文件最后提交记录最后更新时间
[Ops][Feature] Add DeepSeek V4 custom operators (#9228) ## Summary - port DeepSeek V4 custom ops from GDzhu01/vllm-ascend-deepseekv4 - add attention ops: compressor, inplace_partial_rotary_mul, quant_lightning_indexer, quant_lightning_indexer_metadata, sparse_attn_sharedkv, sparse_attn_sharedkv_metadata - add GMM ops: grouped_matmul_swiglu_quant, grouped_matmul_swiglu_quant_weight_nz_tensor_list, grouped_matmul_swiglu_quant_v2 - add MoE ops: hc_post, hc_pre_inv_rms, hc_pre_sinkhorn, moe_gating_top_k_hash, scatter_nd_update_v2 - register torch bindings and meta kernels for the new public APIs ## Source branches - GDzhu01/vllm-ascend-deepseekv4:vllm_ds_uncontigous_018_lf for the first 13 ops - GDzhu01/vllm-ascend-deepseekv4:v4_v0.18.0_0412 for grouped_matmul_swiglu_quant_v2 ## Validation - git diff --check HEAD~1..HEAD - checked that source-branch test artifacts/logs were not included - no local NPU execution run; this is expected for local macOS workspace and CI should cover build validation - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 ## Co-authors Co-authored-by: 1132509010 <1132509010@qq.com> Co-authored-by: ader47 <1661888967@qq.com> Co-authored-by: anakin-wx <1084704046@qq.com> Co-authored-by: anon189Ty <Stari_Falcon@outlook.com> Co-authored-by: ChangminTao <taocm123@qq.com> Co-authored-by: chenchris2 <1349418798@qq.com> Co-authored-by: ChenxiQ <chenxi.qian.cq@outlook.com> Co-authored-by: coder-fny <985619145@qq.com> Co-authored-by: fuzhihong699 <fuzhihong4@huawei.com> Co-authored-by: GDzhu01 <809721801@qq.com> Co-authored-by: goldVitaminC <297780618@qq.com> Co-authored-by: HiC4Sh1e <chenjie137@huawei.com> Co-authored-by: hwhaokun <haokun0405@163.com> Co-authored-by: kirliavc <jlc@pku.edu.cn> Co-authored-by: lcfenglinwan <lcfenglin@qq.com> Co-authored-by: Liexss <924834690@qq.com> Co-authored-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: liuyan190974 <shandaliuyan@163.com> Co-authored-by: LookAround0301 <lixushi@huawei.com> Co-authored-by: maoxx241 <maomaoyu870@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: monologue815 <monologue815@qq.com> Co-authored-by: MosCloud <bwzhang1991@163.com> Co-authored-by: nomewang <nomeyue@outlook.com> Co-authored-by: nwpu-zxr <zhouxuerong2@huawei.com> Co-authored-by: pinfa <1819563383@qq.com> Co-authored-by: pjgao <1783198484@qq.com> Co-authored-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: realliujiaxu <realliujiaxu@163.com> Co-authored-by: showMeYourCode1997 <934005226@qq.com> Co-authored-by: SidaoY <1024863041@qq.com> Co-authored-by: slippersss <slippersss@126.com> Co-authored-by: Toneymiller <1476209578@qq.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weinachuan <1173732899@qq.com> Co-authored-by: WithHades <244036962@qq.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Co-authored-by: WOE-Y <876362620@qq.com> Co-authored-by: wxh571001500 <571001500@qq.com> Co-authored-by: wxsIcey <1790571317@qq.com> Co-authored-by: xmpp777 <yangming2@huawei.com> Co-authored-by: yiz-liu <liu_yizhou@outlook.com> Co-authored-by: yzylxyypl <yzylxyypl@gmail.com> Co-authored-by: zcc-zjut <zcczxy2019@163.com> Co-authored-by: zhangsicheng5 <zhangsicheng5@huawei.com> Co-authored-by: zhaozx-cn <zhaozx2116@163.com> Co-authored-by: zhenwenqi_2024 <zhenwenqi_2022@qq.com> Co-authored-by: ZT-AIA <1028681969@qq.com> Signed-off-by: maoxx241 <maomaoyu870@gmail.com>11 天前
[Feature][Ops] Add A5 custom operator build support (#9271) ### What this PR does / why we need it? This PR adds the A5 / ascend950 custom operator build and binding support needed by the DeepSeek V4 A5 path. Changes included: - Updates the ascend950 CUSTOM_OPS_ARRAY in csrc/build_aclnn.sh to match the A5 ACLNN branch operator list. - Adds A5-specific custom operator directories for indexer_compress_epilog, indexer_compress_epilog_v2, kv_compress_epilog, kv_quant_sparse_attn_sharedkv, kv_quant_sparse_attn_sharedkv_metadata, load_index_kv_cache, hc_pre, and swiglu_group_quant. - Wires torch and meta registrations for the new A5 operators, including npu_hc_pre_v2 and grouped_matmul_swiglu_quant_weight_nz. - Skips the direct vllm_ascend_kernels target for ascend950 and uses VLLM_ENABLE_ATB_AND_DIRECT_KERNELS to guard direct-kernel includes, schemas, implementations, and meta registrations. This keeps ascend950 and 310P import-safe when those direct kernels are not built. - Maps ascend950 to the newer Ascend950PR_9599 CANN platform name used by the new build framework. - Updates quant_lightning_indexer for A5 PA cache views: it keeps the existing stride / scale_stride schema, avoids forcing 950 key/key-scale tensors contiguous, and uses the passed strides in the arch35 PA offset path. A3 conflict-sensitive note: - The A5 source branch also contains changes under shared A3 operator paths such as compressor, sparse_attn_sharedkv, hc_pre_sinkhorn, hc_pre_inv_rms, hc_post, inplace_partial_rotary_mul, moe_gating_top_k_hash, and fused-MoE Python code. - This PR intentionally does not overwrite those shared implementations. For shared operator names that remain in the ascend950 build list, it uses the current main-branch implementation and only adds the A5-specific operator directories missing from main. - quant_lightning_indexer is the intentional shared-path exception because A5 needs dim0-stride-aware PA cache access. - quant_lightning_indexer_metadata stays on the mainline Ascend950 SoC-version behavior and does not add legacy Ascend910_95 compatibility. - compressor is intentionally not changed in this PR and should be handled by the operator owners separately. - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: maoxx241 <maomaoyu870@gmail.com>10 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Ops][Feature] Add DeepSeek V4 custom operators (#9228) ## Summary - port DeepSeek V4 custom ops from GDzhu01/vllm-ascend-deepseekv4 - add attention ops: compressor, inplace_partial_rotary_mul, quant_lightning_indexer, quant_lightning_indexer_metadata, sparse_attn_sharedkv, sparse_attn_sharedkv_metadata - add GMM ops: grouped_matmul_swiglu_quant, grouped_matmul_swiglu_quant_weight_nz_tensor_list, grouped_matmul_swiglu_quant_v2 - add MoE ops: hc_post, hc_pre_inv_rms, hc_pre_sinkhorn, moe_gating_top_k_hash, scatter_nd_update_v2 - register torch bindings and meta kernels for the new public APIs ## Source branches - GDzhu01/vllm-ascend-deepseekv4:vllm_ds_uncontigous_018_lf for the first 13 ops - GDzhu01/vllm-ascend-deepseekv4:v4_v0.18.0_0412 for grouped_matmul_swiglu_quant_v2 ## Validation - git diff --check HEAD~1..HEAD - checked that source-branch test artifacts/logs were not included - no local NPU execution run; this is expected for local macOS workspace and CI should cover build validation - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 ## Co-authors Co-authored-by: 1132509010 <1132509010@qq.com> Co-authored-by: ader47 <1661888967@qq.com> Co-authored-by: anakin-wx <1084704046@qq.com> Co-authored-by: anon189Ty <Stari_Falcon@outlook.com> Co-authored-by: ChangminTao <taocm123@qq.com> Co-authored-by: chenchris2 <1349418798@qq.com> Co-authored-by: ChenxiQ <chenxi.qian.cq@outlook.com> Co-authored-by: coder-fny <985619145@qq.com> Co-authored-by: fuzhihong699 <fuzhihong4@huawei.com> Co-authored-by: GDzhu01 <809721801@qq.com> Co-authored-by: goldVitaminC <297780618@qq.com> Co-authored-by: HiC4Sh1e <chenjie137@huawei.com> Co-authored-by: hwhaokun <haokun0405@163.com> Co-authored-by: kirliavc <jlc@pku.edu.cn> Co-authored-by: lcfenglinwan <lcfenglin@qq.com> Co-authored-by: Liexss <924834690@qq.com> Co-authored-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: liuyan190974 <shandaliuyan@163.com> Co-authored-by: LookAround0301 <lixushi@huawei.com> Co-authored-by: maoxx241 <maomaoyu870@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: monologue815 <monologue815@qq.com> Co-authored-by: MosCloud <bwzhang1991@163.com> Co-authored-by: nomewang <nomeyue@outlook.com> Co-authored-by: nwpu-zxr <zhouxuerong2@huawei.com> Co-authored-by: pinfa <1819563383@qq.com> Co-authored-by: pjgao <1783198484@qq.com> Co-authored-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: realliujiaxu <realliujiaxu@163.com> Co-authored-by: showMeYourCode1997 <934005226@qq.com> Co-authored-by: SidaoY <1024863041@qq.com> Co-authored-by: slippersss <slippersss@126.com> Co-authored-by: Toneymiller <1476209578@qq.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weinachuan <1173732899@qq.com> Co-authored-by: WithHades <244036962@qq.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Co-authored-by: WOE-Y <876362620@qq.com> Co-authored-by: wxh571001500 <571001500@qq.com> Co-authored-by: wxsIcey <1790571317@qq.com> Co-authored-by: xmpp777 <yangming2@huawei.com> Co-authored-by: yiz-liu <liu_yizhou@outlook.com> Co-authored-by: yzylxyypl <yzylxyypl@gmail.com> Co-authored-by: zcc-zjut <zcczxy2019@163.com> Co-authored-by: zhangsicheng5 <zhangsicheng5@huawei.com> Co-authored-by: zhaozx-cn <zhaozx2116@163.com> Co-authored-by: zhenwenqi_2024 <zhenwenqi_2022@qq.com> Co-authored-by: ZT-AIA <1028681969@qq.com> Signed-off-by: maoxx241 <maomaoyu870@gmail.com>11 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Ops][BugFix] Reuse common tiling_base for custom ops (#9103) ## Summary - remove duplicated host-side tiling_base and error_log headers from imported custom ops - include the common tiling_base headers directly from affected host tiling files - keep CeilDiv/CeilAlign in the common host tiling utility and expose them through the common error_log compatibility header - keep kernel-side local error_log headers untouched - include csrc changes in the 310P light-test tracker and map ascend310p builds to arch22 - update CANN 9.0 Ascend950 SOC naming from ascend910_95/ASCEND910_95 to ascend950/ASCEND950 ## Validation - git diff --check - git diff --cached --check - bash -n csrc/build.sh - bash -n csrc/build_aclnn.sh - python -m py_compile csrc/cmake/scripts/util/const_var.py csrc/cmake/scripts/util/opdesc_parser.py csrc/scripts/util/const_var.py - rg -n "ASCEND910_95|ascend910_95|Ascend910_9599" . Fixes the custom-op build issues seen in PR #9066 CI logs and avoids per-operator tiling_base copies. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>16 天前
[Ops][Feature] Add DeepSeek V4 custom operators (#9228) ## Summary - port DeepSeek V4 custom ops from GDzhu01/vllm-ascend-deepseekv4 - add attention ops: compressor, inplace_partial_rotary_mul, quant_lightning_indexer, quant_lightning_indexer_metadata, sparse_attn_sharedkv, sparse_attn_sharedkv_metadata - add GMM ops: grouped_matmul_swiglu_quant, grouped_matmul_swiglu_quant_weight_nz_tensor_list, grouped_matmul_swiglu_quant_v2 - add MoE ops: hc_post, hc_pre_inv_rms, hc_pre_sinkhorn, moe_gating_top_k_hash, scatter_nd_update_v2 - register torch bindings and meta kernels for the new public APIs ## Source branches - GDzhu01/vllm-ascend-deepseekv4:vllm_ds_uncontigous_018_lf for the first 13 ops - GDzhu01/vllm-ascend-deepseekv4:v4_v0.18.0_0412 for grouped_matmul_swiglu_quant_v2 ## Validation - git diff --check HEAD~1..HEAD - checked that source-branch test artifacts/logs were not included - no local NPU execution run; this is expected for local macOS workspace and CI should cover build validation - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 ## Co-authors Co-authored-by: 1132509010 <1132509010@qq.com> Co-authored-by: ader47 <1661888967@qq.com> Co-authored-by: anakin-wx <1084704046@qq.com> Co-authored-by: anon189Ty <Stari_Falcon@outlook.com> Co-authored-by: ChangminTao <taocm123@qq.com> Co-authored-by: chenchris2 <1349418798@qq.com> Co-authored-by: ChenxiQ <chenxi.qian.cq@outlook.com> Co-authored-by: coder-fny <985619145@qq.com> Co-authored-by: fuzhihong699 <fuzhihong4@huawei.com> Co-authored-by: GDzhu01 <809721801@qq.com> Co-authored-by: goldVitaminC <297780618@qq.com> Co-authored-by: HiC4Sh1e <chenjie137@huawei.com> Co-authored-by: hwhaokun <haokun0405@163.com> Co-authored-by: kirliavc <jlc@pku.edu.cn> Co-authored-by: lcfenglinwan <lcfenglin@qq.com> Co-authored-by: Liexss <924834690@qq.com> Co-authored-by: linfeng-yuan <1102311262@qq.com> Co-authored-by: liuyan190974 <shandaliuyan@163.com> Co-authored-by: LookAround0301 <lixushi@huawei.com> Co-authored-by: maoxx241 <maomaoyu870@gmail.com> Co-authored-by: MengqingCao <cmq0113@163.com> Co-authored-by: monologue815 <monologue815@qq.com> Co-authored-by: MosCloud <bwzhang1991@163.com> Co-authored-by: nomewang <nomeyue@outlook.com> Co-authored-by: nwpu-zxr <zhouxuerong2@huawei.com> Co-authored-by: pinfa <1819563383@qq.com> Co-authored-by: pjgao <1783198484@qq.com> Co-authored-by: QiuChunshuo <qiuchunshuo@huawei.com> Co-authored-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Co-authored-by: realliujiaxu <realliujiaxu@163.com> Co-authored-by: showMeYourCode1997 <934005226@qq.com> Co-authored-by: SidaoY <1024863041@qq.com> Co-authored-by: slippersss <slippersss@126.com> Co-authored-by: Toneymiller <1476209578@qq.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: weijinqian_v1 <weijinqian@huawei.com> Co-authored-by: weinachuan <1173732899@qq.com> Co-authored-by: WithHades <244036962@qq.com> Co-authored-by: wjunLu <wjunlu217@gmail.com> Co-authored-by: WOE-Y <876362620@qq.com> Co-authored-by: wxh571001500 <571001500@qq.com> Co-authored-by: wxsIcey <1790571317@qq.com> Co-authored-by: xmpp777 <yangming2@huawei.com> Co-authored-by: yiz-liu <liu_yizhou@outlook.com> Co-authored-by: yzylxyypl <yzylxyypl@gmail.com> Co-authored-by: zcc-zjut <zcczxy2019@163.com> Co-authored-by: zhangsicheng5 <zhangsicheng5@huawei.com> Co-authored-by: zhaozx-cn <zhaozx2116@163.com> Co-authored-by: zhenwenqi_2024 <zhenwenqi_2022@qq.com> Co-authored-by: ZT-AIA <1028681969@qq.com> Signed-off-by: maoxx241 <maomaoyu870@gmail.com>11 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前
[Feature] Update custom op build framework (#8146) ## Summary - update the custom-op build and packaging framework - align current custom-op integration under csrc while keeping the current main-branch operator implementations - improve custom-op runtime environment bootstrap so single-op tests and offline service no longer depend on manually sourcing vendor env scripts - keep the non-custom-op build path unchanged ## Notes - this PR focuses on the custom-op build framework update and related runtime loading path changes - operator implementations continue to follow the current main-branch codebase - final validation for this PR state relies on the CI results on GitHub ## Validation - remote A2 targeted verification completed for custom-op build/install and runtime bootstrap - full CI is used as the final validation gate for this PR state - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>18 天前