vllm_ascend/tests/e2e/nightly/single_node · yilunh/vllm_ascend - AtomGit

文件	最后提交记录	最后更新时间
models	[CI] add weekly case (#9380) ### What this PR does / why we need it? We run the weekly test case at a fixed period. add weekly case ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by running the test - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: chen-commits <1636718796@qq.com> Signed-off-by: chen <1636718796@qq.com>	8 天前
ops	[Feature][Model] Switch DeepSeekV4 hc_pre to fused op (#9396) ### What this PR does / why we need it? This PR switches DeepSeekV4 `hc_pre` from the composite small-op path to the `aclnnHcPre`-backed `npu_hc_pre_v2` interface. It also aligns the runtime `hc_pre` torch binding contract with the CANN recipe binding while leaving the Meta implementation as shape inference only, so torch compile does not trip over runtime-only checks: - `x` must be 3D or 4D BF16 - `hc_mult` / `hc` must be 4 - `d` must be 4096 or 7168 - `hc_fn` must be `[24, hc * d]` - `hc_scale` must be `[3]` - `hc_base` must be `[24]` - non-`x` tensors must be FP32 For Ascend950, `npu_hc_pre_v2` follows the CANN recipe's batch filter and falls back to the composite path when `bs > 512` and `bs` is not aligned to 8192. Other SoCs continue to use the fused path. Reference: https://gitcode.com/cann/cann-recipes-infer/blob/master/ops/ascendc/torch_ops_extension/custom_ops/csrc/npu_hc_pre.cpp - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>	8 天前