文件最后提交记录最后更新时间
[CI] add weekly case (#9380) ### What this PR does / why we need it? We run the weekly test case at a fixed period. add weekly case ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by running the test - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: chen-commits <1636718796@qq.com> Signed-off-by: chen <1636718796@qq.com>8 天前
[Feature][Model] Switch DeepSeekV4 hc_pre to fused op (#9396) ### What this PR does / why we need it? This PR switches DeepSeekV4 hc_pre from the composite small-op path to the aclnnHcPre-backed npu_hc_pre_v2 interface. It also aligns the runtime hc_pre torch binding contract with the CANN recipe binding while leaving the Meta implementation as shape inference only, so torch compile does not trip over runtime-only checks: - x must be 3D or 4D BF16 - hc_mult / hc must be 4 - d must be 4096 or 7168 - hc_fn must be [24, hc * d] - hc_scale must be [3] - hc_base must be [24] - non-x tensors must be FP32 For Ascend950, npu_hc_pre_v2 follows the CANN recipe's batch filter and falls back to the composite path when bs > 512 and bs is not aligned to 8192. Other SoCs continue to use the fused path. Reference: https://gitcode.com/cann/cann-recipes-infer/blob/master/ops/ascendc/torch_ops_extension/custom_ops/csrc/npu_hc_pre.cpp - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: maoxx241 <maomaoyu870@gmail.com>8 天前