文件最后提交记录最后更新时间
[Bugfix][LoRA] Fix the bug when runs Qwen3-Reranker-0.6B with LoRA. (#7156) ### What this PR does / why we need it? Fix the error that reports while initializing qwen3-reranker-0.6b model with --enable-lora. And add a testcase to verify the fix. - vLLM version: v0.17.0 - vLLM main: https://github.com/vllm-project/vllm/commit/4034c3d32e30d01639459edd3ab486f56993876d --------- Signed-off-by: paulyu12 <507435917@qq.com> Co-authored-by: Mengqing Cao <cmq0113@163.com>2 个月前
[Model] Support pooling models (#3122) ### What this PR does / why we need it? Support pooling models (like bge-reranker-v2-m3) in vllm-ascend, this pr covered the three model types of embed (cls_token, mean_token, lasttoken). After this [commit](https://github.com/vllm-project/vllm/commit/17373dcd93ca60554d72cef4e159e70abbfd15af), vllm has provided support for adapting pooling models on the v1 engine. This PR includes corresponding adaptations on the vllm-ascend side. Fixes #1960 - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 --------- Signed-off-by: lianyibo <lianyibo1@kunlunit.com> Signed-off-by: MengqingCao <cmq0113@163.com> Co-authored-by: MengqingCao <cmq0113@163.com>5 个月前
[CI]Style: Convert test/ to ruff format(Batch #2) (#6739) ### What this PR does / why we need it? | File Path | | :--- | | tests/e2e/multicard/4-cards/long_sequence/test_accuracy.py | | tests/e2e/multicard/4-cards/long_sequence/test_basic.py | | tests/e2e/multicard/4-cards/long_sequence/test_chunked_prefill_cp.py | | tests/e2e/multicard/4-cards/long_sequence/test_mtp.py | | tests/e2e/multicard/4-cards/long_sequence/test_prefix_caching_cp.py | | tests/e2e/multicard/4-cards/spec_decode/test_mtp_qwen3_next.py | | tests/e2e/multicard/4-cards/test_data_parallel_tp2.py | | tests/e2e/multicard/4-cards/test_kimi_k2.py | | tests/e2e/multicard/4-cards/test_qwen3_next.py | | tests/e2e/nightly/multi_node/scripts/multi_node_config.py | | tests/e2e/nightly/multi_node/scripts/test_multi_node.py | | tests/e2e/nightly/multi_node/scripts/utils.py | | tests/e2e/singlecard/pooling/test_classification.py | | tests/e2e/singlecard/pooling/test_embedding.py | | tests/e2e/singlecard/pooling/test_scoring.py | | tests/e2e/singlecard/spec_decode/test_mtp_eagle_correctness.py | | tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py | | tests/e2e/utils.py | | tests/e2e/vllm_interface/singlecard/test_sampler.py | | tests/e2e/weekly/single_node/models/test_qwen3_30b_acc.py | ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/9562912cead1f11e8540fb91306c5cbda66f0007 --------- Signed-off-by: MrZ20 <2609716663@qq.com>1 个月前
[CI]Style: Convert test/ to ruff format(Batch #2) (#6739) ### What this PR does / why we need it? | File Path | | :--- | | tests/e2e/multicard/4-cards/long_sequence/test_accuracy.py | | tests/e2e/multicard/4-cards/long_sequence/test_basic.py | | tests/e2e/multicard/4-cards/long_sequence/test_chunked_prefill_cp.py | | tests/e2e/multicard/4-cards/long_sequence/test_mtp.py | | tests/e2e/multicard/4-cards/long_sequence/test_prefix_caching_cp.py | | tests/e2e/multicard/4-cards/spec_decode/test_mtp_qwen3_next.py | | tests/e2e/multicard/4-cards/test_data_parallel_tp2.py | | tests/e2e/multicard/4-cards/test_kimi_k2.py | | tests/e2e/multicard/4-cards/test_qwen3_next.py | | tests/e2e/nightly/multi_node/scripts/multi_node_config.py | | tests/e2e/nightly/multi_node/scripts/test_multi_node.py | | tests/e2e/nightly/multi_node/scripts/utils.py | | tests/e2e/singlecard/pooling/test_classification.py | | tests/e2e/singlecard/pooling/test_embedding.py | | tests/e2e/singlecard/pooling/test_scoring.py | | tests/e2e/singlecard/spec_decode/test_mtp_eagle_correctness.py | | tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py | | tests/e2e/utils.py | | tests/e2e/vllm_interface/singlecard/test_sampler.py | | tests/e2e/weekly/single_node/models/test_qwen3_30b_acc.py | ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/9562912cead1f11e8540fb91306c5cbda66f0007 --------- Signed-off-by: MrZ20 <2609716663@qq.com>1 个月前
[CI]Style: Convert test/ to ruff format(Batch #2) (#6739) ### What this PR does / why we need it? | File Path | | :--- | | tests/e2e/multicard/4-cards/long_sequence/test_accuracy.py | | tests/e2e/multicard/4-cards/long_sequence/test_basic.py | | tests/e2e/multicard/4-cards/long_sequence/test_chunked_prefill_cp.py | | tests/e2e/multicard/4-cards/long_sequence/test_mtp.py | | tests/e2e/multicard/4-cards/long_sequence/test_prefix_caching_cp.py | | tests/e2e/multicard/4-cards/spec_decode/test_mtp_qwen3_next.py | | tests/e2e/multicard/4-cards/test_data_parallel_tp2.py | | tests/e2e/multicard/4-cards/test_kimi_k2.py | | tests/e2e/multicard/4-cards/test_qwen3_next.py | | tests/e2e/nightly/multi_node/scripts/multi_node_config.py | | tests/e2e/nightly/multi_node/scripts/test_multi_node.py | | tests/e2e/nightly/multi_node/scripts/utils.py | | tests/e2e/singlecard/pooling/test_classification.py | | tests/e2e/singlecard/pooling/test_embedding.py | | tests/e2e/singlecard/pooling/test_scoring.py | | tests/e2e/singlecard/spec_decode/test_mtp_eagle_correctness.py | | tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py | | tests/e2e/utils.py | | tests/e2e/vllm_interface/singlecard/test_sampler.py | | tests/e2e/weekly/single_node/models/test_qwen3_30b_acc.py | ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/9562912cead1f11e8540fb91306c5cbda66f0007 --------- Signed-off-by: MrZ20 <2609716663@qq.com>1 个月前
[CI]Style: Convert test/ to ruff format(Batch #2) (#6739) ### What this PR does / why we need it? | File Path | | :--- | | tests/e2e/multicard/4-cards/long_sequence/test_accuracy.py | | tests/e2e/multicard/4-cards/long_sequence/test_basic.py | | tests/e2e/multicard/4-cards/long_sequence/test_chunked_prefill_cp.py | | tests/e2e/multicard/4-cards/long_sequence/test_mtp.py | | tests/e2e/multicard/4-cards/long_sequence/test_prefix_caching_cp.py | | tests/e2e/multicard/4-cards/spec_decode/test_mtp_qwen3_next.py | | tests/e2e/multicard/4-cards/test_data_parallel_tp2.py | | tests/e2e/multicard/4-cards/test_kimi_k2.py | | tests/e2e/multicard/4-cards/test_qwen3_next.py | | tests/e2e/nightly/multi_node/scripts/multi_node_config.py | | tests/e2e/nightly/multi_node/scripts/test_multi_node.py | | tests/e2e/nightly/multi_node/scripts/utils.py | | tests/e2e/singlecard/pooling/test_classification.py | | tests/e2e/singlecard/pooling/test_embedding.py | | tests/e2e/singlecard/pooling/test_scoring.py | | tests/e2e/singlecard/spec_decode/test_mtp_eagle_correctness.py | | tests/e2e/singlecard/spec_decode/test_v1_spec_decode.py | | tests/e2e/utils.py | | tests/e2e/vllm_interface/singlecard/test_sampler.py | | tests/e2e/weekly/single_node/models/test_qwen3_30b_acc.py | ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.15.0 - vLLM main: https://github.com/vllm-project/vllm/commit/9562912cead1f11e8540fb91306c5cbda66f0007 --------- Signed-off-by: MrZ20 <2609716663@qq.com>1 个月前