文件最后提交记录最后更新时间
[CI] Refactor light test cases and update test coverage (#9059) ### What this PR does / why we need it? Main changes: - Adds dedicated light E2E test files for: - singlecard basic light coverage, including Qwen3 dense, embedding, VLM, and W8A8 Eagle3 cases - 2-card light coverage, including Qwen3 MoE TP2/EP and Qwen3-VL PP2 multimodal cases - 4-card light coverage, including DeepSeek W8A8 TP/PP/EP/EPLB and PD disaggregation cases - Adds RemotePDServer and DisaggPDProxy helpers for ordinary PD disaggregation E2E tests. - Supports launching multiple vLLM serve processes in one test. - Assigns ASCEND_RT_VISIBLE_DEVICES based on each server's tp * dp requirement. - Launches the disaggregated prefill proxy and validates requests through the proxy endpoint. - Updates E2E CI config to run the new light suites. - Replaces old light suite entries with the new test_light.py cases. - Adds a 4-card light CI job for the new 4-card light coverage. - Increase patch_qwen3_vl_moe_pp_layer_range until the commit of the vllm code includes: - https://github.com/vllm-project/vllm/commit/cee6751e548357478a9943cae5786062b7b95127 | Feature | Qwen3<br>-0.6B | Qwen3<br>-8B-W8A8 | Qwen3-Embedding | Qwen3.5<br>-0.8B | Qwen3<br>-30B | Qwen3<br>-VL-30B | DeepSeek<br>-V3.2-W8A8-Pruning | DeepSeek<br>-V3.2-W8A8-Pruning | | -- | -- | -- | -- | -- | -- | -- | -- | -- | | Card count | 1 | 1 | 1 | 1 | 2 | 2 | 4 | 4 | | Dense | ✅ | ✅ |   |   |   |   |   |  | | Moe |   |   |   |   | ✅ |   | ✅ | ✅ | | Embedding |   |   | ✅ |   |   |   |   |  | | Mamba/SSM |   |   |   | ✅ |   |   |   |  | | Multimodal Reasoning |   |   |   | ✅ |   | ✅ |   |  | | TP |   |   |   |   | ✅ |   | ✅|  | | PP |   |   |   |   |   | ✅ | ✅ |  | | EP |   |   |   |   | ✅ |   | ✅ |  | | EPLB |   |   |   |   | ✅ |   | |  | | Full Graph |   | ✅ | | ✅ | ✅ | ✅ | ✅ | ✅ | | PIECEWISE Graph | ✅ |   | ✅ |   |   |   |   |  | | PD disaggregation |   |   |   |   |   |   |   | ✅ | | W8A8 |   | ✅ |   |   |   |   | ✅ | ✅ | | MTP |   |   |   | ✅ |   |   |   |  | | Eagle-3 |   | ✅ |   |   |   |   |   |  | | SFA/DSA |   |   |   |   |   |   |   | ✅ | ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated E2E-Light cases with A2/A3 passed. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>11 天前
[CI] Refactor light test cases and update test coverage (#9059) ### What this PR does / why we need it? Main changes: - Adds dedicated light E2E test files for: - singlecard basic light coverage, including Qwen3 dense, embedding, VLM, and W8A8 Eagle3 cases - 2-card light coverage, including Qwen3 MoE TP2/EP and Qwen3-VL PP2 multimodal cases - 4-card light coverage, including DeepSeek W8A8 TP/PP/EP/EPLB and PD disaggregation cases - Adds RemotePDServer and DisaggPDProxy helpers for ordinary PD disaggregation E2E tests. - Supports launching multiple vLLM serve processes in one test. - Assigns ASCEND_RT_VISIBLE_DEVICES based on each server's tp * dp requirement. - Launches the disaggregated prefill proxy and validates requests through the proxy endpoint. - Updates E2E CI config to run the new light suites. - Replaces old light suite entries with the new test_light.py cases. - Adds a 4-card light CI job for the new 4-card light coverage. - Increase patch_qwen3_vl_moe_pp_layer_range until the commit of the vllm code includes: - https://github.com/vllm-project/vllm/commit/cee6751e548357478a9943cae5786062b7b95127 | Feature | Qwen3<br>-0.6B | Qwen3<br>-8B-W8A8 | Qwen3-Embedding | Qwen3.5<br>-0.8B | Qwen3<br>-30B | Qwen3<br>-VL-30B | DeepSeek<br>-V3.2-W8A8-Pruning | DeepSeek<br>-V3.2-W8A8-Pruning | | -- | -- | -- | -- | -- | -- | -- | -- | -- | | Card count | 1 | 1 | 1 | 1 | 2 | 2 | 4 | 4 | | Dense | ✅ | ✅ |   |   |   |   |   |  | | Moe |   |   |   |   | ✅ |   | ✅ | ✅ | | Embedding |   |   | ✅ |   |   |   |   |  | | Mamba/SSM |   |   |   | ✅ |   |   |   |  | | Multimodal Reasoning |   |   |   | ✅ |   | ✅ |   |  | | TP |   |   |   |   | ✅ |   | ✅|  | | PP |   |   |   |   |   | ✅ | ✅ |  | | EP |   |   |   |   | ✅ |   | ✅ |  | | EPLB |   |   |   |   | ✅ |   | |  | | Full Graph |   | ✅ | | ✅ | ✅ | ✅ | ✅ | ✅ | | PIECEWISE Graph | ✅ |   | ✅ |   |   |   |   |  | | PD disaggregation |   |   |   |   |   |   |   | ✅ | | W8A8 |   | ✅ |   |   |   |   | ✅ | ✅ | | MTP |   |   |   | ✅ |   |   |   |  | | Eagle-3 |   | ✅ |   |   |   |   |   |  | | SFA/DSA |   |   |   |   |   |   |   | ✅ | ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated E2E-Light cases with A2/A3 passed. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>11 天前
[CI] Refactor light test cases and update test coverage (#9059) ### What this PR does / why we need it? Main changes: - Adds dedicated light E2E test files for: - singlecard basic light coverage, including Qwen3 dense, embedding, VLM, and W8A8 Eagle3 cases - 2-card light coverage, including Qwen3 MoE TP2/EP and Qwen3-VL PP2 multimodal cases - 4-card light coverage, including DeepSeek W8A8 TP/PP/EP/EPLB and PD disaggregation cases - Adds RemotePDServer and DisaggPDProxy helpers for ordinary PD disaggregation E2E tests. - Supports launching multiple vLLM serve processes in one test. - Assigns ASCEND_RT_VISIBLE_DEVICES based on each server's tp * dp requirement. - Launches the disaggregated prefill proxy and validates requests through the proxy endpoint. - Updates E2E CI config to run the new light suites. - Replaces old light suite entries with the new test_light.py cases. - Adds a 4-card light CI job for the new 4-card light coverage. - Increase patch_qwen3_vl_moe_pp_layer_range until the commit of the vllm code includes: - https://github.com/vllm-project/vllm/commit/cee6751e548357478a9943cae5786062b7b95127 | Feature | Qwen3<br>-0.6B | Qwen3<br>-8B-W8A8 | Qwen3-Embedding | Qwen3.5<br>-0.8B | Qwen3<br>-30B | Qwen3<br>-VL-30B | DeepSeek<br>-V3.2-W8A8-Pruning | DeepSeek<br>-V3.2-W8A8-Pruning | | -- | -- | -- | -- | -- | -- | -- | -- | -- | | Card count | 1 | 1 | 1 | 1 | 2 | 2 | 4 | 4 | | Dense | ✅ | ✅ |   |   |   |   |   |  | | Moe |   |   |   |   | ✅ |   | ✅ | ✅ | | Embedding |   |   | ✅ |   |   |   |   |  | | Mamba/SSM |   |   |   | ✅ |   |   |   |  | | Multimodal Reasoning |   |   |   | ✅ |   | ✅ |   |  | | TP |   |   |   |   | ✅ |   | ✅|  | | PP |   |   |   |   |   | ✅ | ✅ |  | | EP |   |   |   |   | ✅ |   | ✅ |  | | EPLB |   |   |   |   | ✅ |   | |  | | Full Graph |   | ✅ | | ✅ | ✅ | ✅ | ✅ | ✅ | | PIECEWISE Graph | ✅ |   | ✅ |   |   |   |   |  | | PD disaggregation |   |   |   |   |   |   |   | ✅ | | W8A8 |   | ✅ |   |   |   |   | ✅ | ✅ | | MTP |   |   |   | ✅ |   |   |   |  | | Eagle-3 |   | ✅ |   |   |   |   |   |  | | SFA/DSA |   |   |   |   |   |   |   | ✅ | ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated E2E-Light cases with A2/A3 passed. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>11 天前
[CI] Refactor light test cases and update test coverage (#9059) ### What this PR does / why we need it? Main changes: - Adds dedicated light E2E test files for: - singlecard basic light coverage, including Qwen3 dense, embedding, VLM, and W8A8 Eagle3 cases - 2-card light coverage, including Qwen3 MoE TP2/EP and Qwen3-VL PP2 multimodal cases - 4-card light coverage, including DeepSeek W8A8 TP/PP/EP/EPLB and PD disaggregation cases - Adds RemotePDServer and DisaggPDProxy helpers for ordinary PD disaggregation E2E tests. - Supports launching multiple vLLM serve processes in one test. - Assigns ASCEND_RT_VISIBLE_DEVICES based on each server's tp * dp requirement. - Launches the disaggregated prefill proxy and validates requests through the proxy endpoint. - Updates E2E CI config to run the new light suites. - Replaces old light suite entries with the new test_light.py cases. - Adds a 4-card light CI job for the new 4-card light coverage. - Increase patch_qwen3_vl_moe_pp_layer_range until the commit of the vllm code includes: - https://github.com/vllm-project/vllm/commit/cee6751e548357478a9943cae5786062b7b95127 | Feature | Qwen3<br>-0.6B | Qwen3<br>-8B-W8A8 | Qwen3-Embedding | Qwen3.5<br>-0.8B | Qwen3<br>-30B | Qwen3<br>-VL-30B | DeepSeek<br>-V3.2-W8A8-Pruning | DeepSeek<br>-V3.2-W8A8-Pruning | | -- | -- | -- | -- | -- | -- | -- | -- | -- | | Card count | 1 | 1 | 1 | 1 | 2 | 2 | 4 | 4 | | Dense | ✅ | ✅ |   |   |   |   |   |  | | Moe |   |   |   |   | ✅ |   | ✅ | ✅ | | Embedding |   |   | ✅ |   |   |   |   |  | | Mamba/SSM |   |   |   | ✅ |   |   |   |  | | Multimodal Reasoning |   |   |   | ✅ |   | ✅ |   |  | | TP |   |   |   |   | ✅ |   | ✅|  | | PP |   |   |   |   |   | ✅ | ✅ |  | | EP |   |   |   |   | ✅ |   | ✅ |  | | EPLB |   |   |   |   | ✅ |   | |  | | Full Graph |   | ✅ | | ✅ | ✅ | ✅ | ✅ | ✅ | | PIECEWISE Graph | ✅ |   | ✅ |   |   |   |   |  | | PD disaggregation |   |   |   |   |   |   |   | ✅ | | W8A8 |   | ✅ |   |   |   |   | ✅ | ✅ | | MTP |   |   |   | ✅ |   |   |   |  | | Eagle-3 |   | ✅ |   |   |   |   |   |  | | SFA/DSA |   |   |   |   |   |   |   | ✅ | ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Updated E2E-Light cases with A2/A3 passed. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: MrZ20 <2609716663@qq.com>11 天前