GGitHub[CI] add weekly case (#9380 )

a8f7beaf创建于 8 天前历史提交

文件	最后提交记录	最后更新时间
dockerfiles	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
matchers	[Lint]Style: reformat markdown files via markdownlint (#5884) ### What this PR does / why we need it? reformat markdown files via markdownlint - vLLM version: v0.13.0 - vLLM main: https://github.com/vllm-project/vllm/commit/bde38c11df0ea066a740efe9b77fff5418be45df --------- Signed-off-by: root <root@LAPTOP-VQKDDVMG.localdomain> Signed-off-by: MrZ20 <2609716663@qq.com> Co-authored-by: root <root@LAPTOP-VQKDDVMG.localdomain>	4 个月前
misc	[CI] add weekly case (#9380) ### What this PR does / why we need it? We run the weekly test case at a fixed period. add weekly case ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by running the test - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: chen-commits <1636718796@qq.com> Signed-off-by: chen <1636718796@qq.com>	8 天前
scripts	[Bugfix][CI] Optimize the cleanup mechanism of `RemoteOpenAIServer` (#9356) ### What this PR does / why we need it? - Extract the existing `RemoteEPDServer` process-tree cleanup logic into a shared `_terminate_process_tree()` helper. - Reuse the helper in both `RemoteOpenAIServer` and `RemoteEPDServer`. - Return standard exit code `1` for failed suites instead of `-1`, avoiding shell-side `255` exit codes. - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: MrZ20 <2609716663@qq.com>	8 天前
README.md	[BugFix] Modify Failed vLLM Test Cases (#7946) ### What this PR does / why we need it? correct the failure cases ### Does this PR introduce _any_ user-facing change? modify the upstream_config.yaml and pr-test-upstream.yaml ### How was this patch tested? no - vLLM version: v0.18.0 - vLLM main: https://github.com/vllm-project/vllm/commit/35141a7eeda941a60ad5a4956670c60fd5a77029 --------- Signed-off-by: wangli <wangli858794774@gmail.com> Signed-off-by: guxin108 <1252896542@qq.com> Co-authored-by: wangli <wangli858794774@gmail.com>	1 个月前
_e2e_nightly_multi_node.yaml	[CI] Move part of nightly test to weekly (#9210) ### What this PR does / why we need it? Move part of nightly test to weekly ```shell tests/e2e/nightly/multi_node/config/GLM5_1-W8A8-A3-dual-nodes.yaml tests/e2e/nightly/single_node/models/configs/DeepSeek-V3.2-W8A8.yaml tests/e2e/nightly/single_node/models/configs/Qwen3.5-397B-A17B-W8A8-mtp-A3.yaml tests/e2e/nightly/single_node/models/configs/Kimi-K2.5.yaml tests/e2e/nightly/single_node/models/configs/Qwen3.5-122B-A10B-W8A8-A3.yaml tests/e2e/nightly/single_node/models/configs/Qwen3.5-27B-w8a8-A3.yaml tests/e2e/nightly/single_node/models/configs/MiniMax-M2.5-w8a8-QuaRot-A3.yaml ``` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/ce29c26b31d432b1b4bc028c46bb2c3b07a667d8 --------- Signed-off-by: wangli <wangli858794774@gmail.com>	11 天前
_e2e_nightly_single_node.yaml	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
_e2e_nightly_single_node_models.yaml	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
_e2e_test.yaml	[CI] set `MAX_JOBS` for vllm-ascend install in workflows (#9407) ### What this PR does / why we need it? - Add `MAX_JOBS` to the step "Install vllm-project/vllm-ascend" in the workflow to control parallel compilation capacity and prevent OOM during compilation. - Calculation rule for `MAX_JOBS`: Number of CPU cores × Number of NPU cards ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: MrZ20 <2609716663@qq.com>	8 天前
_nightly_image_build.yaml	[Misc][Upgrade] Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 (#9085) Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	16 天前
_optional_smart_e2e.yaml	[CI] set `MAX_JOBS` for vllm-ascend install in workflows (#9407) ### What this PR does / why we need it? - Add `MAX_JOBS` to the step "Install vllm-project/vllm-ascend" in the workflow to control parallel compilation capacity and prevent OOM during compilation. - Calculation rule for `MAX_JOBS`: Number of CPU cores × Number of NPU cards ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: MrZ20 <2609716663@qq.com>	8 天前
_parse_trigger.yaml	[CI] Support artifact collection and dashboard reporting for nightly tests (#7102) ### What this PR does / why we need it? This PR improves the nightly CI test infrastructure in several ways: 1. Artifact support for multi-node nightly tests: Benchmark results are now saved to a PVC-mounted directory (`LOG_PREFIX/benchmark_results`) for persistence after pod termination, and uploaded as GitHub Actions artifacts for easy retrieval. 2. Artifact support for single-node nightly tests: Similarly adds artifact upload steps to collect and merge test results from single-node nightly runs on A2/A3 hardware. 3. vLLM Ascend dashboard reporting: The single-node test script now generates and uploads structured benchmark data (throughput, latency, etc.) to support the vllm-ascend model performance dashboard. ```json { "model_name": "Qwen/Qwen3-32B", "hardware": "A2", "dtype": "bf16", "feature": [ "weight_nz_layout" ], "vllm_version": "0.18.0+empty", "vllm_ascend_version": "f4fafc62a63f2a761952ee8123fd4b8fb48729e3", "tasks": [ { "name": "gsm8k-lite", "metrics": { "accuracy": 100.0 }, "test_input": { "max_out_len": 32768, "batch_size": 32 }, "target": { "baseline": 95, "threshold": 5 }, "pass_fail": "pass" }, { "name": "GSM8K-in3500-bs400", "metrics": { "Benchmark_Duration(BD)": 338621.3952, "Prefill_Token_Throughput(PTT)": 608.5932, "Input_Token_Throughput(ITT)": 864.5437, "Output_Token_Throughput(OTT)": 354.3781, "Total_Token_Throughput(TTT)": 1218.9218 }, "test_input": { "num_prompts": 80, "max_out_len": 1500, "batch_size": 20, "request_rate": 0 }, "target": { "baseline": 1, "threshold": 0.97 }, "pass_fail": "pass" } ], "serve_cmd": { "mix": "vllm serve Qwen/Qwen3-32B --no-enable-prefix-caching --tensor-parallel-size 4 --port 45861 --max-model-len 36864 --max-num-batched-tokens 36864 --block-size 128 --trust-remote-code --gpu-memory-utilization 0.9 --additional-config {\"enable_weight_nz_layout\":true}" }, "environment": { "TASK_QUEUE_ENABLE": "1", "OMP_PROC_BIND": "false", "HCCL_OP_EXPANSION_MODE": "AIV", "PAGED_ATTENTION_MASK_LEN": "5500" }, "pass_fail": "pass" } ``` ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.16.0 - vLLM main: https://github.com/vllm-project/vllm/commit/4034c3d32e30d01639459edd3ab486f56993876d --------- Signed-off-by: hfadzxy <starmoon_zhang@163.com>	1 个月前
_pre_commit.yml	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
_schedule_image_build.yaml	[CI] Add csrc cache for image build (#9037) ### What this PR does / why we need it? This pull request introduces a csrc build artifact caching mechanism into image building, which is expected to reduce image build time from 50 minutes to 10 minutes. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: wangli <wangli858794774@gmail.com>	18 天前
bot_issue_manage.yaml	Issue auto set label optimize: remove match label from issue body (#7696) ### What this PR does / why we need it? 1. Issue auto set label optimize: remove match label from issue body 2. optimize auto-label match reguler for model:glm, wan, qwen-next,emu,eplb,deepseek 3. optimize label name structure: replace '_' to '-'. - vLLM version: v0.18.0 - vLLM main: https://github.com/vllm-project/vllm/commit/35141a7eeda941a60ad5a4956670c60fd5a77029 --------- Signed-off-by: leo-pony <nengjunma@outlook.com>	2 个月前
bot_merge_conflict.yaml	[CI] Improve CI (#5078) Raname workflow to be clear. - vLLM version: v0.12.0 - vLLM main: https://github.com/vllm-project/vllm/commit/ad32e3e19ccf0526cb6744a5fed09a138a5fb2f9 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	5 个月前
bot_pr_create.yaml	[CI] Fix pr-create bot (#8329) Fix bot_pr_create CI job error, make it work again. Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	1 个月前
dispatch_main2main_bisect.yaml	[CI] Bump actions/download-artifact from 4 to 8 (#9050) Bumps [actions/download-artifact](https://github.com/actions/download-artifact)from 4 to 8. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	9 天前
labeled_doctest.yaml	[Misc] Fix doc test (#8277) ### What this PR does / why we need it? This patch normalize the doc tests between nightly tests andPR tests and update it to the latest daily built images (main/v0.18.0). - vLLM version: - vLLM main: https://github.com/vllm-project/vllm/commit/v0.19.0 --------- Signed-off-by: wangli <wangli858794774@gmail.com>	1 个月前
labled_download_model_dataset.yaml	[BugFix] install datasets for download dataset (#9253) ### What this PR does / why we need it? install datasets for download dataset ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: hfadzxy <starmoon_zhang@163.com>	9 天前
nightly_image_build.yaml	[CI] Optimize nightly test scheduling strategy (#8034) ### What this PR does / why we need it? This PR improves and reorganizes the nightly CI test workflows: - Split multi-node tests: Extract the 4-node `DeepSeek-V3_2-W8A8-EP` test into a separate `multi-node-tests` job (requiring 4 nodes), and group remaining 2-node tests into a new `double-node-tests` job. This avoids resource contention and allows more efficient parallel scheduling. - Add job timeout: Add `timeout-minutes: 120` to multi-node jobs; reduce single-node job timeout from 600 to 120 minutes to prevent runaway jobs. - Increase a2 parallelism: Raise `max-parallel` from 2 to 5 for a2 single-node tests to speed up the nightly run. - Add branch selection for nightly image build: Add a `vllm_ascend_branch` input (choices: `main`, `releases/v0.18.0`) to the `workflow_dispatch` trigger of `nightly_image_build.yaml`, enabling manual builds against specific branches. - Fix `clear-pre-logs` dependency: Update the `clear-pre-logs` job to wait on both `multi-node-tests` and `double-node-tests`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.18.0 - vLLM main: https://github.com/vllm-project/vllm/commit/14acf429ac08b6d538ca6feb3e06b6d13895804d Signed-off-by: hfadzxy <starmoon_zhang@163.com>	1 个月前
pr_close_cancel_job.yaml	Bump actions/github-script from 7 to 8 (#5796) Bumps [actions/github-script](https://github.com/actions/github-script) from 7 to 8. - vLLM version: v0.13.0 - vLLM main: https://github.com/vllm-project/vllm/commit/2f4e6548efec402b913ffddc8726230d9311948d Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	4 个月前
pr_test_full.yaml	[CI]Main2main 0515 (#9176) ### What this PR does / why we need it? Upstream PR [vllm-project/vllm#39568](https://github.com/vllm-project/vllm/pull/39568) is a complete rewrite of the routed-experts capture/transport pipeline. It supersedes both: - The original 0.20.2 design — `RoutedExpertsCapturer.get_instance()` singleton, `save_captured_experts(indices=...)`, shared-memory + `fcntl.flock` cross-process transport. - The intermediate PR #39917 design — module-level `get_global_experts_capturer()`, `init_routed_experts_capturer_with_shared_cache()`, `issue_routing_d2h_copy()`, `extract_routed_experts_for_current_batch()`. This API existed in main for only a few days and was never in a stable release; it has been fully removed. After the upgrade to vLLM 0515, vllm-ascend faces two API surfaces that are incompatible at the source level: \| Aspect \| 0.20.2 \| main \| \|---\|---\|---\| \| Capturer access \| `RoutedExpertsCapturer.get_instance()` (singleton) \| `runner.routed_experts_capturer` (per-runner instance, no global) \| \| Per-step `clear_buffer` \| via singleton \| via runner attribute \| \| Per-step D2H + ship \| `capturer.save_captured_experts(indices=cpu_slot_mapping)` (sync, shm write) \| runner-managed pinned `routed_experts_cpu` D2H + `RoutedExpertsLists` on `ModelRunnerOutput.routed_experts` \| \| Output channel \| shm/flock to scheduler \| `ModelRunnerOutput.routed_experts: RoutedExpertsLists` (NamedTuple, msgpack + zmq IPC) \| \| `slot_mapping` source \| `slot_mapping.cpu().numpy()` saved to `self.cpu_slot_mapping` \| private device snapshot `routed_experts_slot_mapping_device`, then pinned `routed_experts_slot_mapping_cpu` \| \| Layer hook injection \| `select_experts` calls singleton from inside `apply()` \| `module.router.set_capture_fn(...)` from `_bind_routed_experts_capturer` \| ## Strategy Overview 1. Keep the 0.20.2 path intact. It already works end-to-end. All 0.20.2-specific call sites stay byte-identical. 2. Add a parallel main path gated by `vllm_version_is("0.20.2") == False`. Reuse upstream` GPUModelRunner.init_routed_experts_capturer()` (inherited) for buffer allocation; override only `_bind_routed_experts_capturer` because Ascend's `select_experts` does not go through upstream `BaseRouter`. 3. Async scheduling: piggyback on upstream `AsyncGPUModelRunnerOutput`. vllm-ascend already constructs that wrapper directly, so adding the `routed_experts=` kwarg is enough — the wrapper handles `to_cpu_nonblocking()` on its copy stream and `tolists()` finalization in `get_output()` for free. 4. No new compat module, no monkey patches. Branching is inline at each call site; total surface is one new method (`_bind_routed_experts_capturer`) plus three branched call sites in `model_runner_v1.py` and one in `fused_moe.py`. ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/ce29c26b31d432b1b4bc028c46bb2c3b07a667d8 --------- Signed-off-by: wangli <wangli858794774@gmail.com>	12 天前
pr_test_light.yaml	[BugFix][310p] Fix torch-npu cannot import error (#9249) ### What this PR does / why we need it? Fixed the recent CI failure on Ascend 310P where `torch_npu` could not be imported. The root cause is related to the `torch-npu` 2.10.0 upgrade. After the upgrade, if a residual `triton` directory still exists in the environment, importing `torch_npu` may indirectly depend on `triton.language`. However, Triton is not supported on Ascend 310P and should be removed. In the CI environment, `triton` had been uninstalled, but the cleanup was incomplete because of the uninstall order. We need to uninstall `triton-ascend` first and then uninstall `triton`; otherwise, some Triton-related files may remain. The correct cleanup order is: ```bash pip uninstall -y triton-ascend pip uninstall -y triton ``` ### Does this PR introduce _any_ user-facing change? NA ### How was this patch tested? CI - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: Tflowers-0129 <2906339855@qq.com>	11 天前
push_build_csrc_cache.yaml	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
schedule_doc_linkcheck.yaml	[Misc][Upgrade] Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 (#9085) Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	16 天前
schedule_doc_translate.yaml	[CI] Bump actions/setup-python from 5 to 6 (#9051) Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	9 天前
schedule_image_build_and_push.yaml	[CI] Remove releases/v0.18.0 branch from nightly test workflows (#8874) ### What this PR does / why we need it? Remove releases/v0.18.0 branch from nightly test workflows ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.19.1 - vLLM main: https://github.com/vllm-project/vllm/commit/d886c26d4d4fef7d079696beb4ece1cfb4b008a8 Signed-off-by: hfadzxy <starmoon_zhang@163.com>	23 天前
schedule_lint_image_build.yaml	[CI] Main2main 0514 (#9155) ### What this PR does / why we need it? 1. fix https://github.com/vllm-project/vllm/issues/33322 overwrite `gpu_modelrunner.sync_and_gather_intermediate_tensors`, for the sceniro `pp+sp+tp`, skip scatter the residual for ascend 2. https://github.com/vllm-project/vllm/issues/35520 Adapted to the modifications of `ModelRunner v2` for hybrid attn in interface level, . Todo: Added support for Mamba in ModelRunner in Ascend. any pull_request is welcome 3. https://github.com/vllm-project/vllm/issues/40711 4. https://github.com/vllm-project/vllm/pull/42121 5. https://github.com/vllm-project/vllm/pull/41706 6. https://github.com/vllm-project/vllm/issues/39917 Disable `async_schedule` when `enable_return_routed_experts=True` 7. https://github.com/vllm-project/vllm/pull/41046 8. https://github.com/vllm-project/vllm/pull/41055 9. https://github.com/vllm-project/vllm/pull/41035 10. https://github.com/vllm-project/vllm/pull/42434 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: wangli <wangli858794774@gmail.com>	15 天前
schedule_nightly_test_a2.yaml	[CI] Fix nightly case:Qwen3.5-397B-w4a8-A3 (#9124) ### What this PR does / why we need it? we fix the failed nightly case:Qwen3.5-397B-w4a8-A3 - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 --------- Signed-off-by: guxin108 <1252896542@qq.com>	16 天前
schedule_nightly_test_a3.yaml	[CI] add weekly case (#9380) ### What this PR does / why we need it? We run the weekly test case at a fixed period. add weekly case ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by running the test - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: chen-commits <1636718796@qq.com> Signed-off-by: chen <1636718796@qq.com>	8 天前
schedule_release_code_and_wheel.yml	[CI] Bump actions/download-artifact from 4 to 8 (#9050) Bumps [actions/download-artifact](https://github.com/actions/download-artifact)from 4 to 8. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	9 天前
schedule_stale_manage.yaml	[CI] refine issue triage rules, wan regex and update stale setting (#7531) - Update issue labeler regex for wan to match numeric suffix only, including both standalone wan label and multi-modality-generate aggregate rule. - Add title-based gate conditions in issue triage workflow so auto-labeling runs only for expected issue templates ( [Bug]: , [Installation]: , [Usage]: , [Doc]: ). - Adjust scheduled stale workflow configuration for the awaiting-feedback processing block. ### What this PR does / why we need it? - Update issue labeler regex for wan to match numeric suffixes only, in both: - standalone wan label rule - multi-modality-generate aggregate rule - Add title-based gate conditions in issue triage workflow so auto-labeling runs only for expected templates: [Bug]:/ [Installation]:/ [Usage]:/ [Doc]: - Adjust the scheduled stale workflow configuration for the awaiting-feedback processing block. ### Does this PR introduce _any_ user-facing change? - No runtime/API user-facing change. - This PR only updates repository automation behavior in GitHub workflows and issue labeling rules. ### How was this patch tested? - Performed config-level validation by reviewing diffs and final YAML content for: - .github/issue-labeler.yml - .github/workflows/bot_issue_manage.yaml - .github/workflows/schedule_stale_manage.yaml - Verified wan regex now requires numeric suffix (e.g., wan2 , wan2.1 ) and no longer matches alphabetic suffix forms (e.g., wana ). - Verified triage workflow includes title-based if conditions for expected issue templates. - Verified stale workflow’s awaiting-feedback block reflects the intended configuration adjustment. - No unit/e2e tests were added because this PR changes GitHub Actions and labeling configuration only. - vLLM version: v0.18.0 - vLLM main: https://github.com/vllm-project/vllm/commit/8b6325758cce5f9c36d38f2462edbd368b97a07c --------- Signed-off-by: drizzlezyk <drizzlezyk@163.com>	2 个月前
schedule_test_vllm_main.yaml	[Misc][Upgrade] Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 (#9085) Upgrade CANN to 9.0.0 and triton-ascend to 3.2.1 - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>	16 天前
schedule_update_estimated_time.yaml	[CI] Bump actions/download-artifact from 4 to 8 (#9050) Bumps [actions/download-artifact](https://github.com/actions/download-artifact)from 4 to 8. - vLLM version: v0.20.1 - vLLM main: https://github.com/vllm-project/vllm/commit/c7aa186d67b6f051680831418e957c67f34ba7a2 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	9 天前
schedule_vllm_e2e_test.yaml	[CI] replace mirror with CDN (#9345) ### What this PR does / why we need it? Replace CDN mirror repo. ### Does this PR introduce _any_ user-facing change? uses https://repo.huaweicloud.com/ascend/repos/pypi ### How was this patch tested? Accelerate package download - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 Signed-off-by: tfhddd <2272751277@qq.com>	9 天前
schedule_weekly_test_a3.yaml	[CI] add weekly case (#9380) ### What this PR does / why we need it? We run the weekly test case at a fixed period. add weekly case ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? by running the test - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: chen-commits <1636718796@qq.com> Signed-off-by: chen <1636718796@qq.com>	8 天前

E2E Test Workflow Guide

This document provides a guide on how to manage and extend the E2E test suite for vllm-ascend. It covers how to add new test cases and understand the automatic partitioning mechanism.

1. Adding a New Test Case

All E2E test cases are defined and managed in the .github/workflows/scripts/config.yaml file.

Steps

Prepare the Test Script: Ensure your test script (.py file) is placed in the appropriate location under the tests/e2e/ directory (e.g., tests/e2e/singlecard/ or tests/e2e/multicard/).
Modify config.yaml: Open .github/workflows/scripts/config.yaml and locate the corresponding test suite (e.g., e2e-singlecard or e2e-multicard-2-cards).
Add Configuration Entry: Add a new entry under the corresponding list. Each entry contains the following fields:
- name: The relative path to the test file. If you only need to run a specific test function within the file, use :: as a separator, e.g., path/to/test.py::test_func.
- estimated_time: The estimated time (in seconds) required to run the test. This field is crucial as it is used for automatic load balancing (partitioning).
- is_skipped (Optional): If set to true, the test will be skipped.

Example

Suppose you want to add a new test named tests/e2e/singlecard/test_new_feature.py with an estimated runtime of 120 seconds:

suites:
  e2e-singlecard:
    # ... other existing tests ...
    - name: tests/e2e/singlecard/test_new_feature.py
      estimated_time: 120

To add a specific test function:

    - name: tests/e2e/singlecard/test_new_feature.py::test_specific_case
      estimated_time: 60

2. Automatic Partitioning Mechanism

To speed up CI execution, we support splitting large test suites into multiple parallel Jobs (partitions). The partitioning logic is primarily implemented in the auto_partition function in .github/workflows/scripts/run_suite.py.

Principle

The partitioning algorithm uses a Greedy Approach to achieve load balancing, aiming to make the total estimated runtime of each partition as equal as possible.

Read Configuration: The script reads all non-skipped test cases and their estimated_time from config.yaml.
Sort(Balanced Assignment): Test cases are sorted by estimated_time in descending order. This ensures that the heaviest tasks are distributed first to achieve optimal load balancing across partitions.
Assign: Iterating through the sorted test cases, each case is assigned to the partition (Bucket) with the current minimum total time.
Re-sort (Fast Feedback): Within each partition, tests are re-sorted by estimated_time in ascending order. This allows the CI to cover as many test cases as possible in the early stages.

TIP: If you need to prioritize a new test case, you can temporarily set its estimated_time to 0 to ensure it runs first, then update it to the actual value later.

How to Modify Partitioning Logic

If you need to adjust the partitioning strategy, please modify the .github/workflows/scripts/run_suite.py file.

Algorithm Location: auto_partition function.
Input Parameters:
- files: List of test files (including estimated_time).
- rank: Index of the current partition (0 to size-1).
- size: Total number of partitions.

Invocation: CI workflows (e.g., .github/workflows/_e2e_test.yaml) call the script via command-line arguments:

python3 .github/workflows/scripts/run_suite.py --suite <suite_name> --auto-partition-id <index> --auto-partition-size <total_count>

Notes

Accurate Estimated Time: To achieve the best load balancing, please provide an accurate estimated_time in config.yaml. If a new test is very time-consuming but the estimated time is set too low, it may cause a specific partition to timeout.
Number of Partitions: The number of partitions (auto-partition-size) is typically defined in the strategy.matrix of the GitHub Actions workflow definition file (e.g., _e2e_test.yaml).

3. Running Tests Locally

You can use the run_suite.py script to run test suites locally:

# Run the full e2e-singlecard suite
python3 .github/workflows/scripts/run_suite.py --suite e2e-singlecard

# Simulate partitioned execution (e.g., partition 0 of 2)
python3 .github/workflows/scripts/run_suite.py --suite e2e-singlecard --auto-partition-id 0 --auto-partition-size 2