文件最后提交记录最后更新时间
[Doc][skill] Rework main2main skill and add deterministic automation scripts (#9232) ### What this PR does / why we need it? This PR reworks the main2main skill into a structured, script-driven pipeline for keeping vLLM-Ascend aligned with upstream vLLM main. It follows the direction in RFC #7074: make main2main caller-agnostic, script-deterministic, incremental, CI-gated, and bounded. The skill now separates high-level agent guidance from deterministic helper scripts and detailed reference material. New structure: ```text main2main/ ├── SKILL.md │ └── Compact entrypoint: guardrails, workflow overview, and pre-completion checklist. ├── scripts/ │ ├── detect_commits.py │ │ └── Initialize workspace and detect base/target vLLM commits. │ ├── plan_steps.py │ │ └── Deterministic step planner for bounded upstream commit ranges. │ ├── step-planner.yaml │ │ └── Classification, weight, and budget configuration for step planning. │ ├── check_and_commit.py │ │ └── Validate commit paths and create signed commits only after CI passes. │ └── run_main2main_ci.py │ └── Run CI and extract structured log summaries. └── reference/ ├── adapt-guide.md │ └── Detailed method for the Adapt phase. ├── diagnosis-guide.md │ └── Detailed method for the Fix-CI loop. ├── final-summary.md │ └── Final reviewer-facing summary template. └── error-pattern-examples.md └── Concrete examples for common CI error fixes. ``` Key workflow changes: - Splits upstream vLLM commit drift into bounded steps before adaptation. - Updates the pinned vLLM commit reference per step. - Requires CI verification for every step, including no-op adapt steps. - Uses structured CI log summaries instead of reading raw logs into agent context. - Commits only after CI passes or only environment flakes remain. - Stops cleanly on bounded fix-loop failure by saving patch and failure summary instead of silently skipping failed steps. - This reduces manual main2main maintenance by making repeatable operations deterministic and keeping the agent focused on the hard parts: understanding upstream API changes, adapting vLLM-Ascend code, and diagnosing CI failures. Reference: [#7074](https://github.com/vllm-project/vllm-ascend/issues/7074) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>9 天前
[Doc][skill] Rework main2main skill and add deterministic automation scripts (#9232) ### What this PR does / why we need it? This PR reworks the main2main skill into a structured, script-driven pipeline for keeping vLLM-Ascend aligned with upstream vLLM main. It follows the direction in RFC #7074: make main2main caller-agnostic, script-deterministic, incremental, CI-gated, and bounded. The skill now separates high-level agent guidance from deterministic helper scripts and detailed reference material. New structure: ```text main2main/ ├── SKILL.md │ └── Compact entrypoint: guardrails, workflow overview, and pre-completion checklist. ├── scripts/ │ ├── detect_commits.py │ │ └── Initialize workspace and detect base/target vLLM commits. │ ├── plan_steps.py │ │ └── Deterministic step planner for bounded upstream commit ranges. │ ├── step-planner.yaml │ │ └── Classification, weight, and budget configuration for step planning. │ ├── check_and_commit.py │ │ └── Validate commit paths and create signed commits only after CI passes. │ └── run_main2main_ci.py │ └── Run CI and extract structured log summaries. └── reference/ ├── adapt-guide.md │ └── Detailed method for the Adapt phase. ├── diagnosis-guide.md │ └── Detailed method for the Fix-CI loop. ├── final-summary.md │ └── Final reviewer-facing summary template. └── error-pattern-examples.md └── Concrete examples for common CI error fixes. ``` Key workflow changes: - Splits upstream vLLM commit drift into bounded steps before adaptation. - Updates the pinned vLLM commit reference per step. - Requires CI verification for every step, including no-op adapt steps. - Uses structured CI log summaries instead of reading raw logs into agent context. - Commits only after CI passes or only environment flakes remain. - Stops cleanly on bounded fix-loop failure by saving patch and failure summary instead of silently skipping failed steps. - This reduces manual main2main maintenance by making repeatable operations deterministic and keeping the agent focused on the hard parts: understanding upstream API changes, adapting vLLM-Ascend code, and diagnosing CI failures. Reference: [#7074](https://github.com/vllm-project/vllm-ascend/issues/7074) ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? - vLLM version: v0.20.2 - vLLM main: https://github.com/vllm-project/vllm/commit/0d4d334eaa583b9c09aa4eb7538c22db99fd84b3 --------- Signed-off-by: Meihan-chen <zr010426ztt@outlook.com>9 天前
README.md

vLLM Ascend skills

This directory contains the skills for vLLM Ascend.

Note: Please copy the skills directory .agents/skills to .claude/skills if you want to use the skills in this repo with Claude code.

Table of Contents

vLLM Ascend Model Adapter Skill

Adapt and debug models for vLLM on Ascend NPU — covering both already-supported architectures and new models not yet registered in vLLM.

What it does

This skill guides an AI agent through a deterministic workflow to:

  1. Triage a model checkpoint (architecture, quant type, multimodal capability).
  2. Implement minimal code changes in /vllm-workspace/vllm and /vllm-workspace/vllm-ascend.
  3. Validate via a two-stage gate (dummy fast gate + real-weight mandatory gate).
  4. Deliver one signed commit with code, test config, and tutorial doc.

File layout

File Purpose
SKILL.md Skill definition, constraints, and execution playbook
references/workflow-checklist.md Step-by-step commands and templates
references/troubleshooting.md Symptom-action pairs for common failures
references/fp8-on-npu-lessons.md FP8 checkpoint handling on Ascend
references/multimodal-ep-aclgraph-lessons.md VL, EP, and ACLGraph patterns
references/deliverables.md Required outputs and commit discipline

Quick start

  1. Open a conversation with the AI agent inside the vllm-ascend dev container.
  2. Invoke the skill (e.g. /vllm-ascend-model-adapter).
  3. Provide the model path (default /models/<model-name>) and the originating issue number.
  4. The agent follows the playbook in SKILL.md and produces a ready-to-merge commit.

Key constraints

  • Never upgrade transformers.
  • Start vllm serve from /workspace (direct command, port 8000).
  • Dummy-only evidence is not sufficient — real-weight validation is mandatory.
  • Final delivery is exactly one signed commit in the current repo.

Two-stage validation

  • Stage A (dummy): fast architecture / operator / API path check with --load-format dummy.
  • Stage B (real): real-weight loading, fp8/quant path, KV sharding, runtime stability.

Both stages require request-level verification (/v1/models + at least one chat request), not just startup success.

vLLM Ascend main2main Skill

Adapt vLLM-Ascend to track upstream vLLM main branch changes incrementally: detect commit drift, plan steps, adapt code, run CI, and commit verified changes.

What it does

When upstream vLLM changes — function signatures, config fields, module paths, base class methods — vllm-ascend breaks. This skill:

  1. Detects the commit drift between the currently adapted vLLM commit and a target upstream commit.
  2. Splits the commit range into manageable steps.
  3. For each step: adapts vllm-ascend code, updates commit references, runs CI, and commits only verified changes.
  4. If a step fails CI, diagnoses the failures and applies fixes in rounds until CI passes or a stop condition is met.

The two hardest parts — figuring out what to adapt and diagnosing CI failures — are guided by reference playbooks. The rest (commit detection, step planning, CI execution, committing) is handled by scripts.

The skill itself does not create PRs; it produces committed, CI-verified changes in the working tree and a final reviewer-facing summary.

File layout

File Purpose
SKILL.md Skill definition, guardrails, and execution playbook
reference/adapt-guide.md Step-by-step adaptation workflow with file mapping tables
reference/diagnosis-guide.md CI failure diagnosis workflow with error→fix mappings
reference/error-pattern-examples.md Concrete upstream-change-to-fix pattern examples
reference/final-summary.md Template for the final reviewer-facing summary
scripts/detect_commits.py Detect base and target commits in both repos
scripts/plan_steps.py Split the commit range into manageable steps
scripts/per_ci_check.py Pre-CI checks: version guards, version string consistency, temp files
scripts/run_main2main_ci.py Run CI suites and produce structured result summaries
scripts/check_and_commit.py Verify preconditions and commit changes
scripts/update_commit_reference.py Update the vLLM commit reference in vllm-ascend

Quick start

  1. Open a conversation with the AI agent inside the vllm-ascend dev container.
  2. Invoke the skill (e.g. /main2main).
  3. The agent detects drift, plans steps, and iterates through each step:
    • Adapt code following reference/adapt-guide.md
    • Update commit references via scripts/update_commit_reference.py
    • Run CI via scripts/run_main2main_ci.py
    • Commit via scripts/check_and_commit.py
  4. Output: committed changes per step, per-step summaries in /tmp/main2main/steps/, and a final summary.

Key guidance

  • Only modify vllm-ascend, never the upstream vLLM repo.
  • Intermediate files go in /tmp/main2main/ — never in the repo.
  • Every step runs CI, including no-op adapt steps where no code changed.
  • Use vllm_version_is() guards (not hasattr() or try/except) for code that must work with both the release version and upstream main.
  • Never read raw CI logs directly — use the round-N-summary.json produced by the CI wrapper.
  • Do not stop because a run is long; the only valid stop reasons are in reference/diagnosis-guide.md Step 4.

vLLM Ascend Release Note Writer Skill

You just need to say: Please help me write a 0.13.0 release note based on commits from v0.11.0 and releases/v0.13.0

What it does

This skill guides you through a structured workflow to:

  1. Fetch commits between two versions using the provided script.
  2. Analyze and categorize each commit in a CSV workspace.
  3. Draft highlights and write polished release notes.
  4. Generate release notes organized by category (Features, Hardware Support, Performance, Dependencies, etc.).

File layout

File Purpose
SKILL.md Skill definition, workflow, and writing guidelines
references/ref-past-release-notes-highlight.md Style and category reference for release notes
scripts/fetch_commits-optimize.py Script to fetch commits between versions

Quick start

  1. Open a conversation with the AI agent.
  2. Invoke the skill (e.g. /vllm-ascend-release-note-writer).
  3. Follow the workflow steps:
    • Fetch commits between versions
    • Analyze commits in CSV format
    • Draft and edit highlights
  4. Output files are saved to vllm-ascend-release-note/output/$version

Key guidelines

  • Use one-level headings (###) for sections in a specific order: Highlights, Features, Hardware and Operator Support, Performance, Dependencies, Deprecation & Breaking Changes, Documentation, Others.
  • Focus on user-facing impact and include context for practical usage.
  • Verify details by checking linked PRs (use GitHub API for descriptions if needed).
  • Keep notes concise and avoid unnecessary technical details.