文件最后提交记录最后更新时间
[Bugfix] verl bugfix Co-authored-by: ningmengliu<liuhao438@huawei.com> # message auto-generated for no-merge-commit merge: !1838 merge master into master [Bugfix] verl bugfix Created-by: ningmenglh Commit-by: ningmengliu Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!18386 个月前
[Modify] Fix document formatting and broken links Co-authored-by: WendongPang<pangwendong@huawei.com> # message auto-generated for no-merge-commit merge: !2353 merge doc_26 into 26.0.0 [Modify] Fix document formatting and broken links Created-by: WendongPang Commit-by: WendongPang Merged-by: ascend-robot Description: ## What this PR does / why we need it? [Modify] Fix document formatting and broken links. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!23531 个月前
[Feature] Add deterministic function in Verl Co-authored-by: js1234567<jiangshuo9@h-partners.com> # message auto-generated for no-merge-commit merge: !1871 merge master into master [Feature] Add deterministic function in Verl Created-by: js1234567 Commit-by: js1234567 Merged-by: ascend-robot Description: ## Motivation Add deteministic function in Verl ## Modification Readme plugin ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!18715 个月前
README.md

GRPO 使用指南

目录

简介

以 MindSpeed MM 仓库复现 Group Relative Policy Optimization (GRPO) 后训练方法为例来帮助用户快速入门,后续规划支持多个模型。

支持模型

性能数据

模型 数据集 机器型号 GBS n_samples max_prompt_length max_response_length max_num_batched_tokens 端到端 tps
Qwen2.5VL-7B geo3k Atlas 200T A2 Box16 512 5 1024 2048 8192 142.42
Qwen2.5VL-32B geo3k Atlas 200T A2 Box16 256 5 1024 2048 8192 88.32
Qwen2.5VL-7B 非公开数据集 Atlas 200T A2 Box16 16 4 18,000 512 19,000 428.38
Qwen2.5VL-32B 非公开数据集 Atlas 200T A2 Box16 32 8 18,000 512 20,000 99.65
Qwen3VL-8B geo3k Atlas 200T A2 Box16 512 5 1024 2048 8192 429
Qwen3VL-8B geo3k Atlas 200T A3 Box8 512 5 1024 2048 8192 364*2
Qwen3VL-30B geo3k Atlas 200T A2 Box16 64 5 1024 2048 8192 21.76
Qwen3VL-30B geo3k Atlas 200T A3 Box8 64 5 1024 2048 8192 19.1*2
Qwen3VL-30B geo3k Atlas 200T A2 Box16 64 5 16384 1024 18000 275
Qwen3VL-30B geo3k Atlas 200T A3 Box8 64 5 16384 1024 18000 267*2
:非公开数据集性能结果仅供参考。