| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
[Feature] Add the muon optimizer and adapt it for FSDP2. Co-authored-by: hanyyy<hanyue42@huawei.com> # message auto-generated for no-merge-commit merge: !1964 merge master into master [Feature] Add the muon optimizer and adapt it for FSDP2. Created-by: vasileone Commit-by: hanyyy Merged-by: ascend-robot Description: ## Motivation Added the Muon optimizer and enabled its compatibility with FSDP2. ## Modification - Added Muon optimizer implementation with FSDP2 compatibility. - Added patches for Megatron-core to enable zero-code-change injection. - Fixed the issue of Muon optimizer failing under FSDP2 context: sharded parameter handling, gradient sync, and step behavior. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - The new code needs to comply with the Clean Code specification. - The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - CLA has been signed and all committers have signed the CLA in this PR. - The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1964 | 6 个月前 | |
[Feature] add downcast to bf16 for fsdp2 and remove --bf16 Co-authored-by: 林明哲<linmingzhe3@huawei.com> # message auto-generated for no-merge-commit merge: !1914 merge fix1208 into master [Feature] add downcast to bf16 for fsdp2 and remove --bf16 Created-by: LinMingZhe Commit-by: 林明哲 Merged-by: ascend-robot Description: ## Motivation - Remove the redundant --bf16 flag to cut memory usage and boost performance; - Introduce weight-loading downcast to bf16, keeping computational precision identical to the previous --bf16 behavior. ## Modification - wan2.2 configs - vace configs - modify dcp patch and import to pretrain_sora.py - add --downcast-to-bf16 to args and introduce it in fsdp2.md ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1914 | 6 个月前 | |
[Modify] Specify the value of sharding_size for wan2.2 Co-authored-by: AZe_404<wangze62@h-partners.com> # message auto-generated for no-merge-commit merge: !1873 merge fsdp2_sharding_size into master [Modify] Specify the value of sharding_size for wan2.2 Created-by: AZe_404 Commit-by: AZe_404 Merged-by: ascend-robot Description: ## Motivation Specify the value of sharding_size for wan2.2. ## Modification Specify the value of sharding_size for wan2.2. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1873 | 6 个月前 |
| 文件 | 最后提交记录 | 最后更新时间 |
|---|---|---|
| 6 个月前 | ||
| 6 个月前 | ||
| 6 个月前 |