文件最后提交记录最后更新时间
!2254 refactor: fused_rope, level1(l1) Merge pull request !2254 from 邓佳/refactor_rope 1 年前
!2726 [mindspore][master] adapt to megatron0.12.1 for mindspeed-mm Merge pull request !2726 from wangshuangling/master 9 个月前
[auto settings] fix auto settings feature Co-authored-by: mhlinoer<lvmuheng@h-partners.com> # message auto-generated for no-merge-commit merge: !2926 merge master into master [auto settings] fix auto settings feature Created-by: mhlinoer Commit-by: mhlinoer Merged-by: ascend-robot Description: [auto settings] fix auto settings feature See merge request: Ascend/MindSpeed!29267 个月前
perf(verl ckpt): ckpt load and save acceleration Co-authored-by: 李鸣沼<lmztju@126.com> # message auto-generated for no-merge-commit merge: !3074 merge verl_load_and_save_ckpt into master perf(verl ckpt): ckpt load and save acceleration Created-by: lmztju Commit-by: 李鸣沼;l30057177 Merged-by: ascend-robot Description: **测试场景1:** verl+megatron后端dapo-qwen3-30b 910A3双机加载和保存ckpt 本地存储 ![image.png](https://raw.gitcode.com/user-images/assets/7404741/cea9c052-2550-4d2c-9dd7-bcfe0cb33d99/image.png 'image.png') See merge request: Ascend/MindSpeed!30745 个月前
fix: fix the initialization error of compress-optimizer Co-authored-by: NingGuangyou<ningguangyou@h-partners.com> # message auto-generated for no-merge-commit merge: !3319 merge master into master fix: fix the initialization error of compress-optimizer Created-by: NingGuangyou Commit-by: NingGuangyou Merged-by: ascend-robot Description: fix the initialization error of compress-optimizer See merge request: Ascend/MindSpeed!33192 个月前
!2650 Add TE class and args modification to support verl Merge pull request !2650 from Jializheng/args 10 个月前
fix: te ulysses Co-authored-by: clc2025<chenlucong@huawei.com> # message auto-generated for no-merge-commit merge: !3349 merge fix_te_ulysses into master fix: te ulysses Created-by: clc2025 Commit-by: clc2025 Merged-by: ascend-robot Description: What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33492 个月前
!2738 【feat.】支持Megatron Custom FSDP特性 Merge pull request !2738 from yuqi/custom_fsdp 9 个月前
!2734 feat: balanced moe Merge pull request !2734 from 邓佳/core_r0.12.1_bm_v2 8 个月前
Bugfix: 在disable-gloo-groups不开启的情况下不去设置enable_gloo_process_groups Co-authored-by: fishhhqi<moeyfishyq@outlook.com> # message auto-generated for no-merge-commit merge: merge gloo_fix2 into master Bugfix: 在disable-gloo-groups不开启的情况下不去设置enable_gloo_process_groups Created-by: fishhhqi Commit-by: fishhhqi Merged-by: ascend-robot Description: 在disable-gloo-groups不开启的情况下不去设置enable_gloo_process_groups See merge request: Ascend/MindSpeed!28877 个月前
!2777 [Bugfix] 修复 dist_train Merge pull request !2777 from yuqi/dist_train_fix 9 个月前
[Bugfix] Fix Megatron checkpoint saving&loading compatibility for torch_dcp format Co-authored-by: 林明哲<linmingzhe3@huawei.com> # message auto-generated for no-merge-commit merge: !3077 merge fix1202 into master [Bugfix] Fix Megatron checkpoint saving&loading compatibility for torch_dcp format Created-by: LinMingZhe Commit-by: 林明哲 Merged-by: ascend-robot Description: Fix Megatron checkpoint saving&loading compatibility for torch_dcp format See merge request: Ascend/MindSpeed!30775 个月前
fix: NPU datadump level: L0 & mix Co-authored-by: yulelanmei<huangyijie8@huawei.com> # message auto-generated for no-merge-commit merge: !3351 merge master into master fix: NPU datadump level: L0 & mix Created-by: yulelanmei Commit-by: yulelanmei Merged-by: ascend-robot Description: What this PR does / why we need it? 当前--npu-datadump未适配 L0及mix 的dump等级,需要增强功能 Does this PR introduce any user-facing change? N/A How was this patch tested? 开启--npu-datadump,config.json配置level为L0或mix 测试:https://wiki.huawei.com/domains/148330/wiki/296621/WIKI2026032510543405 See merge request: Ascend/MindSpeed!33512 个月前
feat(triton):sort_chunks_by_idx Co-authored-by: guofanfeng<guofanfeng1@huawei.com> # message auto-generated for no-merge-commit merge: !2997 merge master into master feat(triton):sort_chunks_by_idx Created-by: guofanfeng23 Commit-by: guofanfeng Merged-by: ascend-robot Description: sort_chunks_by_idx triton算子接入 算子验证结果: https://wiki.huawei.com/domains/152732/wiki/307991/WIKI202511219117266 See merge request: Ascend/MindSpeed!29975 个月前
feat: hccl op mode set Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !3376 merge core_r0.12.1_adaptive_hccl_op_v2 into master feat: hccl op mode set Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: What this PR does / why we need it? feat(torch): hccl op mode set Does this PR introduce any user-facing change? --hccl-op-mode How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33761 个月前
!1967 fix coc demo, import error and feature validate logic Merge pull request !1967 from wangyuansheng8/master 1 年前
feat: mxfp8-32x32 quant Co-authored-by: kyle_zhangchi<zhangchi158@huawei.com> # message auto-generated for no-merge-commit merge: !3471 merge feat_mxfp8-32x32 into master feat: mxfp8-32x32 quant Created-by: kyle_zhangchi Commit-by: kyle_zhangchi Merged-by: ascend-robot Description: ## What this PR does / why we need it? 在Megatron框架下新增mxfp8-32x32量化算子,降低权重显存占用 ## Does this PR introduce *any* user-facing change? --fp8-recipe新增mxfp8-32x32选项 https://gitcode.com/Ascend/MindSpeed/commit/e065cbca6873bfc02661d088b07d90224333e87d?ref=feat_mxfp8-32x32&prId=3471 ## How was this patch tested? 验证文档 https://wiki.huawei.com/domains/170864/wiki/367830/WIKI2026051111046509 See merge request: Ascend/MindSpeed!34717 天前
quant fp8 optimizer 6 个月前
feat: add custom pp layout Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3496 merge add_pp_layout into master feat: add custom pp layout Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 新增pipeline-model-parallel-layout功能,支持自定义PP每个stage的层排布 验证链接:https://wiki.huawei.com/domains/137239/wiki/268925/WIKI2026052611233549 issue: https://gitcode.com/Ascend/MindSpeed/issues/166 See merge request: Ascend/MindSpeed!34969 小时前
feat: SwapMuon add save/load ckpt support Co-authored-by: JialiZheng<jializheng@huawei.com> # message auto-generated for no-merge-commit merge: !3518 merge master into master feat: SwapMuon add save/load ckpt support Created-by: JialiZheng1 Commit-by: JialiZheng Merged-by: ascend-robot Description: SwapMuon add save/load ckpt support RFC:https://gitcode.com/Ascend/MindSpeed/issues/164 See merge request: Ascend/MindSpeed!351811 小时前
feat: add custom pp layout Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3496 merge add_pp_layout into master feat: add custom pp layout Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 新增pipeline-model-parallel-layout功能,支持自定义PP每个stage的层排布 验证链接:https://wiki.huawei.com/domains/137239/wiki/268925/WIKI2026052611233549 issue: https://gitcode.com/Ascend/MindSpeed/issues/166 See merge request: Ascend/MindSpeed!34969 小时前
add w8a16 quant Co-authored-by: wangjunhang<wangjunhang7@huawei.com> # message auto-generated for no-merge-commit merge: !3460 merge Dev_W8A16 into master add w8a16 quant Created-by: goodflower9 Commit-by: wangjunhang Merged-by: ascend-robot Description: What this PR does / why we need it? This PR adds W8A16 MXFP8 QAT support based on the existing QAT flow. Does this PR introduce any user-facing change? Yes. Users can enable W8A16 MXFP8 QAT with: ``` --qat-scheme w8a16-mxfp8 ``` Related doc: ``` docs/zh/features/qat_quant.md ``` How was this patch tested? Enable fake quantization by adding the parameter --qat-scheme w8a16-mxfp8 https://wiki.huawei.com/domains/76578/wiki/233229/WIKI2026051211063362 See merge request: Ascend/MindSpeed!34603 天前
feat(ut/qos/torch): 补充ut,修复代码遗漏BUG Co-authored-by: Klayyy<wanglei886@h-partners.com> # message auto-generated for no-merge-commit merge: !3309 merge master into master feat(ut/qos/torch): 补充ut,修复代码遗漏BUG Created-by: Klayyy Commit-by: Klayyy Merged-by: ascend-robot Description: 1.补充AI QOS特性feature UT 2.ut补充过程中,自检代码,修复BUG 2.1 torch_npu._C._distributed_c10d.ProcessGroupHCCL.Options()调用名称修改 2.2 qos_feature.py 中 raiseValueError 提示词完善 2.3 qos.py中对于最小冲突度组合中优先级的赋值部分,去掉重复代码,去掉无用库导入,_PARALLEL_TYPES中有逗号未添加 2.4 qos.py中 应是sdma qos 部分的处理,误使用roce 3.补充H2D QOS 对于 PCIE异步通道的使用,对于DCMI接口新建set_h2d_qos接口,提供给python调用 4.修改aiQos Readme中关于DCMI接口的调用,补充DCMI接口SO编译方法 See merge request: Ascend/MindSpeed!33092 个月前
!2675 Add once warning and args check Merge pull request !2675 from Jializheng/master 10 个月前
fix: mc2_validate_args Co-authored-by: clc2025<chenlucong@huawei.com> # message auto-generated for no-merge-commit merge: !3394 merge fix_mc2_validate_args into master fix: mc2_validate_args Created-by: clc2025 Commit-by: clc2025 Merged-by: ascend-robot Description: What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. DTS2026040735953 Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33941 个月前
!2635 安全:文件路径校验/权限 Merge pull request !2635 from glhyy/secmaster 10 个月前
Feat: adaptor for DeepSeek V4 Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3427 merge master into master Feat: adaptor for DeepSeek V4 Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: What this PR does / why we need it? Adaptor for DeepSeek V4!!! Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!34271 个月前
feat: add custom pp layout Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3496 merge add_pp_layout into master feat: add custom pp layout Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 新增pipeline-model-parallel-layout功能,支持自定义PP每个stage的层排布 验证链接:https://wiki.huawei.com/domains/137239/wiki/268925/WIKI2026052611233549 issue: https://gitcode.com/Ascend/MindSpeed/issues/166 See merge request: Ascend/MindSpeed!34969 小时前
!2112 MindSpeed L0 reconstruction Merge pull request !2112 from Jializheng/master 1 年前
!2650 Add TE class and args modification to support verl Merge pull request !2650 from Jializheng/args 10 个月前