文件最后提交记录最后更新时间
doc: adjust doc Co-authored-by: liutongtong27<liutongtong15@h-partners.com> # message auto-generated for no-merge-commit merge: !3305 merge master_menutest into master doc: adjust doc Created-by: liutongtong27 Commit-by: liutongtong27 Merged-by: ascend-robot Description: What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!33052 个月前
feat: add custom pp layout Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3496 merge add_pp_layout into master feat: add custom pp layout Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 新增pipeline-model-parallel-layout功能,支持自定义PP每个stage的层排布 验证链接:https://wiki.huawei.com/domains/137239/wiki/268925/WIKI2026052611233549 issue: https://gitcode.com/Ascend/MindSpeed/issues/166 See merge request: Ascend/MindSpeed!34968 小时前
feat: add custom pp layout Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3496 merge add_pp_layout into master feat: add custom pp layout Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: 新增pipeline-model-parallel-layout功能,支持自定义PP每个stage的层排布 验证链接:https://wiki.huawei.com/domains/137239/wiki/268925/WIKI2026052611233549 issue: https://gitcode.com/Ascend/MindSpeed/issues/166 See merge request: Ascend/MindSpeed!34968 小时前
feat: Add MXFP8 support for FSDP Co-authored-by: EVA1<jingsiyu1@huawei.com> Co-authored-by: quancs001<quancs@qq.com> Co-authored-by: h00638954<huangzhiyuan8@huawei.com> # message auto-generated for no-merge-commit merge: !3387 merge fsdp_lp_ag into master feat: Add MXFP8 support for FSDP Created-by: quancs001 Commit-by: EVA1;quancs001;h00638954 Merged-by: ascend-robot Description: 在FSDP框架下新增MXFP8 Dense/MoE模型训练支持,实现了包括: 1. 基础的低精通信+FSDP框架 2. 实现了对MXFP8+低精通信的支持 3. MoE模型,支持了EP、EFSDP的适配 See merge request: Ascend/MindSpeed!33871 个月前
fix: NPU datadump level: L0 & mix Co-authored-by: yulelanmei<huangyijie8@huawei.com> # message auto-generated for no-merge-commit merge: !3351 merge master into master fix: NPU datadump level: L0 & mix Created-by: yulelanmei Commit-by: yulelanmei Merged-by: ascend-robot Description: What this PR does / why we need it? 当前--npu-datadump未适配 L0及mix 的dump等级,需要增强功能 Does this PR introduce any user-facing change? N/A How was this patch tested? 开启--npu-datadump,config.json配置level为L0或mix 测试:https://wiki.huawei.com/domains/148330/wiki/296621/WIKI2026032510543405 See merge request: Ascend/MindSpeed!33512 个月前
style:Triton operator path normalization. Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !3432 merge master_GDN_triton_stype into master style:Triton operator path normalization. Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: What this PR does / why we need it? 删除lite路径下的GDN算子,后续统一归档到ops路径下; 对应关联issue:https://gitcode.com/Ascend/MindSpeed/issues/147 Does this PR introduce any user-facing change? NA How was this patch tested? 见PR中的test文件 See merge request: Ascend/MindSpeed!34321 个月前
[modify][mindspore] register patchs for coalescing_manager Co-authored-by: weixin_47897441<wuyouqi1@h-partners.com> # message auto-generated for no-merge-commit merge: !3233 merge master-0129 into master [modify][mindspore] register patchs for coalescing_manager Created-by: weixin_47897441 Commit-by: weixin_47897441 Merged-by: ascend-robot Description: [modify][mindspore] register patchs for coalescing_manager: The coalescing_manager-related patches have recently been removed from megtron_basic.py. Given that MSA currently lacks support for communication operators associated with coalescing_manager, these removed patches are migrated to MindSpore to ensure normal launch of models such as qwen3vl. See merge request: Ascend/MindSpeed!32333 个月前
remove deprecated code 2 Co-authored-by: 赵一帆<zhaoyifan15@huawei.com> # message auto-generated for no-merge-commit merge: !2972 merge master into master remove deprecated code 2 Created-by: zhao-yifan27 Commit-by: 赵一帆 Merged-by: ascend-robot Description: 删除废弃代码 mindspeed/model下废弃代码 mindspeed/moe下ampipe废弃代码 See merge request: Ascend/MindSpeed!29726 个月前
remove deprecated code 2 Co-authored-by: 赵一帆<zhaoyifan15@huawei.com> # message auto-generated for no-merge-commit merge: !2972 merge master into master remove deprecated code 2 Created-by: zhao-yifan27 Commit-by: 赵一帆 Merged-by: ascend-robot Description: 删除废弃代码 mindspeed/model下废弃代码 mindspeed/moe下ampipe废弃代码 See merge request: Ascend/MindSpeed!29726 个月前
!872 [feature: conv3d支持depth维度并行] Merge pull request !872 from Shitong Li/master_new_conv3d 1 年前
Feat: adaptor for DeepSeek V4 Co-authored-by: wuweiqiang24<wuweiqiang11@huawei.com> # message auto-generated for no-merge-commit merge: !3427 merge master into master Feat: adaptor for DeepSeek V4 Created-by: wuweiqiang24 Commit-by: wuweiqiang24 Merged-by: ascend-robot Description: What this PR does / why we need it? Adaptor for DeepSeek V4!!! Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed!34271 个月前
fix npu_sparse_attn_sharedkv ops in Mindspeed Co-authored-by: boes129<chenqi185@huawei.com> # message auto-generated for no-merge-commit merge: !3477 merge chenqi_f1 into master fix npu_sparse_attn_sharedkv ops in Mindspeed Created-by: boes129 Commit-by: boes129 Merged-by: ascend-robot Description: What this PR does / why we need it? 1.新版npu_sparse_attn_sharedkv算子的aclnn接口变了,新增了orikv stride和cmpkv stride参数,以适配不同内存框架。 但mindspeed没感知到,代码里还是按照老接口的调用的,在deepseekv4 flash预训练时会导致调用算子报core dump,参考新算子接口示例 https://gitcode.com/cann/cann-recipes-infer/pull/387/diffs 和ops-transformer仓PR https://gitcode.com/cann/ops-transformer/commit/fc04f943c12b87c6581527bd558fbe38cee31879?ref=master ,修复此问题。 2.原先的文件不满足.clang-format的格式要求,无法通过流水线,按照项目中的.clang-format文件格式化了文件。 Does this PR introduce any user-facing change? NA How was this patch tested? 使用新版算子包, 替换了修改后的npu_sparse_attn_shared_kv.cpp,通过mindspeed对deepseekv4 flash完成预训练,验证了修改成功生效 See merge request: Ascend/MindSpeed!34778 天前
quant fp8 optimizer 6 个月前
!716 perf: gpt_dataset and initialize in megatron Merge pull request !716 from 邓佳/dev_patch 1 年前
fix(quant): only hif8 add dst_type_max args Co-authored-by: Muu<koimuu@163.com> # message auto-generated for no-merge-commit merge: !3514 merge fix-hif8-tensorwise into master fix(quant): only hif8 add dst_type_max args Created-by: Muuyo Commit-by: Muu Merged-by: ascend-robot Description: fix(quant): only hif8 add dst_type_max args See merge request: Ascend/MindSpeed!35143 天前
!2238 refactor: tokenizer重构 Merge pull request !2238 from wangruiqi/master 1 年前
!2114 实现pipeline parallel的noop layer的重构 Merge pull request !2114 from liurong1995/feature_noop 1 年前
!2827 fix: fix get_full_args Merge pull request !2827 from YE ZHENYUAN/master0910 8 个月前
Support TransformerEngine Co-authored-by: MingzhenWang<wangmingzhen4@huawei.com> Co-authored-by: Muu<koimuu@163.com> Co-authored-by: x30061065<xuyuanhui3@h-partners.com> Co-authored-by: 耿瑞良<gengruiliang@huawei.com> # message auto-generated for no-merge-commit merge: !2947 merge lingqu_master into master Support TransformerEngine Created-by: mingzhenwang Commit-by: mingzhenwang;Muu;MingzhenWang;x30061065;耿瑞良 Merged-by: ascend-robot Description: 1. 支持TELinear层 2. 支持FP8计算,quantmatmul/gmm 3. 支持多种数据类型FP8/HiF8 4. 支持多种量化策略delayed/tensorwise/blockwise/mxfp8 5. TELinear层支持通算融合 See merge request: Ascend/MindSpeed!29476 个月前
[Bugfix] Fix Megatron checkpoint saving&loading compatibility for torch_dcp format Co-authored-by: 林明哲<linmingzhe3@huawei.com> # message auto-generated for no-merge-commit merge: !3077 merge fix1202 into master [Bugfix] Fix Megatron checkpoint saving&loading compatibility for torch_dcp format Created-by: LinMingZhe Commit-by: 林明哲 Merged-by: ascend-robot Description: Fix Megatron checkpoint saving&loading compatibility for torch_dcp format See merge request: Ascend/MindSpeed!30775 个月前
!2228 add MindSpeedFeaturesManager Merge pull request !2228 from Jializheng/master 1 年前
!2308 Adaptation core_r0.12.0 Merge pull request !2308 from 邓佳/core_r0.12.0_dev 1 年前
!2675 Add once warning and args check Merge pull request !2675 from Jializheng/master 10 个月前
【bugfix!!!】fbov COC&share_expert_sync fix Co-authored-by: EX_mitsu<yangjie409@h-partners.com> # message auto-generated for no-merge-commit merge: !3005 merge master into master 【bugfix!!!】fbov COC&share_expert_sync fix Created-by: EX_mitsuX Commit-by: EX_mitsuX;EX_mitsu Merged-by: ascend-robot Description: 修复TP1状态下开启COC未进行检查的BUG(非预期场景。TP1开COC应当无收益。)。 重新调整计算流,追加等待及同步,修复计算速度过快时可能产生的同步问题(问题场景:同时开启COC,permute融合算子及共享专家出现,使用launch_blocking该问题消失)。 修复不再兼容的TE检测。 修复TE部分module的属性缺失。 See merge request: Ascend/MindSpeed!30056 个月前
fix:Add mindspeed config to subclass of transformer config Co-authored-by: JialiZheng<jializheng@huawei.com> # message auto-generated for no-merge-commit merge: !3284 merge master into master fix:Add mindspeed config to subclass of transformer config Created-by: JialiZheng1 Commit-by: JialiZheng Merged-by: ascend-robot Description: Add mindspeed config to subclass of transformer config See merge request: Ascend/MindSpeed!32842 个月前
增加MOE专家负载均衡功能 Co-authored-by: zhanggaolu2<252028123@qq.com> # message auto-generated for no-merge-commit merge: !2845 merge expert_loadbalance2master into master 增加MOE专家负载均衡功能 Created-by: zhanggaolu2 Commit-by: zhanggaolu2 Merged-by: ascend-robot Description: 增加MOE专家负载均衡功能 See merge request: Ascend/MindSpeed!28457 个月前
fix: fix the alltoall_seq token dispatcher Nan bug Co-authored-by: guofanfeng<guofanfeng1@huawei.com> # message auto-generated for no-merge-commit merge: !3249 merge bug_fix into master fix: fix the alltoall_seq token dispatcher Nan bug Created-by: guofanfeng23 Commit-by: guofanfeng Merged-by: ascend-robot Description: fix the alltoall_seq token dispatcher Nan bug https://wiki.huawei.com/domains/152732/wiki/307991/WIKI2026020210028614 See merge request: Ascend/MindSpeed!32493 个月前
!1825 长序列支持BNSD和arguments整改 Merge pull request !1825 from YE ZHENYUAN/master 1 年前