文件最后提交记录最后更新时间
!3071 [pytorch][refactor]refact affinity feature Merge pull request !3071 from 丁子叉/master 9 个月前
!3215 [mindspore][master] add mindspore_register_args Merge pull request !3215 from 孔德硕/parse_args 8 个月前
[pytorch][bugfix]add non-zero validation for --moe-router-group-topk Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4110 merge master into master [pytorch][bugfix]add non-zero validation for --moe-router-group-topk Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: add non-zero validation for --moe-router-group-topk See merge request: Ascend/MindSpeed-LLM!41103 个月前
[pytorch][feature]add megatron-cache-dir Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !3694 merge ckpt_merge into master [pytorch][feature]add megatron-cache-dir Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: add megatron-cache-dir See merge request: Ascend/MindSpeed-LLM!36946 个月前
[pytorch][feature] kvallgather supports TND Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !4277 merge fix_te_tnd into master [pytorch][feature] kvallgather supports TND Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: ## What this PR does / why we need it? feat: TE tnd ## Does this PR introduce any user-facing change? NA ## How was this patch tested? Turn on and off TE CP TND See merge request: Ascend/MindSpeed-LLM!42772 个月前
feat(pytorch): add dsv4 mg2hf Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !4458 merge master into master feat(pytorch): add dsv4 mg2hf Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. add dsv4 mg2hf - only support pp - only support etp = 1 or tp = 1 2. fix dsv4 hf2mg vpp ## Does this PR introduce any user-facing change? if use base model of dsv4 to do mg2hf convert, please set --model-type-hf with deepseek4_base ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!445819 天前
feat(pytorch): add DeepSeek4 fine-tuning template Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4436 merge dsv4 into master feat(pytorch): add DeepSeek4 fine-tuning template Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds a fine-tuning template for the DeepSeek4 model series to support its specific prompt format, including thinking mode, tool calling (DSML format), and reasoning effort control. ## Does this PR introduce any user-facing change? Yes — users can now select --prompt-type deepseek4 to fine-tune DeepSeek4 models. Two new behaviors are also exposed: - --enable-thinking controls thinking vs chat mode - --reasoning-effort {max,high} inserts a max-effort instruction prefix; only valid when thinking is enabled - --drop-thinking controls whether reasoning content is kept in each turn ## How was this patch tested? Tested with byte-level alignment against the official encoding_dsv4 script. See merge request: Ascend/MindSpeed-LLM!443625 天前
[pytorch][sh] update dpo in master Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !3478 merge update-dpo into master [pytorch][sh] update dpo in master Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: update dpo in master See merge request: Ascend/MindSpeed-LLM!34787 个月前
!3136 [pytorch][refactor]refactor evaluation feature Merge pull request !3136 from HanhuiChen/evaluation 9 个月前
Adapt low precision optimization on LLM Co-authored-by: 34739<34739974@qq.com> Co-authored-by: g00943493<gengruiliang@huawei.com> # message auto-generated for no-merge-commit merge: !3823 merge master into master Adapt low precision optimization on LLM Created-by: ruilgeng Commit-by: g00943493;ruilgeng;34739 Merged-by: ascend-robot Description: adapt low precision optimization on LLM See merge request: Ascend/MindSpeed-LLM!38234 个月前
!3369 [pytorch][optimize]fix profile step setting and qwen3 scripts Merge pull request !3369 from 丁子叉/profile 8 个月前
feat(torch): Add swiglu func with limit Co-authored-by: iansheng<shengjiayi@huawei.com> # message auto-generated for no-merge-commit merge: !4428 merge swiglu426 into master feat(torch): Add swiglu func with limit Created-by: iansheng Commit-by: iansheng Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!44281 个月前
[mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. Co-authored-by: wangguoyan<wangguoyan6@h-partners.com> # message auto-generated for no-merge-commit merge: !4178 merge master into master [mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. Created-by: guoywang Commit-by: wangguoyan Merged-by: ascend-robot Description: [pytorch][mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. See merge request: Ascend/MindSpeed-LLM!41783 个月前
!3128 [pytorch][refactor]refactor inference feature Merge pull request !3128 from HanhuiChen/inference 9 个月前
feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Co-authored-by: yanzhenghao<yanzhenghao2@huawei.com> Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4437 merge 20260409_vdp into master feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Created-by: xuguoliang3 Commit-by: xuguoliang3;yanzhenghao;fangminghao Merged-by: ascend-robot Description: ## What this PR does / why we need it? support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!443713 天前
fix: add low precision opt verification Co-authored-by: yulelanmei<huangyijie8@huawei.com> # message auto-generated for no-merge-commit merge: !4504 merge master into master fix: add low precision opt verification Created-by: yulelanmei Commit-by: yulelanmei Merged-by: ascend-robot Description: Fix [#1398](https://gitcode.com/Ascend/MindSpeed-LLM/issues/1398) ## What this PR does / why we need it? The low precision optimizer features: --quant-states & --quant-grads are incompatible with --swap-optimizer, but no verification right now. ## Does this PR introduce any user-facing change? When user uses both parameters, a verification message is displayed. ## How was this patch tested? Turn on both parameters in a train script in addition. See merge request: Ascend/MindSpeed-LLM!45047 天前
feat(pytorch): support deepseekv4_flash in mcore backend Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4420 merge geneva2 into master feat(pytorch): support deepseekv4_flash in mcore backend Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!44201 个月前
[pytorch][feature]add chunk_loss Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !3942 merge chunk_loss_master into master [pytorch][feature]add chunk_loss Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: [pytorch][feature]add chunk_loss See merge request: Ascend/MindSpeed-LLM!39425 个月前
[pytorch][feature]PLM-1.8B pretrain/sft Co-authored-by: EVA1<jingsiyu1@huawei.com> # message auto-generated for no-merge-commit merge: !3637 merge master into master [pytorch][feature]PLM-1.8B pretrain/sft Created-by: EVA1 Commit-by: EVA1 Merged-by: ascend-robot Description: 1.PLM-1.8B 模型支持:数据集格式转换、权重转换、微调、预训练; 2.精度已对齐,sft相对误差小于千分之一。 See merge request: Ascend/MindSpeed-LLM!36376 个月前
fix(pytorch):fix pipeline testcase Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !4443 merge bugfix-pipeline into master fix(pytorch):fix pipeline testcase Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: ## What this PR does / why we need it? fix pipeline testcase ## Does this PR introduce any user-facing change? NA ## How was this patch tested? NA See merge request: Ascend/MindSpeed-LLM!444326 天前
feat(pytorch): Add MindSpeed Muon feature Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4549 merge master into master feat(pytorch): Add MindSpeed Muon feature Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Replaces the in-repo self-maintained Muon optimizer with MindSpeed's native Muon implementation, removing the legacy code and adapting the patch registration accordingly. ## Does this PR introduce any user-facing change? No change to the Muon usage interface; existing Muon training scripts and arguments continue to work. The underlying implementation is switched to MindSpeed's native version. ## How was this patch tested? Precision has been verified: training with the native Muon optimizer was aligned against the previous self-maintained implementation, with consistent loss and grad-norm behavior. See merge request: Ascend/MindSpeed-LLM!454917 小时前
[pytorch][bugfix]fix num-layer-list of ckpt mg2hf Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !3775 merge master into master [pytorch][bugfix]fix num-layer-list of ckpt mg2hf Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: fix num-layer-list of ckpt mg2hf See merge request: Ascend/MindSpeed-LLM!37756 个月前
feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Co-authored-by: xusiyang<xusiyang2@huawei.com> # message auto-generated for no-merge-commit merge: !4446 merge master into master feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Created-by: weixin_44492126 Commit-by: xusiyang Merged-by: ascend-robot Description: ## What this PR does / why we need it? QAT支持仅在MOE层做W4A16伪量化选项 ## Does this PR introduce any user-facing change? 新增控制MOE层使用W4A16的校验 ## How was this patch tested? 910b机器上验证通过 https://wiki.huawei.com/domains/171785/wiki/358154/WIKI2026040110634082 See merge request: Ascend/MindSpeed-LLM!444622 天前
!3287 [pytorch][bugfix]fix coc in adapter_v2 Merge pull request !3287 from 丁子叉/coc 8 个月前
feat(pytorch): add DeepSeek4 fine-tuning template Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4436 merge dsv4 into master feat(pytorch): add DeepSeek4 fine-tuning template Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds a fine-tuning template for the DeepSeek4 model series to support its specific prompt format, including thinking mode, tool calling (DSML format), and reasoning effort control. ## Does this PR introduce any user-facing change? Yes — users can now select --prompt-type deepseek4 to fine-tune DeepSeek4 models. Two new behaviors are also exposed: - --enable-thinking controls thinking vs chat mode - --reasoning-effort {max,high} inserts a max-effort instruction prefix; only valid when thinking is enabled - --drop-thinking controls whether reasoning content is kept in each turn ## How was this patch tested? Tested with byte-level alignment against the official encoding_dsv4 script. See merge request: Ascend/MindSpeed-LLM!443625 天前
feat(pytorch): support o lora rank and q lora rank in v4pro Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4427 merge master into master feat(pytorch): support o lora rank and q lora rank in v4pro Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!44271 个月前
feat(pytorch): Add MindSpeed Muon feature Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4549 merge master into master feat(pytorch): Add MindSpeed Muon feature Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Replaces the in-repo self-maintained Muon optimizer with MindSpeed's native Muon implementation, removing the legacy code and adapting the patch registration accordingly. ## Does this PR introduce any user-facing change? No change to the Muon usage interface; existing Muon training scripts and arguments continue to work. The underlying implementation is switched to MindSpeed's native version. ## How was this patch tested? Precision has been verified: training with the native Muon optimizer was aligned against the previous self-maintained implementation, with consistent loss and grad-norm behavior. See merge request: Ascend/MindSpeed-LLM!454917 小时前