MindSpeed-LLM/mindspeed_llm/features_manager · Ascend/MindSpeed-LLM - AtomGit

ascend-robotfeat(pytorch): Add MindSpeed Muon feature

文件	最后提交记录	最后更新时间
affinity	!3071 [pytorch][refactor]refact affinity feature Merge pull request !3071 from 丁子叉/master	9 个月前
ai_framework	!3215 [mindspore][master] add mindspore_register_args Merge pull request !3215 from 孔德硕/parse_args	8 个月前
arguments	[pytorch][bugfix]add non-zero validation for `--moe-router-group-topk` Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4110 merge master into master [pytorch][bugfix]add non-zero validation for `--moe-router-group-topk` Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: add non-zero validation for `--moe-router-group-topk` See merge request: Ascend/MindSpeed-LLM!4110	3 个月前
common	[pytorch][feature]add megatron-cache-dir Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !3694 merge ckpt_merge into master [pytorch][feature]add megatron-cache-dir Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: add megatron-cache-dir See merge request: Ascend/MindSpeed-LLM!3694	6 个月前
context_parallel	[pytorch][feature] kvallgather supports TND Co-authored-by: Jia_Austin<dengjia6@huawei.com> # message auto-generated for no-merge-commit merge: !4277 merge fix_te_tnd into master [pytorch][feature] kvallgather supports TND Created-by: Jia_Austin Commit-by: Jia_Austin Merged-by: ascend-robot Description: ## What this PR does / why we need it? feat: TE tnd ## Does this PR introduce any user-facing change? NA ## How was this patch tested? Turn on and off TE CP TND See merge request: Ascend/MindSpeed-LLM!4277	2 个月前
convert_checkpoint	feat(pytorch): add dsv4 mg2hf Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !4458 merge master into master feat(pytorch): add dsv4 mg2hf Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. add dsv4 mg2hf - only support pp - only support etp = 1 or tp = 1 2. fix dsv4 hf2mg vpp ## Does this PR introduce any user-facing change? if use base model of dsv4 to do mg2hf convert, please set --model-type-hf with deepseek4_base ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4458	19 天前
dataset	feat(pytorch): add DeepSeek4 fine-tuning template Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4436 merge dsv4 into master feat(pytorch): add DeepSeek4 fine-tuning template Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds a fine-tuning template for the DeepSeek4 model series to support its specific prompt format, including thinking mode, tool calling (DSML format), and reasoning effort control. ## Does this PR introduce any user-facing change? Yes — users can now select --prompt-type deepseek4 to fine-tune DeepSeek4 models. Two new behaviors are also exposed: - `--enable-thinking` controls thinking vs chat mode - `--reasoning-effort {max,high}` inserts a max-effort instruction prefix; only valid when thinking is enabled - `--drop-thinking` controls whether reasoning content is kept in each turn ## How was this patch tested? Tested with byte-level alignment against the official encoding_dsv4 script. See merge request: Ascend/MindSpeed-LLM!4436	25 天前
dpo	[pytorch][sh] update dpo in master Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !3478 merge update-dpo into master [pytorch][sh] update dpo in master Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: update dpo in master See merge request: Ascend/MindSpeed-LLM!3478	7 个月前
evaluation	!3136 [pytorch][refactor]refactor evaluation feature Merge pull request !3136 from HanhuiChen/evaluation	9 个月前
finetune	Adapt low precision optimization on LLM Co-authored-by: 34739<34739974@qq.com> Co-authored-by: g00943493<gengruiliang@huawei.com> # message auto-generated for no-merge-commit merge: !3823 merge master into master Adapt low precision optimization on LLM Created-by: ruilgeng Commit-by: g00943493;ruilgeng;34739 Merged-by: ascend-robot Description: adapt low precision optimization on LLM See merge request: Ascend/MindSpeed-LLM!3823	4 个月前
functional	!3369 [pytorch][optimize]fix profile step setting and qwen3 scripts Merge pull request !3369 from 丁子叉/profile	8 个月前
fusions	feat(torch): Add swiglu func with limit Co-authored-by: iansheng<shengjiayi@huawei.com> # message auto-generated for no-merge-commit merge: !4428 merge swiglu426 into master feat(torch): Add swiglu func with limit Created-by: iansheng Commit-by: iansheng Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4428	1 个月前
high_availability	[mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. Co-authored-by: wangguoyan<wangguoyan6@h-partners.com> # message auto-generated for no-merge-commit merge: !4178 merge master into master [mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. Created-by: guoywang Commit-by: wangguoyan Merged-by: ascend-robot Description: [pytorch][mindio][bugfix] Fixed an anomaly in accuracy when --distributed-optimizer-no-replica is enabled. See merge request: Ascend/MindSpeed-LLM!4178	3 个月前
inference	!3128 [pytorch][refactor]refactor inference feature Merge pull request !3128 from HanhuiChen/inference	9 个月前
layerwise_disaggregated_training	feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Co-authored-by: yanzhenghao<yanzhenghao2@huawei.com> Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4437 merge 20260409_vdp into master feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Created-by: xuguoliang3 Commit-by: xuguoliang3;yanzhenghao;fangminghao Merged-by: ascend-robot Description: ## What this PR does / why we need it? support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4437	13 天前
low_precision	fix: add low precision opt verification Co-authored-by: yulelanmei<huangyijie8@huawei.com> # message auto-generated for no-merge-commit merge: !4504 merge master into master fix: add low precision opt verification Created-by: yulelanmei Commit-by: yulelanmei Merged-by: ascend-robot Description: Fix [#1398](https://gitcode.com/Ascend/MindSpeed-LLM/issues/1398) ## What this PR does / why we need it? The low precision optimizer features: --quant-states & --quant-grads are incompatible with --swap-optimizer, but no verification right now. ## Does this PR introduce any user-facing change? When user uses both parameters, a verification message is displayed. ## How was this patch tested? Turn on both parameters in a train script in addition. See merge request: Ascend/MindSpeed-LLM!4504	7 天前
megatron_basic	feat(pytorch): support deepseekv4_flash in mcore backend Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4420 merge geneva2 into master feat(pytorch): support deepseekv4_flash in mcore backend Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4420	1 个月前
memory	[pytorch][feature]add chunk_loss Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !3942 merge chunk_loss_master into master [pytorch][feature]add chunk_loss Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: [pytorch][feature]add chunk_loss See merge request: Ascend/MindSpeed-LLM!3942	5 个月前
models	[pytorch][feature]PLM-1.8B pretrain/sft Co-authored-by: EVA1<jingsiyu1@huawei.com> # message auto-generated for no-merge-commit merge: !3637 merge master into master [pytorch][feature]PLM-1.8B pretrain/sft Created-by: EVA1 Commit-by: EVA1 Merged-by: ascend-robot Description: 1.PLM-1.8B 模型支持：数据集格式转换、权重转换、微调、预训练； 2.精度已对齐，sft相对误差小于千分之一。 See merge request: Ascend/MindSpeed-LLM!3637	6 个月前
moe	fix(pytorch)：fix pipeline testcase Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !4443 merge bugfix-pipeline into master fix(pytorch)：fix pipeline testcase Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: ## What this PR does / why we need it? fix pipeline testcase ## Does this PR introduce any user-facing change? NA ## How was this patch tested? NA See merge request: Ascend/MindSpeed-LLM!4443	26 天前
optimizer	feat(pytorch): Add MindSpeed Muon feature Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4549 merge master into master feat(pytorch): Add MindSpeed Muon feature Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Replaces the in-repo self-maintained Muon optimizer with MindSpeed's native Muon implementation, removing the legacy code and adapting the patch registration accordingly. ## Does this PR introduce any user-facing change? No change to the Muon usage interface; existing Muon training scripts and arguments continue to work. The underlying implementation is switched to MindSpeed's native version. ## How was this patch tested? Precision has been verified: training with the native Muon optimizer was aligned against the previous self-maintained implementation, with consistent loss and grad-norm behavior. See merge request: Ascend/MindSpeed-LLM!4549	17 小时前
pipeline_parallel	[pytorch][bugfix]fix num-layer-list of ckpt mg2hf Co-authored-by: qyzqyz<quyueze@h-partners.com> # message auto-generated for no-merge-commit merge: !3775 merge master into master [pytorch][bugfix]fix num-layer-list of ckpt mg2hf Created-by: qyzqyz Commit-by: qyzqyz Merged-by: ascend-robot Description: fix num-layer-list of ckpt mg2hf See merge request: Ascend/MindSpeed-LLM!3775	6 个月前
qat	feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Co-authored-by: xusiyang<xusiyang2@huawei.com> # message auto-generated for no-merge-commit merge: !4446 merge master into master feat: QAT supports the W4A16 fake quantization option only at the MOE layer. Created-by: weixin_44492126 Commit-by: xusiyang Merged-by: ascend-robot Description: ## What this PR does / why we need it? QAT支持仅在MOE层做W4A16伪量化选项 ## Does this PR introduce any user-facing change? 新增控制MOE层使用W4A16的校验 ## How was this patch tested? 910b机器上验证通过 https://wiki.huawei.com/domains/171785/wiki/358154/WIKI2026040110634082 See merge request: Ascend/MindSpeed-LLM!4446	22 天前
tensor_parallel	!3287 [pytorch][bugfix]fix coc in adapter_v2 Merge pull request !3287 from 丁子叉/coc	8 个月前
tokenizer	feat(pytorch): add DeepSeek4 fine-tuning template Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4436 merge dsv4 into master feat(pytorch): add DeepSeek4 fine-tuning template Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds a fine-tuning template for the DeepSeek4 model series to support its specific prompt format, including thinking mode, tool calling (DSML format), and reasoning effort control. ## Does this PR introduce any user-facing change? Yes — users can now select --prompt-type deepseek4 to fine-tune DeepSeek4 models. Two new behaviors are also exposed: - `--enable-thinking` controls thinking vs chat mode - `--reasoning-effort {max,high}` inserts a max-effort instruction prefix; only valid when thinking is enabled - `--drop-thinking` controls whether reasoning content is kept in each turn ## How was this patch tested? Tested with byte-level alignment against the official encoding_dsv4 script. See merge request: Ascend/MindSpeed-LLM!4436	25 天前
transformer	feat(pytorch): support o lora rank and q lora rank in v4pro Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4427 merge master into master feat(pytorch): support o lora rank and q lora rank in v4pro Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4427	1 个月前
__init__.py	feat(pytorch): Add MindSpeed Muon feature Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4549 merge master into master feat(pytorch): Add MindSpeed Muon feature Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Replaces the in-repo self-maintained Muon optimizer with MindSpeed's native Muon implementation, removing the legacy code and adapting the patch registration accordingly. ## Does this PR introduce any user-facing change? No change to the Muon usage interface; existing Muon training scripts and arguments continue to work. The underlying implementation is switched to MindSpeed's native version. ## How was this patch tested? Precision has been verified: training with the native Muon optimizer was aligned against the previous self-maintained implementation, with consistent loss and grad-norm behavior. See merge request: Ascend/MindSpeed-LLM!4549	17 小时前