昇腾LLM分布式训练框架

文件	最后提交记录	最后更新时间
.gitcode	docs: change pr template Co-authored-by: LinShua<707894133@qq.com> # message auto-generated for no-merge-commit merge: !4299 merge master_pr_template3 into master docs: change pr template Created-by: LinShua Commit-by: LinShua Merged-by: ascend-robot Description: ## What this PR does / why we need it? change pr template ## Does this PR introduce any user-facing change? NA ## How was this patch tested? NA See merge request: Ascend/MindSpeed-LLM!4299	3 个月前
ci	ci(torch): run st with pytest Co-authored-by: iansheng<shengjiayi@huawei.com> # message auto-generated for no-merge-commit merge: !4403 merge ci412 into master ci(torch): run st with pytest Created-by: iansheng Commit-by: iansheng Merged-by: ascend-robot Description: ## What this PR does / why we need it? Run st test suit with Pytest instead of Shell. ## Does this PR introduce any user-facing change? no ## How was this patch tested? Yes See merge request: Ascend/MindSpeed-LLM!4403	2 个月前
configs	fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Co-authored-by: wanggangguo<wanggangguo@huawei.com> # message auto-generated for no-merge-commit merge: !4656 merge upgrade into master fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Created-by: isfrapples Commit-by: wanggangguo Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. Adds `seed-oss` and `qwen25` to the v2 converter's supported model list with proper `model_cfg.json` configuration. 2. Fixes fused QKV weight splitting by introducing a `qkv_split` attribute so the converter correctly handles both per-group-interleaved (seed-oss) and contiguous (all other models) memory layouts. 3. Fixes qwen25 conversion producing checkpoints without `linear_qkv.bias` by registering it as a standalone model type instead of falling back to `llama2`. ## Does this PR introduce any user-facing change? 1. seed-oss: mcore2hf conversion now produces correct Q/K/V weights. 2. qwen25: existing checkpoints converted with `--model-type-hf llama2` must be re-converted with `--model-type-hf qwen25` to include `linear_qkv.bias`. 3. All other models: no behavioral change. ## How was this patch tested? 1. All 5 mcore2hf cases covering `qkv_type='unpack'` (seed-oss MHA TP2, qwen25 MHA TP4, qwen2-moe GQA, llama2 MQA TP2, llama2 MHA TP1) passed. 2. HF→MG round-trip cases for qwen25 and qwen2-moe passed. 3. Four qwen25 layerwise disaggregated training CI scripts were updated to `--model-type-hf qwen25`, fixing the previously reported `Missing linear_qkv.bias` errors. fixes#1551 See merge request: Ascend/MindSpeed-LLM!4656	48 分钟前
docker	docs: use absolute license link in docker overview Co-authored-by: wxy1105<wangxinyu136@huawei.com> # message auto-generated for no-merge-commit merge: !4654 merge master_doc into master docs: use absolute license link in docker overview Created-by: wxy1105 Commit-by: wxy1105 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Modify the relative path of the license in the Docker overview to an absolute path. ## Does this PR introduce any user-facing change? Modify the Docker overview ## How was this patch tested? Attempt to click the license entry to verify whether it can jump to the corresponding page. See merge request: Ascend/MindSpeed-LLM!4654	1 天前
docs	feat(pytorch): add glm52 in fsdp2 Co-authored-by: guozhihua2<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !4614 merge add_glm52_in_fsdp2 into master feat(pytorch): add glm52 in fsdp2 Created-by: guozhihua2 Commit-by: guozhihua2 Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR adds GLM52 model support in the FSDP2 training framework. Main changes include: 1. Add the GLM52 model implementation for FSDP2, including model definition, configuration adaptation, and registration logic. 2. Support GLM52 pretraining under the FSDP2 framework, enabling users to launch GLM52 training with the FSDP2 training entry and related scripts. 3. Support GLM52 chat/inference flow, so the adapted GLM52 model can be used for basic generation and chat validation after loading. 4. Adapt GLM52-specific model logic in FSDP2, including attention/indexer-related behavior and model forward compatibility required by GLM52. 5. Provide related ST coverage to ensure the GLM52 FSDP2 model path can be built, loaded, and executed correctly. ## Does this PR introduce any user-facing change? Yes. Users can now use the GLM52 model in the FSDP2 framework, including GLM52 pretraining and chat/inference workflows. Existing model usage is not expected to be affected. ## How was this patch tested？ Verified by running the GLM52 ST test. `pipeline/st/glm52/pretrain_glm52_38b_4k_fsdp2_A3.sh` The test covers the GLM52 FSDP2 model build and execution path, including pretraining-related runtime validation and basic chat/inference functionality. See merge request: Ascend/MindSpeed-LLM!4614	14 小时前
examples	feat(pytorch): add glm52 in fsdp2 Co-authored-by: guozhihua2<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !4614 merge add_glm52_in_fsdp2 into master feat(pytorch): add glm52 in fsdp2 Created-by: guozhihua2 Commit-by: guozhihua2 Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR adds GLM52 model support in the FSDP2 training framework. Main changes include: 1. Add the GLM52 model implementation for FSDP2, including model definition, configuration adaptation, and registration logic. 2. Support GLM52 pretraining under the FSDP2 framework, enabling users to launch GLM52 training with the FSDP2 training entry and related scripts. 3. Support GLM52 chat/inference flow, so the adapted GLM52 model can be used for basic generation and chat validation after loading. 4. Adapt GLM52-specific model logic in FSDP2, including attention/indexer-related behavior and model forward compatibility required by GLM52. 5. Provide related ST coverage to ensure the GLM52 FSDP2 model path can be built, loaded, and executed correctly. ## Does this PR introduce any user-facing change? Yes. Users can now use the GLM52 model in the FSDP2 framework, including GLM52 pretraining and chat/inference workflows. Existing model usage is not expected to be affected. ## How was this patch tested？ Verified by running the GLM52 ST test. `pipeline/st/glm52/pretrain_glm52_38b_4k_fsdp2_A3.sh` The test covers the GLM52 FSDP2 model build and execution path, including pretraining-related runtime validation and basic chat/inference functionality. See merge request: Ascend/MindSpeed-LLM!4614	14 小时前
mindspeed_llm	fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Co-authored-by: wanggangguo<wanggangguo@huawei.com> # message auto-generated for no-merge-commit merge: !4656 merge upgrade into master fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Created-by: isfrapples Commit-by: wanggangguo Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. Adds `seed-oss` and `qwen25` to the v2 converter's supported model list with proper `model_cfg.json` configuration. 2. Fixes fused QKV weight splitting by introducing a `qkv_split` attribute so the converter correctly handles both per-group-interleaved (seed-oss) and contiguous (all other models) memory layouts. 3. Fixes qwen25 conversion producing checkpoints without `linear_qkv.bias` by registering it as a standalone model type instead of falling back to `llama2`. ## Does this PR introduce any user-facing change? 1. seed-oss: mcore2hf conversion now produces correct Q/K/V weights. 2. qwen25: existing checkpoints converted with `--model-type-hf llama2` must be re-converted with `--model-type-hf qwen25` to include `linear_qkv.bias`. 3. All other models: no behavioral change. ## How was this patch tested? 1. All 5 mcore2hf cases covering `qkv_type='unpack'` (seed-oss MHA TP2, qwen25 MHA TP4, qwen2-moe GQA, llama2 MQA TP2, llama2 MHA TP1) passed. 2. HF→MG round-trip cases for qwen25 and qwen2-moe passed. 3. Four qwen25 layerwise disaggregated training CI scripts were updated to `--model-type-hf qwen25`, fixing the previously reported `Missing linear_qkv.bias` errors. fixes#1551 See merge request: Ascend/MindSpeed-LLM!4656	48 分钟前
pre-commit	feat: compatibility improvements among third-party dependencies Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !4638 merge requirement-fix into master feat: compatibility improvements among third-party dependencies Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1.Improves compatibility with third-party dependencies. 2.Removes version constraints on third-party packages, including numpy, datasets, ray, bitsandbytes-npu-beta, mistral_common, and antlr4-python3-runtime. 3.Adds support for newer versions of PEFT. 4.The transformers version has already been approved and is not included in this update. ## Does this PR introduce any user-facing change? No. This PR only contains compatibility improvements and does not introduce any user-facing changes. ## How was this patch tested? 1.Passed smoke tests in the Blue Zone. 2.Passed ST validation in the LLM repository. See merge request: Ascend/MindSpeed-LLM!4638	3 天前
tests	fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Co-authored-by: wanggangguo<wanggangguo@huawei.com> # message auto-generated for no-merge-commit merge: !4656 merge upgrade into master fix: fix mg2hf bug, support qkv_split per_group layout for seed-oss mcore2hf conversion Created-by: isfrapples Commit-by: wanggangguo Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. Adds `seed-oss` and `qwen25` to the v2 converter's supported model list with proper `model_cfg.json` configuration. 2. Fixes fused QKV weight splitting by introducing a `qkv_split` attribute so the converter correctly handles both per-group-interleaved (seed-oss) and contiguous (all other models) memory layouts. 3. Fixes qwen25 conversion producing checkpoints without `linear_qkv.bias` by registering it as a standalone model type instead of falling back to `llama2`. ## Does this PR introduce any user-facing change? 1. seed-oss: mcore2hf conversion now produces correct Q/K/V weights. 2. qwen25: existing checkpoints converted with `--model-type-hf llama2` must be re-converted with `--model-type-hf qwen25` to include `linear_qkv.bias`. 3. All other models: no behavioral change. ## How was this patch tested? 1. All 5 mcore2hf cases covering `qkv_type='unpack'` (seed-oss MHA TP2, qwen25 MHA TP4, qwen2-moe GQA, llama2 MQA TP2, llama2 MHA TP1) passed. 2. HF→MG round-trip cases for qwen25 and qwen2-moe passed. 3. Four qwen25 layerwise disaggregated training CI scripts were updated to `--model-type-hf qwen25`, fixing the previously reported `Missing linear_qkv.bias` errors. fixes#1551 See merge request: Ascend/MindSpeed-LLM!4656	48 分钟前
.clang-format	[feat]Modify and improve the configuration of the pre-commit open-source code detection tool Co-authored-by: wujinyuan1<wujinyuan1@huawei.com> # message auto-generated for no-merge-commit merge: !4448 merge master into master [feat]Modify and improve the configuration of the pre-commit open-source code detection tool Created-by: wujinyuan1 Commit-by: wujinyuan1 Merged-by: ascend-robot Description: ## What this PR does / why we need it? pre-commit 工具规则配置修改：试运行阶段发现python bandit工具检测过严，超出昇腾编程规范。此次修改重点修改bandit规则。 ## Does this PR introduce any user-facing change? 门禁codecheck检测规则精简，提升代码合入效率 ## How was this patch tested? 代码扫描工具配置，不涉及代码仓功能 See merge request: Ascend/MindSpeed-LLM!4448	1 个月前
.gitignore	docs(fsdp2): fix documentation format issues and improve readability Co-authored-by: wangjiangben<wangjiangben@huawei.com> # message auto-generated for no-merge-commit merge: !4476 merge docs/fix-fsdp2-docs-format into master docs(fsdp2): fix documentation format issues and improve readability Created-by: wangjiangben Commit-by: wangjiangben Merged-by: ascend-robot Description: ## Summary Fix format errors in FSDP2 documentation and optimize document structure to improve readability and compliance with Markdown standards. Format Fixes: - Fix table format issues in quantization.md (missing header separators, column alignment errors) - Fix HTML entity syntax error in arguments.md (`"ulysses&quot` missing semicolon) - Remove extra blank lines in code blocks - Standardize indentation in example scripts Structure Optimization: - Unify list markers to standard Markdown `-` syntax - Optimize DTensor section hierarchy for better structure clarity - Convert reference links to proper Markdown link format - Split long paragraphs in quantization descriptions for better readability - Fix MD032 lint error (add blank line before list) Files Changed: - `arguments.md`: Fix HTML entity syntax - `fsdp2_basic_features.md`: Optimize structure hierarchy and list format - `quantization.md`: Fix table format, optimize description text See merge request: Ascend/MindSpeed-LLM!4476	1 个月前
.pre-commit-config.yaml	[feat]Modify and improve the configuration of the pre-commit open-source code detection tool Co-authored-by: wujinyuan1<wujinyuan1@huawei.com> # message auto-generated for no-merge-commit merge: !4448 merge master into master [feat]Modify and improve the configuration of the pre-commit open-source code detection tool Created-by: wujinyuan1 Commit-by: wujinyuan1 Merged-by: ascend-robot Description: ## What this PR does / why we need it? pre-commit 工具规则配置修改：试运行阶段发现python bandit工具检测过严，超出昇腾编程规范。此次修改重点修改bandit规则。 ## Does this PR introduce any user-facing change? 门禁codecheck检测规则精简，提升代码合入效率 ## How was this patch tested? 代码扫描工具配置，不涉及代码仓功能 See merge request: Ascend/MindSpeed-LLM!4448	1 个月前
CONTRIBUTING.md	docs: update README and CONTRIBUTING.md Co-authored-by: xuwenyue<xuwenyue1@huawei.com> # message auto-generated for no-merge-commit merge: !4440 merge master into master docs: update README and CONTRIBUTING.md Created-by: xuwenyue Commit-by: xuwenyue Merged-by: ascend-robot Description: ## What this PR does / why we need it? update README, CONTRIBUTING.md, FAQ.md, features/README.md, performance_tuning.md ## Does this PR introduce any user-facing change? NA ## How was this patch tested? NA See merge request: Ascend/MindSpeed-LLM!4440	1 个月前
LICENSE	update LICENSE. add license Signed-off-by: fengliangjun <fengliangjun@huawei.com>	1 年前
OWNERS	[pytorch][feature] chunkloss for fsdp2 Co-authored-by: sunjunjie1587<sunjunjie8@huawei.com> Co-authored-by: wenjiang23<wenjiang@huawei.com> # message auto-generated for no-merge-commit merge: !3784 merge master into master [pytorch][feature] chunkloss for fsdp2 Created-by: wenjiang23 Commit-by: wenjiang23;sunjunjie1587 Merged-by: ascend-robot Description: 为fsdp2添加chunkloss特性，解决lm_head/cross entropy loss峰值显存问题 See merge request: Ascend/MindSpeed-LLM!3784	6 个月前
README.md	docs: fix quick_start、README、install_guide docs and imporve user experience Co-authored-by: xuwenyue<xuwenyue1@huawei.com> # message auto-generated for no-merge-commit merge: !4639 merge fix_doc_0624 into master docs: fix quick_start、README、install_guide docs and imporve user experience Created-by: xuwenyue Commit-by: xuwenyue Merged-by: ascend-robot Description: ## What this PR does / why we need it? This PR mainly focuses on improving documentation completeness and user experience for MindSpeed LLM: 1. Added FSDP2 quick start documentation link to README.md and docs_guide.md to provide better navigation for users who want to use the FSDP2 training backend. 2. Added an English placeholder note in docs/en/README.md indicating that documentation translation is in progress, guiding users to refer to Chinese docs for the latest content. 3. Added a table legend explanation in README.md to clarify the meaning of "Released", "✅" and "❌" fields, improving documentation clarity. 4. Added ModelScope download option for model weights in both quick_start.md and fsdp2_quick_start.md, which is more accessible for developers in China who may have difficulty accessing HuggingFace. ## Does this PR introduce any user-facing change? - Documentation updates : Multiple documentation files have been updated with new content and links. - Path changes : - README.md - Added FSDP2 quick start link and table legend - docs/zh/docs_guide.md - Updated FSDP2 documentation link - docs/en/README.md - Added translation in-progress note - docs/zh/pytorch/training/quick_start.md - Added ModelScope weight download method - docs/zh/pytorch/training/fsdp2_quick_start.md - Added ModelScope weight download method ## How was this patch tested? 1. Verified all links are correctly formatted and point to existing documentation files 2. Checked markdown syntax correctness for all changes 3. Confirmed consistency between quick_start.md and fsdp2_quick_start.md for the ModelScope download section 4. Verified that all documentation updates maintain proper structure and readability No code changes were introduced in this PR, so no functional testing is required. The changes only affect documentation content and navigation. See merge request: Ascend/MindSpeed-LLM!4639	5 天前
Third_Party_Open_Source_Software_Notice	!2500 update notice Merge pull request !2500 from sunjunjie/master	1 年前
convert_ckpt.py	transformer 4 upgrade 5 & adapt tests Co-authored-by: wanggangguo<wanggangguo@huawei.com> # message auto-generated for no-merge-commit merge: !4545 merge upgrade into master transformer 4 upgrade 5 & adapt tests Created-by: isfrapples Commit-by: wanggangguo Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4545	15 天前
convert_ckpt_v2.py	transformer 4 upgrade 5 & adapt tests Co-authored-by: wanggangguo<wanggangguo@huawei.com> # message auto-generated for no-merge-commit merge: !4545 merge upgrade into master transformer 4 upgrade 5 & adapt tests Created-by: isfrapples Commit-by: wanggangguo Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4545	15 天前
evaluation.py	refactor(megatron):update coverage script Co-authored-by: guihaowen666<guihaowen@huawei.com> # message auto-generated for no-merge-commit merge: !4295 merge br_master_coverage_fix_0313 into master refactor(megatron):update coverage script Created-by: guihaowen666 Commit-by: guihaowen666 Merged-by: ascend-robot Description: # 覆盖率分析脚本更新 ---- ## What this PR does / why we need it? 更新仓库run_coverage.sh脚本，修复覆盖率分析扫描文件不全的问题 ## Does this PR introduce any user-facing change? 不影响仓库基本功能，旨在优化仓库覆盖率分析功能 ## How was this patch tested? 已在蓝区机器上自测通过 See merge request: Ascend/MindSpeed-LLM!4295	3 个月前
inference.py	refactor(megatron):update coverage script Co-authored-by: guihaowen666<guihaowen@huawei.com> # message auto-generated for no-merge-commit merge: !4295 merge br_master_coverage_fix_0313 into master refactor(megatron):update coverage script Created-by: guihaowen666 Commit-by: guihaowen666 Merged-by: ascend-robot Description: # 覆盖率分析脚本更新 ---- ## What this PR does / why we need it? 更新仓库run_coverage.sh脚本，修复覆盖率分析扫描文件不全的问题 ## Does this PR introduce any user-facing change? 不影响仓库基本功能，旨在优化仓库覆盖率分析功能 ## How was this patch tested? 已在蓝区机器上自测通过 See merge request: Ascend/MindSpeed-LLM!4295	3 个月前
inference_fsdp2.py	feature(pytorch): FSDP2 support hardware-adaptive execution Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4343 merge fsdp2_gpu into master feature(pytorch): FSDP2 support hardware-adaptive execution Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: ## What this PR does / why we need it? FSDP2 support hardware-adaptive execution ## Does this PR introduce any user-facing change? No ## How was this patch tested? pipeline test passed See merge request: Ascend/MindSpeed-LLM!4343	3 个月前
posttrain_gpt.py	refactor(megatron):update coverage script Co-authored-by: guihaowen666<guihaowen@huawei.com> # message auto-generated for no-merge-commit merge: !4295 merge br_master_coverage_fix_0313 into master refactor(megatron):update coverage script Created-by: guihaowen666 Commit-by: guihaowen666 Merged-by: ascend-robot Description: # 覆盖率分析脚本更新 ---- ## What this PR does / why we need it? 更新仓库run_coverage.sh脚本，修复覆盖率分析扫描文件不全的问题 ## Does this PR introduce any user-facing change? 不影响仓库基本功能，旨在优化仓库覆盖率分析功能 ## How was this patch tested? 已在蓝区机器上自测通过 See merge request: Ascend/MindSpeed-LLM!4295	3 个月前
preprocess_data.py	feat(pytorch): add DeepSeek4 fine-tuning template Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4436 merge dsv4 into master feat(pytorch): add DeepSeek4 fine-tuning template Created-by: HANHU1CHEN Commit-by: HanhuiChen Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds a fine-tuning template for the DeepSeek4 model series to support its specific prompt format, including thinking mode, tool calling (DSML format), and reasoning effort control. ## Does this PR introduce any user-facing change? Yes — users can now select --prompt-type deepseek4 to fine-tune DeepSeek4 models. Two new behaviors are also exposed: - `--enable-thinking` controls thinking vs chat mode - `--reasoning-effort {max,high}` inserts a max-effort instruction prefix; only valid when thinking is enabled - `--drop-thinking` controls whether reasoning content is kept in each turn ## How was this patch tested? Tested with byte-level alignment against the official encoding_dsv4 script. See merge request: Ascend/MindSpeed-LLM!4436	1 个月前
preprocess_prompt.py	[pytorch][feature]Add RL data processing Co-authored-by: fh_188<fenghui32@huawei.com> # message auto-generated for no-merge-commit merge: !3646 merge fenzhi into master [pytorch][feature]Add RL data processing Created-by: weixin_44917616 Commit-by: fh_188 Merged-by: ascend-robot Description: add preprocess prmompt 修改模板路径 See merge request: Ascend/MindSpeed-LLM!3646	7 个月前
pretrain_deepseek4.py	feat(pytorch): support o lora rank and q lora rank in v4pro Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4427 merge master into master feat(pytorch): support o lora rank and q lora rank in v4pro Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4427	2 个月前
pretrain_gpt.py	refactor(megatron):update coverage script Co-authored-by: guihaowen666<guihaowen@huawei.com> # message auto-generated for no-merge-commit merge: !4295 merge br_master_coverage_fix_0313 into master refactor(megatron):update coverage script Created-by: guihaowen666 Commit-by: guihaowen666 Merged-by: ascend-robot Description: # 覆盖率分析脚本更新 ---- ## What this PR does / why we need it? 更新仓库run_coverage.sh脚本，修复覆盖率分析扫描文件不全的问题 ## Does this PR introduce any user-facing change? 不影响仓库基本功能，旨在优化仓库覆盖率分析功能 ## How was this patch tested? 已在蓝区机器上自测通过 See merge request: Ascend/MindSpeed-LLM!4295	3 个月前
pretrain_mamba.py	refactor(megatron):update coverage script Co-authored-by: guihaowen666<guihaowen@huawei.com> # message auto-generated for no-merge-commit merge: !4295 merge br_master_coverage_fix_0313 into master refactor(megatron):update coverage script Created-by: guihaowen666 Commit-by: guihaowen666 Merged-by: ascend-robot Description: # 覆盖率分析脚本更新 ---- ## What this PR does / why we need it? 更新仓库run_coverage.sh脚本，修复覆盖率分析扫描文件不全的问题 ## Does this PR introduce any user-facing change? 不影响仓库基本功能，旨在优化仓库覆盖率分析功能 ## How was this patch tested? 已在蓝区机器上自测通过 See merge request: Ascend/MindSpeed-LLM!4295	3 个月前
requirements.txt	feat: compatibility improvements among third-party dependencies Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !4638 merge requirement-fix into master feat: compatibility improvements among third-party dependencies Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1.Improves compatibility with third-party dependencies. 2.Removes version constraints on third-party packages, including numpy, datasets, ray, bitsandbytes-npu-beta, mistral_common, and antlr4-python3-runtime. 3.Adds support for newer versions of PEFT. 4.The transformers version has already been approved and is not included in this update. ## Does this PR introduce any user-facing change? No. This PR only contains compatibility improvements and does not introduce any user-facing changes. ## How was this patch tested? 1.Passed smoke tests in the Blue Zone. 2.Passed ST validation in the LLM repository. See merge request: Ascend/MindSpeed-LLM!4638	3 天前
rlhf_gpt.py	add rl ci Co-authored-by: fh_188<fenghui32@huawei.com> # message auto-generated for no-merge-commit merge: !3951 merge master_rl_ci into master [pytorch][feature]Adapt RL for vllm0110 and add CI Created-by: weixin_44917616 Commit-by: fh_188 Merged-by: ascend-robot Description: [pytorch][feature]Adapt RL for vllm0110 and add CI See merge request: Ascend/MindSpeed-LLM!3951	6 个月前
setup.py	[pytorch][build]update setup.py version and python support range master Co-authored-by: tichang<tichang@huawei.com> # message auto-generated for no-merge-commit merge: !4076 merge chore/update_setup.py_version_and_pyversion_info into master [pytorch][build]update setup.py version and python support range master Created-by: tichang Commit-by: tichang Merged-by: ascend-robot Description: chore(setup): update setup.py version and python support range 在master分支修改了setup.py中的版本信息和python支持版本 See merge request: Ascend/MindSpeed-LLM!4076	5 个月前
train_fsdp2.py	fix:Fix FSDP2 backend scheduler max step calculation. Co-authored-by: sunjunjie1587<sunjunjie8@huawei.com> # message auto-generated for no-merge-commit merge: !4505 merge master into master fix:Fix FSDP2 backend scheduler max step calculation. Created-by: sunjunjie1587 Commit-by: sunjunjie1587 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4505	1 个月前

简介

MindSpeed LLM：基于昇腾生态的大语言模型分布式训练套件，旨在为华为昇腾芯片生态合作伙伴提供端到端的大语言模型训练方案，包含分布式预训练、分布式指令微调以及对应的开发工具链，如：数据预处理、权重转换、在线推理、基线评估等。

目录结构

MindSpeed LLM 项目代码按照模块化设计原则进行组织，详细介绍参见项目导读。

MindSpeed-LLM/
 ├── ci                        # 门禁看护
 ├── configs                   # 配置文件目录
 ├── docker                    # Docker构建配置
 ├── docs                      # 项目文档目录
 ├── examples                  # 模型示例脚本
 ├── mindspeed_llm             # 核心代码目录
 ├── pre-commit                # pre-commit钩子配置
 ├── tests                     # 测试用例目录
 ├── convert_ckpt.py           # 权重转换工具
 ├── convert_ckpt_v2.py        # 权重转换工具 v2
 ├── evaluation.py             # 模型评估工具
 ├── inference.py              # 模型推理工具
 ├── inference_fsdp2.py        # FSDP2推理工具
 ├── posttrain_gpt.py          # 后训练流程
 ├── preprocess_data.py        # 数据预处理工具
 ├── preprocess_prompt.py      # 提示词预处理工具
 ├── pretrain_deepseek4.py     # DeepSeek4预训练流程
 ├── pretrain_gpt.py           # 预训练流程
 ├── pretrain_mamba.py         # 预训练mamba模型流程
 ├── rlhf_gpt.py               # RLHF 训练流程
 ├── train_fsdp2.py            # FSDP2 训练流程
 ├── requirements.txt          # Python依赖文件
 ├── setup.py                  # 安装配置文件
 ├── README.md                 # 项目说明文档

文档导航

文档导航提供了 MindSpeed LLM 的完整文档使用指南，包含以下核心内容：

环境安装指导：MindSpeed LLM 的安装配置说明
快速入门：从环境安装到训练拉起的入门指导
模型清单：基于 PyTorch 框架支持的模型列表
特性清单：性能优化和显存优化的特性说明
训练方案：预训练、微调、推理、评估等完整方案
工具链：权重转换、数据集处理、性能采集分析、确定性计算等工具使用说明

版本说明

查看版本说明，了解MindSpeed LLM的最新发布版本以及历史版本，包含对应版本的软件配套表、版本兼容性说明、以及当前版本的更新说明。

安装

详细的安装步骤和环境配置请参考MindSpeed LLM安装指导（基于PyTorch）。

快速上手

指导开发者快速启动大语言模型的预训练和微调任务，具体的操作请参考：

支持模型

MindSpeed LLM目前已内置支持百余个业界常用LLM大模型的预训练与微调，支持模型清单可查看：

PyTorch框架模型支持列表

训练方案与特性

MindSpeed LLM包含分布式预训练、分布式微调等训练方案，具体介绍请参考训练方案与特性说明。

在线推理

特性	Mcore	Released	贡献方
流式推理	✅	✅	【Ascend】
Chat对话	✅	✅	【Ascend】
yarn上下文扩展	✅	❌	【Ascend】

Note

文档表格中的“Released”代表商用版本已发布，“✅”代表支持，“❌”代表不支持。

开源数据集评测

仓库模型基线见开源数据集评测基线

场景	数据集	Mcore	Released	贡献方
评测	MMLU	✅	❌	【Ascend】
	CEval	✅	❌	【Ascend】
	BoolQ	✅	❌	【Ascend】
	BBH	✅	❌	【Ascend】
	AGIEval	✅	❌	【Ascend】
	HumanEval	✅	❌	【Ascend】
	CMMLU	✅	❌	【Ascend】
	GSM8k	✅	❌	【Ascend】
	HellaSwag	✅	❌	【Ascend】
	NeedleBench	✅	❌	【Ascend】

开发工具链

权重转换

MindSpeed LLM支持Huggingface、Megatron-core两种格式的权重互转，支持LoRA权重合并。权重转换特性参数和使用说明参考权重转换。

源格式	目标格式	切分特性	LoRA	贡献方	Released
Huggingface	Megatron-core	tp、pp、dpp、vpp、cp、ep、loop layer	❌	【Ascend】	❌
Megatron-core	Huggingface		✅
Megatron-core	Megatron-core	tp、pp、dpp、vpp、cp、ep、loop layer	✅

数据预处理

MindSpeed LLM支持预训练、指令微调等多种任务的数据预处理。

任务场景	数据集	Mcore	Released	贡献方
预训练	预训练数据处理	✅	✅	【Ascend】
微调	Alpaca风格	✅	✅
微调	ShareGPT风格	✅	✅

性能采集

场景	特性	Mcore	Released	贡献方
性能采集	基于昇腾芯片采集 profiling 数据	✅	❌	【Ascend】

高可用性

场景	特性	Mcore	Released	贡献方
高可用性	基于昇腾芯片开启确定性计算	✅	❌	【Ascend】

版本维护策略

MindSpeed LLM版本有以下五个维护阶段：

状态	时间	说明
计划	1—3 个月	计划特性
开发	3 个月	开发特性
维护	6-12 个月	合入所有已解决的问题并发布版本，针对不同的MindSpeed LLM版本采取不同的维护策略，常规版本和长期支持版本维护周期分别为6个月和12个月
无维护	0—3 个月	合入所有已解决的问题，无专职维护人员，无版本发布
生命周期终止（EOL）	N/A	分支不再接受任何修改

MindSpeed LLM已发布版本维护策略：

MindSpeed LLM版本	对应标签	维护策略	当前状态	发布时间	后续状态	EOL日期
26.1.0	v26.1.0	常规版本	开发	预计2026/6/30发布	预计2026/12/30起无维护
26.0.0	v26.0.0	常规版本	维护	2026/3/30	预计2026/9/30起无维护
2.3.0	v2.3.0	常规版本	维护	2025/12/30	预计2026/6/30起无维护
2.2.0	v2.2.0	常规版本	EOL	2025/9/30	生命周期终止	2026/3/30
2.1.0	v2.1.0	常规版本	EOL	2025/6/30	生命周期终止	2025/12/30
2.0.0	v2.0.0	常规版本	EOL	2025/3/30	生命周期终止	2025/9/30
1.0.0	v1.0.0	常规版本	EOL	2024/12/30	生命周期终止	2025/6/30
1.0.RC3	v1.0.RC3.0	常规版本	EOL	2024/09/30	生命周期终止	2025/3/30
1.0.RC2	v1.0.RC2.0	常规版本	EOL	2024/06/30	生命周期终止	2024/12/30
1.0.RC1	v1.0.RC1.0	常规版本	EOL	2024/03/30	生命周期终止	2024/9/30
bk_origin_23	\	Demo	EOL	2023	生命周期终止	2024/6/30

未来规划

未来规划会刷新在MindSpeed LLM RoadMap中，欢迎访问LLM最新规划动态。

社区会议

MindSpeed LLM系列TC及SIG会议安排请查看Ascend会议中心

加入我们

为了交流开发经验、分享使用心得、及时获取项目更新，我们创建了MindSpeed LLM社区交流群。无论你是正在使用这个项目，还是有奇思妙想，都欢迎加入。

加入方式：

直接扫码加入微信交流群（二维码7天有效，定期更新）
添加昇腾开源小助手，获取群链接，进入MindSpeed LLM社区交流群

MindSpeed LLM社区交流群

昇腾开源小助手

安全声明

MindSpeed LLM安全声明

免责声明

致MindSpeed LLM使用者

MindSpeed LLM提供的模型仅供您用于非商业目的。
MindSpeed LLM功能依赖的Megatron等第三方开源软件，均由第三方社区提供和维护，因第三方开源软件导致的问题修复依赖相关社区的贡献和反馈。您应理解，MindSpeed LLM仓库不保证对第三方开源软件本身的问题进行修复，也不保证会测试、纠正所有第三方开源软件的漏洞和错误。
对于各模型，MindSpeed LLM平台仅提示性地向您建议可用于训练的数据集，华为不提供任何数据集，如您使用这些数据集进行训练，请您特别注意应遵守对应数据集的License，如您因使用数据集而产生侵权纠纷，华为不承担任何责任。
如您在使用MindSpeed LLM模型过程中，发现任何问题（包括但不限于功能问题、合规问题），请在Gitcode提交issue，我们将及时审视并解决。

致数据集所有者

如果您不希望您的数据集在MindSpeed LLM中的模型被提及，或希望更新MindSpeed LLM中的模型关于您的数据集的描述，请在Gitcode提交issue，我们将根据您的issue要求删除或更新您的数据集描述。衷心感谢您对MindSpeed LLM的理解和贡献。

License声明

MindSpeed LLM产品的使用许可证，具体请参见LICENSE。
MindSpeed LLM工具docs目录下的文档适用CC-BY 4.0许可证，具体请参见LICENSE。

贡献声明

如果您希望向MindSpeed LLM报告问题和贡献代码，具体请参见贡献指南。

FAQ

MindSpeed LLM仓库基本使用过程中常见问题可以参考MindSpeed LLM FAQ。FAQ中未能涵盖的问题，可以在仓库的issues列表中尝试寻找类似问题，或者提交新的issue。

致谢

MindSpeed LLM由华为公司的下列部门以及昇腾生态合作伙伴联合贡献：

华为公司：

计算产品线：Ascend
公共开发部：NAIE
全球技术服务部：GTS
华为云计算：Cloud

生态合作伙伴：

移动云（China Mobile Cloud）：大云震泽智算平台
工商银行软件开发中心大数据人工智能实验室

感谢来自社区的每一个PR，欢迎贡献 MindSpeed LLM。

项目介绍