华为昇腾面向大规模分布式训练的多模态大模型套件，支撑多模态生成、多模态理解。

AZe_404[Bugfix] Update pkgs for cve issues

文件	最后提交记录	最后更新时间
.gitee	add PULL_REQUEST_TEMPLATE	1 年前
checkpoint	[bugfix]fix get num_experts error in mm_2_hf权重转换 Co-authored-by: chengpeng25<chengpeng9@huawei.com> # message auto-generated for no-merge-commit merge: !1918 merge dev_cp_fix_230 into 2.3.0 [bugfix]fix get num_experts error in mm_2_hf权重转换 Created-by: chengpeng25 Commit-by: chengpeng25 Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. 修复mm_2_hf权重转换时，InterVl模型取num_experts无对应属性问题 issue: https://gitcode.com/Ascend/MindSpeed-MM/issues/186 ## Modification Please briefly describe what modification is made in this PR. 校验config中是否有'text_config'，没有的话给默认值1. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. 修改后，Intervl3-8b模型hf_2_mm，mm_2_hf执行都正常 ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1918	5 个月前
ci	[Test] modify local ci ut Co-authored-by: yangx_sy<sunyang49@huawei.com> # message auto-generated for no-merge-commit merge: !1585 merge local_ci into master [Test] modify local ci ut Created-by: yangx_sy Commit-by: yangx_sy Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1585	7 个月前
docs	docs: update branch 2.3.0 docs link, switch from master to 2.3.0 Co-authored-by: liyingxuan<liyingxuan3@huawei.com> # message auto-generated for no-merge-commit merge: !2241 merge 2.3.0 into 2.3.0 docs: update branch 2.3.0 docs link, switch from master to 2.3.0 Created-by: liyx616 Commit-by: liyingxuan Merged-by: ascend-robot Description: ## What this PR does / why we need it? 将分支2.3.0链接到master分支的文档全部修改为链接到2.3.0 ## Does this PR introduce any user-facing change? 修复了2.3.0分支的使用体验 ## How was this patch tested? 文档修改，不涉及 See merge request: Ascend/MindSpeed-MM!2241	2 个月前
examples	[Bugfix] Update pkgs for cve issues Co-authored-by: AZe_404<wangze62@h-partners.com> # message auto-generated for no-merge-commit merge: !2336 merge cve_230 into 2.3.0 [Bugfix] Update pkgs for cve issues Created-by: AZe_404 Commit-by: AZe_404 Merged-by: ascend-robot Description: ## What this PR does / why we need it? 修复requests 低于2.33.0版本以及其他三方库版本存在漏洞的问题. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2336	2 个月前
mindspeed_mm	[Bugfix]Layerzero saving weight bugfix-branch 2.3.0 Co-authored-by: zs-derrick1<1434012475@qq.com> # message auto-generated for no-merge-commit merge: !1943 merge 2.3.0 into 2.3.0 [Bugfix]Layerzero saving weight bugfix-branch 2.3.0 Created-by: zs-derrick1 Commit-by: zs-derrick1 Merged-by: ascend-robot Description: ## Motivation Layerzero saving weight bugfix-branch 2.3.0 ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1943	5 个月前
sources	[Docs] Readme updates Co-authored-by: js1234567<jiangshuo9@h-partners.com> # message auto-generated for no-merge-commit merge: !1885 merge 2.3.0 into 2.3.0 [Docs] Readme updates Created-by: js1234567 Commit-by: js1234567 Merged-by: ascend-robot Description: ## Motivation Readme updates ## Modification QR code updates Raodmap updates ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1885	5 个月前
tests	delete outdated model for 2.3.0 Co-authored-by: lu-jinfu1999<lujinfu1@h-partners.com> # message auto-generated for no-merge-commit merge: !1981 merge 2.3.0 into 2.3.0 [Modify] delete outdated model for 2.3.0 Created-by: lu-jinfu1999 Commit-by: lu-jinfu1999 Merged-by: ascend-robot Description: ## Motivation delete outdated model for 2.3.0 ## Modification delete outdated model for 2.3.0 ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1981	5 个月前
verl_plugin	[Docs] Update qwen2.5vl performance script Co-authored-by: js1234567<jiangshuo9@h-partners.com> # message auto-generated for no-merge-commit merge: !2021 merge 2.3.0 into 2.3.0 [Docs] Update qwen2.5vl performance script Created-by: js1234567 Commit-by: js1234567 Merged-by: ascend-robot Description: ## Motivation Update qwen2.5vl performance script ## Modification Modify the readme Add shell for performance ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!2021	4 个月前
.gitignore	!729 【安全】加载功能安全加固 Merge pull request !729 from htwang/master	1 年前
LICENSE	!180 【资料】修改LICENSE Merge pull request !180 from liuqiyuan/master	1 年前
MANIFEST.in	!325 【测试】添加InternVL2-8B ST & build打包内容完善 Merge pull request !325 from 陆劲夫/master	1 年前
OWNERS	[Bugfix] add empty_cache before generate Co-authored-by: mr-lin314<798948055@qq.com> # message auto-generated for no-merge-commit merge: merge master into master [Bugfix] add empty_cache before generate Created-by: mr-lin125 Commit-by: mr-lin314 Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM !1576	7 个月前
README.md	docs: update branch 2.3.0 docs link, switch from master to 2.3.0 Co-authored-by: liyingxuan<liyingxuan3@huawei.com> # message auto-generated for no-merge-commit merge: !2241 merge 2.3.0 into 2.3.0 docs: update branch 2.3.0 docs link, switch from master to 2.3.0 Created-by: liyx616 Commit-by: liyingxuan Merged-by: ascend-robot Description: ## What this PR does / why we need it? 将分支2.3.0链接到master分支的文档全部修改为链接到2.3.0 ## Does this PR introduce any user-facing change? 修复了2.3.0分支的使用体验 ## How was this patch tested? 文档修改，不涉及 See merge request: Ascend/MindSpeed-MM!2241	2 个月前
Third-Party Open Source Software Notice.txt	[Docs] add wan2.2 readme and copyright Co-authored-by: 林明哲<linmingzhe3@huawei.com> # message auto-generated for no-merge-commit merge: !1628 merge 1022reame into master [Docs] add wan2.2 readme and copyright Created-by: LinMingZhe Commit-by: 林明哲 Merged-by: ascend-robot Description: ## Motivation add readme and copyright ## Modification add readme and copyright btw remove redundant shell script ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1628	6 个月前
evaluate_gen.py	!1321 [Bugfix] import dependency isolation. Merge pull request !1321 from zs-Derrick/master	10 个月前
evaluate_vlm.py	!1321 [Bugfix] import dependency isolation. Merge pull request !1321 from zs-Derrick/master	10 个月前
inference_qihoo.py	!1321 [Bugfix] import dependency isolation. Merge pull request !1321 from zs-Derrick/master	10 个月前
inference_sora.py	[Bugfix]resolve multiple issues — unused code, index out of bounds, undefined vars, resource leaks Co-authored-by: zhangxubin<1656631289@qq.com> # message auto-generated for no-merge-commit merge: !1662 merge master into master [Bugfix]resolve multiple issues — unused code, index out of bounds, undefined vars, resource leaks Created-by: MoCuishle-M Commit-by: zhangxubin Merged-by: ascend-robot Description: ## Motivation Fix some security issues. ## Modification The issues fixed are as follows: 1. Removed unused code and fixed logic errors 2. Fixed array out-of-bounds access. 3. Fixed usage of undefined variables 4. Fixed resource leaks by ensuring proper release ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1662	6 个月前
inference_videoalign.py	!1453 [Docs] readme for videoalign Merge pull request !1453 from chenpeizhe/master	8 个月前
inference_vlm.py	!1321 [Bugfix] import dependency isolation. Merge pull request !1321 from zs-Derrick/master	10 个月前
posttrain_flux_dancegrpo.py	!1376 [Feature]add DanceGRPO-flux feature readme Merge pull request !1376 from lmy/dev	9 个月前
posttrain_qwen2vl_dpo.py	!1116 [Bugfix]Rectify the code in the repository based on the CleanCode scan results. Merge pull request !1116 from zhangxubin/master	11 个月前
posttrain_sora_dpo.py	!1384 [Feature] support lumina-mgpt2 model Merge pull request !1384 from meng-coding/master	9 个月前
pretrain_ae.py	!599 【特性】新增vae训练脚本以及配置文件 Merge pull request !599 from zs-Derrick/master	1 年前
pretrain_deepseekvl.py	[mindspore][master]support deepseekvl, llava, glm4.1v for mindspore backend Co-authored-by: ffmh<fengminghao2@huawei.com> # message auto-generated for no-merge-commit merge: !1671 merge ms_adapt into master [mindspore][master]support deepseekvl, llava, glm4.1v for mindspore backend Created-by: ffmh Commit-by: ffmh Merged-by: ascend-robot Description: ## Motivation deepseekvl2, llava1.5, glm4.1v 支持mindspore后端 ## Modification patch修改介绍 1. npu_rotary_position_embedding使用mindspore框架接口，不走mindspeed自定义算子流程 2. vmap接口缺失，使用等价写法替换transformers中 sdpa_mask_older_torch 函数 ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. dsvl2 ![image.png](https://raw.gitcode.com/user-images/assets/7404510/64b345d1-c04c-4c7f-8607-895f638dcee0/image.png 'image.png') glm ![image.png](https://raw.gitcode.com/user-images/assets/7404510/6c75c0cb-8478-42cd-a05d-effa1b18dc17/image.png 'image.png') llava ![image.png](https://raw.gitcode.com/user-images/assets/7404510/addbf12a-9de9-44d6-8184-4b8b46d262fe/image.png 'image.png') ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1671	6 个月前
pretrain_internvl.py	MM disttrain intervl2 适配自动并行搜索 Co-authored-by: gcw_amOUPDs9<fuyuefeng@huawei.com> # message auto-generated for no-merge-commit merge: !1588 merge master into master MM disttrain intervl2 适配自动并行搜索 Created-by: gcw_amOUPDs9 Commit-by: gcw_amOUPDs9 Merged-by: ascend-robot Description: ## Motivation Please describe the motivation of this PR and the goal you want to achieve through this PR. ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1588	6 个月前
pretrain_lumina.py	!1384 [Feature] support lumina-mgpt2 model Merge pull request !1384 from meng-coding/master	9 个月前
pretrain_omni.py	[Feature]:Bagel Model Co-authored-by: ningmengliu<liuhao438@huawei.com> # message auto-generated for no-merge-commit merge: !1621 merge master into master [Feature]:Bagel Model Created-by: ningmenglh Commit-by: ningmengliu Merged-by: ascend-robot Description: ## Motivation Submit Bagel model ## Modification Added the Bagel model backbone along with training interfaces, established framework APIs, and reused VAE and SigLIP components. Built the qwen2_mot model. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1621	6 个月前
pretrain_qwen2vl.py	!1116 [Bugfix]Rectify the code in the repository based on the CleanCode scan results. Merge pull request !1116 from zhangxubin/master	11 个月前
pretrain_sora.py	[Feature]opensoraplan1.3新增动态DPCP切换功能 Co-authored-by: qusongyun1<qusongyun1@noreply.gitcode.com> # message auto-generated for no-merge-commit merge: !1677 merge dynamicDPCP into master [Feature]opensoraplan1.3新增动态DPCP切换功能 Created-by: qusongyun1 Commit-by: qusongyun1 Merged-by: ascend-robot Description: ## Motivation 当前静态DPCP方案在动态负载下无法充分利用算力，例如在大量短序列和少量长序列的情况下，为了保证不OOM，需要设置较大的CP，然而短序列进行大CP并行会导致性能的下降。本特性新增动态DPCP功能，支持在每轮训练迭代中根据数据特征动态切换DP/CP并行策略。 ## Modification pretrain_sora.py：如果开启了动态DPCP，则优先获取缓存数据 training.py: 在初始化时，新增DPCP并行组的初始化，切换后，将数据在cp组内广播并放入缓存 MindSpeed-MM/mindspeed_mm/utils 中新增dpcp_utils.py文件，所有本特性相关的函数实现均在该文件中 ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1677	6 个月前
pretrain_transformers.py	[Bugfix] 修复使用pretrain_transformers.py系列模型在开启calculate_per_token_loss时loss错误打印问题 Co-authored-by: yangx_sy<sunyang49@huawei.com> # message auto-generated for no-merge-commit merge: !1949 merge token_loss_230 into 2.3.0 [Bugfix] Fixed an issue where incorrect loss printing occurred when using models from the pretrain_transformers.py series with calculate_per_token_loss enabled. Created-by: yangx_sy Commit-by: yangx_sy Merged-by: ascend-robot Description: ## Motivation 修复使用pretrain_transformers.py系列模型在开启calculate_per_token_loss时loss错误打印问题 ## Modification Please briefly describe what modification is made in this PR. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1949	5 个月前
pretrain_videoalign.py	!1429 [Feature] support videoalign model Merge pull request !1429 from chenpeizhe/main	9 个月前
pretrain_vlm.py	[Modify] hetero-pp support encoder/decoder different micro-batch-size & improve qwen2.5omni data balance on global batch Co-authored-by: huangdabiao<huangdabiao1@huawei.com> # message auto-generated for no-merge-commit merge: !1736 merge test_final_1111 into master [Modify] hetero-pp support encoder/decoder different micro-batch-size & improve qwen2.5omni data balance on global batch Created-by: huangdabiao Commit-by: huangdabiao Merged-by: ascend-robot Description: ## Motivation [Modify] hetero-pp support encoder/decoder different micro-batch-size & improve qwen2.5omni data balance on global batch ## Modification [Modify] hetero-pp support encoder/decoder different micro-batch-size & improve qwen2.5omni data balance on global batch ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1736	6 个月前
pretrain_whisper.py	!127 【特性】新增WhisperForConditionalGeneration模型 Merge pull request !127 from zzztq/master	1 年前
pyproject.toml	!1431 [Docs]Update transformers dependency version to 4.53.0 Merge pull request !1431 from zhangxubin/master	9 个月前

简介

MindSpeed MM：面向大规模分布式训练的昇腾多模态大模型套件，支持业界主流多模态大模型训练，旨在为华为昇腾芯片提供端到端的多模态训练解决方案, 包含预置业界主流模型，数据工程，分布式训练及加速，预训练、微调、后训练、在线推理任务等特性。

未来规划

📈未来规划会动态刷新在MindSpeed MM RoadMap中，欢迎社区通过此链接进行互动并提出诉求

加入我们

为了交流开发经验、分享使用心得、及时获取项目更新，我们创建了MindSpeed MM官方微信群。

无论你是正在使用这个项目，还是有奇思妙想，都欢迎加入👋

效果展示

文生视频： Wan 2.2 T2V

Prompt: Ultra HD, 4K, cinematic composition, low contrast ratio, low saturation, cool tone; The queen wears an iron crown and rides on the dragon over the city. She holds a big flag that shows:" MindSpeed MM".

文生视频： OpensoraPlan 1.5 T2V

Prompt: A fluffy white rabbit with soft, velvety fur and twitching pink nose sits curiously near a rustic wooden fence, surrounded by a lush garden of vibrant wildflowers and tall grasses swaying gently in the breeze. The rabbit's large, expressive eyes scan the environment, reflecting the golden hues of the setting sun. As it nibbles on a patch of clover, its ears perk up at the distant sound of chirping birds. The fence, weathered and covered in patches of moss, adds a charming, pastoral backdrop to this serene scene, capturing the essence of a peaceful countryside moment.

Prompt: A majestic Berlin tower stands tall against the night sky, its structure bathed in a mesmerizing array of vibrant lights, casting a kaleidoscope of colors across the cityscape. The tower's intricate architectural details are highlighted by the illumination, creating a stunning contrast against the deep indigo sky. As the camera pans upward, the lights shift, revealing a dynamic play of shadows and hues that dance across the tower's surface. The surrounding city lights twinkle in harmony, enhancing the tower's grandeur and creating a breathtaking visual symphony that captures the essence of Berlin's vibrant nightlife.

文生图：Qwen-Image -> 图片编辑 Flux.1-Kontext

Prompt for generation: A coffee shop entrance features a chalkboard sign reading "MindSpeed Coffee 😊 $2 per cup," with a neon light displaying "MindSpeed MM". Next to it hangs a poster showing a beautiful Chinese woman, and beneath the poster is written "Welcome to use MindSpeed MM". Ultra HD, 4K, cinematic composition. (Qwen-Image)

Prompt for edition: Change the decoration of the coffee shop to a modern style with white painting. (Flux.1-Kontext)

理解模型：Qwen2VL

Input image for both models:

Input text for both models: Please describe the image shortly

Qwen2VL推理结果: The image depicts a serene lakeside scene with a wooden dock extending into the calm waters. The dock is made of weathered wooden planks and leads to a small platform with a ladder, suggesting it is used for swimming or diving. The lake is surrounded by lush green forests and mountains in the background, creating a picturesque and tranquil setting. The sky is overcast, adding to the calm and peaceful atmosphere of the scene.

Input text for Qwen2VL: 请用中文简短描述这张照片

Qwen2VL推理结果: 这张图片展示了一座木制码头延伸到平静的湖面上，背景是连绵的山脉和茂密的森林。天空多云，整体色调偏冷，给人一种宁静和自然的感觉。

版本说明

MindSpeed MM支持Atlas 800T A2等昇腾训练硬件形态，软件版本配套表如下：

MindSpeed MM版本	MindSpeed版本	Megatron版本	PyTorch版本	torch_npu版本	CANN版本	Python版本
master（主线）	master	Core 0.12.1	2.6.0, 2.7.1	在研版本	在研版本	Python3.10
2.3.0（商用）	2.3.0_core_r0.12.1	Core 0.12.1	2.6.0, 2.7.1	7.3.0	8.5.0	Python3.10
2.2.0（商用）	2.2.0_core_r0.12.1	Core 0.12.1	2.6.0, 2.7.1	7.2.0	8.3.RC1	Python3.10
2.1.0（商用）	2.1.0_core_r0.8.0	Core 0.8.0	2.1.0, 2.6.0	7.1.0	8.2.RC1	Python3.8, Python3.10
2.0.0（商用）	2.0.0_core_r0.8.0	Core 0.8.0	2.1.0	7.0.0	8.1.RC1	Python3.8, Python3.10
1.0.0（商用）	1.0.0_core_r0.6.0	Core 0.6.0	2.1.0	6.0.0	8.0.0	Python3.8, Python3.10

更多详情请参考版本配套表。

安装

MindSpeed MM具体的安装请参考安装指南。

快速上手

MindSpeed MM将以Qwen2.5-VL-3B和Wan2.1-T2V-1.3B模型为例，引导开发者快速上手预置模型在昇腾NPU上的高效运行。具体的操作请参考快速上手。

特性/模型介绍

已支持特性概览

模型 \ 特性	TP	TP-SP	VPP	PP	CP	Distributed Optimizer	Recomputation	LoRA	RL	FSDP2
InternVL3.5-30B							✔			✔
Qwen3-VL-8B							✔			✔
Qwen3-VL-30B							✔			✔
Wan2.2					CP (Ulysses)		✔			✔
OpenSoraPlan1.5-T2V	✔	✔					✔
Wan2.1					CP (Ulysses)	✔	✔	✔		✔
HunyuanVideo	✔	✔			CP (Ulysses)	✔	✔	✔
CogVideoX系列-T2V	✔	✔			CP (Ulysses)	✔	✔	✔
CogVideoX系列-I2V	✔	✔			CP (Ulysses)	✔	✔	✔
OpensoraPlan1.3-T2V	✔	✔	✔	✔	CP (Ulysses)	✔	✔
OpensoraPlan1.3-I2V	✔	✔	✔	✔	CP (Ulysses)	✔	✔
GLM-4.1V				✔		✔	✔
Qwen2VL-2B	✔	✔		✔	CP (Ulysses)	✔	✔	✔
Qwen2VL-7B	✔	✔		✔	CP (Ulysses)	✔	✔	✔
Qwen2VL-72B	✔	✔		✔	CP (Ulysses)	✔	✔	✔	DPO
Qwen2.5VL-3B	✔	✔		✔		✔	✔		GRPO
Qwen2.5VL-7B	✔	✔		✔		✔	✔		GRPO
Qwen2.5VL-32B	✔	✔		✔		✔	✔		GRPO
Qwen2.5VL-72B	✔	✔		✔		✔	✔	✔
Qwen2.5Omni-7B	✔			✔		✔		✔
Qwen3-Omni							✔			✔
InternVL3-8B	✔	✔	✔	✔	CP (Ring)	✔	✔
InternVL3-78B	✔	✔	✔	✔	CP (Ring)	✔	✔

备注：

TP: Tensor Parallel
TP-SP: Tensor Parallel with Sequence Parallel
VPP: Virtual Pipeline Parallel
PP: Pipeline Parallel
DSP: Dynamic Sequence Parallel
CP (Ulysses): Context Parallel by leveraging Deepspeed Ulysses with Sequence Parallel
CP (Ring Attention): Context Parallel with Ring Attention
Distributed Optimizer: Zero Redundancy Optimizer (ZeRO)
Recomputation: Reducing Activation Recomputation
LoRA: Low-Rank Adaptation
RL: Reinforcement Learning
FSDP2: Fully Sharded Data Parallelism

配套版本与支持模型

【现版本实测性能（硬件信息：Atlas 900 A2 PODc）】

下述列表中支持的模型，我们在各模型的README文件中提供了相应的使用说明，里面有详细的模型训练、推理、微调等流程

模型列中的超链接指向各模型的文件夹地址， 参数量列中的超链接指向模型的社区资源地址

认证【Pass】表示已经通过测试的模型，【Test】表示测试中的模型

Samples per Second 为 (SPS); Frames per Second 为 (FPS); Tokens per Second 为 (TPS)

(注：此处SPS、FPS展示集群吞吐；TPS展示单卡吞吐)

平均序列长度是指在性能测试过程中所使用数据集的平均序列长度，通过统计各个序列长度的出现频率进行加权平均计算得出

亲和场景为调整少量结构或参数，使得模型更加亲和昇腾，性能更优

A3 为硬件 Atlas A3 训练系列产品

MindSpeed MM模型列表
模型任务	模型	参数量	任务	集群	精度格式	NPU性能	参考性能	平均序列长度	认证
多模态生成
	Lumina-mGPT 2.0	7B	微调	1x8	BF16	8.24 (SPS)	8.79 (SPS)	1024	【Pass】
	OpenSoraPlan1.5	8.5B	预训练	1x8	BF16	0.83 (SPS)	/	/	【北大贡献】
	Wan2.2-T2V	5B	预训练	1x4 (A3)	BF16	3.18 (SPS)	2.93 (SPS)	/	【Test】
	Wan2.2-T2V	A14B	预训练	1x8 (A3)	BF16	0.710 (SPS)	0.292 (SPS)	/	【Test】
	Wan2.2-TI2V	5B	预训练	1x4 (A3)	BF16	3.18 (SPS)	2.93 (SPS)	/	【Test】
	Wan2.2-I2V	A14B	预训练	1x8 (A3)	BF16	0.671 (SPS)	0.294 (SPS)	/	【Test】
	Wan2.1-T2V	1.3B	预训练	1x8	BF16	0.918 (SPS)	1.04 (SPS)	/	【Pass】
		1.3B	Lora微调	1x8	BF16	0.954 (SPS)	1.042 (SPS)	/	【Pass】
		14B	预训练	1x8	BF16	0.160 (SPS)	0.160 (SPS)	/	【Pass】
		14B	Lora微调	1x8	BF16	0.179 (SPS)	0.174 (SPS)	/	【Pass】
	Wan2.1-I2V	1.3B	预训练	1x8	BF16	0.76 (SPS)	/	/	【Pass】
		14B	预训练	1x8	BF16	0.130 (SPS)	/	/	【Pass】
		14B	Lora微调	1x8	BF16	0.179 (SPS)	0.173 (SPS)	/	【Pass】
	HunyuanVideo-T2V	13B	预训练	1x8	BF16	0.171 (SPS)	0.181 (SPS)	/	【Pass】	HunyuanVideo-I2V	13B	预训练	1x8	BF16	0.164 (SPS)	0.202 (SPS)	/	【Pass】
	OpenSora 1.0	5.5B	预训练	1x8	BF16	3.18 (SPS)	2.04 (SPS)	/	【Pass】
	OpenSora 1.2	5.2B	预训练	1x8	BF16	7.31 (SPS)	8.15 (SPS)	/	【Test】
	OpenSora 2.0-T2V	11B	预训练	1x8	BF16	1.33 (SPS)	1.46 (SPS)	/	【Pass】
	OpenSoraPlan 1.2	8.7B	预训练	1x8	BF16	0.42 (SPS)	0.37 (SPS)	/	【Pass】
	OpenSoraPlan 1.3-T2V	8.6B	预训练	1x8	BF16	1.29 (SPS)	1.27 (SPS)	/	【Pass】
	OpenSoraPlan 1.3-I2V	8.6B	预训练	1x8	BF16	1.17 (SPS)	1.15 (SPS)	/	【Pass】
	WFVAE	0.18B	预训练	1x8	BF16	23.860 (SPS)	26.091 (SPS)	/	【Pass】
	CogVideoX-T2V	5B	预训练	1x8	BF16	1.14 (SPS)	1.00 (SPS)	6976	【Pass】
	CogVideoX-I2V	5B	预训练	1x8	BF16	1.13 (SPS)	0.84 (SPS)	6976	【Pass】
	CogVideoX 1.5-T2V	5B	预训练	1x8	BF16	1.44 (SPS)	1.75 (SPS)	6976	【Pass】
	CogVideoX 1.5-T2V	5B	Lora微调	1x8	BF16	2.76 (SPS)	2.64 (SPS)	/	【Pass】
	CogVideoX 1.5-I2V	5B	预训练	1x8	BF16	1.43 (SPS)	1.44 (SPS)	6976	【Pass】
	CogVideoX 1.5-I2V	5B	Lora微调	1x8	BF16	2.33 (SPS)	2.04 (SPS)	/	【Pass】
	Qihoo-T2X	1.1B	推理	1x1	BF16	/	/	/	【奇虎360贡献】
	SDXL	3.5B	预训练	1x8	BF16	29.92 (FPS)	30.65 (FPS)	/	【Pass】
	SDXL	3.5B	预训练	1x8	FP16	28.51 (FPS)	30.23 (FPS)	/	【Pass】
	SD3	2B	全参微调	1x8	BF16	16.09 (FPS)	16.01 (FPS)	/	【Pass】
	SD3.5	8.1B	全参微调	1x8	BF16	26.20 (FPS)	28.33 (FPS)	/	【Pass】
	SD3.5	8.1B	Lora微调	1x8	FP16	47.93 (FPS)	47.95 (FPS)	/	【Pass】
	Flux	12B	全参微调	1x8	BF16	55.23 (FPS)	53.65 (FPS)	/	【Pass】
	Flux-Kontext	12B	全参微调	1x8	BF16	1.97 (FPS)	2.00 (FPS)	/	【Pass】
	Sana	1.6B	Lora微调	1x8	BF16	28.7 (FPS)	32.8 (FPS)	/	【Pass】
	HiDream	17B	Lora微调	1x8	BF16	18.37 (FPS)	19.61 (FPS)	/	【Pass】
	Kolors	2.6B	推理	1x1	FP16	/	/	/	【Test】
	Qwen-Image	27B	Lora微调	1x8	BF16	23.02 (FPS)	21.54 (FPS)	/	【Pass】
	Qwen-Image-Edit	27B	Lora微调	1x8	BF16	20.59 (FPS)	17.47 (FPS)	/	【Test】
	多模态理解
GLM-4.1V		9B	微调	1x8	BF16	1074.64(TPS)	908.49(TPS)	707	【Pass】
LLaVA 1.5		7B	全参微调	1x8	BF16	3632.31 (TPS)	3757.98 (TPS)	602	【Test】
InternVL 2.0		2B	微调	1x8	BF16	7653.12 (TPS)	5089.99 (TPS)	1813	【Pass】
		8B	微调	1x8	BF16	2914.39 (TPS)	2492.87 (TPS)	1813	【Pass】
		26B	微调	1x8	BF16	750.12 (TPS)	738.79 (TPS)	1813	【Pass】
		76B	全参微调	8x16	BF16	214 (TPS)	191 (TPS)	1813	【Pass】
InternVL 2.5		78B	微调	8x8	BF16	228.33	/	1896	【Test】
InternVL 3.0		8B	微调	1x8	BF16	2344.58 (TPS)	2211.93 (TPS)	2653	【Pass】
InternVL 3.0		78B	微调	4x8 (A3)	BF16	228.82 (TPS)	283.15 (TPS)	1932	【Pass】
InternVL 3.5		30B	微调	1x8 (A3)	BF16	52.76 (TPS)	47.73 (TPS)	201	【Test】
Qwen2-VL		2B	微调	1x8	BF16	2941.17 (TPS)	3004.04 (TPS)	689	【Pass】
		7B	微调	1x8	BF16	1143.74 (TPS)	1004.22 (TPS)	689	【Pass】
		72B	微调	4x8 (A3)	BF16	261.25 (TPS)	257.63 (TPS)	689	【Pass】
Qwen2.5-VL		3B	微调	1x8	BF16	2047.19 (TPS)	1876.66 (TPS)	689	【Pass】
		7B	微调	1x8	BF16	1620.87 (TPS)	1091.20 (TPS)	689	【Pass】
		32B	微调	2x8	BF16	257.50 (TPS)	/	689	【Pass】
		72B	微调	4x8 (A3)	BF16	322.96 (TPS)	256.28 (TPS)	689	【Pass】
Qwen3-VL		8B	微调	1x8	BF16	146.54 (TPS)	129.71 (TPS)	179	【Test】
		30B	微调	1x8 (A3)	BF16	179.57 (TPS)	/	185	【Test】
		235B	微调	16x8 (A3)	BF16	598.05 (TPS)	/	16116	【Test】
Qwen2.5-Omni		7B	微调	1x8	BF16	575.01 (TPS)	534.28 (TPS)	296	【Pass】
Qwen3-Omni		30B	微调	2x4 (A3)	BF16	131.3 (TPS)	16.4 (TPS)	288	【Test】
语音识别	Whisper	1.5B	预训练	1x8	BF16	93.38 (SPS)	109.23 (SPS)	/	【Test】

常用参数解释说明

针对MindSpeed MM套件中运行所使用的参数做解释说明，具体见README

特性规划

【新模型】 JanusPro
【模型特性】 CogVideoX: PP
【模型特性】 OpensoraPlan1.3: CP (Ring Attention)
【模型特性】 Qwen2VL: VPP, CP (Ulysses & Ring Attention)
【模型特性】 InternVL2: TP, CP (Ulysses & Ring Attention)
【基础特性】 Hetero-parallel

工具使用

昇腾Profiling采集工具

MindSpeed MM集成了昇腾profiling采集工具，以提供对模型运行情况的分析。该工具能够依照配置采集模型的算子、显存等关键信息，同时支持动静态两种采集方式，协助开发者分析模型瓶颈，并可根据实际场景需求选择使用。

具体方法见 README 的profiling章节

MindStudio Insight性能分析工具

针对大模型集群场景的性能调优，这里推荐一款优秀的可视化调优工具MindStudio Insight。 MindStudio Insight提供了包括Timeline视图、通信分析、计算耗时等的可视化呈现，以便用户分析潜在的性能瓶颈，并指导如何采取措施消除或减少这些瓶颈。

具体安装及使用方法见《MindStudio Insight操作指南》

Sora类模型特征提取

MindSpeed MM支持提取视频和文本特征并保存

具体方法见 README 的Sora类模型特征提取章节

内存快照提取

MindSpeed MM集成了昇腾内存快照采集工具，以提供对模型运行情况的分析。

具体方法见 README 的内存快照提取章节

Tensorboard使用

MindSpeed MM支持Tensorboard的使用

具体方法见 README 的Tensorboard使用章节

版本维护

MindSpeed MM版本有以下五个维护阶段：

状态	时间	说明
计划	1—3 个月	计划特性
开发	3 个月	开发特性
维护	6-12 个月	合入所有已解决的问题并发布版本，针对不同的MindSpeed MM版本采取不同的维护策略，常规版本和长期支持版本维护周期分别为6个月和12个月
无维护	0—3 个月	合入所有已解决的问题，无专职维护人员，无版本发布
生命周期终止（EOL）	N/A	分支不再接受任何修改

MindSpeed MM已发布版本维护策略：

MindSpeed MM版本	维护策略	当前状态	发布时间	后续状态
2.3.0	常规版本	维护	2025/12/30	预计2026/06/30起无维护
2.2.0	常规版本	维护	2025/09/30	预计2026/03/30起无维护
2.1.0	常规版本	无维护	2025/06/30	预计2025/12/30起无维护
2.0.0	常规版本	无维护	2025/03/30	预计2025/09/30起无维护
1.0.0	常规版本	无维护	2024/12/30	预计2025/06/30起无维护
1.0.RC3	常规版本	无维护	2024/09/30	预计2025/03/30起无维护

常见问题

安全声明

MindSpeed MM 安全声明

免责声明

致MindSpeed MM使用者

MindSpeed MM提供的模型仅供您用于非商业目的。
对于各模型，MindSpeed MM平台仅提示性地向您建议可用于训练的数据集，华为不提供任何数据集，如您使用这些数据集进行训练，请您特别注意应遵守对应数据集的License，如您因使用数据集而产生侵权纠纷，华为不承担任何责任。
如您在使用MindSpeed MM模型过程中，发现任何问题（包括但不限于功能问题、合规问题），请在Gitcode提交issue，我们将及时审视并解决。
MindSpeed MM功能依赖的Megatron等第三方开源软件，均由第三方社区提供和维护，因第三方开源软件导致的问题的修复依赖相关社区的贡献和反馈。您应理解，MindSpeed MM仓库不保证第三方开源软件本身的问题进行修复，也不保证会测试，纠正所有第三方开源软件的漏洞和错误。