6c45b486创建于 4月28日历史提交

文件	最后提交记录	最后更新时间
install_guide.md	docs(pytorch): docs updated Co-authored-by: z60119525<zhaotao68@h-partners.com> # message auto-generated for no-merge-commit merge: !2434 merge 26.0.0 into 26.0.0 docs(pytorch): docs updated Created-by: kali00 Commit-by: z60119525 Merged-by: ascend-robot Description: ## What this PR does / why we need it? https://gitcode.com/Ascend/MindSpeed-MM/issues/314 1、修改CANN安装引导 ## Does this PR introduce any user-facing change? 修改安装指导中关于CANN社区页面的引用链接 ## How was this patch tested? Not need to be tested for doc. See merge request: Ascend/MindSpeed-MM!2434	1 个月前
model_migration_mindspore.md	docs(mindspore):fix model_migration_mindspore.md and readme.md Co-authored-by: weixin_47897441<wuyouqi1@h-partners.com> # message auto-generated for no-merge-commit merge: !2387 merge 26.0.0 into 26.0.0 docs(mindspore):fix model_migration_mindspore.md and readme.md Created-by: weixin_47897441 Commit-by: weixin_47897441 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Fix typos in documentation: correct model names (e.g., Qwen2.5, Qwen3VL) and remove redundant characters in the README table and migration guide. ## Does this PR introduce any user-facing change? No. ## How was this patch tested? Documentation only (N/A). See merge request: Ascend/MindSpeed-MM!2387	1 个月前
quickstart_ms.md	[Docs]splitting install_guide and rename quickstart Co-authored-by: z60119525<zhaotao68@h-partners.com> # message auto-generated for no-merge-commit merge: !2383 merge 26.0.0 into 26.0.0 [Docs]splitting install_guide and rename quickstart Created-by: kali00 Commit-by: z60119525 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2383	1 个月前
readme.md	[Docs]splitting install_guide and rename quickstart Co-authored-by: z60119525<zhaotao68@h-partners.com> # message auto-generated for no-merge-commit merge: !2383 merge 26.0.0 into 26.0.0 [Docs]splitting install_guide and rename quickstart Created-by: kali00 Commit-by: z60119525 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2383	1 个月前

MindSpore后端支持说明

MindSpeed MM已支持接入华为自研AI套件MindSpore，旨在提供华为全栈易用的端到端的多模态模型训练解决方案，以此获得更极致的性能体验。MindSpore后端提供了一套对标PyTorch的API，用户无需进行额外代码适配即可无缝切换。

🔥🔥🔥Latest News

🚀🚀🚀 MindSpore后端已支持CogVideoX、Qwen2.5VL、Qwen25Omni、Qwen3VL模型。

版本配套

MindSpeed MM + MindSpore后端的依赖配套如下表，安装步骤参考基础安装指导。

依赖软件
昇腾NPU驱动固件	在研版本
昇腾 CANN	在研版本
MindSpore	2.7.2
Python	>=3.10

环境部署

具体部署步骤请查看部署文档。

快速上手

快速上手操作请查看快速上手文档。

模型/特性介绍

已支持特性概览

模型 \ 特性	TP	TP-SP	PP	CP	EP	Distributed Optimizer	Recomputation
CogVideoX系列-T2V	✔	✔		CP (Ulysses)		✔	✔
CogVideoX系列-I2V	✔	✔		CP (Ulysses)		✔	✔
Qwen2.5VL-7B	✔		✔			✔
Qwen2.5VL-72B	✔		✔			✔
Qwen2.5Omni-7B			✔			✔
Qwen3VL-8B		✔	✔			✔
Qwen3VL-30B		✔	✔		✔	✔

备注：

TP: Tensor Parallel
TP-SP: Tensor Parallel with Sequence Parallel
VPP: Virtual Pipeline Parallel
PP: Pipeline Parallel
DSP: Dynamic Sequence Parallel
CP (Ulysses): Context Parallel by leveraging Deepspeed Ulysses with Sequence Parallel
CP (Ring Attention): Context Parallel with Ring Attention
Distributed Optimizer: Zero Redundancy Optimizer (ZeRO)
Recomputation: Reducing Activation Recomputation
LoRA: Low-Rank Adaptation

支持模型

MindSpeed MM (MindSpore后端)模型支持列表
模型任务	模型	参数量	任务	集群	精度格式	参考性能	平均序列长度	支持情况	支持版本
多模态生成
	CogVideoX-T2V	5B	预训练	1x8	BF16	0.46 (SPS)	/	✅	2.3.0
	CogVideoX-I2V	5B	预训练	1x8	BF16	0.46 (SPS)	/	✅	2.3.0
	CogVideoX 1.5-T2V	5B	预训练	1x8	BF16	2.09 (SPS)	/	✅	2.3.0
	CogVideoX 1.5-T2V	5B	Lora微调	1x8	BF16	3.03 (SPS)	/	支持中
	CogVideoX 1.5-I2V	5B	预训练	1x8	BF16	2.01 (SPS)	/	✅	2.3.0
	CogVideoX 1.5-I2V	5B	Lora微调	1x8	BF16	3.92 (SPS)	/	支持中
	Wan2.1-I2V	1.3B	预训练	1x8	BF16	0.61 (SPS)	/	✅	2.3.0
	Wan2.1-T2V	1.3B	预训练	1x8	BF16	0.73 (SPS)	/	✅	2.3.0
	Wan2.1-T2V	1.3B	Lora微调	1x8	BF16	/	/	支持中
多模态理解	Qwen2.5-VL	3B	微调	1x8	BF16	21.79 (SPS)	563	支持中
		7B	微调	1x8	BF16	12.67 (SPS)	563	✅	2.3.0
		32B	微调	2x8	BF16	/	563	支持中
		72B	微调	8x8	BF16	256.28 (TPS)	563	✅	2.3.0
	Qwen25Omni	7B	微调	1x8	BF16	/	/	✅	2.3.0
	Qwen3VL	8B	微调	1x8	BF16	/	/	✅	2.3.0
	Qwen3VL	30B	微调	1x16	BF16	/	/	✅	2.3.0

特性规划

工具使用

昇腾Profiling采集工具

MindSpeed MM集成了昇腾profiling采集工具，以提供对模型运行情况的分析。该工具能够依照配置采集模型的算子、显存等关键信息，同时支持动静态两种采集方式，协助开发者分析模型瓶颈，并可根据实际场景需求选择使用。

具体方法见 README 的profiling章节。

MindStudio Insight性能分析工具

针对大模型集群场景的性能调优，这里推荐一款优秀的可视化调优工具MindStudio Insight。 MindStudio Insight提供了包括Timeline视图、通信分析、计算耗时等的可视化呈现，以便用户分析潜在的性能瓶颈，并指导如何采取措施消除或减少这些瓶颈。

具体使用方法见《MindStudio Insight操作指南》，下载地址《MindStudio Insight》。