ascend-robotdocs: add DT authoring guide

文件	最后提交记录	最后更新时间
st	[Test]Add ST for Qwen3-Omni Co-authored-by: yaoyaoxu<xuyaoyao.824404@huawei.com> # message auto-generated for no-merge-commit merge: !2142 merge qwen3omni_test_st into master [Test]Add ST for Qwen3-Omni Created-by: yaoyaoxu Commit-by: yaoyaoxu Merged-by: ascend-robot Description: ## Motivation Qwen3-Omni 与 Qwen3-VL 的配置存在差异，公共流程的修改可能导致 Qwen3-Omni 运行失败，如 https://gitcode.com/Ascend/MindSpeed-MM/pull/2115 因此需要增加st看护qwen3-Omni模型 ## Modification 1.本次新增的st耗时增加2m1s 2.主要修改点：在st路径下增加qwen3omni的脚本，并针对qwen3omni的执行环境做适配，执行结束后恢复 ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!2142	3 个月前
ut	cleancode Co-authored-by: liyingxuan<liyingxuan3@huawei.com> # message auto-generated for no-merge-commit merge: !2323 merge master into 26.0.0 cleancode Created-by: liyx616 Commit-by: liyingxuan Merged-by: ascend-robot Description: ## What this PR does / why we need it? cleancode整改 ## Does this PR introduce any user-facing change? cleancode整改 ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2323	2 个月前
README.md	docs: add DT authoring guide Co-authored-by: zhangxubin<1656631289@qq.com> # message auto-generated for no-merge-commit merge: !2372 merge 26.0.0 into 26.0.0 docs: add DT authoring guide Created-by: MoCuishle-M Commit-by: zhangxubin Merged-by: ascend-robot Description: ## What this PR does / why we need it? 为26.0.0分支增加DT编写指南：用于指导开发者如何为MindSpeed MM贡献DT用例。 ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2372	1 个月前
conftest.py	!92 【特性】增加opensora1.0 ST用例 Merge pull request !92 from htwang/master	1 年前

MindSpeed MM 测试用例编写指南

本文档详细说明如何为MindSpeed MM贡献DT用例。

一、背景与参考

1.1 MindSpeed MM仓CI门禁代码相关路径

用途	路径
测试用例	`MindSpeed-MM/tests`
CI启动代码	`MindSpeed-MM/ci`

1.2 CI门禁范围

CI门禁看护以下两项指标：

功能：代码能够正常运行
性能：性能劣化不得超过5%

二、CI门禁看护列表

PR合入前都须通过全量CI门禁用例测试。

2.1 ST（系统测试）看护列表

说明：ST用例看护性能指标，性能劣化不得超过5%。

Module	Features	Scripts
Pretrain	CogVideoX T2V, TP=2, CP=2, Ulysses CP	pretrain_cogvideox_t2v_1_0.sh
	CogVideoX I2V, TP=2, PP=2, CP=2, Ulysses CP	pretrain_cogvideox_i2v_1.5.sh
	HunyuanVideo T2V, TP=4, CP=1	pretrain_hunyuanvideo_t2v.sh
	OpenSoraPlan 1.3, TP=2, CP=2	pretrain_opensoraplan1_3.sh
	Wan2.1 T2V, FSDP2	pretrain_wan2.1_t2v.sh
	Wan2.2 I2V, FSDP2	pretrain_wan2.2_i2v.sh
Finetune	Qwen2VL 7B, TP=1, PP=4	finetune_qwen2vl_7B.sh
	Qwen2.5VL 7B, TP=2, PP=2	finetune_qwen2_5_vl_7b.sh
	DeepSeekVL2, TP=2, PP=2	finetune_deepseekvl2.sh
	Qwen3Omni, FSDP2	finetune_qwen3omni.sh
	Qwen3VL 30B, FSDP2	finetune_qwen3vl_30B.sh
Posttrain	Qwen2VL, DPO, TP=2, PP=4	posttrain_qwen2vl_dpo.sh
Inference	Qwen2VL 7B, PP=1	inference_qwen2vl_7b_pp1.sh
	Qwen2VL 7B, PP=4	inference_qwen2vl_7b_pp4.sh
	CogVideoX T2V 1.5	inference_cogvideox_t2v_1.5.sh
	InternVL2.5	inference_internvl2_5.sh
	Wan2.2 T2V, CP=2	inference_wan2.2_t2v.sh

2.2 UT（单元测试）看护列表

说明：UT用例看护功能指标，确保代码能够正常运行。

Module	Features	Scripts
Loss	Chunk Loss	test_chunkloss.py
Tools	Profiler性能分析工具	test_profiler.py
Data	数据工具函数	test_utils.py
	多模态数据处理插件	test_mm_plugin.py
Models - Vision	Vision RoPE索引计算 (Qwen2VL)	test_qwen2vl_get_rope_index.py
	Vision RoPE索引计算 (Qwen2.5VL)	test_qwen2_5vl_get_rope_index.py
	Vision RoPE索引计算 (Qwen2.5Omni)	test_qwen2_5_omni_get_rope_index.py
	Vision RoPE Processor (Qwen2VL)	test_qwen2vl_rope_processor.py
Models - Transformers	Attention Utils (Qwen3VL)	test_attention_utils.py
	Attention Modules (Qwen3Omni)	test_attention_modules.py
Models - Text Encoder	文本编码器处理	test_text_encoder_processor.py
	Tokenizer处理	test_tokenzier_processor.py
Models - Audio Encoder	音频编码器处理	test_audio_encoder_processor.py
Models - AE	AutoEncoder处理	test_ae_processor.py
Models - Diffusion	IDDPM Scheduler	test_iddpm.py
	Diffusers Scheduler	test_diffusers_scheduler.py
	Wan Flow Match Scheduler	test_wan_flow_match_scheduler.py
	CogVideoX扩散模型	test_cogvideo_diffusion.py
	Hunyuan I2V扩散模型	test_hunyuan_i2v_diffusion.py
Models - Common	激活函数	test_activations.py
	注意力机制	test_attention.py
	非对齐分割	test_unaligned_split.py
	位置编码	test_pos_embeddings.py
	CogVideoX位置编码	test_cogvideox_pos_emb.py
Tasks	Sora GRPO Trainer	test_sora_grpo_trainer.py
	Flux GRPO Trainer	test_flux_grpo_trainer.py
Checkpoint	权重转换	test_weight_convert.py
	Encoder Balance Comm	test_encoder_balance_comm.py
	MoE Expert Weight Convert	test_moe_expert_weight_convert.py

三、开发流程

flowchart LR
    A[需求分析] --> B[用例设计]
    B --> C[代码开发]
    C --> D[本地验证]
    D --> E[CI门禁]
    E --> F[PR评审]
    F --> G[合入代码]

四、开发规范

4.1 命名规范

4.1.1 ST用例命名规则

测试类型	命名规则	示例
pretrain	`pretrain_` + 模型名 + `.sh`	`pretrain_cogvideox_t2v_1_0.sh`
finetune	`finetune_` + 模型名 + `.sh`	`finetune_qwen2vl_7B.sh`
posttrain	`posttrain_` + 模型名 + `_` + 任务类型 + `.sh`	`posttrain_qwen2vl_dpo.sh`
inference	`inference_` + 模型名 + `.sh`	`inference_qwen2vl_7b_pp1.sh`

4.1.2 UT用例命名规则

test_ + 目标文件名或特性、功能名

示例：test_chunkloss.py

4.2 用例规范

4.2.1 ST用例要求

环境配置：因为CI服务器硬件是NPU，必须设置正确的NPU环境变量
数据shuffle必须关闭：多模态训练用例中需关闭数据shuffle以确保结果可复现
模型减层运行：为节省资源同时保证测试有效性，模型需要减层运行，但层数不能设置过低以避免性能波动过大
基线数据：每个ST用例需配套基线数据文件，放置于 st/baseline_results/ 目录，文件名为 ${script_name}.json

4.2.2 UT用例要求

代码编写风格：需与现有UT用例保持一致
命名规范：所有用例以 test 作为命名前缀
目录层级：建议按照功能特性进行文件夹命名区分

4.2.3 CI门禁时间要求

整个CI门禁执行时间须小于40分钟

4.2.4 资源路径规范

资源类型	路径
模型权重	`/home/ci_resource/models`
数据集	`/home/ci_resource/data`

五、附录

5.1 目录结构说明

tests/
├── README.md                        # 本文档
├── conftest.py                      # pytest全局配置
├── st/                              # 系统测试用例
│   ├── shell_scripts/               # ST脚本存放目录
│   │   ├── pretrain_*.sh            # 预训练用例
│   │   ├── finetune_*.sh            # 微调用例
│   │   ├── posttrain_*.sh           # 后训练用例
│   │   └── inference_*.sh           # 推理用例
│   ├── run_configs/                 # 用例配置文件目录
│   ├── baseline_results/            # 基线数据目录
│   ├── st_run.sh                    # ST用例执行入口
│   └── local_st_run.sh              # 本地ST执行脚本
└── ut/                              # 单元测试用例
    ├── loss/                        # Loss相关UT
    ├── tools/                       # 工具相关UT
    ├── data/                        # 数据处理UT
    ├── models/                      # 模型相关UT
    │   ├── vision/                  # 视觉模型UT
    │   ├── transformers/            # Transformer UT
    │   ├── text_encoder/            # 文本编码器UT
    │   ├── audio_encoder/           # 音频编码器UT
    │   ├── ae/                      # 自编码器UT
    │   ├── diffusion/               # 扩散模型UT
    │   └── common/                  # 通用模块UT
    ├── tasks/                       # 任务相关UT
    ├── tools/                       # 工具UT
    └── test_*.py                    # 根目录UT