ascend-robot[Docs] Modify current repository URLs to relative paths

文件	最后提交记录	最后更新时间
README.md	[Docs] Modify current repository URLs to relative paths Co-authored-by: AZe_404<wangze62@h-partners.com> # message auto-generated for no-merge-commit merge: !2360 merge chg_branch_2600 into 26.0.0 [Docs] Modify current repository URLs to relative paths Created-by: AZe_404 Commit-by: AZe_404 Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1. 拉取代码修改为拉取26.0.0分支，包括之前未指定版本的MindSpeed Core 2. 将MM仓库内的链接修改为相对路径访问 ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!2360	1 个月前
bf16_accelerate_config.yaml	!1485 [Feature]add qwen image in diffusers Merge pull request !1485 from OrangeSoda/add_qwen_image	8 个月前
finetune_qwen_image_dreambooth_lora_deepspeed_bf16.sh	[Docs] Document corrections Co-authored-by: js1234567<jiangshuo9@h-partners.com> # message auto-generated for no-merge-commit merge: !2108 merge master into master [Docs] Document corrections Created-by: js1234567 Commit-by: js1234567 Merged-by: ascend-robot Description: ## Motivation Document corrections: 1. 添加2.3.0配套信息 2. 中英文标点问题 3. 链接版本更新 4. CANN8.5.0版本配置环境变量刷新, 涉及环境变量设置需全面排查修改 ## Modification Readme.md ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!2108	3 个月前
infer_qwen_image_text2img_bf16.py	!1485 [Feature]add qwen image in diffusers Merge pull request !1485 from OrangeSoda/add_qwen_image	8 个月前
infer_qwen_image_text2img_lora_bf16.py	[Docs]Supplement the discussion of readme based on feedback from testers Co-authored-by: peng-hengduo<penghengduo@huawei.com> # message auto-generated for no-merge-commit merge: !1696 merge update_qwenimage_q4 into master [Docs]Supplement the discussion of readme based on feedback from testers Created-by: peng-hengduo Commit-by: peng-hengduo Merged-by: ascend-robot Description: ## Motivation Some descriptions in the README were revised based on the issues raised by Q4 testers, along with related questions. The main changes were made to the README.md file. ## Modification Some descriptions in the README were revised based on the issues raised by Q4 testers, along with related questions. The main changes were made to the README.md file. ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1696	6 个月前
requirements_qwen_image.txt	[Docs] peft version fix Co-authored-by: js1234567<jiangshuo9@h-partners.com> # message auto-generated for no-merge-commit merge: !1768 merge master into master [Docs] peft version fix Created-by: js1234567 Commit-by: js1234567 Merged-by: ascend-robot Description: ## Motivation peft version fix ## Modification Readme requirement.txt ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist Before PR: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized After PR: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!1768	6 个月前

Diffusers

Qwen Image
引用
- 公网地址说明

Qwen Image

模型介绍

Qwen Image是基于 MMDiT 扩散骨干与 Qwen2.5-VL 文本编码器构建的多模态图像生成模型，其核心结构创新在于采用 MSRoPE 多模态位置编码解决文本与图像位置混淆问题；功能上以卓越的多语言文本渲染（尤其中文）和精准图像编辑为特色，同时具备强大的通用图像生成能力。

版本说明

参考实现

url=https://github.com/huggingface/diffusers
commit_id=7a2b78bf0f788d311cc96b61e660a8e13e3b1e63

变更记录

2025.09.08：首次发布Qwen Image

微调

环境搭建

【模型开发时推荐使用配套的环境版本】

请参考安装指南

软件与驱动安装

# 创建并激活python环境，安装 torch 和 torch_npu，请参考上述安装指南

# 将shell脚本中的环境变量路径修改为真实路径，下面为参考路径
source /usr/local/Ascend/cann/set_env.sh

克隆仓库到本地服务器

git clone --branch 26.0.0 https://gitcode.com/Ascend/MindSpeed-MM.git

模型搭建

3.1 【下载 Qwen Image GitHub参考实现在模型根目录下执行以下命令，安装模型对应PyTorch版本需要的依赖】

git clone https://github.com/huggingface/diffusers.git
cd diffusers
git checkout 7a2b78b
cp -r ../MindSpeed-MM/examples/diffusers/qwen_image/* ./examples/dreambooth

【主要代码路径】

code_path=examples/dreambooth/

3.2【安装其余依赖库】

pip install -e .
pip install -r examples/dreambooth/requirements_qwen_image.txt # 安装对应依赖

微调

【准备微调数据集】
- 用户需自行获取并解压pokemon-blip-captions数据集，并在以下启动shell脚本中将dataset_name参数设置为本地数据集的绝对路径
```
dataset_name="pokemon-blip-captions" # 数据集 路径
```
- pokemon-blip-captions数据集格式如下:
```
pokemon-blip-captions
├── dataset_infos.json
├── README.md
└── data
      └── train-001.parquet
```
说明： 该数据集的训练过程脚本只作为一种参考示例。
- 如用自己的微调数据集，需在shell脚本中修改dataset_name：
```
dataset_name="/path/customized_datasets" # 数据集路径
```
在shell脚本accelerate launch目录下（40行左右）将修改 dataset_name=$dataset_name，并将instance_prompt改为与自己数据集所匹配的prompt，caption_column修改为数据集匹配名称:
```
# Example
accelerate launch --config_file $config_file \
  ./examples/dreambooth/train_dreambooth_lora_qwen_image.py \
  --pretrained_model_name_or_path=$model_name  \
  --dataset_name=$dataset_name \
  --caption_column="text" \
  --instance_prompt="a photo of pokemon" \
```
【配置 Lora 微调脚本】

联网情况下，微调模型可通过以下步骤下载。无网络时，用户可访问huggingface官网自行下载Qwen Image模型 model_name模型
```
model_name="Qwen/Qwen-Image" # 预训练模型路径
```
获取对应的微调模型后，在以下shell启动微调脚本中将model_name参数设置为本地预训练模型绝对路径。若需要，可根据deepspeed分布式训练配置修改accelerate中的配置，即bf16_accelerate_config.yaml。将config_file参数设置为该yaml文件的绝对路径。
```
model_name="Qwen/Qwen-Image"
dataset_name="pokemon-blip-captions"
batch_size=8
num_processors=8
max_train_steps=5000
mixed_precision="bf16"
resolution=512
gradient_accumulation_steps=1
config_file="./examples/dreambooth/bf16_accelerate_config.yaml"

# accelerate launch --config_file $config_file \ 目录下
--dataloader_num_workers=0 \ # 请基于系统配置与数据大小进行调整num workers
```

【修改代码文件】

【Optional】如机器未联网，需对save_model_card进行修改：将save_model_card删除或者放到args.push_to_hub目录下：

validation_prompt = args.validation_prompt if args.validation_prompt else args.final_validation_prompt # 原代码

if args.push_to_hub:
    save_model_card(
        (args.hub_model_id or Path(args.output_dir).name) if not args.push_to_hub else repo_id,
        images=images,
        base_model=args.pretrained_model_name_or_path,
        instance_prompt=args.instance_prompt,
        validation_prompt=validation_prompt,
        repo_folder=args.output_dir,
    )
    upload_folder(
        repo_id=repo_id,
        folder_path=args.output_dir,
        commit_message="End of training",
        ignore_patterns=["step_*", "epoch_*"],
    ) # 原代码

【Optional】多机运行

修改config文件

vim ./examples/dreambooth/bf16_accelerate_config.yaml

将文件中的deepspeed_multinode_launcher, main_process_ip, 以及main_process_port消除注释而进行使用。

    zero_stage: 2
    deepspeed_multinode_launcher: standard
  main_process_ip: localhost  # 主节点IP
  main_process_port: 6000     # 主节点port
  machine_rank: 0             # 当前机器的rank
  num_machines: 1             # 总共的机器数
  num_processes: 8            # 总共的卡数

【启动 Qwen Image 微调脚本】

本任务主要提供dreambooth_lora_qwen_image微调脚本，支持多卡训练。

启动Qwen Image dreambooth_lora微调脚本
```
bash examples/dreambooth/finetune_qwen_image_dreambooth_lora_deepspeed_bf16.sh
```

推理

环境搭建及运行

同微调对应章节

【DREAMBOOTH微调Qwen Image模型推理】

vim ./examples/dreambooth/infer_qwen_image_text2img_bf16.py # 进入运行推理的Python文件

修改路径

model_name="Qwen/Qwen-Image" # Qwen Image pretrained model
output_path = "./infer_result"  # Inference result output folder

按需修改prompt

prompt = "一片森林中，一只可爱的小鹿在俯身喝水，小鹿旁边有一块木板，写着'MindSpeed-MM'的字"

运行代码

python examples/dreambooth/infer_qwen_image_text2img_bf16.py

【lora微调Qwen Image模型推理】

vim ./examples/dreambooth/infer_qwen_image_text2img_lora_bf16.py

修改路径

model_name="Qwen/Qwen-Image" # Qwen Image pretrained model
lora_path = "qwen_image_lora" # Folder containing trained lora weights
output_path = "./infer_result"  # Inference result output folder

按需修改prompt

prompt = "一片森林中，一只可爱的小鹿在俯身喝水，小鹿旁边有一块木板，写着'MindSpeed-MM'的字"

运行代码

python examples/dreambooth/infer_qwen_image_text2img_lora_bf16.py

环境变量声明

环境变量	描述	取值说明
`ASCEND_SLOG_PRINT_TO_STDOUT`	是否开启日志打印	`0`: 关闭日志打屏 `1`: 开启日志打屏
`ASCEND_GLOBAL_LOG_LEVEL`	设置应用类日志的日志级别及各模块日志级别，仅支持调试日志	`0`: 对应DEBUG级别 `1`: 对应INFO级别 `2`: 对应WARNING级别 `3`: 对应ERROR级别 `4`: 对应NULL级别，不输出日志
`TASK_QUEUE_ENABLE`	用于控制开启task_queue算子下发队列优化的等级	`0`: 关闭 `1`: 开启Level 1优化 `2`: 开启Level 2优化
`COMBINED_ENABLE`	设置combined标志。设置为0表示关闭此功能；设置为1表示开启，用于优化非连续两个算子组合类场景	`0`: 关闭 `1`: 开启
`CPU_AFFINITY_CONF`	控制CPU端算子任务的处理器亲和性，即设定任务绑核	设置`0`或未设置: 表示不启用绑核功能 `1`: 表示开启粗粒度绑核 `2`: 表示开启细粒度绑核
`HCCL_CONNECT_TIMEOUT`	用于限制不同设备之间socket建链过程的超时等待时间	需要配置为整数，取值范围`[120,7200]`，默认值为`120`，单位`s`
`PYTORCH_NPU_ALLOC_CONF`	控制缓存分配器行为	`expandable_segments:<value>`: 使能内存池扩展段功能，即虚拟内存特征
`HCCL_EXEC_TIMEOUT`	控制设备间执行时同步等待的时间，在该配置时间内各设备进程等待其他设备执行通信同步	需要配置为整数，取值范围`[68,17340]`，默认值为`1800`，单位`s`
`ACLNN_CACHE_LIMIT`	配置单算子执行API在Host侧缓存的算子信息条目个数	需要配置为整数，取值范围`[1, 10,000,000]`，默认值为`10000`
`TOKENIZERS_PARALLELISM`	用于控制Hugging Face的transformers库中的分词器（tokenizer）在多线程环境下的行为	`False`: 禁用并行分词 `True`: 开启并行分词
`OMP_NUM_THREADS`	设置执行期间使用的线程数	需要配置为整数

引用

公网地址说明

代码涉及公网地址参考公网地址