文件最后提交记录最后更新时间
!244 【特性】qwen2vl数据模块和训练脚本提交 Merge pull request !244 from pjgao/master 1 年前
!599 【特性】新增vae训练脚本以及配置文件 Merge pull request !599 from zs-Derrick/master 1 年前
!729 【安全】加载功能安全加固 Merge pull request !729 from htwang/master 1 年前
[Docs] Annotation Standardization Rectification Co-authored-by: LKONE<wanglikai4@huawei.com> # message auto-generated for no-merge-commit merge: !2010 merge master into master [Docs] Annotation Standardization Rectification Created-by: wanglikai1019 Commit-by: LKONE Merged-by: ascend-robot Description: ## Motivation Rectify the annotations for the code repository in accordance with the annotation specification requirements. ## Modification 1. Replace Chinese annotations with English ones; 2. Organize the environment variable documentation and add explanations in the startup scripts (currently implemented on Qwen3VL and Wan2.2); 3. Add annotations to the public dataset functions and public model classes of Wan2.2 and Qwen3VL; 4. Adjust the annotation format (e.g., number of indentations). ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!20104 个月前
[Feature] Bagel Dataset Co-authored-by: ningmengliu<liuhao438@huawei.com> # message auto-generated for no-merge-commit merge: !1845 merge master into master [Feature] Bagel Dataset Created-by: ningmenglh Commit-by: ningmengliu Merged-by: ascend-robot Description: ## Motivation Bagel Scripts and Dataset ## Modification Bagel Scripts and Dataset ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!18455 个月前
!1222 [Feature] Wan DPO support PP Merge pull request !1222 from meng-coding/master 10 个月前
[Docs] Annotation Standardization Rectification Co-authored-by: LKONE<wanglikai4@huawei.com> # message auto-generated for no-merge-commit merge: !2010 merge master into master [Docs] Annotation Standardization Rectification Created-by: wanglikai1019 Commit-by: LKONE Merged-by: ascend-robot Description: ## Motivation Rectify the annotations for the code repository in accordance with the annotation specification requirements. ## Modification 1. Replace Chinese annotations with English ones; 2. Organize the environment variable documentation and add explanations in the startup scripts (currently implemented on Qwen3VL and Wan2.2); 3. Add annotations to the public dataset functions and public model classes of Wan2.2 and Qwen3VL; 4. Adjust the annotation format (e.g., number of indentations). ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!20104 个月前
[Bugfix]resolve multiple issues — unused code, index out of bounds, undefined vars, resource leaks Co-authored-by: zhangxubin<1656631289@qq.com> # message auto-generated for no-merge-commit merge: !1662 merge master into master [Bugfix]resolve multiple issues — unused code, index out of bounds, undefined vars, resource leaks Created-by: MoCuishle-M Commit-by: zhangxubin Merged-by: ascend-robot Description: ## Motivation Fix some security issues. ## Modification The issues fixed are as follows: 1. Removed unused code and fixed logic errors 2. Fixed array out-of-bounds access. 3. Fixed usage of undefined variables 4. Fixed resource leaks by ensuring proper release ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [x] The new code needs to comply with the Clean Code specification. - [x] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [x] CLA has been signed and all committers have signed the CLA in this PR. - [x] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!16626 个月前
[feature]Fix the problem of slow parsing speed of JSON files in big datasets Co-authored-by: feng0w0<houyufeng4@huawei.com> # message auto-generated for no-merge-commit merge: !1944 merge master into master [feature]Fix the problem of slow parsing speed of JSON files in big datasets Created-by: feng0w0 Commit-by: feng0w0 Merged-by: ascend-robot Description: ## Motivation 1.When processing large datasets, the speed of JSON parsing is very slow. 2.The parsed JSON data contains some keys that were not used during the training process ## Modification 1.Replace the parsing library used for parsing a single JSON file from pandas to orjson. 2.Using multiple processes to accelerate the processing of multiple JSON files and utilizing shared memory to reduce data transfer time between processes.(Set in data.json: dataset_param.basic_parameters.use_multiprocess) 3.During the parsing of JSON files, only the specified keys are retained.(Set in data.json: dataset_param.basic_parameters.reserved_keys) ## Self-test (Optional) If modifications to this PR may cause/fix function/accuracy/performance DTSs/issues, a self-inspection record needs to be attached. ## BC-breaking (Optional) If there are compatibility issues, such as dependencies on cann/torch_npu versions, they need to be explained in the PR. ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!19444 个月前
!1321 [Bugfix] import dependency isolation. Merge pull request !1321 from zs-Derrick/master 10 个月前
feat(torch):Support text-only pretraining Co-authored-by: yaoyaoxu<xuyaoyao.824404@huawei.com> # message auto-generated for no-merge-commit merge: !2261 merge pretrain_data_preprocess into master feat(torch):Support text-only pretraining Created-by: yaoyaoxu Commit-by: yaoyaoxu Merged-by: ascend-robot Description: ## What this PR does / why we need it? 1.支持纯fsdp的纯文本预训练 2.支持megatron+fsdp双后端的纯文本预训练 3.提供预训练特性文档 ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-MM!22612 个月前
!16 【特性】新增dataset模块中的T2VDataset Merge pull request !16 from bigdog1206/t2v_dataset_240727 1 年前
[Feature] wan2.2 t2v A14B Co-authored-by: 林明哲<linmingzhe3@huawei.com> # message auto-generated for no-merge-commit merge: merge 0918wan22FSDP into master [Feature] wan2.2 t2v A14B Created-by: gcw_lpJGTC0j Commit-by: 林明哲 Merged-by: ascend-robot Description: ## Motivation add wan2.2 cfg & shell with FSDP2 ## Modification wan2.2 cfg & shell ## Checklist **Before PR**: - [ ] The new code needs to comply with the Clean Code specification. - [ ] The PR content is self-checked, and the expression can be clear and the writing standardized **After PR**: - [ ] CLA has been signed and all committers have signed the CLA in this PR. - [ ] The ci-pipeline is passed, Code Check is passed. See merge request: Ascend/MindSpeed-MM!15228 个月前