MindSpeed-LLM/mindspeed_llm/features_manager/layerwise_disaggregated_training · Ascend/MindSpeed-LLM - AtomGit

ascend-robotfeat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training

78212c65创建于 13 天前历史提交

文件	最后提交记录	最后更新时间
__init__.py	[pytorch][feature]model fine-tuning supports U-shaped split learning Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4248 merge 20260225_perf into master [pytorch][feature]model fine-tuning supports U-shaped split learning Created-by: xuguoliang3 Commit-by: xuguoliang3;fangminghao Merged-by: ascend-robot Description: 1. 支持模型首尾层共卡部署（可支持PP的非均匀切分） 2. 新增权重转换脚本，支持权重正确切分及加载 3. 正确构建通信域，实现激活值/梯度传播，并正确训练 4. 支持仅首级流水处理训练样本，中间隐藏层不读取原始数据。 4. 支持计算通信掩盖，实现拉远收敛场景下算效损失小于3% See merge request: Ascend/MindSpeed-LLM!4248	2 个月前
u_shaped_split_feature.py	feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Co-authored-by: yanzhenghao<yanzhenghao2@huawei.com> Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4437 merge 20260409_vdp into master feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Created-by: xuguoliang3 Commit-by: xuguoliang3;yanzhenghao;fangminghao Merged-by: ascend-robot Description: ## What this PR does / why we need it? support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4437	13 天前
vdp_feature.py	feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Co-authored-by: yanzhenghao<yanzhenghao2@huawei.com> Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4437 merge 20260409_vdp into master feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Created-by: xuguoliang3 Commit-by: xuguoliang3;yanzhenghao;fangminghao Merged-by: ascend-robot Description: ## What this PR does / why we need it? support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4437	13 天前
vtp_feature.py	feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Co-authored-by: yanzhenghao<yanzhenghao2@huawei.com> Co-authored-by: xuguoliang3<xuguoliang3@huawei.com> Co-authored-by: fangminghao<fangminghao@huawei.com> # message auto-generated for no-merge-commit merge: !4437 merge 20260409_vdp into master feat(pytorch): support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training Created-by: xuguoliang3 Commit-by: xuguoliang3;yanzhenghao;fangminghao Merged-by: ascend-robot Description: ## What this PR does / why we need it? support different DP or DP&TP configuration on edge and cloud for layerwise_disaggregated_training ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4437	13 天前