MindSpeed/mindspeed/mindspore/core · Ascend/MindSpeed - AtomGit

ascend-robot[mindspore]fix distrib_optim save

文件	最后提交记录	最后更新时间
data_parallel	!2585 megatron v012 adaptor Merge pull request !2585 from 张珂/megatron012_mindspeed	10 个月前
distributed	[mindspore][feature] Support Megatron Custom FSDP	6 个月前
fusions	!2812 [mindspore][update][master]update useless wrapper and add transformers patch Merge pull request !2812 from 杨承翰/dispatch_master	8 个月前
memory	[mindspore][master]adapt swap attention for Qwen2.5vl-72B Co-authored-by: iquoyuw<wuyouqi1@h-partners.com> # message auto-generated for no-merge-commit merge: !2999 merge sa-master into master [mindspore][master]adapt swap attention for Qwen2.5vl-72B Created-by: weixin_47897441 Commit-by: iquoyuw Merged-by: ascend-robot Description: Adapt swap attention for Qwen2.5vl-72B 修改说明：适配swap attention特性，优化qwen2.5vl-72B显存。自验证结果： 1、特性功能与ptA对齐，相同算子下显存缩减量一致： ![image.png](https://raw.gitcode.com/user-images/assets/7404741/a62b945b-eddb-4936-82dd-2af01184caaa/image.png 'image.png') 2、开启特性前后与ptA精度零误差对齐： ![image.png](https://raw.gitcode.com/user-images/assets/7404741/c31f9943-3a7d-465d-8fb2-be7eb83cca02/image.png 'image.png') See merge request: Ascend/MindSpeed!2999	6 个月前
models	!2799 [mindspore][master]del useless patch Merge pull request !2799 from 杨承翰/dispatch_master	8 个月前
optimizer	[mindspore]fix distrib_optim save Co-authored-by: ybwang19<1605891897@qq.com> # message auto-generated for no-merge-commit merge: !3218 merge master into master [mindspore]fix distrib_optim save Created-by: ybwang19 Commit-by: ybwang19 Merged-by: ascend-robot Description: [🔗关联问题单](https://e.gitee.com/mind_spore/dashboard?issue=IDJ2KI) 修改前后PT和MS侧断点续训精度不齐： ![image.png](https://raw.gitcode.com/user-images/assets/7404741/f71d34e7-f12f-4eb5-9256-6f8806223845/image.png 'image.png') ![image.png](https://raw.gitcode.com/user-images/assets/7404741/fe61dbce-6328-48d1-a5f5-8481fcfeade2/image.png 'image.png') 修改后PT和MS侧断点续训精度对齐： ![image.png](https://raw.gitcode.com/user-images/assets/7404741/7690c96e-ec98-415c-94e9-808b1b961dc9/image.png 'image.png') ![image.png](https://raw.gitcode.com/user-images/assets/7404741/518fb739-e25a-4b7a-9025-822982152664/image.png 'image.png') See merge request: Ascend/MindSpeed!3218	4 个月前
pipeline_parallel	[mindspore][bugfix] bugfix for chatglm3-6b pretrain Co-authored-by: wanglijun55<wanglijun54@huawei.com> # message auto-generated for no-merge-commit merge: !2996 merge branch-glm3 into master [mindspore][bugfix] bugfix for chatglm3-6b pretrain Created-by: wanglijun55 Commit-by: wanglijun55 Merged-by: ascend-robot Description: 修改说明： 1、关联问题单：https://e.gitee.com/mind_spore/issues/table?issue=ID5HCI chatglm3-6b pretrain报错syncstream failed，定位原因为原patch中out.untyped_storage().resize_(0)的写法会导致loss tensor未初始化，确认改为out.data = torch.empty([]) 2、由于deallocate_output_tensor_ patch写法变更，custom_backward patch中assert loss.numel() == 1的判断和megatron保持一致，因而删除custom_backward patch 自验证结果： 1、chatglm3-6b pretrain功能正常 ![image.png](https://raw.gitcode.com/user-images/assets/7404741/e7ba3506-f520-4625-a65a-05c6bfac84e8/image.png 'image.png') 2、qwen3-235b finetune功能正常 ![image.png](https://raw.gitcode.com/user-images/assets/7404741/950063c9-8aa4-497e-be74-30d6c111cd7a/image.png 'image.png') See merge request: Ascend/MindSpeed!2996	6 个月前
tensor_parallel	[Feature] alltoallvc performance optimization in MindSpeed Co-authored-by: yashi999<guyashi1@huawei.com> # message auto-generated for no-merge-commit merge: merge master into master [Feature] alltoallvc performance optimization in MindSpeed Created-by: yashi999 Commit-by: yashi999 Merged-by: ascend-robot Description: alltoallvc performance optimization in MindSpeed See merge request: Ascend/MindSpeed !2838	8 个月前
transformer	del patch	6 个月前
timers.py	!2585 megatron v012 adaptor Merge pull request !2585 from 张珂/megatron012_mindspeed	10 个月前
utils.py	!2762 [feature][mindspore] support qwenvl-25-lora using Tensor.backward Merge pull request !2762 from zihaowang26/master_lora	8 个月前