Fork
0
代码
介绍
代码
Issues
Pull Requests
流水线
Actions
讨论
Wiki
项目成员
分析
项目设置
Fork
0
master
MindSpeed
/
mindspeed
/
mindspore
/
core
/
optimizer
下载当前目录
ascend-robot
[mindspore]fix distrib_optim save
3a6ca5b9
创建于
1月22日
历史提交
文件
最后提交记录
最后更新时间
swap_optimizer
!2793
[mindspore][master] add swap-optimizer patch Merge pull request
!2793
from wangjialin/swap_optimizer_pin_memory
8 个月前
__init__.py
!2585
megatron v012 adaptor Merge pull request
!2585
from 张珂/megatron012_mindspeed
10 个月前
adamw.py
!2789
[mindspore][master]modify npu_apply_fused_adamw_v2 to msadapter Merge pull request
!2789
from minghu111/master0828
8 个月前
distrib_optimizer.py
[mindspore]fix distrib_optim save Co-authored-by: ybwang19<1605891897@qq.com> # message auto-generated for no-merge-commit merge:
!3218
merge master into master [mindspore]fix distrib_optim save Created-by: ybwang19 Commit-by: ybwang19 Merged-by: ascend-robot Description: [🔗关联问题单](
https://e.gitee.com/mind_spore/dashboard?issue=IDJ2KI)
修改前后PT和MS侧断点续训精度不齐:   修改后PT和MS侧断点续训精度对齐:   See merge request: Ascend/MindSpeed
!3218
4 个月前
optimizer.py
[mindspore][bugfix] bugfix for chatglm3-6b pretrain Co-authored-by: wanglijun55<wanglijun54@huawei.com> # message auto-generated for no-merge-commit merge:
!2996
merge branch-glm3 into master [mindspore][bugfix] bugfix for chatglm3-6b pretrain Created-by: wanglijun55 Commit-by: wanglijun55 Merged-by: ascend-robot Description: **修改说明:** 1、关联问题单:
https://e.gitee.com/mind_spore/issues/table?issue=ID5HCI
chatglm3-6b pretrain报错syncstream failed,定位原因为原patch中out.untyped_storage().resize_(0)的写法会导致loss tensor未初始化,确认改为out.data = torch.empty([]) 2、由于deallocate_output_tensor_ patch写法变更,custom_backward patch中assert loss.numel() == 1的判断和megatron保持一致,因而删除custom_backward patch **自验证结果:** 1、chatglm3-6b pretrain功能正常  2、qwen3-235b finetune功能正常  See merge request: Ascend/MindSpeed
!2996
6 个月前