文件最后提交记录最后更新时间
[mindspore][feature] support seq1f1b Co-authored-by: zihaowang26<wangzihao111@huawei.com> # message auto-generated for no-merge-commit merge: !2994 merge master_seq1f1b into master [mindspore][feature] support seq1f1b Created-by: zihaowang26 Commit-by: zihaowang26 Merged-by: ascend-robot Description: support seq1f1b See merge request: Ascend/MindSpeed!29946 个月前
[mindspore][bugfix] bugfix for chatglm3-6b pretrain Co-authored-by: wanglijun55<wanglijun54@huawei.com> # message auto-generated for no-merge-commit merge: !2996 merge branch-glm3 into master [mindspore][bugfix] bugfix for chatglm3-6b pretrain Created-by: wanglijun55 Commit-by: wanglijun55 Merged-by: ascend-robot Description: **修改说明:** 1、关联问题单:https://e.gitee.com/mind_spore/issues/table?issue=ID5HCI chatglm3-6b pretrain报错syncstream failed,定位原因为原patch中out.untyped_storage().resize_(0)的写法会导致loss tensor未初始化,确认改为out.data = torch.empty([]) 2、由于deallocate_output_tensor_ patch写法变更,custom_backward patch中assert loss.numel() == 1的判断和megatron保持一致,因而删除custom_backward patch **自验证结果:** 1、chatglm3-6b pretrain功能正常 ![image.png](https://raw.gitcode.com/user-images/assets/7404741/e7ba3506-f520-4625-a65a-05c6bfac84e8/image.png 'image.png') 2、qwen3-235b finetune功能正常 ![image.png](https://raw.gitcode.com/user-images/assets/7404741/950063c9-8aa4-497e-be74-30d6c111cd7a/image.png 'image.png') See merge request: Ascend/MindSpeed!29966 个月前