文件最后提交记录最后更新时间
!3171 [pytorch][feature]Deprecate some moe parameters: n-group, topk-group, etc. Merge pull request !3171 from shengjy/dep_moe9 个月前
!3335 [pytorch][bugfix]fix qwen3_next mpt bug Merge pull request !3335 from guozhihua/fix_qwen3_next_mtp8 个月前
!3370 [pytorch][bugfix]restore the einsum operation for next states of mamba Merge pull request !3370 from sunjunjie/master8 个月前
[pytorch][bugfix]fix deepseek3 tnd bug in mbs > 1 Co-authored-by: guozhihua2<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !4235 merge deepseek3_tnd_mbs_2.2.0 into 2.2.0 [pytorch][bugfix]fix deepseek3 tnd bug in mbs > 1 Created-by: guozhihua2 Commit-by: guozhihua2 Merged-by: ascend-robot Description: 1. 去掉deepseek3中mla部分在tnd下的维度转换,原生是megatron的实现,megatron是全局tnd,llm是只在fa计算过程是tnd See merge request: Ascend/MindSpeed-LLM!42353 个月前
!1998 rename: repo package name from modellink to mindspeed_llm Merge pull request !1998 from MeiFei/master-package-rename1 年前