MindSpeed-LLM/mindspeed_llm/tasks/models/transformer · Ascend/MindSpeed-LLM - AtomGit

ascend-robot[pytorch][bugfix]fix deepseek3 tnd bug in mbs > 1

文件	最后提交记录	最后更新时间
__init__.py	!2030 CodeChcek整改-master Merge pull request !2030 from shenjiarun/master	1 年前
attention.py	!2089 refactor: move spec related structure into right position Merge pull request !2089 from RuanZhiXiang/refactor_mla_attention	1 年前
fast_mlp.py	!1958 整改仓库文件结构 Merge pull request !1958 from DONGHAORAN/master	1 年前
hunyuan_large_attention.py	[pytorch][bugfix]fix about cleancode Co-authored-by: qyzqyz<quyueze@h-partners.com>	8 个月前
hunyuan_rope.py	!2249 【HunyuanLargeMoE】part of model Merge pull request !2249 from zhoubeirong/0218-model-part1	1 年前
mla_dot_product_attention.py	!3160 [pytorch][bugfix]fix MLA init and args bug Merge pull request !3160 from HanhuiChen/master	9 个月前
mla_up_proj_overlap_tp_comm.py	!3156 [pytorch][bugfix]Fix MLA up proj overlap and remove deprecated funcs Merge pull request !3156 from shengjy/master	9 个月前
multi_latent_attention.py	[pytorch][bugfix]fix deepseek3 tnd bug in mbs > 1 Co-authored-by: guozhihua2<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !4235 merge deepseek3_tnd_mbs_2.2.0 into 2.2.0 [pytorch][bugfix]fix deepseek3 tnd bug in mbs > 1 Created-by: guozhihua2 Commit-by: guozhihua2 Merged-by: ascend-robot Description: 1. 去掉deepseek3中mla部分在tnd下的维度转换，原生是megatron的实现，megatron是全局tnd，llm是只在fa计算过程是tnd See merge request: Ascend/MindSpeed-LLM!4235	3 个月前
qwen3_next_full_attention.py	!3316 [pytorch][model]add qwen3_next model Merge pull request !3316 from guozhihua/qwen3_next_master	9 个月前
qwen3_next_gated_deltanet_attention.py	[pytorch][model]change l2norm in qwen3_next for hf Co-authored-by: guozhihua<guozhihua2@huawei.com> # message auto-generated for no-merge-commit merge: !3443 merge qwen3_next_l2norm_2.2 into 2.2.0 [pytorch][model]change l2norm in qwen3_next for hf Created-by: guozhihua2 Commit-by: guozhihua Merged-by: ascend-robot Description: [pytorch][model]change l2norm in qwen3_next for hf See merge request: Ascend/MindSpeed-LLM!3443	8 个月前
transformer_layer.py	[pytorch][bugfix] fix the bug with gemma enabling recomputation Co-authored-by: yanzhixiao<yanzhixiao@h-partners.com> # message auto-generated for no-merge-commit merge: !4196 merge bugfix-gemma-2.2.0 into 2.2.0 [pytorch][bugfix] fix the bug with gemma enabling recomputation Created-by: yanzhixiao23 Commit-by: yanzhixiao Merged-by: ascend-robot Description: [pytorch][bugfix] fix the bug with gemma enabling recomputation. See merge request: Ascend/MindSpeed-LLM!4196	3 个月前
transformer_layer_hunyuan.py	!3025 [pytorch][bugfix] fix the hunyuan model Merge pull request !3025 from yanzhixiao/bugfix-hunyuan	11 个月前