MindSpeed-LLM/mindspeed_llm/tasks/models/spec · Ascend/MindSpeed-LLM - AtomGit

ascend-robotrefactor(pytorch): update deepseek4 shell

文件	最后提交记录	最后更新时间
__init__.py	!1998 rename: repo package name from modellink to mindspeed_llm Merge pull request !1998 from MeiFei/master-package-rename	1 年前
bailing_spec.py	!3238 [pytorch][model]support bailing_mini 0day Merge pull request !3238 from jzh/bailing	8 个月前
deepseek4_spec.py	refactor(pytorch): update deepseek4 shell Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4423 merge master into master refactor(pytorch): update deepseek4 shell Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!4423	1 个月前
deepseek_spec.py	[pytorch][feature]support transformer_engine and FP8 training Co-authored-by: mingzhenwang<wangmingzhen4@huawei.com> # message auto-generated for no-merge-commit merge: !3565 merge 0917merge_master into master [pytorch][feature]support transformer_engine and FP8 training Created-by: mingzhenwang Commit-by: mingzhenwang Merged-by: ascend-robot Description: 1.LLM支持transformer_engine 2.deepseek/qwen moe支持TELinear层 See merge request: Ascend/MindSpeed-LLM!3565	5 个月前
gemma2_spec.py	!2032 refactor: TransformerLayerSpecWithPostNorm Merge pull request !2032 from RuanZhiXiang/refactor-support-feature-spec-mapping	1 年前
grok_spec.py	!2032 refactor: TransformerLayerSpecWithPostNorm Merge pull request !2032 from RuanZhiXiang/refactor-support-feature-spec-mapping	1 年前
hunyuan_spec.py	!2414 codecheck更改 Merge pull request !2414 from shenjiarun/master	1 年前
longcat_spec.py	fix(pytorch): modify longcat mlp to enable TE Co-authored-by: MissingPompeii<guohao120@huawei.com> # message auto-generated for no-merge-commit merge: !4318 merge master_te into master fix(pytorch): modify longcat mlp to enable TE Created-by: MissingPompeii Commit-by: MissingPompeii Merged-by: ascend-robot Description: longcat原始实现默认mlps_0、mlps_0走非te分支(默认use_te=False)，linear_fc* 调用ColumnParallelLinear和RowParallelLinear；修改后mlps_0、mlps_0在使能use_te情况下，mlps_0、mlps_0可以走te分支，linear_fc* 调用TEColumnParallelLinear和TERowParallelLinear。 See merge request: Ascend/MindSpeed-LLM!4318	2 个月前
mamba_spec.py	!2555 [mamba] supports state_space_duality implementations for both mamba2 and mamba2_hybrid architectures. Merge pull request !2555 from guozhihua/master	1 年前
minicpm_spec.py	[pytorch][coverage] add minicpm,lingmini,ling-think-format Co-authored-by: jzh6229<jiangzhihui4@huawei.com> # message auto-generated for no-merge-commit merge: !3700 merge coverage_jzh into master [pytorch][coverage] add minicpm,lingmini,ling-think-format Created-by: jzh6229 Commit-by: jzh6229 Merged-by: ascend-robot Description: [pytorch][coverage] add minicpm,lingmini,ling-think-format See merge request: Ascend/MindSpeed-LLM!3700	6 个月前
phi35_moe_spec.py	!2414 codecheck更改 Merge pull request !2414 from shenjiarun/master	1 年前
plm_spec.py	[pytorch][feature]PLM-1.8B pretrain/sft Co-authored-by: EVA1<jingsiyu1@huawei.com> # message auto-generated for no-merge-commit merge: !3637 merge master into master [pytorch][feature]PLM-1.8B pretrain/sft Created-by: EVA1 Commit-by: EVA1 Merged-by: ascend-robot Description: 1.PLM-1.8B 模型支持：数据集格式转换、权重转换、微调、预训练； 2.精度已对齐，sft相对误差小于千分之一。 See merge request: Ascend/MindSpeed-LLM!3637	6 个月前
qwen2_moe_spec.py	!3105 [pytorch][refactor]add qwen2_moe spec with shared_expert_gate Merge pull request !3105 from guozhihua/add_qwen2_moe_spec	9 个月前
qwen3_next_spec.py	!3335 [pytorch][bugfix]fix qwen3_next mpt bug Merge pull request !3335 from guozhihua/fix_qwen3_next_mtp	8 个月前
qwen3_spec.py	style(pytorch): SCA compliance rectification Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4355 merge master into master style(pytorch): SCA compliance rectification Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: ## What this PR does / why we need it? SCA compliance rectification ## Does this PR introduce any user-facing change? no ## How was this patch tested? pipeline test passed See merge request: Ascend/MindSpeed-LLM!4355	2 个月前