文件最后提交记录最后更新时间
!1998 rename: repo package name from modellink to mindspeed_llm Merge pull request !1998 from MeiFei/master-package-rename 1 年前
!3238 [pytorch][model]support bailing_mini 0day Merge pull request !3238 from jzh/bailing 8 个月前
refactor(pytorch): update deepseek4 shell Co-authored-by: dingzicha1997<dingzilin@huawei.com> # message auto-generated for no-merge-commit merge: !4423 merge master into master refactor(pytorch): update deepseek4 shell Created-by: dingzicha1997 Commit-by: dingzicha1997 Merged-by: ascend-robot Description: ## What this PR does / why we need it? Please describe the background and detailed changes of the PR. If it is a bugfix, please attach the related issue. ## Does this PR introduce any user-facing change? Please describe whether the PR will result in any user-facing usage changes. If there is related documentation, please specify its path. ## How was this patch tested? Please explain how to verify the correctness and effectiveness of this feature, as well as its usage constraints and limitations. See merge request: Ascend/MindSpeed-LLM!44231 个月前
[pytorch][feature]support transformer_engine and FP8 training Co-authored-by: mingzhenwang<wangmingzhen4@huawei.com> # message auto-generated for no-merge-commit merge: !3565 merge 0917merge_master into master [pytorch][feature]support transformer_engine and FP8 training Created-by: mingzhenwang Commit-by: mingzhenwang Merged-by: ascend-robot Description: 1.LLM支持transformer_engine 2.deepseek/qwen moe支持TELinear层 See merge request: Ascend/MindSpeed-LLM!35655 个月前
!2032 refactor: TransformerLayerSpecWithPostNorm Merge pull request !2032 from RuanZhiXiang/refactor-support-feature-spec-mapping 1 年前
!2032 refactor: TransformerLayerSpecWithPostNorm Merge pull request !2032 from RuanZhiXiang/refactor-support-feature-spec-mapping 1 年前
!2414 codecheck更改 Merge pull request !2414 from shenjiarun/master 1 年前
fix(pytorch): modify longcat mlp to enable TE Co-authored-by: MissingPompeii<guohao120@huawei.com> # message auto-generated for no-merge-commit merge: !4318 merge master_te into master fix(pytorch): modify longcat mlp to enable TE Created-by: MissingPompeii Commit-by: MissingPompeii Merged-by: ascend-robot Description: longcat原始实现默认mlps_0、mlps_0走非te分支(默认use_te=False),linear_fc* 调用ColumnParallelLinear和RowParallelLinear; 修改后mlps_0、mlps_0在使能use_te情况下,mlps_0、mlps_0可以走te分支,linear_fc* 调用TEColumnParallelLinear和TERowParallelLinear。 See merge request: Ascend/MindSpeed-LLM!43182 个月前
!2555 ​[mamba] supports state_space_duality implementations for both mamba2 and mamba2_hybrid architectures.​ Merge pull request !2555 from guozhihua/master 1 年前
[pytorch][coverage] add minicpm,lingmini,ling-think-format Co-authored-by: jzh6229<jiangzhihui4@huawei.com> # message auto-generated for no-merge-commit merge: !3700 merge coverage_jzh into master [pytorch][coverage] add minicpm,lingmini,ling-think-format Created-by: jzh6229 Commit-by: jzh6229 Merged-by: ascend-robot Description: [pytorch][coverage] add minicpm,lingmini,ling-think-format See merge request: Ascend/MindSpeed-LLM!37006 个月前
!2414 codecheck更改 Merge pull request !2414 from shenjiarun/master 1 年前
[pytorch][feature]PLM-1.8B pretrain/sft Co-authored-by: EVA1<jingsiyu1@huawei.com> # message auto-generated for no-merge-commit merge: !3637 merge master into master [pytorch][feature]PLM-1.8B pretrain/sft Created-by: EVA1 Commit-by: EVA1 Merged-by: ascend-robot Description: 1.PLM-1.8B 模型支持:数据集格式转换、权重转换、微调、预训练; 2.精度已对齐,sft相对误差小于千分之一。 See merge request: Ascend/MindSpeed-LLM!36376 个月前
!3105 [pytorch][refactor]add qwen2_moe spec with shared_expert_gate Merge pull request !3105 from guozhihua/add_qwen2_moe_spec 9 个月前
!3335 [pytorch][bugfix]fix qwen3_next mpt bug Merge pull request !3335 from guozhihua/fix_qwen3_next_mtp 8 个月前
style(pytorch): SCA compliance rectification Co-authored-by: zhyebin01<zhangyebin@h-partners.com> # message auto-generated for no-merge-commit merge: !4355 merge master into master style(pytorch): SCA compliance rectification Created-by: zhyebin01 Commit-by: zhyebin01 Merged-by: ascend-robot Description: ## What this PR does / why we need it? SCA compliance rectification ## Does this PR introduce any user-facing change? no ## How was this patch tested? pipeline test passed See merge request: Ascend/MindSpeed-LLM!43552 个月前