MindSpeed-LLM/examples/fsdp2/minimax_m27 · Ascend/MindSpeed-LLM - AtomGit

ascend-robotfeat(pytorch): add minimax-m2.7 model in FSDP2 backend

2198322c创建于 4月16日历史提交

文件	最后提交记录	最后更新时间
chat_minimax_m2p7_fsdp2_A3.sh	feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!4410	1 个月前
pretrain_minimax_m2p7_229b_4K_fsdp2_A3.sh	feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!4410	1 个月前
pretrain_minimax_m2p7_229b_4K_fsdp2_A3.yaml	feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!4410	1 个月前