文件最后提交记录最后更新时间
feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!44101 个月前
feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!44101 个月前
feat(pytorch): add minimax-m2.7 model in FSDP2 backend Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com> # message auto-generated for no-merge-commit merge: !4410 merge minimax27 into master feat(pytorch): add minimax-m2.7 model in FSDP2 backend Created-by: HANHU1CHEN Commit-by: HanhuiChen;HANHU1CHEN Merged-by: ascend-robot Description: ## What this PR does / why we need it? Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework. ## Does this PR introduce any user-facing change? Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details. ## How was this patch tested? Verified on Ascend NPU with the following scenarios: - Full-parameter training: Multi-device distributed training runs successfully. - Full-parameter inference: The model loads correctly and produces coherent responses as expected. See merge request: Ascend/MindSpeed-LLM!44101 个月前