| feat(pytorch): add minimax-m2.7 model in FSDP2 backend
Co-authored-by: HanhuiChen<chenhanhui1@h-partners.com>
# message auto-generated for no-merge-commit merge:
!4410 merge minimax27 into master
feat(pytorch): add minimax-m2.7 model in FSDP2 backend
Created-by: HANHU1CHEN
Commit-by: HanhuiChen;HANHU1CHEN
Merged-by: ascend-robot
Description: ## What this PR does / why we need it?
Adds support for the MiniMax-M27 model in the FSDP2 backend, including model architecture adaptation and configuration integration, enabling distributed training and inference under the MindSpeed-LLM framework.
## Does this PR introduce any user-facing change?
Yes. Users can now launch MiniMax-M27 training and inference via the FSDP2 backend by specifying model_id=minimax_m27. Refer to any newly added example scripts under the examples/fsdp2 directory for usage details.
## How was this patch tested?
Verified on Ascend NPU with the following scenarios:
- Full-parameter training: Multi-device distributed training runs successfully.
- Full-parameter inference: The model loads correctly and produces coherent responses as expected.
See merge request: Ascend/MindSpeed-LLM!4410 | 1 个月前 |