MindSpeed-MM/examples/bagel/fsdp2_config.yaml-代码预览-MindSpeed-MM:基于昇腾芯片的多模态大模型训练套件项目 - AtomGit

ascend-robot[Bugfix] Bug in Bagel folder changes and validation

289bdb4f创建于 2025年12月24日历史提交

sharding_size: 8
sub_modules_to_wrap:
  - mindspeed_mm.models.omni.mllms.bagel_qwen2_mot.Qwen2MoTDecoderLayer
reshard_after_forward: True
param_dtype: bf16
reduce_dtype: bf16
cast_forward_inputs: True
recompute_modules:
  - mindspeed_mm.models.omni.mllms.bagel_qwen2_mot.Qwen2MoTDecoderLayer