MindSpeed-MM/examples/wan2.2/A14B/fsdp2_config.yaml-代码预览-MindSpeed-MM:基于昇腾芯片的多模态大模型训练套件项目 - AtomGit

ascend-robot[Modify] Specify the value of sharding_size for wan2.2

9be65619创建于 2025年12月2日历史提交

sharding_size: 16
sub_modules_to_wrap:
  - predictor.blocks.{*}
  - predictor.head
reshard_after_forward: True
param_dtype: "bf16"
reduce_dtype: "fp32"
ignored_modules:
  - ae
  - text_encoder
cast_forward_inputs: True
recompute_modules:
  - predictor.blocks.{*}
num_to_forward_prefetch: 0
num_to_backward_prefetch: 0
offload_to_cpu: False