torch.distributed.fsdp

Note

若API“是否支持”为“是”,“限制与说明”为“-”,说明此API和原生API支持度保持一致。

API名称 是否支持 限制与说明
torch.distributed.fsdp.FullyShardedDataParallel 支持bf16,fp16,fp32
在昇腾NPU场景中使用FSDP,推荐传入“device_id=torch.device("npu:0")”设备相关参数
torch.distributed.fsdp.FullyShardedDataParallel.apply -
torch.distributed.fsdp.FullyShardedDataParallel.check_is_root -
torch.distributed.fsdp.FullyShardedDataParallel.clip_grad_norm_ -
torch.distributed.fsdp.FullyShardedDataParallel.flatten_sharded_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.forward -
torch.distributed.fsdp.FullyShardedDataParallel.fsdp_modules -
torch.distributed.fsdp.FullyShardedDataParallel.full_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.get_state_dict_type -
torch.distributed.fsdp.FullyShardedDataParallel.module -
torch.distributed.fsdp.FullyShardedDataParallel.named_buffers -
torch.distributed.fsdp.FullyShardedDataParallel.named_parameters -
torch.distributed.fsdp.FullyShardedDataParallel.no_sync -
torch.distributed.fsdp.FullyShardedDataParallel.optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.optim_state_dict_to_load -
torch.distributed.fsdp.FullyShardedDataParallel.register_comm_hook -
torch.distributed.fsdp.FullyShardedDataParallel.rekey_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.scatter_full_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.set_state_dict_type -
torch.distributed.fsdp.FullyShardedDataParallel.shard_full_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.sharded_optim_state_dict -
torch.distributed.fsdp.FullyShardedDataParallel.state_dict_type -
torch.distributed.fsdp.FullyShardedDataParallel.summon_full_params -
torch.distributed.fsdp.BackwardPrefetch -
torch.distributed.fsdp.ShardingStrategy -
torch.distributed.fsdp.MixedPrecision -
torch.distributed.fsdp.CPUOffload -
torch.distributed.fsdp.StateDictConfig -
torch.distributed.fsdp.FullStateDictConfig -
torch.distributed.fsdp.ShardedStateDictConfig -
torch.distributed.fsdp.LocalStateDictConfig -
torch.distributed.fsdp.OptimStateDictConfig -
torch.distributed.fsdp.FullOptimStateDictConfig -
torch.distributed.fsdp.ShardedOptimStateDictConfig -
torch.distributed.fsdp.LocalOptimStateDictConfig -
torch.distributed.fsdp.StateDictSettings -