torch.optim

Note

若API“是否支持”为“是”,“限制与说明”为“-”,说明此API和原生API支持度保持一致。

API名称 是否支持 限制与说明
torch.optim.Optimizer -
Optimizer.add_param_group -
Optimizer.load_state_dict -
Optimizer.state_dict -
Optimizer.step -
Optimizer.zero_grad -
torch.optim.Adadelta 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.Adadelta.add_param_group -
torch.optim.Adadelta.load_state_dict -
torch.optim.Adadelta.register_load_state_dict_post_hook -
torch.optim.Adadelta.register_load_state_dict_pre_hook -
torch.optim.Adadelta.register_state_dict_post_hook -
torch.optim.Adadelta.register_state_dict_pre_hook -
torch.optim.Adadelta.register_step_post_hook -
torch.optim.Adadelta.register_step_pre_hook -
torch.optim.Adadelta.state_dict -
torch.optim.Adadelta.step -
torch.optim.Adadelta.zero_grad -
torch.optim.Adagrad 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.Adagrad.add_param_group -
torch.optim.Adagrad.load_state_dict -
torch.optim.Adagrad.register_load_state_dict_post_hook -
torch.optim.Adagrad.register_load_state_dict_pre_hook -
torch.optim.Adagrad.register_state_dict_post_hook -
torch.optim.Adagrad.register_state_dict_pre_hook -
torch.optim.Adagrad.register_step_post_hook -
torch.optim.Adagrad.register_step_pre_hook -
torch.optim.Adagrad.state_dict -
torch.optim.Adagrad.step -
torch.optim.Adagrad.zero_grad -
torch.optim.Adam 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
在某些情况下可能回退至CPU执行
torch.optim.Adam.add_param_group -
torch.optim.Adam.load_state_dict -
torch.optim.Adam.register_load_state_dict_post_hook -
torch.optim.Adam.register_load_state_dict_pre_hook -
torch.optim.Adam.register_state_dict_post_hook -
torch.optim.Adam.register_state_dict_pre_hook -
torch.optim.Adam.register_step_post_hook -
torch.optim.Adam.register_step_pre_hook -
torch.optim.Adam.state_dict -
torch.optim.Adam.step -
torch.optim.Adam.zero_grad -
torch.optim.AdamW 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
优化器在启动fused的情况下(fused=True),暂不支持grad_scale和found_inf参数。对标_single_tensor_adamw实现,fp32与cpu/cuda一致,fp16和bf16采用升精度实现,与cpu/cuda不一致
torch.optim.AdamW.add_param_group -
torch.optim.AdamW.load_state_dict -
torch.optim.AdamW.register_load_state_dict_post_hook -
torch.optim.AdamW.register_load_state_dict_pre_hook -
torch.optim.AdamW.register_state_dict_post_hook -
torch.optim.AdamW.register_state_dict_pre_hook -
torch.optim.AdamW.register_step_post_hook -
torch.optim.AdamW.register_step_pre_hook -
torch.optim.AdamW.state_dict -
torch.optim.AdamW.step 支持fp16,fp32
torch.optim.AdamW.zero_grad 支持fp16,fp32
torch.optim.SparseAdam.add_param_group -
torch.optim.SparseAdam.load_state_dict -
torch.optim.SparseAdam.register_load_state_dict_post_hook -
torch.optim.SparseAdam.register_load_state_dict_pre_hook -
torch.optim.SparseAdam.register_state_dict_post_hook -
torch.optim.SparseAdam.register_state_dict_pre_hook -
torch.optim.SparseAdam.register_step_post_hook -
torch.optim.SparseAdam.register_step_pre_hook -
torch.optim.SparseAdam.step - -
torch.optim.SparseAdam.zero_grad - -
torch.optim.Adamax 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.Adamax.add_param_group 支持fp16,fp32
torch.optim.Adamax.load_state_dict 支持fp16,fp32
torch.optim.Adamax.register_load_state_dict_post_hook -
torch.optim.Adamax.register_load_state_dict_pre_hook -
torch.optim.Adamax.register_state_dict_post_hook -
torch.optim.Adamax.register_state_dict_pre_hook -
torch.optim.Adamax.register_step_post_hook -
torch.optim.Adamax.register_step_pre_hook -
torch.optim.Adamax.state_dict 支持fp16,fp32
torch.optim.Adamax.step 支持fp16,fp32
torch.optim.Adamax.zero_grad 支持fp16,fp32
torch.optim.ASGD 支持fp16,fp32
torch.optim.ASGD.add_param_group 支持fp16,fp32
torch.optim.ASGD.load_state_dict 支持fp16,fp32
torch.optim.ASGD.register_load_state_dict_post_hook -
torch.optim.ASGD.register_load_state_dict_pre_hook -
torch.optim.ASGD.register_state_dict_post_hook -
torch.optim.ASGD.register_state_dict_pre_hook -
torch.optim.ASGD.register_step_post_hook -
torch.optim.ASGD.register_step_pre_hook -
torch.optim.ASGD.state_dict 支持fp16,fp32
torch.optim.ASGD.step 支持fp16,fp32
torch.optim.ASGD.zero_grad 支持fp16,fp32
torch.optim.LBFGS -
torch.optim.LBFGS.add_param_group -
torch.optim.LBFGS.load_state_dict -
torch.optim.LBFGS.register_load_state_dict_post_hook -
torch.optim.LBFGS.register_load_state_dict_pre_hook -
torch.optim.LBFGS.register_state_dict_post_hook -
torch.optim.LBFGS.register_state_dict_pre_hook -
torch.optim.LBFGS.register_step_post_hook -
torch.optim.LBFGS.register_step_pre_hook -
torch.optim.LBFGS.state_dict -
torch.optim.LBFGS.step -
torch.optim.LBFGS.zero_grad -
torch.optim.NAdam 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.NAdam.add_param_group 支持fp16,fp32
torch.optim.NAdam.load_state_dict 支持fp16,fp32
torch.optim.NAdam.register_load_state_dict_post_hook -
torch.optim.NAdam.register_load_state_dict_pre_hook -
torch.optim.NAdam.register_state_dict_post_hook -
torch.optim.NAdam.register_state_dict_pre_hook -
torch.optim.NAdam.register_step_post_hook -
torch.optim.NAdam.register_step_pre_hook -
torch.optim.NAdam.state_dict 支持fp16,fp32
torch.optim.NAdam.step 支持fp16,fp32
torch.optim.NAdam.zero_grad 支持fp16,fp32
torch.optim.RAdam 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.RAdam.add_param_group 支持fp16,fp32
torch.optim.RAdam.load_state_dict 支持fp16,fp32
torch.optim.RAdam.register_load_state_dict_post_hook -
torch.optim.RAdam.register_load_state_dict_pre_hook -
torch.optim.RAdam.register_state_dict_post_hook -
torch.optim.RAdam.register_state_dict_pre_hook -
torch.optim.RAdam.register_step_post_hook -
torch.optim.RAdam.register_step_pre_hook -
torch.optim.RAdam.state_dict 支持fp16,fp32
torch.optim.RAdam.step 支持fp16,fp32
torch.optim.RAdam.zero_grad 支持fp16,fp32
torch.optim.RMSprop 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.RMSprop.add_param_group -
torch.optim.RMSprop.load_state_dict -
torch.optim.RMSprop.register_load_state_dict_post_hook -
torch.optim.RMSprop.register_load_state_dict_pre_hook -
torch.optim.RMSprop.register_state_dict_post_hook -
torch.optim.RMSprop.register_state_dict_pre_hook -
torch.optim.RMSprop.register_step_post_hook -
torch.optim.RMSprop.register_step_pre_hook -
torch.optim.RMSprop.state_dict -
torch.optim.RMSprop.step -
torch.optim.RMSprop.zero_grad -
torch.optim.Rprop -
torch.optim.Rprop.add_param_group 支持fp16,fp32
torch.optim.Rprop.load_state_dict 支持fp16,fp32
torch.optim.Rprop.register_load_state_dict_post_hook -
torch.optim.Rprop.register_load_state_dict_pre_hook -
torch.optim.Rprop.register_state_dict_post_hook -
torch.optim.Rprop.register_state_dict_pre_hook -
torch.optim.Rprop.register_step_post_hook -
torch.optim.Rprop.register_step_pre_hook -
torch.optim.Rprop.state_dict 支持fp16,fp32
torch.optim.Rprop.step 支持fp16,fp32
torch.optim.Rprop.zero_grad 支持fp16,fp32
torch.optim.SGD 支持bf16,fp16,fp32
优化器在启动foreach的情况下(默认情况foreach=None或foreach=True),当被优化的参数分组过多时由于foreach算子的特性会导致性能下降。这种情况建议设置为foreach=False
torch.optim.SGD.add_param_group 支持fp16,fp32
torch.optim.SGD.load_state_dict 支持fp16,fp32
torch.optim.SGD.register_load_state_dict_post_hook -
torch.optim.SGD.register_load_state_dict_pre_hook -
torch.optim.SGD.register_state_dict_post_hook -
torch.optim.SGD.register_state_dict_pre_hook -
torch.optim.SGD.register_step_post_hook -
torch.optim.SGD.register_step_pre_hook -
torch.optim.SGD.state_dict 支持fp16,fp32
torch.optim.SGD.step 支持fp16,fp32
torch.optim.SGD.zero_grad 支持fp16,fp32
torch.optim.lr_scheduler.LambdaLR -
torch.optim.lr_scheduler.LambdaLR.get_last_lr -
torch.optim.lr_scheduler.LambdaLR.load_state_dict -
torch.optim.lr_scheduler.LambdaLR.print_lr -
torch.optim.lr_scheduler.LambdaLR.state_dict -
torch.optim.lr_scheduler.MultiplicativeLR -
torch.optim.lr_scheduler.MultiplicativeLR.get_last_lr 支持fp32
torch.optim.lr_scheduler.MultiplicativeLR.load_state_dict 支持fp32
torch.optim.lr_scheduler.MultiplicativeLR.print_lr 支持fp32
torch.optim.lr_scheduler.MultiplicativeLR.state_dict 支持fp32
torch.optim.lr_scheduler.StepLR -
torch.optim.lr_scheduler.StepLR.get_last_lr 支持fp16,fp32
torch.optim.lr_scheduler.StepLR.load_state_dict 支持fp16,fp32
torch.optim.lr_scheduler.StepLR.print_lr 支持fp16,fp32
torch.optim.lr_scheduler.StepLR.state_dict 支持fp16,fp32
torch.optim.lr_scheduler.MultiStepLR -
torch.optim.lr_scheduler.MultiStepLR.get_last_lr 支持fp16,fp32
torch.optim.lr_scheduler.MultiStepLR.load_state_dict 支持fp16,fp32
torch.optim.lr_scheduler.MultiStepLR.print_lr 支持fp16,fp32
torch.optim.lr_scheduler.MultiStepLR.state_dict 支持fp16,fp32
torch.optim.lr_scheduler.ConstantLR -
torch.optim.lr_scheduler.ConstantLR.get_last_lr 支持fp32
torch.optim.lr_scheduler.ConstantLR.load_state_dict 支持fp32
torch.optim.lr_scheduler.ConstantLR.print_lr 支持fp32
torch.optim.lr_scheduler.ConstantLR.state_dict 支持fp32
torch.optim.lr_scheduler.LinearLR -
torch.optim.lr_scheduler.LinearLR.get_last_lr -
torch.optim.lr_scheduler.LinearLR.load_state_dict -
torch.optim.lr_scheduler.LinearLR.print_lr -
torch.optim.lr_scheduler.LinearLR.state_dict -
torch.optim.lr_scheduler.ExponentialLR -
torch.optim.lr_scheduler.ExponentialLR.get_last_lr -
torch.optim.lr_scheduler.ExponentialLR.load_state_dict -
torch.optim.lr_scheduler.ExponentialLR.print_lr -
torch.optim.lr_scheduler.ExponentialLR.state_dict -
torch.optim.lr_scheduler.PolynomialLR -
torch.optim.lr_scheduler.PolynomialLR.get_last_lr -
torch.optim.lr_scheduler.PolynomialLR.load_state_dict -
torch.optim.lr_scheduler.PolynomialLR.print_lr -
torch.optim.lr_scheduler.PolynomialLR.state_dict -
torch.optim.lr_scheduler.CosineAnnealingLR -
torch.optim.lr_scheduler.CosineAnnealingLR.get_last_lr -
torch.optim.lr_scheduler.CosineAnnealingLR.load_state_dict -
torch.optim.lr_scheduler.CosineAnnealingLR.print_lr -
torch.optim.lr_scheduler.CosineAnnealingLR.state_dict -
torch.optim.lr_scheduler.ChainedScheduler -
torch.optim.lr_scheduler.ChainedScheduler.get_last_lr -
torch.optim.lr_scheduler.ChainedScheduler.load_state_dict -
torch.optim.lr_scheduler.ChainedScheduler.print_lr -
torch.optim.lr_scheduler.ChainedScheduler.state_dict -
torch.optim.lr_scheduler.SequentialLR -
torch.optim.lr_scheduler.SequentialLR.get_last_lr -
torch.optim.lr_scheduler.SequentialLR.load_state_dict -
torch.optim.lr_scheduler.SequentialLR.print_lr -
torch.optim.lr_scheduler.SequentialLR.state_dict -
torch.optim.lr_scheduler.ReduceLROnPlateau -
torch.optim.lr_scheduler.CyclicLR -
torch.optim.lr_scheduler.CyclicLR.get_last_lr -
torch.optim.lr_scheduler.CyclicLR.get_lr -
torch.optim.lr_scheduler.CyclicLR.print_lr -
torch.optim.lr_scheduler.OneCycleLR -
torch.optim.lr_scheduler.OneCycleLR.get_last_lr -
torch.optim.lr_scheduler.OneCycleLR.load_state_dict -
torch.optim.lr_scheduler.OneCycleLR.print_lr -
torch.optim.lr_scheduler.OneCycleLR.state_dict -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.get_last_lr -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.load_state_dict -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.print_lr -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.state_dict -
torch.optim.lr_scheduler.CosineAnnealingWarmRestarts.step -