The module pyro.optim provides support for optimization in Pyro. In particular it provides PyroOptim, which is used to wrap PyTorch optimizers and manage optimizers for dynamically generated parameters (see the tutorial SVI Part I for a discussion). Any custom optimization algorithms are also to be found here.


class PyroOptim(optim_constructor, optim_args)[source]

Bases: object

A wrapper for torch.optim.Optimizer objects that helps managing with dynamically generated parameters

  • optim_constructor – a torch.optim.Optimizer
  • optim_args – a dictionary of learning arguments for the optimizer or a callable that returns such dictionaries

Get state associated with all the optimizers in the form of a dictionary with key-value pairs (parameter name, optim state dicts)

Parameters:filename – file name to load from

Load optimizer state from disk

Parameters:filename – file name to save to

Save optimizer state to disk


Set the state associated with all the optimizers using the state obtained from a previous call to get_state()


class ClippedAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, clip_norm=10.0, lrd=1.0)[source]

Bases: torch.optim.optimizer.Optimizer

  • params – iterable of parameters to optimize or dicts defining parameter groups
  • lr – learning rate (default: 1e-3)
  • betas (Tuple) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999))
  • eps – term added to the denominator to improve numerical stability (default: 1e-8)
  • weight_decay – weight decay (L2 penalty) (default: 0)
  • clip_norm – magnitude of norm to which gradients are clipped (default: 10.0)
  • lrd – rate at which learning rate decays (default: 1.0)

Small modification to the Adam algorithm implemented in torch.optim.Adam to include gradient clipping and learning rate decay.


A Method for Stochastic Optimization, Diederik P. Kingma, Jimmy Ba https://arxiv.org/abs/1412.6980


:param closure:: An optional closure that reevaluates the model and returns the loss.

Performs a single optimization step.