Optimization¶

The module pyro.optim provides support for optimization in Pyro. In particular it provides PyroOptim, which is used to wrap PyTorch optimizers and manage optimizers for dynamically generated parameters (see the tutorial SVI Part I for a discussion). Any custom optimization algorithms are also to be found here.

PyroOptim¶

class PyroOptim(optim_constructor, optim_args)[source]

Bases: object

A wrapper for torch.optim.Optimizer objects that helps managing with dynamically generated parameters

Parameters: optim_constructor – a torch.optim.Optimizer optim_args – a dictionary of learning arguments for the optimizer or a callable that returns such dictionaries
get_state()[source]

Get state associated with all the optimizers in the form of a dictionary with key-value pairs (parameter name, optim state dicts)

load(filename)[source]
Parameters: filename – file name to load from

save(filename)[source]
Parameters: filename – file name to save to

Save optimizer state to disk

set_state(state_dict)[source]

Set the state associated with all the optimizers using the state obtained from a previous call to get_state()

class ClippedAdam(params, lr=0.001, betas=(0.9, 0.999), eps=1e-08, weight_decay=0, clip_norm=10.0, lrd=1.0)[source]

Bases: torch.optim.optimizer.Optimizer

Parameters: params – iterable of parameters to optimize or dicts defining parameter groups lr – learning rate (default: 1e-3) betas (Tuple) – coefficients used for computing running averages of gradient and its square (default: (0.9, 0.999)) eps – term added to the denominator to improve numerical stability (default: 1e-8) weight_decay – weight decay (L2 penalty) (default: 0) clip_norm – magnitude of norm to which gradients are clipped (default: 10.0) lrd – rate at which learning rate decays (default: 1.0)

Small modification to the Adam algorithm implemented in torch.optim.Adam to include gradient clipping and learning rate decay.

Reference

A Method for Stochastic Optimization, Diederik P. Kingma, Jimmy Ba https://arxiv.org/abs/1412.6980

step(closure=None)[source]

:param closure:: An optional closure that reevaluates the model and returns the loss.

Performs a single optimization step.