# Optimal Experiment Design¶

The pyro.contrib.oed module provides tools to create optimal experiment designs for pyro models. In particular, it provides estimators for the average posterior entropy (APE) criterion.

To estimate the APE for a particular design, use:

def model(design):
...

eig = vi_ape(model, design, ...)


APE can then be minimised using existing optimisers in pyro.optim.

## Expected Information Gain¶

vi_ape(model, design, observation_labels, target_labels, vi_parameters, is_parameters, y_dist=None)[source]

Estimates the average posterior entropy (APE) loss function using variational inference (VI).

The APE loss function estimated by this method is defined as

$$APE(d)=E_{Y\sim p(y|\theta, d)}[H(p(\theta|Y, d))]$$

where $$H[p(x)]$$ is the differential entropy. The APE is related to expected information gain (EIG) by the equation

$$EIG(d)=H[p(\theta)]-APE(d)$$

in particular, minimising the APE is equivalent to maximising EIG.

Parameters: model (function) – A pyro model accepting design as only argument. design (torch.Tensor) – Tensor representation of design observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred. target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured. vi_parameters (dict) – Variational inference parameters which should include: optim: an instance of pyro.Optim, guide: a guide function compatible with model, num_steps: the number of VI steps to make, and loss: the loss function to use for VI is_parameters (dict) – Importance sampling parameters for the marginal distribution of $$Y$$. May include num_samples: the number of samples to draw from the marginal. y_dist (pyro.distributions.Distribution) – (optional) the distribution assumed for the response variable $$Y$$ Loss function estimate torch.Tensor
naive_rainforth_eig(model, design, observation_labels, target_labels=None, N=100, M=10, M_prime=None)[source]

Naive Rainforth (i.e. Nested Monte Carlo) estimate of the expected information gain (EIG). The estimate is

$\frac{1}{N}\sum_{n=1}^N \log p(y_n | \theta_n, d) - \log \left(\frac{1}{M}\sum_{m=1}^M p(y_n | \theta_m, d)\right)$

Monte Carlo estimation is attempted for the $$\log p(y | \theta, d)$$ term if the parameter M_prime is passed. Otherwise, it is assumed that that $$\log p(y | \theta, d)$$ can safely be read from the model itself.

Parameters: model (function) – A pyro model accepting design as only argument. design (torch.Tensor) – Tensor representation of design observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred. target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured. N (int) – Number of outer expectation samples. M (int) – Number of inner expectation samples for p(y|d). M_prime (int) – Number of samples for p(y | theta, d) if required. EIG estimate torch.Tensor
donsker_varadhan_eig(model, design, observation_labels, target_labels, num_samples, num_steps, T, optim, return_history=False, final_design=None, final_num_samples=None)[source]

Donsker-Varadhan estimate of the expected information gain (EIG).

The Donsker-Varadhan representation of EIG is

$\sup_T E_{p(y, \theta | d)}[T(y, \theta)] - \log E_{p(y|d)p(\theta)}[\exp(T(\bar{y}, \bar{\theta}))]$

where $$T$$ is any (measurable) function.

This methods optimises the loss function over a pre-specified class of functions T.

Parameters: model (function) – A pyro model accepting design as only argument. design (torch.Tensor) – Tensor representation of design observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred. target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured. num_samples (int) – Number of samples per iteration. num_steps (int) – Number of optimisation steps. or torch.nn.Module T (function) – optimisable function T for use in the Donsker-Varadhan loss function. optim (pyro.optim.Optim) – Optimiser to use. return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimisation. final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design. final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses num_samples. EIG estimate, optionally includes full optimisatio history torch.Tensor or tuple
barber_agakov_ape(model, design, observation_labels, target_labels, num_samples, num_steps, guide, optim, return_history=False, final_design=None, final_num_samples=None)[source]

Barber-Agakov estimate of average posterior entropy (APE).

The Barber-Agakov representation of APE is

$$sup_{q}E_{p(y, \theta | d)}[\log q(\theta | y, d)]$$

where $$q$$ is any distribution on $$\theta$$.

This method optimises the loss over a given guide family guide representing $$q$$.

Parameters: model (function) – A pyro model accepting design as only argument. design (torch.Tensor) – Tensor representation of design observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred. target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured. num_samples (int) – Number of samples per iteration. num_steps (int) – Number of optimisation steps. guide (function) – guide family for use in the (implicit) posterior estimation. The parameters of guide are optimised to maximise the Barber-Agakov objective. optim (pyro.optim.Optim) – Optimiser to use. return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimisation. final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design. final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses num_samples. EIG estimate, optionally includes full optimisatio history torch.Tensor or tuple
class EwmaLog(alpha)[source]

Logarithm function with exponentially weighted moving average for gradients.

For input inputs this function return inputs.log(). However, it computes the gradient as

$$\frac{\sum_{t=0}^{T-1} \alpha^t}{\sum_{t=0}^{T-1} \alpha^t x_{T-t}}$$

where $$x_t$$ are historical input values passed to this function, $$x_T$$ being the most recently seen value.

This gradient may help with numerical stability when the sequence of inputs to the function form a convergent sequence.