Optimal Experiment Design¶
Tasks such as choosing the next question to ask in a psychology study, designing an election polling strategy, and deciding which compounds to synthesize and test in biological sciences are all fundamentally asking the same question: how do we design an experiment to maximize the information gathered? Bayesian optimal experimental design (BOED) is a powerful methodology for tackling such problems.
In the BOED framework, we begin with a Bayesian model with a likelihood \(p(y\theta,d)\) and a prior \(p(\theta)\) on the target latent variables. In Pyro, any fully Bayesian model can be used in the BOED framework. The sample sites corresponding to experimental outcomes are the observation sites, those corresponding to latent variables of interest are the target sites. The design \(d\) is the argument to the model, and is not a random variable.
In the BOED framework, we choose the design that optimizes the expected information gain (EIG) on the targets \(\theta\) from running the experiment
\(\text{EIG}(d) = \mathbf{E}_{p(yd)} [H[p(\theta)] − H[p(\thetay, d)]]\) ,
where \(H[·]\) represents the entropy and \(p(\thetay, d) \propto p(\theta)p(y\theta, d)\) is the posterior we get from running the experiment with design \(d\) and observing \(y\). In other words, the optimal design is the one that, in expectation over possible future observations, most reduces posterior entropy over the target latent variables. If the predictive model is correct, this forms a design strategy that is (onestep) optimal from an informationtheoretic viewpoint. For further details, see [1, 2].
The pyro.contrib.oed
module provides tools to create optimal experimental
designs for Pyro models. In particular, it provides estimators for the
expected information gain (EIG).
To estimate the EIG for a particular design, we first set up our Pyro model. For example:
def model(design):
# This line allows batching of designs, treating all batch dimensions as independent
with pyro.plate_stack("plate_stack", design.shape):
# We use a Normal prior for theta
theta = pyro.sample("theta", dist.Normal(torch.tensor(0.0), torch.tensor(1.0)))
# We use a simple logistic regression model for the likelihood
logit_p = theta  design
y = pyro.sample("y", dist.Bernoulli(logits=logit_p))
return y
We then select an appropriate EIG estimator, such as:
eig = nmc_eig(model, design, observation_labels=["y"], target_labels=["theta"], N=2500, M=50)
It is possible to estimate the EIG across a grid of designs:
designs = torch.stack([design1, design2], dim=0)
to find the best design from a number of options.
[1] Chaloner, Kathryn, and Isabella Verdinelli. “Bayesian experimental design: A review.” Statistical Science (1995): 273304.
[2] Foster, Adam, et al. “Variational Bayesian Optimal Experimental Design.” arXiv preprint arXiv:1903.05480 (2019).
Expected Information Gain¶

laplace_eig
(model, design, observation_labels, target_labels, guide, loss, optim, num_steps, final_num_samples, y_dist=None, eig=True, **prior_entropy_kwargs)[source]¶ Estimates the expected information gain (EIG) by making repeated Laplace approximations to the posterior.
Parameters:  model (function) – Pyro stochastic function taking design as only argument.
 design (torch.Tensor) – Tensor of possible designs.
 observation_labels (list) – labels of sample sites to be regarded as observables.
 target_labels (list) – labels of sample sites to be regarded as latent variables of interest, i.e. the sites that we wish to gain information about.
 guide (function) – Pyro stochastic function corresponding to model.
 loss – a Pyro loss such as pyro.infer.Trace_ELBO().differentiable_loss.
 optim – optimizer for the loss
 num_steps (int) – Number of gradient steps to take per sampled pseudoobservation.
 final_num_samples (int) – Number of y samples (pseudoobservations) to take.
 y_dist – Distribution to sample y from if None we use the Bayesian marginal distribution.
 eig (bool) – Whether to compute the EIG or the average posterior entropy (APE). The EIG is given by EIG = prior entropy  APE. If True, the prior entropy will be estimated analytically, or by Monte Carlo as appropriate for the model. If False the APE is returned.
 prior_entropy_kwargs (dict) – parameters for estimating the prior entropy: num_prior_samples indicating the number of samples for a MC estimate of prior entropy, and mean_field indicating if an analytic form for a meanfield prior should be tried.
Returns: EIG estimate, optionally includes full optimization history
Return type:

vi_eig
(model, design, observation_labels, target_labels, vi_parameters, is_parameters, y_dist=None, eig=True, **prior_entropy_kwargs)[source]¶ Deprecated since version 0.4.1: Use posterior_eig instead.
Estimates the expected information gain (EIG) using variational inference (VI).
The APE is defined as
\(APE(d)=E_{Y\sim p(y\theta, d)}[H(p(\thetaY, d))]\)where \(H[p(x)]\) is the differential entropy. The APE is related to expected information gain (EIG) by the equation
\(EIG(d)=H[p(\theta)]APE(d)\)in particular, minimising the APE is equivalent to maximising EIG.
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 vi_parameters (dict) – Variational inference parameters which should include:
optim: an instance of
pyro.Optim
, guide: a guide function compatible with model, num_steps: the number of VI steps to make, and loss: the loss function to use for VI  is_parameters (dict) – Importance sampling parameters for the marginal distribution of \(Y\). May include num_samples: the number of samples to draw from the marginal.
 y_dist (pyro.distributions.Distribution) – (optional) the distribution assumed for the response variable \(Y\)
 eig (bool) – Whether to compute the EIG or the average posterior entropy (APE). The EIG is given by EIG = prior entropy  APE. If True, the prior entropy will be estimated analytically, or by Monte Carlo as appropriate for the model. If False the APE is returned.
 prior_entropy_kwargs (dict) – parameters for estimating the prior entropy: num_prior_samples indicating the number of samples for a MC estimate of prior entropy, and mean_field indicating if an analytic form for a meanfield prior should be tried.
Returns: EIG estimate, optionally includes full optimization history
Return type:

nmc_eig
(model, design, observation_labels, target_labels=None, N=100, M=10, M_prime=None, independent_priors=False)[source]¶ Nested Monte Carlo estimate of the expected information gain (EIG). The estimate is, when there are not any random effects,
\[\frac{1}{N}\sum_{n=1}^N \log p(y_n  \theta_n, d)  \frac{1}{N}\sum_{n=1}^N \log \left(\frac{1}{M}\sum_{m=1}^M p(y_n  \theta_m, d)\right)\]where \(\theta_n, y_n \sim p(\theta, y  d)\) and \(\theta_m \sim p(\theta)\). The estimate in the presence of random effects is
\[\frac{1}{N}\sum_{n=1}^N \log \left(\frac{1}{M'}\sum_{m=1}^{M'} p(y_n  \theta_n, \widetilde{\theta}_{nm}, d)\right) \frac{1}{N}\sum_{n=1}^N \log \left(\frac{1}{M}\sum_{m=1}^{M} p(y_n  \theta_m, \widetilde{\theta}_{m}, d)\right)\]where \(\widetilde{\theta}\) are the random effects with \(\widetilde{\theta}_{nm} \sim p(\widetilde{\theta}\theta=\theta_n)\) and \(\theta_m,\widetilde{\theta}_m \sim p(\theta,\widetilde{\theta})\). The latter form is used when M_prime != None.
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 N (int) – Number of outer expectation samples.
 M (int) – Number of inner expectation samples for p(yd).
 M_prime (int) – Number of samples for p(y  theta, d) if required.
 independent_priors (bool) – Only used when M_prime is not None. Indicates whether the prior distributions for the target variables and the nuisance variables are independent. In this case, it is not necessary to sample the targets conditional on the nuisance variables.
Returns: EIG estimate, optionally includes full optimization history
Return type:

donsker_varadhan_eig
(model, design, observation_labels, target_labels, num_samples, num_steps, T, optim, return_history=False, final_design=None, final_num_samples=None)[source]¶ DonskerVaradhan estimate of the expected information gain (EIG).
The DonskerVaradhan representation of EIG is
\[\sup_T E_{p(y, \theta  d)}[T(y, \theta)]  \log E_{p(yd)p(\theta)}[\exp(T(\bar{y}, \bar{\theta}))]\]where \(T\) is any (measurable) function.
This methods optimises the loss function over a prespecified class of functions T.
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 num_samples (int) – Number of samples per iteration.
 num_steps (int) – Number of optimization steps.
 or torch.nn.Module T (function) – optimisable function T for use in the DonskerVaradhan loss function.
 optim (pyro.optim.Optim) – Optimiser to use.
 return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimization.
 final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
 final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
Returns: EIG estimate, optionally includes full optimization history
Return type:

posterior_eig
(model, design, observation_labels, target_labels, num_samples, num_steps, guide, optim, return_history=False, final_design=None, final_num_samples=None, eig=True, prior_entropy_kwargs={}, *args, **kwargs)[source]¶ Posterior estimate of expected information gain (EIG) computed from the average posterior entropy (APE) using \(EIG(d) = H[p(\theta)]  APE(d)\). See [1] for full details.
The posterior representation of APE is
\(\sup_{q}\ E_{p(y, \theta  d)}[\log q(\theta  y, d)]\)where \(q\) is any distribution on \(\theta\).
This method optimises the loss over a given guide family representing \(q\).
[1] Foster, Adam, et al. “Variational Bayesian Optimal Experimental Design.” arXiv preprint arXiv:1903.05480 (2019).
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 num_samples (int) – Number of samples per iteration.
 num_steps (int) – Number of optimization steps.
 guide (function) – guide family for use in the (implicit) posterior estimation. The parameters of guide are optimised to maximise the posterior objective.
 optim (pyro.optim.Optim) – Optimiser to use.
 return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimization.
 final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
 final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
 eig (bool) – Whether to compute the EIG or the average posterior entropy (APE). The EIG is given by EIG = prior entropy  APE. If True, the prior entropy will be estimated analytically, or by Monte Carlo as appropriate for the model. If False the APE is returned.
 prior_entropy_kwargs (dict) – parameters for estimating the prior entropy: num_prior_samples indicating the number of samples for a MC estimate of prior entropy, and mean_field indicating if an analytic form for a meanfield prior should be tried.
Returns: EIG estimate, optionally includes full optimization history
Return type:

marginal_eig
(model, design, observation_labels, target_labels, num_samples, num_steps, guide, optim, return_history=False, final_design=None, final_num_samples=None)[source]¶ Estimate EIG by estimating the marginal entropy \(p(yd)\). See [1] for full details.
The marginal representation of EIG is
\(\inf_{q}\ E_{p(y, \theta  d)}\left[\log \frac{p(y  \theta, d)}{q(y  d)} \right]\)where \(q\) is any distribution on \(y\). A variational family for \(q\) is specified in the guide.
Warning
This method does not estimate the correct quantity in the presence of random effects.
[1] Foster, Adam, et al. “Variational Bayesian Optimal Experimental Design.” arXiv preprint arXiv:1903.05480 (2019).
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 num_samples (int) – Number of samples per iteration.
 num_steps (int) – Number of optimization steps.
 guide (function) – guide family for use in the marginal estimation. The parameters of guide are optimised to maximise the loglikelihood objective.
 optim (pyro.optim.Optim) – Optimiser to use.
 return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimization.
 final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
 final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
Returns: EIG estimate, optionally includes full optimization history
Return type:

lfire_eig
(model, design, observation_labels, target_labels, num_y_samples, num_theta_samples, num_steps, classifier, optim, return_history=False, final_design=None, final_num_samples=None)[source]¶ Estimates the EIG using the method of LikelihoodFree Inference by Ratio Estimation (LFIRE) as in [1]. LFIRE is run separately for several samples of \(\theta\).
[1] Kleinegesse, Steven, and Michael Gutmann. “Efficient Bayesian Experimental Design for Implicit Models.” arXiv preprint arXiv:1810.09912 (2018).
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 num_y_samples (int) – Number of samples to take in \(y\) for each \(\theta\).
 num_steps (int) – Number of optimization steps.
 classifier (function) – a Pytorch or Pyro classifier used to distinguish between samples of \(y\) under \(p(yd)\) and samples under \(p(y\theta,d)\) for some \(\theta\).
 optim (pyro.optim.Optim) – Optimiser to use.
 return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimization.
 final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
 final_num_samples (int) – The number of samples to use at the final evaluation, If None, uses `num_samples.
Param: int num_theta_samples: Number of initial samples in \(\theta\) to take. The likelihood ratio is estimated by LFIRE for each sample.
Returns: EIG estimate, optionally includes full optimization history
Return type:

vnmc_eig
(model, design, observation_labels, target_labels, num_samples, num_steps, guide, optim, return_history=False, final_design=None, final_num_samples=None)[source]¶ Estimates the EIG using Variational Nested Monte Carlo (VNMC). The VNMC estimate [1] is
\[\frac{1}{N}\sum_{n=1}^N \left[ \log p(y_n  \theta_n, d)  \log \left(\frac{1}{M}\sum_{m=1}^M \frac{p(\theta_{mn})p(y_n  \theta_{mn}, d)} {q(\theta_{mn}  y_n)} \right) \right]\]where \(q(\theta  y)\) is the learned variational posterior approximation and \(\theta_n, y_n \sim p(\theta, y  d)\), \(\theta_{mn} \sim q(\thetay=y_n)\).
As \(N \to \infty\) this is an upper bound on EIG. We minimise this upper bound by stochastic gradient descent.
Warning
This method cannot be used in the presence of random effects.
[1] Foster, Adam, et al. “Variational Bayesian Optimal Experimental Design.” arXiv preprint arXiv:1903.05480 (2019).
Parameters:  model (function) – A pyro model accepting design as only argument.
 design (torch.Tensor) – Tensor representation of design
 observation_labels (list) – A subset of the sample sites present in model. These sites are regarded as future observations and other sites are regarded as latent variables over which a posterior is to be inferred.
 target_labels (list) – A subset of the sample sites over which the posterior entropy is to be measured.
 num_samples (tuple) – Number of (\(N, M\)) samples per iteration.
 num_steps (int) – Number of optimization steps.
 guide (function) – guide family for use in the posterior estimation. The parameters of guide are optimised to minimise the VNMC upper bound.
 optim (pyro.optim.Optim) – Optimiser to use.
 return_history (bool) – If True, also returns a tensor giving the loss function at each step of the optimization.
 final_design (torch.Tensor) – The final design tensor to evaluate at. If None, uses design.
 final_num_samples (tuple) – The number of (\(N, M\)) samples to use at the final evaluation, If None, uses `num_samples.
Returns: EIG estimate, optionally includes full optimization history
Return type:
Generalised Linear Mixed Models¶
Warning
This module will eventually be deprecated in favor of brmp
The pyro.contrib.oed.glmm
module provides models and guides for
generalised linear mixed models (GLMM). It also includes the
Normalinversegamma family.
To create a classical Bayesian linear model, use:
from pyro.contrib.oed.glmm import known_covariance_linear_model
# Note: coef is a pvector, observation_sd is a scalar
# Here, p=1 (one feature)
model = known_covariance_linear_model(coef_mean=torch.tensor([0.]),
coef_sd=torch.tensor([10.]),
observation_sd=torch.tensor(2.))
# An n x p design tensor
# Here, n=2 (two observations)
design = torch.tensor(torch.tensor([[1.], [1.]]))
model(design)
A nonlinear link function may be introduced, for instance:
from pyro.contrib.oed.glmm import logistic_regression_model
# No observation_sd is needed for logistic models
model = logistic_regression_model(coef_mean=torch.tensor([0.]),
coef_sd=torch.tensor([10.]))
Random effects may be incorporated as regular Bayesian regression coefficients.
For random effects with a shared covariance matrix, see pyro.contrib.oed.glmm.lmer_model()
.