Introduction to mcunit

Increasingly sophisticated MCMC and Monte Carlo algorithms raise the scope for errors, either in derivations of mathematical quantities needed for sampling, or implementation errors. Testing for such errors should be an integral and routine part of the workflow of any Bayesian analysis.

mcunit makes it easy to do this, by allowing for unit testing MCMC and other Monte Carlo implementations within the framework of the testthat package (Wickham 2011). The methods correspond to those proposed in Gandy and Scott (2020). They are based on statistical hypothesis testing, and are designed to achieve an arbitrarily low false rejection probability, so that the user can be confident that a correctly implemented algorithm will almost certainly not be rejected. By default, this false rejection rate is set to \(10^{-5}\).

MCMC samplers can be testes using expect_mcmc or expect_mcmc_reversible. These functions use different statistical hypothesis tests described in Section 2.1 and 2.2 of (Gandy and Scott 2020) respectively. expect_mcmc_reversible requires that the sampler to be tested is (or at least, is expected to be) reversible.

This vignette demonstrates how to use these tests, using the following simple example. Consider the model \[y \sim \theta_1 + \theta_2 + \epsilon,\] where \(\theta := (\theta_1,\theta_2)\) is apriori independent, zero mean normal with standard deviation \(\sigma=10\). The white noise term \(\epsilon\) is independent from \(\theta\) and also zero mean normal but with variance \(\sigma_{\epsilon}^2\) = 0.1. While inference is easy here, we consider drawing samples from the posterior \(\pi(\theta \mid y)\) using a Gibbs sampler. The posterior conditional distributions for \(\theta_1\) and \(\theta_2\) are normal with expectations \[\mathbb{E}(\theta_i | y, \theta_j) = \frac{\sigma^2}{\sigma^2_{\epsilon} + \sigma^2}(y - \theta_j),\] and variances \[\mathbb{V}(\theta_i | y, \theta_j) = \frac{1}{\frac{1}{\sigma^2_{\epsilon}} + \frac{1}{\sigma^2}}.\]

Correct Sampler

Start with a correctly implemented random scan Gibbs sampler. First load the package and set the seed.

require(mcunit)
## Loading required package: mcunit
set.seed(10)

The following function updates one element of \(\theta\) given the other and given the observed data \(y\).

gibbsUpdate <- function(y, theta_j) {
  # Samples theta_i given y and theta_j
  mean <- 100 * (y - theta_j) / (100 + 0.1)
  var <- 1. / (1. / 100 + 1. / 0.1)
  rnorm(1, mean=mean, sd=sqrt(var))
}

The argument y is the observed data, and theta_j is the component to condition on. From this, we define a correctly implemented random scan sampler.

randomScan <- function(theta, y, thinning) {
  # Random Scan Gibbs
  for(i in 1:thinning) {
    # select index to update
    i <- sample.int(2,1)
    theta[i] <- gibbsUpdate(y, theta[i %% 2 + 1])
  }
  theta
}

This function takes the current state of the chain theta, the data y and a thinning parameter which determines the number of individual updates to make before returning the new state.

Our task is to test the correctness of randomScan. We will do this using both expect_mcmc and expect_mcmc_reversible. This latter method can be used because the sampler should, if correct, be reversible.

Both methods require a list object which describes the MCMC sampler to be tested. The list must contain the following elements:

Please see the documentation for expect_mcmc and expect_mcmc_reversible for further details on this.

We begin constructing this list by defining the prior sampler and the data sampler.

obj <- list()
obj$genprior <- function() rnorm(n=2, mean=0, sd=10)
obj$gendata <- function(theta) sum(theta) + rnorm(n=1, mean=0, sd=sqrt(0.1))

As test functions, we use the components \(\theta_1\), \(theta_2\), the prior density \(\pi(\theta)\) and likelihood function \(p(y \mid \theta)\). The density and likelihood appear to be a good default choice because they are one-dimensional regardless of the dimension of the parameter vector and data.

priorDensity <- function(theta) prod(dnorm(theta, mean=0, sd=10))
likelihood <- function(theta, y) dnorm(y, mean=sum(theta), sd=sqrt(0.1))
testVec <- function(theta, y) c(theta[1], theta[2], priorDensity(theta), likelihood(theta, y)) 
obj$test <- testVec

Finally, we are left to define a single MCMC step.

obj$stepMCMC <- function(theta, dat, thinning) randomScan(theta, dat, thinning)
print(obj)
## $genprior
## function() rnorm(n=2, mean=0, sd=10)
## 
## $gendata
## function(theta) sum(theta) + rnorm(n=1, mean=0, sd=sqrt(0.1))
## 
## $test
## function(theta, y) c(theta[1], theta[2], priorDensity(theta), likelihood(theta, y))
## 
## $stepMCMC
## function(theta, dat, thinning) randomScan(theta, dat, thinning)

First consider expect_mcmc. We use \(100\) thinning steps between samples to give the test some power to detect errors.

expect_mcmc(obj, thinning = 100)

No errors were detected, because there are none. Recall that the false rejection rate is set to \(10^{-5}\) by default, and so repeated application of this test should very rarely flag randomScan.

This sampler is also reversible, and so we try expect_mcmc_reversible.

expect_mcmc_reversible(obj, thinning = 10, nsteps = 10)

Again, no error is detected (as expected).

A Correct Non-Reversible Sampler

Consider systematic scan of the vector \(\theta\) rather than random scan.

systematicScan <- function(theta, y, thinning) {
  # Systematic Scan Gibbs
  for(i in 1:thinning) {
    theta[1] <- gibbsUpdate(y, theta[2])
    theta[2] <- gibbsUpdate(y, theta[1])
  }
  theta
}

This sampler is correct and expect_mcmc is unlikely to falsely detect errors.

obj$stepMCMC <- function(theta, dat, thinning) systematicScan(theta, dat, thinning)
expect_mcmc(obj, thinning = 100)

Good. What happens using expect_mcmc_reversible with thinning of \(1\)?

expect_mcmc_reversible(obj, thinning = 1, nsteps = 10)

The test failed. This illustrates why expect_mcmc_reversible should not be used for a non-reversible sampler - we have no guarantee over the false rejection rate. Even though this sampler is correctly implemented, we are erroneously detecting an error.

An Incorrect Sampler

Now we make a genuine mistake in the sampler, and investigate whether the methods detect it. We replace the variance terms \(\sigma^2\) and \(\sigma^2_{\epsilon}\) in \(\mathbb{V}(\theta_i | y, \theta_j)\) with their corresponding standard deviations (i.e. \(\sigma\) and \(\sigma_{\epsilon}\)).

gibbsUpdate <- function(y, theta_j) {
  # Samples theta_i given y and theta_j
  mean <- 100 * (y - theta_j) / (100 + 0.1)
  var <- 1. / (1. / 10 + 1. / sqrt(0.1))
  rnorm(1, mean=mean, sd=sqrt(var))
}

Also make sure to switch back to random scan…

obj$stepMCMC <- function(theta, dat, thinning) randomScan(theta, dat, thinning)

Rerunning the tests…

expect_mcmc(obj, thinning = 100)
## Error: Test failed for components  with p-values=NULL in iteration 1
expect_mcmc_reversible(obj, thinning = 10, nsteps = 10)
## Error: Test failed for components  with p-values=NULL in iteration 1

Great! Both tests detect a problem. Even better, you can be pretty much sure there is a problem because the chance of a false rejection is upper bounded by \(10^{-5}\)!

References

Gandy, A., and Scott, J. (2020), “Unit testing for MCMC and other Monte Carlo methods.”

Wickham, H. (2011), “Testthat: Get started with testing,” The R Journal, 3, 5–10.