Lecture slides

Objectives

The aim of this session is to learn how you can fit a stochastic model using the particle MCMC Metropolis-Hastings algorithm (pMCMC). As you saw in the lecture, fitting a model with a likelihood approach requires:

In the previous session, you have used the function mcmcMh to explore the parameter space with the Metropolis-Hastings algorithm and the function dTrajObs to evaluate the likelihood for your deterministic model. In order to fit a stochastic model, we can still use the mcmcMh function to explore the parameter space efficiently but we need to use a new function to evaluate the likelihood. This new function is called a particle filter.

In this session you will:

  1. code a particle filter to estimate the likelihood of a stochastic model
  2. learn how to calibrate the number of particles
  3. fit the stochastic SEIT4L model to the Tristan da Cunha outbreak with pMCMC

Code a particle filter

Before running pMCMC, we need to code a particle filter in order to evaluate the log-likelihood of the data at a given proposed theta.

Below you can find the skeleton of such a function. We have inserted comments at each step of the algorithm. If you are struggling at any point, follow the link below the code for a more guided example.

# This is a function that takes four parameters:
# - fitmodel: a fitmodel object
# - theta: named numeric vector. Values of the parameters for which the marginal log-likelihood is desired.
# - initState: named numeric vector. Initial values of the state variables.
# - data: data frame. Observation times and observed data.
# The function returns the value of the marginal log-likelihood
my_particleFilter <- function(fitmodel, theta, initState, data, nParticles) {

    ## Initialisation of the algorithm
    # Initialise the state and the weight of your particles

    ## Start for() loop over observation times

        # Resample particles according to their weights
        # You can use the `sample() function of R

        ## Start for() loop over particles

            # Propagate the particle from current observation time to the next one
            # using the function `fitmodel$simulate`

            # Weight the particle with the likelihood of the observed data point
            # using the function `fitmodel$dPointObs`

        ## End for() loop over particles

    ## End for() loop over observation times

    ## Compute and return the marginal log-likelihood
    # sum of the log of the mean of the weights at each observation time

}

If you have trouble filling any of the empty bits, have a look at our more guided example.

Run a particle filter

Try to run your particle filter with the following inputs:

# load seit4lStoch
data(models)

# load data
data(fluTdc1971)

# theta close to the mean posterior estimate of the deterministic SEIT4L model
theta <- c(R_0 = 7, D_lat = 1, D_inf = 4, alpha = 0.5, D_imm = 10, rho = 0.65)

# init state as before
initState <- c(S = 279, E = 0, I = 2, T1 = 3, T2 = 0, T3 = 0, T4 = 0, L = 0, Inc = 0)

# run the particle filter with 20 particles
my_particleFilter(seit4lStoch, theta, initState, data = fluTdc1971, nParticles = 20)
## [1] -124.5306

Does your particle filter return the same value for the marginal log-likelihood? Can you explain why?

What can you notice when you:

Calibrate the number of particles

Can you think of and implement an algorithm to calibrate the number of particles?

Compare your approach with ours and determine an optimal number of particles.

Run pMCMC

You can now write a new wrapper for the function dLogPosterior that will take margLogLike = my_particleFilter as argument. This wrapper will then be passed to mcmcMh.

Note that the function dLogPosterior doesn’t have a nParticles argument so you might wonder how to specify it to my_particleFilter? Have a look at the documentation of dLogPosterior. You should notice the ... argument, which allows you to pass any extra argument to the function margLogLike.

You have probably noticed that running my_particleFilter is time consuming. As we mentioned in the previous session this can lead to a waste of time and computational resources if you initialise the pMCMC with parameter values far from the target and with a Gaussian proposal that is very different from the target.

In order to efficiently initialise your pMCMC, you can make use the results of the fit of the deterministic SEIT4L model of the previous session. Remember that you can load these results as follows:


# this should load 2 objects in your environment: mcmcSeit4lInfoPriorTheta1 and
# mcmcSeit4lInfoPriorTheta2. Each one is a list of 3 elements returned by
# mcmcMh
names(mcmcSeit4lInfoPriorTheta1)
## [1] "trace"           "acceptanceRate"  "covmatEmpirical"

In the previous session, you have initialised mcmcMh with a diagonal covariance matrix for the Gaussian proposal using the argument proposalSd. Actually, you can also pass a non-diagonal covariance matrix, accounting for correlations, by using the argument covmat (type ?mcmcMh for more details).

You can now set the pMCMC by filling the empty bits below:

# wrapper for posterior
my_posteriorSto <- function(theta) {

    my_fitmodel <- # INSERT HERE
    my_initState <- # INSERT HERE
    my_nParticles <- # INSERT HERE

    return(dLogPosterior(fitmodel = my_fitmodel,
                         theta = theta,
                         initState = my_initState,
                         data = fluTdc1971,
                         margLogLike = my_particleFilter,
                         nParticles = my_nParticles))
}

# theta to initialise the pMCMC
initTheta <- # INSERT HERE

# covariance matrix for the Gaussian proposal
covmat <- # INSERT HERE

# lower and upper limits of each parameter (must be named vectors)
lower <- # INSERT HERE
upper <- # INSERT HERE

# number of iterations for the pMCMC
nIterations <- # INSERT HERE

# additional parameters for the adaptive pMCMC, see ?mcmcMh for more details
adaptSizeStart <- # INSERT HERE
adaptSizeCooling <- # INSERT HERE
adaptShapeStart <- # INSERT HERE

If you have trouble filling some of the empty bits, have a look at our solution.

Then you should be able to run mcmcMh:

# run the pMCMC
my_pMCMC <- mcmcMh(target = my_posteriorSto,
                   initTheta = initTheta,
                   covmat = covmat,
                   limits = list(lower = lower,upper = upper),
                   nIterations = nIterations,
                   adaptSizeStart = adaptSizeStart,
                   adaptSizeCooling = adaptSizeCooling,
                   adaptShapeStart = adaptShapeStart)

Analyse a pMCMC with 16 particles

If you have run a pMCMC you should have noticed that it takes quite a lot of time as we need to run a particle filter at each iteration. Since the computation time scales linearly with the number of particles you might be tempted to reduce this number. Let’s have a look at what we would get by running a pMCMC with 16 particles.

To save time, we have run 8 chains in parallel for you. Each chain was started from a different initTheta and ran for 3000 iterations. The values of initTheta were chosen close to the mean posterior estimates of the deterministic fit and their empirical covariance matrix was also used for the Gaussian proposal kernel. Each chain took 6 hours to complete on a scientific computing cluster and can be loaded as follows:

data(pmcmcSeit4lInfoPrior)
# this should load a list with the same name, which contains the 8 chains.
length(pmcmcSeit4lInfoPrior16)
## [1] 8
# each chain is a list of 3 elements returned by mcmcMh
names(pmcmcSeit4lInfoPrior16[[1]])
## [1] "trace"           "acceptanceRate"  "covmatEmpirical"
# the trace contains 9 variables for 3000 iterations
dim(pmcmcSeit4lInfoPrior16[[1]]$trace)
## [1] 3000    9

We can combine the traces of the 8 chains into a mcmc.list object as follows:

library("coda")
trace <- mcmc.list(lapply(pmcmcSeit4lInfoPrior16, function(chain) {
    mcmc(chain$trace)
}))
head(trace, 1)
## [[1]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
##      logLikelihood logDensity
## [1,]     -119.7101  -131.0695
## [2,]     -119.7101  -131.0695
## 
## [[2]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm      rho logPrior
## [1,] 9.156946 1.369854 4.422742 0.4634092 10.10724 0.624404 -12.7751
## [2,] 9.156946 1.369854 4.422742 0.4634092 10.10724 0.624404 -12.7751
##      logLikelihood logDensity
## [1,]     -120.8215  -133.5966
## [2,]     -120.8215  -133.5966
## 
## [[3]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
##      logLikelihood logDensity
## [1,]     -122.2201  -133.5796
## [2,]     -122.2201  -133.5796
## 
## [[4]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0   D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 5.205378 1.14665 2.257054 0.4387091 11.13603 0.5932959 -10.03886
## [2,] 5.205378 1.14665 2.257054 0.4387091 11.13603 0.5932959 -10.03886
##      logLikelihood logDensity
## [1,]     -124.2126  -134.2514
## [2,]     -124.2126  -134.2514
## 
## [[5]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 8.059997 1.707677 2.917582 0.4784999 9.896543 0.6653181 -10.10542
## [2,] 8.059997 1.707677 2.917582 0.4784999 9.896543 0.6653181 -10.10542
##      logLikelihood logDensity
## [1,]     -118.9109  -129.0163
## [2,]     -118.9109  -129.0163
## 
## [[6]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
##      logLikelihood logDensity
## [1,]     -118.4847  -129.8442
## [2,]     -118.4847  -129.8442
## 
## [[7]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
##      logLikelihood logDensity
## [1,]     -122.5704  -133.9299
## [2,]     -122.5704  -133.9299
## 
## [[8]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1 
## End = 2 
## Thinning interval = 1 
##           R_0    D_lat    D_inf     alpha    D_imm       rho  logPrior
## [1,] 7.686268 1.924227 2.114245 0.5073793 8.322857 0.6307255 -9.651117
## [2,] 7.686268 1.924227 2.114245 0.5073793 8.322857 0.6307255 -9.651117
##      logLikelihood logDensity
## [1,]     -122.8299   -132.481
## [2,]     -122.8299   -132.481
## 
## attr(,"class")
## [1] "mcmc.list"

You can check that the chains were started from different initState and that we used informative priors.

Take 15 min to analyse these chains and conclude on the choice of using 16 particles to save computation time.

You can compare your conclusions with ours.

Analyse a pMCMC with 128 particles

Let’s increase the number of particles to 128, which is suggested by the calibration analysis. Again, to save time we have run 8 chains of 3000 iterations in parallel. Each chain took 40 hours to complete on a scientific computing cluster and can be loaded and stored in a mcmc.list object as follows:

# load
data(pmcmcSeit4lInfoPrior)
# create
trace <- mcmc.list(lapply(pmcmcSeit4lInfoPrior128, function(chain) {
    mcmc(chain$trace)
}))

Re-do the same analysis as for 16 particles. What differences can you notice? In particular, try to compare the posterior distributions with 16 and 128 particles using the function plotPosteriorDensity.

You can also have a look at our solution if that helps.

Stochastic vs deterministic fit

Compare the posterior of the deterministic and stochastic SEIT4L models. Which model provides the best fit? Can you explain why?

Check your answer here.

Going further

Filtered trajectories

  • Actually, in addition to the log-likelihood, a particle filter can also return the filtered trajectories (i.e. all the trajectories that “survived” until the last observation time). You can update your filter so it keeps track and returns the filtered trajectories. Alternatively, there is a function in the package called particleFilter that will do it for you (see ?particleFilter for documentation). Have a look at its code.
  • If you run a particle filter with our function particleFilter, you can then plot the filtered trajectories using the function plotSmc.

Optimization

You might have noted that the for() loop over particles could be parallelized, as particles can be propagated independently. You could take advantage of this to code a parallel loop and make your algorithm even faster. If you have never coded a parallel program in R you can also have a look at the code of particleFilter. Actually, all the test runs were performed on a scientific computing cluster with 12 core machines. So the computational time is expected to be multiplied by 12 without parallelization of the particle filter.

SMC^2

For pMCMC we have combined a particle filter (to estimate the likelihood and sample from the posterior distribution of trajectories) and Metropolis-Hastings MCMC (to sample from the posterior distribution of parameters). Another option for sampling from the joint distribution of trajectories is to run what is called SMC^2 (or SMC squared), where Sequential Monte Carlo is used both for parameters and state trajectories. See this presentation by Pierre Jacob for more details and some visualisations.

References

 

This web site and the material contained in it were originally created in support of an annual short course on Model Fitting and Inference for Infectious Disease Dynamics at the London School of Hygiene & Tropical Medicine. All material is under a MIT license. Please report any issues or suggestions for improvement on the corresponding GitHub issue tracker. We are always keen to hear about any uses of the material here, so please do get in touch using the Discussion board if you have any questions or ideas, or if you find the material here useful or use it in your own teaching.