The aim of this session is to learn how you can fit a stochastic model using the particle MCMC Metropolis-Hastings algorithm (pMCMC). As you saw in the lecture, fitting a model with a likelihood approach requires:
In the previous session,
you have used the function mcmcMh
to explore the parameter
space with the Metropolis-Hastings algorithm and the function
dTrajObs
to evaluate the likelihood for your deterministic
model. In order to fit a stochastic model, we can still use the
mcmcMh
function to explore the parameter space efficiently
but we need to use a new function to evaluate the likelihood. This new
function is called a particle filter.
In this session you will:
Before running pMCMC, we need to code a particle filter in order to
evaluate the log-likelihood of the data at a given proposed
theta
.
Below you can find the skeleton of such a function. We have inserted comments at each step of the algorithm. If you are struggling at any point, follow the link below the code for a more guided example.
# This is a function that takes four parameters:
# - fitmodel: a fitmodel object
# - theta: named numeric vector. Values of the parameters for which the marginal log-likelihood is desired.
# - initState: named numeric vector. Initial values of the state variables.
# - data: data frame. Observation times and observed data.
# The function returns the value of the marginal log-likelihood
my_particleFilter <- function(fitmodel, theta, initState, data, nParticles) {
## Initialisation of the algorithm
# Initialise the state and the weight of your particles
## Start for() loop over observation times
# Resample particles according to their weights
# You can use the `sample() function of R
## Start for() loop over particles
# Propagate the particle from current observation time to the next one
# using the function `fitmodel$simulate`
# Weight the particle with the likelihood of the observed data point
# using the function `fitmodel$dPointObs`
## End for() loop over particles
## End for() loop over observation times
## Compute and return the marginal log-likelihood
# sum of the log of the mean of the weights at each observation time
}
If you have trouble filling any of the empty bits, have a look at our more guided example.
Try to run your particle filter with the following inputs:
# load seit4lStoch
data(models)
# load data
data(fluTdc1971)
# theta close to the mean posterior estimate of the deterministic SEIT4L model
theta <- c(R_0 = 7, D_lat = 1, D_inf = 4, alpha = 0.5, D_imm = 10, rho = 0.65)
# init state as before
initState <- c(S = 279, E = 0, I = 2, T1 = 3, T2 = 0, T3 = 0, T4 = 0, L = 0, Inc = 0)
# run the particle filter with 20 particles
my_particleFilter(seit4lStoch, theta, initState, data = fluTdc1971, nParticles = 20)
## [1] -124.5306
Does your particle filter return the same value for the marginal log-likelihood? Can you explain why?
What can you notice when you:
theta
Can you think of and implement an algorithm to calibrate the number of particles?
Compare your approach with ours and determine an optimal number of particles.
You can now write a new wrapper for the function
dLogPosterior
that will take
margLogLike = my_particleFilter
as argument. This wrapper
will then be passed to mcmcMh
.
Note that the function dLogPosterior
doesn’t have a
nParticles
argument so you might wonder how to specify it
to my_particleFilter
? Have a look at the documentation of
dLogPosterior
. You should notice the ...
argument, which allows you to pass any extra argument to the function
margLogLike
.
You have probably noticed that running my_particleFilter
is time consuming. As we mentioned in the previous session
this can lead to a waste of time and computational resources if you
initialise the pMCMC with parameter values far from the target and with
a Gaussian proposal that is very different from the target.
In order to efficiently initialise your pMCMC, you can make use the results of the fit of the deterministic SEIT4L model of the previous session. Remember that you can load these results as follows:
# this should load 2 objects in your environment: mcmcSeit4lInfoPriorTheta1 and
# mcmcSeit4lInfoPriorTheta2. Each one is a list of 3 elements returned by
# mcmcMh
names(mcmcSeit4lInfoPriorTheta1)
## [1] "trace" "acceptanceRate" "covmatEmpirical"
In the previous session, you have initialised mcmcMh
with a diagonal covariance matrix for the Gaussian proposal using the
argument proposalSd
. Actually, you can also pass a
non-diagonal covariance matrix, accounting for correlations, by using
the argument covmat
(type ?mcmcMh
for more
details).
You can now set the pMCMC by filling the empty bits below:
# wrapper for posterior
my_posteriorSto <- function(theta) {
my_fitmodel <- # INSERT HERE
my_initState <- # INSERT HERE
my_nParticles <- # INSERT HERE
return(dLogPosterior(fitmodel = my_fitmodel,
theta = theta,
initState = my_initState,
data = fluTdc1971,
margLogLike = my_particleFilter,
nParticles = my_nParticles))
}
# theta to initialise the pMCMC
initTheta <- # INSERT HERE
# covariance matrix for the Gaussian proposal
covmat <- # INSERT HERE
# lower and upper limits of each parameter (must be named vectors)
lower <- # INSERT HERE
upper <- # INSERT HERE
# number of iterations for the pMCMC
nIterations <- # INSERT HERE
# additional parameters for the adaptive pMCMC, see ?mcmcMh for more details
adaptSizeStart <- # INSERT HERE
adaptSizeCooling <- # INSERT HERE
adaptShapeStart <- # INSERT HERE
If you have trouble filling some of the empty bits, have a look at our solution.
Then you should be able to run mcmcMh
:
# run the pMCMC
my_pMCMC <- mcmcMh(target = my_posteriorSto,
initTheta = initTheta,
covmat = covmat,
limits = list(lower = lower,upper = upper),
nIterations = nIterations,
adaptSizeStart = adaptSizeStart,
adaptSizeCooling = adaptSizeCooling,
adaptShapeStart = adaptShapeStart)
If you have run a pMCMC you should have noticed that it takes quite a lot of time as we need to run a particle filter at each iteration. Since the computation time scales linearly with the number of particles you might be tempted to reduce this number. Let’s have a look at what we would get by running a pMCMC with 16 particles.
To save time, we have run 8 chains in parallel for you. Each chain
was started from a different initTheta
and ran for 3000
iterations. The values of initTheta
were chosen close to
the mean posterior estimates of the deterministic fit and their
empirical covariance matrix was also used for the Gaussian proposal
kernel. Each chain took 6 hours to complete on a scientific computing
cluster and can be loaded as follows:
data(pmcmcSeit4lInfoPrior)
# this should load a list with the same name, which contains the 8 chains.
length(pmcmcSeit4lInfoPrior16)
## [1] 8
# each chain is a list of 3 elements returned by mcmcMh
names(pmcmcSeit4lInfoPrior16[[1]])
## [1] "trace" "acceptanceRate" "covmatEmpirical"
# the trace contains 9 variables for 3000 iterations
dim(pmcmcSeit4lInfoPrior16[[1]]$trace)
## [1] 3000 9
We can combine the traces of the 8 chains into a
mcmc.list
object as follows:
library("coda")
trace <- mcmc.list(lapply(pmcmcSeit4lInfoPrior16, function(chain) {
mcmc(chain$trace)
}))
head(trace, 1)
## [[1]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## logLikelihood logDensity
## [1,] -119.7101 -131.0695
## [2,] -119.7101 -131.0695
##
## [[2]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 9.156946 1.369854 4.422742 0.4634092 10.10724 0.624404 -12.7751
## [2,] 9.156946 1.369854 4.422742 0.4634092 10.10724 0.624404 -12.7751
## logLikelihood logDensity
## [1,] -120.8215 -133.5966
## [2,] -120.8215 -133.5966
##
## [[3]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## logLikelihood logDensity
## [1,] -122.2201 -133.5796
## [2,] -122.2201 -133.5796
##
## [[4]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 5.205378 1.14665 2.257054 0.4387091 11.13603 0.5932959 -10.03886
## [2,] 5.205378 1.14665 2.257054 0.4387091 11.13603 0.5932959 -10.03886
## logLikelihood logDensity
## [1,] -124.2126 -134.2514
## [2,] -124.2126 -134.2514
##
## [[5]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 8.059997 1.707677 2.917582 0.4784999 9.896543 0.6653181 -10.10542
## [2,] 8.059997 1.707677 2.917582 0.4784999 9.896543 0.6653181 -10.10542
## logLikelihood logDensity
## [1,] -118.9109 -129.0163
## [2,] -118.9109 -129.0163
##
## [[6]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## logLikelihood logDensity
## [1,] -118.4847 -129.8442
## [2,] -118.4847 -129.8442
##
## [[7]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## [2,] 7.670397 1.287948 3.711294 0.4749959 9.282156 0.6462679 -11.35949
## logLikelihood logDensity
## [1,] -122.5704 -133.9299
## [2,] -122.5704 -133.9299
##
## [[8]]
## Markov Chain Monte Carlo (MCMC) output:
## Start = 1
## End = 2
## Thinning interval = 1
## R_0 D_lat D_inf alpha D_imm rho logPrior
## [1,] 7.686268 1.924227 2.114245 0.5073793 8.322857 0.6307255 -9.651117
## [2,] 7.686268 1.924227 2.114245 0.5073793 8.322857 0.6307255 -9.651117
## logLikelihood logDensity
## [1,] -122.8299 -132.481
## [2,] -122.8299 -132.481
##
## attr(,"class")
## [1] "mcmc.list"
You can check that the chains were started from different
initState
and that we used informative priors.
Take 15 min to analyse these chains and conclude on the choice of using 16 particles to save computation time.
You can compare your conclusions with ours.
Let’s increase the number of particles to 128, which is suggested by
the calibration analysis. Again, to save time we have run 8 chains of
3000 iterations in parallel. Each chain took 40 hours to complete on a
scientific computing cluster and can be loaded and stored in a
mcmc.list
object as follows:
# load
data(pmcmcSeit4lInfoPrior)
# create
trace <- mcmc.list(lapply(pmcmcSeit4lInfoPrior128, function(chain) {
mcmc(chain$trace)
}))
Re-do the same analysis as for 16 particles. What differences can you
notice? In particular, try to compare the posterior distributions with
16 and 128 particles using the function
plotPosteriorDensity
.
You can also have a look at our solution if that helps.
Compare the posterior of the deterministic and stochastic SEIT4L models. Which model provides the best fit? Can you explain why?
Check your answer here.
particleFilter
that will do it for you (see ?particleFilter
for
documentation). Have a look at its code.particleFilter
, you can then plot the filtered trajectories
using the function plotSmc
.You might have noted that the for()
loop over particles
could be parallelized, as particles can be propagated independently. You
could take advantage of this to code a parallel loop and make your
algorithm even faster. If you have never coded a parallel program in
R you can also have a look at the code of
particleFilter
. Actually, all the test runs were performed
on a scientific computing cluster with 12 core machines. So the
computational time is expected to be multiplied by 12 without
parallelization of the particle filter.
For pMCMC we have combined a particle filter (to estimate the likelihood and sample from the posterior distribution of trajectories) and Metropolis-Hastings MCMC (to sample from the posterior distribution of parameters). Another option for sampling from the joint distribution of trajectories is to run what is called SMC^2 (or SMC squared), where Sequential Monte Carlo is used both for parameters and state trajectories. See this presentation by Pierre Jacob for more details and some visualisations.
This web site and the material contained in it were originally created in support of an annual short course on Model Fitting and Inference for Infectious Disease Dynamics at the London School of Hygiene & Tropical Medicine. All material is under a MIT license. Please report any issues or suggestions for improvement on the corresponding GitHub issue tracker. We are always keen to hear about any uses of the material here, so please do get in touch using the Discussion board if you have any questions or ideas, or if you find the material here useful or use it in your own teaching.