Particle Filters

Estimated time: 2-2.5 hours (includes computation time for calibration)

Introduction

Where we are: You’ve completed the core workflow (Introduction → MCMC → Diagnostics → Model Checking) and seen the SEITL model. Now we tackle stochastic models, where the same parameters produce different trajectories each run.

In the previous sessions, we fitted deterministic models using MCMC. For deterministic models, computing the likelihood is straightforward: given parameters, we solve the ODEs to get a trajectory, then evaluate the probability of the data given that trajectory.

But what about stochastic models? In the SEITL session, we saw that the same parameters can produce very different trajectories due to demographic stochasticity. There’s no single “expected” trajectory to compare against the data.

This session introduces particle filters — the computational technique that makes Bayesian inference possible for stochastic models.

Objectives

In this session you will:

Understand why stochastic models need different inference methods
Learn the core particle filter algorithm: propagate, weight, resample
Implement the SEIT4L model using the state-space model interface
Calibrate the number of particles needed for reliable inference

Reminder: Stochastic variability

Particle filters use stochastic simulation, so your results will differ slightly from others’ - and from repeated runs. The patterns (e.g., how ESS changes with particle count) should be similar, but exact numbers will vary. This is expected.

Computational requirements

Particle filters are computationally intensive. For this session:

Memory: ~2-4 GB RAM for 256 particles (scales linearly with particle count)
Runtime: Single likelihood evaluation takes 0.1-0.5 seconds with 100 particles
Calibration section: Running 100 replicates at multiple particle counts takes 5-10 minutes

If you’re on a slower machine, reduce particle counts in the calibration exercises (e.g., use 50, 100, 200 instead of 64, 128, 256, 512).

Setup

Source file

The source file of this session is located at sessions/particle_filters.qmd.

Load libraries

using Random ## for random numbers
using Distributions ## for probability distributions
using DataFrames ## for data frames
using Plots ## for plots
using StatsPlots ## for statistical plots
using CSV ## for CSV file reading
using StatsBase ## for statistical functions
using SSMProblems ## for state-space model interface
using GeneralisedFilters ## for particle filtering
using DrWatson

Load the course package

This loads the MFIIDD course package, which provides the helper functions used throughout — including the SEIT4L Gillespie simulator and the SSMProblems interface, both shown in the main text below.

using MFIIDD

Initialisation

We set a random seed for reproducibility.

Random.seed!(1234)

TaskLocalRNG()

In your own work, you should run prior and posterior predictive checks when fitting models. We skip them here to focus on the particle filter mechanics.

Why deterministic likelihood fails for stochastic models

First, consider the problem. With a deterministic model, the trajectory is fully determined by the parameters:

\[\text{Parameters} \rightarrow \text{Unique trajectory} \rightarrow P(\text{data}|\text{trajectory})\]

With a stochastic model, the same parameters can produce infinitely many different trajectories:

\[\text{Parameters} \rightarrow \text{Many possible trajectories} \rightarrow \text{???}\]

The likelihood we need is \(P(\text{data}|\theta)\), which marginalises over all possible trajectories:

\[P(\text{data}|\theta) = \int P(\text{data}|\text{trajectory}) \times P(\text{trajectory}|\theta) \, d(\text{trajectory})\]

This integral is intractable - we can’t enumerate all possible trajectories. Particle filters give us a Monte Carlo approximation.

Load data and define SEIT4L dynamics

# Load Tristan da Cunha data
flu_tdc = CSV.read(datadir("flu_tdc_1971.csv"), DataFrame)

# Parameters (reasonable values from literature)
θ_example = Dict(
    :R_0 => 7.0,
    :D_lat => 1.0,
    :D_inf => 4.0,
    :α => 0.5,
    :D_imm => 10.0,
    :ρ => 0.65
)

# Initial state for SEIT4L
init_state = Dict(
    :S => 279.0,
    :E => 0.0,
    :I => 2.0,
    :T1 => 3.0,
    :T2 => 0.0,
    :T3 => 0.0,
    :T4 => 0.0,
    :L => 0.0
)

Dict{Symbol, Float64} with 8 entries:
  :I  => 2.0
  :T2 => 0.0
  :L  => 0.0
  :S  => 279.0
  :T4 => 0.0
  :T1 => 3.0
  :E  => 0.0
  :T3 => 0.0

The Gillespie simulator

Why SEIT4L instead of SEITL?

From here on, we use the SEIT4L model (Erlang-4 temporary immunity) rather than the basic SEITL (exponential temporary immunity). As we saw in the SEITL session, the Erlang distribution gives more realistic timing for immunity loss. The original Camacho et al. (2011) paper found that SEIT4L fits the Tristan da Cunha data better. Everything below works identically with SEITL — just with fewer T sub-compartments.

The particle filter needs to simulate many stochastic trajectories. We use the Gillespie algorithm introduced in the SEITL session, adapted here for the SEIT4L model with its four T compartments:

Gillespie step implementation (click to expand)

# Stoichiometry matrix: how each transition changes [S, E, I, T1, T2, T3, T4, L]
const STOICH = [
    [-1,  1,  0,  0,  0,  0,  0,  0],  # S -> E (infection)
    [ 0, -1,  1,  0,  0,  0,  0,  0],  # E -> I (become infectious)
    [ 0,  0, -1,  1,  0,  0,  0,  0],  # I -> T1 (recovery starts)
    [ 0,  0,  0, -1,  1,  0,  0,  0],  # T1 -> T2
    [ 0,  0,  0,  0, -1,  1,  0,  0],  # T2 -> T3
    [ 0,  0,  0,  0,  0, -1,  1,  0],  # T3 -> T4
    [ 0,  0,  0,  0,  0,  0, -1,  1],  # T4 -> L (long-term immunity)
    [ 1,  0,  0,  0,  0,  0,  0, -1],  # L -> S (waning immunity)
]

function gillespie_step(rng, state, θ)
    S, E, I, T1, T2, T3, T4, L = state
    N = sum(state)

    # Calculate transition rates
    β = θ[:R_0] / θ[:D_inf]
    rates = [
        β * S * I / N,           # S -> E
        E / θ[:D_lat],           # E -> I
        (1 - θ[:α]) * I / θ[:D_inf],  # I -> T1 (temporary immunity)
        θ[:α] * I / θ[:D_inf],   # I -> L (direct to long-term, not shown above)
        4 * T1 / θ[:D_imm],      # T1 -> T2 (Erlang stages)
        4 * T2 / θ[:D_imm],      # T2 -> T3
        4 * T3 / θ[:D_imm],      # T3 -> T4
        4 * T4 / θ[:D_imm],      # T4 -> L
    ]

    daily_incidence = 0
    t = 0.0

    while t < 1.0  # Simulate one day
        total_rate = sum(rates)
        if total_rate == 0
            break
        end

        # Time to next event
        dt = randexp(rng) / total_rate
        if t + dt > 1.0
            break
        end
        t += dt

        # Choose which event occurs
        event = sample(rng, 1:length(rates), Weights(rates))

        # Update state
        state = state .+ STOICH[event]

        # Track E -> I transitions as new cases
        if event == 2
            daily_incidence += 1
        end

        # Recalculate rates with new state
        S, E, I, T1, T2, T3, T4, L = state
        # ... (rates update)
    end

    return state, daily_incidence
end

Key aspects:

Stoichiometry matrix: Defines how each transition changes the state vector
Rates: Computed from current state and parameters
Event selection: Exponential waiting time, event chosen proportional to rates
Incidence tracking: Counts E→I transitions as new cases

How particle filters work

The particle filter approximates the likelihood integral by:

Propagate: Simulate many independent trajectories (“particles”) forward through time
Weight: At each observation time, assign weights based on how well each particle matches the data
Resample: Replace low-weight particles with copies of high-weight particles

We can visualise this with our SEIT4L model:

# Propagate 20 particles for 5 days
n_particles = 20
particles = [Float64[init_state[:S], init_state[:E], init_state[:I],
                     init_state[:T1], init_state[:T2], init_state[:T3],
                     init_state[:T4], init_state[:L]] for _ in 1:n_particles]

# Store trajectories (days 0-5)
trajectories = [zeros(6) for _ in 1:n_particles]

# Simulate 5 days
for day in 1:5
    for i in 1:n_particles
        particles[i], inc = gillespie_step(Random.default_rng(), particles[i], θ_example)
        trajectories[i][day+1] = inc
    end
end

# For weighting step later
day_5_predictions = [trajectories[i][6] for i in 1:n_particles]
day_5_observed = flu_tdc.obs[6]

# Plot
p = plot(title="Particle Filter: Propagation Step",
         xlabel="Time (days)", ylabel="Daily incidence")

# Particle trajectories
for i in 1:n_particles
    plot!(p, 0:5, trajectories[i], alpha=0.4, color=:steelblue, linewidth=1, label="")
end

# All observations - earlier in gray, previous (day 4) in red, current target (day 5) as open circle
scatter!(p, flu_tdc.time[1:4], flu_tdc.obs[1:4],
         label="Earlier observations", markersize=4, color=:gray)
scatter!(p, [flu_tdc.time[5]], [flu_tdc.obs[5]],
         label="Previous observation", markersize=7, color=:red)
scatter!(p, [flu_tdc.time[6]], [flu_tdc.obs[6]],
         label="Current target", markersize=7, markershape=:circle,
         markerstrokecolor=:red, markerstrokewidth=2, color=:white)

plot!(p, legend=:topright)

Each blue line is one particle trajectory. The particles spread out because each follows different random events. The current observation (red, at day 5) will be used to weight the particles.

The weighting step

Particles that predict incidence close to the observed value get higher weights:

# Compute weights (Poisson likelihood)
log_w = [logpdf(Poisson(max(θ_example[:ρ] * pred, 0.01)), day_5_observed)
         for pred in day_5_predictions]

# Normalise
particle_weights = exp.(log_w .- maximum(log_w))
particle_weights ./= sum(particle_weights)

# Visualise
p = bar(1:n_particles, particle_weights,
        xlabel="Particle", ylabel="Weight",
        title="Particle Weights at Day 5 (obs = $day_5_observed)",
        legend=false, alpha=0.7)

# Add predictions as labels
for (i, pred) in enumerate(day_5_predictions)
    annotate!(p, i, particle_weights[i] + 0.02, text("$(round(Int, pred))", 7))
end

plot!(p)

Particles predicting values closer to the observation get higher weights.

Code explainer

exp.(log_w .- maximum(log_w)) is the log-sum-exp trick for numerical stability. The raw log weights might be very negative (e.g. -500), and exp(-500) underflows to zero. Subtracting the maximum first shifts the largest value to 0, so at least one weight is exp(0) = 1, and the others are smaller but non-zero. Since we normalise by the sum afterwards, the shift cancels out — the relative weights are unchanged.
particle_weights ./= sum(particle_weights) is shorthand for particle_weights .= particle_weights ./ sum(particle_weights) — normalise in place so the weights sum to 1.

The resampling step

When weights become too uneven, we resample - drawing particles with replacement according to their weights:

# Resample particles
n_effective = 1.0 / sum(particle_weights.^2)  # Effective sample size
println("Effective sample size: $(round(n_effective, digits=1)) out of $n_particles")

# Draw new particles
resampled_idx = sample(1:n_particles, Weights(particle_weights), n_particles)

# Count how many times each original particle was selected
counts = [count(==(i), resampled_idx) for i in 1:n_particles]

p = bar(1:n_particles, counts,
        xlabel="Original particle", ylabel="Copies after resampling",
        title="Resampling: High-weight particles duplicated",
        legend=false, alpha=0.7, color=:green)
plot!(p)

Effective sample size: 2.0 out of 20

High-weight particles are duplicated, low-weight particles are discarded. This focuses computational effort on plausible trajectories.

Code explainer

1.0 / sum(particle_weights.^2) is the effective sample size (ESS). When all weights are equal, ESS equals the number of particles; when one particle dominates, ESS approaches 1. It tells you how many particles are effectively contributing to the estimate — a common trigger for deciding when to resample.
sample(1:n_particles, Weights(particle_weights), n_particles) draws n_particles indices from 1:n_particles with replacement, where each index is chosen with probability proportional to its weight. Weights(...) is a wrapper from StatsBase.jl that tells sample to interpret the vector as sampling weights.
count(==(i), resampled_idx) counts how many times each original particle was drawn. ==(i) is partial application: it creates a function “does x equal i?” that count then applies to every element of resampled_idx. This is equivalent to sum(resampled_idx .== i) but reads more naturally.

Exercise: Understand the particle filter steps

Look at the three plots above (propagation, weighting, resampling).

In the propagation plot, why do the particle trajectories spread apart over time?
In the weighting plot, particles with predicted incidence closer to the observed value get higher weights. Why does Poisson likelihood give this behaviour?
After resampling, some particles are duplicated many times while others disappear entirely. Why is this beneficial for estimating the likelihood?

Why resampling is essential: particle degeneracy

Without resampling, particle filters suffer from degeneracy: after several time steps, almost all weight concentrates on a single particle. This happens because multiplicative weights compound - a particle that’s slightly better at time 1 gets a slightly higher weight, making it even more likely to dominate at time 2, and so on.

Imagine 100 particles after 10 time steps without resampling: - 1 particle has weight 0.99 - 99 particles share weight 0.01

This means we’re effectively using just 1 trajectory to estimate the likelihood - a very poor approximation. Resampling fixes this by redistributing particles to maintain diversity, keeping the effective sample size high.

Accumulating the likelihood

The log-likelihood is estimated by summing the log of mean weights at each time step:

\[\log \hat{P}(\text{data}|\theta) = \sum_{t=1}^{T} \log \left( \frac{1}{N} \sum_{i=1}^{N} w_t^{(i)} \right)\]

This is unbiased in expectation, meaning if we run the particle filter many times, the average will converge to the true likelihood.

Implementing the state-space model interface

For production use, we define the model using the SSMProblems.jl interface. This separates the model from the filtering algorithm, so you can swap in different filtering methods (bootstrap, guided, auxiliary) without rewriting the model. It also integrates with Turing.jl for the Particle MCMC in the next session.

A state-space model has three components:

Initial state distribution (SEIT4LInitial) - where does the system start?
Transition dynamics (SEIT4LDynamics) - how does the state evolve? (uses gillespie_step)
Observation process (PoissonObservation) - how do observations relate to the state?

Here is the full interface implementation (defined in src/seit4l_ssm_interface.jl). Each type carries a docstring spelling out its state vector and the method it implements — useful for reference once you’ve read the prose above.

using Random
using Distributions
using SSMProblems

"""
SEIT4L latent dynamics.
State vector: [S, E, I, T1, T2, T3, T4, L, daily_inc]
The last element tracks daily incidence for the observation process.
"""
struct SEIT4LDynamics <: SSMProblems.LatentDynamics
    θ::Dict{Symbol, Float64}
end

function SSMProblems.simulate(rng::AbstractRNG, dyn::SEIT4LDynamics,
                              step::Integer, prev_state; kwargs...)
    state = collect(prev_state[1:8])  # Extract compartments
    new_state, daily_inc = gillespie_step(rng, state, dyn.θ)
    return vcat(new_state, daily_inc)  # Append daily incidence
end

"""
Poisson observation process.
Observes daily incidence (last element of state) with reporting rate ρ.
"""
struct PoissonObservation <: SSMProblems.ObservationProcess
    ρ::Float64
end

function SSMProblems.distribution(obs::PoissonObservation, step::Integer,
                                  state; kwargs...)
    daily_inc = state[end]
    Poisson(max(obs.ρ * daily_inc, 1e-10))
end

"""
Initial state distribution (deterministic).
"""
struct SEIT4LInitial <: SSMProblems.StatePrior
    init_state::Vector{Float64}
end

function SSMProblems.simulate(rng::AbstractRNG, prior::SEIT4LInitial; kwargs...)
    vcat(prior.init_state, 0.0)  # Append 0 for initial daily incidence
end

Code explainer

A few Julia features worth unpacking:

struct SEIT4LDynamics <: SSMProblems.LatentDynamics defines a new type — a small container that holds some data (here, the parameter dictionary θ). The <: means “is a subtype of”. SSMProblems.LatentDynamics is an abstract type: it has no fields of its own, and exists purely to mark “this thing plays the role of a latent-dynamics model”. By declaring our struct a subtype, we tell SSMProblems.jl that our type fills that role. The same pattern applies to SEIT4LInitial <: SSMProblems.StatePrior and PoissonObservation <: SSMProblems.ObservationProcess.
Inside the particle filter, SSMProblems.jl calls simulate(rng, dyn, ...) and distribution(obs, ...) on whatever types it’s given. The dyn::SEIT4LDynamics annotation on our function definition tells Julia “use this version when dyn is one of ours”, so our code gets plugged in automatically. We implement simulate for the latent dynamics (how the state evolves one step) and for the initial state (how to draw the first state). For the observation model we instead implement distribution, which returns a Distributions.jl distribution describing how observations arise from a given state.
rng::AbstractRNG is Julia’s abstract random number generator type. Any concrete RNG (the default one, a seeded Mersenne twister, etc.) counts. Threading the rng through every simulation step is what lets the particle filter control randomness — set the seed once and you get reproducible particles.
kwargs... at the end of a function signature means “accept any additional keyword arguments”. We don’t use them here, but SSMProblems.jl may pass extra keyword arguments depending on the filter being used, so our methods need to accept them without complaining.
vcat(prior.init_state, 0.0) appends 0.0 to the 8-element state vector to make a 9-element vector. We use the 9th slot to track daily incidence, because the observation process needs to see it. collect(prev_state[1:8]) grabs the first 8 elements (the compartments) and copies them into a plain Vector — some inputs arrive as views or other array-like types, and gillespie_step wants a concrete vector.

How the pieces fit together:

The state vector is [S, E, I, T1, T2, T3, T4, L, daily_inc] - the 9th element tracks daily incidence for the observation process
SEIT4LDynamics wraps the Gillespie simulator, calling gillespie_step each day
PoissonObservation assumes observed cases follow Poisson(ρ × true_incidence)

Running the bootstrap particle filter

The run_particle_filter function assembles the SSM components and runs the bootstrap filter:

using Random: default_rng
using GeneralisedFilters: BF, filter
using ForwardDiff: value
using SSMProblems: StateSpaceModel

"""
    run_particle_filter(θ, obs, n_particles; init_state)

Run bootstrap particle filter for SEIT4L and return log-likelihood.

# Arguments
- `θ`: Parameter dictionary (must include :R_0, :D_lat, :D_inf, :α, :D_imm, :ρ)
- `obs`: Vector of observed daily incidence
- `n_particles`: Number of particles
- `init_state`: Initial state vector [S, E, I, T1, T2, T3, T4, L]

# Returns
- `log_likelihood`: Estimated log-likelihood
"""
function run_particle_filter(θ, obs, n_particles;
                             init_state=[279.0, 0.0, 2.0, 3.0, 0.0, 0.0, 0.0, 0.0])
    # Coerce init_state to Float64 so callers can pass an integer vector
    # (e.g. [279, 0, 2, ...]) without hitting SEIT4LInitial's Vector{Float64}
    # signature.
    init_state_f64 = collect(Float64.(init_state))
    length(init_state_f64) == 8 ||
        throw(ArgumentError("init_state must be [S, E, I, T1, T2, T3, T4, L]"))

    # Strip ForwardDiff Duals before constructing the SSM components: the
    # bootstrap filter is non-differentiable, and SEIT4LDynamics /
    # PoissonObservation declare concrete Float64 fields. Mirrors the pattern
    # used in sessions/pmcmc.qmd.
    θ_f64 = Dict{Symbol,Float64}(k => value(v) for (k, v) in θ)

    # Define SSM components
    initial = SEIT4LInitial(init_state_f64)
    dynamics = SEIT4LDynamics(θ_f64)
    observation = PoissonObservation(θ_f64[:ρ])

    # Create state-space model
    model = StateSpaceModel(initial, dynamics, observation)

    # Run bootstrap particle filter
    rng = default_rng()
    algo = BF(n_particles)  # Bootstrap Filter
    _, log_lik = filter(rng, model, algo, obs)

    return log_lik
end

This ties everything together: the initial state, Gillespie dynamics, and Poisson observations form a StateSpaceModel, which the bootstrap filter (BF) processes to estimate the log-likelihood.

Code explainer

StateSpaceModel(initial, dynamics, observation) is a constructor from SSMProblems.jl that bundles the three SSM components into a single object. Constructors in Julia look like function calls with the type’s name — there’s no explicit new keyword as in Python or Java.
BF(n_particles) constructs a Bootstrap Filter object from GeneralisedFilters.jl. The bootstrap filter is the simplest particle filter variant: it propagates particles using the model’s own dynamics, weights them by the observation likelihood, and resamples. Other variants (e.g. APF for auxiliary particle filter, GPF for guided) would plug in here without any changes to the model — that’s the point of the SSM interface.
GeneralisedFilters.filter(rng, model, algo, obs) returns a tuple of (particles_at_final_time, log_likelihood). We ignore the particles with _ and keep only the log-likelihood estimate, which is what MCMC will eventually need.
Random.default_rng() returns Julia’s shared global RNG. If you seed it with Random.seed!(n) at the start of the session, every call that uses this RNG becomes reproducible.

Testing it:

# Run with 100 particles
ll = run_particle_filter(θ_example, flu_tdc.obs, 100)
println("Log-likelihood estimate: $(round(ll, digits=1))")

Log-likelihood estimate: -118.2

Calibrating particle count

A key question: how many particles do we need?

Too few particles produce noisy likelihood estimates, causing poor MCMC mixing
Too many particles waste computation

The principled approach targets a specific variance of the log-likelihood estimate. Pitt et al. (2012) and Doucet et al. (2015) showed that MCMC efficiency is maximised when the standard deviation of the log-likelihood estimate is between 1 and 3.

Why is some noise acceptable? MCMC with a noisy likelihood (called “pseudo-marginal MCMC”) still targets the correct posterior — the noise slows down mixing but doesn’t introduce bias. An SD of 1-3 is a sweet spot: low enough for reasonable MCMC acceptance rates, high enough to avoid wasting computation on unnecessary precision.

Calibrating (this takes 1-2 minutes to run all particle counts):

# Reference parameters (near what we expect the posterior mode to be)
θ_cal = Dict(:R_0 => 6.0, :D_lat => 1.3, :D_inf => 2.0,
             :α => 0.5, :D_imm => 10.5, :ρ => 0.7)

# Test different particle counts
particle_counts = 2 .^ (2:9)  # 4, 8, 16, ..., 512
n_reps = 30

calibration = DataFrame(n_particles=Int[], mean_ll=Float64[],
                        sd_ll=Float64[], time_per_run=Float64[])

for n in particle_counts
    t0 = time()
    lls = Vector{Float64}(undef, n_reps)
    Threads.@threads for i in 1:n_reps
        lls[i] = run_particle_filter(θ_cal, flu_tdc.obs, n)
    end
    elapsed = (time() - t0) / n_reps

    push!(calibration, (n, mean(lls), std(lls), elapsed))
end

calibration

8×4 DataFrame

Row	n_particles	mean_ll	sd_ll	time_per_run
	Int64	Float64	Float64	Float64
1	4	-393.578	1343.83	0.00738657
2	8	-129.276	22.2412	0.0036694
3	16	-122.177	8.34465	0.00373336
4	32	-121.54	8.32412	0.0068688
5	64	-116.966	3.4911	0.0138145
6	128	-116.039	0.808273	0.0282924
7	256	-116.057	0.482184	0.0583154
8	512	-115.887	0.376968	0.169338

Code explainer

Threads.@threads — runs loop iterations in parallel across available threads. The replicates are independent, so this is safe and gives a roughly linear speedup.
Vector{Float64}(undef, n_reps) — pre-allocates the results vector so each thread writes to its own index (lls[i] = ...). We can’t use push! here because threads appending to a shared vector is a race condition.
To actually use more than one thread, start Julia with julia --threads=auto (or set JULIA_NUM_THREADS). Without that, @threads silently runs serially.

Plotting code

# Plot calibration results
p1 = plot(calibration.n_particles, calibration.mean_ll,
    xlabel="Number of particles", ylabel="Mean log-likelihood",
    title="Bias", legend=false, marker=:circle, linewidth=2, xscale=:log2)

p2 = plot(calibration.n_particles, calibration.sd_ll,
    xlabel="Number of particles", ylabel="SD of log-likelihood",
    title="Variance (target: 1-3)", legend=false, marker=:circle,
    linewidth=2, xscale=:log2)
hline!(p2, [1, 3], color=:grey, linestyle=:dot, alpha=0.7)

p3 = plot(calibration.n_particles, calibration.time_per_run,
    xlabel="Number of particles", ylabel="Time (seconds)",
    title="Computation time", legend=false, marker=:circle,
    linewidth=2, xscale=:log2)

plot(p1, p2, p3, layout=(1, 3), size=(900, 300))

Look at the variance plot (middle). We want the SD to be in the range 1-3. Read off which particle count achieves this.

Typical results (if you can’t run the code)

In our runs, we typically see:

4-16 particles: SD > 10, far too noisy for reliable inference
32-64 particles: SD around 3-6, borderline
128 particles: SD around 1.5-3, in the target range
256-512 particles: SD < 1.5, diminishing returns for the extra computation

Your numbers will differ due to stochastic variability, but the pattern should be similar. For the PMCMC session, we’ll use around 128-256 particles based on this calibration.

Exercise: Choose the number of particles

Based on the calibration plots above:

What particle count achieves SD of log-likelihood between 1 and 3?
What’s the trade-off between 128 and 256 particles for this problem?
Why might you want to use more particles than the minimum needed?

Exercise: Sensitivity to parameters

The particle filter likelihood estimate depends on the parameters. Explore how the log-likelihood changes as you vary R₀:

# Vary R_0 and compute log-likelihoods
R0_values = 3.0:1.0:12.0
n_particles = 128  # Use your calibrated value

ll_results = Float64[]
for R0 in R0_values
    θ_test = Dict(:R_0 => R0, :D_lat => 1.3, :D_inf => 2.0,
                  :α => 0.5, :D_imm => 10.5, :ρ => 0.7)
    ll = run_particle_filter(θ_test, flu_tdc.obs, n_particles)
    push!(ll_results, ll)
end

plot(R0_values, ll_results, xlabel="R_0", ylabel="Log-likelihood",
     title="Likelihood surface", marker=:circle, legend=false)

Questions:

Which value of R₀ gives the highest log-likelihood?
Run the same R₀ value multiple times. How much does the log-likelihood vary? (This is the noise from particle filtering)
How does the shape of this likelihood surface compare to what you’d expect from deterministic model fitting?

Learning points

Stochastic models require particle filters because the likelihood marginalises over all possible trajectories
Particle filters work by: propagate particles forward, weight by observation likelihood, resample when weights become uneven
SSMProblems.jl interface separates model specification from filtering algorithms
Calibrate particle count to achieve log-likelihood SD between 1 and 3 (too few = noisy, too many = slow)
The likelihood estimate is unbiased but noisy - this noise slows MCMC but doesn’t bias posteriors
Next step: The noisy likelihood estimate slots directly into the Metropolis-Hastings acceptance ratio — wherever we previously computed log_likelihood(θ), we now use run_particle_filter(θ, data, n_particles) instead

References

Doucet & Johansen (2009). A tutorial on particle filtering and smoothing. Handbook of Nonlinear Filtering 12, 656-704. (Excellent introduction)
Pitt et al. (2012). On some properties of Markov chain Monte Carlo simulation methods based on the particle filter. Biometrika 99(2), 269-284. (Calibration: target SD 1-3)
Doucet et al. (2015). Efficient implementation of Markov chain Monte Carlo when using an unbiased likelihood estimator. Biometrika 102(2), 295-313.
SSMProblems.jl documentation - State-space model interface
GeneralisedFilters.jl documentation - Particle filtering algorithms

Next session: In Particle MCMC, we’ll use the particle filter within an MCMC framework to fit the stochastic SEIT4L model to data.