4 Example: Abakaliki smallpox outbreak

Here we will explore fitting a simple \(SIR\) model to the famous Abakaliki smallpox data (see e.g. O’Neill and Roberts 1999). This chapter is mainly for you to have a go at!

Firstly, load the SimBIID library:

## load library
library(SimBIID)

Note that the data, originally from (Thompson and Foege 1968), can be found in the outbreaks package, albeit in a slightly different form. We use the simpler data in (O’Neill and Roberts 1999), which correspond to the inter-removal times for 30 individuals, where the times have been relabelled such that time 0 is the time of the first removal. For convenience these data have been collapsed to time-series counts and included in the SimBIID package:

## read in the data 
data(smallpox)
summary(smallpox)

##       time          removals    
##  Min.   : 0.00   Min.   :1.000  
##  1st Qu.:28.00   1st Qu.:1.000  
##  Median :47.00   Median :1.000  
##  Mean   :43.43   Mean   :1.304  
##  3rd Qu.:57.50   3rd Qu.:1.500  
##  Max.   :76.00   Max.   :3.000

4.1 SIR model

Recall, for a continuous-time \(SIR\) model we have rates of transition between states given by: \[\begin{align*} P\left[S_{t + \delta t} = S_t - 1, I_{t + \delta t} = I_t + 1\right] &\approx \beta S I\\ P\left[I_{t + \delta t} = I_t - 1, R_{t + \delta t} = R_t + 1\right] &\approx \gamma I\\ \end{align*}\]

4.2 Summary statistics

In the first instance, as before, let’s match to:

final epidemic size (i.e. total number of removals across the time course of the epidemic), and
time of final removal (in this case when the epidemic process ceased).

Although simple, these two measures serve to give us some information on both the length and magnitude of the epidemic, and should contain useful information about the parameters. In this case the final removal time is 76 days and the final epidemic size is 30 individuals.

4.3 Simulation model

In order to use the ABCSMC() function, we need to define a function that runs the simulations and returns an NA if the simulation is rejected, or a vector of summary statistics if it is not.

Note: one thing to be careful of for this particular system, is that the data are aligned to time 0 being the time of the first removal, not infection. Just for illustration, we are going to simplify things in the first instance, and assume that the first infection happens 10 days before the first removal, which corresponds to adding 10 days to each of our observed removal times.

Copy-and-paste the function below. Go through this function and understand what each line is doing.

## reset times so that initial infection occurs
## 10 days before first removal
smallpox$time <- smallpox$time + 10

## define the targeted summary statistics
sumStat <- c(
    finalsize = sum(smallpox$removals), 
    finaltime = max(smallpox$time)
)

## set initial states (1 initial infection in population of 120)
iniStates <- c(S = 119, I = 1, R = 0)

Task

Set up an SIR model using mparseRcpp() to simulate smallpox outbreaks. Check that the model compiles.

Task

Write a function called simSIR() that is used to simulate an outbreak and extract the relevant summary statistics from the model runs. the function needs to take at least the arguments: pars, data, tols, u, and return an NA if no match, or a vector of summary statistics if there is a match.

Task

Set up a set of gamma prior distributions for \(\beta\) and \(\gamma\), with shape parameter 10 in both cases, but rate parameters of 10,000 and 100 respectively.

Task

Run 5 generations of the sequential ABC routine of (Toni et al. 2009) using starting tolerances of \(\epsilon = 50\) and ptols = 0.2.

Task

Plot the approximate posteriors and simulated outcomes. Produce posterior mean and standard deviation estimates for the parameters.

References

O’Neill, Philip D., and Gareth O. Roberts. 1999. “Bayesian Inference for Partially Observed Stochastic Epidemics.” Journal of the Royal Statistical Society. Series A (General) 162: 121–29.

Thompson, David, and William Foege. 1968. “Faith Tabernacle Smallpox Epidemic.” World Health Organization.

Toni, Tina, David Welch, Natalja Strelkowa, Andreas Ipsen, and Michael P.H. Strumpf. 2009. “Approximate Bayesian Computation Scheme for Parameter Inference and Model Selection in Dynamical Systems.” Journal of the Royal Society Interface 6: 187–202.