Bayesian hierarchical statistical SIRS models

Zhuang, Lili; Cressie, Noel

doi:10.1007/s10260-014-0280-9

Bayesian hierarchical statistical SIRS models

Published: 06 November 2014

Volume 23, pages 601–646, (2014)
Cite this article

Download PDF

Statistical Methods & Applications Aims and scope Submit manuscript

Bayesian hierarchical statistical SIRS models

Download PDF

Lili Zhuang¹ &
Noel Cressie²

1990 Accesses
1 Altmetric
Explore all metrics

Abstract

The classic susceptible-infectious-recovered (SIR) model, has been used extensively to study the dynamical evolution of an infectious disease in a large population. The SIR-susceptible (SIRS) model is an extension of the SIR model to allow modeling imperfect immunity (those who have recovered might become susceptible again). SIR(S) models assume observed counts are “mass balanced.” Here, mass balance means that total count equals the sum of counts of the individual components of the model. However, since the observed counts have errors, we propose a model that assigns the mass balance to the hidden process of a (Bayesian) hierarchical SIRS (HSIRS) model. Another challenge is to capture the stochastic or random nature of an epidemic process in a SIRS. The HSIRS model accomplishes this through modeling the dynamical evolution on a transformed scale. Through simulation, we compare the HSIRS model to the classic SIRS model, a model where it is assumed that the observed counts are mass balanced and the dynamical evolution is deterministic.

How Local Interactions Impact the Dynamics of an Epidemic

Article Open access 13 November 2021

Effects of heterogeneous susceptibility on epidemiological models of reinfection

Article 02 October 2022

A Stochastic SVIR Model with Imperfect Vaccine and External Source of Infection

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

A pandemic (e.g., caused by influenza viruses such as H1N1, H5N1) is an epidemic of an infectious disease spreading through human populations across a large region (e.g., Potter 2001). Recently, the risk of pandemic influenza has been a significant public-health concern, and much attention has been paid to achieve more precise and timely estimates and predictions of influenza activity.

Compartment epidemic models, such as the susceptible-infectious-recovered (SIR) model, have been widely used to study the dynamical evolution of an infectious disease in a large population. The SIR model was first proposed by Kermack and McKendrick (1927) to explain plague and cholera epidemics, and it was extended to the SIR-susceptible (SIRS) model to allow imperfect immunity (Kermack and McKendrick 1932, 1933; Dushoff et al. 2004). Assume that at any given time $t$, a fixed population can be split into three compartments: $S(t)$ denotes the number of susceptibles; $I(t)$ denotes the number of infectious; and $R(t)$ denotes the number of “recovereds” (which includes deaths). Then in an SIRS model, the dynamical process is captured through the following set of deterministic (i.e., non-stochastic) nonlinear ordinary differential equations (ODEs):

$$\begin{aligned} \frac{\mathrm{d}S}{\mathrm{d}t}&= -\beta S I+\phi R, \end{aligned}$$

(1)

$$\begin{aligned} \frac{\mathrm{d}I}{\mathrm{d}t}&= \beta S I-\gamma I,\end{aligned}$$

(2)

$$\begin{aligned} \frac{\mathrm{d}R}{\mathrm{d}t}&= \gamma I-\phi R, \end{aligned}$$

(3)

In (1)–(3), $\beta $ denotes the transmission rate (also referred as the contact or infection rate) per unit time, which can be expressed as the fraction of contacts between susceptible individuals and infectious individuals that result in an infection. Further, $\gamma $ denotes the rate of “recovery” per unit time, which is the rate at which infectious individuals are removed from being infectious due to recovery (or death); then $\frac{1}{\gamma }$ is the average duration of the infectious period. Finally, $\phi $ denotes the rate of loss of immunity of recovered individuals per unit time, which is the rate at which recovered individuals become susceptible again (Anderson and May 1991; Hethcote 2000); then $\frac{1}{\phi }$ is the average duration of the immunity period.

Notice that by adding Eqs. (1)–(3), we can obtain

$$\begin{aligned} \frac{\mathrm{d}S}{\mathrm{d}t}+\frac{\mathrm{d}I}{\mathrm{d}t}+ \frac{\mathrm{d}R}{\mathrm{d}t}=0. \end{aligned}$$

(4)

Thus, the model postulates a fixed total population, $N$, without entry and exits of demographic type (i.e, there are no births or deaths from causes other than the disease itself). Clearly, it represents a short-term phenomenon where the total number of people in all three compartments together is constant. Thus, (4) implies that for $t$ in an interval of $\mathbb {R}_+,$

$$\begin{aligned} S(t) + I(t) + R(t) = N. \end{aligned}$$

(5)

This assumption (5) is commonly referred to as mass balance (Reluga 2004). As discussed above, the SIRS model in (1)–(3) assumes a fraction $\phi $ of members of the recovered class can rejoin the susceptible class; the traditional SIR model is obtained when $\phi =0$.

A discretized version of (1)–(3) can be written as a set of deterministic difference equations. That is, when time $t$ is discrete (in units of $\varDelta $ days),

$$\begin{aligned} S(t+1)&= S(t)- \beta S(t)I(t)\varDelta +\phi R(t)\varDelta ,\end{aligned}$$

(6)

$$\begin{aligned} I(t+1)&= I(t)+\beta S(t)I(t)\varDelta -\gamma I(t)\varDelta ,\end{aligned}$$

(7)

$$\begin{aligned} R(t+1)&= R(t)+ \gamma I(t)\varDelta -\phi R(t)\varDelta ; \end{aligned}$$

(8)

and the discrete-time mass-balance constraint is,

$$\begin{aligned} S(t) + I(t) + R(t) = N, \end{aligned}$$

(9)

where here $t=1,2,\ldots , T$.

In past decades, the deterministic SIR model and its various extensions (e.g., SIRS, SIR including birth and death rates, migration, etc.), have been used extensively for infectious-disease estimation and prediction in large and well mixed populations (e.g., Schenzle 1984; Anderson and May 1991; Keeling and Rohani 2007). The classic SIRS (CSIRS) model given by (1)–(3) or (6)–(8) is appealing because of its straightforward modeling strategy and its easily interpretable parameters. However, there are various sources of uncertainty in the model: First, there is uncertainty in the counts $\{S(t), I(t),R(t)\}$ themselves; that is, the counts in the compartments are observed with error. Second, the rather simple model (1)–(3) [or (6)–(8) in the discrete-time setting] may not capture the uncertainties in the hidden dynamical epidemic process, such as the uncertainties caused by the presence of heterogeneous populations; and third, the values of the parameters $\beta ,\,\gamma $, and $\phi $ are uncertain.

A variety of stochastic models have been developed recently, through a probabilistic mechanism that involves a Markov chain of SIR states (e.g., Bailey 1975; Andersson and Britton 2000; Allen 2003; Xu et al. 2007). Some more recent stochastic models involve complex networks (e.g., Halloran et al. 2002; Zhou et al. 2006; Volz 2008) or drug resistance (e.g., Chao et al. 2012) to avoid the assumption of homogenous mixing. However, these stochastic models ignore the noisy nature of data, and they apply mass balance to the observed counts rather than the true counts. Furthermore, these models typically rely on many carefully chosen parameters, such as transmission rates, recovery rates, and so forth in heterogeneous populations; uncertainty in where the model’s parameter vector is located in the parameter space is not accounted for. Our strategy is to deal with each source of uncertainty using a Bayesian hierarchical statistical model, where the counts are observed with error and the dynamical evolution embodied in (6)–(8) is stochastic.

Bayesian percolation models have proven popular for modeling spatio-temporal dynamical processes (e.g., Catterall et al. 2012; Gibson et al. 2006) and have been applied to epidemics (e.g., Cook et al. 2007), but they ignore the true process hidden behind the noisy data. More recent Bayesian hierarchical models, which are widely used for mapping non-infectious diseases, aim to capture the true spatial process (e.g., Besag et al. 1991; Carlin and Banerjee 2002), but their process models and parameter models are not appropriate for epidemics. Those that do have a dynamical spatial statistical component have not generally been parameterized in terms of the interpretable components of the epidemic (e.g., Mugglin et al. 2002; Wood 2010).

Recently, partially observed nonlinear stochastic dynamical systems (also know as partially observed Markov processes, or state-space models) have been used extensively for infectious-disease estimation and prediction. A wide range of inference techniques have been proposed and implemented in the R statistical language as part of the package pomp (http://cran.at.r-project.org/web/packages/pomp/), such as nonlinear forecasting (e.g., Kendall et al. 1999, 2005), iterated filtering (Ionides et al. 2006; King et al. 2008; He et al. 2010), and approximate Bayesian particle filtering (e.g., Liu and West 2001; Arulampalam et al. 2001; Dukić et al. 2012; Rasmussen et al. 2011). Some more recent state-space models involve spatial components in the dynamical process model (e.g., Patterson et al. 2008). Some of these models are not appropriate for modeling epidemic flows (e.g., Kendall et al. 1999, 2005; Patterson et al. 2008). Those that are extensions of classic compartment epidemic models (e.g., the SIR model and the SEIR model) and that do pay attention to the underlying true process hidden behind the noisy data, either ignore the source of variation that captures randomness in the (hidden) epidemic process (e.g., Rasmussen et al. 2011), or they do not preserve the mass-balance property (e.g., Liu and West 2001; Dukić et al. 2012), which may introduce biased results. Recent extensions to stochastic models with a master equation have similar problems with mass balance (e.g., Alonso et al. 2007).

Notice that imposing the mass balance appropriately to improve accuracy of estimating counts in all compartments is important. If one only pays attention to estimating the infectious population (I) and ignores modeling the susceptible population (S) and the recovered population (R) (for example, simply using a value of R from the literature and letting $S=N-I-R$, as if it were observed), it may result in biased estimates of important model parameters (e.g., the recovery rate $\gamma $) and finally lead to bias in the estimation of I. Further, bias in $\gamma $ may lead to bias in the reproduction number, $R_0$ (defined in Sect. 4.1), which in turn may affect the estimate of the proportion of susceptibles that need to be vaccinated to achieve “herd immunity” (Fine et al. 2011). Accurate estimates to support public-health decisions are critical for disease control and prevention.

In this article, we return to the classic SIRS (CSIRS) model (6)–(8) for motivation, and we propose a discrete-time, mass-balanced (Bayesian) hierarchical SIRS, or HSIRS, model, which is based directly on counts and imposes mass balance appropriately on the underlying true counts, rather than on the observed counts. Our model captures the randomness in the epidemic process by assuming that the dynamical process occurs on a log-odds-ratio scale, transformed from the scale where the true counts are mass-balanced. This new dynamical approach to infectious-disease modeling also shows how infectious diseases in heterogeneous populations could be modeled hierarchically.

In Sect. 2, we propose the HSIRS model for infectious-disease data. The actual computations associated with the posterior analysis involve local linearization of difference equations; see Sect. 3. In Sect. 4, we simulate in discrete time, a dataset from the HSIRS model and from a CSIRS model that is modified to incorporate observation error, and then we infer all unknowns of the models through Markov chain Monte Carlo (MCMC) analysis. Comparisons are given to CSIRS-model-based inference. In Sect. 5, we extend the example to a carefully designed simulation experiment with sufficient replication to make definitive comparisons between the HSIRS model and the CSIRS model. Discussion and conclusions are given in Sect. 6. Some technical derivations are given in the “Appendices”.

2 Bayesian hierarchical statistical SIRS (HSIRS) model

We assume that there is a true (unobserved) process underlying the observed epidemic counts, which we incorporate into the framework of a Bayesian hierarchical statistical model. This typically consists of three components: the data model (i.e., the conditional distribution of the data given hidden processes and parameters); the process model (i.e., the conditional distribution of the hidden processes given parameters); and the parameter model (i.e., the prior distribution of the parameters).

2.1 Data model

We model the raw counts directly rather than modeling the rates derived from the counts (e.g., Dukić et al. 2012, use Gaussian distributions to model the rates) and assume that the data model consists of (conditionally) independent Poisson distributions evolving at discrete time points. That is, for time points $t=1,2,\ldots ,T$, in units of $\varDelta $ days, the data model is

$$\begin{aligned} Z_S(t)|P_S(t)\sim \hbox {ind. }\mathrm{Poisson}(\lambda _N P_S(t)), \end{aligned}$$

(10)

independent from

$$\begin{aligned} Z_I(t)|P_I(t) \sim \hbox {ind. }\mathrm{Poisson}(\lambda _N P_I(t)), \end{aligned}$$

(11)

where $Z_S(t)$ and $Z_I(t)$ are the observed number of susceptible and infectious individuals at time $t$, respectively; “ind.” is shorthand for “independent”; $\lambda _N$ denotes the true total population count, and $P_S(t)$ and $P_I(t)$ are the underlying true rates of susceptible and infectious individuals at time $t$, respectively. Since $\lambda _N$ is known from demography and $\lambda _N=Z_S(t)+Z_I(t)+Z_R(t)$, then $Z_R(t)$ follows. Thus, the data are $\{(Z_S(t), Z_I(t)): t=1,2,\ldots ,T\}$.

2.2 Process model

Recall the discrete-time CSIRS model defined by (6)–(8), which assumes that mass balance happens on the observed population. However, the appropriate place to model mass balance is on the true (hidden) process. That is, for $t=1,2,\ldots T$, we have

$$\begin{aligned} \lambda _S(t)+\lambda _I(t)+\lambda _R(t)=\lambda _N, \end{aligned}$$

(12)

where $\lambda _S(t),\,\lambda _I(t)$, and $\lambda _R(t)$ are the underlying true (but hidden) counts of susceptible, infectious, and recovered individuals at time $t$, respectively. Now define the true (hidden) rates, $P_S(t),\,P_I(t)$, and $P_R(t)$, via

$$\begin{aligned}&\lambda _S(t)\equiv \lambda _NP_S(t),\end{aligned}$$

(13)

$$\begin{aligned}&\lambda _I(t)\equiv \lambda _NP_I(t),\end{aligned}$$

(14)

$$\begin{aligned}&\lambda _R(t)\equiv \lambda _NP_R(t), \end{aligned}$$

(15)

where $P_R(t)$ denotes the underlying true rate of recovered individuals at time $t$. Then by substituting (13)–(15) into (12), it is straightforward to see that the mass balance in (12) can be rewritten as,

$$\begin{aligned} P_S(t)+P_I(t)+P_R(t)=1, \end{aligned}$$

(16)

and hence for $t=1,2,\ldots T$,

$$\begin{aligned} P_R(t)=1-P_S(t)-P_I(t). \end{aligned}$$

(17)

Recall that the difference equations defined in (6)–(8) give easily interpretable dynamics, in which individuals move from the susceptible state, to the infectious state, then to the recovered state (and some individuals may become susceptible again). We wish to model this “SIRS flow” on the hidden process where $t$ is discrete (in units of $\varDelta $ days). We do this by deriving a set of deterministic difference equations on $\lambda _S(t),\,\lambda _I(t)$, and $\lambda _R(t)$. That is, for $t=1,2,\ldots $, our process model is,

$$\begin{aligned} \lambda _S(t+1)&= \lambda _S(t)-\beta \varDelta \lambda _S(t)\lambda _I(t)+ \phi \varDelta \lambda _R(t),\end{aligned}$$

(18)

$$\begin{aligned} \lambda _I(t+1)&= \lambda _I(t)+ \beta \varDelta \lambda _S(t)\lambda _I(t)-\gamma \varDelta \lambda _I(t),\end{aligned}$$

(19)

$$\begin{aligned} \lambda _R(t+1)&= \lambda _R(t)+\gamma \varDelta \lambda _I(t)-\phi \varDelta \lambda _R(t). \end{aligned}$$

(20)

In (18)–(20), the SIRS flow has been preserved, and the rate parameters $\beta ,\,\phi $, and $\gamma $ are in units of per day ($d^{-1}$).

According to the definition of $\lambda _S(t),\,\lambda _I(t)$, and $\lambda _R(t)$ in (13)–(15), Eqs. (18)–(20) can be rewritten in terms of the true proportions, $P_S(t),\,P_I(t)$, and $P_R(t)$:

$$\begin{aligned} P_S(t+1)&= P_S(t)- \beta \lambda _N P_S(t)P_I(t)\varDelta +\phi P_R(t)\varDelta ,\end{aligned}$$

(21)

$$\begin{aligned} P_I(t+1)&= P_I(t)+\beta \lambda _N P_S(t)P_I(t)\varDelta -\gamma P_I(t)\varDelta , \end{aligned}$$

(22)

$$\begin{aligned} P_R(t+1)&= P_R(t)+ \gamma P_I(t)\varDelta -\phi P_R(t)\varDelta . \end{aligned}$$

(23)

Deterministic equations, such as those in (21)–(23), are unable to capture the uncertainties in the hidden epidemic model. Possible sources of uncertainty are the presence of heterogeneous populations and the existence of other categories such as exposed individuals. To handle the complexity while still preserving the mass balance, we apply the logit transformation to the true rates, which changes the scale of variability from $[0,1]$ to $(-\infty , \infty )$. That is, for $t=1,2,\ldots ,$ define

$$\begin{aligned} W_S(t)&\equiv \mathrm{log}\left( \frac{P_S(t)}{P_R(t)}\right) ,\end{aligned}$$

(24)

$$\begin{aligned} W_I(t)&\equiv \mathrm{log}\left( \frac{P_I(t)}{P_R(t)}\right) , \end{aligned}$$

(25)

where $W_S(t)$ and $W_I(t)$ are the log odds ratios of susceptible-over-recovered populations, and infectious-over-recovered populations, respectively, at time $t$. On the odds-ratio scale ($\mathbf {W}$-scale), we construct the process model in terms of $\mathbf {W}(t)\equiv (W_S(t), W_I(t))^\prime $:

$$\begin{aligned} \mathbf {W}(t+1)={\varvec{\mu }}^W(t)+{\varvec{\xi }}(t+1), \end{aligned}$$

(26)

for discrete time $t=1,2,\ldots ,$ in units of $\varDelta $ days. We now discuss each of the components of (26), in turn. The vector ${\varvec{\mu }}^W(t)\equiv (\mu ^W_S(t), \mu ^W_I(t))'$ is the dynamical process that captures the temporal dependence. In “Appendix A.1”, we derive the nonlinear dynamical structure of ${\varvec{\mu }}^W(t)$ using (21)–(25). This derivation retains the SIRS flow on the hidden process; that is, for discrete time $t=1,2,\ldots ,$ in units of $\varDelta $ days,

$$\begin{aligned} \mu ^W_S(t)&= W_S(t)\nonumber \\&\quad +\,\mathrm{log}\left[ 1+\frac{\phi \varDelta }{\mathrm{exp} \left( W_S(t)\right) }-\frac{\beta \lambda _N\mathrm{exp}\left( W_I(t) \right) \varDelta }{1+\mathrm{exp}\left( W_S(t)\right) + \mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad +\,\mathrm{log}\left[ \frac{1}{1+\gamma \mathrm{exp}\left( W_I(t) \right) \varDelta -\phi \varDelta }\right] ,\nonumber \\ \end{aligned}$$

(27)

$$\begin{aligned} \mu ^W_I(t)&= W_I(t)\nonumber \\&\quad +\,\mathrm{log}\left[ 1-\gamma \varDelta +\frac{\beta \lambda _N \mathrm{exp}\left( W_S(t)\right) \varDelta }{1+\mathrm{exp}\left( W_S(t) \right) +\mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad +\,\mathrm{log}\left[ \frac{1}{1+\gamma \mathrm{exp}\left( W_I(t) \right) \varDelta -\phi \varDelta }\right] ,\nonumber \\ \end{aligned}$$

(28)

where recall that $\beta ,\,\gamma $, and $\phi $, are the transmission rate, recovery rate, and loss-of-immunity rate per day, respectively.

We denote the vector ${\varvec{\xi }}(t)\equiv (\xi _S(t), \xi _I(t))'$, to be the small-scale variation that captures the uncertainties in the hidden epidemic process. For $t=1,2,\ldots $, we define

$$\begin{aligned} {\varvec{\xi }}(t)\sim \mathrm{MVN}(\mathbf {0}, {\varvec{\varSigma }}_{{\varvec{\xi }}}(t)), \end{aligned}$$

(29)

a multivariate normal (MVN) distribution with mean $\mathbf {0}$ and diagonal covariance matrix ${\varvec{\varSigma }}_{{\varvec{\xi }}}(t) \equiv \mathrm{diag}(\sigma ^2_{\xi _S}(t), \sigma ^2_{\xi _I}(t)),$ with nonnegative variance components, $\sigma ^2_{\xi _S}(t)$ and $\sigma ^2_{\xi _I}(t)$. For the sake of simplicity, in this article we assume that $\sigma ^2_{\xi _S}(t)=\sigma ^2_{\xi _S}$ and $\sigma ^2_{\xi _I}(t)=\sigma ^2_{\xi _I}$, for all $t=1,2,\ldots $. Notice that the (negative) covariation of $W_S(t)$ and $W_I(t)$ [defined in (24) and (25)] due to the mass-balance constraint in (16) has been structured through the covariance of the nonlinear dynamical process ${\varvec{\mu }}^W(t)$. Therefore, it is reasonable to assume that ${\varvec{\varSigma }}_{{\varvec{\xi }}}(t)$, which is the covariance of the conditional distribution, $[\mathbf {W}(t+1)|{\varvec{\mu }}^W(t)]$, has a diagonal structure. Also notice that we can still preserve mass balance while adding the small-scale variation, ${\varvec{\xi }}(t)$, on the $\mathbf {W}$-scale, because the mass balance is preserved when

$$\begin{aligned} P_S(t)&= \frac{\mathrm{exp}(W_S(t))}{1+\mathrm{exp}(W_S(t)) +\mathrm{exp}(W_I(t))},\\ P_I(t)&= \frac{\mathrm{exp}(W_I(t))}{1+\mathrm{exp}(W_S(t)) +\mathrm{exp}(W_I(t))},\\ P_R(t)&= \frac{1}{1+\mathrm{exp}(W_S(t))+\mathrm{exp}(W_I(t))}. \end{aligned}$$

Finally, notice that these equations do not put a constraint on the range of the values $W_S(t)$ and $W_I(t)$, so adding the error vector ${\varvec{\xi }}(\cdot )$ in (26) will not destroy the balance. However, if an error vector is added directly on the $\mathbf {P}$-scale, the mass balance constraint (16) is difficult to retain.

The strategy of transforming from the hidden proportion scale ($\mathbf {P}$-scale) to the hidden log-odds-ratio scale ($\mathbf {W}$-scale) and adding the small-scale variation on the $\mathbf {W}$-scale rather than on the $\mathbf {P}$-scale, is key to retaining the mass-balance constraint while allowing flexible SIRS flow to be handled. To our knowledge, this is a new approach to infectious-disease modeling; other Bayesian approaches (e.g., Dukić et al. 2012) do not preserve mass balance after building uncertainties into the process model.

2.3 Parameter model

To complete the model, we now specify the joint prior distribution for the parameters, which includes the transmission rate per unit time, $\beta $; the rate of recovery per unit time, $\gamma $; the loss-of-immunity rate per unit time, $\phi $; and variance components, $\{\sigma ^2_{\xi _S}\}$ and $\{\sigma ^2_{\xi _I}\}$. Notice that the difference equations in (6)–(8) impose a natural constraint on $\beta $; that is, for any time $t=1,2,\ldots $,

$$\begin{aligned} \beta S(t) I(t) \le N-(1-\phi )R(t), \end{aligned}$$

(30)

because the number of individuals that become infectious at a certain time $t$ should be less than or equal to the total number that could be infected at that time. Hence,

$$\begin{aligned} 0\le \beta \le \frac{N-(1-\phi ) R(t)}{S(t)I(t)}. \end{aligned}$$

(31)

In the context of the HSIRS model, this amounts to ensuring that $\beta $ is bounded above by a hyperparameter, $\beta _{\mathrm{max}}$. Furthermore, it is straightforward to see that $\gamma \in [0,1]$ and $\phi \in [0,1]$, due to their definition. Typically, information about the recovery rate $\gamma $ is easier to obtain than the other rate parameters; we apply a logit transformation to $\gamma $,

$$\begin{aligned} \theta _{\gamma }\equiv \mathrm{log}\left( \frac{\gamma }{1-\gamma }\right) , \end{aligned}$$

(32)

and assign a Gaussian prior to $\theta _{\gamma }$.

Assuming statistical independence of parameters and using $[Y]$ as generic notation for the probability distribution of $Y$, we assume that the parameter model can be written as

$$\begin{aligned} \left[ \beta , \theta _\gamma ,\phi , \sigma ^2_{\xi _S}, \sigma ^2_{\xi _I}\right] =[\beta ][\theta _\gamma ][\phi ][\sigma ^2_{\xi _S}][\sigma ^2_{\xi _I}], \end{aligned}$$

(33)

where we specify the prior distributions of individual parameters as follows:

$$\begin{aligned}&\beta \sim \mathrm{Uniform}[0, \beta _{max}],\\&\theta _{\gamma }\sim \mathrm{Normal}(\mu _{\theta _{\gamma }},\sigma ^2_{\theta _{\gamma }}),\\&\phi \sim \mathrm{Uniform}[0,1],\\&\sigma ^2_{\xi _S}\sim \hbox { Inverse Gamma }\left( a_{\xi _S},b_{\xi _S}\right) ,\\&\sigma ^2_{\xi _I}\sim \hbox { Inverse Gamma }\left( a_{\xi _I},b_{\xi _I}\right) . \end{aligned}$$

Notice that the uniform distributions on $\beta $ and $\phi $ could easily be replaced by the very flexible Generalized Beta distributions on their supports.

In practice, the hyperparameter $\beta _{max}$ could be data-driven using (31):

$$\begin{aligned} \beta _{max}\equiv \min _t\left( \frac{\lambda _N-Z_R(t)}{Z_S(t)Z_I(t)} \right) , \end{aligned}$$

(34)

where

$$\begin{aligned} Z_R(t)=\max \left\{ 1, \lambda _N-Z_S(t)-Z_I(t)\right\} . \end{aligned}$$

(35)

The hyperparameter $\mu _{\theta _{\gamma }}$ could be data-driven through the inverse relationship between recovery rate and the mean duration of the infectious period for individuals (e.g., Lloyd 2001), which is about 3 days for common influenzas (Centers for Disease Control; http://www.cdc.h1n1flu/recommendations.htm). In our case, we use $\mu _{\theta _{\gamma }}= \mathrm{log}(0.33/(1-0.33))=-0.708$, with $\sigma ^2_{\theta _{\gamma }}=0.01$. The other hyperparameters, namely, $a_{\xi _S},\,b_{\xi _S},\,a_{\xi _I}$, and $b_{\xi _I}$, need to be specified; for example, the choice, $a_{\xi _S}=a_{\xi _I}=0.25$ and $b_{\xi _S}=b_{\xi _I}=0.4$, results in a fairly vague prior for the variance components.

3 $\mathbf {W}$-scale approximations for the HSIRS model

Here, we derive a calibrated approximation to the nonlinear $\mathbf {W}$-scale process in the HSIRS model. From “Appendix A.2”, for $t=1,2,\ldots $, Eq. (26) in the HSIRS model can be approximated by

$$\begin{aligned} \mathbf {W}(t+1)={\varvec{\mu }}^{LW}(t)+{\varvec{\zeta }}(t+1), \end{aligned}$$

(36)

where recall that $\mathbf {W}(t)\equiv (W_S(t), W_I(t))'$ is the true log-odds-ratio vector.

The vector ${\varvec{\zeta }} (t)\equiv (\zeta _S(t), \zeta _I(t))'$ in (36) is the small-scale-variation vector that captures the uncertainties in the epidemic process as well as the higher-order terms in the Taylor-series expansions. For $t=1,2,\ldots $, we assume a Multivariate Normal (MVN) distribution,

$$\begin{aligned} {\varvec{\zeta }}(t)\sim \mathrm{MVN}\left( \mathbf {0}, {\varvec{\varSigma }}_{{\varvec{\zeta }}}(t)\right) \!, \end{aligned}$$

(37)

where ${\varvec{\varSigma }}_{{\varvec{\zeta }}}(t)\equiv \mathrm{diag}\left( \sigma ^2_{\zeta _S}(t),\sigma ^2_{\zeta _I}(t)\right) $ is the covariance matrix of ${\varvec{\zeta }} (t)$, and $\sigma ^2_{\zeta _S}(t)$ and $\sigma ^2_{\zeta _I}(t) $ are nonnegative variance-component parameters. Typically, the components of ${\varvec{\varSigma }}_{{\varvec{\zeta }}}(t)$ are larger than the respective components of ${\varvec{\varSigma }}_{{\varvec{\xi }}}(t)$, because $\{{\varvec{\zeta }}(t)\}$ also captures the higher-order terms left after matching the linear approximation.

In (36), the vector ${\varvec{\mu }}^{LW}(t)\equiv (\mu ^{LW}_S(t), \mu ^{LW}_I(t))'$ is a linear dynamical process derived through Taylor-series expansions that approximates the nonlinear stochastic process ${\varvec{\mu }}^{W}(t)$ defined in (27)–(28). From “Appendix A.2”, for $t=1,2,\ldots $,

$$\begin{aligned} \mu ^{LW}_{S}(t)&= J_0(t)+J_1(t)W_S(t)+J_2(t)W_I(t),\end{aligned}$$

(38)

$$\begin{aligned} \mu ^{LW}_{I}(t)&= J_3(t)+J_4(t)W_S(t)+J_5(t)W_I(t), \end{aligned}$$

(39)

where

$$\begin{aligned} J_0(t)&\equiv \mathrm{log}\hat{A}_1(t)+\frac{1}{\hat{A}_1(t)} +\phi \left[ \frac{\mathrm{exp}(\hat{A}_5(t))(1- \hat{A}_5(t))}{\hat{A}_1(t)}+\frac{1}{\hat{A}_2(t)} \right] \varDelta \nonumber \\&\quad -\,\frac{\beta \lambda _N}{\hat{A}_1(t)}\left( 1- \frac{1}{1-\hat{A}_7(t)}+\frac{B_0(t)+\hat{A}_7(t)}{(1- \hat{A}_7(t))^2}\right) \varDelta \nonumber \\&\quad -\,\mathrm{log}\hat{A}_2(t)-\frac{1}{\hat{A}_2(t)}- \frac{\gamma }{\hat{A}_2(t)}e^{\hat{A}_6(t)}(1-\hat{A}_6(t))\varDelta , \end{aligned}$$

(40)

$$\begin{aligned}&\displaystyle J_1(t)\equiv 1-\frac{ \phi \mathrm{exp}\left( \hat{A}_5(t) \right) \varDelta }{\hat{A}_1(t)}-\frac{\beta \lambda _NB_1(t) \varDelta }{\hat{A}_1(t)(1-\hat{A}_7(t))^2},\end{aligned}$$

(41)

$$\begin{aligned}&\displaystyle J_2(t)\equiv -\frac{\beta \lambda _NB_2(t)\varDelta }{\hat{A}_1(t)(1- \hat{A}_7(t))^2}-\frac{\gamma e^{\hat{A}_6(t)} \varDelta }{\hat{A}_2(t)}, \end{aligned}$$

(42)

$$\begin{aligned} J_3(t)&\equiv \mathrm{log}\hat{A}_3(t)+ \frac{1-\gamma \varDelta }{\hat{A}_3(t)}+ \frac{\beta \lambda _N\varDelta }{\hat{A}_3(t)(1-\hat{A}_{10}(t))}- \frac{\beta \lambda _N(B_3(t)+\hat{A}_{10}(t))\varDelta }{\hat{A}_3(t)(1- \hat{A}_{10}(t))^2}\nonumber \\&\quad -\,\mathrm{log}\hat{A}_2(t)-\frac{1}{\hat{A}_2(t)}- \frac{\gamma e^{\hat{A}_6(t)}(1-\hat{A}_6(t))\varDelta }{\hat{A}_2(t)}+ \frac{\phi \varDelta }{\hat{A}_2(t)}, \end{aligned}$$

(43)

$$\begin{aligned}&\displaystyle J_4(t)\equiv \frac{-\beta \lambda _N B_4(t) \varDelta }{\hat{A}_3(t)(1-\hat{A}_{10}(t))^2},\end{aligned}$$

(44)

$$\begin{aligned}&\displaystyle J_5(t)\equiv 1-\frac{\beta \lambda _N B_5(t)\varDelta }{\hat{A}_3(t)(1- \hat{A}_{10}(t))^2}-\frac{\gamma e^{\hat{A}_6(t)} \varDelta }{\hat{A}_2(t)}. \end{aligned}$$

(45)

Also,

$$\begin{aligned} B_0(t)&\equiv e^{\hat{A}_4(t)}(1-\hat{A}_4(t))B^*(t),\end{aligned}$$

(46)

$$\begin{aligned} B_1(t)&\equiv \frac{e^{\hat{A}_5(t)+\hat{A}_4(t)}(1- \hat{A}_4(t))}{(1-\hat{A}_9(t))^2}-e^{\hat{A}_4(t)}B^*(t),\end{aligned}$$

(47)

$$\begin{aligned} B_2(t)&\equiv e^{\hat{A}_4(t)}B^*(t),\end{aligned}$$

(48)

$$\begin{aligned} B_3(t)&\equiv e^{\hat{A}_4(t)}(1-\hat{A}_4(t))+e^{\hat{A}_5(t)}(1-\hat{A}_5(t)),\end{aligned}$$

(49)

$$\begin{aligned} B_4(t)&\equiv -e^{\hat{A}_4(t)}-e^{\hat{A}_5(t)},\end{aligned}$$

(50)

$$\begin{aligned} B_5(t)&\equiv e^{\hat{A}_4(t)}, \end{aligned}$$

(51)

where

$$\begin{aligned} B^*(t)=\frac{1}{1-\hat{A}_9(t)}+\frac{e^{\hat{A}_5(t)}\left( \hat{A}_5(t)-1\right) -\hat{A}_9(t)}{(1-\hat{A}_9(t))^2}, \end{aligned}$$

(52)

and $\{\hat{A}_j(t): j=1,\ldots ,10\}$ are discussed below.

Notice that by combining Eqs. (36)–(52), we are able to write $\mathbf {W}(t+1)$ given by (36) in a multivariate autoregressive form:

$$\begin{aligned} \mathbf {W}(t+1)=\mathbf {C}(t)+\mathbf {H}(t)\mathbf {W}(t) +{\varvec{\zeta }}(t+1), \end{aligned}$$

(53)

where $\mathbf {C}(t)\equiv (J_0(t), J_3(t))'$ and the $2\times 2$ matrix $\mathbf {H}(t)$ is

$$\begin{aligned} \mathbf {H}(t)\equiv \left( \begin{array}{cc} J_1(t) &{} J_2(t) \\ J_4(t) &{} J_5(t) \\ \end{array} \right) . \end{aligned}$$

The general idea behind (38)–(39) is to use $\hat{A}_i(t),\,i=1,\ldots ,10$, as an initialization of the Taylor-series expansion of the nonlinear process ${\varvec{\mu }}^{W}(t)$ in (26). Formulas for $\hat{A}_i(t)$ and the quantity $A_i(t)$ that it approximates are given in Table 1. In practice, we use empirical values obtained from data $\{Z_S(t)\}$ and $\{Z_I(t)\}$ to obtain $\hat{A}_i(t)$ close to $A_i(t)$, where from Table 1 $\{A_i(t):\,i=1,\ldots ,10\}$ are nonlinear components in the HSIRS model. From the transformation in (24) and (25), we can obtain

$$\begin{aligned} \lambda _NP_S(t)&= \frac{\lambda _N\mathrm{exp}(W_S(t))}{1 +\mathrm{exp}(W_S(t))+\mathrm{exp}(W_I(t))},\\ \lambda _NP_I(t)&= \frac{\lambda _N\mathrm{exp}(W_I(t))}{1+ \mathrm{exp}(W_S(t))+\mathrm{exp}(W_I(t))},\\ \lambda _NP_R(t)&= \frac{\lambda _N}{1+\mathrm{exp}(W_S(t))+ \mathrm{exp}(W_I(t))}. \end{aligned}$$

Now $\lambda _NP_S(t),\,\lambda _NP_I(t)$, and $\lambda _NP_R(t)$ are the means of $\{Z_S(t)\},\,\{Z_I(t)\}$, and $\{Z_R(t)\}$, and hence the values in the column “Initializations ($\hat{A}(t)$)” in Table 1 are reasonable choices for $\{\hat{A}_i(t)\}$. If data are not available at some time point, we obtain $\{\hat{A}_i(t)\}$ in Table 1 by replacing the observed data $\{Z_S(t)\}$ and $\{Z_I(t)\}$ with the predicted counts provided by a simple model, such as the CSIRS model. The goal is to obtain $\{\hat{A}_i(t)\}$ as close as possible to $\{A_i(t)\}$.

Table 1 Table of the initializations $\{\hat{A}_i(t):i=1,\ldots ,10\}$ and the quantities $\{A_i(t):i=1,\ldots ,10\}$ that they approximate

Full size table

In Table 1, $\beta _0,\,\gamma _0$, and $\phi _0$ are initial values of $\beta ,\,\gamma $, and $\phi $, respectively, which are used to enhance the Taylor-series expansions. We use values obtained from the CSIRS model (e.g., Anderson and May 1991; Wearing et al. 2005; Burr and Chowell 2006) for $\beta _0,\,\gamma _0$, and $\phi _0$. Further details of implementation can be found in Sect. 4.3. We performed sensitivity studies and noticed that the Markov chain Monte Carlo (MCMC) based on this linear approximation is not sensitive to the initializations (even in forecasting), because the small-scale-variation vector in (36) can absorb the higher-order terms in the Taylor-series expansion (e.g., Cressie and Wikle 2011, Section 7.3.3). Like the data model, the parameter model is unchanged, except that subscript $\xi $ is replaced with subscript $\zeta $; see Sect. 2.3 for details.

Finally, the approximate HSIRS model, which we call the ASIRS model, consists of the data model defined in (10)–(11), the linear dynamical process model for $\{\mathbf {W}(t)\}$ defined in (36), and the parameter model defined in (33) with $\xi $ replaced by $\zeta $. Because of its computational efficiency, the ASIRS model is used in the MCMC algorithm that produces our posterior analysis.

4 Posterior analysis of a simulated epidemic

4.1 Simulated data

In this section, an epidemic is simulated from each of two processes meant to mimic an H1N1 outbreak in Franklin County, Ohio. The two datasets are from an HSIRS model (defined in Sect. 2) and a modified-CSIRS model (defined below), which we refer to as the HSIRS dataset and the modified-CSIRS dataset, respectively. Henceforth, the time step of the difference equations is $\varDelta =1$ day. Notice that for epidemics with a different developing time step, the value of $\varDelta $ could be easily adjusted according to the infectious nature of the underlying virus.

4.1.1 The HSIRS dataset

For simulating the HSIRS dataset from the HSIRS model, we select $\gamma ^*=0.33$ per day as the true value of $\gamma $, since there is an inverse relationship between $\gamma $ and mean duration of the infectious period (the inverse relation between rate and average duration is due to the assumption that recovery is a process with constant intensity, that is, a constant rate per unit of time, which implies a negative exponential density); this is about 3 days for common influenza (Centers for Disease Control; http://www.cdc.h1n1flu/recommendations.htm). Furthermore, in epidemiology, the basic reproduction number, $R_0$, is defined as the number of secondary infections caused by a single infective, introduced into a population made up entirely of susceptible individuals, over this individual’s course of the infection. Therefore, $R_0$ can be obtained by

$$\begin{aligned} R_0=\frac{\beta \lambda _N}{\gamma }. \end{aligned}$$

(54)

Typically, $R_0$ has a value between 1 and 2 for new strains of Influenza A in human communities (e.g., Anderson 2006). Therefore, since $\lambda _N=1{,}068{,}978$ in the 2000 Census in Franklin County, Ohio, and $\gamma ^*=0.33$, if we select $\beta ^*=5.1*10^{-7}$ as the true value of $\beta $, then we obtain $R_0\approx 1.65$ to mimic a pandemic flu. Furthermore, for illustration, we simply assume $\phi ^*=0.05$, which is a typical value of loss of immunity for pandemic strains used in the long-term period (e.g., Bansal et al. 2010).

We now turn to the components of variance. Recall that the signal-to-noise ratio (SNR) can be defined as

$$\begin{aligned} \mathrm{SNR}\equiv \frac{\mu }{\sigma }, \end{aligned}$$

(55)

where $\mu $ is the signal mean and $\sigma $ is the standard deviation of the noise. We denote $\mathrm{SNR}_{W_S}(t)$ and $\mathrm{SNR}_{W_I}(t)$ to be the SNR for the log odds ratios $W_S(t)$ and $W_I(t)$, respectively, at time $t$. Then, for $t=1,2,\ldots ,T+F$, we have

$$\begin{aligned} \mathrm{SNR}_{W_S}(t)&\equiv \frac{\mu ^W_{S}(t)}{\sigma _{\xi _S}(t)},\end{aligned}$$

(56)

$$\begin{aligned} \mathrm{SNR}_{W_I}(t)&\equiv \frac{\mu ^W_{I}(t)}{\sigma _{\xi _I}(t)}, \end{aligned}$$

(57)

where the right-hand sides of (56) and (57) are given by (27)–(29), and the time points $T+1,\ldots ,T+F$ represent a forecast period of $F$ days.

As a baseline at time $t=1$, we assume the rates of susceptible and infectious individuals, namely, $P^*_S(1)$ and $P^*_I(1)$ to be $0.94$ and $0.01$, respectively. Hence,

$$\begin{aligned} \mu ^{W*}_{S}(1)&= \mathrm{log}\left( \frac{0.94}{1-0.94-0.01} \right) =2.934,\end{aligned}$$

(58)

$$\begin{aligned} \mu ^{W*}_{I}(1)&= \mathrm{log}\left( \frac{0.01}{1-0.94-0.01} \right) =-1.609, \end{aligned}$$

(59)

which results in

$$\begin{aligned} {\varvec{\mu }}^{W*}(1)=\left( \mu ^{W*}_{S}(1), \mu ^{W*}_{I}(1)\right) '=(2.934,-1.609)', \end{aligned}$$

the initial mean of the hidden log-odds-ratio vector, $\mathbf {W}(1)$.

For the simulated epidemic, we assume that the true values of SNR at time $t=1$ are

$$\begin{aligned} \mathrm{SNR}^*_{W_S}(1)=\mathrm{SNR}^*_{W_I}(1)=15. \end{aligned}$$

More choices of SNR are discussed in Sect. 5. Then from (56)–(57), we have

$$\begin{aligned} \sigma ^{2}_{\xi _S}(1)&= \left( 2.934/15\right) ^2=0.038,\end{aligned}$$

(60)

$$\begin{aligned} \sigma ^{2}_{\xi _I}(1)&= \left( -1.609/15\right) ^2=0.012. \end{aligned}$$

(61)

Recall that we assume $\sigma ^{2}_{\xi _S}(t)=\sigma ^{2}_{\xi _S}$ and $\sigma ^{2}_{\xi _I}(t)=\sigma ^{2}_{\xi _I}$, for $t=1,2,\ldots $. Therefore, from (60) to (61) we select $\sigma ^{2*}_{\xi _S}=0.038,\,\sigma ^{2*}_{\xi _I}=0.012$, as the true values of $\sigma ^{2}_{\xi _S}$ and $\sigma ^{2}_{\xi _I}$, respectively, and we write ${\varvec{\varSigma }}^*_{\xi }\equiv \mathrm{diag}(\sigma ^{2*}_{\xi _S}, \sigma ^{2*}_{\xi _I})$.

We simulate daily data for $T+F=45$ days, where $T=35$ and $F=10$. Specifically, for $t=1$, we simulate

$$\begin{aligned} \mathbf {W}(1)\sim \mathrm{MVN}\left( {\varvec{\mu }}^{W*}(1), {\varvec{\varSigma }}^*_{\xi }\right) ; \end{aligned}$$

(62)

then for $t=2,\ldots ,45$, we simulate $\{\mathbf {W}(t)\}$ using (26) and obtain $\{\mathbf {P}(t)\}$ using transformations defined by (78)–(79) in “Appendix A.1”. Finally, we generate observed counts of susceptible and infectious individuals, $\{Z_S(t)\}$ and $\{Z_I(t)\}$, from the Poisson distribution defined in (10)–(11), conditional on $\{\mathbf {P}(t)\}$. These counts, $\{Z_S(t)\}$ and $\{Z_I(t)\}$, represent the HSIRS dataset.

4.1.2 The modified-CSIRS dataset

Before we illustrate the procedure for generating the modified-CSIRS dataset, we first define the modified-CSIRS model. Recall that the CSIRS model is deterministic, which we now modify to incorporate uncertainty in the observations and in the parameters. Specifically, the modified-CSIRS model is a hierarchical statistical model that consists of a data model defined by (10)–(11), a deterministic process model defined by the CSIRS model in (21)–(23), and a parameter model. That is, for $t=1,2,\ldots ,$ in units of $\varDelta $ days, the modified-CSIRS model can be written as:

Data model:

$$\begin{aligned} Z_S(t)|\lambda _S(t)&\sim \hbox {ind. }\mathrm{Poisson}(\lambda _NP_S(t)),\\ Z_I(t)|\lambda _I(t)&\sim \hbox {ind. }\mathrm{Poisson}(\lambda _NP_I(t)). \end{aligned}$$

Process model:

$$\begin{aligned} P_S(t+1)&= P_S(t)-\beta \lambda _N P_S(t)P_I(t)\varDelta +\phi P_R(t)\varDelta ,\\ P_I(t+1)&= P_I(t)+\beta \lambda _N P_S(t)P_I(t)\varDelta -\gamma P_I(t)\varDelta ,\\ P_R(t+1)&= P_R(t)+\gamma P_I(t)\varDelta -\phi P_R(t)\varDelta . \end{aligned}$$

Parameter model:

$$\begin{aligned} {[}\beta , \gamma , \phi ]=[\beta ][\gamma ][\phi ]. \end{aligned}$$

In this hierarchical statistical model, recall that $Z_S(t)$ and $Z_I(t)$ are the observed susceptible and infectious counts, respectively; $P_S(t),\,P_I(t)$, and $P_R(t)$ are the hidden true proportions; $\beta ,\,\gamma $, and $\phi $ are transmission rate, recovery rate, and loss-of-immunity rate, respectively, per day; and $\varDelta $ is the time step. As specified at the beginning of this section, $\varDelta =1$. Notice that in the modified-CSIRS model, the process model is deterministic and does not capture any uncertainty in the hidden epidemic process.

For simulating the modified-CSIRS dataset from the modified-CSIRS model, we likewise simulate daily data for $T+F=45$ days, where $T=35$ and $F=10$. For the unknown parameters, $\beta ,\,\gamma $, and $\phi $, we select the same values in Sect. 4.1.1, and we select the same starting proportions, $P^*_S(1)=0.94$ and $P^*_I(1)=0.01$. Then for $t=2,\ldots ,45$, we simulate $\{\mathbf {P}(t)\}$ from the deterministic process defined in (21)–(23); and finally we generate $\{Z_S(t)\}$ and $\{Z_I(t)\}$ using the Poisson distribution defined in (10)–(11), conditional on $\{\mathbf {P}(t)\}$. These counts, $\{Z_S(t)\}$ and $\{Z_I(t)\}$, represent the modified-CSIRS dataset.

Notice that the HSIRS dataset mimics noisy observations from an epidemic with uncertainties in the underlying epidemic process, whereas the modified-CSIRS dataset mimics noisy observations from a deterministic epidemic process. Although for each dataset, we simulated data for $T+F=45$ days, when we fit models, we assume that data are only available on the first $T=35$ days and are missing in the last $F=10$ days. Thus, we can assess the performance of these models on forecasting, by comparing their forecasts of $\lambda _S(t)$ and $\lambda _I(t)$ to the true values known from simulation.

Figure 1a–d show the daily observed susceptible counts $\{Z_S(t)\}$ and infectious counts $\{Z_I(t)\}$, for each of the two datasets, as a function of time (for all 45 days). We use the vertical line on each plot to emphasize that we assume data are only available for the first $T=35$ days. Comparing these plots, we can clearly see that the epidemic patterns in the modified-CSIRS dataset (Fig. 1c, d) are much smoother than those in HSIRS dataset. In general, the two datasets suggest similar epidemic patterns in the 45 days. That is, both of them indicate clearly an epidemic between day 15 and day 40, and the infectious population reaches its peak in the period between day 20 and day 30.

4.2 Fitting the HSIRS model

As mentioned in Sect. 3, we derive a well calibrated Gaussian linear process (the ASIRS model) to approximate the nonlinear W-scale process in the HSIRS model, which improves computational efficiency in posterior analysis and forecasting. Recall that within the 45-day study period, data are only available on the first $T=35$ days and are missing for the last $F=10$ days.

Based on the ASIRS model specified in Sect. 3, the joint posterior distribution of all “unknowns” is proportional to a product of the data model, the process model, and the parameter model. Notice that $\{W_S(t)\}$ and $\{W_I(t)\}$ are transformations of $\{P_S(t)\}$ and $\{P_I(t)\}$, and hence the conditioning in the data model can be equivalently written in terms of $\mathbf {W}(t)$. Therefore, combining Eqs. (78)–(79) in “Appendix A.1”, the data model given by (10) and (11) can be written as follows: For $t=1,\ldots ,T$,

$$\begin{aligned} Z_S(t)|\mathbf {W}(t)\sim \hbox { ind. }\mathrm{Poisson}\left( \frac{\lambda _N \mathrm{exp}(W_S(t))}{1+\mathrm{exp}(W_I(t))+ \mathrm{exp}(W_S(t))}\right) ,\\ Z_I(t)|\mathbf {W}(t)\sim \hbox { ind. }\mathrm{Poisson}\left( \frac{\lambda _N \mathrm{exp}(W_I(t))}{1+\mathrm{exp}(W_I(t))+ \mathrm{exp}(W_S(t))}\right) . \end{aligned}$$

Therefore, the joint posterior distribution of all unknowns is

$$\begin{aligned}&\left[ \beta , \theta _{\gamma },\phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}, \{W_S(t)\}, \{W_I(t)\}|Z_S(1),\ldots ,Z_S(T), Z_I(1), \ldots , Z_I(T)\right] \nonumber \\&\quad \times \propto \prod _{t=1}^{T}[Z_S(t)|W_S(t),W_I(t)]\cdot \prod _{t=1}^{T}[Z_I(t)|W_S(t), W_I(t)]\cdot [W_S(1), W_I(1)| \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}]\nonumber \\&\quad \times \left( \prod _{t=2}^{T+F}[W_S(t), W_I(t)|\beta ,\theta _{\gamma }, \phi , W_S(t\!-\!1), W_I(t\!-\!1), \sigma ^2_{\zeta _S},\sigma ^2_{\zeta _I}]\right) \cdot [\beta ][ \theta _{\gamma }][\phi ][\sigma ^2_{\zeta _S}][\sigma ^2_{\zeta _I}], \end{aligned}$$

(63)

where at $t=1,\,\mathbf {W}(1)\equiv (W_S(1), W_I(1))'$ has distribution,

$$\begin{aligned} \mathbf {W}(1)|\sigma ^2_{\zeta _S},\sigma ^2_{\zeta _I} \thicksim \mathrm{MVN}\left( {\varvec{\mu }}^W(1),{\varvec{\varSigma }}_{\zeta } \right) , \end{aligned}$$

and recall that ${\varvec{\varSigma }}_{\zeta }= \mathrm{diag}(\sigma ^2_{\zeta _S},\sigma ^2_{\zeta _I}).$ There is strong prior information on what happens at $t=1$, which allows the hyperparameter ${\varvec{\mu }}^W(1)$ to be specified. For example, in Sect. 4.1, we specify it as,

$$\begin{aligned} {\varvec{\mu }}^W(1)={\varvec{\mu }}^{W*}(1)\equiv (2.934,-1.609)'. \end{aligned}$$

(Recall from Sect. 4.1 that ${\varvec{\mu }}^{W*}(1)$ is the initial mean of the log-odds-ratio vector used for simulating the data.) The parameter model defined in Sect. 2.3 consists of independent prior distributions on each parameter. Regarding the parameter-model specification for ${\varvec{\varSigma }}_{\zeta }$, we specify fairly vague priors for $\sigma ^2_{\zeta _S}$ and $\sigma ^2_{\zeta _I}$ by choosing independent Inverse Gamma distributions with hyperparameters, $a_{\zeta _S}=a_{\zeta _I}=0.25,\,b_{\zeta _S}=b_{\zeta _I}=0.4$. Recall from Sect. 2.3 that we also specify $\beta \sim \mathrm{Uniform}(0, \beta _{max})$, where $\beta _{max}$ is given by (34); $\theta _{\gamma }\sim \mathrm{Normal}(0.33, 0.01)$; $\phi \sim \mathrm{Uniform}[0, 1]$.

Even based on the ASIRS model, the posterior distribution is not available analytically, owing to a normalizing constant that cannot be obtained in closed form. However, we can sample from the posterior distribution using a Markov chain Monte Carlo (MCMC) algorithm with a Gibbs sampler that incorporates Metropolis–Hastings steps where necessary (e.g., Waller et al. 1997). These are based on full conditional distributions that are given in “Appendix A.3”.

The MCMC algorithm draws samples, cyclically from each full conditional distribution, conditioning on the most recent samples drawn from the other full conditionals. This iterative procedure defines a Markov chain whose stationary distribution is the joint distribution of all the unknowns given the data (i.e., the posterior distribution). Hence, after a “burn-in” number of iterations, we obtain samples from the posterior distribution. Notice that in our case, except for $[\sigma ^2_{\zeta _S}|\mathrm{rest}]$ and $[\sigma ^2_{\zeta _I}|\mathrm{rest}]$, which follow the Inverse Gamma distributions, and $[W_S(t), W_I(t)|\mathrm{rest}]$ for $t>T$, which follow a Multivariate Normal (MVN) distribution (in “Appendix A.3”, we define “rest” to represent all other unknowns as well as the data $\{Z_S(1),\ldots ,Z_S(T)\}$ and $\{Z_I(1),\ldots ,Z_I(T)\}$), the full conditional distributions of all the other unknowns cannot be simulated directly, and so Metropolis–Hastings updates are applied (e.g., Robert and Casella 2004).

4.3 Posterior analysis and forecasting

4.3.1 Posterior analysis based on the HSIRS model

For the purpose of comparison, we fit the approximate HSIRS (i.e., ASIRS) model to each of the two datasets, and we refer to them as:

case AH: fit the ASIRS model to the HSIRS dataset
case AM: fit the ASIRS model to the modified-CSIRS dataset

For each of these two cases, we ran an MCMC chain of 30,000 iterations. After a burn-in of 3,000 iterations, we obtained a total of 27,000 samples from the posterior distribution for each case listed above. Notice that this is not a particle-filtering approach in which new data are used to update current and past posteriors, so avoiding the need to re-run the MCMC.

Figure 2 shows the posterior behavior of $\beta ,\,\gamma ,\,\sigma ^2_{\zeta _S},\,\sigma ^2_{\zeta _I}$, and $\phi $, for the case AH. Similar figures are obtained for the case AM (not shown). The posterior median, mean, variance, and 95 % Bayes credible interval (95 % CI) for parameters in each of the two cases are shown in Tables 2 and 3. For both cases, we can see that, except for $\gamma $ (recall that $\theta _{\gamma }$ in the tables is the logit transformation of $\gamma $), the posteriors for the parameters are much tighter than the priors. See also Fig. 3, which shows the prior and posterior distributions of $\beta $ (chosen as a representative parameter) for both cases. Hence, there is substantial learning about these parameters, but there is almost no learning for $\gamma $ because it already has a tight prior.

Table 2 Table of parameters and summaries of their priors and posteriors for the case AH

Full size table

Table 3 Table of parameters and summaries of their priors and posteriors for the case AM

Full size table

Because the data are simulated, we have the opportunity to compare the posterior samples with the true values. As mentioned in Sect. 4.1, for both datasets, we use $\beta ^*=5.1*10^{-7},\,\gamma ^*=0.33$, and $\phi ^*=0.05$, as the true values of the transmission rate, the recovery rate, and the loss-of-immunity rate, respectively (in units of per day). Tables 2 and 3 show that ASIRS, the (approximate) hierarchical model, performs well to recover these parameters, and the posterior 95 % CIs of $\beta $ and $\phi $ are much tighter than those of their priors. (The parameter for $\gamma $ already has a tight prior, and its posterior hardly changes.) Figure 2a, b, e show the true values of the three parameters as red lines for the case AH. We find the agreements are excellent, especially for $\beta $ and $\gamma $. The same conclusions hold for the case AM (not shown).

Recall that the HSIRS dataset has uncertainties associated with the hidden epidemic process, and we selected $\sigma ^{2*}_{\xi _S}=0.038,\,\sigma ^{2*}_{\xi _I}=0.012$ as the true values of the small-scale variance components. Tables 2 and 3 indicate that when fitting the ASIRS model to the HSIRS dataset (case AH), the posterior samples of the small-scale variances in the ASIRS model (i.e., $\sigma ^{2}_{\zeta _S}$ and $\sigma ^{2}_{\zeta _I}$) tend to be quite a bit larger than these values. These results are expected since, as explained in Sect. 3, the small-scale variation terms in the ASIRS model have the flexibility to capture higher-order terms not included in the linear approximations.

To assess the general performance of the ASIRS model to approximate the HSIRS model, we use a discrepancy measure, as described in Cressie and Wikle (2011), Section 2.2.2, due to Gelman et al. (1996). Based on a given discrepancy measure, we obtain a posterior predictive p value and perform a diagnostic procedure to determine whether the model fits the data.

Specifically, for the $k$th sample from the MCMC, $k=1,\ldots ,m=27{,}000$, we define a discrepancy measure, $\psi \left( \mathbf {Z};\varvec{\lambda }^{(k)}\right) $, as follows:

$$\begin{aligned} \psi \left( \mathbf {Z};\varvec{\lambda }^{(k)}\right) \!\equiv \!\sum _{t=1}^{T}\left[ \mathbf {Z}(t)\!-\! \mathrm{E}\left( \mathbf {Z}(t)|{\varvec{\lambda }}^{(k)}(t) \right) \right] '\left( {\varvec{\varSigma }}^{(k)}_Z(t) \right) ^{-1}\left[ \mathbf {Z}(t)\!-\!\mathrm{E}\left( \mathbf {Z}(t)|{\varvec{\lambda }}^{(k)}(t)\right) \right] ,\nonumber \\ \end{aligned}$$

(64)

where ${\varvec{\lambda }}^{(k)} (t)\equiv \left( \lambda _S^{(k)}(t), \lambda _I^{(k)}(t)\right) '$ denotes the vector of true counts at time $t$ for the $k$th MCMC sample, and

$$\begin{aligned} \lambda _S^{(k)}(t)&\equiv \lambda _NP_S^{(k)}(t),\end{aligned}$$

(65)

$$\begin{aligned} \lambda _I^{(k)}(t)&\equiv \lambda _NP_I^{(k)}(t). \end{aligned}$$

(66)

Recall that we assume data are only available on the first $T=35$ days for each of the two datasets. Based on the Poisson data model defined in (10)–(11) and on the definitions of ${\varvec{\lambda }}^{(k)}(t)$ in (65) and (66),

$$\begin{aligned} \left( {\varvec{\varSigma }}^{(k)}_Z(t)\right) \equiv \mathrm{diag}\left( \lambda _S^{(k)}(t), \lambda _I^{(k)}(t)\right) , \end{aligned}$$

represents the data model’s covariance matrix, and hence (64) take the form of a Wald statistic. Furthermore, for $t=1,\ldots ,T$, if we use $\mathbf {Z}_{rep}(t)$ to denote an independent replicate of the data, then the posterior-predictive distribution of $\mathbf {Z}_{rep}(t)$ can be defined as (Gelman et al. 1996),

$$\begin{aligned} \left[ \mathbf {Z}_{rep}(t)|\mathbf {Z}(t)\right] =\int \int \left[ \mathbf {Z}_{rep}(t)|\{\mathbf {W}(t)\},{\varvec{\varTheta }} \right] \left[ \{\mathbf {W}(t)\},{\varvec{\varTheta }}|\mathbf {Z}(t) \right] d\{\mathbf {W}(t)\}d{\varvec{\varTheta }}(t), \end{aligned}$$

(67)

where

$$\begin{aligned} {\varvec{\varTheta }}\equiv \{\beta ,\gamma , \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}\}. \end{aligned}$$

Thus, for the $k$th sample, $\mathbf {Z}^{(k)}_{rep}(t)$ is drawn from the posterior distribution $\left[ \mathbf {Z}_{rep}(t)|\mathbf {Z}(t)\right] ,\,t=1,\ldots ,T$, and we obtain the discrepancy measure $\psi \left( \mathbf {Z}^{(k)}_{rep};\varvec{\lambda }^{(k)}\right) $ by replacing $\mathbf {Z}$ in (64) with $\mathbf {Z}^{(k)}_{rep}(t)$ defined in (67). Therefore, the replicates,

$$\begin{aligned} \{\mathbf {Z}^{(1)}_{rep}(t),\mathbf {Z}^{(2)}_{rep}(t), \ldots , \mathbf {Z}^{(m)}_{rep}\}, \end{aligned}$$

should “look like” the data $\mathbf {Z}(t)$ if the model is appropriate. Based on this idea, we can apply posterior-predictive diagnostics in the case AH and the case AM. Since the HSIRS dataset is simulated from the HSIRS model, we can also assess the performances of the ASIRS model to approximate the HSIRS model.

The posterior predictive p value can be obtained as below (Gelman et al. 1996):

$$\begin{aligned} \mathrm{Pr}\left( \psi (\mathbf {Z}_{rep};\varvec{\lambda }) \ge \psi (\mathbf {Z};\varvec{\lambda })\right) = \frac{1}{m}\sum _{k=1}^{m}\mathrm{I}\left[ \psi \left( \mathbf {Z}^{(k)}_{rep};\varvec{\lambda }^{(k)}\right) \ge \psi \left( \mathbf {Z};\varvec{\lambda }^{(k)}\right) \right] , \end{aligned}$$

(68)

where $\mathrm{I}(\cdot )$ denotes an indicator function.

Based on (68), we obtain the posterior predictive p values for the case AH, to be 0.458; hence, there is no striking evidence for a lack of model fit when fitting the ASIRS model to the HSIRS dataset. This conclusion is further supported by Fig. 4, which plots $\psi \left( \mathbf {Z}^{(k)}_{rep};\varvec{\lambda }^{(k)}\right) $ against $\psi \left( \mathbf {Z};\varvec{\lambda }^{(k)}\right) $ for the case AH; there is no striking departure from the 45-degree line. Indeed, the same is true for the case AM, and we conclude that there is no evidence for lack of fit for the ASIRS model when fitted to either dataset. That is, the ASIRS model appears to have considerable flexibility.

4.3.2 Comparisons of the fitted ASIRS model and the fitted CSIRS model

We now fit the (discrete-time) CSIRS model to each of the two simulated datasets described in Sect. 4.1. In practice, estimates of $\beta ,\,\gamma $, and $\phi $, denoted as $\hat{\beta },\,\hat{\gamma }$, and $\hat{\phi }$, are needed in order to solve for $\{S(t)\}$ and $\{I(t)\}$ in Eqs. (6)–(8). Here, we minimize the sum of squares between the estimated and observed infectious counts over time (e.g., Anderson and May 1991; Wearing et al. 2005; Burr and Chowell 2006). That is,

$$\begin{aligned} (\hat{\beta },\hat{\gamma },\hat{\phi })= \mathrm{arg}\min _{(\beta ,\gamma ,\phi )}\left[ \sum _{t=1}^{T} \left( \hat{I}(t;\beta ,\gamma ,\phi )-Z_I(t)\right) ^2\right] , \end{aligned}$$

(69)

where $\hat{I}(t; \beta ,\gamma ,\phi )$ is the deterministic estimate of infectious counts at time $t$, obtained after substituting $\beta ,\,\gamma $, and $\phi $ into Eqs. (6)–(8). Recall that here $T=35$, and also notice that these least squares estimates, $\hat{\beta },\,\hat{\gamma }$, and $\hat{\phi }$, are the values selected in Sect. 3 for the initial MCMC values (denoted $\beta _0,\,\gamma _0$, and $\phi _0$, there.) Finally, the CSIRS forecasts are given by (6)–(8) for $t=T+1,\ldots ,T+F;$ using obvious notation, they are $\{\hat{S}(t; \hat{\beta }, \hat{\gamma }, \hat{\phi }):t=T+1,\ldots ,T+F\},\,\{\hat{I}(t; \hat{\beta }, \hat{\gamma }, \hat{\phi }):t=T+1,\ldots ,T+F\}$, and $\{\hat{R}(t; \hat{\beta }, \hat{\gamma }, \hat{\phi }):t=T+1,\ldots ,T+F\}$, where $T=35$ and $F=10$. For $t=1,\ldots ,T$, the observed counts are considered to be the CSIRS-predicted counts.

Now, for the HSIRS model (approximated by ASIRS), the appropriate predictions are obtained from posterior inference on $\lambda _S(t)$, and $\lambda _I(t)$, for the entire study period $t=1,\ldots ,T,T+1,\ldots ,T+F$, but based only on data from $t=1,\ldots ,T$. The Bayesian predictions are actually given for $P_S(t)$ and $P_I(t)$; then to obtain ASIRS-predicted counts, Eqs. (13)–(15) along with mass balance given by (17) are used.

In the infectious-disease setting, the number of infectious individuals is one of the most important quantities of interest. The estimated trajectories given in Fig. 5 show infectious counts for the entire 45-day study period, obtained by fitting the ASIRS model and the CSIRS model to each of the two datasets. These are compared to the true (hidden) values of infectious counts. Consider the modified-CSIRS dataset: upon inspection of Fig. 5b, we see that both models can capture the overall epidemic pattern very well, not only during the first 35 days when data are available, but also in the last 10 days when there are no data. Now consider the HSIRS dataset: when the underlying epidemic process has stochastic components, the disadvantage of the CSIRS model becomes apparent; upon inspection of Fig. 5a, estimates from fitting the CSIRS model are oversmoothed, even on the days when data are available. In contrast, the agreement between the true value and the posterior median of $\{\lambda _I(t)\}$, obtained from fitting the ASIRS model, is excellent when data are available. On the last 10 days when there are no data, the hierarchical model is able to predict the general downward trend at the end of the epidemic process; however, the CSIRS model is unable to deal with the uncertainties, mistakenly forecasting that the epidemic is maintained in the last 10 days.

Another disadvantage of the CSIRS model is that it is unable to provide any uncertainty measures to accompany its deterministic-modeling strategy. In contrast, when fitting the hierarchical model, we can obtain uncertainty measures for any quantity of interest, based on its posterior distribution. For example, Fig. 6 shows 0.025, 0.25, 0.5, 0.75, 0.975 quantiles of posterior distributions of the hidden infectious counts, $\lambda _I(t)$, during the forecasting period $\{T+1,\ldots ,T+F\}$. We can see that in all cases, the posterior 50 % CI of $\lambda _I(t)$ obtained from fitting the ASIRS model, cover the true values at all times.

Now consider the susceptible counts. Figures 7 and 8 clearly indicate that the ASIRS model performs better than the CSIRS model, when fitting to the HSIRS dataset.

The analyses given in this section illustrate the advantages and the practicalities associated with a hierarchical statistical approach to inferring the dynamical evolution of an infectious disease. In the next section, we confirm those inferential advantages through a simulation experiment.

5 A simulation experiment

In this section, a simulation experiment is presented to compare the performances of the HSIRS model (actually, the approximative ASIRS model) and the CSIRS model using design-based criteria under various factor combinations (i.e., a factorial experimental design), assuming that data are from an HSIRS model (defined in Sect. 2). That is, the processes $\{\mathbf {W}(t)\}$ and $\{\mathbf {P}(t)\}$ are simulated according to (26) and (78)–(79), from which the data processes $\{Z_S(t)\}$ and $\{Z_I(t)\}$ are simulated from the Poisson distributions defined in (10)–(11), conditional on $\{\mathbf {P}(t)\}$. As in Sect. 4.1.1, we set the true values of the parameters in (26) as ($\gamma ^*=0.33,\,\phi ^*=0.05$), and we simulate the baseline $\mathbf {W}(1)$ from (62) with ${\varvec{\mu }}^{W*}(1)=(2.934,-1.609)$. The baseline small-scale variation ${\varvec{\varSigma }}^*_{\xi }$ in (62), the values of $T$ and $F$, and the parameter $\beta $ in (26) are chosen in ways that relate to factors in the experiment and will be described in Sect. 5.1. For each of the factor combinations, we simulate $L=100$ replications.

5.1 Factors of the simulation experiment

Four factors are considered in this simulation experiment: fitted model (FM) is a factor that compares the HSIRS model and the CSIRS model; in the terminology of experimental design, this is considered the “treatment.” Three other factors related to the data-generating schemes are also included; these are data information (DI), small-scale variation (SV), and transmission rate (TR). The details of these four factors are now presented.

Fitted Model (FM)

The factor FM h as two levels, where FM $=$ 0 represents the CSIRS model and FM $=$ 1 represents the HSIRS model. The HSIRS model is specified by (10), (11), and (26) in Sect. 2, and by (78), (79), and (80) in “Appendix A.1”. The CSIRS model is specified by (6)–(8) in Sect. 1.

Data Information (DI)

Two different levels of DI are considered. One is where there are data for 15 days (DI $=$ 0), and the other for 35 days (DI $=$ 1). In each situation, we perform forecasting for up to 10 days beyond the last day where there were data. The choice of 15 days (DI $=$ 0) was based on CDC’s response to the 2009 H1N1 pandemic. According to “The 2009 H1N1 Pandemic: Summary Highlights, April 2009–April 2010” (CDC: http://www.cdc.gov/h1n1flu/cdcresponse.html), an H1N1 infection was first detected in the US in a 10-year-old patient in California on April 15, 2009; 15 days later, that is, on April 29, 2009, WHO raised the influenza pandemic alert from phases 4 to 5, signaling that a pandemic was imminent. We selected 35 days (DI $=$ 1) because, in retrospect, the 2009 H1N1 pandemic took 35 days to develop from the time the first case occurred in Mexico (Fig. 9, from “Outbreak of Swine-Origin Influenza A (H1N1) Virus Infection—Mexico, March–April 2009”;

CDC: http://www.cdc.gov/mmwR/preview/mmwrhtml/mm58d0430a2.html).

Small-scale Variation (SV)

Two different levels of SV are considered. Recall the definitions of signal-to-noise ratios, $\mathrm{SNR}_{W_S}(t)$ and $\mathrm{SNR}_{W_I}(t)$, in Eqs. (56) and (57) in Sect. 4.1.1. These are put equal to each other, and the two levels are SNR $=$ 15 (SV $=$ 0) and SNR $=$ 5 (SV $=$ 1). Consider the initial time $t=1$. Then for SV $=$ 0,

$$\begin{aligned} \mathrm{SNR}_{W_S}(1)= \mathrm{SNR}_{W_I}(1)=15; \end{aligned}$$

hence (60)–(61) gives $\sigma ^{2}_{\xi _S}(1)=0.038,\, \sigma ^{2}_{\xi _I}(1)=0.012$. For $\mathbf SV =1$,

$$\begin{aligned} \mathrm{SNR}_{W_S}(1)= \mathrm{SNR}_{W_I}(1)=5; \end{aligned}$$

then from (56) to (57), we have

$$\begin{aligned} \sigma ^{2}_{\xi _S}(1)&= \left( 2.934/5\right) ^2=0.344,\\ \sigma ^{2}_{\xi _I}(1)&= \left( -1.609/5\right) ^2=0.104. \end{aligned}$$

To control this factor in the simulation, we assume henceforth that $\sigma ^{2}_{\xi _S}(t)\equiv \sigma ^{2}_{\xi _S}(1)$ and $\sigma ^{2}_{\xi _I}(t)\equiv \sigma ^{2}_{\xi _I}(1)$, for $t=1,2,\ldots $. Therefore, SV $=$ 0 represents {$\sigma ^{2}_{\xi _S}(t)=0.038,\,\sigma ^{2}_{\xi _I}(t)=0.012$, SNR $=$ 15}, and SV $=$ 1 represents {$\sigma ^{2}_{\xi _S}(t)=0.344,\,\sigma ^{2}_{\xi _I}(t)=0.104$, SNR $=$ 5}.

Transmission Rate (TR)

Two levels of the transmission rate (TR) per unit time are considered: $\beta =3.3958\times 10^{-7}$ (TR $=$ 0) and $\beta =5.8654\times 10^{-7}$ (TR $=$ 1). These two levels of $\beta $ result in the basic reproduction number $R_0=1.1$ and $1.9$, respectively, given by Eq. (54). According to Anderson (2006), $R_0$ usually has a value between 1 and 2 for new strains of influenza in the human community. Therefore, by selecting these two levels of TR, we are aiming to mimic scenarios with small and large transmission rates, respectively, but still within the usual range.

5.2 Results of the simulation experiment

In order to compare the approximate HSIRS model and the CSIRS model, we define a response variable based on the empirical mean squared prediction error (MSPE) in each simulation run: Let $T^O$ denote the times at which data are observed, and let $T^M$ denote the times at which data are missing; note that in our case, $T^M$ is a 10-day forecast period that follows $T^O$. Let $P_X(t,l)$ denote the $l$-th simulated realization of the underlying true rate of the population in subgroup $X$ [see Eqs. (13)–(15) for the definition of “true rate”]; in our case, $X$ denotes either the susceptible (S) or the infectious (I) population. Let $\hat{P}_X(t,l)$ denote a generic predictor of $P_X(t,l)$. Note that $P_X(t,l)$ contains the same information as $\lambda _X(t,l)$, which denotes the $l$-th realization of the simulated true counts for the population in subgroup $X$, since $\lambda _X(t,l)=\lambda _NP_X(t,l)$ and $\lambda _N$ is a known constant total population. Then we define

$$\begin{aligned} MSPE _X(T^*,l)&\equiv \frac{1}{|T^*|}\sum _{t\in T^*}\left( \hat{P}_X(t,l)-P_X(t,l)\right) ^2;\quad l\!=\!1,\ldots ,L, \end{aligned}$$

(70)

where $T^*=T^O$ or $T^M,\,|T^*|$ is the number of days in the time period $T^*$, and $L$ is the total number of simulation runs for each of the factor combinations in this study. The bias of the predictor can be studied through,

$$\begin{aligned} BIAS_X(T^*,l)\equiv \frac{1}{|T^*|}\sum _{t\in T^*}\left( \hat{P}_X(t,l)-P_X(t,l)\right) ;\quad l=1,\ldots ,L. \end{aligned}$$

(71)

We performed an analysis of variance (ANOVA) to investigate which factors are important and under which scenarios the approximate HSIRS model provides substantial improvement over the CSIRS model. First, we consider MSPE. Because of the skewness and mean-variance dependence in $\left\{ MSPE _X(T^*,l)\right\} $, we use a fourth-root transformation to transform the response (Cressie et al. 2010). The boxplots in Figs. 10 and 11 show that the distributions of the differences between the fourth-root transformations (superscripts “C” and “H” denote, respectively, analysis under the CSIRS model and under the approximate HSIRS model):

$$\begin{aligned} \left\{ MSPE ^C_X(T^*,l)^{1/4}\!-\! MSPE ^H_X(T^*,l)^{1/4}:l\!=\!1,\ldots ,L;\, X\!=\!S,I; \,T^*\!=\!T^O, T^M \right\} , \end{aligned}$$

is suitable for a classical ANOVA. We then define

$$\begin{aligned} A^Y_X(T^*)\equiv \mathrm{ave}_l\left\{ MSPE _X(T^*,l)^{1/4}\right\} ; \,X=S,I; Y=C,H; \, T^*=T^O,T^M, \end{aligned}$$

(72)

where the average is taken over $L=100$ replications for each of the factor combinations. We construct the “response” variable of the ANOVA study as $A^C_X(T^*)-A^H_X(T^*)$, for $X=S$ and $I$. These paired comparisons will tell us for which factor combinations the approximate HSIRS model shows improvement over the CSIRS model. Four ANOVAs showing up to two-way interactions are reported in Tables 4, 5, 6 and 7; they are based on the response variables, $A^C_S(T^O)-A^H_S(T^O),\,A^C_S(T^M)-A^H_S(T^M),\,A^C_I(T^O)-A^H_I(T^O)$, and $A^C_I(T^M)-A^H_I(T^M)$, respectively.

Table 4 Analysis of variance (ANOVA) on $A^C_S(T^O)-A^H_S(T^O)$ up to two-way interactions

Full size table

Table 5 Analysis of variance (ANOVA) on $A^C_I(T^O)-A^H_I(T^O)$ up to two-way interactions

Full size table

Table 6 Analysis of variance (ANOVA) on $A^C_S(T^M)-A^H_S(T^M)$ up to two-way interactions

Full size table

Table 7 Analysis of variance (ANOVA) on $A^C_I(T^M)-A^H_I(T^M)$ up to two-way interactions

Full size table

Table 4 gives the paired-comparison ANOVA for the infectious population, for $T^*=T^O$. It indicates that the main effect of the small-scale variation factor, SV, explains over 65 % of the variability in the response. We also find that the difference, $A^C_S(T^O)-A^H_S(T^O)$, increases by 83 % when SV $=$ 1, relative to its value when SV $=$ 0. The remaining variability is mostly explained by the main effects of DI and TR. Figure 12 supports the results shown in Table 4, where the red and blue lines show that the SV–TR, SV–DE, and TR–DI interactions for $A^C_S(T^O)-A^H_S(T^O)$ are not obvious. Moreover, from Fig. 12, we see that for all factor combinations, the approximate HSIRS model always outperforms the CSIRS model, for $T^*=T^O$, since the responses are always greater than zero. Notice that they increase as data information becomes longer (DI $=$ 1), as SNR becomes smaller (SV $=$ 1), and as the transmission rate becomes larger (TR $=$ 1), which are all reasonable results that match our intuition. Clearly, as more variability is introduced into the pandemic and its observed counts, the relative inability of the CSIRS model to handle it becomes more obvious.

Table 5 gives the paired-comparison ANOVA for the infectious population, again for $T^*=T^O$. It indicates that SV and TR are the two most important factors; their main effects together explain over 89 % of the variability in the response. By investigating the main effects of SV and TR, we find that the difference, $A^C_I(T^O)-A^H_I(T^O)$, increases by 85 % when SV $=$ 1, relative to its value at SV $=$ 0; and it increases by 81 % when TR $=$ 1, relative to its value at TR=0. The pattern is broadly consistent with that for the susceptible population, and the responses increase when data information become longer (DI $=$ 1), as we expect. These results are supported further by Fig. 13, where the red and blue lines show the SV–TR, SV–DI, and TR–DI interactions for $A^C_I(T^O)-A^H_I(T^O)$; it again indicates no evident interactions. Importantly, Fig. 13 shows that for $T^*=T^O$ and all factor combinations, the approximate HSIRS model outperforms the CSIRS model.

Forecasting in the period $T^M$ is of more interest, so we turn our attention to the susceptible populations, Table 6, and the ANOVA of $A^C_S(T^M)-A^H_S(T^M)$. It indicates that DI is the most dominant, which explains over 60 % of the variability in the response. We also find that the difference, $A^C_S(T^M)-A^H_S(T^M)$, increases by 145 % when DI $=$ 1, relative to its value when DI $=$ 0. The remaining variability is mostly explained by the main effect of SV and its interaction with TR. The green and black lines in Fig. 12 show the SV–TR, SV–DE, and TR–DI interactions for $A^C_S(T^M)-A^H_S(T^M)$. It supports the results shown in Table 6, namely that the SV–TR interaction is more evident than the other two interactions. We see that when TR $=$ 0, the value of $A^C_S(T^M)-A^H_S(T^M)$ increases substantially when the SNR becomes smaller (SV=1), in contrast to very little change between the two levels of SV when TR $=$ 1. From Fig. 12, the values of $A^C_S(T^M)-A^H_S(T^M)$ are always greater than zero, and they increase as SNR becomes smaller (SV $=$ 1) or as data information becomes longer (DI $=$ 1), which matches our intuition. This again indicates that for the susceptible population, the approximate HSIRS model always outperforms the CSIRS model holding all other factors the same, even though the difference is not as pronounced as during the period $T^O$. Since forecasting is inherently difficult, this is not surprising.

Table 7 gives the paired-comparison ANOVA for the infectious population in the period $T^M$: the SV–TR interaction has the largest impact on the response and explains about 50 % of the variability. The main effects of TR, SV, and the SV–DI interaction explain most of the remaining variation. These results are supported further by Fig. 13, where the green and black lines show the SV–TR, SV–DI, and TR–DI interactions for $A^C_I(T^M)-A^H_I(T^M)$. We see that the SV–TR interaction is similar to that for the susceptible population, but it is more evident, because when TR $=$ 1, the value of $A^C_I(T^M)-A^H_I(T^M)$ decreases as SNR becomes smaller (SV $=$ 1). A similar pattern is seen for the DI–SV interaction; however, the DI–TR interaction is not as evident as the other two interactions. Notice that at both levels of DI, the values of $A^C_I(T^M)-A^H_I(T^M)$ are larger when TR is smaller (TR $=$ 0). Importantly, Fig. 13 shows that for $T^*=T^M$ and all factor combinations, the values of $A^C_I(T^M)-A^H_I(T^M)$ are always greater than zero; that is, the approximate HSIRS model always outperforms the CSIRS model.

To investigate further the performances of the approximate HSIRS model and the CSIRS model, we turn our attention to bias [as specified in (71)] and define the fraction of MSPE that can be explained by the squared bias as

$$\begin{aligned} \kappa ^Y_X(T^*,l)\!\equiv \! \frac{BIAS^Y_X(T^*,l) ^2}{ MSPE ^Y_X(T^*,l)}; \,X\!=\!S,I;\quad T^*\!=\!T^O,T^M;\,\, Y\!=\!H,C; \,\, l\!=\!1,\ldots ,L. \end{aligned}$$

(73)

Notice that $0\le \kappa ^Y_X(T^*,l)\le 1$. The boxplots in Figs. 14 and 15 show the distributions of $\{\kappa ^H_S(T^*,l)\}$ and $\{\kappa ^H_I(T^*,l)\}$, respectively, for all factor combinations. For both the susceptible and infectious populations, Figs. 14 and 15 show that the fractions of MSPE explained by the squared bias are much smaller (around 0.1 or less) in the period $T^O$ than those in the period $T^M$ (around 0.5), for all factor combinations. Again because forecasting is inherently difficult, this is not surprising. By comparing Figs. 14 and 15 with Figs. 10 and 11, we see that those factor combinations where squared bias is relatively small are the same combinations where the approximate HSIRS model yields much better MSPE than that for the CSIRS model. In particular, Figs. 14 and 15 clearly show that in the period of $T^O$, the responses decrease (or do not increase) as data information becomes longer (DI $=$ 1), as SNR becomes smaller (SV $=$ 1), and as the transmission rate becomes larger (TR $=$ 1), which supports our conclusion that the approximate HSIRS model has better predictive capability even when more variability is introduced into the pandemic through its observed counts. Figures 14 and 15 indicate the strong interactions among DI, SV, and TR in the period of $T^M$, which agrees with the conclusions drawn from Figs. 12 and 13.

6 Discussion and conclusions

In this article, we develop a Bayesian hierarchical SIRS (HSIRS) model that captures the various sources of uncertainties in modeling infectious diseases such as seasonal or pandemic influenza. Important features of our HSIRS model are that it preserves mass balance on the (hidden) true counts rather than on the observed counts, and that the dynamical process is modeled on a log-odds-ratio scale. Furthermore, our approach captures the stochastic and discrete nature of the epidemic process, as well as keeping the SIRS flow that underlies the CSIRS model.

In Sect. 4, we simulated two datasets, an HSIRS dataset and a modified-CSIRS dataset, where we assumed that data were available on the first 35 days and missing on the following 10 days. Then we used an MCMC algorithm to fit the HSIRS model to each of the two datasets; for computational efficiency, a well calibrated linear approximation was used. We saw that the approximate HSIRS (ASIRS) model was a very good approximation, and that accounting for all known uncertainties led to a superior performance over the deterministic classic SIRS (CSIRS) model.

In Sect. 5, a carefully designed simulation experiment at various levels of various factors with sufficient replication is presented. It allows us to conclude that the approximate HSIRS model offers an accurate and computationally efficient approach to analyzing infectious-disease data. The comparisons given there clearly show that the HSIRS model is better, according to both the MSPE and bias criteria, than the CSIRS model, during either the observation period or the forecast period for all the conditions tested.

In this research, the HSIRS model is based on assuming a constant population for a short period of time; however, the model is not restricted to this assumption and can be extended to include population turnover that handles long-term influenza dynamics. For example, Eqs. (21)–(23) in Sect. 2.2 could instead be derived from an SIRS flow with birth and death rates (see the SIR flow incorporating birth and death rates given in Anderson and May 1991). This extension is useful because recovery from seasonal influenza may give immunity to that specific pathogen, but this immunity can be lost over the years due to the evolution of the virus (e.g., Dushoff et al. 2004). In ongoing research, we are investigating more complicated epidemic dynamics that not only incorporate birth, death, but also emigration/immigration processes for appropriate time periods. Also, spatial (e.g., Hooten and Wikle 2010; Oleson and Wikle 2013) and multivariate (e.g., Zhuang et al. 2013) aspects could be incorporated into these hierarchical dynamical models through vector-valued processes, although the form of such models would require careful integration of the aforementioned emigration/immigration processes.

References

Allen L (2003) An introduction to stochastic processes with applications to biology. Prentice-Hall, Upper Saddle River, NJ
MATH Google Scholar
Alonso D, McKane AJ, Pascual M (2007) Stochastic amplification in epidemics. J R Soc Interface 4(14):575–582
Article Google Scholar
Anderson R (2006) Planning for pandemics of infectious diseases. Bridge 3:5–9
Google Scholar
Anderson R, May R (1991) Infectious diseases of humans: dynamics and control. Oxford University Press, Oxford
Google Scholar
Andersson H, Britton T (2000) Stochastic epidemic models and their statistical analysis, lecture notes in statistics, vol 151. Springer, Berlin
Arulampalam S, Maskell S, Gordon N, Clapp T (2001) A tutorial on particle filters for on-line non-linear/non-Gaussian Bayesian tracking. IEEE Trans Signal Process 50:174–188
Article Google Scholar
Bailey NTJ (1975) The mathematical theory of infectious diseases. Griffin, London
MATH Google Scholar
Bansal S, Pourbohloul B, Hupert N, Grenfell B, Meyers LA (2010) The shifting demographic landscape of pandemic influenza. PLoS One 5(2):e9360. doi:10.1371/hournal.phone.0009360
Article Google Scholar
Besag J, York J, Mollié A (1991) Bayesian image restoration, with two applications in spatial statistics. Ann Inst Stat Math 43:1–59
Article MATH Google Scholar
Burr T, Chowell G (2006) Observation and model error effects on parameter estimates in susceptible-infected-recovered epidemiological models. Far East J Theor Stat 19:163–183
MATH MathSciNet Google Scholar
Carlin B, Banerjee S (2002) Hierarchical multivariate CAR models for spatio-temporally correlated survival data. In: Bernardo J, Bayarri M, Berger J, Dawid A, Heckerman D, Smith AFM, West M (eds) Bayesian statistics 7. Clarendon Press, Oxford, pp 45–63 (with discussion)
Catterall S, Cook AR, Marion G, Butler A, Hulme PE (2012) Accounting for uncertainty in colonisation times: a novel approach to modelling the spatio-temporal dynamics of alien invasions using distribution data. Ecography 35(10):901–911
Article Google Scholar
Chao DL, Bloom JD, Kochin BF, Antia R, Longini IM Jr (2012) The global spread of drug-resistant influenza. J R Soc Interface 9(69):648–656
Article Google Scholar
Cook AR, Otten W, Marion G, Gibson GJ, Gilligan CA (2007) Estimation of multiple transmission rates for epidemics in heterogeneous populations. Proc Natl Acad Sci USA 104(51):20392–20397
Article Google Scholar
Cressie N, Shi T, Kang E (2010) Fixed rank filtering for spatio-temporal data. J Comput Graph Stat 19:724–745
Article MathSciNet Google Scholar
Cressie N, Wikle C (2011) Statistics for spatio-temporal data. Wiley, Hoboken, NJ
MATH Google Scholar
Dukić V, Lopes HF, Polson NG (2012) Tracking epidemics with Google flu trends data and a state-space SEIR model. J Am Stat Assoc 107:1410–1426
Article MATH Google Scholar
Dushoff J, Plotkin J, Levin S, Earn D (2004) Dynamical resonance can account for seasonality of influenza epidemics. Proc Natl Acad Sci USA 101:16915–16916
Article Google Scholar
Fine P, Eames K, Heymann DL (2011) “Herd immunity”: a rough guide. Clin Infect Dis 52(7):911–916
Article Google Scholar
Gelman A, Meng X-L, Stern H (1996) Posterior predictive assessment of model fitness via realized discrepancies. Stat Sin 6:733–807 (with discussion)
Gibson GJ, Otten W, Filipe JAN, Cook A, Marion G, Gilligan CA (2006) Bayesian estimation for percolation models of disease spread in plant populations. Stat Comput 16(4):391–402
Article MathSciNet Google Scholar
Halloran ME, Longini IM, Nizam A, Yang Y (2002) Containing bioterrorist smallpox. Science 298:1428–1432
Article Google Scholar
He D, Ionides EL, King AA (2010) Plug-and-play inference for disease dynamics: measles in large and small populations as a case study. J R Soc Interface 7:271–283
Article Google Scholar
Hethcote H (2000) The mathematics of infectious diseases. SIAM Rev 42:599–653
Article MATH MathSciNet Google Scholar
Hooten MB, Wikle CK (2010) Statistical agent-based models for discrete spatio-temporal systems. J Am Stat Assoc 105:236–248
Article MathSciNet Google Scholar
Ionides EL, Bretó C, King AA (2006) Inference for nonlinear dynamical systems. Proc Natl Acad Sci USA 103:18438–18443
Article Google Scholar
Keeling M, Rohani P (2007) Modelling infectious diseases in humans and animals. Princeton University Press, Princeton, NJ
Google Scholar
Kendall BE, Briggs CJ, Murdoch WW, Turchin P, Ellner SP, McCauley E, Nisbet RM, Wood SN (1999) Why do populations cycle? A synthesis of statistical and mechanistic modeling approaches. Ecology 80:1789–1805
Article Google Scholar
Kendall BE, Ellner SP, McCauley E, Wood SN, Briggs CJ, Murdoch WM, Turchin P (2005) Population cycles in the pine looper moth: dynamical tests of mechanistic hypotheses. Ecol Monogr 75:259–276
Article Google Scholar
Kermack W, McKendrick A (1927) Contributions to the mathematical theory of epidemics—I. Proc R Soc Edinb A 115:700–721
Article MATH Google Scholar
Kermack W, McKendrick A (1932) Contributions to the mathematical theory of epidemics—II—the problem of endemicity. Proc R Soc Edinb A 138:55–83
Article Google Scholar
Kermack W, McKendrick A (1933) Contributions to the mathematical theory of epidemics—III—further studies of the problem of endemicity. Proc R Soc Edinb A 141:94–122
Article MATH Google Scholar
King AA, Ionides EL, Pascual M, Bouma MJ (2008) Inapparent infections and cholera dynamics. Nature 454:877–880
Article Google Scholar
Liu J, West M (2001) Combined parameter and state estimation in simulation-based filtering. In: Doucet A, de Freitas N, Gordon NJ (eds) Sequential Monte Carlo methods in practice. Springer, New York, pp 197–224
Chapter Google Scholar
Lloyd A (2001) Realistic distributions of infectious periods in epidemic models: changing patterns of persistence and dynamics. Theor Popul Biol 60:59–71
Article Google Scholar
Mugglin A, Cressie N, Gemmell I (2002) Hierarchical statistical modelling of influenza epidemic dynamics in space and time. Stat Med 21:2703–2721
Article Google Scholar
Oleson JJ, Wikle CK (2013) Predicting infectious disease outbreak risk via migratory waterfowl vectors. J Appl Stat 40(3):656–673
Article MathSciNet Google Scholar
Patterson TA, Thomas L, Wilcox C, Ovaskainen O, Matthiopoulos J (2008) State-space models of individual animal movement. Trends Ecol Evol 23(2):87–94
Article Google Scholar
Potter CW (2001) A history of influenza. J Appl Microbiol 91:572–579
Article Google Scholar
Rasmussen DA, Ratmann O, Koelle K (2011) Inference for nonlinear epidemiological models using genealogies and time series. PLoS Comput Biol 7(8):e1002136
Article MathSciNet Google Scholar
Reluga T (2004) A two-phase epidemic driven by diffusion. J Theor Biol 229:249–261
Article MathSciNet Google Scholar
Robert C, Casella G (2004) Monte Carlo statistical methods, 2nd edn. Springer, New York, NY
Book MATH Google Scholar
Schenzle D (1984) An age-structured model of pre- and post-vaccination measles transmission. IMA J Math Appl Med Biol 1:169–191
Article MATH MathSciNet Google Scholar
Volz E (2008) SIR dynamics in random networks with heterogeneous connectivity. J Math Biol 56:293–310
Article MATH MathSciNet Google Scholar
Waller L, Carlin B, Xia H, Gelfand A (1997) Hierarchical spatio-temporal mapping of disease rates. J Am Stat Assoc 92:607–617
Article MATH Google Scholar
Wearing H, Rohani P, Keeling M (2005) Appropriate models for the management of infectious diseases. PLoS Med 2(7):e174
Article Google Scholar
Wood SN (2010) Statistical inference for noisy nonlinear ecological dynamic systems. Nature 466:1102–1104
Article Google Scholar
Xu Y, Allena L, Perelson A (2007) Stochastic model of an influenza epidemic with drug resistance. J Theor Biol 248:179–193
Article Google Scholar
Zhou T, Fu Z, Wang B (2006) Epidemic dynamics on complex networks. Prog Nat Sci 16:452–457
Article MATH MathSciNet Google Scholar
Zhuang L, Cressie N, Pomeroy L, Janies D (2013) Multi-species SIR models from a dynamical Bayesian perspective. Theor Ecol 6(4):457–473
Article Google Scholar

Download references

Acknowledgments

We would like to express our appreciation to the referees, the editor, and Dr. Laura Pomeroy for their constructive comments, which led to important improvements. The second author’s research was partially supported by the U.S. National Science Foundation (NSF) and the U.S. Census Bureau under NSF Grant SES-1132031, funded through the NSF-Census Research Network (NCRN) program.

Author information

Authors and Affiliations

Department of Statistics, The Ohio State University, Columbus, OH , 43210, USA
Lili Zhuang
NIASRA, School of Mathematics and Applied Statistics, University of Wollongong, Wollongong, NSW , 2522, Australia
Noel Cressie

Authors

Lili Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Noel Cressie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Lili Zhuang.

Appendices

1.1 Appendix A.1: Derivation of the nonlinear dynamical structure of ${\varvec{\mu }}^{W}(t)$

Here we derive the nonlinear dynamical structure of ${\varvec{\mu }}^W(t)$ given in (27) and (28) in Sect. 2.2.

Assume $P_R(t)>0$, for $t=1,2,\ldots $. From the difference Eqs. (21)–(23), we can obtain

$$\begin{aligned} \frac{P_S(t+1)}{P_R(t+1)}&= \frac{P_S(t)- \beta \varDelta \lambda _N P_S(t)P_I(t)+\phi \varDelta P_R(t)}{P_R(t)+\gamma \varDelta P_I(t)-\phi \varDelta P_R(t)},\end{aligned}$$

(74)

$$\begin{aligned} \frac{P_I(t+1)}{P_R(t+1)}&= \frac{P_I(t)+ \beta \varDelta \lambda _N P_S(t)P_I(t)-\gamma \varDelta P_I(t)}{P_R(t)+\gamma \varDelta P_I(t)-\phi \varDelta P_R(t)}. \end{aligned}$$

(75)

Notice that Eqs. (74)–(75) can be rewritten as

$$\begin{aligned} \frac{P_S(t+1)}{P_R(t+1)}&= \frac{\frac{P_S(t)}{P_R(t)}\frac{1}{P_R(t)}-\beta \varDelta \lambda _N \frac{P_S(t)}{P_R(t)}\frac{P_I(t)}{P_R(t)}+\frac{\phi \varDelta }{P_R(t)}}{\frac{1}{P_R(t)}+ \gamma \varDelta \frac{P_I(t)}{P_R(t)}\frac{1}{P_R(t)}-\frac{\phi \varDelta }{P_R(t)}},\end{aligned}$$

(76)

$$\begin{aligned} \frac{P_I(t+1)}{P_R(t+1)}&= \frac{\beta \varDelta \frac{P_S(t)}{P_R(t)}\frac{P_I(t)}{P_R(t)}+(1-\gamma \varDelta ) \frac{P_I(t)}{P_R(t)}\frac{1}{P_R(t)}}{\frac{1}{P_R(t)}+ \gamma \varDelta \frac{P_I(t)}{P_R(t)}\frac{1}{P_R(t)}-\frac{\phi \varDelta }{P_R(t)}}. \end{aligned}$$

(77)

From (24)–(25),

$$\begin{aligned} P_S(t)&= \frac{\mathrm{exp}(W_S(t))}{1+\mathrm{exp}(W_S(t)) +\mathrm{exp}(W_I(t))},\end{aligned}$$

(78)

$$\begin{aligned} P_I(t)&= \frac{\mathrm{exp}(W_I(t))}{1+\mathrm{exp}(W_S(t)) +\mathrm{exp}(W_I(t))}. \end{aligned}$$

(79)

Then substituting (78)–(79) into (17), we obtain

$$\begin{aligned} P_R(t)=\frac{1}{1+\mathrm{exp}(W_S(t))+\mathrm{exp}(W_I(t))}. \end{aligned}$$

(80)

Hence,

$$\begin{aligned} \frac{P_S(t)}{P_R(t)}&= \mathrm{exp}(W_S(t)),\end{aligned}$$

(81)

$$\begin{aligned} \frac{P_I(t)}{P_R(t)}&= \mathrm{exp}(W_I(t))\end{aligned}$$

(82)

$$\begin{aligned} \frac{1}{P_R(t)}&= \frac{\mathrm{exp}(W_S(t))+\mathrm{exp}(W_I(t))+1}{1}. \end{aligned}$$

(83)

For $t=1,2,\ldots $, substitute (81)–(83) into Eqs. (76)–(77) to obtain

$$\begin{aligned} \mathrm{exp}(W_S(t+1))&= \mathrm{exp}\left( W_S(t)\right) \nonumber \\&\quad \times \left[ 1+\frac{\phi \varDelta }{\mathrm{exp}\left( W_S(t)\right) }-\frac{(\beta \varDelta \lambda _N)\mathrm{exp}\left( W_I(t)\right) }{1+\mathrm{exp}\left( W_S(t)\right) +\mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad \times \frac{1}{\left[ 1+\gamma \varDelta \mathrm{exp}\left( W_I(t)\right) -\phi \varDelta \right] },\end{aligned}$$

(84)

$$\begin{aligned} \mathrm{exp}\left( W_I(t+1)\right)&= \mathrm{exp}\left( W_I(t)\right) \nonumber \\&\quad \times \left[ 1-\gamma \varDelta +\frac{(\beta \varDelta \lambda _N)\mathrm{exp}\left( W_S(t)\right) }{1+\mathrm{exp}\left( W_S(t)\right) +\mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad \times \frac{1}{\left[ 1+\gamma \varDelta \mathrm{exp}\left( W_I(t)\right) -\phi \varDelta \right] }. \end{aligned}$$

(85)

Taking logarithms on both sides of (84)–(85), for $t=1,2,\ldots $, we obtain,

$$\begin{aligned} W_S(t+1)&= W_S(t)\nonumber \\&\quad +\,\mathrm{log}\left[ 1+\frac{\phi \varDelta }{\mathrm{exp}\left( W_S(t)\right) }-\frac{\beta \varDelta \lambda _N\mathrm{exp}\left( W_I(t)\right) }{1+\mathrm{exp}\left( W_S(t)\right) +\mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad +\,\mathrm{log}\left[ \frac{1}{1+ \gamma \varDelta \mathrm{exp}\left( W_I(t)\right) -\phi \varDelta }\right] ,\nonumber \\ \end{aligned}$$

(86)

$$\begin{aligned} W_I(t+1)&= W_I(t)\nonumber \\&\quad +\,\mathrm{log}\left[ 1-\gamma \varDelta +\frac{\beta \varDelta \lambda _N\mathrm{exp}\left( W_S(t)\right) }{1+\mathrm{exp}\left( W_S(t)\right) +\mathrm{exp}\left( W_I(t)\right) }\right] \nonumber \\&\quad +\,\mathrm{log}\left[ \frac{1}{1+\gamma \varDelta \mathrm{exp}\left( W_I(t)\right) -\phi \varDelta }\right] .\nonumber \\ \end{aligned}$$

(87)

Then (86)–(87) are used to define ${\varvec{\mu }}^W(t)$ in the nonlinear autoregressive structure given by (26), which captures the uncertainties in the hidden epidemic process.

1.2 Appendix A.2: Derivation of the linear process $\{{\varvec{\mu }}^{LW}(t)\}$ to approximate ${{\varvec{\mu }}^{W}(t)}$

Here, we derive a well calibrated linear process $\{{\varvec{\mu }}^{LW}(t)\}$, as defined by (38)–(39), to approximate $\{{\varvec{\mu }}^{W}(t)\}$ given by (27)–(28). Specifically, for $t=1,2,\ldots ,\,\{\hat{A}_i(t):i=1,\ldots ,10\}$ are initializations of the nonlinear components in Eqs. (27)–(28), as shown in Table 1.

Consider $\mu ^W_S(t)$ given by (27), and use a Taylor-series expansion up to second order. The second term on the right-hand side is:

$$\begin{aligned}&\mathrm{log}\left( 1+\frac{\phi \varDelta }{e^{W_S(t)}}- \frac{\beta \varDelta \lambda _Ne^{W_I(t)}}{1+e^{W_S(t)}+e^{W_I(t)}} \right) \nonumber \\&\quad =\mathrm{log}(A_1(t))+\mathrm{log}\left[ 1+\left( \frac{1+ \frac{\phi \varDelta }{e^{W_S(t)}}-\frac{\beta \varDelta \lambda _Ne^{W_I(t)}}{\left( 1+e^{W_S(t)}+e^{W_I(t)}\right) }}{A_1(t)}-1\right) \right] \nonumber \\&\quad \approx \mathrm{log}(\hat{A}_1(t))+\left( \frac{1+\frac{\phi \varDelta }{e^{W_S(t)}}-\frac{\beta \varDelta \lambda _Ne^{W_I(t)}}{\left( 1+e^{W_S(t)}+e^{W_I(t)}\right) }}{\hat{A}_1(t)}-1\right) \nonumber \\&\quad \quad -\frac{1}{2}\left( \frac{1+\frac{\phi \varDelta }{e^{W_S(t)}}-\frac{\beta \varDelta \lambda _Ne^{W_I(t)}}{\left( 1+e^{W_S(t)}+e^{W_I(t)}\right) }}{\hat{A}_1(t)}-1\right) ^2. \end{aligned}$$

(88)

Now, $\frac{e^{W_I(t)}}{\left( 1+e^{W_S(t)}+e^{W_I(t)}\right) }$ can be further expanded using a Taylor series to second order:

$$\begin{aligned} \frac{e^{W_I(t)}}{\left( 1+e^{W_S(t)}+e^{W_I(t)}\right) }&=1-\frac{1}{1-\left( -\frac{e^{W_I(t)}}{1+e^{W_S(t)}}\right) }\nonumber \\&\approx 1-\left[ \frac{1}{1-\hat{A}_7(t)}+\frac{-\frac{e^{W_I(t)}}{1+e^{W_S(t)}}-\hat{A}_7(t)}{(1-\hat{A}_7(t))^2}+\frac{\left( -\frac{e^{W_I(t)}}{1+e^{W_S(t)}}-\hat{A}_7(t)\right) ^2}{(1-\hat{A}_7(t))^3}\right] \!.\nonumber \\ \end{aligned}$$

(89)

Then we expand the remaining nonlinear component in (89), $\frac{e^{W_I(t)}}{1+e^{W_S(t)}}$, using a Taylor series to second order:

$$\begin{aligned} \frac{e^{W_I(t)}}{1+e^{W_S(t)}}&=\left[ e^{\left( W_I(t)-W_S(t)\right) }\right] \times \left[ \frac{1}{1-(-e^{-W_S(t)})}\right] \nonumber \\&\approx \left[ e^{\hat{A}_4(t)}+e^{\hat{A}_4(t)}(W_I(t)-W_S(t)-\hat{A}_4(t))+\frac{e^{\hat{A}_4(t)}}{2}(W_I(t)-W_S(t)-\hat{A}_4(t))^2\right] \nonumber \\&\quad \times \left[ \frac{1}{1-\hat{A}_9(t)}+\frac{\left( -e^{-W_S(t)}-\hat{A}_9(t)\right) }{\left( 1-\hat{A}_9(t)\right) ^2}+\frac{-e^{W_S(t)}-\hat{A}_9(t)}{\left( 1-\hat{A}_9(t)\right) ^3}\right] , \end{aligned}$$

(90)

and

$$\begin{aligned} e^{-W_S(t)}\approx e^{\hat{A}_5(t)}+e^{\hat{A}_5(t)}\left( -W_S(t)-\hat{A}_5(t)\right) +\frac{e^{\hat{A}_5(t)}\left( -W_S(t)-\hat{A}_5(t)\right) ^2}{2}.\nonumber \\ \end{aligned}$$

(91)

Upon substituting (91) into (90), we obtain:

$$\begin{aligned} \frac{e^{W_I(t)}}{1+e^{W_S(t)}}&= B_0(t)+B_1(t)W_S(t)+B_2(t)W_I(t)\nonumber \\&\quad +\,o(W_S(t)^2)+o(W_I(t)^2)+o\left( W_S(t)W_I(t)\right) , \end{aligned}$$

(92)

where $\{B_i(t):i=0,1,2\}$ are defined in Eqs. (46)–(48) and (52).

Hence, combining (88), (89), and (92), we can finally approximate

$$\begin{aligned} \mathrm{log}\left[ 1+\frac{\phi \varDelta }{\mathrm{exp}\left( W_S(t)\right) }-\frac{\beta \varDelta \lambda _Ne^{W_I(t)}}{1+e^{W_S(t)}+e^{W_I(t)}}\right] \end{aligned}$$

in Eq. (27) with,

$$\begin{aligned}&\mathrm{log}\hat{A}_1(t)+\frac{1}{\hat{A}_1(t)}+\phi \varDelta \left[ \frac{e^{(\hat{A}_5(t))}(1-\hat{A}_5(t))}{\hat{A}_1(t)}\right] \nonumber \\&\quad -\,\frac{\beta \varDelta \lambda _N}{\hat{A}_1(t)}\left( 1-\frac{1}{1-\hat{A}_7(t)}+\frac{B_0(t)+\hat{A}_7(t)}{(1-\hat{A}_7(t))^2}\right) -1\nonumber \\&\quad +\,\left[ 1-\frac{\phi \varDelta e^{\hat{A}_5(t)}}{\hat{A}_1(t)}-\frac{\beta \varDelta \lambda _NB_1(t)}{\hat{A}_1(t)(1-\hat{A}_7(t))^2}\right] W_S(t)-\left[ \frac{\beta \varDelta \lambda _NB_2(t)}{\hat{A}_1(t)(1-\hat{A}_7(t))^2}\right] W_I(t). \end{aligned}$$

(93)

Also, the third term, $\mathrm{log}\left( 1+\varDelta \gamma e^{W_I(t)}-\varDelta \phi \right) $, on the right-hand side of (27), can be expanded using a Taylor series to second order:

$$\begin{aligned} \mathrm{log}\left( 1+\varDelta \gamma e^{W_I(t)}-\varDelta \phi \right)&=\mathrm{log}(A_2(t))+\mathrm{log}\left[ 1+\left( \frac{1+\gamma \varDelta e^{W_I(t)}-\phi \varDelta }{A_2(t)}-1\right) \right] \nonumber \\&\approx \mathrm{log}(\hat{A}_2(t))+\left( \frac{1+\gamma \varDelta e^{W_I(t)}-\phi \varDelta }{\hat{A}_2(t)}-1\right) \nonumber \\&\quad -\frac{1}{2}\left( \frac{1+\gamma \varDelta e^{W_I(t)}-\phi \varDelta }{\hat{A}_2(t)}-1\right) ^2. \end{aligned}$$

(94)

Then we expand $e^{W_I(t)}$ using a Taylor series to second order:

$$\begin{aligned} e^{W_I(t)}\approx e^{\hat{A}_6(t)}+ e^{\hat{A}_6(t)}(W_I(t)-\hat{A}_6(t))+\frac{e^{\hat{A}_6(t)}}{2}(W_I(t)-\hat{A}_6(t))^2. \end{aligned}$$

(95)

Upon substituting (95) into (94), we obtain:

$$\begin{aligned} \mathrm{log}\left( 1\!+\!\gamma \varDelta e^{W_I(t)}-\phi \varDelta \right)&\approx \mathrm{log}\hat{A}_2(t)+\frac{1}{\hat{A}_2(t)}\!+\!\frac{\gamma \varDelta }{\hat{A}_2(t)}e^{\hat{A}_6(t)}(1\!-\!\hat{A}_6(t))\!-\!\frac{\phi \varDelta }{\hat{A}_2(t)}\!-\!1\nonumber \\&+\,\frac{\gamma \varDelta e^{\hat{A}_6(t)}}{\hat{A}_2(t)}W_I(t). \end{aligned}$$

(96)

Consider $\mu ^W_I(t)$ given by (28). Using a Taylor-series expansion to second order, the second term on the right-hand side is:

$$\begin{aligned}&\mathrm{log}\left( 1-\gamma \varDelta +\frac{\beta \varDelta \lambda _N e^{W_S(t)} }{1+e^{W_S(t)}+e^{W_I(t)}}\right) \nonumber \\&\quad =\mathrm{log}(A_3(t))+\mathrm{log}\left[ 1+\left( \frac{1-\gamma \varDelta +\frac{\beta \varDelta \lambda _Ne^{W_S(t)}}{1+e^{W_S(t)}+e^{W_I(t)}}}{A_3(t)}-1\right) \right] \nonumber \\&\quad \!\approx \! \mathrm{log}(\hat{A}_3(t))+\left( \frac{1-\gamma \varDelta +\frac{\beta \varDelta \lambda _Ne^{W_S(t)}}{1+e^{W_S(t)}+e^{W_I(t)}}}{\hat{A}_3(t)}-1\right) \nonumber \\&\qquad -\frac{1}{2}\left( \frac{1-\gamma \varDelta +\frac{\beta \varDelta \lambda _Ne^{W_S(t)}}{1+e^{W_S(t)}+e^{W_I(t)}}}{\hat{A}_3(t)}-1\right) ^2, \end{aligned}$$

(97)

Now, $\frac{e^{W_S(t)} }{1+e^{W_S(t)}+e^{W_I(t)}}$ can be further expanded using a Taylor series to second order:

$$\begin{aligned} \frac{e^{W_S(t)} }{1+e^{W_S(t)}+e^{W_I(t)}}&=\frac{1}{1-\left( -\frac{1+e^{W_I(t)}}{e^{W_S(t)}}\right) }\nonumber \\&\approx \!\frac{1}{1\!-\!\hat{A}_{10}(t)}\!+\!\frac{\left( \!-\!\frac{1\!+\!e^{W_I(t)}}{e^{W_S(t)}}\!-\!\hat{A}_{10}(t)\right) }{(1-\hat{A}_{10}(t))^2}+\frac{\left( -\frac{1+e^{W_I(t)}}{e^{W_S(t)}}-\hat{A}_{10}(t)\right) ^2}{(1-\hat{A}_{10}(t))^3}.\nonumber \\ \end{aligned}$$

(98)

Then we expand the remaining nonlinear component in (98), $\frac{1+e^{W_I(t)}}{e^{W_S(t)}}$, using a Taylor series to second order:

$$\begin{aligned} \frac{1+e^{W_I(t)}}{e^{W_S(t)}}&=e^{W_I(t)-W_S(t)}+e^{-W_S(t)}\nonumber \\&\approx \left[ e^{\hat{A}_4(t)}+e^{\hat{A}_4(t)}(W_I(t)-W_S(t)-\hat{A}_4(t))+\frac{e^{\hat{A}_4(t)}}{2}(W_I(t)-W_S(t)-\hat{A}_4(t))^2\right] \nonumber \\&\quad +\,\left[ e^{\hat{A}_5(t)}+e^{\hat{A}_5(t)}(-W_S(t)-\hat{A}_5(t))+\frac{e^{\hat{A}_5(t)}}{2}(-W_S(t)-\hat{A}_5(t))^2\right] . \end{aligned}$$

(99)

Then we rewrite (99) as

$$\begin{aligned} \frac{1+e^{W_I(t)}}{e^{W_S(t)}}&= B_3(t)+B_4(t)W_S(t)+B_5(t)W_I(t)\nonumber \\&\quad +\,o\left( W_S(t)^2\right) +o\left( W_I(t)^2\right) +o\left( W_S(t)W_I(t)\right) , \end{aligned}$$

(100)

where $\{B_i(t):i=3,4,5\}$ are defined in (49)–(51).

Hence, combining (97), (98), and (100), we can finally approximate

$$\begin{aligned} \mathrm{log}\left[ (1-\gamma \varDelta )+\frac{\beta \varDelta \lambda _N e^{W_S(t)} }{1+e^{W_S(t)}+e^{W_I(t)}}\right] \end{aligned}$$

in Eq. (28) with

$$\begin{aligned}&\mathrm{log}\hat{A}_3(t)+\frac{1-\gamma \varDelta }{\hat{A}_3(t)}+ \frac{\beta \varDelta \lambda _N}{\hat{A}_3(t)(1-\hat{A}_{10}(t))}-\frac{\beta \varDelta \lambda _N(B_3(t)+\hat{A}_{10}(t))}{\hat{A}_3(t)(1-\hat{A}_{10}(t))^2}-1\nonumber \\&\quad +\,\left[ \frac{-\beta \varDelta \lambda _N B_4(t)}{\hat{A}_3(t)(1-\hat{A}_{10}(t))^2}\right] W_S(t) +\left[ 1-\frac{\beta \varDelta \lambda _N B_5(t)}{\hat{A}_3(t)(1-\hat{A}_{10}(t))^2}\right] W_I(t).\nonumber \\ \end{aligned}$$

(101)

The third component on the right-hand side of (28) is identical to that of (27), and hence we use the linear approximation given by (96).

Therefore, (93), (96), and (101) yields the linear dynamical process ${\varvec{\mu }}^{LW}(t)$ as defined in (38)–(39) in Sect. 3. That is, ${\varvec{\mu }}^{LW}(t)$ approximates the nonlinear dynamical process, ${\varvec{\mu }}^{W}(t)$, in the HSIRS model defined by Eqs. (27)–(28).

1.3 Appendix A.3: Full conditional distributions

Here we give the full conditional distributions that define our Markov chain Monte Carlo (MCMC) algorithm. We use “rest” to represent all other unknowns as well as the data $\{Z_S(1),\ldots ,Z_S(T)\}$ and $\{Z_I(1),\ldots ,Z_I(T)\}.$ Notice that this is not a particle-filtering approach in which new data are used to update current and past posteriors without re-running the MCMC.

$\beta $
$$\begin{aligned} {[}\beta |\mathrm{rest}]\propto \prod _{t=2}^{T+F}[W_S(t),W_I(t)|W_S(t-1), W_I(t-1), \beta ,\theta _{\gamma }, \phi ,\sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}][\beta ]. \end{aligned}$$
$\theta _{\gamma }$
$$\begin{aligned} {[}\theta _{\gamma }|\mathrm{rest}]\propto \prod _{t=2}^{T+F}[W_S(t), W_I(t)| W_S(t-1), W_I(t-1), \beta ,\theta _{\gamma },\phi ,\sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}][\theta _{\gamma }]. \end{aligned}$$
$\phi $
$$\begin{aligned} {[}\phi |\mathrm{rest}]\propto \prod _{t=2}^{T+F}[W_S(t), W_I(t)| W_S(t-1), W_I(t-1), \beta ,\theta _{\gamma },\phi ,\sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}][\phi ]. \end{aligned}$$
$\sigma ^2_{\zeta _S}$
$$\begin{aligned} {[}\sigma ^2_{\zeta _S}|\mathrm{rest}]&\propto [W_S(1), W_I(1)| \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}]\nonumber \\&\cdot \prod _{t=2}^{T+F}[W_S(t), W_I(t)| W_S(t-1), W_I(t-1), \beta ,\theta _{\gamma }, \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}][\sigma ^2_{\zeta _S}]. \end{aligned}$$
$\sigma ^2_{\zeta _I}$
$$\begin{aligned} {[}\sigma ^2_{\zeta _I}|\mathrm{rest}]&\propto [W_S(1), W_I(1)| \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}]\nonumber \\&\cdot \prod _{t=2}^{T+F}[W_S(t), W_I(t)| W_S(t-1), W_I(t-1), \beta ,\theta _{\gamma }, \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}][\sigma ^2_{\zeta _I}]. \end{aligned}$$
$\{W_S(t), W_I(t)\}$: for $t=1,\ldots ,T, T+1,\ldots ,T+F$,
$$\begin{aligned} \left[ W_S(t), W_I(t)|\mathrm{rest}\right] \propto \left\{ \begin{array}{ll} \left[ Z_S(t)|W_S(t), W_I(t)\right] \left[ Z_I(t)|W_S(t), W_I(t)\right] \\ \cdot [W_S(t\!+\!1), W_I(t\!+\!1)|W_S(t), W_I(t),\beta ,\theta _{\gamma }, \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}]\\ \cdot [W_S(t), W_I(t)|\sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}],\\ \hbox { if } t=1;\\ \left[ Z_S(t)|W_S(t), W_I(t)\right] \left[ Z_I(t)|W_S(t), W_I(t)\right] \\ \cdot [W_S(t\!+\!1), W_I(t\!+\!1)|W_S(t), W_I(t),\beta ,\theta _{\gamma },\phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}]\\ \cdot [W_S(t), W_I(t)|W_S(t\!-\!1), W_I(t\!-\!1),\beta ,\theta _{\gamma }, \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}],\\ \hbox { if } 2\le t\le T;\\ \left[ W_S(t\!+\!1), W_I(t\!+\!1)|W_S(t), W_I(t),\beta ,\theta _{\gamma },\phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}\right] \\ \cdot \left[ W_S(t), W_I(t)|W_S(t\!-\!1), W_I(t\!-\!1),\beta ,\theta _{\gamma }, \phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}\right] \!,\\ \hbox { if } T<t<T+F;\\ \cdot [W_S(t), W_I(t)|W_S(t\!-\!1), W_I(t\!-\!1),\beta ,\theta _{\gamma },\phi , \sigma ^2_{\zeta _S}, \sigma ^2_{\zeta _I}],\\ \hbox { if } t=T+F.\\ \end{array} \right. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhuang, L., Cressie, N. Bayesian hierarchical statistical SIRS models. Stat Methods Appl 23, 601–646 (2014). https://doi.org/10.1007/s10260-014-0280-9

Download citation

Accepted: 22 July 2014
Published: 06 November 2014
Issue Date: November 2014
DOI: https://doi.org/10.1007/s10260-014-0280-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Bayesian hierarchical statistical SIRS models

Abstract

Similar content being viewed by others

How Local Interactions Impact the Dynamics of an Epidemic

Effects of heterogeneous susceptibility on epidemiological models of reinfection

A Stochastic SVIR Model with Imperfect Vaccine and External Source of Infection

1 Introduction

2 Bayesian hierarchical statistical SIRS (HSIRS) model

2.1 Data model

2.2 Process model

2.3 Parameter model

3 \(\mathbf {W}\)-scale approximations for the HSIRS model

4 Posterior analysis of a simulated epidemic

4.1 Simulated data

4.1.1 The HSIRS dataset

4.1.2 The modified-CSIRS dataset

4.2 Fitting the HSIRS model

4.3 Posterior analysis and forecasting

4.3.1 Posterior analysis based on the HSIRS model

4.3.2 Comparisons of the fitted ASIRS model and the fitted CSIRS model

5 A simulation experiment

5.1 Factors of the simulation experiment

5.2 Results of the simulation experiment

6 Discussion and conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix A.1: Derivation of the nonlinear dynamical structure of \({\varvec{\mu }}^{W}(t)\)

1.2 Appendix A.2: Derivation of the linear process \(\{{\varvec{\mu }}^{LW}(t)\}\) to approximate \({{\varvec{\mu }}^{W}(t)}\)

1.3 Appendix A.3: Full conditional distributions

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Bayesian hierarchical statistical SIRS models

Abstract

Similar content being viewed by others

How Local Interactions Impact the Dynamics of an Epidemic

Effects of heterogeneous susceptibility on epidemiological models of reinfection

A Stochastic SVIR Model with Imperfect Vaccine and External Source of Infection

1 Introduction

2 Bayesian hierarchical statistical SIRS (HSIRS) model

2.1 Data model

2.2 Process model

2.3 Parameter model

3 \(\mathbf {W}\)-scale approximations for the HSIRS model

4 Posterior analysis of a simulated epidemic

4.1 Simulated data

4.1.1 The HSIRS dataset

4.1.2 The modified-CSIRS dataset

4.2 Fitting the HSIRS model

4.3 Posterior analysis and forecasting

4.3.1 Posterior analysis based on the HSIRS model

4.3.2 Comparisons of the fitted ASIRS model and the fitted CSIRS model

5 A simulation experiment

5.1 Factors of the simulation experiment

5.2 Results of the simulation experiment

6 Discussion and conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendices

Appendices

1.1 Appendix A.1: Derivation of the nonlinear dynamical structure of \({\varvec{\mu }}^{W}(t)\)

1.2 Appendix A.2: Derivation of the linear process \(\{{\varvec{\mu }}^{LW}(t)\}\) to approximate \({{\varvec{\mu }}^{W}(t)}\)

1.3 Appendix A.3: Full conditional distributions

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation