Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

A Message-Passing Approach For Recurrent State Epidemic Models On Networks

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

A Message-Passing Approach

for Recurrent-State Epidemic


Models on Networks
Munik Shrestha
Samuel V. Scarpino
Cristopher Moore

SFI WORKING PAPER: 2015-05-014

SFI  Working  Papers  contain  accounts  of  scienti5ic  work  of  the  author(s)  and  do  not  necessarily  represent
the  views  of  the  Santa  Fe  Institute.    We  accept  papers  intended  for  publication  in  peer-­‐reviewed  journals  or
proceedings  volumes,  but  not  papers  that  have  already  appeared  in  print.    Except  for  papers  by  our  external
faculty,  papers  must  be  based  on  work  done  at  SFI,  inspired  by  an  invited  visit  to  or  collaboration  at  SFI,  or
funded  by  an  SFI  grant.

©NOTICE:  This  working  paper  is  included  by  permission  of  the  contributing  author(s)  as  a  means  to  ensure
timely  distribution  of  the  scholarly  and  technical  work  on  a  non-­‐commercial  basis.      Copyright  and  all  rights
therein  are  maintained  by  the  author(s).  It  is  understood  that  all  persons  copying  this  information  will
adhere  to  the  terms  and  constraints  invoked  by  each  author's  copyright.  These  works    may    be  reposted
only  with  the  explicit  permission  of  the  copyright  holder.

www.santafe.edu

SANTA FE INSTITUTE
A message-passing approach for recurrent-state epidemic models on networks
Munik Shrestha
University of New Mexico, Albuquerque, NM 87131, USA and
Santa Fe Institute, 1399 Hyde Park road, Santa Fe, NM 87501, USA

Samuel V. Scarpino and Cristopher Moore


Santa Fe Institute, 1399 Hyde Park road, Santa Fe, NM 87501, USA
(Dated: May 8, 2015)
Epidemic processes are common out-of-equilibrium phenomena of broad interdisciplinary interest.
Recently, dynamic message-passing (DMP) has been proposed as an efficient algorithm for simulat-
ing epidemic models on networks [1–5], and in particular for estimating the probability that a given
node will become infectious at a particular time. To date, DMP has been applied exclusively to mod-
els with one-way state changes, as opposed to models like SIS (susceptible-infectious-susceptible)
and SIRS (susceptible-infectious-recovered-susceptible) where nodes can return to previously inhab-
ited states. Because many real-world epidemics can exhibit such recurrent dynamics, we propose
a DMP algorithm for complex, recurrent epidemic models on networks. Our approach takes corre-
lations between neighboring nodes into account while preventing causal signals from backtracking
to their immediate source, and thus avoids “echo chamber effects” where a pair of adjacent nodes
each amplify the probability that the other is infectious. We demonstrate that this approach well
approximates results obtained from Monte Carlo simulation and that its accuracy is often superior
to the pair approximation (which also takes second-order correlations into account). Moreover, our
approach is more computationally efficient than the pair approximation, especially for complex epi-
demic models: the number of variables in our DMP approach grows as 2mk where m is the number
of edges and k is the number of states, as opposed to mk2 for the pair approximation. We suspect
that the resulting reduction in computational effort, as well as the conceptual simplicity of DMP,
will make it a useful tool in epidemic modeling, especially for inference tasks where there is a large
parameter space to explore.

I. INTRODUCTION

Mathematical models of epidemic processes are intrinsically non-linear and multiplicative. These models
include the spread of disease [6, 7], transmission of social behaviors [8–11], cascades of banking failures [12,
13], forest fires [14–16], the propagation of marginal probabilities in constraint satisfaction problems [17, 18]
and the dynamics of magnetic and glassy systems [19].
The classical approach to modeling epidemics, such as the SIR model where each node is Susceptible,
Infectious, or Recovered, assumes that at any given time each individual exists in a single state or “com-
partment” [6, 7]. To make these models analytically tractable, it is often assumed that the population is
well mixed, so that interaction between any two individuals is equally likely; in physical terms, we assume
the model is mean-field (also known as mass-action mixing in the epidemiology literature). Despite this
unrealistic assumption, mean-field models capture some essential features of epidemics, such as a threshold
above which we have an endemic phase with a non-zero fraction of infected individuals, and below which we
have outbreaks of size o(n) so that the equilibrium fraction of infected individuals is zero.
In reality, contacts between individuals in the population are often highly structured, with some pairs of
individuals much more likely to interact than others due to location or demographics [11, 20]. To relax the
mean-field assumption, while retaining some measure of tractability, we can assume that individuals interact
on a network, whose structure captures the heterogeneity in the population [21, 22]. However, replacing the
mean-field approximation with a contact network substantially increases a model’s complexity.
One reasonable goal is to compute the one-point marginals, e.g., for each node i the probability Ii (t)
that i is infectious at time t. In addition to being of direct interest, these marginals help us perform tasks
such as inferring the originator of an epidemic, determining an optimal set of nodes to immunize in order
to minimize the final size of an outbreak, or calculating the probability that an entire group of nodes will
remain uninfected after a fixed time [23–27].
We can always compute these marginals by performing Monte Carlo experiments. However, since we need
to perform many independent trials in order to collect good statistics, this is computationally expensive on
large networks. This problem is compounded if we need to scan through parameter space, or if we want
to explore many different initial conditions, vaccination strategies, etc. Therefore, it would be desirable to
compute these marginals using, say, a system of differential equations, with variables that directly model the
probabilities of various events.
The most naive way to do this, as we review below, uses the one-point marginals themselves as variables.
However, this approach completely ignores correlations between nodes. At the other extreme, to model the
system exactly, we would need to keep track of the entire joint distribution: but if there are n individuals,
each of which can be in one of k states, this results in a coupled system with kn variables. This exponential
scaling quickly renders most models computationally intractable, even on moderately sized networks.
In between these two extremes, we can approximate the joint distribution by “moment closure,” assuming
that higher-order marginals can be written in terms of lower-order ones. This gives a hierarchy of increasingly
accurate (and computationally expensive) approximations, familiar in physics as cluster expansions. At the
first level of this hierarchy we assume that the nodes are uncorrelated, and approximate two-point marginals
such as [Ii (t) ∧ Ij (t)] (the probability that i and j are both infectious at time t) as Ij (t)Ij (t). At the second
level, commonly referred to in the epidemiology literature as the pair approximation, we close the hierarchy
at the level of pairs [Ii (t) ∧ Ij (t)] by assuming that three-point correlations can be factored in terms of
two-point correlations. For a comprehensive review of these methods, see [22, 28].
In this paper, we study an alternative method, namely Dynamic Message-Passing (DMP). As in belief
propagation [32, 33], here variables or “messages” are defined on a network’s directed edges: for instance,
Ij→i denotes the probability that j was infected by one of its neighbors other than i, so that the epidemic
might spread from j to i. However, unlike belief propagation, where the posterior distributions are updated
according to Bayes’ rule, here we write differential equations for the messages over time.
For many epidemic models, such as SI (susceptible-infectious), SIR (susceptible-infectious-recovered) and
SEIR (susceptible-exposed-infectious-recovered), only one-way state changes can occur. For example, in the
SIR model, once an individual has left the Susceptible class and become Infectious, they cannot return to
being Susceptible; once they become Recovered, they are immune to future infections, and might as well be
Removed. For these non-recurrent models, DMP is known to be be an efficient algorithm to estimate Ii (t),
and it is exact on trees [1]; it can also be applied to threshold models [3–5] and used for inference [23].
However, for many real-world diseases individuals can return to previously inhabited states. In these
recurrent models, such as SIS (susceptible-infectious-susceptible), SIRS (susceptible-infectious-recovered-
susceptible), and SEIS (susceptible-exposed-infectious-susceptible), individuals can cycle through the states
multiple times, giving multiple waves of infection traveling through the population. The most obvious
examples of recurrent models are seasonal influenza, where due to the evolution of the virus individuals are
repeatedly infected during their lifetime [41], vaccination where protective immunity wanes over time [42], and
diseases curable by treatment which does not result in antibody-mediated immunity, such as gonorrhea [47].
In all three cases, individuals leave the Susceptible class, only to return at some point in the future (although
for influenza, it is worth mentioning that if the evolutionary rate of the virus is functionally related to the
number of susceptible individuals, then the recovery rate may not be independent from the state of one’s
neighbors.) Unfortunately, the DMP approach of [1] cannot be directly extended to recurrent models, since
their equations for messages only track the first time an individual makes the transition to a given state.
The purpose of this paper is to develop a novel DMP algorithm for recurrent models of epidemics on
networks, which we call rDMP. We will show that rDMP gives very good approximations for marginal
probabilities on networks, and is often more accurate than the pair approximation. Moreover, whereas the
pair approximation requires keeping track of mk2 variables, if there are m edges and k states per node,
rDMP requires just 2mk variables. For complex models where k is large—for instance, for diseases with
multiple stages of infection or immunity, or multiple-disease epidemics where one disease makes individuals
more susceptible to another one—this gives a substantial reduction in the computational effort required.
Finally, the rDMP approach is conceptually simple, making it easy to write down the system of differential
equations for a wide variety of epidemic models.

2
FIG. 1. We define messages on the directed edges of a network to carry causal information of the flow of contagion,
e.g. Ij→i is the probability that j is Infectious because it received the infection from a neighbor k other than i. This
prevents effects from immediately backtracking to the node they came from, and avoids “echo chamber” infections.

FIG. 2. Two simple, yet illustrative, cases of networks, where the darker node is initially Infectious. As we discuss,
in these simple cases one can see the motivation for our approach to prevent infection signals from backtracking to
the node it immediately came from.

II. MESSAGE-PASSING AND PREVENTING THE ECHO CHAMBER EFFECT

As shown in Fig. 1, the variables of rDMP are messages along directed edges of the network (in addition
to one-point marginals). For instance, Ij→i is the probability that j is Infectious because it was infected by
one of its other neighbors k. The intuition behind this is the following, where we take the SIS model as an
example. If i is Susceptible, the rate at which j will infect i is proportional to the probability Ij that j is
infected. But when computing this rate, we only include the contribution to Ij that comes from neighbors
other than i. In other words, we deliberately neglect the event that j receives the infection from i, and
immediately passes it back to i, even if i has become Susceptible in the intervening time.
This choice avoids a kind of “echo chamber” effect, where neighboring nodes artificially amplify each
others’ probability of being Infectious. For instance, consider a simple but pathological case of the SI model
where there are only two nodes in the graph, i and j, with an edge between them as shown in Fig. 2. If
the transmission rate is λ, and if we assume the nodes are independent (i.e., if we use first-order moment
closure) we obtain the following differential equations,

dIi
= λSi Ij
dt
dIj
= λSj Ii , (1)
dt
where Si (t) = 1 − Ii (t) and similarly for j.
Now suppose that j is initially Infectious with probability δ, and that i is initially Susceptible, i.e., Ij (0) = δ
and Ii (0) = 0. Since in the SI model nodes never recover, the infection will eventually spread from j to i,
but only if i was Infectious in the first place. Thus the marginals Ii (t) and Ij (t) should tend to δ as t → ∞.
However, integrating Eq. (1) gives a different result. Once Ii becomes positive, dIj /dt becomes positive
as well, allowing i to infect j with the infection that it received from j in the first place. As a result, Ij (t)
approaches 1 as t → ∞. Thus the “echo chamber” between i and j leads to the absurd result that j eventually
becomes Infectious, even though with probability 1 − δ there was no initial infection in the system.
In the rDMP approach, we fix this problem by replacing Ii and Ij with the messages they send each other,

dIi
= λSi Ij→i ,
dt
dIj
= λSj Ii→j ,
dt

3
so that i can only infect j if i received the infection from some node other than j. (Below we give the
equations on a general network, including the time derivatives of the messages.) In this example, there are
no other nodes, so if Ij→i (0) = δ and Ii→j (0) = 0, then Ij (t) = δ for all t as it should be.
Note that we do not claim that rDMP is exact in this case. In particular, as in (1), Ii (t) tends to 1 as
t → ∞. This is because, unlike the system of [1], rDMP assumes that the events that j infects i at different
times are independent.
In this two-node example, of course, the pair approximation is exact, since it maintains separate variables
such as [Sj ∧ Ik ] for each of the joint states of the two nodes. However, the pair approximation is subject to
other forms of the echo chamber effect. Consider a network with three nodes, as in Fig. 2 (right), where j is a
common neighbor of i and k. The pair approximation assumes that, conditioned on the state of j, the states
of i and k are independent; however, in a recurrent epidemic model, i and k could be correlated, for instance
if j infected them both and then returned to the Susceptible state. As a result, the pair approximation is
vulnerable to a distance-two echo chamber, where i and k infect each other through j. As in the two-node
case, rDMP prevents this.
Preventing backtracking completely may seem like a strong assumption, and in recurrent models it is
a priori possible, for instance, for a node to re-infect the neighbor it was infected by. Despite the well-
documented importance of recurrent infections for diseases including (but certainly not limited to) seasonal
influenza [41], Plasmodium malaria [49], and urinary tract infections [48], little is known about the source of
recurrent infections. For certain sexually transmitted diseases such as gonorrhea [47] and repeated ringworm
infections [50], there is evidence that backtracking plays a significant role; on the other hand, it may be that
recurrent infections are caused by different strains, each of which is acting essentially without backtracking.
Thus while our non-backtracking assumption is clearly invalid in some cases, we believe it is a reasonable
approach for most recurrent state infections.

III. THE rDMP EQUATIONS FOR THE SIS, SIRS, AND SEIS MODELS

In this section, we illustrate the rDMP approach for several recurrent epidemic models. We start with the
simplest one: in the SIS model, each node is either Infectious (I) or Susceptible (S). Infectious nodes infect
their Susceptible neighbors at rate λ, and their infections wane back into the Susceptible state at rate ρ. We
denote the probability that that node i is Infectious or Susceptible by Ii and Si respectively. The objective
then is to efficiently and accurately compute these probabilities as a function of time t.
We define variables or “messages” that live on the directed edges (i, j) of the network. The directed nature
of these messages prevent infection from backtracking from an Infectious node back to its infection source,
e.g., if node i infects node j, then we prevent j from re-infecting i. In addition to tracking the one-point
marginal Ij , we define a message Ij→i from j to i as the probability that j is in the Infectious state as a result
of being infected from one of its neighbors other than i. Given these incoming messages, the rate at which
Ii evolves in time is given by

dIi X
= −ρIi + λSi Ij→i , (2)
dt
j∈∂i

where ∂i denotes the neighbors of i. Similarly, the rate at which Ij→i evolves in time is given by

dIj→i X
= −ρIj→i + λSj Ik→j , (3)
dt
k∈∂j\i

where k ∈ ∂j \ i denotes the neighbors of j excluding i.


For the SIRS model, we let ρ and γ denote the transition rates from Infectious to Recovered and from

4
Recovered to Susceptible respectively. Then the rDMP system for the SIRS model is given by
dIj→i X
= −ρIj→i + λSj Ik→j , (4)
dt
k∈∂j\i

which is coupled with the one-point marginals through


dSi X
= γRi − λSi Ij→i
dt
j∈∂i
dIi X
= −ρIi + λSi Ij→i
dt
j∈∂i
dRi
= ρIi − γRi . (5)
dt

In the SEIS model, upon becoming exposed to an infected neighbor, Susceptible nodes first go through a
latent period called the Exposed state. In this state, individuals are infected but not yet Infectious. Exposed
nodes become Infectious at the rate ε, and Infectious nodes again wane back to Susceptible at rate ρ. The
rDMP system for the SEIS model is
dEj→i X
= −εEj→i + λSj Ik→j ,
dt
k∈∂j\i
dIj→i
= −ρIj→i + εEj→i , (6)
dt
which is coupled with the one-point marginals as
dSi X
= ρIi − λSi Ij→i
dt
j∈∂i
dIi
= −ρIi + εEi
dt
dEi X
= −εIi + λSi Ij→i . (7)
dt
j∈∂i

Note that here we track messages for the Exposed state, in addition to one-point marginals, since they act
as precursors for the Infectious messages. There is no need to track messages for the Susceptible state, since
it does not cause state changes in its neighbors.
Generalizing these equations to more complex epidemic models with k different states, as opposed to three
or four, is straightforward. Even in a model where every state can cause state changes in its neighbors—for
instance, where having Susceptible neighbors speeds up the rate of recovery, or where Exposed nodes can
also infect their neighbors at a lower rate—the total number of variables we need to track in a network
with n nodes and m edges is at most 2mk in addition to the nk one-point marginals. In contrast, the pair
approximation requires mk2 states to keep track of the joint distribution of every neighboring pair.

IV. EXPERIMENTS IN REAL AND SYNTHETIC NETWORKS

In this section we report on numerical experiments for rDMP for the SIS and SIRS models on real and
synthetic networks. As a performance metric, we use the average L1 error per node between the marginals
computed from rDMP and the true probabilities computed (up to sampling error) using continuous-time

5
FIG. 3. Results on the SIS model. On the left, the marginal probability that node 29 in Zachary’s Karate club (see
inset on right) is Infectious as a function of time. We compare the true marginal derived by 105 independent Monte
Carlo simulations with that estimated by rDMP, the independent node approximation, and the pair approximation.
On the right is the L1 error, averaged over all nodes; we see that rDMP is the most accurate of the three methods.
Here the transmission rate is λ = 0.1, the waning rate is ρ = 0.05, and vertex 0 (colored red) was initially infected.

FIG. 4. A scatterplot of the steady-state marginals Ii for the n = 33 nodes in Zachary’s Karate Club, with the
same parameters as in Fig. 3. The vertical axis is the true marginal computed by Monte Carlo simulations; the
horizontal axis is the estimated marginals from rDMP (black ?) and the pair approximation (blue ×). Both methods
overestimate the marginal, but rDMP is closer to the true value (the line y = x) for every node.

Monte Carlo simulations. That is,


1 X MC
LrDMP
1 (t) = Ii (t) − IrDMP
i (t) , (8)
n
i

We use this metric to compare the performance of rDMP with the independent-node approximation and
the pair approximation, or equivalently first- and second-order moment closure [22, 28]. As we will see,
for a wide range of parameters, rDMP is more accurate than either of these approaches, even though it is
computationally easier than the pair approximation.

6
of the di↵erence D(t, ⇢ ) between LDMP
1 (t) and Lpair
1 (t) for increasing values of the p
pair
MP
(t) L1 (t) in the Zachary’s network. A positive D(t, ⇢ ) (colored red) means
an that from the pair-approximation, whereas DMP outperforms the pair-approxim
parameters as in Fig. 3, but we sweep through various value of the recovery rate ⇢.

ed in Eq. (9). So if D(t, ⇢ ) is positive (negative), the error from r -DMP is mo


air-approximation. In Fig. 5, keeping all the parameters the same as in Fig.
r -DMP is only positive (colored red) at early times when ⇢ is relatively low.
pare the performance in a single instance of an Erdős-Rényi graph (inset of th
single initially infectious node.
FIG. 5. The difference between L rDMP
and L
Transmission
pair
rate = 0.2, and recovery rate ⇢
on Zachary’s Karate Club for various values of the ratio ρ/λ. We
3 1 1
ults were rescale
averaged
time so thatover
rDMP
λ = 0.1 10
pair
runs.
as before. We
In the blue see
region, L that
rDMP
1 <L 1rand-DMP
pair
rDMP is moredoes the
accurate; best,
in the red except
region, L > L . We see that rDMP is more accurate except at early times or when ρ/λ is small.
oximation marginally outperforms r -DMP.
1 1

the performance
In Fig. 3, weof showall three
results methods
for the SIS in various
model on Zachary’s Karate Clubother
[34]. On networks
the left, we show like
the random
marginal probability that a particular node is Infectious as a function of time, estimated by rDMP and
metric graphs,
by first- scale-free
and second-ordernetworks,
moment closure, and Newman-Watts-Strogatz
compared with the true marginals given small world netwo
by Monte Carlo
of dolphins [21]. We find that r -DMP outperforms the first-moment-closure a
simulation. On the right, we show the average L error for the three methods. Here λ = 0.1, ρ = 0.05, and
1
the initial condition consists of a single infected node (shown in red in the inset). The Monte Carlo results
were averaged over 105 runs. We see that rDMP is significantly more accurate than the other two, except
at some early times when the pair approximation marginally outperforms rDMP.
As a further illustration, in Fig. 4 we show the steady-state marginal Ii for each node i (measured by
7
running the system until t = 50, at which point Ii (t) is nearly constant), with the same parameters and
initial condition as in Fig. 3. We show the true marginal of each node on the y-axis, and the marginals
estimated by rDMP and the pair approximation on the x-axis. If the estimated marginals were perfectly
accurate, the points would fall on the line y = x. Both methods overestimate the marginals to some extent,
but rDMP is more accurate than the pair approximation on every node. Thus rDMP makes accurate
estimates of the marginals on individual nodes, as opposed to just the average across the population.
To investigate how rDMP compares with the pair approximation across a broader range of parameters,
in Fig. 5 we vary the ratio between waning rate ρ and the transmission rate λ. Since we can always rescale
time by multiplying λ and ρ by the same constant, we do this by holding λ = 0.1 as before, and varying ρ.
We then measure the difference in the L1 error of the two methods, LrDMP1 − Lpair
1 .
In the blue region, rDMP is more accurate than the pair approximation; in the red region, it is less so.
We see that rDMP is more accurate except at early times (as in Fig. 3) or when ρ is small compared to λ,
i.e., if the model is close to the SI model where Infectious nodes rarely become Susceptible again.
In Fig. 6, we simulate the SIS model on an Erdős-Rényi graph with n = 100 and average degree 3, with
λ = 0.4, ρ = 0.1, and a single initially Infectious node. As with the Karate Club, rDMP does a better job of
tracking the true fraction of Infectious nodes, except at early times when the pair approximation is superior;
in particular, it does a better job of computing the steady-state size of the epidemic.
In Fig. 7 we show results for the SIRS model on Zachary’s Karate Club. As in Fig. 3, on the left we show
the marginal probability I29 that node 29 is Infectious; on the right, we show the L1 error for Ii averaged
over the network. In the insets, we show the marginal probability R29 for the Recovered state and the
corresponding average L1 error. Here the transmission rate is λ = 0.1, the waning rate from Infectious to
Recovered is ρ = 0.05, and the rate from Recovered to Susceptible is γ = 0.2. The initial condition consisted
of a single infected node, and Monte Carlo results were averaged over 105 runs. As for the SIS model, rDMP
is significantly more accurate than the independent node approximation, and is more accurate than the pair
approximation except at early times.
We found similar results on many other families of networks, including random regular graphs, random
geometric graphs, scale-free networks, Newman-Watts-Strogatz small world networks, and a social network

7
FIG. 6. The fraction f of Infectious nodes as a function of time in
with n = 100 and average degree 3. Here = 0.4, ⇢ = 0.1, and the
node (colored red). Monte Carlo results were averaged over 103 in
racks the true trajectory more closely.
FIG. 6. The fraction f of Infectious nodes as a function of time in the SIS model on an Erdős-Rényi graph (inset)
with n = 100 and average degree 3. Here λ = 0.4, ρ = 0.1, and the initial condition consists of a single Infectious
node (colored red). Monte Carlo results were averaged over 103 independent runs. Except at early times, rDMP
tracks the true trajectory more closely.

FIG. 7. The SIRS model on the Karate Club. On the left, we show the true and estimated marginal probability that
a node 29 is Infectious (main figure) or Recovered (inset) as a function of time. On the right is the average L1 error
for the Infectious and Marginal states. The transmission rate is λ = 0.1, and the transition rates from Infectious
to Recovered and from Recovered to Susceptible are ρ = 0.05 and γ = 0.2 respectively. Node 0 (colored red) was
initially infected. Monte Carlo results were averaged over 105 runs. As for the SIS model, rDMP is significantly more
accurate than the first-order model where nodes are independent, and is more accurate than the pair approximation
except at early times.

of dolphins [29]. Namely, rDMP outperforms the first-order approximation where nodes are independent,
and outperforms the pair approximation across a wide range of parameters and times.

8
V. LINEAR STABILITY, EPIDEMIC THRESHOLDS, AND RELATED WORK

Systems of differential equations for rDMP, such as (3), do not appear to have a closed analytic form
due to their nonlinearities. On the other hand, we can compute quantities such as epidemic thresholds
by linearizing around a stationary point, such as {I∗j→i = 0} where the initial outbreak is small. Given a
perturbation j→i = Ij→i − I∗j→i , the linear stability of the system, i.e., whether or not j→i diverges in time,
is governed by the eigenvalues of the Jacobian matrix J of the right hand side of (3) at the stationary point
I∗i . The Jacobian for (3) at {I∗j→i } is

J(j→i),(k→j 0 ) = −δkj δij 0 ρ + λ(1 − I∗j )B(j→i),(k→j 0 ) . (9)

where

B(j→i),(k→j 0 ) = δjj 0 (1 − δik ) . (10)

This definition of B is another way of saying that the edge k → j influences edges j → i for i 6= k, but does
not backtrack to k. This corresponds to our assumption that infections, for instance, do not bounce from k
to j and back again and create an echo chamber effect. For this reason, B is also known in the literature as
the non-backtracking matrix [36] or the Hashimoto matrix [31].
Now, for a small perturbation ~ away from a stationary point {I∗j→i }, the linearized system of (3) becomes

d~
= J~, (11)
dt
If J has any eigenvalues with positive real part, then k~(t)k grows exponentially in time. So, the fixed point
{Ij→i } is stable as long as the leading eigenvalue J1 of J has negative real part.
One trivial, but important, stationary point to test is I∗j→i = 0 for all edges. A small perturbation around
~0 corresponds to a small initial probability that each node is infected. From (9), J becomes
 ρ 
J=λ B− 1 , (12)
λ
where 1 is the 2m × 2m identity matrix. So, the leading eigenvalue of J becomes positive when the largest
eigenvalue B1 of B is greater than ρ/λ. In other words, if

λ
R0 = B1 > 1 , (13)
ρ
where R0 is the reproductive number, even a small initial probability of infection will lead to a widespread
endemic state, where the infection becomes extensive. If (13) does not hold, a small initial probability of
infection will instead decay back to an infection-less state.
Since B is not symmetric, not all its eigenvalues are real. However, by the Perron-Frobenius theorem,
it’s leading eigenvalue is real; moreover, it is upper bounded by A1 , the leading eigenvalue of the adjacency
matrix A. Interestingly, if we examine the linear stability of the first-order approximation where nodes are
independent, [22], the epidemic threshold for the SIS model is given by

λ
A1 > 1 . (14)
ρ

Since B1 6 A1 , the threshold (13) gives a better upper bound for the true epidemic threshold than we
would get from the first-order approximation. A similar threshold for the SIR model in sparse networks, or
equivalently for percolation, using B1 was recently demonstrated in [37]. (We note that when backtracking
is allowed, it has important consequences for epidemic thresholds on power-law networks [38].)
Whereas the leading eigenvector of B governs the epidemic threshold, the spectral gap between B’s top
two eigenvectors governs how quickly the epidemic converges to the leading behavior (at least until we leave

9
FIG. 8. Same as in Fig. 3, but with transmission rate λ = 0.1 and waning rate ρ = 0.54. A well known upper bound
on the epidemic threshold of the SIS model can be computed from the leading eigenvalue A1 of the adjacency matrix
(the Jacobian matrix of first-moment-closure approach) of a network. In other words, if ρλ < A1 , it is known from the
first-moment-method that an infection-free state becomes unstable and epidemics become widespread and endemic.
Here we show the results from SIS model in Zachary’s Karate Club, where A1 ≈ 6.7. Even though ρλ = 5.4 < A1 which
is well below the threshold from the first-moment method, the contagion fades away eventually, which is correctly
captured by our DMP approach.

the linear regime). Qualitatively, this depends on bottlenecks in the network such as those due to community
structure, where an epidemic spreads quickly in one community but then takes a longer time to cross over
into another. Indeed, the second eigenvector of the non-backtracking matrix B was recently used to detect
community structure [36].
Similarly, just as the leading eigenvector of B was recently shown to be a good measure of importance
or “centrality” of a node [40], it may be helpful in identifying “superspreaders”—nodes where an initial
infection will generate the largest outbreak, and be the most likely to lead to a widespread epidemic.

VI. CONCLUSION

Modern epidemiological studies often require recurrent models, where nodes can return to their previous
inhabited states multiple times. For example, consider diseases such as influenza where individuals are
infected multiple times throughout their lives, or whooping cough where vaccine effectiveness wanes over
time; in both cases, individuals return to the Susceptible class. In this paper we have extended Dynamic
Message-Passing (DMP) to recurrent epidemic models. Our rDMP approach defines messages on the directed
edges of a network in such a way as to prevent signals, such as the spread of infection, from backtracking
immediately to the node that they came from. By preventing these “echo chamber effects,” rDMP obtains
good estimates of the time-varying marginal probabilities on a wide variety of networks, estimating both the
fraction of infectious individuals in the entire network, and the probabilities that individual nodes become
infected.
Like the pair approximation, rDMP takes correlations between neighboring nodes into account. However,
our experiments show that rDMP is more accurate than the pair approximation for a wide variety of network
structures and parameters. Moreover, rDMP is computationally less expensive than the pair approximation,
especially for complex epidemic models with a large number of states, using O(mk) instead of O(mk2 )
variables for models with k states on networks with m edges.
Finally, rDMP is conceptually simple, allowing the user to immediately write down the system of differ-
ential equations for a wide variety of epidemic models, such as those with multiple stages of infection or

10
immunity [43, 44], or those with multiple interacting diseases [45, 46]. We expect that given its simplicity
and accuracy, it will be an attractive option for future epidemiological studies.

VII. ACKNOWLEDGMENTS

This work is supported by AFOSR and DARPA under grant #FA9550-12-1-0432. MS performed this work
while a Graduate Fellow at the Santa Fe Institute, and SVS was supported by the Santa Fe Institute and
the Omidyar Group. We are grateful to Mason Porter and Joel Miller for helpful conversations regarding
recurrent state epidemic models.

[1] B. Karrer and M.E.J. Newman, Message passing approach for general epidemic models. Phys. Rev. E 82, 016101
(2010)
[2] Joel C. Miller, Anja C. Slim and Erik M. Volz, Edge-based compartmental modelling for infectious disease
spread. Journal of the Royal Society Interface [Internet]. 9 890-906 (2010).
[3] M. Shrestha and C. Moore, Message passing approach for threshold models of behavior in networks. Phys. Rev.
E 89, 022805 (2014)
[4] F. Altarelli, A. Braunstein, L. Dall’Asta, and R. Zecchina, Large deviations of cascade processes on graphs.
Phys. Rev. E 87 062115 (2013)
[5] A.Y. Lokhov, M. Mézard, and L. Zdeborovà, Dynamic message-passing equations for models with unidirectional
dynamics. Phys. Rev. E 91, 012811 (2015)
[6] N. T. J. Bailey, The Mathematical Theory of Infectious Diseases and its Applications. Hafner Press, New York
(1975).
[7] R. M. Anderson and R. M. May, Infectious Diseases of Humans. Oxford University Press, Oxford (1991).
[8] M. Granovetter, Threshold models of collective behavior. American Journal of Sociology 83(6), 14201443(1978).
[9] M. Granovetter, The strength of weak ties. American Journal of Sociology 78(6), 13601380(1973).
[10] J.H. Miller and S.E. Page, The standing ovation problem. Complexity 9, 8-16 (2004).
[11] B. Gonçalves, N. Perra, A. Vespignani, Modeling Users’ Activity on Twitter Networks: Validation of Dunbar’s
Number. PLoS ONE 6 (8), e22656 (2011).
[12] R. M. May and A. G. Haldane, Systemic risk in banking ecosystems. Nature 469, 351-355 (2011).
[13] F. Caccioli, M. Shrestha, C. Moore, and J. D Farmer, Stability analysis of financial contagion due to overlapping
portfolios. Journal of Banking & Finance 46, 233-245 (2014).
[14] P. Bak, K. Chen, and C. Tang, A forest-fire model and some thoughts on turbulence. Phys. Lett. A, 147, 297-300
(1990).
[15] B. Drossel, and F. Schwabl, Self-organized critical forest-fire model. Phys. Rev. Lett. 69, 1629-1632 (1992).
[16] P. Grassberger, Critical behaviour of the Drossel-Schwabl forest fire model. New J. Phys, 4, 17 (2002).
[17] M. Mézard and A. Montanari, Information, Physics, and Computation. Oxford University Press (2009).
[18] C. Moore and S. Mertens, The Nature of Computation. Oxford University Press (2011).
[19] R. Morris, Zero-temperature Glauber dynamics on Zd . Prob. Theory Rel. Fields, 149, 3-4 (2011).
[20] R.I.M Dunbar, Neocortex size as a constraint on group size in primates. Journal of Human Evolution 22 (6),
469-493 (1992)
[21] L. A. Meyers, Contact network epidemiology: Bond percolation applied to infectious disease prediction and
control, Bulletin of the American Mathematical Society 44 63-86 (2007).
[22] M. E. J. Newman, Networks: An Introduction. Oxford University Press (2010).
[23] A.Y. Lokhov, M. Mézard, H. Ohta, and L. Zdeborovà, Inferring the origin of an epidemic with dynamic message-
passing algorithm. Phys. Rev. E 90, 012801 (2014)
[24] F. Altarelli, A. Braunstein, L. Dall’Asta, A. Ingrosso, and R. Zecchina, The zero-patient problem with noisy
observations. J. Stat. Mech P10016 (2014)
[25] F. Altarelli, A. Braunstein, L. Dall’Asta, J.R. Wakeling, and R. Zecchina, Containing epidemic outbreaks by
message-passing techniques. Phys. Rev. X 4 021024 (2014)
[26] F. Altarelli, A. Braunstein, L. Dall’Asta, and R. Zecchina, Optimizing spread dynamics on graphs by message
passing. J. Stat. Mech P09011 (2013)
[27] F. Altarelli, A. Braunstein, A. Ramezanpour, and R. Zecchina, Stochastic optimization by message passing. J.
Stat. Mech P11009 (2011)

11
[28] M. A. Porter and J. P. Gleeson, Dynamical systems on networks: A tutorial. arXiv:1403.7663 (2014).
[29] D. Lusseau, K. Schneider, O. J. Boisseau, P. Haase, E. Slooten, and S. M. Dawson, The bottlenose dolphin
community of Doubtful Sound features a large proportion of long-lasting associations. Behavioral Ecology and
Sociobiology 54, 396-405 (2003).
[30] P. Zhang, and C. Moore, Scalable detection of statistically significant communities and hierarchies: message-
passing for modularity. Proceedings of the National Academy of Sciences 111 (51), 18144-18149
[31] K. Hashimoto, Zeta functions of finite graphs and representations of p-adic groups. Advanced Studies in Pure
Mathematics 15 211-280 (1989).
[32] J. Pearl, Reverend Bayes on inference engines: a distributed hierarchical approach. AAAI Proceedings 82, (1982).
[33] A. Decelle, F. Krzakala, C. Moore, and L. Zdeborová, Asymptotic analysis of the stochastic block model for
modular networks and its algorithmic applications. Phys. Rev. E 84, 066106 (2011).
[34] W. W. Zachary, An information flow model for conflict and fission in small groups. Journal of Anthropological
Research 33 (4), 452-473 (1977).
[35] M. J. Keeling and P. Rohani, Modeling Infectious Diseases in Humans and Animals. Princeton and Oxford:
Princeton University Press (2008).
[36] F. Krzakala, C. Moore, E. Mossel, J. Neeman, A. Sly, L. Zdeborová, and P. Zhang, Spectral redemption in
clustering sparse networks. Proceedings of the National Academy of Sciences 110 (52), 20935-20940 (2013).
[37] B. Karrer, M. E. J. Newman, and L. Zdeborová, Percolation on sparse networks. Phys. Rev. E 113, 208702
(2014).
[38] S. Chatterjee and R. Durrett, Contact processes on random graphs with power law degree distributions have
critical value 0. The Annals of Probability 37, 2332–2356 (2009).
[39] H W Watson, and Francis Galton, On the Probability of the Extinction of Families Journal of the Anthropological
Institute of Great Britain, 4, 138-144, (1875).
[40] T. Martin, X. Zhang, M. E. J. Newman, Localization and centrality in networks. Phys. Rev. E 90, 052808
(2014).
[41] D. J. D Earn, J Dushoff, S. A Levin, Ecology and evolution of the flu. Trends in ecology & evolution 17, 334–340
(2002).
[42] M. G. M Gomes, L. J White, G. F Medley, Infection, reinfection, and vaccination under suboptimal immune
protection: epidemiological perspectives. Journal of Theoretical Biology 228, 539–549 (2004).
[43] S. Melnik, J. A. Ward, J. P. Gleeson, and M. A. Porter, Multi-stage complex contagions. Chaos 23, 013124
(2013).
[44] J. C. Miller and E. M Volz, Incorporating Disease and Population Structure into Models of SIR Disease in
Contact Networks. PLoS ONE 8, (8) e69162 (2013).
[45] B. Karrer and M. E. J. Newman, Competing epidemics on complex networks, Phys. Rev. E 84, 036106 (2011).
[46] J. C. Miller, Cocirculation of infectious diseases on networks, Phys. Rev. E 87, 060801 (2013).
[47] M. R. Golden, W. L. H. Whittington, H. H. Handsfield, J. P. Hughes, W. E. Stamm, M. Hogben, A. Clark,
C. Malinski, J. R. L Helmers, K. K. Thomas, and K. K Holmes, Effect of expedited treatment of sex partners
on recurrent or persistent gonorrhea or chlamydial infection. New England Journal of Medicine 352, 676–685
(2005).
[48] P. H. Conway, A. Cnaan, T. Zaoutis, and B. V. Henry, R. W. Grundmeier, and R. Keren, Recurrent urinary tract
infections in children: risk factors and association with prophylactic antimicrobials. Journal of the American
Medical Association 2, 179–186 (2007).
[49] G. M. Jeffery, Epidemiological significance of repeated infections with homologous and heterologous strains and
species of Plasmodium. JBulletin of the World Health Organization 35, 873 (1966).
[50] L. M. Drusin, B. G. Ross, K. H. Rhodes, A. N. Krauss, R. A. Scott, Nosocomial Ringworm in a Neonatal
Intensive Care Unit A Nurse and Her Cat. Infection Control 21, 605–607 (2000).

12

You might also like