A Time-Dependent SIR Model For COVID-19 With Undetectable Infected Persons
A Time-Dependent SIR Model For COVID-19 With Undetectable Infected Persons
A Time-Dependent SIR Model For COVID-19 With Undetectable Infected Persons
PDF 1
For (Q1), we analyze the cases in China and aim to predict in China will end in about 6 weeks after its peak in our
how the virus spreads in this paper. Specifically, we propose (deterministic) model if the current contagious disease control
using a time-dependent susceptible-infected-recovered (SIR) policies are maintained in China. In that case, the total number
model to analyze and predict the number of infected persons of confirmed cases is predicted to be around 80, 000 cases in
and the number of recovered persons (including deaths). In the China under our (deterministic) model.
traditional SIR model, it has two time-invariant variables: the For (Q3), we propose our second SIR model that has two
transmission rate β and the recovering rate γ. The transmission types of infected persons: detectable infected persons (type
rate β means that each individual has on average β contacts I) and undetectable infected cases (type II). Type I infected
with randomly chosen others per unit time. On the other persons have a lower transmission rate than that of type II
hand, the recovering rate γ indicates that individuals in the infected persons (as type I infected persons can be isolated).
infected state get recovered or die at a fixed average rate γ. For such a model, whether the disease is controllable is
The traditional SIR model neglects the time-varying property characterized by the spectral radius of a 2 × 2 matrix. If
of β and γ, and it is too simple to precisely and effectively the spectral radius of that matrix is larger than 1, then there
predict the trend of the disease. Therefore, we propose using a is an outbreak. On the other hand, if it is smaller than 1,
time-dependent SIR model, where both the transmission rate then there is no outbreak. One interesting result is that the
β and the recovering rate γ are functions of time t. Our idea spectral radius of that matrix is larger (resp. smaller) than 1
is to use machine learning methods to track the transmission if the basic reproduction number R0 is larger (resp. smaller)
rate β(t) and the recovering rate γ(t), and then use them to than 1. The curve that has the spectral radius equal to 1 is
predict the number of the infected persons and the number known as the percolation threshold curve in a phase transition
of recovered persons at a certain time t in the future. Our diagram [11]. Using the historical data from Jan. 22, 2020 to
time-dependent SIR model can dynamically adjust the crucial Mar. 2, 2020 from WHO [2], we extend our study to some
parameters, such as β(t) and γ(t), to adapt accordingly to other countries, including Japan, Singapore, South Korea, Italy,
the change of control policies, which differs from the existing and Iran. Our numerical results show that there are several
SIR and SEIR models in the literature, e.g., [3], [4], [5], [6], countries, including South Korea, Italy, and Iran, that are above
and [7]. For example, we observe that city-wide lockdown can the percolation threshold curve, and they are on the verge of
lower the transmission rate substantially from our model. Most COVID-19 outbreaks on Mar. 2, 2020.
data-driven and curve-fitting methods for the prediction of The rest of the paper is organized as follows: In Section II,
COVID-19, e.g., [8], [9], and [10] seem to track data perfectly; we propose the time-dependent SIR model. We then extend the
however, they are lack of physical insights of the spread of the model to the SIR model with undetectable infected persons in
disease. Moreover, they are very sensitive to a sudden change Section III. In Section IV, we conduct several numerical ex-
in the definition of confirmed cases on Feb. 12, 2020. On periments to illustrate the effectiveness of our two models. In
the other hand, our time-dependent SIR model can examine Section V, we put forward some discussions and suggestions
the epidemic control policy of the Chinese government and to control COVID-19. The paper is concluded in Section VI.
provide reasonable explanations. Using the data provided by
the National Health Commission of the People’s Republic of II. T HE T IME - DEPENDENT SIR M ODEL
China (NHC) [1], we show that the one-day prediction errors
for the numbers of confirmed cases are almost less than 3% A. Susceptible-infected-recovered (SIR) Model
except for the day when the definition of confirmed cases is In the typical mathematical model of infectious disease, one
changed on Feb. 12, 2020. often simplify the virus-host interaction and the evolution of
For (Q2), the basic reproduction number R0 , defined as an epidemic into a few basic disease states. One of the simplest
the number of additional infections by an infected person epidemic model, known as the susceptible-infected-recovered
before it recovers, is one of the commonly used metrics to (SIR) model [11], includes three states: the susceptible state,
check whether the disease will become an outbreak. In the the infected state, and the recovered state. An individual in
classical SIR model, R0 is simply β/γ as an infected person the susceptible state is one who does not have the disease at
takes (on average) 1/γ days to recover, and during that period time t yet, but may be infected if one is in contact with a
time, it will be in contact with (on average) β persons. In person infected with the disease. The infected state refers to
our time-dependent SIR model, the basic reproduction number an individual who has a disease at time t and may infect a
R0 (t) is a function of time, and it is defined as β(t)/γ(t). If susceptible individual potentially (if they come into contact
R0 (t) > 1, the disease will spread exponentially and infects with each other). The recovered state refers to an individual
a certain fraction of the total population. On the contrary, the who is either recovered or dead from the disease and is no
disease will eventually be contained. Therefore, by observing longer contagious at time t. Also, a recovered individual will
the change of R0 (t) with respect to time or even predict not be back to the susceptible state anymore. The reason for
R0 (t) in the future, we can check whether certain epidemic the number of deaths is counted in the recovered state is that,
control policies are effective or not. Using the data provided from an epidemiological point of view, this is basically the
by the National Health Commission of the People’s Republic same thing, regardless of whether recovery or death does not
of China (NHC) [1], we show that the turning point (peak), have much impact on the spread of the disease. As such, they
defined as the day that the basic reproduction number is less can be effectively eliminated from the potential host of the
than 1, is predicted to be Feb. 17, 2020. Moreover, the disease disease [12]. Denote by S(t), X(t) and R(t) the numbers of
THE LATEST VERSION WILL BE PLACED ON HTTP://GIBBS1.EE.NTHU.EDU.TW/A TIME DEPENDENT SIR MODEL FOR COVID 19.PDF 3
susceptible persons, infected persons, and recovered persons illustrates the difference of R(t) at time t. Since three variables
at time t. Summing up the above SIR model, we believe it is S(t), X(t) and R(t) still satisfy (1), we have
very similar to the COVID-19 outbreak, and we will adopt the dX(t) dS(t) dR(t)
SIR model as our basic model in this paper. = −( + ),
dt dt dt
In the traditional SIR model, it has two time-invariant which is the number of people changing from the susceptible
variables: the transmission rate β and the recovering rate state to the infected state minus the number of people changing
γ. The transmission rate β means that each individual has from the infected state to the recovered state (see (3)).
on average β contacts with randomly chosen others per unit
time. On the other hand, the recovering rate γ indicates that C. Discrete Time Time-dependent SIR Model
individuals in the infected state get recovered or die at a fixed
average rate γ. The traditional SIR model neglects the time- Due to the COVID-19 data is updated in days [1], we revise
varying property of β and γ. This assumption is too simple the differential equations in (2), (3), and (4) into discrete time
to precisely and effectively predict the trend of the disease. difference equations:
Therefore, we propose the time-dependent SIR model, where −β(t)S(t)X(t)
S(t + 1) − S(t) = , (5)
both the transmission rate β and the recovering rate γ are n
functions of time t. Such a time-dependent SIR model is much β(t)S(t)X(t)
X(t + 1) − X(t) = − γ(t)X(t), (6)
better to track the disease spread, control, and predict the n
future trend. R(t + 1) − R(t) = γ(t)X(t). (7)
Again, the three variables S(t), X(t) and R(t) still satisfy (1).
B. Differential Equations for the Time-dependent SIR Model In the beginning of the disease spread, the number of
confirmed cases is very low, and most of the population are
For the traditional SIR model, the three variables S(t), X(t) in the susceptible state. Hence, for our analysis of the initial
and R(t) are governed by the following differential equations stage of COVID-19, we assume {S(t) ≈ n, t ≥ 0}, and
(see, e.g., the book [11]): further simplify (5), (6), and (7) as follows:
dS(t) −βS(t)X(t) S(t + 1) − S(t) = −β(t)X(t), (8)
= ,
dt n X(t + 1) − X(t) = β(t)X(t) − γ(t)X(t). (9)
dX(t) βS(t)X(t)
= − γX(t), From the difference equations above, one can easily derive
dt n
dR(t) β(t) and γ(t) of each day. From (7), we have
= γX(t).
dt R(t + 1) − R(t)
γ(t) = . (10)
We note that X(t)
S(t) + X(t) + R(t) = n, (1) Using (7) in (9) yields
[X(t + 1) − X(t)] + [R(t + 1) − R(t)]
where n is the total population. Let β(t) and γ(t) be trans- β(t) = . (11)
mission rate and recovering rate at time t. Replacing β and γ X(t)
by β(t) and γ(t) in the differential equations above yields Given the historical data from a certain period
{X(t), R(t), 0 ≤ t ≤ T − 1}, we can measure the
dS(t) −β(t)S(t)X(t) corresponding {β(t), γ(t), 0 ≤ t ≤ T − 2} by using (10)
= , (2)
dt n and (11). With the above information, we can use machine
dX(t) β(t)S(t)X(t) learning methods to predict the time varying transmission
= − γ(t)X(t), (3)
dt n rates and recovering rates.
dR(t)
= γ(t)X(t). (4)
dt D. Tracking Transmission Rate β(t) and Recovering Rate γ(t)
The three variables S(t), X(t) and R(t) still satisfy (1). by Ridge Regression
Now we briefly explain the intuition of these three equa- In this subsection, we track and predict β(t) and γ(t) by
tions. Equation (2) describes the difference of the number of the commonly used Finite Impulse Response (FIR) filters
susceptible persons S(t) at time t. If we assume the total popu- in linear systems. Denote by β̂(t) and γ̂(t) the predicted
lation is n, then the probability that a randomly chosen person transmission rate and recovering rate. From the FIR filters,
is in the susceptible state is S(t)/n. Hence, an individual in the they are predicted as follows:
infected state will contact (on average) β(t)S(t)/n people in β̂(t) = a1 β(t − 1) + a2 β(t − 2) + · · · + aJ β(t − J) + a0
the susceptible state per unit time, which implies the number J
of newly infected persons is β(t)S(t)X(t)/n (as there are =
X
aj β(t − j) + a0 , (12)
X(t) people in the infected state at time t). On the contrary, j=1
the number of people in the susceptible state will decrease
γ̂(t) = b1 γ(t − 1) + b2 γ(t − 2) + · · · + bK γ(t − K) + b0
by β(t)S(t)X(t)/n. Additionally, as every individual in the K
infected state will recover with rate γ(t), there are (on average) =
X
bk γ(t − k) + b0 , (13)
γ(t)X(t) people recovered at time t. This is shown in (4) that k=1
THE LATEST VERSION WILL BE PLACED ON HTTP://GIBBS1.EE.NTHU.EDU.TW/A TIME DEPENDENT SIR MODEL FOR COVID 19.PDF 4
where J and K are the orders of the two FIR filters (0 < ALGORITHM 1: Tracking Discrete Time Time-
J, K < T − 2), aj , j = 0, 1, . . . , J, and bk , k = 0, 1, . . . , K dependent SIR Model
are the coefficients of the impulse responses of these two FIR Input: {X(t), R(t), 0 ≤ t ≤ T − 1}, Regularization
filters. parameters α1 and α2 , Order of FIR filters J and
There are several widely used machine learning methods for K, Prediction window W .
the estimation of the coefficients of the impulse response of Output: {β(t), γ(t), 0 ≤ t ≤ T − 2},
an FIR filter, e.g., ordinary least squares (OLS), regularized {β̂(t), γ̂(t), t ≥ T − 1}, and
least squares (i.e., ridge regression), and partial least squares {X̂(t), R̂(t), t ≥ T }.
(PLS) [13]. In this paper, we choose the ridge regression as 1: Measure {β(t), γ(t), 0 ≤ t ≤ T − 2} using (11) and
our estimation method that solves the following optimization (10) respectively.
problem: 2: Train the ridge regression using (14) and (15).
T
X −2 J
X 3: Estimate β̂(T − 1) and γ̂(T − 1) by (12) and (13)
min (β(t) − β̂(t))2 + α1 a2j , (14) respectively.
aj
t=J j=0 4: Estimate the number of infected persons X̂(T ) and
T
X −2 K
X recovered persons R̂(T ) on the next day T using (16)
min (γ(t) − γ̂(t))2 + α2 b2k , (15) and (17) respectively.
bk
t=K k=0 5: while T ≤ t ≤ T + W do
where α1 and α2 are the regularization parameters. 6: Estimate β̂(t) and γ̂(t) in (12) and (13) respectively.
7: Predict X̂(t + 1) and R̂(t + 1) using (18) and (19)
respectively.
E. Tracking the Number of Infected Persons X̂(t) and the 8: end while
Number of Recovered Persons R̂(t) of the Time-dependent SIR
Model
In this section, we show how we use the two FIR filters III. T HE SIR M ODEL WITH U NDETECTABLE I NFECTED
to track and predict the number of infected persons and P ERSONS
the number of recovered persons in the time-dependent SIR
According to the recent report from WHO [2], only 87.9%
model. Given a period of historical data {X(t), R(t), 0 ≤
of COVID-19 patients have a fever, and 67.7% of them have
t ≤ T − 1}, we first measure {β(t), γ(t), 0 ≤ t ≤ T − 2}
a dry cough. This means there exist asymptomatic infections.
by (10) and (11). Then we solve the ridge regression (with
Recent studies in [7] and [14] also pointed out the existence
the objective functions in (14) and (15) and the constraints
of the asymptomatic carriers of COVID-19. Those people are
in (12) and (13)) to learn the coefficients of the FIR filters,
unaware of their contagious ability, and thus get more people
i.e., aj , j = 0, 1, . . . , J and bk , k = 0, 1, . . . , K. Once we
infected. The transmission rate can increase dramatically in
learn these coefficients, we can predict β̂(t) and γ̂(t) at time
this circumstance.
t = T − 1 by the trained ridge regression in (12) and (13).
To take the undetectable infected persons into account, we
Denote by X̂(t) (resp. R̂(t)) the predicted number of
propose the SIR model with undetectable infected persons in
infected (resp. recovered) persons at time t. To predict X̂(t)
this section. We assume that there are two types of infected
and R̂(t) at time t = T , we simply replace β(t) and γ(t) by
persons. The individuals who are detectable (with obvious
β̂(t) and γ̂(t) in (7) and (9). This leads to
symptoms) are categorized as type I infected persons, and
the asymptomatic individuals who are undetectable are cate-
X̂(T ) = 1 + β̂(T − 1) − γ̂(T − 1) X(T − 1), (16)
gorized as type II infected persons. For an infected individual,
R̂(T ) = R(T − 1) + γ̂(T − 1)X(T − 1). (17)
it has probability w1 to be type I and probability w2 to be type
To predict X̂(t) and R̂(t) for t > T , we estimate β̂(t) and II, where w1 + w2 = 1. Besides, those two types of infected
γ̂(t) by using (12) and (13). Similar to those in (16) and (17), persons have different transmission rates and recovering rates,
we predict X̂(t) and R̂(t) as follows: depending on whether they are under treatment or isolation
or not. We denote β1 (t) and γ1 (t) as the transmission rate
X̂(t + 1) = 1 + β̂(t) − γ̂(t) X̂(t), t ≥ T, (18) and the recovering rate of type I at time t. Similarly, β2 (t)
R̂(t + 1) = R̂(t) + γ̂(t)X̂(t), t ≥ T. (19) and γ2 (t) are the transmission rate and the recovering rate for
type II at time t.
The detailed steps of our tracking/predicting method are
outlined in Algorithm 1.
We note that this deterministic epidemic model is based A. The Governing Equations for the SIR Model with Unde-
on the mean-field approximation for X(t) and R(t). Such tectable Infected Persons
an approximation is a result of the law of large numbers. Now we derive the governing equations for the SIR model
Therefore, when X(t) and R(t) are relatively small, the mean- with two types of infected persons. Let X1 (t) (resp. X2 (t)) be
field approximation may not be as accurate as expected. In the number of type I (resp. type II) infected persons at time t.
those cases, one might have to resort to stochastic epidemic Similar to the derivation of (6), (7) in Section II-C, we assume
models, such as Markov chains. that {S(t) ≈ n, t ≥ 0} in the initial stage of the epidemic
THE LATEST VERSION WILL BE PLACED ON HTTP://GIBBS1.EE.NTHU.EDU.TW/A TIME DEPENDENT SIR MODEL FOR COVID 19.PDF 5
and split X(t) into two types of infected persons. We have Now we find the larger eigenvalue of the matrix A. Let I
the following difference equations: be the 2 × 2 identify matrix and
B. Parameter Setup reference value. We mark these data points for β(t) and γ(t)
For our time-dependent SIR model, we set the orders of the with the gray dashed curve.
FIR filters for predicting β(t) and γ(t) as 3, i.e., J = K = 3.
The stopping criteria of the model is set to X(t) ≤ 0. Since
the numbers of infected persons before Jan. 27, 2020 are too
&