Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HL Math IA v2

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 18
At a glance
Powered by AI
The document discusses using the SIR model to analyze the 2014 Ebola outbreak in West Africa and evaluates the precision and accuracy of the model. It also explores the tradeoff between precision and accuracy in disease modeling.

The SIR model is a method for calculating disease spread over time based on the number of people who are susceptible, infected, or recovered from a disease at any given point. It uses three equations and only three parameters to model infectious diseases.

The SIR model cannot account for all the variables that affect disease spread in reality. It also resulted in a prediction that was quite different from the actual data when using the initial parameter values.

Jenna Kohls

Ms. Hadden
IB HL Math
15 October 2016
IA Final Rough Draft
A History of Ebola Outbreaks through the SIR Model
Abstract
The first outbreak of Ebola occurred in the country currently recognized as the
Democratic Republic of the Congo in 1976. There have been multiple outbreaks in
different areas since then, but by far the most significant and most deadly was the 2014
outbreak.1 From both a mathematical and epidemiological standpoint, there is much to
be learned from this outbreak.
The SIR model is a method of calculating disease spread, working as a function of time,
from three equations of the number of people Susceptible to, Infected with, and
Recovered from a disease.2 By examining the outbreak of 2014 with this model, it can
be seen that //conclusion

Aim

1 (BBC News, 2016)


2 (Smith & Moore, 2004)
1

The Ebola outbreak drew the publics attention to a serious deficiency in awareness and
research. It is a matter of public health and safety that the most accurate mathematical
methods are being used to predict and describe the spread of diseases, particularly
those which are capable of killing in the thousands. By analyzing this most recent
outbreak, I aim to explore the efficacy of the SIR model, and determine its values and
limitations in predicting the spread of Ebola.
Rationale
The SIR model has long been a standard in epidemiological models, as an intersection
of accuracy and simplicity.3 In order for a model to be worthwhile, it needs to be as
accurate as possible, clearly, but it is also important to consider things beyond accuracy.
Precision, as opposed to accuracy, is the ability of results to be replicated and
generalized. Data can be precise but not accurate, or accurate but not precise. When it
comes to modeling disease spread, the precision of the results is just as important as
the accuracy. While models are often used and proved retroactively, their most valuable
function is their ability to predict future disease spreads. If a model is too accurate, it will
not generalize well, meaning that it will lose precision when applied to situations outside
of the original. So a model that is accurately derived from a specific outbreak may
match actual results perfectly, but it must be detailed in order to reach that level of
accuracy. This means that if the same model, accurate in one outbreak, is used to
predict the results of a new outbreak, its results would be less predictive of reality than a
less accurate model. Essentially, a detailed or complex model is not necessarily

3 (Smith & Moore, 2004)


2

superior, and in order to generalize a model, and get the use out of it, a certain level of
accuracy must be sacrificed.
This contradiction, this classic struggle, between precision and accuracy is both a
fundamental principle of scientific study and a complex philosophical discussion, which I
find fascinating. It is similar to the Heisenberg principle of uncertainty, which asserts that
it is impossible to measure both the position and velocity of an object. It has a strong
mathematical foundation behind, because at a certain small size of measurement, the
uncertainty becomes large enough that the measurement loses all meaning. 4 However,
this mathematical equation also makes sense on a philosophical level. When you focus
too much on where an object is, you cant see where it is going, and vice versa. As in, if
you are too focused on one moment in time or point in your life, you cant properly see
where your life is headed. Conversely, if you are too focused on your future, you cant
properly appreciate each moment. Everything comes down to striking a perfect balance
between the two. The duality of this principle, the intersection of science and philosophy,
is beautiful to me, an art all its own.
The SIR Model is the perfect example of this conflict between precision and accuracy.
Researchers are constantly creating new and increasingly complex models to map the
spread of specific diseases, but the SIR model requires only three functions, and its
principles apply to a host of different diseases. 5 Therefore by analyzing this significant
Ebola outbreak, the practical efficiency of the SIR model can be explored and

4 (Schombert, 2005)
5 (Weisstein)
3

evaluated. In order to be justified, its results should compare well with actual statistics,
while avoiding unnecessary complicated calculations.
Introduction
This occurrence killed more than five times as many as all other known outbreaks
combined. As of January 2016, 11,315 people have been reported as having died from
the disease in six countries; Liberia, Guinea, Sierra Leone, Nigeria, the US and Mali.
The total number of reported cases is about 28,637. On 13 January, 2016, the World
Health Organization declared the last of the countries affected, Liberia, to be Ebolafree.6 As this outbreak has now come to end, it becomes important to reflect on the
meaningfulness of the data collected. This most recent outbreak caused more attention
to be drawn to Ebola worldwide than ever before. Ironically, it was also largely caused
by a lack of preparation and serious attention being given to the disease, prior to the
outbreak.
The SIR Model
The SIR Model uses the following three variables:
S = number of people that are susceptible to the disease
I = number of people infected with the disease
R = number of people recovered from the disease, with total immunity

6 (BBC News, 2016)


4

The model assumes a fixed population of N people, and only works in a closed system,
where there are no births or deaths not caused by the disease. Therefore the total
population can be written as:
N = S + I + R7
Although it is a simplification, on short time scales, this use of a closed system is
beneficial for keeping the model neat.

Equation 1:

dS
=IS
dt

In Equation 1,

dS
dt

disease over time.

refers the rate of change of the number of people susceptible to the

dS
dt

decreases proportionally to

and

because of the nature of

the three categories. As people become infected, they are no longer susceptible to the disease.
The only way to leave the set of susceptible people is by becoming infected, therefore the
number of people who are susceptible to the disease is a function of the number those who are
already susceptible, the number of those who are already infected, and the amount of contact
between the susceptible and infected.

refers to the rate of infection. This is calculated for

each individual case, and will be expanded on later.

Equation 2:

dR
=I
dt

7 (Dolgoarshinnykh & Lalley, 2002)


5

dR
dt

refers to the rate of change of the number of people recovered over time. This

illustrates that the rate of the number of people recovering is dependent upon the
number of people infected, as in order to become recovered, one must have been
infected. If the duration of the time infected is shorter, then the rate of infection
increases. Therefore, it is a proportional relationship between the recovery rate and the
infection rate. Again,

is a parameter that in this case refers to the rate of recovery,

and will be expanded upon later.

dI
=IS I
dt

Equation 3:

dI
dt

In equation 3,

refers to the rate of change of the number of people infected. This

is dependent on the number of people susceptible and the number of people infected,
as well as the infection rate of the disease between the two compartments. As the
population of

which

dI
dt

increases, the population of

decreases, therefore the rate at

increases is inversely proportional to the S because in order for there to

be more infected people, there must be a decrease in the number of susceptible people.

Thus, this equation is a consequence of the fact that:

dI dS dR
=

dt
dt
dt

into which we

can substitute equation 1 and 2, giving us the final equation.


Parameters

In addition to

(the rate of infection) and

(the rate of recovery), it is necessary

to define to other parameters for this model:


D=Duration of disease for t h ose recovered
M =Mortality rate for those who die per day
Based on the previous 30 years of Ebola data, M has been calculated by the World
Health Organization as 0.7, or 70%. This figure incorporates the known clinical outcome
of the countries in which Ebola is prevalent.8
Two additional equations are generated from these parameters:
Equation 4:

1
D

The rate at which the disease is spread is the reciprocal of the duration of the disease,
as a certain individual can only experience one recovery in a given period of time. For
example, if the duration of the time spent infected is 10 days, then the rate at which an
infected person becomes recovered is:
1
=0.1=10
10

Equation 5:

M
S

This equation show that the infection rate of the disease is dependent on the morality
rate and the number of people susceptible to the disease. This value is always between
0 and 1, where a value of 1 suggests a 100% infection rate and a value of 0 suggests a
8 (Epatko, 2014)
7

0% infection rate. For example, if the mortality rate of the population is 50% and the
number of people susceptible is 100, then the rate of infection would be
=

0.5
=0.005,0.5
100

Evaluation of the SIR model on the 2014 Liberian Outbreak


If we now take the example of the Ebola outbreak in Liberia 2014, we can assign the
parameters the following values.
The total population of Liberia, N, is 4,294,000 9, and according to data from WHO10, the
number of people infected I = 846 and the number of people dead is 481. Seeing as R
includes the number of people who have received permanent immunity, this includes
those who have died in addition to those who have recovered with permanent immunity.
Therefore, the number of people recovered

R=481+ ( 0.3 846 )=735 . Therefore, the

parameters can be given the following values:


N=429 4000
I =846
R=735

Therefore, S=N I + R=4294000( 735+846 )=4292419


The duration of the disease ranges from 2 to 18 days, therefore we could roughly
estimate the duration of the disease at the midpoint, i.e. 10 days.

9 (Centers for Disease Control and Prevention, 2014)


10 (WHO)
8

D=10
=

1
=0.1
10

As discussed earlier, the mortality rate of Ebola is 0.7 and the number of people
susceptible is 4292419.
(therate of infection)=

Therefore from equation 5,

0.7
=1.63 107
4292419

In order to use the SIR model to predict the evolution of the disease, it would be helpful
if we could solve the system of differential equations. Unfortunately, we cannot
completely solve these equations with an explicit formula solution. 11
dI
dR
dS dt
,
dt
dt

Therefore, for each day, the values of


1, 2 and 3. Then assume that the

S value+

dS
dt

will be calculated using equations

S value for the following day is the previous

for that point in time. Here can be seen the transition from t = 0 to t =

1. Using equations 1, 2 and 3 from earlier, the following values for the three rates of
change of S, I and R can be calculated.
dS
dt

=(1.63 107 ) 846 4292419 = -581

t=0

dI
dt

=(1.6 107)( 0.1 846 ) = 496

t =0

dR
dt

=0.1 846 = 85

t=0

11 (Matemtic, 2013)
9

Therefore, at t = 1,

S(t) = 4292419581=4291838

The following table shows the results of this calculation over a two month period.

Susceptible

Infected

Recovered

ds/dt

dI/dt

dr/dt

S+I+R

4292419

846

735

-581

496

85

4294000

4291838

1342

820

-922

788

134

4294000

4290916

2130

954

-1462

1249

213

4294000

4289454

3379

1167

-2319

1981

338

4294000

4287134

5361

1505

-3677

3141

536

4294000

4283457

8502

2041

-5827

4977

850

4294000

4277631

13478

2891

-9225

7877

1348

4294000

4268406

21355

4239

-14585

12449

2136

4294000

4253821

33804

6374

-23008

19627

3380

4294000

4230814

53432

9755

-36169

30826

5343

4294000

10

4194644

84258

15098

-56549

48123

8426

4294000

11

4138095

132381

23524

-87649

74411

13238

4294000

12

4050446

206792

36762

-134016

113337

20679

4294000

13

3916430

320129

57441

-200602

168589

32013

4294000

14

3715828

488718

89454

-290559

241687

48872

4294000

15

3425269

730405

138326

-400294

327253

73041

4294000

16

3024975

1057658

211366

-511902

406137

105766

4294000

17

2513073

1463795

317132

-588580

442200

146379

4294000

18

1924493

1905995

463512

-586892

396292

190600

4294000

19

1337601

2302288

654111

-492727

262498

230229

4294000

20

844874

2564786

884340

-346707

90229

256479

4294000

21

498167

2655015

1140819

-211622

-53879

265501

4294000

22

286544

2601136

1406320

-119255

-140859

260114

4294000

23

167290

2460277

1666434

-65853

-180175

246028

4294000

24

101437

2280102

1912461

-37006

-191004

228010

4294000

25

64431

2089097

2140471

-21537

-187373

208910

4294000

26

42895

1901724

2349381

-13052

-177121

190172

4294000

27

29843

1724604

2539554

-8235

-164226

172460

4294000

28

21608

1560378

2712014

-5395

-150643

156038

4294000

29

16213

1409735

2868052

-3657

-137316

140973

4294000

30

12556

1272418

3009025

-2556

-124686

127242

4294000

31

10000

1147733

3136267

-1836

-112937

114773

4294000

32

8164

1034796

3251040

-1352

-102128

103480

4294000

10

33

6812

932668

3354520

-1017

-92250

93267

4294000

34

5796

840418

3447787

-779

-83262

84042

4294000

35

5016

757155

3531828

-608

-75108

75716

4294000

36

4409

682047

3607544

-481

-67724

68205

4294000

37

3927

614324

3675749

-386

-61046

61432

4294000

38

3541

553277

3737181

-313

-55014

55328

4294000

39

3228

498263

3792509

-257

-49569

49826

4294000

40

2971

448694

3842335

-213

-44656

44869

4294000

41

2757

404038

3887205

-178

-40226

40404

4294000

42

2579

363813

3927608

-150

-36231

36381

4294000

43

2429

327581

3963990

-127

-32631

32758

4294000

44

2302

294951

3996748

-109

-29386

29495

4294000

45

2193

265564

4026243

-93

-26463

26556

4294000

46

2100

239101

4052799

-80

-23830

23910

4294000

47

2019

215271

4076709

-70

-21458

21527

4294000

48

1950

193814

4098236

-60

-19321

19381

4294000

49

1889

174493

4117618

-53

-17397

17449

4294000

50

1837

157096

4135067

-46

-15663

15710

4294000

51

1791

141433

4150777

-41

-14103

14143

4294000

52

1750

127330

4164920

-36

-12697

12733

4294000

53

1714

114633

4177653

-31

-11432

11463

4294000

54

1683

103201

4189116

-28

-10292

10320

4294000

55

1655

92909

4199436

-25

-9266

9291

4294000

56

1631

83642

4208727

-22

-8342

8364

4294000

57

1609

75300

4217091

-19

-7511

7530

4294000

58

1589

67789

4224621

-17

-6762

6779

4294000

59

1572

61028

4231400

-15

-6087

6103

4294000

60

1557

54940

4237503

-14

-5480

5494

4294000

The table was generated with this format:


A

dS/dt

dI/dt

dR/dt

S+I+R

gamma

beta

B2

C2

D2

E2

F2

G2

B2 + C2 +

D2
3

t+1

B3+E3

C2+F2

D2+G2

-g*I3*B3

B*I3*B3
g*I3

g*I3

B3+E3+C2+
F2+D2+G2

Data Analysis
11

This shows the initial steep increase in the number of infected, that eventually levels
out, while at the same time the number of recovered people increases. The three
equations relate to each other in a way that fits with the way Ebola was likely spread,
with a large increase at the beginning that gradually decreases as awareness of the
disease spreads. This peak in I could also be calculated by taking the derivative of I,

which is

dI
, and finding where it is equal to zero. Checking the table, we see that the
dt

derivative of I goes from positive to negative between t = 20 and t = 21, meaning that
with this model, 20 days into the spread of the disease saw the highest number of
patients actively experiencing Ebola.
Also note that the number of susceptible people will never read zero, only tending
towards it, because the only way for the entire population to be unsusceptible would be
a complete wipe of the population or the introduction of a vaccine.
Discussion of the SIR model
Values
12

It is a very quick and straightforward model. With minimal outside data, we were able to
realistically model the spread of Ebola. As the efficiency of computing increases, this
becomes more and more important. It also has clearly defined parameters for such
outside data, like the mortality rate of a disease, making it easier and more valid to
generalize to another disease.
Limitation
The calculation of the beta values and gamma values are often inaccurate because
small deviation from the correct value can result in great changes in the overall model.
For example, changing the gamma value from 0.1 to 0.3 can lead to the following
changes:

In this situation, a skewed value in the duration of sickness can drastically alter the
results.
A main weakness of this model is that it relies on a closed ecosystem, meaning it
cannot and does not account for any births or any deaths caused by something other
than the disease. This is, of course, unrealistic. On a small scale, the differences may
be negligible, but before too much weight is placed on the SIR models predictions, a
way to compensate for this would need to be created.

13

Comparison to Recorded Data

I model vs. I actual

I model

I actual

Now that the Ebola outbreak

has been officially declared ended, we can compare the SIR models predictions to the
actual outcome in Liberia, using statistics from the WHO12.
Time
(days)

I model

I actual

10

84258

1378

15

730405

1680

20

2564786

1871

25

2089097

2046

30

1272418

2407

35

757155

3022

40

448694

3280

45

265564

3696

50

157096

3834

55

92909

4076

60

54940

4262

12 (Chretien, Riley, & George, 2015)


14

Because of the limitations of graphing the two models on the same set of axes, the
actual I data appears as like a graph of y = 0 in comparison to the SIR models results.
Therefore, it needs to be graphed separately to see the actual shape of the data.

I(t) actual

We can see that the SIR model has significantly inflated the number of people who were
infected with Ebola, and the overall shape of the graph is quite different. As discussed
earlier, however, a different gamma value can change the SIR model drastically, and is
difficult to calculate accurately. Accordingly, I was able to find a different gamma value
(the rate of recovery) that generated a result similar to the actual data. It is graphed

15

below in blue against the actual data, with a gamma value of 0.679995559.

Adjusted value (0.679995559)

In order to get a graph as close as this is to the actual data, I had to use nine significant
figures, and it still is not an exact match. This demonstrates the level of accuracy
required in the parameters for the SIR model to work, because the gamma value is
calculated through extreme simplification.
Conclusion
This exploration was able to evaluate the effectiveness of the SIR model as an
intersection of precision and accuracy. Clearly, after being compared to actual data, the
16

model cannot accurately account for all of the variances that affect disease spread, and
resulted in a prediction widely different from reality. However, the model, once adjusted
for an accurate rate of recovery, produced a remarkably similar result with a relatively
small amount of calculations involved. Therefore, while not being the most accurate
model for the spread of Ebola, the SIR model was able to be precise, and therefore
maintains an important role in the modeling of disease spread.
Bibliography
BBC News. (2016). Ebola: Mapping the Outbreak. British Broadcasting Company.
Dolgoarshinnykh, R., & Lalley, S. P. (2002). Epidemic Modeling: SIRS Models.
Epatko, L. (2014, October 16). 70 percent Ebola death rate? Heres how they calculate
it. Retrieved from PBS News Hour: http://www.pbs.org/newshour/rundown/70-percentebola-death-rate-calculate/
IB Maths Resources from British Internaional School Phuket. (2014). Modelling
Infectious Diseases.
Schombert, J. (2005, April 21). Uncertainty Principle. (U. o. Oregon, Producer)
Retrieved from 21st Century Science:
http://abyss.uoregon.edu/~js/21st_century_science/lectures/lec14.html
Smith, D., & Moore, L. (2004, December). The SIR Model for Spread of Disease - The
Differential Equational Model . Retrieved from Mathematical Association of America:
http://www.maa.org/press/periodicals/loci/joma/the-sir-model-for-spread-of-disease-thedifferential-equation-model

17

Weisstein, E. W. (n.d.). SIR Model. Retrieved from


http://mathworld.wolfram.com/SIRModel.html

18

You might also like