Probability and Probability Distributions
Probability and Probability Distributions
Department of Epidemiology
and Biostatistic
s
Probability and probability distribution
ab
Objective of the chapter
At the end of this chapter, students are expected to understand the
following
Probability
The difference between probability and probability distribution
Types of probability
Conditional probability
Distribution for categorical variable
Distribution for continuous variable
Different distribution tables
Normal distribution
Student t-distribution
Chi-square distribution
Probability
Probability is the language of chance.
The deliberate use of chance is the central idea of statistical
designs for producing data.
Probabilities are used in everyday communication
4
Probability…
Medicine is also not an exact science, physicians seldom can
predict an outcome with absolute certainty.
E.g., to formulate a diagnosis, a physician must rely on available
diagnostic information about a patient
–History and physical examination
–Laboratory studies, X-ray findings, ECG, etc
Although no test result is absolutely accurate, it does affect the
probability of the presence (or absence) of a disease.
Probability cont…
An understanding of probability is fundamental for
quantifying the uncertainty that is inherent in the decision-
making process
Example: The sample space for the sex of newborns when two
mothers are in the gynecology ward to give birth is:
{MM, MF, FM, FF}
9
Types of probability cont…
Empirical (or statistical) probability is based on observations
obtained from experiments /a large number of trials or from
historical data.
Frequency of Event E
PE
( ) Total frequency
f
Example: n
A medical doctor realized that out of 100,000 patients visited
the hospital, there are 50 cancer cases.
What is the probability that a patient to be examined will be
positive for cancer?
10 3 0.30
100 61 0.61
1000 496 0.496 ≈ 0.50
Types of probability cont…
Subjective Probability: It is usually set from intuition,
educated guesses, or estimates.
P( A B) P( A) P(B) P( A B).
A AnB B
14
Mutually Exclusive Events
A B
S 15
Independent Events
Two events are independent if the occurrence of one of the
events does not affect the probability of the other event.
Example:
Let event A stands for “the sex of the first child from a
mother is female”; and event B stands for “the sex of the
second child from the same mother is female”
Are A and B independent?
Solution
P(B/A) = P(B) = The occurrence of A does not affect the probability
of B, so the events are independent.
0.5
Multiplicative rule of probability
18
Bayes Theorem
PBPA | B
PB | A
PBPA | B PB'PA | B'
Application of conditional probability
1-19 times 32 7 39
20-99 times 18 20 38
more than 100 times 25 9 34
--------------------------------------------------------------------------------------------
Total 75 36 111
---------------------------------------------------------------------------------------------
Questions
d. Given that the person has used cocaine less than 100 times,
what is the probability of being female?
27
Permutations cont…
Example: Five different new drugs are given
simultaneously to each of the five patients. The drugs are
compared by the length of time taken to cure the patients.
(assume that the five patients are same in all other
characteristics like: disease type, severity status, sex, age
etc. )
a) How many possible drugs we have for the 1st place (the
fastest to cure).
b) How many possible arrangements we have for the first
three drugs?
c) How many possible arrangements of all the drugs
we have?
Counting Rules cont…
Combinations
When the order in which the events occurred is of no interest,
we are dealing with combinations. The number of possible
combinations is
29
Combinations cont…
a. With replacement?
a. Without replacement?
Sampling with and without … (answer)
If sampling is with replacement:
Number of ways = 10 x 10 = 100
Value on face 1 2 3 4 5 6
Probability 1/6 1/6 1/6 1/6 1/6 1/
6
In any prob. distribution, each probability must be between 0
and 1 and that their sum must be 1.
36
Probability distribution of categorical
cont…
37
Bernoulli Distribution
When a random process or experiment, called a trial, can
result in only one of two mutually exclusive outcomes, such
as
Male or female
dead or alive,
sick or well,
full-term or premature, the single trial is called
a Bernoulli trial.
38
Bernoulli Distribution cont…
In a Bernoulli trial, the outcome of an
experiment can either be success (i.e., 1) or
failure (i.e., 0).
If you have only two possible outcomes (call them 1/0 or yes/
no or success/failure) in n independent trials, then the
probability of exactly X “successes” is:
n = number of trials
n
p X (1 p)n X
X
1-p =
probability of
X=# failure
p=
success probability
es out of success
of n
trials 46
Binomial distribution….
43
Binomial distribution….
50
Probability distribution
The Poisson distribution
When the probability of “success” is very small, e.g., the
probability of a mutation, then pX and (1 – p)n – X become too
small to calculate exactly by the binomial distribution.
In such cases, the Poisson distribution becomes useful.
52
The Poisson distribution…
i.e. λ = 2.4
f (x) Normal
Uniform Skewed
x
There are infinite number of continuous random variables
51
Properties of Normal Distributions
The most important probability distribution in statistics is
the normal distribution.
Normal curve
1 x
1 ( )2
f ( x) e 2
2
Where, = Population variance
This is a bell shaped
µ = population mean curve with different
centers and spreads
e =2.718…, π= 3.14… depending on and
53
Properties of Normal Distributions
Properties of a Normal Distribution
1. The mean, median, and mode are equal.
2. The normal curve is bell-shaped and symmetric about
the mean.
3. The total area under the curve is equal to one.
4. The normal curve approaches, but never touches the x-
axis as it extends farther and farther away from the
mean.
5. Between µ σ and µ + σ (in the center of the curve), the
graph curves downward. The graph curves upward to the
left of µ σ and to the right of µ + σ. The points at which
the curve changes from curving upward to curving
downward are called the inflection points.
Properties of Normal Distributions
x
µ 3σ µ 2σ µ σ µ µ +σ µ + 2σ µ + 3σ
The Family of Normal Distribution
1
f
Where ( x ) e (x )2/2 2
ab
2
b. By preparing a tables containing areas for each curve
3 2 1 0 1 2 3
z
The Standard Normal Distribution
z = 3.49
3 2 1 0 1 2 3 z = 3.49
z=0
Area is 0.5000.
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-score
of 2.71.
Standard Normal Table
z .00 .01 .02 .03 .04 .05 .06 .07 .08 .09
0.0 .5000 .5040 .5080 .5120 .5160 .5199 .5239 .5279 .5319 .5359
0.1 .5398 .5438 .5478 .5517 .5557 .5596 .5636 .5675 .5714 .5753
0.2 .5793 .5832 .5871 .5910 .5948 .5987 .6026 .6064 .6103 .6141
2.6 .9953 .9955 .9956 .9957 .9959 .9960 .9961 .9962 .9963 .9964
2.7 .9965 .9966 .9967 .9968 .9969 .9970 .9971 .9972 .9973 .9974
2.8 .9974 .9975 .9976 .9977 .9977 .9978 .9979 .9979 .9980 .9981
Find the area by finding 2.7 in the left hand column, and
then moving across the row to the column under 0.01.
The area to the left of z = 2.71 is 0.9966.
The Standard Normal Table
Example:
Find the cumulative area that corresponds to a z-score
of 0.25.
Standard Normal Table
z .09 .08 .07 .06 .05 .04 .03 .02 .01 .00
3 .4 .0002 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003 .0003
3 .3 .0003 .0004 .0004 .0004 .0004 .0004 .0004 .0005 .0005 .0005
0 .3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0 .2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0 .1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0 .0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Find the area by finding 0.2 in the left hand column, and then
moving across the row to the column under 0.05.
The area to the left of z = 0.25 is 0.4013
Guidelines for Finding Areas
Finding Areas Under the Standard Normal Curve
1. Sketch the standard normal curve and shade the
appropriate area under the curve.
2. Find the area by following the directions for each case
shown.
a.To find the area to the lef of , find the area that
corresponds to z in the Standard Normal Table.
z
1.23
1. Use the table to fin0 d
the area for the z-score.
Guidelines for Finding Areas
Finding Areas Under the Standard Normal Curve
b.To find the area to the right of z, use the Standard
Normal Table to find the area that corresponds to z.
Then subtract the area from 1.
z
0 1.23
1. Use the table to find the
area for the z-score.
Guidelines for Finding Areas
Finding Areas Under the Standard Normal Curve
c.To find the area between two z-scores, find the area
corresponding to each z-score in the Standard
Normal Table. Then subtract the smaller area from
the larger area.
4. Subtract to find the area of
2. The area to
the region between the two
the left of z =
z-scores:
1.23 is 0.8907. 0.8907 0.2266 = 0.6641.
3. The area to the left of z =
0.75 is 0.2266.
z
0.75 0 1.23
x z
µ =10 15 µ =0 1
Same area
P(x < 15) = P(z < 1) = Shaded area under the curve
= 0.8413
Probability and Normal Distributions
Example: The average weight of pregnant women attending a
prenatal care in a clinic was 78kg with a standard deviation of
8kg. If the weights are normally distributed:
µ = 78 z = x -σµ = 85 - 788
σ=8
= 0.875 0.88
P(x > 85)
The probability that a
x randomly selected
µ =78 85 pregnant woman weights
z
greater than 85kg. is
µ =0 0?8. 0.1894.
8
P(x > 85) = P(z > 0.88) = 1 P(z < 0.88) = 1 0.8106 = 0.1894
Probability and Normal Distributions
Example:
From the above example, find the probability that a randomly
selected pregnant woman weights between 60 and 80.
z1 = x -σµ = 60 - 78
8
= -2.25
0 .3 .3483 .3520 .3557 .3594 .3632 .3669 .3707 .3745 .3783 .3821
0 .2 .3859 .3897 .3936 .3974 .4013 .4052 .4090 .4129 .4168 .4207
0 .1 .4247 .4286 .4325 .4364 .4404 .4443 .4483 .4522 .4562 .4602
0 .0 .4641 .4681 .4724 .4761 .4801 .4840 .4880 .4920 .4960 .5000
Example:
Find the z-score that corresponds to P75
Area = 0.75
z
µ =0 0.6?7