Chapter 4 Introduction To Probability
Chapter 4 Introduction To Probability
Wullo S. (MPH)
7/9/21
Learning outcomes
After studying this chapter, the student will be able to:
4.1 Define basic terms in probability
4.2 Describe set theory and probability
4.3 Identify types of probability
4.4 Identify types of random variable and probability distribution
4.5 List common probability distributions and their
properties
07/09/2021 2
PROBABILITY CONCEPTS
Introduction
Probability lays the foundation for statistical inference
This chapter provides a brief overview of the probability
concepts necessary for understanding topics covered in
the chapters that follow
It also provides a context for understanding the probability
distributions used in statistical inference
07/09/2021 3
Basic Terms of Probability
• Probability can be defined as the chance of an event
occurring.
• Probability experiment: is a process that leads to well-
defined results or is an action through which specific
results/outcomes (counts, measurements or responses)
are obtained. But that is the result cannot be predicted.
Example:
• Tossing a coin and observing the face showing up is a
probability experiment.
• Outcome: It is the result of a single trial in a probability
experiment. It is also called simple event.
Example: the outcome of the sex of a newborn from a
mother in delivery room is either Male or female
07/09/2021 4
Basic concepts con'td….
7/9/21 Wullo S. 5
Basic terms…
07/09/2021 6
Exercise
Find the sample space for the gender of the children
if a family has three children. Use B for boy and G
for girl
And also find:
a. The probability of obtaining at least two girls in a
family?
b. The probability of getting at most two boys in a family?
c. The probability of getting one boys and two girls in a
family?
07/09/2021 7
Types of probability
1. Classical (or theoretical) probability
It is used when each outcome in a sample space is
equally likely to occur.
That is if an experiment has n equally likely outcomes,
then each possible outcome must have probability of 1/n
to occur Or, equivalently the probability for event E is;
07/09/2021 8
Types of probability cont…
2. Empirical (or statistical) probability: is based on
observations obtained from experiments /a large
number of trials or from historical data.
Example:
• A medical doctor realized that out of 100,000 patients
visited the hospital, there are 50 cancer cases. What
is the probability that a patient to be examined will be
positive for cancer?
P(+ve for cancer) = 50/100,000 = 0.0005
07/09/2021 9
Example 2
In a sample of 50 people, 21 had type O blood, 22 had type A
blood, 5 had type B blood, and 2 had type AB blood. Set up
a frequency distribution and find the following probabilities
a. A person has type O blood
b. A person has type A or type B blood
c. A person has neither type A nor type O blood
d. A person does not have type AB blood
07/09/2021 10
Solution
Blood type Frequency
A 22
B 5
AB 2
O 21
Total 50
07/09/2021 11
• Union of events: The union of two events A and B, denoted
by (AUB) , consists of all outcomes that are in A or in B or
both A and B.
If A and B are two events, then
P(A ∪ B) = P(A) + P(B) − P(A ∩ B).
If A and B are mutually exclusive/independent, then
P(A ∪ B) = P(A) + P(B)
Example
In a hospital unit there are 8 nurses and 5 physicians; 7 nurses
and 3 physicians are females. If a staff person is selected,
find the probability that the subject is:
a. nurse or a male
b. physician or female
07/09/2021 12
•Solution
Staff Gender
Male Female Total
Physician 2 3 5
Nurse 1 7 8
Total 3 10 13
The probability is
P(N M) = P(N) + P(M) - P(MN)
= 8/13+3/13-1/13
= 0.615 + 0.23-0.077
= 0.768
07/09/2021 13
3. Subjective probability
• In this view, probability is treated as a quantifiable level of belief
ranging from 0 (complete disbelief) to 1 (complete belief)
07/09/2021 16
Conditional probability
occurred by the probability that the first event has occurred. The formula is
P(A/B) =P(AB)/P(B)
Where: P(B)≠0
Special case: When events A and B are independent, then:
P(A|B) = P(A)
P(AB)=P(A)P(B)
07/09/2021 17
Example
• In a certain high school class, consisting of 60 girls and 40
boys, it is observed that 24 girls and 16 boys wear
eyeglasses. If a student is picked at random from this class,
the probability that the student wears eyeglasses, P(E), is
40/100, or 0.4.
a. What is the probability that a student picked at random wears
eyeglasses, given that the student is a boy?
solution
07/09/2021 21
Factorial
For any positive integer n, n factorial denoted as
n! is defined as:
n! = n×(n-1) ×(n-2) ×…. ×3x2x1
e.g. 3!=3x2x1=6
5!= 5x4x3x2x1=120
07/09/2021 22
Permutation rules
• Permutation: is the number of possible
permutations is the number of different orders in
which particular events occur. The number of
possible permutations are
p( n, r) =
example: 6p2=
07/09/2021 23
Combination
The number of ways r objects can be chosen a set of n
objects without considering the order of selection is
called the number of combination of n objects taking r
of them at a time, denoted by
C(8,6) =
C(8,0)=
07/09/2021 24
Probability Distribution
07/09/2021 25
• Discrete random variables: have a finite number of possible
values or an infinite number of values that can be counted
– The word counted means that they can be enumerated
using the numbers 1, 2, 3, etc
– Variables that can assume all values in the interval
between any two given values are called continuous
variables
– Continuous random variables can assume an infinite
number of values and can be decimal and fractional
values
07/09/2021 26
Examples of discrete random variable:
• Toss a coin “n” time and count the number of heads.
• number of car accidents per week.
• Number of defective items in a given company.
• Number of bacteria per two cubic centimeter of water
Examples of continuous random variable:
• Height of students at certain college.
• Mark of a student.
• Life time of certain disease .
• Length of time required to complete a given training
07/09/2021 27
The probability distribution of a discrete random variable is a
table, graph, formula, or other device used to specify all
possible values of a random variable along with their
respective probabilities
Example:
Consider the experiment of tossing a coin three times. Let X be
the number of heads. Construct the probability distribution
of X
X 0 1 2 3
P(x) 1/8 3/8 3/8 1/8
07/09/2021 28
Example 2:
Construct a probability distribution for rolling a single die.
Solution
Since the sample space is 1, 2, 3, 4, 5, 6 and each outcome
has a probability of , the distribution
X 1 2 3 4 5 6
p(x) 1/6 1/6 1/6 1/6 1/6 1/6
07/09/2021 29
Two requirements for probability distribution
•• The
sum of the probabilities of all events in the sample
space must be equal to 1; i.e.
07/09/2021 30
Properties of continuous probability distribution
1.
07/09/2021 31
Introduction to expectation
Definition: the expected value (also known as the
mean) of a random variable is a measure of the
center location for the random variable.
1. Discrete R.V
n
E(X) = X1P(X1) +X2P(X2) +…. +XnP(Xn) = X .P X i i
i 1
2. Continuous R.V
b
E X X . f ( x)d ( x)
a
07/09/2021 32
Variance Probability distribution
• The expected value of X is its mean
Mean of X= E(X)
• The variance of X is given by:
Variance of X=Var(x) = E X 2 ( E X ) 2
n
E ( X ) X i .P X i
2 2
if X is discrete
i 1
X 2 f x d ( x) if X is continuous
x
07/09/2021 33
Example
Let X be a continuous R.V with distribution
1
x 0 x2
f ( x) 2
0, otherwise
Then find
a) P (1<x<1.5)
b) E(x)
c) Var(x)
d) E (3x 2 2 x)
07/09/2021 34
o n
r ib uti
ist
t y D
i li
o b ab
te pr
re
D isc
07/09/2021 35
1. Binomial Distribution
07/09/2021 36
Binomial distribution Cont..
•
Definition: The outcomes of the binomial experiment and the
corresponding probabilities of these outcomes are called
BinomialDistribution.
If the probability of success on an individual trial is P,
then the binomial probability is defined by:
Where:
– x=the number of success
– P=probability of success
– n=the number of experiments
– 1-p=probability of failure
07/09/2021 37
•When
using the binomial formula to solve problems, we have to
identify three things:
The number of trials (n)
The probability of a success on any one trial (P) and
07/09/2021 38
Example: Suppose that an examination consists of six true and
false questions, and assume that a student has no
knowledge of the subject matter. The probability that the
student will guess the correct answer to the first question is
30%. Likewise, the probability of guessing each of the
remaining questions correctly is also 30%.
a) What is the probability of getting exactly three correct
answers?
b) What is the probability of getting exactly two correct
answers?
c) What is the probability of getting at most two correct
answers?
d) What is the probability of getting less than five correct
answers?
e. Find expected value and standard deviation?
07/09/2021 39
•Solution:
a.
b. 07/09/2021 40
•c.
d.
07/09/2021 41
Exercise
1. Suppose 14 percent of mothers admitted to smoking one or
more cigarettes per day during pregnancy. If a random
sample of size 10 is selected from this population, what is
the probability that it will contain exactly four mothers who
admitted to smoking during pregnancy?
2. Suppose that 80% of adults with allergies report
symptomatic relief with a specific medication. If the
medication is given to 10 new patients with allergies, what is
the probability that it is effective in exactly seven? assume
that the replications are independent.
07/09/2021 42
2.Poisson distribution
The probability distribution of a Poisson random variable X
representing the number of successes occurring in a given time
interval or a specified region of space is given by the formula:
Where
• k=Number of successes per unit time
07/09/2021 44
Example:
In a study of drug-induced anaphylaxis among patients taking
rocuronium bromide as part of their anesthesia, the
occurrence of anaphylaxis followed a Poisson distribution
with λ =12 incidents per year in Norway. Find the probability
that in the next year, among patients receiving rocuronium,
a. exactly three will experience anaphylaxis.
b. At least two will experience anaphylaxis
c. At most two experience anaphylaxis
07/09/2021 45
•
Solution:
a.
b.
07/09/2021 46
Exercise
In a certain population an average of 13 new cases of
esophageal cancer are diagnosed each year. If the annual
incidence of esophageal cancer follows a Poisson
distribution, find the probability that in a given year the
number of newly diagnosed cases of esophageal cancer will
be:
A. Exactly 10 cases
B. At least three cases
C. No more than 3
D. Between nine and 12, inclusive
E. Fewer than two
07/09/2021 47
CONTINUOUS PROBABILITY DISTRIBUTIONS
07/09/2021 48
Normal distribution
where
• X is a normal random variable,
• μ is the mean
• σ is the standard deviation
• pi is approximately 3.14159, and e is approximately 2.71828.
• The random variable X in the normal equation is called the
normal random variable.
07/09/2021 49
Characteristics of Normal Distribution
• It links frequency distribution to probability distribution
• Has a Bell Shape Curve and is Symmetric
• It is Symmetric around the mean: Two halves of the
curve are the same (mirror images)
• Hence Mean = Median=mode
• The total area under the curve is 1 (or 100%)
• Normal Distribution has the same shape as Standard
Normal Distribution.
07/09/2021 50
Normal Curve
• The graph of the normal distribution depends on two factors:
the mean and the standard deviation.
• The mean of the distribution determines the location of the center of the
graph, and the standard deviation determines the height and width of the
graph.
• When the standard deviation is large, the curve is short and wide; when
the standard deviation is small, the curve is tall and narrow.
• All normal distributions look like a symmetric, bell-shaped curve.
07/09/2021 51
Standard Normal Distribution
• It makes life a lot easier for us if we standardize our normal
curve, with a mean of zero and a standard deviation of 1
unit.
• We can transform all the observations of any normal random
variable X with mean μ and variance σ to a new set of
observations of another normal random variable Z with mean
0 and variance 1 using the following transformation:
07/09/2021 52
• About 95% of the area under the curve falls within 2
standard deviations of the mean
• About 99.7% of the area under the curve falls within 3
standard deviations of the mean
• A graph of this standardized (mean 0 and variance 1) normal
curve is given in Graph:
07/09/2021 53
Probability and Normal Distributions
07/09/2021 55
Table of normal distribution
• Example 1: Suppose we want to compute the area
under the normal curve to the left of 1.45
• This area can be computed by finding the probability under
the normal curve.
• The probability can be read at the normal curve by combining
the value of 1.4 under the first column and 0.05 under the
first row.
• The left side of the area in the diagram represents the area
that is within 1.45 standard deviations from the mean.
• The area of this shaded portion is 0.9265(or 92.65% of the
total area under the curve).
07/09/2021 56
07/09/2021 57
Example:
Find the area to the left of z = 2.06
Solution
Step 1: Draw the figure
07/09/2021 58
Step2: We are looking for the area under the standard normal
distribution to the left of z = 2.06, It is 0.9803. Hence, 98.03%
of the area is less than z = 2.06.
07/09/2021 59
Find the area between z = 1.68 and z =-1.37.
Solution
Step 1: Draw the figure as shown.
Step 2 Since the area desired is between two given z values, look up
the areas
corresponding to the two z values and subtract the smaller area from the
larger area. (Do not subtract the z values.) The area for z=1.68 is 0.9535,
and the area for z= -1.37 is 0.0853. The area between the two z values is
0.9535 - 0.0853 = 0.8682 or 86.82%
07/09/2021 60
Example:
For subject A, a 27-year-old female, the ammonia concentration
in parts per billion (ppb) followed a normal distribution over 30
days with mean 491 and standard deviation 119.What is the
probability that on a random day, the subject’s ammonia
concentration is between 292 and 649 ppb?
Solution:
We find the z value corresponding to an x of 292 by
07/09/2021 61
The area desired is the difference between these, 0.9082
-0.0475 = 0. 8607.
Exercise:
1. For another subject (a 29-year-old male), the acetone levels
were normally distributed with a mean of 870 and a standard
deviation of 211 ppb. Find the probability that on a given day
the subject’s acetone level is:
a. Between 600 and 1000 ppb
b. Over 900 ppb
c. Under 500 ppb
d. Between 900 and 1100 ppb
07/09/2021 62
2. If the total cholesterol values for a certain population are
approximately normally distributed with a mean of 200
mg\100 ml and a standard deviation of 20 mg\100 ml, find the
probability that an individual picked at random from this
population will have a cholesterol value:
a. Between 180 and 200 mg\100 ml
b. Greater than 225 mg\100 ml
c. Less than 150 mg\100 ml
d. Between 190 and 210 mg\100 ml
07/09/2021 63
Student t-distribution
• It is often the case that one wants to calculate the size
of sample needed to obtain a certain level of confidence
in survey results.
• Unfortunately, this calculation requires prior knowledge
of the population standard deviation σ.
• Realistically, σ is unknown
• Often a preliminary sample will be conducted so that a
reasonable estimate of this critical population parameter
can be made.
• If such a preliminary sample is not made, but
confidence intervals for the population mean are to be
constructing using an unknown σ, then the distribution
known
07/09/2021 as the Student t distribution can be used. 64
Student’s t-distribution cont…
•• Suppose
we have a simple random sample of size n
drawn from a Normal population with mean μ and
standard deviation σ. Let us denote the sample mean
by and sample standard deviation by s, then the
quantity:
07/09/2021 65
Some properties of t-distribution are;
The t distribution shares some characteristics of the normal
distribution and differs from it in others. The t distribution is
similar to the standard normal distribution in these ways:
1. It is bell-shaped.
2. It is symmetric about the mean.
3. The mean, median, and mode are equal to 0 and are located
at the center of the distribution.
.Converges to the normal distribution as the sample size gets
large
5. The curve never touches the x axis.
07/09/2021 66
The t distribution differs from the standard normal distribution in
the following ways:
The variance is greater than 1.
The t distribution is actually a family of curves based on
the concept of degrees of freedom, which is related to
sample size.
As the sample size increases, the t distribution
approaches the standard normal distribution
07/09/2021 67
o u v e ry
a nk y
Th ! !!
M uc h
07/09/2021 68