The Normal Distribution
The Normal Distribution
Continuous Distributions
A continuous random variable is a variable whose possible values form some interval of
numbers. Typically, a continuous variable involves a measurement of something, such as the
height of a person, the weight of a newborn baby, or the length of time a car battery lasts.
EXAMPLES:
1. Let x = the amount of milk a cow produces in one day. This is a continuous random variable
because it can have any value over a continuous span. During a single day, a cow might yield
an amount of milk that can be any value between 0 gallons and 5 gallons. It would be possible
to get 4.123456 gallons, because the cow is not restricted to the discrete amounts of 0, 1, 2, 3,
4, or 5 gallons.
2. The measure of voltage for a smoke-detector battery can be any value between 0 volts and
9 volts. It is therefore a continuous random variable.
In the histograms we have seen thus far, the frequencies, percentages, proportions, or probabilities were represented by the heights of the rectangles, or by their areas. In the continuous
case, we also represent probabilities by areas not by areas of rectangles, but by areas under
continuous curves.
Continuous curves such as the one shown on the right are the graphs of functions called probability densities, or informally, continuous distributions. Probability densities are characterized by the fact that the area under the curve between any two values a and b gives the
probability that a random variable having this continuous distribution will take on a value on
the interval from a to b.
The total area under the curve must equal 1
x
EXAMPLE: Verify that f (x) =
can serve as the probability density of a random variable
8
defined over the interval from x = 0 to x = 4. Then find the probabilities that a random
variable having the given probability density will take on a value
(a) less than 2;
(b) less than or equal to 2.
1
We can interpret the above statement in several ways. Theoretically, it says that standardizing
converts all normal distributions to the standard normal distribution, as depicted in the Figure
below.
EXAMPLE: Find z if
(a) the standard-normal-curve area between 0 and z is 0.4484
(b) the standard-normal-curve area to the left of z is 0.9868
Solution:
(a) z = 1.63
(b) 0.9868 0.5000 = 0.4868
z = 2.22
z = 0.98
z = 1.47
=
z = 0.41
EXAMPLE: If a random variable has the normal distribution with = 82.0 and = 4.8, find
the probabilities that it will take on a value
(a) less than 89.2
EXAMPLE: If a random variable has the normal distribution with = 82.0 and = 4.8, find
the probabilities that it will take on a value
(a) less than 89.2
(b) greater than 78.4
(c) between 83.2 and 88.0
(d) between 73.6 and 90.4
Solution:
(a) We have
89.2 82
= 1.5
4.8
therefore the probability is 0.4332 + 0.5 = 0.9332.
z=
(b) We have
78.4 82
= 0.75
4.8
therefore the probability is 0.2734 + 0.5 = 0.7734.
z=
(c) We have
83.2 82
88 82
= 0.25 and z2 =
= 1.25
4.8
4.8
therefore the probability is 0.3944 0.0987 = 0.2957.
z1 =
(d) We have
73.6 82
90.4 82
= 1.75 and z2 =
= 1.75
4.8
4.8
therefore the probability is 0.4599 + 0.4599 = 0.9198.
z1 =
115 100
140 100
= 0.9375 and z2 =
= 2.5
16
16
therefore the probability is 0.4938 0.3264 = 0.1674. It follows that 16.74% of all people have
IQs between 115 and 140. Equivalently, the probability is 0.1674 that a randomly selected
person will have an IQ between 115 and 140.
EXAMPLE: One of the larger species of tarantulas is the Grammostola mollicoma, whose
common name is the Brazilian giant tawny red. A tarantula has two body parts. The anterior
part of the body is covered above by a shell, or carapace. From a recent article by F. Costa
and F. Perez-Miles titled Reproductive Biology of Uruguayan Theraphosids (The Journal of
Arachnology, Vol. 30, No. 3, pp. 571-587), we find that the carapace length of the adult male
G. mollicoma is normally distributed with mean 18.14 mm and standard deviation 1.76 mm.
(a) Find the percentage of adult male G. mollicoma that have carapace length between 16 mm
and 17 mm.
(b) Find the percentage of adult male G. mollicoma that have carapace length exceeding 19
mm.
EXAMPLE: One of the larger species of tarantulas is the Grammostola mollicoma, whose
common name is the Brazilian giant tawny red. A tarantula has two body parts. The anterior
part of the body is covered above by a shell, or carapace. From a recent article by F. Costa
and F. Perez-Miles titled Reproductive Biology of Uruguayan Theraphosids (The Journal of
Arachnology, Vol. 30, No. 3, pp. 571-587), we find that the carapace length of the adult male
G. mollicoma is normally distributed with mean 18.14 mm and standard deviation 1.76 mm.
(a) Find the percentage of adult male G. mollicoma that have carapace length between 16 mm
and 17 mm.
(b) Find the percentage of adult male G. mollicoma that have carapace length exceeding 19
mm.
Solution:
(a) We have
16 18.14
17 18.14
1.22 and z2 =
0.65
1.76
1.76
therefore the probability is 0.3888 0.2422 = 0.1466. It follows that 14.66% of adult male G.
mollicoma have carapace length between 16 mm and 17 mm. Equivalently, the probability is
0.1466 that a randomly selected adult male G. mollicoma has carapace length between 16 mm
and 17 mm.
z1 =
(b) We have
19 18.14
0.49
1.76
therefore the probability is 0.5 0.1879 = 0.3121. It follows that 31.21% of adult male G.
mollicoma have carapace length exceeding 19 mm. Equivalently, the probability is 0.3121 that
a randomly selected adult male G. mollicoma has carapace length exceeding 19 mm.
z=
EXAMPLE: As reported in Runners World magazine, the times of the finishers in the New
York City 10-km run are normally distributed with mean 61 minutes and standard deviation 9
minutes.
(a) Determine the percentage of finishers with times between 50 and 70 minutes.
(b) Determine the percentage of finishers with times less than 75 minutes.
EXAMPLE: As reported in Runners World magazine, the times of the finishers in the New
York City 10-km run are normally distributed with mean 61 minutes and standard deviation 9
minutes.
(a) Determine the percentage of finishers with times between 50 and 70 minutes.
(b) Determine the percentage of finishers with times less than 75 minutes.
Solution:
(a) We have
50 61
70 61
1.22 and z2 =
=1
9
9
therefore the probability is 0.3888 + 0.3413 = 0.7301 = 73.01%.
z1 =
(b) We have
75 61
= 1.5556
9
therefore the probability is 0.5 + 0.4406 = 0.9406 = 94.06%.
z=
EXAMPLE: An article by S. M. Berry titled Drive for Show and Putt for Dough (Chance,
Vol. 12(4), pp. 50-54) discussed driving distances of PGA players. The mean distance for
tee shots on the 1999 mens PGA tour is 272.2 yards with a standard deviation of 8.12 yards.
Assuming that the 1999 tee-shot distances are normally distributed, find the percentage of such
tee shots that went
(a) between 260 and 280 yards.
EXAMPLE: An article by S. M. Berry titled Drive for Show and Putt for Dough (Chance,
Vol. 12(4), pp. 50-54) discussed driving distances of PGA players. The mean distance for
tee shots on the 1999 mens PGA tour is 272.2 yards with a standard deviation of 8.12 yards.
Assuming that the 1999 tee-shot distances are normally distributed, find the percentage of such
tee shots that went
(a) between 260 and 280 yards.
(b) more than 300 yards.
Solution:
(a) We have
260 272.2
280 272.2
1.5 and z2 =
0.96
8.12
8.12
therefore the probability is 0.4332 + 0.3315 = 0.7647 = 76.47%.
z1 =
(b) We have
300 272.2
3.42
8.12
therefore the probability is 0.5 0.4997 = 0.0003 = 0.03%.
z=
10
11
EXAMPLE: A student is taking a true-false exam with 10 questions. Assume that the student
guesses at all 10 questions.
(a) Determine the probability that the student gets either 7 or 8 answers correct.
(b) Approximate the probability obtained in part (a) by an area under a suitable normal curve.
Solution: Let X denote the number of correct answers by the student. Then X has the binomial
distribution with parameters n = 10 (the 10 questions) and p = 0.5 (the probability of a correct
guess).
(a) Probabilities for X are given by the binomial probability formula
10
(0.5)x (1 0.5)10x
P (X = x) =
x
Using this formula, we get the probability distribution of X, as shown in the Table below.
According to that table, the probability the student gets either 7 or 8 answers correct is
P (X = 7 or 8) = P (X = 7) + P (X = 8) = 0.1172 + 0.0439 = 0.1611
(b) Referring to Table below, we drew the probability histogram of X. Because the probability
histogram is bell shaped, probabilities for X can be approximated by areas under a normal
curve. The appropriate normal curve is the one whose parameters are the same as the mean
and standard deviation of X, which are
= np = 10 0.5 = 5
and
=
np(1 p) =
p
10 0.5 (1 0.5) = 1.58
The probability P (X = 7 or 8) equals the area of the corresponding bars of the histogram,
cross-hatched in the Figure above. Note that the cross-hatched area approximately equals the
area under the normal curve between 6.5 and 8.5.
The Figure makes clear why we consider the area under the normal curve between 6.5 and
8.5 instead of between 7 and 8. This adjustment is called the correction for continuity. It
is required because we are approximating the distribution of a discrete variable by that of a
continuous variable.
In any case, the Figure shows that P (X = 7 or 8) roughly equals the area under the normal
curve with parameters = 5 and = 1.58 that lies between 6.5 and 8.5. To compute this area,
12
we convert to z-scores and then find the corresponding area under the standard normal curve.
We have
6.5 5
8.5 5
z1 =
0.95 and z2 =
2.22
1.58
1.58
therefore the probability is 0.4868 0.3289 = 0.1579. This is close to P (X = 7 or 8), which, as
we found in part (a), is 0.1611.
We can now write a general step-by-step method for approximating binomial probabilities by
areas under a normal curve.
EXAMPLE: The probability is 0.80 that a person of age 20 years will be alive at age 65 years.
Suppose that 500 people of age 20 are selected at random. Determine the probability that
(a) exactly 390 of them will be alive at age 65.
(b) between 375 and 425 of them, inclusive, will be alive at age 65.
13
EXAMPLE: The probability is 0.80 that a person of age 20 years will be alive at age 65 years.
Suppose that 500 people of age 20 are selected at random. Determine the probability that
(a) exactly 390 of them will be alive at age 65.
(b) between 375 and 425 of them, inclusive, will be alive at age 65.
Solution: We have n = 500 and p = 0.8, therefore
np = 500 0.8 = 400 and n(1 p) = 500 0.2 = 100
Both np and n(1 p) are greater than 5, so we can continue. We get
p
390.5 400
389.5 400
1.17 and z2 =
1.06
8.94
8.94
therefore the probability is about 0.3790 0.3554 = 0.0236 that exactly 390 of the 500 people
selected will be alive at age 65.
(b) To make the correction for continuity, we subtract 0.5 from 375 and add 0.5 to 425. Thus
we need to determine the area under the normal curve with parameters = 400 and = 8.94
that lies between 374.5 and 425.5. As in part (a), we convert to z-scores
z1 =
425.5 400
374.5 400
2.85 and z2 =
2.85
8.94
8.94
and then find the corresponding area under the standard normal curve. This area is 0.4978 2 =
0.9956. So, P (375 X 425) = 0.9956, approximately.
14
15