Normal Distribution
Normal Distribution
1
=
2
, o
1
= o
2
6.) When relating to the normal distribution,
the standard deviation becomes a more
meaningful quantity than merely being a
measure of dispersion.
If we draw a line perpendicular to the x-axis at the
point equivalent to 1 standard deviation to the left
and 1 standard deviation to the right of the mean,
then the area bounded by the normal curve, the
perpendicular lines and the x-axis is approximately
68.5% of the total distribution.
7
Example: If for example the lengths of Filipino babies
approximate a normal distribution with a mean of 41 cm and
a standard deviation of 4 cm, then approximately 68.5% of
Filipino babies measure between 37 ( 41- 4) and 45 (41 + 4)
cm at birth.
68.5%
95%
99.7%
-3s -2s -1s +1s +2s +3s
AREAS UNDER THE NORMAL CURVE BOUNDED BY
DIFFERENT STANDARD DEVIATIONS
If the perpendicular lines were drawn two standard
deviations away from the mean, then the area covered is
95%.
Around 95% of the babies are between 33 and 49 cm
long at birth . { 41 (2 x 4)} and 41 + (2 x 4) }
If the lines were drawn three standard deviations, the
area covered is 99.7% . Beyond three standard deviations,
the areas are almost negligible.
In summary:
1 SD away from the mean = area covered is 68.7%
2 SD away from the mean = area covered is 95%
3 SD away from the mean = area covered is 99.7%
8
THE IMPORTANCE OF THE NORMAL
DISTRIBUTION
Why is the normal distribution important?
The answer lies in the usefulness of the normal
distribution in explaining many biological phenomena
and in the role it plays in statistical inference .
Examples of biological variables with normal
distribution:
Height, weight, blood pressure, serum uric acid
(continuous numerical)
Discrete variables e.g. heart beat after an exercise
Note that these variables assume positive values
only. The range of values of a normal distribution
ranges from - and + . It is also impossible to
attain infinity.
What if variables are not normally distributed?
Transformation of a non-normal data to a normal one
requires square root and logarithmic transformations.
E.g. minimum dose that will produce an effect to an
animal may be converted to its natural logarithm
to yield a normal data.
9
The method of data transformation depends on the type
of distribution of the data. (e.g. Z-transformation)
Other types of distributions in biostatistics:
Binomial
Poisson
Students T
These can be approximated by the normal
distribution in large sample studies.
The normal distribution is easy to work with.
The most important reason for the use of the
normal distribution stems from the so-called central
limit theorem.
CENTRAL LIMIT THEOREM
A general law about commonly computed
statistics (measures to characterize a sample).
One of its implications is that if we draw a large
sample n repeatedly from a population that has a
mean and standard deviation o, then the
distribution of the sample will approximate that
of a normal distribution.
The resulting distribution will have a mean and
a standard deviation o / \n . This holds true
even if the population from which a sample is
taken in not normally distributed.
10
THE STANDARD NORMAL DISTRIBUTION
According to the fifth property, the normal
distribution is actually a collection or a family of
distributions with each member having a
population mean and standard deviation o .
The most important member of this family is one
having a = zero and o = 1. (unity)
This member of the family is called the standard
normal distribution.
Let s say we take a sample of values from a
population with normal distribution with a mean
and standard deviation o and denote these
values as xs. To convert this particular normal
distribution into a standard normal distribution,
we subtract the value of from each x value and
divide the result by s to get a new variable z
x -
o
Z has a normal distribution with a mean=0 and
SD=1. The variable Z is called the standard
normal deviate.
In the example of the lengths of Filipino babies.
Which is normally distributed
Z =
11
Mean = 41 cm 4cm SD
The value of 45 cm in this distribution is
equivalent to z=1 in the standard normal
distribution.
The proportion or percentage of Filipino babies
who are measured more than 45 cm at birth is
equal to the area under the standard normal
curve to the right of z=1.
The value of 33 cm in the original distribution is
transformed to z=-2. The percentage of Filipino
babies who have lengths less than 33 cm is equal
to the area under the curve to the left of z=-2.
The proportion of Filipino babies who measured
between 33 cm and 45 cm at birth is the same as
the area between z=-2 and z=1 in the standard
normal curve.
12
APPLICATIONS OF THE NORMAL
DISTRIBUTION
IT ALLOWS US TO SOLVE PROBLEMS OF TWO TYPES:
1.) ALLOWS US TO COMPUTE FOR PROPORTIONS
OR PERCENTAGES OF VALUES THAT BELONG TO
DIFFERENT CATEGORIES OF THE VARIABLE OF
INTEREST.
-The proportion belonging to a certain category can
also be interpreted as the probability that a member
drawn from a population will belong to this category.
- These categories may be presented by ranges of
values or intervals
Example:
If the distribution of systolic BP of non-hypertensive
men is known to be normal with a mean of 110
mmHg and an SD of 15 mmHg, then the proportion
of non-hypertensive men who have systolic BP
between 100 and 120 mmHg can be determined.
This proportion is computed by solving for the
area under the curve with = 110 and = 15
and x values bounded by x1 = 100 and x2=120.
2.) IT ALLOWS US TO DETERMINE THE X VALUES
THAT BOUND A SPECIFIED AREA OF
DISTRIBUTION OF THIS VARIABLE.
WE MAY BE INTERESTED IN DETERMINING
THE SBP OF THE HIGHEST 10% OF NON-
HYPERTENSIVE MEN. (Use the z-table)
13
PROBLEM 1:
What is the proportion of non-hypertensive men
who have systolic blood pressure above 120
mmHg?
Solution:
Here x= 120, we compute for Z using the formula
67 . 0
15
110 120
=
=
o
_
Z
Look for the area of the standard normal curve
corresponding to z =0.67. The table gives us A=0.2514.
This is the area of the curve to the right of z=0.67 in the
normal distribution with =110 and = 15, 0.2514 is also the
right of x=120
= 15 mmHg
A = 0.2514
X
= 110 120
= 15 mmHg
A = 1
= 0 0.67
14
PROBLEM 2
What is the proportion of non-hypertensive men
with systolic BP less than 90 mmHg ?
33 . 1
15
110 90
=
=
o
_
Z
The value of x= 90 corresponds to = -1.33.
Note that there are no negative z-values in the table.
However, we know that the standard normal curve is
symmetric about its mean (0).
The area to the left of Z = -1.33 is the same as to the
area to the right of Z = + 1.33. This area in the table is
.00918 or 9.18%
Therefore, the proportion of non-hypertensive men
with systolic BP less than 90mmHg is 9.18%.
= 15 mmHg
A=.0918
X
90 = 110 120
A=.0918 = 1
Z
-1.33 =0
15
PROBLEM 3
Calculate the area under the curve for systolic BP > 90
and < 120 mmHg.
= 15
X
90 =110 120
= 1
Z
-1.33 =0 0.67
PROBLEM 4:
What is the 90
th
percentile of the systolic BP levels of non-
hypertensive men ?
Solution:
X= z +
X = (1.28) (15) + 110
= 129.1
Therefore, 129.2 mmHg is the 90
th
percentile of the SBP of
non-hypertensive men. Ninety percent of non-hypertensive
men have SBP less than or equal to 129.2 mmHg.
16
EXERCISES TO DO:
1.) The hemoglobin level of household heads in
Magallanes Cavite has a mean of 12.63 gm% and a
SD of 2.45 gm%. Assuming that hemoglobin levels
follow a normal distribution:
1.1 What is the probability that a household head would
be classified as having severe anemia if the cut-off point
used in < gm%
1.2 What percentage of the population of household
heads would be classified as normal if the cut-off
point used is a hemoglobin level of > 12 gm%?
2.) Suppose that the height of 6-year old Filipino boys
is normally distributed with a mean of 110 cms and
a variance of 40cm
2
.
2.1 What proportion of 6-year old Filipino boys have
heights taller than 103 cm?
2.2) Between what two values do the middle 80% of the
heights fall?