.5 The Central Limit Theorem
Statisticians are also interested in knowing the distribution of the means of samples taken
from a population.
DISTRIBUTION OF THE SAMPLE MEANS
This is a distribution obtained by using the means computed from random samples of a
specific size taken from a population. If the samples are randomly selected, the sample means will
be somewhat different from the population mean p. The differences are caused by sampling error.
Sampling error is the difierence between the sample measure and the corresponding population
‘measure due to the fact that the sample is not a perfect representation of the population.
Properties of the Distribution of Sample Means
1. The mean of the sample means will be the same as the population mean.
2. The standard deviation of the sample means will be smaller than the standard
deviation of the population, and it will be equal to the population standard deviation
divided by the square root of the sample size.
The standard deviation of the sample mean is called the standard error of the mean.
This is equal to
vn
‘Another property of the sampling distribution of sample mean refers to the shape of the
distribution and is explained by the central limit theorem,
Central Limit Theorem
‘As the sample size n increases, the shape of the distribution of the sample means
taken from a population with mean . and standard deviation ¢ will approach
anormal distribution, This distribution will have a mean 4 and a standard
deviation o / fm.
‘The central limit theorem can be used to answer questions about sample means in the same
manner that the normal distribution can be used to answer questions about individial values. There is
a new formula to be used for the z values.
oe
where Y is the sample mean o/,{ n is the standard deviation1. When the original variable is normally.
important to remember two things when usi
‘the central limit theorem:
istributed, the distribution of the sample means will be
‘normally distributed, for any sample size n.
2. When the distribution of the
inal variable departs from normality, a sample size of 30 or
‘more is needed to use the normal distribution to approximate the distribution of the sample
‘means, The larger the sample, the better the approximation will be.
Example 1
Example 2
The mean serum cholesterol of a large population of overweight adults is 220 mg/dl
and the standard deviation is 16.3 mg/dl. Ifa sample of 30 adults is selected, find the
probability thatthe mean wil be between 220 and 222 mg/dl.
‘Transform to = value.
oes
+ aiyn 163/430
‘The required area is between = = 0 and z = 0.67, whict 0.2486. Hence, the
probability that the mean will be between 220 and 222 mg/dl is 24.86%. That is,
(220 < x <222)= 0.2486.
The average age of accountants is 43 years, with a standard deviation of $ years.
Wan accounting firm employs 30 accountants, find the probability that the average
‘ge of the group is greater than 44.2 years old.
Since the equation concerns the mean of a sample with a sie of 30, the formula
22(F-n)r (ofa) is used, The 2 vane i
Fen w2-0_
oe
olfn sif30
‘The area between = = 0 and z~ 1.31 is 0.4049, (pcs
131
Hence, the area to the right of := 1.31 is 0.0951, Therefore, the probability that the
average age of group is greater than 44.2 years old is 0.0951 of 9.51%.
0.0951.Summary of Formulas and Their Uses
Used fo gsin information about an individual data value when
the variable is normally distributed
zx. Used to gain information when applying the central limit
oir ‘theorem about « sample mean when the variable is normally
al{n distributed or when the sample size is 30 or more
‘The average annual precipitation for a certain city is 30.83 inches, with a standard
deviation of S inches. Ifa random sample of 10 years is selected, find the probability
that the mean will be between 32 and 33 inches. Assume the variable is normally
distributed.
Answer:
05 7.5
the following.
‘The average number of kilos of meat a person consumes in a year is 100 kilos. Assume that the
standard deviation is 11 kilos and the distribution is approximately normal.
a) Find the probability that « person selected at random consumes less than
105 kilos per year.
1b) Ifa sample of 40 individuals is selected, find the probability that the mean of the sample
will be less than 105 kilos per year.
‘The mean weight of 20-year-old females is 126 pounds and the standard deviation is 15.7.
Ifa sample of 25 females is selected, find the probability that the mean of the sample will be
greater than 128.3 pounds. Assume that the variable is normally distributed.
‘Assume that the mean systolic blood pressure of normal adults is 120 milliliters of mercury
(mmtg) andthe standard deviation is 5.6. Assume that the variable is normally distributed.
4) Ifan individual is selected, find the probability thatthe individu
between 120 and 121.8 mmHg.
b) Ifa sample of 30 adults is randomly selected, find the probability that the sample mean
be between 120 and 121.8 mmHg.
pressure will beConfidence Intervals
jon is the process of estimating the Value of a parameter from information obtained in
TAn important aspect of estimation is the size of the sample. These will all be discussed in
-¢ Intervals for the Mean when c is known or m > 30
point estimate is a specific numerical value estimate of a parameter. The best point
‘the population mean jis the sample mean X.
measures are used to estimate population measures. These are called estimators.
being estimated.
“2. It must be consistent.
3. Itmust be a relatively efficient estimator. All the statistics that can be used to estimate
4 parameter, the relatively efficient estimator has the smallest variance,
‘most parts, the point estimate will be different from the population mean due to sampling
is no way of knowing how close the point estimate is to the population mean. For this
isticians prefer another type of estimate called an interval estimate.
interval estimate of a parameter is an interval or a range of values used (o estimate
‘This estimate may of may not contain the value of the parameter being estimated. In an
jimate, the parameter is specified as being between two values. A degree of confidence
before an interval estimate is made. The confidence level is the probability that the
jae will contain the parameter. Three common confidence intervals are used: the 90%,
‘the 99% confidence intervals.
la for the Confidence Interval of the Mean for a Specific a
iz)
8 95% confidence interval, xq; = 1.96 and for a 99% confidence interval,
12 72.58.
‘The relationship between a and the confidence level is that the
C ; is stated confidence
Bercentage equivalent of the decimal value of I ~ a, and vice versa. For example,
‘99% confidence interval is to be found, a = 0.01, since’l - 0.01 =0.99. "
‘The term a2
is called the maximum error of estimate.
#)
Example 1 A study of 40 bowlers showed tha their avera
deviation of the population is 6. ni eee
+) Find the 95% confidence interval of the mean score for all bowlers.
'b) Find the 95% confidence interval of the mean score |
instead of a sample of 40. a enema‘necessary to determine the size of the sample to make an accurate estimate, It depends
error of estimate, the population standard deviation, and the degree of confidence.
la for the Minimum Sample Size
jimum sample size is determined by the formula
fa12-0)?
x
E is the maximum error of estimate. If necessary, round th
‘awhole number,
8.2 Confidence Intervals for the Mean when o is unknown and 1 < 30
‘When the population standard deviation is unknown and the sample size is less than 30,
the standard deviation from the sample can be used in place of the population standard deviation,
‘A different distribution, called the £ distribution must be used when the variable is normally or
approximately normally distributed,
Properties of the t Distribution
‘The t distribution is similar to the standard normal distribution in the following ways.
1. Itis bell-shaped
2. Itis symmetrical about the mean.
3. ‘The mean, median, and mode are equal to 0 and are located at the center of
the distribution.
4. ‘The curve never touches the x-axi
‘The distribution differs from the standard normal distribution in the
1. The vs greater than 1.
2. The ¢ distribution is a family of carves based
‘hich is member ised tote sample sees
3. As the sample size increases, the 1 distribution approaches |
distribution, a
The degrees of freedom are the number.of values that are free te
statistic has been computed. ‘The symbol d.f. will be used for degrees of
freedom for the confidence interval for the mean is m — 1
Formula for a Specific Confidence Interval for the Mean when
‘5+ sample standard deviation
Example 1 Find the 14/2 for 2 99% confidence interval when the sample size
Example 2 The average hemoglobin reading for a sample of 20 WLSA teachers was 16 grams
per 100 milliliters, with a sample standard deviation of 2 grams. Find the
95% confidence interval of the true mean.mn to use the z or / distributi
+ fais known and n 230, use 24/2
+ if is unknown and m2 30, use sia 24/2
+ fois unknown and 1 < 30, use sin ta,/2
8.3 Confidence Intervals for Variances and Standard Deviation
In statistics, the variance and standard deviation of a variable are as important as
In order to calculate the confidence intervals for these two, a new statistical distribution will
the chi-square distribution
‘The chi-square variable is a family of curves based on the number of degrees of freedom.
symbol of chi-square is Z* (Greek letter chi, pronounced as “ki”). The chi-square distribution
jined from the values of (w — 1)s"/o? when random samples are selected from a normally
population whose variance is 0”,
‘A chi-square variable cannot be negative, and the distributions are positively skewed.
‘area under each chi-square distribution is equal to 1.00 or 100%. Appendix D gives the value for
‘chi-square distributions. The degrees of freedom is equal tom ~ 1.
We 1 Find the values for 125 and X jy for a 95% confidence interval when m= 20.
Confidence Interval for a Variance and a Standard Deviation
(n=1)s? cot < en
2 2:
right Met
(n-1)s?- coe | eats?
Vine The
degrees of freedom = n~ 1
Standard Deviation:
Example 2 Find the 99% confidence interval for the variance and standard deviation for
lifetime of batteries if a sample of 20 batteries has a standard deviation of 1.7
‘Assume that the variable is normally distributed.