Chapter 3 Estimation
Chapter 3 Estimation
Chapter 3 Estimation
ESTIMATION
Definition
• procedure by which a numerical value(s) are assigned to a population
parameter (such as mean, median, mode, variance and standard
deviation) based on the information collected from a sample.
• Eg. 2:
suppose the Malaysian Census Bureau takes a sample of 10,000 households and
finds that the mean housing expenditure per month 𝑥,ҧ is RM1370. If he assigns this
value to the population mean, then RM 1370 is called an estimate of μ. The sample
statistic used to estimate a population parameter is called an estimator. Thus the
sample mean 𝑥ҧ is an estimator of the population mean μ.
Properties of good estimator
• Unbiased estimator - estimator should be “close” in some sense to
the true value of the unknown parameter. That is the expected value
or the mean of the estimates obtained from samples of a given size is
equal to the parameter being estimated.
• Consistent estimator - as the sample size increases, the value of the
estimator approaches the value of the parameter being estimated.
• Relatively efficient estimator – of all the statistics that can be used to
estimate a parameter, the relatively efficient estimator has the
smallest variance.
Types of estimation
A. Point estimate
is a single number used to estimate a population parameter. The best
point estimate of the population mean, μ is the sample mean, 𝑥ҧ .
Eg: the bureau can state that the mean housing expenditure per
month μ for all households is about RM1370.
the point estimate of a population parameter = value of the
corresponding sample statistics.
the standard error of 𝑋ത or the estimated standard error, respectively
are
𝜎 𝑆
𝜎𝑥ҧ = or 𝜎ො𝑥ҧ =
𝑛 𝑛
• Eg1: An article in the Journal of Heat and Mass Transfer described a new
method of measuring the thermal conductivity of Armco iron. Using a
temperature of 1000F and a power input of 550 watts, the following 10
measurements of thermal conductivity were obtained:
41.60, 41.48, 42.34, 41.95, 41.86,
42.18, 41.72, 42.26, 41.81, 42.04
a) Find the point estimate of the mean thermal conductivity at 1000F and
550 watts power input.
b) Find the standard error of the sample mean.
ഥ ± margin of error E
Point estimate, 𝒙
𝝈
ഥ
𝝁 ≤ 𝒙 + 𝒛𝜶
𝒏
𝝈
ഥ−
𝝁 ≥ 𝒙 𝒛𝜶
𝒏
𝒛𝜶/𝟐 𝝈 𝟐
𝒏=
𝑬
Sol:
Eg 6: A company that produces detergents wants to estimate the mean
amount of detergent in 64-ounce jugs at a 99% confidence level. The
company knows that the standard deviation of the amount of the
detergent in all such jugs is 0.2 ounces. How large a sample should the
company take so that the estimate is within 0.04 ounces of the
population mean?
Sol:
Eg 7: A scientist wishes to estimate the average depth of a river. He
wants to be 98% confident that the estimate is accurate within 2 feet.
From a previous study, the standard deviation of the depths measured
was 4.33 feet.
Sol:
Confidence interval for the mean: σ Unknown
• If the population standard deviation, 𝜎 is not known, then we use sample
standard deviation, 𝑠.
• the value of 𝑡𝛼/2 is obtained from the t distribution table with n-1 degrees
of freedom
B. ONE-SIDED/UPPER BOUND CONFIDENCE INTERVAL
• If 𝑥ҧ is the sample mean of a random sample of size n from a normal
population with UNKNOWN variance σ2, a 100% (1-α) confidence
interval on μ is given by
𝝈
ഥ
𝝁 ≤ 𝒙 + 𝒕𝜶
𝒏
𝝈
ഥ
𝝁 ≥ 𝒙 − 𝒕𝜶
𝒏
• Differences
i. The t-distribution differs from the standard normal distribution in the following ways.
ii. The variance is greater than 1.
iii. The t-distribution is actually a family of curves based on the concept of a degree of
freedom, which is related to sample size.
iv. As the sample size increases, the t-distribution approaches the standard normal
distribution.
many statistical distribution
use the concept of degrees
of freedom, and the
formulas for finding the
degrees of freedom vary for
different statistical tests.
The degrees of freedom are
the number of values that
are free to vary after a
sample statistic has been
computed.
Eg 9: Ten randomly selected automobiles were stopped, and the tread death
of the right front tyre was measured. The mean was 0.32 inch, and the
standard deviation was 0.08 inch. Find the 95% confidence interval of the
mean depth. Assume that the variable is approximately normally distributed.
Sol:
Eg 10: Johan, the manager of the paint store, wants to estimate the
mean amount of product sold per day. Twenty business days are
monitored, and an average of 32 litres is sold daily. The sample
standard deviation is 12 litres. Calculate the confidence limit at the 95%
confidence level.
Sol:
Eg 11: An article in the journal Meterials Engineering (1989, Vol II,
pp275-281) described the results of tensile adhesion test on 22 U-700
alloy specimens. The load at specimen failure is as follows (in
megapascals):
19.8 10.1 14.9 7.5 15.4 15.4
15.4 18.5 7.9 12.7 11.9 11.4
11.4 14.1 17.6 16.7 15.8
19.5 8.8 13.6 11.9 11.4
Find a 95% confidence interval for the mean of tensile adhesion test on
22 U-700 alloy specimens.
Sol:
• For Eg 9-11, construct the 99% upper and lower bound confidence
interval
GUIDELINES
Guideline when to use z or t-distribution
Yes
Is σ known? Use zα/2 values no matter what the sample size is*
No
Yes
Is n ≥ 30? Use zα/2 values and s in place of σ in the formula
No
Use tα/2 values and s in the * Variable must be normally distributed when n < 30.
formula** ** Variable must be approximately normally distributed.
Confidence intervals for the variance
If 𝑠 2 is the sample variance from a random sample of n observations
from a normal distribution with unknown variance σ2, then a 100% (1-α)
confidence interval on σ2 is
2 2
where 𝜒𝛼/2,𝑛−1 and 𝜒1−𝛼/2,𝑛−1 are the upper and lower 100α/2
percentage points of the chi-square distribution with n-1 degrees of
freedom
Confidence intervals for the standard deviation
If 𝑠 is the sample standard deviation from a random sample of n
observations from a normal distribution with unknown variance σ2,
then a 100% (1-α) confidence interval on 𝜎 is
(𝒏 − 𝟏)𝒔𝟐 (𝒏 − 𝟏)𝒔𝟐
≤ 𝝈≤
𝝌𝟐𝜶/𝟐,𝒏−𝟏 𝝌𝟐𝟏−𝜶/𝟐,𝒏−𝟏
2 2
where 𝜒𝛼/2,𝑛−1 and 𝜒1−𝛼/2,𝑛−1 are the upper and lower 100α/2
percentage points of the chi-square distribution with n-1 degrees of
freedom
• To calculate these confidence intervals, a new statistical distribution is
needed. It is called the chi-square distribution.
• The chi-square variable is similar to the t variable in that its
distribution is a family of curves based on the number of degrees of
freedom.
•
Characteristics of the χ2 distribution
Eg 13: Find the 95% confidence interval for the variance and standard deviation of the
nicotine content of cigarettes manufactured if a sample of 20 cigarettes has a standard
deviation of 1.6 mg.
Sol:
Hence, you can be 95% confident that the true variance for the nicotine content is between
1.5 and 5.5,
and the true standard deviation for the nicotine content of all cigarettes manufactured is
between 1.2 and 2.3 mg based on a sample of 20 cigarettes.
Past Year’s Questions
Q3 (b) Mid Term Exam Sem 2 2014/2015
An izod impact test was performed on 30 specimens of PCV pipe. The sample
mean is 𝑥ҧ = 0.25 and the sample standard deviation is 𝑠 = 0.25. Find a 99%
confidence interval on Izod impact strength for the variance and interpret
the finding.
i. Find a 95% confidence interval for the mean of the beams frequency.
Hence, can you conclude that the mean of the beams frequency is 230
hertz? Explain.
ii. Construct a 98% confidence interval for the standard deviation of the
beam frequency.
Q2(a) Sem 1 2015/2016
Data below showed the IQ scores for eight individuals where each was
selected among the youngest of a family
IQ scores: 131 119 103 93 108 100 111 130
i. Estimate the sample mean
ii. Construct the 95% confidence interval for the population mean
• Q4(d) Sem 1 2015/2016
A health care professional wishes to estimate the birth weight of
infants. How large a sample must be obtained if she desires to be 90%
confident that the true mean is within 2kg of the sample mean?
Assume 𝜎=8 kg.
Q4(e) Sem 1 2015/2016
The mean weight in kg for 8 adult males are given as follows. Construct
a 90% confidence interval to estimate the variance of weight for all
adult males.