Estimation Hypothesis Testing: Decisions Inferences
Estimation Hypothesis Testing: Decisions Inferences
Estimation Hypothesis Testing: Decisions Inferences
LECTURE 2
Introduction
In this lecture, we will start to study and explain the second branch of statistics which is named
"INFERENTIAL STATISTICS".
The inferential statistics is a branch interested in making decisions (inferences) about the
population parameters by applying some basic statistical methods (techniques).
The inferential statistics is divided into two basic areas of study are defined as:
1. Estimation Is a process of estimating the value of a population parameter using the
information known about a sample taken from this population.
2. Hypothesis Testing Is a process of testing claims about the population parameters
that may be or may not be true and it helps in making decisions.
The most common population parameters used in inferential statistics are "mean, proportion,
variance, and standard deviation".
In both of estimation and hypothesis testing, the sample statistics are used to estimate the
population parameters. These statistics are called estimators, see the following table:
MEASURES Mean Variance Proportion
Population σ2 p
Sample x s2 p̂
A good estimator should satisfy three basic properties summarized as:
(1) It is unbiased estimator If the expected (mean) value of that estimator is equal to the
corresponding population parameter.
(2) It is consistent estimator If as the sample size increases, the value of that estimator
approaches to the corresponding population parameter.
(3) It is relatively efficient If that estimator has the smallest variance, of all the statistics
that can be used to estimate the population parameter.
There are some assumptions must be known before making the decisions (inferences) about
the population parameters are summarized as:
The samples must be randomly selected.
The sample size is greater than or equal to 30 (i.e, population is normally distributed).
If the sample size is less than 30, the population must be approximately normally distributed.
Remark
When the sample size increases, the margin of error will decrease.
For a 90% confidence interval, we have z 2 1.65. (Proof Later)
For a 95% confidence interval, we have z 2 1.96. (Proof Later)
For a 99% confidence interval, we have z 2 2.58. (Proof Later)
The confidence intervals for the population mean (µ) can be written as
C.I x z . .
2 n
Example 1
A researcher wishes to estimate the number of days it takes an automobile dealer to sell a Chevrolet
Aveo. A sample of 50 cars had a mean time on the dealer’s lot of 54 days. Assume the population
standard deviation to be 6.0 days. Find the best point estimate of the population mean and the
95% confidence interval of the population mean?
Solution
Since, we have
n 50, x 54, 6, z 2 1.96.
Then, the best point estimate of the population mean is given as
x 54 days.
The 95% confidence interval of the population mean is given as
x z. x z. .
2 n 2 n
6 6
54 (1.96). 54 (1.96).
50 50
54 1.7 54 1.7
52.3 55.7.
Hence, with 95% confidence, the interval (52.3, 55.7) will contain the population mean.
Example 2
A survey of 30 emergency room patients found that the average waiting time for treatment was
174.3 minutes. Assuming that the population standard deviation is 46.5 minutes, find the best
point estimate of the population mean and the 99% confidence of the population mean?
Solution
Since, we have
n 30, x 174.3, 46.5, z 2 2.58.
Then, the best point estimate of the population mean is given as
x 174.3 minutes.
The 99% confidence interval of the population mean is given as
x z. x z. .
2 n 2 n
46.5 46.5
174.3 (2.58). 174.3 (2.58).
30 30
174.3 21.9 174.3 21.9
152.4 196.2.
Hence, we can be 99% confident that the mean waiting time for treatment for all emergency
room patients is between 152.4 and 196.2 minutes.
Sample Size
An important question in estimation of the population mean. How large should the sample be
in order to make an accurate estimate?
The answer is not easy because it depends on three factors are "the margin of error, the
population standard deviation, and the degree of confidence".
To estimate the sample size that helps the researchers to make an accurate estimate of
the population mean, we will use the margin of error formula as follows:
z . z .
2
E z . E . n z . n 2
n 2 .
2 n 2 E E
That is, the minimum sample size needed for an interval estimate of the population mean
is given by the formula
z .
2
n 2 .
E
Example 3
A scientist wishes to estimate the average depth of a river. He wants to be 99% confident that the
estimate is accurate within 2 feet. From a previous study, the standard deviation of the depths
measured was 4.33 feet. Find the minimum sample size needed to do this?
Solution
Since, we have
z 2 2.58, E 2, 4.33.
Then, the minimum sample size needed to estimate the average depth of a river is given by
z . (2.58)(4.33) 2
2
A researcher wishes to estimate within $300 the true average amount of money a county spends
on road repairs each year. If she wants to be 90% confident, how large a sample is necessary? The
standard deviation is known to be $900.
Solution
Since, we have
z 2 1.65, E 300, 900.
Then, the sample size necessary to estimate the average amount of money is given by
z . (1.65)(900) 2
2
Theorem
The confidence intervals for the population mean (µ) when the population standard deviation (σ)
is unknown are given by the formula
s s
x t . x t . .
2 n 2 n
where,
x The sample mean & n The sample size.
s The sample standard deviation (known or calculated).
t 2 The standard value of the random variable X (From t distribution table).
The values of t 2 are found according to the degree of freedom and the confidence levels.
Example 5
Find the value of t 2 for a 95% confidence interval when the sample size is 22?
Solution
Since, we have
Confidence Level = 95% & d.f n 1 22 1 21.
Then, from the t distribution table, we get
t 2 2.080.
Example 6
Ten randomly selected people were asked how long they slept at night. The mean time for that
sample was 7.1 hours, and the standard deviation was 0.78 hour. Find the 95% confidence interval
of the population mean time? Assume the variable is normally distributed.
Solution
Since, we have
n 10, x 7.1, s 0.78, d.f n 1 10 1 9, t 2 2.262.
Then, the 95% confidence interval of the population mean is given as
s s
x t 2. x t 2. .
n n
0.78 0.78
7.1 (2.262). 7.1 (2.262).
10 10
7.1 0.56 7.1 0.56
6.54 7.66.
Hence, we can be 95% confident that the population mean time is between 6.54 and 7.66 hours.
Example 7
The following data represent a sample of the number of home fires started by candles for the past
seven years.
5460 5900 6090 6310 7160 8440 9930
Find the 99% confidence interval for the mean number of home fires started by candles in all years?
Solution
Since,
x
x i 5460 5900 ........ 9930 49, 290
7041.4.
n 7 7
s
2 x i2 n x 2 362,629,900 7(7041.4)2 15,560,702.28
2,593, 450.38.
n 1 7 1 6
s s 2 2,593, 450.38 1610.4.
Now, we have
n 7, x 7041.4, s 1610.4, d.f n 1 7 1 6, t 2 3.707.
Then, the 99% confidence interval of the population mean is given as
s s
x t 2. x t 2. .
n n
1610.3 1610.3
7041.4 (3.707). 7041.4 (3.707).
7 7
7041.4 2256.2 7041.4 2256.2
4785.2 9297.6.
Hence, we can be 99% confident that the mean number of home fires started by candles in all years
is between 4785.2 and 9297.6 fires.
تمت بـحمـد اللـه