Estimation
Estimation
Estimation
Testing of Hypothesis
ESTIMATION Terms
In theory of estimation, STATISTIC is renamed as
ESTIMATOR (
x)
Point Estimate
is a SINGLE VALUE of the estimator, obtained from available
sample observations
Example : proportion of vegetarians in a random sample of 50
PGP students can be a point estimate of the corresponding
proportion in the population of all PGP students.
Interval Estimate (Confidence Interval)
is an INTERVAL that provides an upper and lower bound for a
specific unknown population parameter.
Ex: The interval (45, 52) may contain the true proportion of
vegetarians among all PGP students with 95% confidence.
Point estimate is always within the interval estimate
POINT ESTIMATION
Bias
An unbiased estimator is on
target on average.
A biased estimator is
off target on average.
1 n
1 n
1 n
E ( x ) E ( xi ) E ( xi ) E ( x )
n i 1
n i 1
n i 1
Sample variance is not an unbiased estimator for the population variance. That is why
when mean and variance are unknown the following equation is used for sample
variance:
1 n
E (T )
E (x
n
i 1
2
i
) E(x 2 )
1 n
[var( xi ) ( E ( xi )) 2 ] [var( x ) ( E ( x )) 2
n i 1
1
2
2
[ ] [ 2 ]
n i 1
n
n 1 2
n
n
1 n
2
s
(
x
x
)
i
n 1 i 1
2
Consistency
A consistent estimator converges towards the
parameter being estimated as the sample size
increases. i.e. E (T ) and
V (T ) 0 as n
n = 10
McGraw-Hill/Irwin
n = 100
2007 The McGraw-Hill Companies, Inc. All rights reserved.
Efficiency
An estimator is efficient if it has a relatively small variance (and
standard deviation).
Example
Suppose, you want to estimate mean and sd of score of a batsman
in one day cricket. So, you have randomly chosen 5 different
innings and recorded scores as below
20 52 8 63 11
Find out unbiased estimator of mean and variance.
x 30.8
1 n
2
s
(
x
x
)
25.07
i
n 1 i 1
Interval Estimator =
Point Estimator Margin of
Error
Confidence Limit
(Upper)
Known
known
Proportion
unknown
Confidence Interval of (
known)
Assumption
x z / 2
x z / 2
The quantity
z / 2
the sampling error. n
z0.025 1.96
20
x 1.96
122 1.96
n
30
122 (1.96)(3.65)
122 7.15
114.85,129.15
What is happening?
Sampling Distribution of the Mean
0.4
95%
f(x)
0.3
0.2
0.1
2.5%
2.5%
0.0
x 1.96
n
x 1.96
x
x
x
x
x
x
x
x
Background
We define z as the z value that cuts off a right-tail area of under the standard
2
2
normal curve.
P z > z /2
P z < z /2
<
<
P z z z (1 )
2
(1 )
f(z)
0.3
0.2
0.1
0.0
-5
-4
-3
-2
-1
2
2
x z
2
0.005
0.010
0.025
0.050
0.100
S t a n d a rd N o r m al Di s trib utio n
0.4
(1 )
2.576
2.326
1.960
1.645
1.282
0.3
f(z)
(1 )
0.2
0.1
2
0.0
-5
-4
-3
-2
-1
2
2
Example 10.1
0.4
0 .4
0.3
0 .3
f(z)
f(z)
S t a n d a r d N o r m al Di s tri b uti o n
0.2
0.1
0 .2
0 .1
0.0
0 .0
-5
-4
-3
-2
-1
-5
-4
-3
-2
-1
0 .4
0 .9
0 .8
0 .7
0 .3
f(x)
f(x)
0 .6
0 .2
0 .5
0 .4
0 .3
0 .1
0 .2
0 .1
0 .0
0 .0
Confidence Interval of (
unknown)
Assumption
x z / 2
s
n
x z / 2
where s
1 n
2
x
i
n i 1
s
n
Practice Problem 1:
A manufacturer of light bulbs claims that its light bulbs have a mean life hours
with a standard deviation of 85 hours. A random sample of 40 such bulbs is
selected for testing. If the sample produces a mean value of 1505 hours, find out
95% Confidence Interval of .
Solution: Given, n=40 (large), =85 (known), 1-=0.95, =0.05,
x 1505
z / 2 z 0.025 1.96
Therefore,
95% CI of is given by
85
85
1478.66 , 1531.34
Practice Problem 2:
Waiting times (in hours) at a popular restaurant are found to have a mean waiting
time of 1.52 hours with sd 2.25hrs. for a sample of 50 customers. Construct the
99% confidence interval for the estimate of the population mean.
Solution: Given, n=50 (large), s=2.25 (estimated), 1-=0.99, =0.01,
z / 2 z 0.005 2.58
Therefore,
99% CI of is given by
x 1.52
2.25
2.25
2.58 , 1.52
2.58
1.52
50
50
1.20 , 2.34
p (1 p )
p p Z / 2
n
p (1 p )
n
Practice Problem 3:
A marketing research firm wants to estimate the share that foreign companies
have in the Indian market for certain products. A random sample of 100
consumers is obtained, and it is found that 34 people in the sample are users
of foreign-made products; the rest are users of domestic products. Give a
95% confidence interval for the share of foreign products in this market.
p z
2
pq
( 0.34 )( 0.66)
0.34 1.96
n
100
0.34 (1.96)( 0.04737 )
0.34 0.0928
0.2472 ,0.4328
Thus, the firm may be 95% confident that foreign manufacturers control
anywhere from 24.72% to 43.28% of the market.
pq
(0.34)(0.66)
0.34 1645
.
n
100
0.34 (1645
. )(0.04737)
0.34 0.07792
0.2621,0.4197
p z
2
pq
(0.34)(0.66)
0.34 196
.
n
200
0.34 (196
. )(0.03350)
0.34 0.0657
0.2743,0.4057
Known
known
unknown
Confidence Interval of (
known)
Assumption
x z / 2
x z / 2
Confidence Interval of (
unknown)
Assumption
Population Distribution is Normal
Population Standard Deviation, is unknown
x t / 2,n 1
s
n
x t / 2,n 1
where s
s
n
1 n
2
x
i
n 1 i 1
Practice Problem 4:
A stock market analyst wants to estimate the average return on a certain
stock. A random sample of 15 days yields an average (annualized) return of
x 10.37% and a standard deviation of s = 3.5%. Assuming a normal
population of returns, give a 95% confidence interval for the average return
on this stock.
The critical value of t for df = (n -1) = (15 -1) =14 and a righttail area of 0.025 is:
`2
15 2
14
t 0.025 2.145
= 13.125; ` = 3.623
=
The corresponding confidence interval or interval estimate is:
x t 0.025
s
n
10.37 2.145
10.37 1.81
8.56,12.18
3.623
15
Sample-Size Determination
Before determining the necessary sample size, three questions must be
answered:
Bound, B
Example 1
A marketing research firm wants to conduct a survey to estimate the average
amount spent on entertainment by each person visiting a popular resort. The
people who plan the survey would like to determine the average amount spent by
all people visiting the resort to within $120, with 95% confidence. From past
operation of the resort, an estimate of the population standard deviation is
s = $400. What is the minimum required sample size?
z
2
(1.96) ( 400)
120
2
42.684 43
Example 2
The manufacturers of a sports car want to estimate the proportion of people in a
given income bracket who are interested in the model. The company wants to
know the population proportion, p, to within 0.01 with 99% confidence. Current
company records indicate that the proportion p may be around 0.25. What is the
minimum required sample size for this survey?
z2 pq
2
B2
2.5762 (0.25)(0.75)
010
. 2
124.42 125
Problem
NDTV randomly selected 10,000 final year students
across different management schools in India and
asked them about their career choices. 4% said they
want to take the plunge and start their own companies
even if that meant giving up lucrative job offers from
established MNCs. Find a 99% confidence interval of
the true population proportion of management
students in India who want to work their start-ups.