Estimation

INFERENTIAL STATISTICS
Statistical inference may be divided into two major

areas:
Estimation
Testing of Hypothesis
ESTIMATION Terms
In theory of estimation, STATISTIC is renamed as
ESTIMATOR (
x)
Value of an estimator is called ESTIMATE
Point Estimate
is a SINGLE VALUE of the estimator, obtained from available
sample observations
Example : proportion of vegetarians in a random sample of 50
PGP students can be a point estimate of the corresponding
proportion in the population of all PGP students.
Interval Estimate (Confidence Interval)
is an INTERVAL that provides an upper and lower bound for a
specific unknown population parameter.
Ex: The interval (45, 52) may contain the true proportion of
vegetarians among all PGP students with 95% confidence.
Point estimate is always within the interval estimate
POINT ESTIMATION
Point Estimators and Their

Properties
An estimator of a parameter is a statistic used to estimate the
parameter. The most commonly-used estimator of the:
Population (Parameter)
Estimator (statistic)
Mean ()
is the
Mean (X)
Variance (2)
is the
Variance (s2)
Standard Deviation ()
is the
Standard Deviation (s)
Proportion (p)
is the
proportion ( p )
Difference of means (1 2 ) is the
difference of means( x1 x2 )
Desirable properties of estimators include:

Unbiasedness
Efficiency
Consistency
Unbiased and Biased Estimators
Bias
An unbiased estimator is on
target on average.
A biased estimator is
off target on average.
Properties of estimator: Unbiasedness

T is said to be an unbiased estimator of iff E(T)=
Example:
SAMPLE MEAN IS THE ESTIMATOR OF POPULATION MEAN
1 n
1 n
1 n
E ( x ) E ( xi ) E ( xi ) E ( x )
n i 1
n i 1
n i 1
Example of biased estimator: Sample

variance.
Given sample of size n from the population with unknown mean () and variance
(2) we estimate mean as we already know and variance (intuitively) as:
2
1 n
1 n 2
2
T ( xi x ) xi x
n i 1
n i 1
Sample variance is not an unbiased estimator for the population variance. That is why
when mean and variance are unknown the following equation is used for sample
variance:
1 n
E (T )
E (x
n
i 1
2
i
) E(x 2 )
1 n
[var( xi ) ( E ( xi )) 2 ] [var( x ) ( E ( x )) 2
n i 1
1
2
2
[ ] [ 2 ]
n i 1
n
n 1 2
n
n
1 n
2
s
(
x
x
)
i
n 1 i 1
2
Consistency
A consistent estimator converges towards the
parameter being estimated as the sample size
increases. i.e. E (T ) and
V (T ) 0 as n
n = 10
McGraw-Hill/Irwin
n = 100
2007 The McGraw-Hill Companies, Inc. All rights reserved.
Efficiency
An estimator is efficient if it has a relatively small variance (and
standard deviation).
An efficient estimator is,

on average, closer to the
parameter being estimated..
An inefficient estimator is, on

average, farther from the
parameter being estimated.
sample mean vs sample median

E(sample mean)=
E(sample median)=
V(sample mean) is 2/n
V(sample median) is 1.572/n
Example
Suppose, you want to estimate mean and sd of score of a batsman
in one day cricket. So, you have randomly chosen 5 different
innings and recorded scores as below
20 52 8 63 11
Find out unbiased estimator of mean and variance.
x 30.8
1 n
2
s
(
x
x
)
25.07
i
n 1 i 1
Interval Estimator =
Point Estimator Margin of
Error
Elements of Interval Estimation
A Probability That the Population Parameter Falls

Somewhere Within the Interval.
Sample
Confidence Interval
Statistic or
Point
Estimator
Confidence Limit
(Lower)
Confidence Limit
(Upper)
Elements of Interval Estimation

Confidence Coefficient/Level : Probability that
the confidence interval will contain true
parameter
Denoted by (1 - ) % e.g. 90%, 95%, 99%
: Probability that the interval does not contain the
parameter
The confidence coefficient is the area under the

curve of the sampling distribution.
Interval Estimator (Large

Sample)
Confidence
Intervals
Mean
Known
known
Proportion
unknown
Confidence Interval of (
known)
Assumption
Population Standard Deviation, is Known
Sample size is large
Confidence Interval Estimator of :
Let x1 , x2 ,..., xn be an iid random sample of size n,

drawn from a population with mean and sd .
100(1-)% Confidence Interval of is
x z / 2
x z / 2

The quantity
z / 2
the sampling error. n
For example, if: n = 30

= 20
x = 122
is often called the margin of error or
A 95% confidence interval:

=0.05 ; /2=0.025
z0.025 1.96
20
x 1.96
122 1.96
n
30
122 (1.96)(3.65)
122 7.15
114.85,129.15
What is happening?
Sampling Distribution of the Mean
0.4
95%
f(x)
0.3
0.2
0.1
2.5%
2.5%
0.0
x 1.96
n
x 1.96
x
x
2.5% fall below

the interval
x
x
x
2.5% fall above

the interval
x
x
x
95% fall within

the interval
Background
We define z as the z value that cuts off a right-tail area of under the standard
2
2
normal curve.
P z > z /2
P z < z /2
<
<
P z z z (1 )
2
S t a n d ard N o r m al Dis trib utio n

0.4
(1 )
f(z)
0.3
0.2
0.1
(1- )100% Confidence Interval:
0.0
-5
-4
-3
-2
-1
2
2
x z
2
Critical Values of z and Levels of

Confidence
0.99
0.98
0.95
0.90
0.80
0.005
0.010
0.025
0.050
0.100
S t a n d a rd N o r m al Di s trib utio n
0.4
(1 )
2.576
2.326
1.960
1.645
1.282
0.3
f(z)
(1 )
0.2
0.1
2
0.0
-5
-4
-3
-2
-1
2
2
Example 10.1
Confidence level and the Width

of the Confidence Interval
When sampling from the same population, using a fixed sample size, the
higher the confidence level, the wider the confidence interval.

S t a n d a r d N o r m al Di s tri b u ti o n
0.4
0 .4
0.3
0 .3
f(z)
f(z)
S t a n d a r d N o r m al Di s tri b uti o n
0.2
0.1
0 .2
0 .1
0.0
0 .0
-5
-4
-3
-2
-1
-5
-4
-3
-2
-1
80% Confidence Interval:

x 128
.
95% Confidence Interval:

x 196
.
Sample Size and the Width of the

Confidence Interval
When sampling from the same population, using a fixed confidence
level, the larger the sample size, n, the narrower the confidence
interval.
S a m p lin g D is trib u tio n o f th e Me a n
S a m p lin g D is trib u tio n o f th e Me a n
0 .4
0 .9
0 .8
0 .7
0 .3
f(x)
f(x)
0 .6
0 .2
0 .5
0 .4
0 .3
0 .1
0 .2
0 .1
0 .0
0 .0
95% Confidence Interval: n = 20
95% Confidence Interval: n = 40
unknown)
Assumption
Population Standard Deviation, is unknown
Sample size is large

Let x1 , x2 ,..., xn be an iid random sample of size n, drawn from a large sample
with mean and sd . 100(1-)% Confidence Interval of is
x z / 2
s
n
x z / 2
where s
1 n
2
x
i
n i 1
s
n
Practice Problem 1:
A manufacturer of light bulbs claims that its light bulbs have a mean life hours
with a standard deviation of 85 hours. A random sample of 40 such bulbs is
selected for testing. If the sample produces a mean value of 1505 hours, find out
95% Confidence Interval of .
Solution: Given, n=40 (large), =85 (known), 1-=0.95, =0.05,
x 1505
z / 2 z 0.025 1.96
Therefore,
95% CI of is given by
85
85
1505 40 1.96 , 1505 40 1.96
1478.66 , 1531.34
Practice Problem 2:
Waiting times (in hours) at a popular restaurant are found to have a mean waiting
time of 1.52 hours with sd 2.25hrs. for a sample of 50 customers. Construct the
99% confidence interval for the estimate of the population mean.
Solution: Given, n=50 (large), s=2.25 (estimated), 1-=0.99, =0.01,
z / 2 z 0.005 2.58
Therefore,
99% CI of is given by
x 1.52
2.25
2.25
2.58 , 1.52
2.58
1.52
50
50
1.20 , 2.34
Large-Sample Confidence Intervals

for the Population Proportion, p
The estimator of the population proportion, p , is the sample proportion, p . If the
sample size is large, p has an approximately normal distribution, with E( p ) = p and
pq
V( p ) =
, where q = (1 - p). When the population proportion is unknown, use the
n
estimated value, p , to estimate the standard deviation of p .
For estimating p , a sample is considered large enough when both n p an n q are greater
than 5.
Large-Sample Confidence Intervals

for the Population Proportion, p
Assumptions
Two Categorical Outcomes
Population Follows Binomial Distribution
Large Sample
100(1-)% Confidence Interval for population
proportion p is given by
p Z / 2
p (1 p )
p p Z / 2
n
p (1 p )
n
Practice Problem 3:
A marketing research firm wants to estimate the share that foreign companies
have in the Indian market for certain products. A random sample of 100
consumers is obtained, and it is found that 34 people in the sample are users
of foreign-made products; the rest are users of domestic products. Give a
95% confidence interval for the share of foreign products in this market.
p z
2
pq
( 0.34 )( 0.66)
0.34 1.96
n
100
0.34 (1.96)( 0.04737 )
0.34 0.0928
0.2472 ,0.4328
Thus, the firm may be 95% confident that foreign manufacturers control
anywhere from 24.72% to 43.28% of the market.
Reducing the Width of Confidence Intervals The Value of Information

The width of a confidence interval can be reduced only at the
price of:
a lower level of confidence, or
a larger sample.
Lower Level of Confidence
Larger Sample Size

Sample Size, n = 200
90% Confidence Interval

p z
2
pq
(0.34)(0.66)
0.34 1645
.
n
100
0.34 (1645
. )(0.04737)
0.34 0.07792
0.2621,0.4197
p z
2
pq
(0.34)(0.66)
0.34 196
.
n
200
0.34 (196
. )(0.03350)
0.34 0.0657
0.2743,0.4057
Interval Estimator (Small Sample)

Confidence
Intervals
Mean
Known
known
unknown
known)
Assumption
Population Distribution is Normal
Population Standard Deviation, is known

Let x1 , x2 ,..., xn be an iid random sample of size n, drawn

from a normal distribution with mean and sd .
100(1-)% Confidence Interval of is
x z / 2
x z / 2
unknown)
Assumption
Population Distribution is Normal
Population Standard Deviation, is unknown

Let x1 , x2 ,..., xn be a random sample of size n, drawn from normal with mean
and sd . 100(1-)% Confidence Interval of is
x t / 2,n 1
s
n
x t / 2,n 1
where t / 2 is the value of the t distribution

with n-1 degrees of freedom that cuts off
a tail area of to its right.
where s
s
n
1 n
2
x
i
n 1 i 1
Practice Problem 4:
A stock market analyst wants to estimate the average return on a certain
stock. A random sample of 15 days yields an average (annualized) return of
x 10.37% and a standard deviation of s = 3.5%. Assuming a normal
population of returns, give a 95% confidence interval for the average return
on this stock.
The critical value of t for df = (n -1) = (15 -1) =14 and a righttail area of 0.025 is:
`2
15 2
14
t 0.025 2.145
= 13.125; ` = 3.623
=
The corresponding confidence interval or interval estimate is:
x t 0.025
s
n
10.37 2.145
10.37 1.81
8.56,12.18
3.623
15
Sample-Size Determination
Before determining the necessary sample size, three questions must be
answered:
How close do you want your sample estimate to be to the unknown
parameter? (What is the desired bound, B?)

What do you want the desired confidence level (1-) to be so that the
distance between your estimate and the parameter is less than or equal to
B?
What is your estimate of the variance (or standard deviation) of the
population in question?
For example : (1 - )% Confidence Interval for : x z
Bound, B
Minimum Sample Size: Mean

and Proportion
Minimum required sample size in estimating the population
mean, :
z2 2
n 2 2
B
Bound of estimate : B (Known)
Minimum required sample size in estimating the population

proportion,
z2 pq
n 2 2
B
Example 1
A marketing research firm wants to conduct a survey to estimate the average
amount spent on entertainment by each person visiting a popular resort. The
people who plan the survey would like to determine the average amount spent by
all people visiting the resort to within $120, with 95% confidence. From past
operation of the resort, an estimate of the population standard deviation is
s = $400. What is the minimum required sample size?
z
2
(1.96) ( 400)
120
2
42.684 43
Example 2
The manufacturers of a sports car want to estimate the proportion of people in a
given income bracket who are interested in the model. The company wants to
know the population proportion, p, to within 0.01 with 99% confidence. Current
company records indicate that the proportion p may be around 0.25. What is the
minimum required sample size for this survey?
z2 pq
2
B2
2.5762 (0.25)(0.75)
010
. 2
124.42 125
Problem
NDTV randomly selected 10,000 final year students
across different management schools in India and
asked them about their career choices. 4% said they
want to take the plunge and start their own companies
even if that meant giving up lucrative job offers from
established MNCs. Find a 99% confidence interval of
the true population proportion of management
students in India who want to work their start-ups.

Estimation

Uploaded by

Document Informationclick to expand document information

Copyright:

Available Formats

Estimation

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Estimation

Uploaded by

Copyright:

Available Formats

INFERENTIAL STATISTICS

Statistical inference may be divided into two major

Value of an estimator is called ESTIMATE

Point Estimators and Their

Desirable properties of estimators include:

Unbiased and Biased Estimators

Properties of estimator: Unbiasedness

Example of biased estimator: Sample

An efficient estimator is,

An inefficient estimator is, on

sample mean vs sample median

V(sample median) is 1.572/n

Elements of Interval Estimation

A Probability That the Population Parameter Falls

Elements of Interval Estimation

The confidence coefficient is the area under the

Interval Estimator (Large

Population Standard Deviation, is Known

Sample size is large

Confidence Interval Estimator of :

Let x1 , x2 ,..., xn be an iid random sample of size n,

For example, if: n = 30

is often called the margin of error or

A 95% confidence interval:

2.5% fall below

2.5% fall above

95% fall within

S t a n d ard N o r m al Dis trib utio n

(1- )100% Confidence Interval:

Critical Values of z and Levels of

Confidence level and the Width

higher the confidence level, the wider the confidence interval.

80% Confidence Interval:

95% Confidence Interval:

Sample Size and the Width of the

S a m p lin g D is trib u tio n o f th e Me a n

95% Confidence Interval: n = 20

95% Confidence Interval: n = 40

Population Standard Deviation, is unknown

Sample size is large

Confidence Interval Estimator of :

1505 40 1.96 , 1505 40 1.96

Large-Sample Confidence Intervals

Large-Sample Confidence Intervals

Reducing the Width of Confidence Intervals The Value of Information

Larger Sample Size

90% Confidence Interval

Interval Estimator (Small Sample)

Population Distribution is Normal

Population Standard Deviation, is known

Let x1 , x2 ,..., xn be an iid random sample of size n, drawn

Confidence Interval Estimator of :

where t / 2 is the value of the t distribution

How close do you want your sample estimate to be to the unknown

parameter? (What is the desired bound, B?)

For example : (1 - )% Confidence Interval for : x z

Minimum Sample Size: Mean

Minimum required sample size in estimating the population