Unit - 1 Sampling distribution and estimation part 2
Unit - 1 Sampling distribution and estimation part 2
@ Dinesh Shrestha 3
Point estimation:
A particular value of a statistic which is used to
estimate a given population parameter is known as
point estimate and the process of finding such
estimate is known as point estimation.
For example; if sample mean (𝑥)ҧ is used to estimate
the population mean (µ) then the estimation is called
point estimation. Similarly, the production of laptops
in the next year will be 50,00,000 estimated by the
higher authority of Dell company is an example of
point estimate.
@ Dinesh Shrestha 4
Interval Estimation:
One of the drawback of point estimation is that the point
estimator does not coincide with a true value of population
parameter. Thus to overcome this drawback, an interval
estimation is preferred.
The estimate in which the unknown value of population
parameter lies in the interval with certain probability is
called interval estimate and the process of estimating such
interval estimate is called interval estimation.
For example; the total production of laptops in the next
year will be 40,00,000 to 55,00,000 estimated by higher
authority of Dell company is an example of interval
estimate.
@ Dinesh Shrestha 5
Estimators and Estimates
A sample statistic which is used to estimate the population parameter is called
estimator.
In other words, the estimators are the function of sample observations. The
sample mean (𝑥)ҧ is an estimator of population mean (µ), the sample standard
deviation (s) is an estimator of population standard deviation (σ), the sample
proportion (p) is estimator of population proportion (P) are some example of
estimators.
An estimator can be considered as a good estimator if it is closed to true value
of population parameter. More precisely, an estimator is said to be a good
estimator, if it satisfy the following four properties;
Unbiasedness
Consistency.
Efficiency.
Sufficiency.
On the other hand, the realized (specific) value of the estimator is called
estimate. For example, the average sale of a product is estimated to be Rs.
75,000 with standard deviation Rs. 1,000. Here, the numerical value of mean
Rs. 75,000 and standard deviation Rs. 1,000 would be estimates of population
mean (µ) and population standard deviation (σ) respectively.
@ Dinesh Shrestha 6
Unbiasedness:
Let ‘t’ be an estimator of population parameter ‘θ’, then
the estimator ‘t’ is said to be an unbiased estimator of
population parameter ‘θ’, if E (t) = θ. i.e. mean value of
the sampling distribution of statistic is equal to the
population parameter.
For example, sample mean (𝑥)ҧ is an unbiased
estimator of population mean (µ). Sample proportion
(p) is an unbiased estimator of population proportion
(P).
On the other hand, an estimator t is said to be a biased
estimator of population parameter θ, if E (t) ≠ θ.
@ Dinesh Shrestha 7
Consistency:
Let ‘t’ be an estimator, which is the function of sample
observation. Then the estimator is said to be a
consistent estimator of population parameter ‘θ’, if ‘t’
converges in probability to θ as n →∞.
i.e. t ⎯ ⎯→
p
As n → ∞
or, lim P tn → = 1
n →
Consistency property implies that if the sample size
increases, the value of estimator t becomes very nearer
to the value of population parameter θ.
@ Dinesh Shrestha 8
Efficiency:
If t1 and t2 are two consistent estimators of the parameter θ. Then the
estimator t1 is said to be more efficient estimator than the estimator t2,
if the variance of t1 is less than the variance of t2,
V (t1 )
ie, V (t1) ˂ V (t2) → 1
So that, V (t2 )
Efficiency (E) = V (t1 )
V (t2 )
An estimator is said to be most efficient estimator of population
parameter θ, if the variance of the estimator t is least with comparison
to the variances of all other estimators of θ, provided that there exist all
the estimators of the parameter θ. It is also called best estimator of θ.
Thus, less the variance of a consistent estimator, more will be the
efficient estimator and vice versa. Hence, less variance means, more
efficiency of the estimator.
@ Dinesh Shrestha 9
Sufficiency:
An estimator ‘t’ is said to be sufficient estimator of
population parameter θ, if it contains all the information in
the sample regarding the population parameter.
For example, sample mean (𝑥)ҧ is a sufficient estimator of
the population mean (µ) because it uses all the information
given in the sample. Similarly, sample median and sample
mode are not sufficient estimators of population parameter
θ.
Properties of sufficient estimators:
i. If a sufficient estimator exits for some parameter then, it
is also the most efficient estimator.
ii. It is always consistent.
iii. It may or may not be unbiased.
@ Dinesh Shrestha 10
Confidence Interval (CI) or,
Confidence limit (CL):
The interval [t1, t2] within which unknown value of the population
parameter (θ) is expected to lie with certain probability is known as
confidence interval (Neyman).
It is also called fiducial interval or limits (R. A. Fisher).
Also, Confidence level refers to the probability that any random value
drawn from population will lie within the confidence limits.
For example, 95% confidence level indicates that there is 95%
probability of estimated random value will be lie within the confidence
limits and there is 5% risk to lie the estimator value on the outside of
the confidence limits.
Let t1 and t2 be two estimators (or statistics) and θ be the parameter of
the population. Then, the interval defined by,
P{t1 ≤ θ ≤ t2 } = 1 – α , where, 1-α is the confidence level or confidence
coefficient, is called confidence interval.
More generally, 100(1-α)% confidence interval is given by
{statistic ± Critical value of the test statistic x S.E. (statistic)}
@ Dinesh Shrestha 11
Confidence Interval (CI) of population mean
from large population:
The 100(1 – α) % confidence interval (limits) for the population
mean is given by
N −n
CI = x Z .S .E.( x ) = x Z . if N is known
n N −1
or , CI = x Z . if N is unknown/infinite
n
Where,
S.E. ( x ) = Standard error of sample mean
Z = value of Z at probability α (if α =5% then Z 0.05 = 1.96)
σ = population standard deviation.
n = sample size.
@ Dinesh Shrestha 12
s N −n if N is known and σ is unknown
CI = x Z .
n N −1 (if sample SD is given)
s
CI = x Z .
n if N is unknown.
If sample size is small (n < 30) the confidence interval is given by
( if data are given)
S N−n
CI = xlj ± t (α, n−1) .
n N
𝑆
𝐶𝐼 = 𝑥lj ± 𝑡(α, 𝑛−1) .
𝑛
𝑠 𝑁−𝑛 𝑠
𝐶𝐼 = 𝑥lj ± 𝑡(α, 𝑛−1) . 𝑜𝑟, 𝐶𝐼 = 𝑥lj ± 𝑡(α, 𝑛−1) .
𝑛−1 𝑁 𝑛−1
(if sample standard deviation (biased) is given)
Where,
S = unbiased sample standard deviation.
s = biased sample standard deviation.
@ Dinesh Shrestha 13
Confidence interval (CI) for population proportion:
Let ‘P’ be population proportion and ‘p’ be the sample proportion.
Then the 100(1 – α)% confidence interval/limits for the population
proportion is given by
CI = p Z .S .E.( p )
PQ N − n
CI = p Z . . if P and N are known.
n N −1
PQ
CI = p Z . . if N is not known.
n
pq N − n
CI = p Z . . if N is given and P is not known.
n N −1
pq if P is not known and N is not given.
CI = p Z . .
n
@ Dinesh Shrestha 14
Central Limit Theorem (CLT):
Central limit theorem is one of the most remarkable results in theory of
probability. It plays important role in Statistics. The concept of CLT was first
introduced by Abraham De Moivre (initial version) and it was improved and
extended by Laplace, Liapounov and Lindeberg Levy.
Central Limit Theorem states that the sum of large number of independent
random variables follows approximately a normal distribution.
i.e. “ If 𝑋1 , 𝑋2 , … , 𝑋𝑛 independent random variables following any distribution,
then certain general conditions, their sum σ 𝑋 = 𝑋1 + 𝑋2 + … + 𝑋𝑛 is
asymptotically (n → ∞) normally distributed”. It describes the effect of an increase
in sample size on the shape of a sampling distribution.
In other words, CLT states that the distribution of sample approximates a
normal distribution as the sample size increases (assuming that all the samples
are identical in size, and regardless of the shape of the population
distribution).
If 𝑥1 , 𝑥2 , … 𝑥𝑛 are independent random samples of size ‘n’ from any population,
then the sample mean (𝑥)ҧ is normally distributed with mean μ and variance
𝜎2Τ ,provided that n is sufficiently large.
𝑛
@ Dinesh Shrestha 15