Chapter2 Sampling Simple Random Sampling
Chapter2 Sampling Simple Random Sampling
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 1
Such process can be implemented through programming and using the discrete uniform distribution.
Any number between 1 and N can be generated from this distribution and corresponding unit can be
selected into the sample by associating an index with each sampling unit. Many statistical softwares
like R, SAS, etc. have inbuilt functions for drawing a sample using SRSWOR or SRSWR.
Notations:
The following notations will be used in further notes:
1 n
y= ∑ yi : sample mean
n i =1
N
1
Y =
N
∑y
i =1
i : population mean
1 N 1 N
=
=∑
S2 (Yi =
N −1 i 1=
− Y )2 (∑ Yi 2 − NY 2 )
N −1 i 1
1 N 1 N
=
σ2 =
= ∑ (Yi − Y )2 =(∑ Yi 2 − NY 2 )
N i 1= N i1
n n
1 1
=
=
s 2
∑ ( yi =
n −1 i 1=
− y) 2
(∑ yi2 − ny 2 )
n −1 i 1
Pj (i=
) P1 (i ) + P2 (i ) + ... + Pn (i )
1 1 1
= + + ... + (n times )
N N N
n
=
N
Now if u1 , u2 ,..., un are the n units selected in the sample, then the probability of their selection is
Alternative approach:
The probability of drawing a sample in SRSWOR can alternatively be found as follows:
Let ui ( k ) denotes the ith unit drawn at the kth draw. Note that the ith unit can be any unit out of the N
units. Then so = (ui (1) , ui (2) ,..., ui ( n ) ) is an ordered sample in which the order of the units in which they
are drawn, i.e., ui (1) drawn at the first draw, ui (2) drawn at the second draw and so on, is also
Here P(ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) is the probability of drawing ui ( k ) at the kth draw given that
ui (1) , ui (2) ,..., ui ( k −1) have already been drawn in the first (k – 1) draws.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 3
Such probability is obtained as
1
P (ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) = .
N − k +1
So
n
1 ( N − n)!
=P( so ) ∏
=
N − k +1
k =1 N!
.
( N − n)! 1
=
irrelevant n=
! .
N! N
n
2. SRSWR
When n units are selected with SRSWR, the total number of possible samples are N n . The
1
Probability of drawing a sample is .
Nn
Alternatively, let ui be the ith unit selected in the sample. This unit can be selected in the sample
either at first draw, second draw, …, or nth draw. At any stage, there are always N units in the
population in case of SRSWR, so the probability of selection of ui at any stage is 1/N for all i =
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 4
Probability of drawing an unit
1. SRSWOR
Let Ae denotes an event that a particular unit u j is not selected at the th draw. The
2. SRSWR
1
P[ selection of u j at kth draw] = .
N
SRSWOR
n
Let ti = ∑ yi . Then
i =1
n
1
E( y ) = E (∑ yi )
n i =1
1
= E ( ti )
n
N
1 1 n
= ∑ ti
n N i =1
n
N
1 1 n n
=
n N=
∑ ∑
i 1= i 1
yi .
n
When n units are sampled from N units by without replacement , then each unit of the population
can occur with other units selected out of the remaining ( N − 1) units is the population and each unit
N − 1 N
occurs in of the possible samples. So
n −1 n
N
n
n
N − 1 N
So ∑ ∑ y = n − 1 ∑ y .
i i
=i 1 =i 1
=i 1
Now
( N − 1)! n !( N − n)! N
E( y ) =
(n − 1)!( N − n)! nN!
∑i =1
yi
N
1
=
N
∑y
i =1
i
=Y.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 6
Thus y is an unbiased estimator of Y . Alternatively, the following approach can also be adopted to
show the unbiasedness property.
n
1
E( y ) =
n
∑j =1
E( y j )
1 n N
= ∑ ∑
i 1
n=j 1 =
Yi Pj (i )
1 n N 1
= ∑ ∑ Yi .
i 1 N
n=j 1 =
n
1
=
n
∑Y
j =1
=Y
SRSWR
n
1
E( y ) = E (∑ yi )
n i =1
1 n
= ∑ E ( yi )
n i =1
1 n
= ∑ (Y1P1 + .. + YN P)
n i =1
1 n
=
n
∑Y
=Y.
1
where Pi = for all i = 1, 2,..., N is the probability of selection of a unit. Thus y is an unbiased
N
estimator of population mean under SRSWR also.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 7
Variance of the estimate
Assume that each observation has some variance σ 2 . Then
V (=
y ) E ( y − Y )2
2
1 n
= E ∑ ( yi − Y )
n i =1
1 n 1 n n
= E 2 ∑ ( yi − Y ) 2 + 2 ∑∑ ( yi − Y )( y j − Y )
= n i 1 n i ≠j
n n n
1 1
= 2 ∑ E ( yi − Y ) 2 + 2 ∑∑ E ( yi − Y )( y j − Y )
n n i ≠j
1 n 2 K
=
n2
∑ σ + n2
N −1 2 K
= S + 2
Nn n
n n
where =
K ∑∑ E ( y − Y )( y − Y )
i ≠j
i i assuming that each observation has variance σ 2 . Now we find
SRSWOR
n n
=
K ∑∑ E ( y − Y )( y − Y ) .
i ≠j
i i
Consider
N N
1
E ( y=
i − Y )( y j − Y ) ∑∑ ( yk − Y )( ye − Y )
N ( N − 1) k ≠
Since
2
N N N N
∑ k − ∑ k
= − + ∑∑ ( yk − Y )( y − Y ))
2
( y Y ) ( y Y )
= k 1= i 1 k ≠
N N
0 =( N − 1) S 2 + ∑∑ ( yk − Y )( y − Y )
k ≠
N N
1
∑∑ ( y
k ≠
k − Y )( y=
−Y )
N ( N − 1)
[−( N − 1) S 2 ]
S2
= − .
N
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 8
S2
Thus K =
−n(n − 1) and so substituting the value of K , the variance of y under SRSWOR is
N
N −1 2 1 S2
V ( yWOR )= S − 2 n(n − 1)
Nn n N
N −n 2
= S .
Nn
SRSWR
N N
=
K ∑∑ E ( y − Y )( y − Y )
i ≠j
i i
N N
= ∑∑ E ( y − Y ) E ( y
i ≠j
i je −Y )
=0
because the ith and jth draws (i ≠ j ) are independent.
Thus the variance of y under SRSWR is
N −1 2
V ( yWR ) = S .
Nn
It is to be noted that if N is infinite (large enough), then
S2
V ( y) =
n
N −n
is both the cases of SRSWOR and SRSWR. So the factor is responsible for changing the
N
variance of y when the sample is drawn from a finite population in comparison to an infinite
N −n
population. This is why is called a finite population correction (fpc) . It may be noted that
N
N −n n N −n n
= 1 − , so is close to 1 if the ratio of sample size to population , is very small or
N N N N
n
negligible. The term is called sampling fraction. In practice, fpc can be ignored whenever
N
n
< 5% and for many purposes even if it is as high as 10%. Ignoring fpc will result in the
N
overestimation of variance of y .
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 9
Efficiency of y under SRSWOR over SRSWR
N −n 2
V ( yWOR ) = S
Nn
N −1 2
V ( yWR ) = S
Nn
N − n 2 n −1 2
= S + S
Nn Nn
= V ( yWOR ) + a positive quantity
Thus
V ( yWR ) > V ( yWOR )
and so, SRSWOR is more efficient than SRSWR.
Consider
1 n
=s2 ∑ ( yi − y )2
n − 1 i =1
2
1 n
= ∑ ( yi − Y ) − ( y − Y )
n − 1 i =1
1 n
= ∑
n − 1 i =1
( yi − Y ) 2 − n( y − Y ) 2
1 n
=
E (s 2 ) ∑
n − 1 i =1
E ( yi − Y ) 2 − nE ( y − Y ) 2
1 n 1
= ∑
n − 1 i =1
Var ( yi ) − nVar ( y ) =
n −1
nσ 2 − nVar ( y )
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 10
In case of SRSWOR
N −n 2
V ( yWOR ) = S
Nn
and so
n 2 N −n 2
=
E (s 2 ) σ − S
n − 1 Nn
n N −1 2 N − n 2
= S − S
n − 1 N Nn
= S2
In case of SRSWR
N −1 2
V ( yWR ) = S
Nn
and so
n 2 N −n 2
=
E (s 2 ) σ − S
n − 1 Nn
n N −1 2 N − n 2
= S − S
n − 1 N Nn
N −1 2
= S
N
=σ2
Hence
S 2 is SRSWOR
E (s2 ) = 2
σ is SRSWR
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 11
Standard errors
The standard error of y is defined as Var ( y ) .
In order to estimate the standard error, one simple option is to consider the square root of estimate of
variance of sample mean.
N −n
• under SRSWOR, a possible estimator is σˆ ( y ) = s.
Nn
N −1
• under SRSWR, a possible estimator is σˆ ( y ) = s.
Nn
( y) .
It is to be noted that this estimator does not possess the same properties as of Var
Reason being if θˆ is an estimator of θ , then θ is not necessarily an estimator of θ .
In fact, the σˆ ( y ) is a negatively biased estimator under SRSWOR.
Consider s as an estimator of S .
Let
S 2 + ε with E (ε ) =
s2 = 0, E (ε 2 ) =
S 2.
Write
s ( S 2 + ε )1/2
=
ε
1/2
= S 1 + 2
S
ε ε2
= S 1 + 2 − 4 + ...
2S 8S
assuming ε will be small as compared to S 2 and as n becomes large, the probability of such an
event approaches one. Neglecting the powers of ε higher than two and taking expectation, we have
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 12
Var ( s 2 )
E ( s=
) 1 −
8S 4
S
where
2S 4 n − 1
Var ( s ) =
2
1+ ( β 2 − 3) ) for large N .
(n − 1) 2n
N j
∑ (Y − Y )
1
=µj i
N i =1
µ4
β2 = : coefficient of kurtosis.
S4
Thus
1 β − 3
E (s) =
S 1 − − 2
4(n − 1) 8n
2
1 Var ( s 2 )
Var ( s ) = S − S 1 −
2 2
4
8 S
2
Var ( s )
=
4S 2
S 2 n −1
= 1+ ( β 2 − 3) .
2 ( n − 1) 2n
Note that for a normal distribution, β 2 = 3 and we obtain
S2
Var ( s ) = .
2 ( n − 1)
Both Var ( s ) and Var ( s 2 ) are inflated due to nonnormality to the same extent, by the inflation factor
n −1
1 + 2n ( β 2 − 3)
and this does not depends on coefficient of skewness.
This is an important result to be kept in mind while determining the sample size in which it is
assumed that S 2 is known. If inflation factor is ignored and population is non-normal, then the
reliability on s 2 may be misleading.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 13
Alternative approach:
The results for the unbiasedness property and the variance of sample mean can also be proved in an
alternative way as follows:
(i) SRSWOR
With the ith unit of the population, we associate a random variable ai defined as follows:
Then,
E (ai ) = 1× Probability that the i th unit is included in the sample
n
= = , i 1, 2,..., N .
N
E (ai2 ) = 1× Probability that the i th unit is included in the sample
n
= =, i 1, 2,..., N
N
E (ai a j ) = 1× Probability that the i th and j th units are included in the sample
n(n − 1)
= = , i ≠ j 1, 2,..., N .
N ( N − 1)
From these results, we can obtain
n( N − n)
Var (ai ) = E (ai2 ) − ( E (ai ) ) = 2 , i =
2
1, 2,..., N
N
n( N − n)
Cov(ai= , a j ) E (ai a j ) − E (ai ) E=
(a j ) ,= i ≠ j 1, 2,..., N .
N 2 ( N − 1)
We can rewrite the sample mean as
1 N
y= ∑ ai yi
n i =1
Then
1 N
=E( y ) = ∑ E (ai ) yi Y
n i =1
and
1 N 1 N N
Var ( y ) = = 2
Var ∑ a
i =1=
i i
y 2 ∑
n i 1
Var ( ai ) yi
2
+ ∑ Cov(ai , a j ) yi y j .
n i≠ j
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 14
Substituting the values of Var (ai ) and Cov(ai , a j ) in the expression of Var ( y ) and simplifying, we
get
N −n 2
Var ( y ) = S .
Nn
To show that E ( s 2 ) = S 2 , consider
1 n 2 2 1 N
=
=s2 ∑ y
(n − 1) i 1 =
i −
= ny ∑
(n − 1) i 1
ai yi2 − ny 2 .
Hence, taking, expectation, we get
1 N
=
E (s 2 ) ∑ E (ai ) yi2 − n {Var ( y ) + Y 2 }
(n − 1) i =1
Substituting the values of E (ai ) and Var ( y ) in this expression and simplifying, we get E ( s 2 ) = S 2 .
(ii) SRSWR
Let a random variable ai associated with the ith unit of the population denotes the number of times
the ith unit occurs in the sample i = 1, 2,..., N . So ai assumes values 0, 1, 2,…,n. The joint
n! 1
P(a1 , a2 ,..., aN ) = N
.
Nn
∏a !
i =1
i
N
where ∑a i =1
i = n. For this multinomial distribution, we have
n
E (ai ) = ,
N
n( N − 1)
=
Var (ai ) = , i 1, 2,..., N .
N2
n
Cov(ai , a j ) =− 2 , i ≠ j = 1, 2,..., N .
N
We rewrite the sample mean as
1 N
y= ∑ ai yi .
n i =1
Hence, taking expectation of y and substituting the value of E (ai ) = n / N we obtain that
E( y ) = Y .
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 15
Further,
1 N N
2 ∑ ∑
=Var ( y ) Var ( ai ) yi
2
+ Cov(ai , a j ) yi y j
=n i 1 =i 1
Substituting, the values of Var (ai ) =
n( N − 1) / N 2 and Cov(ai , a j ) =
−n / N 2 and simplifying, we get
N −1 2
Var ( y ) = S .
Nn
N −1 2
To prove that=
E (s 2 ) = S σ 2 in SRSWR, consider
N
n N
(n − 1) s 2 =
=i 1 =i 1
∑ yi2 − ny 2 = ∑a y i
2
i − ny 2 ,
− n {Var ( y ) + Y 2 }
N
(n − 1) E ( s 2=
) ∑ E (a ) y
i =1
i
2
i
n N ( N − 1) 2
=∑ yi2 − n. S − nY 2
N i =1 nN
(n − 1)( N − 1) 2
= S
N
N −1 2
=
E (s2 ) = S σ2
N
YˆT = NYˆ
= Ny .
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 16
Obviously
( )
E YˆT = NE ( y )
= NY
( )
Var YˆT = N 2 ( y )
2 N − n 2 N ( N − n) 2
N Nn S = S for SRSWOR
n
=
N 2 N − 1 S 2 = N ( N − 1) S 2 for SRSWOR
Nn n
N ( N − n) 2
s for SRSWOR
Var (YT ) =
ˆ n
N s2 for SRSWOR
n
y −Y
population is normally distributed N ( µ , σ 2 ) with mean µ and variance σ 2 . then
Var ( y )
follows N (0,1) when σ 2 is known. If σ 2 is unknown and is estimated from the sample then
y −Y
follows a t -distribution with (n − 1) degrees of freedom. When σ 2 is known, then the
Var ( y )
100( 1 − α ) % confidence interval is given by
y −Y
P −Zα ≤ ≤ Zα 1 α
=−
2 Var ( y ) 2
or P y − Z α Var ( y ) ≤ y ≤ y + Zα Var ( y ) =1 − α
2 2
and the confidence limits are
y − Zα Var ( y ), y + Z α Var ( y
2 2
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 17
α
when Z α denotes the upper % points on N (0,1) distribution. Similarly, when σ 2 is unknown,
2 2
then the 100(1- 1 − α ) % confidence interval is
y −Y
P −tα ≤ ≤ tα =1 − α
2 Varˆ( y ) 2
or P y − tα ≤ Varˆ( y ) ≤ y ≤ y + tα Varˆ( y ) =1 − α
2 2
and the confidence limits are
y − tα ≤ Varˆ( y ) ≤ y + tα Varˆ( y )
2 2
α
where tα denotes the upper % points on t -distribution with (n − 1) degrees of freedom.
2 2
An important constraint or need to determine the sample size is that the information regarding the
population standard derivation S should be known for these criterion. The reason and need for this
will be clear when we derive the sample size in the next section. A question arises about how to
have information about S before hand? The possible solutions to this issue are to conduct a pilot
survey and collect a preliminary sample of small size, estimate S and use it as known value of S
it. Alternatively, such information can also be collected from past data, past experience, long
association of experimenter with the experiment, prior information etc.
Now we find the sample size under different criteria assuming that the samples have been drawn
using SRSWOR. The case for SRSWR can be derived similarly.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 18
1. Prespecified variance
The sample size is to be determined such that the variance of y should not exceed a given value, say
V. In this case, find n such that
Var ( y ) ≤ V
N −n
or ( y) ≤ V
Nn
N −n 2
or S ≤V
Nn
1 1 V
or − ≤ 2
n N S
1 1 1
or − ≤
n N ne
ne
n≥
n
1+ e
N
S2
where ne = .
v
It may be noted here that ne can be known only when S 2 is known. This reason compels to assume
that S should be known. The same reason will also be seen in other cases.
The smallest sample size needed in this case is
ne
nsmallest = .
ne
1+
N
It N is large, then the required n is
n ≥ ne and nsmallest = ne .
P y − Y ≤ e = (1 − α ).
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 19
N −n 2
Since y follows N (Y , S ) assuming the normal distribution for the population, we can write
Nn
y −Y e
P ≤ =1−α
Var ( y ) Var ( y )
or Z α2 Var ( y ) = e 2
2
N −n 2
or Z α2 S = e2
2 Nn
Z S 2
α2
e
or n =
Zα S
2
1 2
1+
N e
which is the required sample size. If N is large then
2
Zα S
n = 2e .
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 20
3. Pre-specified width of confidence interval
If the requirement is that the width of the confidence interval of y with confidence coefficient
(1 − α ) should not exceed a prespecified amount W , then the sample size n is determined such that
2 Z α Var ( y ) ≤ W
2
N −n
2Z α S ≤W
2 Nn
1 1
or 4Z α2 − S 2 ≤ W 2
2 n N
1 1 W2
or ≤ +
n N 4 Z α2 S 2
2
4Z α2 S 2
2
or n ≥ W2 .
4Z α2 S 2
1+ 2
NW 2
The minimum sample size required is
4 Z α2 S 2
2
nsmallest = W2
4 Z α2 S 2
1+ 2
NW 2
If N is large then
4Z α2 S 2
n≥ 2
W2
and the minimum sample size needed is
4Z α2 S 2
nsmallest = 2
.
W2
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 21
4. Pre-specified coefficient of variation
The coefficient of variation (CV) is defined as the ratio of standard error (or standard deviation)
and mean. The knowledge of coefficient of variation has played an important role in the sampling
theory as this information has helped in deriving efficient estimators.
If it is desired that the the coefficient of variation of y should not exceed a given or pre-specified
value of coefficient of variation, say C0 , then the required sample size n is to be determined such
that
CV ( y ) ≤ C0
Var ( y )
or ≤ C0
Y
N −n 2
S
or Nn 2 ≤ C02
Y
1 1 C02
or − ≤
n N C2
C2
Co2
or n ≥
C2
1+
NC02
S
is the required sample size where C = is the population coefficient of variation.
Y
The smallest sample size needed in this case is
C2
C02
nsmallest = .
C2
1+
NC02
If N is large, then
C2
n≥
C02
C2
and nsmalest =
C02
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 22
5. Pre-specified relative error
When y is used for estimating the population mean Y , then the relative estimation error is defined
y −Y
as . If it is required that such relative estimation error should not exceed a pre-specified value
Y
R with probability (1 − α ) , then such requirement can be satisfied by expressing it like such
requirement can be satisfied by expressing it like
y −Y RY
P ≤ =1−α.
Var ( y ) Var ( y )
N −n 2
Assuming the population to be normally distributed, y follows N Y , S .
Nn
N −n 2
or Z α2 S = R Y
2 2
2 Nn
1 1 R2
or − =
n N C Zα
2 2
2
Zα C
2
R
or n =
2
Zα C
1
1+ 2
N R
S
where C = is the population coefficient of variation and should be known.
Y
If N is large, then
2
zα C
n= 2 .
R
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 23
6. Pre-specified cost
Let an amount of money C is being designated for sample survey to called n observations, C0 be
the overhead cost and C1 be the cost of collection of one unit in the sample. Then the total cost C
can be expressed as
= C0 + nC1
C
C − C0
Or n =
C1
is the required sample size.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 24