0% found this document useful (0 votes)

22 views

Chapter2 Sampling Simple Random Sampling

1. Simple random sampling (SRS) is a method where every sampling unit has an equal chance of being selected. Samples can be drawn with or without replacement. 2. SRS without replacement (SRSWOR) selects units one by one without replacing them, while SRS with replacement (SRSWR) replaces selected units back into the population. 3. The probability of selecting a simple random sample of size n from a population of size N is 1/C(N,n) for SRSWOR and 1/N^n for SRSWR.

Uploaded by

hermitserbot

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

22 views

Chapter2 Sampling Simple Random Sampling

Uploaded by

hermitserbot

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 23

Chapter -2

Simple Random Sampling

Simple random sampling (SRS) is a method of selection of a sample comprising of n a number of

sampling units out of the population having N number of sampling units such that every sampling
unit has an equal chance of being chosen.

The samples can be drawn in two possible ways.

• The sampling units are chosen without replacement because the units, once chosen, are not
placed back in the population.
• The sampling units are chosen with replacement because the selected units are placed back
in the population.

1. Simple random sampling without replacement (SRSWOR):

SRSWOR is a method of selection of n units out of the N units one by one such that at any stage of
selection, any one of the remaining units has the same chance of being selected, i.e., 1/ N .

2. Simple random sampling with replacement (SRSWR):

SRSWR is a method of selection of n units out of the N units one by one such that at each stage of
selection, each unit has an equal chance of being selected, i.e., 1/ N .

Procedure of selection of a random sample:

The procedure of selection of a random sample follows the following steps:
1. Identify the N units in the population with the numbers 1 to N .
2. Choose any random number arbitrarily in the random number table and start reading numbers.
3. Choose the sampling unit whose serial number corresponds to the random number drawn
from the table of random numbers.
4. In the case of SRSWR, all the random numbers are accepted even if repeated more than once.
In the case of SRSWOR, if any random number is repeated, then it is ignored, and more
numbers are drawn.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page11
Such a process can be implemented through programming and using the discrete uniform
distribution. Any number between 1 and N can be generated from this distribution, and the
corresponding unit can be selected in the sample by associating an index with each sampling unit.
Many statistical software like R, SAS, etc., have built-in functions for drawing a sample using
SRSWOR or SRSWR.

Notations:
The following notations will be used in further notes:

N: Number of sampling units in the population (Population size).

n: Number of sampling units in the sample (sample size)
Y: The characteristic under consideration
Yi : Value of the characteristic for the i th unit of the population

1 n
y=  yi : sample mean
n i =1
N
1
Y=
N
y
i =1
i : population mean

1 N 1 N
S2 =  i
N − 1 i =1
(Y − Y ) 2
= (  Yi 2 − NY 2 )
N − 1 i =1
N
1 1 N 2
 2 ==
N
 (Yi − Y )2 =
i =1
( Yi − NY 2 )
N i =1
1 n 1 n
s2 = 
n − 1 i =1
( yi − y ) 2
= ( 
n − 1 i =1
yi2 − ny 2 )

Probability of drawing a sample :

1.SRSWOR:
N
If n units are selected by SRSWOR, the total number of possible samples are   .
n
1
So, the probability of selecting any one of these samples is .
N
 
n

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page22
Note that a unit can be selected at any one of the n draws. Let ui be the ith unit selected in the sample.

This unit can be selected in the sample either at first draw, second draw, …, or nth draw.
Let Pj (i ) denotes the probability of selection of ui at the jth draw, j = 1,2,...,n. Then

Pj (i ) = P1 (i ) + P2 (i ) + ... + Pn (i )
1 1 1
= + + ... + (n times )
N N N
n
= .
N

Now if u1 , u2 ,..., un are the n units selected in the sample, then the probability of their selection is

P(u1 , u2 ,..., un ) = P(u1 ).P(u2 ),..., P(un ).

Note that when the second unit is to be selected, then there are (n – 1) units left to be selected in the
sample from the population of (N – 1) units. Similarly, when the third unit is to be selected, there are
(n – 2) units left to be selected in the sample from the population of (N – 2) units and so on.
n
If P(u1 ) = , then
N
n −1 1
P(u2 ) = ,..., P(un ) = .
N −1 N − n +1
Thus
n n −1 n − 2 1 1
P(u1 , u2 ,.., un ) = . . ... = .
N N −1 N − 2 N − n +1  N 
 
n

Alternative approach:
The probability of drawing a sample in SRSWOR can alternatively be found as follows:

Let ui ( k ) denotes the ith unit drawn at the kth draw. Note that the ith unit can be any unit out of the N

units. Then so = (ui (1) , ui (2) ,..., ui ( n ) ) is an ordered sample in which the order of the units in which they

are drawn, i.e., ui (1) drawn at the first draw, ui (2) drawn at the second draw and so on, is also

considered. The probability of selection of such an ordered sample is

P( so ) = P(ui (1) ) P(ui (2) | ui (1) ) P(ui (3) | ui (1)ui (2) )...P(ui ( n ) | ui (1)ui (2) ...ui ( n −1) ).

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page33
Here P(ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) is the probability of drawing ui ( k ) at the kth draw given that

ui (1) , ui (2) ,..., ui ( k −1) have already been drawn in the first (k – 1) draws.

Such a probability is obtained as

1
P(ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) = .
N − k +1
So
n
1 ( N − n)!
P( so ) =  = .
k =1 N − k +1 N!
The number of ways in which a sample of size n can be drawn = n !

( N − n)!
Probability of drawing a sample in a given order =
N!
So the probability of drawing a sample in which the order of units in which they are drawn is

( N − n)! 1
irrelevant = n ! = .
N! N
 
n

2. SRSWR
When n units are selected with SRSWR, the total number of possible samples are N n .
1
The Probability of drawing a sample is .
Nn

Alternatively, let ui be the ith unit selected in the sample. This unit can be selected in the sample
either at first draw, second draw, …, or nth draw. At any stage, there are always N units in the
population in case of SRSWR, so the probability of selection of ui at any stage is 1/N for all i =

1,2,…,n. Then the probability of selection of n units u1 , u2 ,..., un in the sample is

P(u1 , u2 ,.., un ) = P(u1 ).P(u2 )...P(un )

1 1 1
= . ...
N N N
1
= n
N

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page44
Probability of drawing a unit
1. SRSWOR
th
Let Ae denotes an event that a particular unit u j is not selected at the draw. The probability of

selecting, say, j th unit at k th draw is

P (selection of u j at k th draw) = P( A1 A2 .... Ak −1 Ak )

= P( A1 ) P( A2 A1 ) P( A3 A1 A2 ).....P ( Ak −1 A1 , A2 ...... Ak −2 ) P ( Ak A1 , A2 ...... Ak −1 )

 1  1  1   1  1
= 1 − 1 − 1 −  ... 1 − 
 N  N − 1  N − 2   N − k + 2  N − k + 1
N −1 N − 2 N − k + 1 1
= . ... .
N N −1 N − k + 2 N − k +1
1
=
N

2. SRSWR
1
P[ selection of u j at kth draw] = .
N

Estimation of population mean and population variance

One of the main objectives after the selection of a sample is to know about the tendency of the data to
cluster around the central value and the scatteredness of the data around the central value. Among
various measures of central tendency and dispersion, the popular choices are arithmetic mean and
variance. So the population mean and population variability are generally measured by the arithmetic
mean (or weighted arithmetic mean) and variance, respectively. There are various popular estimators
for estimating the population mean and population variance. Among them, sample arithmetic mean
and sample variance is more popular than other estimators. One of the reasons to use these estimators
is that they possess nice statistical properties. Moreover, they are also obtained through well -
established statistical estimation procedures like maximum likelihood estimation, least squares
estimation, method of moments etc., under several standard statistical distributions. One may also
consider other measures like median, mode, geometric mean, harmonic mean for measuring the
central tendency and mean deviation, absolute deviation, Pitman nearness etc. for measuring the
dispersion. Numerical procedures like bootstrapping can study the properties of such estimators.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page55
1. Estimation of population mean
1 n
Let us consider the sample arithmetic mean y =  yi as an estimator of the population mean
n i =1
N
1
Y =
N
Yi =1
i and verify y is an unbiased estimator of Y under the two cases.

SRSWOR
n
Let ti =  yi . Then
i =1

n
1
E( y ) = E ( yi )
n i =1
1
= E ( ti )
n
 N 
   
1 1 n 
=  ti
n   N  i =1 
  
 n  
N
 
1 1  n 
n
=   yi .
n  N  i =1  i =1 
 
n
When n units are sampled from N units without replacement, each unit of the population can occur
with other units selected out of the remaining ( N − 1) units in the population, and each unit occurs in

 N − 1 N
  the   possible samples. So
 n −1  n
N
 
 n   N − 1 N
n
So    yi  =    yi .
i =1  i =1   n − 1  i =1
Now
( N − 1)! n !( N − n)! N
E( y ) =
(n − 1)!( N − n)! nN!
i =1
yi
N
1
=
N
y
i =1
i

=Y.

Thus y is an unbiased estimator of Y .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page66
Alternatively, the following approach can also be adapted to show the unbiasedness property. Let
1
Pj (i) = denotes the probability of selection of i th unit at j th stage. Then
N
n
1
E( y ) =
n

j =1
E( y j )

1 n
N 
=
n
   Yi Pj (i ) 
j =1  i =1 
1 n
N 1
=
n
   Yi . N 
j =1  i =1 
n
1
=
n
Y
j =1

SRSWR
n
1
E( y ) = E ( yi )
n i =1
1 n
=  E ( yi )
n i =1
1 n
=  (Y1P1 + Y2 P2 + ... + YN PN )
n i =1
1 n 1 1 1
= 
n i =1
(Y1 + Y2 + ... + YN )
N N N
1 n
= Y
n i =1
=Y.
1
where Pi = for all i = 1, 2,..., N is the probability of selection of a unit. Thus y is an unbiased
N
estimator of the population mean under SRSWR also.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page77
Variance of the estimate
Assume that each observation has some variance  2 . Then
V ( y ) = E ( y − Y )2
2
1 n 
= E   ( yi − Y ) 
 n i =1 
1 n 1 n n 
= E  2  ( yi − Y ) 2 + 2  ( yi − Y )( y j − Y ) 
 n i =1 n i j 
n n n
1 1
= 2  E ( yi − Y ) 2 + 2  E ( yi − Y )( y j − Y )
n i =1 n i j
1 n 2 K
=   + n2
n 2 i =1
N −1 2 K
= S + 2
Nn n
n n
where K =  E ( yi − Y )( y j − Y ) assuming that each observation has variance  2 . Now we find
i j

K under the setups of SRSWR and SRSWOR.

SRSWOR
n n
K =  E ( yi − Y )( y j − Y ) .
i j

Consider
N N
1
E ( yi − Y )( y j − Y ) =  ( yk − Y )( yl − Y ).
N ( N − 1) k 
Since
2
N  N N N

 k −   k
= − +  ( yk − Y )( y − Y )
2
( y Y ) ( y Y )
 k =1  i =1 k 
N N
0 = ( N − 1) S 2 +  ( yk − Y )( y − Y )
k 
N N
1 1
 
N ( N − 1) k 
( yk − Y )( y − Y ) =
N ( N − 1)
[−( N − 1) S 2 ]

S2
=− .
N

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page88
S2
Thus K = −n(n − 1) and so substituting the value of K , the variance of y under SRSWOR is
N
N −1 2 1 S2
V ( yWOR ) = S − 2 n(n − 1)
Nn n N
N −n 2
= S .
Nn

SRSWR
N N
K =  E ( yi − Y )( y j − Y )
i j
N N
=  E ( yi − Y ) E ( y j − Y )
i j

=0
because the i th and jth draws (i  j ) are independent.
Thus the variance of y under SRSWR is
N −1 2
V ( yWR ) = S .
Nn
It is to be noted that if N is infinite (large enough), then
S2
V ( y) =
n
N −n
is both the cases of SRSWOR and SRSWR. So the factor is responsible for changing the
N
variance of y when the sample is drawn from a finite population in comparison to an infinite
N −n
population. This is why is called a finite population correction (fpc) . It may be noted that
N
N −n n N −n n
= 1 − , so is close to 1 if the ratio of a sample size to population , is very small or
N N N N
n
negligible. The term is called the sampling fraction. In practice, fpc can be ignored whenever
N
n
 5% and for many purposes, even if it is as high as 10%. Ignoring fpc will result in the
N
overestimation of the variance of y .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page99
Efficiency of y under SRSWOR over SRSWR
N −n 2
V ( yWOR ) = S
Nn
N −1 2
V ( yWR ) = S
Nn
N − n 2 n −1 2
= S + S
Nn Nn
= V ( yWOR ) + a positive quantity
Thus
V ( yWR )  V ( yWOR )

and so, SRSWOR is more efficient than SRSWR.

Estimation of variance from a sample

Since the expressions of variances of the sample mean, involve S 2 which is based on population
values, so these expressions can not be used in real-life applications. In order to estimate the variance
of y on the basis of a sample, an estimator of S 2 (or equivalently  2 ) is needed. Consider s 2 as an

estimator of S 2 (or  2 ) and we investigate its biasedness for s 2 in the cases of SRSWOR and
SRSWR,

Consider
1 n
s2 = 
n − 1 i =1
( yi − y ) 2
2
1 n
=  ( yi − Y ) − ( y − Y ) 
n − 1 i =1 
1  n 
=  
n − 1  i =1
( yi − Y ) 2 − n( y − Y ) 2 


1  n 
E (s 2 ) =  
n − 1  i =1
E ( yi − Y ) 2 − nE ( y − Y ) 2 

1  n  1
=  
n − 1  i =1
Var ( yi ) − nVar ( y )  =
 n −1
 n 2 − nVar ( y ) 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page10
10
In the case of SRSWOR
N −n 2
V ( yWOR ) = S
Nn
and so
n  2 N −n 2
E (s 2 ) =  − S 
n − 1  Nn 
n  N −1 2 N − n 2 
= S − S 
n − 1  N Nn 
= S2
In the case of SRSWR
N −1 2
V ( yWR ) = S
Nn
and so

n  2 N −1 2 
E (s 2 ) =  − S 
n − 1  Nn 
n  N −1 2 N −1 2 
= S − S 
n − 1  N Nn 
N −1 2
= S
N
=2
Hence

 S in SRSWOR
2

E (s 2 ) =  2
 in SRSWR


An unbiased estimate of Var ( y ) is

N −n 2
Vˆ ( yWOR ) = s in case of SRSWOR and
Nn
N −1 N 2
Vˆ ( yWR ) = . s
Nn N − 1
s2
= in case of SRSWR.
n

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page11
11
Standard errors
The standard error of y is defined as Var ( y ) .

In order to estimate the standard error, one simple option is to consider the square root of the estimate
of the variance of the sample mean.

N −n
• under SRSWOR, a possible estimator is ˆ ( y ) = s.
Nn

N −1
• under SRSWR, a possible estimator is ˆ ( y ) = s.
Nn

It is to be noted that this estimator does not possess the same properties as of Var ( y ) .

The reason being if ˆ is an estimator of  , then  is not necessarily an estimator of  .

In fact, the ˆ ( y ) is a negatively biased estimator under SRSWOR.

The approximate expressions for large N case are as follows:

(Reference: Sampling Theory of Surveys with Applications, P.V. Sukhatme, B.V. Sukhatme, S.
Sukhatme, C. Asok, Iowa State University Press and Indian Society of Agricultural Statistics,
1984, India)

Consider s as an estimator of S .
Let
s 2 = S 2 +  with E ( ) = 0, E ( 2 ) = S 2 .
Write
s = ( S 2 +  )1/2

 
1/2

= S 1 + 2 
 S 
  2 
= S 1 + 2 − 4 + ... 
 2S 8S 
assuming  will be small as compared to S 2 and as n becomes large, the probability of such an
event approaches one. Neglecting the powers of  higher than two and taking expectation, we have

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page12
12
 Var ( s 2 ) 
E ( s) = 1 −
8S 4 
S

where
2S 4   n − 1  
Var ( s ) =
2
1+   (  2 − 3) )  for large N .
(n − 1)   2n  
j
1 N
 j =  (Yi − Y )
N i =1
4
2 = : coefficient of kurtosis.
S4
Thus
 1  − 3
E ( s ) = S 1 − − 2
 4(n − 1) 8n 
2
 1 Var ( s 2 ) 
Var ( s ) = S − S 1 −
2 2
4 
 8 S 
2
Var ( s )
=
4S 2
S 2   n −1  
= 1+   (  2 − 3)  .
2 ( n − 1)   2n  
Note that for a normal distribution,  2 = 3 and we obtain

S2
Var ( s) = .
2 ( n − 1)

Both Var ( s ) and Var ( s 2 ) are inflated due to nonnormality to the same extent, by the inflation factor

  n −1  
1 +  2n  (  2 − 3) 
   
and this does not depend on the coefficient of skewness.

This is an important result to be kept in mind while determining the sample size in which it is
assumed that S 2 is known. If inflation factor is ignored and the population is non-normal, then the
reliability on s 2 may be misleading.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page13
13
Alternative approach:
The results for the unbiasedness property and the variance of the sample mean can also be proved in
an alternative way as follows:

(i) SRSWOR
With the ith unit of the population, we associate a random variable ai defined as follows:

1, if the i th unit occurs in the sample

ai = 
0, if the i unit does not occurs in the sample (i =1, 2,..., N )
th

Then,
E (ai ) = 1 Probability that the i th unit is included in the sample
n
= , i =1, 2,..., N .
N
E (ai2 ) = 1 Probability that the i th unit is included in the sample
n
= , i =1, 2,..., N
N
E (ai a j ) = 1 Probability that the i th and j th units are included in the sample
n(n − 1)
= , i  j = 1, 2,..., N .
N ( N − 1)
From these results, we can obtain
n( N − n)
Var (ai ) = E (ai2 ) − ( E (ai ) ) = , i =1, 2,..., N
2

N2
n( N − n)
Cov(ai , a j ) = E (ai a j ) − E (ai ) E (a j ) = 2 , i  j = 1, 2,..., N .
N ( N − 1)
We can rewrite the sample mean as
1 N
y=  ai yi
n i =1
Then
1 N
E( y ) =  E (ai ) yi = Y
n i =1
and
1  N  1 N N 
Var ( y ) = 2
Var  
 i =1
ai i
y = 2 
 n  i =1
Var ( ai ) yi
2
+  Cov(ai , a j ) yi y j  .
n i j 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page14
14
Substituting the values of Var (ai ) and Cov(ai , a j ) in the expression of Var ( y ) and simplifying, we

get
N −n 2
Var ( y ) = S .
Nn
To show that E ( s 2 ) = S 2 , consider

1  n 2 2 1 N 
s =
2
 
(n − 1)  i =1
yi − ny  =  
 (n − 1)  i =1
ai yi2 − ny 2  .

Hence, taking, expectation, we get
1 N 
E (s 2 ) =   E (ai ) yi2 − n Var ( y ) + Y 2 
(n − 1)  i =1 
Substituting the values of E (ai ) and Var ( y ) in this expression and simplifying, we get E ( s 2 ) = S 2 .

(ii) SRSWR
Let a random variable ai associated with the ith unit of the population denotes the number of times

the ith unit occurs in the sample i = 1, 2,..., N . So ai assumes values 0, 1, 2,…,n. The joint distribution

of a1 , a2 ,..., aN is the multinomial distribution given by

n! 1
P (a1 , a2 ,..., aN ) = N
.
Nn
a !
i =1
i

N
where a
i =1
i = n. For this multinomial distribution, we have

n
E (ai ) = ,
N
n( N − 1)
Var (ai ) = , i = 1, 2,..., N .
N2
n
Cov(ai , a j ) = − 2 , i  j =1, 2,..., N .
N
We rewrite the sample mean as
1 N
y=  ai yi .
n i =1
Hence, taking the expectation of y and substituting the value of E (ai ) = n / N we obtain that

E( y ) = Y .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page15
15
Further,
1 N N

2  
Var ( y ) = Var ( ai ) yi
2
+ Cov(ai , a j ) yi y j 
n  i =1 i =1 
Substituting, the values of Var (ai ) = n( N − 1) / N 2 and Cov(ai , a j ) = −n / N 2 and simplifying, we get

N −1 2
Var ( y ) = S .
Nn
N −1 2
To prove that E ( s 2 ) = S =  2 in SRSWR, consider
N
n N
(n − 1) s 2 =  yi2 − ny 2 =  ai yi2 − ny 2 ,
i =1 i =1

(n − 1) E ( s 2 ) =  E (ai ) yi2 − n Var ( y ) + Y 2 

i =1

n N 2 ( N − 1) 2
 =
N i =1
yi − n.
nN
S − nY 2

(n − 1)( N − 1) 2
= S
N
N −1 2
E (s 2 ) = S =2
N

Estimator of population total:

Sometimes, it is also of interest to estimate the population total, e.g., total household income, total
expenditures, etc. Let denotes the population total
N
YT =  Yi = NY
i =1

which can be estimated by

YˆT = NYˆ
= Ny .
Obviously

( )
E YˆT = NE ( y )
= NY

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page16
16
( )
Var YˆT = N 2Var ( y )
 2  N − n  2 N ( N − n) 2
 N  Nn  S = S for SRSWOR
   n
=
 N 2  N − 1  S 2 = N ( N − 1) S 2 for SRSWOR
  Nn  n

and the estimates of variance of YˆT are

 N ( N − n) 2
 s for SRSWOR
n
Var (YˆT ) = 
 N s2 for SRSWOR.
 n

Confidence limits for the population mean

Now we construct the 100 (1 −  ) % confidence interval for the population mean. Assume that the
y −Y
population is normally distributed N (  ,  2 ) with mean  and variance  2 . then follows
Var ( y )

y −Y
N (0,1) when  2 is known. If  2 is unknown and is estimated from the sample, then
Var ( y )

follows a t -distribution with ( n − 1) degrees of freedom. When  2 is known, then the 100( 1 −  ) %
confidence interval is given by
 y −Y 
P −Z   Z  = 1−
 2 Var ( y ) 2 
 
or P  y − Z  Var ( y )  Y  y + Z  Var ( y )  = 1 − 
 2 2 
and the confidence limits are
 
 y − Z Var ( y ), y + Z  Var ( y ) 
 2 2 

where Z  denotes the upper % points on N (0,1) distribution. Similarly, when  2 is unknown,
2 2

then the 100 (1 −  ) % confidence interval is

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page17
17
 y −Y 

P −t   t  = 1 − 
 2 2 
 Var ( y ) 
 
or P  y − t  Var ( y )  Y  y + t Var ( y )  = 1 − 
 2 2 
and the confidence limits are
 
 y − t Var ( y ), y + t Var ( y ) 
 2 2 

where t  denotes the upper % points on t -distribution with ( n − 1) degrees of freedom.
2 2

Determination of sample size

The size of the sample is needed before the survey starts and goes into operation. One point to be
kept in mind is that when the sample size increases, the variance of estimators decreases but the cost
of survey increases and vice versa. So there has to be a balance between the two aspects- cost and
variability. The sample size can be determined on the basis of prescribed values of the standard error
of the sample mean, the error of estimation, the width of the confidence interval, coefficient of
variation of the sample mean, the relative error of sample mean or total cost among several others.

An important constraint or need to determine the sample size is that the information regarding the
population standard derivation S should be known for these criteria. The reason and need for this
will be clear when we derive the sample size in the next section. A question arises about how to have
information about S beforehand? The possible solutions to this issue are to conduct a pilot survey
and collect a preliminary sample of small size, estimate S and use it as a known value of S it.
Alternatively, such information can also be collected from past data, past experience, the long
association of the experimenter with the experiment, prior information, etc.

Now we find the sample size under different criteria assuming that the samples have been drawn
using SRSWOR. The case for SRSWR can be derived similarly.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page18
18
1. Pre-specified variance
The sample size is to be determined such that the variance of y should not exceed a given value, say
V. In this case, find n such that
Var ( y )  V

N −n 2
or S V
Nn
1 1 V
or −  2
n N S
1 1 1
or − 
n N ne

ne
n
n
1+ e
N
S2
where ne = .
v
It may be noted here that ne can be known only when S 2 is known. This reason compels to assume

that S should be known. The same reasoning will also be seen in other cases.
The smallest sample size needed in this case is
ne
nsmallest = .
n
1+ e
N
If N is large, then the required n is
n  ne and nsmallest = ne .

2. Pre-specified estimation error

It may be possible to have some prior knowledge of population mean Y . It may be required that the
sample mean y should not differ from it by more than a specified amount of absolute estimation
error, i.e., which is a small quantity. Such a requirement can be satisfied by associating a probability
(1 −  ) with it and can be expressed as

P  y − Y  e  = (1 −  ).

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page19
19
 N −n 2
Since y follows N  Y , S  assuming the normal distribution for the population, we can write
 Nn 
 y −Y e 
P   = 1−
 Var ( y ) Var ( y ) 

which implies that

e
= Z
Var ( y ) 2

or Z 2 Var ( y ) = e 2
2

N −n 2
or Z 2 S = e2
2 Nn

  Z S 2 
  2  
   
  e  
or n =    
  Z S  
2

 1 2  
 1+  
 N  e  
   
which is the required sample size. If N is large then
2
 Z S 
 
n =  2e  .
 
 

3. Pre-specified width of the confidence interval

If the requirement is that the width of the confidence interval of y with confidence coefficient
(1 −  ) should not exceed a pre-specified amount W , then the sample size n is determined such that

2Z  Var ( y )  W
2

assuming  2 is known and population is normally distributed. This can be expressed as

N −n
2Z  S W
2 Nn

1 1 
or 4Z 2  −  S 2  W 2
2 n N
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page20
20
1 1 W2
or  +
n N 4 Z 2 S 2
2

4Z 2 S 2
2

or n  W2 .
4Z 2 S 2
1+ 2
NW 2

The minimum sample size required is

4 Z 2 S 2
2

nsmallest = W2
4 Z 2 S 2
1+ 2
NW 2
If N is large then
4Z 2 S 2
n 2

W2
and the minimum sample size needed is
4Z 2 S 2
nsmallest = 2
.
W2

4. Pre-specified coefficient of variation

The coefficient of variation (CV) is defined as the ratio of standard error (or standard deviation)
and mean. The knowledge of the coefficient of variation has played an important role in the sampling
theory as this information has helped in deriving efficient estimators.

If it is desired that the coefficient of variation of y should not exceed a given or pre-specified value
of the coefficient of variation, say C 0 , then the required sample size n is to be determined such that

CV ( y )  C0

Var ( y )
or  C0
Y
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur
Page21
21
N −n 2
S
or Nn 2  C02
Y
1 1 C02
or − 
n N C2
C2
Co2
or n 
C2
1+
NC02
S
is the required sample size where C = is the population coefficient of variation.
Y
The smallest sample size needed in this case is
C2
C02
nsmallest = .
C2
1+
NC02
If N is large, then
C2
n
C02
C2
and nsmalest = 2
C0

5. Pre-specified relative error

When y is used for estimating the population mean Y , then the relative estimation error is defined

y −Y
as . If it is required that such relative estimation error should not exceed a pre-specified value
Y
R with probability (1 −  ) , then such requirement can be satisfied by expressing it like such
requirement can be satisfied by expressing it like
 y −Y RY 
P   = 1−.
 Var ( y ) Var ( y ) 

 N −n 2
Assuming the population to be normally distributed, y follows N  Y , S .
 Nn 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page22
22
So it can be written that
RY
= Z .
Var ( y ) 2

 N −n 2
or Z 2  S = R Y
2 2

2  Nn 

1 1  R2
or  −  = 2 2
 n N  C Z
2

2
 Z C 
 2 
 R 
 
or n =  
2
 Z C 
1  
1+  2 
N R 
 
S
where C = is the population coefficient of variation and should be known.
Y
If N is large, then
2
 z C 
n= 2  .
 R 
 
 

6. Pre-specified cost
Let an amount of money C be designated for sample survey to called n observations, C 0 be the

overhead cost and C1 be the cost of collection of one unit in the sample. Then the total cost C can be

expressed as
C = C0 + nC1

C − C0
Or n =
C1
is the required sample size.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur

Page23
23

BG2209 Syllabus
No ratings yet
BG2209 Syllabus
3 pages
Research Methodology: Rebecca Jadon
100% (1)
Research Methodology: Rebecca Jadon
48 pages
Applied Multivariate Statistical Analysis Solution Manual PDF
No ratings yet
Applied Multivariate Statistical Analysis Solution Manual PDF
18 pages
Jupyter Notebook Project DM Nikita Chaturvedi 25.07.2021
100% (5)
Jupyter Notebook Project DM Nikita Chaturvedi 25.07.2021
83 pages
Chapter2 Sampling Simple Random Sampling
No ratings yet
Chapter2 Sampling Simple Random Sampling
24 pages
Simple Random Sampling Without Replacement (SRSWOR)
No ratings yet
Simple Random Sampling Without Replacement (SRSWOR)
30 pages
Chapter2 Sampling Simple Random Sampling PDF
No ratings yet
Chapter2 Sampling Simple Random Sampling PDF
23 pages
Simple Random Sampling Without Replacement (SRSWOR)
No ratings yet
Simple Random Sampling Without Replacement (SRSWOR)
23 pages
Lesson 1. Simple Random Sampling
No ratings yet
Lesson 1. Simple Random Sampling
23 pages
SP Sampling Lect 5
No ratings yet
SP Sampling Lect 5
28 pages
LECTURE 3
No ratings yet
LECTURE 3
8 pages
Sampling
No ratings yet
Sampling
7 pages
Chapter3 Sampling Proportions Percentages
No ratings yet
Chapter3 Sampling Proportions Percentages
10 pages
SP Sampling Lect 4
No ratings yet
SP Sampling Lect 4
13 pages
Chapter3 Sampling Proportions Percentages
No ratings yet
Chapter3 Sampling Proportions Percentages
10 pages
Creative Commons Attribution-Noncommercial-Sharealike License
No ratings yet
Creative Commons Attribution-Noncommercial-Sharealike License
50 pages
Chapter3 Sampling Proportions Percentages
No ratings yet
Chapter3 Sampling Proportions Percentages
10 pages
Sampling For Proportions and Percentages
No ratings yet
Sampling For Proportions and Percentages
8 pages
Chapter7 Sampling Varying Probability Sampling
No ratings yet
Chapter7 Sampling Varying Probability Sampling
32 pages
Chapter7 Sampling Varying Probability Sampling
No ratings yet
Chapter7 Sampling Varying Probability Sampling
32 pages
Chapter7 Sampling Varying Probability Sampling
No ratings yet
Chapter7 Sampling Varying Probability Sampling
32 pages
Sampling CH-2
No ratings yet
Sampling CH-2
22 pages
Simple Random Sampling
No ratings yet
Simple Random Sampling
18 pages
Cap 1 de Vries
No ratings yet
Cap 1 de Vries
30 pages
Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)
No ratings yet
Sampling Design and Analysis - (APPENDIX A Probability Concepts Used in Sampling)
14 pages
Chapter9 Sampling Cluster Sampling
No ratings yet
Chapter9 Sampling Cluster Sampling
21 pages
Sampling Lecture
No ratings yet
Sampling Lecture
10 pages
Systematic Sampling
No ratings yet
Systematic Sampling
18 pages
02 Simple Random Sampling
No ratings yet
02 Simple Random Sampling
10 pages
N Out of A Finite Population of Size:) (SRSWR) (Srswor) (SRSWR
No ratings yet
N Out of A Finite Population of Size:) (SRSWR) (Srswor) (SRSWR
30 pages
Chapter12 Sampling Successive Occasions
No ratings yet
Chapter12 Sampling Successive Occasions
11 pages
Systematic Sampling: N N N K N NK
No ratings yet
Systematic Sampling: N N N K N NK
17 pages
Chapter 7 - Sampling Distributions
No ratings yet
Chapter 7 - Sampling Distributions
82 pages
Stat 115 - Chapter 1
No ratings yet
Stat 115 - Chapter 1
156 pages
2-1 Srswor
No ratings yet
2-1 Srswor
47 pages
Statistics For Management Sampling Theory: Post Graduate Programme
No ratings yet
Statistics For Management Sampling Theory: Post Graduate Programme
20 pages
Var Mean Sample
No ratings yet
Var Mean Sample
14 pages
Chapter7 Varying Probability Sampling
No ratings yet
Chapter7 Varying Probability Sampling
32 pages
Recall: The Sample Mean y Has A Sampling Distribution With The Mean and SE Given by
No ratings yet
Recall: The Sample Mean y Has A Sampling Distribution With The Mean and SE Given by
4 pages
Sampling
No ratings yet
Sampling
20 pages
Y y y N N N N: SRSWR: Total Number of Samples of Size 2 5
No ratings yet
Y y y N N N N: SRSWR: Total Number of Samples of Size 2 5
4 pages
Chapter12 Sampling Successive Occasions
No ratings yet
Chapter12 Sampling Successive Occasions
11 pages
Varying Probability Sampling
No ratings yet
Varying Probability Sampling
58 pages
Sampling Distribution: Estimation and Testing of Hypothesis
No ratings yet
Sampling Distribution: Estimation and Testing of Hypothesis
34 pages
IS 3001 SRS Part 1s - Lecture 3
No ratings yet
IS 3001 SRS Part 1s - Lecture 3
27 pages
Sampling: Click at Http://goo - gl/7Dztn
No ratings yet
Sampling: Click at Http://goo - gl/7Dztn
8 pages
Sampling Survey
No ratings yet
Sampling Survey
2 pages
Unit 2
No ratings yet
Unit 2
15 pages
7.1 Basic Concepts
No ratings yet
7.1 Basic Concepts
28 pages
J. K. Shah Classes Sampling Theory and Theory of Estimation
No ratings yet
J. K. Shah Classes Sampling Theory and Theory of Estimation
37 pages
Unit-3 by EasePDF
No ratings yet
Unit-3 by EasePDF
29 pages
Chapter11 Sampling Systematic Sampling
No ratings yet
Chapter11 Sampling Systematic Sampling
17 pages
Sampling 1 PDF
No ratings yet
Sampling 1 PDF
8 pages
Self Assessment Tool-1
No ratings yet
Self Assessment Tool-1
4 pages
Introduction To Probabilistic Sampling
No ratings yet
Introduction To Probabilistic Sampling
39 pages
Research Methods (Sampling)
No ratings yet
Research Methods (Sampling)
21 pages
CH 7
No ratings yet
CH 7
17 pages
Lecture 3 Simple Random Sampling
No ratings yet
Lecture 3 Simple Random Sampling
5 pages
Unit-2
No ratings yet
Unit-2
27 pages
Sampling Theory
No ratings yet
Sampling Theory
58 pages
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
From Everand
Application of Derivatives Tangents and Normals (Calculus) Mathematics E-Book For Public Exams
Mohmmad Khaja Shareef
5/5 (1)
Topics on Tournaments in Graph Theory
From Everand
Topics on Tournaments in Graph Theory
John W. Moon
No ratings yet
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Complex Variables I Essentials
From Everand
Complex Variables I Essentials
Alan D. Solomon
No ratings yet
Wage Disparities in Britain People or Place
No ratings yet
Wage Disparities in Britain People or Place
53 pages
Combined QP - S1 Edexcel
No ratings yet
Combined QP - S1 Edexcel
240 pages
Name: Score: Grade & Section: Date
No ratings yet
Name: Score: Grade & Section: Date
3 pages
11 Regression JASP
100% (1)
11 Regression JASP
35 pages
Question Part a and B New
No ratings yet
Question Part a and B New
7 pages
Reading 3 Machine Learning - Answers
No ratings yet
Reading 3 Machine Learning - Answers
12 pages
CS2A
No ratings yet
CS2A
7 pages
Section 03.4 Shared Lab
No ratings yet
Section 03.4 Shared Lab
5 pages
rr311801 Probability and Statistics
No ratings yet
rr311801 Probability and Statistics
9 pages
19 Most Elegant Sklearn Tricks I Found After 3 Years of Use - by Bex T. - Towards AI
No ratings yet
19 Most Elegant Sklearn Tricks I Found After 3 Years of Use - by Bex T. - Towards AI
9 pages
Refer To Brand Preference Problem 6 5 A Obtain The Studentized Deleted Residuals and Identify PDF
No ratings yet
Refer To Brand Preference Problem 6 5 A Obtain The Studentized Deleted Residuals and Identify PDF
2 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
57 pages
SPSS Handouts
No ratings yet
SPSS Handouts
6 pages
K-NN Algorithm in Machine Learning
No ratings yet
K-NN Algorithm in Machine Learning
11 pages
Exit Exam Stat and Proba 11
No ratings yet
Exit Exam Stat and Proba 11
2 pages
Nota Statistik Asas To Build A Frequency Table
No ratings yet
Nota Statistik Asas To Build A Frequency Table
2 pages
MAT5007 - Module 1 Problem Set
No ratings yet
MAT5007 - Module 1 Problem Set
3 pages
Anova One Way
No ratings yet
Anova One Way
24 pages
Polverino Et Al 2023 Machine Learning For Prognostics and Health Management of Industrial Mechanical Systems and
No ratings yet
Polverino Et Al 2023 Machine Learning For Prognostics and Health Management of Industrial Mechanical Systems and
20 pages
A Brief Overview of The Classical Linear Regression Model (CLRM)
No ratings yet
A Brief Overview of The Classical Linear Regression Model (CLRM)
85 pages
Chapter11 Slides
No ratings yet
Chapter11 Slides
20 pages
R Project
No ratings yet
R Project
14 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
13 pages
הסקה סטטיסטית
No ratings yet
הסקה סטטיסטית
80 pages
Introduction To Econometrics: Bivariate Regression Models
No ratings yet
Introduction To Econometrics: Bivariate Regression Models
21 pages
MOTIVASI - UNIANOVA M BY Gender Level
No ratings yet
MOTIVASI - UNIANOVA M BY Gender Level
5 pages