0% found this document useful (0 votes)

202 views

Chapter2 Sampling Simple Random Sampling

1. Simple random sampling (SRS) is a method where every sampling unit has an equal chance of being selected from the population. 2. There are two types of SRS: sampling without replacement, where units are not replaced once selected, and sampling with replacement, where units are replaced. 3. The probability of selecting a simple random sample of size n from a population of size N is 1/Nn for sampling with replacement and (N-n)!/N! for sampling without replacement.

Uploaded by

Dr Swati Raj

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

202 views

Chapter2 Sampling Simple Random Sampling

Uploaded by

Dr Swati Raj

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Chapter -2

Simple Random Sampling

Simple random sampling (SRS) is a method of selection of a sample comprising of n number of

sampling units out of the population having N number of sampling units such that every sampling
unit has an equal chance of being chosen.

The samples can be drawn in two possible ways.

• The sampling units are chosen without replacement in the sense that the units once chosen
are not placed back in the population .
• The sampling units are chosen with replacement in the sense that the chosen units are
placed back in the population.

1. Simple random sampling without replacement (SRSWOR):

SRSWOR is a method of selection of n units out of the N units one by one such that at any stage of
selection, anyone of the remaining units have same chance of being selected, i.e. 1/ N .

2. Simple random sampling with replacement (SRSWR):

SRSWR is a method of selection of n units out of the N units one by one such that at each stage of
selection each unit has equal chance of being selected, i.e., 1/ N . .

Procedure of selection of a random sample:

The procedure of selection of a random sample follows the following steps:
1. Identify the N units in the population with the numbers 1 to N .
2. Choose any random number arbitrarily in the random number table and start reading
numbers.
3. Choose the sampling unit whose serial number corresponds to the random number drawn
from the table of random numbers.
4. In case of SRSWR, all the random numbers are accepted ever if repeated more than once.
In case of SRSWOR, if any random number is repeated, then it is ignored and more
numbers are drawn.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 1
Such process can be implemented through programming and using the discrete uniform distribution.
Any number between 1 and N can be generated from this distribution and corresponding unit can be
selected into the sample by associating an index with each sampling unit. Many statistical softwares
like R, SAS, etc. have inbuilt functions for drawing a sample using SRSWOR or SRSWR.

Notations:
The following notations will be used in further notes:

N: Number of sampling units in the population (Population size).

n: Number of sampling units in the sample (sample size)
Y: The characteristic under consideration
Yi : Value of the characteristic for the i th unit of the population

1 n
y= ∑ yi : sample mean
n i =1
N
1
Y =
N
∑y
i =1
i : population mean

1 N 1 N

=
=∑
S2 (Yi =
N −1 i 1=
− Y )2 (∑ Yi 2 − NY 2 )
N −1 i 1
1 N 1 N
=
σ2 =
= ∑ (Yi − Y )2 =(∑ Yi 2 − NY 2 )
N i 1= N i1
n n
1 1
=
=
s 2
∑ ( yi =
n −1 i 1=
− y) 2
(∑ yi2 − ny 2 )
n −1 i 1

Probability of drawing a sample :

1.SRSWOR:
N
If n units are selected by SRSWOR, the total number of possible samples are   .
n
1
So the probability of selecting any one of these samples is .
N
 
n
Note that a unit can be selected at any one of the n draws. Let ui be the ith unit selected in the
sample. This unit can be selected in the sample either at first draw, second draw, …, or nth draw.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 2
Let Pj (i ) denotes the probability of selection of ui at the jth draw, j = 1,2,...,n. Then

Pj (i=
) P1 (i ) + P2 (i ) + ... + Pn (i )
1 1 1
= + + ... + (n times )
N N N
n
=
N

Now if u1 , u2 ,..., un are the n units selected in the sample, then the probability of their selection is

P(u1 , u2 ,..., un ) = P(u1 ).P(u2 ),..., P(un )

Note that when the second unit is to be selected, then there are (n – 1) units left to be selected in the
sample from the population of (N – 1) units. Similarly, when the third unit is to be selected, then
there are (n – 2) units left to be selected in the sample from the population of (N – 2) units and so on.
n
If P(u1 ) = , then
N
n −1 1
=P(u2 ) = ,..., P(un ) .
N −1 N − n +1
Thus
n n −1 n − 2 1 1
=
P(u1 , u2 ,.., un ) =. . ... .
N N −1 N − 2 N − n +1  N 
 
n

Alternative approach:
The probability of drawing a sample in SRSWOR can alternatively be found as follows:

Let ui ( k ) denotes the ith unit drawn at the kth draw. Note that the ith unit can be any unit out of the N

units. Then so = (ui (1) , ui (2) ,..., ui ( n ) ) is an ordered sample in which the order of the units in which they

are drawn, i.e., ui (1) drawn at the first draw, ui (2) drawn at the second draw and so on, is also

considered. The probability of selection of such an ordered sample is

P ( so ) = P (ui (1) ) P(ui (2) | ui (1) ) P(ui (3) | ui (1)ui (2) )...P(ui ( n ) | ui (1)ui (2) ...ui ( n −1) ).

Here P(ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) is the probability of drawing ui ( k ) at the kth draw given that

ui (1) , ui (2) ,..., ui ( k −1) have already been drawn in the first (k – 1) draws.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 3
Such probability is obtained as
1
P (ui ( k ) | ui (1)ui (2) ...ui ( k −1) ) = .
N − k +1
So
n
1 ( N − n)!
=P( so ) ∏
=
N − k +1
k =1 N!
.

The number of ways in which a sample of size n can be drawn = n !

( N − n)!
Probability of drawing a sample in a given order =
N!
So the probability of drawing a sample in which the order of units in which they are drawn is

( N − n)! 1
=
irrelevant n=
! .
N! N
 
n

2. SRSWR
When n units are selected with SRSWR, the total number of possible samples are N n . The
1
Probability of drawing a sample is .
Nn
Alternatively, let ui be the ith unit selected in the sample. This unit can be selected in the sample
either at first draw, second draw, …, or nth draw. At any stage, there are always N units in the
population in case of SRSWR, so the probability of selection of ui at any stage is 1/N for all i =

1,2,…,n. Then the probability of selection of n units u1 , u2 ,..., un in the sample is

P(u1 , u2 ,.., un ) = P(u1 ).P(u2 )...P(un )

1 1 1
= . ...
N N N
1
= n
N

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 4
Probability of drawing an unit
1. SRSWOR
Let Ae denotes an event that a particular unit u j is not selected at the th draw. The

probability of selecting, say, j th unit at k th draw is

P (selection of u j at k th draw) = P( A1  A2  ....  Ak −1  Ak )

= P( A1 ) P( A2 A1 ) P ( A3 A1 A2 ).....P ( Ak −1 A1 , A2 ...... Ak − 2 ) P ( Ak A1 , A2 ...... Ak −1 )

 1  1  1   1  1
= 1 −  1 −  1 −  ... 1 − 
 N   N −1   N − 2   N − k + 2  N − k +1
N −1 N − 2 N − k +1 1
= . ... .
N N −1 N − k + 2 N − k +1
1
=
N

2. SRSWR
1
P[ selection of u j at kth draw] = .
N

Estimation of population mean and population variance

One of the main objectives after the selection of a sample is to know about the tendency of the data
to cluster around the central value and the scatterdness of the data around the central value. Among
various indicators of central tendency and dispersion, the popular choices are arithmetic mean and
variance. So the population mean and population variability are generally measured by the arithmetic
mean (or weighted arithmetic mean) and variance, respectively. There are various popular estimators
for estimating the population mean and population variance. Among them, sample arithmetic mean
and sample variance are more popular than other estimators. One of the reason to use these
estimators is that they possess nice statistical properties. Moreover, they are also obtained through
well established statistical estimation procedures like maximum likelihood estimation, least squares
estimation, method of moments etc. under several standard statistical distributions. One may also
consider other indicators like median, mode, geometric mean, harmonic mean for measuring the
central tendency and mean deviation, absolute deviation, Pitman nearness etc. for measuring the
dispersion. The properties of such estimators can be studied by numerical procedures like
bootstraping.
Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 5
1. Estimation of population mean
1 n
Let us consider the sample arithmetic mean y = ∑ yi as an estimator of population mean
n i =1
N
1
Y =
N
∑Y
i =1
i and verify y is an unbiased estimator of Y under the two cases.

SRSWOR
n
Let ti = ∑ yi . Then
i =1

n
1
E( y ) = E (∑ yi )
n i =1
1
= E ( ti )
n
 N 
   
1 1 n 
= ∑ ti
n   N  i =1 
  
 n  
N
 
1 1  n  n
=
n  N=
∑  ∑
 i 1= i 1
yi .

 
n
When n units are sampled from N units by without replacement , then each unit of the population
can occur with other units selected out of the remaining ( N − 1) units is the population and each unit

 N − 1 N
occurs in   of the   possible samples. So
 n −1  n
N
 
n
 n
  N − 1 N
So ∑  ∑ y  =  n − 1  ∑ y .
i i
=i 1 =i 1  
=i 1

Now
( N − 1)! n !( N − n)! N
E( y ) =
(n − 1)!( N − n)! nN!
∑i =1
yi
N
1
=
N
∑y
i =1
i

=Y.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 6
Thus y is an unbiased estimator of Y . Alternatively, the following approach can also be adopted to
show the unbiasedness property.
n
1
E( y ) =
n
∑j =1
E( y j )

1 n N 
= ∑  ∑
i 1
n=j 1 =
Yi Pj (i ) 

1 n N 1
= ∑  ∑ Yi . 
i 1 N
n=j 1 =
n
1
=
n
∑Y
j =1

where Pj (i ) denotes the probability of selection of i th unit at j th stage.

SRSWR
n
1
E( y ) = E (∑ yi )
n i =1
1 n
= ∑ E ( yi )
n i =1
1 n
= ∑ (Y1P1 + .. + YN P)
n i =1
1 n
=
n
∑Y
=Y.
1
where Pi = for all i = 1, 2,..., N is the probability of selection of a unit. Thus y is an unbiased
N
estimator of population mean under SRSWR also.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 7
Variance of the estimate
Assume that each observation has some variance σ 2 . Then
V (=
y ) E ( y − Y )2
2
1 n 
= E  ∑ ( yi − Y ) 
 n i =1 
1 n 1 n n 
= E  2 ∑ ( yi − Y ) 2 + 2 ∑∑ ( yi − Y )( y j − Y ) 
= n i 1 n i ≠j 
n n n
1 1
= 2 ∑ E ( yi − Y ) 2 + 2 ∑∑ E ( yi − Y )( y j − Y )
n n i ≠j
1 n 2 K
=
n2
∑ σ + n2
N −1 2 K
= S + 2
Nn n
n n
where =
K ∑∑ E ( y − Y )( y − Y )
i ≠j
i i assuming that each observation has variance σ 2 . Now we find

K under the setups of SRSWR and SRSWOR.

SRSWOR
n n
=
K ∑∑ E ( y − Y )( y − Y ) .
i ≠j
i i

Consider
N N
1
E ( y=
i − Y )( y j − Y ) ∑∑ ( yk − Y )( ye − Y )
N ( N − 1) k ≠ 
Since
2
N  N N N

∑ k −  ∑ k
= − + ∑∑ ( yk − Y )( y − Y ))
2
( y Y ) ( y Y )
= k 1=  i 1 k ≠
N N
0 =( N − 1) S 2 + ∑∑ ( yk − Y )( y − Y )
k ≠
N N
1
∑∑ ( y
k ≠
k − Y )( y=
−Y )
N ( N − 1)
[−( N − 1) S 2 ]

S2
= − .
N

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 8
S2
Thus K =
−n(n − 1) and so substituting the value of K , the variance of y under SRSWOR is
N
N −1 2 1 S2
V ( yWOR )= S − 2 n(n − 1)
Nn n N
N −n 2
= S .
Nn

SRSWR
N N
=
K ∑∑ E ( y − Y )( y − Y )
i ≠j
i i

N N
= ∑∑ E ( y − Y ) E ( y
i ≠j
i je −Y )

=0
because the ith and jth draws (i ≠ j ) are independent.
Thus the variance of y under SRSWR is
N −1 2
V ( yWR ) = S .
Nn
It is to be noted that if N is infinite (large enough), then
S2
V ( y) =
n
N −n
is both the cases of SRSWOR and SRSWR. So the factor is responsible for changing the
N
variance of y when the sample is drawn from a finite population in comparison to an infinite
N −n
population. This is why is called a finite population correction (fpc) . It may be noted that
N
N −n n N −n n
= 1 − , so is close to 1 if the ratio of sample size to population , is very small or
N N N N
n
negligible. The term is called sampling fraction. In practice, fpc can be ignored whenever
N
n
< 5% and for many purposes even if it is as high as 10%. Ignoring fpc will result in the
N
overestimation of variance of y .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 9
Efficiency of y under SRSWOR over SRSWR
N −n 2
V ( yWOR ) = S
Nn
N −1 2
V ( yWR ) = S
Nn
N − n 2 n −1 2
= S + S
Nn Nn
= V ( yWOR ) + a positive quantity
Thus
V ( yWR ) > V ( yWOR )
and so, SRSWOR is more efficient than SRSWR.

Estimation of variance from a sample

Since the expressions of variances of sample mean involve S 2 which is based on population values,
so these expressions can not be used in real life applications. In order to estimate the variance of y

on the basis of a sample, an estimator of S 2 (or equivalently σ 2 ) is needed. Consider S 2 as an

estimator of s 2 (or σ 2 ) and we investigate its biasedness for S 2 in the cases of SRSWOR and
SRSWR,

Consider
1 n
=s2 ∑ ( yi − y )2
n − 1 i =1
2
1 n
= ∑ ( yi − Y ) − ( y − Y ) 
n − 1 i =1 
1  n 
=  ∑
n − 1  i =1
( yi − Y ) 2 − n( y − Y ) 2 


1  n 
=
E (s 2 )  ∑
n − 1  i =1
E ( yi − Y ) 2 − nE ( y − Y ) 2 

1  n  1
=  ∑
n − 1  i =1
Var ( yi ) − nVar ( y )  =
 n −1
 nσ 2 − nVar ( y ) 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 10
In case of SRSWOR
N −n 2
V ( yWOR ) = S
Nn
and so
n  2 N −n 2
=
E (s 2 ) σ − S 
n − 1  Nn 
n  N −1 2 N − n 2 
= S − S 
n − 1  N Nn 
= S2
In case of SRSWR
N −1 2
V ( yWR ) = S
Nn
and so

n  2 N −n 2
=
E (s 2 ) σ − S 
n − 1  Nn 
n  N −1 2 N − n 2 
= S − S 
n − 1  N Nn 
N −1 2
= S
N
=σ2
Hence
 S 2 is SRSWOR
E (s2 ) =  2
σ is SRSWR

An unbiased estimate of Var ( y ) is

N −n 2
Vˆ ( yWOR ) = s in case of SRSWOR and
Nn
N −1 N 2
Vˆ ( yWR ) = . s
Nn N − 1
s2
= in case of SRSWR.
n

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 11
Standard errors
The standard error of y is defined as Var ( y ) .
In order to estimate the standard error, one simple option is to consider the square root of estimate of
variance of sample mean.

N −n
• under SRSWOR, a possible estimator is σˆ ( y ) = s.
Nn

N −1
• under SRSWR, a possible estimator is σˆ ( y ) = s.
Nn
( y) .
It is to be noted that this estimator does not possess the same properties as of Var

Reason being if θˆ is an estimator of θ , then θ is not necessarily an estimator of θ .
In fact, the σˆ ( y ) is a negatively biased estimator under SRSWOR.

The approximate expressions for large N case are as follows:

(Reference: Sampling Theory of Surveys with Applications, P.V. Sukhatme, B.V. Sukhatme, S.
Sukhatme, C. Asok, Iowa State University Press and Indian Society of Agricultural Statistics,
1984, India)

Consider s as an estimator of S .
Let
S 2 + ε with E (ε ) =
s2 = 0, E (ε 2 ) =
S 2.
Write
s ( S 2 + ε )1/2
=

ε 
1/2

= S 1 + 2 
 S 
 ε ε2 
= S 1 + 2 − 4 + ... 
 2S 8S 
assuming ε will be small as compared to S 2 and as n becomes large, the probability of such an
event approaches one. Neglecting the powers of ε higher than two and taking expectation, we have

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 12
 Var ( s 2 ) 
E ( s=
) 1 −
8S 4 
S

where
2S 4   n − 1  
Var ( s ) =
2
1+   ( β 2 − 3) )  for large N .
(n − 1)   2n  
N j

∑ (Y − Y )
1
=µj i
N i =1

µ4
β2 = : coefficient of kurtosis.
S4
Thus
 1 β − 3
E (s) =
S 1 − − 2
 4(n − 1) 8n 
2
 1 Var ( s 2 ) 
Var ( s ) = S − S 1 −
2 2
4 
 8 S 
2
Var ( s )
=
4S 2
S 2   n −1  
= 1+   ( β 2 − 3)  .
2 ( n − 1)   2n  
Note that for a normal distribution, β 2 = 3 and we obtain

S2
Var ( s ) = .
2 ( n − 1)

Both Var ( s ) and Var ( s 2 ) are inflated due to nonnormality to the same extent, by the inflation factor

  n −1  
1 +  2n  ( β 2 − 3) 
   
and this does not depends on coefficient of skewness.

This is an important result to be kept in mind while determining the sample size in which it is
assumed that S 2 is known. If inflation factor is ignored and population is non-normal, then the
reliability on s 2 may be misleading.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 13
Alternative approach:
The results for the unbiasedness property and the variance of sample mean can also be proved in an
alternative way as follows:

(i) SRSWOR
With the ith unit of the population, we associate a random variable ai defined as follows:

1, if the i th unit occurs in the sample

ai = 
0, if the i unit does not occurs in the sample (i =1, 2,..., N )
th

Then,
E (ai ) = 1× Probability that the i th unit is included in the sample
n
= = , i 1, 2,..., N .
N
E (ai2 ) = 1× Probability that the i th unit is included in the sample
n
= =, i 1, 2,..., N
N
E (ai a j ) = 1× Probability that the i th and j th units are included in the sample
n(n − 1)
= = , i ≠ j 1, 2,..., N .
N ( N − 1)
From these results, we can obtain
n( N − n)
Var (ai ) = E (ai2 ) − ( E (ai ) ) = 2 , i =
2
1, 2,..., N
N
n( N − n)
Cov(ai= , a j ) E (ai a j ) − E (ai ) E=
(a j ) ,= i ≠ j 1, 2,..., N .
N 2 ( N − 1)
We can rewrite the sample mean as
1 N
y= ∑ ai yi
n i =1
Then
1 N
=E( y ) = ∑ E (ai ) yi Y
n i =1
and
1  N  1 N N 
Var ( y ) = = 2
Var  ∑ a
 i =1=
i i
y 2 ∑
 n i 1
Var ( ai ) yi
2
+ ∑ Cov(ai , a j ) yi y j  .
n i≠ j 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 14
Substituting the values of Var (ai ) and Cov(ai , a j ) in the expression of Var ( y ) and simplifying, we

get
N −n 2
Var ( y ) = S .
Nn
To show that E ( s 2 ) = S 2 , consider

1  n 2 2 1 N 
=
=s2  ∑ y
(n − 1)  i 1 =
i −
= ny   ∑
 (n − 1)  i 1
ai yi2 − ny 2  .

Hence, taking, expectation, we get
1 N 
=
E (s 2 )  ∑ E (ai ) yi2 − n {Var ( y ) + Y 2 }
(n − 1)  i =1 
Substituting the values of E (ai ) and Var ( y ) in this expression and simplifying, we get E ( s 2 ) = S 2 .

(ii) SRSWR
Let a random variable ai associated with the ith unit of the population denotes the number of times

the ith unit occurs in the sample i = 1, 2,..., N . So ai assumes values 0, 1, 2,…,n. The joint

distribution of a1 , a2 ,..., aN is the multinomial distribution given by

n! 1
P(a1 , a2 ,..., aN ) = N
.
Nn
∏a !
i =1
i

N
where ∑a i =1
i = n. For this multinomial distribution, we have

n
E (ai ) = ,
N
n( N − 1)
=
Var (ai ) = , i 1, 2,..., N .
N2
n
Cov(ai , a j ) =− 2 , i ≠ j = 1, 2,..., N .
N
We rewrite the sample mean as
1 N
y= ∑ ai yi .
n i =1
Hence, taking expectation of y and substituting the value of E (ai ) = n / N we obtain that

E( y ) = Y .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 15
Further,
1 N N

2 ∑ ∑
=Var ( y ) Var ( ai ) yi
2
+ Cov(ai , a j ) yi y j 
=n  i 1 =i 1 
Substituting, the values of Var (ai ) =
n( N − 1) / N 2 and Cov(ai , a j ) =
−n / N 2 and simplifying, we get

N −1 2
Var ( y ) = S .
Nn
N −1 2
To prove that=
E (s 2 ) = S σ 2 in SRSWR, consider
N
n N
(n − 1) s 2 =
=i 1 =i 1
∑ yi2 − ny 2 = ∑a y i
2
i − ny 2 ,

− n {Var ( y ) + Y 2 }
N
(n − 1) E ( s 2=
) ∑ E (a ) y
i =1
i
2
i

n N ( N − 1) 2
=∑ yi2 − n. S − nY 2
N i =1 nN
(n − 1)( N − 1) 2
= S
N
N −1 2
=
E (s2 ) = S σ2
N

Estimator of population total:

Sometimes, it is also of interest to estimate the population total, e.g. total household income, total
expenditures etc. Let denotes the population total
N
=
YT ∑=
Y
i =1
i NY

which can be estimated by

YˆT = NYˆ
= Ny .

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 16
Obviously

( )
E YˆT = NE ( y )
= NY
( )
Var YˆT = N 2 ( y )
 2  N − n  2 N ( N − n) 2
 N  Nn  S = S for SRSWOR
   n
=
 N 2  N − 1  S 2 = N ( N − 1) S 2 for SRSWOR
  Nn  n

and the estimates of variance of YˆT are

 N ( N − n) 2
 s for SRSWOR

Var (YT ) = 
ˆ n
 N s2 for SRSWOR
 n

Confidence limits for the population mean

Now we construct the 100 (1 − α ) % confidence interval for the population mean. Assume that the

y −Y
population is normally distributed N ( µ , σ 2 ) with mean µ and variance σ 2 . then
Var ( y )

follows N (0,1) when σ 2 is known. If σ 2 is unknown and is estimated from the sample then

y −Y
follows a t -distribution with (n − 1) degrees of freedom. When σ 2 is known, then the
Var ( y )
100( 1 − α ) % confidence interval is given by

 y −Y 
P −Zα ≤ ≤ Zα 1 α
 =−
 2 Var ( y ) 2 
 
or P  y − Z α Var ( y ) ≤ y ≤ y + Zα Var ( y )  =1 − α
 2 2 
and the confidence limits are
 
 y − Zα Var ( y ), y + Z α Var ( y 
 2 2 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 17
α
when Z α denotes the upper % points on N (0,1) distribution. Similarly, when σ 2 is unknown,
2 2
then the 100(1- 1 − α ) % confidence interval is

 y −Y 
P  −tα ≤ ≤ tα  =1 − α
 2 Varˆ( y ) 2 

 
or P  y − tα ≤ Varˆ( y ) ≤ y ≤ y + tα Varˆ( y )  =1 − α
 2 2 
and the confidence limits are
 
 y − tα ≤ Varˆ( y ) ≤ y + tα Varˆ( y ) 
 2 2 
α
where tα denotes the upper % points on t -distribution with (n − 1) degrees of freedom.
2 2

Determination of sample size

The size of the sample is needed before the survey starts and goes into operation. One point to be
kept is mind is that when the sample size increases, the variance of estimators decreases but the cost
of survey increases and vice versa. So there has to be a balance between the two aspects. The
sample size can be determined on the basis of prescribed values of standard error of sample mean,
error of estimation, width of the confidence interval, coefficient of variation of sample mean,
relative error of sample mean or total cost among several others.

An important constraint or need to determine the sample size is that the information regarding the
population standard derivation S should be known for these criterion. The reason and need for this
will be clear when we derive the sample size in the next section. A question arises about how to
have information about S before hand? The possible solutions to this issue are to conduct a pilot
survey and collect a preliminary sample of small size, estimate S and use it as known value of S
it. Alternatively, such information can also be collected from past data, past experience, long
association of experimenter with the experiment, prior information etc.

Now we find the sample size under different criteria assuming that the samples have been drawn
using SRSWOR. The case for SRSWR can be derived similarly.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 18
1. Prespecified variance
The sample size is to be determined such that the variance of y should not exceed a given value, say
V. In this case, find n such that
Var ( y ) ≤ V
N −n
or ( y) ≤ V
Nn
N −n 2
or S ≤V
Nn
1 1 V
or − ≤ 2
n N S
1 1 1
or − ≤
n N ne
ne
n≥
n
1+ e
N
S2
where ne = .
v
It may be noted here that ne can be known only when S 2 is known. This reason compels to assume

that S should be known. The same reason will also be seen in other cases.
The smallest sample size needed in this case is
ne
nsmallest = .
ne
1+
N
It N is large, then the required n is
n ≥ ne and nsmallest = ne .

2. Pre-specified estimation error

It may be possible to have some prior knowledge of population mean Y and it may be required that
the sample mean y should not differ from it by more than a specified amount of absolute
estimation error, i.e., which is a small quantity. Such requirement can be satisfied by associating a
probability (1 − α ) with it and can be expressed as

P  y − Y ≤ e  = (1 − α ).

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 19
N −n 2
Since y follows N (Y , S ) assuming the normal distribution for the population, we can write
Nn
 y −Y e 
P ≤ =1−α
 Var ( y ) Var ( y ) 

which implies that

e
= Zα
Var ( y ) 2

or Z α2 Var ( y ) = e 2
2

N −n 2
or Z α2 S = e2
2 Nn

  Z S 2 
  α2  
   
  e  
or n =    
  Zα S  
2

 1 2  
 1+  
 N  e  
   
which is the required sample size. If N is large then
2
 Zα S 
 
n =  2e  .
 
 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 20
3. Pre-specified width of confidence interval
If the requirement is that the width of the confidence interval of y with confidence coefficient
(1 − α ) should not exceed a prespecified amount W , then the sample size n is determined such that

2 Z α Var ( y ) ≤ W
2

assuming σ 2 is known and population is normally distributed. This can be expressed as

N −n
2Z α S ≤W
2 Nn

1 1 
or 4Z α2  −  S 2 ≤ W 2
2 n N

1 1 W2
or ≤ +
n N 4 Z α2 S 2
2

4Z α2 S 2
2

or n ≥ W2 .
4Z α2 S 2
1+ 2
NW 2
The minimum sample size required is
4 Z α2 S 2
2

nsmallest = W2
4 Z α2 S 2
1+ 2
NW 2
If N is large then
4Z α2 S 2
n≥ 2
W2
and the minimum sample size needed is
4Z α2 S 2
nsmallest = 2
.
W2

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 21
4. Pre-specified coefficient of variation
The coefficient of variation (CV) is defined as the ratio of standard error (or standard deviation)
and mean. The knowledge of coefficient of variation has played an important role in the sampling
theory as this information has helped in deriving efficient estimators.

If it is desired that the the coefficient of variation of y should not exceed a given or pre-specified

value of coefficient of variation, say C0 , then the required sample size n is to be determined such
that
CV ( y ) ≤ C0

Var ( y )
or ≤ C0
Y
N −n 2
S
or Nn 2 ≤ C02
Y
1 1 C02
or − ≤
n N C2
C2
Co2
or n ≥
C2
1+
NC02
S
is the required sample size where C = is the population coefficient of variation.
Y
The smallest sample size needed in this case is
C2
C02
nsmallest = .
C2
1+
NC02
If N is large, then
C2
n≥
C02
C2
and nsmalest =
C02

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 22
5. Pre-specified relative error
When y is used for estimating the population mean Y , then the relative estimation error is defined

y −Y
as . If it is required that such relative estimation error should not exceed a pre-specified value
Y
R with probability (1 − α ) , then such requirement can be satisfied by expressing it like such
requirement can be satisfied by expressing it like
 y −Y RY 
P ≤ =1−α.
 Var ( y ) Var ( y ) 

 N −n 2
Assuming the population to be normally distributed, y follows N  Y , S .
 Nn 

So it can be written that

RY
= Zα .
Var ( y ) 2

 N −n 2
or Z α2  S = R Y
2 2

2  Nn 

1 1  R2
or  −  =
 n N  C Zα
2 2

2
 Zα C 
 2 
 R 
 
or n =  
2
 Zα C 
1  
1+  2 
N R 
 
S
where C = is the population coefficient of variation and should be known.
Y
If N is large, then
2
 zα C 
n= 2  .
 R 
 
 

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 23
6. Pre-specified cost
Let an amount of money C is being designated for sample survey to called n observations, C0 be

the overhead cost and C1 be the cost of collection of one unit in the sample. Then the total cost C
can be expressed as
= C0 + nC1
C

C − C0
Or n =
C1
is the required sample size.

Sampling Theory| Chapter 2 | Simple Random Sampling | Shalabh, IIT Kanpur Page 24

Examples of Questionnaires PDF
100% (1)
Examples of Questionnaires PDF
2 pages
A Family of Median Based Estimators in Simple Random Sampling
No ratings yet
A Family of Median Based Estimators in Simple Random Sampling
11 pages
Questionaire A Survey About The Impact of Panatag Shoal Territorial Dispute To The Fishermen of Masinloc, Zambales
No ratings yet
Questionaire A Survey About The Impact of Panatag Shoal Territorial Dispute To The Fishermen of Masinloc, Zambales
3 pages
CAPSTONE
No ratings yet
CAPSTONE
27 pages
Alumni Portal Management System
No ratings yet
Alumni Portal Management System
5 pages
Data Mining For Unemployment Rate Prediction Using Search
No ratings yet
Data Mining For Unemployment Rate Prediction Using Search
10 pages
An Economic Analysis of Unemployment in Trinidad and Tobago
No ratings yet
An Economic Analysis of Unemployment in Trinidad and Tobago
47 pages
Chapter One
No ratings yet
Chapter One
30 pages
Hypothesis Testing
No ratings yet
Hypothesis Testing
2 pages
A Brief History of Linux Mint
No ratings yet
A Brief History of Linux Mint
5 pages
Thesis 0: Information System of The Biography and Selected Literaray Compositions of Jose Rizal For STI College Calamba
100% (4)
Thesis 0: Information System of The Biography and Selected Literaray Compositions of Jose Rizal For STI College Calamba
114 pages
UNO: An Information X Explorer Android Application of Palawan State University
No ratings yet
UNO: An Information X Explorer Android Application of Palawan State University
36 pages
2017 Graduate Unemployment BNP Read This Cp04 - 003 - Box
No ratings yet
2017 Graduate Unemployment BNP Read This Cp04 - 003 - Box
8 pages
College of Computer Studies: Central Philippines State University
No ratings yet
College of Computer Studies: Central Philippines State University
74 pages
X X X X Mean Average X N N X X: Example 2
No ratings yet
X X X X Mean Average X N N X X: Example 2
7 pages
Technology and Assessment at Tertiary Level Case of Limkokwing University of Creative Technology
No ratings yet
Technology and Assessment at Tertiary Level Case of Limkokwing University of Creative Technology
8 pages
Unemployment Problem of Bangladesh
67% (3)
Unemployment Problem of Bangladesh
5 pages
Script For Final Defense
No ratings yet
Script For Final Defense
1 page
Online Management System
No ratings yet
Online Management System
29 pages
Purposive Sampling - Complex or Simple? Research Case Examples
No ratings yet
Purposive Sampling - Complex or Simple? Research Case Examples
10 pages
Chapter One: Identifying and Stating The Problem
No ratings yet
Chapter One: Identifying and Stating The Problem
42 pages
Chapter 2 - Organization and Presentation of Data: Learning Outcomes
No ratings yet
Chapter 2 - Organization and Presentation of Data: Learning Outcomes
8 pages
Performance Evaluation by Fuzzy Inference PDF
No ratings yet
Performance Evaluation by Fuzzy Inference PDF
7 pages
Research Chapter 5
No ratings yet
Research Chapter 5
3 pages
Chapter 3
No ratings yet
Chapter 3
9 pages
Theoretical Framework
100% (1)
Theoretical Framework
3 pages
One Way ANOVA in 4 Pages
No ratings yet
One Way ANOVA in 4 Pages
8 pages
Cavite City: 0 - Page
No ratings yet
Cavite City: 0 - Page
89 pages
PR1Q3 Reviewer
No ratings yet
PR1Q3 Reviewer
7 pages
MIS - Project Title Proposal
100% (1)
MIS - Project Title Proposal
14 pages
Executive Summary IWMS Business Case
No ratings yet
Executive Summary IWMS Business Case
1 page
Barcode Scanner System 1
No ratings yet
Barcode Scanner System 1
77 pages
Tracer Study of BS in Information Technology (BSIT) Graduates of Camarines Sur Polytechnic Colleges, Nabua, Camarines Sur From 2004 To 2010
No ratings yet
Tracer Study of BS in Information Technology (BSIT) Graduates of Camarines Sur Polytechnic Colleges, Nabua, Camarines Sur From 2004 To 2010
5 pages
Chapter 4
No ratings yet
Chapter 4
47 pages
Cover Letter
100% (1)
Cover Letter
13 pages
Chapter 1 To 5
No ratings yet
Chapter 1 To 5
101 pages
Case 2
No ratings yet
Case 2
9 pages
Web Based Resident Record Management System
No ratings yet
Web Based Resident Record Management System
62 pages
Chapter 6 Sampling
100% (1)
Chapter 6 Sampling
5 pages
3877 3892+Raymund+a.+Dela+Cruz
100% (1)
3877 3892+Raymund+a.+Dela+Cruz
16 pages
Patient Records Management System in Magomeni Hospital
No ratings yet
Patient Records Management System in Magomeni Hospital
27 pages
Supply and Demand
No ratings yet
Supply and Demand
11 pages
System Analysis and Design: "Pacita National High School Automated Entrance Exam and Enrollment System"
No ratings yet
System Analysis and Design: "Pacita National High School Automated Entrance Exam and Enrollment System"
11 pages
The Subtraction Rule (Principle of Inclusion-Exclusion)
No ratings yet
The Subtraction Rule (Principle of Inclusion-Exclusion)
11 pages
A.3 Research Capsule Tracer Study
No ratings yet
A.3 Research Capsule Tracer Study
6 pages
CITIZENS REGISTRATION MANAGEMENT SYSTEM-Chapter Three
100% (4)
CITIZENS REGISTRATION MANAGEMENT SYSTEM-Chapter Three
13 pages
TRACER-STUDY
No ratings yet
TRACER-STUDY
19 pages
Masters Thesis Proposal
No ratings yet
Masters Thesis Proposal
22 pages
Chapter 2: Online School Fee
No ratings yet
Chapter 2: Online School Fee
4 pages
Capstone 1
No ratings yet
Capstone 1
27 pages
For KIOSK (Student Respondents) : WITH SMS NOTIFICATION in Terms of Functionality, Efficiency, Usability, Reliability, and
No ratings yet
For KIOSK (Student Respondents) : WITH SMS NOTIFICATION in Terms of Functionality, Efficiency, Usability, Reliability, and
2 pages
Thesis Format Draft
No ratings yet
Thesis Format Draft
37 pages
Probability, Statistics, and Data Analysis Notes # 3: UCL LCL M
No ratings yet
Probability, Statistics, and Data Analysis Notes # 3: UCL LCL M
5 pages
The Problem and Its Background: University of Cagayan Valley College of Information Technology
No ratings yet
The Problem and Its Background: University of Cagayan Valley College of Information Technology
23 pages
Final Revision System 22
No ratings yet
Final Revision System 22
124 pages
Project Title "Impact of Fii'S On Indian Stock Market"
No ratings yet
Project Title "Impact of Fii'S On Indian Stock Market"
12 pages
Computational Science and Numerical Methods
No ratings yet
Computational Science and Numerical Methods
8 pages
PY078
No ratings yet
PY078
8 pages
Chapter2 Sampling Simple Random Sampling
No ratings yet
Chapter2 Sampling Simple Random Sampling
23 pages
Simple Random Sampling Without Replacement (SRSWOR)
No ratings yet
Simple Random Sampling Without Replacement (SRSWOR)
30 pages
Chapter10 Sampling Two Stage Sampling
No ratings yet
Chapter10 Sampling Two Stage Sampling
21 pages
Chapter9 Sampling Cluster Sampling
No ratings yet
Chapter9 Sampling Cluster Sampling
21 pages
Unit Iii Index Numbers: According To The Spiegel
No ratings yet
Unit Iii Index Numbers: According To The Spiegel
7 pages
Unit Iii Index Numbers: According To The Spiegel
No ratings yet
Unit Iii Index Numbers: According To The Spiegel
7 pages
Rev - Ass. X Polynomials 2024
No ratings yet
Rev - Ass. X Polynomials 2024
8 pages
Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained
No ratings yet
Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained Google's Tile Engine Explained
3 pages
Matrix - Part 3
No ratings yet
Matrix - Part 3
37 pages
Influence Lines For Indeterminate Structure
No ratings yet
Influence Lines For Indeterminate Structure
9 pages
Sequence and Series Reborn
No ratings yet
Sequence and Series Reborn
8 pages
Fundamental of Differential Calculus
No ratings yet
Fundamental of Differential Calculus
42 pages
Relations and Functions New QN Bank
No ratings yet
Relations and Functions New QN Bank
3 pages
Lab 3 Math Core
No ratings yet
Lab 3 Math Core
8 pages
03b Pure Mathematics 3 January 2021 Examination Paper Word
No ratings yet
03b Pure Mathematics 3 January 2021 Examination Paper Word
9 pages
Curriculum Vitae, Vivek Dhand, 1
No ratings yet
Curriculum Vitae, Vivek Dhand, 1
3 pages
Cartan's EQs.
No ratings yet
Cartan's EQs.
9 pages
2019 Dec. EE307-E - Ktu Qbank
No ratings yet
2019 Dec. EE307-E - Ktu Qbank
3 pages
The Number of All One-One Functions From Set A (1, 2, 3) To Itself Is
No ratings yet
The Number of All One-One Functions From Set A (1, 2, 3) To Itself Is
4 pages
Permutations by Cutting and Shuffling by SB Morris
No ratings yet
Permutations by Cutting and Shuffling by SB Morris
74 pages
Lec4 PDF
No ratings yet
Lec4 PDF
91 pages
CCP403
No ratings yet
CCP403
20 pages
Computing The Satellite's Coordinates Using CORDIC Algorithm
No ratings yet
Computing The Satellite's Coordinates Using CORDIC Algorithm
3 pages
Mathematics - Fast Track
No ratings yet
Mathematics - Fast Track
92 pages
CBSE Solved Papers (2015-2020) by OP Gupta
No ratings yet
CBSE Solved Papers (2015-2020) by OP Gupta
659 pages
Visvesvaraya Technological University, Belgaum: M.TECH. Machine Design I Semester
No ratings yet
Visvesvaraya Technological University, Belgaum: M.TECH. Machine Design I Semester
71 pages
Year 8 S5 Transformations
No ratings yet
Year 8 S5 Transformations
45 pages
Decentralized Formation Control With Variable Shapes For Aerial Robots
No ratings yet
Decentralized Formation Control With Variable Shapes For Aerial Robots
8 pages
Probability Theory and Stochastic Processes - Assignment
No ratings yet
Probability Theory and Stochastic Processes - Assignment
2 pages
Trigonometry DUJAT - 3
No ratings yet
Trigonometry DUJAT - 3
14 pages
Parametric Query Optimization
No ratings yet
Parametric Query Optimization
12 pages
Maths DPP 6 Advanced
No ratings yet
Maths DPP 6 Advanced
49 pages
Mathedaka Hjasjkd PDF
No ratings yet
Mathedaka Hjasjkd PDF
14 pages
SIUE 2008 Fall Math 320 Chapter 15 Study Guide
No ratings yet
SIUE 2008 Fall Math 320 Chapter 15 Study Guide
3 pages
Chapter 18: Graphing Straight Lines Chapter Pre-Test
No ratings yet
Chapter 18: Graphing Straight Lines Chapter Pre-Test
3 pages
IB Derivatives
No ratings yet
IB Derivatives
66 pages