Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Module 2 Notes

Download as pdf or txt
Download as pdf or txt
You are on page 1of 14

MODULE-2

Sampling Theory
Sample survey
It is a method of drawing an inference about the characteristic of a population or universe by
observing only a part of the population.
Population
In every day language the word population means group of inhabitants living in a certain region.
One may count the number of heads and determine the sizes of the population.
Statistical Population
An aggregate or collection of measurements enumerated or measured from each and every object
under the topic of investigation, on a given variable or variables.
Example, one may collect data on income of individuals and this entire set of measurements on
income will constitute the population income.
The population has the statistical characteristic of being finite or infinite.
Eg:- Total number of college teachers under the university of Calicut is a finite population.
When the number of units in a phenomenon is indeterminable, the number of stars in the sky; it is
called an infinite population
Univariate-Bivariate-Multivariate population
If we have collected data on only one variable, we have a ‘univariate population’. However, if two
variables are measured say, marks in English and marks in Psychology, Heights and weights of
students etc., we have a bivariate population. In general one may define a multivariate population,
where more than two variables are measured on every individual
Sample
Sample is a finite subset of the population.
The number of individuals or objects in the sample is called size of the sample.
The process of obtaining a suitable sample from a population is called sampling
Census and sample method
In any statistical investigation, one is always interested in studying population
characteristics. This can be done by studying either the entire items in the population or on a part
drawn from it. If we are studying each and every element of the population, the process is called
census method and if we are studying only a sample, the process is called sample survey, sample
method or sampling.
For example, the Indian population census or a socio-economic survey of a whole village by
a college planning forum are examples of census studies. The national sample survey enquiries are
examples of sample studies(agriculture labour enquiry).
It is obvious that the results obtained by a census are likely to be more reliable than those
obtained by taking sample survey. Census method are very costly and time consuming. But, in
some surveys sampling is the only possible method and in others it may the better method. Hence
sampling has acquired a great deal of popularity these days.
SAMPLING AND NON-SAMPLING ERRORS
To appreciate the need for sample surveys, it is necessary to understand clearly the role of
sampling and non-sampling errors in complete enumeration and sample surveys. The error arising
due to drawing inferences about the population on the basis of few observations is termed as
sampling error. Clearly, the sampling error in this sense non-existent in CENSUS, since the whole
population is surveyed. However the error mainly arising at the stage of ascertainment and
processing of data, which are termed non-sampling errors, are common both in CENSUS and
sample surveys.
Sampling Errors
Even if utmost care has been taken in selecting a sample, the results derived from a sample study
may not be exactly equal to the true value in the population. The reason is that estimate is based
on a part and not on the whole and samples are seldom, if ever, perfect miniature of the population.
Hence sampling give rise to certain errors known as sampling errors. However the errors can be
controlled.
Sampling errors are of two types: biased and unbiased.
The sampling error usually decreases with increase in sample size

Non-Sampling Errors
When a complete enumeration of units in the universe if made, one would expect that it would
give rise to data free from errors. However, in practice it is not so. For example, it is difficult to
completely avoid errors of observation or ascertainment. So also in the processing of data
tabulation errors may be committed affecting the final results. Errors arising in this manner are
termed as non- sampling errors, as they are due to factors other than the inductive process of
inferring about the population from a sample. Thus, the data obtained in an investigation by
complete enumeration, although free from sampling error, whereas the results of as sample survey
would be subject to sampling error as well as non- sampling error.
Characteristics of a good sample
 The sample must represent the population correctly
 It must be free from bias
 It must possess the least sampling error
 It must be optimum in size
 Only independent unit must be selected in the sample.
Advantages of sampling
 Sampling is preferred to census method of enquiry because of the following reasons
 The sample method is comparatively more economical
 The sample method ensures completeness and a high degree of accuracy due to the small
area of operation
 It is possible to obtain more detailed information in sample survey than complete
enumeration.
 Sampling is also advocated where census is neither necessary nor desirable
 In some cases sampling is the only feasible method. For example, we have to test the
sharpness of blades, if we test each blade, perhaps the whole of the product will be wasted;
in such circumstances the census method will not be suitable.
 A sample survey is much more scientific than census because in it the extent of the
reliability of the results can be known where as this is not always possible in census.
 With the help of sampling, the volume of data being small, which can be collected and
analyzed quickly, one can arrive at certain decisions for problems which are required to be
solved urgently.
Limitations of sampling
 In order to obtain accurate results it is indispensable that a sample survey has been properly
planned and executed otherwise incorrect and misleading results may be obtained.
 Most sampling requires the service of experts and if there is a paucity of such people,
sampling may give unsatisfactory results owing to the sue of faulty methods of selection,
inappropriate sampling design, or inefficient methods of estimation.
 Where one is interested in minute details in the characteristics of individual constituent of
a universe, sampling is ruled out.
 There are various sources of errors in a sample survey. Every attempt must be made to
minimize the chances of such errors; otherwise right inference cannot be entailed.
 If the sample is not truly representative and wrong type of sampling method is selected,
then the sample will fail to give the true characteristics of the population.
Thus the choice between sample and census method of enquiry must be carefully made. If
population is small and precise information is needed concerning it, a census will be appropriate.
But when population is very large or field of enquiry is very wide, and quick results are needed,
sampling is to resorted to.
On the basis of the advantages of sampling narrated above one should not come to the hasty
conclusion that sampling is always to preferred to census. Sometimes the sample must be large in
order to achieve the required accuracy that one might just to do well to take a census.
Important terminologies
 Target population: The population to which the investigator wants to generalize his
results
 Sampling unit: Before selecting the sample, the population must be divided into parts that
are called sampling units. These units must cover the whole of the population and they
must not overlap, in the sense that every element in the population belongs to one and only
one unit.
 Sampling Frame : Sampling frame is the list from which the potential respondents are
drawn.
The sampling frame defines a set of elements from which a researcher can select a sample
of the target population
 Sampling scheme: Method of selecting sampling units from sampling frame
 Sampling bias: It is said to occur if the sample units do not possess the population
characteristics. Due to this, the sample becomes unrepresentative of population with the
result that the inference drawn from the sample about the population may be inaccurate.

Methods of sampling
When it is decided to take a sample from a population, it is necessary to choose some methods of
sampling. There are many different ways of selecting a sample from a population. The choice
would depend upon the nature of data and the purpose of enquiry. However, the following are
some of the important methods available for sampling:
The various methods of sampling can be grouped under two broad heads:
• Probability sampling (Random sampling)
• Non-probability sampling (non-random sampling)
Random Sampling
Random sampling methods are those in which every item in the universe has a known chance , or
probability, of being chosen for the sample.
Non-Probability sampling
Non-probability sampling methods are those which do not provide every item in the universe with
a known chance of being included in the sample. The selection process is partially subjective.
SAMPLING SCHEMES
There are two types of sampling schemes namely
i. Unrestricted random sampling
ii. Restricted random sampling
Unrestricted random sampling
In this type of sampling, each and every unit of the population has equal chance of being included
in the sample. Simple random sampling is in example of unrestricted sampling.
Restricted random sampling
If an investigator has an idea about the heterogeneity of sampling units, the population is divided
into homogeneous groups and sample is drawn independently from each group. Such a process of
sampling is known as restricted random sampling. Stratified sampling, systematic sampling, multi
stage sampling etc., are covered under the category of restricted sampling

1. Simple random sampling


Random sampling is purely a scientific technique based on definite principles. The sample
obtained by this method is called a random sample. By a random sample we mean, we should give
equal chance or opportunity to every unit in the population to be included in the sample
A random sample can be selected by any one of the two methods
i. Lottery Method
ii. Random number method
Lottery Method
The lottery method is practicable when the population size is comparatively small. To select a
random number by this method, first
• Assign serial numbers as 1,2,3,…N to each item
• Write these numbers on pieces of paper of equal size and of the same quality.
• Roll the papers called ‘lots’ and shuffle them thoroughly
• Take one lot at random. Note the number on it.
• Now select the item corresponding to this number into the sample
• Repeat the process till we get the required number of items into the sample.
Random number method
As the population size increases, it becomes more and more difficult to work with lots and once
can simulate this process on a computer or by using a table of random numbers. We can associate
a serial number with each number of our population and then instruct a computer to pick up a
member from 1 through N using its pseudo random number generator. This ensures that every
number from 1 through N has an equal probability of getting picked up and so the sample selected
is a random sample.
We can also use a table of random numbers to pick up a simple random sample. These are table of
random numbers in which the digits 0 to 9 are selected by a mathematical process of
randomization. They are tabled as 2 digital 3,4 or 5 digital numbers. After assigning serial numbers
to each sampling unit, open any page of the random number table. Then a blind-fold selection of
a number is made. Starting from that number we can proceed along a row or column, the successive
numbers there occur will give a random selection of items

Merits of Simple random sampling


• Personal bias is eliminated as the selection depends solely on chance.
• It is a very fair way of selecting samples from a given population, since every member is
given equal opportunities of being selected.
• There is no need for thorough knowledge of the members of the population.
• The accuracy of a sample can be tested by examining another sample from the same
universe.
Demerits of SRS
• In many situations, it is not possible to have a complete sampling frame.
• Preparing cards or making uses of random number table is tedious.
• It is seen that the members of a simple random sample generally lie far apart
geographically, boosting the time and cost of collection of data.
• A large sample is needed by this method than stratified random sampling method.
• Where there is large difference between the members of the population, stratified random
sampling is better than simple random sampling.

2. Systematic sampling
This type of sampling is very convenient in practice and is often used in field surveys. This
method is operationally more convenient than simple random sampling and ensures at the same
time equal probability of inclusion of each unit in the sample.
If we can arrange units in a definite order say, alphabetical, chronological, geographical etc.,
this method can be employed. Once the sampling units are arranged in some order, give them
serial numbers as 1,2,3, … . Divide them into a number of groups which is equal to the required
sample size. From the first group choose an item at random using lottery method. If we select the
4th item, choose the 4th item of every group systematically at equally spaced intervals. This will
constitute a systematic sample

Procedure
Suppose we want to select a systematic sample of ‘n’ units from a population of N units, first
arrange the units in a definite order. The order of arrangement should not be related to the variable
under study. Now give serial numbers as 1,2,3, … ,N to each sampling units respectively. Divide
𝑁
them into n groups of size k which is called sampling interval (𝑘 = 𝑛 ). Then we select a number
‘r’ randomly from the numbers 1,2…,k. The number ‘r’ is known as random start. Then we choose
𝑟 𝑡ℎ , (𝑟 + 𝑘)𝑡ℎ , (𝑟 + 2𝑘)𝑡ℎ , … , (𝑟 + (𝑛 − 1)𝑘)𝑡ℎ units of the population as the samples
For example,
Let N=95 and n=8
𝑁 95
Here 𝑘 = = = 11.875
𝑛 8

→ k= 12.
Hence find a number randomly from 1 to 12. let the number drawn is r=7. then we are considering
the 7th, (7+12)th, (7+24)th, …. (7+(8-1)12)th units of the population as the samples. That is, the 7th
, 19th , 31st,…, 91th units of the population are the samples selected
Example:
Imagine a retail store wants to conduct a customer satisfaction survey. The store receives about
500 customers every day, and they want to survey 50 customers.
1. Define the Population: The population is all customers visiting the store in one day, which is
500.
2. Determine the Sample Size and Interval:
Desired sample size: 50 customers.
Population size: 500 customers.
Calculate the sampling interval (k): k = 500/50= 10 .
3. Select a Random Starting Point (r): Choose a random number between 1 and 10. Suppose you
choose 7.
4. Select the Sample:
Start with the 7th customer. Then select every 10th customer thereafter. The selected customers
would be those arriving in positions: 7, 17, 27, 37, 47, 57, 67, 77, 87, 97, …, 497.
Note:
Systematic sampling is preferably used when the information is to be collected from cards, trees
in a forest, houses in blocks, entries in a register which are in a serial order, etc.,
Merits
 It is more convenient and easier method than simple random sampling
 The time and work involved in sampling by this method are relatively less
 The results obtained are also found to be generally satisfactory provided care is taken to
see that there are no periodic features associated with the sampling methods.

Demerits
1. The main limitation of the method is that it becomes less and less representative if we are
doing with populations having “hidden periodicities”. Also if the population is ordered in
a systematic way with respect to the characteristics the investigator is interested in, then it
is possible that only certain types of items will be included in the population, or at least
more of certain types than others. For instance, in a study of workers wages, the list may
be such that every tenth worker on the list gets wages above Rs. 20000 per month.
2. The entire population units must be available in a systematic and sequential manner.
3. Stratified random sampling
This method of sampling is generally adopted when the population is heterogeneous with
respect to some characteristic. We divide the population into homogeneous groups known as
strata and from each stratum random samples are drawn proportionally according to any
random sampling procedure. We have to ensure that the units within any stratum are
homogeneous with regard to the characteristic under investigation.
Let the population contains N units is divided into k strata, each contains N1,N2,… Nk units,
where N=N1+N2+…+Nk.
If it requires to select n units from the population of N units, on proportional basis, the number
𝑛
of samples n1 collected from the first strata of N1 units is 𝑁1 ∗ 𝑁
In general, the number of samples ni to be collected from ith strata which contains Ni units is
𝑛
𝑁𝑖 ∗ 𝑁

For example, suppose we want to study the academic level of 2000 students in a college. Let
us assume that this consists of 600 I DC, 500 II DC, 400 III DC, 300 I PG and 200 II PG
students. In this situation we can select a stratified random sample of 200 students. For this
purpose divide the students into 5 strata as I DC, II DC, III Dc, I PG and II PG. now from each
stratum select the students proportionally using anyone of the sampling methods. That means
we have to select 60 I DC, 50 II DC, 40 III DC, 30 I PG and 20 II PG students respectively
from each stratum. This will constitute a stratified random sample of 200 students.

MERITS
 It becomes more representative since each stratum is adequately represented in the sample.
 Since the strata are homogeneous the precision will be more.
 Stratification many times leads to administrative convenience.
 Because it provides greater precision, a stratified sample often requires a smaller sample,
which saves money.
 A stratified sample can guard against an “unrepresentative” sample (e.g., an all-male
sample from a mixed gender population)
Demerits
 It may not always easy to divide a population into homogeneous strata.
 If stratification is faulty, the results will be biased
 Sampling frame of entire population has to be prepared separately for each stratum.
 When examining multiple criteria, stratifying variables may be related to some, but not to
others, further complicating the design, and potentially reducing the utility of the strata.

4. Cluster sampling
Cluster sampling is done when each of the population unit is a group of elements. Every
member of the population is assigned to one, and only one, group. Each group is called a
cluster. For example, a school or a family is a cluster because a school comprises many
students and, similarly, a family may have several members in it. In cluster sampling, a sample
is obtained by selecting clusters from the population on the basis of simple random sampling.
The sample comprises elements as randomly selected clusters. For example, if a survey is to
be conducted among the school children in a particular district, then all the schools in that
district are clusters. If a sample of 25 schools is to be selected, a simple or systematic random
sampling can be used. Thus, the sample elements so obtained are a group of clusters as each
school is a cluster. If the purpose of the survey is to know the opinion of teachers in the school,
whether to make the sports compulsory in the curriculum or not, then every teacher in the
school must be interviewed.

Cluster sampling is very economical in the sense that at a single go, all the persons in that
clusters are investigated, but at the same time it is very susceptible to sampling bias. Like, for
the above case, one is likely to get similar responses from all the teachers in one school due to
a particular philosophy of the school.

Let us take up the situation where we are interested in estimating the demand for a curry
powder in a residential colony. The colony is divided into 11 blocks, called block A through
block K. We might use cluster sampling in this situation by treating each block as a cluster.
We will select 2 blocks out of the 11 blocks at random and then collect information from all
families residing in these 2 blocks.
■ Clusters should be heterogeneous within and the different clusters should be similar to each
other. A cluster, ideally, a mini-population and has all the features of the population.
Advantages of Cluster Sampling
1. One of the biggest advantages of cluster sampling is that it is cost-effective. One can get the
bigger sample in the limited cost.
2. It is useful if the complete list of the population is not available or constructing the complete
list of the population is difficult.
3. It is particularly useful if a researcher is interested to know the characteristics at the cluster
level rather than at the individual level. For example in studying the hygiene conditions in
schools of a district, cluster sampling would be more appropriate.
Disadvantage of Cluster Sampling
1. Clusters are required to be of the same level, but they may not have the same essential
characteristics. For example, in studying the attitude on introducing the vocational education
in schools, a survey may be conducted in a few selected schools operating in urban areas.
Thus, geographically schools may belong to the urban areas but they may differ in terms of
their status like private schools, government schools or central schools.
2. For the same sample size, cluster sampling generally provides less precision than either
simple random sampling or stratified sampling.

Multistage cluster sampling


In multistage sampling, we select a sample by using combinations of different sampling
methods. Samples are drawn in different stages. For example, in stage 1, cluster sampling can
be used to choose clusters from a population. Then, in stage 2, one may use simple random
sampling to select a subset of elements from each chosen cluster for the final sample.
Multistage sampling is used when the population is organized in different clusters. For
example, in investigating the scientific temper among the school children in a state, one may
divide the state into districts where each district may be further divided into schools. Here,
each district may be considered as a cluster of schools. In the first stage, a few districts are
randomly selected in the sample by using simple random or stratified sampling. In the second
stage, from each of these districts, a few schools may be selected by using simple random or
stratified sampling. Finally, in stage three, a few students may be randomly selected from each
of the schools selected in the second stage. The samples together so obtained in the third stage
from each of the schools comprise the ultimate sample. The most important consideration of
using multistage sampling is administrative convenience. It is normally used due to its reducing
cost. This technique is sometimes used when no general sample frame exists. A multistage
sampling becomes two-stage sampling if the ultimate samples are drawn in two stages.

Another example:
Example, suppose we want to take a sample of 5000 house holds from the state of UP. At the
first stage, the state may be divided into a number of districts and few districts selected at
random. At the second stage, each district may be sub divided into a number of villages and a
sample of villages may be taken at random. At the third stage, a number of households may be
selected from each of the villages selected at the second stage.
Merits of multi stage sampling
■ Multi stage sampling introduces flexibility in sampling method, which is lacking in other
methods
■ It is the most time efficient and cost efficient probability design for large geographical
areas.
■ This method is much easier compared with other methods
■ In this type of sampling, a list of households is necessary only for the zones selected for
the sample, and a listing of individuals is necessary only for the chosen households.
■ It provides a way of selecting a random sample where there is no frame.
Demerits
■ Multi-stage cluster sampling is less accurate
■ The sampling error in this method is greater
■ This method is statistically less efficient than other probability sampling methods

Non-probability sampling
It is not always possible to undertake a probability method of sampling, such as in random
sampling. For example, there is not a complete sampling frame available for certain groups of
the population e.g. the elderly; people who are attending a football match; people who shop in
a particular part of town. Another factor to bear in mind is that many of the probability
sampling methods described above may mean that researchers would have to undertake a
postal or telephone survey delivery or might be expected to go from house to house. Some of
the problems are low response rate. In a non-probability sample, some people have a greater.
But unknown, chance than others of selection.

This type of sampling technique does not give any assurance whether every element has chance
of being included in sample. In non-probability samples, there is no way of a calculating the
margin of error and the confidence level. The important non-probability methods are as follows
 Convenience sampling
 Purposive sampling (Subjective Sampling)
 Judgment sampling
Convenience sampling
In this type of sampling, the choice of the sample is left completely to the convenience of the
interviewer. The cost involved in picking up the sample is minimum and the cost of data collection
is also generally low.
This technique is considered to be the easiest, economical and least- time consuming.
For example, standing at a mall or a grocery store and asking people to answer questions.
However, such samples can suffer from excessive bias from known or unknown sources and also
there is no way that the possible errors can be quantified.
Purposive sampling (Subjective Sampling)
A purposive sample is one, which is selected by the researcher subjectively. The researcher
attempts to obtain sample that appears to him/her to be representative of the population and will
usually try to ensure that a range from one extreme to the other is included. In other words in
purposive sampling the people/units/elements/ in the sample are selected because they are regarded
as having similar characteristics to the people in the designated research population.
So, for example, in research investigating the management skills of owner/managers of small
enterprises, the researcher might select some typical owner managers to take part in the study.
They will not be selected randomly.
Consider another example where a researcher wants to know the attitude of high anxiety students
towards certain academic issues. In this case, it is difficult to define or identify the population and
hence the researcher may select any group of high anxiety students, which is easily available for
the study.
One advantage of this kind of sample is that it is usually possible to get a targeted sample together
very quickly and hence cheaply.
This technique suffers from the drawback that the sample drawn is not easily defensible as being
representative of the populations due to the potential subjectivity of the researcher.
Judgment sampling
In judgment sampling, the judgment or opinion of some experts forms the basis for sample
selection. The experts are persons who are believed to have information on the population which
can help in giving us better samples. Such sampling is very useful when we want to study rare
events.
Advantage
1. when there are only a small number of sampling units, random selection may miss some
important representative elements, whereas, judgment sampling will definitely include
those units
Disadvantage
1. The main disadvantage of this method is that sample may be affected due to the bias of the
investigator. There is a risk that the investigator may try to choose a sample in such a way
so as to establish his own preconceived opinions.
Previous Year Questions
1. What is meant by size of the sample? (2 marks)
2. Define Target population. (2 marks)
3. Compare sampling and non-sampling errors. (2 marks)
4. Briefly explain non-probability sampling methods (5 marks)
5. Describe simple random sampling and stratified random sampling. (5 marks)
6. Explain the advantages of sampling over census. Explain sampling and non-sampling errors.
Also explain simple random sampling and stratified random sampling. (10 marks)
7. Define sampling. (2 marks)
8. Distinguish between population and sample. (2 marks)
9. Distinguish between cluster and stratum. (2 marks)
10. Describe i) Simple random sampling ii) Systematic sampling (5 marks)
11. What is meant by sampling unit? (2 marks)
12. What is meant by non-random sampling? (2 marks)
13. Discuss about the advantages and limitations of sampling over census. (5 marks)
14. What is meant by probability sampling? Discuss in detail various methods of probability
sampling. (10 marks)

You might also like