This document provides an overview of sampling concepts and methods. It defines key terms like population, sample, sampling frame, sampling error, and non-sampling error. It discusses the steps in sample design including defining the population, identifying the sampling frame, selecting a sampling method/technique, and determining sample size. Probability and non-probability sampling methods are introduced. Practical considerations for determining an optimal sample size are outlined. The goal of sampling is to make inferences about a population from a representative subset or sample.
2. Unit-4 Sampling: Syllabus
• Sampling: Basic Concepts: Defining the Universe, Concepts
of Statistical Population, Sample, Characteristics of a good
sample. Sampling Frame, determining the sample frame,
Sampling errors, Non Sampling errors, Methods to reduce
the errors, Sample Size constraints, Non Response.
• Probability Sample: Simple Random Sample, Systematic
Sample, Stratified Random Sample, Area Sampling &
Cluster Sampling.
• Non Probability Sample: Judgment Sampling, Convenience
Sampling, Purposive Sampling, Quota Sampling &
Snowballing Sampling methods.
• Determining size of the sample: Practical considerations in
sampling and sample size, (sample size determination
formulae and numerical not expected)
7. Today’s class is on ?
Sampling and its types
Before that what we need?
We need to have a good understanding on Population,
Sample & some definitions.
8. Definitions of related terms
• Universe or Population
– All the items under consideration in any field of
inquiry constitute a ‘Universe’ or ‘Population’.
• Census
• A complete enumeration / counting of all the
items in to population is known as a census
inquiry.
• Census inquiry is not possible in practice
because of quite often researcher select only
a few items from the universe for the study
purposes. This items so selected constitute
what is technically called a ‘sample’.
9. Sample Design : Population
• Population:
• refers to entire group of people, events or
things of interest that the researchers wishes
to investigate
• In statistics, a population is a set of similar
items or events which is of interest for some
question or experiment.
Definitions of related terms
10. Population
➢ It is the whole point of interest.
➢It refers to the group of people, items
or units under investigation and
includes every individual.
➢It is the entire pool from which a
statistical sample is drawn.
Population may refer to people,
objects, events, hospital visits,
measurements etc.
11. • Sampling:
Sampling is the process of selecting a
sufficient number of elements from the
population, so that the study of the sample &
understanding of its properties or
characteristics would make it possible for us to
generalize such properties or characteristics to
the population.
• Sampling is the process of selection of a
representative subset of individuals from within a
population to estimate characteristics of the whole
population.
Definitions of related terms
12. Definitions of related terms : Sample
❑ A sample is a group of units / element selected from
a larger group – the population
❑A sample is a subset of the population which is
representative of the population from which it was drawn.
❑ The sample is the group of individuals who participate in
the study, and the population is the broader group of people
to whom the study results will apply.
13. • Sampling frame:
The listing of all accessible population from
which you will draw your sample is called
sampling frame.
Definitions of related terms
14. • Sample Design
– A sample design is a definite plan determined
before any data are actually collected for
obtaining a sample from a given population.
Definitions of related terms: Sample Design
15. • Representative sample: a sample that reflects the
population accurately so that it is a microcosm (
Small scale version) of the population.
• Sampling bias: a distortion in the
representativeness of the sample that arises when
some members of the population (or more precisely
the sampling frame) stand little or no chance of
being selected for inclusion in the sample.
Definitions of related terms
16. • Probability sample: a sample that has been selected
using random selection so that each unit in the
population has a known chance of being selected. The
aim of probability sampling is to keep sampling error to a
minimum.
• Non-probability sample: a sample that has not
been selected using a random selection method.
Essentially, this implies that some units in the
population are more likely to be selected than
others.
Definitions of related terms
17. • Sampling error: the difference between a sample
and the population from which it is selected, even
though a probability sample has been selected.
• Non-sampling error: differences between the
population and the sample that arise either from
deficiencies in the sampling approach, such as an
inadequate sampling frame or non-response , or
from such problems as poor question wording, poor
interviewing, or faulty processing of data.
Definitions of related terms
21. Ultimate goal of using statistics
Population
Sample
Infer certain characteristics of
a population from sample
Sample
22. Ultimate goal of using statistics
• Generalize the study findings from a small sub group of
population to the whole population of interest quantifying
the uncertainty.
Population
Sample
Infar certain characteristics
of a population from sample
Sample
23. Example of Sampling
• Say, we have a population of size 321 infants in
Wadgaon And we want to estimate the proportion of low
birth weight.
• We will draw a sample of size 50 from the population.
• Estimate the mean birth weight.
• Estimate the proportion of low birth weight.
24. Sampling is a Prerequisite of
conducting a statistical analysis for
a valid inference
But before that we need to have
• Better understanding of research question
• Clear idea of the target population on whom you are
going to generalize your findings from a sample.
• Sample is representative of the population.
• Study has to be properly designed.
• Optimum sample size.
• Valid measures of the study outcome.
• Chose the appropriate statistical analysis.
25. Need of Sampling
• Lower Cost
• Greater accuracy of results.
• Greater speed of data collection
• Availability of Population elements
• Sample Versus Census. ( e.g. safety test)
26. Characteristics of good sample
design / Sampling
• 1) Sample design must result in a truly
representative sample.
• 2) Sample design must be such which
result in a small sampling error.
• 3) Sample design must be viable in the
context of funds available for the
research study.
27. Characteristics of good sample
design
• 4) Sampling design must be such so that
systematic bias can be controlled in a
better way.
• 5) Sample should be such that the result
of the sample study can be applied, in
general, with a universe with a reasonable
level of confidence.
30. The Sampling Design Process
Define the Population
Determine the Sampling Frame
Select Sampling Technique(s)
Determine the Sample Size
Execute the Sampling Process
38. Sampling Frame:
• A perfect frame identifies each element once and
only once. Perfect frames are rarely available in real
life. A frame is subject to several types of defect which
may be broadly classified on the following lines.
• Incomplete Frame When some legitimate (genuine)
sampling units of the population are omitted the frame
is said to be incomplete.
• Inaccurate Frame When some of the sampling units
of the population are listed inaccurately or some units
which do not actually exist are included, the frame is
said to be inaccurate. If you use the list of ration cards as a frame to
select persons obviously such a frame will be inaccurate as the details about
the persons such as age are never updated.
39. Sampling Frame:
• Inadequate Frame A frame which does not include all units
of the population by its structure is an inadequate frame.
• If you use the list of names included in the telephone directory of a city
as the frame for selecting a sample to collect information about a
consumer product, obviously it will be an inadequate frame. It will
include the names of only those persons who have a telephone omitting
the majority of the residents of the city.
• Out of Date Frame A frame is out of date when it has not
been updated although it was accurate, complete and
adequate at the time of preparation.
• The use of census blocks as a frame to select a sample of households is
a fairly accurate frame immediately after the decennial census.
41. 4) Determine the Sample Size
• Common Misconceptions
– The sample should be a proportion (often 5
or 10 per cent) of the population;
– The sample should total about 500;
– Any increase in the sample size will increase
the precision of the sample results.
◙ No such rule-of-thumb method is
adequate.
42. • This is the sub-population to be studied in order to
make an inference to a reference population(A
broader population to which the findings from a study
are to be generalized)
• In census, the sample size is equal to the population
size. However, in research, because of time
constraint and budget, a representative sample are
normally used.
• The larger the sample size the more
accurate the findings from a study.
WHAT IS SAMPLE SIZE?
43. WHAT IS SAMPLE SIZE DETERMINATION
• Sample size determination is the
mathematical estimation of the number of
subjects/units to be included in a study.
• When a representative sample is taken from
a population, the finding are generalized to
the population.
• Optimum sample size determination is
required for the following reasons:
To allow for appropriate analysis
To provide the desired level of accuracy
To allow validity of significance test
44. Sample Sizes Used in Marketing
Research Studies
Type of Study Minimum Size Typical Range
Problem identification research
(e.g. market potential)
500 1,000-2,500
Problem-solving research (e.g.
pricing)
200 300-500
Product tests 200 300-500
Test marketing studies 200 300-500
TV, radio, or print advertising (per
commercial or ad tested)
150 200-300
Test-market audits 10 stores 10-20 stores
Focus groups 2 groups 4-12 groups
45. SIZE OF SAMPLE
• Most researchers find it difficult to
determine the size of the sample.
• Krejcie and Morgan (1970) have
prepared a table.
47. There is only one method of determining
sample size that allows the researcher to
PREDETERMINE the accuracy of the sample
results…
The Confidence Interval
Method of Determining
Sample Size
48. Sample Size Formula - Proportion
• The sample size formula for estimating a
proportion (also called a percentage or share):
49. The Central Limit Theorem allows us to
use the logic of the Normal Curve
Distribution
• Since 95% of samples drawn from a
population will fall within + 1.96 x
Sample error (this logic is based upon
our understanding of the normal curve)
we can make the following statement:
….
50. Practical Considerations in Sample Size
Determination
• How to estimate variability (p and q
shares) in the population
• Expect the worst case (p=50%; q=50%)
• Estimate variability: results of previous
studies or conduct a pilot study
51. Practical Considerations in Sample Size
Determination
• How to determine the amount of desired
sample error
• Researchers should work with managers
to make this decision. How much error is
the manager willing to tolerate (less error
= more accuracy)?
• Convention is + 5%
• The more important the decision, the less
should be the acceptable level of the
sample error
52. Practical Considerations in Sample Size
Determination
• How to decide on the level of confidence
desired
• Researchers should work with managers
to make this decision. The higher the
desired confidence level, the larger the
sample size needed
• Convention is 95% confidence level
(z=1.96 which is + 1.96 s.d.’s )
• The more important the decision, the more
likely the manager will want more
confidence. For example, a 99%
confidence level has a z=2.58.
53. Sample Size…
• Many numerical techniques for
determining sample sizes are available ,
but suffice it to say that the larger the
sample size is, the more accurate we can
expect the sample estimates to be.
54. If Sample size is too large
• The study will be difficult and costly
• Time constraint
• Available cases e.g rare disease.
• Loss of accuracy.
• Hence, optimum sample size must be
determined before commencement of a stud
56. Sampling and Non-Sampling Errors…
• Two major types of error can arise when a sample of
observations is taken from a population:
• sampling error and non sampling error.
• Sampling error refers to differences between the sample
and the population that exist only because of the
observations that happened to be selected for the sample.
Random and we have no control over.
• Non sampling errors are more serious and are due to
mistakes made in the acquisition of data or due to the
sample observations being selected improperly.
Most likely caused by poor planning, sloppy work, act of the
Goddess of Statistics, etc.
57. Sampling Error…
• Sampling error refers to differences
between the sample and the population
that exist only because of the observations
that happened to be selected for the
sample.
• Increasing the sample size will reduce this
type of error.
61. Sampling errors
• Random sampling error – the deviation between the
sample statistic and the population parameter caused
by chance in selecting a random sample.
– this is only component of the margin of error
• Bad Sampling Methods
– Convenience Sampling
– Voluntary Response
• Undercoverage – when some members of the
population are left out of the process of choosing the
sample.
62. Undercoverage
• sampling frame is the list of individuals
from where the samples are actually
chosen.
– If the sampling frame leaves out certain classes
of people, random sample from that frame will be
biased.
63. Example- Undercoverage
• We used a telephone book to randomly
choose numbers to dial and ask “What
brand of soap do you use most often?”
– Population: All Indian adults
– Sampling Frame: All adults with listed phone numbers
– Error: Undercoverage
• By using the telephone book, we have left out all those
people who do not have phones and all the people who
have unlisted phone numbers.
64. Reducing sampling error
• If sampling principles are applied carefully within the
constraints of available resources, sampling error can be
kept to a minimum.
• Increase Sample size
• Stratification
66. Non sampling errors
• 1) Coverage / Selection bias : This may be
due to Under coverage (Missing elements)
or Over coverage ( duplications / Population
sampled is not exactly the population of
interest)
• 2) Non Response: Non response error may
be
i) Unit non- response: Failure to obtain
response from pre-chosen sampling unit
ii) Item non response: Failure to obtain
response for specific question or item
67. Non sampling errors
• 3) Measurement / Data acquisition error:
• Occurs when recorded responses are differ
from actual response it may due to
i) Respondent: False information provided by
respondent due to :
prestige issue /
Sensitivity /
Misunderstanding/
lack of motivation to give correct answer
68. Non sampling errors
• 3) Measurement / Data acquisition error:
• ii) Instrument : (Questionnaire)
---- Unclear , ambiguous / difficult
question / incomplete options
iii) Interviewer error:
---- Inadequate training
Wrong method for interview
4) Processing / Data handling error:
Errors in Coding, data entry, Data analysis
69. Nonsampling errors
• Processing errors- mistakes in mechanical
tasks, such as doing arithmetic or entering
responses into a computer.
• Response errors – occurs when a subject
gives an incorrect response.
– i.e. not understanding a question, lying about a
question.
• Nonresponse – the failure to obtain data from
a selected individual in the survey.
70. Reducing non-sampling errors
• Can be minimised by adopting any of the following
approaches:
– using an up-to-date and accurate sampling
frame.
– careful selection of the time the survey is
conducted.
– planning for follow up of non-respondents.
– careful questionnaire design.
– providing thorough training and periodic
retraining of interviewers and processing staff.
71. Reducing non-sampling errors – cont’d
- designing good systems to capture errors that occur
during the process of collecting data, sometimes called
Data Quality Assurance Systems.
73. Classification of Sampling
Techniques
Sampling Techniques
Nonprobability
Sampling Techniques
Probability
Sampling Techniques
Convenience
Sampling
Judgmental
Sampling
Quota
Sampling
Snowball
Sampling
Systematic
Sampling
Stratified
Sampling
Cluster
Sampling
Other Sampling
Techniques
Simple Random
Sampling
74. Why we need to know the classification of
sampling?
• To select the appropriate sampling for a
specific study
• To describe the sampling method in a
single word.
• Anything more?
75. Probability Sampling
➢The best sampling is probability sampling, because it
increase the likelihood of obtaining samples that are
representative of the population.
➢ It is based on the fact that every member of a
population has a known /and or equal chance of
being selected.
➢Probability sample: a sample that has been selected using
random selection so that each unit in the population has a
known chance of being selected.
➢ For example, if you had a population of 100 people,
each person would have probability of selecting as a
sample for study is known.
76. Advantages of Probability Sampling
•Involves lesser degree of judgment
While assigning the number to an item of the population, the number is
chosen in random trend that makes the process more effective and
more accurate.
•Comparatively easier way of sampling
Probability sampling does not involve any complex and long process.
•Sample representative of population
Probability Sampling uses random numbers which ensures that the
samples vary as much as the population itself.
• Less prone to bias due to random selection of subjects.
77. Disadvantages of Probability Sampling
• Chances of selecting specific class of samples only
If a surveyor is appointed to survey about any data relating to
family members, there is likely chances that s/he will develop a
trend of starting to number from the eldest member to the
youngest and numbers will be only increasing or decreasing only.
In this case, only oldest or the latest generations will be taken as
samples.
• No advantages with small numbers of sample.
• Time consuming and costly.
78. Types of Probability Sampling
• 1) Simple Random Sampling
• 2) Systematic Sampling
• 3) Stratified Random Sample,
• 4) Cluster Sampling
• 5) Area Sampling
79. 1) Simple Random Sampling
• A sampling process where each element
in the target population has an equal
chance or probability of inclusion in the
sample.
Sample size
Probability of Selection = ----------------------
Population size
80. 1) Simple Random Sampling
• The simple random sample is the most
basic form of probability sample.
• With random sampling, each unit of the
population has an equal probability of
inclusion in the sample.
• E.g. Imagine that we decide that we have
enough money to interview 450 ( out of
total 9000) employees at the company.
This means that the probability of inclusion
in the sample is
81. 1) Simple Random Sampling
• = 450/ 9,000 =0.05 i.e. 1 in 20.
• This is known as the sampling fraction and
is expressed as n/N,
• where n is the sample size and
• N is the population size
82. Simple Random Sampling (SRS)
❑When population is:
▪ Small
▪ Homogeneous
▪ Readily available
❑Each element of the frame has equal probability of
selection.
❑Provides for greatest number of possible samples.
▪ Assigning number to each unit in sampling frame.
❑ A table of random numbers or lottery system is used
to determine which units are selected
84. Advantages and Disadvantages of SRS
❖ Advantages:
▪ Easy to conduct.
▪ High probability of achieving a representative
sample.
▪ Meets assumptions of many statistical procedures.
▪ Easy to analyze data.
❖ Disadvantages:
▪ Identification of all members of the population can
be difficult.
▪ Time consuming and costly.
▪ Larger sample needed.
85. 2) Systematic Sampling
• The systematic sampling procedure involves
the selection of every kth case in a list. This
procedure is useful when sampling frame is
available in the form of list.
Population size
Skip interval (k) = -----------------------
Sample size
86. 2) Systematic Sampling
• With this kind of sample, you select units
directly from the sampling frame—that is,
without resorting to a table of random
numbers.
87. Systemic Sampling
➢The elements of the population are put in a list.
➢Then every kth element in the list is chosen (systematically)
for inclusion in the sample.
➢For example, if the population of study contained 2,000
students at a high school and the researcher wanted a sample
of 100 students, Students are put in a list. Then every 20th
student is selected for inclusion in the sample.
➢To ensure against human bias:
The researcher should select the first individual at random.
‘Systematic sample with a Random start'
89. Advantages and Disadvantages of SS
❖ Advantages:
▪ Simple to design.
▪ Sample easy to select.
▪ Easier than simple random.
▪ Easy to determine sampling distribution of mean or
proportion.
❖ Disadvantages:
▪ Trends in list may bias results.
▪ Difficult to assess precision of estimate from one
survey.
▪ Moderate in cost.
90. 3) Stratified Sampling
• In this, Population is divided into strata
according to some criteria / Characteristics
that are common.
• Then from each strata specified number of
units are picked.
• E.g. If survey requires gender separation
then total segments are divided into to
strata. i.e. Male & female.
91. Stratified Sampling
❑Stratified Random Sampling is an improvement over
systematic sampling.
❑ In this method, the population is divided into smaller
homogeneous groups or strata on the basis of some
characteristics and from each of these smaller
homogeneous groups or strata members are selected
randomly.
❑ Finally from each stratum using simple random or
systemic sample method is used to select final
sample.
94. Advantages and Disadvantages of
Stratified Sampling
❖ Advantages:
▪ Control of sample size in strata.
▪ Increased statistical efficiency.
▪ Provides data to represent and analyze subgroups.
▪ Enables use of different methods in strata.
❖ Disadvantages:
▪ Increased error if subgroups are selected at different
rates.
▪ Time consuming and expensive.
▪ Prior knowledge of composition and of distribution of
population.
95. Cluster Sampling
• Cluster sampling is a probability sampling
technique where researchers divide the
population into multiple groups (clusters) for
research.
• We are not choosing clusters based on
characteristics…i.e. heterogenous clusters.
• Researchers then select random groups with a
simple random or systematic random sampling
technique for data collection and data analysis.
96. Cluster Sampling
❑ Study population is divided into heterogeneous groups
usually based on geographical areas.
❑ From these groups samples are randomly drawn , known
as “clusters”.
❑ One thing to consider, heterogeneity within the clusters
and homogeneity between the clusters.
❑The method is mostly feasible in case of diverse
population spread over different areas.
98. Types of cluster sampling
•There are two ways to classify this sampling technique.
•The first way is based on the number of stages followed
to obtain the cluster sample, and the
•second way is the representation of the groups in the
entire cluster.
•In most cases, sampling by clusters happens over multiple
stages.
•A stage is considered to be the step taken to get to the
desired sample. We can divide this technique into single-
stage, two-stage, and multiple stages.
Cluster Sampling Example
99. • Single-stage cluster : ling:
• As the name suggests, sampling is done just
once. An example of single-stage cluster
sampling – An NGO wants to create a sample
of girls across five neighboring towns to
provide education. Using single-stage
sampling, the NGO randomly selects towns
(clusters) to form a sample and extend help to
the girls deprived of education in those towns.
Cluster Sampling Example
100. • Two-stage cluster sampling:
• Here, instead of selecting all the elements of a
cluster, only a handful of members are chosen
from each group by implementing systematic or
simple random sampling.
• An example of two-stage cluster sampling – A
business owner wants to explore the performance
of his/her plants that are spread across various
parts of the U.S. The owner creates clusters of the
plants. He/she then selects random samples from
these clusters to conduct research.
Cluster Sampling Example
102. Area Sampling
• Area sampling methods have been applied to
national populations, county populations, and even
smaller areas where there are well-defined
political or natural boundaries.
• Suppose you want to survey the adult residents of
a city. You would rarely be able to secure a listing
of such individuals. It would be simple, however, to
get a detailed city map that shows the blocks of
the city. If you take a sample of these blocks, you
are also taking a sample of the adult residents of
the city.
103. Area Sampling
❑ This sampling is also called cluster sampling.
❑ Area sampling is a method of sampling used when no
complete frame of reference is available.
❑ The total area under investigation is divided into small
sub-areas which are sampled at random or according
to a restricted process (stratification of sampling).
❑ Each of the chosen sub-areas is then fully inspected and
enumerated, and may form the basis for further sampling
if desired.
105. Advantages and Disadvantages of Cluster
Sampling
❖ Advantages:
▪ Very feasible when populations are large and spread
over a large geographical region.
▪ Economically more efficient.
▪ Increased variability is observed in results.
❖Disadvantages:
▪ If chosen cluster sample has a biased opinion then the
entire population is inferred to have the same opinion.
▪ Often lower statistical efficiency due to subgroups
being homogeneous rather than heterogeneous.
▪ Larger sampling error.
106. Non-Probability Sampling Methods
◼ Convenience Sample :
The sampling procedure used to obtain those
units or people most conveniently available.
✓Subjects selected because it is easy to access them.
• No reason tied to purposes of research.
▪Students in your class, people on State Street, friends
◼ Why: speed and cost
◼ External validity?
◼ Internal validity
◼ Is it ever justified?
107. Convenience Sampling
Convenience sampling attempts to obtain a sample of
convenient elements. Often, respondents are selected
because they happen to be in the right place at the right
time.
– use of students, and members of social organizations
– mall intercept interviews without qualifying the
respondents
– department stores using charge account lists
– “people on the street” interviews
109. ◼ Advantages
◼ Very low cost
◼ Extensively used/understood
◼ No need for list of population elements
◼ Disadvantages
◼ Variability and bias cannot be measured
or controlled
◼ Projecting data beyond sample not
justified.
110. ◼ The sampling procedure in which an
experienced researcher selects the sample
based on some appropriate characteristic
of sample members… to serve a purpose.
➢Subjects selected for a good reason tied to
purposes of research
➢ Small samples < 30, not large enough for power of
probability sampling.
➢ Nature of research requires small sample
➢ Choose subjects with appropriate variability in
what you are studying
➢ Hard-to-get populations that cannot be found
through screening general population
Judgment or Purposive Sample
111. Judgmental Sampling
Judgmental sampling is a form of convenience
sampling in which the population elements are selected
based on the judgment of the researcher.
– test markets
– purchase engineers selected in industrial marketing
research
– expert witnesses used in court
112. ◼ Advantages
◼ Moderate cost
◼ Commonly used/understood
◼ Sample will meet a specific objective
◼ Disadvantages
◼ Bias!
◼ Projecting data beyond sample not
justified.
113. Quota Sample
◼ The sampling procedure that ensure
that a certain characteristic of a
population sample will be represented
to the exact extent that the
investigator desires.
◼ Specific number of sample unit (Quota)
116. ◼ Advantages
◼ moderate cost
◼ Very extensively used/understood
◼ No need for list of population elements
◼ Introduces some elements of
stratification
◼ Disadvantages
◼ Variability and bias cannot be measured
or controlled (classification of subjects)
◼ Projecting data beyond sample not
justified.
117. ◼ The sampling procedure in which the
initial respondents are chosen by
probability or non-probability methods,
and then additional respondents are
obtained by information provided by the
initial respondents
Snowball sampling
119. ◼ Advantages
◼ low cost
◼ Useful in specific circumstances
◼ Useful for locating rare populations
◼ Disadvantages
◼ Bias because sampling units not
independent
◼ Projecting data beyond sample not
justified.
120. Panel Sampling
• The same units or elements are measured
on subsequent occasion.
• E.g. : Some households – to know
consumption pattern & after six months
same house holds.
121. Master Samples
• A master sample is one form which
repeated sub-samples can be taken as
and when required from the same area of
population.