chapter 6 introduction to sampling
chapter 6 introduction to sampling
size determination
Samrawit .F (MSC)
1
Objective
2
Terminologies
• Sampling frame: the list of all units in the reference population, from
which the sample is to be picked.
3
Sampling technique
4
Why sample?
• The population of interest is usually too large to
attempt to survey all of its members.
5
Basic conditions in sampling process
• Sample must be well chosen: – Representativeness
• Sample must be sufficiently large: – Minimizes
sampling variation
• There must be adequate coverage of the sample: –
Information should be obtained from almost all
Ø Two keys
1. Selecting the right people
§ Have to be selected scientifically so that they are representative of the
population
2. Selecting the right number of the right people
§ To minimize sampling errors I.e. choosing the wrong people by chance
6
Basic terms
ØA population is a group of individuals persons, objects, or
items from which samples are taken for measurement.
ØReference population (or target population): the population
of interest to whom the researchers would like to make
generalizations.
ØSource population: is the sub set of target population where
samples are drawn.
ØStudy population: the group that is studied, either in total or
by selecting a sample of its members
ØStudy unit: the units on which information will be collected:
persons, families, housing units, health facilities, schools
7
8
Study subjects
The actual
Hierarchy of sampling participants in
the study
Sample
Subjects who are
selected
Sampling Frame
The list of potential subjects
from which the sample is drawn
Source population
The Population from whom the study
subjects would be obtained
Target population
9
The population to whom the results would be applied
Advantage of sampling
10
Drawback of sampling
11
Characteristics of Good Samples
1. Representation
• Sample surveys are almost never conducted for the purposes
of describing the particular sample under study. Rather they
are conducted for purposes of understanding the larger
population from which the sample was initially selected
12
ü3 factors that influence sample representativeness
• Sampling procedure
• Sample size
• Participation (response)
13
Types of sampling method
14
Cont’d…
2. Non-probability sampling is a sampling method where
every item has an unknown chance of being selected
15
Types of sampling
16
1.Simple random sampling
• Involves random selection
• Most common form of probability sampling.
• To use a SRS method:
– Make a numbered list of all the units in the population
(sampling frame)
– Each unit should be numbered from 1 to N (where N is
the size of the population)
- Decide on the size of sample
– Select the required number.
17
Cont’d…
18
Cont’d…
Random number table
• It is a table of random numbers constructed by a process
that
19
20
SIMPLE RANDOM SAMPLING
22
2. Systematic random sampling
23
Cont’d…
24
25
3. Stratified sampling
• It is done when the population is known to have
heterogeneity with regard to some factors and those factors
are used for stratification
26
Cont’d…
1) Proportionate STRS
2) Disproportionate STRS
27
• In the case of Proportionate STRS
- Determine the proportion of each stratum in
the study population
- p = elements (#) in each stratum
total pop. size
• Determine the number of elements to be
selected from each stratum = (n) x (p)
• Select the required number of elements from
each stratum with SRS technique.
28
• In the case of Disproportionate STRS
-allocate equal sample size to each stratum
- Determine the number of element to be
selected from each stratum = Sample size (n)
No. of strata (k)
29
vThe advantage of stratified random sampling is that it
increases the likelihood of representation, especially if
the sample size is small
30
4.Cluster sampling
• It is selection of groups of study units (clusters) instead of
the selection of study units individually
31
Cont’d…
32
Steps in cluster sampling
1. Divide the population into groups or clusters
2. A number of clusters are selected randomly to represent
the total population, and then all units within selected
clusters are included in the sample.
3. No units from non-selected clusters are included in the
sample.
4. Differs from stratified sampling, where some units are
selected from each group.
33
5. Multi-stage sampling
• Similar to the cluster sampling, except that it involves picking
a sample from within each chosen cluster, rather than
including all units in the cluster.
34
Advantages
• No need to have a list of all units in the population.
• Saves a great amount of time and effort
Disadvantages
• More information is needed in this type of sampling, which
may not be available
• Error will be multiplied
• Provide less precise estimation
35
2. Non-probability sampling
36
1. Convenience sampling
• Sometimes known as grab or opportunity sampling
or accidental or haphazard sampling.
• For convenience, the study units that are available
at the time of data collection are selected
• Many clinic-based studies
37
2.Quota sampling
• is done until a specific number of units (quotas) for
different categories of populations have been
selected.
• Similar to stratified but does not involve random
selection
• It is based on the researcher’s judgment
38
3. Purposive sampling
• Often used in qualitative studies( such as those conducting
Focus Group Discussion and In-depth interview )
39
4. Snow ball sampling
• Also called chain referral sampling
40
Errors in sampling
1.Sampling error – Random error
ØThe uncertainty associated with an estimate that is
based on data gathered from a sample of the
population rather than the full population is known
as sampling error.
ØIt is an error arising from the sampling process
itself
ØSampling error can be minimized by increasing the
size of the sample.
ØCan not be avoided or totally eliminated
41
2.Non-sampling error (Bias)
It is a type of systematic error in the design or
conduct of a sampling procedure which results in
distortion of the sample, so that it is no longer
representative of the reference population.
42
43
Sample size determination
44
• A sample size determination is the act of choosing
the number of observations or replicates to include in
a statistical sample.
45
Sample Size Determination
The answer will depend on the aims, nature and scope of the
study and on the expected result. All of which should be carefully
considered at the planning stage.
46
Basic things in sample size determination
47
Sample……
n
o If sample (“ ”) is
§ Large
§Increase accuracy
§ Costy / complex
Take
Optimum
§ Small sample
o Decrease accuracy
o Less costy
How ?
48
Factors to determine sample size
• Size of population
• Resources – subjects, financial, manpower
• Method of Sampling- random, stratified
• Degree of difference to be detected
• Degree of Accuracy (or errors)
- Type I error (alpha) p<0.05
- Type II error (beta) less than 0.2 (20%)
- Power of the test : more than 0.8 (80%)
• Statistical Formulae
• Dropout rate, non-compliance to treatment
49
o Sample size determination depending on outcome variables.
50
• The third category covers continuous response variables such as
birth weight, age at first marriage, blood pressure and cerium
uric acid level, for which numerical measurement are usually
made.
• In this case the data are summarize in the form of means and
standard deviations or their derivatives.
51
Sample Size………...
The sample size determination formulas come from the formulas for
the maximum error of the estimates and is derived by solving for n.
52
Sample size for single population mean
53
Maximum acceptable difference (d or w): This is the maximum
amount of error that you are willing to accept.
Desired confidence level (Z/2 ) : is your level of certainty that
the sample mean does not differ from the true population mean by
more than the maximum acceptable difference. Commonly we use
a 95% confidence level.
Then the sample size determination formula for single population
mean is defined by:
z 2 2 2
n 2
w
54
Sample Size for Single Population Proportion
55
Then the formula for the sample size of single population proportion is defined
as:
z22 * p (1 p )
n 2
w
Where α = the level of significance which can be obtained as 1- confidence level.
P = best estimate of population proportions
W = maximum acceptable difference
z the value under standard normal table for the given value of confidence level
2
56
Example 1
One of MPH student want to conduct a research on the prevalence of ANC utilization of
mothers in DABAT district. Given that the prevalence from the previous study found to be
45.7% , what will be the sample size he should take to address his objective?
Solution:
ØMargin of error d= 5%
ØA confidence level of 95% will give the value of as Zα/2=1.96.
ØThen using the formula :
2 2
Z P (1 P ) Z 0 . 457 (1 0 . 457 )
0 . 05
n 2 2
2
W 0 . 05 2
1 . 96 0 . 457 ( 0 . 543 )
2
0 . 05 2
382
57
Some Considerations
58
Sample size for case control study
59
Sample size in cohort study
60
Incorrect sample size will lead to
oWrong conclusions
oWaste of resources
oLoss of money
oEthical problems
oDelay in completion
61
Any question??
62
T hank you
63