Sampling Issues in Research
Sampling Issues in Research
Epidemiologic Research
1
Sampling
1.Introduction and concept in sampling
- Sampling is an important issue in research and day to day life
Census vs sample
• In a census, every animal in the population is evaluated.
• In a sample, data are only collected from a sub-set of the
population.
• Taking measurements or collecting data on a sample of the
population is more convenient than collecting data on the entire
population.
• In census the only error is the measurement it self
• In sample error can happen due to measurement and sampling
• Census: more time, more resource , observation less reliable
2
Sampling
Sampling unit: smallest of division of population: house, calf, herd:
our sample
Parameters Vs statistics
3
Descriptive versus analytic studies
Hierarchy of populations
4
The external population is the population to which it might be
possible to extrapolate results from a study.
6
Types of error
• In a study based on a sample of observations, the variability of the
outcome being measured, measurement error, and sample-to-sample
variability all affect the results we obtain.
• Hence, when we make inferences based on the sample data, they are
subject to error.
• In analytical studies two type errors
• Type I (a) error: You conclude that the outcomes are different when in
fact they are not.
• Type II (β) error: You conclude that the outcomes are not different when
in fact they are.
7
The α level is used as significance level—
1-β measures power of the test
8
• Statistical test results reported in medical literature are aimed at disproving the
null hypothesis (i.e that there is no difference among groups).
• If differences are found, they are reported with a P-value which expresses the
probability that the observed differences could be due to chance, and not due to
the presence of the factor being evaluated.
9
Accuracy and precision
1. Popn. representativeness,
2.Access required,
13
Sampling methods /Types of sampling
probability sampling
14
Non random sampling
A type of sample which is not produced by random selection
15
1. Judgement sample
16
2.Convenience sample
• A convenience sample is chosen because it is easy to obtain.
• For instance, nearby herds, herds with good handling facilities,
herds with records that are easily accessible, volunteer herds etc
might be selected for study.
3. Purposive sample
• The selection of this type of sample is based on the elements
possessing one or more attributes such as known exposure to a risk
factor or a specific disease status.
• This approach is often used in observational analytic studies.
17
Random sampling/probability sampling
It is one in which every element in the population has a known non-zero
probability of being included in the sample.
However , it doesn’t mean you do it in any way you like it, random
selection can be done in different ways .
20
2. Systematic random sampling
• In a systematic random sample, a
21
3. Stratified random sample
• Prior to sampling, the population is divided into mutually exclusive
strata based on factors likely to affect the outcome.
• Then, within each stratum, a simple or systematic random sample is
chosen.
• The simplest form of stratified random sampling is called proportional
(the number sampled within each stratum is proportional to the total
number in the stratum).
There are three advantages of stratified random sampling.
1. It ensures that all strata are represented in the sample.
2. The precision of overall estimates might be greater than those derived
from a simple random sample.
22
3. It produces estimates of stratum-specific outcomes, although the
precision of these estimates will be lower than the precision of the
overall estimate.
For example,
You would make up two lists - one of cats and one of dogs and
sample from each list.
For example:
25
• Cluster sampling is one of the probability sampling techniques
where as sampling is applied at an aggregated level (=group) of
individual units.
• Typically, the individual still remains the unit of interest such as
for example its disease status, but
• the sampling unit becomes a grouping of individual animals such
as the herd they belong to.
• All elements within each randomly selected group are then
included in the sample
• Therefore, this technique does only require a sampling frame for
the groups, but not for the members within the groups.
26
Sampling
27
28
4. Multistage sampling
Involves several level of random sampling (one for each), for
example with the nationally selected herd, a further random
selection process is used to determine which animal to be studied
(two stage sampling)
29
Summary and comparison of d/t sampling methods
30
Importance of sample size calculation (objectives)
• Usually a researcher would like to show a statistical significance
variation, but the difference should be meaning full
32
Sample Size determination
• A decision as to the required sample size has to be based on
different calculations depending on whether estimates for a
categorical or a continuous variable are to be calculated.
the size of the smallest subgroup and the actual variability of the
variable of interest in the population.
33
Sample Size determination
Definitions and some concepts
• Null hypothesis (H0): No difference between groups
H0: p1 = p2 H0: 1 = 2
– Test of significance of H0
– Based on distribution of a test statistic assuming H0 is true
– It is NOT the probability that H0 is true
34
Sample Size determination
Definitions and some concepts
• Type I error: Rejecting H0 when H0 is true
Where ,
n = required sample size
pexp = expected prevalence
d = desired absolute precision
38
If the N is relatively small compared with n the sample size
obtained needs adjustment as follows
where,
N= populations size
39
Table: the approximate sample size required to estimate
prevalence in large population with the desired fixed width
confidence limits. (Modified from Cannon and Roe, 1982.)
41
One-stage cluster sampling
♥ The appropriate formula for a 95% confidence interval is then:
where:
g = 1.962 {nVc+Pexp (1-Pexp)}
nd2
g = number of clusters to be sampled;
n = predicted average number of animals per cluster;
Pexp = expected prevalence;
d = desired absolute precision;
Vc = between-cluster variance.
42
Two-stage cluster sampling
43
(1)When the total sample size is fixed, and the number of clusters to
be sampled is required; or
g= 1.962TsVc
d2Ts -1 .962 Pexp(1-Pexp)
Where,
g = number of clusters to be sampled;
Pexp = expected prevalence;
d = desired absolute precision;
Ts = total number of animals to be sampled;
Vc = between-cluster variance
44
(2) Number of animals to be sampled when the number of clusters is
fixed:
Ts = 1.962g Pexp(1-Pexp)
gd2 -1 .962 Vc
45
If you are sampling from a finite population (eg <1,000 animals), then the
formula to determine the required sample size is:
Where
N =population size
46
• If you are sampling from an infinite population, then the following
approximate formula can be used:
Where
n=the required sample size,
a is usually set to 0.05 or 0.01,
q=(l-minimum expected prevalence ).
• If you take the required sample and get no positive results (assuming that
you set a to 0.05), then you can say that you are 95% confident that the
prevalence of the disease in the population is below the minimal
threshold which you specified about the disease in question.
• Thus, you accept this as sufficient evidence of the absence of the disease.
47
48
The sample size required for rate studies
• If we wish not only to detect disease, but also wish to estimate its prevalence,
then a somewhat more complex calculation is used to estimate sample size.
• As you might expect, the sample size is larger than that needed to detect only the
presence of disease.
49
• it is useful to understand the principles behind these calculations,
the required sample sizes can be obtained much more quickly from
tables or specialised epidemiological computer software such as
EpiInfo or EpiScope.
Where:
51
Sample size: Comparing proportions or means
n=sample size per group to compare two proportions or two
means
52
Sample size for comparing proportions
53
54
55