Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
20 views

Chapter 1 - Sampling and Experimental Design

This document provides an overview of key concepts in sampling and experimental design covered in sections 1.3-1.5 of Chapter 1, including: 1) Biased and random sampling methods, with random sampling being preferred since it avoids systematic differences between the sample and population. 2) Observational studies versus experiments, with experiments being preferred for determining causation since treatments are randomly assigned. 3) The difference between prospective and retrospective observational studies, with prospective preferred due to less potential for confounding variables and bias. 4) The definition of a confounding variable as one that is related to both the explanatory and response variables, obscuring the effect of the explanatory variable.

Uploaded by

Yassine Belhaje
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views

Chapter 1 - Sampling and Experimental Design

This document provides an overview of key concepts in sampling and experimental design covered in sections 1.3-1.5 of Chapter 1, including: 1) Biased and random sampling methods, with random sampling being preferred since it avoids systematic differences between the sample and population. 2) Observational studies versus experiments, with experiments being preferred for determining causation since treatments are randomly assigned. 3) The difference between prospective and retrospective observational studies, with prospective preferred due to less potential for confounding variables and bias. 4) The definition of a confounding variable as one that is related to both the explanatory and response variables, obscuring the effect of the explanatory variable.

Uploaded by

Yassine Belhaje
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Chapter 1 - Sampling and Experimental Design

Read sections 1.3 - 1.5

Sampling
(1.3.3 and 1.4.2)

Sampling Plans: methods of selecting individuals from a population. We are interested in sampling
plans such that results from the sample can be used to make conclusions about the population.
Biased Samples: Bias occurs when the sample tends to differ from the population in a systematic
way. When this happens, results from the sample can not be used to make conclusions about the
population of interest.

1. Convenience Sample - An “easily available” sample of individuals which was convenient for
the researcher to collect. This is a BAD sampling plan since the individuals in the convenience
sample may systematically differ from the population and therefore may not represent the entire
population.

2. Voluntary Response Sample - A sample of individuals who volunteer to participate. This


is a BAD sampling plan since the individuals who volunteer may systematically differ from the
population and therefore may not represent the entire population.

Types of Bias in Sampling:

• Selection Bias - The sampling plan excludes some part of the population from the selection
process. Those excluded from the selection process systematically differ from those included.
EXAMPLES:
– Phone surveys exclude (1) households without a phone, (2) prisoners, and (3) homeless
people.
– Call-in polls on TV exclude (1) individuals without a TV, (2) individuals not watching the
program, and (3) individuals who do not care to participate

• Measurement/Response Bias - The method of observation tends to produce measurements


that differ from the true value of the response.
EXAMPLES:
– uncalibrated scale
– untrained or ill-trained technician
– wording of a survey or interviewer influence

• Non-response Bias - Data is not obtained from all individuals in the sample. This bias occurs
when those who respond systematically differ from those who do not respond.
EXAMPLES:
– Telephone and mail surveys

IMPORTANT POINTS to remember:

• A biased sample is a biased sample, regardless of its size! Collecting more data in a biased fashion
will not correct the problem.

1
• A biased sample still contains information about a population, but this population is not the one
that a researcher is interested in! Information can still be gleaned from biased samples, but one
must be wary of the interpretation.
EXAMPLE:
– Drug trials using human volunteers
– Studies on animals which have been specifically bred for experiments

QUESTION: What type of sample and bias?


A researcher is interested in the opinions of MSU students about updating gym
equipment. A surveyor stands at the gym entrance door and uses the next 50 people
who enter as a sample and asks each their opinion about updating gym equipment.

Random Sampling: A sample of individuals who have been chosen randomly from the population.
Random samples tend to represent the population from which they are chosen since randomization
does not systematically favor some individuals in the population over others.

Since random samples are representative of the population of interest, then inference is valid. In other
words, results from a random sample can be generalized to make conclusions about the population.

Random Sampling can be done:

• With replacement, which means that after an individual is selected to be the sample, that
individual can potentially be selected into the sample again. This method needs to be used when
the sample size n is more than 5% of the population size, 20n > N , where N is the population
size.

• Without replacement, which is much more commonly used, is where once an individual is
selected to be in the sample, that individual may not be selected again. Therefore, the sample
consists of n distinct individuals. This method is used when the population size is infinite; or
if the population size is N , and the sample size n is no more than 5% of the population size,
20n ≤ N .

QUESTION: With or without Replacement?

1. Randomly sampling a 5 card hand from a standard deck of 52 playing cards:


2. Randomly sampling 2000 Montanans:

2
Types of Random Samples:

1. Simple Random Sample (SRS) - Each possible sample of size n has an equal chance of being
selected from the population.

• How to Select a SRS:


– Put slips of paper in a hat, mix well, then choose n slips.
– Use a computer:
(a) Create a sampling frame, a numbered list of all individuals in the population.
(b) Use a random number generator to select individuals from the list.

2. Stratified Random Sample - Separate the population into non-overlapping homogeneous


groups, called strata. Take a SRS from each strata, then combine the SRSs to form the stratified
random sample.

• Stratifying is beneficial if the population consists of strata that differ in regards to the
variable of interest.
• Usually, ni , the size of the SRS from each strata, is proportional to Ni , the size of the strata
within the population.

QUESTION:
Give an example of a study for which stratifying would be necessary.

3. Cluster Sample - If the population naturally consists of non-overlapping groups, called clusters,
where each cluster is heterogeneous (i.e. it represents and reflects the variability in the population)
then a SRS of clusters can be drawn. All individuals in the selected clusters form the cluster
sample.

3
QUESTION:
(a) There are about 20 sections of STAT 216 offered at MSU each semester. How
would you use cluster sampling to choose a sample from all STAT216 students?

(b) What are the two main differences between a stratified random sample and a
cluster sample?

4. Systematic Sample - Select every k th individual from the numbered population list. This works
well only if:
• the variable of interest is not related to the order of the list or
• the variable of interest is related to the list’s order, but not in a cyclic manner.

QUESTION: Would Systematic Sampling Work Well?


(a) A Phonebook:

(b) Husband/Wife Listing:


1. husband 2. wife 3. husband 4. wife etc.

4
Observation and Experimentation
(1.3.5 and 1.4.1)

Observational Study: A study which observes individuals and measures variables, but does not
attempt to influence the responses.

• An observational study on individuals from a random sample allows one to generalize conclusions
about the sample to the population.

• An observational study cannot show cause-and-effect relationships because there is the


possibility that the response is affected by some variable(s) other than the ones being measured.
That is, confounding variables may be present. “It ain’t what you don’t know that gets you
into trouble. It’s what you know for sure that just ain’t so.” - Mark Twain

• In prospective observational studies, investigators choose a sample and collect new data
generated from that sample. That is, the investigators “look forward in time.”

• In retrospective observational studies, investigators “look backwards in time” and use data
that have already been collected. Retrospective studies are often criticized for having more
confounding and bias compared to prospective studies.

QUESTION: Prospective or Retrospective Observational Study?


1. A study that follows marijuana users in Colorado for 5 years.
2. A study of illegal immigrant activity last year in Arizona.

Experiment: A study in which treatment(s) are deliberately imposed on individuals in order to


observe their response.

• An experiment in which the treatments are randomly assigned to individuals can provide evidence
for a cause-and-effect relationship. Furthermore, if the individuals are from a random sample,
then one can generalize conclusions from the experiment to the population.

To recognize the difference between an Observational Study and an Experiment, ask yourself, “Was
there a treatment imposed on the individuals?” In an experiment, the researcher determines (randomly)
which individuals receive which treatment. In an observational study, the individuals have already self-
chosen their groups.

QUESTION: Observational Study or Experiment?


1. A study of the birth weight of babies and the mother’s level of coffee consumption.
2. A study of lab mice whose spinal cords have been severed.
3. A study of gender versus salary.
4. A study of grizzly bear attacks.
5. A study of the number of 1’s rolled on a weighted die.

5
Confounding Variable: A variable that is related to the response variable and to the explanatory
variable in such a way that makes it impossible to distinguish the effects of the confounding variable
on the response from the effects of the explanatory variable on the response.
EXAMPLES:
• In a study of gender differences in salary, it was found that female nurses (in a certain hospital)
have higher salaries, on average, than do male nurses. It also was found that female nurses
have a greater number of years of experience than do male nurses. Years of experience is a
confounding variable. It may be that the data give no clue as to whether the salary difference is
due to gender discrimination or due to years of experience.

• In a study investigating the association between the occurrence of low birth weight babies and
the mother’s level of coffee consumption, it was found that an increase in the mother’s coffee
consumption is associated with an increase in the risk of having a low birth weight baby. It also
was found that moms who smoke also consume large amounts of coffee and moms who do not
smoke consume no or small amounts of coffee. Smoking is a confounding variable. Are the low
birth weights due to the smoking or the coffee? CAN’T TELL!

Principles of Experimental Design


(1.5)

Experimental Designs: methods of assigning treatments to individuals (units or cases)


Unit: an individual in an experiment
Subject: a human experimental unit
Factor: a categorical explanatory variable
Treatment: a combination of levels of factors
Extraneous Factor: a factor that is not of primary interest and yet affects the response variable. An
extraneous factor is called a confounding variable if its effect on the response cannot be distinguished
from the effect of another factor on the response.
The goal of an experiment is to determine the effects of factor(s) on the response while taking into
account extraneous factors that also affect the response.
Control Group: a group that receives no treatment (or a placebo). The response of the treatment
group is compared to the response of the control group to determine effectiveness of the treatment.

Placebo: a treatment that has no active ingredients (a fake treatment). A placebo is supposed
to resemble the real treatment as far as appearance, taste, and feel so that subjects believe they
are receiving the true treatment. Use of a placebo mitigates “the power of suggestion.” That is, a
treatment, when thought to be beneficial, tends to positively affect responses (and a non-beneficial
treatment tends to negatively affect responses).

Single-blind: the subjects do not know what treatment was received. A single-blind experiment
avoids the unconscious expectations of the subjects of one treatment over another.
EXAMPLE:
Give an example of an experiment which can not be made single-blind.

6
Double-blind: neither the subject nor the person recording the response know what treatment was
received. A double-blind experiment avoids the unconscious expectations of the subjects and of the
recorder of one treatment over another.

Four Basic Principles of Experimental Design:

1. Direct Control - Holding extraneous factors constant for all units so that the effects of the
extraneous factors are not confounded with the factors of interest.

2. Random Assignment - Treatments are randomly assigned to units in order to create similar
experimental groups. In other words, the values of the extraneous variables will be similar, on
average, for each experimental group.

3. Replication - The experiment is replicated on many units for each treatment group to reduce
the role of random variation due to uncontrolled and “unblocked” extraneous variables.

• If there was only one unit in each of two treatment groups, then it could happen that these
two units are quite different. But if we randomly assign several more units to each group,
then any differences will get “evened out”.

4. Blocking - Units are classified into subgroups or blocks so that the extraneous factors are held
constant for all units within a given block. Treatments are randomly assigned to units within
each block.
• “Block what you can, randomize what you cannot,” George Box.
• Blocks and strata are different. Blocking refers to classifying experimental units into blocks
whereas stratification refers to classifying individuals of a population into strata.
• The samples from the strata in a stratified random sample can be the blocks in an
experiment.

7
Two Basic Experimental Designs:

1. Completely Randomized Design (CRD): Experimental units are randomly assigned to each
treatment (using principles 1-3).

EXAMPLE:
Consider an experiment to compare the effect of using two different neurons (olfactory and
motor) and two different antibiotics (amoxicillin and tetracycline) to repair severed spinal cords
in laboratory mice.

What are the experimental units?

What is the response?

Give one factor in the experiment.

Give a second factor.

What are the treatments?

What are some potential extraneous factors?

How can direct control be used so that these extraneous factors don’t obscure the effects of the
treatments on the response?

8
2. Randomized Block Design (RBD): Units are classified into blocks that are similar with
respect to extraneous variable(s), then units are assigned to treatments independently within
each block (using principles 1-4).

EXAMPLE:
Consider an experiment to determine the effect of different wheat strains (called A, B, and C) on crop
yield (in bushels/acre).
What are the experimental units?

What is the response?

Give the factor in the experiment.

What are the treatments?

What are some potential extraneous factors?

How can direct control and blocking be used to account for these extraneous factors so they don’t
obscure the effects of the treatments on the response?

Exercises
Sampling, p. 58: 1.9 - 1.15 odd
Observational studies, p. 60: 1.17 - 1.29 odd
Experiments, p. 63: 1.31 - 1.37 odd

You might also like