An Introduction To Experimental Design
An Introduction To Experimental Design
An Introduction To Experimental Design
Alex Sanchez-Pla
Departamento de Genética, Microbiología y Estadística (UB)
4) Some resources
2 / 48
Experiments: What, Why, How
3 / 48
Types of studies in medical research
DOI: https://doi.org/10.3238/arztebl.2009.0262
4 / 48
Experimental Studies
Investigate changes in response variable,
Investigators play active role by (randomly) assigning each individual to each group.
5 / 48
Observational Studies
Investigate changes in response variable,
Assignment of subjects into each other group is outside the control of the investigator
who can only observe the response variable in each group.
6 / 48
Why experiment?
A very common type of research study is experimental studies.
The purposes of the experimental studies are diverse. For example, we could propose
experiments to:
7 / 48
An example and basic ideas
8 / 48
Some definitions
An experiment is any investigation in which a particular set of conditions is applied to
and the results of said experiment are observed and evaluated. app. For example, study
of the drug in pre-diabetic mice
Each group of experimental conditions is a treatment or factor. For example "Drug" and
"Age" are two factors.
Each particular condition within a factor is a level of that factor. For example
Drug/Placebo or Old/Young are two factors with two levels each.
The results that we observe after applying a treatment are the responses.
9 / 48
Some important definitions:
Experimental Unit (UE) The physical entity or subject exposed to the treatment
independently of other units.
Unit of observation (UO) The unit in which they are carried out.
observations, that is,
measurements.
10 / 48
Types of variability
Random variability
Differences expected to be observed
when different subjects from the same
sample are measured.
It is usually always present to a greater
or lesser degree.
Systematic variability
3. The way treatment levels are assigned to the experimental units, that is, the
experimental design
12 / 48
What characterizes a good experimental
design?
It avoids biases or systematic errors
It allows a precise estimation of the response, which implies that the random error is
as low as possible.
It has wide validity: the experimental units are a sample of the population in question,
so it is possible to extrapolate the conclusions of the sample to the population.
13 / 48
How to get a good design
Try to apply some ideas, basic and somewhat redundant, but which, together, guarantee
a good result.
Randomization,
Replication,
Local control.
And also
Plan design and analysis at the same time,
Involve your favorite statistician from the beginning (or before) of the experiment.
14 / 48
Design checklist
1. Define the objectives of the experiment.
2. Identify all possible sources of variation.
3. Select an appropriate experimental design.
4. Specify the experimental process
5. Conduct a pilot study
6. Specify the hypothesized model
7. Describe the tests to be performed.
8. Estimate the required sample size using the results.
of the pilot study
9. Review your decisions in Steps 1 through 8 and make the
necessary revisions.
15 / 48
Design, experimentation and analysis
16 / 48
Principles of experimental design
17 / 48
Basic principles of experimental design
Good experimental designs share common traits.
Apart from the fact that they are based on the logic of experimentation and the
scientific method,
they usually rely on some ideas, whose application guarantees good designs, or, in
any case, better designs than those studies in which they are not taken into
account explicitly.
Randomization
Replication
18 / 48
1. Randomization
Since it is not possible to avoid random variations, we can randomly assign treatments
to units to try to compensate for the effect of such variation.
For example, a doctor may be "tempted" to give the drug she thinks works best to
patients with the worst prognosis.
19 / 48
2. Replication
There is general agreement on the need to apply each treatment independently to
several experimental units.
This ...
Having the appropriate sample size does not guarantee the presence of an effect: The -
often heard sentence- "we didn't detect any effect but if we can collect enough samples
the effect will be seen" can be considered a Statistical Myth.
20 / 48
Replicates, power and precision
The number of repetitions r is directly related to the precision of the experiment
Variability is inversely related with the precision of the experiment.
¯
¯¯¯¯ 2
1/var(X ) = r/σ (∗)
While this is stated for estimating the sample mean, the rule can be easily extended to
other characterristics.
From this relation, it is straightforward to derive formulae for the sample size needed
for estimation.
21 / 48
How many replicates are needed?
If the goal of an experiment is, not only estimating one characteristic, but also
comparing groups, that is detecting the effect of a treatment, this can also be accounted
for.
One can compute the sample size needed given the previous four values or,
One can fix any four and compute the other one (for instance the power given a
sample size, etc.)
22 / 48
Technical and biological replicates
for
Technical replications allow quantifying
variability associated with the technique used.
for
Biological replications allow quantifying the
variability associated with the study population.
The total variability can be decomposed into various components of the variance.
2 2 2 2
σ(T OT AL) = σ(T EC) + σ(BI O) + σ(ERR)
In general :
2 2
σ(T EC) < σ(BI O)
source: https://www.licor.com/bio/blog/technical-and-biological-replicates
23 / 48
Replicates or pools?
Sometimes it may be decided to combine mRNA from different samples to form a
"pooled sample" or pool
Don't use groups when individual information is important (e.g. paired designs).
A sample with 3 grouped individuals is not the same as 3 individual samples!
24 / 48
3. Local control
In many situations it is common for not all samples to be homogeneous.
If there are systematic differences between groups of samples ("blocks") the effects of
interest (for example the effect of a treatment) can be affected by differences between
samples of different blocks,
In other words, it may not be clear if the differences observed are attributable to the
effect of the treatment or other factors that we call confounding.
Local control or blocking, that is distributing each treatment evenly among the different
blocks is the way to minimize this undesired effect.
25 / 48
How to apply local control
This design does not apply good local This design applies good local control.
control.
The possible effect of sex or of the
Treatment effect can be confused with production batch is distributed among
the different levels of treatment, which
the effect of age or will allow them to be analyzed
that of the production batch. separately.
26 / 48
Batch effect and its adjustment
A Principal Component Analysis (PCA) can reveal the presence of undetected blocks in
the design.
If the different levels of the "batch" are distributed among the treatment levels, it is
possible to correct them.
If there is confusion between the two (for example, each treatment has been done
in a distinct batch), the effects cannot be separated. 27 / 48
Basic types of experimental designs
28 / 48
Experimental designs
A key point in any experiment is the way in which
the experimental units are assigned to
the treatments.
the best possible local control is achieved, given the circumstances of the
experiment.
To achieve the best possible design, we will take into account the components that
define each design.
29 / 48
Design components
When considering the choice of a design for an experiment we must take into account:
This depends on the resources, the available units, the desired precision, the
heterogeneity between UEs.
30 / 48
From components to design
Treatments Design Error control D. Observational Design
Completely
1 factor (k level), 1 block (l Assign trats. 1 ... k to EU, in
randomized block 1 EU = 1 OU
level), k l EU each block
design
31 / 48
Designs, Models and Analysis
The design of the treatments and the observational design help us to choose the
appropriate design for an experiment.
Error control design defines how the randomization is carried out, that is, the
assignment of individuals to the treatments.
32 / 48
Experimental design and ANOVA
Sometimes the design of the experiment is confused with its analysis, which is carried
out using Analysis of the Variance techniques.
This is understandable, because when one defines the experimental design the way
it will be analyzed is set. That is they are related, but they are not the same.
It is a common problem, in some books or statistics courses, which do not pay
attention to how treatments were allocated between individuals and provide the
data already collected.
This makes it difficult for students to realize that experimental design had been
carried out before the data were collected.
Summarizing: Although the treatment design suggests a certain analysis model the
experimental design should not be confused with the analysis of the data collected in
the experiment!
33 / 48
Experimental design and ANOVA
Treatments Design Error control D. Analysis
34 / 48
Completely randomized design
Gene therapy experiment: compare four techniques to correct faulty genes
20 genetically identical and modified mice, affected by the disease to be treated, are
selected.
35 / 48
Completely randomized design
The simplest design, suitable for comparing several treatments on a homogeneous
sample.
The analysis will usually be carried out by means of a one-way analysis of variance
(ANOVA).
36 / 48
Randomization in a DCA
There are many libraries that allow randomization, but it can also be done easily with a
small script.
Randomization is carried out before the experiment and it only indicates which
treatment will receive each experimental unit
Once the experiment is carried out, it is usual to present the data ordered by the
treatments received, which eliminates the evidence that the assignment has been made
randomly.
37 / 48
Random block design
After exposure to a poison, cells can be treated by different substances that accelerate
regeneration.
A study wants to compare six of these growth factors (5 are treatments and 1 is a
control).
A problem has caused that there is not enough culture medium to grow all treatments
with replicates. Instead, there are 4 culture media available.
38 / 48
Random block design
The completely randomized design loses utility if the experimental material is not
homogeneous.
In these cases, we can apply local control (blocking) and divide the experimental
material in homogeneous subgroups, which we will call blocks.
Once the samples have been distributed among the blocks, the treatments are applied
to the experimental units randomly and independently of the other blocks.
Yij = μ + ρi + τj + eij , i = 1 … k, j = 1 … l.
The analysis will usually be carried out by means of an analysis of variance (ANOVA) of
two factors without interaction.
Obviously, if it is not possible to distribute the samples evenly between the blocks, the
situation becomes complicated and we are faced with unbalanced designs
39 / 48
Block or randomize?
Block what you can and randomize what you cannot - Box, Hunter & Hunter (1978)
Randomization provides a rough balance between variables that have not been taken
into account.
Local control eliminates the effect of differences between blocks, thereby ensuring that
differences between treatments cannot be due to differences between blocks.
40 / 48
Factorial design
A study was conducted to study the effect of a drug and a diet on systolic blood
pressure.
20 people with high blood pressure were randomized to one of four treatment
conditions.
At the end of the treatment period, systolic blood pressure was assessed.
It is a factorial design in which each of the two treatments (drug, diet) can be randomly
assigned to each individual.
41 / 48
Factorial design
This design is useful to study the effects of several factors simultaneously.
The "treatments" are all combinations of the different factors under study.
The fact that each combination is replicated makes it possible to study, not only the
effects of each factor separately, but also the interaction between them.
The linear model that describes a two-factor design with interaction with t and s levels
and r replicates respectively is the following:
The analysis will usually be carried out by means of an analysis of variance (ANOVA) of
two factors with interaction.
42 / 48
Repeated measures design
A study wanted to measure the concentration of certain metabolites in plasma after two
dietary interventions consisting of adding an amount of olive oil or an equivalent
amount of walnuts to the standard diet.
21 mice subjected to the same diet were taken and an intervention (water, olive oil or
nuts) was randomly assigned.
The concentration of the metabolite in blood was measured after before the
intervention and at 24h, 48h and one week.
43 / 48
Repeated measures design
When we take more than one measurement in each experimental unit, we have a
within-subjects design.
In this case, the data have different characteristics from the previous ones.
Apart from this, they offer the same possibilities as with other designs, but with an
additional source of variability, "time".
The analysis of repeated measures data is a whole world. Although the ANOVA of
repeated measures is traditionally used, the current trend is to perform the analyzes
using linear mixed models which are much more flexible.
44 / 48
Summarizing ...
A good experimental design is essential to carry out good experiments.
Experimental design means planning in advance, that is, before and not after the
experiment.
The experimental design must consider all steps: from sampling to data analysis.
Applying grounded principles such as randomization, replication and local control is key
to obtain good experimental designs.
The analysis of designed experiments is carried out with the Analysis of the Variance
(ANOVA). While each design can be asociated with an ANOVA model they should not be
confused.
Whenever possible we should have statistical support from the beginning of the study
45 / 48
And, as the master said ...
46 / 48
References and resources
47 / 48
References and resources
3rs-reduction.co.uk
The "Statistical Analysis" section takes a brief tour of some experimental design
models and their analysis.
A book for an introductory course to design of experiments that, after being sold
out in bookstores, the author decided to provide freely on the internet.
It takes a "traditional" approach to the subject and contains aspects that today
would be approached differently, but it continues to be very interesting.
48 / 48