Topic 9. Factorial Experiments (ST&D Chapter 15) : 1 2 I I I I
Topic 9. Factorial Experiments (ST&D Chapter 15) : 1 2 I I I I
Topic 9. Factorial Experiments (ST&D Chapter 15) : 1 2 I I I I
B = P level
Level
a1 = N0
A = N level
a2 = N1
b1 = P0
b2 = P1
40.9
47.8
44.4
42.4
50.2
46.3
Mean (aib)
41.6
49.0
45.3
7.4 (me A)
b2-b1
Mean (abi)
a2-a1
The differences a2 - a1 and b2 - b1 are called the simple effects, denoted (se A) and (se
B). The differences between the means are the main effects, denoted (me A) and (me B).
One way of using this data is to consider the effect of N on yield at each P level
separately. This information could be useful to a grower who is constrained to use one or
the other P level. This is called analyzing the simple effects (se) of N. The simple effects
of applying nitrogen are to increase yield by 6.9 lbs/acre for P0 and 7.8 lbs/acre for P1.
9.2
It is possible that the effect of N on yield is the same whether or not P is applied. In this
case, the two simple effects estimate the same quantity and differ only due to
experimental error. One is then justified in looking at the difference between the two
means to obtain a main yield response of 7.4 lbs/acres. This is called the main effect (me)
of N on yield. If the effect of P is the same at any N level then one could do the same
thing for this factor to get a main effect of 1.9 lb/a.
9. 4. Interaction
If the simple effects of Factor A are the same across all levels of Factor B, the two factors
are said to be independent. In such cases, it is appropriate to analyze the main effects of
each factor. It may, however, be the case that the effects are not independent. For
example, one might expect the application of P to permit a higher expression of the yield
potential of the N application. . In that case, the effect of N in the presence of P would be
much larger than the effect of N in the absence of P. When the effect of one factor
depends on the level of another factor, the two factors are said to exhibit an interaction.
An interaction is a measure of the difference in the effect of one factor at the
different levels of another factor. Interaction is a common and fundamental
scientific idea.
One of the primary objectives of factorial experiments, other than efficiency, is to study
the interactions among factors. The sum of squares of an interaction measures the
departure of the group means from the values expected on the basis of purely additive
effects. In common biological terminology, a large positive deviation of this sort is called
synergism. When drugs act synergistically, the result of the interaction of the two drugs
may be above and beyond the simple addition of the separate effects of each drug. When
the combination of levels of two factors inhibits each others effects, we call it
interference. Both synergism and interference increase the interaction SS.
These differences between the simple effects of two factors, also known as first-order
interactions or two-way interactions, can be visualized in the following interaction
plots.
b2
Y
b2
b1
se B,a1
b1
se A,b1
a1
a2
a1
a2
Low me B, no interaction
H igh me B, no interaction
b2
b1
Y
b1
b2
a1
a2
a1
a2
9.3
Pitfalls of Interpreting Interactions in Transformed Data
0
AB
20
30
35
45
Y^2
400
900
1225
2025
Transformed Data
Original Data
With B
AB
With Effect B
15
A
15
W/o Effect B
20
AB
2000
50
1125
B
A
825
W/o B
0
N
Effect A YES
YES
Effect A
Transformed Data
Our transformation
y^2
Y
Y
0
A B
Original Data Y=X
X
AB
9.4
9. 5. 1. Reasons for carrying out factorial experiments
1. To investigate interactions: If factors are not independent, single factor experiments
provide a disorderly, incomplete, and often quite misleading picture of the system.
More than this, most of the interesting questions today concern interactions.
2. To establish the dependence or independence of factors of interest: In the initial
phases of an investigation, pilot or exploratory factorial experiments can establish
which factors are independent and can therefore be more fully analyzed in separate
experiments.
3. To offer recommendations that must apply over a wide range of conditions: One can
introduce "subsidiary factors" (e.g. soil type) into an experiment to ensure that any
recommended results apply across a necessary range of circumstances.
9. 5. 2. Some disadvantages of factorial experiments
1. The total possible number of treatment level combinations increases rapidly as the
number of factors increases. For example, to investigate 7 factors (3 levels each) in a
factorial experiment requires, at minimum, 2187 experimental units.
2. Higher order interactions (three-way, four-way, etc.) are very difficult to interpret.
So a large number of factors greatly complicates the interpretation of results.
9. 6. Differences between nested and factorial experiments (Biometry pages 322-323)
People are often confused between nested and factorial experiments. Consider a factorial
experiment in which growth of leaf discs was measured in tissue culture with five
different types of sugars at two different pH levels. In what way does this differ from a
nested design in which each sugar solution is prepared twice, so there are two batches of
sugar for each treatment? The following tables represent both designs, using asterisks to
represent measurements of the response variable.
2x5 factorial experiment
pH1
pH2
1
*
*
*
*
Nested experiment
Sugar Type
2 3 4
* * *
* * *
* * *
* * *
5
*
*
*
*
Batch 1
Batch 2
1
*
*
*
*
Sugar Type
2 3
4
* * *
* * *
* * *
* * *
5
*
*
*
*
The data tables look very similar, so what's the difference here? The factorial analysis
implies that the two pH classes are common across the entire study (i.e. pH level 1 is a
specific pH level that is the same across all sugar treatments). By analogy, if you were to
analyze the nested experiment as a two-way factorial ANOVA, it would imply that
Batches are common across the entire study. But this is not so. Batch 1 for Treatment 1
has no closer relation to Batch 1 for Treatment 2 than it does to Batch 2 for Treatment 2.
9.5
"Batch" is an ID, and Batches 1 and 2 are simply arbitrary designations for two randomly
prepared sugar solutions for each treatment.
Now, if all batches labeled 1 were prepared by the same technician on the same day,
while all batches labeled 2 were made by someone else on another day, then 1 and 2
would represent meaningfully common classes across the study. In this case, the
experiment could properly be analyzed using a twoway ANOVA with Technicians/Days
as blocks (RCBD).
While they both require two-way ANOVAs, RCBD's differ from true factorial
experiments in their objective. In this example, we are not interested in the effect of the
batches or in the interaction between batches and sugar types. Our main interest is to
control for this additional source of variation so that we can better detect the differences
among treatments; toward this end, we assume there to be no interactions.
When presented with an experimental description and its accompanying dataset, the
critical question to be asked to differentiate factors from experimental units or
subsamples is this: Do the classes in question have a consistent meaning across the
experiment, or are they simply ID's? Notice that ID (or dummy) classes can be swapped
without affecting the analysis (switching the names of "Batch 1" and "Batch 2" within
any given Sugar Type has no consequences) whereas factor classes cannot (switching
"pH1" and "pH2" within any given Sugar Type will completely muddle the analysis).
9. 7. The two-way factorial analysis (for fixed-effects model or Model I)
9. 7. 1. The linear model for two-way factorial experiments
The linear model for a two-way factorial analysis is
Yijk = + i + j + ()ij + ijk
Here i represents the main effect of factor A i, i = 1,...,a, j represents the main
effect of factor B, j = 1,...,b, ()ij represents the interaction of factor A level i with
factor B level j., and ijk is the error associated with replication k of the factor
combination ij, k = 1,..,r. In dot notation:
Yijk Y ... (Y i.. Y ... ) (Y . j . Y ... ) (Y ij . Y i.. Y . j . Y ... ) (Yijk Y ij . )
main effect main effect
interaction
experimental
factor i
factor j
effect
error
The null hypotheses for a tw- factor experiment are i = 0, j = 0, and ()ij = 0. The F
statistic s for each of these hypotheses may be interpreted independently due to the
orthogonality of their respective sum of squares (they are equivalent to orthogonal
contrasts). The sum of squares equation becomes:
T SS = SSA + SSB + SSAB + SSE.
9.6
9. 7. 2. ANOVA for a two-way factorial design (for fixed-effects model or Model I)
In the ANOVA for two-way factorial experiments, the Treatments SS is partitioned into
three orthogonal components: a SS for each factor and an interaction. This partitioning is
valid even when the overall F test among treatments is not significant. Indeed, there are
situations where one factor, say B, has n effect on A and hence contributes no more to the
SST than one would expect by chance; a significant response to A might well be lost in
an overall test of significance. In a factorial experiment the overall SST is more often just
an intermediate computational quantity rather than an end product.
In a two-way factorial (a x b), there are a total of ab treatment combinations and therefore
(ab 1) treatment degrees of freedom. The main effect of factor A has (a -1) df and the
main effect of factor B has (b 1) df. The interaction (AxB) has (a 1)(b 1) df. With r
replications per treatment combination, there are a total of (rab) experimental units in the
study and, therefore, (rab 1) total degrees of freedom.
General ANOVA table for a two-way CRD factorial experiment:
Source
Factor A
Factor B
AxB
Error
Total
df
a-1
b-1
(a - 1)(b - 1)
ab(r - 1)
rab - 1
SS
SSA
SSB
SSAB
SSE
TSS
MS
MSA
MSB
MSAB
MSE
F
MSA/MSE
MSB/MSE
MSAB/MSE
The interaction SS is the variation due to the departures of group means from the
values expected on the basis of additive combinations of the two factors' main effects.
The significance of the interaction F test determines what kind of subsequent analysis
is appropriate:
9.7
8 x 8 Latin Square
24
21
12
13
23
14
11
22
11
23
14
22
12
24
21
13
22
13
24
21
11
23
12
14
12
14
11
24
13
22
23
21
13
22
23
11
21
12
14
24
14
12
21
23
22
13
24
11
23
11
22
14
24
21
13
12
21
24
13
12
14
11
22
23
R
0
Block 1
15.7
9.8
14.6
11.9
12.4
48.7
7.9
10.3
9.7
9.6
37.5
18.0
17.4
15.1
14.4
64.9
13.6
10.6
11.8
13.3
49.3
8.8
8.2
11.3
11.2
39.5
73.8
75.7
76.3
75.6
301.4
10
Totals
Total
61.5
9.8
SAS Program
data STDp391;
input D R block number @@;
cards;
3
3
3
3
10
10
10
10
0
0
0
0
0
0
0
0
1
2
3
4
1
2
3
4
15.7
14.6
16.5
14.7
18.0
17.4
15.1
14.4
3
3
3
3
10
10
10
10
4
4
4
4
4
4
4
4
1
2
3
4
1
2
3
4
9.8
14.6
11.9
12.4
13.6
10.6
11.8
13.3
3
3
3
3
10
10
10
10
proc GLM;
class D R block;
model number= block D R D*R;
means D|R / lsd;
contrast 'R lineal'
R -1 0
contrast 'R quadratic' R 1 -2
run; quit
8
8
8
8
8
8
8
8
1 7.9
2 10.3
3 9.7
4 9.6
1 8.8
2 8.2
3 11.3
4 11.2
1;
1;
If you have 1 rep only (1 block) you can not include the D*R in the model
Dependent Variable: NUMBER
Source
DF
Sum of
Squares
Mean
Square
F Value
Pr > F
Model
Error
Corrected Total
8
15
23
156.235000
39.383333
195.618333
19.529375
2.625556
7.44
0.0005
3
1
2
2
0.581667
1.500000
153.663333
0.490000
0.193889
1.500000
76.831667
0.245000
0.07
0.57
29.26
0.09
0.9731
0.4614
0.0001
0.9114
BLOCK
D
R
D*R
3
3
3
10
10
10
0
4
8
0
4
8
4
4
4
4
4
4
Mean
12.8083
12.3083
N
12
12
Mean
15.8000
12.2500
9.6250
N
8
8
8
D
10
3
R
0
4
8
Mean
SD
15.3750000
12.1750000
9.3750000
16.2250000
12.3250000
9.8750000
0.89953692
1.97040605
1.03077641
1.74427635
1.39373599
1.60701587
DF
Contrast SS
Mean Square
F Value
Pr > F
1
1
152.522500
1.140833
152.522500
1.140833
58.09
0.43
0.0001
0.5198
9.9
The figure below was produced using the Analyst application. Within the Factorial
ANOVA window there is an option to produce plots of the dependent means for the twoway effects. Parallel lines, as those observed in this graphic indicate absence of
interaction. The differences among R doses are the same for the different D levels.
As a consequence of these constant differences the lines are parallel.
R=0
R=4
R=8
R -1 0
R 1 -2
1;
1;
9.10
D10 R4= TRT 5
D10 R8= TRT 6.
Then analyze TRT means as if TRT were a one-way classification of the data and use
contrast to partition the interaction. The contrasts in blue are the two interaction contrasts
(A good discussion is available in SAS System for Linear Models, 3rd Ed. P 94-104).
proc glm order=data;
class TRT block;
model number= block TRT;
contrast 'D'
contrast 'R lineal'
contrast 'R quadratic'
contrast 'Interaction lineal R * D'
contrast Interaction Quadratic R * D
TRT 1 1
TRT -1 0
TRT 1 -2
TRT -1 0
TRT 1 -2
1 -1 -1 -1;
1 -1 0 1;
1 1 -2 1;
1 1 0 -1;
1 -1 2 -1;
run;quit;
DF
3
5
Contrast
D
R lineal
R quadratic
Int R L*D
Int R Q*D
DF Contrast SS
1
1.500
1
152.522
1
1.141
1
0.123
1
0.367
MS
19.529
2.626
SS
0.582
155.653
7.44
MS
0.194
31.131
F Value
Pr > F
0.0005
F Value
0.07
11.86
MS
1.500
152.522
1.141
0.122
0.367
F Value
0.57
58.09
0.43
0.05
0.14
Pr > F
0.9731
<.0001
Pr > F
0.4614
<.0001
0.5198
0.8319
0.7135
block
4
Source
BLOCK
D
R
D*R
Contrast
R lineal
R quadratic
1 2 3 4
DF
3
1
2
2
DF
1
1
SS
0.582
1.500
153.663
0.490
Contrast SS
152.522
1.141
MS
F Value
0.194
0.07
0.9731
1.500
0.57
0.4614
76.832
29.26
0.0001
0.245
0.09
0.9114
MS
F Value
Pr > F
152.522
58.09
0.0001
1.141
0.43
0.5198
Pr > F
10
9.11
data interpart;
input type Vrn1 Vrn2 days;
cards;
1 0 0 89
1 0 0 97
1 0 0 98
2 0 1 133
2 0 1 148
2 0 1 138
2 0 1 128
2 0 1 130
2 0 1 134
2 0 1 133
2 0 1 148
3 0 2 163
3 0 2 153
3 0 2 156
4 1 0 83
4 1 0 87
4 1 0 81
4 1 0 99
4 1 0 78
4 1 0 92
4 1 0 85
4 1 0 83
5 1 1 121
5 1 1 121
5 1 1 118
5 1 1 123
5 1 1 108
5 1 1 112
5 1 1 98
5 1 1 116
5 1 1 110
5 1 1 113
6 1 2 140
6 1 2 125
6 1 2 132
6 1 2 133
6 1 2 125
6 1 2 125
6 1 2 128
6 1 2 135
7 2 0 81
7 2 0 99
7 2 0 73
8 2 1 137
8 2 1 153
8 2 1 86
8 2 1 120
8 2 1 120
8 2 1 106
8 2 1 112
9 2 2 124
9 2 2 124
;
proc glm order=data;
class vrn1 vrn2;
model days= vrn1|vrn2;
contrast 'Lineal Vrn1'
contrast 'Quadratic Vrn1'
proc glm order=data;
1
2
2
2
2
3
3
4
4
4
4
5
5
5
5
5
6
6
6
7
7
8
8
8
8
0
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
0
1
1
1
1
2
2
0
0
0
0
1
1
1
1
1
2
2
2
0
0
1
1
1
1
101
144
130
137
138
153
148
103
98
92
66
122
124
126
106
129
178
135
128
91
88
118
114
118
111
1
2
2
2
2
3
4
4
4
4
5
5
5
5
5
5
6
6
6
7
7
8
8
8
8
0
0
0
0
0
0
1
1
1
1
1
1
1
1
1
1
1
1
1
2
2
2
2
2
2
0
1
1
1
1
2
0
0
0
0
1
1
1
1
1
1
2
2
2
0
0
1
1
1
1
100
148
133
141
131
161
109
110
83
91
122
125
125
118
117
116
136
134
121
103
99
120
126
119
117
Vrn2
Vrn1
vrn1 -1 0 1;
vrn1 1 -2 1;
1
0
0
2
0
1
3
0
2
4
1
0
5
1
1
6
1
2
7
2
0
8
2
1
9 Type
2 Vrn1
2 Vrn2
11
9.12
class type;
model days= type;
contrast 'Lineal
contrast 'Quadrat
contrast 'Lineal
contrast 'Quadrat
contrast 'Int l by
contrast 'Int l by
contrast 'Int q by
contrast 'Int q by
run; quit;
Vrn1'
Vrn1'
Vrn2'
Vrn2'
l'
q'
l'
q'
Type
Type
Type
Type
Type
Type
Type
Type
-1 -1 -1 0 0 0 1 1
1 1 1 -2 -2 -2 1 1
-1 0 1 -1 0 1 -1 0
1 -2 1 1 -2 1 1 -2
1 0 -1 0 0 0 -1 0
-1 2 -1 0 0 0 1 -2
-1 0 1 2 0 -2 -1 0
1 -2 1 -2 4 -2 1 -2
1;
1;
1;
1;
1;
1;
1;
1;
12
9.13
3x3 Factorial
Class
Vrn1
Vrn2
Levels
3
3
Source
Model
Error
Corrected Total
Source
Vrn1
Vrn2
Vrn1*Vrn2
Values
0 1 2
0 1 2
DF
8
93
101
DF
2
2
4
Contrast
Lineal
Vrn1
Quadrat Vrn1
DF
1
1
SS
38006
10282
48288
MS
4751
111
Type III SS MS
4435
2217
21310
10655
808
202
SS
2829
847
MS
2829
847
F Value
42.97
F Value
20.06
96.37
1.83
F Value
25.58
7.66
Pr > F
<.0001
Pr > F
<.0001
<.0001
0.1303 NS
Pr > F
<.0001
0.0068
Levels
9
Source
Type
Error
Corrected Total
Contrast
Lineal
Quadrat
Lineal
Quadrat
Int l by
Int l by
Int q by
Int q by
Vrn1
Vrn1
Vrn2
Vrn2
l
q
l
q
Values
1 2 3 4 5 6 7 8 9
DF
8
93
101
SS
38006
10282
48288
DF
1
1
1
1
1
1
1
1
SS
2829
847
16181
1650
631
0
12
161
MS
4751
111
MS
2829
847
16181
1650
631
0
12
161
F Value
Pr > F
42.97
<.0001
F Value
25.58
7.66
146.35
14.92
5.71
0.00
0.11
1.46
Pr > F
<.0001
0.0068
<.0001
0.0002
0.0189
0.9523
0.7465
0.2305
Note that even though the interaction in the 3x3 factorial is not significant, the lineal by
lineal interaction is significant.
Note also that the Lineal and Quadratic contrast for the main Vrn1 are identical in both
analyses.
13
9.14
9.7.4.4. Example of a nested factor within a factorial design
Assume that in the quack-grass shoots experiment (9.7.4.1), two random samples of 1
square foot were taken in each plot (each R D combination). The values for the two
subsamples were created to give an average identical to the value in the previous
exercise. The correct design includes subsamples, nested within the interaction
R*D*Block.
data STDp391;
input D R Block plot number @@;
cards;
3 0 1 1 14.7 3 4 1 1 8.8 3 8 1 1 6.9
3 0 1 1
3 0 2 1 13.6 3 4 2 1 13.6 3 8 2 1 9.3
3 0 2 1
3 0 3 1 15.5 3 4 3 1 10.9 3 8 3 1 8.7
3 0 3 1
3 0 4 1 13.7 3 4 4 1 11.4 3 8 4 1 8.6
3 0 4 1
10 0 1 1 17.0 10 4 1 1 12.6 10 8 1 1 7.8
10 0 1 1
10 0 2 1 16.4 10 4 2 1 9.6 10 8 2 1 7.2
10 0 2 1
10 0 3 1 14.1 10 4 3 1 10.8 10 8 3 1 10.3
10 0 3 1
10 0 4 1 13.4 10 4 4 1 12.3 10 8 4 1 10.2
10 0 4 1
;
proc GLM;
class D R Block plot;
model number= Block D R D*R plot(D*R*Block);
random plot(D*R*Block);
test h= D
e= plot(D*R*Block);
test h= R
e= plot(D*R*Block);
test h= D*R e= plot(D*R*Block);
16.7
15.6
17.5
15.7
19.0
18.4
16.1
15.4
3
3
3
3
10
10
10
10
4
4
4
4
4
4
4
4
1
2
3
4
1
2
3
4
1
1
1
1
1
1
1
1
10.8
15.6
12.9
13.4
14.6
11.6
12.8
14.3
3
3
3
3
10
10
10
10
8
8
8
8
8
8
8
8
1
2
3
4
1
2
3
4
1
1
1
1
1
1
1
1
8.9
11.3
10.7
10.6
9.8
9.2
12.3
12.2
Source
Model
Error
Corrected Total
DF
23
24
47
SS
391.2
48.0
439.2
Source
Block
D
R
D*R
plot(D*R*Block)
DF
3
1
2
2
15
SS
1.16
3.00
307.33
0.98
78.77
MS
17.0
2.0
F
8.51
MS
0.39
3.00
153.66
0.49
5.25
F
0.19
1.50
76.83
0.24
2.63
Pr > F
<.0001
Pr > F
0.90
0.23
<.0001
0.7846
0.0170
Source
D
R
D*R
DF
1
2
2
Variance Component
Var(Block)
Var(D)
Var(R)
Var(D*R)
Var(plot(D*R*Block))
Var(Error)
SS
3.00
307.33
0.98
Estimate
-0.40528
0.10458
9.57333
-0.59514
1.62556
2.00000
MS
3.00
153.66
0.49
%
0
1
72
0
12
15
F Value
0.57
29.26
0.09
Pr > F
0.4614
<.0001
0.9114
14
9.15
Note that the first PROC GLM produce wrong results because SAS uses automatically
the last error. Once you specify the correct error (plot(D*R*Block)) for each hypothesis (h=
D, or h=D, or H=D*R) SAS will divide by the correct error term. The real replication is
the block and not the two subsamples. If you are confused by this analysis, use the
average of the subsamples and you will get a correct result (remember similar exercise in
Homework 3. problem 5).
The output indicates the relative contribution of each component to the variance. In this
case the mayor component is the significant R factor and within the error term the
variance between subsamples is similar to the variance between the replications.
The objective of introducing a nested factor is to understand the sources of variance in
the error term. This information can be used later to optimize the distribution of resources
between the number of samples and subsamples, as indicated above.
9. 7. 5 Two-way factorial in a CRD with one replication per cell
When only one observation per treatment combination is available, there is no source of
variation to estimate the experimental error. However, the interaction effect can be used
as error term if it is possible to assume that there are no significant interactions between
the factors. Tukeys additivity test can be used to test the presence of some of these
interactions.
The interaction is not specified in the model, and the interaction variation is used as an
estimate of the experimental error. In the following table only the first block from the
previous example is used as an example of a CRD.
proc glm;
class D R;
model Y= D R;
Dependent Variable: NUMBER
Source
Model
Error
Corrected Total
Source
D
R
DF
3
2
5
DF
1
2
Sum of
Squares
81.5
2.1
83.6
Type I SS
8.2
73.3
Mean
Square
27.2
1.1
Mean Square
8.2
36.7
F Value
Pr > F
25.87
0.0375
F Value
Pr > F
7.77
34.86
0.4349
0.0279
15
9.16
9. 7. 6 Example with significant interaction (fixed-effects model, ST&D, p. 358)
The interpretation of factorial experiments is often complicated when the
interactions are large. This is especially true if the effects change direction, as they do in
this example. Factor A in this experiment is time of bleeding of a lamb, and Factor B is
treatment vs. no treatment with estrogen. Here are the treatment totals of the 5
replications
A= time
(a1)= A.M.
(a2)= P.M.
Factor
Level
Total
(b1)=
control
Mean of 5
obs.: 66.39
Mean of 5
obs.: 182.67
249.06
(b2)= treated
Mean of 5
obs.: 96.80
Mean of 5
obs.: 139.06
235.86
321.73
484.92
B= estrogen
Total
163.19
SAS analysis
data fact1;
input id time $ estgn $ phos @@;
cards;
1 am c 8.53 2 am t 17.53 3 pm c 39.14
1 am c 20.53 2 am t 21.07 3 pm c 26.20
1 am c 12.53 2 am t 20.80 3 pm c 31.33
1 am c 14.00 2 am t 17.33 3 pm c 45.80
1 am c 10.80 2 am t 20.07 3 pm c 40.20
;
proc glm;
class time estgn;
model phos=time|estgn;
proc glm;
class id;
model phos= id;
contrast 'Between time within control'
contrast 'Between time within treated'
contrast 'Between estrogen levels, am'
contrast 'Between estrogen levels, pm'
4
4
4
4
4
pm
pm
pm
pm
pm
id
id
id
id
t
t
t
t
t
32.00
23.80
28.87
25.06
29.33
1 0 -1 0;
0 1 0 -1;
1 -1 0 0;
0 0 1 -1;
run; quit;
OUTPUT
DF
Anova SS
Mean Square
F Value
Pr > F
TIME
ESTGN
TIME*ESTGN
ERROR
1
1
1
16
1256.74658
8.71200
273.94802
379.92000
1256.74658
8.71200
273.94802
23.75000
52.93
0.37
11.54
0.0001
0.5532
0.0037
16
9.17
If interactions are present in a fixedeffects model the next step is the analysis
of the simple effects.
One general way of testing the simple effects is using the by statement (always use proc
sort before, to sort by the variable used in the by statement).
proc sort;
by time;
proc glm;
class estgn;
model phos= estgn;
means estgn / Hovtest= Levene;
by time;
proc sort;
by estgn;
proc glm;
class time;
model phos=time;
means time / Hovtest= Levene;
by estgn;
run; quit;
You need to test the assumptions for each one way ANOVA
One Way Anovas
DF
Contrast SS
Mean Square
Between
Between
Between
Between
1
1
1
1
1352.10384
178.59076
92.47681
190.18321
1352.10384
178.59076
92.47681
190.18321
F Value
Pr > F
0.0004
0.0011
0.0237
0.0495
An alternative way when there are clear preplanned hypotheses is to use an ID variable
and solve the simple effects by contrasts (ST&D page 362). These contrasts are not
orthogonal. The results are not identical since they use different MSE. We will generally
use the first approach.
Second PROC GLM (with id as a class variable)
Source
DF
ID
Error
Corrected Total
3
16
19
SS
MS
1539.40660
379.92328
1919.32988
F Value
513.13553
23.74520
21.61
Pr > F
0.0001
CONTRASTS
Contrast
Between
Between
Between
Between
DF
Contrast SS
Mean Square
F Value
Pr > F
1
1
1
1
1352.10384
178.59076
92.47681
190.18321
1352.10384
178.59076
92.47681
190.18321
56.94
7.52
3.89
8.01
0.0001
0.0145
0.0660
0.0121
17
9.18
9. 8. Three way ANOVA (fixed-effects model)
There is no reason to restrict the factorial design to a consideration of only two
factors. Three or more factors may be analyzed simultaneously each at different levels.
However, as the number of factors increases, even without replication within a subgroup,
the experimental units necessary becomes very large. It is frequently impossible or
prohibitive in cost to carry out such an experiment. A 4x4x4 factorial requires 64
experimental units to represent each combination of factors. Moreover, if only 64 e.u. are
used, there will be no replication to estimate the basic experiment error and some
interactions would have to be used as an estimate of experimental error (on the
assumption that no added interaction effect is present).
There are also logistic difficulties with such large experiments. It may not be
possible to run all the tests in one day or to hold all of the material in a single controlled
environmental chamber. Thus treatments may be confounded with undesired effects if
different treatments are applied under not quite the same experimental conditions.
The third problem that accompanies a factorial ANOVA with several main effects is
the large number of possible interactions. A two-way ANOVA has only one interaction,
A X B. A three-factor factorial has three first-order interactions, A X B, A X C, and B
X C.; and a second-order interaction, A X B X C.
The fixed model is assumed to be: ijk = + i + j + k + ()ij + ()ik + ()jk + ()ijk
A four-factor factorial has 6 first-order interactions, four second-order interactions,
and one third-order interaction (A X B X C X D). The numbers of interactions go up
rapidly as the numbers of factors increase. The testing of their significance, and more
importantly, their interpretation becomes exceedingly complex.
9. 8. 1. Example of a three-way factorial ANOVA (Taken from: C.J. Monlezun.1979.
Two-dimensional plots for interpreting interactions in the three-factor analysis of
variance model. The American Statistician 33:63-69.)
The following hypothetical population means for a 3x5x2 experiment are used to
illustrate an example with no three-way interactions. A graphic technique to show the
three way interactions is discussed.
B1
B2
B3
B4
B5
A1C1
A2C1
A3C1
A1C2
A2C2
A3C2
61
39
121
79
91
38
61
82
68
31
81
49
41
59
61
31
68
78
122
92
27
103
57
127
43
113
143
63
167
128
The lines of mean plots for fixed C1 (left, figure next page) and C2 (right) levels
are not parallel indicating a two-way interaction between A and B in both levels of C.
The first order interaction (AxB) now has two values: (AxB, c1) and (AxB, c2). The
interaction term (AxB) is the average of these.
18
9.19
AB combinations for C2
180
160
140
120
100
80
60
40
20
0
140
Response
Response
120
100
80
60
40
A2-level
20
0
1
B-levels
A3
A1
A2
4
B-levels
If, however, the differences between different levels of A are taken over levels
of say, B, for the two different C levels, the plot of these differences reveals no
interaction between BC. The lack of BC interaction with the differences between levels
of A indicates that no ABC interaction is present in these means, i.e. ()ijk = 0. A
graphical check of whether ()ijk = 0 is satisfied in the general situation requires a-1
different graphs.
Phrasing these results in words, we can say that factors A and B interact in the
same way for all levels of factor C.
( A 1 - A 2 ) fo r B C le v e ls
(A 2 -A 3 ) fo r B C c o m b in a tio n s
60
C1
C2
40
20
0
1
-2 0
80
60
40
20
0
-2 0 1
5
C1
-4 0
-6 0
-8 0
C2
-1 0 0
-4 0
B -l e v e l s
B -le v e ls
19