Anova
Anova
COMPLETE
BUSINESS
STATISTICS
by
AMIR D. ACZEL
&
JAYAVEL SOUNDERPANDIAN
6th edition.
9-2
Chapter 9
Analysis of Variance
9-3
9 Analysis of Variance
Using Statistics
The Hypothesis Test of Analysis of Variance
The Theory and Computations of ANOVA
The ANOVA Table and Examples
Further Analysis
Models, Factors, and Designs
Two-Way Analysis of Variance
Blocking Designs
9-4
9 LEARNING OBJECTIVES
After studying this chapter you should be able to:
9-5
9-6
9-7
r populations
We assume that the r populations under study:
m1
Population 1
m2
Population 2
m3
Population 3
9-8
That is, the test statistic in an analysis of variance is based on the ratio of
two estimators of a population variance, and is therefore based on the F
distribution, with (r-1) degrees of freedom in the numerator and (n-r)
degrees of freedom in the denominator.
9-9
= m =m
F(r-1, n-r)=
x
9-10
In any of these situations, we would not expect the sample means to all be nearly
equal. We would expect the variation among the sample means (between
sample) to be large, relative to the variation around the individual sample means
(within sample).
If the null hypothesis is false, the numerator in the test statistic is expected to be
large, relative to the denominator:
F(r-1,
n-r)=
9-11
f(F)
0.5
0.4
0.3
0.2
a=0.05
0.1
0.0
0
3
2.79
F(3,50)
9-12
Example 9-1
Randomly chosen groups of customers were served different types of coffee and asked to rate the
coffee on a scale of 0 to 100: 21 were served pure Brazilian coffee, 20 were served pure Colombian
coffee, and 22 were served pure African-grown coffee.
The resulting test statistic was F = 2.02
F Distribution with 2 and 60 Degrees of Freedom
F = 2.02 F
2,60
= 3.15
0.7
0.6
0.5
f(F)
H0 : m1 = m2 = m3
H1: Not all three means equal
n1 = 21 n 2 = 20 n3 = 22 n = 21+ 20 + 22 = 63
r =3
The critical point for a = 0.05 is :
F
=F
=F
= 3.15
r -1,n-r 31,633 2,60
0.4
0.3
0.2
a=0.05
0.1
0.0
0
Test Statistic=2.02
F(2,60)=3.15
9-13
9-14
Sample point(j)
I = 1 Triangle
1
Triangle
2
Triangle
3
Triangle
4
Mean of Triangles
I = 2 Square
1
Square
2
Square
3
Square
4
Mean of Squares
I = 3 Circle
1
Circle
2
Circle
3
Mean of Circles
Grand mean of all data points
Value(x ij)
4
5
7
8
6
10
11
12
13
11.5
1
2
3
2
6.909
x1=6
x2=11.5
x=6.909
x3=2
0
10
9-15
e =x x
ij
ij
t =x x
i
9-16
Total deviation:
Tot24=x24-x=6.091
Error deviation:
e24=x24-x2=1.5
x24=13
Treatment deviation:
t2=x2-x=4.591
x2=11.5
x = 6.909
10
9-17
Squared Deviations
2
2
2
+e
= ( x x ) ( xij x )
i
ij
i
i
2
2
Tot ij = ( xij x )
t
9-18
Tot
e
= nt
+
ij
i =1j =1
i =1 ii
i = 1 j = 1 ij
n
n
j
j
r
r
r
2
2
(x x) = n (x x)
( x x )2
i
i = 1 j = 1 ij
i =1 i i
i = 1 j = 1 ij
SST =
SSTR
SSE
9-19
SSE
SST
SST measures the total variation in the data set, the variation of all individual data
points from the grand mean.
SSTR measures the explained variation, the variation of individual sample means
from the grand mean. It is that part of the variation that is possibly expected, or
explained, because the data points are drawn from different populations. Its the
variation between groups of data points.
SSE measures unexplained variation, the variation within each group that cannot be
explained by possible differences between the groups.
9-20
The degrees of freedom are additive in the same way as are the sums of squares:
df(total) = df(treatment) + df(error)
(n - 1) = (r - 1)
+ (n - r)
9-21
SSTR
MSTR =
(r 1)
SSE
MSE =
(n r )
SST
MST =
(n 1)
(Note that the additive properties of sums of squares do not extend to the mean
squares. MST MSTR + MSE.
9-22
m
n
(
)
= s 2 when the null hypothesis is true
2
i
i
E ( MSTR) = s
r 1
> s 2 when the null hypothesis is false
where mi is the mean of population i and m is the combined mean of all r populations.
That is, the expected mean square error (MSE) is simply the common population variance
(remember the assumption of equal population variances), but the expected treatment sum of
squares (MSTR) is the common population variance plus a term related to the variation of the
individual population means around the grand population mean.
If the null hypothesis is true so that the population means are all equal, the second term in
the E(MSTR) formulation is zero, and E(MSTR) is equal to the common population variance.
9-23
9-24
MSTR
MSE
9-25
Triangle
-2
Triangle
-1
Triangle
Triangle
Square
10
-1.5
2.25
Square
Square
Square
2
2
2
2
3
4
11
12
13
-0.5
0.5
1.5
0.25
0.25
2.25
Circle
-1
Circle
Circle
73
17
Treatment
(xi -x)
Value (x ij )
(x ij -xi ) (x ij -xi )2
(xi -x) 2
ni (x i -x) 2
Triangle
-0.909
0.826281
3.305124
Square
4.591
21.077281
84.309124
Circle
-4.909
124.098281
72.294843
159.909091
n
j
r
( x x ) 2 = 17
SSE =
i
i = 1 j = 1 ij
r
2
SSTR = n ( x x ) = 159 .9
i =1 i i
SSTR
159.9
=
= 79.95
MSTR =
r 1
( 3 1)
SSTR 17
=
= 2 .125
MSE =
n r
8
MSTR
79 .95
=
=
= 37 .62.
F
MSE
2 .125
( 2 ,8 )
Critical point ( a = 0.01): 8.65
H may be rejected at t he 0.01 level
0
of significance.
9-26
ANOVA Table
Source of
Variation
Sum of
Squares
Degrees of
Freedom Mean Square F Ratio
Treatment SSTR=159.9
(r-1)=2
MSTR=79.95 37.62
Error
SSE=17.0
(n-r)=8
MSE=2.125
Total
SST=176.9
(n-1)=10
MST=17.69
0.6
0.5
f(F)
0.4
0.3
0.2
0.01
0.1
0.0
0
10
8.65
F(2,8)
9-27
Template Output
9-28
89
Source of
Variation
Martinique
75
Treatment
MSTR= 3552
Eleuthra
73
Error
SSE=98356
(n-r)= 195
MSE= 504.39
Paradise Island
91
Total
SST=112564
(n-1)= 199
MST= 565.65
St. Lucia
85
SSE=98356
Sum of
Squares
Degrees of
Freedom
Mean Square
F Ratio
7.04
0.5
f(F)
SST=112564
Mean Response (x i )
0.4
0.3
0.2
0.01
0.1
0.0
0
3.41
F(4,200)
The resultant F
ratio is larger than
the critical point for
a = 0.01, so the
null hypothesis may
be rejected.
9-29
Sum of
Squares
Degrees of
Freedom
Mean Square
F Ratio
Treatment
SSTR= 879.3
(r-1)=3
MSTR= 293.1
8.52
Error
SSE= 18541.6
(n-r)= 539
MSE=34.4
Total
SST= 19420.9
(n-1)=542
MST= 35.83
9-30
Do Not Reject H0
Stop
ANOVA
Reject H0
The sample means are unbiased estimators of the population means.
The mean square error (MSE) is an unbiased estimator of the common
population variance.
Further
Analysis
Confidence Intervals
for Population Means
Tukey Pairwise
Comparisons Test
9-31
Mean Response (x i )
Guadeloupe
89
Martinique
75
Eleuthra
73
Paradise Island
91
St. Lucia
85
SST = 112564
SSE = 98356
ni = 40
n = (5)(40) = 200
MSE = 504.39
a
2
MSE
504.39
= xi 1.96
= xi 6.96
ni
40
2
89 6.96 = [82.04, 95.96]
75 6.96 = [ 68.04,81.96]
73 6.96 = [ 66.04, 79.96]
91 6.96 = [84.04, 97.96]
85 6.96 = [ 78.04, 91.96]
xi ta
9-32
T = qa
MSE
ni
r
2
r!
p a irs o f p o p u la tio n m e a n s to c o m p a re . F o r e x a m p le , if r
2 !( r 2 ) !
H 0: m1 = m 2
H 0: m1 = m 3
H0:m2 = m3
H1: m1 m 2
H1: m1 m 3
H1: m 2 m 3
3:
9-33
9-34
The bars indicate the three groupings of populations with possibly equal
means: 2 and 3; 2, 3, and 5; and 1, 4, and 5.
9-35
9-36
9-37
example:
One factor models based on sets of resorts, types of airplanes, or
kinds of sweaters
Two factor models based on firm and location
Three factor models based on color and shape and size of an ad.
Fixed-Effects and Random Effects
A fixed-effects model is one in which the levels of the factor under
study (the treatments) are fixed in advance. Inference is valid only
for the levels under study.
A random-effects model is one in which the levels of the factor
under study are randomly chosen from an entire population of levels
(treatments). Inference is valid for the entire population of levels.
9-38
Experimental Design
A completely-randomized design is one in which the
9-39
The effect on the population mean that can be attributed to the levels of either factor alone
is called a main effect.
An interaction effect between two factors occurs if the total effect at some pair of levels of
the two factors or treatments differs significantly from the simple addition of the two main
effects. Factors that do not interact are called additive.
9-40
9-41
Factor A: Resort
Friendship
Sports
Culture
Excitement
Guadeloupe
n11
n12
n13
n14
Martinique
n21
n22
n23
n24
Graphical Display o f Ef f ec ts
Eleuthra
n31
n32
n33
n34
R a ti n g
St. Lucia
n51
n52
n53
n54
Eleuthra/sports interaction:
Combined effect greater than
additive main effects
Rating
Friendship
Excitement
Sports
Culture
Paradise
Island
n41
n42
n43
n44
Friendship
Attribute
Excitement
Sports
Culture
Eleuthra
St. Lucia
Paradise island
Martinique
Guadeloupe
Resort
Resort
St. Lucia
Paradise Island
Eleuthra
Guadeloupe
Martinique
9-42
9-43
Sums of Squares
In a two-way ANOVA:
xijk=m+ai+ bj + (abijk + eijk
SST = SSTR +SSE
SST = SSA + SSB +SS(AB)+SSE
9-44
Sum of
Squares
Degrees
of Freedom
Mean Square
F Ratio
Factor A
SSA
a-1
MSA =
SSA
a 1
MSA
F =
MSE
Factor B
SSB
b-1
MSB =
SSB
b 1
MSB
F=
MSE
Interaction SS(AB)
(a-1)(b-1)
MS ( AB) =
Error
SSE
ab(n-1)
Total
SST
abn-1
SS ( AB)
( a 1)(b 1)
SSE
MSE =
ab( n 1)
F =
MS ( AB )
MSE
9-45
Sum of
Squares
Degrees
of Freedom
Location
1824
912
8.94
Artist
2230
1115
10.93
804
201
1.97
Error
8262
81
102
Total
13120
89
Interaction
Mean Square
F Ratio
9-46
Hypothesis Tests
F Distribution with 2 and 81 Degrees of Freedom
0.7
0.6
0.4
0.5
f(F)
f(F)
0.5
0.6
0.4
0.3
0.3
a=0.01
0.2
a=0.05
0.2
0.1
0.1
0.0
0.0
0
F0.01=4.88
F0.05=2.48
9-47
MSE
bn
where the degrees of freedom of the q distribution are now a and ab(n-1). Note
that MSE is divided by bn.
9-48
9-49
Sum of
Squares
Degrees
of Freedom
Mean Square
SSA
a 1
F Ratio
MSA
F=
MSE
Factor A
SSA
a-1
MSA =
Factor B
SSB
b-1
SSB
MSB =
b 1
F =
Factor C
SSC
c-1
MSC =
SSC
c 1
F =
Interaction
(AB)
Interaction
(AC)
Interaction
(BC)
SS(AB)
(a-1)(b-1)
SS(AC)
(a-1)(c-1)
SS(BC)
(b-1)(c-1)
SS ( AB)
( a 1)(b 1)
SS ( AC)
MS ( AC) =
(a 1)(c 1)
SS ( BC)
MS ( BC) =
(b 1)(c 1)
Interaction
(ABC)
Error
SS(ABC)
(a-1)(b-1)(c-1)
SSE
abc(n-1)
Total
SST
abcn-1
MS ( AB) =
SS ( ABC)
(a 1)(b 1)(c 1)
SSE
MSE =
abc( n 1)
MS ( ABC) =
MSB
MSE
MSC
MSE
MS ( AB)
F =
MSE
MS ( AC )
MSE
MS ( BC)
F=
MSE
F =
F=
MS( ABC)
MSE
9-50
9-51
Sum of
Squares
Degrees of
Freedom
Factor A
SSA
a-1
Factor B
SSB
b-1
Error
SS(AB)
(a 1)(b
1)
Total
SST
ab - 1
Mean Square
F Ratio
MSA = SSA
a 1
F = MSA
MS (AB)
MSB = SSB
b 1
F = MSB
MS (AB)
MS ( AB) = SS ( AB)
(a 1)(b 1)
9-52
9-53
xij=m+ai+ bj + eij
9-54
SSBL
SSTR
SSE
SST
Source of Variation
Blocks
Treatments
Error
Total
n-1
r-1
(n -1)(r - 1)
nr - 1
F Ratio
Sum of Squares
df
Mean Square F Ratio
2750
39
70.51
0.69
2640
2
1320
12.93
7960
78
102.05
13350 119
9-55