Analysis of Variance
Analysis of Variance
Analysis of Variance
Analysis of Variance
15.1 Introduction
Analysis of variance compares two or more
populations of interval data.
Specifically, we are interested in determining
whether differences exist between the population
means.
The procedure works by analyzing the sample
variance.
variances
Weekly
sales
529
529
658
658
793
793
514
514
663
663
719
719
711
711
606
606
461
Weekly
461
529
529
sales
498
498
663
663
604
604
495
495
485
485
557
557
353
353
557
557
542
542
614
614
Quality
Quality
804
804
630
630
774
774
717
717
679
679
604
604
620
620
697
697
706
706
615
615
492
492
719
719
787
787
699
699
572
572
Weekly
523
523
584
sales
584
634
634
580
580
624
624
Price
Price
672
672
531
531
443
443
596
596
602
602
502
502
659
659
689
689
675
675
512
512
691
691
733
733
698
698
776
776
561
561
572
572
469
469
581
581
679
679
532
532
See file
Xm15 -01
Notation
Independent samples are drawn from k populations (treatments).
First observation,
first sample
Second observation,
second sample
X11
x21
.
.
.
Xn1,1
n1
X12
x22
.
.
.
Xn2,2
X1k
x2k
.
.
.
Xnk,k
n2
nk
x2
xk
x1
Sample size
Sample mean
Terminology
In the context of this problem
Response variable weekly sales
Responses actual sale values
Experimental unit weeks in the three cities when we
record sales figures.
Factor the criterion by which we classify the populations
(the treatments). In this problems the factor is the marketing
strategy.
Factor levels the population (treatment) names. In this
problem factor levels are the marketing trategies.
Graphical demonstration:
Employing two types of variability
30
25
x 3 20
20
x 2 15
16
15
14
11
10
9
x 3 20
20
19
x 2 15
x1 10
12
10
9
x1 10
The
1 sample means are the same as before,
but the larger within-sample variability
Treatment 1
Treatment 2 Treatment 3
makes it harder to draw a conclusion
about the population means.
SST n j ( x j x)
j 1
SST n j ( x j x) 2
j 1
n1 x1 n2 x 2 ... nk x k
X
n1 n2 ... nk
= 20(577.55 - 613.07)2 +
+ 20(653.00 - 613.07)2 +
+ 20(608.65 - 613.07)2 =
= 57,512.23
SSE
nj
j 1 i 1
SST
k 1
57,512.23
3 1
28,756.12
MST
Calculation of MSE
Mean Square for Error
SSE
nk
509,983.50
60 3
8,894.45
MSE
Required Conditions:
1. The populations tested
are normally distributed.
2. The variances of all the
populations tested are
equal.
MST
F
MSE
28,756.12
8,894.45
3.23
with the following degrees of freedom:
H0: 1 = 2 = =k
H1: At least two means differ
MST
Test statistic: F MSE
R.R: F>F,k-1,n-k
The F test
MST
MSE
28,756.12
8,894.17
3.23
Ho: 1 = 2= 3
H1: At least two means differ
Test statistic F= MST MSE= 3.23
Statistical
FDIST(3.23,2,57) = .0467
0.1
0.08
0.06
0.04
0.02
0
-0.02 0
Count
20
20
20
ANOVA
Source of Variation
Between Groups
Within Groups
SS
57512
506984
Total
564496
Sum
Average Variance
11551
577.55 10775.00
13060
653.00
7238.11
12173
608.65
8670.24
df
2
57
59
MS
28756
8894
P-value
3.23
0.0468
F crit
3.16
Response
Response
Treatment 3 (level 1)
Treatment 2 (level 2)
Treatment 1 (level 3)
Level2
Level 1
Factor B
Level 3
Level2
Level 1 Factor A
Random effects
If the levels included in our analysis represent a random
sample of all the possible levels, we have a random-effect
ANOVA.
The conclusion of the random-effect ANOVA applies to all the
levels (not only those studied).
Randomized Blocks
Block all the observations with some
commonality across treatments
Treatment 4
Treatment 3
Treatment 2
Treatment 1
Block3
Block2
Block 1
Randomized Blocks
Block all the observations with some
commonality across treatments
Treatment
Block
1
2
.
.
.
b
Treatment mean
1
2
k Block mean
X11 X12 . . . X1k
x[B]1
X21 X22
X2k
x[B ] 2
Xb1 Xb2
Xbk
x[ T ]1 x[ T ]2
x[ T ]k
x[B]b
SS(Total) == SST
SST++ SSB
SSB ++ SSE
SSE
SS(Total)
Sum of square for treatments
SSB=
1
2
k Block mean
k( x[B] ) X
1
X11 X12 . . . X1k
x[B]1
X21 X22
X2k x[B]2
1
2
2
.
k ( x[ B ] 2 ) X
SS (Total ) ( x11 X ) 2 ( x21 X ) 2 ... ( x12 X ) 2 ( x22 X ) 2
.
2
.... ( X 1k X ) 2 ( x2 k X ) 2 ...
k( x[B] ) X
k
b
Xb1 Xb2
Xbk
Treatment mean x[ T ]1 x[ T ]2
SST =
b( x[ T ] ) X b( x[ T ] ) X
1
2
x[ T ]k
... b( x[ T ] k ) X
SSB=
1
2
k Block mean
1
X11 X12 . . . X1k
x[B]1
2
SSE 2( x 11 x[ T ]1 x[X21
B]1 X )X22
( x 21 x[ T ]1X2k
x[B] 2 x[XB)]22 ...
.
(. x 12 x[ T ] 2 x[B]1 X ) 2 ( x 22 x[ T ] 2 x[B] 2 X ) 2 ...
(. x 1k x[ T ]k x[B]1 X ) 2 ( x 2k x[ T ]k x[B] 2 X ) 2 ...
b
Xb1 Xb2
Xbk
Treatment mean x[ T ]1 x[ T ]2
SST =
b( x[ T ] ) X b( x[ T ] ) X
1
2
x[ T ]k
2
... b( x[ T ] k ) X
k( x[B] ) X
1
k( x[B] ) X
2
k( x[B] ) X
k
Mean Squares
To perform hypothesis tests for treatments and blocks we
need
Mean square for treatments
Mean square for blocks
SST
Mean square for error
MST
k 1
SSB
MSB
b 1
SSE
MSE
nk b 1
MST
F
MSE
Test statistic for blocks
MSB
F
MSE
F > F,k-1,n-k-b+1
Testing the mean response for blocks
F> F,b-1,n-k-b+1
SS
3848.7
196.0
1142.6
Total
5187.2
Treatments
df
24
3
72
MS
160.36
65.32
15.87
F
P-value
10.11
0.0000
4.12
0.0094
F crit
1.67
2.73
99
Chapter 15 - continued
Analysis of Variance
City1
Convnce
TV
City2
Convnce
Paper
City3
Quality
TV
City4
Quality
Paper
City5
Price
TV
City6
Price
Paper
Convnce
TV
City2
Convnce
Paper
City3
Quality
TV
City4
Quality
Paper
City5
Price
TV
City6
Price
Paper
Xm15-03
The p-value =.0452.
We conclude that there is evidence that differences
exist in the mean weekly sales among the six cities.
Factor B:
Advertising media
Newspapers
City 1
sales
City3
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
Calculations are
based on the sum of
square for factor A
SS(A)
Factor B:
Advertising media
TV
Newspapers
Convenience
Quality
Price
City 1
sales
City 3
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
Factor B:
Advertising media
Newspapers
City 1
sales
City 3
sales
City 5
sales
City 2
sales
City 4
sales
City 6
sales
Graphical description
description ofof the
the possible
possible
Graphical
relationships between
between factors
factorsAAand
and B.
B.
relationships
Difference between the levels of factor A, and Difference between the levels of factor A
difference between the levels of factor B; no
No difference between the levels of factor B
interaction
M R
Level 1 of factor B
Level 1and 2 of factor B
e e
s
a p
Level 2 of factor B
n o
n
s
e
Levels of factor A
Levels of factor A
M R
e e
s
a p
n o
n
s
e
1
M R
e e
s
a p
n o
n
s
e
M R
e e
s
a p
n o
n
s
e
2
Interaction
Levels of factor A
1
Levels of factor A
3
Sums of squares
a
SS( A ) rb
i 1
b
SS(B) ra
( x[ A ]i x)2
( x[ B ] j x ) 2
j 1
SS( AB) r
i 1
SSE
2
(
x
[
AB
]
x
[
A
]
x
[
B
]
x
)
ij
i
j
j 1
i 1
j 1
k 1
( xijk x[ AB ]ij ) 2
MS(A)
F=
MSE
Rejection region: F > F,a-1 ,n-ab
MS(B)
F=
MSE
SS(B)/(b-1)
SSE/(n-ab)
MS(AB)
F=
MSE
F > Fa-1)(b-1),n-ab
SS(AB)/(a-1)(b-1)
Required conditions:
1. The response distributions is normal
2. The treatment variances are equal.
3. The samples are independent.
Convenience
Quality
Price
491
712
558
447
479
624
546
444
582
672
464
559
759
557
528
670
534
657
557
474
677
627
590
632
683
760
690
548
579
644
689
650
704
652
576
836
628
798
497
841
575
614
706
484
478
650
583
536
579
795
803
584
525
498
812
565
708
546
616
587
SS
13172.0
98838.6
1609.6
501136.7
Total
614757.0
df
1
2
2
54
MS
13172.0
49419.3
804.8
9280.3
P-value
1.42
0.2387
5.33
0.0077
0.09
0.9171
F crit
4.02
3.17
3.17
59
MS(A)MSE
SS
13172.0
98838.6
1609.6
501136.7
Total
614757.0
df
1
2
2
54
MS
13172.0
49419.3
804.8
9280.3
F
1.42
5.33
0.09
P-value
0.2387
0.0077
0.9171
59
F crit
4.02
3.17
3.17
MS(B)MSE
F = MS(Media)/MSE = 1.42
Fcritical = Fa-1,n-ab = F.05,2-1,60-(3)(2) = 4.02 (p-value = .2387)
SS
13172.0
98838.6
1609.6
501136.7
Total
614757.0
df
1
2
2
54
MS
13172.0
49419.3
804.8
9280.3
P-value
1.42
0.2387
5.33
0.0077
0.09
0.9171
F crit
4.02
3.17
3.17
59
Interaction AB = Marketing*Media
MS(AB)MSE
F = MS(Marketing*Media)/MSE = .09
Fcritical = Fa-1)(b-1),n-ab = F.05,(3-1)(2-1),60-(3)(2) = 3.17 (p-value= .9171)
1 1
LSD t 2 MSE( )
ni n j
d.f . n k
Bonferroni Adjustment
The procedure:
Compute the number of pairwise comparisons (C)
[C=k(k-1)/2], where k is the number of populations.
Set = E/C, where E is the true probability of making at
least one Type I error (called experimentwise Type I error).
We can conclude that i and j differ (at /C% significance
level if
i j t ( 2C )
d.f . n k
1 1
MSE( )
ni n j
t 2
1 1
MSE( )
ni n j
1 1
t 2 MSE( )
ni n j
x1 x 3 577.55 608.65 31.10
x 2 x 3 653.0 608.65 44.35 2.467 8894(1/ 20) (1/ 20) 73.54
MSE
q (k, )
ng
k = the number of samples
=degrees of freedom = n - k
ng = number of observations per sample
(recall, all the sample sizes are the same)
= significance level
q(k,) = a critical value obtained from the studentized range table
ng
k
1 n1 1 n2 ...1 nk
MSE
8894
q (k, )
q.05 (3,57)
71.70
ng
20
Population
Mean
xmax xmin
xmax xmin