Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
22 views

Notes On ANOVA For Comparing Multiple Algorithms

Uploaded by

bodev46157
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Notes On ANOVA For Comparing Multiple Algorithms

Uploaded by

bodev46157
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

Notes on ANOVA for Comparing Multiple Algorithms

1. Overview of the Problem

When comparing the performance of multiple algorithms, we often train and test several
algorithms on multiple datasets to evaluate their error rates. Given L algorithms and K training
sets, we induce K classifiers for each algorithm and test them on K validation sets. This results in
L groups of K error rates each. The goal is to determine if there are statistically significant
differences in error rates among these algorithms.

2. ANOVA Framework

Objective:

 Test whether there are significant differences in mean error rates across L algorithms.

Hypotheses:

 Null Hypothesis (H0): All algorithms have the same mean error rate. μ1 = μ2 = ⋯ =
μL .
 Alternative Hypothesis (H1): At least one algorithm has a different mean error rate.

Data Assumptions:

 Error rates Xij are normally distributed with mean μj and common variance σ2 .
 Each error rate is approximately normal due to the binomial distribution of validation
errors.

3. ANOVA Procedure

a. Estimators of Variance:

̂𝟐 ):
1. Between-Group Variance (Estimator 𝛔𝒃
1
o Group Mean: mj = ∑K X
K i=1 ij
1
o Overall Mean: m = ∑Lj=1 mj
L
2
o Between-Group Sum of Squares (SSB): SSB = K ∑Lj=1(mj − m)
o ̂2 = SSB
Estimator: σb L−1
2. Within-Group Variance (Estimator 𝛔̂𝟐𝒘 ):
1 2
o Group Variance: Sj2 = ∑K (X − mj )
K−1 i=1 ij
2
o Within-Group Sum of Squares (SSW): SSW = ∑Lj=1 ∑Ki=1(Xij − mj )
o 2 = SSW
Estimator: σ̂
w L⋅(K−1)
b. F-Ratio Calculation:

σ̂2b
 F-Ratio: F0 =
σ̂
w
2

o The F-Ratio compares the variance between groups to the variance within groups.

c. Decision Rule:

 If 𝐅𝟎 is greater than the critical value 𝐅𝛂,𝐋−𝟏,𝐋(𝐊−𝟏) from the F-distribution table,
reject the null hypothesis.
 If 𝐅𝟎 is not significant, fail to reject the null hypothesis.

4. ANOVA Table

Source of Sum of Degrees of Mean Square F-Ratio


Variation Squares Freedom
Between Groups SSB L-1 MSB = SSB / (L - 1) MSB /
MSW
Within Groups SSW L(K - 1) MSW = SSW / (L(K
- 1))
Total SST L·K-1

2
 Total Sum of Squares (SST): SST = ∑Lj=1 ∑Ki=1(Xij − m)

5. Post Hoc Testing

Purpose:

 To identify which specific groups differ after finding a significant difference with
ANOVA.

Common Tests:
mi −mj
 Least Significant Difference (LSD) Test: t =
√2σ̂
2 /K
w

o Compare pairwise means to check significant differences.

Multiple Comparisons Correction:

 Bonferroni Correction: Adjust the significance level to account for multiple


comparisons.
α
αadj = T

o Where T is the number of comparisons.


6. Summary

ANOVA is a powerful tool for comparing multiple algorithms' performance. It assesses whether
the observed differences in error rates are statistically significant by analyzing variances within
and between groups. Significant results suggest that at least one algorithm's performance is
significantly different, warranting further investigation through post hoc tests.

Let's go through a detailed example of ANOVA with calculations and post hoc testing. Suppose
we have three algorithms, and we want to compare their error rates using a 5-fold cross-
validation. Here’s the step-by-step process:

Example Dataset

Let’s assume we have the following error rates (in percentage) for three algorithms (A, B, and C)
across 5 folds:

Algorithm Fold 1 Fold 2 Fold 3 Fold 4 Fold 5


A 10 12 11 13 12
B 15 14 16 17 15
C 20 19 21 22 20

Step 1: Calculate Group Means and Overall Mean

Group Means:
10+12+11+13+12
 Algorithm A: mA = = 11.6
5
15+14+16+17+15
 Algorithm B: mB = = 15.4
5
20+19+21+22+20
 Algorithm C: mC = = 20.4
5
(10+12+11+13+12)+(15+14+16+17+15)+(20+19+21+22+20) 165
 Overall Mean: m = = = 11.0
15 15

Step 2: Calculate Sum of Squares

a. Between-Group Sum of Squares (SSB):

SSB = 5 × [(11.6 − 15.4)2 + (15.4 − 15.4)2 + (20.4 − 15.4)2 ]


SSB = 5 × [(11.6 − 15.4)2 + (15.4 − 15.4)2 + (20.4 − 15.4)2 ]
SSB = 5 × [(−3.8)2 + (0)2 + (5)2 ]
SSB = 5 × [14.44 + 0 + 25] = 5 × 39.44 = 197.2

b. Within-Group Sum of Squares (SSW):

 Algorithm A:
(10 − 11.6)2 + (12 − 11.6)2 + (11 − 11.6)2 + (13 − 11.6)2 + (12 − 11.6)2
SA2 =
5−1

2.56 + 0.16 + 0.36 + 1.96 + 0.16 5.2


SA2 = = = 1.3
4 4

 Algorithm B:

(15 − 15.4)2 + (14 − 15.4)2 + (16 − 15.4)2 + (17 − 15.4)2 + (15 − 15.4)2
SB2 =
5−1

0.16 + 1.96 + 0.36 + 2.56 + 0.16 5.2


SB2 = = = 1.3
4 4

 Algorithm C:

(20 − 20.4)2 + (19 − 20.4)2 + (21 − 20.4)2 + (22 − 20.4)2 + (20 − 20.4)2
SC2 =
5−1

0.16 + 1.96 + 0.36 + 2.56 + 0.16 5.2


SC2 = = = 1.3
4 4

SSW = 1.3 × 2 + 1.3 × 2 + 1.3 × 2 = 7.8

Step 3: Calculate Mean Squares and F-Ratio

 Mean Square Between (MSB):

SSB 197.2 197.2


MSB = = = = 98.6
L−1 3−1 2

 Mean Square Within (MSW):

SSW 7.8 7.8


MSW = = = = 0.65
L ⋅ (K − 1) 3 × (5 − 1) 12

 F-Ratio:

MSB 98.6
F0 = = ≈ 151.0
MSW 0.65

Step 4: Decision and Post Hoc Testing

a. Compare F-Ratio to Critical Value:


 Assume significance level α = 0.05, degrees of freedom for the numerator dfB = L −
1 = 2, and for the denominator dfW = L ⋅ (K − 1) = 12.
 From F-distribution tables, the critical value for F0.05,2,12 is approximately 3.89.
 Since F0 ≈ 151.0 is much greater than 3.89, we reject the null hypothesis.

b. Post Hoc Testing:

Least Significant Difference (LSD) Test:

 Standard Error:

SE = √2 × MSW/K = √2 × 0.65/5 = √0.26 ≈ 0.51

 Critical t-value (for 12 degrees of freedom, α = 0.05) is approximately 2.18.


 Pairwise Comparisons:
o Algorithm A vs. B:

Difference = |11.6 − 15.4| = 3.8

3.8
t= ≈ 7.45
0.51

Since 7.45 > 2.18, this difference is significant.

o Algorithm A vs. C:

Difference = |11.6 − 20.4| = 8.8

8.8
t= ≈ 17.25
0.51

Since 17.25 > 2.18, this difference is significant.

o Algorithm B vs. C:

Difference = |15.4 − 20.4| = 5.0

5.0
t= ≈ 9.80
0.51

Since 9.80 > 2.18, this difference is significant.

Summary: All pairwise comparisons are significant, indicating that all algorithms have
significantly different error rates.
By following these steps, we have used ANOVA to determine that there are significant
differences in error rates among the algorithms and used post hoc tests to pinpoint where those
differences lie.

You might also like