Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

Germain, Pascal; Lacasse, Alexandre; Laviolette, François; Marchand, Mario; Roy, Jean-Francis

Statistics > Machine Learning

arXiv:1503.08329 (stat)

[Submitted on 28 Mar 2015 (v1), last revised 28 Jul 2015 (this version, v2)]

Title:Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

Authors:Pascal Germain, Alexandre Lacasse, François Laviolette, Mario Marchand, Jean-Francis Roy

View PDF

Abstract:We propose an extensive analysis of the behavior of majority votes in binary classification. In particular, we introduce a risk bound for majority votes, called the C-bound, that takes into account the average quality of the voters and their average disagreement. We also propose an extensive PAC-Bayesian analysis that shows how the C-bound can be estimated from various observations contained in the training data. The analysis intends to be self-contained and can be used as introductory material to PAC-Bayesian statistical learning theory. It starts from a general PAC-Bayesian perspective and ends with uncommon PAC-Bayesian bounds. Some of these bounds contain no Kullback-Leibler divergence and others allow kernel functions to be used as voters (via the sample compression setting). Finally, out of the analysis, we propose the MinCq learning algorithm that basically minimizes the C-bound. MinCq reduces to a simple quadratic program. Aside from being theoretically grounded, MinCq achieves state-of-the-art performance, as shown in our extensive empirical comparison with both AdaBoost and the Support Vector Machine.

Comments:	Published in JMLR this http URL
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:1503.08329 [stat.ML]
	(or arXiv:1503.08329v2 [stat.ML] for this version)
	https://doi.org/10.48550/arXiv.1503.08329
Journal reference:	Journal of Machine Learning Research 2015, vol. 16, p. 787-860

Submission history

From: Pascal Germain [view email]
[v1] Sat, 28 Mar 2015 17:19:49 UTC (1,045 KB)
[v2] Tue, 28 Jul 2015 20:08:16 UTC (1,045 KB)

Statistics > Machine Learning

Title:Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Statistics > Machine Learning

Title:Risk Bounds for the Majority Vote: From a PAC-Bayesian Analysis to a Learning Algorithm

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators