Abstract
Recent theoretical works applying the methods of statistical learning theory have put into relief the interest of old well known learning paradigms such as Bayesian inference and Gibbs algorithms. Sample complexity bounds have been given for such paradigms in the zero error case. This paper studies the behavior of these algorithms without this assumption. Results include uniform convergence of Gibbs algorithm towards Bayesian inference, rate of convergence of the empirical loss towards the generalization loss, convergence of the generalization error towards the optimal loss in the underlying class of functions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
N. Alon, S. Ben-David, N. Cesa-Bianci, D. Haussler, Scale-sensitive dimensions, uniform convergence and learnability. Journal of the ACM, 44(4):615–631, 1997
P.-L. Bartlett, The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network, IEEE transactions on Information Theory, 44:525–536, 1998
L. Devroye, L. Györfi, G. Lugosi, A Probabilistic Theory of Pattern Recognition, Springer, 1997
D. Haussler, M. Kearns, R.-E. Schapire, Bounds on the Sample Complexity of Bayesian Learning Using Information Theory and the VC Dimension, Machine Learning, 14(1):83–113, 1994.
R. Herbrich, T. Graepel, C. Campbelli, Bayes Point Machines: Estimating the Bayes Point in Kernel Space, in Proceedings of IJCAI Workshop on Support Vector Machines, pages 23–27, 1999
R. Herbrich, T. Graepeli, C. Campbell, Robuts Bayes Point Machines, Proceedings of ESANN 2000, pp49–54, 2000
B.-K. Natarajan, Learning over classes of distributions. In proceedings of the 1988 Workshop on Computational Learning Theory, pp 408–409, San Mateo, CA, Morgan Kaufmann 1988
M. Opper, D. Haussler, Calculation of the learning curve of Bayes optimal classification algorithm for learning a perceptron with noise. In Computational Learning Theory: Proceedings of the Fourth Annual Workshop, p 75–87. Morgan Kaufmann, 1991.
P. Rujän Playing Billiard in version space. Neural Computation, 1997
O. Teytaud, Bayesian learning/Structural Risk Minimization, Research Report RR-2005, Eric, http://eric.univ-lyon2.fr, 2000.
M. Vidyasagar, A theory of learning and generalization, Springer 1997.
A.-W. Van der Vaart, J.-A. Wellner, Weak Convergence and Empirical Processes, Springer, 1996.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Teytaud, O., Paugam-Moisy, H. (2001). Bounds on the Generalization Ability of Bayesian Inference and Gibbs Algorithms. In: Dorffner, G., Bischof, H., Hornik, K. (eds) Artificial Neural Networks — ICANN 2001. ICANN 2001. Lecture Notes in Computer Science, vol 2130. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44668-0_38
Download citation
DOI: https://doi.org/10.1007/3-540-44668-0_38
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42486-4
Online ISBN: 978-3-540-44668-2
eBook Packages: Springer Book Archive