Abstract
This paper surveys certain developments in the use of probabilistic techniques for the modelling of generalization. Some of the main methods and key results are discussed. Many details are omitted, the aim being to give a high-level overview of the types of approaches taken and methods used.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Noga Alon, Shai Ben-David, Nicolo Cesa-Bianchi, and David Haussler: Scale-sensitive dimensions, uniform convergence, and learnability. Journal of the ACM 44(5): 616–631.
Martin Anthony: Probabilistic analysis of learning in artificial neural networks: the PAC model and its variants. Neural Computing Surveys, 1, 1997.
Martin Anthony and Peter L. Bartlett: Neural Network Learning: Theoretical Foundations. Cambridge University Press, Cambridge UK, 1999.
Martin Anthony and Norman L. Biggs: Computational Learning Theory: An Introduction. Cambridge Tracts in Theoretical Computer Science, 30, 1992. Cambridge University Press, Cambridge, UK.
András Antos, Balázs Kégl, Tamás Linder and Gábor Lugosi: Data-dependent margin-based generalization bounds for classification. Preprint, Queen’s University at Kingston, Canada, magenta.mast.queensu.ca/~linder/preprints.html.
Peter Bartlett: The sample complexity of pattern classification with neural networks: the size of the weights is more important than the size of the network. IEEE Transactions on Information Theory 44(2): 525–536.
Peter L. Bartlett, Olivier Bousquet and Shahar Mendelson: Localized Rademacher complexities. To appear, Proceedings of the 15th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 2002.
Peter L. Bartlett and Philip M. Long: More theorems about scale-sensitive dimensions and learning. In Proceedings of the 8th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 1995, pp. 392–401.
Peter Bartlett and Shahar Mendelson: Rademacher and Guassian complexities: risk bounds and structural results. In Proceedings of the 14th Annual Conference on Computational Learning Theory, Lecture Notes in Artificial Intelligence, Springer pp. 224–240, 2001.
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth: Learnability and the Vapnik-Chervonenkis dimension. Journal of the ACM, 36(4): 929–965, 1989.
Stéphane Boucheron, Gábor Lugosi and Pascal Massart: A sharp concentration inequality with applications. Random Structures and Algorithms, 16: 277–292, 2000.
Olivier Bousquet, Vladimir Koltchinskii and Dmitriy Panchenko: Some local measures of complexity on convex hulls and generalization bounds. To appear, Proceedings of the 15th Annual Conference on Computational Learning Theory, ACM Press, New York, NY, 2002.
Nello Cristianini and John Shawe-Taylor: An Introduction to Support Vector Machines, Cambridge University Press, Cambridge, UK, 2000.
Luc Devroye and Gábor Lugosi: Combinatorial Methods in Density Estimation, Springer Series in Statistics, Springer-Verlag, New York, NY, 2001.
Richard M. Dudley: Uniform Central Limit Theorems, Cambridge Studies in Advanced Mathematics, 63, Cambridge University Press, Cambridge, UK, 1999.
Richard M. Dudley: Central limit theorems for empirical measures. Annals of Probability, 6(6): 899–929, 1978.
Andrzej Ehrenfeucht, David Haussler, Michael Kearns, and Leslie Valiant. A general lower bound on the number of examples needed for learning. Information and Computation, 82: 247–261, 1989.
E. Giné and J. Zinn: Some limit theorems for empirical processes. Annals of Probability 12(4): 929–989, 1984.
David Haussler: Decision theoretic generalizations of the PAC model for neural net and other learning applications. Information and Computation, 100(1): 78–150, 1992.
Marek Karpinski and Angus MacIntyre: Polynomial bounds for VC dimension of sigmoidal and general Pfaffian neural networks. Journal of Computer and System Sciences, 54: 169–176, 1997.
Michael J. Kearns and Umesh Vazirani: Introduction to Computational Learning Theory, MIT Press, Cambridge, MA, 1995.
Vladimir Koltchinskii and Dmitry Panchenko: Rademacher processes and bounding the risk of function learning. Technical report, Department of Mathematics and Statistics, University of New Mexico, 2000.
Gábor Lugosi: Lectures on Statistical Learning Theory, presented at the Garchy Seminar on Mathematical Statistics and Applications, August 27–September 1, 2000. (Availablefromhttp://www.econ.upf.es/lugosi.)
Colin McDiarmid: On the method of bounded differences. In J. Siemons, editor, Surveys in Combinatorics, 1989, London Mathematical Society Lecture Note Series (141). Cambridge University Press, Cambridge, UK, 1989.
Shahar Mendelson: A few notes on Statistical Learning Theory. Technical Report, Australian National University Computer Science Laboratory.
S. Mendelson and R. Vershynin: Entropy, dimension and the Elton-Pajor theorem. Preprint, Australian National University.
David Pollard: Convergence of Stochastic Processes. Springer-Verlag, 1984.
N. Sauer: On the density of families of sets. Journal of Combinatorial Theory (A), 13: 145–147, 1972.
S. Shelah: A combinatorial problem: Stability and order for models and theories in infinitary languages. Pacific Journal of Mathematics, 41: 247–261, 1972.
John Shawe-Taylor, Peter Bartlett, Bob Williamson and Martin Anthony: Structural risk minimisation over data-dependent hierarchies. IEEE Transactions on Information Theory, 44(5): 1926–1940, 1998.
Aad W. van der Vaart and Jon A. Wellner: Weak Convergence and Empirical Processes, Springer Series in Statistics, Springer-Verlag, New York, NY, 1996.
Leslie G. Valiant: A theory of the learnable. Communications of the ACM, 27(11): 1134–1142, Nov. 1984.
Vladimir N. Vapnik: Estimation of Dependences Based on Empirical Data. Springer-Verlag, New York, 1982.
Vladimir N. Vapnik: Statistical Learning Theory, Wiley, 1998.
V. N. Vapnik and A. Y. Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability and its Applications, 16(2): 264–280, 1971.
M. Vidyasagar: A Theory of Learning and Generalization, Springer-Verlag, 1996.
Robert Williamson, John Shawe-Taylor, Bernhard Scholkopf, and Alex Smola: Sample Based Generalization Bounds, NeuroCOLT Technical Report, NC-TR-99-055, 1999.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Anthony, M. (2002). Mathematical Modelling of Generalization. In: Marinaro, M., Tagliaferri, R. (eds) Neural Nets. WIRN 2002. Lecture Notes in Computer Science, vol 2486. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45808-5_20
Download citation
DOI: https://doi.org/10.1007/3-540-45808-5_20
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44265-3
Online ISBN: 978-3-540-45808-1
eBook Packages: Springer Book Archive