Abstract
The neuromanifold or the parameter space of multila yer perceptrons includes complex singularities at which the Fisher information matrix degenerates. The parameters are unidentifiable at singularities, and this causes serious difficulties in learning, known as plateaus in the cost function. The natural or adaptive natural gradient method is proposed for overcoming this difficulty. It is important to study the relation betw een the generalization error and and the training error at the singularities, because the generalization error is estimated in terms of the training error. The generalization error is studied both for the maximum likelihood estimator (mle) and the Bayesian predictive distribution estimator in terms of the Gaussian random field, by using a simple model. This elucidates the strange behaviors of learning dynamics around singularities.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Amari, S.: Natural gradient works efficiently in learning, Neural Computation, 10, 251–276, 1998.
Amari, S. and Murata, N.: Statistical theory of learning curves under entropic loss criterion Neural Computation, 5, 140–153, 1993.
Amari S. and Nagaoka, H.: Information Geometry, AMS and Oxford University Press, 2000.
Amari, S. and Ozeki, T.: Differential and algebraic geometry of multilayer perceptrons, IEICE Transactions on Fundamentals of Electronics, Communications and Computer System, E84-A, 31–38, 2001.
Amari, S., Park, H., and Fukumizu, F.: Adaptive method of realizing natural gradient learning for multilayer perceptrons, Neural Computation, 12, 1399–1409, 2000.
Dacunha-Castelle, D. and Gassiat, E.: Testing in locally conic models, and application to mixture models, Probability and Statistics, 1, 285–317, 1997.
Fukumizu, K.: Statistical analysis of unidentifiable models and its application to multilayer neural networks, Memo at Post-Conference of the Bernoulli-RIKEN BSI 2000 Symposium on Neural Networks and Learning, October 2000.
Fukumizu, K.: Likelihood Ratio of Unidentifiable Models and Multilayer Neural Networks, Research Memorandum, 780, Inst. of Statitical Mathematics, 2001.
Hagiwara, k., Kuno, K. and Usui, S.: On the problem in model selection of neural network regression in overrealizable scenario, Proceeding of International Joint Conference of Neural Networks, 2000.
Hartigan, J. A.: A failure of likelihood asymptotics for normal mixtures, Proceedings of Berkeley Conference in Honor of J. Neyman and J. Kiefer, 2, 807–810, 1985.
Kitahara, M., Hayasaka, T., Toda, N. and Usui, S.: On the probability distribution of estimators of regression model using 3-layered neural networks (in Japanese), Workshop on Information-Based Induction Sciences (IBIS 2000), 21–26, July, 2000
Park, H., Amari, S. and Fukumizu, F.: Adaptive natural gradient learning algorithms for various stochastic models, Neural Networks, 13, 755–764, 2000.
Rattray, M., Saad, D. and Amari S.: Natural Gradient Descent for On-line Learning, Physical Review Letters, 81, 5461–5464, 1998.
Watanabe, S.: Algebraic analysis for non-identifiable learning machines, Neural Computation, to appear.
Watanabe, S.: Training and generalization errors of learning machines with algebraic singularities (in Japanease), The Trans. of IEICE A, J84-A, 99–108,2001.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amari, Si., Ozeki, T., Park, H. (2001). Generalization Error and Training Error at Singularities of Multilayer Perceptrons. In: Mira, J., Prieto, A. (eds) Connectionist Models of Neurons, Learning Processes, and Artificial Intelligence. IWANN 2001. Lecture Notes in Computer Science, vol 2084. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45720-8_37
Download citation
DOI: https://doi.org/10.1007/3-540-45720-8_37
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42235-8
Online ISBN: 978-3-540-45720-6
eBook Packages: Springer Book Archive