Abstract
In order to avoid overfitting in neural learning, a regularization term is added to the loss function to be minimized. It is naturally derived from the Bayesian standpoint. The present paper studies how to determine the regularization constant from the points of view of the empirical Bayes approach, the maximum description length (MDL) approach, and the network information criterion (NIC) approach. The asymptotic statistical analysis is given to elucidate their differences. These approaches are tightly connected with the method of model selection. The superiority of the NIC is shown from this analysis.
Preview
Unable to display preview. Download preview PDF.
References
H. Akaike (1974) A new look at statistical model identification, IEEE Transactions on Automatic Control, 19, 716–723.
G. Brake, J.N. Kok and P.M.B. Vitányi (1995) Model Selection for Neural Networks: Comparing MDL and NIC, NeuroCOLT Technical Report NC-TR-95-021.
D.J.C. McKay (1992) Bayesian interpolation, Neural Computation, 4, 415–447.
J.E. Moody (1992) The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems, in NIPS4, pp.847–854.
N. Murata, S. Yoshizawa and S. Amari (1994) Network information criterion — determining the number of hidden units for artificial neural network models, IEEE Transactions on Neural Networks, 5, 865–872.
T. Poggio and F. Girosi (1990) Regularization algorithms for learning that are equivalent to multilayer networks, Science, 247, 978–982.
B.D. Ripley (1996) Pattern Recognition and Neural Networks, Cambridge University Press.
J.Rissanen (1989) Stochastic Complexity in Statistical Inquiry, Singapore: World Scientific Publishing Co.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Amari, Si., Murata, N. (1997). Statistical analysis of regularization constant — From Bayes, MDL and NIC points of view. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032486
Download citation
DOI: https://doi.org/10.1007/BFb0032486
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive