Statistical analysis of regularization constant — From Bayes, MDL and NIC points of view

Amari, Shun-ichi; Murata, Noboru

doi:10.1007/BFb0032486

Shun-ichi Amari¹ &
Noboru Murata¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1240))

Included in the following conference series:

International Work-Conference on Artificial Neural Networks

215 Accesses
2 Citations

Abstract

In order to avoid overfitting in neural learning, a regularization term is added to the loss function to be minimized. It is naturally derived from the Bayesian standpoint. The present paper studies how to determine the regularization constant from the points of view of the empirical Bayes approach, the maximum description length (MDL) approach, and the network information criterion (NIC) approach. The asymptotic statistical analysis is given to elucidate their differences. These approaches are tightly connected with the method of model selection. The superiority of the NIC is shown from this analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

H. Akaike (1974) A new look at statistical model identification, IEEE Transactions on Automatic Control, 19, 716–723.
Google Scholar
G. Brake, J.N. Kok and P.M.B. Vitányi (1995) Model Selection for Neural Networks: Comparing MDL and NIC, NeuroCOLT Technical Report NC-TR-95-021.
Google Scholar
D.J.C. McKay (1992) Bayesian interpolation, Neural Computation, 4, 415–447.
Google Scholar
J.E. Moody (1992) The effective number of parameters: an analysis of generalization and regularization in nonlinear learning systems, in NIPS4, pp.847–854.
Google Scholar
N. Murata, S. Yoshizawa and S. Amari (1994) Network information criterion — determining the number of hidden units for artificial neural network models, IEEE Transactions on Neural Networks, 5, 865–872.
Google Scholar
T. Poggio and F. Girosi (1990) Regularization algorithms for learning that are equivalent to multilayer networks, Science, 247, 978–982.
Google Scholar
B.D. Ripley (1996) Pattern Recognition and Neural Networks, Cambridge University Press.
Google Scholar
J.Rissanen (1989) Stochastic Complexity in Statistical Inquiry, Singapore: World Scientific Publishing Co.
Google Scholar

Download references

Author information

Authors and Affiliations

RIKEN Frontier Research Program, Wako-shi, Hirosawa 2-1, 351-01, Saitama, Japan
Shun-ichi Amari & Noboru Murata

Authors

Shun-ichi Amari
View author publications
You can also search for this author in PubMed Google Scholar
Noboru Murata
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

José Mira Roberto Moreno-Díaz Joan Cabestany

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Amari, Si., Murata, N. (1997). Statistical analysis of regularization constant — From Bayes, MDL and NIC points of view. In: Mira, J., Moreno-Díaz, R., Cabestany, J. (eds) Biological and Artificial Computation: From Neuroscience to Technology. IWANN 1997. Lecture Notes in Computer Science, vol 1240. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0032486

Download citation

DOI: https://doi.org/10.1007/BFb0032486
Published: 18 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63047-0
Online ISBN: 978-3-540-69074-0
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics