Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons

Published: 01 June 2000 Publication History

Abstract

The natural gradient learning method is known to have ideal performances for on-line training of multilayer perceptrons. It avoids plateaus, which give rise to slow convergence of the backpropagation method. It is Fisher efficient, whereas the conventional method is not. However, for implementing the method, it is necessary to calculate the Fisher information matrix and its inverse, which is practically very difficult. This article proposes an adaptive method of directly obtaining the inverse of the Fisher information matrix. It generalizes the adaptive Gauss-Newton algorithms and provides a solid theoretical justification of them. Simulations show that the proposed adaptive method works very well for realizing natural gradient learning.

References

[1]
Amari, S. (1985). Differential-geometrical method in statistics. Berlin: Springer-Verlag.
[2]
Amari, S. (1998). Natural gradient works efficiently in learning. Neural Computation, 10, 251-276.
[3]
Amari, S., & Nagaoka, H. (2000). Information geometry. New York: American Mathematical Society and Oxford University Press.
[4]
Bottou, L. (1998). Online algorithms and stochastic approximations. In D. Saad (Ed.), Online learning in neural networks (pp. 9-42). Cambridge: Cambridge University Press.
[5]
Edelman, A., Arias, T., & Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM Journal of Matrix Analysis and Applications, 20, 303-353.
[6]
LeCun, Y., Bottou, L., Orr, G. B., & Müller, K.-R. (1998). Efficient backprop. In G. B. Orr & K. R. Müller (Eds.), Neural networks--Tricks of the trade, (pp. 5-50). Berlin: Springer-Verlag.
[7]
Park, H., Amari, S., & Fukumizu, K. (2000). Adaptive natural gradient learning algorithms for various stochastic models. Submitted.
[8]
Rattray, M., Saad, D. & Amari, S. (1998). Natural gradient descent for on-line learning. Physical Review Letters, 81, 5461-5464.
[9]
Saad, D., & Solla, S. A. (1995). On-line learning in soft committee machines. Phys. Rev. E, 52, 4225-4243.
[10]
Yang, H. H., & Amari, S. (1998). Complexity issues in natural gradient descent method for training multilayer perceptrons. Neural Computation, 10, 2137- 2157.

Cited By

View all
  • (2021)Tensor normal training for deep learning modelsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542255(26040-26052)Online publication date: 6-Dec-2021
  • (2020)WoodFisherProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497243(18098-18109)Online publication date: 6-Dec-2020
  • (2020)Practical quasi-newton methods for training deep neural networksProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3495925(2386-2396)Online publication date: 6-Dec-2020
  • Show More Cited By
  1. Adaptive Method of Realizing Natural Gradient Learning for Multilayer Perceptrons

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Neural Computation
    Neural Computation  Volume 12, Issue 6
    June 2000
    231 pages

    Publisher

    MIT Press

    Cambridge, MA, United States

    Publication History

    Published: 01 June 2000

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 12 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Tensor normal training for deep learning modelsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542255(26040-26052)Online publication date: 6-Dec-2021
    • (2020)WoodFisherProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3497243(18098-18109)Online publication date: 6-Dec-2020
    • (2020)Practical quasi-newton methods for training deep neural networksProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3495925(2386-2396)Online publication date: 6-Dec-2020
    • (2019)EA-CGProceedings of the Thirty-Third AAAI Conference on Artificial Intelligence and Thirty-First Innovative Applications of Artificial Intelligence Conference and Ninth AAAI Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v33i01.33013337(3337-3346)Online publication date: 27-Jan-2019
    • (2018)Exact natural gradient in deep linear networks and application to the nonlinear caseProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327345.3327494(5945-5954)Online publication date: 3-Dec-2018
    • (2018)Numerical analysis near singularities in RBF networksThe Journal of Machine Learning Research10.5555/3291125.329112619:1(1-39)Online publication date: 1-Jan-2018
    • (2018)Dynamics of learning in mlpNeural Computation10.1162/neco_a_0102930:1(1-33)Online publication date: 1-Jan-2018
    • (2017)Active biasProceedings of the 31st International Conference on Neural Information Processing Systems10.5555/3294771.3294867(1003-1013)Online publication date: 4-Dec-2017
    • (2017)Probabilistic line searches for stochastic optimizationThe Journal of Machine Learning Research10.5555/3122009.317686318:1(4262-4320)Online publication date: 1-Jan-2017
    • (2017)Building Proteins in a DayIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2016.262757339:4(706-718)Online publication date: 1-Apr-2017
    • Show More Cited By

    View Options

    View options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media