Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

The nature of unsupervised learning in deep neural networks: A new understanding and novel approach

Published: 01 July 2016 Publication History

Abstract

Over the last decade, the deep neural networks are a hot topic in machine learning. It is breakthrough technology in processing images, video, speech, text and audio. Deep neural network permits us to overcome some limitations of a shallow neural network due to its deep architecture. In this paper we investigate the nature of unsupervised learning in restricted Boltzmann machine. We have proved that maximization of the log-likelihood input data distribution of restricted Boltzmann machine is equivalent to minimizing the cross-entropy and to special case of minimizing the mean squared error. Thus the nature of unsupervised learning is invariant to different training criteria. As a result we propose a new technique called "REBA" for the unsupervised training of deep neural networks. In contrast to Hinton's conventional approach to the learning of restricted Boltzmann machine, which is based on linear nature of training rule, the proposed technique is founded on nonlinear training rule. We have shown that the classical equations for RBM learning are a special case of the proposed technique. As a result the proposed approach is more universal in contrast to the traditional energy-based model. We demonstrate the performance of the REBA technique using wellknown benchmark problem. The main contribution of this paper is a novel view and new understanding of an unsupervised learning in deep neural networks.

References

[1]
Hinton, G., Osindero, S., and Teh, Y., A fast learning algorithm for deep belief nets, Neural Computation, 2006, vol. 18, pp. 1527---1554.
[2]
Hinton, G., Training products of experts by minimizing contrastive divergence, Neural Computation, 2002, vol. 14, pp. 1771---1800.
[3]
Hinton, G. and Salakhutdinov, R., Reducing the dimensionality of data with neural networks, Science, 2006, vol. 313, no. 5786, pp. 504---507.
[4]
Hinton, G.E., A practical guide to training restricted Boltzmann machines, Tech. Rep. 2010-000, Toronto: Machine Learning Group, University of Toronto, 2010.
[5]
Krizhevsky, A., Sutskever, L., and Hinton, G., ImageNet classification with deep convolutional neural networs, Proc. Advances in Neural information Processing Systems, 2012, vol. 25, pp. 1090---1098.
[6]
LeCun, Y., Bengio, Y., and Hinton, G., Deep Learning Nature, 2015, vol. 521, no. 7553, pp. 436---444.
[7]
Mikolov, T., Deoras, A., Povey, D., Burget, L., and Cernocky, J., Strategies for training large scale neural network language models, in Automatic Speech Recognition and Understanding, 2011, pp. 195---201.
[8]
Hinton, G., et al., Deep neural network for acoustic modeling in speech recognition, Proc. IEEE Signal Processing Magazine, 2012, vol. 29, pp. 82---97.
[9]
Bengio, Y., Learning deep architectures for AI, Foundations and Trends in Machine Learning, 2009, vol. 2, no. 1, pp. 1---127.
[10]
Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al., Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems, 2007, vol. 19, p. 153.
[11]
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S., Why does unsupervised pretraining help deep learning?, Journal of Machine Learning Research, 2010, vol. 11, pp. 625---660.
[12]
Larochelle, H., Bengio, Y., Louradour, J., and Lamblin, P., Exploring strategies for training deep neural networks, Journal of Machine Learning Research, 2009, vol. 1, pp. 1---40.
[13]
Bengio, Y., Courville, A., and Vincent, P., Representation learning a review and new percpectives. Proc. IEEE Trans. Pattern Anal. Machine Intell., 2013, vol. 35, pp. 1798---1828.
[14]
Glorot, X., Bordes, A., and Bengio, Y., Deep sparse rectifier networks, Proc. 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP, 2011, vol. 15, pp. 315---323.
[15]
Golovko, V., A learning technique for deep belief neural networks, in Neural Networks and Artificial Intelligence, Golovko, V., Kroshchanka, A., Rubanau, U., and Jankowski, S., Eds., Springer, Communication in Computer and Information Science, 2014, vol. 440, pp. 136---146.
[16]
Golovko, V., A new technique for restricted Boltzmann machine learning, Kroshchanka, A., Turchenko, V., Jankowski, S., and Treadwell, D., Eds., Proc. 8th IEEE International Conference IDAACS-2015, Warsaw 24---26 September 2015, Warsaw, 2015, pp. 182---186.
[17]
Golovko, V., From multilayers perceptrons to deep belief neural networks: training paradigms and application, Lections on Neuroinformatics, Golovko, V.A., Ed., Moscow: NRNU MEPhI, 2015, pp. 47---84 {in Russian}.
[18]
Golik, P., Cross-entropy vs. squared error training: A theoretical and experimental comparison, Golik, P., Doetsch, P., and Ney, H., Eds., in Interspeech Lyon, France, 2013, pp. 1756---1760.
[19]
Glorot, X. and Bengio, Y., Understanding the difficulty of training deep feed-forward neural networks, Proc. of Int. Conf. on Artificial Intelligence and Statistics, vol. 9, Chia Laguna Resort, Italy, 2010, pp. 249---256.
[20]
Rosenblatt, F., Principles of Neurodynamics; Perceptrons and the Theory of Brain Mechanisms, Washington: Spartan Books, 1962.
[21]
Widrow, B. and Hoff, M., Adaptive switching circuits, Proc. 1960 IRE WESCON Convention Record.-DUNNO, 1960, pp. 96---104.
[22]
Scholz, M., Fraunholz, M., and Selbig, J., Nonlinear principal component analysis: neural network models and applications, in Principal Manifolds for Data Visualization and Dimension Reduction, Springer Berlin Heidelberg, 2008, pp. 44---67.
[23]
Qiao, Yu., THE MNIST DATABASE of handwritten digits: http://www.gavo.t.u-tokyo.ac.jp/~qiao/database. html.

Cited By

View all
  • (2022)Method for Reducing Neural-Network Models of Computer VisionPattern Recognition and Image Analysis10.1134/S105466182202014632:2(294-300)Online publication date: 1-Jun-2022
  • (2022)Hybrid deep learning diagonal recurrent neural network controller for nonlinear systemsNeural Computing and Applications10.1007/s00521-022-07673-934:24(22367-22386)Online publication date: 1-Dec-2022
  • (2021)Deep Neural Networks: Selected Aspects of Learning and ApplicationPattern Recognition and Image Analysis10.1134/S105466182101009031:1(132-143)Online publication date: 1-Jan-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Optical Memory and Neural Networks
Optical Memory and Neural Networks  Volume 25, Issue 3
July 2016
76 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 July 2016

Author Tags

  1. data visualization
  2. deep learning
  3. deep neural networks
  4. machine learning
  5. restricted Boltzmann machine

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Method for Reducing Neural-Network Models of Computer VisionPattern Recognition and Image Analysis10.1134/S105466182202014632:2(294-300)Online publication date: 1-Jun-2022
  • (2022)Hybrid deep learning diagonal recurrent neural network controller for nonlinear systemsNeural Computing and Applications10.1007/s00521-022-07673-934:24(22367-22386)Online publication date: 1-Dec-2022
  • (2021)Deep Neural Networks: Selected Aspects of Learning and ApplicationPattern Recognition and Image Analysis10.1134/S105466182101009031:1(132-143)Online publication date: 1-Jan-2021
  • (2021)The Reduction of Fully Connected Neural Network Parameters Using the Pre-training Technique2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)10.1109/IDAACS53288.2021.9661015(937-941)Online publication date: 22-Sep-2021
  • (2021)Deep learning controller for nonlinear system based on Lyapunov stability criterionNeural Computing and Applications10.1007/s00521-020-05077-133:5(1515-1531)Online publication date: 1-Mar-2021

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media