article

The nature of unsupervised learning in deep neural networks: A new understanding and novel approach

Authors:

A. Kroshchanka,

D. TreadwellAuthors Info & Claims

Optical Memory and Neural Networks, Volume 25, Issue 3

Pages 127 - 141

https://doi.org/10.3103/S1060992X16030073

Published: 01 July 2016 Publication History

Abstract

Over the last decade, the deep neural networks are a hot topic in machine learning. It is breakthrough technology in processing images, video, speech, text and audio. Deep neural network permits us to overcome some limitations of a shallow neural network due to its deep architecture. In this paper we investigate the nature of unsupervised learning in restricted Boltzmann machine. We have proved that maximization of the log-likelihood input data distribution of restricted Boltzmann machine is equivalent to minimizing the cross-entropy and to special case of minimizing the mean squared error. Thus the nature of unsupervised learning is invariant to different training criteria. As a result we propose a new technique called "REBA" for the unsupervised training of deep neural networks. In contrast to Hinton's conventional approach to the learning of restricted Boltzmann machine, which is based on linear nature of training rule, the proposed technique is founded on nonlinear training rule. We have shown that the classical equations for RBM learning are a special case of the proposed technique. As a result the proposed approach is more universal in contrast to the traditional energy-based model. We demonstrate the performance of the REBA technique using wellknown benchmark problem. The main contribution of this paper is a novel view and new understanding of an unsupervised learning in deep neural networks.

References

[1]

Hinton, G., Osindero, S., and Teh, Y., A fast learning algorithm for deep belief nets, Neural Computation, 2006, vol. 18, pp. 1527---1554.

Digital Library

[2]

Hinton, G., Training products of experts by minimizing contrastive divergence, Neural Computation, 2002, vol. 14, pp. 1771---1800.

Digital Library

[3]

Hinton, G. and Salakhutdinov, R., Reducing the dimensionality of data with neural networks, Science, 2006, vol. 313, no. 5786, pp. 504---507.

Digital Library

[4]

Hinton, G.E., A practical guide to training restricted Boltzmann machines, Tech. Rep. 2010-000, Toronto: Machine Learning Group, University of Toronto, 2010.

[5]

Krizhevsky, A., Sutskever, L., and Hinton, G., ImageNet classification with deep convolutional neural networs, Proc. Advances in Neural information Processing Systems, 2012, vol. 25, pp. 1090---1098.

[6]

LeCun, Y., Bengio, Y., and Hinton, G., Deep Learning Nature, 2015, vol. 521, no. 7553, pp. 436---444.

[7]

Mikolov, T., Deoras, A., Povey, D., Burget, L., and Cernocky, J., Strategies for training large scale neural network language models, in Automatic Speech Recognition and Understanding, 2011, pp. 195---201.

[8]

Hinton, G., et al., Deep neural network for acoustic modeling in speech recognition, Proc. IEEE Signal Processing Magazine, 2012, vol. 29, pp. 82---97.

[9]

Bengio, Y., Learning deep architectures for AI, Foundations and Trends in Machine Learning, 2009, vol. 2, no. 1, pp. 1---127.

Digital Library

[10]

Bengio, Y., Lamblin, P., Popovici, D., Larochelle, H., et al., Greedy layer-wise training of deep networks, Advances in Neural Information Processing Systems, 2007, vol. 19, p. 153.

Digital Library

[11]

Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S., Why does unsupervised pretraining help deep learning?, Journal of Machine Learning Research, 2010, vol. 11, pp. 625---660.

Digital Library

[12]

Larochelle, H., Bengio, Y., Louradour, J., and Lamblin, P., Exploring strategies for training deep neural networks, Journal of Machine Learning Research, 2009, vol. 1, pp. 1---40.

Digital Library

[13]

Bengio, Y., Courville, A., and Vincent, P., Representation learning a review and new percpectives. Proc. IEEE Trans. Pattern Anal. Machine Intell., 2013, vol. 35, pp. 1798---1828.

Digital Library

[14]

Glorot, X., Bordes, A., and Bengio, Y., Deep sparse rectifier networks, Proc. 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP, 2011, vol. 15, pp. 315---323.

[15]

Golovko, V., A learning technique for deep belief neural networks, in Neural Networks and Artificial Intelligence, Golovko, V., Kroshchanka, A., Rubanau, U., and Jankowski, S., Eds., Springer, Communication in Computer and Information Science, 2014, vol. 440, pp. 136---146.

[16]

Golovko, V., A new technique for restricted Boltzmann machine learning, Kroshchanka, A., Turchenko, V., Jankowski, S., and Treadwell, D., Eds., Proc. 8th IEEE International Conference IDAACS-2015, Warsaw 24---26 September 2015, Warsaw, 2015, pp. 182---186.

[17]

Golovko, V., From multilayers perceptrons to deep belief neural networks: training paradigms and application, Lections on Neuroinformatics, Golovko, V.A., Ed., Moscow: NRNU MEPhI, 2015, pp. 47---84 {in Russian}.

[18]

Golik, P., Cross-entropy vs. squared error training: A theoretical and experimental comparison, Golik, P., Doetsch, P., and Ney, H., Eds., in Interspeech Lyon, France, 2013, pp. 1756---1760.

[19]

Glorot, X. and Bengio, Y., Understanding the difficulty of training deep feed-forward neural networks, Proc. of Int. Conf. on Artificial Intelligence and Statistics, vol. 9, Chia Laguna Resort, Italy, 2010, pp. 249---256.

[20]

Rosenblatt, F., Principles of Neurodynamics; Perceptrons and the Theory of Brain Mechanisms, Washington: Spartan Books, 1962.

[21]

Widrow, B. and Hoff, M., Adaptive switching circuits, Proc. 1960 IRE WESCON Convention Record.-DUNNO, 1960, pp. 96---104.

[22]

Scholz, M., Fraunholz, M., and Selbig, J., Nonlinear principal component analysis: neural network models and applications, in Principal Manifolds for Data Visualization and Dimension Reduction, Springer Berlin Heidelberg, 2008, pp. 44---67.

[23]

Qiao, Yu., THE MNIST DATABASE of handwritten digits: http://www.gavo.t.u-tokyo.ac.jp/~qiao/database. html.

Cited By

Kroshchanka AGolovko VChodyka M(2022)Method for Reducing Neural-Network Models of Computer VisionPattern Recognition and Image Analysis10.1134/S105466182202014632:2(294-300)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1134/S1054661822020146
El-Nagar AZaki ASoliman FEl-Bardini M(2022)Hybrid deep learning diagonal recurrent neural network controller for nonlinear systemsNeural Computing and Applications10.1007/s00521-022-07673-934:24(22367-22386)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00521-022-07673-9
Golovko VKroshchanka AMikhno E(2021)Deep Neural Networks: Selected Aspects of Learning and ApplicationPattern Recognition and Image Analysis10.1134/S105466182101009031:1(132-143)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1134/S1054661821010090
Show More Cited By

The nature of unsupervised learning in deep neural networks: A new understanding and novel approach

Recommendations

Deep learning: an overview and main paradigms

In the present paper, we examine and analyze main paradigms of learning of multilayer neural networks starting with a single layer perceptron and ending with deep neural networks, which are considered regarded as a breakthrough in the field of the ...
Research on Point-wise Gated Deep Networks

Display Omitted We introduce pgRBMs into DBNs and present Point-wise Gated Deep Belief Networks.Similar to pgDBNs, Point-wise Gated Deep Boltzmann Machines are presented.We introduce dropout and weight uncertainty methods into pgRBMs.We discuss the ...
Enhancing deep neural networks via multiple kernel learning
Highlights
- We introduce KerNET, which combines Deep Neural Networks and Multiple Kernel learning;
Abstract
Deep neural networks and Multiple Kernel Learning are representation learning methodologies of widespread use and increasing success. While the former aims at learning representations through a hierarchy of features of increasing ...

Comments

Information & Contributors

Information

Published In

cover image Optical Memory and Neural Networks

Optical Memory and Neural Networks Volume 25, Issue 3

July 2016

76 pages

ISSN:1060-992X

Issue’s Table of Contents

Copyright © Copyright © 2016 Allerton Press, Inc.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 July 2016

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 23 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Kroshchanka AGolovko VChodyka M(2022)Method for Reducing Neural-Network Models of Computer VisionPattern Recognition and Image Analysis10.1134/S105466182202014632:2(294-300)Online publication date: 1-Jun-2022
https://dl.acm.org/doi/10.1134/S1054661822020146
El-Nagar AZaki ASoliman FEl-Bardini M(2022)Hybrid deep learning diagonal recurrent neural network controller for nonlinear systemsNeural Computing and Applications10.1007/s00521-022-07673-934:24(22367-22386)Online publication date: 1-Dec-2022
https://dl.acm.org/doi/10.1007/s00521-022-07673-9
Golovko VKroshchanka AMikhno E(2021)Deep Neural Networks: Selected Aspects of Learning and ApplicationPattern Recognition and Image Analysis10.1134/S105466182101009031:1(132-143)Online publication date: 1-Jan-2021
https://dl.acm.org/doi/10.1134/S1054661821010090
Kroshchanka AGolovko V(2021)The Reduction of Fully Connected Neural Network Parameters Using the Pre-training Technique2021 11th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS)10.1109/IDAACS53288.2021.9661015(937-941)Online publication date: 22-Sep-2021
https://dl.acm.org/doi/10.1109/IDAACS53288.2021.9661015
Zaki AEl-Nagar AEl-Bardini MSoliman F(2021)Deep learning controller for nonlinear system based on Lyapunov stability criterionNeural Computing and Applications10.1007/s00521-020-05077-133:5(1515-1531)Online publication date: 1-Mar-2021
https://dl.acm.org/doi/10.1007/s00521-020-05077-1

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents