Deep Learning

Schulz, Hannes; Behnke, Sven

doi:10.1007/s13218-012-0198-z

Deep Learning

Layer-Wise Learning of Feature Hierarchies

Fachbeitrag
Published: 17 May 2012

Volume 26, pages 357–363, (2012)
Cite this article

KI - Künstliche Intelligenz Aims and scope Submit manuscript

Hannes Schulz¹ &
Sven Behnke¹

6873 Accesses
3 Altmetric
Explore all metrics

Abstract

Hierarchical neural networks for object recognition have a long history. In recent years, novel methods for incrementally learning a hierarchy of features from unlabeled inputs were proposed as good starting point for supervised training. These deep learning methods—together with the advances of parallel computers—made it possible to successfully attack problems that were not practical before, in terms of depth and input size. In this article, we introduce the reader to the basic concepts of deep learning, discuss selected methods in detail, and present application examples from computer vision and speech recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Behnke S (1999) Hebbian learning and competition in the neural abstraction pyramid. In: Proceedings of international joint conference on neural networks (IJCNN), Washington, DC, USA, vol 2, pp 1356–1361
Google Scholar
Behnke S (2003a) Discovering hierarchical speech features using convolutional non-negative matrix factorization. In: Proceedings of international joint conference on neural networks (IJCNN), Portland, Oregon, USA, vol 4, pp 2758–2763
Google Scholar
Behnke S (2003b) Hierarchical neural networks for image interpretation. Lecture notes in computer science, vol 2766. Springer, Berlin
Book MATH Google Scholar
Bengio Y, Lamblin P, Popovici D, Larochelle H (2006) Greedy layer-wise training of deep networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 153–160
Google Scholar
Bottou L (2011) From machine learning to machine reasoning. Arxiv preprint. arXiv:1102.1808
Boureau Y, Bach F, LeCun Y, Ponce J (2010) Learning mid-level features for recognition. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2559–2566
Google Scholar
Cireşan DC, Meier U, Masci J, Schmidhuber J (2012) Multi-column deep neural networks for image classification. In: Proceedings of computer vision and pattern recognition (CVPR) (in press)
Coates A, Lee H, Ng AY (2010) An analysis of single-layer networks in unsupervised feature learning. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Chia, Laguna, Italy
Google Scholar
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 160–167
Chapter Google Scholar
Cybenko G (1989) Approximation by superpositions of a sigmoidal function. Math. Control Signals Syst. 2(4):303–314
Article MathSciNet MATH Google Scholar
Dahl G, Yu D, Deng L, Acero A (2012) Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1):30–42
Article Google Scholar
Erhan D, Manzagol P, Bengio Y, Bengio S, Vincent P (2009) The difficulty of training deep architectures and the effect of unsupervised pre-training. In: Proceedings of international conference on artificial intelligence and statistics (AISTATS), Clearwater Beach, FL, USA, pp 153–160
Google Scholar
Erhan D, Bengio Y, Courville AC, Manzagol PA, Vincent P, Bengio S (2010) Why does unsupervised pre-training help deep learning? J. Mach. Learn. Res. 11:625–660
MathSciNet MATH Google Scholar
Fidler S, Leonardis A (2007) Towards scalable representations of object categories: learning a hierarchy of parts. In: Proceedings of computer vision and pattern recognition (CVPR), Minneapolis, MN, USA
Fukushima K (1980) Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybern. 36(4):193–202
Article MATH Google Scholar
Grangier D, Bottou L, Collobert R (2009) Deep convolutional networks for scene parsing. In: ICML deep learning workshop, Montreal, Canada
Google Scholar
Hinton G (2002) Training products of experts by minimizing contrastive divergence. Neural Comput. 14(8):1771–1800
Article MATH Google Scholar
Hinton G, Salakhutdinov R (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Article MathSciNet MATH Google Scholar
Hinton G, Osindero S, Teh Y (2006) A fast learning algorithm for deep belief nets. Neural Comput. 18(7):1527–1554
Article MathSciNet MATH Google Scholar
Hochreiter S, Bengio Y Frasconi P, Schmidhuber J (2001) Gradient flow in recurrent nets: the difficulty of learning long-term dependencies. In: Kremer SC, Kolen JF (eds) A field guide to dynamical recurrent neural networks. Wiley/IEEE Press, New York
Google Scholar
Huang J, Mumford D (1999) Statistics of natural images and models. In: Proceedings of computer vision and pattern recognition (CVPR), Ft. Collins, CO, USA
Google Scholar
Kavukcuoglu K, Ranzato M, LeCun Y (2010) Fast inference in sparse coding algorithms with applications to object recognition. CoRR abs/1010.3467
LeCun Y, Boser B, Denker J, Henderson D, Howard R, Hubbard W, Jackel L (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4):541–551
Article Google Scholar
LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc. IEEE 86:2278–2324
Article Google Scholar
Lee H, Grosse R, Ranganath R, Ng A (2009) Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of international conference on machine learning (ICML), New York, NY, USA, pp 609–616
Google Scholar
Lee H, Pham P, Largman Y, Ng A (2009) Unsupervised feature learning for audio classification using convolutional deep belief networks. In: Advances in neural information processing systems (NIPS), Vancouver, Canada, pp 1096–1104
Google Scholar
Memisevic R (2011) Gradient-based learning of higher-order image features. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 1591–1598
Chapter Google Scholar
Ranzato M, Hinton G (2010) Modeling pixel means and covariances using factorized third-order Boltzmann machines. In: Proceedings of computer vision and pattern recognition (CVPR), San Francisco, CA, USA, pp 2551–2558
Google Scholar
Riesenhuber M, Poggio T (1999) Hierarchical models of object recognition in cortex. Nat. Neurosci. 2:1019–1025
Article Google Scholar
Rumelhart D, Hinton G, Williams R (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536
Article Google Scholar
Scherer D, Müller A, Behnke S (2010) Evaluation of pooling operations in convolutional architectures for object recognition. In: Proceedings of international conference on artificial neural networks (ICANN), Thessaloniki, Greece, pp 92–101
Google Scholar
Schulz H, Behnke S (2012) Learning object-class segmentation with convolutional neural networks. In: Proceedings of the European symposium on artificial neural networks (ESANN), Bruges, Belgium
Google Scholar
Shannon C (1949) The synthesis of two-terminal switching circuits. Bell Syst. Tech. J. 28(1):59–98
MathSciNet Google Scholar
Taylor G, Fergus R, LeCun Y, Bregler C (2010) Convolutional learning of spatio-temporal features. In: Computer Vision (ECCV 2010), pp 140–153
Chapter Google Scholar
Tieleman T (2008) Training restricted Boltzmann machines using approximations to the likelihood gradient. In: Proceedings of international conference on machine learning (ICML), pp 1064–1071
Chapter Google Scholar
Vincent P (2011) A connection between score matching and denoising autoencoders. Neural Comput. 23(7):1661–1674
Article MathSciNet MATH Google Scholar
Vincent P, Larochelle H, Lajoie I, Bengio Y, Manzagol P (2010) Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11:3371–3408
MathSciNet MATH Google Scholar
Weston J, Ratle F, Collobert R (2008) Deep learning via semi-supervised embedding. In: Proceedings of international conference on machine learning (ICML), Helsinki, Finland, pp 1168–1175
Chapter Google Scholar
Wiskott L, Sejnowski T (2002) Slow feature analysis: unsupervised learning of invariances. Neural Comput. 14(4):715–770
Article MATH Google Scholar
Zeiler M, Taylor G, Fergus R (2011) Adaptive deconvolutional networks for mid and high level feature learning. In: Proceedings of international conference on computer vision (ICCV), Barcelona, Spain, pp 2018–2025
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

Autonomous Intelligent Systems, Institut für Informatik VI, University Bonn, Friedrich-Ebert-Allee 144, 53113, Bonn, Germany
Hannes Schulz & Sven Behnke

Authors

Hannes Schulz
View author publications
You can also search for this author in PubMed Google Scholar
Sven Behnke
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hannes Schulz.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Schulz, H., Behnke, S. Deep Learning. Künstl Intell 26, 357–363 (2012). https://doi.org/10.1007/s13218-012-0198-z

Download citation

Received: 16 April 2012
Accepted: 27 April 2012
Published: 17 May 2012
Issue Date: November 2012
DOI: https://doi.org/10.1007/s13218-012-0198-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Deep Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

A survey of deep learning methods and software tools for image classification and object detection

Deep learning

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Deep Learning

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Learning Models and Their Architectures for Computer Vision Applications: A Review

A survey of deep learning methods and software tools for image classification and object detection

Deep learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation