Abstract
It is stated that deep learning (DL) depends on a mixture of artificial intelligence (AI) and machine learning (ML) rules that encompasses all those suggested in the previous chapters. Two main versions of deep learning-enhanced artificial neural network (ANN), convolution neural network (CNN) and recurrent neural network (RNN), are explained in terms of model architecture. The first has feedforward procedures with a series of hidden convolution-pooling layers, followed by the fully connected layer, and then the output layer. CNN is relatively very powerful as a deep learning procedure from traditional shallow learning ANN models. Possibilities of regularization of CNN against over- or lower-fitting cases are mentioned. As for the RNN in the DL domain, they have back propagation layer possibilities to reach the final solution in a short time in the form of compressed or unfolded neural network architectures. In the text, training, testing, and prediction stages of CNN and RNN are explained comparatively.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bowman S, Angeli G, Potts C, Manning CD (2015) A large annotated corpus for learning natural language inference. In: Proceedings of the 2015 conference on empirical methods in natural language processing
Fukushima K (1980) Neocognitron: a self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol Cybern 36:193–202
Glorot X, Bengio Y (2010) Understanding the difficulty of training deep feedforward neural networks. In: Proceedings of the thirteenth international conference on artificial intelligence and statistics, vol 9. PMLR, pp 249–256
He K, Zhang X, Ren S, Sun J (2015) Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp 1026–1034
Hinton GE, Salakhutdinov RR (2006) Reducing the dimensionality of data with neural networks. Science 313(5786):504–507
Hubel DH, Wiesel TN (1959) Receptive fields of single neurons in the cat’s striate cortex. J Physiol 148:574–591
Ivakhnenko AG (1971) Polynomial theory of complex systems. IEEE Trans Syst Man Cybern 4:364–378
Ivakhnenko AG, Lapa VG (1965) Cybernetic predicting devices. CCM Information Corporation
Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. Adv Neural Inf Proces Syst 25:1097–1105
LeCun Y, Boser BE, Denker JS, Henderson D, Howard RE, Hubbard WE, Jankel LD (1990) Handwritten digit recognition with a back-propagation network. Adv Neural Inf Proces Syst 2:396–404
Liu X, He P, Chen W, Gao J (2019) Improving multi-task deep neural networks via knowledge distillation for natural language understanding. arXiv preprint arXiv:1904.09482
Movshovitz-Attias D, Cohen WW (2013) Natural language models for predicting programming comments. In: Proceedings of the 51st annual meeting of the association for computational linguistics, Sofia, Bulgaria, August 4–9 2013. c 2013 Association for Computational Linguistics, pp 35–40
Rumelhart DE, Hinton GE, Williams RJ (1985) Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science
Sherstinsky A (2020) Special issue on machine learning and dynamical systems fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Phys D: Nonlinear Phenom 404
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Şen, Z. (2023). Deep Learning. In: Shallow and Deep Learning Principles. Springer, Cham. https://doi.org/10.1007/978-3-031-29555-3_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-29555-3_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-29554-6
Online ISBN: 978-3-031-29555-3
eBook Packages: EngineeringEngineering (R0)