research-article

Dropout algorithms for recurrent neural networks

Authors:

Mathys C. du PlessisAuthors Info & Claims

SAICSIT '18: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists

Pages 72 - 78

https://doi.org/10.1145/3278681.3278691

Published: 26 September 2018 Publication History

Abstract

In the last decade, hardware advancements have allowed for neural networks to become much larger in size. Dropout is a popular deep learning technique which has shown to improve the performance of large neural networks. Recurrent neural networks are powerful networks specialised at solving problems which use time series data. Three different approaches to incorporating Dropout with recurrent neural networks have been suggested. However, these approaches have not been evaluated under identical experimental conditions. This article investigates the performance of these Dropout approaches using a 2D physics simulation benchmark. After applying statistical tests it was found that using Dropout did improve network performance on the benchmark. However, contrary to the literature, the Dropout approach which was expected to perform poorly, performed well, and the approach which was expected to perform well, performed poorly.

References

[1]

J. Bayer, C. Osendorfer, D. Korhammer, N. Chen, S. Urban, and P. van der Smagt. On fast dropout and its applicability to recurrent networks. arXiv preprint arXiv:1311.0701, 2013.

[2]

Y. Gal and Z. Ghahramani. A theoretically grounded application of dropout in recurrent neural networks. In Advances in neural information processing systems, pages 1019--1027, 2016.

Digital Library

[3]

A. Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850, 2013.

[4]

K. Gurney. An Introduction to Neural Networks. Taylor & Francis, Inc., Bristol, PA, USA, 1997.

Digital Library

[5]

K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.

[6]

S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.

Digital Library

[7]

J. Koutnik, K. Greff, F. Gomez, and J. Schmidhuber. A clockwork rnn. In International Conference on Machine Learning, pages 1863--1871, 2014.

Digital Library

[8]

Q. V. Le, N. Jaitly, and G. E. Hinton. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941, 2015.

[9]

Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller. Efficient backprop. In Neural networks: Tricks of the trade, pages 9--50. Springer, 1998.

Digital Library

[10]

A. L. Maas, A. Y. Hannun, and A. Y. Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3, 2013.

[11]

J. Martens and I. Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1033--1040. Citeseer, 2011.

Digital Library

[12]

A. NG. Deep learning, self-taught learning and unsupervised feature learning. https://www.slideshare.net/extractconf/andrew-ng-chief-scientist-at-baidu, May 2013.

[13]

NVIDIA. Cuda zone. https://developer.nvidia.com/cuda-zone, Sep 2017.

[14]

M. Pachitariu and M. Sahani. Regularization and nonlinearities for neural language models: when are they needed? arXiv preprint arXiv:1301.5650, 2013.

[15]

H. T. Siegelmann and E. D. Sontag. On the computational power of neural nets. Journal of computer and system sciences, 50(1):132--150, 1995.

Digital Library

[16]

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929--1958, 2014.

Digital Library

[17]

S. S. Talathi and A. Vartak. Improving performance of recurrent neural network with relu nonlinearity. CoRR, abs/1511.03771, 2015.

[18]

T. Tieleman and G. Hinton. Lecture 6.5---RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 2012.

[19]

W. Zaremba, I. Sutskever, and O. Vinyals. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329, 2014.

Cited By

Kouris PAlexandridis GStafylopatis A(2024)Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approachJournal of Big Data10.1186/s40537-024-00950-511:1Online publication date: 14-Jul-2024
https://doi.org/10.1186/s40537-024-00950-5
Chen JXu TLiang XZhang S(2023)Evaluation and Optimization of Heat Extraction Strategies Based on Deep Neural Network in the Enhanced Geothermal SystemJournal of Energy Engineering10.1061/JLEED9.EYENG-4579149:1Online publication date: Feb-2023
https://doi.org/10.1061/JLEED9.EYENG-4579
Prasad ARoy SSarkar A(2023)Hemispheric prediction of solar cycle 25 based on a deep learning techniqueAdvances in Space Research10.1016/j.asr.2023.11.015Online publication date: Nov-2023
https://doi.org/10.1016/j.asr.2023.11.015
Show More Cited By

Recommendations

Exponential stability analysis of memristor-based recurrent neural networks with time-varying delays

This paper investigates the exponential stability problem about the memristor-based recurrent neural networks. Having more rich dynamic behaviors, neural networks based on the memristor will play a key role in the optimistic computation and associative ...
Exponential stability of recurrent neural networks with both time-varying delays and general activation functions via LMI approach

In this paper, the problem on exponential stability analysis of recurrent neural networks with both time-varying delays and general activation functions is considered. Neither the boundedness and the monotony on these activation functions nor the ...
Stochastic stability of uncertain fuzzy recurrent neural networks with Markovian jumping parameters

In this paper, the global robust stability of uncertain recurrent neural networks with Markovian jumping parameters which are represented by the Takagi-Sugeno fuzzy model is considered. A novel linear matrix inequality-based stability criterion is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

SAICSIT '18: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists

September 2018

362 pages

ISBN:9781450366472

DOI:10.1145/3278681

Conference Chair:
Sue Petratos,
Program Chairs:
Johan van Niekerk,
Bertram Haskins

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 September 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SAICSIT '18

SAICSIT '18: 2018 Annual Conference of the South African Institute of Computer Scientists and Information Technologists

September 26 - 28, 2018

Port Elizabeth, South Africa

Acceptance Rates

Overall Acceptance Rate 187 of 439 submissions, 43%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
148
Total Downloads

Downloads (Last 12 months)21
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Kouris PAlexandridis GStafylopatis A(2024)Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approachJournal of Big Data10.1186/s40537-024-00950-511:1Online publication date: 14-Jul-2024
https://doi.org/10.1186/s40537-024-00950-5
Chen JXu TLiang XZhang S(2023)Evaluation and Optimization of Heat Extraction Strategies Based on Deep Neural Network in the Enhanced Geothermal SystemJournal of Energy Engineering10.1061/JLEED9.EYENG-4579149:1Online publication date: Feb-2023
https://doi.org/10.1061/JLEED9.EYENG-4579
Prasad ARoy SSarkar A(2023)Hemispheric prediction of solar cycle 25 based on a deep learning techniqueAdvances in Space Research10.1016/j.asr.2023.11.015Online publication date: Nov-2023
https://doi.org/10.1016/j.asr.2023.11.015
Qudus URöder MKirrane SNgomo A(2023)TemporalFC: A Temporal Fact Checking Approach over Knowledge GraphsThe Semantic Web – ISWC 202310.1007/978-3-031-47240-4_25(465-483)Online publication date: 27-Oct-2023
https://doi.org/10.1007/978-3-031-47240-4_25
Qudus URöder MSaleem MNgonga Ngomo A(2022)HybridFC: A Hybrid Fact-Checking Approach for Knowledge GraphsThe Semantic Web – ISWC 202210.1007/978-3-031-19433-7_27(462-480)Online publication date: 16-Oct-2022
https://doi.org/10.1007/978-3-031-19433-7_27
Pan ZLu WChang Zwang H(2021)Simultaneous identification of groundwater pollution source spatial–temporal characteristics and hydraulic parameters based on deep regularization neural network-hybrid heuristic algorithmJournal of Hydrology10.1016/j.jhydrol.2021.126586600(126586)Online publication date: Sep-2021
https://doi.org/10.1016/j.jhydrol.2021.126586
Pan ZLu WFan YLi J(2021)Identification of groundwater contamination sources and hydraulic parameters based on bayesian regularization deep neural networkEnvironmental Science and Pollution Research10.1007/s11356-020-11614-1Online publication date: 4-Jan-2021
https://doi.org/10.1007/s11356-020-11614-1
Atencia MStoean RJoya G(2020)Uncertainty Quantification through Dropout in Time Series Prediction by Echo State NetworksMathematics10.3390/math80813748:8(1374)Online publication date: 17-Aug-2020
https://doi.org/10.3390/math8081374
Sefara TMokgonyane T(2020)Emotional Speaker Recognition based on Machine and Deep Learning2020 2nd International Multidisciplinary Information Technology and Engineering Conference (IMITEC)10.1109/IMITEC50163.2020.9334138(1-8)Online publication date: 25-Nov-2020
https://doi.org/10.1109/IMITEC50163.2020.9334138
Watt Ndu Plessis M(2019)Dropout for Recurrent Neural NetworksRecent Advances in Big Data and Deep Learning10.1007/978-3-030-16841-4_5(38-47)Online publication date: 3-Apr-2019
https://doi.org/10.1007/978-3-030-16841-4_5

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten