Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3278681.3278691acmotherconferencesArticle/Chapter ViewAbstractPublication PageshtConference Proceedingsconference-collections
research-article

Dropout algorithms for recurrent neural networks

Published: 26 September 2018 Publication History

Abstract

In the last decade, hardware advancements have allowed for neural networks to become much larger in size. Dropout is a popular deep learning technique which has shown to improve the performance of large neural networks. Recurrent neural networks are powerful networks specialised at solving problems which use time series data. Three different approaches to incorporating Dropout with recurrent neural networks have been suggested. However, these approaches have not been evaluated under identical experimental conditions. This article investigates the performance of these Dropout approaches using a 2D physics simulation benchmark. After applying statistical tests it was found that using Dropout did improve network performance on the benchmark. However, contrary to the literature, the Dropout approach which was expected to perform poorly, performed well, and the approach which was expected to perform well, performed poorly.

References

[1]
J. Bayer, C. Osendorfer, D. Korhammer, N. Chen, S. Urban, and P. van der Smagt. On fast dropout and its applicability to recurrent networks. arXiv preprint arXiv:1311.0701, 2013.
[2]
Y. Gal and Z. Ghahramani. A theoretically grounded application of dropout in recurrent neural networks. In Advances in neural information processing systems, pages 1019--1027, 2016.
[3]
A. Graves. Generating sequences with recurrent neural networks. CoRR, abs/1308.0850, 2013.
[4]
K. Gurney. An Introduction to Neural Networks. Taylor & Francis, Inc., Bristol, PA, USA, 1997.
[5]
K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. CoRR, abs/1512.03385, 2015.
[6]
S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.
[7]
J. Koutnik, K. Greff, F. Gomez, and J. Schmidhuber. A clockwork rnn. In International Conference on Machine Learning, pages 1863--1871, 2014.
[8]
Q. V. Le, N. Jaitly, and G. E. Hinton. A simple way to initialize recurrent networks of rectified linear units. arXiv preprint arXiv:1504.00941, 2015.
[9]
Y. LeCun, L. Bottou, G. B. Orr, and K.-R. Müller. Efficient backprop. In Neural networks: Tricks of the trade, pages 9--50. Springer, 1998.
[10]
A. L. Maas, A. Y. Hannun, and A. Y. Ng. Rectifier nonlinearities improve neural network acoustic models. In Proc. icml, volume 30, page 3, 2013.
[11]
J. Martens and I. Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1033--1040. Citeseer, 2011.
[12]
A. NG. Deep learning, self-taught learning and unsupervised feature learning. https://www.slideshare.net/extractconf/andrew-ng-chief-scientist-at-baidu, May 2013.
[13]
NVIDIA. Cuda zone. https://developer.nvidia.com/cuda-zone, Sep 2017.
[14]
M. Pachitariu and M. Sahani. Regularization and nonlinearities for neural language models: when are they needed? arXiv preprint arXiv:1301.5650, 2013.
[15]
H. T. Siegelmann and E. D. Sontag. On the computational power of neural nets. Journal of computer and system sciences, 50(1):132--150, 1995.
[16]
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929--1958, 2014.
[17]
S. S. Talathi and A. Vartak. Improving performance of recurrent neural network with relu nonlinearity. CoRR, abs/1511.03771, 2015.
[18]
T. Tieleman and G. Hinton. Lecture 6.5---RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning, 2012.
[19]
W. Zaremba, I. Sutskever, and O. Vinyals. Recurrent neural network regularization. arXiv preprint arXiv:1409.2329, 2014.

Cited By

View all
  • (2024)Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approachJournal of Big Data10.1186/s40537-024-00950-511:1Online publication date: 14-Jul-2024
  • (2023)Evaluation and Optimization of Heat Extraction Strategies Based on Deep Neural Network in the Enhanced Geothermal SystemJournal of Energy Engineering10.1061/JLEED9.EYENG-4579149:1Online publication date: Feb-2023
  • (2023)Hemispheric prediction of solar cycle 25 based on a deep learning techniqueAdvances in Space Research10.1016/j.asr.2023.11.015Online publication date: Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SAICSIT '18: Proceedings of the Annual Conference of the South African Institute of Computer Scientists and Information Technologists
September 2018
362 pages
ISBN:9781450366472
DOI:10.1145/3278681
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 September 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. deep learning
  2. dropout
  3. recurrent neural networks

Qualifiers

  • Research-article

Conference

SAICSIT '18

Acceptance Rates

Overall Acceptance Rate 187 of 439 submissions, 43%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)21
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Text summarization based on semantic graphs: an abstract meaning representation graph-to-text deep learning approachJournal of Big Data10.1186/s40537-024-00950-511:1Online publication date: 14-Jul-2024
  • (2023)Evaluation and Optimization of Heat Extraction Strategies Based on Deep Neural Network in the Enhanced Geothermal SystemJournal of Energy Engineering10.1061/JLEED9.EYENG-4579149:1Online publication date: Feb-2023
  • (2023)Hemispheric prediction of solar cycle 25 based on a deep learning techniqueAdvances in Space Research10.1016/j.asr.2023.11.015Online publication date: Nov-2023
  • (2023)TemporalFC: A Temporal Fact Checking Approach over Knowledge GraphsThe Semantic Web – ISWC 202310.1007/978-3-031-47240-4_25(465-483)Online publication date: 27-Oct-2023
  • (2022)HybridFC: A Hybrid Fact-Checking Approach for Knowledge GraphsThe Semantic Web – ISWC 202210.1007/978-3-031-19433-7_27(462-480)Online publication date: 16-Oct-2022
  • (2021)Simultaneous identification of groundwater pollution source spatial–temporal characteristics and hydraulic parameters based on deep regularization neural network-hybrid heuristic algorithmJournal of Hydrology10.1016/j.jhydrol.2021.126586600(126586)Online publication date: Sep-2021
  • (2021)Identification of groundwater contamination sources and hydraulic parameters based on bayesian regularization deep neural networkEnvironmental Science and Pollution Research10.1007/s11356-020-11614-1Online publication date: 4-Jan-2021
  • (2020)Uncertainty Quantification through Dropout in Time Series Prediction by Echo State NetworksMathematics10.3390/math80813748:8(1374)Online publication date: 17-Aug-2020
  • (2020)Emotional Speaker Recognition based on Machine and Deep Learning2020 2nd International Multidisciplinary Information Technology and Engineering Conference (IMITEC)10.1109/IMITEC50163.2020.9334138(1-8)Online publication date: 25-Nov-2020
  • (2019)Dropout for Recurrent Neural NetworksRecent Advances in Big Data and Deep Learning10.1007/978-3-030-16841-4_5(38-47)Online publication date: 3-Apr-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media