Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3377930.3390158acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article
Open access

AutoLR: an evolutionary approach to learning rate policies

Published: 26 June 2020 Publication History

Abstract

The choice of a proper learning rate is paramount for good Artificial Neural Network training and performance. In the past, one had to rely on experience and trial-and-error to find an adequate learning rate. Presently, a plethora of state of the art automatic methods exist that make the search for a good learning rate easier. While these techniques are effective and have yielded good results over the years, they are general solutions. This means the optimization of learning rate for specific network topologies remains largely unexplored. This work presents AutoLR, a framework that evolves Learning Rate Schedulers for a specific Neural Network Architecture using Structured Grammatical Evolution. The system was used to evolve learning rate policies that were compared with a commonly used baseline value for learning rate. Results show that training performed using certain evolved policies is more efficient than the established baseline and suggest that this approach is a viable means of improving a neural network's performance.

References

[1]
[n. d.]. Keras Optimizers Documentation. https://keras.io/optimizers/. ([n. d.]). Accessed: 2020-01-19.
[2]
[n. d.]. Matlab Training Option Documentation. https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html. ([n. d.]). Accessed: 2020-01-19.
[3]
Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/ Software available from tensorflow.org.
[4]
Filipe Assunção, Nuno Lourenço, Penousal Machado, and Bernardete Ribeiro. 2018. DENSER: deep evolutionary network structured representation. Genetic Programming and Evolvable Machines (27 Sep 2018).
[5]
James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 (2011), 1--9.
[6]
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (2012), 281--305.
[7]
Thomas M. Breuel. 2015. The Effects of Hyperparameters on SGD Training of Neural Networks. (2015). arXiv:1508.02788 http://arxiv.org/abs/1508.02788
[8]
John Duchi, Elad Hazan, and Yoram Singer. 2010. Adaptive subgradient methods for online learning and stochastic optimization. COLT 2010 - The 23rd Conference on Learning Theory 12 (2010), 257--269.
[9]
A. E. Eiben and James E. Smith. 2015. Introduction to Evolutionary Computing (2nd ed.). Springer Publishing Company, Incorporated.
[10]
Kunihiko Fukushima. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics (1980).
[11]
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.
[12]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. ResNet. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016). arXiv:arXiv:1512.03385v1
[13]
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation (1997).
[14]
Robert A. Jacobs. 1988. Increased rates of convergence through learning rate adaptation. Neural Networks (1988).
[15]
Maarten Keijzer, Michael O'Neill, Conor Ryan, and Mike Cattolico. 2002. Grammatical Evolution Rules: The Mod and the Bucket Rule. In Genetic Programming, 5th European Conference, EuroGP 2002, Kinsale, Ireland, April 3-5, 2002, Proceedings. 123--130.
[16]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. (2014), 1--15. arXiv:1412.6980 http://arxiv.org/abs/1412.6980
[17]
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
[18]
Nuno Lourenço, Francisco B. Pereira, and Ernesto Costa. 2016. Unveiling the properties of structured grammatical evolution. Genetic Programming and Evolvable Machines 17, 3 (01 Sep 2016), 251--289.
[19]
Michael O'Neill and Conor Ryan. 2001. Grammatical evolution. IEEE Transactions on Evolutionary Computation (2001).
[20]
Russell Reed and Robert J MarksII. 1999. Neural smithing: supervised learning in feedforward artificial neural networks. Mit Press.
[21]
Conor Ryan, Michael O'Neill, and JJ Collins. 2018. Handbook of Grammatical Evolution. Springer.
[22]
Andrew Senior, Georg Heigold, Marc'Aurelio Ranzato, and Ke Yang. 2013. An empirical study of learning rates in deep neural networks for speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings.
[23]
Leslie N Smith. 2017. Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 464--472.
[24]
Matthew D. Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. (2012). arXiv:1212.5701 http://arxiv.org/abs/1212.5701

Cited By

View all
  • (2025)AI-Powered Cow Detection in Complex Farm EnvironmentsSmart Agricultural Technology10.1016/j.atech.2025.100770(100770)Online publication date: Jan-2025
  • (2024)Optimizing tomato plant phenotyping detectionComputers and Electronics in Agriculture10.1016/j.compag.2024.108728218:COnline publication date: 1-Mar-2024
  • (2023)Rethinking Learning Rate Tuning in the Era of Large Language Models2023 IEEE 5th International Conference on Cognitive Machine Intelligence (CogMI)10.1109/CogMI58952.2023.00025(112-121)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference
June 2020
1349 pages
ISBN:9781450371285
DOI:10.1145/3377930
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. learning rate schedulers
  2. structured grammatical evolution

Qualifiers

  • Research-article

Funding Sources

  • European Social Fund
  • Fundacao para a Ciencia e Tecnologia (FCT), Portugal
  • FCT - Foundation for Science and Technology

Conference

GECCO '20
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)107
  • Downloads (Last 6 weeks)21
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)AI-Powered Cow Detection in Complex Farm EnvironmentsSmart Agricultural Technology10.1016/j.atech.2025.100770(100770)Online publication date: Jan-2025
  • (2024)Optimizing tomato plant phenotyping detectionComputers and Electronics in Agriculture10.1016/j.compag.2024.108728218:COnline publication date: 1-Mar-2024
  • (2023)Rethinking Learning Rate Tuning in the Era of Large Language Models2023 IEEE 5th International Conference on Cognitive Machine Intelligence (CogMI)10.1109/CogMI58952.2023.00025(112-121)Online publication date: 1-Nov-2023
  • (2023)A survey: evolutionary deep learningSoft Computing10.1007/s00500-023-08316-427:14(9401-9423)Online publication date: 23-May-2023
  • (2022)Selecting and Composing Learning Rate Policies for Deep Neural NetworksACM Transactions on Intelligent Systems and Technology10.1145/357050814:2(1-25)Online publication date: 3-Nov-2022
  • (2022)Evolving Adaptive Neural Network Optimizers for Image ClassificationGenetic Programming10.1007/978-3-031-02056-8_1(3-18)Online publication date: 13-Apr-2022

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media