research-article

Open access

AutoLR: an evolutionary approach to learning rate policies

Authors:

Pedro Carvalho,

Nuno Lourenço,

Filipe Assunção,

Penousal MachadoAuthors Info & Claims

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

Pages 672 - 680

https://doi.org/10.1145/3377930.3390158

Published: 26 June 2020 Publication History

Abstract

The choice of a proper learning rate is paramount for good Artificial Neural Network training and performance. In the past, one had to rely on experience and trial-and-error to find an adequate learning rate. Presently, a plethora of state of the art automatic methods exist that make the search for a good learning rate easier. While these techniques are effective and have yielded good results over the years, they are general solutions. This means the optimization of learning rate for specific network topologies remains largely unexplored. This work presents AutoLR, a framework that evolves Learning Rate Schedulers for a specific Neural Network Architecture using Structured Grammatical Evolution. The system was used to evolve learning rate policies that were compared with a commonly used baseline value for learning rate. Results show that training performed using certain evolved policies is more efficient than the established baseline and suggest that this approach is a viable means of improving a neural network's performance.

References

[1]

[n. d.]. Keras Optimizers Documentation. https://keras.io/optimizers/. ([n. d.]). Accessed: 2020-01-19.

[2]

[n. d.]. Matlab Training Option Documentation. https://www.mathworks.com/help/deeplearning/ref/trainingoptions.html. ([n. d.]). Accessed: 2020-01-19.

[3]

Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Yangqing Jia, Rafal Jozefowicz, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Mike Schuster, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. (2015). http://tensorflow.org/ Software available from tensorflow.org.

[4]

Filipe Assunção, Nuno Lourenço, Penousal Machado, and Bernardete Ribeiro. 2018. DENSER: deep evolutionary network structured representation. Genetic Programming and Evolvable Machines (27 Sep 2018).

[5]

James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. Advances in Neural Information Processing Systems 24: 25th Annual Conference on Neural Information Processing Systems 2011, NIPS 2011 (2011), 1--9.

[6]

James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of Machine Learning Research 13 (2012), 281--305.

Digital Library

[7]

Thomas M. Breuel. 2015. The Effects of Hyperparameters on SGD Training of Neural Networks. (2015). arXiv:1508.02788 http://arxiv.org/abs/1508.02788

[8]

John Duchi, Elad Hazan, and Yoram Singer. 2010. Adaptive subgradient methods for online learning and stochastic optimization. COLT 2010 - The 23rd Conference on Learning Theory 12 (2010), 257--269.

[9]

A. E. Eiben and James E. Smith. 2015. Introduction to Evolutionary Computing (2nd ed.). Springer Publishing Company, Incorporated.

Digital Library

[10]

Kunihiko Fukushima. 1980. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics (1980).

[11]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep learning. MIT press.

Digital Library

[12]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. ResNet. Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2016). arXiv:arXiv:1512.03385v1

[13]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long Short-Term Memory. Neural Computation (1997).

Digital Library

[14]

Robert A. Jacobs. 1988. Increased rates of convergence through learning rate adaptation. Neural Networks (1988).

[15]

Maarten Keijzer, Michael O'Neill, Conor Ryan, and Mike Cattolico. 2002. Grammatical Evolution Rules: The Mod and the Bucket Rule. In Genetic Programming, 5th European Conference, EuroGP 2002, Kinsale, Ireland, April 3-5, 2002, Proceedings. 123--130.

[16]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. (2014), 1--15. arXiv:1412.6980 http://arxiv.org/abs/1412.6980

[17]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[18]

Nuno Lourenço, Francisco B. Pereira, and Ernesto Costa. 2016. Unveiling the properties of structured grammatical evolution. Genetic Programming and Evolvable Machines 17, 3 (01 Sep 2016), 251--289.

[19]

Michael O'Neill and Conor Ryan. 2001. Grammatical evolution. IEEE Transactions on Evolutionary Computation (2001).

Digital Library

[20]

Russell Reed and Robert J MarksII. 1999. Neural smithing: supervised learning in feedforward artificial neural networks. Mit Press.

[21]

Conor Ryan, Michael O'Neill, and JJ Collins. 2018. Handbook of Grammatical Evolution. Springer.

[22]

Andrew Senior, Georg Heigold, Marc'Aurelio Ranzato, and Ke Yang. 2013. An empirical study of learning rates in deep neural networks for speech recognition. In ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings.

[23]

Leslie N Smith. 2017. Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV). IEEE, 464--472.

[24]

Matthew D. Zeiler. 2012. ADADELTA: An Adaptive Learning Rate Method. (2012). arXiv:1212.5701 http://arxiv.org/abs/1212.5701

Cited By

Araújo VRili IGisiger TGambs SVasseur ECellier MDiallo A(2025)AI-Powered Cow Detection in Complex Farm EnvironmentsSmart Agricultural Technology10.1016/j.atech.2025.100770(100770)Online publication date: Jan-2025
https://doi.org/10.1016/j.atech.2025.100770
Solimani FCardellicchio ADimauro GPetrozza ASummerer SCellini FRenò V(2024)Optimizing tomato plant phenotyping detectionComputers and Electronics in Agriculture10.1016/j.compag.2024.108728218:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.compag.2024.108728
Jin HWei WWang XZhang WWu Y(2023)Rethinking Learning Rate Tuning in the Era of Large Language Models2023 IEEE 5th International Conference on Cognitive Machine Intelligence (CogMI)10.1109/CogMI58952.2023.00025(112-121)Online publication date: 1-Nov-2023
https://doi.org/10.1109/CogMI58952.2023.00025
Show More Cited By

Index Terms

AutoLR: an evolutionary approach to learning rate policies
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
      1. Bio-inspired approaches
        Genetic programming
      2. Neural networks

Recommendations

Evolving Adaptive Neural Network Optimizers for Image Classification
Genetic Programming
Abstract
The evolution of hardware has enabled Artificial Neural Networks to become a staple solution to many modern Artificial Intelligence problems such as natural language processing and computer vision. The neural network’s effectiveness is highly ...
Towards the evolutionary assessment of neural transformers trained on source code
GECCO '22: Proceedings of the Genetic and Evolutionary Computation Conference Companion

In recent years, deep learning have become popular for solving tasks in a wide range of domains. With this growing diffusion, combined with architectures becoming increasingly complex and sophisticated, understanding how deep models make their ...
Initialisation in Structured Grammatical Evolution
GECCO '23 Companion: Proceedings of the Companion Conference on Genetic and Evolutionary Computation

Robust initialisation has shown to greatly improve the performance of genetic programming on a wide variety of tasks. Many of these have been adapted to work with grammatical evolution, with varying success. We are the first to examine the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

GECCO '20: Proceedings of the 2020 Genetic and Evolutionary Computation Conference

June 2020

1349 pages

ISBN:9781450371285

DOI:10.1145/3377930

General Chair:
Carlos Artemio Coello Coello
CINVESTAV-IPN

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2020

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

European Social Fund
Fundacao para a Ciencia e Tecnologia (FCT), Portugal
FCT - Foundation for Science and Technology

Conference

GECCO '20

Sponsor:

SIGEVO

GECCO '20: Genetic and Evolutionary Computation Conference

July 8 - 12, 2020

Cancún, Mexico

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
432
Total Downloads

Downloads (Last 12 months)107
Downloads (Last 6 weeks)21

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Araújo VRili IGisiger TGambs SVasseur ECellier MDiallo A(2025)AI-Powered Cow Detection in Complex Farm EnvironmentsSmart Agricultural Technology10.1016/j.atech.2025.100770(100770)Online publication date: Jan-2025
https://doi.org/10.1016/j.atech.2025.100770
Solimani FCardellicchio ADimauro GPetrozza ASummerer SCellini FRenò V(2024)Optimizing tomato plant phenotyping detectionComputers and Electronics in Agriculture10.1016/j.compag.2024.108728218:COnline publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1016/j.compag.2024.108728
Jin HWei WWang XZhang WWu Y(2023)Rethinking Learning Rate Tuning in the Era of Large Language Models2023 IEEE 5th International Conference on Cognitive Machine Intelligence (CogMI)10.1109/CogMI58952.2023.00025(112-121)Online publication date: 1-Nov-2023
https://doi.org/10.1109/CogMI58952.2023.00025
Li YLiu J(2023)A survey: evolutionary deep learningSoft Computing10.1007/s00500-023-08316-427:14(9401-9423)Online publication date: 23-May-2023
https://doi.org/10.1007/s00500-023-08316-4
Wu YLiu L(2022)Selecting and Composing Learning Rate Policies for Deep Neural NetworksACM Transactions on Intelligent Systems and Technology10.1145/357050814:2(1-25)Online publication date: 3-Nov-2022
https://dl.acm.org/doi/10.1145/3570508
Carvalho PLourenço NMachado P(2022)Evolving Adaptive Neural Network Optimizers for Image ClassificationGenetic Programming10.1007/978-3-031-02056-8_1(3-18)Online publication date: 13-Apr-2022
https://doi.org/10.1007/978-3-031-02056-8_1

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten