Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2908812.2908898acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article
Public Access

Epsilon-Lexicase Selection for Regression

Published: 20 July 2016 Publication History

Abstract

Lexicase selection is a parent selection method that considers test cases separately, rather than in aggregate, when performing parent selection. It performs well in discrete error spaces but not on the continuous-valued problems that compose most system identification tasks. In this paper, we develop a new form of lexicase selection for symbolic regression, named ε-lexicase selection, that redefines the pass condition for individuals on each test case in a more effective way. We run a series of experiments on real-world and synthetic problems with several treatments of ε and quantify how ε affects parent selection and model performance. ε-lexicase selection is shown to be effective for regression, producing better fit models compared to other techniques such as tournament selection and age-fitness Pareto optimization. We demonstrate that ε can be adapted automatically for individual test cases based on the population performance distribution. Our experiments show that ε-lexicase selection with automatic ε produces the most accurate models across tested problems with negligible computational overhead. We show that behavioral diversity is exceptionally high in lexicase selection treatments, and that ε-lexicase selection makes use of more fitness cases when selecting parents than lexicase selection, which helps explain the performance improvement.

References

[1]
A. R. Burks and W. F. Punch. An Efficient Structural Diversity Technique for Genetic Programming. In GECCO, pages 991--998. ACM Press, 2015.
[2]
K. Deb, S. Agrawal, A. Pratap, and T. Meyarivan. A Fast Elitist Non-dominated Sorting Genetic Algorithm for Multi-objective Optimization: NSGA-II. In PPSN VI, volume 1917, pages 849--858. Springer Berlin Heidelberg, Berlin, Heidelberg, 2000.
[3]
C. Gathercole and P. Ross. Dynamic training subset selection for supervised learning in Genetic Programming. In PPSN III, number 866 in Lecture Notes in Computer Science, pages 312--321. Springer Berlin Heidelberg, Oct. 1994.
[4]
I. Gonçalves and S. Silva. Balancing learning and overfitting in genetic programming with interleaved sampling of training data. In EuroGP 2013, pages 73--84, Berlin, Heidelberg, 2013. Springer Berlin Heidelberg.
[5]
D. Harrison and D. L. Rubinfeld. Hedonic housing prices and the demand for clean air. Journal of environmental economics and management, 5(1):81--102, 1978.
[6]
T. Helmuth. General Program Synthesis from Examples Using Genetic Programming with Parent Selection Based on Random Lexicographic Orderings of Test Cases. PhD thesis, UMass Amherst, Jan. 2015.
[7]
T. Helmuth, L. Spector, and J. Matheson. Solving Uncompromising Problems with Lexicase Selection. IEEE Transactions on Evolutionary Computation, PP(99):1--1, 2014.
[8]
G. S. Hornby. ALPS: The Age-layered Population Structure for Reducing the Problem of Premature Convergence. In GECCO, pages 815--822, New York, NY, USA, 2006. ACM.
[9]
H. Ishibuchi, N. Tsukamoto, and Y. Nojima. Evolutionary many-objective optimization: A short review. In IEEE CEC 2008, pages 2419--2426. Citeseer, 2008.
[10]
J. Klein and L. Spector. Genetic programming with historically assessed hardness. GPTP VI, pages 61--75, 2008.
[11]
K. Krawiec and P. Lichocki. Using Co-solvability to Model and Exploit Synergetic Effects in Evolution. In PPSN XI, pages 492--501. Springer Berlin Heidelberg, Berlin, Heidelberg, 2010.
[12]
K. Krawiec and P. Liskowski. Automatic derivation of search objectives for test-based genetic programming. In Genetic Programming, pages 53--65. Springer, 2015.
[13]
K. Krawiec and M. Nawrocki. Implicit fitness sharing for evolutionary synthesis of license plate detectors. Springer, 2013.
[14]
K. Krawiec and U.-M. O'Reilly. Behavioral programming: a broader and more detailed take on semantic GP. In GECCO, pages 935--942. ACM Press, 2014.
[15]
W. La Cava, K. Danai, L. Spector, P. Fleming, A. Wright, and M. Lackner. Automatic identification of wind turbine models using evolutionary multiobjective optimization. Renewable Energy, 87, Part 2:892--902, Mar. 2016.
[16]
W. B. Langdon. Evolving Data Structures with Genetic Programming. In ICGA, pages 295--302, 1995.
[17]
P. Liskowski, K. Krawiec, T. Helmuth, and L. Spector. Comparison of Semantic-aware Selection Methods in Genetic Programming. In GECCO Companion, pages 1301--1307, New York, NY, USA, 2015. ACM.
[18]
Y. Martínez, E. Naredo, L. Trujillo, and E. Galván-López. Searching for novel regression functions. In IEEE CEC 2013, pages 16--23. IEEE, 2013.
[19]
R. I. B. McKay. An Investigation of Fitness Sharing in Genetic Programming. The Australian Journal of Intelligent Information Processing Systems, 7(1/2):43--51, July 2001.
[20]
A. Moraglio, K. Krawiec, and C. G. Johnson. Geometric semantic genetic programming. In PPSN XII, pages 21--31. Springer, 2012.
[21]
T. Pham-Gia and T. L. Hung. The mean and median absolute deviations. Mathematical and Computer Modelling, 34(7--8):921--936, Oct. 2001.
[22]
M. Schmidt and H. Lipson. Coevolution of Fitness Predictors. IEEE Transactions on Evolutionary Computation, 12(6):736--749, Dec. 2008.
[23]
M. Schmidt and H. Lipson. Distilling free-form natural laws from experimental data. Science, 324(5923):81--85, 2009.
[24]
M. Schmidt and H. Lipson. Age-fitness pareto optimization. In GPTP VIII, pages 129--146. Springer, 2011.
[25]
M. D. Schmidt. Machine Science: Automated Modeling of Deterministic and Stochastic Dynamical Systems. PhD thesis, Cornell University, Ithaca, NY, USA, 2011. AAI3484909.
[26]
G. F. Smits and M. Kotanchek. Pareto-front exploitation in symbolic regression. In GPTP II, pages 283--299. Springer, 2005.
[27]
L. Spector. Assessment of problem modality by differential performance of lexicase selection in genetic programming: a preliminary report. In GECCO, pages 401--408, 2012.
[28]
L. Spector and J. Klein. Trivial geography in genetic programming. In GPTP III, pages 109--123. Springer, 2006.
[29]
J. Towns, T. Cockerill, M. Dahan, I. Foster, K. Gaither, A. Grimshaw, V. Hazlewood, S. Lathrop, D. Lifka, G. D. Peterson, R. Roskies, J. R. Scott, and N. Wilkens-Diehr. XSEDE: Accelerating Scientific Discovery. Computing in Science and Engineering, 16(5):62--74, 2014.
[30]
A. Tsanas and A. Xifara. Accurate quantitative estimation of energy performance of residential buildings using statistical machine learning tools. Energy and Buildings, 49:560--567, 2012.
[31]
E. Vladislavleva, G. Smits, and D. den Hertog. Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming. IEEE Transactions on Evolutionary Computation, 13(2):333--349, 2009.
[32]
D. R. White, J. McDermott, M. Castelli, L. Manzoni, B. W. Goldman, G. Kronberger, W. Ja\'skowski, U.-M. O'Reilly, and S. Luke. Better GP benchmarks: community survey results and proposals. Genetic Programming and Evolvable Machines, 14(1):3--29, Dec. 2012.
[33]
E. Zitzler, M. Laumanns, and L. Thiele. SPEA2: Improving the strength Pareto evolutionary algorithm. ETH Zürich, Institut für Technische Informatik und Kommunikationsnetze (TIK), 2001.

Cited By

View all
  • (2025)A comparison of representations in grammar-guided genetic programming in the context of glucose prediction in people with diabetesGenetic Programming and Evolvable Machines10.1007/s10710-024-09502-526:1Online publication date: 1-Jun-2025
  • (2024)Scaling up unbiased search-based symbolic regressionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/471(4264-4272)Online publication date: 3-Aug-2024
  • (2024)Feature Encapsulation by Stages Using Grammatical EvolutionProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654097(531-534)Online publication date: 14-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '16: Proceedings of the Genetic and Evolutionary Computation Conference 2016
July 2016
1196 pages
ISBN:9781450342063
DOI:10.1145/2908812
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. genetic programming
  2. parent selection
  3. regression
  4. system identification

Qualifiers

  • Research-article

Funding Sources

  • ACI
  • NSF

Conference

GECCO '16
Sponsor:
GECCO '16: Genetic and Evolutionary Computation Conference
July 20 - 24, 2016
Colorado, Denver, USA

Acceptance Rates

GECCO '16 Paper Acceptance Rate 137 of 381 submissions, 36%;
Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)215
  • Downloads (Last 6 weeks)46
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)A comparison of representations in grammar-guided genetic programming in the context of glucose prediction in people with diabetesGenetic Programming and Evolvable Machines10.1007/s10710-024-09502-526:1Online publication date: 1-Jun-2025
  • (2024)Scaling up unbiased search-based symbolic regressionProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/471(4264-4272)Online publication date: 3-Aug-2024
  • (2024)Feature Encapsulation by Stages Using Grammatical EvolutionProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3638530.3654097(531-534)Online publication date: 14-Jul-2024
  • (2024)On the robustness of lexicase selection to contradictory objectivesProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654215(594-602)Online publication date: 14-Jul-2024
  • (2024)Minimum variance threshold for epsilon-lexicase selectionProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654149(905-913)Online publication date: 14-Jul-2024
  • (2024)Effective Adaptive Mutation Rates for Program SynthesisProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654135(952-960)Online publication date: 14-Jul-2024
  • (2024)On the Nature of the Phenotype in Tree Genetic ProgrammingProceedings of the Genetic and Evolutionary Computation Conference10.1145/3638529.3654129(868-877)Online publication date: 14-Jul-2024
  • (2024)Modular Multitree Genetic Programming for Evolutionary Feature Construction for RegressionIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.331863828:5(1455-1469)Online publication date: Oct-2024
  • (2024)Genetic Programming With Lexicase Selection for Large-Scale Dynamic Flexible Job Shop SchedulingIEEE Transactions on Evolutionary Computation10.1109/TEVC.2023.324460728:5(1235-1249)Online publication date: Oct-2024
  • (2024)A Novel Symbolic Regressor Enhancer Using Genetic Programming2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10612124(1-8)Online publication date: 30-Jun-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media