Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3341105.3373921acmconferencesArticle/Chapter ViewAbstractPublication PagessacConference Proceedingsconference-collections
research-article

Leveraging asynchronous parallel computing to produce simple genetic programming computational models

Published: 30 March 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Traditionally, reducing complexity in Machine Learning promises benefits such as less overfitting. However, complexity control in Genetic Programming (GP) often means reducing the sizes of the evolving expressions, and past literature shows that size reduction does not necessarily reduce overfitting. In fact, whether size consistently represents complexity is itself debatable. Therefore, this paper proposes evaluation time of an evolving model - the computational time required to evaluate a model on data - as the estimate of its complexity. Evaluation time depends upon the size, but crucially also on the composition of an evolving model, and can thus distil its underlying complexity. To discourage complexity, this paper takes an innovative approach that asynchronously evaluates multiple models concurrently. These models race to their completion; thus, those models that finish earlier, join the population earlier to breed further in a steady-state fashion. Thus, the computationally simpler models, even if less accurate, get further chances to evolve before the more accurate yet expensive models join the population. Crucially, since evaluation times vary from one execution to another, this paper also shows how to significantly minimise this variation.
    The paper compares the proposed method on six challenging symbolic regression problems with both standard GP and GP with an effective bloat control method. The results demonstrated that the proposed asynchronous parallel GP (APGP) indeed produces individuals that are smaller, faster and more accurate than those in standard GP. While GP with bloat control (GP+BC) produced smaller individuals, it did so at the cost of lower accuracy than APGP both on training and test data, thus questioning the overall benefits of bloat control. Also, while APGP took the fewest evaluations to match the training accuracy of GP, GP+BC took the most.
    These results, and the portability of evaluation time as an estimate of complexity encourage further experimentation and fine-tuning of this hitherto unexplored style of GP.

    References

    [1]
    Raja Muhammad Atif Azad. 2003. A Position Independent Representation for Evolutionary Automatic Programming Algorithms - The Chorus System. Ph.D. Dissertation. University of Limerick, Ireland. http://www.cs.ud.ac.uk/staff/W.Langdon/ftp/papers/azad_thesis.ps.gz
    [2]
    Raja Muhammad Atif Azad and Conor Ryan. 2011. Variance based selection to improve test set performance in genetic programming. In Proceedings of the 13th annual conference on Genetic and evolutionary computation. ACM, Dublin, Ireland, 1315--1322. http://dl.acm.org/citation.cfm?id=2001754
    [3]
    Raja Muhammad Atif Azad and Conor Ryan. 2014. A Simple Approach to Lifetime Learning in Genetic Programming-Based Symbolic Regression. Evolutionary Computation 22, 2 (2014), 287--317.
    [4]
    Erick Cantú-Paz. 1998. A Survey of Parallel Genetic Algorithms. CALCULATEURS PARALLELES, RESEAUX ET SYSTEMS REPARTIS 10, 2 (1998), 141--171.
    [5]
    Gopinath Chennupati, Raja Muhammad Atif Azad, and Conor Ryan. 2015. Performance Optimization of Multi-Core Grammatical Evolution Generated Parallel Recursive Programs. In GECCO '15: Proceedings of the 2015 Annual Conference on Genetic and Evolutionary Computation, Sara Silva et al (Ed.). Springer, Madrid, Spain, 1007--1014.
    [6]
    Mario Couture. 2007. Complexity and chaos-state-of-the-art; formulations and measures of complexity. Technical Report. DEFENCE RESEARCH AND DEVELOPMENT CANADA VALCARTIER (QUEBEC).
    [7]
    Stephen Dignum and Riccardo Poli. 2008. Operator Equalisation and Bloat Free GP. In Proceedings of the 11th European Conference on Genetic Programming, EuroGP 2008 (Lecture Notes in Computer Science), Michael O'Neill et al (Ed.), Vol. 4971. Springer, Naples, 110--121.
    [8]
    Dheeru Dua and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. http://archive.ics.uci.edu/ml
    [9]
    I De Falco, Aniello Iazzetta, Ernesto Tarantino, A Delia Cioppa, and Giuseppe Trautteur. 2000. A Kolmogorov Complexity-based Genetic Programming tool for string compression. In Proceedings of the 2nd Annual Conference on Genetic and Evolutionary Computation. Morgan Kaufmann Publishers Inc., Morgan Kaufmann, Las Vegas, Nevada, USA, 427--434.
    [10]
    Peter Griinwald. 2005. Introducing the minimum description length principle. Advances in minimum description length: Theory and applications 3 (2005), 3--22.
    [11]
    Steven Gustafson, Edmund K Burke, and Natalio Krasnogor. 2005. On Improving Genetic Programming for Symbolic Regression. In Proceedings of the 2005 IEEE Congress on Evolutionary Computation, David Corne et al (Ed.), Vol. 1. IEEE Press, Edinburgh, Scotland, UK, 912--919. http://ieeexplore.ieee.org/servlet/opac?punumber=10417&isvol=1
    [12]
    Nguyen Xuan Hoai, Robert I McKay, and Daryl Essam. 2006. Representation and structural difficulty in genetic programming. IEEE Transactions on evolutionary computation 10, 2 (2006), 157--166.
    [13]
    Ting Hu, Joshua Payne, Wolfgang Banzhaf, and Jason Moore. 2012. Evolutionary dynamics on multiple scales: a quantitative analysis of the interplay between genotype, phenotype, and fitness in linear genetic programming. Genetic Programming and Evolvable Machines 13, 3 (Sept. 2012), 305--337. Special issue on selected papers from the 2011 European conference on genetic programming.
    [14]
    Hitoshi Iba, Hugo de Garis, and Taisuke Sato. 1994. Genetic Programming Using a Minimum Description Length Principle. In Advances in Genetic Programming, Kenneth E. Kinnear, Jr. (Ed.). MIT Press, Cambridge, MA, USA, Chapter 12, 265--284. http://cognet.mit.edu/sites/default/files/books/9780262277181/pdfs/9780262277181_chap12.pdf
    [15]
    Maarten Keijzer. 2003. Improving symbolic regression with interval arithmetic and linear scaling. In European Conference on Genetic Programming. EuroGP, Springer, Essex, UK, 70--82.
    [16]
    Jinhan Kim, Junhwi Kim, and Shin Yoo. 2017. GPGPGPU: Evaluation of Parallelisation of Genetic Programming using GPGPU. In Proceedings of the 9th International Symposium on Search Based Software Engineering, SSBSE 2017 (LNCS), Tim Menzies and Justyna Petke (Eds.), Vol. 10452. Springer, Paderborn, Germany, 137--142.
    [17]
    John R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge, MA, USA. http://mitpress.mit.edu/books/genetic-programming
    [18]
    John R. Koza. 1994. Genetic Programming II: Automatic Discovery of Reusable Programs. MIT Press, Cambridge, MA.
    [19]
    John R. Koza. 2008. Human-competitive machine invention by means of genetic programming. Artificial Intelligence for Engineering Design, Analysis and Manufacturing 22, 3 (2008), 185--193.
    [20]
    John R. Koza and David Andre. 1995. Parallel Genetic Programming on a Network of Transputers. Technical Report CS-TR-95-1542. Stanford University, Department of Computer Science. http://www.genetic-programming.com/jkpdf/tr1542parallelsuversion.pdf
    [21]
    Sanjeev R. Kulkarni and Gilbert Harman. 2011. Statistical learning theory: a tutorial. Wiley Interdisciplinary Reviews: Computational Statistics 3, 6 (2011), 543--556.
    [22]
    Ashish Kumar, Saurabh Goyal, and Manik Varma. 2017. Resource-efficient Machine Learning in 2 KB RAM for the Internet of Things. In Proceedings of the 34th International Conference on Machine Learning (Proceedings of Machine Learning Research), Doina Precup and Yee Whye Teh (Eds.), Vol. 70. PMLR, International Convention Centre, Sydney, Australia, 1935--1944.
    [23]
    Zachary C. Lipton. 2018. The Mythos of Model Interpretability. Commun. ACM 61, 10 (Sept. 2018), 36--43.
    [24]
    Sean Luke and Liviu Panait. 2002. Fighting Bloat with Nonparametric Parsimony Pressure. In Parallel Problem Solving from Nature - PPSN VII (Lecture Notes in Computer Science, LNCS), Juan J. Merelo-Guervos, Panagiotis Adamidis, Hans-Georg Beyer, Jose-Luis Fernandez-Villacanas, and Hans-Paul Schwefel (Eds.). Springer-Verlag, Granada, Spain, 411--421.
    [25]
    Sean Luke and Liviu Panait. 2006. A Comparison of Bloat Control Methods for Genetic Programming. Evolutionary Computation 14, 3 (Fall 2006), 309--344.
    [26]
    Yi Mei, Su Nguyen, and Mengjie Zhang. 2017. Evolving Time-Invariant Dispatching Rules in Job Shop Scheduling with Genetic Programming. In EuroGP 2017: Proceedings of the 20th European Conference on Genetic Programming (LNCS), Mauro Castelli, James McDermott, and Lukas Sekanina (Eds.), Vol. 10196. Springer Verlag, Amsterdam, 147--163.
    [27]
    Mouloud Oussaidène, Bastien Chopard, Olivier V. Pictet, and Marco Tomassini. 1997. Parallel Genetic Programming and its application to trading model induction. Parallel Comput. 23, 8 (Aug. 1997), 1183--1198.
    [28]
    Gregory Paris, Denis Robilliard, and Cyril Fonlupt. 2003. Exploring Overfitting in Genetic Programming. In Evolution Artificielle, 6th International Conference (Lecture Notes in Computer Science), Pierre Liardet, Pierre Collet, Cyril Fonlupt, Evelyne Lutton, and Marc Schoenauer (Eds.), Vol. 2936. Springer, Marseilles, France, 267--277. Revised Selected Papers.
    [29]
    Riccardo Poli. 2003. A Simple but Theoretically-motivated Method to Control Bloat in Genetic Programming. In Genetic Programming, Proceedings of EuroGP'2003 (LNCS), Conor Ryan, Terence Soule, Maarten Keijzer, Edward Tsang, Riccardo Poli, and Ernesto Costa (Eds.), Vol. 2610. Springer-Verlag, Essex, 204--217.
    [30]
    David Power, Conor Ryan, and Raja Muhammad Atif Azad. 2005. Promoting diversity using migration strategies in distributed genetic algorithms. In 2005 IEEE Congress on Evolutionary Computation, Vol. 2. IEEE Press, Edinburgh, Scotland, UK, 1831--1838 Vol. 2.
    [31]
    Conor Ryan, Michael O'Neill, and J. J. Collins (Eds.). 2018. Handbook of Grammatical Evolution. Springer, New York, NY, USA.
    [32]
    Eric O. Scott and Kenneth A. De Jong. 2015. Evaluation-Time Bias in Asynchronous Evolutionary Algorithms. In GECCO'15 Student Workshop, Tea Tusar and Boris Naujoks (Eds.). ACM, Madrid, Spain, 1209--1212.
    [33]
    Eric O. Scott and Kenneth A. De Jong. 2016. Evaluation-Time Bias in Quasi-Generational and Steady-State Asynchronous Evolutionary Algorithms. In GECCO 16: Proceedings of the 2016 on Genetic and Evolutionary Computation Conference, Tobias Friedrich et al (Ed.). ACM, Denver, USA, 845--852.
    [34]
    Sara Silva, Stephen Dignum, and Leonardo Vanneschi. 2012. Operator equalisation for bloat free genetic programming and a survey of bloat control methods. Genetic Programming and Evolvable Machines 13, 2 (2012), 197--238.
    [35]
    Lee Spector and Alan Robinson. 2002. Genetic Programming and Autoconstructive Evolution with the Push Programming Language. Genetic Programming and Evolvable Machines 3, 1 (March 2002), 7--40.
    [36]
    Gilbert Syswerda. 1991. A study of reproduction in generational and steady-state genetic algorithms. In Foundations of genetic algorithms. Vol. 1. Elsevier, Amsterdam, 94--101.
    [37]
    Leonardo Vanneschi, Mauro Castelli, and Sara Silva. 2010. Measuring bloat, overfitting and functional complexity in genetic programming. In GECCO '10: Proceedings of the 12th annual conference on Genetic and evolutionary computation, Juergen Branke et al (Ed.). ACM, Portland, Oregon, USA, 877--884.
    [38]
    Vladimir Naumovich Vapnik. 1998. Statistical learning theory. Wiley, New York, NY. OCLC: 845016043.
    [39]
    Ekaterina J Vladislavleva, Guido F Smits, and Dick Den Hertog. 2009. Order of Nonlinearity as a Complexity Measure for Models Generated by Symbolic Regression via Pareto Genetic Programming. IEEE Transactions on Evolutionary Computation 13, 2 (2009), 333--349.
    [40]
    James Alfred Walker and Julian Francis Miller. 2008. The Automatic Acquisition, Evolution and Reuse of Modules in Cartesian Genetic Programming. IEEE Transactions on Evolutionary Computation 12, 4 (Aug. 2008), 397--417.
    [41]
    David R. White, James McDermott, Mauro Castelli, Luca Manzoni, Brian W. Goldman, Gabriel Kronberger, Wojciech Jaskowski, Una-May O'Reilly, and Sean Luke. 2013. Better GP benchmarks: community survey results and proposals. Genetic Programming and Evolvable Machines 14, 1 (March 2013), 3--29.

    Cited By

    View all
    • (2021)Evolving simple and accurate symbolic regression models via asynchronous parallel computingApplied Soft Computing10.1016/j.asoc.2021.107198104:COnline publication date: 1-Jun-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SAC '20: Proceedings of the 35th Annual ACM Symposium on Applied Computing
    March 2020
    2348 pages
    ISBN:9781450368667
    DOI:10.1145/3341105
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 March 2020

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. genetic programming
    2. model complexity
    3. parallel computing

    Qualifiers

    • Research-article

    Conference

    SAC '20
    Sponsor:
    SAC '20: The 35th ACM/SIGAPP Symposium on Applied Computing
    March 30 - April 3, 2020
    Brno, Czech Republic

    Acceptance Rates

    Overall Acceptance Rate 1,650 of 6,669 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)13
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Evolving simple and accurate symbolic regression models via asynchronous parallel computingApplied Soft Computing10.1016/j.asoc.2021.107198104:COnline publication date: 1-Jun-2021

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media