Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3067695.3082506acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

Econometric genetic programming outperforms traditional econometric algorithms for regression tasks

Published: 15 July 2017 Publication History

Abstract

Econometric Genetic Programming (EGP) evolves multiple linear regressions through Genetic Programming (GP), which is responsible for model selection, aiming to generate high accuracy regressions with potential interpretability of parameters. It uses statistical significance as a feature selection tool, directly and efficiently identifying introns and controlling bloat. In this paper, EGP is tested against traditional feature-selection econometric algorithms in regression tasks - namely Partial Least Squares Regression, Ridge Regression and Stepwise Forward Regression - outperforming them in all three datasets. The way EGP explores search space of possible regressors and models is crucial for its results. EGP is carefully constructed considering econometric theory on cross-sectional datasets, giving rigorous treatment on topics like homoscedasticity and heteroscedasticity, statistical inference for estimated parameters and sampling criteria. It also benefits by the mathematical proof on accuracy and statistical significance: accuracy will only increase if the regressor presents a test's statistics module in a two-sided hypothesis testing higher than a predefined value.

References

[1]
A.M. Legendre, 1805. Nouvelles methodes pour la détermination des orbites des comètes, Firmin Didot Commun. Firmin Didot, Paris.
[2]
C.F. Gauss, 1809. Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientum.
[3]
J. W. Davidson, D. Savic, and G. A.Walters. 1999. Method for the identification of explicit polynomial formulae for the friction in turbulent pipe flow. Comm. Journal of Hydroinformatics 1, 2 (1999), 115--126.
[4]
J. R. Koza. 1992. Genetic Programming: On the Programming of Computers by Means of Natural Selection (Complex Adaptive Systems) (1st. ed.). The MIT Press.
[5]
I. Arnaldo, K. Krawiec, and U.-M. O'Reilly. 2014. Multiple regression genetic programming In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation (GECCO'14). ACM, New York, NY, USA, 879--886.
[6]
J. W. Davidson, D. Savic, and G. A.Walters. 2003. Symbolic and numerical regression: experiments and applications. Comm. Information Sciences 150, 1--2 (2003), 95--117.
[7]
O. Giustolisi, and D. Savic. 2006. A symbolic data-driven technique based on evolutionary polynomial regression. Comm. Journal of Hydroinformatics 8, 3 (2006), 207--222.
[8]
A. L. F. Novaes, R. Tanscheit, and D. M. Dias. 2016. Programação Genética Econométrica Aplicada a Problemas de Regressão em Conjuntos de Dados Seccionais. In Proceedings of XIII Encontro Nacional de Inteligência Artificial (ENIAC'16). Recife, PE.
[9]
J. Wooldridge. 2009. Introductory Econometrics: A Modern Approach (4 ed.). Cengage Learning.
[10]
R. Davidson, and J. MacKinnon. 1993. Estimation and Inference in Econometrics (1 ed.). Oxford University Press.
[11]
J. M. Chambers. 1977. Computational Methods for Data Analysis (Probability & Mathematical Statistics) (1 ed.). John Wiley & Sons, New York.
[12]
J. H. Maindonald. 1984. Statistical Computation (1 ed.). Wiley, New York.
[13]
V. V. De Melo. 2014. Kaizen programming. In Proceedings of the 2014 Conference on Genetic and Evolutionary Computation (GECCO'14). ACM, New York, NY, USA, 895--902.
[14]
A. L. F. Novaes. 2015. Programação Genética Econométrica: uma Nova Abordagem para Problemas de Regressão e Classificação em Conjuntos de Dados Seccionais. Master's thesis. Pontifícia Universidade Católica do Rio de Janeiro (PUC-Rio), Rio de Janeiro, Brazil.
[15]
R. Poli, W. B. Langdon, and N. F. McPhee. 2008. A Field Guide to Genetic Programming (1 ed.). Lulu Enterprises, United Kingdom.
[16]
S. Luke, and L. Panait. 2002. Lexicographic parsimony pressure. In Proceedings of the 2002 Conference on Genetic and Evolutionary Computation (GECCO'02). ACM, San Francisco, CA, 829--836.
[17]
S. Silva, and J. Almeida. 2003. Gplab - a genetic programming toolbox for matlab. In Proceedings of the Nordic MATLAB conference. 273--278.
[18]
D. P. Searson, D.E. Leahy, and M. J. Willis. 2010. GPTIPS: an open source genetic programming toolbox for multigene symbolic regression. In Proceedings of The International Multiconference of Engineers and Computer Scientists 2010 (IMECS'10). Hong Kong, 77--80.
[19]
UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, last accessed 2015/02/24, University of California, School of Information and Computer Science, Irvine, CA.
[20]
T. Hastie, R. Tibshirani, and J. Friedman. 2011. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (2 ed.). Springer.

Cited By

View all
  • (2019)Transfer learning in constructive induction with Genetic ProgrammingGenetic Programming and Evolvable Machines10.1007/s10710-019-09368-yOnline publication date: 5-Nov-2019

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
GECCO '17: Proceedings of the Genetic and Evolutionary Computation Conference Companion
July 2017
1934 pages
ISBN:9781450349390
DOI:10.1145/3067695
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 July 2017

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. feature selection
  2. genetic programming
  3. model selection
  4. multiple regression

Qualifiers

  • Research-article

Conference

GECCO '17
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2019)Transfer learning in constructive induction with Genetic ProgrammingGenetic Programming and Evolvable Machines10.1007/s10710-019-09368-yOnline publication date: 5-Nov-2019

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media