Article

Estimating classifier performance with genetic programming

Authors:

Leonardo Trujillo,

Yuliana Martínez,

Patricia MelinAuthors Info & Claims

EuroGP'11: Proceedings of the 14th European conference on Genetic programming

Pages 274 - 285

Published: 27 April 2011 Publication History

Abstract

A fundamental task that must be addressed before classifying a set of data, is that of choosing the proper classification method. In other words, a researcher must infer which classifier will achieve the best performance on the classification problem in order to make a reasoned choice. This task is not trivial, and it is mostly resolved based on personal experience and individual preferences. This paper presents a methodological approach to produce estimators of classifier performance, based on descriptive measures of the problem data. The proposal is to use Genetic Programming (GP) to evolve mathematical operators that take as input descriptors of the problem data, and output the expected error that a particular classifier might achieve if it is used to classify the data. Experimental tests show that GP can produce accurate estimators of classifier performance, by evaluating our approach on a large set of 500 two-class problems of multimodal data, using a neural network for classification. The results suggest that the GP approach could provide a tool that helps researchers make a reasoned decision regarding the applicability of a classifier to a particular problem.

References

[1]

Cantú-Paz, E., Kamath, C.: An empirical comparison of combinations of evolutionary algorithms and neural networks for classification problems. IEEE Trans. on Syst., Man, and Cyber., Part B 35(5), 915-927 (2005).

Digital Library

[2]

Ho, T.K., Basu, M.: Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Mach. Intell. 24, 289-300 (2002).

Digital Library

[3]

Hordijk, W.: A measure of landscapes. Evol. Comput. 4, 335-360 (1996).

Digital Library

[4]

Luke, S., Panait, L.: Lexicographic parsimony pressure. In: Proceedings of GECCO 2002, pp. 829-836. Morgan Kaufmann, San Francisco (2002).

Digital Library

[5]

Mansilla, E.B., Ho, T.K.: On classifier domains of competence. In: Proceedings of ICPR 2004, vol. 1, pp. 136-139. IEEE Computer Society, Washington, DC, USA (2004).

Digital Library

[6]

McDermott, J., Galvan-Lopez, E., O'Neill, M.: A fine-grained view of GP locality with binary decision diagrams as ant phenotypes. In: Schaefer, R., Cotta, C., Kołodziej, J., Rudolph, G. (eds.) PPSN XI. LNCS, vol. 6238, pp. 164-173. Springer, Heidelberg (2010).

Digital Library

[7]

Michie, D., Spiegelhalter, D.J., Taylor, C.C., Campbell, J. (eds.): Machine learning, neural and statistical classification, NJ, USA (1994).

Digital Library

[8]

Ou, G., Murphey, Y.L.: Multi-class pattern classification using neural networks. Pattern Recogn. 40, 4-18 (2007).

Digital Library

[9]

Poli, R., Graff, M.: There is a free lunch for hyper-heuristics, genetic programming and computer scientists. In: Vanneschi, L., Gustafson, S., Moraglio, A., De Falco, I., Ebner, M. (eds.) EuroGP 2009. LNCS, vol. 5481, pp. 195-207. Springer, Heidelberg (2009).

Digital Library

[10]

Poli, R., Graff, M., McPhee, N.F.: Free lunches for function and program induction. In: Proceedings of FOGA 2009, pp. 183-194. ACM, New York (2009).

Digital Library

[11]

Poli, R., Vanneschi, L.: Fitness-proportional negative slope coefficient as a hardness measure for genetic algorithms. In: Proceedings of GECCO 2007, pp. 1335-1342. ACM, New York (2007).

Digital Library

[12]

Silva, S., Almeida, J.: Gplab-a genetic programming toolbox for matlab. In: Proceedings of the Nordic MATLAB Conference, pp. 273-278 (2003).

[13]

Silva, S., Costa, E.: Dynamic limits for bloat control in genetic programming and a review of past and current bloat theories. Genetic Programming and Evolvable Machines 10(2), 141-179 (2009).

Digital Library

[14]

Sohn, S.Y.: Meta analysis of classification algorithms for pattern recognition. IEEE Trans. Pattern Anal. Mach. Intell. 21, 1137-1144 (1999).

Digital Library

[15]

Vanneschi, L., Castelli, M., Silva, S.: Measuring bloat, overfitting and functional complexity in genetic programming. In: Proceedings of GECCO 2010, pp. 877-884. ACM, New York (2010).

Digital Library

[16]

Vanneschi, L., Tomassini, M., Collard, P., Vérel, S., Pirola, Y., Mauri, G.: A comprehensive view of fitness landscapes with neutrality and fitness clouds. In: Ebner, M., O'Neill, M., Ekárt, A., Vanneschi, L., Esparcia-Alcázar, A.I. (eds.) EuroGP 2007. LNCS, vol. 4445, pp. 241-250. Springer, Heidelberg (2007).

Digital Library

[17]

Whitley, D., Watson, J.: Complexity theory and the no free lunch theorem, ch. 11, pp. 317-339 (2005).

[18]

Wolpert, D., Macready, W.: No free lunch theorems for optimization. IEEE Trans. Evol. Comput. 1(1), 67-82 (1997).

Digital Library

Cited By

Martínez YTrujillo LLegrand PGalván-López E(2016)Prediction of expected performance for a genetic programming classifierGenetic Programming and Evolvable Machines10.1007/s10710-016-9265-917:4(409-449)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1007/s10710-016-9265-9
Trujillo LMartínez YGalván López ELegrand PMoore J(2012)A comparative study of an evolvability indicator and a predictor of expected performance for genetic programmingProceedings of the 14th annual conference companion on Genetic and evolutionary computation10.1145/2330784.2331006(1489-1490)Online publication date: 7-Jul-2012
https://dl.acm.org/doi/10.1145/2330784.2331006
Trujillo LMartínez YMelin PLanzi P(2011)How many neurons?Proceedings of the 13th annual conference companion on Genetic and evolutionary computation10.1145/2001858.2001956(175-176)Online publication date: 12-Jul-2011
https://dl.acm.org/doi/10.1145/2001858.2001956
Show More Cited By

Estimating classifier performance with genetic programming
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches

Recommendations

Multiobjective genetic programming for maximizing ROC performance
Abstract
In binary classification problems, receiver operating characteristic (ROC) graphs are commonly used for visualizing, organizing and selecting classifiers based on their performances. An important issue in the ROC literature is to ...
Optimal resampling and classifier prototype selection in classifier ensembles using genetic algorithms

Ensembles of classifiers that are trained on different parts of the input space provide good results in general. As a popular boosting technique, AdaBoost is an iterative and gradient based deterministic method used for this purpose where an exponential ...
Measuring classifier performance: a coherent alternative to the area under the ROC curve

The area under the ROC curve ( AUC ) is a very widely used measure of performance for classification and diagnostic rules. It has the appealing property of being objective, requiring no subjective input from the user. On the other hand, the AUC has ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

EuroGP'11: Proceedings of the 14th European conference on Genetic programming

April 2011

348 pages

ISBN:9783642204067

Editors:
Sara Silva
INESC-ID Lisboa, Lisboa, Portugal
,
James A. Foster
University of Idaho, Department of Biological Sciences, Moscow, ID
,
Miguel Nicolau
University College Dublin, Dublin 4, Ireland
,
Penousal Machado
University of Coimbra, Faculty of Sciences and Technology, Department of Informatics Engineering, Coimbra, Portugal
,
Mario Giacobini
University of Torino, Department of Animal Production Epidemiology and Ecology, Grugliasco, TO, Italy

Sponsors

The Museum of Human Anatomy: The Museum of Human Anatomy ("Luigi Rolando")
HuGeF: The Human Genetics Foundation of Torino
The Museum of Criminal Anthropology: The Museum of Criminal Anthropology ("Cesare Lombroso")
The University of Torino: The University of Torino

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 27 April 2011

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 05 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Martínez YTrujillo LLegrand PGalván-López E(2016)Prediction of expected performance for a genetic programming classifierGenetic Programming and Evolvable Machines10.1007/s10710-016-9265-917:4(409-449)Online publication date: 1-Dec-2016
https://dl.acm.org/doi/10.1007/s10710-016-9265-9
Trujillo LMartínez YGalván López ELegrand PMoore J(2012)A comparative study of an evolvability indicator and a predictor of expected performance for genetic programmingProceedings of the 14th annual conference companion on Genetic and evolutionary computation10.1145/2330784.2331006(1489-1490)Online publication date: 7-Jul-2012
https://dl.acm.org/doi/10.1145/2330784.2331006
Trujillo LMartínez YMelin PLanzi P(2011)How many neurons?Proceedings of the 13th annual conference companion on Genetic and evolutionary computation10.1145/2001858.2001956(175-176)Online publication date: 12-Jul-2011
https://dl.acm.org/doi/10.1145/2001858.2001956
Trujillo LMartínez YGalván-López ELegrand PLanzi P(2011)Predicting problem difficulty for genetic programming applied to data classificationProceedings of the 13th annual conference on Genetic and evolutionary computation10.1145/2001576.2001759(1355-1362)Online publication date: 12-Jul-2011
https://dl.acm.org/doi/10.1145/2001576.2001759

View Options

View options

Figures

Tables

Media

View Table of Conten