Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2330163.2330335acmconferencesArticle/Chapter ViewAbstractPublication PagesgeccoConference Proceedingsconference-collections
research-article

A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

Published: 07 July 2012 Publication History

Abstract

Decision tree induction is one of the most employed methods to extract knowledge from data, since the representation of knowledge is very intuitive and easily understandable by humans. The most successful strategy for inducing decision trees, the greedy top-down approach, has been continuously improved by researchers over the years. This work, following recent breakthroughs in the automatic design of machine learning algorithms, proposes a hyper-heuristic evolutionary algorithm for automatically generating decision-tree induction algorithms, named HEAD-DT. We perform extensive experiments in 20 public data sets to assess the performance of HEAD-DT, and we compare it to traditional decision-tree algorithms such as C4.5 and CART. Results show that HEAD-DT can generate algorithms that significantly outperform C4.5 and CART regarding predictive accuracy and F-Measure.

References

[1]
R. C. Barros, M. P. Basgalupp, A. C. P. L. F. de Carvalho, and A. A. Freitas. A survey of evolutionary algorithms for decision tree induction. IEEE Transactions on Systems, Man and Cybernetics - Part C: Applications and Reviews, In press., 2011.
[2]
R. C. Barros, R. Cerri, P. A. Jaskowiak, and A. C. P. L. F. de Carvalho. A Bottom-Up Oblique Decision Tree Induction Algorithm. In 11th International Conference on Intelligent Systems Design and Applications, pages 450--456, 2011.
[3]
R. C. Barros, D. D. Ruiz, and M. P. Basgalupp. Evolutionary model trees for handling continuous classes in machine learning. Information Sciences, 181:954--971, 2011.
[4]
L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001.
[5]
L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth, 1984.
[6]
L. Breslow and D. Aha. Simplifying decision trees: A survey. The Knowledge Engineering Review, 12(01):1--40, 1997.
[7]
E. K. Burke, M. R. Hyde, G. Kendall, G. Ochoa, E. Ozcan, and J. R. Woodward. Exploring Hyper-heuristic Methodologies with Genetic Programming. In Colaborative Computational Intelligence, pages 177--201. Springer, 2009.
[8]
B. Cestnik and I. Bratko. On estimating probabilities in tree pruning. In EWSL'91, pages 138--150. Springer, 1991.
[9]
B. Chandra, R. Kothari, and P. Paul. A new node splitting measure for decision tree construction. Pattern Recognition, 43(8):2725--2731, 2010.
[10]
B. Chandra and P. P. Varghese. Moving towards efficient decision tree construction. Information Sciences, 179(8):1059--1069, 2009.
[11]
J. Ching, A. Wong, and K. Chan. Class-dependent discretization for inductive learning from continuous and mixed-mode data. IEEE Transactions on Pattern Analysis and Machine Intelligence, 17(7):641--651, 1995.
[12]
P. Clark and T. Niblett. The CN2 induction algorithm. Machine Learning, 3(4):261--283, 1989.
[13]
R. L. De Mántaras. A Distance-Based Attribute Selection Measure for Decision Tree Induction. Machine Learning, 6(1):81--92, 1991.
[14]
B. Delibasic, M. Jovanovic, M. Vukicevic, M. Suknovic, and Z. Obradovic. Component-based decision trees for classification. Intelligent Data Analysis, 15:1--38, Aug. 2011.
[15]
J. Demsar. Statistical Comparisons of Classifiers over Multiple Data Sets. Journal of Machine Learning Research, 7:1--30, 2006.
[16]
F. Esposito, D. Malerba, and G. Semeraro. A Comparative Analysis of Methods for Pruning Decision Trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(5):476--491, 1997.
[17]
U. Fayyad and K. Irani. The attribute selection problem in decision tree generation. In National Conference on Artificial Intelligence, pages 104--110, 1992.
[18]
A. Frank and A. Asuncion. UCI machine learning repository, 2010.
[19]
M. Gleser and M. Collen. Towards automated medical decisions. Computers and Biomedical Research, 5(2):180--189, 1972.
[20]
R. Iman and J. Davenport. Approximations of the critical region of the friedman statistic. Communications in Statistics, pages 571--595, 1980.
[21]
B. Jun, C. Kim, Y.-Y. Song, and J. Kim. A New Criterion in Selection and Discretization of Attributes for the Generation of Decision Trees. IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(2):1371--1375, 1997.
[22]
B. Kim and D. Landgrebe. Hierarchical classifier design in high-dimensional numerous class cases. IEEE Transactions on Geoscience and Remote Sensing, 29(4):518--528, 1991.
[23]
I. Kononenko, I. Bratko, and E. Roskar. Experiments in automatic learning of medical diagnostic rules. Technical report, Jozef Stefan Institute, Ljubljana, Yugoslavia, 1984.
[24]
W. Loh and Y. Shih. Split selection methods for classification trees. Statistica Sinica, 7:815--840, 1997.
[25]
J. Martin. An exact probability metric for decision tree splitting and stopping. Machine Learning, 28(2):257--291, 1997.
[26]
J. Mingers. Expert systems - rule induction with statistical data. Journal of the Operational Research Society, 38:39--47, 1987.
[27]
J. Mingers. An empirical comparison of selection measures for decision-tree induction. Machine Learning, 3(4):319--342, 1989.
[28]
T. Niblett and I. Bratko. Learning decision rules in noisy domains. In 6th Annual Technical Conference on Expert Systems, pages 25--34, 1986.
[29]
J. R. Quinlan. Induction of decision trees. Machine Learning, 1(1):81--106, 1986.
[30]
J. R. Quinlan. Simplifying decision trees. International Journal of Man-Machine Studies, 27:221--234, 1987.
[31]
J. R. Quinlan. Unknown attribute values in induction. In 6th International Workshop on Machine Learning, pages 164--168, 1989.
[32]
J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann, San Francisco, CA, USA, 1993.
[33]
P. C. Taylor and B. W. Silverman. Block diagrams and splitting criteria for classification trees. Statistics and Computing, 3:147--161, 1993.
[34]
A. Vella, D. Corne, and C. Murphy. Hyper-heuristic decision tree induction. World Congress on Nature & Biologically Inspired Computing, pages 409--414, 2009.

Cited By

View all
  • (2024)An Experimental Analysis on Automated Machine Learning for Software Defect Prediction2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611946(1-8)Online publication date: 30-Jun-2024
  • (2023)Development of rapid and effective risk prediction models for stroke in the Chinese population: a cross-sectional studyBMJ Open10.1136/bmjopen-2022-06804513:3(e068045)Online publication date: 1-Mar-2023
  • (2023)Towards improving decision tree induction by combining split evaluation measuresKnowledge-Based Systems10.1016/j.knosys.2023.110832277(110832)Online publication date: Oct-2023
  • Show More Cited By

Index Terms

  1. A hyper-heuristic evolutionary algorithm for automatically designing decision-tree algorithms

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      GECCO '12: Proceedings of the 14th annual conference on Genetic and evolutionary computation
      July 2012
      1396 pages
      ISBN:9781450311779
      DOI:10.1145/2330163
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 July 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. decision trees
      2. evolutionary algorithms
      3. hyper-heuristics

      Qualifiers

      • Research-article

      Conference

      GECCO '12
      Sponsor:
      GECCO '12: Genetic and Evolutionary Computation Conference
      July 7 - 11, 2012
      Pennsylvania, Philadelphia, USA

      Acceptance Rates

      Overall Acceptance Rate 1,669 of 4,410 submissions, 38%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)6
      • Downloads (Last 6 weeks)2
      Reflects downloads up to 15 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)An Experimental Analysis on Automated Machine Learning for Software Defect Prediction2024 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC60901.2024.10611946(1-8)Online publication date: 30-Jun-2024
      • (2023)Development of rapid and effective risk prediction models for stroke in the Chinese population: a cross-sectional studyBMJ Open10.1136/bmjopen-2022-06804513:3(e068045)Online publication date: 1-Mar-2023
      • (2023)Towards improving decision tree induction by combining split evaluation measuresKnowledge-Based Systems10.1016/j.knosys.2023.110832277(110832)Online publication date: Oct-2023
      • (2023)Automatic design of machine learning via evolutionary computation: A surveyApplied Soft Computing10.1016/j.asoc.2023.110412143(110412)Online publication date: Aug-2023
      • (2022)Metaheuristics for data mining: survey and opportunities for big dataAnnals of Operations Research10.1007/s10479-021-04496-0314:1(117-140)Online publication date: 18-Jan-2022
      • (2022)A Metaheuristic Perspective on Learning Classifier SystemsMetaheuristics for Machine Learning10.1007/978-981-19-3888-7_3(73-98)Online publication date: 13-Aug-2022
      • (2021)HYPER HEURISTIC EVOLUTIONARY APPROACH FOR CONSTRUCTING DECISION TREE CLASSIFIERSJournal of Information and Communication Technology10.32890/jict2021.20.2.520:Number 2(249-276)Online publication date: 21-Feb-2021
      • (2021)Feature Selection and Deep Learning for Deterioration Prediction of the BridgesJournal of Performance of Constructed Facilities10.1061/(ASCE)CF.1943-5509.000165335:6Online publication date: Dec-2021
      • (2019)Metaheuristics for data mining4OR10.1007/s10288-019-00402-4Online publication date: 6-Apr-2019
      • (2018)Active Learning of Regular Expressions for Entity ExtractionIEEE Transactions on Cybernetics10.1109/TCYB.2017.268046648:3(1067-1080)Online publication date: Mar-2018
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media