Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Boosting meta-learning with simulated data complexity measures

Published: 01 January 2020 Publication History

Abstract

Meta-Learning has been largely used over the last years to support the recommendation of the most suitable machine learning algorithm(s) and hyperparameters for new datasets. Traditionally, a meta-base is created containing meta-features extracted from several datasets along with the performance of a pool of machine learning algorithms when applied to these datasets. The meta-features must describe essential aspects of the dataset and distinguish different problems and solutions. However, if one wants the use of Meta-Learning to be computationally efficient, the extraction of the meta-feature values should also show a low computational cost, considering a trade-off between the time spent to run all the algorithms and the time required to extract the meta-features. One class of measures with successful results in the characterization of classification datasets is concerned with estimating the underlying complexity of the classification problem. These data complexity measures take into account the overlap between classes imposed by the feature values, the separability of the classes and distribution of the instances within the classes. However, the extraction of these measures from datasets usually presents a high computational cost. In this paper, we propose an empirical approach designed to decrease the computational cost of computing the data complexity measures, while still keeping their descriptive ability. The proposal consists of a novel Meta-Learning system able to predict the values of the data complexity measures for a dataset by using simpler meta-features as input. In an extensive set of experiments, we show that the predictive performance achieved by Meta-Learning systems which use the predicted data complexity measures is similar to the performance obtained using the original data complexity measures, but the computational cost involved in their computation is significantly reduced.

References

[1]
V.H. Barella, L.P.F. Garcia, M.P. de Souto, A.C. Lorena and A.C.P.L.F. de Carvalho, Data complexity measures for imbalanced classification tasks, In International Joint Conference on Neural Networks (IJCNN), volume 1, 2018, pp. 1–8.
[2]
H. Bensusan, C. Giraud-Carrier and C. Kennedy, A higher-order approach to meta-learning, Technical report, University of Bristol, 2000.
[3]
H. Bensusan and A. Kalousis, Estimating the predictive accuracy of a classifier. In 12th European Conference on Machine Learning (ECML), volume 2167, 2001, pp. 25–36.
[4]
P. Brazdil, C. Giraud-Carrier, C. Soares and R. Vilalta, Metalearning – Applications to Data Mining, Cognitive Technologies. Springer, 1 edition, 2009.
[5]
P. Brazdil, C. Soares and J.P. da Costa, Ranking learning algorithms: Using IBL and meta-learning on accuracy and time results, Machine Learning 50(3) (2003), 251–277.
[6]
L. Breiman, Random forests, Machine Learning 45(1) (2001), 5–32.
[7]
C. Castiello, G. Castellano and A.M. Fanelli, Meta-data: Characterization of input features for meta-learning, In Modeling Decisions for Artificial Intelligence (MDAI), volume 3558, 2005, pp. 457–468.
[8]
G.D.C. Cavalcanti, T.I. Ren and B.A. Vale, Data complexity measures and nearest neighbor classifiers: a practical analysis for meta-learning, In 24th International Conference on Tools with Artificial Intelligence (ICTAI), volume 1, 2012, pp. 1065–1069.
[9]
ChristianKopf and I. Iglezakis, Combination of task description strategies and case base properties for meta-learning, In Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), 2002, pp. 65–76.
[10]
N. Cristianini and J. Shawe-Taylor, An introduction to support vector machines and other kernel-based learning methods, Cambridge University Press, 2000.
[11]
J. Demšar, Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research 7 (2006), 1–30.
[12]
D. Dua and C. Graff, UCI machine learning repository, 2017. http://archive.ics.uci.edu/ml.
[13]
A. Filchenkov and A. Pendryak, Datasets meta-feature description for recommending feature selection algorithm, In Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference (AINL-ISMW FRUCT), volume 7, 2015, pp. 11–18.
[14]
J. Fürnkranz and J. Petrak, An evaluation of landmarking variants, In Workshop on Integrating Aspects of Data Mining, Decision Support and Meta-Learning (IDDM), 2001, pp. 57–68.
[15]
L.P.F. Garcia, A.C.P.L.F. de Carvalho and A.C. Lorena, Effect of label noise in the complexity of classification problems, Neurocomputing 160 (2015), 108–119.
[16]
L.P.F. Garcia, A.C.P.L.F. de Carvalho and A.C. Lorena, Noise detection in the meta-learning level, Neurocomputing 176 (2016), 14–25.
[17]
L.P.F. Garcia and A.C. Lorena, ECoL: Complexity measures for classification problems, 2018. https://CRAN.R-project.org/package=ECoL.
[18]
L.P.F. Garcia, A.C. Lorena, M.P. de Souto and T.K. Ho, Classifier recommendation using data complexity measures, In 24th International Conference on Pattern Recognition (ICPR), volume 1, 2018, pp. 874–879.
[19]
S. Haykin, Neural Networks – A Comprehensive Foundation. Prentice Hall, 2 edition, 1999.
[20]
T.K. Ho, The random subspace method for constructing decision forests, IEEE Transactions on Pattern Analysis and Machine Intelligence 20(8) (1998), 832–844.
[21]
T.K. Ho and M. Basu, Complexity measures of supervised classification problems, IEEE Transactions on Pattern Analysis and Machine Intelligence 24(3) (2002), 289–300.
[22]
A. Hoekstra and R.P.W. Duin, On the nonlinearity of pattern classifiers, In 13th International Conference on Pattern Recognition (ICPR), volume 4, 1996, pp. 271–275.
[23]
S.-W. Kim and B.J. Oommen, On using prototype reduction schemes to enhance the computation of volume-based inter-class overlap measures, Pattern Recognition 42(11) (2009), 2695–2704.
[24]
E.D. Kolaczyk, Statistical Analysis of Network Data: Methods and Models, Springer Series in Statistics. Springer, 2009.
[25]
E. Leyva, A. González and R. Pérez, A set of complexity measures designed for applying meta-learning to instance selection, IEEE Transactions on Knowledge and Data Engineering 27(2) (2014), 354–367.
[26]
A.C. Lorena, A.C.P.L.F. de Carvalho and J.M.P. Gama, A review on the combination of binary classifiers in multiclass problems, Artificial Intelligence Review 30(1-4) (2008), 19.
[27]
A.C. Lorena, L.P.F. Garcia, J. Lehmann, M.P. de Souto and T.K. Ho, How complex is your classification problem? A survey on measuring classification complexity, ACM Computing Surveys 52(5) (2019), 107:1–107:34.
[28]
W. Malina, Two-parameter fisher criterion, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics) 31(4) (2001), 629–636.
[29]
T.M. Mitchell, Machine Learning, McGraw Hill series in computer science, McGraw Hill, 1997.
[30]
R.A. Mollineda, J.S. Sánchez and J.M. Sotoca, Data characterization for effective prototype selection, In 2nd Iberian Conference on Pattern Recognition and Image Analysis, volume 3523, 2005, pp. 27–34.
[31]
R.A. Mollineda, J.S. Sánchez and J.M. Sotoca, A meta-learning framework for pattern classification by means of data complexity measures, Inteligencia Artificial 10(29) (2006), 31–38.
[32]
G. Morais and R.C. Prati, Complex network measures for data set characterization, In 2nd Brazilian Conference on Intelligent Systems (BRACIS), 2013, pp. 12–18.
[33]
M.A. Muñoz, L. Villanova, D. Baatar and K. Smith-Miles, Instance spaces for machine learning classification, Machine Learning 107(1) (2018), 109–147.
[34]
A. Orriols-Puig, N. Maciá and T.K. Ho, Documentation for the data complexity library in C++, Technical report, La Salle – Universitat Ramon Llull, 2010.
[35]
Y. Peng, P.A. Flach, C. Soares and P. Brazdil, Improved dataset characterisation for meta-learning, In 5th International Conference on Discovery Science (DS), volume 2534, 2002, pp. 141–152.
[36]
B. Pfahringer, H. Bensusan and C. Giraud-Carrier, Meta-learning by landmarking various learning algorithms, In 17th International Conference on Machine Learning (ICML), 2000, pp. 743–750.
[37]
F. Pinto, C. Soares and J. Mendes-Moreira, Towards automatic generation of metafeatures, In Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), 2016, pp. 215–226.
[38]
R.B.C. Prudêncio and T.B. Ludermir, Active learning to support the generation of meta-examples. In 17th International Conference on Artificial Neural Networks (ICANN), volume 4668, 2007, pp. 817–826.
[39]
R.B.C. Prudêncio, C. Soares and T.B. Ludermir, Uncertainty sampling-based active selection of datasetoids for meta-learning, In 21st International Conference on Artificial Neural Networks (ICANN), volume 6792, 2011, pp. 454–461.
[40]
J.R. Quinlan, Induction of decision trees, Machine Learning 1(1) (1986), 81–106.
[41]
M. Reif, A comprehensive dataset for evaluating approaches of various meta-learning tasks, In 1st International Conference on Pattern Recognition Applications and Methods, 2012, pp. 273–276.
[42]
M. Reif, F. Shafait and A. Dengel, Prediction of classifier training time including parameter optimization, In 34th Annual German Conference on Artificial Intelligence (KI), 2011, pp. 260–271.
[43]
M. Reif, F. Shafait, M. Goldstein, T. Breuel and A. Dengel, Automatic classifier selection for non-experts, Pattern Analysis and Applications 17(1) (2014), 83–96.
[44]
J.R. Rice, The algorithm selection problem, Advances in Computers 15 (1976), 65–118.
[45]
A. Rivolli, L.P.F. Garcia, C. Soares, J. Vanschoren and A.C.P.L.F. de Carvalho, Towards reproducible empirical research in meta-learning, eprint arXiv, (1808.10406) (2019), 1–41.
[46]
S. Segrera, J. Pinho and M.N. Moreno, Information-theoretic measures for meta-learning, In 3rd Hybrid Artificial Intelligence Systems (HAIS), 2008, pp. 458–465.
[47]
K.A. Smith-Miles, Cross-disciplinary perspectives on meta-learning for algorithm selection, ACM Computing Surveys 41(1) (2008), 1–25.
[48]
C. Soares, J. Petrak and P. Brazdil, Sampling-based relative landmarks: Systematically test-driving algorithms before choosing, In 10th Portuguese Conference on Artificial Intelligence (EPIA), 2001, pp. 88–95.
[49]
M.C.P. Souto, A.C. Lorena, N. Spolaôr and I.G. Costa, Complexity measures of supervised classification tasks: A case study for cancer gene expression data, In International Joint Conference on Neural Networks (IJCNN), 2010, pp. 1352–1358.
[50]
J. Vanschoren, J.N. van Rijn, B. Bischl and L. Torgo, OpenML: networked science in machine learning, SIGKDD Explorations 15(2) (2013), 49–60.
[51]
D.H. Wolpert, Stacked generalization, Neural Networks 5(2) (1992), 241–259.
[52]
X. Zhu, J. Lafferty and R. Rosenfeld, Semi-supervised learning with graphs, PhD thesis, Carnegie Mellon University, Language Technologies Institute, School of Computer Science, 2005.

Cited By

View all
  • (2024)Data Complexity: A New Perspective for Analyzing the Difficulty of Defect Prediction TasksACM Transactions on Software Engineering and Methodology10.1145/364959633:6(1-45)Online publication date: 27-Jun-2024
  • (2023)Investigating the Performance of Data Complexity & Instance Hardness Measures as A Meta-Feature in Overlapping Classes ProblemProceedings of the 2023 7th International Conference on Cloud and Big Data Computing10.1145/3616131.3616132(1-9)Online publication date: 17-Aug-2023

Index Terms

  1. Boosting meta-learning with simulated data complexity measures
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Intelligent Data Analysis
        Intelligent Data Analysis  Volume 24, Issue 5
        2020
        259 pages

        Publisher

        IOS Press

        Netherlands

        Publication History

        Published: 01 January 2020

        Author Tags

        1. Meta-learning
        2. meta-features
        3. complexity measures

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 17 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Data Complexity: A New Perspective for Analyzing the Difficulty of Defect Prediction TasksACM Transactions on Software Engineering and Methodology10.1145/364959633:6(1-45)Online publication date: 27-Jun-2024
        • (2023)Investigating the Performance of Data Complexity & Instance Hardness Measures as A Meta-Feature in Overlapping Classes ProblemProceedings of the 2023 7th International Conference on Cloud and Big Data Computing10.1145/3616131.3616132(1-9)Online publication date: 17-Aug-2023

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media