Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Automatic recommendation of feature selection algorithms based on dataset characteristics

Published: 15 December 2021 Publication History

Abstract

Feature selection in real-world data mining problems is essential to make the learning task efficient and more accurate. Identifying the best feature selection algorithm, among the many available, is a complex activity that still relies heavily on human experts or some random trial-and-error procedure. Thus, the automated machine learning community has taken some steps towards the automation of this process. In this paper, we address the metalearning challenge of recommending feature selection algorithms by proposing a novel meta-feature engineering model. Our model considers a broad collection of meta-features that enable the study of the relationship between the dataset properties and the feature selection algorithm performance in terms of several criteria. We arrange the input meta-features into eight categories: (i) simple, (ii) statistical, (iii) information-theoretical, (iv) complexity, (v) landmarking, (vi) based on symbolic models, (vii) based on images, and (viii) based on complex networks (graphs). The target meta-features emerge from a multi-criteria performance measure, based on five individual performance indexes, that assesses feature selection methods grounded in information, distance, dependence, consistency, and precision measures. We evaluate our proposal using a recently developed framework that extracts the input meta-features from 213 benchmark datasets, and ranks the assessed feature selection algorithms, to fill in the target meta-features in meta-bases. This evaluation uses five state-of-the-art classification methods to induce recommendation models from meta-bases: C4.5, Random Forest, XGBoost, ANN, and SVM. The results showed that it is possible to reach an average accuracy of up to 90% applying our meta-feature engineering model. This work is the first to use an extensive empirical evaluation to provide a careful discussion of the strengths and limitations of more than 160 meta-features. These meta-features, while designed to aid the task of feature selection algorithm recommendation, can readily be employed in other metalearning scenarios. Therefore, we believe our findings are a valuable contribution to the fields of automated machine learning and data mining, as well as to the feature extraction and pattern recognition communities.

Highlights

A novel meta-feature engineering model recommends feature selection algorithms.
The proposal obtains promising results from 213 datasets with hit rates of up to 90%.
Some simple, landmarking, image, and graph-based input meta-features highlighted.
A multi-criteria performance measure rigorously assesses candidate algorithms.
Chains of binary or multiclass classifiers can efficiently rank candidate algorithms.

References

[1]
Aduviri, R., Matos, D., & Villanueva, E. (2018). Feature selection algorithm recommendation for gene expression data through gradient boosting and neural network metamodels. In IEEE international conference on bioinformatics and biomedicine (pp. 2726–2728).
[2]
Anderson R., Koh Y.S., Dobbie G., Bifet A., Recurring concept meta-learning for evolving data streams, Expert Systems with Applications 138 (2019).
[3]
Arauzo-Azofra A., Aznarte J.L., Benítez J.M., Empirical study of feature selection methods based on individual feature evaluation for classification problems, Expert Systems with Applications 38 (7) (2011) 8170–8177.
[4]
Blum A.L., Langley P., Selection of relevant features and examples in machine learning, Artificial Intelligence 97 (1997) 245–271.
[5]
Bolón-Canedo V., Sánchez-Maroño N., Alonso-Betanzos A., A review of feature selection methods on synthetic data, Knowledge and Information Systems 34 (3) (2013) 483–519.
[6]
Bolón-Canedo V., Sánchez-Maroño N., Alonso-Betanzos A., Recent advances and emerging challenges of feature selection in the context of big data, Knowledge-Based Systems 86 (2015) 33–45.
[7]
Breiman L., Friedman J., Stone C., Olshen R., Classification and regression trees, in: The Wadsworth and Brooks-Cole statistics-probability series, Taylor & Francis, California, 1984.
[8]
Brezočnik L., Fister I., Podgorelec V., Swarm intelligence algorithms for feature selection: a review, Applied Sciences 8 (9) (2018) 1521.
[9]
Brodley, C. E. (1993). Addressing the selective superiority problem: Automatic algorithm/model class selection. In International conference on machine learning (pp. 17–24).
[10]
Chandrashekar G., Sahin F., A survey on feature selection methods, Computers & Electrical Engineering 40 (1) (2014) 16–28.
[11]
Chen T., Guestrin C., Xgboost: a scalable tree boosting system, in: ACM SIGKDD international conference on knowledge discovery & data mining, ACM, 2016, pp. 785–794.
[12]
Costa L.F., Rodrigues F.A., Travieso G., Boas P.R.V., Characterization of complex networks: a survey of measurements, Advances in Physics 56 (1) (2007) 167–242.
[13]
Das, S. (2001). Filters, wrappers and a boosting-based hybrid for feature selection. In International conference on machine learning (pp. 74–81).
[14]
Dash M., Liu H., Feature selection for classification, Intelligent Data Analysis 1 (3) (1997) 131–156.
[15]
Dernoncourt D., Hanczar B., Zucker J.-D., Analysis of feature selection stability on high dimension and small sample data, Computational Statistics & Data Analysis 71 (2014) 681–693.
[16]
Dôres, S. N., Soares, C., & Ruiz, D. (2017). Effect of metalearning on feature selection employment. In International workshop on automatic selection, configuration and composition of machine learning algorithms (pp. 1–7).
[17]
Dy J.G., Brodley C.E., Feature selection for unsupervised learning, Journal of Machine Learning Research 5 (2004) 845–889.
[18]
Fayyad U.M., Irani K.B., Multi-interval discretization of continuous-valued attributes for classification learning, in: International joint conference on artificial intelligence, Morgan Kaufmann Publishers, 1993, pp. 1022–1027.
[19]
Ferrari D.G., De Castro L.N., Clustering algorithm selection by meta-learning systems: a new distance-based problem characterization and ranking combination methods, Information Sciences 301 (2015) 181–194.
[20]
Filchenkov A., Pendryak A., Datasets meta-feature description for recommending feature selection algorithm, AINL-ISMW FRUCT (2015) 11–18.
[21]
Fisher R.A., The use of multiple measurements in taxonomic problems, Annals of Eugenics 7 (2) (1936) 179–188.
[22]
Forman G., An extensive empirical study of feature selection metrics for text classification, Journal of Machine Learning Research 3 (2003) 1289–1305.
[23]
Goswami S., Chakrabarti A., Chakraborty B., A proposal for recommendation of feature selection algorithm based on data set characteristics, Journal of Universal Computer Science 22 (6) (2016) 760–781.
[24]
Grzymala-Busse J.W., Hu M., A comparison of several approaches to missing attribute values in data mining, in: International conference on rough sets and current trends in computing, Springer, 2000, pp. 378–385.
[25]
Han J., Kamber M., Pei J., Data mining: concepts and techniques, 3rd ed., Morgan Kaufmann, California, 2011.
[26]
Haralick R.M., Shanmugam K., Dinstein I., Textural features for image classification, IEEE Transactions on Systems, Man and Cybernetics SMC-3 (6) (1973) 610–621.
[27]
John, G., Kohavi, R., & Pfleger, K. (1994). Irrelevant features and the subset selection problem. In International conference on machine learning (pp. 167–173).
[28]
Kalousis A., Algorithm selection via meta-learning, (Ph.D. thesis) Centre Universiteire d’Informatique, Université de Genève, 2002.
[29]
Kohavi R., John G.H., Wrappers for feature subset selection, Artificial Intelligence 97 (1–2) (1997) 273–324.
[30]
Lee H.D., Mendes A.I., Spolaôr N., Oliva J.T., Parmezan A.R.S., Wu F.C., Fonseca-Pinto R., Dermoscopic assisted diagnosis in melanoma: Reviewing results, optimizing methodologies and quantifying empirical guidelines, Knowledge-Based Systems 158 (2018) 9–24.
[31]
Lee J., Olafsson S., A meta-learning approach for determining the number of clusters with consideration of nearest neighbors, Information Sciences 232 (2013) 208–224.
[32]
Lemke C., Budka M., Gabrys B., Metalearning: a survey of trends and technologies, Artificial Intelligence Review 44 (1) (2015) 117–130.
[33]
Li J., Cheng K., Wang S., Morstatter F., Trevino R.P., Tang J., Liu H., Feature selection: a data perspective, ACM Computing Surveys 50 (6) (2017) 94:1–94:45.
[34]
Liu H., Motoda H., Computational methods of feature selection, Chapman & Hall/CRC Data Mining and Knowledge Discovery, Minnesota, 2007.
[35]
Mantovani R.G., Rossi A.L., Alcobaça E., Vanschoren J., de Carvalho A.C., A meta-learning recommender system for hyperparameter tuning: Predicting when tuning improves SVM classifiers, Information Sciences 501 (2019) 193–221.
[36]
Molina, L. C., Belanche, L., & Nebot, À. (2002). Feature selection algorithms: A survey and experimental evaluation. In International conference on data mining (pp. 306–313).
[37]
Montero-Manso P., Athanasopoulos G., Hyndman R.J., Talagala T.S., FFORMA: feature-based forecast model averaging, International Journal of Forecasting 36 (1) (2020) 86–92.
[38]
Parmezan A.R.S., Batista G.E.A.P.A., A study of the use of complexity measures in the similarity search process adopted by kNN algorithm for time series prediction, in: IEEE international conference on machine learning and applications, IEEE, 2015, pp. 45–51.
[39]
Parmezan A.R.S., Lee H.D., Wu F.C., Metalearning for choosing feature selection algorithms in data mining: Proposal of a new framework, Expert Systems with Applications 75 (2017) 1–24.
[40]
Parmezan A.R.S., Silva D.F., Batista G.E.A.P.A., A combination of local approaches for hierarchical music genre classification, in: International society for music information retrieval conference, ISMIR, 2020, pp. 740–747.
[41]
Parmezan A.R.S., Souza V.M.A., Batista G.E.A.P.A., Evaluation of statistical and machine learning models for time series prediction: Identifying the state-of-the-art and the best conditions for the use of each model, Information Sciences 484 (5) (2019) 302–337.
[42]
Peng, Y., Flach, P. A., Soares, C., & Brazdil, P. (2002). Improved dataset characterisation for meta-learning. In International conference on discovery science (pp. 141–152).
[43]
Pfahringer, B., Bensusan, H., & Giraud-Carrier, C. G. (2000). Meta-learning by landmarking various learning algorithms. In International conference on machine learning (pp. 743–750).
[44]
Priebe C.E., Conroy J.M., Marchette D.J., Park Y., Scan statistics on enron graphs, Computational and Mathematical Organization Theory 11 (3) (2005) 229–247.
[45]
Reif M., Shafait F., Goldstein M., Breuel T., Dengel A., Automatic classifier selection for non-experts, Pattern Analysis and Applications 17 (1) (2014) 83–96.
[46]
Rice J.R., The algorithm selection problem, Advances in Computers (1976) 65–118.
[47]
Sheikhpour R., Sarram M.A., Gharaghani S., Chahooki M.A.Z., A survey on semi-supervised feature selection methods, Pattern Recognition 64 (2017) 141–158.
[48]
Shilbayeh S., Vadera S., Feature selection in meta learning framework, in: Science and information conference, IEEE, 2014, pp. 269–275.
[49]
Somol P., Novovicova J., Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality, IEEE Transactions on Pattern Analysis and Machine Intelligence 32 (11) (2010) 1921–1939.
[50]
Tsoumakas G., Katakis I., Vlahavas I., Mining multi-label data, in: Data mining and knowledge discovery handbook, Springer, 2009, pp. 667–685.
[51]
Wang G., Song Q., Sun H., Zhang X., Xu B., Zhou Y., A feature subset selection algorithm automatic recommendation method, Journal of Artificial Intelligence Research 47 (2013) 1–34.
[52]
Witten I.H., Frank E., Hall M.A., Data mining: practical machine learning tools and techniques, The Morgan Kaufmann series in data management systems, 3rd ed., Morgan Kaufmann, Amsterdam, 2011.
[53]
Wolpert D.H., Macready W.G., No free lunch theorems for optimization, IEEE Transactions on Evolutionary Computation 1 (1) (1997) 67–82.
[54]
Won Lee J., Giraud-Carrier C., Automatic selection of classification learning algorithms for data mining practitioners, Intelligent Data Analysis 17 (2013) 665–678.
[55]
Yu, L., & Liu, H. (2003). Feature selection for high-dimensional data: a fast correlation-based filter solution. In International conference on machine learning (pp. 856–863).
[56]
Yu L., Liu H., Efficient feature selection via analysis of relevance and redundancy, Journal of Machine Learning Research 5 (2004) 1205–1224.

Cited By

View all
  • (2024)Meta-learning for vessel time series data imputation method recommendationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124016251:COnline publication date: 24-Jul-2024
  • (2023)Hybrid Feature Selection Framework for Building Resource Efficient Intrusion Detection Systems Model in the Internet of ThingsProceedings of the 8th International Conference on Sustainable Information Engineering and Technology10.1145/3626641.3626923(16-22)Online publication date: 24-Oct-2023
  • (2023)Eight years of AutoML: categorisation, review and trendsKnowledge and Information Systems10.1007/s10115-023-01935-165:12(5097-5149)Online publication date: 1-Dec-2023
  • Show More Cited By

Index Terms

  1. Automatic recommendation of feature selection algorithms based on dataset characteristics
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image Expert Systems with Applications: An International Journal
            Expert Systems with Applications: An International Journal  Volume 185, Issue C
            Dec 2021
            1550 pages

            Publisher

            Pergamon Press, Inc.

            United States

            Publication History

            Published: 15 December 2021

            Author Tags

            1. Feature engineering
            2. Characterization measures
            3. Algorithm selection
            4. Recommendation system
            5. Filter
            6. Wrapper

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0
            Reflects downloads up to 06 Jan 2025

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Meta-learning for vessel time series data imputation method recommendationExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124016251:COnline publication date: 24-Jul-2024
            • (2023)Hybrid Feature Selection Framework for Building Resource Efficient Intrusion Detection Systems Model in the Internet of ThingsProceedings of the 8th International Conference on Sustainable Information Engineering and Technology10.1145/3626641.3626923(16-22)Online publication date: 24-Oct-2023
            • (2023)Eight years of AutoML: categorisation, review and trendsKnowledge and Information Systems10.1007/s10115-023-01935-165:12(5097-5149)Online publication date: 1-Dec-2023
            • (2022)Empirical study on meta-feature characterization for multi-objective optimization problemsNeural Computing and Applications10.1007/s00521-022-07302-534:19(16255-16273)Online publication date: 1-Oct-2022
            • (2022)An Ontological Approach for Recommending a Feature Selection AlgorithmWeb Engineering10.1007/978-3-031-09917-5_20(300-314)Online publication date: 5-Jul-2022

            View Options

            View options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media