Abstract
Many strategies have been exploited for the task of feature selection, in an effort to identify more compact and better quality feature subsets. A number of evaluation metrics have been developed recently that can judge the quality of a given feature subset as a whole, rather than assessing the qualities of individual features. Effective techniques of stochastic nature have also emerged, allowing good quality solutions to be discovered without resorting to exhaustive search. This paper provides a comprehensive review of the most recent methods for feature selection that originated from nature inspired meta-heuristics, where the more classic approaches such as genetic algorithms and ant colony optimisation are also included for comparison. A good number of the reviewed methodologies have been significantly modified in the present, in order to systematically support generic subset-based evaluators and higher dimensional problems. Such modifications are carried out because the original studies either work exclusively with certain subset evaluators (e.g., rough set-based methods), or are limited to specific problem domains. A total of ten different algorithms are examined, and their mechanisms and work flows are summarised in an unified manner. The performance of the reviewed approaches are compared using high dimensional, real-valued benchmark data sets. The selected feature subsets are also used to build classification models, in an effort to further validate their efficacies.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Aha DW, Bankert RL (1996) A comparative evaluation of sequential feature selection algorithms. In: Fisher DH, Lenz HJ (eds) Learning from data: artificial intelligence and statistics V, lecture notes in statistics. Springer, New York, pp 199–206
AlRashidi MR, El-Hawary M (2009) A survey of particle swarm optimization applications in electric power systems. IEEE Trans Evol Comput 13(4):913–918. doi:10.1109/TEVC.2006.880326
Atyabi A, Luerssen M, Fitzgibbon S, Powers D (2012) Evolutionary feature selection and electrode reduction for eeg classification. In: 2012 IEEE congress on evolutionary computation, pp 1–8. doi:10.1109/CEC.2012.6256130
Banati H, Bajaj M (2011) Fire fly based feature selection approach. Int J Comput Sci Issues 8(2):473–479
Bellman R (1957) Dynamic programming, 1st edn. Princeton University Press, Princeton
Bengio Y, Grandvalet Y (2004) No unbiased estimator of the variance of K-fold cross-validation. J Mach Learn Res 5:1089–1105
Brownlee J (2011) Clever algorithms: nature-inspired programming recipes. Lulu Enterprises Incorporated, Raleigh
Chen Y, Miao D, Wang R (2010) A rough set approach to feature selection based on ant colony optimization. Pattern Recognit Lett 31(3):226–233
Chen X, Ong YS, Lim MH, Tan KC (2011) A multi-facet survey on memetic computation. IEEE Trans Evol Comput 15(5):591–607. doi:10.1109/TEVC.2011.2132725
Chuang LY, Tsai SW, Yang CH (2011) Improved binary particle swarm optimization using catfish effect for feature selection. Expert Syst Appl 38(10):12699–12707
Dash M, Liu H (1997) Feature selection for classification. Intell Data Anal 1:131–156
Dash M, Liu H (2003) Consistency-based search in feature selection. Artif Intell 151(1–2):155–176. doi:10.1016/S0004-3702(03)00079-1
de Castro L, Von Zuben F (2002) Learning and optimization using the clonal selection principle. IEEE Trans Evol Comput 6(3):239–251. doi:10.1109/TEVC.2002.1011539
Debuse J, Rayward-Smith V (1997) Feature subset selection within a simulated annealing data mining algorithm. J Intell Inf Syst 9:57–81. doi:10.1023/A:1008641220268
Diao R, Shen Q (2010) Two new approaches to feature selection with harmony search. In: IEEE international conference on fuzzy systems, pp 1–7. doi:10.1109/FUZZY.2010.5584009
Diao R, Shen Q (2012) Feature selection with harmony search. IEEE Trans Syst Man Cybern B 42(6):1509–1523
Diao R, Chao F, Peng T, Snooke N, Shen Q (2014) Feature selection inspired classifier ensemble reduction. IEEE Trans Cybern 44(8):1259–1268
Dorigo M, Sttzle T (2010) Ant colony optimization: overview and recent advances. In: Gendreau M, Potvin JY (eds) Handbook of metaheuristics, international series in operations research and management science, vol 146. springer, US, pp 227–263. doi:10.1007/978-1-4419-1665-5_8
Ekbal A, Saha S, Uryupina O, Poesio M (2011) Multiobjective simulated annealing based approach for feature selection in anaphora resolution. In: Proceedings of the 8th international conference on anaphora processing and applications. Springer, Berlin, Heidelberg, pp 47–58
Emmanouilidis C, Hunter A, MacIntyre J (2000) A multiobjective evolutionary setting for feature selection and a commonality-based crossover operator. In: Proceedings of the 2000 congress on evolutionary computation, vol 1, pp 309–316
Frank A, Asuncion A (2010) UCI machine learning repository
Freitas AA (2008) A review of evolutionary algorithms for data mining. In: Maimon O, Rokach L (eds) Soft computing for knowledge discovery and data mining. Springer, US, pp 79–111. doi:10.1007/978-0-387-69935-6_4
Geem ZW (ed) (2010) Recent advances in harmony search algorithm, studies in computational intelligence, vol 270. Springer, Berlin
Haktanirlar Ulutas B, Kulturel-Konak S (2011) A review of clonal selection algorithm and its applications. Artif Intell Rev 36(2):117–138. doi:10.1007/s10462-011-9206-1
Hall MA (1998) Correlation-based feature subset selection for machine learning. PhD thesis, University of Waikato, Hamilton, New Zealand
Hart W, Krasnogor N, Smith J (eds) (2004) Recent advances in memetic algorithms. Springer, Berlin
Hedar AR, Wang J, Fukushima M (2008) Tabu search for attribute reduction in rough set theory. Soft Comput 12(9):909–918
Hsu CN, Huang HJ, Schuschel D (2002) The ANNIGMA-wrapper approach to fast feature selection for neural nets. IEEE Trans Syst Man Cybern B 32(2):207–212
Jensen R, Shen Q (2005) Fuzzy-rough data reduction with ant colony optimization. Fuzzy Sets Syst 149:5–20
Jensen R, Shen Q (2007) Fuzzy-rough sets assisted attribute selection. IEEE Trans Fuzzy Syst 15(1):73–89. doi:10.1109/TFUZZ.2006.889761
Jensen R, Shen Q (2008) Computational intelligence and feature selection: rough and fuzzy approaches. Wiley/IEEE Press, New York
Jensen R, Shen Q (2009a) Are more features better? A response to attributes reduction using fuzzy rough sets. IEEE Trans Fuzzy Syst 17(6):1456–1458
Jensen R, Shen Q (2009b) New approaches to fuzzy-rough feature selection. IEEE Trans Fuzzy Syst 17(4):824–838. doi:10.1109/TFUZZ.2008.924209
John G, Langley P (1995) Estimating continuous distributions in bayesian classifiers. In: Proceedings of the eleventh conference on uncertainty in artificial intelligence. Morgan Kaufmann, pp 338–345
Kabir MM, Shahjahan M, Murase K (2011) A new local search based hybrid genetic algorithm for feature selection. Neurocomputing 74(17):2914–2928
Kabir MM, Shahjahan M, Murase K (2012) A new hybrid ant colony optimization algorithm for feature selection. Expert Syst Appl 39(3):3747–3763
Karaboga D, Akay B (2009) A survey: algorithms simulating bee swarm intelligence. Artif Intell Rev 31(1–4):61–85. doi:10.1007/s10462-009-9127-4
Karaboga D, Basturk B (2007) A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm. J Glob Optim 39(3):459–471
Karzynski M, Mateos l, Herrero J, Dopazo J (2003) Using a genetic algorithm and a perceptron for feature selection and supervised class learning in dna microarray data. Artif Intell Rev 20(1–2):39–51. doi:10.1023/A:1026032530166
Ke L, Feng Z, Ren Z (2008) An efficient ant colony optimization approach to attribute reduction in rough set theory. Pattern Recognit Lett 29(9):1351–1357
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1):273–324
Kononenko I, Simec E, Robnik-Sikonja M (1997) Overcoming the myopia of inductive learning algorithms with RELIEFF. Appl Intell 7:39–55
Leardi R, Boggia R, Terrile M (1992) Genetic algorithms as a strategy for feature selection. J Chemom 6(5):267–281. doi:10.1002/cem.1180060506
Lee KS, Geem ZW (2005) A new meta-heuristic algorithm for continuous engineering optimization: harmony search theory and practice. Comput Methods Appl Mech Eng 194(36–38):3902–3933. doi:10.1016/j.cma.2004.09.007
Lee HM, Chen CM, Chen JM, Jou YL (2001) An efficient fuzzy classifier with feature selection based on fuzzy entropy. IEEE Trans Syst Man Cybern B 31:426–432
Liu H, Motoda H (2007) Computational methods of feature selection (Chapman & Hall/CRC data mining and knowledge discovery series). Chapman & Hall/CRC, Boca Raton
Liu Y, Wang G, Chen H, Dong H, Zhu X, Wang S (2011) An improved particle swarm optimization for feature selection. J Bionic Eng 8(2):191–200. doi:10.1016/S1672-6529(11)60020-6
Lpez FG, Torres MG, Batista BM, Prez JAM, Moreno-vega JM (2006) Solving feature subset selection problem by a parallel scatter search. Eur J Oper Res 169(2):477–489
Mac Parthaláin N, Jensen R, Shen Q, Zwiggelaar R (2010a) Fuzzy-rough approaches for mammographic risk analysis. Intell Data Anal 14(2):225–244
Mac Parthaláin N, Shen Q, Jensen R (2010b) A distance measure approach to exploring the rough set boundary region for attribute reduction. IEEE Trans Knowl Data Eng 22(3):305–317. doi:10.1109/TKDE.2009.119
Meiri R, Zahavi J (2006) Using simulated annealing to optimize the feature selection problem in marketing applications. Eur J Oper Res 171(3):842–858
Muni D, Pal N, Das J (2006) Genetic programming for simultaneous feature selection and classifier design. IEEE Trans Syst Man Cybern B 36(1):106–117. doi:10.1109/TSMCB.2005.854499
Nakamura RYM, Pereira LAM, Costa KA, Rodrigues D, Papa JP, Yang XS (2012) Bba: a binary bat algorithm for feature selection. In: 25th SIBGRAPI conference on graphics, patterns and images, pp 291–297. doi:10.1109/SIBGRAPI.2012.47
Nemati S, Basiri ME, Ghasem-Aghaee N, Aghdam MH (2009) A novel ACO-GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36(10):12,086–12,094
Oh IS, Lee JS, Moon BR (2004) Hybrid genetic algorithms for feature selection. IEEE Trans Pattern Anal Mach Intell 26(11):1424–1437. doi:10.1109/TPAMI.2004.105
Ong YS, Krasnogor N, Ishibuchi H (2007) Special issue on memetic algorithms. IEEE Trans Syst Man Cybern B 37(1):2–5. doi:10.1109/TSMCB.2006.883274
Palanisamy S, Kanmani S (2012) Artificial bee colony approach for optimizing feature selection. Int J Comput Sci Issues 9(3):432–438
Senthamarai Kannan S, Ramaraj N (2010) A novel hybrid feature selection via symmetrical uncertainty ranking based local memetic search algorithm. Know-Based Syst 23(6):580–585. doi:10.1016/j.knosys.2010.03.016
Shang C, Barnes D (2013) Fuzzy-rough feature selection aided support vector machines for mars image classification. Comput Vis Image Underst 117(3):202–213. doi:10.1016/j.cviu.2012.12.002
Shen Q, Jensen R (2004) Selecting informative features with fuzzy-rough sets and its application for complex systems monitoring. Pattern Recognit 37(7):1351–1363
Shojaie S, Moradi M (2008) An evolutionary artificial immune system for feature selection and parameters optimization of support vector machines for ERP assessment in a P300-based GKT. In: International biomedical engineering conference, pp 1–5. doi:10.1109/CIBEC.2008.4786065
Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large-scale feature selection. Pattern Recognit Lett 10(5):335–347
Sivagaminathan RK, Ramakrishnan S (2007) A hybrid approach for feature subset selection using neural networks and ant colony optimization. Expert Syst Appl 33(1):49–60
Sklansky J, Vriesenga M (1996) Genetic selection and neural modeling of piecewise-linear classifiers. Int J Pattern Recognit Artif Intell 10(05):587–612. doi:10.1142/S0218001496000360
Srinivasan S, Ramakrishnan S (2011) Evolutionary multi objective optimization for rule mining: a review. Artif Intell Rev 36(3):205–248. doi:10.1007/s10462-011-9212-3
Stracuzzi DJ, Utgoff PE (2004) Randomized variable elimination. J Mach Learn Res 5:1331–1364
Suguna N, Thanushkodi KG (2011) An independent rough set approach hybrid with artificial bee colony algorithm for dimensionality reduction. Am J Appl Sci 8(3):261–266
Vilalta R, Drissi Y (2002) A perspective view and survey of meta-learning. Artif Intell Rev 18(2):77–95. doi:10.1023/A:1019956318069
Wang X, Yang J, Teng X, Xia W, Jensen R (2007) Feature selection based on rough sets and particle swarm optimization. Pattern Recogn Lett 28(4):459–471. doi:10.1016/j.patrec.2006.09.003
Witten IH, Frank E (2005) Data mining: practical machine learning tools and techniques, second edition (Morgan Kaufmann series in data management systems). Morgan Kaufmann, San Francisco
Wróblewski J (2001) Ensembles of classifiers based on approximate reducts. Fundam Inf 47(3–4):351–360
Wu X, Yu K, Ding W, Wang H, Zhu X (2013) Online feature selection with streaming features. IEEE Trans Pattern Anal Mach Intell 35(5):1178–1192. doi:10.1109/TPAMI.2012.197
Xing EP, Jordan MI, Karp RM (2001) Feature selection for high-dimensional genomic microarray data. In: Proceedings of the eighteenth international conference on machine learning. Morgan Kaufmann, pp 601–608
Yang XS (2008) Nature-inspired metaheuristic algorithms. Luniver Press, UK
Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. Intell Syst Their Appl IEEE 13(2):44–49. doi:10.1109/5254.671091
Yang CS, Chuang LY, Chen YJ, Yang CH (2008) Feature selection using memetic algorithms. In: Third international conference on convergence and hybrid information technology, vol 1, pp 416–423. doi:10.1109/ICCIT.2008.81
Yusta SC (2009) Different metaheuristic strategies to solve the feature selection problem. Pattern Recognit Lett 30(5):525–534
Zhang L, Meng X, Wu W, Zhou H (2009) Network fault feature selection based on adaptive immune clonal selection algorithm. Int Joint Conf Comput Sci Optim 2:969–973. doi:10.1109/CSO.2009.342
Zheng Z, Wu X, Srihari R (2004) Feature selection for text categorization on imbalanced data. SIGKDD Explor Newsl 6(1):80–89. doi:10.1145/1007730.1007741
Zheng L, Diao R, Shen Q (2014) Self-adjusting harmony search-based feature selection. Soft Comput . doi:10.1007/s00500-014-1307-8
Zhu Z, Ong YS (2007) Memetic algorithms for feature selection on microarray data. In: Liu D, Fei S, Hou ZG, Zhang H, Sun C (eds) Advances in neural networks, lecture notes in computer science, vol 4491. Springer, Berlin, pp 1327–1335. doi:10.1007/978-3-540-72383-7_155
Zhu Z, Ong YS, Dash M (2007) Wrapper-filter feature selection algorithm using a memetic framework. IEEE Trans Syst Man Cybern B 37(1):70–76
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Diao, R., Shen, Q. Nature inspired feature selection meta-heuristics. Artif Intell Rev 44, 311–340 (2015). https://doi.org/10.1007/s10462-015-9428-8
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10462-015-9428-8