Abstract
Feature selection (FS) is an integral part of many machine learning problems in providing a better and time-efficient classification model. In recent times, many new FS algorithms have been proposed which combine well-established algorithms to overcome drawbacks of the constituent algorithms. The general process of combination is to allow them to operate consecutively or simultaneously. These rudimentary combinations in many cases do not allow for proper inclusion of the advantages of the specific algorithms and this necessitates an alternative approach for combining. Initially without interrupting the flow of the algorithms, we allow them to generate their results. After selection of the most dominant features, the rest of the combination is done using the concept of histogram and assigning a weightage to the fuzzy features based on the quality of the candidate solution in which they appear. In the proposed method, the outcome of the three popularly used algorithms with complementary exploitation–exploration trade-off namely genetic algorithm (GA), binary particle swarm optimisation (BPSO) and ant colony optimisation (ACO) are combined together. Then, 14 popular UCI datasets have been used to evaluate the proposed FS method. Results obtained by our proposed ensemble are compared with some popular FS models like gravitational search algorithm, histogram based multi objective GA, GA, BPSO and ACO, and it shows that our algorithm outperforms the others.









Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Curse of Dimensionality (n.d.) https://en.wikipedia.org/wiki/Curse_of_dimensionality. Accessed 28 Dec 2018
Yang J, Honavar V (1998) Feature subset selection using a genetic algorithm. IEEE Intell Syst Appl 13:44–49
Forsati R, Moayedikia A, Jensen R, Shamsfard M, Meybodi MR (2014) Enriched ant colony optimization and its application in feature selection. Neurocomputing 142:354–371. https://doi.org/10.1016/j.neucom.2014.03.053
Ghosh M, Begum S, Sarkar R, Chakraborty D, Maulik U (2019) Recursive memetic algorithm for gene selection in microarray data. Expert Syst Appl 116:172–185
Ghosh M, Adhikary S, Ghosh KK, Sardar A, Begum S, Sarkar R (2019) Genetic algorithm based cancerous gene identification from microarray data using ensemble of filter methods. Med Biol Eng Comput 57:159–176
Liu H, Motoda H (2007) Computational methods of feature selection. CRC Press, Boca Raton
Mitra P, Murthy CA, Pal SK (2002) Unsupervised feature selection using feature similarity. IEEE Trans Pattern Anal Mach Intell 24:301–312
Shang W-Q, Qu Y-L, Huang H-K, Zhu H-B, Lin Y-M, Dong H-B (2006) Fuzzy knn text classifier based on Gini index. J Guangxi Norm Univ 24:87–90
Dorigo M, Birattari M (2011) Ant colony optimization. In: Sammut C, Webb GI (eds) Encyclopedia machine learning. Springer, Berlin, pp 36–39
Eberhart R, Kennedy J (1995) A new optimizer using particle swarm theory. In: MHS’95. Proceedings of the sixth international symposium on micro machine and human science. IEEE, pp 39–43
Holland JH (1992) Genetic algorithms. Sci Am 1:66–73
Zhu Z, Ong YS, Dash M (2007) Markov blanket-embedded genetic algorithm for gene selection. Pattern Recognit 40:3236–3248. https://doi.org/10.1016/j.patcog.2007.02.007
Duval B, Hao J-K, Hernandez Hernandez JC (2009) A memetic algorithm for gene selection and molecular classification of cancer. In: Proceedings of the 11th annual conference on genetic and evolutionary computation—GECCO’09, p 201. https://doi.org/10.1145/1569901.1569930
Aghdam MH, Ghasem-Aghaee N, Basiri ME (2009) Text feature selection using ant colony optimization. Expert Syst Appl 36:6843–6853. https://doi.org/10.1016/j.eswa.2008.08.022
Ghosh M, Guha R, Sarkar R, Abraham A (2019) A wrapper–filter feature selection technique based on ant colony optimization. Neural Comput Appl. https://doi.org/10.1007/s00521-019-04171-3
Kennedy J, Eberhart RC (1997) A discrete binary version of the particle swarm algorithm, systems, man, cybernetics. In: IEEE international conference on computational cybernetics and simulation, vol 5, pp 4104–4108
Wei J, Zhang R, Yu Z, Hu R, Tang J, Gui C, Yuan Y (2017) A BPSO–SVM algorithm based on memory renewal and enhanced mutation mechanisms for feature selection. Appl Soft Comput J 58:176–192. https://doi.org/10.1016/j.asoc.2017.04.061
Moradi P, Gholampour M (2016) A hybrid particle swarm optimization for feature subset selection by integrating a novel local search strategy. Appl Soft Comput J 43:117–130. https://doi.org/10.1016/j.asoc.2016.01.044
Sarkar R, Ghosh M, Chatterjee A, Malakar S (2018) An advanced particle swarm optimization based feature selection method for tri-script handwritten digit recognition. In: International conference on computational intelligence, communications, and business analytics, pp 978–981
Frohlich H, Chapelle O, Scholkopf B (2016) Feature selection for support vector machines by means of genetic algorithm. In: Proceedings of the 15th IEEE international conference tools with artificial intelligence, pp 142–148. https://doi.org/10.1109/tai.2003.1250182
Leardi R (2000) Application of genetic algorithm—PLS for feature selection in spectral data sets. J Chemom 14(5–6):643–655
Ghosh M, Guha R, Mondal R, Singh PK, Sarkar R (2018) Feature selection using histogram based multi-objective GA for handwritten devanagari numeral recognition. Intell Eng Inform AISC 695:471–479
Guha R, Ghosh M, Kapri S, Shaw S, Mutsuddi S, Bhateja V, Sarkar R (2019) Deluge based genetic algorithm for feature selection. Evol Intell. https://doi.org/10.1007/s12065-019-00218-5
Prasad Y, Biswas KK, Jain CK (2010) SVM classifier based feature selection using GA, ACO and PSO for siRNA design. In: International conference in swarm intelligence, pp 307–314. Springer, Berlin
Huang C, Dun J (2008) A distributed PSO–SVM hybrid system with feature selection and parameter optimization. Appl Soft Comput 8:1381–1391. https://doi.org/10.1016/j.asoc.2007.10.007
Huang CL (2009) ACO-based hybrid classification system with feature subset selection and model parameters optimization. Neurocomputing 73:438–448. https://doi.org/10.1016/j.neucom.2009.07.014
Nemati S, Ehsan M, Ghasem-aghaee N, Hosseinzadeh M (2009) Expert systems with applications a novel ACO–GA hybrid algorithm for feature selection in protein function prediction. Expert Syst Appl 36:12086–12094. https://doi.org/10.1016/j.eswa.2009.04.023
Basiri ME, Nemati S (2009) A novel hybrid ACO–GA algorithm for text feature selection. In: Proceedings of 11th IEEE conference on congress on evolutionary computation, pp 2561–2568
Sheikhan M, Mohammadi N (2012) Neural-based electricity load forecasting using hybrid of GA and ACO for feature selection. Neural Comput Appl 21:1961–1970. https://doi.org/10.1007/s00521-011-0599-1
Alba E, Garcia-Nieto J, Jourdan L, Talbi E-G (2007) Gene selection in cancer classification using PSO/SVM and GA/SVM hybrid algorithms. In: 2007 IEEE congress on evolutionary computation, pp 284–290
Cadenas JM, Garrido MC, MartíNez R (2013) Feature subset selection filter–wrapper based on low quality data. Expert Syst Appl 40:6241–6252
Tran CT, Zhang M, Andreae P, Xue B (2016) Improving performance for classification with incomplete data using wrapper-based feature selection. Evol Intell 9:81–94
Harifi S, Khalilian M, Mohammadzadeh J, Ebrahimnejad S (2019) Emperor penguins colony: a new metaheuristic algorithm for optimization. Evol Intell 12(2):211–226
Cheng J, Duan Z (2019) Cloud model based sine cosine algorithm for solving optimization problems. Evol Intell. https://doi.org/10.1007/s12065-019-00251-4
Singh H, Kumar Y, Kumar S (2019) A new meta-heuristic algorithm based on chemical reactions for partitional clustering problems. Evol Intell 12(2):241–252
Cruz DPF, Maia RD, De Castro LN (2019) A critical discussion into the core of swarm intelligence algorithms. Evol Intell 12(2):189–200
Elbes M, Alzubi S, Kanan T, Al-Fuqaha A, Hawashin B (2019) A survey on particle swarm optimization with emphasis on engineering and network applications. Evol Intell 12(2):113–129
Wolpert DH, Macready WG (1997) No free lunch theorems for optimization. IEEE Trans Evol Comput 1:67–82
UCI repository (n.d.) https://archive.ics.uci.edu/ml/datasets.html. Accessed 7 Jan 2019
Rashedi E, Nezamabadi-pour H, Saryazdi S (2009) GSA: a gravitational search algorithm. Inf Sci 179:2232–2248. https://doi.org/10.1016/j.ins.2009.03.004
Singh PK, Sarkar R, Nasipuri M (2016) Significance of non-parametric statistical tests for comparison of classifiers over multiple datasets. Int J Comput Sci Math 7(5):410–422
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghosh, M., Guha, R., Singh, P.K. et al. A histogram based fuzzy ensemble technique for feature selection. Evol. Intel. 12, 713–724 (2019). https://doi.org/10.1007/s12065-019-00279-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12065-019-00279-6