Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

An oscillatory particle swarm optimization feature selection algorithm for hybrid data based on mutual information entropy

Published: 01 February 2024 Publication History

Abstract

Hybrid data lead to overfitting in machine learning models, which may reduce the accuracy of classification. Feature selection can not only reduce the computational cost of processing hybrid data but also improve the accuracy of classification. The particle swarm optimization (PSO) algorithm has clear advantages in feature selection. This paper presents an oscillatory particle swarm optimization feature selection algorithm for hybrid data based on mutual information entropy. First, a new distance function on the object set of a hybrid information system (HIS) is built, which yields a tolerance relation on this object set. Then, mutual information entropy is presented to measure the uncertainty of the HIS. On this basis, the maximum-relevance and minimal-redundancy model (MRMR model) for the HIS is proposed. Based on the MRMR model, a feature selection algorithm (denoted as MRMR) for the HIS is naturally designed. As the integration of the MRMR model and PSO can effectively explore all possible feature subsets, an oscillatory particle swarm optimization algorithm based on the MRMR model (denoted as OPSO-MRMR) for the HIS is also designed. Moreover, the MRMR model is utilized to define a fitness function that evaluates the quality of particles. The particle position update process is modified by means of a two-order oscillatory equation. Finally, an experimental analysis is conducted to compare the two designed algorithms with five other algorithms. The statistical analysis of classification accuracy and F1 score shows that OPSO-MRMR improves precision by 5.8% and 10.7% compared to the other six algorithms, respectively.

Highlights

Mutual information entropy is presented to measure the uncertainty of a hybrid information system.
Max-relevance and minimal redundancy model (MRMR-model) is put forward.
A feature selection algorithm (denoted as MRMR) is designed.
A oscillatory particle swarm optimization algorithm based on MRMR-model (denoted as OPSO-MRMR) is designed.
Experimental analysis is conducted to compare two designed algorithms with other five algorithms.

References

[1]
Koutanaei F.N., Sajedi H., Khanbabaei M., A hybrid data mining model of feature selection algorithms and ensemble learning classifiers for credit scoring, J. Retail. Cons. Serv. 27 (2015) 11–23.
[2]
Wei W., Liang J.Y., Qian Y.H., A comparative study of rough sets for hybrid data, Inform. Sci. 190 (2012) 1–16.
[3]
Bugata P., Drotar P., On some aspects of minimum redundancy maximum relevance feature selection, Sci. China Inf. Sci. 63 (2020) 1–15.
[4]
Billah M., Waheed S., Minimum redundancy maximum relevance (mRMR) based feature selection from endoscopic images for automatic gastrointestinal polyp detection, Multimedia Tools Appl. 79 (2020) 23633–23643.
[5]
Wang C.Z., Huang Y., Shao M.W., Fan X.D., Fuzzy rough set-based attribute reduction using distance measures, Knowl.-Based Syst. 164 (2019) 205–212.
[6]
Sharif M., Khan M.A., Faisal M., Yasmin M., Fernandes S.L., A framework for offline signature verification system: Best features selection approach, Pattern Recognit. Lett. 139 (2020) 50–59.
[7]
Kamala R., Thangaiah R.J., An improved hybrid feature selection method for huge dimensional datasets, IAES Int. J. Artif. Intell. 8 (2019) 77–86.
[8]
Shu W.H., Qian W.B., Xie Y.H., Incremental feature selection for dynamic hybrid data using neighborhood rough set, Knowl.-Based Syst. 194 (2020).
[9]
Yuan Z., Chen H.M., Yang X.L., Li T.R., Liu K.Y., Fuzzy complementary entropy using hybrid-kernel function and its unsupervised attribute reduction, Knowl.-Based Syst. 231 (2021).
[10]
Zhang Y., Gong D.W., Gao X.Z., Tian T., Sun X.Y., Binary differential evolution with self-learning for multi-objective feature selection, Inform. Sci. 507 (2020) 67–85.
[11]
Hancer E., New filter approaches for feature selection using differential evolution and fuzzy rough set theory, Neural Comput. Appl. 32 (2020) 2929–2944.
[12]
Zeng A.P., Li T.R., Liu D., Zhang J.B., Chen H.M., A fuzzy rough set approach for incremental feature selection on hybrid information systems, Fuzzy Sets and Systems 258 (2015) 39–60.
[13]
Sakar C.O., Kursun O., Gurgen F., A feature selection method based on kernel canonical correlation analysis and the minimum Redundancy-CMaximum relevance filter method, Expert Syst. Appl. 39 (2012) 3432–3437.
[14]
Hu M., Tsang E.C.C., Guo Y., Chen D., Xu W., A novel approach to attribute reduction based on weighted neighborhood rough sets, Knowl.-Based Syst. 220 (2021).
[15]
Chikhi S., Benhammada S., ReliefMSS: a variation on a feature ranking relieff algorithm, Int. J. Bus. Intell. Data Min. 4 (2009) 375–390.
[16]
Rostami M., Berahmand K., Nasiri E., Forouzandeh S., Review of swarm intelligence-based feature selection methods, Eng. Appl. Artif. Intell. 100 (2021).
[17]
Zouache D., Abdelaziz F.B., A cooperative swarm intelligence algorithm based on quantum-inspired and rough sets for feature selection, Comput. Ind. Eng. 115 (2018) 26–36.
[18]
Wang S.W., Chen H.M., Feature selection method based on rough set and improved whale optimization algorithm, Comput. Sci. 47 (2020) 44–50.
[19]
El-Kenawy E.S.M., Ibrahim A., Mirjalili S., Hussein S., Novel feature selection and voting classifier algorithms for COVID-19 classification in CT images, IEEE Access 8 (2020).
[20]
Shaban W.M., Rabie A.H., Saleh A.I., Abo-Elsoud M.A., A new COVID-19 patients detection strategy (CPDS) based on hybrid feature selection and enhanced KNN classifier, Knowl.-Based Syst. 205 (2020).
[21]
Al-Tashi Q., Kadir S.J.A., Rais H.M., Rais H.M., Mirjalili S., Alhussian H., Binary optimization using hybrid grey wolf optimization for feature selection, IEEE Access 7 (2019) 39496–39508.
[22]
Arora S., Anand P., Binary butterfly optimization approaches for feature selection, Expert Syst. Appl. 116 (2019) 147–160.
[23]
Mir M., Shafieezadeh M., Heidari M.A., Ghadimi N., Application of hybrid forecast engine based intelligent algorithm and feature selection for wind signal prediction, Evol. Syst. 11 (2020) 559–573.
[24]
Alazzam H., Sharieh A., Sabri K.E., A feature selection algorithm for intrusion detection system based on pigeon inspired optimizer, Expert Syst. Appl. 148 (2020).
[25]
El-Kenawy E.S., Eid M., Hybrid gray wolf and particle swarm optimization for feature selection, Int. J. Innovative Comput. Inf. Control 16 (2020) 831–844.
[26]
Hu P., Pan J.S., Chu S.C., Improved binary grey wolf optimizer and its application for feature selection, Knowl.-Based Syst. 195 (2020).
[27]
Alweshah N., Khalaileh S.A., Gupta B.B., Almomani A., Hammouri A.I., Al-Betar M.A., The monarch buttery optimization algorithm for solving feature selection problems, Neural Comput. Appl. 34 (2022) 11267–11281.
[28]
Souza R.C.T.d., Macedo C.A.d., Coelho L.d.S., Pierezan J., Mariani V.C., Binary coyote optimization algorithm for feature selection, Pattern Recognit. 107 (2020).
[29]
Riyahi M., Rafsanjani M.K., Gupta B.B., Alhalabi W., Multi-objective whale optimization algorithm based feature selection for intelligent systems, Int. J. Intell. Syst. 37 (2022) 9037–9054.
[30]
Khurmaa R.A., Aljarah I., Sharieh A., An intelligent feature selection approach based on moth flame optimization for medical diagnosis, Neural Comput. Appl. 33 (2021) 7165–7204.
[31]
Zhang X., Xu Y.T., Yu C.Y., Heidari A.A., Li S.M., Chen H.L., Li C.Y., Gaussian mutational chaotic fruit fly-built optimization and feature selection, Expert Syst. Appl. 141 (2020).
[32]
Pan J.S., Tian A.Q., Chu S.C., Li J.B., Improved binary pigeon-inspired-optimization and its application for feature selection, Appl. Intell. 51 (2021) 8661–8679.
[33]
Neggaz I., Fizazi H., An intelligent handcrafted feature selection using Archimedes optimization algorithm for facial analysis, Soft Comput. 26 (2022) 10435–10464.
[34]
Sangaiah A.K., Javadpour A., Jafari F., Pinto P., Zhang W.Z., Balasubramanian S., A hybrid heuristics artificial intelligence feature selection for intrusion detection classifiers in cloud of things, Cluster Comput. 26 (2023) 599–612.
[35]
Nurhayati, Agustian F., Lubis M.D.I., Particle swarm optimization feature selection for breast cancer prediction, in: 2020 8th International Conference on Cyber and IT Service Management, CITSM, IEEE, 2020, pp. 1–6.
[36]
Wang P., He J.L., Li Z.W., Attribute reduction for hybrid data based on fuzzy rough iterative computation model, Inform. Sci. 632 (2023) 555–575.
[37]
Hu J.X., Zeng J.C., Two-order oscillating particle swarm optimization, J. Syst. Simul. 17 (2007) 997–999.
[38]
[39]
R. Yacouby, D. Axman, Probabilistic extension of precision, recall, and f1 score for more thorough evaluation of classification models, in: Proceedings of the First Workshop on Evaluation and Comparison of NLP Systems 11, 2020, pp. 79–91.
[40]
Demisar J., Statistical comparisons of classifiers over multiple data sets, J. Mach. Learn. Res. 7 (2006) 1–30.

Cited By

View all
  • (2024)Single-objective and multi-objective mixed-variable grey wolf optimizer for joint feature selection and classifier parameter tuningApplied Soft Computing10.1016/j.asoc.2024.112121165:COnline publication date: 1-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Applied Soft Computing
Applied Soft Computing  Volume 152, Issue C
Feb 2024
1017 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 01 February 2024

Author Tags

  1. Feature selection
  2. Hybrid data
  3. MRMR model
  4. Fitness function
  5. OPSO-MRMR

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Single-objective and multi-objective mixed-variable grey wolf optimizer for joint feature selection and classifier parameter tuningApplied Soft Computing10.1016/j.asoc.2024.112121165:COnline publication date: 1-Nov-2024

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media