Abstract
Searching for an optimal feature subset from a high-dimensional feature space is an NP-complete problem; hence, traditional optimization algorithms are inefficient when solving large-scale feature selection problems. Therefore, meta-heuristic algorithms are extensively adopted to solve such problems efficiently. This study proposes a regression-based particle swarm optimization for feature selection problem. The proposed algorithm can increase population diversity and avoid local optimal trapping by improving the jump ability of flying particles. The data sets collected from UCI machine learning databases are used to evaluate the effectiveness of the proposed approach. Classification accuracy is used as a criterion to evaluate classifier performance. Results show that our proposed approach outperforms both genetic algorithms and sequential search algorithms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Agrafiotis, D.K., & Cedeno,W. (2002). Feature selection for structure-activity correlation use binary particle swarms. Journal of Medicinal Chemistry, 45, 1098–1107.
Ai, J., & Kachitvichyanukul, V. (2009). A particle swarm optimization for the vehicle routing problem with simultaneous pickup and delivery. Computers and Operations Research, 36, 1693–1702.
Bollmfield, M.W., Herencia, J.E., Weaver, P.M. (2010). Analysis and benchmarking of meta-heuristic techniques for lay-up optimization. Computers and Structures, 88, 272–282.
Cervantes, A., Garia, I.M., Isasi, P. (2009). AMPSP: a new particle swarm method for nearest neighborhood classification. IEEE Transactions on Systems Man and Cybernetics Part B, 39, 1082–1091.
Chen, L.F., Su, C.T., Chen, K.H. (2012). An improved particle swarm optimization for feature selection. Intelligent Data Analysis, 19, 167–182.
Chuang, L.Y., Chang, H.W., Yang, C.H. (2008). Improved binary PSO for feature selection using gene expression data. Computational Biology and Chemistry, 32, 29–38.
Cotta, C., Sloper, C., Moscato, P. (2004). Evolutionary search of thresholds for robust feature set selection: Application to the analysis of microarray data. In Proceedings of European workshop on evolutionary computation and bioinformatics (pp. 21–30).
Dash, M., & Lin, H. (1997). Feature selection for classification. Intelligent Data Analysis, 1, 131–156.
Duda, R.O., Hart, P.E., Stork, D.G. (2000). Pattern classification. In Wiley-Interscience (2nd edn.)
Durbha, S.S., King, R.L., Younan, N.H. (2010). Wrapper-based feature subset selection for rapid image information mining. IEEE Transactions on Geoscience and Remote Sensing, 7, 43–47.
Fan, Y.J., & Chaovalitwongse, W.A. (2010). Optimizing feature selection to improve medical diagnosis. Annals of Operations Research, 174, 169–183.
Fix, F.E., & Hodges, J.L. (1989). Discriminatory analysis-nonparametric discrimination: consistency properties. International Statistical Review, 57, 238–247.
Gertheiss, J., & Titz, G. (2009). Feature selection and weighting by nearest neighbor ensembles. Chemometrics and Intelligent Laboratory, 99, 30–38.
Gheyas, I.A., & Smith, L.S. (2010). Feature subset selection in large dimensionality domains. Pattern Recognition, 43, 5–13.
Guyon, I., & Elisseeff, A. (2003). An introduction to variable and feature selection. Journal of Machine Learning Research, 3, 1157–82.
Hsu, H., Hsieh, C., Lu, M. (2011). Hybrid feature selection by combining filters and wrappers. Expert Systems with Applications, 38, 8144–8150.
Hu, X., & Eberhart, R. (2002). Multiobjective optimization using dynamic neighborhood particle swarm optimization. In IEEE congress on evolutionary computation (vol. 1–2, pp. 1677–1681).
Kennedy, J., Eberhart, R.C., She, Y. (2001). Swarm intelligence. San Diego: Morgan Kaufman.
Khushaba, R.N., Al-Ani, A., Al-Jumaily, A. (2011). Feature subset selection using differential evolution and a statistical repair mechanism. Expert Systems with Applications, 38, 11515–11526.
Kim, Y., Street, W.N., Menczer, F. (2000). Feature selection in unsupervised learning via evolutionary search. In Proceedings of knowledge discovery and data mining (pp. 365–369).
Kira, K., & Rendell, L.A. (1992). A practical approach to feature selection. In Proceedings of the ninth international workshop on machine learning (pp. 249–256).
Kohavi, R., & John, G. (1997). Wrappers for feature subset selection. Artificial Intelligence Journal, 97, 273–324.
Kudo, M., & Sklansky, J. (2000). Comparison of algorithms that select features for pattern recognition. Pattern Recognition, 33, 25–41.
Lanzi, P. (1997). Fast feature selection with genetic algorithms: a filter approach. In: Proceedings of the IEEE conference on evolutionary computation (pp. 537–540).
Lee, J.H., & Cha, G.H. (1999). A model for k-nearest neighbor query processing cost in multidimensional data space. Information Processing Letters, 69, 69–76.
Lewis, D.D. (1992). Feature selection and feature extraction for text categorization. In Proceedings speech and natural language workshop (pp. 212–217).
Liu, H., & Motoda, H. (2007). Computational methods of feature selection. Chapman and Hall/CRC Press.
Liu, H., & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17, 491–502.
Liu, H., Sun, J., Liu, L., Zhang, H. (2009). Feature selection with dynamic mutual information. Pattern Recognition, 42, 1330–1339.
Luukka, P. (2011). Feature selection using fuzzy entropy measures with similarity classifier. Expert Systems with Applications, 38, 4600–4607.
Masaeli, M., Fung, G., Dy, J.G. (2010). From transformation-based dimensionality reduction to feature selection. In Proceedings of international conference on machine learning (pp. 751–758).
Maximiliano, S., & Yuji, T. (2003). Conformational analyses and SAR studies of antispermatogenic hexahydroindenopyridines. Journal of Molecular Structure (Theochem), 633, 93–104.
Melgani, F., & Bazi, Y. (2008). Classification of electrocardiogram signals with support vector machine and particle swarm opt. IEEE Transactions on Information Technology in Biomedicine, 12, 667–677.
Oh, I.S., Lee, J.S., Moon, B.R. (2004). Hybrid genetic algorithms for feature selection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 1424–1437.
Park, J.B., Jeong, Y.W., Shin, J.R., Lee, K.Y. (2010). An improved particle swarm optimization for nonconvex economic dispatch problems. IEEE Transactions on Power Systems, 25, 156–166.
Reddy, J.B.V., Dash, P.K., Samantaray, R., Moharana, A.K. (2009). Fast tracking of power quality disturbance signals using an optimized unscented filte. IEEE Transactions on Instrumentation and Measurement, 58, 3943–3952.
Shi, Y., & Eberhart, R.C. (1998). A modified particle swarm optimizer. In IEEE international conference on evolutionary computation Anchorage Alaska (pp. 69–73).
Shi, X.H., Liang, Y.C., Lee, H.P., Lu, C., Wang, L.M. (2005). An improved GA and a novel PSO-GA-based hybrid algorithm. Information Processing Letters, 93, 255–261.
Soft computing and intelligent information systems. (2012). http://sci2s.ugr.es/index.php.
Song, M.P., & Gu, G.C. (2004). Research on particle swarm optimization: a review. In Proceedings of the international conference on machine learning and cybernetics (vol. 4, pp. 2236–2241).
Tan, S. (2005). Neighbor-weighted k-nearest neighbor for unbalanced text corpus. Expert Systems with Applications, 28, 667–671.
Tasgetiren, M.F., & Liang, Y.C. (2004). A binary particle swarm optimization algorithm for lot sizing problem. Journal of Economic and Social Research, 5, 1–20.
Tsanas, A., Little, M.A., McSharry, P.E., Ramig, L.O. (2010). Accurate telemonitoring of parkinsons disease progression by noninvasive speech tests. IEEE Transactions on Biomedical Engineering, 57, 884–893.
Xue, B., Zhang,M., Browne,W.N. (2013). Particle swarm optimization for feature selection in classification: a multi-objective. IEEE Transactions on, 99, 1–16.
Yang, J.H., & Honavar, V. (1998). Feature subset selection using a genetic algorithm. IEEE Intelligent Systems, 13, 44–49.
Yu, L., & Liu, H. (2004). Efficient feature selection via analysis of relevance and redundancy. Journal of Machine Learning Research, 5, 1205–24.
Yusta, S.C. (2009). Different metaheuristic strategies to solve the feature selection problem. Pattern Recognition Letters, 30.
Zhu, J., Rosset, S., Hastie, T., Tibshirani, R. (2003). 1-norm support vector machines. The annual conference on neural information processing systems.
Ziari, I., & Jalilian, A. (2010). A new approach for allocation and sizing of multiple active power-line conditioners. IEEE Transactions on Power Delivery, 25, 1026–1035.
Acknowledgments
This work was partially supported by the National Science Council of Taiwan, R.O.C., under Grants Number NSC 100-2410-E-030-007-MY2.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, KH., Chen, LF. & Su, CT. A new particle swarm feature selection method for classification. J Intell Inf Syst 42, 507–530 (2014). https://doi.org/10.1007/s10844-013-0295-y
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10844-013-0295-y