Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A fast feature weighting algorithm of data gravitation classification

Published: 01 January 2017 Publication History

Abstract

The data gravitation classification (DGC) model is a new classification method that has gained great research interest in recent years. Feature weights are the key parameters in DGC models, because the classification performances of these models are very sensitive to such feature weights. The available DGC models use wrapper-like algorithms to obtain their optimised feature weights. Although such algorithms produce high classification accuracies, but they contribute to the high computational complexities of the DGC models. In this study, we propose a fast feature weight algorithm for DGC models called FFW-DGC. We use the concepts of feature discrimination and redundancy to measure the importance of a feature, after which two fuzzy subsets are constructed to respectively represent these concepts. Next, we combine the two fuzzy subsets to compute the feature weights used in gravitational computing. We conduct our experiments on 25 standard data sets and 22 imbalance data sets, and compare FFW-DGC with 11 kinds of classifiers, including the swarm-intelligence-based DGC (PSO-DGC) model. Competitive results of FFW-DGC demonstrate that it can obtain high classification accuracies, but also hundreds of times of speedup ratios compared with PSO-DGC.

References

[1]
J. Alcalá-Fdez, A. Fernández, J. Luengo, J. Derrac, S. García, L. Sánchez, F. Herrera, KEEL data-mining software tool: data set repository, integration of algorithms and experimental analysis framework, J. Multi Valued Logic Soft Comput., 17 (2011) 255-287.
[2]
M. Antonelli, P. Ducange, F. Marcelloni, On the influence of feature selection in fuzzy rule-based regression model generation, Inf. Sci., 329 (2016) 649-669.
[3]
L. Bahl, P. P.Brown, P.V. de Souza, Maximum mutual information estimation of hidden Markov model parameters for speech recognition, 1986.
[4]
A. Ben-David, Comparison of classification accuracy using cohens weighted kappa, Expert Syst. Appl., 32 (2008) 825-832.
[5]
S.A. Bouhamed, I.K. Kallel, D.S. Masmoudi, Feature selection in possibilistic modeling, Pattern Recognit., 48 (2015) 3627-3640.
[6]
A.P. Bradley, The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognit., 30 (1997) 1145-1159.
[7]
A. Cano, A. Zafra, S. Ventura, Weighted data gravitation classification for standard and imbalanced data, IEEE Trans. Cybern., 43 (2013) 1672-1687.
[8]
C. Chen, Unsupervised margin-based feature selection using linear transformations with neighbor preservation, Neurocomputing, 171 (2016) 1354-1366.
[9]
T.M. Cover, P.E. Hart, Nearest neighbor pattern classification, IEEE Trans. Inf. Theory, 13 (1967) 21-27.
[10]
M. Dash, H. Liu, Feature selection for classification, Intell. Data Anal., 1 (1997) 131-156.
[11]
J. Derrac, C. Cornelis, S. García, Enhancing evolutionary instance selection algorithms by means of fuzzy rough set based feature selection, Inf. Sci., 186 (2012) 73-92.
[12]
A. Fernández, M.J. del Jesus, F. Herrera, On the 2-tuples based genetic tuning performance for fuzzy rule based classification systems in imbalanced data-sets, Inf. Sci., 180 (2010) 1268-1291.
[13]
E. Frank, Fully supervised training of Gaussian radial basis function networks in WEKA, Technical Report 04/14, Department of Computer Science, University of Waikato, 2014.
[14]
E. Frank, I.H. Witten, Generating accurate rule sets without global optimization, 1998.
[15]
A. Ganivada, S.S. Ray, S.K. Pal, Fuzzy rough sets, and a granular neural network for unsupervised feature selection, Neural Netw., 48 (2013) 91-108.
[16]
S. García, A. Fernández, J. Luengo, F. Herrera, A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability, Soft Comput., 13 (2009) 959-977.
[17]
S. García, A. Fernández, J. Luengo, F. Herrera, Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power, Inf. Sci., 180 (2010) 2044-2064.
[18]
M. Hall, E. Frank, Combining naive Bayes and decision tables, 2008.
[19]
J. Han, Z. Sun, H. Hao, l0-norm based structural sparse least square regression for feature selection, Pattern Recognit., 48 (2015) 3927-3940.
[20]
L. Hedjazi, J. Aguilar-Martin, M.L. Lann, Membership-margin based feature selection for mixed type and high-dimensional data: theory and applications, Inf. Sci., 322 (2015) 174-196.
[21]
Q. Hu, S. An, D. Yu, Soft fuzzy rough sets for robust feature evaluation and selection, Inf. Sci., 180 (2010) 4384-4400.
[22]
J. Huang, C.X. Ling, Using AUC and accuracy in evaluating learning algorithms, IEEE Trans. Knowl. Data Eng., 17 (2005) 299-310.
[23]
J.C. Huehn, E. Huellermeier, FURIA: an algorithm for unordered fuzzy rule induction, Data Min. Knowl. Discovery, 19 (2009) 293-319.
[24]
R. Jensen, N.M. Parthaláin, Towards scalable fuzzy-rough feature selection, Inf. Sci., 323 (2015) 1-15.
[25]
F. Jiang, Y. Sui, L. Zhou, A relative decision entropy-based feature selection approach, Pattern Recognit., 48 (2015) 2151-2163.
[26]
S. Kamyab, M. Eftekhari, Feature selection using multimodal optimization techniques, Neurocomputing, 171 (2016) 586-597.
[27]
R. Kohavi, Scaling up the accuracy of naive-Bayes classifiers: a decision-tree hybrid, 1996.
[28]
J. Lee, D.W. Kim, Fast multi-label feature selection based on information-theoretic feature ranking, Pattern Recognit., 48 (2015) 2761-2771.
[29]
Y. Lin, Q. Hu, J. Liu, Multi-label feature selection based on max-dependency and min-redundancy, Neurocomputing, 168 (2015) 92-103.
[30]
H. Liu, J. Sun, L. Liu, Feature selection with dynamic mutual information, Pattern Recognit., 42 (2009) 1330-1339.
[31]
F. Maes, A. Collignon, D. Vandermeulen, Multimodality image registration by maximization of mutual information, IEEE Trans. Med. Imaging, 16 (1997) 187-198.
[32]
S. Maldonado, E. Carrizosa, R. Weber, Kernel penalized K-means: a feature selection method based on kernel K-means, Inf. Sci., 322 (2015) 150-160.
[33]
S. Mitra, P.P. Kundu, W. Pedrycz, Feature selection using structural similarity, Inform. Sci., 198 (2012) 48C61.
[34]
S. Parsazad, H.S. Yazdi, S. Effati, Gravitation based classification, Inf. Sci., 220 (2013) 319-330.
[35]
S. Paul, S. Das, Simultaneous feature selection and weighting c an evolutionary multi-objective optimization approach, Pattern Recognit. Lett., 65 (2015) 51-59.
[36]
H. Peng, F. Long, C. Ding, Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy, IEEE Trans. Pattern Anal. Mach. Intell., 27 (2005) 1226-1238.
[37]
L. Peng, B. Yang, Y. Chen, A. Abraham, Data gravitation based classification, Inf. Sci., 179 (2009) 809-819.
[38]
L. Peng, H. Zhang, B. Yang, Y. Chen, A new approach for imbalanced data classification based on data gravitation, Inf. Sci., 288 (2014) 347-373.
[39]
J. Pérez-Rodríguez, A.G.A.-P. na, N. García-Pedrajas, Simultaneous instance and feature selection and weighting using evolutionary computation: proposal and study, Appl. Soft Comput., 37 (2015) 416-443.
[40]
B. Scholkopf, A.J. Smola, R. Williamson, P.L. Bartlett, New support vector algorithms, Neural Comput., 12 (2000) 1207-1245.
[41]
D. Sheskin, Handbook of Parametric and Nonparametric Statistical Procedures, Chapman & Hall, CRC, 2006.
[42]
M. Sumner, E. Frank, M. Hall, Speeding up logistic model tree induction, 2005.
[43]
S. Tabakhi, P. Moradi, Relevance-redundancy feature selection based on ant colony optimization, Pattern Recognit., 48 (2015) 2798-2811.
[44]
UC Machine Learning Repository, http://archive.ics.uci.edu/ml/.
[45]
S.M. Vieira, J.M.C. Sousa, U. Kaymak, Fuzzy criteria for feature selection, Fuzzy Sets Syst., 189 (2012) 1-18.
[46]
G. Webb, J. Boughton, Z. Wang, Not so naive Bayes: aggregating one-dependence estimators, Mach. Learn., 58 (2005) 5-24.
[47]
G. Webb, J. Boughton, F. Zheng, K. Ting, H. Salem, Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification, Mach. Learn., 86 (2012) 233-272.
[48]
Weka 3: Data Mining Software in Java, http://www.cs.waikato.ac.nz/ml/weka/.
[49]
G. Wen, J. Wei, J. Wang, Cognitive gravitation model for classification on small noisy data, Neurocomputing, 118 (2013) 245-252.
[50]
H. Yan, J. Yang, Sparse discriminative feature selection, Pattern Recognit., 48 (2015) 1827-1835.
[51]
M. Yuwono, Y. Guo, J. Wall, Unsupervised feature selection using swarm intelligence andconsensus clustering for automatic fault detection and diagnosis inheating ventilation and air conditioning systems, Appl. Soft Comput., 34 (2015) 402-425.
[52]
A. Zeng, T. Li, D. Liu, A fuzzy rough set approach for incremental feature selection onhybrid information systems, Fuzzy Sets Syst., 258 (2015) 39-60.
[53]
Z. Zeng, H. Zhang, R. Zhang, A novel feature selection method considering feature interaction, Pattern Recognit., 48 (2015) 2656-2666.

Cited By

View all

Index Terms

  1. A fast feature weighting algorithm of data gravitation classification
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Information Sciences: an International Journal
      Information Sciences: an International Journal  Volume 375, Issue C
      January 2017
      314 pages

      Publisher

      Elsevier Science Inc.

      United States

      Publication History

      Published: 01 January 2017

      Author Tags

      1. Classification
      2. Data gravitation
      3. Feature selection
      4. Machine learning

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 02 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Weighting Approaches in Data Mining and Knowledge Discovery: A ReviewNeural Processing Letters10.1007/s11063-023-11332-y55:8(10393-10438)Online publication date: 1-Dec-2023
      • (2022)Ranking-based biased learning swarm optimizer for large-scale optimizationInformation Sciences: an International Journal10.1016/j.ins.2019.04.037493:C(120-137)Online publication date: 20-Apr-2022
      • (2022)Gravitation balanced multiple kernel learning for imbalanced classificationNeural Computing and Applications10.1007/s00521-022-07187-434:16(13807-13823)Online publication date: 1-Aug-2022
      • (2021)Auto-weighted concept factorization for joint feature map and data representation learningJournal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology10.3233/JIFS-20029841:1(69-81)Online publication date: 1-Jan-2021
      • (2019)A feature selection approach combining neural networks with genetic algorithmsAI Communications10.3233/AIC-19062632:5-6(361-372)Online publication date: 1-Jan-2019
      • (2019)A comparative study of neural-network feature weightingArtificial Intelligence Review10.1007/s10462-019-09700-z52:1(469-493)Online publication date: 6-Aug-2019
      • (2018)Accelerating nearest neighbor partitioning neural network classifier based on CUDAEngineering Applications of Artificial Intelligence10.1016/j.engappai.2017.10.02368:C(53-62)Online publication date: 1-Feb-2018
      • (2017)A general feature-weighting function for classification problemsExpert Systems with Applications: An International Journal10.1016/j.eswa.2016.12.01672:C(177-188)Online publication date: 15-Apr-2017

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media