article

Feature subset selection using a new definition of classifiability

Authors:

Ravi KothariAuthors Info & Claims

Pattern Recognition Letters, Volume 24, Issue 9-10

Pages 1215 - 1225

https://doi.org/10.1016/S0167-8655(02)00303-3

Published: 01 June 2003 Publication History

Abstract

The performance of most practical classifiers improves when correlated or irrelevant features are removed. Machine based classification is thus often preceded by subset selection--a procedure which identifies relevant features of a high dimensional data set. At present, the most widely used subset selection technique is the so-called "wrapper" approach in which a search algorithm is used to identify candidate subsets and the actual classifier is used as a "black box" to evaluate the fitness of the subset. Fitness evaluation of the subset however requires cross-validation or other resampling based procedure for error estimation necessitating the construction of a large number of classifiers for each subset. This significant computational burden makes the wrapper approach impractical when a large number of features are present.In this paper, we present an approach to subset selection based on a novel definition of the classifiability of a given data. The classifiability measure we propose characterizes the relative ease with which some labeled data can be classified. We use this definition of classifiability to systematically add the feature which leads to the most increase in classifiability. The proposed approach does not require the construction of classifiers at each step and therefore does not suffer from as high a computational burden as a wrapper approach. Our results over several different data sets indicate that the results obtained are at least as good as that obtained with the wrapper approach.

References

[1]

Almuallin, H., Dietterich, T.G., 1991. Learning with many irrelevant features. Proc. 9th Nat. Conf. on AI. pp. 547-552.

[2]

Cardie, C., 1993. Using decision trees to improve case based learning. Proc. 10th Int. Conf. on Machine Learning. pp. 25-32.

[3]

Dong, M., Kothari, R., 2001. Look-ahead based fuzzy decision tree induction. IEEE Trans. Fuzzy Syst. 9, 461-468.

Digital Library

[4]

Duda, R.O., Hart, P.E., 1973. Pattern Classification and Scene Analysis. John Wiley.

[5]

Efron, B., Tibshirani, T.J., 1993. An Introduction to the Bootstrap. Chapman and Hall, New York, NY.

[6]

Efron, B., Tibshirani, T.J., 1995. Cross-validation and the bootstrap: Estimating the error rate of a prediction rule. Technical Report, Stanford University.

[7]

Friedman, J.H., Rafsky, L.C., 1979. Multivariate generalizations of the Wald-Wolfowitz and Smirnof two-sample tests. Ann. Statist., 697-717.

[8]

Fukunaga, K., 1990. Introduction to Statistical Pattern Recognition. Academic Press, Boston, MA.

Digital Library

[9]

Goldberg, D., 1989. Genetic Algorithms in Search, Optimization, and Machine Learning. Addison-Wesley, Reading, MA.

Digital Library

[10]

Haralick, R.M., 1980. Statistical and structural approaches to texture. Proc. IEEE 67, 786-804.

[11]

Hartman, E.J., Keeler, J.D., Kowalski, J.M., 1990. Layered neural networks with Gaussian hidden units as universal approximators. Neural Comput. 2, 210-215.

Digital Library

[12]

Ho, T.K., Basu, M., 2002. Complexity measures of supervised classification problems. IEEE Trans. Pattern Anal. Machine Intell. 24, 289-300.

Digital Library

[13]

Hoekstra, A., Duin, R.P.W., 1996. On the nonlinearity of pattern classifiers. Proc. 13th Int. Conf. on Pattern Recognition, pp. 271-275.

Digital Library

[14]

Kira, K., Rendell, L.A., 1992. The feature selection problem: Traditional methods and a new algorithm. Proc. 10th Nat. Conf. on AI. pp. 129-134.

[15]

Kohavi, R., John, G.H., 1997. Wrappers for feature subset selection. Artifical Intell. 97, 273-324.

Digital Library

[16]

Langley, P., Iba, W., Thompson, K., 1992. An analysis of Bayesian classifiers. Proc. 10th Nat. Conf. on AI. pp. 223-228.

[17]

Narendra, P.M., Fukunaga, K., 1977. A branch and bound algorithm for feature subset selection. IEEE Trans. Comput. 26, 917-922.

Digital Library

[18]

Neter, J., Wasserman, W., Kutner, M.H., 1990. Applied Linear Statistical Models, third ed. Richard D. Irwin Inc, Englewood Cliffs, New Jersey.

[19]

Quinlan, J.R., 1986. Induction of decision trees. Machine Learning 1, 81-106.

[20]

Rao, A.R., 1990. A Taxonomy for Texture Description and Identification. Springer-Verlag, New York, NY.

Digital Library

[21]

Russell, S.J., Norvig, P., 1995. Artificial Intelligence: A Modern Approach. Prentice Hall, Englewood Cliff, NJ.

Digital Library

[22]

Smith, F.W., 1968. Pattern classifier design by linear programming. IEEE Trans. Comput. 17, 367-372.

Digital Library

Cited By

Patel HGupta NPanwar NSharma Mittal RMehta SGuttula SMujumdar SAfzal SBedathur SMunigala V(2022)Automatic Assessment of Quality of your Data for AIProceedings of the 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD)10.1145/3493700.3493774(354-357)Online publication date: 8-Jan-2022
https://dl.acm.org/doi/10.1145/3493700.3493774
Lorena AGarcia LLehmann JSouto MHo T(2019)How Complex Is Your Classification Problem?ACM Computing Surveys10.1145/334771152:5(1-34)Online publication date: 13-Sep-2019
https://dl.acm.org/doi/10.1145/3347711
Luengo JHerrera F(2015)An automatic extraction method of the domains of competence for learning classifiers using data complexity measuresKnowledge and Information Systems10.1007/s10115-013-0700-442:1(147-180)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1007/s10115-013-0700-4
Show More Cited By

Index Terms

Feature subset selection using a new definition of classifiability
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
      1. Parametric curve and surface models
      2. Volumetric models
  2. Modeling and simulation
    1. Model development and analysis
      1. Modeling methodologies

Recommendations

A hybrid feature selection method based on instance learning and cooperative subset search

A hybrid feature selection method is proposed for classification in small sample size data sets.The filter step is based on instance learning taking advantage of the small sample size of data.A few candidate feature subsets are generated since their ...
A feature selection method based on multiple feature subsets extraction and result fusion for improving classification performance
Abstract
Directly applying high-dimensional data to machine learning leads to dimensionality disasters and may induce model overfitting. Feature selection can effectively reduce feature size. However, a single feature selection algorithm has instability ...
Highlights
- FSM ensures excellence and differentiation of feature subsets.
- FSM proposes a layer optimization method for feature combination.
- FSM synthesizes the output of optimal feature subsets to improve performance.
Single feature ranking and binary particle swarm optimisation based feature subset ranking for feature selection
ACSC '12: Proceedings of the Thirty-fifth Australasian Computer Science Conference - Volume 122

This paper proposes two wrapper based feature selection approaches, which are single feature ranking and binary particle swarm optimisation (BPSO) based feature subset ranking. In the first approach, individual features are ranked according to the ...

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition Letters

Pattern Recognition Letters Volume 24, Issue 9-10

01 June 2003

501 pages

ISSN:0167-8655

Issue’s Table of Contents

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 June 2003

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Patel HGupta NPanwar NSharma Mittal RMehta SGuttula SMujumdar SAfzal SBedathur SMunigala V(2022)Automatic Assessment of Quality of your Data for AIProceedings of the 5th Joint International Conference on Data Science & Management of Data (9th ACM IKDD CODS and 27th COMAD)10.1145/3493700.3493774(354-357)Online publication date: 8-Jan-2022
https://dl.acm.org/doi/10.1145/3493700.3493774
Lorena AGarcia LLehmann JSouto MHo T(2019)How Complex Is Your Classification Problem?ACM Computing Surveys10.1145/334771152:5(1-34)Online publication date: 13-Sep-2019
https://dl.acm.org/doi/10.1145/3347711
Luengo JHerrera F(2015)An automatic extraction method of the domains of competence for learning classifiers using data complexity measuresKnowledge and Information Systems10.1007/s10115-013-0700-442:1(147-180)Online publication date: 1-Jan-2015
https://dl.acm.org/doi/10.1007/s10115-013-0700-4
SáEz JLuengo JHerrera F(2013)Predicting noise filtering efficacy with data complexity measures for nearest neighbor classificationPattern Recognition10.1016/j.patcog.2012.07.00946:1(355-364)Online publication date: 1-Jan-2013
https://dl.acm.org/doi/10.1016/j.patcog.2012.07.009
Cano J(2013)Analysis of data complexity measures for classificationExpert Systems with Applications: An International Journal10.1016/j.eswa.2013.02.02540:12(4820-4831)Online publication date: 1-Sep-2013
https://dl.acm.org/doi/10.1016/j.eswa.2013.02.025
Luengo JHerrera F(2012)Shared domains of competence of approximate learning models using measures of separability of classesInformation Sciences: an International Journal10.1016/j.ins.2011.09.022185:1(43-65)Online publication date: 1-Feb-2012
https://dl.acm.org/doi/10.1016/j.ins.2011.09.022
Elizondo DBirkenhead RGamez MGarcia NAlfaro E(2012)Linear separability and classification complexityExpert Systems with Applications: An International Journal10.1016/j.eswa.2012.01.09039:9(7796-7807)Online publication date: 1-Jul-2012
https://dl.acm.org/doi/10.1016/j.eswa.2012.01.090
Parvin HMinaei-Bidgoli BParvin S(2011)An accumulative points/votes based approach for feature selectionProceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications10.1007/978-3-642-25085-9_47(399-408)Online publication date: 15-Nov-2011
https://dl.acm.org/doi/10.1007/978-3-642-25085-9_47
Piatrik TIzquierdo E(2009)Subspace clustering of images using ant colony optimisationProceedings of the 16th IEEE international conference on Image processing10.5555/1818719.1818801(229-232)Online publication date: 7-Nov-2009
https://dl.acm.org/doi/10.5555/1818719.1818801
Rodrigues MRobinson ABrink W(2008)Fast 3D reconstruction and recognitionProceedings of the 8th conference on Signal processing, computational geometry and artificial vision10.5555/1503734.1503737(15-21)Online publication date: 20-Aug-2008
https://dl.acm.org/doi/10.5555/1503734.1503737
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents