Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

An introduction to variable and feature selection

Published: 01 March 2003 Publication History
  • Get Citation Alerts
  • Abstract

    Variable and feature selection have become the focus of much research in areas of application for which datasets with tens or hundreds of thousands of variables are available. These areas include text processing of internet documents, gene expression array analysis, and combinatorial chemistry. The objective of variable selection is three-fold: improving the prediction performance of the predictors, providing faster and more cost-effective predictors, and providing a better understanding of the underlying process that generated the data. The contributions of this special issue cover a wide range of aspects of such problems: providing a better definition of the objective function, feature construction, feature ranking, multivariate feature selection, efficient search methods, and feature validity assessment methods.

    References

    [1]
    E. Amaldi and V. Kann. On the approximation of minimizing non zero variables or unsatisfied relations in linear systems. Theoretical Computer Science, 209: 237-260, 1998.
    [2]
    R. Bekkerman, R. El-Yaniv, N. Tishby, and Y. Winter. Distributional word clusters vs. words for text categorization. JMLR, 3: 1183-1208 (this issue), 2003.
    [3]
    A. Ben-Hur and I. Guyon. Detecting stable clusters using principal component analysis. In M.J. Brownstein and A. Kohodursky, editors, Methods In Molecular Biology, pages 159-182. Humana Press, 2003.
    [4]
    Y. Bengio and N. Chapados. Extensions to metric-based model selection. JMLR, 3: 1209- 1227 (this issue), 2003.
    [5]
    J. Bi, K. Bennett, M. Embrechts, C. Breneman, and M. Song. Dimensionality reduction via sparse support vector machines. JMLR, 3: 1229-1243 (this issue), 2003.
    [6]
    A. Blum and P. Langley. Selection of relevant features and examples in machine learning. Artificial Intelligence, 97(1-2): 245-271, December 1997.
    [7]
    B. Boser, I. Guyon, and V. Vapnik. A training algorithm for optimal margin classifiers. In Fifth Annual Workshop on Computational Learning Theory, pages 144-152, Pittsburgh, 1992. ACM.
    [8]
    L. Breiman, J. H. Friedman, R. A. Olshen, and C. J. Stone. Classification and Regression Trees. Wadsworth and Brooks, 1984.
    [9]
    R. Caruana and V. de Sa. Benefitting from the variables that variable selection discards. JMLR, 3: 1245-1264 (this issue), 2003.
    [10]
    I. Dhillon, S. Mallela, and R. Kumar. A divisive information-theoretic feature clustering algorithm for text classification. JMLR, 3: 1265-1287 (this issue), 2003.
    [11]
    T. G. Dietterich. Approximate statistical test for comparing supervised classification learning algorithms. Neural Computation, 10(7): 1895-1924, 1998.
    [12]
    R. O. Duda, P. E. Hart, and D. G. Stork. Pattern Classification. John Wiley & Sons, USA, 2nd edition, 2001.
    [13]
    T. R. Golub et al. Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring. Science, 286: 531-537, 1999.
    [14]
    G. Forman. An extensive empirical study of feature selection metrics for text classification. JMLR, 3: 1289-1306 (this issue), 2003.
    [15]
    T. Furey, N. Cristianini, Duffy, Bednarski N., Schummer D., M., and D. Haussler. Support vector machine classification and validation of cancer tissue samples using microarray expression data. Bioinformatics, 16: 906-914, 2000.
    [16]
    A. Globerson and N. Tishby. Sufficient dimensionality reduction. JMLR, 3: 1307-1331 (this issue), 2003.
    [17]
    Y. Grandvalet and S. Canu. Adaptive scaling for feature selection in SVMs. In NIPS 15, 2002.
    [18]
    I. Guyon, J. Weston, S. Barnhill, and V. Vapnik. Gene selection for cancer classification using support vector machines. Machine Learning, 46(1-3): 389-422, 2002.
    [19]
    T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning. Springer series in statistics. Springer, New York, 2001.
    [20]
    T. Jebara and T. Jaakkola. Feature selection and dualities in maximum entropy discrimination. In 16th Annual Conference on Uncertainty in Artificial Intelligence, 2000.
    [21]
    K. Kira and L. Rendell. A practical approach to feature selection. In D. Sleeman and P. Edwards, editors, International Conference on Machine Learning, pages 368-377, Aberdeen, July 1992. Morgan Kaufmann.
    [22]
    R. Kohavi and G. John. Wrappers for feature selection. Artificial Intelligence, 97(1-2): 273-324, December 1997.
    [23]
    D. Koller and M. Sahami. Toward optimal feature selection. In 13th International Conference on Machine Learning, pages 284-292, July 1996.
    [24]
    Y. LeCun, J. Denker, S. Solla, R. E. Howard, and L. D. Jackel. Optimal brain damage. In D. S. Touretzky, editor, Advances in Neural Information Processing Systems II, San Mateo, CA, 1990. Morgan Kaufmann.
    [25]
    G. Monari and G. Dreyfus. Withdrawing an example from the training set: an analytic estimation of its effect on a nonlinear parameterized model. Neurocomputing Letters, 35: 195-201, 2000.
    [26]
    C. Nadeau and Y. Bengio. Inference for the generalization error. Machine Learning (to appear), 2001.
    [27]
    A. Y. Ng. On feature selection: learning with exponentially many irrelevant features as training examples. In 15th International Conference on Machine Learning, pages 404- 412. Morgan Kaufmann, San Francisco, CA, 1998.
    [28]
    A. Y. Ng and M. Jordan. Convergence rates of the voting Gibbs classifier, with application to Bayesian feature selection. In 18th International Conference on Machine Learning, 2001.
    [29]
    J. Pearl. Causality. Cambridge University Press, 2000.
    [30]
    F. Pereira, N. Tishby, and L. Lee. Distributional clustering of English words. In Proc. Meeting of the Association for Computational Linguistics, pages 183-190, 1993.
    [31]
    S. Perkins, K. Lacker, and J. Theiler. Grafting: Fast incremental feature selection by gradient descent in function space. JMLR, 3: 1333-1356 (this issue), 2003.
    [32]
    A. Rakotomamonjy. Variable selection using SVM-based criteria. JMLR, 3: 1357-1370 (this issue), 2003.
    [33]
    J. Reunanen. Overfitting in making comparisons between variable selection methods. JMLR, 3: 1371-1382 (this issue), 2003.
    [34]
    I. Rivals and L. Personnaz. MLPs (mono-layer polynomials and multi-layer perceptrons) for non-linear modeling. JMLR, 3: 1383-1398 (this issue), 2003.
    [35]
    B. Schoelkopf and A. Smola. Learning with Kernels. MIT Press, Cambridge MA, 2002.
    [36]
    D. Schuurmans. A new metric-based approach to model selection. In 9th Innovative Applications of Artificial Intelligence Conference, pages 552-558, 1997.
    [37]
    H. Stoppiglia, G. Dreyfus, R. Dubois, and Y. Oussar. Ranking a random feature for variable and feature selection. JMLR, 3: 1399-1414 (this issue), 2003.
    [38]
    R. Tibshirani. Regression selection and shrinkage via the lasso. Technical report, Stanford University, Palo Alto, CA, June 1994.
    [39]
    N. Tishby, F. C. Pereira, and W. Bialek. The information bottleneck method. In Proc. of the 37th Annual Allerton Conference on Communication, Control and Computing, pages 368-377, 1999.
    [40]
    K. Torkkola. Feature extraction by non-parametric mutual information maximization. JMLR, 3: 1415-1438 (this issue), 2003.
    [41]
    V. G. Tusher, R. Tibshirani, and G. Chu. Significance analysis of microarrays applied to the ionizing radiation response. PNAS, 98: 5116-5121, April 2001.
    [42]
    V. Vapnik. Estimation of dependencies based on empirical data. Springer series in statistics. Springer, 1982.
    [43]
    V. Vapnik. Statistical Learning Theory. John Wiley & Sons, N.Y., 1998.
    [44]
    A. Vehtari and J. Lampinen. Bayesian input variable selection using posterior probabilities and expected utilities. Report B31, 2002.
    [45]
    J. Weston, A. Elisseff, B. Schoelkopf, and M. Tipping. Use of the zero norm with linear models and kernel methods. JMLR, 3: 1439-1461 (this issue), 2003.
    [46]
    J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, and V. Vapnik. Feature selection for SVMs. In NIPS 13, 2000.
    [47]
    E.P. Xing and R.M. Karp. Cliff: Clustering of high-dimensional microarray data via iterative feature filtering using normalized cuts. In 9th International Conference on Intelligence Systems for Molecular Biology, 2001.

    Cited By

    View all
    • (2024)A Gaussian–Based WGAN–GP Oversampling Approach for Solving the Class Imbalance ProblemInternational Journal of Applied Mathematics and Computer Science10.61822/amcs-2024-002134:2(291-307)Online publication date: 1-Jun-2024
    • (2024)Evolutionary feature selection based on hybrid bald eagle search and particle swarm optimizationIntelligent Data Analysis10.3233/IDA-22722228:1(121-159)Online publication date: 1-Jan-2024
    • (2024)Bio-inspired Intrusion Detection System for Internet of Things Networks SecurityProceedings of the Cognitive Models and Artificial Intelligence Conference10.1145/3660853.3660856(14-19)Online publication date: 25-May-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image The Journal of Machine Learning Research
    The Journal of Machine Learning Research  Volume 3, Issue
    3/1/2003
    1437 pages
    ISSN:1532-4435
    EISSN:1533-7928
    Issue’s Table of Contents

    Publisher

    JMLR.org

    Publication History

    Published: 01 March 2003
    Published in JMLR Volume 3

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,390
    • Downloads (Last 6 weeks)151

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Gaussian–Based WGAN–GP Oversampling Approach for Solving the Class Imbalance ProblemInternational Journal of Applied Mathematics and Computer Science10.61822/amcs-2024-002134:2(291-307)Online publication date: 1-Jun-2024
    • (2024)Evolutionary feature selection based on hybrid bald eagle search and particle swarm optimizationIntelligent Data Analysis10.3233/IDA-22722228:1(121-159)Online publication date: 1-Jan-2024
    • (2024)Bio-inspired Intrusion Detection System for Internet of Things Networks SecurityProceedings of the Cognitive Models and Artificial Intelligence Conference10.1145/3660853.3660856(14-19)Online publication date: 25-May-2024
    • (2024)Improve Deep Hashing with Language Guidance for Unsupervised Image RetrievalProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3658059(137-145)Online publication date: 30-May-2024
    • (2024)Lateralization Effects in Electrodermal Activity Data Collected Using Wearable DevicesProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/36435418:1(1-30)Online publication date: 6-Mar-2024
    • (2024)Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization PerspectiveACM Transactions on Knowledge Discovery from Data10.1145/363805918:4(1-22)Online publication date: 13-Feb-2024
    • (2024)An Investigation on Hardware-Aware Vision Transformer ScalingACM Transactions on Embedded Computing Systems10.1145/361138723:3(1-19)Online publication date: 11-May-2024
    • (2024)On the relation of causality- versus correlation-based feature selection on model fairnessProceedings of the 39th ACM/SIGAPP Symposium on Applied Computing10.1145/3605098.3636018(56-64)Online publication date: 8-Apr-2024
    • (2024)Heterogeneous Graph Attention Network Based Statistical Timing Library Characterization with Parasitic RC ReductionProceedings of the 29th Asia and South Pacific Design Automation Conference10.1109/ASP-DAC58780.2024.10473881(171-176)Online publication date: 22-Jan-2024
    • (2024)PAS - A Feature Selection Process Definition for Industrial SettingsProcedia Computer Science10.1016/j.procs.2024.01.030232:C(308-316)Online publication date: 1-Jan-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media