Feature selection for support vector machines with RBF kernel

Liu, Quanzhong; Chen, Chihau; Zhang, Yang; Hu, Zhengguo

doi:10.1007/s10462-011-9205-2

Feature selection for support vector machines with RBF kernel

Published: 09 February 2011

Volume 36, pages 99–115, (2011)
Cite this article

Artificial Intelligence Review Aims and scope Submit manuscript

Quanzhong Liu¹,
Chihau Chen²,
Yang Zhang³ &
…
Zhengguo Hu¹

1981 Accesses
66 Citations
Explore all metrics

Abstract

Linear kernel Support Vector Machine Recursive Feature Elimination (SVM-RFE) is known as an excellent feature selection algorithm. Nonlinear SVM is a black box classifier for which we do not know the mapping function ${\Phi}$ explicitly. Thus, the weight vector w cannot be explicitly computed. In this paper, we proposed a feature selection algorithm utilizing Support Vector Machine with RBF kernel based on Recursive Feature Elimination(SVM-RBF-RFE), which expands nonlinear RBF kernel into its Maclaurin series, and then the weight vector w is computed from the series according to the contribution made to classification hyperplane by each feature. Using ${w_i^2}$ as ranking criterion, SVM-RBF-RFE starts with all the features, and eliminates one feature with the least squared weight at each step until all the features are ranked. We use SVM and KNN classifiers to evaluate nested subsets of features selected by SVM-RBF-RFE. Experimental results based on 3 UCI and 3 microarray datasets show SVM-RBF-RFE generally performs better than information gain and SVM-RFE.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Artificial Intelligence

References

Albrecht A (2006) Stochastic local search for the feature set problem, with applications to microarray data. Appl Math Comput 183(2): 1148–1164
Article MATH MathSciNet Google Scholar
Ando S, Iba H (2004) Classification of gene expression profile using combinatory method of evolutionary computation and machine learning. Genet Program Evol Mach 5: 1573–7632
Article Google Scholar
Bontempi G (2007) A blocking strategy to improve gene selection for classification of gene expression data. IEEE/ACM Trans Comput Biology Bioinform 4: 293–300
Article Google Scholar
Brank J, Grobelnik M, Milic-Frayling N, Mladenic D (2002) Feature selection using linear support vector machines. Technical Report, MSR-TR-2002-63, Microsoft Research, Microsoft Corporation
Burges C (1998) A tutorial on support vector machines for pattern recognition. Data Mining Knowl Discovery 2: 121–167
Article Google Scholar
Claeskens G, Croux C, Kerckhoven J (2008) An information criterion for variable selection in support vector machines. J Mach Learn Res 9: 541–558
MathSciNet Google Scholar
Cristianini N, Taylor J (2000) An introduction to support vector machines. Cambridge University Press, Cambridge
Google Scholar
Deng L, Pei J, Ma J, Lee D (2004) A rank sum test method for informative gene discovery. In: Proceedings of the 10th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Seattle, pp 410–419
Ding C, Peng H (2005) Minimum redundancy feature selection from microarray gene expression data. J Bioinform Comput Biology 3(2): 185–205
Article MathSciNet Google Scholar
Ding Y, Wilkins D(2006)Improving the performance of SVM-RFE to select genes in microarray data. BMC Bioinform 7 (Suppl 2):S12. doi:10.1186/1471-2105-7-S2-S12
Draminski M, Rada-Iglesias A, Enroth S, Wadelius C, Koronacki J, Komorowski J (2008) Monte Carlo feature selection for supervised classification. Bioinformatics 24(1): 110–117
Article Google Scholar
Duan K, Rajapakse J (2004a) SVM-RFE peak selection for cancer classification with mass spectrometry data. In: Proceedings of the 3rd Asia-pacific bioinformatics conference, pp 191–200
Duan K, Rajapakse J (2004b) A variant of SVM-RFE for gene selection in cancer classification with expression data. In: Proceedings of IEEE symposium computational intelligence in bioinformatics and computational biology, pp 49–55
Duan K, Rajapakse J, Wang H, Azuaje F (2005) Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE Trans Nanobiosci 4(3): 228–234
Article Google Scholar
Elalami M (2009) A filter model for feature subset selection based on genetic algorithm. Knowledge-Based Syst 22: 356–362
Article Google Scholar
Estevez P, Tesmer M, Perez C, Zurada J (2009) Normalized mutual information feature selection. IEEE Trans Neural Netw 20: 189–201
Article Google Scholar
Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 1022–1027
Golub T et al (1999) Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286: 531–537
Article Google Scholar
Guyon W, Barnhill V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46: 389–422
Article MATH Google Scholar
Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3: 1157–1182
Article MATH Google Scholar
Ho S, Hsieh C, Chen H, Huang H (2006) Interpretable gene expression classifier with an accurate and compact fuzzy rule base for microarray data analysis. BioSystems 85: 165–176
Article Google Scholar
Huang J, Cai Y, Xu X (2007) A hybrid genetic algorithm for feature selection wrapper based on mutual information. Pattern Recogn Lett 28: 1825–1844
Article Google Scholar
Kohavi R, John G (1997) Wrappers for feature subset selection. Artif Intell 97(1-2): 273–324
Article MATH Google Scholar
LeCun Y, Denker J, Solla S (1990) Optimal brain damage. Adv Neural Inform Process Syst II: 598–605
Google Scholar
Lee C, Lee G (2006) Information gain and divergence-based feature selection for machine learning-based text categorization. Inform Process Manage 42(1): 155–165
Article Google Scholar
Li F, Yang Y (2005) Analysis of recursive gene selection approaches from microarray data. Bioinformatics 21(19): 3741–3747
Article Google Scholar
Liu Q, Zhang Y, Hu Z (2007) Extracting positive and negative association classification rules from RBF kernel. In: 2007 International conference on convergence information technology. IEEE Computer Society, pp 1285–1291
Niijima S, Kuhara S (2006) Gene subset selection in kernel-induced feature space. Pattern Recogn Lett 27: 1884–1892
Article Google Scholar
Schoch C, Kohlmann A, Schnittger S et al (2002) Acute myeloid leukemias with reciprocal rearrangements can be distinguished by specific gene expression profiles. Proc Nat Acad Sci USA 99(15): 10008–10013
Article Google Scholar
Shipp M, Ross K, Tamayo P et al (2002) Diffuse large B-Cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning. Nature Med 8(1): 68–74
Article Google Scholar
Silva P, Hashimoto R, Kim S et al (2005) Feature selection algorithms to find strong genes. Pattern Recogn Lett 26: 1444–1453
Article Google Scholar
Singh D, Febbo P et al (2002) Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1: 203–209
Article Google Scholar
Sun Y (2007) Iterative RELIEF for feature weighting: algorithms, theories, and applications. In: IEEE transactions on pattern analysis and machine intelligence, vol. 29(6):1035–1051
Tang Y, Zhang Y, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE/ACM Trans Comput Biol Bioinform 4(3): 365–381
Article Google Scholar
Tong D, Phalp K, Schierz A, Mintram R (2009) Innovative hybridisation of genetic algorithms and neural networks in detecting marker genes for leukaemia cancer. In: 4th IAPR international conference on pattern recognition in bioinformatics, Sheffield, 7–9 September 2009
Vapnik V (1998) Statistical learning theory. Wiley, New York
MATH Google Scholar
Wang Z, Palade V, Xu Y (2006) Neuro-fuzzy ensemble approach for microarray cancer gene expression data analysis. In: Proceedings of the second international symposium on evolving fuzzy system (EFS’06), IEEE Computational Intelligence Society 2006 , pp 241–246
Youn E, Jeong M (2009) Class dependent feature scaling method using naive Bayes classifier for text data mining. Pattern Recogn Lett 30: 477–485
Article Google Scholar
Zhang C, Lu X, Zhang X (2006) Significance of gene ranking for classification of microarray samples. IEEE/ACM Trans Comput Biology Bioinform 3(3): 312–320
Article Google Scholar
Zhang H, Song X, Wang H, Zhang X (2009) MIClique: an algorithm to identify differentially coexpressed disease gene subset from microarray data. J Biomed Biotechnol 2009. Article No.: 42524, doi:10.1155/2009/642524

Download references

Author information

Authors and Affiliations

College of Mechanical and Electric Engineering, Northwest A & F University, Yangling, 712100, Shaanxi Province, China
Quanzhong Liu & Zhengguo Hu
Electrical & Computer Engineering, University of Massachusetts Dartmouth, Dartmouth, MA, 02747-2300, USA
Chihau Chen
College of Information Engineering, Northwest A & F University, Yangling, 712100, Shaanxi Province, China
Yang Zhang

Authors

Quanzhong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chihau Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Zhengguo Hu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Quanzhong Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, Q., Chen, C., Zhang, Y. et al. Feature selection for support vector machines with RBF kernel. Artif Intell Rev 36, 99–115 (2011). https://doi.org/10.1007/s10462-011-9205-2

Download citation

Published: 09 February 2011
Issue Date: August 2011
DOI: https://doi.org/10.1007/s10462-011-9205-2

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Feature selection for support vector machines with RBF kernel

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

New fast feature selection methods based on multiple support vector data description

An Embedded Method for Feature Selection Using Kernel Parameter Descent Support Vector Machine

Kernel Construction and Feature Subset Selection in Support Vector Machines

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Feature selection for support vector machines with RBF kernel

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

New fast feature selection methods based on multiple support vector data description

An Embedded Method for Feature Selection Using Kernel Parameter Descent Support Vector Machine

Kernel Construction and Feature Subset Selection in Support Vector Machines

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation