Abstract
The interpretation of the results in a classification problem can be enhanced, specially in image texture analysis problems, by feature selection techniques, knowing which features contribute more to the classification performance. This paper presents an evaluation of a number of feature selection techniques for classification in a biomedical image texture dataset (2-DE gel images), with the aim of studying their performance and the stability in the selection of the features. We analyse three different techniques: subgroup-based multiple kernel learning (MKL), which can perform a feature selection by down-weighting or eliminating subsets of features which shares similar characteristic, and two different conventional feature selection techniques such as recursive feature elimination (RFE), with different classifiers (naive Bayes, support vector machines, bagged trees, random forest and linear discriminant analysis), and a genetic algorithm-based approach with an SVM as decision function. The different classifiers were compared using a ten times tenfold cross-validation model, and the best technique found is SVM-RFE, with an AUROC score of (\(95.88 \pm 0.39\,\%\)). However, this method is not significantly better than RFE-TREE, RFE-RF and grouped MKL, whilst MKL uses lower number of features, increasing the interpretability of the results. MKL selects always the same features, related to wavelet-based textures, while RFE methods focuses specially co-occurrence matrix-based features, but with high instability in the number of features selected.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abbey CK, Nosratieh A, Sohl-Dickstein J, Yang K, Boone JM (2012) Non-Gaussian statistical properties of breast images. Med Phys 39(11):7121–7130
Al-Kadi OS (2010) Assessment of texture measures susceptibility to noise in conventional and contrast enhanced computed tomography lung tumour images. Computerized medical imaging and graphics. In: 8th IEEE international conference on bioinformatics and bioengineering, Athens, Greece, 8–10 Oct 2008, vol 34(6, SI), pp 494–503.
Alfons A (2012) cvTools: cross-validation tools for regression models. R package version 0.3.2. http://CRAN.R-project.org/package=cvTools
Alonso-Atienza F, Rojo-Álvarez JL, Rosado-Muñoz A, Vinagre JJ, García-Alberola A, Camps-Valls G (2012) Feature selection using support vector machines and bootstrap methods for ventricular fibrillation detection. Expert Syst Appl 39(2):1956–1967
Bac U, Bray M, Caban J, Yao J, Mollura DJ (2012) Computer-assisted detection of infectious lung diseases: a review. Comput Med Imaging Graph 36(1):72–84
Bahl G, Cruite I, Wolfson T, Gamst AC, Collins JM, Chavez AD, Barakat F, Hassanein T, Sirlin CB (2012) Noninvasive classification of hepatic fibrosis based on texture parameters from double contrast-enhanced magnetic resonance images. J Magn Reson Imaging 36(5):1154–1161
Bartlett MS (1937) Properties of sufficiency and statistical tests. Proc R Soc Lond Ser A Math Phys Sci 160(901):268–282
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Caban J, Yao J, Bagci U, Mollura D (2011) Monitoring pulmonary fibrosis by fusing clinical, physiological, and computed tomography features. In: Engineering in Medicine and Biology Society, EMBC. 2011 Annual international conference of the IEEE, pp 6216–6219
Campbell C, Ying Y (2011) Learning with support vector machines. Synth Lect Artif Intell Mach Learn 5(1):1–95
Chang C-C, Lin C-J (2011) LIBSVM: a library for support vector machines, ACM Trans Intell Syst Technol 2(27):1–27. Software http://www.csie.ntu.edu.tw/~cjlin/libsvm. Accessed 8 Nov 2014
Chang C-Y, Lai C-C, Lai C-T, Chen S-J (2013) Integrating PSONN and Boltzmann function for feature selection and classification of lymph nodes in ultrasound images. J Vis Commun Image Represent 24(1):23–30
Chang Y-W, Lin C-J (2008) Feature ranking using linear SVM. In: Guyon I, Aliferis CF, Cooper GF, Elisseeff A, Pellet J-P, Spirtes P, Statnikov AR (eds) WCCI causation and prediction challenge, JMLR Proceedings, vol 3. JMLR.org, pp 53–64
Chapelle O, Haffner P, Vapnik VN (1999) Support vector machines for histogram-based image classification. IEEE Trans Neural Netw 10(5):1055–1064
Chen Y-W, Lin C-J (2006) Combining SVMs with various feature selection strategies. In: Guyon I, Nikravesh M, Gunn S, Zadeh L (eds) Feature extraction Studies in fuzziness and soft computing, vol 207. Springer, Berlin, Heidelberg, pp 315–324
Chuah TK, Reeth EV, Sheah K, Poh CL (2013) Texture analysis of bone marrow in knee MRI for classification of subjects with bone marrow lesion data from the osteoarthritis initiative. Magn Reson Imaging 31(6):930–938
Conover W (1971) Practical nonparametric statistics. Wiley, New York
Daniel W (1990) Applied nonparametric statistics. Duxbury Thomson Learning, Pacific Grove
Demšar J (2006) Statistical comparisons of classifiers over multiple data sets. J Mach Learn Res 7:1–30
Eden E, Waisman D, Rudzsky M, Bitterman H, Brod V, Rivlin E (2005) An automated method for analysis of flow characteristics of circulating particles from in vivo video microscopy. Med Imaging IEEE Trans 24(8):1011–1024
Eliat P-A, Olivie D, Saikali S, Carsin B, Saint-Jalmes H, de Certaines JD (2012) Can dynamic contrast-enhanced magnetic resonance imaging combined with texture analysis differentiate malignant glioneuronal tumors from other glioblastoma? Neurol Res Int 2012:195176
Fawcett T (2006) An introduction to ROC analysis. Pattern Recognit Lett 27(8):861–874
Fernandez-Lozano C, Canto C, Gestal M, Andrade-Garda JM, Rabunal JR, Dorado J, Pazos A (2013a) Hybrid model based on genetic algorithms and SVM applied to variable selection within fruit juice classification. Sci World J 2013:13
Fernandez-Lozano C, Gestal M, Pedreira N, Dorado J, Pazos A (2013b) High order texture-based analysis in biomedical images. Curr Med Imaging Rev 9(4):309–317
Fernandez-Lozano C, Seoane JA, Mesejo P, Nashed YSG, Cagnoni S, Dorado J (2013c) 2D-PAGE texture classification using support vector machines and genetic algorithms. In: Proceedings of the 4th international conference on bioinformatics models, methods and algorithms, pp 5–14
Fernandez-Lozano C, Fernandez-Blanco E, Dave K, Pedreira N, Gestal M, Dorado J, Munteanu CR (2014) Improving enzyme regulatory protein classification by means of SVM-RFE feature selection. Mol Biosyst 10:1063–1071
Ferri C, Hernandez-Orallo J, Modroiu R (2009) An experimental comparison of performance measures for classification. Pattern Recognit Lett 30(1):27–38
Friedman M (1937) The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J Am Stat Assoc 32(200):675–701
Fruehwald-Pallamar J, Czerny C, Holzer-Fruehwald L, Nemec SF, Mueller-Mang C, Weber M, Mayerhoefer ME (2013) Texture-based and diffusion-weighted discrimination of parotid gland lesions on MR images at 3.0 Tesla. NMR Biomed 26(11):1372–1379
Ganeshan B, Burnand K, Young R, Chatwin C, Miles K (2011) Dynamic contrast-enhanced texture analysis of the liver initial assessment in colorectal cancer. Invest Radiol 46(3):160–168
Ganeshan B, Goh V, Mandeville HC, Ng QS, Hoskin PJ, Miles KA (2013) Nonsmall cell lung cancer: histopathologic correlates for texture parameters at CT. Radiology 266(1):326–336
Ganeshan B, Skogen K, Pressney I, Coutroubis D, Miles K (2012) Tumour heterogeneity in oesophageal cancer assessed by CT texture analysis: preliminary evidence of an association with tumour metabolism, stage, and survival. Clin Radiol 67(2):157–164
Garcia S, Fernandez A, Luengo J, Herrera F (2009) A study of statistical techniques and performance measures for genetics-based machine learning: accuracy and interpretability. Soft Comput 13(10):959–977
Garcia S, Fernandez A, Luengo J, Herrera F (2010) Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Inf Sci 180(10):2044–2064
Gehler PV, Nowozin S (2009) On feature combination for multiclass object classification. In: ICCV, IEEE, pp 221–228
Goh V, Ganeshan B, Nathan P, Juttla JK, Vinayan A, Miles KA (2011) Assessment of response to tyrosine kinase inhibitors in metastatic renal cell cancer: CT texture as a predictive biomarker. Radiology 261(1):165–171
Guyon I, André E (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182
Guyon I, Weston J, Barnhill S, Vapnik V (2002a) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422
Guyon I, Weston J, Barnhill S, Vapnik V (2002b) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422
Hanczar B, Hua J, Sima C, Weinstein J, Bittner M, Dougherty ER (2010) Small-sample precision of ROC-related estimates. Bioinformatics 26(6):822–830
Hand DJ (2009) Measuring classifier performance: a coherent alternative to the area under the ROC curve. Mach Learn 77(1):103–123
Haralick RM, Shanmugam K, Dinstein I (1973) Textural features for image classification. IEEE Trans Syst Man Cybern SMC 3(6):610–621
Harrison L, Luukkaala T, Pertovaara H, Saarinen T, Heinonen T, Jarvenpaa R, Soimakallio S, Kellokumpu-Lehtinen P-L, Eskola H, Dastidar P (2009) Non-Hodgkin lymphoma response evaluation with MRI texture classification. J Exp Clin Cancer Res 28(1):87
Hodges J Jr, Lehmann EL (1962) Rank methods for combination of independent experiment in analysis of variance. Ann Math Stat 33(2):482–497
Holland JH (1992) Adaptation in natural and artificial systems: an introductory analysis with applications to biology, control, and artificial intelligence. MIT Press, Cambridge, MA, USA. ISBN: 0262082136
Holli K, Harrison L, Dastidar P, Waljas M, Liimatainen S, Luukkaala T, Ohman J, Soimakallio S, Eskola H (2010a) Texture analysis of MR images of patients with mild traumatic brain injury. BMC Med Imaging 10(1):8
Holli K, Lperi A-L, Harrison L, Luukkaala T, Toivonen T, Ryymin P, Dastidar P, Soimakallio S, Eskola H (2010b) Characterization of breast cancer types by texture analysis of magnetic resonance images. Acad Radiol 17(2):135–141
Hommel G (1988) A stagewise rejective multiple test procedure based on a modified Bonferroni test. Biometrika 75(2):383–386
Hong B-W, Sohn B-S (2010) Segmentation of regions of interest in mammograms in a topographic approach. Inf Technol Biomed IEEE Trans 14(1):129–139
Hossain S, Serikawa S (2013) Texture databases. A comprehensive survey. Pattern Recognit Lett 34(15):2007–2022 (smart approaches for human action recognition)
Hunt SMN, Thomas MR, Sebastian LT, Pedersen SK, Harcourt RL, Sloane AJ, Wilkins MR (2005) Optimal replication and the importance of experimental design for gel-based quantitative proteomics. J Proteome Res 4(3):809–819
Jafari-Khouzani K, Siadat M-R, Soltanian-Zadeh H, Elisevich K (2003) Texture analysis of hippocampus for epilepsy
Jafari-Khouzani K, Elisevich K, Patel S, Smith B, Soltanian-Zadeh H (2010) FLAIR signal and texture analysis for lateralizing mesial temporal lobe epilepsy. NeuroImage 49(2):1559–1571. doi:10.1016/j.neuroimage.2009.08.064.
Karatzoglou A, Smola A, Hornik K, Zeileis A (2004) Kernlab—an S4 [ackage for kernel methods in R. J Stat Softw 11(9):1–20
Kavzoglu T (2009) Increasing the accuracy of neural network classification using refined training data. Environ Model Softw 24(7):850–858
Kohavi R, John GH (1997) Wrappers for feature subset selection. Artif Intell 97(1–2):273–324
Kudo M, Sklansky J (1998) A comparative evaluation of medium- and large-scale feature selectors for pattern classifiers. Kybernetika 34(4):429–434
Kuhn M (2008) Building predictive models in R using the caret package. J Stat Softw 28(5):1–26
Lanckriet GRG, Cristianini N, Bartlett P, Ghaoui LE, Jordan MI (2004a) Learning the kernel matrix with semidefinite programming. J Mach Learn Res 5:27–72
Lanckriet GRG, De Bie T, Cristianini N, Jordan MI, Noble WS (2004b) A statistical framework for genomic data fusion. Bioinformatics 20(16):2626–2635
Levina E (2002) Statistical issues in texture analysis. University of California, Berkeley
Li F, Zhao C, Xia Z, Wang Y, Zhou X, Li G-Z (2012) Computer-assisted lip diagnosis on traditional Chinese medicine using multi-class support vector machines. BMC Complement Altern Med 12:127. doi:10.1186/1472-6882-12-127
Lobo JM, Jimnez-Valverde A, Real R (2008) AUC: a misleading measure of the performance of predictive distribution models. Glob Ecol Biogeogr 17(2):145–151
Loizou CP, Pantziaris M, Seimenis I, Pattichis CS (2009) Brain MR image normalization in texture analysis of multiple sclerosis. In: 2009 9TH international conference on information technology and applications in biomedicine, University of Cyprus, Cyprus University Technology, University of Ioannina; Cyprus Institute, Frederick University, University of Wisconsin, IEEE Engineering Medical and Biological Society, International Federation of Medical and Biological Engineering, European Society of Engineering and Medicin. 9th International conference on information technology and applications in biomedicine, Larnaka, Cyprus, 4–7 Nov 2009, pp 131–135
Lopes R, Ayache A, Makni N, Puech P, Villers A, Mordon S, Betrouni N (2011) Prostate cancer characterization on MR images using fractal features. Med Phys 38(1):83–95
Mahapatra D, Schueffler P, Tielbeek J, Buhmann J, Vos F (2013) A supervised learning approach for Crohn’s disease detection using higher-order image statistics and a novel shape asymmetry measure. J Digit Imaging 26(5):920–931
Markel D, Caldwell C, Alasti H, Soliman H, Ung Y, Lee J, Sun A (2013) Automatic segmentation of lung carcinoma using 3D texture features in 18-FDG PET/CT. Int J Mol Imaging 2013. doi:10.1155/2013/980769
Materka A, Strzelecki M (1998) Texture analysis methods—a review. Technical University of Lodz, Institute of Electronics. COST B11 report
Mayerhoefer ME, Breitenseher MJ, Kramer J, Aigner N, Hofmann S, Materka A (2005) Texture analysis for tissue discrimination on T1-weighted MR images of the knee joint in a multicenter study: Transferability of texture features and comparison of feature selection methods and classifiers. J Magn Reson Imaging 22(5):674–680
Mignotte M, Meunier J, Tardif J-C (2001) Endocardial boundary estimation and tracking in echocardiographic images using deformable template and Markov random fields. Pattern Anal Appl 4(4):256–271
Mirmehdi M, Xie X, Suri J (2009) Handbook of texture analysis. Imperial College Press, London
Moulin LS, Alves Da Silva AP, El-Sharkawi MA, Marks Ii RJ (2004) Support vector machines for transient stability analysis of large-scale power systems. IEEE Trans Power Syst 19(2):818–825
Nailon WH (2010) Texture analysis methods for medical image characterisation. In: Youxin M (ed) Biomedical imaging. InTech. doi:10.5772/8912
Orphanidou-Vlachou E, Vlachos N, Davies NP, Arvanitis TN, Grundy RG, Peet AC (2014) Texture analysis of T1-and T2-weighted MR images and use of probabilistic neural network to discriminate posterior fossa tumours in children. NMR Biomed 27(6):632–639
Petrou M, Sevilla PG (2006) Image processing: dealing with texture. Wiley, New York
Pitiot A, Toga A, Thompson P (2002) Adaptive elastic segmentation of brain MRI via shape-model-guided evolutionary programming. Med Imaging IEEE Trans 21(8):910–923
Prater JS, Richard WD (1992) Segmenting ultrasound images of the prostate using neural networks. Ultrason Imaging 14(2):159– 185
Quade D (1979) Using weighted rankings in the analysis of complete blocks with additive block effects. J Am Stat Assoc 74(367):680–683
R Core Team (2014) R: a language and environment for statistical computing. R Foundation for Statistical Computing, Vienna
Rakotomamonjy A (2003) Variable selection using SVM based criteria. J Mach Learn Res 3:1357–1370
Rakotomamonjy A, Bach F, Canu S, Grandvalet Y (2008) Simple MKL. J Mach Learn Res 9:2491–2521
Rathi VPGP, Palani S (2012) Brain tumor MRI image classification with feature selection and extraction using linear discriminant analysis. CoRR, abs/1208.2128
Reyes-Aldasoro C, Bhalerao A (2003) Volumetric texture description and discriminant feature selection for MRI. Information processing in medical imaging: Proceedings of the conference, vol 18
Richard WD, Keen CG (1996) Automated texture-based segmentation of ultrasound images of the prostate. Comput Med Imaging Graph 20(3):131–140
Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez J-C, Müller M (2011) pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform 12(1):77
Rojas-Domnguez A, Nandi AK (2009) Development of tolerant features for characterization of masses in mammograms. Comput Biol Med 39(8):678–688
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Sahiner B, Chan H, Wei D, Petrick N, Helvie M, Adler D, Goodsitt M (1996) Image feature selection by a genetic algorithm: application to classification of mass and normal breast tissue. Med Phys 23(10):1671–1684
Seoane JA, Day INM, Gaunt TR, Campbell C (2014) A pathway-based data integration framework for prediction of disease progression. Bioinformatics 30(6):838–845. doi:10.1093/bioinformatics/btt610
Shapiro SS, Wilk MB (1965) An analysis of variance test for normality (complete samples). Biometrika 52(3–4):591–611
Sheskin D (2011) Handbook of parametric and nonparametric statistical procedures. Taylor and Francis, New York
Siedlecki W, Sklansky J (1989) A note on genetic algorithms for large-scale feature selection. Pattern Recognit Lett 10(5):335–347
Sonnenburg S, Rätsch G, Schäfer C, Schölkopf B (2006) Large scale multiple kernel learning. J Mach Learn Res 7:1531–1565
Spackman KA (1989) Signal detection theory: valuable tools for evaluating inductive learning. In: Alberto Maria Segre (ed) Proceedings of the sixth international workshop on Machine learning. Morgan Kaufmann Publishers Inc., pp 160–163
Szczypinski PM, Strzelecki M, Materka A (2007) Mazda - a software for texture analysis. In: International symposium on information technology convergence, 2007. ISITC 2007, pp 245–249. doi:10.1109/ISITC.2007.15
Szymanski JJ, Jamison JT, DeGracia DJ (2012) Texture analysis of poly-adenylated mRNA staining following global brain ischemia and reperfusion. Comput Methods Progr Biomed 105(1):81–94
Tuceryan M, Jain A (1999) Texture analysis, In Handbook of pattern recognition and computer vision, vol 2. World Scientific Publishing Company, Incorporated (chapter 2)
Vapnik VN (1982) Estimation of dependences based on empirical data (in Russian) Nauka, Moscow (1979) English translation: Springer Verlag, New York. ISBN: 0-387-90733-5
Wall M (1996) GAlib: A C++ library of genetic algorithm components. Mechanical Engineering Department, Massachusetts Institute of Technology
Wang J, Fan Z, Vandenborne K, Walter G, Shiloh-Malawsky Y, An H, Kornegay JN, Styner MA (2013) Statistical texture analysis based MRI quantification of Duchenne muscular dystrophy in a canine model. In: Proceedings of SPIE 8672, medical imaging 2013: biomedical applications in molecular, structural, and functional imaging, 86720F. doi:10.1117/12.2006892
Wickham H (2009) ggplot2: elegant graphics for data analysis. Springer, New York
Wickham H (2011) The split-apply-combine strategy for data analysis. J Stat Softw 40(1):1–29
Woods BJ, Clymer BD, Kurc T, Heverhagen JT, Stevens R, Orsdemir A, Bulan O, Knopp MV (2007) Malignant-lesion segmentation using 4D co-occurrence texture analysis applied to dynamic contrast-enhanced magnetic resonance breast image data. J Magn Reson Imaging 25(3):495–501
Zacharaki EI, Wang S, Chawla S, Yoo DS, Wolf R, Melhem ER, Davatzikos C (2009) Classification of brain tumor type and grade using MRI texture and shape in a machine learning scheme. Magn Reson Med 62(6):1609–1618
Zar JH (1999) Biostatistical analysis. Prentice Hall, Englewood Cliffs
Acknowledgments
This work is supported by “Collaborative Project on Medical Informatics (CIMED)” PI13/00280 funded by the Carlos III Health Institute from the Spanish National plan for Scientific and Technical Research and Innovation 2013–2016 and the European Regional Development Funds (FEDER), UK Medical Research Council (G10000427, MC_UU_12013/8) and “Development of new image analysis techniques in 2D Gel for Biomedical research” (ref. 10SIN105004PR) funded by Xunta de Galicia. The authors thank the Galicia Supercomputing Centre (CESGA) for the provision of computational support. The authors also thank Dr. G.-Z. Yang for providing the dataset.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by I. R. Ruiz.
Rights and permissions
About this article
Cite this article
Fernandez-Lozano, C., Seoane, J.A., Gestal, M. et al. Texture classification using feature selection and kernel-based techniques. Soft Comput 19, 2469–2480 (2015). https://doi.org/10.1007/s00500-014-1573-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00500-014-1573-5