Abstract
Support vector machine (SVM) is a supervised machine learning approach that was recognized as a statistical learning apotheosis for the small-sample database. SVM has shown its excellent learning and generalization ability and has been extensively employed in many areas. This paper presents a performance analysis of six types of SVMs for the diagnosis of the classical Wisconsin breast cancer problem from a statistical point of view. The classification performance of standard SVM (St-SVM) is analyzed and compared with those of the other modified classifiers such as proximal support vector machine (PSVM) classifiers, Lagrangian support vector machines (LSVM), finite Newton method for Lagrangian support vector machine (NSVM), Linear programming support vector machines (LPSVM), and smooth support vector machine (SSVM). The experimental results reveal that these SVM classifiers achieve very fast, simple, and efficient breast cancer diagnosis. The training results indicated that LSVM has the lowest accuracy of 95.6107 %, while St-SVM performed better than other methods for all performance indices (accuracy = 97.71 %) and is closely followed by LPSVM (accuracy = 97.3282). However, in the validation phase, the overall accuracies of LPSVM achieved 97.1429 %, which was superior to LSVM (95.4286 %), SSVM (96.5714 %), PSVM (96 %), NSVM (96.5714 %), and St-SVM (94.86 %). Value of ROC and MCC for LPSVM achieved 0.9938 and 0.9369, respectively, which outperformed other classifiers. The results strongly suggest that LPSVM can aid in the diagnosis of breast cancer.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abonyi J, Szeifert F (2003) Supervised fuzzy clustering for the identification of fuzzy classifiers. Pattern Recognit Lett 24(14):2195–2207
Akay MF (2009) Support vector machines combined with feature selection for breast cancer diagnosis. Expert Syst Appl 36(2):3240–3247
Bishop C (1997) Neural networks for pattern recognition. Clarendon Press, Oxford
Blanz V, Scholkopf B, Bulthoff H et al (1996) Comparison of view–based object recognition algorithms using realistic 3d models. In: von der Malsburg C, von Seelen W, Vorbruggen JC, Sendhoff B (eds) Artificial Neural Networks—ICANN’96, Springer Lecture Notes in Computer Science, Berlin, vol 1112, pp 251–256
Boyle P, Levin B (2008) World Cancer report 2008. International Agency for Research on Cancer, Lyon
Burges CJC (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Discov 2(2):121–167
Burges CJC, Scholkopf B (1997) Improving the accuracy and speed of support vector learning machines. In: Mozer M, Jordan M, Petsche T (eds) Advances in neural information processing systems 9. MIT Press, Cambridge, pp 375–381
Cedeño AM, Domíngueza JQ, Andina D (2011) WBCD breast cancer database classification applying artificial metaplasticity neural network. Expert Syst Appl 38(8):9573–9579
Chang RF, Wu WJ, Moon WK et al (2003) Support vector machines for diagnosis of breast tumors on US images. Acad Radiol 10(2):189–197
Chen HL, Yanga B, Liua J, Liu DY (2011) A support vector machine classifier with rough set based feature selection for breast cancer diagnosis. Expert Syst Appl 38(7):9014–9022
Chen HL, Yang B, Wang G et al (2011) Support vector machine based diagnostic system for breast cancer using swarm intelligence. J Med Syst. doi:10.1007/s10916-011-9723-0
Cortes C, Vapnik V (1995) Support vector network. Mach Learn 20:273–297
Cristianini N, Taylor JS (2000) An introduction to support Vector Machines: and other kernel-based learning methods. Cambridge University Press, Cambridge
Evgeniou T, Pontil M, Poggio T (2000) Regularization networks and support vector machines. In: Bartlett P, Scholkopf B, Schuurmans D, Smola AJ (eds) Advances in large margin classifiers. MIT Press, Cambridge, pp 171–203
Fan CY, Changb PC, Linb JJ, Hsieh JC (2011) A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification. Appl Soft Comput 11(1):632–644
Francois D, Rossi F, Wertz V, Verleysen M (2007) Resampling methods for parameter-free and robust feature selection with mutual information. Neurocomputing 70:1276–1288
Fung G, Mangasarian OL (2004) A feature selection Newton method for support vector machine classification. Comput Optim Appl 28(2):185–202
Fung G, Mangasarian OL (2003) Finite {N}ewton method for {L}agrangian support vector machine classification. Neurocomputing 55(1–2):39–55
Fung G, Mangasarian OL (2001) Proximal support vector machine classifiers. Proceedings of KDD’01 seventh ACM SIGKDD international conference on Knowledge Discovery and Data Mining, San Francisco, pp 77–86. ISBN: 1-58113-391-X. doi:10.1145/502512.502527
Goodman D, Boggess L, Watkins A (2002) Artificial immune system classification of multiple-class problems. In: Dagli CH, Buczak AL, Ghosh J, Ersoy O, Kercel SW (eds) Intell Eng Syst Artif Neural Net, vol 12, pp 179–184
Gunn SR (1998) Support vector machines for classification and regression. Technical Report, Faculty of Engineering, University of Southampton
Hamilton HJ, Shan N, Cerone N (1996) RIAC: a rule induction algorithm based on approximate classification. Technical Report CS 96-06, University of Regina. ISBN 0-7731-0321-X
Hsu CW, Chang CC, Lin CJ (2003) A practical guide to support vector classification. Technical Report, Department of Computer Science and Information Engineering, National Taiwan University
Huang ML, Hung YH, Chen WY (2010) Neural network classifier with entropy based feature selection on breast cancer diagnosis. J Med Syst 34(5):865–873
Joachims T, Nedellec C, Rouveirol C (1998) Text categorization with support vector machines: learning with many relevant. Springer, Springer-Verlag GmbH, Berlin
Joachims T (1998) SVM light. http://svmlight.joachims.org/
Karabatak M, Ince MC (2009) An expert system for detection of breast cancer based on association rules and neural network. Exp Syst Appl 36(2, Part 2):3465–3469
Kerekes J (2008) Receiver operating characteristic curve confidence intervals and regions. IEEE Geosci Remote Sens Lett 5(2):251–255
Lee YJ, Mangasarian OL (2001) {SSVM}: a smooth support vector machine. Comput Optim Appl 20:5–22
Liu HX, Zhang RS, Luan F et al (2003) Diagnosing breast cancer based on support vector machines. J Chem Inf Comput Sci 43(3):900–907
Mangasarian OL, Setiono R, Wolberg WH (1990) Pattern recognition via linear programming: theory and application to medical diagnosis. Proceedings of the workshop on large-scale numerical optimization, SIAM, Philadelphia, pp 22–31
Mangasarian OL, Musicant DR (2000) Lagrangian Support Vector Machine Classification. Tec. Report, Data Mining Institute, Computer Sciences Department, University of Wisconsin
Mangasarian OL, Musicant DR (1999) Successive overrelaxation for support vector machines. IEEE Trans Neural Networks 10:1032–1037
Mangasarian OL (2000) Generalized support vector machines. In: Smola A, Bartlett P, Scholkopf B, Schuurmans D (eds) Advances in large margin classifiers. MIT Press, Cambridge, pp 135–146
McAree B, O’Donnell ME, Spence A et al (2010) Breast cancer in women under 40 years of age: a series of 57 cases from Northern Ireland. Breast 19(2):97–104
Mitchell T (1997) Machine learning. The McGraw-Hill Companies, Inc., New York
Nauck D, Kruse R (1999) Obtaining interpretable fuzzy classification rules from medical data. Artif Intell Med 16(2):149–169
NCSS (2012) Statistical and power analysis software. http://www.ncss.com. Accessed in April 2012
Osuna E, Freund R, Girosit F (1997) Training support vector machines: an application to face detection. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Jun 17–19, pp 130–136
Park SH, Goo JM, Jo CH (2004) Receiver operating characteristic (ROC) curve: practical review for radiologists. Korean J Radiol 5(1):11–18
Pena-Reyes CA, Sipper M (1999) A fuzzy-genetic approach to breast cancer diagnosis. Artif Intell Med 17(2):131–155
Polat K, Gunes S (2007) Breast cancer diagnosis using least square support vector machine. Digit Signal Process 17(4):694–701
Platt J (1998) Sequential minimal optimization: A fast algorithm for training support vector machines. Technical Report MSR-TR-98-14
Quinlan J (1996) Improved use of continuous attributes in C4. 5. J Artif Intell Res 4:77–90
Sahan S, Polat K, Kodaz H, Günes S (2007) A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis. Comput Biol Med 37(3):415–423
Schmidt M (1996) Identifying speaker with support vector networks. Interface’96 Proceedings, Sydney
Scholkopf B, Burges C, Vapnik V (1995) Extracting support data for a given task. In: Fayyad UM, Uthurusamy R (eds) Proceedings, first international conference on knowledge discovery & data mining. AAAI Press, Menlo Park
Scholkopf B, Burges C, Vapnik V (1996) Incorporating invariances in support vector learning machines. In: von der Malsburg C, von Seelen W, Vorbruggen JC, Sendhoff B (eds) Artificial neural networks- ICANN’96, vol 1112. Springer Lecture Notes in Computer Science, Berlin, pp 47–52
Setiono R (2000) Generating concise and accurate classification rules for breast cancer diagnosis. Artif Intell Med 18(3):205–219
Sherrod PH (2011) DTREG predictive modeling software. www.dtreg.com
Ster B, Dobnikar A (1996) Neural networks in medical diagnosis: comparison with other methods. Proceedings of the international conference on engineering applications of neural networks, pp 427–430
Taylor JS, Cristianini N (2004) Kernel methods for pattern analysis. Cambridge University Press, Cambridge
Übeyli ED (2005) A mixture of experts network structure for breast cancer diagnosis. J Med Syst 29(5):569–579
Übeyli ED (2009) Adaptive neuro-fuzzy inference systems for automatic detection of breast cancer. J Med Syst 33(5):353–358
Ubeyli ED (2007) Implementing automated diagnostic systems for breast cancer detection. Expert Syst Appl 33(4):1054–1062
UCI (2012) Machine learning repository. http://archive.ics.uci.edu/ml/index.html. Accessed on 10 Aug 2012
Vapnik VN (1995) The nature of statistical learning theory. Springer, New York
Vapnik VN (1999) The nature of statistical learning theory, 2nd edn. New York, Springer
Vapnik V, Golowich S, Smola A (1997) Support vector method for function approximation, regression estimation, and signal processing. In: Mozer M, Jordan M, Petsche T (eds) Advances in neural information processing systems 9. Cambridge, MIT Press, pp 281–287
Wolberg WH, Mangasarian OL (1990) Multisurface method of pattern separation for medical diagnosis applied to breast cytology. Proc Natl Acad Sci 87:9193–9196
Yuan Q, Cai C, Xiao H et al (2007) Diagnosis of breast tumours and evaluation of prognostic risk by using machine learning approaches. Commun Comput Inform Sci 2:1250–1260
Acknowledgments
I would like to highly appreciate and gratefully acknowledge, Phillip H. Sherrod [50], software developer and consultant on predictive modeling, for his support and consultation during modeling process.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Azar, A.T., El-Said, S.A. Performance analysis of support vector machines classifiers in breast cancer mammography recognition. Neural Comput & Applic 24, 1163–1177 (2014). https://doi.org/10.1007/s00521-012-1324-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-012-1324-4