Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Faster Support Vector Machines

Published: 08 October 2021 Publication History

Abstract

The time complexity of support vector machines (SVMs) prohibits training on huge datasets with millions of data points. Recently, multilevel approaches to train SVMs have been developed to allow for time-efficient training on huge datasets. While regular SVMs perform the entire training in one—time-consuming—optimization step, multilevel SVMs first build a hierarchy of problems decreasing in size that resemble the original problem and then train an SVM model for each hierarchy level, benefiting from the solved models of previous levels. We present a faster multilevel support vector machine that uses a label propagation algorithm to construct the problem hierarchy. Extensive experiments indicate that our approach is up to orders of magnitude faster than the previous fastest algorithm while having comparable classification quality. For example, already one of our sequential solvers is on average a factor 15 faster than the parallel ThunderSVM algorithm, while having similar classification quality.1

References

[1]
M. A. Aizerman, E. M. Braverman, and L. I. Rozoner. 1964. Theoretical foundations of the potential function method in pattern recognition learning. Autom. Rem. Contr. 25 (1964), 821–837
[2]
Zeyuan Allen Zhu, Weizhu Chen, Gang Wang, Chenguang Zhu, and Zheng Chen. 2009. P-packSVM: Parallel primal grAdient desCent kernel SVM. In Proceedings of the 9th IEEE International Conference on Data Mining. 677–686.
[3]
Ethem Alpaydin. 2010. Introduction to Machine Learning (2nd ed.). The MIT Press.
[4]
Bernhard E. Boser, Isabelle M. Guyon, and Vladimir N. Vapnik. 1992. A training algorithm for optimal margin classifiers. In Proceedings of the 5th Annual Workshop on Computational Learning Theory (COLT’92). ACM, 144–152.
[5]
A. Buluç, H. Meyerhenke, I. Safro, P. Sanders, and C. Schulz. 2016. Recent advances in graph partitioning. In Algorithm Engineering—Selected Results and Surveys. Springer, 117–158.
[6]
Chih-Chung Chang and Chih-Jen Lin. 2011. LIBSVM: A library for support vector machines. ACM Trans. Intell. Syst. Technol. 2, 3 (2011), 27:1–27:27. Retrieved from http://www.csie.ntu.edu.tw/ cjlin/libsvm.
[7]
Chih-Chung Chang and Chih-Jen Lin. 2019. LIBSVM FAQ. Retrieved from https://www.csie.ntu.edu.tw/ cjlin/libsvm/faq.html.
[8]
Edward Y. Chang, Kaihua Zhu, Hao Wang, Hongjie Bai, Jian Li, Zhihuan Qiu, and Hang Cui. 2007. Parallelizing support vector machines on distributed computers. In Proceedings of the 21st Annual Conference on Neural Information Processing Systems. Curran Associates, Inc., 257–264. Retrieved from http://papers.nips.cc/paper/3202-parallelizing-support-vector-machines-on-distributed-computers.
[9]
Olivier Chapelle, Vladimir Vapnik, Olivier Bousquet, and Sayan Mukherjee. 2002. Choosing multiple parameters for support vector machines. Mach. Learn. 46, 1–3 (2002), 131–159.
[10]
Chao Chen, Xiaomin Li, Abdelkader Nasreddine Belkacem, Zhifeng Qiao, Enzeng Dong, Wenjun Tan, and Duk Shin. 2019. The mixed kernel function SVM-based point cloud classification. Int. J. Precis. Eng. Manuf. 20, 5 (2019), 737–747.
[11]
Fan Cheng, Jiabin Chen, Jianfeng Qiu, and Lei Zhang. 2020. A subregion division based multi-objective evolutionary algorithm for SVM training set selection. Neurocomputing 394 (2020), 70–83.
[12]
Marc Claesen, Frank De Smet, Johan A. K. Suykens, and Bart De Moor. 2014. EnsembleSVM: A library for ensemble learning using support vector machines. J. Mach. Learn. Res. 15, 1 (2014), 141–145. Retrieved from http://dl.acm.org/citation.cfm?id=2627439.
[13]
Corinna Cortes and Vladimir Vapnik. 1995. Support-vector networks. Mach. Learn. 20, 3 (1995), 273–297.
[14]
Koby Crammer and Yoram Singer. 2001. On the algorithmic implementation of multiclass kernel-based vector machines. J. Mach. Learn. Res. 2 (2001), 265–292. Retrieved from http://www.jmlr.org/papers/v2/crammer01a.html.
[15]
Lijuan Cui, Changjian Wang, Wanli Li, Ludan Tan, and Yuxing Peng. 2017. Multi-modes cascade SVMs: Fast support vector machines in distributed system. In Proceedings of the International Conference on Information Science and Applications. Springer, 443–450.
[16]
Dua Dheeru and Efi Karra Taniskidou. 2017. UCI Machine Learning Repository. Retrieved from http://archive.ics.uci.edu/ml.
[17]
Rong-En Fan, Kai-Wei Chang, Cho-Jui Hsieh, Xiang-Rui Wang, and Chih-Jen Lin. 2008. LIBLINEAR: A library for large linear classification. J. Mach. Learn. Res. 9 (2008), 1871–1874.
[18]
Rong-En Fan, Pai-Hsuen Chen, and Chih-Jen Lin. 2005. Working set selection using second order information for training support vector machines. J. Mach. Learn. Res. 6 (2005), 1889–1918. Retrieved from http://www.jmlr.org/papers/v6/fan05a.html.
[19]
Hossam Faris, Mohammad A. Hassonah, Ala’ M. Al-Zoubi, Seyedali Mirjalili, and Ibrahim Aljarah. 2018. A multi-verse optimizer approach for feature selection and optimizing SVM parameters based on a robust system architecture. Neural Comput. Appl. 30, 8 (2018), 2355–2369.
[20]
Hans Peter Graf, Eric Cosatto, Léon Bottou, Igor Durdanovic, and Vladimir Vapnik. 2004. Parallel support vector machines: The cascade SVM. In Proceedings of the International Conference on Advances in Neural Information Processing Systems (NIPS). 521–528. Retrieved from http://papers.nips.cc/paper/2608-parallel-support-vector-machines-the-cascade-svm.
[21]
David Money Harris and Sarah Harris. 2013. Introductory digital design & computer architecture curriculum. In Proceedings of the IEEE International Conference on Microelectronic Systems Education. IEEE, 14–16.
[22]
Shi-Jinn Horng, Ming-Yang Su, Yuan-Hsin Chen, Tzong-Wann Kao, Rong-Jian Chen, Jui-Lin Lai, and Citra Dwi Perkasa. 2011. A novel intrusion detection system based on hierarchical clustering and support vector machines. Expert Syst. Appl. 38, 1 (2011), 306–313.
[23]
Cho-Jui Hsieh, Si Si, and Inderjit S. Dhillon. 2014. A divide-and-conquer solver for kernel support vector machines. In Proceedings of the 31st International Conference on Machine Learning, ICML. 566–574. Retrieved from http://jmlr.org/proceedings/papers/v32/hsieha14.html.
[24]
Chih-Wei Hsu, Chih-Chung Chang, Chih-Jen Lin, et al. 2003. A practical guide to support vector classification. (2003).
[25]
Chien-Ming Huang, Yuh-Jye Lee, Dennis Lin, and Su-Yun Huang. 2007. Model selection for support vector machine via uniform design. 52 (09 2007), 335–346.
[26]
Frank Hutter, Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle. 2009. ParamILS: An automatic algorithm configuration framework. J. Artif. Intell. Res. 36 (2009), 267–306.
[27]
Ron Kohavi. 1995. A study of cross-validation and bootstrap for accuracy estimation and model selection. In Proceedings of the 14th International Joint Conference on Artificial Intelligence (IJCAI). 1137–1145. Retrieved from http://ijcai.org/Proc./95-2/Papers/016.pdf.
[28]
Tao Li, Xuechen Liu, Qiankun Dong, Wenjing Ma, and Kai Wang. 2016. HPSVM: Heterogeneous parallel SVM with factorization based IPM algorithm on CPU-GPU cluster. In Proceedings of the 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP). IEEE Computer Society, 74–81.
[29]
Gaëlle Loosli, Stéphane Canu, and Léon Bottou. 2007. Training invariant support vector machines using selective sampling. Large Scale Kern. Mach. 2 (2007).
[30]
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. 2008. Introduction to Information Retrieval. Cambridge University Press.
[31]
Sanjay Mehrotra. 1992. On the implementation of a primal-dual interior point method. SIAM J. Optim. 2, 4 (1992), 575–601.
[32]
H. Meyerhenke, P. Sanders, and C. Schulz. 2014. Partitioning complex networks via size-constrained clustering, In Proceedings ofthe 13thInternationalSymposium on Experimental Algorithms. preprint arXiv:1402.3281.
[33]
Henning Meyerhenke, Peter Sanders, and Christian Schulz. 2017. Parallel graph partitioning for complex networks. IEEE Trans. Parallel Distrib. Syst. 28, 9 (2017), 2625–2638.
[34]
Gary Miller, Richard Peng, and Shen Xu. 2013. Parallel graph decompositions using random shifts. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures.
[35]
Marius Muja and David G. Lowe. 2014. Scalable nearest neighbor algorithms for high dimensional data. IEEE Trans. Pattern Anal. Mach. Intell. 36, 11 (2014), 2227–2240.
[36]
Jakub Nalepa and Michal Kawulok. 2019. Selecting training sets for support vector machines: A review. Artif. Intell. Rev. 52, 2 (2019), 857–900.
[37]
Edgar Osuna, Robert Freund, and Federico Girosi. 1997. An improved training algorithm for support vector machines. In Neural Networks for Signal Processing VII. IEEE, 276–285.
[38]
John C. Platt. 1999. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods: Support Vector Learning.
[39]
Alexander Popov and Alexander Sautin. 2008. Selection of support vector machines parameters for regression using nested grids. In Proceedings of the 3rd International Forum on Strategic Technologies. 329–331.
[40]
Junfei Qiu, Qihui Wu, Guoru Ding, Yuhua Xu, and Shuo Feng. 2016. A survey of machine learning for big data processing. EURASIP J. Adv. Sig. Process. 2016, 1 (28 May 2016), 67.
[41]
U. N. Raghavan, R. Albert, and S. Kumara. 2007. Near linear time algorithm to detect community structures in large-scale networks. Phys. Rev. E 76, 3 (2007).
[42]
Talayeh Razzaghi, Oleg Roderick, Ilya Safro, and Nicholas Marko. 2016. Multilevel weighted support vector machine for classification on healthcare data with missing values. PLoS One 11 (04 2016).
[43]
Talayeh Razzaghi and Ilya Safro. 2015. Scalable multilevel support vector machines. In Proceedings of the International Conference on Computational Science (ICCS)(Procedia Computer Science, Vol. 51). Elsevier, 2683–2687.
[44]
Ehsan Sadrfaridpour, Sandeep Jeereddy, Ken Kennedy, André Luckow, Talayeh Razzaghi, and Ilya Safro. 2016. Algebraic multigrid support vector machines. Retrieved from http://arxiv.org/abs/1611.05487.
[45]
Ehsan Sadrfaridpour, Sandeep Jeereddy, Ken Kennedy, André Luckow, Talayeh Razzaghi, and Ilya Safro. 2017. Algebraic multigrid support vector machines. In Proceedings of the European Symposium on Artificial Neural Networks (ESANN). 35–40.
[46]
Ehsan Sadrfaridpour, Talayeh Razzaghi, and Ilya Safro. 2017. Engineering fast multilevel support vector machines. Retrieved from http://arxiv.org/abs/1707.07657.
[47]
Sancho Salcedo-Sanz, José Luis Rojo-Álvarez, Manel Martínez-Ramón, and Gustavo Camps-Valls. 2014. Support vector machines in engineering: An overview. Data Mining Knowl. Discov. 4, 3 (2014), 234–267.
[48]
Sebastian Schlag, Matthias Schmitt, and Christian Schulz. 2019. Faster support vector machines. In Proceedings of the 21st Workshop on Algorithm Engineering and Experiments. 199–210.
[49]
Greg Schohn and David Cohn. 2000. Less is more: Active learning with support vector machines. In Proceedings of the International Conference on Machine Learning. Citeseer, 839–846.
[50]
Bernhard Schölkopf and Alexander Johannes Smola. 2002. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond. The MIT Press. Retrieved from http://www.worldcat.org/oclc/48970254.
[51]
Julian Shun, Laxman Dhulipala, and Guy Blelloch. 2014. A simple and practical linear-work parallel algorithm for connectivity. In Proceedings of the Annual ACM Symposium on Parallelism in Algorithms and Architectures
[52]
R. V. Southwell. 1935. Stress-calculation in frameworks by the method of “Systematic Relaxation of Constraints.”Proc. Roy. Soc. London 151, 872 (1935), 56–95.
[53]
Andreas Stolcke, Sachin S. Kajarekar, and Luciana Ferrer. 2008. Nonparametric feature normalization for SVM-based speaker verification. In Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP). IEEE, 1577–1580.
[54]
Marco Stolpe, Kanishka Bhaduri, and Kamalika Das. 2016. Distributed support vector machines: An overview. In Solving Large Scale Learning Tasks. Challenges and Algorithms - Essays Dedicated to Katharina Morik on the Occasion of Her 60th Birthday(Lecture Notes in Computer Science, Vol. 9580). Springer, 109–138.
[55]
Yusuke Torii and Shigeo Abe. 2006. Fast training of linear programming support vector machines using decomposition techniques. In Artificial Neural Networks in Pattern Recognition, Second IAPR Workshop(Lecture Notes in Computer Science, Vol. 4087). Springer, 165–176.
[56]
Z. Wen, J. Shi, B. He, J. Chen, and Y. Chen. 2018. Efficient multi-class probabilistic SVMs on GPUs. IEEE Trans. Knowl. Data Eng. (2018), 1–1.
[57]
Zeyi Wen, Jiashuai Shi, Qinbin Li, Bingsheng He, and Jian Chen. 2018. ThunderSVM: A fast SVM library on GPUs and CPUs. J. Mach. Learn. Res. 19 (2018), 797–801.
[58]
Yang You, James Demmel, Kenneth Czechowski, Le Song, and Richard W. Vuduc. 2015. CA-SVM: Communication-avoiding support vector machines on distributed systems. In Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS). IEEE Computer Society, 847–859.
[59]
Yang You, Haohuan Fu, Shuaiwen Leon Song, Amanda Peters Randles, Darren J. Kerbyson, Andres Marquez, Guangwen Yang, and Adolfy Hoisie. 2015. Scaling support vector machines on modern HPC platforms. J. Parallel Distrib. Comput. 76 (2015), 16–31.
[60]
Hwanjo Yu, Jiong Yang, and Jiawei Han. 2003. Classifying large data sets using SVMs with hierarchical clusters. In Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 306–315.
[61]
Wan Zhang and I. King. 2002. Locating support vectors via /spl beta/-skeleton technique. In Proceedings of the 9th International Conference on Neural Information Processing, Vol. 3. 1423–1427.

Cited By

View all
  • (2023)A Quantitative Detection Algorithm for Multi-Test Line Lateral Flow Immunoassay Applied in SmartphonesSensors10.3390/s2314640123:14(6401)Online publication date: 14-Jul-2023
  • (2023)RadWise: A Rank-Based Hybrid Feature Weighting and Selection Method for Proteomic Categorization of Chemoirradiation in Patients with GlioblastomaCancers10.3390/cancers1510267215:10(2672)Online publication date: 9-May-2023
  • (2023)Determining the shelf life and quality changes of potatoes (Solanum tuberosum) during storage using electronic nose and machine learningPLOS ONE10.1371/journal.pone.028461218:4(e0284612)Online publication date: 28-Apr-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Journal of Experimental Algorithmics
ACM Journal of Experimental Algorithmics  Volume 26, Issue
December 2021
479 pages
ISSN:1084-6654
EISSN:1084-6654
DOI:10.1145/3446425
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 October 2021
Accepted: 01 August 2021
Revised: 01 July 2021
Received: 01 October 2020
Published in JEA Volume 26

Author Tags

  1. Support vector machines
  2. graph clustering
  3. kernel
  4. model parameter optimization
  5. multilevel training

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Research Council
  • ERC
  • DFG

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)63
  • Downloads (Last 6 weeks)3
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Quantitative Detection Algorithm for Multi-Test Line Lateral Flow Immunoassay Applied in SmartphonesSensors10.3390/s2314640123:14(6401)Online publication date: 14-Jul-2023
  • (2023)RadWise: A Rank-Based Hybrid Feature Weighting and Selection Method for Proteomic Categorization of Chemoirradiation in Patients with GlioblastomaCancers10.3390/cancers1510267215:10(2672)Online publication date: 9-May-2023
  • (2023)Determining the shelf life and quality changes of potatoes (Solanum tuberosum) during storage using electronic nose and machine learningPLOS ONE10.1371/journal.pone.028461218:4(e0284612)Online publication date: 28-Apr-2023
  • (2023)Transient trend prediction of safety parameters for small modular reactor considering equipment degradationAnnals of Nuclear Energy10.1016/j.anucene.2022.109507181(109507)Online publication date: Feb-2023
  • (2022)Data-Driven Fatigue Damage Monitoring and Early Warning Model for BearingsWireless Communications & Mobile Computing10.1155/2022/76116702022Online publication date: 1-Jan-2022
  • (2022)An Automatic Pronunciation Error Detection and Correction Mechanism in English Teaching Based on an Improved Random Forest ModelJournal of Electrical and Computer Engineering10.1155/2022/60119932022Online publication date: 1-Jan-2022
  • (2022)Gaussian Pyramid for Nonlinear Support Vector MachineApplied Computational Intelligence and Soft Computing10.1155/2022/52553462022Online publication date: 1-Jan-2022
  • (2022)New incremental SVM algorithms for human activity recognition in smart homesJournal of Ambient Intelligence and Humanized Computing10.1007/s12652-022-03798-w14:10(13433-13450)Online publication date: 24-Mar-2022
  • (undefined)Transient Trend Prediction of Safety Parameters for Small Modular Reactor Considering Equipment DegradationSSRN Electronic Journal10.2139/ssrn.4046458

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media