Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1390156.1390273acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

SVM optimization: inverse dependence on training set size

Published: 05 July 2008 Publication History
  • Get Citation Alerts
  • Abstract

    We discuss how the runtime of SVM optimization should decrease as the size of the training data increases. We present theoretical and empirical results demonstrating how a simple subgradient descent approach indeed displays such behavior, at least for linear kernels.

    References

    [1]
    Bartlett, P. L., & Mendelson, S. (2003). Rademacher and gaussian complexities: risk bounds and structural results. J. Mach. Learn. Res., 3, 463--482.
    [2]
    Bottou, L. (Web Page). Stochastic gradient descent examples. http://leon.bottou.org/projects/sgd.
    [3]
    Bottou, L., & Bousquet, O. (2008). The tradeoffs of large scale learning. Advances in Neural Information Processing Systems 20.
    [4]
    Bottou, L., & LeCun, Y. (2004). Large scale online learning. Advances in Neural Information Processing Systems 16.
    [5]
    Bottou, L., & Lin, C.-J. (2007). Support vector machine solvers. In L. Bottou, O. Chapelle, D. DeCoste and J. Weston (Eds.), Large scale kernel machines. MIT Press.
    [6]
    Joachims, T. (1998). Making large-scale support vector machine learning practical. In B. Schölkopf, C. Burges and A. Smola (Eds.), Advances in kernel methods---Support Vector learning. MIT Press.
    [7]
    Joachims, T. (2006). Training linear svms in linear time. Proceedings of the ACM Conference on Knowledge Discovery and Data Mining (KDD).
    [8]
    Lin, C.-J. (2002). A formal analysis of stopping criteria of decomposition methods for support vector machines. IEEE Transactions on Neural Networks, 13, 1045--1052.
    [9]
    Platt, J. C. (1998). Fast training of Support Vector Machines using sequential minimal optimization. In B. Schölkopf, C. Burges and A. Smola (Eds.), Advances in kernel methods---Support Vector learning. MIT Press.
    [10]
    Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-gradient solver for svm. Proceedings of the 24th International Conference on Machine Learning.
    [11]
    Smola, A., Vishwanathan, S., & Le, Q. (2008). Bundle methods for machine learning. Advances in Neural Information Processing Systems 20.
    [12]
    Sridharan, K. (2008). Fast convergence rates for excess regularized risk with application to SVM. http://ttic.uchicago.edu/~karthik/con.pdf.
    [13]
    Valiant, L. G. (1984). A theory of the learnable. Communications of the ACM, 27, 1134--1142.

    Cited By

    View all
    • (2024)The Risk of Silence—How the Capital Market Penalizes Social Media PassivityJournal of Information Systems10.2308/ISYS-2023-05938:1(5-38)Online publication date: 20-Feb-2024
    • (2024)The complexity of quantum support vector machinesQuantum10.22331/q-2024-01-11-12258(1225)Online publication date: 11-Jan-2024
    • (2023)Solving the Performance Issues of Epsilon Estimation Method in Differentially Private ERM: Analysis, Solution and EvaluationProceedings of the 2023 2nd International Conference on Networks, Communications and Information Technology10.1145/3605801.3605820(93-99)Online publication date: 16-Jun-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICML '08: Proceedings of the 25th international conference on Machine learning
    July 2008
    1310 pages
    ISBN:9781605582054
    DOI:10.1145/1390156
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    • Pascal
    • University of Helsinki
    • Xerox
    • Federation of Finnish Learned Societies
    • Google Inc.
    • NSF
    • Machine Learning Journal/Springer
    • Microsoft Research: Microsoft Research
    • Intel: Intel
    • Yahoo!
    • Helsinki Institute for Information Technology
    • IBM: IBM

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 05 July 2008

    Permissions

    Request permissions for this article.

    Check for updates

    Qualifiers

    • Research-article

    Conference

    ICML '08
    Sponsor:
    • Microsoft Research
    • Intel
    • IBM

    Acceptance Rates

    Overall Acceptance Rate 140 of 548 submissions, 26%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)59
    • Downloads (Last 6 weeks)12

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)The Risk of Silence—How the Capital Market Penalizes Social Media PassivityJournal of Information Systems10.2308/ISYS-2023-05938:1(5-38)Online publication date: 20-Feb-2024
    • (2024)The complexity of quantum support vector machinesQuantum10.22331/q-2024-01-11-12258(1225)Online publication date: 11-Jan-2024
    • (2023)Solving the Performance Issues of Epsilon Estimation Method in Differentially Private ERM: Analysis, Solution and EvaluationProceedings of the 2023 2nd International Conference on Networks, Communications and Information Technology10.1145/3605801.3605820(93-99)Online publication date: 16-Jun-2023
    • (2023)RetroKD : Leveraging Past States for Regularizing Targets in Teacher-Student LearningProceedings of the 6th Joint International Conference on Data Science & Management of Data (10th ACM IKDD CODS and 28th COMAD)10.1145/3570991.3571014(10-18)Online publication date: 4-Jan-2023
    • (2023)Learning Maximum Margin Channel DecodersIEEE Transactions on Information Theory10.1109/TIT.2023.324386969:6(3597-3626)Online publication date: Jun-2023
    • (2023)Quantum Kernel Alignment with Stochastic Gradient Descent2023 IEEE International Conference on Quantum Computing and Engineering (QCE)10.1109/QCE57702.2023.00036(256-262)Online publication date: 17-Sep-2023
    • (2023)Talking human face generation: A surveyExpert Systems with Applications10.1016/j.eswa.2023.119678219(119678)Online publication date: Jun-2023
    • (2022)A Stochastic Three-Term Conjugate Gradient Method for Unconstrained Optimization ProblemsAdvances in Applied Mathematics10.12677/AAM.2022.11745211:07(4248-4267)Online publication date: 2022
    • (2022) Sparse ℓ 1 - and ℓ 2 -Center Classifiers IEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2020.303683833:3(996-1009)Online publication date: Mar-2022
    • (2022)Learning Maximum Margin Channel Decoders for Non-linear Gaussian Channels2022 IEEE International Symposium on Information Theory (ISIT)10.1109/ISIT50566.2022.9834818(2469-2474)Online publication date: 26-Jun-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media