Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1390156.1390191acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicmlConference Proceedingsconference-collections
research-article

Efficient projections onto the l1-ball for learning in high dimensions

Published: 05 July 2008 Publication History

Abstract

We describe efficient algorithms for projecting a vector onto the l1-ball. We present two methods for projection. The first performs exact projection in O(n) expected time, where n is the dimension of the space. The second works on vectors k of whose elements are perturbed outside the l1-ball, projecting in O(k log(n)) time. This setting is especially useful for online learning in sparse feature spaces such as text categorization applications. We demonstrate the merits and effectiveness of our algorithms in numerous batch and online learning tasks. We show that variants of stochastic gradient projection methods augmented with our efficient projection procedures outperform interior point methods, which are considered state-of-the-art optimization techniques. We also show that in online settings gradient updates with l1 projections outperform the exponentiated gradient algorithm while obtaining models with high degrees of sparsity.

References

[1]
Beck, A., & Teboulle, M. (2003). Mirror descent and nonlinear projected subgradient methods for convex optimization. Operations Research Letters, 31, 167--175.
[2]
Bertsekas, D. (1999). Nonlinear programming. Athena Scientific.
[3]
Candes, E. J. (2006). Compressive sampling. Proc. of the Int. Congress of Math., Madrid, Spain.
[4]
Cormen, T. H., Leiserson, C. E., Rivest, R. L., & Stein, C. (2001). Introduction to algorithms. MIT Press.
[5]
Crammer, K., & Singer, Y. (2002). On the learnability and design of output codes for multiclass problems. Machine Learning, 47.
[6]
Donoho, D. (2006a). Compressed sensing. Technical Report, Stanford University.
[7]
Donoho, D. (2006b). For most large underdetermined systems of linear equations, the minimal l 1-norm solution is also the sparsest solution. Comm. Pure Appl. Math. 59.
[8]
Friedman, J., Hastie, T., & Tibshirani, R. (2007). Pathwise co-ordinate optimization. Annals of Applied Statistics, 1, 302--332.
[9]
Gafni, E., & Bertsekas, D. P. (1984). Two-metric projection methods for constrained optimization. SIAM Journal on Control and Optimization, 22, 936--964.
[10]
Hazan, E. (2006). Approximate convex optimization by online game playing. Unpublished manuscript.
[11]
Kim, S.-J., Koh, K., Lustig, M., Boyd, S., & Gorinevsky, D. (2007). An interior-point method for large-scale l 1-regularized least squares. IEEE Journal on Selected Topics in Signal Processing, 4, 606--617.
[12]
Kivinen, J., & Warmuth, M. (1997). Exponentiated gradient versus gradient descent for linear predictors. Information and Computation, 132, 1--64.
[13]
Koh, K., Kim, S.-J., & Boyd, S. (2007). An interior-point method for large-scale l 1-regularized logistic regression. Journal of Machine Learning Research, 8, 1519--1555.
[14]
Lewis, D., Yang, Y., Rose, T., & Li, F. (2004). Rcv1: A new benchmark collection for text categorization research. Journal of Machine Learning Research, 5, 361--397.
[15]
Ng, A. (2004). Feature selection, l 1 vs. l 2 regularization, and rotational invariance. Proceedings of the Twenty-First International Conference on Machine Learning.
[16]
Shalev-Shwartz, S., & Singer, Y. (2006). Efficient learning of label ranking by soft projections onto polyhedra. Journal of Machine Learning Research, 7 (July), 1567--1599.
[17]
Shalev-Shwartz, S., Singer, Y., & Srebro, N. (2007). Pegasos: Primal estimated sub-gradient solver for SVM. Proceedings of the 24th International Conference on Machine Learning.
[18]
Tarjan, R. E. (1983). Data structures and network algorithms. Society for Industrial and Applied Mathematics.
[19]
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Royal. Statist. Soc B., 58, 267--288.
[20]
Zinkevich, M. (2003). Online convex programming and generalized infinitesimal gradient ascent. Proceedings of the Twentieth International Conference on Machine Learning.

Cited By

View all
  • (2025)Iterative Thresholding and Projection Algorithms and Model-Based Deep Neural Networks for Sparse LQR Control DesignIEEE Transactions on Automatic Control10.1109/TAC.2024.345308970:2(1100-1114)Online publication date: Feb-2025
  • (2025)Efficient proximal subproblem solvers for a nonsmooth trust-region methodComputational Optimization and Applications10.1007/s10589-024-00628-xOnline publication date: 4-Jan-2025
  • (2024)Projection-free variance reduction methods for stochastic constrained multi-level compositional optimizationProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692952(21962-21987)Online publication date: 21-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICML '08: Proceedings of the 25th international conference on Machine learning
July 2008
1310 pages
ISBN:9781605582054
DOI:10.1145/1390156
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • Pascal
  • University of Helsinki
  • Xerox
  • Federation of Finnish Learned Societies
  • Google Inc.
  • NSF
  • Machine Learning Journal/Springer
  • Microsoft Research: Microsoft Research
  • Intel: Intel
  • Yahoo!
  • Helsinki Institute for Information Technology
  • IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 July 2008

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Conference

ICML '08
Sponsor:
  • Microsoft Research
  • Intel
  • IBM

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)257
  • Downloads (Last 6 weeks)22
Reflects downloads up to 01 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Iterative Thresholding and Projection Algorithms and Model-Based Deep Neural Networks for Sparse LQR Control DesignIEEE Transactions on Automatic Control10.1109/TAC.2024.345308970:2(1100-1114)Online publication date: Feb-2025
  • (2025)Efficient proximal subproblem solvers for a nonsmooth trust-region methodComputational Optimization and Applications10.1007/s10589-024-00628-xOnline publication date: 4-Jan-2025
  • (2024)Projection-free variance reduction methods for stochastic constrained multi-level compositional optimizationProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692952(21962-21987)Online publication date: 21-Jul-2024
  • (2024)A2Q+Proceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692439(9275-9291)Online publication date: 21-Jul-2024
  • (2024)Portfolio Optimization with Multi-Trend Objective and Accelerated Quasi-Newton MethodSymmetry10.3390/sym1607082116:7(821)Online publication date: 30-Jun-2024
  • (2024)Passive Aggressive Ensemble for Online Portfolio SelectionMathematics10.3390/math1207095612:7(956)Online publication date: 23-Mar-2024
  • (2024)Enhanced Multi-View Low-Rank Graph Optimization for Dimensionality ReductionElectronics10.3390/electronics1312242113:12(2421)Online publication date: 20-Jun-2024
  • (2024)Multi-view latent space learning framework via adaptive graph embeddingJournal of Electronic Imaging10.1117/1.JEI.33.6.06301633:06Online publication date: 1-Nov-2024
  • (2024)A Parallel Zeroth-Order Framework for Efficient Cellular Network OptimizationIEEE Transactions on Wireless Communications10.1109/TWC.2024.345410623:11(17522-17538)Online publication date: Nov-2024
  • (2024)Sequential Manipulation Against Rank Aggregation: Theory and AlgorithmIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.341671046:12(9353-9370)Online publication date: Dec-2024
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media