Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3367243.3367372guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Sketched iterative algorithms for structured generalized linear models

Published: 10 August 2019 Publication History
  • Get Citation Alerts
  • Abstract

    Recent years have seen advances in optimizing large scale statistical estimation problems. In statistical learning settings iterative optimization algorithms have been shown to enjoy geometric convergence. While powerful, such results only hold for the original dataset, and may face computational challenges when the sample size is large. In this paper, we study sketched iterative algorithms, in particular sketched-PGD (projected gradient descent) and sketched-SVRG (stochastic variance reduced gradient) for structured generalized linear model, and illustrate that these methods continue to have geometric convergence to the statistical error under suitable assumptions. Moreover, the sketching dimension is allowed to be even smaller than the ambient dimension, thus can lead to significant speed-ups. The sketched iterative algorithms introduced provide an additional dimension to study the trade-offs between statistical accuracy and time.

    References

    [1]
    Alekh Agarwal, Sahand Negahban, and Martin J. Wainwright. Fast global convergence of gradient methods for high-dimensional statistical recovery. The Annals of Statistics, 40(5):2452-2482, 10 2012.
    [2]
    Nir Ailon and Edo Liberty. Fast dimension reduction using Rademacher series on dual BCH codes. Discrete & Computational Geometry, 42(4):615, Sep 2008.
    [3]
    Andreas Argyriou, Rina Foygel, and Nathan Srebro. Sparse Prediction with the k-Support Norm. In NIPS, pages 1457-1465, 2012.
    [4]
    Arindam Banerjee, Sheng Chen, Farideh Fazayeli, and Vidyashankar Sivakumar. Estimation with Norm Regularization. In NIPS, pages 1556-1564, 2014.
    [5]
    D. P. Bertsekas. Nonlinear Programming. Athena Scientific, 2010.
    [6]
    Venkat Chandrasekaran, Benjamin Recht, Pablo A. Parrilo, and Alan S. Willsky. The Convex Geometry of Linear Inverse Problems. Foundations of Computational Mathematics, 12(6):805-849, 2012.
    [7]
    Aaron Defazio, Francis Bach, and Simon Lacoste-julien. Saga: A fast incremental gradient method with support for non-strongly convex composite objectives. In NIPS, pages 1646-1654, 2014.
    [8]
    P. Drineas, R. Kannan, and M. Mahoney. Fast monte carlo algorithms for matrices i: Approximating matrix multiplication. SIAM Journal on Computing, 36(1):132-157, 2006.
    [9]
    Rie Johnson and Tong Zhang. Accelerating stochastic gradient descent using predictive variance reduction. In NIPS, pages 315-323, 2013.
    [10]
    Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, and Jarvis Haupt. Stochastic variance reduced optimization for nonconvex sparse learning. In ICML, pages 917-925, 2016.
    [11]
    Michael W. Mahoney. Randomized algorithms for matrices and data. Foundations and Trends® in Machine Learning, 3(2):123-224, 2011.
    [12]
    Sahand N. Negahban, Pradeep Ravikumar, Martin J. Wainwright, and Bin Yu. A Unified Framework for High-Dimensional Analysis of M-Estimators with Decomposable Regularizers. Statistical Science, 27(4):538-557, 2012.
    [13]
    Samet Oymak, Benjamin Recht, and Mahdi Soltanolkotabi. Sharp time-data tradeoffs for linear inverse problems. IEEE Transactions on Information Theory, pages 4129-4158, 2018.
    [14]
    Mert Pilanci and Martin J. Wainwright. Randomized sketches of convex programs with sharp guarantees. IEEE Transactions on Information Theory, pages 5096-5115, 2015.
    [15]
    Mert Pilanci and Martin J. Wainwright. Iterative Hessian sketch: Fast and accurate solution approximation for constrained least-squares. Journal of Machine Learning Research (JMLR), 17(1):1842-1879, January 2016.
    [16]
    Mert Pilanci and Martin J. Wainwright. Newton sketch: A linear-time optimization algorithm with linear-quadratic convergence. SIAM Journal of Optimization, 27(1):205-245, 2017.
    [17]
    Chao Qu, Yan Li, and Huan Xu. Linear Convergence of SVRG in Statistical Estimation. arXiv:1611.01957, 2017.
    [18]
    Garvesh Raskutti, Martin J. Wainwright, and Bin Yu. Restricted Eigenvalue Properties for Correlated Gaussian Designs. Journal of Machine Learning Research (JMLR), 11:2241-2259, 2010.
    [19]
    Benjamin Recht, Maryam Fazel, and Pablo A. Parrilo. Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization. SIAM Review, 52(3):471-501, 2010.
    [20]
    Herbert Robbins and Sutton Monro. A stochastic approximation method. The Annals of Mathematical Statistics, 22(3):400-407, 09 1951.
    [21]
    Mark Schmidt, Nicolas Le Roux, and Francis Bach. Minimizing finite sums with the stochastic average gradient. Mathematical Programming, 162(1):83- 112, Mar 2017.
    [22]
    Junqi Tang, Mohammad Golbabaee, and Mike E. Davies. Gradient projection iterative sketch for large-scale constrained least-squares. In ICML, pages 3377- 3386, 2017.
    [23]
    Robert Tibshirani. Regression Shrinkage and Selection via the Lasso. Journal of the Royal Statistical Society, 58(1):267-288, 1996.
    [24]
    Roman Vershynin. Introduction to the nonasymptotic analysis of random matrices. In Compressed Sensing, pages 210-268. Cambridge University Press, Cambridge, nov 2012.
    [25]
    Roman Vershynin. Estimation in High Dimensions: A Geometric Perspective, pages 3-66. Springer International Publishing, Cham, 2015.
    [26]
    Jialei Wang, Jason Lee, Mehrdad Mahdavi, Mladen Kolar, and Nati Srebro. Sketching Meets Random Projection in the Dual: A Provable Recovery Algorithm for Big and High-dimensional Data. In AISTATS, pages 1150-1158, 2017.
    [27]
    David P. Woodruff. Sketching as a tool for numerical linear algebra. Foundations and Trends® in Theoretical Computer Science, 10(1-2):1-157, 2014.
    [28]
    Lijun Zhang, Mehrdad Mahdavi, Rong Jin, Tianbao Yang, and Shenghuo Zhu. Recovering the Optimal Solution by Dual Random Projection. In COLT, pages 135-157. 2013.

    Index Terms

    1. Sketched iterative algorithms for structured generalized linear models
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      IJCAI'19: Proceedings of the 28th International Joint Conference on Artificial Intelligence
      August 2019
      6589 pages
      ISBN:9780999241141

      Sponsors

      • Sony: Sony Corporation
      • Huawei Technologies Co. Ltd.: Huawei Technologies Co. Ltd.
      • Baidu Research: Baidu Research
      • The International Joint Conferences on Artificial Intelligence, Inc. (IJCAI)
      • Lenovo: Lenovo

      Publisher

      AAAI Press

      Publication History

      Published: 10 August 2019

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 0
        Total Downloads
      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 26 Jul 2024

      Other Metrics

      Citations

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media