research-article

Loopless Semi-Stochastic Gradient Descent with Less Hard Thresholding for Sparse Learning

Authors:

Hongying LiuAuthors Info & Claims

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

Pages 881 - 890

https://doi.org/10.1145/3357384.3358021

Published: 03 November 2019 Publication History

Abstract

Stochastic gradient hard thresholding methods have recently been shown to work favorably for solving large-scale empirical risk minimization problems under sparsity constraints. Many stochastic hard thresholding methods (e.g., SVRG-HT) conduct a full gradient update with a constant frequency and perform a hard thresholding operation at each iteration, which leads to a high computational complexity especially for high-dimensional and sparse problems. To be more efficient in large-scale datasets, we propose an efficient single-layer semi-stochastic gradient hard thresholding (LSSG-HT) method. The proposed algorithm updates full gradient with a given probability p and reduces lots of hard thresholding operations by setting frequency m, which reduces hard thresholding complexity in theory to O(κ_s/młog(1/ε)) compared with O(κ_słog(1/ε)) of SVRG-HT. We prove that our algorithm can converge to an optimal solution with a linear convergence rate. Furthermore, we also present an asynchronous parallel variant of LSSG-HT. Numerical experimental results demonstrate that the efficiency of our algorithms with comparison against the state-of-the-art algorithms.

References

[1]

Sohail Bahmani, Bhiksha Raj, and Petros Boufounos. 2013. Greedy sparsity-constrained optimization. JMLR 14 (2013), 807-- 841.

Digital Library

[2]

Thomas Blumensath and Mike E Davies. 2009. Iterative hard thresholding for compressed sensing. Appl. Comput. Harmon. Anal. 27, 3 (2009), 265--274.

[3]

Jinghui Chen and Quanquan Gu. 2016. Accelerated Stochastic Block Coordinate Gradient Descent for Sparsity Constrained Nonconvex Optimization. In UAI. 132--141.

[4]

Jinghui Chen and Quanquan Gu. 2017. Fast Newton Hard Thresholding Pursuit for Sparsity Constrained Nonconvex Optimization. In SIGKDD. 757--766.

[5]

David L Donoho et al. 2006. Compressed sensing. IEEE Trans. Inform. Theory 52, 4 (2006), 1289--1306.

Digital Library

[6]

Simon Foucart. 2011. Hard thresholding pursuit: an algorithm for compressive sensing. SIAM J. Numer. Anal. 49, 6 (2011), 2543--2563.

Digital Library

[7]

Hongchang Gao and Heng Huang. 2018. Stochastic Second-Order Method for Large-Scale Nonconvex Sparse Learning Models. In IJCAI. 2128--2134.

[8]

Prateek Jain, Ambuj Tewari, and Purushottam Kar. 2014. On iterative hard thresholding methods for high-dimensional mestimation. In NIPS. 685--693.

[9]

Ali Jalali, Christopher C Johnson, and Pradeep K Ravikumar. 2011. On learning discrete graphical models using greedy methods. In NIPS. 1935--1943.

[10]

Rie Johnson and Tong Zhang. 2013. Accelerating stochastic gradient descent using predictive variance reduction. In NIPS. 315--323.

[11]

Jakub Konecn

[12]

y and Peter Richt´arik. 2017. Semi-stochastic gradient descent methods. Front. Appl. Math. Stat. 3 (2017), 1--14.

[13]

Dmitry Kovalev, Samuel Horvath, and Peter Richtarik. 2019. Don't Jump Through Hoops and Remove Those Loops: SVRG and Katyusha are Better Without the Outer Loop. arXiv:1901.08689 (2019).

[14]

Remi Leblond, Fabian Pedregosa, and Simon Lacoste-Julien. 2017. ASAGA: Asynchronous Parallel SAGA. In AISTATS. 46--54.

[15]

Xingguo Li, Tuo Zhao, Raman Arora, Han Liu, and Jarvis Haupt. 2016. Stochastic variance reduced optimization for nonconvex sparse learning. In ICML. 917--925.

[16]

Qihang Lin, Zhaosong Lu, and Lin Xiao. 2014. An accelerated proximal coordinate gradient method. In NIPS. 3059--3067.

[17]

Horia Mania, Xinghao Pan, Dimitris Papailiopoulos, Benjamin Recht, Kannan Ramchandran, and Michael I Jordan. 2017. Perturbed iterate analysis for asynchronous stochastic optimization. SIAM J. Optim. 27, 4 (2017), 2202--2229.

[18]

Elaine Crespo Marques, Nilson Maciel, Lirida A. B. Naviner, Hao Cai, and Jun Yang. 2019. A Review of Sparse Recovery Algorithms. IEEE Access 7 (2019), 1300--1322.

[19]

Balas Kausik Natarajan. 1995. Sparse approximate solutions to linear systems. SIAM J. Comput. 24, 2 (1995), 227--234.

Digital Library

[20]

Deanna Needell and Tina Woolf. 2017. An asynchronous parallel approach to sparse recovery. In Information Theory and Applications Workshop (ITA). 1--5.

[21]

Nam Nguyen, Deanna Needell, and Tina Woolf. 2017. Linear convergence of stochastic iterative greedy algorithms with sparse constraints. IEEE Trans. Inform. Theory 63, 11 (2017), 6869-- 6895.

[22]

Yagyensh Chandra Pati, Ramin Rezaiifar, and Perinkulam Sambamurthy Krishnaprasad. 1993. Orthogonal matching pursuit: Recursive function approximation with applications to wavelet decomposition. In Asilomar Conference on Signals, Systems and Computers. 40--44.

[23]

Garvesh Raskutti, Martin J Wainwright, and Bin Yu. 2010. Restricted eigenvalue properties for correlated Gaussian designs. JMLR 11 (2010), 2241--2259.

Digital Library

[24]

Benjamin Recht, Christopher Re, Stephen Wright, and Feng Niu. 2011. Hogwild: A lock-free approach to parallelizing stochastic gradient descent. In NIPS. 693--701.

[25]

Sashank Reddi, Ahmed Hefny, Suvrit Sra, Barnabas Poczos, and Alexander Smola. 2015. On variance reduction in stochastic gradient descent and its asynchronous variants. In NIPS. 2647-- 2655.

[26]

Jie Shen and Ping Li. 2018. A Tight Bound of Hard Thresholding. JMLR 18 (2018), 1--42.

[27]

Joel A Tropp and Anna C Gilbert. 2007. Signal recovery from random measurements via orthogonal matching pursuit. IEEE Trans. Inform. Theory 53, 12 (2007), 4655--4666.

Digital Library

[28]

Fei Wen, Lei Chu, Peilin Liu, and Robert C. Qiu. 2018. A Survey on Nonconvex Regularization-Based Sparse and Low-Rank Recovery in Signal Processing, Statistics, and Machine Learning. IEEE Access 6 (2018), 69883--69906.

[29]

Xiaotong Yuan, Ping Li, and Tong Zhang. 2014. Gradient hard thresholding pursuit for sparsity-constrained optimization. In ICML. 127--135.

[30]

Yuchen Zhang and Lin Xiao. 2017. Stochastic primal-dual coordinate method for regularized empirical risk minimization. JMLR 18, 1 (2017), 2939--2980.

Digital Library

[31]

Zheng Zhang, Yong Xu, Jian Yang, Xuelong Li, and David Zhang. 2015. A Survey of Sparse Representation: Algorithms and Applications. IEEE Access 3 (2015), 490--530.

[32]

Baojian Zhou, Feng Chen, and Yiming Ying. 2019. Stochastic Iterative Hard Thresholding for Graph-structured Sparsity Optimization. In ICML. 7563--7573.

[33]

Kaiwen Zhou, Fanhua Shang, and James Cheng. 2018. A Simple Stochastic Variance Reduced Algorithm with Fast Convergence Rates. In ICML. 5975--5984.

[34]

Pan Zhou, Xiaotong Yuan, and Jiashi Feng. 2018. Efficient stochastic gradient hard thresholding. In NeurIPS. 1984--1993.

Cited By

Choi SOh JCha J(2024)SAB: Self-Adaptive BiasAI10.3390/ai50401335:4(2761-2772)Online publication date: 6-Dec-2024
https://doi.org/10.3390/ai5040133
Shang FWei BLiu HLiu YZhou PGong M(2022)Efficient Gradient Support Pursuit With Less Hard Thresholding for Cardinality-Constrained LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.308780533:12(7806-7817)Online publication date: Dec-2022
https://doi.org/10.1109/TNNLS.2021.3087805
Shang FHuang HFan JLiu YLiu HLiu J(2022)Asynchronous Parallel, Sparse Approximated SVRG for High-Dimensional Machine LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.307053934:12(5636-5648)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TKDE.2021.3070539
Show More Cited By

Index Terms

Loopless Semi-Stochastic Gradient Descent with Less Hard Thresholding for Sparse Learning
1. Mathematics of computing
  1. Mathematical analysis
    1. Mathematical optimization
      1. Continuous optimization
        Nonconvex optimization
2. Theory of computation
  1. Design and analysis of algorithms
    1. Approximation algorithms analysis
      1. Stochastic approximation
    2. Mathematical optimization
      1. Continuous optimization
        Nonconvex optimization

Recommendations

Gradient hard thresholding pursuit

Hard Thresholding Pursuit (HTP) is an iterative greedy selection procedure for finding sparse solutions of underdetermined linear systems. This method has been shown to have strong theoretical guarantee and impressive numerical performance. In this ...
A tight bound of hard thresholding

This paper is concerned with the hard thresholding operator which sets all but the k largest absolute elements of a vector to zero. We establish a tight bound to quantitatively characterize the deviation of the thresholded solution from a given signal. ...
Fast Newton Hard Thresholding Pursuit for Sparsity Constrained Nonconvex Optimization
KDD '17: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

We propose a fast Newton hard thresholding pursuit algorithm for sparsity constrained nonconvex optimization. Our proposed algorithm reduces the per-iteration time complexity to linear in the data dimension d compared with cubic time complexity in Newton'...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CIKM '19: Proceedings of the 28th ACM International Conference on Information and Knowledge Management

November 2019

3373 pages

ISBN:9781450369763

DOI:10.1145/3357384

General Chairs:
Wenwu Zhu
Tsinghua University, China
,
Dacheng Tao
University of Massachusetts, USA
,
Xueqi Cheng
Institute of Computing Technology, CAS, China
,
Program Chairs:
Peng Cui
Tsinghua University, China
,
Elke Rundensteiner
Worcester Polytechnic Institute, USA
,
David Carmel
Amazon Research, USA
,
Qi He
LinkedIn, USA
,
Jeffrey Xu Yu
Chinese University of Hong Kong, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 03 November 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CIKM '19

Sponsor:

CIKM '19: The 28th ACM International Conference on Information and Knowledge Management

November 3 - 7, 2019

Beijing, China

Acceptance Rates

CIKM '19 Paper Acceptance Rate 202 of 1,031 submissions, 20%;

Overall Acceptance Rate 1,861 of 8,427 submissions, 22%

Upcoming Conference

CIKM '25

Sponsor:
sigir
sigir

The 34th ACM International Conference on Information and Knowledge Management

November 10 - 14, 2025

Seoul , Republic of Korea

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
205
Total Downloads

Downloads (Last 12 months)10
Downloads (Last 6 weeks)1

Reflects downloads up to 09 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Choi SOh JCha J(2024)SAB: Self-Adaptive BiasAI10.3390/ai50401335:4(2761-2772)Online publication date: 6-Dec-2024
https://doi.org/10.3390/ai5040133
Shang FWei BLiu HLiu YZhou PGong M(2022)Efficient Gradient Support Pursuit With Less Hard Thresholding for Cardinality-Constrained LearningIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2021.308780533:12(7806-7817)Online publication date: Dec-2022
https://doi.org/10.1109/TNNLS.2021.3087805
Shang FHuang HFan JLiu YLiu HLiu J(2022)Asynchronous Parallel, Sparse Approximated SVRG for High-Dimensional Machine LearningIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2021.307053934:12(5636-5648)Online publication date: 1-Dec-2022
https://doi.org/10.1109/TKDE.2021.3070539
Shang FWei BLiu YLiu HWang SJiao L(2020)Stochastic Recursive Gradient Support Pursuit and Its Sparse Representation ApplicationsSensors10.3390/s2017490220:17(4902)Online publication date: 30-Aug-2020
https://doi.org/10.3390/s20174902
Song HKim MKim SLee Jd'Aquin MDietze SHauff CCurry ECudre Mauroux P(2020)Carpe Diem, Seize the Samples Uncertain "at the Moment" for Adaptive Batch SelectionProceedings of the 29th ACM International Conference on Information & Knowledge Management10.1145/3340531.3411898(1385-1394)Online publication date: 19-Oct-2020
https://dl.acm.org/doi/10.1145/3340531.3411898

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents