Dynamics of Stochastic Gradient Algorithms

Li, Qianxiao; Tai, Cheng; E, Weinan

Computer Science > Machine Learning

arXiv:1511.06251v2 (cs)

[Submitted on 19 Nov 2015 (v1), revised 20 Nov 2015 (this version, v2), latest version 20 Jun 2017 (v3)]

Title:Dynamics of Stochastic Gradient Algorithms

Authors:Qianxiao Li, Cheng Tai, Weinan E

View PDF

Abstract:Stochastic gradient algorithms (SGA) are increasingly popular in machine learning applications and have become "the algorithm" for extremely large scale problems. Although there are some convergence results, little is known about their dynamics. In this paper, We propose the method of stochastic modified equations (SME) to analyze the dynamics of the SGA. Using this technique, we can give precise characterizations for both the initial convergence speed and the eventual oscillations, at least in some special cases. Furthermore, the SME formalism allows us to characterize various speed-up techniques, such as introducing momentum, adjusting the learning rate and the mini-batch sizes. Previously, these techniques relied mostly on heuristics. Besides introducing simple examples to illustrate the SME formalism, we also apply the framework to improve the relaxed randomized Kaczmarz method for solving linear equations. The SME framework is a precise and unifying approach to understanding and improving the SGA, and has the potential to be applied to many more stochastic algorithms.

Comments:	Changes: 1. Fixed a sign mistake in eq. (74). 2. Factor of d in eq. (98), and thus figure 10's estimate of k^*. 3. Fixed some typos and figure scales
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
MSC classes:	68W20
Cite as:	arXiv:1511.06251 [cs.LG]
	(or arXiv:1511.06251v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1511.06251

Submission history

From: Qianxiao Li [view email]
[v1] Thu, 19 Nov 2015 16:49:33 UTC (3,851 KB)
[v2] Fri, 20 Nov 2015 19:58:15 UTC (4,801 KB)
[v3] Tue, 20 Jun 2017 13:56:33 UTC (1,326 KB)

Computer Science > Machine Learning

Title:Dynamics of Stochastic Gradient Algorithms

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Dynamics of Stochastic Gradient Algorithms

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators