Directional Smoothness and Gradient Methods: Convergence and Adaptivity

Mishkin, Aaron; Khaled, Ahmed; Wang, Yuanhao; Defazio, Aaron; Gower, Robert M.

Computer Science > Machine Learning

arXiv:2403.04081 (cs)

[Submitted on 6 Mar 2024]

Title:Directional Smoothness and Gradient Methods: Convergence and Adaptivity

Authors:Aaron Mishkin, Ahmed Khaled, Yuanhao Wang, Aaron Defazio, Robert M. Gower

View PDF

Abstract:We develop new sub-optimality bounds for gradient descent (GD) that depend on the conditioning of the objective along the path of optimization, rather than on global, worst-case constants. Key to our proofs is directional smoothness, a measure of gradient variation that we use to develop upper-bounds on the objective. Minimizing these upper-bounds requires solving implicit equations to obtain a sequence of strongly adapted step-sizes; we show that these equations are straightforward to solve for convex quadratics and lead to new guarantees for two classical step-sizes. For general functions, we prove that the Polyak step-size and normalized GD obtain fast, path-dependent rates despite using no knowledge of the directional smoothness. Experiments on logistic regression show our convergence guarantees are tighter than the classical theory based on L-smoothness.

Comments:	Twenty-four pages
Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2403.04081 [cs.LG]
	(or arXiv:2403.04081v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.04081

Submission history

From: Aaron Mishkin [view email]
[v1] Wed, 6 Mar 2024 22:24:05 UTC (2,526 KB)

Computer Science > Machine Learning

Title:Directional Smoothness and Gradient Methods: Convergence and Adaptivity

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Directional Smoothness and Gradient Methods: Convergence and Adaptivity

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators