SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

Abdukhakimov, Farshed; Xiang, Chulu; Kamzolov, Dmitry; Gower, Robert; Takáč, Martin

Computer Science > Machine Learning

arXiv:2312.17369 (cs)

[Submitted on 28 Dec 2023]

Title:SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

Authors:Farshed Abdukhakimov, Chulu Xiang, Dmitry Kamzolov, Robert Gower, Martin Takáč

View PDF HTML (experimental)

Abstract:Adaptive optimization methods are widely recognized as among the most popular approaches for training Deep Neural Networks (DNNs). Techniques such as Adam, AdaGrad, and AdaHessian utilize a preconditioner that modifies the search direction by incorporating information about the curvature of the objective function. However, despite their adaptive characteristics, these methods still require manual fine-tuning of the step-size. This, in turn, impacts the time required to solve a particular problem. This paper presents an optimization framework named SANIA to tackle these challenges. Beyond eliminating the need for manual step-size hyperparameter settings, SANIA incorporates techniques to address poorly scaled or ill-conditioned problems. We also explore several preconditioning methods, including Hutchinson's method, which approximates the Hessian diagonal of the loss function. We conclude with an extensive empirical examination of the proposed techniques across classification tasks, covering both convex and non-convex contexts.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2312.17369 [cs.LG]
	(or arXiv:2312.17369v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2312.17369

Submission history

From: Dmitry Kamzolov [view email]
[v1] Thu, 28 Dec 2023 21:28:08 UTC (1,664 KB)

Computer Science > Machine Learning

Title:SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:SANIA: Polyak-type Optimization Framework Leads to Scale Invariant Stochastic Algorithms

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators