Accelerating Value Iteration with Anchoring

Lee, Jongmin; Ryu, Ernest K.

Computer Science > Machine Learning

arXiv:2305.16569v2 (cs)

[Submitted on 26 May 2023 (v1), last revised 28 Oct 2023 (this version, v2)]

Title:Accelerating Value Iteration with Anchoring

Authors:Jongmin Lee, Ernest K. Ryu

View PDF

Abstract:Value Iteration (VI) is foundational to the theory and practice of modern reinforcement learning, and it is known to converge at a $\mathcal{O}(\gamma^k)$-rate, where $\gamma$ is the discount factor. Surprisingly, however, the optimal rate for the VI setup was not known, and finding a general acceleration mechanism has been an open problem. In this paper, we present the first accelerated VI for both the Bellman consistency and optimality operators. Our method, called Anc-VI, is based on an \emph{anchoring} mechanism (distinct from Nesterov's acceleration), and it reduces the Bellman error faster than standard VI. In particular, Anc-VI exhibits a $\mathcal{O}(1/k)$-rate for $\gamma\approx 1$ or even $\gamma=1$, while standard VI has rate $\mathcal{O}(1)$ for $\gamma\ge 1-1/k$, where $k$ is the iteration count. We also provide a complexity lower bound matching the upper bound up to a constant factor of $4$, thereby establishing optimality of the accelerated rate of Anc-VI. Finally, we show that the anchoring mechanism provides the same benefit in the approximate VI and Gauss--Seidel VI setups as well.

Subjects:	Machine Learning (cs.LG); Optimization and Control (math.OC)
Cite as:	arXiv:2305.16569 [cs.LG]
	(or arXiv:2305.16569v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2305.16569
Journal reference:	Neural Information Processing System 2023

Submission history

From: Jongmin Lee [view email]
[v1] Fri, 26 May 2023 01:32:21 UTC (571 KB)
[v2] Sat, 28 Oct 2023 12:59:45 UTC (600 KB)

Computer Science > Machine Learning

Title:Accelerating Value Iteration with Anchoring

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Accelerating Value Iteration with Anchoring

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators