Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

Vincent, Théo; Palenicek, Daniel; Belousov, Boris; Peters, Jan; D'Eramo, Carlo

Computer Science > Machine Learning

arXiv:2403.02107 (cs)

[Submitted on 4 Mar 2024 (v1), last revised 24 Oct 2024 (this version, v3)]

Title:Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

Authors:Théo Vincent, Daniel Palenicek, Boris Belousov, Jan Peters, Carlo D'Eramo

View PDF HTML (experimental)

Abstract:The vast majority of Reinforcement Learning methods is largely impacted by the computation effort and data requirements needed to obtain effective estimates of action-value functions, which in turn determine the quality of the overall performance and the sample-efficiency of the learning procedure. Typically, action-value functions are estimated through an iterative scheme that alternates the application of an empirical approximation of the Bellman operator and a subsequent projection step onto a considered function space. It has been observed that this scheme can be potentially generalized to carry out multiple iterations of the Bellman operator at once, benefiting the underlying learning algorithm. However, till now, it has been challenging to effectively implement this idea, especially in high-dimensional problems. In this paper, we introduce iterated $Q$-Network (i-QN), a novel principled approach that enables multiple consecutive Bellman updates by learning a tailored sequence of action-value functions where each serves as the target for the next. We show that i-QN is theoretically grounded and that it can be seamlessly used in value-based and actor-critic methods. We empirically demonstrate the advantages of i-QN in Atari $2600$ games and MuJoCo continuous control problems.

Comments:	Preprint
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.02107 [cs.LG]
	(or arXiv:2403.02107v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2403.02107

Submission history

From: Théo Vincent [view email]
[v1] Mon, 4 Mar 2024 15:07:33 UTC (5,858 KB)
[v2] Sat, 25 May 2024 11:42:15 UTC (6,682 KB)
[v3] Thu, 24 Oct 2024 16:50:57 UTC (5,132 KB)

Computer Science > Machine Learning

Title:Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Iterated $Q$-Network: Beyond One-Step Bellman Updates in Deep Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators