Reinforcement Learning with Random Delays

Ramstedt, Simon; Bouteiller, Yann; Beltrame, Giovanni; Pal, Christopher; Binas, Jonathan

Computer Science > Machine Learning

arXiv:2010.02966v3 (cs)

[Submitted on 6 Oct 2020 (v1), last revised 4 May 2021 (this version, v3)]

Title:Reinforcement Learning with Random Delays

Authors:Simon Ramstedt, Yann Bouteiller, Giovanni Beltrame, Christopher Pal, Jonathan Binas

View PDF

Abstract:Action and observation delays commonly occur in many Reinforcement Learning applications, such as remote control scenarios. We study the anatomy of randomly delayed environments, and show that partially resampling trajectory fragments in hindsight allows for off-policy multi-step value estimation. We apply this principle to derive Delay-Correcting Actor-Critic (DCAC), an algorithm based on Soft Actor-Critic with significantly better performance in environments with delays. This is shown theoretically and also demonstrated practically on a delay-augmented version of the MuJoCo continuous control benchmark.

Comments:	ICLR 2021
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:2010.02966 [cs.LG]
	(or arXiv:2010.02966v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2010.02966

Submission history

From: Yann Bouteiller [view email]
[v1] Tue, 6 Oct 2020 18:39:23 UTC (12,221 KB)
[v2] Thu, 8 Oct 2020 17:56:31 UTC (11,948 KB)
[v3] Tue, 4 May 2021 20:27:33 UTC (14,915 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-10

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Giovanni Beltrame
Christopher J. Pal
Jonathan Binas

export BibTeX citation

Computer Science > Machine Learning

Title:Reinforcement Learning with Random Delays

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reinforcement Learning with Random Delays

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators