In-context Reinforcement Learning with Algorithm Distillation

Laskin, Michael; Wang, Luyu; Oh, Junhyuk; Parisotto, Emilio; Spencer, Stephen; Steigerwald, Richie; Strouse, DJ; Hansen, Steven; Filos, Angelos; Brooks, Ethan; Gazeau, Maxime; Sahni, Himanshu; Singh, Satinder; Mnih, Volodymyr

Computer Science > Machine Learning

arXiv:2210.14215 (cs)

[Submitted on 25 Oct 2022]

Title:In-context Reinforcement Learning with Algorithm Distillation

Authors:Michael Laskin, Luyu Wang, Junhyuk Oh, Emilio Parisotto, Stephen Spencer, Richie Steigerwald, DJ Strouse, Steven Hansen, Angelos Filos, Ethan Brooks, Maxime Gazeau, Himanshu Sahni, Satinder Singh, Volodymyr Mnih

View PDF

Abstract:We propose Algorithm Distillation (AD), a method for distilling reinforcement learning (RL) algorithms into neural networks by modeling their training histories with a causal sequence model. Algorithm Distillation treats learning to reinforcement learn as an across-episode sequential prediction problem. A dataset of learning histories is generated by a source RL algorithm, and then a causal transformer is trained by autoregressively predicting actions given their preceding learning histories as context. Unlike sequential policy prediction architectures that distill post-learning or expert sequences, AD is able to improve its policy entirely in-context without updating its network parameters. We demonstrate that AD can reinforcement learn in-context in a variety of environments with sparse rewards, combinatorial task structure, and pixel-based observations, and find that AD learns a more data-efficient RL algorithm than the one that generated the source data.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2210.14215 [cs.LG]
	(or arXiv:2210.14215v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.14215

Submission history

From: Michael Laskin [view email]
[v1] Tue, 25 Oct 2022 17:57:49 UTC (2,993 KB)

Computer Science > Machine Learning

Title:In-context Reinforcement Learning with Algorithm Distillation

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:In-context Reinforcement Learning with Algorithm Distillation

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators