Behavior-Guided Reinforcement Learning

Pacchiano, Aldo; Parker-Holder, Jack; Tang, Yunhao; Choromanska, Anna; Choromanski, Krzysztof; Jordan, Michael I.

Computer Science > Machine Learning

arXiv:1906.04349v3 (cs)

[Submitted on 11 Jun 2019 (v1), revised 29 Sep 2019 (this version, v3), latest version 4 Mar 2020 (v4)]

Title:Behavior-Guided Reinforcement Learning

Authors:Aldo Pacchiano, Jack Parker-Holder, Yunhao Tang, Anna Choromanska, Krzysztof Choromanski, Michael I. Jordan

View PDF

Abstract:We introduce a new approach for comparing reinforcement learning policies, using Wasserstein distances (WDs) in a newly defined latent behavioral space. We show that by utilizing the dual formulation of the WD, we can learn score functions over trajectories that can be in turn used to lead policy optimization towards (or away from) (un)desired behaviors. Combined with smoothed WDs, the dual formulation allows us to devise efficient algorithms that take stochastic gradient descent steps through WD regularizers. We incorporate these regularizers into two novel on-policy algorithms, Behavior-Guided Policy Gradient and Behavior-Guided Evolution Strategies, which we demonstrate can outperform existing methods in a variety of challenging environments. We also provide an open source demo.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1906.04349 [cs.LG]
	(or arXiv:1906.04349v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.04349

Submission history

From: Jack Parker-Holder [view email]
[v1] Tue, 11 Jun 2019 02:06:51 UTC (5,600 KB)
[v2] Wed, 19 Jun 2019 14:57:54 UTC (5,600 KB)
[v3] Sun, 29 Sep 2019 16:02:32 UTC (7,565 KB)
[v4] Wed, 4 Mar 2020 08:27:28 UTC (7,787 KB)

Computer Science > Machine Learning

Title:Behavior-Guided Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Behavior-Guided Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators