Adversarially Guided Actor-Critic

Flet-Berliac, Yannis; Ferret, Johan; Pietquin, Olivier; Preux, Philippe; Geist, Matthieu

Computer Science > Machine Learning

arXiv:2102.04376 (cs)

[Submitted on 8 Feb 2021]

Title:Adversarially Guided Actor-Critic

Authors:Yannis Flet-Berliac, Johan Ferret, Olivier Pietquin, Philippe Preux, Matthieu Geist

View PDF

Abstract:Despite definite success in deep reinforcement learning problems, actor-critic algorithms are still confronted with sample inefficiency in complex environments, particularly in tasks where efficient exploration is a bottleneck. These methods consider a policy (the actor) and a value function (the critic) whose respective losses are built using different motivations and approaches. This paper introduces a third protagonist: the adversary. While the adversary mimics the actor by minimizing the KL-divergence between their respective action distributions, the actor, in addition to learning to solve the task, tries to differentiate itself from the adversary predictions. This novel objective stimulates the actor to follow strategies that could not have been correctly predicted from previous trajectories, making its behavior innovative in tasks where the reward is extremely rare. Our experimental analysis shows that the resulting Adversarially Guided Actor-Critic (AGAC) algorithm leads to more exhaustive exploration. Notably, AGAC outperforms current state-of-the-art methods on a set of various hard-exploration and procedurally-generated tasks.

Comments:	Accepted at ICLR 2021
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2102.04376 [cs.LG]
	(or arXiv:2102.04376v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2102.04376

Submission history

From: Yannis Flet-Berliac [view email]
[v1] Mon, 8 Feb 2021 17:31:13 UTC (1,431 KB)

Computer Science > Machine Learning

Title:Adversarially Guided Actor-Critic

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Adversarially Guided Actor-Critic

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators