Correcting Experience Replay for Multi-Agent Communication

Ahilan, Sanjeevan; Dayan, Peter

Computer Science > Machine Learning

arXiv:2010.01192 (cs)

[Submitted on 2 Oct 2020 (v1), last revised 28 Feb 2021 (this version, v2)]

Title:Correcting Experience Replay for Multi-Agent Communication

Authors:Sanjeevan Ahilan, Peter Dayan

View PDF

Abstract:We consider the problem of learning to communicate using multi-agent reinforcement learning (MARL). A common approach is to learn off-policy, using data sampled from a replay buffer. However, messages received in the past may not accurately reflect the current communication policy of each agent, and this complicates learning. We therefore introduce a 'communication correction' which accounts for the non-stationarity of observed communication induced by multi-agent learning. It works by relabelling the received message to make it likely under the communicator's current policy, and thus be a better reflection of the receiver's current environment. To account for cases in which agents are both senders and receivers, we introduce an ordered relabelling scheme. Our correction is computationally efficient and can be integrated with a range of off-policy algorithms. We find in our experiments that it substantially improves the ability of communicating MARL systems to learn across a variety of cooperative and competitive tasks.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
Cite as:	arXiv:2010.01192 [cs.LG]
	(or arXiv:2010.01192v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2010.01192

Submission history

From: Sanjeevan Ahilan [view email]
[v1] Fri, 2 Oct 2020 20:49:24 UTC (2,304 KB)
[v2] Sun, 28 Feb 2021 22:42:12 UTC (5,821 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-10

Change to browse by:

cs
cs.AI
cs.MA

References & Citations

DBLP - CS Bibliography

listing | bibtex

Sanjeevan Ahilan
Peter Dayan

export BibTeX citation

Computer Science > Machine Learning

Title:Correcting Experience Replay for Multi-Agent Communication

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Correcting Experience Replay for Multi-Agent Communication

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators