Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Mhamdi, El Mahdi El; Guerraoui, Rachid; Hendrikx, Hadrien; Maurer, Alexandre

Computer Science > Artificial Intelligence

arXiv:1704.02882 (cs)

[Submitted on 10 Apr 2017 (v1), last revised 22 May 2017 (this version, v2)]

Title:Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Authors:El Mahdi El Mhamdi, Rachid Guerraoui, Hadrien Hendrikx, Alexandre Maurer

View PDF

Abstract:In reinforcement learning, agents learn by performing actions and observing their outcomes. Sometimes, it is desirable for a human operator to \textit{interrupt} an agent in order to prevent dangerous situations from happening. Yet, as part of their learning process, agents may link these interruptions, that impact their reward, to specific states and deliberately avoid them. The situation is particularly challenging in a multi-agent context because agents might not only learn from their own past interruptions, but also from those of other agents. Orseau and Armstrong defined \emph{safe interruptibility} for one learner, but their work does not naturally extend to multi-agent systems. This paper introduces \textit{dynamic safe interruptibility}, an alternative definition more suited to decentralized learning problems, and studies this notion in two learning frameworks: \textit{joint action learners} and \textit{independent learners}. We give realistic sufficient conditions on the learning algorithm to enable dynamic safe interruptibility in the case of joint action learners, yet show that these conditions are not sufficient for independent learners. We show however that if agents can detect interruptions, it is possible to prune the observations to ensure dynamic safe interruptibility even for independent learners.

Subjects:	Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multiagent Systems (cs.MA); Machine Learning (stat.ML)
Cite as:	arXiv:1704.02882 [cs.AI]
	(or arXiv:1704.02882v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.1704.02882

Submission history

From: Hadrien Hendrikx [view email]
[v1] Mon, 10 Apr 2017 14:38:37 UTC (24 KB)
[v2] Mon, 22 May 2017 11:01:28 UTC (24 KB)

Computer Science > Artificial Intelligence

Title:Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:Dynamic Safe Interruptibility for Decentralized Multi-Agent Reinforcement Learning

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators