Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Kartal, Bilal; Hernandez-Leal, Pablo; Taylor, Matthew E.

Computer Science > Machine Learning

arXiv:1812.00045 (cs)

[Submitted on 30 Nov 2018]

Title:Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Authors:Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor

View PDF

Abstract:Deep reinforcement learning (DRL) has achieved great successes in recent years with the help of novel methods and higher compute power. However, there are still several challenges to be addressed such as convergence to locally optimal policies and long training times. In this paper, firstly, we augment Asynchronous Advantage Actor-Critic (A3C) method with a novel self-supervised auxiliary task, i.e. \emph{Terminal Prediction}, measuring temporal closeness to terminal states, namely A3C-TP. Secondly, we propose a new framework where planning algorithms such as Monte Carlo tree search or other sources of (simulated) demonstrators can be integrated to asynchronous distributed DRL methods. Compared to vanilla A3C, our proposed methods both learn faster and converge to better policies on a two-player mini version of the Pommerman game.

Comments:	9 pages, 6 figures, To appear at AAAI-19 Workshop on Reinforcement Learning in Games
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
Cite as:	arXiv:1812.00045 [cs.LG]
	(or arXiv:1812.00045v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1812.00045

Submission history

From: Bilal Kartal [view email]
[v1] Fri, 30 Nov 2018 20:37:17 UTC (688 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-12

Change to browse by:

cs
cs.AI
cs.NE

References & Citations

DBLP - CS Bibliography

listing | bibtex

Bilal Kartal
Pablo Hernandez-Leal
Matthew E. Taylor

export BibTeX citation

Computer Science > Machine Learning

Title:Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Using Monte Carlo Tree Search as a Demonstrator within Asynchronous Deep RL

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators