BeBold: Exploration Beyond the Boundary of Explored Regions

Zhang, Tianjun; Xu, Huazhe; Wang, Xiaolong; Wu, Yi; Keutzer, Kurt; Gonzalez, Joseph E.; Tian, Yuandong

Computer Science > Machine Learning

arXiv:2012.08621 (cs)

[Submitted on 15 Dec 2020]

Title:BeBold: Exploration Beyond the Boundary of Explored Regions

Authors:Tianjun Zhang, Huazhe Xu, Xiaolong Wang, Yi Wu, Kurt Keutzer, Joseph E. Gonzalez, Yuandong Tian

View PDF

Abstract:Efficient exploration under sparse rewards remains a key challenge in deep reinforcement learning. To guide exploration, previous work makes extensive use of intrinsic reward (IR). There are many heuristics for IR, including visitation counts, curiosity, and state-difference. In this paper, we analyze the pros and cons of each method and propose the regulated difference of inverse visitation counts as a simple but effective criterion for IR. The criterion helps the agent explore Beyond the Boundary of explored regions and mitigates common issues in count-based methods, such as short-sightedness and detachment. The resulting method, BeBold, solves the 12 most challenging procedurally-generated tasks in MiniGrid with just 120M environment steps, without any curriculum learning. In comparison, the previous SoTA only solves 50% of the tasks. BeBold also achieves SoTA on multiple tasks in NetHack, a popular rogue-like game that contains more challenging procedurally-generated environments.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Machine Learning (stat.ML)
Cite as:	arXiv:2012.08621 [cs.LG]
	(or arXiv:2012.08621v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2012.08621

Submission history

From: Tianjun Zhang [view email]
[v1] Tue, 15 Dec 2020 21:26:54 UTC (1,752 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-12

Change to browse by:

cs
cs.AI
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Tianjun Zhang
Huazhe Xu
Xiaolong Wang
Yi Wu
Kurt Keutzer

…

export BibTeX citation

Computer Science > Machine Learning

Title:BeBold: Exploration Beyond the Boundary of Explored Regions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:BeBold: Exploration Beyond the Boundary of Explored Regions

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators