Computer Science > Machine Learning
[Submitted on 20 Jun 2019 (this version), latest version 9 Jul 2019 (v2)]
Title:Near-optimal Reinforcement Learning using Bayesian Quantiles
View PDFAbstract:We study model-based reinforcement learning in finite communicating Markov Decision Process. Algorithms in this settings have been developed in two different ways: the first view, which typically provides frequentist performance guarantees, uses optimism in the face of uncertainty as the guiding algorithmic principle. The second view is based on Bayesian reasoning, combined with posterior sampling and Bayesian guarantees. In this paper, we develop a conceptually simple algorithm, Bayes-UCRL that combines the benefits of both approaches to achieve state-of-the-art performance for finite communicating MDP. In particular, we use Bayesian Prior similarly to Posterior Sampling. However, instead of sampling the MDP, we construct an optimistic MDP using the quantiles of the Bayesian prior. We show that this technique enjoys a high probability worst-case regret of order $\tilde{\mathcal{O}}(\sqrt{DSAT})$. Experiments in a diverse set of environments show that our algorithms outperform previous methods.
Submission history
From: Aristide Charles Yedia Tossou [view email][v1] Thu, 20 Jun 2019 06:32:36 UTC (4,905 KB)
[v2] Tue, 9 Jul 2019 21:47:50 UTC (4,914 KB)
Current browse context:
cs.LG
References & Citations
Bibliographic and Citation Tools
Bibliographic Explorer (What is the Explorer?)
Litmaps (What is Litmaps?)
scite Smart Citations (What are Smart Citations?)
Code, Data and Media Associated with this Article
CatalyzeX Code Finder for Papers (What is CatalyzeX?)
DagsHub (What is DagsHub?)
Gotit.pub (What is GotitPub?)
Papers with Code (What is Papers with Code?)
ScienceCast (What is ScienceCast?)
Demos
Recommenders and Search Tools
Influence Flower (What are Influence Flowers?)
Connected Papers (What is Connected Papers?)
CORE Recommender (What is CORE?)
IArxiv Recommender
(What is IArxiv?)
arXivLabs: experimental projects with community collaborators
arXivLabs is a framework that allows collaborators to develop and share new arXiv features directly on our website.
Both individuals and organizations that work with arXivLabs have embraced and accepted our values of openness, community, excellence, and user data privacy. arXiv is committed to these values and only works with partners that adhere to them.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.