Near-optimal Reinforcement Learning using Bayesian Quantiles

Tossou, Aristide; Basu, Debabrota; Dimitrakakis, Christos

Computer Science > Machine Learning

arXiv:1906.09114v1 (cs)

[Submitted on 20 Jun 2019 (this version), latest version 9 Jul 2019 (v2)]

Title:Near-optimal Reinforcement Learning using Bayesian Quantiles

Authors:Aristide Tossou, Debabrota Basu, Christos Dimitrakakis

View PDF

Abstract:We study model-based reinforcement learning in finite communicating Markov Decision Process. Algorithms in this settings have been developed in two different ways: the first view, which typically provides frequentist performance guarantees, uses optimism in the face of uncertainty as the guiding algorithmic principle. The second view is based on Bayesian reasoning, combined with posterior sampling and Bayesian guarantees. In this paper, we develop a conceptually simple algorithm, Bayes-UCRL that combines the benefits of both approaches to achieve state-of-the-art performance for finite communicating MDP. In particular, we use Bayesian Prior similarly to Posterior Sampling. However, instead of sampling the MDP, we construct an optimistic MDP using the quantiles of the Bayesian prior. We show that this technique enjoys a high probability worst-case regret of order $\tilde{\mathcal{O}}(\sqrt{DSAT})$. Experiments in a diverse set of environments show that our algorithms outperform previous methods.

Comments:	arXiv admin note: substantial text overlap with arXiv:1905.12425
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Machine Learning (stat.ML)
Cite as:	arXiv:1906.09114 [cs.LG]
	(or arXiv:1906.09114v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1906.09114

Submission history

From: Aristide Charles Yedia Tossou [view email]
[v1] Thu, 20 Jun 2019 06:32:36 UTC (4,905 KB)
[v2] Tue, 9 Jul 2019 21:47:50 UTC (4,914 KB)

Computer Science > Machine Learning

Title:Near-optimal Reinforcement Learning using Bayesian Quantiles

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Near-optimal Reinforcement Learning using Bayesian Quantiles

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators