Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Balakrishnan, Sreejith; Nguyen, Quoc Phong; Low, Bryan Kian Hsiang; Soh, Harold

Computer Science > Machine Learning

arXiv:2011.08541 (cs)

[Submitted on 17 Nov 2020]

Title:Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Authors:Sreejith Balakrishnan, Quoc Phong Nguyen, Bryan Kian Hsiang Low, Harold Soh

View PDF

Abstract:The problem of inverse reinforcement learning (IRL) is relevant to a variety of tasks including value alignment and robot learning from demonstration. Despite significant algorithmic contributions in recent years, IRL remains an ill-posed problem at its core; multiple reward functions coincide with the observed behavior and the actual reward function is not identifiable without prior knowledge or supplementary information. This paper presents an IRL framework called Bayesian optimization-IRL (BO-IRL) which identifies multiple solutions that are consistent with the expert demonstrations by efficiently exploring the reward function space. BO-IRL achieves this by utilizing Bayesian Optimization along with our newly proposed kernel that (a) projects the parameters of policy invariant reward functions to a single point in a latent space and (b) ensures nearby points in the latent space correspond to reward functions yielding similar likelihoods. This projection allows the use of standard stationary kernels in the latent space to capture the correlations present across the reward function space. Empirical results on synthetic and real-world environments (model-free and model-based) show that BO-IRL discovers multiple reward functions while minimizing the number of expensive exact policy optimizations.

Comments:	Accepted to 34th Conference on Neural Information Processing Systems (NeurIPS 2020). Includes Appendix. 21 pages
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2011.08541 [cs.LG]
	(or arXiv:2011.08541v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2011.08541

Submission history

From: Sreejith Balakrishnan [view email]
[v1] Tue, 17 Nov 2020 10:17:45 UTC (5,319 KB)

Computer Science > Machine Learning

Title:Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Efficient Exploration of Reward Functions in Inverse Reinforcement Learning via Bayesian Optimization

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators