Reinforcement Learning with Non-Exponential Discounting

Schultheis, Matthias; Rothkopf, Constantin A.; Koeppl, Heinz

Computer Science > Machine Learning

arXiv:2209.13413 (cs)

[Submitted on 27 Sep 2022 (v1), last revised 7 Dec 2022 (this version, v2)]

Title:Reinforcement Learning with Non-Exponential Discounting

Authors:Matthias Schultheis, Constantin A. Rothkopf, Heinz Koeppl

View PDF

Abstract:Commonly in reinforcement learning (RL), rewards are discounted over time using an exponential function to model time preference, thereby bounding the expected long-term reward. In contrast, in economics and psychology, it has been shown that humans often adopt a hyperbolic discounting scheme, which is optimal when a specific task termination time distribution is assumed. In this work, we propose a theory for continuous-time model-based reinforcement learning generalized to arbitrary discount functions. This formulation covers the case in which there is a non-exponential random termination time. We derive a Hamilton-Jacobi-Bellman (HJB) equation characterizing the optimal policy and describe how it can be solved using a collocation method, which uses deep learning for function approximation. Further, we show how the inverse RL problem can be approached, in which one tries to recover properties of the discount function given decision data. We validate the applicability of our proposed approach on two simulated problems. Our approach opens the way for the analysis of human discounting in sequential decision-making tasks.

Comments:	22 pages, 3 figures, published at 36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Subjects:	Machine Learning (cs.LG); Systems and Control (eess.SY); Neurons and Cognition (q-bio.NC); Machine Learning (stat.ML)
Cite as:	arXiv:2209.13413 [cs.LG]
	(or arXiv:2209.13413v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2209.13413

Submission history

From: Matthias Schultheis [view email]
[v1] Tue, 27 Sep 2022 14:13:16 UTC (1,594 KB)
[v2] Wed, 7 Dec 2022 10:55:46 UTC (1,595 KB)

Computer Science > Machine Learning

Title:Reinforcement Learning with Non-Exponential Discounting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reinforcement Learning with Non-Exponential Discounting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators