Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

Park, Kwanyoung; Lee, Youngwoon

Computer Science > Machine Learning

arXiv:2407.00699 (cs)

[Submitted on 30 Jun 2024]

Title:Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

Authors:Kwanyoung Park, Youngwoon Lee

View PDF HTML (experimental)

Abstract:Model-based offline reinforcement learning (RL) is a compelling approach that addresses the challenge of learning from limited, static data by generating imaginary trajectories using learned models. However, it falls short in solving long-horizon tasks due to high bias in value estimation from model rollouts. In this paper, we introduce a novel model-based offline RL method, Lower Expectile Q-learning (LEQ), which enhances long-horizon task performance by mitigating the high bias in model-based value estimation via expectile regression of $\lambda$-returns. Our empirical results show that LEQ significantly outperforms previous model-based offline RL methods on long-horizon tasks, such as the D4RL AntMaze tasks, matching or surpassing the performance of model-free approaches. Our experiments demonstrate that expectile regression, $\lambda$-returns, and critic training on offline data are all crucial for addressing long-horizon tasks. Additionally, LEQ achieves performance comparable to the state-of-the-art model-based and model-free offline RL methods on the NeoRL benchmark and the D4RL MuJoCo Gym tasks.

Comments:	this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2407.00699 [cs.LG]
	(or arXiv:2407.00699v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2407.00699

Submission history

From: Youngwoon Lee [view email]
[v1] Sun, 30 Jun 2024 13:44:59 UTC (821 KB)

Computer Science > Machine Learning

Title:Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Tackling Long-Horizon Tasks with Model-based Offline Reinforcement Learning

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators