Align Your Intents: Offline Imitation Learning via Optimal Transport

Bobrin, Maksim; Buzun, Nazar; Krylov, Dmitrii; Dylov, Dmitry V.

Computer Science > Machine Learning

arXiv:2402.13037 (cs)

[Submitted on 20 Feb 2024 (v1), last revised 4 Oct 2024 (this version, v2)]

Title:Align Your Intents: Offline Imitation Learning via Optimal Transport

Authors:Maksim Bobrin, Nazar Buzun, Dmitrii Krylov, Dmitry V. Dylov

View PDF HTML (experimental)

Abstract:Offline Reinforcement Learning (RL) addresses the problem of sequential decision-making by learning optimal policy through pre-collected data, without interacting with the environment. As yet, it has remained somewhat impractical, because one rarely knows the reward explicitly and it is hard to distill it retrospectively. Here, we show that an imitating agent can still learn the desired behavior merely from observing the expert, despite the absence of explicit rewards or action labels. In our method, AILOT (Aligned Imitation Learning via Optimal Transport), we involve special representation of states in a form of intents that incorporate pairwise spatial distances within the data. Given such representations, we define intrinsic reward function via optimal transport distance between the expert's and the agent's trajectories. We report that AILOT outperforms state-of-the art offline imitation learning algorithms on D4RL benchmarks and improves the performance of other offline RL algorithms by dense reward relabelling in the sparse-reward tasks.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2402.13037 [cs.LG]
	(or arXiv:2402.13037v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2402.13037

Submission history

From: Nazar Buzun [view email]
[v1] Tue, 20 Feb 2024 14:24:00 UTC (2,705 KB)
[v2] Fri, 4 Oct 2024 07:24:42 UTC (3,719 KB)

Computer Science > Machine Learning

Title:Align Your Intents: Offline Imitation Learning via Optimal Transport

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Align Your Intents: Offline Imitation Learning via Optimal Transport

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators