A Dual Approach to Imitation Learning from Observations with Offline Datasets

Sikchi, Harshit; Chuck, Caleb; Zhang, Amy; Niekum, Scott

Computer Science > Machine Learning

arXiv:2406.08805 (cs)

[Submitted on 13 Jun 2024 (v1), last revised 19 Sep 2024 (this version, v2)]

Title:A Dual Approach to Imitation Learning from Observations with Offline Datasets

Authors:Harshit Sikchi, Caleb Chuck, Amy Zhang, Scott Niekum

View PDF HTML (experimental)

Abstract:Demonstrations are an effective alternative to task specification for learning agents in settings where designing a reward function is difficult. However, demonstrating expert behavior in the action space of the agent becomes unwieldy when robots have complex, unintuitive morphologies. We consider the practical setting where an agent has a dataset of prior interactions with the environment and is provided with observation-only expert demonstrations. Typical learning from observations approaches have required either learning an inverse dynamics model or a discriminator as intermediate steps of training. Errors in these intermediate one-step models compound during downstream policy learning or deployment. We overcome these limitations by directly learning a multi-step utility function that quantifies how each action impacts the agent's divergence from the expert's visitation distribution. Using the principle of duality, we derive DILO (Dual Imitation Learning from Observations), an algorithm that can leverage arbitrary suboptimal data to learn imitating policies without requiring expert actions. DILO reduces the learning from observations problem to that of simply learning an actor and a critic, bearing similar complexity to vanilla offline RL. This allows DILO to gracefully scale to high dimensional observations, and demonstrate improved performance across the board. Project page (code and videos): $\href{this https URL}{\text{this http URL}}$

Comments:	8th Conference on Robot Learning (CoRL 2024), Munich, Germany. 23 pages
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Robotics (cs.RO)
Cite as:	arXiv:2406.08805 [cs.LG]
	(or arXiv:2406.08805v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2406.08805

Submission history

From: Harshit Sikchi [view email]
[v1] Thu, 13 Jun 2024 04:39:42 UTC (19,296 KB)
[v2] Thu, 19 Sep 2024 21:38:58 UTC (19,298 KB)

Computer Science > Machine Learning

Title:A Dual Approach to Imitation Learning from Observations with Offline Datasets

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:A Dual Approach to Imitation Learning from Observations with Offline Datasets

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators