Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Zhang, Edwin; Lu, Yujie; Wang, William; Zhang, Amy

Computer Science > Machine Learning

arXiv:2210.15629 (cs)

[Submitted on 27 Oct 2022 (v1), last revised 18 Jan 2024 (this version, v3)]

Title:Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Authors:Edwin Zhang, Yujie Lu, William Wang, Amy Zhang

View PDF HTML (experimental)

Abstract:Training generalist agents is difficult across several axes, requiring us to deal with high-dimensional inputs (space), long horizons (time), and generalization to novel tasks. Recent advances with architectures have allowed for improved scaling along one or two of these axes, but are still computationally prohibitive to use. In this paper, we propose to address all three axes by leveraging \textbf{L}anguage to \textbf{C}ontrol \textbf{D}iffusion models as a hierarchical planner conditioned on language (LCD). We effectively and efficiently scale diffusion models for planning in extended temporal, state, and task dimensions to tackle long horizon control problems conditioned on natural language instructions, as a step towards generalist agents. Comparing LCD with other state-of-the-art models on the CALVIN language robotics benchmark finds that LCD outperforms other SOTA methods in multi-task success rates, whilst improving inference speed over other comparable diffusion models by 3.3x~15x. We show that LCD can successfully leverage the unique strength of diffusion models to produce coherent long range plans while addressing their weakness in generating low-level details and control.

Comments:	ICLR 2024, Project and code available at this https URL
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
Cite as:	arXiv:2210.15629 [cs.LG]
	(or arXiv:2210.15629v3 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2210.15629

Submission history

From: Eddie Zhang [view email]
[v1] Thu, 27 Oct 2022 17:20:50 UTC (9,164 KB)
[v2] Tue, 11 Apr 2023 02:15:04 UTC (3,351 KB)
[v3] Thu, 18 Jan 2024 00:43:41 UTC (3,690 KB)

Computer Science > Machine Learning

Title:Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Language Control Diffusion: Efficiently Scaling through Space, Time, and Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators