MOCA: A Modular Object-Centric Approach for Interactive Instruction Following

Singh, Kunal Pratap; Bhambri, Suvaansh; Kim, Byeonghwi; Mottaghi, Roozbeh; Choi, Jonghyun

Computer Science > Artificial Intelligence

arXiv:2012.03208v2 (cs)

[Submitted on 6 Dec 2020 (v1), revised 29 May 2021 (this version, v2), latest version 2 Sep 2021 (v3)]

Title:MOCA: A Modular Object-Centric Approach for Interactive Instruction Following

Authors:Kunal Pratap Singh, Suvaansh Bhambri, Byeonghwi Kim, Roozbeh Mottaghi, Jonghyun Choi

View PDF

Abstract:Performing simple household tasks based on language directives is very natural to humans, yet it remains an open challenge for an AI agent. Recently, an 'interactive instruction following' task has been proposed to foster research in reasoning over long instruction sequences that requires object interactions in a simulated environment. It involves solving open problems in vision, language and navigation literature at each step. To address this multifaceted problem, we propose a modular architecture that decouples the task into visual perception and action policy, and name it as MOCA, a Modular Object-Centric Approach. We evaluate our method on the ALFRED benchmark and empirically validate that it outperforms prior arts by significant margins in all metrics with good generalization performance (high success rate in unseen environments). Our code is available at this https URL.

Comments:	12 pages, 6 figures
Subjects:	Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
Cite as:	arXiv:2012.03208 [cs.AI]
	(or arXiv:2012.03208v2 [cs.AI] for this version)
	https://doi.org/10.48550/arXiv.2012.03208

Submission history

From: Byeonghwi Kim [view email]
[v1] Sun, 6 Dec 2020 07:59:22 UTC (4,470 KB)
[v2] Sat, 29 May 2021 15:49:23 UTC (4,470 KB)
[v3] Thu, 2 Sep 2021 13:14:59 UTC (5,539 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.AI

< prev | next >

new | recent | 2020-12

Change to browse by:

cs
cs.CV
cs.RO

References & Citations

DBLP - CS Bibliography

listing | bibtex

Roozbeh Mottaghi
Jonghyun Choi

export BibTeX citation

Computer Science > Artificial Intelligence

Title:MOCA: A Modular Object-Centric Approach for Interactive Instruction Following

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Artificial Intelligence

Title:MOCA: A Modular Object-Centric Approach for Interactive Instruction Following

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators