research-article

Open access

Just Accepted

“Do this instead” – Robots that Adequately Respond to Corrected Instructions

Authors: Christopher Thierauf, Ravenna Thielstrom, Bradley Oosterveld, Will Becker, Matthias ScheutzAuthors Info & Claims

ACM Transactions on Human-Robot Interaction

Accepted on 07 August 2023

https://doi.org/10.1145/3623385

Online AM: 22 September 2023 Publication History

Abstract

Natural language instructions are effective at tasking autonomous robots and for teaching them new knowledge quickly. Yet, human instructors are not perfect and are likely to make mistakes at times, and will correct themselves when they notice errors in their own instructions. In this paper, we introduce a complete system for robot behaviors to handle such corrections, during both task instruction and action execution. We then demonstrate its operation in an integrated cognitive robotic architecture through spoken language in two tasks: a navigation and retrieval task and a meal assembly task. Verbal corrections occur before, during, and after verbally taught sequences of tasks, demonstrating that the proposed methods enable fast corrections not only of the semantics generated from the instructions, but also of overt robot behavior in a manner shown to be reasonable when compared to human behavior and expectations.

References

[1]

Mattias Appelgren and Alex Lascarides. 2019. Learning Plans by Acquiring Grounded Linguistic Meanings from Corrections. In AAMAS. 1297–1305.

[2]

M. Appelgren and A. Lascarides. 2020. Interactive task learning via embodied corrective feedback. In Auton Agent Multi-Agent Syst, Vol. 34. https://doi.org/10.1007/s10458-020-09481-8

Digital Library

[3]

Alexander Broad, Jacob Arkin, Nathan Ratliff, Thomas Howard, and Brenna Argall. 2017. Real-time natural language corrections for assistive robotic manipulators. The International Journal of Robotics Research 36, 5-7(2017), 684–698.

Digital Library

[4]

Alexander Broad, Jacob Arkin, Nathan Ratliff, Thomas Howard, Brenna Argall, and Distributed Correspondence Graph. 2016. Towards real-time natural language corrections for assistive robots. In RSS Workshop on Model Learning for Human-Robot Communication.

[5]

Arthur Bucker, Luis Figueredo, Sami Haddadin, Ashish Kapoor, Shuang Ma, and Rogerio Bonatti. 2022. LaTTe: Language Trajectory TransformEr. arXiv preprint arXiv:2208.02918(2022).

[6]

Arthur Bucker, Luis Figueredo, Sami Haddadinl, Ashish Kapoor, Shuang Ma, and Rogerio Bonatti. 2022. Reshaping robot trajectories using natural language commands: A study of multi-modal data alignment using transformers. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 978–984.

[7]

Rehj Cantrell, Matthias Scheutz, Paul Schermerhorn, and Xuan Wu. 2010. Robust Spoken Instruction Understanding for HRI. In Proceedings of the 2010 Human-Robot Interaction Conference. 275–282.

[8]

Rehj Cantrell, Kartik Talamadupula, Paul Schermerhorn, J. Benton, Subbarao Kambhampati, and Matthias Scheutz. 2012. Tell Me When and Why to Do It! Run-Time Planner Model Updates via Natural Language Instruction. In Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction (Boston, Massachusetts, USA) (HRI ’12). Association for Computing Machinery, New York, NY, USA, 471–478. https://doi.org/10.1145/2157689.2157840

Digital Library

[9]

Hugo Caselles-Dupré, Olivier Sigaud, and Mohamed Chetouani. 2022. Overcoming Referential Ambiguity in language-guided goal-conditioned Reinforcement Learning. arXiv preprint arXiv:2209.12758(2022).

[10]

Dongkyu Choi, Yeonsik Kang, Heonyoung Lim, and Bum-Jae You. 2009. Knowledge-based control of a humanoid robot. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 3949–3954.

[11]

Dongkyu Choi and Pat Langley. 2018. Evolution of the ICARUS cognitive architecture. Cognitive Systems Research 48 (2018), 25–38.

[12]

Hui-Qing Chong, Ah-Hwee Tan, and Gee-Wah Ng. 2007. Integrated cognitive architectures: a survey. Artificial Intelligence Review 28 (2007), 103–130.

Digital Library

[13]

Yuchen Cui, Siddharth Karamcheti, Raj Palleti, Nidhya Shivakumar, Percy Liang, and Dorsa Sadigh. 2023. No, to the Right: Online Language Corrections for Robotic Manipulation via Shared Autonomy. In Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. 93–101.

Digital Library

[14]

Qianqian Dong, Feng Wang, Zhen Yang, Wei Chen, Shuang Xu, and Bo Xu. 2019. Adapting translation models for transcript disfluency detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 6351–6358.

Digital Library

[15]

Juraj Dzifcak, Matthias Scheutz, Chitta Baral, and Paul Schermerhorn. 2009. What to do and how to do it: Translating Natural Language Directives into Temporal and Dynamic Logic Representation for Goal Management and Action Execution. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation (ICRA ’09). Kobe, Japan.

Digital Library

[16]

Kathleen Eberhard, Hannele Nicholson, Sandra Kuebler, Susan Gundersen, and Matthias Scheutz. 2010. The Indiana Cooperative Remote Search Task (CReST) Corpus. In Proceedings of LREC 2010: Language Resources and Evaluation Conference. Malta.

[17]

Jesse English and Sergei Nirenburg. 2020. OntoAgent: Implementing content-centric cognitive models. In Proceedings of the Annual Conference on Advances in Cognitive Systems.

[18]

Tesca Fitzgerald, Ashok Goel, and Andrea Thomaz. 2021. Modeling and Learning Constraints for Creative Tool Use. Frontiers in Robotics and AI 8 (2021). https://doi.org/10.3389/frobt.2021.674292

[19]

Dieter Fox, Wolfram Burgard, and Sebastian Thrun. 1997. The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine 4, 1 (1997), 23–33.

Digital Library

[20]

Tyler Frasca, Bradley Oosterveld, Meia Chita-Tegmark, and Matthias Scheutz. 2021. Enabling Fast Instruction-Based Modification of Learned Robot Skills. In Proceedings of the 35th AAAI conference on Artificial Intelligence.

[21]

Prasoon Goyal, Scott Niekum, and Raymond J Mooney. 2019. Using natural language for reward shaping in reinforcement learning. arXiv preprint arXiv:1903.02020(2019).

[22]

Scott D Hanford, Oranuj Janrathitikarn, and Lyle N Long. 2009. Control of mobile robots using the soar cognitive architecture. Journal of Aerospace Computing, Information, and Communication 6, 2(2009), 69–91.

[23]

Peter A. Heeman and James F. Allen. 1999. Speech Repairs, Intonational Phrases, and Discourse Markers: Modeling Speakers’ Utterances in Spoken Dialogue. Comput. Linguist. 25, 4 (dec 1999), 527–571.

[24]

Jörg Hoffmann. 2001. FF: The fast-forward planning system. AI magazine 22, 3 (2001), 57–57.

[25]

Mrinal Kalakrishnan, Sachin Chitta, Evangelos Theodorou, Peter Pastor, and Stefan Schaal. 2011. STOMP: Stochastic trajectory optimization for motion planning. In 2011 IEEE international conference on robotics and automation. IEEE, 4569–4574.

[26]

James J Kuffner and Steven M LaValle. 2000. RRT-connect: An efficient approach to single-query path planning. In Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), Vol. 2. IEEE, 995–1001.

[27]

John E Laird. 2019. The Soar cognitive architecture. MIT press.

Digital Library

[28]

John Edwin Laird, Keegan R Kinkade, Shiwali Mohan, and Joseph Z Xu. 2012. Cognitive robotics using the Soar cognitive architecture. In CogRob@ AAAI. Citeseer.

[29]

John E Laird, Paul S Rosenbloom, and Allen Newell. 1986. Chunking in Soar: The anatomy of a general learning mechanism. Machine learning 1(1986), 11–46.

[30]

Pat Langley, Kathleen B McKusick, John A Allen, Wayne F Iba, and Kevin Thompson. 1991. A design for the ICARUS architecture. ACM Sigart Bulletin 2, 4 (1991), 104–109.

Digital Library

[31]

K-F Lee, H-W Hon, and Raj Reddy. 1990. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech, and Signal Processing 38, 1(1990), 35–45.

[32]

Oliver Lemon, Anne Bracy, Alexander Gruenstein, and Stanley Peters. 2001. Information states in a multi-modal dialogue system for human-robot conversation. In 5th Workshop on Formal Semantics and Pragmatics of Dialogue (Bi-Dialog 2001). Citeseer, 57–67.

[33]

Oliver Lemon and Alexander Gruenstein. 2004. Multithreaded Context for Robust Conversational Interfaces: Context-Sensitive Speech Recognition and Interpretation of Corrective Fragments. ACM Trans. Comput.-Hum. Interact. 11, 3 (Sept. 2004), 241–267. https://doi.org/10.1145/1017494.1017496

Digital Library

[34]

Willem J.M. Levelt. 1983. Monitoring and self-repair in speech. Cognition 14, 1 (1983), 41–104. https://doi.org/10.1016/0010-0277(83)90026-4

[35]

Robin J Lickley. 1998. HCRC disfluency coding manual. Human Communication Research Centre, University of Edinburgh (1998).

[36]

Dylan P. Losey and Marcia K. O’Malley. 2018. Including Uncertainty when Learning from Human Corrections. In Proceedings of The 2nd Conference on Robot Learning(Proceedings of Machine Learning Research, Vol. 87), Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto (Eds.). PMLR, 123–132. https://proceedings.mlr.press/v87/losey18a.html

[37]

Paria Jamshid Lou and Mark Johnson. 2020. Improving disfluency detection by self-training a self-attentive model. arXiv preprint arXiv:2004.05323(2020).

[38]

Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, and Tim Rocktäschel. 2019. A survey of reinforcement learning informed by natural language. arXiv preprint arXiv:1906.03926(2019).

[39]

Corey Lynch, Ayzaan Wahid, Jonathan Tompson, Tianli Ding, James Betker, Robert Baruch, Travis Armstrong, and Pete Florence. 2022. Interactive language: Talking to robots in real time. arXiv preprint arXiv:2210.06407(2022).

[40]

Cynthia Matuszek, Evan Herbst, Luke Zettlemoyer, and Dieter Fox. 2013. Learning to parse natural language commands to a robot control system. In Experimental robotics: the 13th international symposium on experimental robotics. Springer, 403–415.

[41]

H. Nicholson, K. Eberhard, and M. Scheutz. 2010. Um...I don’t see any: The Function of Filled Pauses and Repairs. In Proceedings of 5th Workshop on Disfluency in Spontaneous Speech. 89–92.

[42]

Sergei Nirenburg and Peter Wood. 2017. Toward human-style learning in robots. In AAAI Fall Symposium on Natural Communication with Robots.

[43]

Jae Sung Park, Biao Jia, Mohit Bansal, and Dinesh Manocha. 2019. Efficient generation of motion plans from attribute-based natural language instructions using dynamic constraint mapping. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 6964–6971.

Digital Library

[44]

Jordi-Ysard Puigbo, Albert Pumarola, Cecilio Angulo, and Ricardo Tellez. 2015. Using a cognitive architecture for general purpose service robot control. Connection Science 27, 2 (2015), 105–117.

Digital Library

[45]

Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, Andrew Y Ng, et al. 2009. ROS: an open-source Robot Operating System. In ICRA workshop on open source software, Vol. 3. Kobe, Japan, 5.

[46]

Frank E Ritter, Farnaz Tehranchi, and Jacob D Oury. 2019. ACT-R: A cognitive architecture for modeling cognition. Wiley Interdisciplinary Reviews: Cognitive Science 10, 3 (2019), e1488.

[47]

Frank Röder and Manfred Eppe. 2022. Language-Conditioned Reinforcement Learning to Solve Misunderstandings with Action Corrections. arXiv preprint arXiv:2211.10168(2022).

[48]

P. Schermerhorn, J. Kramer, T. Brick, D. Anderson, A. Dingler, and M. Scheutz. 2006. Diarc: A testbed for natural human-robot interactions. In In Proceedings of AAAI 2006 Robot Workshop. AAAI Press.

[49]

Matthias Scheutz, Rehj Cantrell, and Paul Schermerhorn. 2011. Toward Humanlike Task-Based Dialogue Processing for Human Robot Interaction. AI Magazine 32, 4 (2011), 77–84.

Digital Library

[50]

Matthias Scheutz, Jack Harris, and Paul Schermerhorn. 2013. Systematic integration of cognitive and robotic architectures. Advances in Cognitive Systems 2 (2013), 277–296.

[51]

Matthias Scheutz, Evan Krause, Brad Oosterveld, Tyler Frasca, and Robert Platt. 2017. Spoken Instruction-Based One-Shot Object and Action Learning in a Cognitive Robotic Architecture. In Proceedings of the 16th International Conference on Autoomous Agents and Multiagent Systems.

Digital Library

[52]

Matthias Scheutz, Evan Krause, Brad Oosterveld, Tyler Frasca, and Robert Platt. 2017. Spoken Instruction-Based One-Shot Object and Action Learning in a Cognitive Robotic Architecture. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (São Paulo, Brazil) (AAMAS ’17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1378–1386.

Digital Library

[53]

Matthias Scheutz, Paul Schermerhorn, James Kramer, and David Anderson. 2007. First Steps toward Natural Human-Like HRI. Autonomous Robots 22, 4 (May 2007), 411–423.

Digital Library

[54]

Matthias Scheutz, Thomas Williams, Evan Krause, Bradley Oosterveld, Vasanth Sarathy, and Tyler Frasca. 2019. An overview of the distributed integrated cognition affect and reflection DIARC architecture. Cognitive architectures(2019), 165–193.

[55]

Pratyusha Sharma, Balakumar Sundaralingam, Valts Blukis, Chris Paxton, Tucker Hermans, Antonio Torralba, Jacob Andreas, and Dieter Fox. 2022. Correcting robot plans with natural language feedback. arXiv preprint arXiv:2204.05186(2022).

[56]

Lanbo She, Shaohua Yang, Yu Cheng, Yunyi Jia, Joyce Chai, and Ning Xi. 2014. Back to the blocks world: Learning new actions through situated human-robot dialogue. In Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL). 89–97.

[57]

Elizabeth Shriberg, Rebecca Bates, and Andreas Stolcke. 1997. A prosody only decision-tree model for disfluency detection. In Fifth European Conference on Speech Communication and Technology.

[58]

Shawn Squire, Stefanie Tellex, Dilip Arumugam, and Lei Yang. 2015. Grounding English commands to reward functions. In Robotics: Science and Systems.

[59]

Stefanie Tellex, Nakul Gopalan, Hadas Kress-Gazit, and Cynthia Matuszek. 2020. Robots That Use Language. Annual Review of Control, Robotics, and Autonomous Systems3 (2020), 25––55.

[60]

J Gregory Trafton, Laura M Hiatt, Anthony M Harrison, Franklin P Tamborello, Sangeet S Khemlani, and Alan C Schultz. 2013. Act-r/e: An embodied cognitive architecture for human-robot interaction. Journal of Human-Robot Interaction 2, 1 (2013), 30–55.

Digital Library

[61]

Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, and Ting Liu. 2017. Transition-based disfluency detection using lstms. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2785–2794.

[62]

Melonee Wise, Michael Ferguson, Derek King, Eric Diehr, and David Dymesich. 2016. Fetch and freight: Standard platforms for service robot applications. In Workshop on autonomous mobile service robots.

[63]

Vicky Zayats, Mari Ostendorf, and Hannaneh Hajishirzi. 2016. Disfluency detection using a bidirectional LSTM. arXiv preprint arXiv:1604.03209(2016).

Cited By

Thierauf CLaw TFrasca TScheutz M(2024)Toward Competent Robot Apprentices: Enabling Proactive Troubleshooting in Collaborative RobotsMachines10.3390/machines1201007312:1(73)Online publication date: 18-Jan-2024
https://doi.org/10.3390/machines12010073

Index Terms

“Do this instead” – Robots that Adequately Respond to Corrected Instructions
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
2. Computing methodologies
  1. Artificial intelligence
    1. Knowledge representation and reasoning
      1. Cognitive robotics
    2. Natural language processing
      1. Discourse, dialogue and pragmatics

Recommendations

Autonomy, Embodiment, and Obedience to Robots
HRI'15 Extended Abstracts: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts

We conducted an HRI obedience experiment comparing an autonomous robotic authority to: (i) a remote-controlled robot, and (ii) robots of variant embodiments during a deterrent task. The results suggest that half of people will continue to perform a ...
Design and Simulation of Autonomous Mobile Robots Obstacle Avoidance System
LNCS on Transactions on Edutainment XIII - Volume 10092

Autonomous mobile robot is a completely unknown or partially unknown, of its complex environment, which requires the ability to avoid obstacles in order to move safely and avoid the collisions while navigating. To solve this problem, it opted to propose ...
Conversational gaze mechanisms for humanlike robots

During conversations, speakers employ a number of verbal and nonverbal mechanisms to establish who participates in the conversation, when, and in what capacity. Gaze cues and mechanisms are particularly instrumental in establishing the participant roles ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Human-Robot Interaction

ACM Transactions on Human-Robot Interaction Just Accepted

EISSN:2573-9522

Table of Contents

Copyright © 2023 Copyright held by the owner/author(s).

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 22 September 2023

Accepted: 07 August 2023

Revised: 08 July 2023

Received: 15 June 2022

Check for updates

Author Tag

Human-robot Interaction

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
307
Total Downloads

Downloads (Last 12 months)307
Downloads (Last 6 weeks)33

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

Cited By

Thierauf CLaw TFrasca TScheutz M(2024)Toward Competent Robot Apprentices: Enabling Proactive Troubleshooting in Collaborative RobotsMachines10.3390/machines1201007312:1(73)Online publication date: 18-Jan-2024
https://doi.org/10.3390/machines12010073

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables