Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access
Just Accepted

“Do this instead” – Robots that Adequately Respond to Corrected Instructions

Online AM: 22 September 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Natural language instructions are effective at tasking autonomous robots and for teaching them new knowledge quickly. Yet, human instructors are not perfect and are likely to make mistakes at times, and will correct themselves when they notice errors in their own instructions. In this paper, we introduce a complete system for robot behaviors to handle such corrections, during both task instruction and action execution. We then demonstrate its operation in an integrated cognitive robotic architecture through spoken language in two tasks: a navigation and retrieval task and a meal assembly task. Verbal corrections occur before, during, and after verbally taught sequences of tasks, demonstrating that the proposed methods enable fast corrections not only of the semantics generated from the instructions, but also of overt robot behavior in a manner shown to be reasonable when compared to human behavior and expectations.

    References

    [1]
    Mattias Appelgren and Alex Lascarides. 2019. Learning Plans by Acquiring Grounded Linguistic Meanings from Corrections. In AAMAS. 1297–1305.
    [2]
    M. Appelgren and A. Lascarides. 2020. Interactive task learning via embodied corrective feedback. In Auton Agent Multi-Agent Syst, Vol.  34. https://doi.org/10.1007/s10458-020-09481-8
    [3]
    Alexander Broad, Jacob Arkin, Nathan Ratliff, Thomas Howard, and Brenna Argall. 2017. Real-time natural language corrections for assistive robotic manipulators. The International Journal of Robotics Research 36, 5-7(2017), 684–698.
    [4]
    Alexander Broad, Jacob Arkin, Nathan Ratliff, Thomas Howard, Brenna Argall, and Distributed Correspondence Graph. 2016. Towards real-time natural language corrections for assistive robots. In RSS Workshop on Model Learning for Human-Robot Communication.
    [5]
    Arthur Bucker, Luis Figueredo, Sami Haddadin, Ashish Kapoor, Shuang Ma, and Rogerio Bonatti. 2022. LaTTe: Language Trajectory TransformEr. arXiv preprint arXiv:2208.02918(2022).
    [6]
    Arthur Bucker, Luis Figueredo, Sami Haddadinl, Ashish Kapoor, Shuang Ma, and Rogerio Bonatti. 2022. Reshaping robot trajectories using natural language commands: A study of multi-modal data alignment using transformers. In 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, 978–984.
    [7]
    Rehj Cantrell, Matthias Scheutz, Paul Schermerhorn, and Xuan Wu. 2010. Robust Spoken Instruction Understanding for HRI. In Proceedings of the 2010 Human-Robot Interaction Conference. 275–282.
    [8]
    Rehj Cantrell, Kartik Talamadupula, Paul Schermerhorn, J. Benton, Subbarao Kambhampati, and Matthias Scheutz. 2012. Tell Me When and Why to Do It! Run-Time Planner Model Updates via Natural Language Instruction. In Proceedings of the Seventh Annual ACM/IEEE International Conference on Human-Robot Interaction (Boston, Massachusetts, USA) (HRI ’12). Association for Computing Machinery, New York, NY, USA, 471–478. https://doi.org/10.1145/2157689.2157840
    [9]
    Hugo Caselles-Dupré, Olivier Sigaud, and Mohamed Chetouani. 2022. Overcoming Referential Ambiguity in language-guided goal-conditioned Reinforcement Learning. arXiv preprint arXiv:2209.12758(2022).
    [10]
    Dongkyu Choi, Yeonsik Kang, Heonyoung Lim, and Bum-Jae You. 2009. Knowledge-based control of a humanoid robot. In 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 3949–3954.
    [11]
    Dongkyu Choi and Pat Langley. 2018. Evolution of the ICARUS cognitive architecture. Cognitive Systems Research 48 (2018), 25–38.
    [12]
    Hui-Qing Chong, Ah-Hwee Tan, and Gee-Wah Ng. 2007. Integrated cognitive architectures: a survey. Artificial Intelligence Review 28 (2007), 103–130.
    [13]
    Yuchen Cui, Siddharth Karamcheti, Raj Palleti, Nidhya Shivakumar, Percy Liang, and Dorsa Sadigh. 2023. No, to the Right: Online Language Corrections for Robotic Manipulation via Shared Autonomy. In Proceedings of the 2023 ACM/IEEE International Conference on Human-Robot Interaction. 93–101.
    [14]
    Qianqian Dong, Feng Wang, Zhen Yang, Wei Chen, Shuang Xu, and Bo Xu. 2019. Adapting translation models for transcript disfluency detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol.  33. 6351–6358.
    [15]
    Juraj Dzifcak, Matthias Scheutz, Chitta Baral, and Paul Schermerhorn. 2009. What to do and how to do it: Translating Natural Language Directives into Temporal and Dynamic Logic Representation for Goal Management and Action Execution. In Proceedings of the 2009 IEEE International Conference on Robotics and Automation (ICRA ’09). Kobe, Japan.
    [16]
    Kathleen Eberhard, Hannele Nicholson, Sandra Kuebler, Susan Gundersen, and Matthias Scheutz. 2010. The Indiana Cooperative Remote Search Task (CReST) Corpus. In Proceedings of LREC 2010: Language Resources and Evaluation Conference. Malta.
    [17]
    Jesse English and Sergei Nirenburg. 2020. OntoAgent: Implementing content-centric cognitive models. In Proceedings of the Annual Conference on Advances in Cognitive Systems.
    [18]
    Tesca Fitzgerald, Ashok Goel, and Andrea Thomaz. 2021. Modeling and Learning Constraints for Creative Tool Use. Frontiers in Robotics and AI 8 (2021). https://doi.org/10.3389/frobt.2021.674292
    [19]
    Dieter Fox, Wolfram Burgard, and Sebastian Thrun. 1997. The dynamic window approach to collision avoidance. IEEE Robotics & Automation Magazine 4, 1 (1997), 23–33.
    [20]
    Tyler Frasca, Bradley Oosterveld, Meia Chita-Tegmark, and Matthias Scheutz. 2021. Enabling Fast Instruction-Based Modification of Learned Robot Skills. In Proceedings of the 35th AAAI conference on Artificial Intelligence.
    [21]
    Prasoon Goyal, Scott Niekum, and Raymond J Mooney. 2019. Using natural language for reward shaping in reinforcement learning. arXiv preprint arXiv:1903.02020(2019).
    [22]
    Scott D Hanford, Oranuj Janrathitikarn, and Lyle N Long. 2009. Control of mobile robots using the soar cognitive architecture. Journal of Aerospace Computing, Information, and Communication 6, 2(2009), 69–91.
    [23]
    Peter A. Heeman and James F. Allen. 1999. Speech Repairs, Intonational Phrases, and Discourse Markers: Modeling Speakers’ Utterances in Spoken Dialogue. Comput. Linguist. 25, 4 (dec 1999), 527–571.
    [24]
    Jörg Hoffmann. 2001. FF: The fast-forward planning system. AI magazine 22, 3 (2001), 57–57.
    [25]
    Mrinal Kalakrishnan, Sachin Chitta, Evangelos Theodorou, Peter Pastor, and Stefan Schaal. 2011. STOMP: Stochastic trajectory optimization for motion planning. In 2011 IEEE international conference on robotics and automation. IEEE, 4569–4574.
    [26]
    James J Kuffner and Steven M LaValle. 2000. RRT-connect: An efficient approach to single-query path planning. In Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No. 00CH37065), Vol.  2. IEEE, 995–1001.
    [27]
    John E Laird. 2019. The Soar cognitive architecture. MIT press.
    [28]
    John Edwin Laird, Keegan R Kinkade, Shiwali Mohan, and Joseph Z Xu. 2012. Cognitive robotics using the Soar cognitive architecture. In CogRob@ AAAI. Citeseer.
    [29]
    John E Laird, Paul S Rosenbloom, and Allen Newell. 1986. Chunking in Soar: The anatomy of a general learning mechanism. Machine learning 1(1986), 11–46.
    [30]
    Pat Langley, Kathleen B McKusick, John A Allen, Wayne F Iba, and Kevin Thompson. 1991. A design for the ICARUS architecture. ACM Sigart Bulletin 2, 4 (1991), 104–109.
    [31]
    K-F Lee, H-W Hon, and Raj Reddy. 1990. An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech, and Signal Processing 38, 1(1990), 35–45.
    [32]
    Oliver Lemon, Anne Bracy, Alexander Gruenstein, and Stanley Peters. 2001. Information states in a multi-modal dialogue system for human-robot conversation. In 5th Workshop on Formal Semantics and Pragmatics of Dialogue (Bi-Dialog 2001). Citeseer, 57–67.
    [33]
    Oliver Lemon and Alexander Gruenstein. 2004. Multithreaded Context for Robust Conversational Interfaces: Context-Sensitive Speech Recognition and Interpretation of Corrective Fragments. ACM Trans. Comput.-Hum. Interact. 11, 3 (Sept. 2004), 241–267. https://doi.org/10.1145/1017494.1017496
    [34]
    Willem J.M. Levelt. 1983. Monitoring and self-repair in speech. Cognition 14, 1 (1983), 41–104. https://doi.org/10.1016/0010-0277(83)90026-4
    [35]
    Robin J Lickley. 1998. HCRC disfluency coding manual. Human Communication Research Centre, University of Edinburgh (1998).
    [36]
    Dylan P. Losey and Marcia K. O’Malley. 2018. Including Uncertainty when Learning from Human Corrections. In Proceedings of The 2nd Conference on Robot Learning(Proceedings of Machine Learning Research, Vol.  87), Aude Billard, Anca Dragan, Jan Peters, and Jun Morimoto (Eds.). PMLR, 123–132. https://proceedings.mlr.press/v87/losey18a.html
    [37]
    Paria Jamshid Lou and Mark Johnson. 2020. Improving disfluency detection by self-training a self-attentive model. arXiv preprint arXiv:2004.05323(2020).
    [38]
    Jelena Luketina, Nantas Nardelli, Gregory Farquhar, Jakob Foerster, Jacob Andreas, Edward Grefenstette, Shimon Whiteson, and Tim Rocktäschel. 2019. A survey of reinforcement learning informed by natural language. arXiv preprint arXiv:1906.03926(2019).
    [39]
    Corey Lynch, Ayzaan Wahid, Jonathan Tompson, Tianli Ding, James Betker, Robert Baruch, Travis Armstrong, and Pete Florence. 2022. Interactive language: Talking to robots in real time. arXiv preprint arXiv:2210.06407(2022).
    [40]
    Cynthia Matuszek, Evan Herbst, Luke Zettlemoyer, and Dieter Fox. 2013. Learning to parse natural language commands to a robot control system. In Experimental robotics: the 13th international symposium on experimental robotics. Springer, 403–415.
    [41]
    H. Nicholson, K. Eberhard, and M. Scheutz. 2010. Um...I don’t see any: The Function of Filled Pauses and Repairs. In Proceedings of 5th Workshop on Disfluency in Spontaneous Speech. 89–92.
    [42]
    Sergei Nirenburg and Peter Wood. 2017. Toward human-style learning in robots. In AAAI Fall Symposium on Natural Communication with Robots.
    [43]
    Jae Sung Park, Biao Jia, Mohit Bansal, and Dinesh Manocha. 2019. Efficient generation of motion plans from attribute-based natural language instructions using dynamic constraint mapping. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 6964–6971.
    [44]
    Jordi-Ysard Puigbo, Albert Pumarola, Cecilio Angulo, and Ricardo Tellez. 2015. Using a cognitive architecture for general purpose service robot control. Connection Science 27, 2 (2015), 105–117.
    [45]
    Morgan Quigley, Ken Conley, Brian Gerkey, Josh Faust, Tully Foote, Jeremy Leibs, Rob Wheeler, Andrew Y Ng, et al. 2009. ROS: an open-source Robot Operating System. In ICRA workshop on open source software, Vol.  3. Kobe, Japan, 5.
    [46]
    Frank E Ritter, Farnaz Tehranchi, and Jacob D Oury. 2019. ACT-R: A cognitive architecture for modeling cognition. Wiley Interdisciplinary Reviews: Cognitive Science 10, 3 (2019), e1488.
    [47]
    Frank Röder and Manfred Eppe. 2022. Language-Conditioned Reinforcement Learning to Solve Misunderstandings with Action Corrections. arXiv preprint arXiv:2211.10168(2022).
    [48]
    P. Schermerhorn, J. Kramer, T. Brick, D. Anderson, A. Dingler, and M. Scheutz. 2006. Diarc: A testbed for natural human-robot interactions. In In Proceedings of AAAI 2006 Robot Workshop. AAAI Press.
    [49]
    Matthias Scheutz, Rehj Cantrell, and Paul Schermerhorn. 2011. Toward Humanlike Task-Based Dialogue Processing for Human Robot Interaction. AI Magazine 32, 4 (2011), 77–84.
    [50]
    Matthias Scheutz, Jack Harris, and Paul Schermerhorn. 2013. Systematic integration of cognitive and robotic architectures. Advances in Cognitive Systems 2 (2013), 277–296.
    [51]
    Matthias Scheutz, Evan Krause, Brad Oosterveld, Tyler Frasca, and Robert Platt. 2017. Spoken Instruction-Based One-Shot Object and Action Learning in a Cognitive Robotic Architecture. In Proceedings of the 16th International Conference on Autoomous Agents and Multiagent Systems.
    [52]
    Matthias Scheutz, Evan Krause, Brad Oosterveld, Tyler Frasca, and Robert Platt. 2017. Spoken Instruction-Based One-Shot Object and Action Learning in a Cognitive Robotic Architecture. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (São Paulo, Brazil) (AAMAS ’17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1378–1386.
    [53]
    Matthias Scheutz, Paul Schermerhorn, James Kramer, and David Anderson. 2007. First Steps toward Natural Human-Like HRI. Autonomous Robots 22, 4 (May 2007), 411–423.
    [54]
    Matthias Scheutz, Thomas Williams, Evan Krause, Bradley Oosterveld, Vasanth Sarathy, and Tyler Frasca. 2019. An overview of the distributed integrated cognition affect and reflection DIARC architecture. Cognitive architectures(2019), 165–193.
    [55]
    Pratyusha Sharma, Balakumar Sundaralingam, Valts Blukis, Chris Paxton, Tucker Hermans, Antonio Torralba, Jacob Andreas, and Dieter Fox. 2022. Correcting robot plans with natural language feedback. arXiv preprint arXiv:2204.05186(2022).
    [56]
    Lanbo She, Shaohua Yang, Yu Cheng, Yunyi Jia, Joyce Chai, and Ning Xi. 2014. Back to the blocks world: Learning new actions through situated human-robot dialogue. In Proceedings of the 15th annual meeting of the special interest group on discourse and dialogue (SIGDIAL). 89–97.
    [57]
    Elizabeth Shriberg, Rebecca Bates, and Andreas Stolcke. 1997. A prosody only decision-tree model for disfluency detection. In Fifth European Conference on Speech Communication and Technology.
    [58]
    Shawn Squire, Stefanie Tellex, Dilip Arumugam, and Lei Yang. 2015. Grounding English commands to reward functions. In Robotics: Science and Systems.
    [59]
    Stefanie Tellex, Nakul Gopalan, Hadas Kress-Gazit, and Cynthia Matuszek. 2020. Robots That Use Language. Annual Review of Control, Robotics, and Autonomous Systems3 (2020), 25––55.
    [60]
    J Gregory Trafton, Laura M Hiatt, Anthony M Harrison, Franklin P Tamborello, Sangeet S Khemlani, and Alan C Schultz. 2013. Act-r/e: An embodied cognitive architecture for human-robot interaction. Journal of Human-Robot Interaction 2, 1 (2013), 30–55.
    [61]
    Shaolei Wang, Wanxiang Che, Yue Zhang, Meishan Zhang, and Ting Liu. 2017. Transition-based disfluency detection using lstms. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2785–2794.
    [62]
    Melonee Wise, Michael Ferguson, Derek King, Eric Diehr, and David Dymesich. 2016. Fetch and freight: Standard platforms for service robot applications. In Workshop on autonomous mobile service robots.
    [63]
    Vicky Zayats, Mari Ostendorf, and Hannaneh Hajishirzi. 2016. Disfluency detection using a bidirectional LSTM. arXiv preprint arXiv:1604.03209(2016).

    Cited By

    View all
    • (2024)Toward Competent Robot Apprentices: Enabling Proactive Troubleshooting in Collaborative RobotsMachines10.3390/machines1201007312:1(73)Online publication date: 18-Jan-2024

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Human-Robot Interaction
    ACM Transactions on Human-Robot Interaction Just Accepted
    EISSN:2573-9522
    Table of Contents
    Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the owner/author(s).

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Online AM: 22 September 2023
    Accepted: 07 August 2023
    Revised: 08 July 2023
    Received: 15 June 2022

    Check for updates

    Author Tag

    1. Human-robot Interaction

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)307
    • Downloads (Last 6 weeks)33
    Reflects downloads up to

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Toward Competent Robot Apprentices: Enabling Proactive Troubleshooting in Collaborative RobotsMachines10.3390/machines1201007312:1(73)Online publication date: 18-Jan-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media