research-article

Learning to Parse Natural Language to Grounded Reward Functions with Weak Supervision

Authors:

Edward C. Williams,

Stefanie TellexAuthors Info & Claims

2018 IEEE International Conference on Robotics and Automation (ICRA)

Pages 1 - 7

https://doi.org/10.1109/ICRA.2018.8460937

Published: 21 May 2018 Publication History

Abstract

In order to intuitively and efficiently collaborate with humans, robots must learn to complete tasks specified using natural language. We represent natural language instructions as goal-state reward functions specified using lambda calculus. Using reward functions as language representations allows robots to plan efficiently in stochastic environments. To map sentences to such reward functions, we learn a weighted linear Combinatory Categorial Grammar (CCG) semantic parser. The parser, including both parameters and the CCG lexicon, is learned from a validation procedure that does not require execution of a planner, annotating reward functions, or labeling parse trees, unlike prior approaches. To learn a CCG lexicon and parse weights, we use coarse lexical generation and validation-driven perceptron weight updates using the approach of Artzi and Zettlemoyer [4]. We present results on the Cleanup World domain [18] to demonstrate the potential of our approach. We report an F1 score of 0.82 on a collected corpus of 23 tasks containing combinations of nested referential expressions, comparators and object properties with 2037 corresponding sentences. Our goal-condition learning approach enables an improvement of orders of magnitude in computation time over a baseline that performs planning during learning, while achieving comparable results. Further, we conduct an experiment with just 6 labeled demonstrations to show the ease of teaching a robot behaviors using our method. We show that parsing models learned from small data sets can generalize to commands not seen during training.

References

[1]

Ken R. Anderson, Timothy J. Hickey, and Peter Norvig. The jscheme language and implementation, 2013. URL http://jscheme.sourceforge.net/jscheme/main.html.

[2]

Yoav Artzi. Cornell SPF: Cornell Semantic Parsing Framework, 2016.

[3]

Yoav Artzi and Luke Zettlemoyer. Bootstrapping semantic parsers from conversations. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2011.

[4]

Yoav Artzi and Luke Zettlemoyer. Weakly supervized learning of semantic parsers for mapping instructions to actions. In Annual Meeting of the Association for Computational Linguistics, 2013.

[5]

Yoav Artzi, Dipanjan Das, and Slav Petrov. Learning compact lexicons for ccg semantic parsing. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 1273–1283. Association for Computational Linguistics, October 2014.

[6]

Dilip Arumugam, Siddharth Karamcheti, Nakul Gopalan, Lawson L. S. Wong, and Stefanie Tellex. Accurately and efficiently interpreting human-robot instructions of varying granularities. In Robotics: Science and Systems XIII, 2017.

[7]

R. Bellman. A Markovian decision process. Indiana University Mathematics Journal, 6:679–684, 1957.

[8]

B. Carpenter. Type-Logical Semantics. MIT Press, Cambridge, MA, USA, 1997.

[9]

Stephen Clark and James R Curran. Wide-coverage efficient statistical parsing with ccg and log-linear models. Computational Linguistics, 33(4):493–552, 2007.

Digital Library

[10]

Carlos Diuk, Andre Cohen, and Michael L. Littman. An object-oriented representation for efficient reinforcement learning. In International Conference on Machine Learning, 2008.

[11]

Nakul Gopalan, Marie Desjardins, Michael L. Littman, James Macglashan, Shawn Squire, Stefanie Tellex, John Winder, and Lawson L. S. Wong. Planning with abstract markov decision processes. In Proceedings of the Twenty-Seventh International Conference on Automated Planning and Scheduling, pages 480–488, 2017.

[12]

Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. From language to programs: Bridging reinforcement learning and maximum marginal likelihood. CoRR, abs/1704.07926, 2017.

[13]

Thomas M. Howard, Stefanie Tellex, and Nicholas Roy. A natural language planner interface for mobile manipulators. In IEEE International Conference on Robotics and Automation, 2014.

[14]

Michael Janner, Karthik Narasimhan, and Regina Barzilay. Representation learning for grounded spatial reasoning. arXiv preprint arXiv:, 2017.

[15]

Jayant Krishnamurthy and Tom M. Mitchell. Weakly supervised training of semantic parsers. In Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, 2012.

[16]

Tom Kwiatkowski, Luke Zettlemoyer, Sharon Goldwater, and Mark Steedman. Lexical generalization in ccg grammar induction for semantic parsing. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, pages 1512–1523. Association for Computational Linguistics, 2011.

[17]

James MacGlashan. Brown-UMBC Reinforcement Learning and Planning (BURLAP)-Project Page. http://burlap.cs.brown.edu/, 2014.

[18]

James MacGlashan, Monica Babeş-Vroman, Marie Desjardins, Michael L. Littman, Smaranda Muresan, Shawn Squire, Ste-fanie Tellex, Dilip Arumugam, and Lei Yang. Grounding english commands to reward functions. In Robotics: Science and Systems, 2015.

[19]

Dipendra Misra, John Langford, and Yoav Artzi. Mapping instructions and visual observations to actions with reinforcement learning. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2017.

[20]

Andrew Y. Ng and Stuart J. Russell. Algorithms for inverse reinforcement learning. In Proceedings of the Seventeenth International Conference on Machine Learning, ICML'00, pages 663–670, San Francisco, CA, USA, 2000. Morgan Kaufmann Publishers Inc. ISBN 1-55860-707-2.

[21]

Amir Pnueli. The temporal logic of programs. In Proceedings of the 18th Annual Symposium on Foundations of Computer Science, SFCS'77, pages 46–57, Washington, DC, USA, 1977. IEEE Computer Society. https://doi.org/10.1109/SFCS.1977.32.

Digital Library

[22]

Morgan Quigley, Josh Faust, Tully Foote, and Jeremy Leibs. Ros: an open-source robot operating system.

[23]

Pierre Sermanet, Kelvin Xu, and Sergey Levine. Unsupervised perceptual rewards for imitation learning. CoRR, abs/1612.06699, 2016.

[24]

Mark Steedman. The Syntactic Process. MIT Press, Cambridge, MA, USA, 2000. ISBN 0-262-19420-1.

Digital Library

[25]

Stefanie Tellex, Thomas Kollar, Steven Dickerson, Matthew R. Walter, Ashis Gopal Banerjee, Seth Teller, and Nicholas Roy. Understanding natural language commands for robotic navigation and mobile manipulation. In AAAI Conference on Artificial Intelligence, 2011.

[26]

Edward C. Williams, Mina Rhee, Nakul Gopalan, and Stefanie Tellex. Learning to parse natural language to grounded reward functions with weak supervision. In AAAI Fall Symposium on Natural Communication for Human-Robot Collaboration, 2017.

[27]

Luke Zettlemoyer and Michael Collins. Learning to map sentences to logical form: Structured classification with probabilistic categorial grammars. In Proceedings of the Conference on Uncertainty in Artificial Intelligence, 2005.

[28]

Luke Zettlemoyer and Michael Collins. Online learning of relaxed ccg grammars for parsing to logical form. In Proceedings of the Joint Conference on Empirical Methods in Natural Lanugage Processing and Computational Natural Language Learning, 2007.

Cited By

Zhi-Xuan TYing LMansinghka VTenenbaum JDastani MSichman JAlechina NDignum V(2024)Pragmatic Instruction Following and Goal Assistance via Cooperative Language-Guided Inverse PlanningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663074(2094-2103)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663074
Rodriguez-Sanchez RSpiegel BWang JPatel RTellex SKonidaris GKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)RLangProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619620(29161-29178)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619620
Li ZShi LCristea AZhou Y(2021)A Survey of Collaborative Reinforcement Learning: Interactive Methods and Design PatternsProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462135(1579-1590)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462135
Show More Cited By

Index Terms

Learning to Parse Natural Language to Grounded Reward Functions with Weak Supervision
1. Computing methodologies
  1. Artificial intelligence
  2. Machine learning
    1. Learning paradigms
2. Theory of computation

Index terms have been assigned to the content through auto-classification.

Recommendations

Learning to Parse Natural Language with Maximum Entropy Models
Special issue on natural language learning

This paper presents a machine learning system for parsing natural language that learns from manually parsed example sentences, and parses unseen data at state-of-the-art accuracies. Its machine learning technology, based on the maximum entropy framework, is ...
Using types to parse natural language
FP'95: Proceedings of the 1995 international conference on Functional Programming

We describe a natural language parser that uses type information to determine the grammatical structure of simple sentences and phrases. This stands in contrast to studies of type inference where types and grammatical structure play opposite roles, the ...
Unsupervised PCFG induction for grounded language learning with highly ambiguous supervision
EMNLP-CoNLL '12: Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning

"Grounded" language learning employs training data in the form of sentences paired with relevant but ambiguous perceptual contexts. Börschinger et al. (2011) introduced an approach to grounded language learning based on unsupervised PCFG induction. ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

2018 IEEE International Conference on Robotics and Automation (ICRA)

May 2018

5954 pages

Copyright © 2018.

Publisher

IEEE Press

Publication History

Published: 21 May 2018

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 04 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhi-Xuan TYing LMansinghka VTenenbaum JDastani MSichman JAlechina NDignum V(2024)Pragmatic Instruction Following and Goal Assistance via Cooperative Language-Guided Inverse PlanningProceedings of the 23rd International Conference on Autonomous Agents and Multiagent Systems10.5555/3635637.3663074(2094-2103)Online publication date: 6-May-2024
https://dl.acm.org/doi/10.5555/3635637.3663074
Rodriguez-Sanchez RSpiegel BWang JPatel RTellex SKonidaris GKrause ABrunskill ECho KEngelhardt BSabato SScarlett J(2023)RLangProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619620(29161-29178)Online publication date: 23-Jul-2023
https://dl.acm.org/doi/10.5555/3618408.3619620
Li ZShi LCristea AZhou Y(2021)A Survey of Collaborative Reinforcement Learning: Interactive Methods and Design PatternsProceedings of the 2021 ACM Designing Interactive Systems Conference10.1145/3461778.3462135(1579-1590)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1145/3461778.3462135
Cruz CIgarashi T(2021)Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors2021 IEEE Conference on Games (CoG)10.1109/CoG52621.2021.9618999(01-08)Online publication date: 17-Aug-2021
https://dl.acm.org/doi/10.1109/CoG52621.2021.9618999

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents