Abstract
Recent work has demonstrated the feasibility of employing Large Language Models (LLMs) as robot task planners. However, LLM planners often lack awareness of the executability of actions in the physical world, leading to impractical task instructions. Previous solutions rely on fine-tuning the planner for specific tasks and robot functions, which lack transferability across tasks and incur extra costs. We propose a framework that writes task details and robot operational methods into the system prompts of the LLM planner and the LLM executor, respectively, allowing the executor to provide feedback when the instructions given by the planner are deemed unreasonable, to aid the planner in replanning. This achieves zero-shot generalization across tasks and reduces the need for task-specific learning of the planner, like fine-tuning a policy head. Experimental results demonstrate the effectiveness of our approach in task completion, and the generalization ability to adapt different tasks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Andrychowicz, M., et al.: Hindsight experience replay. In: Advances in Neural Information Processing Systems, vol. 30 (2017)
Huang, W., Abbeel, P., Pathak, D., Mordatch, I.: Language models as zero-shot planners: extracting actionable knowledge for embodied agents. In: International Conference on Machine Learning. PMLR, pp. 9118–9147 (2022)
Song, C.H., Wu, J., Washington, C., Sadler, B.M., Chao, W.L., Su, Y.: LLM-Planner: few-shot grounded planning for embodied agents with large language models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2998–3009 (2023)
Bender, E.M., Koller, A.: Climbing towards NLU: on meaning, form, and understanding in the age of data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 5185–5198 (2020)
Alayrac, J.B., et al.: Flamingo: a visual language model for few-shot learning. Adv. Neural. Inf. Process. Syst. 35, 23716–23736 (2022)
Carta, T., Romac, C., Wolf, T., Lamprier, S., Sigaud, O., Oudeyer, P.Y.: Grounding large language models in interactive environments with online reinforcement learning. In: International Conference on Machine Learning. PMLR, pp. 3676--371 (2023)
Li, X., et al.: Vision-language foundation models as effective robot imitators. arXiv preprint arXiv.2311.01378 (2023)
Ahn, M., et al.: Do as I Can, Not as I Say: grounding language in robotic affordances. In: Conference on Robot Learning, 2204.01691 (2022)
Li, B., Wu, P., Abbeel, P., Malik, J.: Interactive task planning with language models. arXiv preprint arXiv.2310.10645 (2023)
Schaal, S.: Dynamic Movement Primitives-a Framework for Motor Control in Humans and Humanoid Robotics. Adaptive Motion of Animals and Machines, pp. 261–280. Springer, Tokyo (2006). https://doi.org/10.1007/4-431-31381-8_23
Schaal, S., Peters, J., Nakanishi, J., Ijspeert, A.: Learning movement primitives. In: Dario, P., Chatila, R. (eds.) Robotics Research. The Eleventh International Symposium: with 303 Figures. STAR, vol. 15, pp. 561–572. Springer, Heidelberg (2005). https://doi.org/10.1007/11008941_60
Abdulsaheb, J.A., Kadhim, D.J.: Classical and heuristic approaches for mobile robot path planning: a survey. Robotics 12(4), 93 (2023)
Mac, T.T., Copot, C., Tran, D.T., De Keyser, R.: Heuristic approaches in robot path planning: a survey. Robot. Auton. Syst. 86, 13–28 (2016)
Akbari, A., Muhayyuddin, Rosell, J.: Task and motion planning using physics-based reasoning. In: 2015 IEEE 20th Conference on Emerging Technologies & Factory Automation (ETFA), pp. 1–7 (2015)
Garrett, C.R., et al.: Integrated task and motion planning. Ann. Rev. Control Robot. Auton. Syst. 4, 265–293 (2021)
Garrett, C.R., Lozano-Pérez, T., Kaelbling, L.P.: FFRob: leveraging symbolic planning for efficient task and motion planning. Int. J. Robot. Res. 37(1), 104–136 (2018)
Garrett, C.R., Lozano-Pérez, T., Kaelbling, L.P.: PDDLStream: integrating symbolic planners and blackbox samplers via optimistic adaptive planning. In: Proceedings of the International Conference on Automated Planning and Scheduling, vol. 30, pp. 440–448 (2020)
Chang, H., et al.: LGMCTS: language-guided Monte-Carlo tree search for executable semantic object rearrangement. arXiv preprint arXiv.2309.15821 (2023)
Zhao, Z., Lee, W.S., Hsu, D.: Large language models as commonsense knowledge for large-scale task planning. In: Advances in Neural Information Processing Systems, vol. 36 (2023)
Wu, J., et al.: TidyBot: personalized robot assistance with large language models. Auton. Robot. 47(8), 1087–1102 (2023)
Zhang, H., et al.: Building cooperative embodied agents modularly with large language models. arXiv preprint arXiv.2307.02485 (2023)
Liang, J., et al.: Code as policies: language model programs for embodied control. In: 2023 IEEE International Conference on Robotics and Automation (ICRA), pp. 9493–9500 (2023)
Acknowledgments
This research was jointly funded by the National Natural Science Foundation of China (Grant No. 62306304) and the CAS Project for Young Scientists in Basic Research (Grant No. YSBR-040). The authors would also like to thank the anonymous reviewers for their careful reading.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Zhao, X., Jing, M., Wu, Y. (2024). Agent Can Say No: Robot Task Planning by Natural Language Feedback Between Planner and Executor. In: Huang, DS., Zhang, X., Zhang, C. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science(), vol 14879. Springer, Singapore. https://doi.org/10.1007/978-981-97-5675-9_13
Download citation
DOI: https://doi.org/10.1007/978-981-97-5675-9_13
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5674-2
Online ISBN: 978-981-97-5675-9
eBook Packages: Computer ScienceComputer Science (R0)