Learning Long-Horizon Robot Exploration Strategies for Multi-object Search in Continuous Action Spaces

Schmalstieg, Fabian; Honerkamp, Daniel; Welschehold, Tim; Valada, Abhinav

doi:10.1007/978-3-031-25555-7_5

Fabian Schmalstieg¹³,
Daniel Honerkamp¹³,
Tim Welschehold¹³ &
…
Abhinav Valada¹³

Part of the book series: Springer Proceedings in Advanced Robotics ((SPAR,volume 27))

Included in the following conference series:

The International Symposium of Robotics Research

1570 Accesses
6 Citations

Abstract

Recent advances in vision-based navigation and exploration have shown impressive capabilities in photorealistic indoor environments. However, these methods still struggle with long-horizon tasks and require large amounts of data to generalize to unseen environments. In this work, we present a novel reinforcement learning approach for multi-object search that combines short-term and long-term reasoning in a single model while avoiding the complexities arising from hierarchical structures. In contrast to existing multi-object search methods that act in granular discrete action spaces, our approach achieves exceptional performance in continuous action spaces. We perform extensive experiments and show that it generalizes to unseen apartment environments with limited data. Furthermore, we demonstrate zero-shot transfer of the learned policies to an office environment in real world experiments.

F. Schmalstieg and D. Honerkamp—These authors contributed equally.

This work was funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No 871449-OpenDR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 249.99; Price excludes VAT (USA)

Softcover Book: USD 299.99; Price excludes VAT (USA)

Hardcover Book: USD 299.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Hierarchical reinforcement learning for handling sparse rewards in multi-goal navigation

Article Open access 28 May 2024

First return, then explore

Article 24 February 2021

Robot Navigation Based on Reinforcement Learning: An Overview

Notes

1.
http://svl.stanford.edu/igibson/challenge2020.html.

References

Anderson, P., et al.: On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757 (2018)
Beeching, E., Debangoye, J., Simonin, O., Wolf, C.: Deep reinforcement learning on a budget: 3D control and reasoning without a supercomputer. In: 25th International Conference on Pattern Recognition (ICPR), pp. 158–165 (2021)
Google Scholar
Cattaneo, D., Vaghi, M., Valada, A.: LCDNet: deep loop closure detection and point cloud registration for LiDAR SLAM. IEEE Trans. Robot. 38, 2074–2093 (2022)
Article Google Scholar
Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural slam. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Chaplot, D.S., Gandhi, D.P., Gupta, A., Salakhutdinov, R.R.: Object goal navigation using goal-oriented semantic exploration. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 4247–4258 (2020)
Google Scholar
Chen, C., et al.: SoundSpaces: audio-visual navigation in 3D environments. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 17–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_2
Chapter Google Scholar
Chen, C., Majumder, S., Al-Halah, Z., Gao, R., Ramakrishnan, S.K., Grauman, K.: Learning to set waypoints for audio-visual navigation. In: International Conference on Learning Representations (ICLR) (2020)
Google Scholar
Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Google Scholar
Druon, R., Yoshiyasu, Y., Kanezaki, A., Watt, A.: Visual object search by learning spatial context. IEEE Robot. Autom. Lett. 5(2), 1279–1286 (2020)
Article Google Scholar
Fang, K., Toshev, A., Fei-Fei, L., Savarese, S.: Scene memory transformer for embodied agents in long-horizon tasks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 538–547 (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Google Scholar
Henriques, J.F., Vedaldi, A.: MapNet: an allocentric spatial memory for mapping environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8476–8484 (2018)
Google Scholar
Honerkamp, D., Welschehold, T., Valada, A.: Learning kinematic feasibility for mobile manipulation through deep reinforcement learning. IEEE Robot. Autom. Lett. 6(4), 6289–6296 (2021)
Article Google Scholar
Hurtado, J.V., Londoño, L., Valada, A.: From learning to relearning: a framework for diminishing bias in social robot navigation. Front. Robot. AI 8, 69 (2021)
Article Google Scholar
Kim, J., Lee, E.S., Lee, M., Zhang, D., Kim, Y.M.: SGoLAM: simultaneous goal localization and mapping for multi-object goal navigation. arXiv preprint arXiv:2110.07171 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kohlbrecher, S., Meyer, J., von Stryk, O., Klingauf, U.: A flexible and scalable slam system with full 3D motion estimation. In: Proceedings of the IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR). IEEE (2011)
Google Scholar
Li, C., et al.: iGibson 2.0: object-centric simulation for robot learning of everyday household tasks. In: 2021 Conference on Robot Learning (CoRL), Proceedings of Machine Learning Research, vol. 164, pp. 455–465. PMLR (2021). https://proceedings.mlr.press/v164/li22b.html
Min, S.Y., Chaplot, D.S., Ravikumar, P.K., Bisk, Y., Salakhutdinov, R.: FILM: following instructions in language with modular methods. In: International Conference on Learning Representations (ICLR) (2022)
Google Scholar
Oh, J., Chockalingam, V., Lee, H., et al.: Control of memory, active perception, and action in minecraft. In: International Conference on Machine Learning, pp. 2790–2799 (2016)
Google Scholar
Qiu, Y., Pal, A., Christensen, H.I.: Learning hierarchical relationships for object-goal navigation. In: 2020 Conference on Robot Learning (CoRL) (2020)
Google Scholar
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021)
Google Scholar
Savva, M., et al.: Habitat: a platform for embodied AI research. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9339–9347 (2019)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Vödisch, N., Cattaneo, D., Burgard, W., Valada, A.: Continual SLAM: beyond lifelong simultaneous localization and mapping through continual learning. arXiv preprint arXiv:2203.01578 (2022)
Wani, S., Patel, S., Jain, U., Chang, A., Savva, M.: MultiON: benchmarking semantic map memory using multi-object navigation. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 9700–9712 (2020)
Google Scholar
Yamauchi, B.: A frontier-based approach for autonomous exploration. In: Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), pp. 146–151 (1997)
Google Scholar
Younes, A., Honerkamp, D., Welschehold, T., Valada, A.: Catch me if you hear me: audio-visual navigation in complex unmapped environments with moving sounds. arXiv preprint arXiv:2111.14843 (2021)
Zhang, J., Tai, L., Boedecker, J., Burgard, W., Liu, M.: Neural SLAM: learning to explore with external memory. arXiv preprint arXiv:1706.09520 (2017)
Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: International Conference on Robotics and Automation, pp. 3357–3364 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Freiburg, Freiburg im Breisgau, Germany
Fabian Schmalstieg, Daniel Honerkamp, Tim Welschehold & Abhinav Valada

Authors

Fabian Schmalstieg
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Honerkamp
View author publications
You can also search for this author in PubMed Google Scholar
Tim Welschehold
View author publications
You can also search for this author in PubMed Google Scholar
Abhinav Valada
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fabian Schmalstieg .

Editor information

Editors and Affiliations

EPFL STI SMT-GE, Ecole Polytechnique Federale de Lausanne, Lausanne, Vaud, Switzerland
Aude Billard
Institute for Anthropomatics and Robotic, KIT, Karlsruhe, Baden-Württemberg, Germany
Tamim Asfour
Artificial Intelligence Laboratory, Department of Computer Science, Stanford University, Stanford, CA, USA
Oussama Khatib

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (mp4 9254 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Schmalstieg, F., Honerkamp, D., Welschehold, T., Valada, A. (2023). Learning Long-Horizon Robot Exploration Strategies for Multi-object Search in Continuous Action Spaces. In: Billard, A., Asfour, T., Khatib, O. (eds) Robotics Research. ISRR 2022. Springer Proceedings in Advanced Robotics, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-031-25555-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-25555-7_5
Published: 08 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25554-0
Online ISBN: 978-3-031-25555-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Learning Long-Horizon Robot Exploration Strategies for Multi-object Search in Continuous Action Spaces