Abstract
Recent advances in vision-based navigation and exploration have shown impressive capabilities in photorealistic indoor environments. However, these methods still struggle with long-horizon tasks and require large amounts of data to generalize to unseen environments. In this work, we present a novel reinforcement learning approach for multi-object search that combines short-term and long-term reasoning in a single model while avoiding the complexities arising from hierarchical structures. In contrast to existing multi-object search methods that act in granular discrete action spaces, our approach achieves exceptional performance in continuous action spaces. We perform extensive experiments and show that it generalizes to unseen apartment environments with limited data. Furthermore, we demonstrate zero-shot transfer of the learned policies to an office environment in real world experiments.
F. Schmalstieg and D. Honerkamp—These authors contributed equally.
This work was funded by the European Union’s Horizon 2020 research and innovation program under grant agreement No 871449-OpenDR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Anderson, P., et al.: On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757 (2018)
Beeching, E., Debangoye, J., Simonin, O., Wolf, C.: Deep reinforcement learning on a budget: 3D control and reasoning without a supercomputer. In: 25th International Conference on Pattern Recognition (ICPR), pp. 158–165 (2021)
Cattaneo, D., Vaghi, M., Valada, A.: LCDNet: deep loop closure detection and point cloud registration for LiDAR SLAM. IEEE Trans. Robot. 38, 2074–2093 (2022)
Chaplot, D.S., Gandhi, D., Gupta, S., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural slam. In: International Conference on Learning Representations (ICLR) (2020)
Chaplot, D.S., Gandhi, D.P., Gupta, A., Salakhutdinov, R.R.: Object goal navigation using goal-oriented semantic exploration. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 4247–4258 (2020)
Chen, C., et al.: SoundSpaces: audio-visual navigation in 3D environments. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 17–36. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_2
Chen, C., Majumder, S., Al-Halah, Z., Gao, R., Ramakrishnan, S.K., Grauman, K.: Learning to set waypoints for audio-visual navigation. In: International Conference on Learning Representations (ICLR) (2020)
Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: International Conference on Learning Representations (ICLR) (2019)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 248–255. IEEE (2009)
Druon, R., Yoshiyasu, Y., Kanezaki, A., Watt, A.: Visual object search by learning spatial context. IEEE Robot. Autom. Lett. 5(2), 1279–1286 (2020)
Fang, K., Toshev, A., Fei-Fei, L., Savarese, S.: Scene memory transformer for embodied agents in long-horizon tasks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 538–547 (2019)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778 (2016)
Henriques, J.F., Vedaldi, A.: MapNet: an allocentric spatial memory for mapping environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 8476–8484 (2018)
Honerkamp, D., Welschehold, T., Valada, A.: Learning kinematic feasibility for mobile manipulation through deep reinforcement learning. IEEE Robot. Autom. Lett. 6(4), 6289–6296 (2021)
Hurtado, J.V., Londoño, L., Valada, A.: From learning to relearning: a framework for diminishing bias in social robot navigation. Front. Robot. AI 8, 69 (2021)
Kim, J., Lee, E.S., Lee, M., Zhang, D., Kim, Y.M.: SGoLAM: simultaneous goal localization and mapping for multi-object goal navigation. arXiv preprint arXiv:2110.07171 (2021)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Kohlbrecher, S., Meyer, J., von Stryk, O., Klingauf, U.: A flexible and scalable slam system with full 3D motion estimation. In: Proceedings of the IEEE International Symposium on Safety, Security and Rescue Robotics (SSRR). IEEE (2011)
Li, C., et al.: iGibson 2.0: object-centric simulation for robot learning of everyday household tasks. In: 2021 Conference on Robot Learning (CoRL), Proceedings of Machine Learning Research, vol. 164, pp. 455–465. PMLR (2021). https://proceedings.mlr.press/v164/li22b.html
Min, S.Y., Chaplot, D.S., Ravikumar, P.K., Bisk, Y., Salakhutdinov, R.: FILM: following instructions in language with modular methods. In: International Conference on Learning Representations (ICLR) (2022)
Oh, J., Chockalingam, V., Lee, H., et al.: Control of memory, active perception, and action in minecraft. In: International Conference on Machine Learning, pp. 2790–2799 (2016)
Qiu, Y., Pal, A., Christensen, H.I.: Learning hierarchical relationships for object-goal navigation. In: 2020 Conference on Robot Learning (CoRL) (2020)
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: reliable reinforcement learning implementations. J. Mach. Learn. Res. 22(268), 1–8 (2021)
Savva, M., et al.: Habitat: a platform for embodied AI research. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 9339–9347 (2019)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Vödisch, N., Cattaneo, D., Burgard, W., Valada, A.: Continual SLAM: beyond lifelong simultaneous localization and mapping through continual learning. arXiv preprint arXiv:2203.01578 (2022)
Wani, S., Patel, S., Jain, U., Chang, A., Savva, M.: MultiON: benchmarking semantic map memory using multi-object navigation. In: Proceedings of the Conference on Neural Information Processing Systems (NeurIPS), vol. 33, pp. 9700–9712 (2020)
Yamauchi, B.: A frontier-based approach for autonomous exploration. In: Proceedings 1997 IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), pp. 146–151 (1997)
Younes, A., Honerkamp, D., Welschehold, T., Valada, A.: Catch me if you hear me: audio-visual navigation in complex unmapped environments with moving sounds. arXiv preprint arXiv:2111.14843 (2021)
Zhang, J., Tai, L., Boedecker, J., Burgard, W., Liu, M.: Neural SLAM: learning to explore with external memory. arXiv preprint arXiv:1706.09520 (2017)
Zhu, Y., et al.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: International Conference on Robotics and Automation, pp. 3357–3364 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Supplementary material 1 (mp4 9254 KB)
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Schmalstieg, F., Honerkamp, D., Welschehold, T., Valada, A. (2023). Learning Long-Horizon Robot Exploration Strategies for Multi-object Search in Continuous Action Spaces. In: Billard, A., Asfour, T., Khatib, O. (eds) Robotics Research. ISRR 2022. Springer Proceedings in Advanced Robotics, vol 27. Springer, Cham. https://doi.org/10.1007/978-3-031-25555-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-031-25555-7_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25554-0
Online ISBN: 978-3-031-25555-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)