Abstract
Visual scene understanding and place recognition are the most challenging problems that mobile robots must solve for to achieve autonomous navigation. To reduce the high computational complexity of many global optimal search strategies, a new two-stage loop closure detection (LCD) strategy is developed in this paper. The front-end sequence node level matching (FSNLM) algorithm is based on the local continuity constraint of the motion process, which avoids the blind search for the global optimal match, and matches the image nodes via a sliding window to accurately find the local optimal matching candidate nodesets. In addition, the back-end image level matching (BILM) algorithm combined with an improved semantic model, DeepLab_AE, uses a convolutional neural network (CNN) as a feature detector to extract visual descriptors. It replaces traditional feature detectors that are manually designed by researchers in the computer vision field and cannot be applied to all environments. Finally, the performance of the two-stage LCD algorithm is evaluated on five public datasets, and is compared with the performance of other state-of-the-art algorithms. The evaluation results prove that the proposed method compares favorably with other algorithms.
Similar content being viewed by others
Explore related subjects
Find the latest articles, discoveries, and news in related topics.Data Availability
All data generated or analysed during this study are included in this published article.
References
Durrant-Whyte, H., Bailey, T.: Simultaneous localization and mapping: part I. IEEE Robotics & Automation Magazine. 13(2), 99–110 (2006)
Sivic, J., Zisserman, A.: Video Google: a text retrieval approach to object matching in videos. IEEE International Conference on Computer Vision. 2, 1470–1477 (2003)
Snderhauf, N., Protzel, P.: Brief-gist - closing the loop by simple means, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 1234–1241, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094921
Cummins, M., Newman, P.: FAB-MAP: probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research. 27(6), 647–665 (2008). https://doi.org/10.1177/0278364908090961
Galvez-Lopez, D., Tardos, J.D.: Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28(5), 1188–1197 (2012)
Lowe, D.G.: Object recognition from local scale-invariant features, vol. 2, pp. 1150–1157. 1999. DOI. https://doi.org/10.1109/ICCV.1999.790410
Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded up robust features, in Computer Vision – ECCV 2006, Eds. Berlin, Heidelberg: Springer Berlin Heidelberg, pp. 404–417, 2006
Xiang, G., Tao, Z.: Unsupervised learning to detect loops using deep neural networks for visual slam system. Auton. Robot. 41(1), 1–18 (2017)
Clemente, L.A., Davison, A.J., Reid, I.D., Neira, J., Tards, J.D.: Mapping large loops with a single hand-held camera, Proc Robotics Sciences & Systems, pp. 297–304, 2007
Korrapati, H., Mezouar, Y.: Multi-resolution map building and loop closure with omnidirectional images. Auton. Robot. 41(4), 967–987 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. International Conference on Neural Information Processing Systems. 25, 1097–1105 (2012). https://doi.org/10.1145/3065386
Hou, Y., Zhang, H., Zhou, S.: Convolutional neural network-based image representation for visual loop closure detection, IEEE International Conference on Information and Automation, pp. 2238–2245, 2015
Finman, R., Paull, L., Leonard, J.J.: Toward Object-Based Place Recognition in Dense RGB-D Maps, ICRA workshop on visual place recognition in changing environments, 2015
Linlin, X., Jiashuo, C., Ran, S., Xun, X., Xinying, L.: A survey of image semantics-based visual simultaneous localization and mapping: Application-oriented solutions to autonomous navigation of mobile robots. International Journal of Advanced Robotic Systems. 17 (2020). https://doi.org/10.1177/1729881420919185
Baifan, C., Dian, Y., Chunfa, L., Qian, W.: Loop Closure Detection Based on Multi-Scale Deep Feature Fusion. Applied Sciences. 9(6), 1120 (2019). https://doi.org/10.3390/app9061120
Chen, L.-C., Papandreou, G., Schroff, F., Adam, H.: Rethinking atrous convolution for semantic image segmentation, arXiv, vol. abs/1706.05587, 2017
Nicosevici, T., Garcia, R.: Automatic visual bag-of-words for online robot navigation and mapping. IEEE Trans. Robot. 28(4), 886–898 (2012)
Rafique Memon, A., Hesheng, W., Hussain, A.: Loop closure detection using supervised and unsupervised deep neural networks for monocular SLAM systems. Robotics & Autonomous Systems. 126, 103470 (2020). https://doi.org/10.1016/j.robot.2020.103470
Nister, D. Stewenius, H.: Scalable recognition with a vocabulary tree, in Proc. IEEE Comput. Soc. Conf. Comput. Vis. Pattern Recog, vol. 2, pp. 2161–2168, 2006
Cummins, M., Newman, P.: Appearance-only SLAM at large scale with FAB-MAP 2.0. Int. J. Robot. Res. 30(9), 1100–1123 (2011)
Bampis, L., Amanatiadis, A., Gasteratos, A.: Fast loop-closure detection using visual-word-vectors from image sequences. The International Journal of Robotics Research. 37(1), 62–82 (2018). https://doi.org/10.1177/0278364917740639
Lowry, S., Sunderhauf, N., Newman, P., Leonard, J.J., Cox, D., Corke, P., Milford, M.J.: Visual place recognition: a survey. IEEE Trans. Robot. 32(1), 1–19 (2016)
Milford, M.J., Wyeth, G.F.: Seqslam: Visual route-based navigation for sunny summer days and stormy winter nights, in IEEE International Conference on Robotics & Automation, pp. 1643–1649, 2012
Maddern, W., Milford, M., Wyeth, G.: CAT-SLAM: probabilistic localization and mapping using a continuous appearance-based trajectory. Int. J. Robot. Res. 31(4), 429–451 (2012)
Duckett, T., Marsland, S., Shapiro, J., England, S.: Learning globally consistent maps by relaxation. IEEE International Conference on Robotics & Automation. 4, 3841–3846 (2000)
Filliat, D., Meyer, J.A.: Global localization and topological map- learning for robot navigation, Seventh International Conference on simulation of adaptive behavior : From Animals to Animats (SAB-2002), pp. 131–140, 2002
Davison, A.J., Reid, I.D., Molton, N.D., Olivier, S.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
Newcombe, R.A., Lovegrove, S.J., Davison, A.J.: (2011) DTAM: dense tracking and mapping in real-time, International Conference on Computer Vision, pp. 2320–2327. DOI. https://doi.org/10.1109/ICCV.2011.6126513
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11(12), 3371–3408 (2010)
Sunderhauf, N., Shirazi, S., Dayoub, F., Upcroft, B., Milford, M.: On the performance of convnet features for place recognition, IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 4297–4304, 2015. DOI. https://doi.org/10.1109/IROS.2015.7353986
Zhe, L., Chuanzhe, S., Shunbo, Z., Wen, C., Hesheng, W., Yun-Hui, L.: SeqLPD: Sequence Matching Enhanced Loop-Closure Detection Based on Large-Scale Point Cloud Description for Self-Driving Vehicles, IEEE/RSJ International Conference on Intelligent Robots & Systems (IROS), pp. 1218–1223, 2019. DOI. https://doi.org/10.1109/IROS40897.2019.8967875
McCormac, J., Handa, A., Davison, A.J., Leutenegger, S.: Semanticfusion: Dense 3d semantic mapping with convolutional neural networks, 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4628–4635, 2016
Zitnick, C.L., Dollr, P.: Edge Boxes: Locating Object Proposals from Edges, Computer Vision – ECCV 2014, vol. 8693, pp. 391–405, 2014. DOI. https://doi.org/10.1007/978-3-319-10602-1_26
Cascianelli, S., Costante, G., Bellocchio, E., Valigi, P., Fravolini, M., Ciarfuglia, T.: Robust visual semi-semantic loop closure detection by a covisibility graph and cnn features. Robotics & Autonomous Systems. 92, 53–65 (2017). https://doi.org/10.1016/j.robot.2017.03.004
Sturm, J., Engelhard, N., Endres, F., Burgard, W., Cremers, D.: A benchmark for the evaluation of rgb-d slam systems, in IEEE/RSJ International Conference on Intelligent Robots & Systems, pp. 573–580, 2012. DOI. https://doi.org/10.1109/IROS.2012.6385773
Blanco¸ J.L., Moreno, F., Gonzlez-Jimnez, J., A collection of outdoor robotic datasets with centimeter-accuracy ground truth, Autonomous Robots, vol. 27, 2009. DOI. https://doi.org/10.1007/s10514-009-9138-7
Geiger, A., Lenz, P., Stiller, C., Urtasun, R.: Vision meets robotics: the Kitti dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition, IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2016. DOI. https://doi.org/10.1109/CVPR.2016.90
Yu, F., Koltun, V.: Multi-scale context aggregation by dilated convolutions, Computing Research Repository (CoRR) in arXiv, vol. abs/1511.07122, 2015
Noh, H., Hong, S., Han, B.: Learning deconvolution network for semantic segmentation, in IEEE International Conference on Computer Vision, pp. 1520–1528, 2015. DOI. https://doi.org/10.1109/ICCV.2015.178
Muja, M., Lowe, D.G.: Fast approximate nearest neighbors with automatic algorithm configuration, in In VISAPP International Conference on Computer Vision Theory and Applications, pp. 331–340, 2009
Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., Wyeth, G.: Openfabmap: An open source toolbox for appearance-based loop closure detection, IEEE International Conference on Robotics and Automation, pp. 4730–4735, 2012. DOI. 1109/ICRA.2012.6224843
Glvez-Lpez, D., Tards, J.D.: Real-time loop detection with bags of binary words, in 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 51–58, Seq, 2011. DOI. https://doi.org/10.1109/IROS.2011.6094885
Shan, A., Haogang, Z., Dong, W., Tsintotas, K.A.: Fast and Incremental Loop Closure Detection with Deep Features and Proximity Graphs, arXiv, vol. abs/2010.11703, 2020
Masci, J., Meier, U., Dan, C., Schmidhuber, J.: Stacked convolutional auto-encoders for hierarchical feature extraction, in International Conference on Artificial Neural Networks, vol. 6791, 2011. DOI. https://doi.org/10.1007/978-3-642-21735-7_7
Glorot, G.X., Bordes, A., Bengio, Y.: Domain adaptation for large-scale sentiment classification: A deep learning approach, in Proceedings of the 28th International Conference on International Conference on Machine Learning, pp. 513–520, 2011
Haotong, Q., Ruihao, G., Xianglong, L., Mingzhu, S., Ziran, W., Fengwei, Y., Jingkuan, S.: Forward and Backward Information Retention for Accurate Binary Neural Networks, arXiv, vol. abs/1909.10788, 2020
Jiahui, H., Sheng, Y., Tai-Jiang, M., Shi-Min, H.: ClusterVO: Clustering Moving Instances and Estimating Visual Odometry for Self and Surroundings, arXiv, vol. abs/2003.12980, 2020
Nair, G., Daga, S., Sajnani, R., Ramesh, A., Ahmed Ansari, J., Murthy Jatavallabhula, K., Krishna, K.: Multi-object Monocular SLAM for Dynamic Environments, arXiv, vol. abs/2002.03528, 2020
Acknowledgments
This work was supported in part by the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].
Funding
This study was supported in part by grants from the National Natural Science Foundation of China (No.61873175, No.71601022), the Key Project B Class of Beijing Natural Science Fund [grant number KZ201710028028]; the Youth Innovative Research Team of Capital Normal University; the academy for multidisciplinary studies academy for multidisciplinary studies of Capital Normal University and the Beijing Youth Talent Support Program [grant number CIT&TCD201804036].
Author information
Authors and Affiliations
Contributions
Zhonghua Wang developed a new two-stage loop closure detection algorithm, and was a major contributor in writing the manuscript. Lifeng Wu analyzed and interpreted the experimental data regarding the loop closure detection. Zhen Peng and Yong Guan made constructive comments on the algorithm and checked the manuscript for typos and grammar, which improved the quality of writing. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethical Approval
Not applicable.
Consent to Participate
Not applicable.
Consent to Publish
Not applicable.
Competing Interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, Z., Peng, Z., Guan, Y. et al. Two-Stage vSLAM Loop Closure Detection Based on Sequence Node Matching and Semi-Semantic Autoencoder. J Intell Robot Syst 101, 29 (2021). https://doi.org/10.1007/s10846-020-01302-0
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-020-01302-0