Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM

Han, Fei; Wang, Hua; Huang, Guoquan; Zhang, Hao

doi:10.1007/s10514-018-9736-3

Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM

Published: 21 April 2018

Volume 42, pages 1323–1335, (2018)
Cite this article

Autonomous Robots Aims and scope Submit manuscript

Fei Han¹,
Hua Wang¹,
Guoquan Huang² &
…
Hao Zhang¹

1471 Accesses
21 Citations
Explore all metrics

Abstract

Loop closure detection is one of the most important module in Simultaneously Localization and Mapping (SLAM) because it enables to find the global topology among different places. A loop closure is detected when the current place is recognized to match the previous visited places. When the SLAM is executed throughout a long-term period, there will be additional challenges for the loop closure detection. The illumination, weather, and vegetation conditions can often change significantly during the life-long SLAM, resulting in the critical strong perceptual aliasing and appearance variation problems in loop closure detection. In order to address this problem, we propose a new Robust Multimodal Sequence-based (ROMS) method for robust loop closure detection in long-term visual SLAM. A sequence of images is used as the representation of places in our ROMS method, where each image in the sequence is encoded by multiple feature modalites so that different places can be recognized discriminatively. We formulate the robust place recognition problem as a convex optimization problem with structured sparsity regularization due to the fact that only a small set of template places can match the query place. In addition, we also develop a new algorithm to solve the formulated optimization problem efficiently, which guarantees to converge to the global optima theoretically. Our ROMS method is evaluated through extensive experiments on three large-scale benchmark datasets, which record scenes ranging from different times of the day, months, and seasons. Experimental results demonstrate that our ROMS method outperforms the existing loop closure detection methods in long-term SLAM, and achieves the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Semi-Direct SLAM with Manhattan for Indoor Low-Texture Environment

A real-time visual SLAM based on semantic information and geometric information in dynamic environment

Article 10 September 2024

Fast and Effective Loop Closure Detection to Improve SLAM Performance

Article 10 October 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Notes

The template groups can be designed to have overlaps, e.g., using sliding window techniques. However, in the experiments, we found that groups with or without overlaps result in almost identical performance, as demonstrated by the example in Fig. 5c, since our method can activate highly similar scene templates outside of the selected group (and vise versa) to solve the sequence misalignment issue.
When $\mathbf {D}\mathbf {a}_i - \mathbf {b}_i = \mathbf {0}$, Eq. 8 is not differentiable. Following Gorodnitsky and Rao (1997) and Wang et al. (2013), we can regularize the i-the diagonal element of the matrix $\mathbf {U}$ using $u_{ii} = \frac{1}{2\sqrt{\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert ^2_2 + \zeta }}$. Similarly, when $\mathbf {a}^i = \mathbf {0}$, the ith diagonal element of the matrix $\mathbf {V}$ can be regularized using $\frac{1}{2\sqrt{\Vert \mathbf {a}^i\Vert _2^2 + \zeta }}$. When $\mathbf {a}_i^j = \mathbf {0}$, we employ the same small perturbation to regularize the jth diagonal block of $\mathbf {W}^i$ as $\frac{1}{2\sqrt{\Vert \mathbf {a}_i^j\Vert ^2_2 + \zeta }} \mathbf {I}_j$. Then, the derived algorithm can be proved to minimize the following function: $\sum _{i=1}^{s}\sqrt{\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2^2 + \zeta } + \lambda _1 \sum _{i=1}^{n}\sqrt{\Vert \mathbf {a}^i\Vert _2^2+ \zeta } + \lambda _2 \sum _{i=1}^{s}\sum _{j=1}^{k} \sqrt{\Vert \mathbf {a}_i^j\Vert _2^2 + \zeta }$. It is easy to verify that this new problem is reduced to the problem in Eq. 8, when $\zeta \rightarrow 0$.

References

Angeli, A., Filliat, D., Doncieux, S., & Meyer, J. A. (2008). Fast and incremental method for loop-closure detection using bags of visual words. IEEE Transactions on Robotics, 24(5), 1027–1037.
Article Google Scholar
Arroyo, R., Alcantarilla, P., Bergasa, L., & Romera, E. (2015). Towards life-long visual localization using an efficient matching of binary sequences from images. In IEEE international conference on robotics and automation.
Badino, H., Huber, D., & Kanade, T. (2012). Real-time topometric localization. In IEEE international conference on robotics and automation.
Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.
Article Google Scholar
Cadena, C., Gálvez-López, D., Tardós, J. D., & Neira, J. (2012). Robust place recognition with stereo sequences. IEEE Transactions on Robotics, 28(4), 871–885.
Article Google Scholar
Chen, C., & Wang, H. (2006). Appearance-based topological Bayesian inference for loop-closing detection in a cross-country environment. The International Journal of Robotics Research, 25(10), 953–983.
Article Google Scholar
Cummins, M., & Newman, P. (2008). FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.
Article Google Scholar
Cummins, M., & Newman, P. (2009). Highly scalable appearance-only SLAM-FAB-MAP 2.0. In Robotics: Science and systems.
Estrada, C., Neira, J., & Tardós, J. D. (2005). Hierarchical SLAM: Real-time accurate mapping of large environments. IEEE Transactions on Robotics, 21(4), 588–596.
Article Google Scholar
Gálvez-López, D., & Tardós, J. D. (2012). Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 28(5), 1188–1197.
Article Google Scholar
Glover, A. J., Maddern, W. P., Milford M. J., & Wyeth, G. F. (2010). FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day. In IEEE international conference on robotics and automation.
Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., & Wyeth, G. (2012). OpenFABMAP: An open source toolbox for appearance-based loop closure detection. In IEEE international conference on robotics and automation.
Goldberg, S. B., Maimone, M. W., & Matthies, L. (2002). Stereo vision and rover navigation software for planetary exploration. In IEEE aerospace conference proceedings.
Gorodnitsky, I. F., & Rao, B. D. (1997). Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Transactions on Signal Processing, 45(3), 600–616.
Article Google Scholar
Gutmann, J. S., & Konolige, K. (1999). Incremental mapping of large cyclic environments. In IEEE international symposium on computational intelligence in robotics and automation.
Han, F., Wang, H., & Zhang, H. (2018). Learning of integrated holism-landmark representations for long-term loop closure detection. In AAAI conference on artificial intelligence.
Han, F., Yang, X., Deng, Y., Rentschler, M., Yang, D., & Zhang, H. (2017). SRAL: Shared representative appearance learning for long-term visual place recognition. IEEE Robotics and Automation Letters, 2(2), 1172–1179.
Article Google Scholar
Hansen, P., & Browning, B. (2014). Visual place recognition using HMM sequence matching. In IEEE/RSJ international conference on intelligent robots and systems.
Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2012). RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research, 31(5), 647–663.
Article Google Scholar
Ho, K. L., & Newman, P. (2007). Detecting loop closure with scene sequences. International Journal of Computer Vision, 74(3), 261–286.
Article Google Scholar
Johns, E., & Yang, G. Z. (2013). Feature co-occurrence maps: Appearance-based localisation throughout the day. In IEEE international conference on robotics and automation.
Kleiner, A., & Dornhege, C. (2007). Real-time localization and elevation mapping within urban search and rescue scenarios. Journal of Field Robotics, 24(8–9), 723–745.
Article Google Scholar
Klopschitz, M., Zach, C., Irschara, A., & Schmalstieg, D. (2008). Generalized detection and merging of loop closures for video sequences. In 3D data processing, visualization, and transmission.
Labbe, M., & Michaud, F. (2013). Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Transactions on Robotics, 29(3), 734–745.
Article Google Scholar
Labbe, M., & Michaud, F. (2014). Online global loop closure detection for large-scale multi-session graph-based SLAM. In IEEE/RSJ international conference on intelligent robots and systems.
Latif, Y., Cadena, C., & Neira, J. (2013). Robust loop closing over time for pose graph SLAM. The International Journal of Robotics Research, 32, 1611–1626.
Article Google Scholar
Latif, Y., Huang, G., Leonard, J., & Neira, J. (2014). An online sparsity-cognizant loop-closure algorithm for visual navigation. In Robotics: Science and systems conference.
Li, S., Huang, H., Zhang, Y., & Liu, M. (2015). An efficient multi-scale convolutional neural network for image classification based on PCA. In International conference on real-time computing and robotics.
Lowry, S., Sünderhauf, N., Newman, P., Leonard, J. J., Cox, D., Corke, P., et al. (2016). Visual place recognition: A survey. IEEE Transactions on Robotics, 32, 1.
Article Google Scholar
Milford, M. J., & Wyeth, G. F. (2012). SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. In IEEE international conference on robotics and automation.
Milford, M. J., Wyeth, G. F., & Rasser, D. (2004). RatSLAM: A hippocampal model for simultaneous localization and mapping. In IEEE international conference on robotics and automation.
Mur-Artal, R., Montiel, J. M. M., & Tardos, J. D. (2015). ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5), 1147–1163.
Article Google Scholar
Mur-Artal, R., & Tardós, J. D. (2014). Fast relocalisation and loop closing in keyframe-based SLAM. In IEEE international conference on robotics and automation.
Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., & Burgard, W. (2015). Robust visual SLAM across seasons. In IEEE/RSJ international conference on intelligent robots and systems.
Naseer, T., Spinello, L., Burgard, W., & Stachniss, C. (2014). Robust visual robot localization across seasons using network flows. In AAAI conference on artificial intelligence.
Nie, F., Huang, H., Cai, X., & Ding, C. H. (2010). Efficient and robust feature selection via joint $\ell _{2,1}$-norms minimization. In Advances in neural information processing systems.
Pepperell, E., Corke, P., & Milford, M. J. (2014). All-environment visual place recognition with SMART. In IEEE international conference on robotics and automation.
Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems.
Santos, J. M., Couceiro, M. S., Portugal, D., & Rocha, R. P. (2015). A sensor fusion layer to cope with reduced visibility in SLAM. Journal of Intelligent & Robotic Systems, 80(3), 401–422.
Article Google Scholar
Sünderhauf, N., Neubert, P., & Protzel, P. (2013). Are we there yet? Challenging SeqSLAM on a 3000 km journey across all four seasons. In Workshop on IEEE international conference on robotics and automation.
Sünderhauf, N., & Protzel, P. (2011). BRIEF-Gist—closing the loop by simple means. In IEEE/RSJ international conference on intelligent robots and systems.
Sünderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015). ConvNet landmarks: Viewpoint-robust, condition-robust, training-free. In Robotics: Science and systems.
Thrun, S., Burgard, W., & Fox, D. (2000). A real-time algorithm for mobile robot mapping with applications to multi-robot and 3D mapping. In IEEE international conference on robotics and automation.
Thrun, S., & Leonard, J. J. (2008). Simultaneous localization and mapping. In B. Siciliano & O. Khatib (Eds.), Springer handbook of robotics (pp. 871–889). Berlin: Springer.
Chapter Google Scholar
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267–288.
MathSciNet MATH Google Scholar
Wang, H., Nie, F., & Huang, H. (2013). Multi-view clustering and feature learning via structured sparsity. In International conference on machine learning.
Zhang, H., Han, F., & Wang, H. (2016). Robust multimodal sequence-based loop closure detection via structured sparsity. In Robotics: Science and systems.

Download references

Acknowledgements

This work was partially supported by ARO W911NF-17-1-0447, NSF-IIS 1423591, and NSF-IIS 1652943.

Author information

Authors and Affiliations

Department of Computer Science, Colorado School of Mines, Golden, CO, 80401, USA
Fei Han, Hua Wang & Hao Zhang
Department of Mechanical Engineering, University of Delaware, Newark, DE, 19716, USA
Guoquan Huang

Authors

Fei Han
View author publications
You can also search for this author in PubMed Google Scholar
Hua Wang
View author publications
You can also search for this author in PubMed Google Scholar
Guoquan Huang
View author publications
You can also search for this author in PubMed Google Scholar
Hao Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Fei Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is one of several papers published in Autonomous Robots comprising the “Special Issue on Robotics Science and Systems”.

Appendix Proof of Lemma 1:

For any vector $\tilde{\mathbf {v}}$ and $\mathbf {v}$, the following inequality holds: $\Vert \tilde{\mathbf {v}}\Vert _2 - \frac{\Vert \tilde{\mathbf {v}}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \le \Vert \mathbf {v}\Vert _2 - \frac{\Vert \mathbf {v}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} $.

Proof

Obviously, the inequality $-(\Vert \tilde{\mathbf {a}}\Vert _2 - \Vert \mathbf {a}\Vert _2)^2 \le 0$ holds. Thus, we have:

$$\begin{aligned}&-(\Vert \tilde{\mathbf {v}}\Vert _2 - \Vert \mathbf {v}\Vert _2)^2 \le 0 \Rightarrow 2\Vert \tilde{\mathbf {v}}\Vert _2\Vert \mathbf {v}\Vert _2 - \Vert \tilde{\mathbf {v}}\Vert _2^2 \le \Vert \mathbf {v}\Vert _2^2 \\&\Rightarrow \Vert \tilde{\mathbf {v}}\Vert _2 - \frac{\Vert \tilde{\mathbf {v}}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \le \Vert \mathbf {v}\Vert _2 - \frac{\Vert \mathbf {v}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \end{aligned}$$

This completes the proof. $\square $

1.1 Proof of Theorem 1:

Algorithm 1 monotonically decreases the objective value of the problem in Eq. 8 in each iteration.

Proof

Assume the update of $\mathbf {A}$ is $\tilde{\mathbf {A}}$. According to Step 6 in Algorithm 1, we know that:

$$\begin{aligned} \tilde{\mathbf {A}} =\,&\underset{\mathbf {A}}{\mathrm {arg\,min}}\; Tr((\mathbf {DA} - \mathbf {B})\mathbf {U}(\mathbf {DA}-\mathbf {B})^\top ) \nonumber \\&+ \lambda _1 Tr(\mathbf {A}^\top \mathbf {VA}) + \lambda _2 \sum _{i=1}^{s} \mathbf {a}_i^\top \mathbf {W}^i \mathbf {a}_i \,, \end{aligned}$$

(12)

where $Tr(\cdot )$ is the trace of a matrix. Thus, we can derive

$$\begin{aligned}&Tr((\mathbf {D}\tilde{\mathbf {A}} - \mathbf {B})\mathbf {U}(\mathbf {D}\tilde{\mathbf {A}}-\mathbf {B})^\top ) \nonumber \\&\qquad + \lambda _1 Tr(\tilde{\mathbf {A}}^\top \mathbf {V}\tilde{\mathbf {A}}) + \lambda _2 \sum _{i=1}^{s} \tilde{\mathbf {a}}_i^\top \mathbf {W}^i \tilde{\mathbf {a}}_i \nonumber \\&\quad \le Tr((\mathbf {DA} - \mathbf {B})\mathbf {U}(\mathbf {DA}-\mathbf {B})^\top ) \nonumber \\&\qquad + \lambda _1 Tr(\mathbf {A}^\top \mathbf {VA}) + \lambda _2 \sum _{i=1}^{s} \mathbf {a}_i^\top \mathbf {W}^i \mathbf {a}_i \end{aligned}$$

(13)

According to the definition of $\mathbf {U}$, $\mathbf {V}$, and $\mathbf {W}$, we have

$$\begin{aligned}&\sum _{i=1}^{s} \left( \frac{\Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} + \lambda _1 \frac{\Vert \tilde{\mathbf {a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} + \lambda _2 \sum _{j=1}^{k} \frac{\Vert \tilde{\mathbf {a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \nonumber \\&\quad \le \sum _{i=1}^{s} \left( \frac{\Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} + \lambda _1 \frac{\Vert \mathbf {{a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} + \lambda _2 \sum _{j=1}^{k} \frac{\Vert \mathbf {{a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \end{aligned}$$

(14)

According to Lemma 1, we can obtain the following inequalities:

$$\begin{aligned}&\sum _{i=1}^{s} \left( \Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert _2 - \frac{\Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} \right) \nonumber \\&\quad \le \sum _{i=1}^{s} \left( \Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert _2 - \frac{\Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} \right) \nonumber \\&\sum _{i=1}^{s} \left( \Vert \tilde{\mathbf {a}}\Vert _2 - \lambda _1 \frac{\Vert \tilde{\mathbf {a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} \right) \le \sum _{i=1}^{s} \left( \Vert \mathbf {{a}}\Vert _2 - \lambda _1 \frac{\Vert \mathbf {{a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} \right) \nonumber \\&\sum _{i=1}^{s}\sum _{j=1}^{k} \left( \Vert \tilde{\mathbf {a}}_i^j\Vert _2 - \frac{\Vert \tilde{\mathbf {a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \le \sum _{i=1}^{s}\sum _{j=1}^{k} \left( \Vert \mathbf {{a}}_i^j\Vert _2 - \frac{\Vert \mathbf {{a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \end{aligned}$$

(15)

Computing the summation of the three equations in Eq. 15 on both sides (weighted by $\lambda $s), we obtain:

$$\begin{aligned}&\sum _{i=1}^{s} \Vert (\mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i)^\top \Vert _{2} + \lambda _1 \Vert \tilde{\mathbf {a}}\Vert _{2} + \lambda _2 \Vert \tilde{\mathbf {a}}\Vert _{2} \nonumber \\&\quad \le \sum _{i=1}^{s} \Vert (\mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i)^\top \Vert _{2} + \lambda _1 \Vert \mathbf {{a}}\Vert _{2} + \lambda _2 \Vert \mathbf {{a}}\Vert _{2} \end{aligned}$$

(16)

Therefore, Algorithm 1 monotonically decreases the objective value in each iteration.$\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Han, F., Wang, H., Huang, G. et al. Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM. Auton Robot 42, 1323–1335 (2018). https://doi.org/10.1007/s10514-018-9736-3

Download citation

Received: 15 February 2017
Accepted: 02 April 2018
Published: 21 April 2018
Issue Date: October 2018
DOI: https://doi.org/10.1007/s10514-018-9736-3

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semi-Direct SLAM with Manhattan for Indoor Low-Texture Environment

A real-time visual SLAM based on semantic information and geometric information in dynamic environment

Fast and Effective Loop Closure Detection to Improve SLAM Performance

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix Proof of Lemma 1:

Proof

1.1 Proof of Theorem 1:

Proof

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Semi-Direct SLAM with Manhattan for Indoor Low-Texture Environment

A real-time visual SLAM based on semantic information and geometric information in dynamic environment

Fast and Effective Loop Closure Detection to Improve SLAM Performance

Explore related subjects

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendix Proof of Lemma 1:

Appendix Proof of Lemma 1:

Proof

1.1 Proof of Theorem 1:

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation