Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM

  • Published:
Autonomous Robots Aims and scope Submit manuscript

Abstract

Loop closure detection is one of the most important module in Simultaneously Localization and Mapping (SLAM) because it enables to find the global topology among different places. A loop closure is detected when the current place is recognized to match the previous visited places. When the SLAM is executed throughout a long-term period, there will be additional challenges for the loop closure detection. The illumination, weather, and vegetation conditions can often change significantly during the life-long SLAM, resulting in the critical strong perceptual aliasing and appearance variation problems in loop closure detection. In order to address this problem, we propose a new Robust Multimodal Sequence-based (ROMS) method for robust loop closure detection in long-term visual SLAM. A sequence of images is used as the representation of places in our ROMS method, where each image in the sequence is encoded by multiple feature modalites so that different places can be recognized discriminatively. We formulate the robust place recognition problem as a convex optimization problem with structured sparsity regularization due to the fact that only a small set of template places can match the query place. In addition, we also develop a new algorithm to solve the formulated optimization problem efficiently, which guarantees to converge to the global optima theoretically. Our ROMS method is evaluated through extensive experiments on three large-scale benchmark datasets, which record scenes ranging from different times of the day, months, and seasons. Experimental results demonstrate that our ROMS method outperforms the existing loop closure detection methods in long-term SLAM, and achieves the state-of-the-art performance.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

Notes

  1. The template groups can be designed to have overlaps, e.g., using sliding window techniques. However, in the experiments, we found that groups with or without overlaps result in almost identical performance, as demonstrated by the example in Fig. 5c, since our method can activate highly similar scene templates outside of the selected group (and vise versa) to solve the sequence misalignment issue.

  2. When \(\mathbf {D}\mathbf {a}_i - \mathbf {b}_i = \mathbf {0}\), Eq. 8 is not differentiable. Following Gorodnitsky and Rao (1997) and Wang et al. (2013), we can regularize the i-the diagonal element of the matrix \(\mathbf {U}\) using \(u_{ii} = \frac{1}{2\sqrt{\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert ^2_2 + \zeta }}\). Similarly, when \(\mathbf {a}^i = \mathbf {0}\), the ith diagonal element of the matrix \(\mathbf {V}\) can be regularized using \(\frac{1}{2\sqrt{\Vert \mathbf {a}^i\Vert _2^2 + \zeta }}\). When \(\mathbf {a}_i^j = \mathbf {0}\), we employ the same small perturbation to regularize the jth diagonal block of \(\mathbf {W}^i\) as \(\frac{1}{2\sqrt{\Vert \mathbf {a}_i^j\Vert ^2_2 + \zeta }} \mathbf {I}_j\). Then, the derived algorithm can be proved to minimize the following function: \(\sum _{i=1}^{s}\sqrt{\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2^2 + \zeta } + \lambda _1 \sum _{i=1}^{n}\sqrt{\Vert \mathbf {a}^i\Vert _2^2+ \zeta } + \lambda _2 \sum _{i=1}^{s}\sum _{j=1}^{k} \sqrt{\Vert \mathbf {a}_i^j\Vert _2^2 + \zeta }\). It is easy to verify that this new problem is reduced to the problem in Eq. 8, when \(\zeta \rightarrow 0\).

References

  • Angeli, A., Filliat, D., Doncieux, S., & Meyer, J. A. (2008). Fast and incremental method for loop-closure detection using bags of visual words. IEEE Transactions on Robotics, 24(5), 1027–1037.

    Article  Google Scholar 

  • Arroyo, R., Alcantarilla, P., Bergasa, L., & Romera, E. (2015). Towards life-long visual localization using an efficient matching of binary sequences from images. In IEEE international conference on robotics and automation.

  • Badino, H., Huber, D., & Kanade, T. (2012). Real-time topometric localization. In IEEE international conference on robotics and automation.

  • Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., et al. (2016). Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on Robotics, 32(6), 1309–1332.

    Article  Google Scholar 

  • Cadena, C., Gálvez-López, D., Tardós, J. D., & Neira, J. (2012). Robust place recognition with stereo sequences. IEEE Transactions on Robotics, 28(4), 871–885.

    Article  Google Scholar 

  • Chen, C., & Wang, H. (2006). Appearance-based topological Bayesian inference for loop-closing detection in a cross-country environment. The International Journal of Robotics Research, 25(10), 953–983.

    Article  Google Scholar 

  • Cummins, M., & Newman, P. (2008). FAB-MAP: Probabilistic localization and mapping in the space of appearance. The International Journal of Robotics Research, 27(6), 647–665.

    Article  Google Scholar 

  • Cummins, M., & Newman, P. (2009). Highly scalable appearance-only SLAM-FAB-MAP 2.0. In Robotics: Science and systems.

  • Estrada, C., Neira, J., & Tardós, J. D. (2005). Hierarchical SLAM: Real-time accurate mapping of large environments. IEEE Transactions on Robotics, 21(4), 588–596.

    Article  Google Scholar 

  • Gálvez-López, D., & Tardós, J. D. (2012). Bags of binary words for fast place recognition in image sequences. IEEE Transactions on Robotics, 28(5), 1188–1197.

    Article  Google Scholar 

  • Glover, A. J., Maddern, W. P., Milford M. J., & Wyeth, G. F. (2010). FAB-MAP + RatSLAM: Appearance-based SLAM for multiple times of day. In IEEE international conference on robotics and automation.

  • Glover, A., Maddern, W., Warren, M., Reid, S., Milford, M., & Wyeth, G. (2012). OpenFABMAP: An open source toolbox for appearance-based loop closure detection. In IEEE international conference on robotics and automation.

  • Goldberg, S. B., Maimone, M. W., & Matthies, L. (2002). Stereo vision and rover navigation software for planetary exploration. In IEEE aerospace conference proceedings.

  • Gorodnitsky, I. F., & Rao, B. D. (1997). Sparse signal reconstruction from limited data using FOCUSS: A re-weighted minimum norm algorithm. IEEE Transactions on Signal Processing, 45(3), 600–616.

    Article  Google Scholar 

  • Gutmann, J. S., & Konolige, K. (1999). Incremental mapping of large cyclic environments. In IEEE international symposium on computational intelligence in robotics and automation.

  • Han, F., Wang, H., & Zhang, H. (2018). Learning of integrated holism-landmark representations for long-term loop closure detection. In AAAI conference on artificial intelligence.

  • Han, F., Yang, X., Deng, Y., Rentschler, M., Yang, D., & Zhang, H. (2017). SRAL: Shared representative appearance learning for long-term visual place recognition. IEEE Robotics and Automation Letters, 2(2), 1172–1179.

    Article  Google Scholar 

  • Hansen, P., & Browning, B. (2014). Visual place recognition using HMM sequence matching. In IEEE/RSJ international conference on intelligent robots and systems.

  • Henry, P., Krainin, M., Herbst, E., Ren, X., & Fox, D. (2012). RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. The International Journal of Robotics Research, 31(5), 647–663.

    Article  Google Scholar 

  • Ho, K. L., & Newman, P. (2007). Detecting loop closure with scene sequences. International Journal of Computer Vision, 74(3), 261–286.

    Article  Google Scholar 

  • Johns, E., & Yang, G. Z. (2013). Feature co-occurrence maps: Appearance-based localisation throughout the day. In IEEE international conference on robotics and automation.

  • Kleiner, A., & Dornhege, C. (2007). Real-time localization and elevation mapping within urban search and rescue scenarios. Journal of Field Robotics, 24(8–9), 723–745.

    Article  Google Scholar 

  • Klopschitz, M., Zach, C., Irschara, A., & Schmalstieg, D. (2008). Generalized detection and merging of loop closures for video sequences. In 3D data processing, visualization, and transmission.

  • Labbe, M., & Michaud, F. (2013). Appearance-based loop closure detection for online large-scale and long-term operation. IEEE Transactions on Robotics, 29(3), 734–745.

    Article  Google Scholar 

  • Labbe, M., & Michaud, F. (2014). Online global loop closure detection for large-scale multi-session graph-based SLAM. In IEEE/RSJ international conference on intelligent robots and systems.

  • Latif, Y., Cadena, C., & Neira, J. (2013). Robust loop closing over time for pose graph SLAM. The International Journal of Robotics Research, 32, 1611–1626.

    Article  Google Scholar 

  • Latif, Y., Huang, G., Leonard, J., & Neira, J. (2014). An online sparsity-cognizant loop-closure algorithm for visual navigation. In Robotics: Science and systems conference.

  • Li, S., Huang, H., Zhang, Y., & Liu, M. (2015). An efficient multi-scale convolutional neural network for image classification based on PCA. In International conference on real-time computing and robotics.

  • Lowry, S., Sünderhauf, N., Newman, P., Leonard, J. J., Cox, D., Corke, P., et al. (2016). Visual place recognition: A survey. IEEE Transactions on Robotics, 32, 1.

    Article  Google Scholar 

  • Milford, M. J., & Wyeth, G. F. (2012). SeqSLAM: Visual route-based navigation for sunny summer days and stormy winter nights. In IEEE international conference on robotics and automation.

  • Milford, M. J., Wyeth, G. F., & Rasser, D. (2004). RatSLAM: A hippocampal model for simultaneous localization and mapping. In IEEE international conference on robotics and automation.

  • Mur-Artal, R., Montiel, J. M. M., & Tardos, J. D. (2015). ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics, 31(5), 1147–1163.

    Article  Google Scholar 

  • Mur-Artal, R., & Tardós, J. D. (2014). Fast relocalisation and loop closing in keyframe-based SLAM. In IEEE international conference on robotics and automation.

  • Naseer, T., Ruhnke, M., Stachniss, C., Spinello, L., & Burgard, W. (2015). Robust visual SLAM across seasons. In IEEE/RSJ international conference on intelligent robots and systems.

  • Naseer, T., Spinello, L., Burgard, W., & Stachniss, C. (2014). Robust visual robot localization across seasons using network flows. In AAAI conference on artificial intelligence.

  • Nie, F., Huang, H., Cai, X., & Ding, C. H. (2010). Efficient and robust feature selection via joint \(\ell _{2,1}\)-norms minimization. In Advances in neural information processing systems.

  • Pepperell, E., Corke, P., & Milford, M. J. (2014). All-environment visual place recognition with SMART. In IEEE international conference on robotics and automation.

  • Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems.

  • Santos, J. M., Couceiro, M. S., Portugal, D., & Rocha, R. P. (2015). A sensor fusion layer to cope with reduced visibility in SLAM. Journal of Intelligent & Robotic Systems, 80(3), 401–422.

    Article  Google Scholar 

  • Sünderhauf, N., Neubert, P., & Protzel, P. (2013). Are we there yet? Challenging SeqSLAM on a 3000 km journey across all four seasons. In Workshop on IEEE international conference on robotics and automation.

  • Sünderhauf, N., & Protzel, P. (2011). BRIEF-Gist—closing the loop by simple means. In IEEE/RSJ international conference on intelligent robots and systems.

  • Sünderhauf, N., Shirazi, S., Jacobson, A., Dayoub, F., Pepperell, E., Upcroft, B., & Milford, M. (2015). ConvNet landmarks: Viewpoint-robust, condition-robust, training-free. In Robotics: Science and systems.

  • Thrun, S., Burgard, W., & Fox, D. (2000). A real-time algorithm for mobile robot mapping with applications to multi-robot and 3D mapping. In IEEE international conference on robotics and automation.

  • Thrun, S., & Leonard, J. J. (2008). Simultaneous localization and mapping. In B. Siciliano & O. Khatib (Eds.), Springer handbook of robotics (pp. 871–889). Berlin: Springer.

    Chapter  Google Scholar 

  • Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society Series B (Methodological), 58, 267–288.

    MathSciNet  MATH  Google Scholar 

  • Wang, H., Nie, F., & Huang, H. (2013). Multi-view clustering and feature learning via structured sparsity. In International conference on machine learning.

  • Zhang, H., Han, F., & Wang, H. (2016). Robust multimodal sequence-based loop closure detection via structured sparsity. In Robotics: Science and systems.

Download references

Acknowledgements

This work was partially supported by ARO W911NF-17-1-0447, NSF-IIS 1423591, and NSF-IIS 1652943.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Fei Han.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This is one of several papers published in Autonomous Robots comprising the “Special Issue on Robotics Science and Systems”.

Appendix Proof of Lemma 1:

Appendix Proof of Lemma 1:

For any vector \(\tilde{\mathbf {v}}\) and \(\mathbf {v}\), the following inequality holds: \(\Vert \tilde{\mathbf {v}}\Vert _2 - \frac{\Vert \tilde{\mathbf {v}}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \le \Vert \mathbf {v}\Vert _2 - \frac{\Vert \mathbf {v}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \).

Proof

Obviously, the inequality \(-(\Vert \tilde{\mathbf {a}}\Vert _2 - \Vert \mathbf {a}\Vert _2)^2 \le 0\) holds. Thus, we have:

$$\begin{aligned}&-(\Vert \tilde{\mathbf {v}}\Vert _2 - \Vert \mathbf {v}\Vert _2)^2 \le 0 \Rightarrow 2\Vert \tilde{\mathbf {v}}\Vert _2\Vert \mathbf {v}\Vert _2 - \Vert \tilde{\mathbf {v}}\Vert _2^2 \le \Vert \mathbf {v}\Vert _2^2 \\&\Rightarrow \Vert \tilde{\mathbf {v}}\Vert _2 - \frac{\Vert \tilde{\mathbf {v}}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \le \Vert \mathbf {v}\Vert _2 - \frac{\Vert \mathbf {v}\Vert _2^2}{2\Vert \mathbf {v}\Vert _2} \end{aligned}$$

This completes the proof. \(\square \)

1.1 Proof of Theorem 1:

Algorithm 1 monotonically decreases the objective value of the problem in Eq. 8 in each iteration.

Proof

Assume the update of \(\mathbf {A}\) is \(\tilde{\mathbf {A}}\). According to Step 6 in Algorithm 1, we know that:

$$\begin{aligned} \tilde{\mathbf {A}} =\,&\underset{\mathbf {A}}{\mathrm {arg\,min}}\; Tr((\mathbf {DA} - \mathbf {B})\mathbf {U}(\mathbf {DA}-\mathbf {B})^\top ) \nonumber \\&+ \lambda _1 Tr(\mathbf {A}^\top \mathbf {VA}) + \lambda _2 \sum _{i=1}^{s} \mathbf {a}_i^\top \mathbf {W}^i \mathbf {a}_i \,, \end{aligned}$$
(12)

where \(Tr(\cdot )\) is the trace of a matrix. Thus, we can derive

$$\begin{aligned}&Tr((\mathbf {D}\tilde{\mathbf {A}} - \mathbf {B})\mathbf {U}(\mathbf {D}\tilde{\mathbf {A}}-\mathbf {B})^\top ) \nonumber \\&\qquad + \lambda _1 Tr(\tilde{\mathbf {A}}^\top \mathbf {V}\tilde{\mathbf {A}}) + \lambda _2 \sum _{i=1}^{s} \tilde{\mathbf {a}}_i^\top \mathbf {W}^i \tilde{\mathbf {a}}_i \nonumber \\&\quad \le Tr((\mathbf {DA} - \mathbf {B})\mathbf {U}(\mathbf {DA}-\mathbf {B})^\top ) \nonumber \\&\qquad + \lambda _1 Tr(\mathbf {A}^\top \mathbf {VA}) + \lambda _2 \sum _{i=1}^{s} \mathbf {a}_i^\top \mathbf {W}^i \mathbf {a}_i \end{aligned}$$
(13)

According to the definition of \(\mathbf {U}\), \(\mathbf {V}\), and \(\mathbf {W}\), we have

$$\begin{aligned}&\sum _{i=1}^{s} \left( \frac{\Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} + \lambda _1 \frac{\Vert \tilde{\mathbf {a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} + \lambda _2 \sum _{j=1}^{k} \frac{\Vert \tilde{\mathbf {a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \nonumber \\&\quad \le \sum _{i=1}^{s} \left( \frac{\Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} + \lambda _1 \frac{\Vert \mathbf {{a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} + \lambda _2 \sum _{j=1}^{k} \frac{\Vert \mathbf {{a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \end{aligned}$$
(14)

According to Lemma 1, we can obtain the following inequalities:

$$\begin{aligned}&\sum _{i=1}^{s} \left( \Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert _2 - \frac{\Vert \mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} \right) \nonumber \\&\quad \le \sum _{i=1}^{s} \left( \Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert _2 - \frac{\Vert \mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i\Vert ^2_2}{2\Vert \mathbf {D}\mathbf {a}_i - \mathbf {b}_i\Vert _2} \right) \nonumber \\&\sum _{i=1}^{s} \left( \Vert \tilde{\mathbf {a}}\Vert _2 - \lambda _1 \frac{\Vert \tilde{\mathbf {a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} \right) \le \sum _{i=1}^{s} \left( \Vert \mathbf {{a}}\Vert _2 - \lambda _1 \frac{\Vert \mathbf {{a}}\Vert _2^2}{2\Vert \mathbf {a}\Vert _2} \right) \nonumber \\&\sum _{i=1}^{s}\sum _{j=1}^{k} \left( \Vert \tilde{\mathbf {a}}_i^j\Vert _2 - \frac{\Vert \tilde{\mathbf {a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \le \sum _{i=1}^{s}\sum _{j=1}^{k} \left( \Vert \mathbf {{a}}_i^j\Vert _2 - \frac{\Vert \mathbf {{a}}_i^j\Vert ^2_2}{2\Vert \mathbf {a}_i^j\Vert _2} \right) \end{aligned}$$
(15)

Computing the summation of the three equations in Eq. 15 on both sides (weighted by \(\lambda \)s), we obtain:

$$\begin{aligned}&\sum _{i=1}^{s} \Vert (\mathbf {D}\tilde{\mathbf {a}}_i - \mathbf {b}_i)^\top \Vert _{2} + \lambda _1 \Vert \tilde{\mathbf {a}}\Vert _{2} + \lambda _2 \Vert \tilde{\mathbf {a}}\Vert _{2} \nonumber \\&\quad \le \sum _{i=1}^{s} \Vert (\mathbf {D}\mathbf {{a}}_i - \mathbf {b}_i)^\top \Vert _{2} + \lambda _1 \Vert \mathbf {{a}}\Vert _{2} + \lambda _2 \Vert \mathbf {{a}}\Vert _{2} \end{aligned}$$
(16)

Therefore, Algorithm 1 monotonically decreases the objective value in each iteration.\(\square \)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, F., Wang, H., Huang, G. et al. Sequence-based sparse optimization methods for long-term loop closure detection in visual SLAM. Auton Robot 42, 1323–1335 (2018). https://doi.org/10.1007/s10514-018-9736-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10514-018-9736-3

Keywords