Abstract
Structural geometry constraints, such as perpendicularity, parallelism and coplanarity, are widely existing in man-made scene, especially in Manhattan scene. By fully exploiting these structural properties, we propose a monocular visual-inertial odometry (VIO) using point and line features with structural constraints. First, a coarse-to-fine vanishing points estimation method with line segment consistency verification is presented to classify lines into structural and non-structural lines accurately with less computation cost. Then, to get precise estimation of camera pose and the position of 3D landmarks, a cost function which combines structural line constraints with feature reprojection residual and inertial measurement unit residual is minimized under a sliding window framework. For geometric representation of lines, Plücker coordinates and orthonormal representation are utilized for 3D line transformation and non-linear optimization respectively. Sufficient evaluations are conducted using two public datasets to verify that the proposed system can effectively enhance the localization accuracy and robustness than other existing state-of-the-art VIO systems with acceptable time consumption.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets used are publicly available in: EuRoC MAV Dataset:https://projects.asl.ethz.ch/datasets/doku.php?id=kmavvisualinertialdatasets#the_euroc_mav_dataset TUM-VI Dataset: https://vision.in.tum.de/data/datasets/visual-inertial-dataset
References
Joo, K., Kim, P., Hebert, M., Kweon, I.S., Kim, H.J.: Linear RGB-D slam for structured environments. IEEE Trans. Pattern Anal. Mach. Intell. 44, 8403–8419 (2021)
Guclu, O., Can, A.B.: Integrating global and local image features for enhanced loop closure detection in RGB-D slam systems. Vis. Comput. 36(6), 1271–1290 (2020)
Zhou, Y., Yan, F., Zhou, Z.: Handling pure camera rotation in semi-dense monocular slam. Vis. Comput. 35(1), 123–132 (2019)
Miao, R., Liu, P., Wen, F., Gong, Z., Xue, W., Ying, R.: R-SDSO: robust stereo direct sparse odometry. Vis. Comput. 38(6), 2207–2221 (2022)
Davison, A.J., Reid, I.D., Molton, N.D., Stasse, O.: Monoslam: real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell. 29(6), 1052–1067 (2007)
He, M., Zhu, C., Huang, Q., Ren, B., Liu, J.: A review of monocular visual odometry. Vis. Comput. 36(5), 1053–1065 (2020)
Cui, H., Tu, D., Tang, F., Xu, P., Liu, H., Shen, S.: Vidsfm: robust and accurate structure-from-motion for monocular videos. IEEE Trans. Image Process. 31, 2449–2462 (2022)
Greene, W.N., Roy, N.: Metrically-scaled monocular slam using learned scale factors. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 43–50 (2020). IEEE
Lin, Y., Gao, F., Qin, T., Gao, W., Liu, T., Wu, W., Yang, Z., Shen, S.: Autonomous aerial navigation using monocular visual-inertial fusion. J. Field Robot. 35(1), 23–51 (2018)
Almalioglu, Y., Turan, M., Saputra, M.R.U., de Gusmão, P.P., Markham, A., Trigoni, N.: SelfVIO: self-supervised deep monocular visual-inertial odometry and depth estimation. Neural Netw. 150, 119–136 (2022)
Li, N., Ai, H.: EfiLoc: large-scale visual indoor localization with efficient correlation between sparse features and 3d points. Vis. Comput. 38(6), 2091–2106 (2022)
Qin, T., Li, P., Shen, S.: VINS-MONO: a robust and versatile monocular visual-inertial state estimator. IEEE Trans. Robot. 34(4), 1004–1020 (2018)
Mur-Artal, R., Montiel, J.M.M., Tardos, J.D.: ORB-SLAM: a versatile and accurate monocular slam system. IEEE Trans. Robot. 31(5), 1147–1163 (2015)
Lee, J., Park, S.-Y.: PLF-VINS: real-time monocular visual-inertial slam with point-line fusion and parallel-line fusion. IEEE Robot. Autom. Lett. 6(4), 7033–7040 (2021)
Lu, Y., Song, D.: Robust RGB-D odometry using point and line features. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3934–3942 (2015)
Hughes, C., Denny, P., Glavin, M., Jones, E.: Equidistant fish-eye calibration and rectification by vanishing point extraction. IEEE Trans. Pattern Anal. Mach. Intell. 32(12), 2289–2296 (2010)
Kim, P., Coltin, B., Kim, H.J.: Low-drift visual odometry in structured environments by decoupling rotational and translational motion. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp. 7247–7253. IEEE (2018)
Li, H., Xing, Y., Zhao, J., Bazin, J.-C., Liu, Z., Liu, Y.-H.: Leveraging structural regularity of Atlanta world for monocular slam. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 2412–2418. IEEE (2019)
Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M.W., Siegwart, R.: The Euroc micro aerial vehicle datasets. Int. J. Robot. Res. 35(10), 1157–1163 (2016)
Schubert, D., Goll, T., Demmel, N., Usenko, V., Stückler, J., Cremers, D.: The tum vi benchmark for evaluating visual-inertial odometry. In: 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 1680–1687. IEEE (2018)
He, Y., Zhao, J., Guo, Y., He, W., Yuan, K.: PL-VIO: tightly-coupled monocular visual-inertial odometry using point and line features. Sensors 18(4), 1159 (2018)
Fu, Q., Wang, J., Yu, H., Ali, I., Guo, F., He, Y., Zhang, H.: PL-VINS: real-time monocular visual-inertial slam with point and line features. arXiv preprint arXiv:2009.07462 (2020)
Lim, H., Jeon, J., Myung, H.: UV-SLAM: unconstrained line-based slam using vanishing points for structural mapping. IEEE Robot. Autom. Lett. 7, 1518–1525 (2022)
Zou, D., Wu, Y., Pei, L., Ling, H., Yu, W.: Structvio: visual-inertial odometry with structural regularity of man-made environments. IEEE Trans. Robot. 35(4), 999–1013 (2019)
Huang, G.: Visual-inertial navigation: a concise review. In: 2019 International Conference on Robotics and Automation (ICRA), pp. 9572–9582. IEEE (2019)
Weiss, S., Achtelik, M.W., Lynen, S., Chli, M., Siegwart, R.: Real-time onboard visual-inertial state estimation and self-calibration of mavs in unknown environments. In: 2012 IEEE International Conference on Robotics and Automation, pp. 957–964. IEEE (2012)
Kneip, L., Weiss, S., Siegwart, R.: Deterministic initialization of metric state estimation filters for loosely-coupled monocular vision-inertial systems. In: 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 2235–2241. IEEE (2011)
Bloesch, M., Omari, S., Hutter, M., Siegwart, R.: Robust visual inertial odometry using a direct EKF-based approach. In: 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 298–304. IEEE (2015)
Jones, E.S., Soatto, S.: Visual-inertial navigation, mapping and localization: a scalable real-time causal approach. Int. J. Robot. Res. 30(4), 407–430 (2011)
Shi, J., et al.: Good features to track. In: 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 593–600. IEEE (1994)
Pumarola, A., Vakhitov, A., Agudo, A., Sanfeliu, A., Moreno-Noguer, F.: Pl-slam: real-time monocular visual slam with points and lines. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 4503–4508. IEEE (2017)
Von Gioi, R.G., Jakubowicz, J., Morel, J.-M., Randall, G.: LSD: a fast line segment detector with a false detection control. IEEE Trans. Pattern Anal. Mach. Intell. 32(4), 722–732 (2008)
Zhang, L., Koch, R.: An efficient and robust line segment matching approach based on LBD descriptor and pairwise geometric consistency. J. Vis. Commun. Image Represent. 24(7), 794–805 (2013)
Li, Y., Brasch, N., Wang, Y., Navab, N., Tombari, F.: Structure-slam: low-drift monocular slam in indoor environments. IEEE Robot. Autom. Lett. 5(4), 6583–6590 (2020)
Yunus, R., Li, Y., Tombari, F.: Manhattanslam: robust planar tracking and mapping leveraging mixture of manhattan frames. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 6687–6693. IEEE (2021)
Lu, X., Yaoy, J., Li, H., Liu, Y., Zhang, X.: 2-line exhaustive searching for real-time vanishing point estimation in manhattan world. In: 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 345–353. IEEE (2017)
Zhou, H., Zou, D., Pei, L., Ying, R., Liu, P., Yu, W.: Structslam: visual slam with building structure lines. IEEE Trans. Veh. Technol. 64(4), 1364–1375 (2015)
Xu, B., Wang, P., He, Y., Chen, Y., Chen, Y., Zhou, M.: Leveraging structural information to improve point line visual-inertial odometry. IEEE Robot. Autom. Lett. 7(2), 3483–3490 (2022)
Peng, X., Liu, Z., Wang, Q., Kim, Y.-T., Lee, H.-S.: Accurate visual-inertial slam by manhattan frame re-identification. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 5418–5424. IEEE
Lucas, B.D., Kanade, T.: An iterative image registration technique with an application to stereo vision. In: IJCAI’81: 7th International Joint Conference on Artificial Intelligence, vol. 2, pp. 674–679 (1981)
Forster, C., Carlone, L., Dellaert, F., Scaramuzza, D.: On-manifold preintegration for real-time visual-inertial odometry. IEEE Trans. Robot. 33(1), 1–21 (2016)
Bouguet, J.-Y., et al.: Pyramidal implementation of the affine Lucas Kanade feature tracker description of the algorithm. Intel Corp. 5(1–10), 4 (2001)
Bartoli, A., Sturm, P.: Structure-from-motion using lines: representation, triangulation, and bundle adjustment. Comput. Vis. Image Underst. 100(3), 416–441 (2005)
Agarwal, S., Mierle, K.: Ceres solver: tutorial and reference. Google 2(72), 8 (2012)
Toldo, R., Fusiello, A.: Robust multiple structures estimation with j-linkage. In: European Conference on Computer Vision, pp. 537–547. Springer (2008)
Funding
This work is partly supported by the National Natural Science Foundation of China under Grant No.61973009.
Author information
Authors and Affiliations
Contributions
JZ: Conceptualization, Methodology, Software, Writing-original draft. JY: Supervision, Funding acquisition, Writing-review; JM: Writing-review and editing.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhang, J., Yang, J. & Ma, J. Monocular visual-inertial odometry leveraging point-line features with structural constraints. Vis Comput 40, 647–661 (2024). https://doi.org/10.1007/s00371-023-02807-z
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-02807-z