Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras

Published: 14 September 2018 Publication History

Abstract

We present an integrated approach for reconstructing high-fidelity three-dimensional (3D) models using consumer RGB-D cameras. RGB-D registration and reconstruction algorithms are prone to errors from scanning noise, making it hard to perform 3D reconstruction accurately. The key idea of our method is to assign a probabilistic uncertainty model to each depth measurement, which then guides the scan alignment and depth fusion. This allows us to effectively handle inherent noise and distortion in depth maps while keeping the overall scan registration procedure under the iterative closest point framework for simplicity and efficiency. We further introduce a local-to-global, submap-based, and uncertainty-aware global pose optimization scheme to improve scalability and guarantee global model consistency. Finally, we have implemented the proposed algorithm on the GPU, achieving real-time 3D scanning frame rates and updating the reconstructed model on-the-fly. Experimental results on simulated and real-world data demonstrate that the proposed method outperforms state-of-the-art systems in terms of the accuracy of both recovered camera trajectories and reconstructed models.

Supplementary Material

cao (cao.zip)
Supplemental movie, appendix, image and software files for, Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras

References

[1]
P. J. Besl and N. D. McKay. 1992. A method for registration of 3-D shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14, 2 (Feb. 1992), 239--256.
[2]
S. Bouaziz, A. Tagliasacchi, and M. Pauly. 2013. Sparse iterative closest point. Comput. Graph. Forum 32, 5 (2013), 1--11.
[3]
J. Chen, D. Bautembach, and S. Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. 32, 4, Article 113 (Jul. 2013), 16 pages.
[4]
S. Choi, Q. Y. Zhou, and V. Koltun. 2015. Robust reconstruction of indoor scenes. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5556--5565.
[5]
Y. Cui, S. Schuon, S. Thrun, D. Stricker, and C. Theobalt. 2013. Algorithms for 3d shape scanning with a depth camera. IEEE Trans. Pattern Anal. Mach. Intell. 35, 5 (2013), 1039--1050.
[6]
B. Curless and M. Levoy. 1996. A volumetric method for building complex models from range images. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). ACM, New York, NY, 303--312.
[7]
A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt. 2017. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Trans. Graph. 36, 3, Article 24 (May 2017), 18 pages.
[8]
M. Danelljan, G. Meneghetti, F. S. Khan, and M. Felsberg. 2016. A probabilistic framework for color-based point set registration. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 1818--1826.
[9]
J. R. Diebel, S. Thrun, and M. Brünig. 2006. A Bayesian method for probable surface reconstruction and decimation. ACM Trans. Graph. 25, 1 (Jan. 2006), 39--59.
[10]
F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard. 2012. An evaluation of the RGB-D SLAM system. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation. 1691--1696.
[11]
F. Endres, J. Hess, J. Sturm, D. Cremers, and W. Burgard. 2014. 3-D mapping with an RGB-D camera. IEEE Trans. Robot. 30, 1 (Feb 2014), 177--187.
[12]
J. Engel, T. Schöps, and D. Cremers. 2014. LSD-SLAM: Large-scale direct monocular SLAM. In Proceedings of the 13th European Conference on Computer Vision (ECCV’14), David Fleet, Tomas Pajdla, Bernt Schiele, and Tinne Tuytelaars (Eds.). Springer International Publishing, Cham, 834--849.
[13]
D. Ferstl, C. Reinbacher, G. Riegler, M. Rüther, and H. Bischof. 2015. Learning depth calibration of time-of-flight cameras. In Proceedings of the British Machine Vision Conference (BMVC’15), Mark W. Jones Xianghua Xie and Gary K. L. Tam (Eds.). BMVA Press, Article 102, 12 pages.
[14]
N. Fioraio, J. Taylor, A. Fitzgibbon, L. Di Stefano, and S. Izadi. 2015. Large-scale and drift-free surface reconstruction using online subvolume registration. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 4475--4483.
[15]
S. Fuhrmann and M. Goesele. 2014. Floating scale surface reconstruction. ACM Trans. Graph. 33, 4, Article 46 (July 2014), 11 pages.
[16]
D. Gálvez-López and J. D. Tardos. 2012. Bags of binary words for fast place recognition in image sequences. IEEE Trans. Robot. 28, 5 (Oct. 2012), 1188--1197.
[17]
B. Glocker, J. Shotton, A. Criminisi, and S. Izadi. 2015. Real-time RGB-D camera relocalization via randomized ferns for keyframe encoding. IEEE Trans. Vis. Comput. Graph. 21, 5 (May 2015), 571--583.
[18]
G. Grigoryan and P. Rheingans. 2004. Point-based probabilistic surfaces to show surface uncertainty. IEEE Trans. Vis. Comput. Graph. 10, 5 (Sept 2004), 564--573.
[19]
A. Handa, T. Whelan, J. McDonald, and A. J. Davison. 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In Proceedings of the 2014 IEEE International Conference on Robotics and Automation (ICRA’14). 1524--1531.
[20]
P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. 2012. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. Int. J. Rob. Res. 31, 5 (April 2012), 647--663.
[21]
D. Holz, A. E. Ichim, F. Tombari, R. B. Rusu, and S. Behnke. 2015. Registration with the point cloud library: A modular framework for aligning in 3-D. IEEE Robot. Automat. Mag. 22, 4 (Dec 2015), 110--124.
[22]
P. J. Huber. 2011. Robust Statistics. Springer.
[23]
P. Jenke, M. Wand, M. Bokeloh, A. Schilling, and W. Strasser. 2006. Bayesian point cloud reconstruction. Comput. Graph. Forum 25, 3 (2006), 379--388.
[24]
B. Jian and B. C. Vemuri. 2011. Robust point set registration using Gaussian mixture models. IEEE Trans. Pattern Anal. Mach. Intell. 33, 8 (Aug 2011), 1633--1645.
[25]
A. Jordt and R. Koch. 2013. Reconstruction of Deformation from Depth and Color Video with Explicit Noise Models. Springer, Berlin, 128--146.
[26]
O. Kähler, V. A. Prisacariu, and D. W. Murray. 2016. Real-Time Large-Scale Dense 3D Reconstruction with Loop Closure. Springer International Publishing, Cham, 500--516.
[27]
O. Kähler, V. A. Prisacariu, C. Y. Ren, X. Sun, P. Torr, and D. Murray. 2015. Very high frame rate volumetric integration of depth images on mobile devices. IEEE Trans. Vis. Comput. Graph. 21, 11 (Nov 2015), 1241--1250.
[28]
A. Kalaiah and A. Varshney. 2003. Statistical point geometry. In Proceedings of the 2003 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing (SGP’03). 107--115. http://dl.acm.org/citation.cfm?id=882370.882385
[29]
M. Keller, D. Lefloch, M. Lambers, S. Izadi, T. Weyrich, and A. Kolb. 2013. Real-time 3D reconstruction in dynamic scenes using point-based fusion. In Proceedings of the 2013 International Conference on 3D Vision (3DV’13). 1--8.
[30]
C. Kerl, J. Sturm, and D. Cremers. 2013. Dense visual SLAM for RGB-D cameras. In Proceedings of the 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2100--2106.
[31]
Y. M. Kim, D. Chan, C. Theobalt, and S. Thrun. 2008. Design and calibration of a multi-view TOF sensor fusion system. In Proceedings of the 2008 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops. 1--7.
[32]
R. Kolluri, J. Richard Shewchuk, and J. F. O’Brien. 2004. Spectral surface reconstruction from noisy point clouds. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH Symposium on Geometry Processing (SGP’04). ACM, New York, NY, USA, 11--21.
[33]
R. Kümmerle, G. Grisetti, H. Strasdat, K. Konolige, and W. Burgard. 2011. G2o: A general framework for graph optimization. In Proceedings of the 2011 IEEE International Conference on Robotics and Automation. 3607--3613.
[34]
F. Lenzen, K. I. Kim, H. Schäfer, R. Nair, S. Meister, F. Becker, C. S. Garbe, and C. Theobalt. 2013. Denoising Strategies for Time-of-Flight Data. Springer, Berlin, 25--45.
[35]
H. Li, E. Vouga, A. Gudym, L. Luo, J. T. Barron, and G. Gusev. 2013. 3D self-portraits. ACM Trans. Graph. 32, 6, Article 187 (Nov. 2013), 9 pages.
[36]
R. Maier, J. Sturm, and D. Cremers. 2014. Submap-based bundle adjustment for 3D reconstruction from RGB-D data. In German Conference on Pattern Recognition. Springer, 54--65.
[37]
N. Mellado, D. Aiger, and N. J. Mitra. 2014. Super 4PCS fast global pointcloud registration via smart indexing. Comput. Graph. Forum 33, 5 (2014), 205--215.
[38]
R. Mur-Artal, J. M. M. Montiel, and J. D. Tardós. 2015. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Trans. Robot. 31, 5 (Oct. 2015), 1147--1163.
[39]
A. Myronenko and X. Song. 2010. Point set registration: Coherent point drift. IEEE Trans. Pattern Anal. Mach. Intell. 32, 12 (Dec. 2010), 2262--2275.
[40]
R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proceedings of the 2011 10th IEEE International Symposium on Mixed and Augmented Reality. 127--136.
[41]
C. V. Nguyen, S. Izadi, and D. Lovell. 2012. Modeling Kinect sensor noise for improved 3D reconstruction and tracking. In Proceedings of the 2012 2nd International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission. 524--530.
[42]
M. Nießner, M. Zollhöfer, S. Izadi, and M. Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. 32, 6, Article 169 (Nov. 2013), 11 pages.
[43]
E. Olson and P. Agarwal. 2013. Inference on networks of mixtures for robust robot mapping. Int. J. Rob. Res. 32, 7 (June 2013), 826--840.
[44]
J. H. Park, Y. D. Shin, J. H Bae, and M. H. Baeg. 2012. Spatial uncertainty model for visual features using a Kinect sensor. Sensors 12, 7 (2012), 8640--8662.
[45]
R. R. Paulsen, J. A. Baerentzen, and R. Larsen. 2010. Markov random field surface reconstruction. IEEE Trans. Vis. Comput. Graph. 16, 4 (July 2010), 636--646.
[46]
F. Reichl, J. Weiss, and R. Westermann. 2016. Memory-efficient interactive online reconstruction from depth image streams. In Computer Graphics Forum 35, 8 (2016), 108--119.
[47]
M. Reynolds, J. Doboš, L. Peel, T. Weyrich, and G. J. Brostow. 2011. Capturing time-of-flight data with confidence. In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR’11). 945--952.
[48]
H. Roth and M. Vona. 2012. Moving volume KinectFusion. In Proceedings of the British Machine Vision Conference. British Machine Vision Association, 112.1--112.11.
[49]
E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the 2011 International Conference on Computer Vision. 2564--2571.
[50]
M. Ruhnke, R. Kümmerle, G. Grisetti, and W. Burgard. 2012. Highly accurate 3D surface models by sparse surface adjustment. In Proceedings of the 2012 IEEE International Conference on Robotics and Automation. 751--757.
[51]
S. Rusinkiewicz and M. Levoy. 2001. Efficient variants of the ICP algorithm. In Proceedings of the 3rd International Conference on 3-D Digital Imaging and Modeling. 145--152.
[52]
A. Segal, D. Haehnel, and S. Thrun. 2009. Generalized-ICP. In Robotics: Science and Systems, Vol. 2.
[53]
F. Steinbrücker, C. Kerl, and D. Cremers. 2013. Large-scale multi-resolution surface reconstruction from RGB-D sequences. In Proceedings of the 2013 IEEE International Conference on Computer Vision. 3264--3271.
[54]
J. Stückler and S. Behnke. 2014. Multi-resolution surfel maps for efficient dense 3D modeling and tracking. J. Vis. Commun. Image Represent. 25, 1 (2014), 137--147.
[55]
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In Proceedings of the 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 573--580.
[56]
A. Teichman, S. Miller, and S. Thrun. 2013. Unsupervised intrinsic calibration of depth sensors via SLAM. In Robotics: Science and Systems, Vol. 248.
[57]
C. Wang and X. Guo. 2017. Feature-based RGB-D camera pose optimization for real-time 3D reconstruction. Comput. Vis. Media 3, 2 (2017), 95--106.
[58]
H. Wang, J. Wang, and L. Wang. 2016. Online reconstruction of indoor scenes from RGB-D streams. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 3271--3279.
[59]
O. Wasenmüller, M. Meyer, and D. Stricker. 2016. CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In Proceedings of the 2016 IEEE Winter Conference on Applications of Computer Vision (WACV’16). 1--7.
[60]
T. Whelan, M. Kaess, H. Johannsson, M. Fallon, J. J. Leonard, and J. Mcdonald. 2015a. Real-time large-scale dense RGB-D SLAM with volumetric fusion. Int. J. Rob. Res. 34, 4--5 (April 2015), 598--626.
[61]
T. Whelan, S. Leutenegger, R. F. Salas-Moreno, B. Glocker, and A. J. Davison. 2015b. ElasticFusion: Dense SLAM without a pose graph. Robotics: Science and Systems (2015).
[62]
H. Yamazoe, H. Habe, I. Mitsugami, and Y. Yagi. 2018. Depth error correction for projector-camera based consumer depth cameras. Comput. Vis. Media 4, 2 (2018), 1--9.
[63]
J. Yang, H. Li, and Y. Jia. 2013. Go-ICP: Solving 3D registration efficiently and globally optimally. In Proceedings of the 2013 IEEE International Conference on Computer Vision. 1457--1464.
[64]
A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser. 2017. 3DMatch: Learning local geometric descriptors from RGB-D reconstructions. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17). 199--208.
[65]
M. Zeng, F. Zhao, J. Zheng, and X. Liu. 2013. Octree-based fusion for realtime 3D reconstruction. Graph. Models 75, 3 (May 2013), 126--136.
[66]
Q. Y. Zhou, S. Miller, and V. Koltun. 2013. Elastic fragments for dense scene reconstruction. In Proceedings of the 2013 IEEE International Conference on Computer Vision. 473--480.
[67]
Q. Y. Zhou, J. Park, and V. Koltun. 2016. Fast Global Registration. Springer International Publishing, Cham, 766--782.
[68]
M. Zollhöfer, A. Dai, M. Innmann, C. Wu, M. Stamminger, C. Theobalt, and M. Nießner. 2015. Shading-based refinement on volumetric signed distance functions. ACM Trans. Graph. 34, 4, Article 96 (July 2015), 14 pages.

Cited By

View all
  • (2025)Digital image correlation calculation method for RGB-D camera multi-view matching using variable templateMeasurement10.1016/j.measurement.2024.115617240(115617)Online publication date: Jan-2025
  • (2024)RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian SplattingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657455(1-11)Online publication date: 13-Jul-2024
  • (2024)Depth Image Inpainting Algorithm Based on Improved Non-Local Means Filtering2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10603656(964-967)Online publication date: 12-Apr-2024
  • Show More Cited By

Index Terms

  1. Real-time High-accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 37, Issue 5
    October 2018
    140 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/3278329
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 September 2018
    Accepted: 01 May 2018
    Revised: 01 April 2018
    Received: 01 October 2017
    Published in TOG Volume 37, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D scan registration
    2. RGB-D scanning
    3. scene reconstruction

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Joint NSFC-DFG Research Program
    • German Research Foundation, DFG
    • European Research Council, ERC Advanced Grant ACROSS
    • Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)90
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 01 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Digital image correlation calculation method for RGB-D camera multi-view matching using variable templateMeasurement10.1016/j.measurement.2024.115617240(115617)Online publication date: Jan-2025
    • (2024)RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian SplattingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657455(1-11)Online publication date: 13-Jul-2024
    • (2024)Depth Image Inpainting Algorithm Based on Improved Non-Local Means Filtering2024 5th International Conference on Computer Engineering and Application (ICCEA)10.1109/ICCEA62105.2024.10603656(964-967)Online publication date: 12-Apr-2024
    • (2024)Loopy-SLAM: Dense Neural SLAM with Loop Closures2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01925(20363-20373)Online publication date: 16-Jun-2024
    • (2024)A Survey of Indoor 3D Reconstruction Based on RGB-D CamerasIEEE Access10.1109/ACCESS.2024.344306512(112742-112766)Online publication date: 2024
    • (2023)Real-Time Globally Consistent 3D Reconstruction With Semantic PriorsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.313791229:4(1977-1991)Online publication date: 1-Apr-2023
    • (2023)Seamless Texture Optimization for RGB-D ReconstructionIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.313410529:3(1845-1859)Online publication date: 1-Mar-2023
    • (2023)3D-CariGAN: An End-to-End Solution to 3D Caricature Generation From Normal Face PhotosIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.312665929:4(2203-2210)Online publication date: 1-Apr-2023
    • (2023)Exemplar-Based 3D Portrait StylizationIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2021.311430829:2(1371-1383)Online publication date: 1-Feb-2023
    • (2023)UncLe-SLAM: Uncertainty Learning for Dense Neural SLAM2023 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)10.1109/ICCVW60793.2023.00488(4539-4550)Online publication date: 2-Oct-2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media