Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

HRBF-Fusion: Accurate 3D Reconstruction from RGB-D Data Using On-the-fly Implicits

Published: 28 April 2022 Publication History

Abstract

Reconstruction of high-fidelity 3D objects or scenes is a fundamental research problem. Recent advances in RGB-D fusion have demonstrated the potential of producing 3D models from consumer-level RGB-D cameras. However, due to the discrete nature and limited resolution of their surface representations (e.g., point or voxel based), existing approaches suffer from the accumulation of errors in camera tracking and distortion in the reconstruction, which leads to an unsatisfactory 3D reconstruction. In this article, we present a method using on-the-fly implicits of Hermite Radial Basis Functions (HRBFs) as a continuous surface representation for camera tracking in an existing RGB-D fusion framework. Furthermore, curvature estimation and confidence evaluation are coherently derived from the inherent surface properties of the on-the-fly HRBF implicits, which are devoted to a data fusion with better quality. We argue that our continuous but on-the-fly surface representation can effectively mitigate the impact of noise with its robustness and constrain the reconstruction with inherent surface smoothness when being compared with discrete representations. Experimental results on various real-world and synthetic datasets demonstrate that our HRBF-fusion outperforms the state-of-the-art approaches in terms of tracking robustness and reconstruction accuracy.

Supplementary Material

xu (xu.zip)
Supplemental movie, appendix, image and software files for, HRBF-Fusion: Accurate 3D Reconstruction from RGB-D Data Using On-the-fly Implicits

References

[1]
P. J. Besl and N. D. McKay. 1992. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence 14, 2 (Feb. 1992), 239–256. DOI:
[2]
Å. Björck. 1996. Numerical Methods for Least Squares Problems. Society for Industrial and Applied Mathematics, Philadelphia, PA. DOI:
[3]
G. Blais and M. D. Levine. 1995. Registering multiview range data to create 3D computer objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 8 (Aug. 1995), 820–824. DOI:
[4]
A. Bozic, P. Palafox, J. Thies, A. Dai, and M. Nießner. 2021. Transformerfusion: Monocular RGB scene reconstruction using transformers. Advances in Neural Information Processing Systems 34 (2021), 17.
[5]
Y. Cao, L. Kobbelt, and S. Hu. 2018. Real-time high-accuracy three-dimensional reconstruction with consumer RGB-D cameras. ACM Transactions on Graphics 37, 5, Article 171 (Sept. 2018), 16 pages. DOI:
[6]
J. C. Carr, R. K. Beatson, J. B. Cherrie, T. J. Mitchell, W. R. Fright, B. C. McCallum, and T. R. Evans. 2001. Reconstruction and representation of 3D objects with radial basis functions. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. 67–76.
[7]
J. Chen, D. Bautembach, and S. Izadi. 2013. Scalable real-time volumetric surface reconstruction. ACM Transactions on Graphics 32, 4, Article 113 (July 2013), 16 pages. DOI:
[8]
S. Choi, Q. Zhou, and V. Koltun. 2015. Robust reconstruction of indoor scenes. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 5556–5565. DOI:
[9]
S. Choi, Q. Zhou, S. Miller, and V. Koltun. 2016. A Large Dataset of Object Scans. (2016). arxiv:cs.CV/1602.02481.
[10]
B. Curless and M. Levoy. 1996. A volumetric method for building complex models from range images. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH’96). Association for Computing Machinery, New York, NY, 303–312.
[11]
A. Dai, A. Chang, M. Savva, M. Halber, T. Funkhouser, and M. Nießner. 2017. ScanNet: Richly-annotated 3D reconstructions of indoor scenes. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’17). 2432–2443. DOI:
[12]
A. Dai, M. Nießner, M. Zollhöfer, S. Izadi, and C. Theobalt. 2017. BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics 36, 3, Article 24 (May 2017), 18 pages. DOI:
[13]
M. Danelljan, G. Meneghetti, F. S. Khan, and M. Felsberg. 2016. A probabilistic framework for color-based point set registration. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 1818–1826. DOI:
[14]
W. Dong, Q. Wang, X. Wang, and H. Zha. 2018. PSDF fusion: Probabilistic signed distance function for on-the-fly 3D data fusion and scene reconstruction. In Computer Vision (ECCV’18), Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing, Cham, 714– 730.
[15]
F. Endres, J. Hess, N. Engelhard, J. Sturm, D. Cremers, and W. Burgard. 2012. An evaluation of the RGB-D SLAM system. In 2012 IEEE International Conference on Robotics and Automation (ICRA’12). 1691–1696. DOI:
[16]
J. Engel, J. Sturm, and D. Cremers. 2013. Semi-dense visual odometry for a monocular camera. In 2013 IEEE International Conference on Computer Vision (ICCV’13). 1449–1456. DOI:
[17]
C. Forster, M. Pizzoli, and D. Scaramuzza. 2014. SVO: Fast semi-direct monocular visual odometry. In 2014 IEEE International Conference on Robotics and Automation (ICRA’14). 15–22. DOI:
[18]
D. Gallup, M. Pollefeys, and J. Frahm. 2010. 3D reconstruction using an N-layer heightmap. In Proceedings of the 32nd DAGM Conference on Pattern Recognition. Springer-Verlag, Berlin, 1–10.
[19]
G. Godin, M. Rioux, and R. Baribeau. 1994. Three-dimensional registration using range and intensity information. Proceedings of SPIE - The International Society for Optical Engineering 2350 (1994), 279–290.
[20]
J. Goldfeather and V. Interrante. 2004. A novel cubic-order algorithm for approximating principal direction vectors. ACM Transactions on Graphics 23, 1 (Jan. 2004), 45–63. DOI:
[21]
A. Handa, T. Whelan, J. McDonald, and A. J. Davison. 2014. A benchmark for RGB-D visual odometry, 3D reconstruction and SLAM. In 2014 IEEE International Conference on Robotics and Automation (ICRA’14). 1524–1531. DOI:
[22]
P. Henry, M. Krainin, E. Herbst, X. Ren, and D. Fox. 2012. RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments. International Journal of Robotic Research 31 (2012), 647–663. DOI:
[23]
J. Huang, S. Huang, H. Song, and S. Hu. 2021b. DI-fusion: Online implicit 3D reconstruction with deep priors. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 8928–8937. DOI:
[24]
S. Huang, H. Chen, J. Huang, H. Fu, and S. Hu. 2021a. Real-time globally consistent 3D reconstruction with semantic priors. IEEE Transactions on Visualization and Computer Graphics (2021), 1–1. DOI:
[25]
B. Jian and B. C. Vemuri. 2011. Robust point set registration using Gaussian mixture models. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 8 (2011), 1633–1645. DOI:
[26]
M. Keller, D. Lefloch, M. Lambers, S. Izadi, T. Weyrich, and A. Kolb. 2013. Real-time 3D reconstruction in dynamic scenes using point-based fusion. In Proceedings of the 2013 International Conference on 3D Vision (3DV’13). IEEE Computer Society, Washington, DC, 1–8. DOI:
[27]
C. Kerl, J. Sturm, and D. Cremers. 2013. Dense visual SLAM for RGB-D cameras. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. 2100–2106. DOI:
[28]
G. Klein and D. Murray. 2007. Parallel tracking and mapping for small AR workspaces. In 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality. 225–234. DOI:
[29]
D. Lefloch, M. Kluge, H. Sarbolandi, T. Weyrich, and A. Kolb. 2017. Comprehensive use of curvature for robust and accurate online surface reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 39, 12 (2017), 2349–2365. DOI:
[30]
H. Li, E. Vouga, A. Gudym, L. Luo, J. Barron, and G. Gusev. 2013. 3D self-portraits. ACM Transactions on Graphics 32, 6, Article 187 (Nov. 2013), 9 pages. DOI:
[31]
L. Liu, Kyaw Z. Gu, J. Lin, T. Chua, and C. Theobalt. 2020. Neural sparse voxel fields. In NeurIPS.
[32]
S. Liu, H. Guo, H. Pan, P. Wang, X. Tong, and Y. Liu. 2021. Deep implicit moving least-squares functions for 3D reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’21). 1788–1797.
[33]
S. Liu, C. Wang, G. Brunnett, and J. Wang. 2016. A closed-form formulation of HRBF-based surface reconstruction by approximate solution. Computer Aided Design 78 (Sept. 2016), 147–157. DOI:
[34]
I. Macêdo, J. P. Gois, and L. Velho. 2011. Hermite radial basis functions implicits. Computer Graphics Forum 30, 1 (2011), 27–42. DOI:
[35]
S. Meerits, D. Thomas, V. Nozick, and H. Saito. 2018. FusionMLS: Highly dynamic 3D reconstruction with consumergrade RGB-D cameras. Computational Visual Media 4, 4 (Dec. 2018), 287–303. DOI:
[36]
M. Meilland and A. I. Comport. 2013. On unifying key-frame and voxel-based dense visual SLAM at large scales. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems. 3677–3683. DOI:
[37]
R. Mur-Artal, J. M. M. Montiel, and J. D. Tardós. 2015. ORB-SLAM: A versatile and accurate monocular SLAM system. IEEE Transactions on Robotics 31, 5 (2015), 1147–1163. DOI:
[38]
R. Mur-Artal and J. D. Tardós. 2017. ORB-SLAM2: An open-source SLAM system for monocular, stereo, and RGB-D cameras. IEEE Transactions on Robotics 33, 5 (Oct. 2017), 1255–1262. DOI:
[39]
L. Nan. 2021. Easy3D: A lightweight, easy-to-use, and efficient C++ library for processing and rendering 3D data. Journal of Open Source Software 6, 64 (2021), 3255. DOI:
[40]
R. A. Newcombe, S. Izadi, O. Hilliges, D. Molyneaux, D. Kim, A. J. Davison, P. Kohi, J. Shotton, S. Hodges, and A. Fitzgibbon. 2011. KinectFusion: Real-time dense surface mapping and tracking. In 2011 10th IEEE International Symposium on Mixed and Augmented Reality. 127–136. DOI:
[41]
M. Niessner, M. Zollhofer, S. Izadi, and M. Stamminger. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM Transactions on Graphics 32, 6, Article 169 (Nov. 2013), 11 pages. DOI:
[42]
N. Patrikalakis. 2002. Shape Interrogation for Computer Aided Design and Manufacturing. Springer-Verlag, Berlin.
[43]
M. Reynolds, J. Doboš, L. Peel, T. Weyrich, and G. J. Brostow. 2011. Capturing time-of-flight data with confidence. In 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’11). 945–952. DOI:
[44]
E. Rublee, V. Rabaud, K. Konolige, and G. Bradski. 2011. ORB: An efficient alternative to SIFT or SURF. In 2011 International Conference on Computer Vision (ICCV’11). 2564–2571. DOI:
[45]
S. Rusinkiewicz and M. Levoy. 2001. Efficient variants of the ICP algorithm. In Proceedings 3rd International Conference on 3-D Digital Imaging and Modeling. 145–152. DOI:
[46]
H. Sarbolandi, D. Lefloch, and A. Kolb. 2015. Kinect range sensing: Structured-light versus time-of-flight kinect. Computer Vision and Image Understanding 139 (2015), 1–20.
[47]
T. Schöps, T. Sattler, and M. Pollefeys. 2020. SurfelMeshing: Online surfel-based mesh reconstruction. IEEE Transactions on Pattern Analysis and Machine Intelligence 42, 10 (2020), 2494–2507. DOI:
[48]
A. Segal, D. Haehnel, and S. Thrun. 2009. Generalized-ICP. In Robotics: Science and Systems, Vol. 2. 435.
[49]
Y. Shi, K. Xu, M. Nießner, S. Rusinkiewicz, and T. Funkhouser. 2018. PlaneMatch: Patch coplanarity prediction for robust RGB-D reconstruction. In Computer Vision (ECCV’18). Springer International Publishing, Cham, 767–784. DOI:
[50]
J. Sturm, N. Engelhard, F. Endres, W. Burgard, and D. Cremers. 2012. A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 573–580.
[51]
J. Stückler and S. Behnke. 2014. Multi-resolution surfel maps for efficient dense 3D modeling and tracking. Journal of Visual Communication and Image Representation 25, 1 (2014), 137–147.
[52]
E. Sucar, S. Liu, J. Ortiz, and A. J. Davison. 2021. iMAP: Implicit mapping and positioning in real-time. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’21). 6229–6238.
[53]
E. Sucar, K. Wada, and A. Davison. 2020. NodeSLAM: Neural object descriptors for multi-view shape reconstruction. In 2020 International Conference on 3D Vision (3DV’20). 949–958. DOI:
[54]
J. Sun, Y. Xie, L. Chen, X. Zhou, and H. Bao. 2021. NeuralRecon: Real-time coherent 3D reconstruction from monocular video. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’21). 15598– 15607.
[55]
C. Wang and X. Guo. 2017. Feature-based RGB-D camera pose optimization for real-time 3D reconstruction. Computational Visual Media 3, 2 (June 2017), 95–106. DOI:
[56]
O. Wasenmüller, M. Meyer, and D. Stricker. 2016. CoRBS: Comprehensive RGB-D benchmark for SLAM using Kinect v2. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV’16). 1–7. DOI:
[57]
S. Weder, J. Schönberger, M. Pollefeys, and M. R. Oswald. 2020. RoutedFusion: Learning real-time depth map fusion. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 4886–4896. DOI:
[58]
S. Weder, J. Schönberger, M. Pollefeys, and M. R. Oswald. 2021. NeuralFusion: Online depth fusion in latent space. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 3161–3171. DOI:
[59]
T. Weise, T. Wismer, B. Leibe, and L. Van Gool. 2009. In-hand scanning with online loop closure. In 2009 IEEE 12th International Conference on Computer Vision Workshops (ICCV Workshops’09). 1630–1637. DOI:
[60]
H. Wendland. 1995. Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in Computational Mathematics 4, 1 (Dec. 1995), 389–396. DOI:
[61]
T. Whelan, M. Kaess, M. F. Fallon, H. Johannsson, J. J. Leonard, and J. B. McDonald. 2012. Kintinuous: Spatially extended KinectFusion. In RSS Workshop on RGB-D: Advanced Reasoning with Depth Cameras. Sydney, Australia.
[62]
T. Whelan, R. Salas-Moreno, B. Glocker, A. Davison, and S. Leutenegger. 2016. ElasticFusion: Real-time dense SLAM and light source estimation. International Journal of Robotics Research 35, 14 (2016), 1697–1716. DOI:
[63]
S. Wu, W. Sun, P. Long, H. Huang, D. Cohen-Or, M. Gong, O. Deussen, and B. Chen. 2014. Quality-driven poisson-guided autoscanning. ACM Transactions on Graphics 33, 6, Article 203 (Nov. 2014), 12 pages.
[64]
S. Yang, K. Chen, M. Liu, H. Fu, and S. Hu. 2017. Saliency-aware real-time volumetric fusion for object reconstruction. Computer Graphics Forum 36, 7 (2017), 167–174. DOI:
[65]
S. Yang, B. Li, Y. Cao, H. Fu, Y. Lai, L. Kobbelt, and S. Hu. 2020. Noise-resilient reconstruction of panoramas and 3D scenes using robot-mounted unsynchronized commodity RGB-D cameras. ACM Transactions on Graphics 39, 5, Article 152 (July 2020), 15 pages. DOI:
[66]
C. Zhang and Y. Hu. 2017. CuFusion: Accurate real-time camera tracking and volumetric scene reconstruction with a cuboid. Sensors 17, 10 (2017), 2260.
[67]
X. Zhang, H. Li, and Z. Cheng. 2008. Curvature estimation of 3D point cloud surfaces through the fitting of normal section curvatures. In Proceedings of AsiaGraph 2008, 72–79.
[68]
Y. Zhang, W. Xu, Y. Tong, and K. Zhou. 2015. Online structure analysis for real-time indoor scene reconstruction. ACM Transactions on. Graphics 34, 5, Article 159 (Nov. 2015), 13 pages. DOI:
[69]
Q. Zhou and V. Koltun. 2013. Dense scene reconstruction with points of interest. ACM Transactions on Graphics 32, 4, Article 112 (July 2013), 8 pages. DOI:
[70]
Q. Zhou and V. Koltun. 2015. Depth camera tracking with contour cues. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR’15). 632–638. DOI:
[71]
Q. Zhou, S. Miller, and V. Koltun. 2013. Elastic fragments for dense scene reconstruction. In 2013 IEEE International Conference on Computer Vision (ICCV’13). 473–480. DOI:

Cited By

View all
  • (2024)LISRProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i5.28239(4415-4423)Online publication date: 20-Feb-2024
  • (2024)Improving Graph Collaborative Filtering with Directional Behavior Enhanced Contrastive LearningACM Transactions on Knowledge Discovery from Data10.1145/366357418:8(1-20)Online publication date: 2-May-2024
  • (2024)X-SLAM: Scalable Dense SLAM for Task-aware Optimization using CSFDACM Transactions on Graphics10.1145/365823343:4(1-15)Online publication date: 19-Jul-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 41, Issue 3
June 2022
213 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/3517033
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 April 2022
Online AM: 23 February 2022
Accepted: 01 February 2022
Revised: 01 December 2021
Received: 01 November 2020
Published in TOG Volume 41, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. 3D reconstruction
  2. closed-form HRBFs
  3. registration
  4. camera tracking
  5. fusion

Qualifiers

  • Research-article
  • Refereed

Funding Sources

  • National Key Research and Development Program of China
  • National Natural Science Foundation of China
  • Natural Science Foundation of Jiangsu Province

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)241
  • Downloads (Last 6 weeks)14
Reflects downloads up to 09 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)LISRProceedings of the Thirty-Eighth AAAI Conference on Artificial Intelligence and Thirty-Sixth Conference on Innovative Applications of Artificial Intelligence and Fourteenth Symposium on Educational Advances in Artificial Intelligence10.1609/aaai.v38i5.28239(4415-4423)Online publication date: 20-Feb-2024
  • (2024)Improving Graph Collaborative Filtering with Directional Behavior Enhanced Contrastive LearningACM Transactions on Knowledge Discovery from Data10.1145/366357418:8(1-20)Online publication date: 2-May-2024
  • (2024)X-SLAM: Scalable Dense SLAM for Task-aware Optimization using CSFDACM Transactions on Graphics10.1145/365823343:4(1-15)Online publication date: 19-Jul-2024
  • (2024)RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian SplattingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657455(1-11)Online publication date: 13-Jul-2024
  • (2024)Real-Time 3D Visual Perception by Cross-Dimensional Refined LearningIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.340640134:10_Part_2(10326-10338)Online publication date: 1-Oct-2024
  • (2024)CIS2VR: CNN-based Indoor Scan to VR Environment Authoring Framework2024 IEEE International Conference on Artificial Intelligence and eXtended and Virtual Reality (AIxVR)10.1109/AIxVR59861.2024.00025(128-137)Online publication date: 17-Jan-2024
  • (2024)A Survey of Indoor 3D Reconstruction Based on RGB-D CamerasIEEE Access10.1109/ACCESS.2024.344306512(112742-112766)Online publication date: 2024
  • (2024)Detection of Ginkgo biloba seed defects based on feature adaptive learning and nuclear magnetic resonance technologyJournal of Plant Diseases and Protection10.1007/s41348-024-00973-3131:6(2111-2124)Online publication date: 12-Oct-2024
  • (2024)WindPoly: Polygonal Mesh Reconstruction via Winding NumbersComputer Vision – ECCV 202410.1007/978-3-031-72970-6_17(294-311)Online publication date: 23-Nov-2024
  • (2023)An Online 3D Modeling Method for Pose Measurement under Uncertain Dynamic Occlusion Based on Binocular CameraSensors10.3390/s2305287123:5(2871)Online publication date: 6-Mar-2023
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media