research-article

Real-time non-rigid reconstruction using an RGB-D camera

Authors:

Michael Zollhöfer,

Matthias Nießner,

Christoph Rehmann,

Christopher Zach,

Matthew Fisher,

Andrew Fitzgibbon,

Christian Theobalt,

Marc StammingerAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 33, Issue 4

Article No.: 156, Pages 1 - 12

https://doi.org/10.1145/2601097.2601165

Published: 27 July 2014 Publication History

Abstract

We present a combined hardware and software solution for markerless reconstruction of non-rigidly deforming physical objects with arbitrary shape in real-time. Our system uses a single self-contained stereo camera unit built from off-the-shelf components and consumer graphics hardware to generate spatio-temporally coherent 3D models at 30 Hz. A new stereo matching algorithm estimates real-time RGB-D data. We start by scanning a smooth template model of the subject as they move rigidly. This geometric surface prior avoids strong scene assumptions, such as a kinematic human skeleton or a parametric shape model. Next, a novel GPU pipeline performs non-rigid registration of live RGB-D data to the smooth template using an extended non-linear as-rigid-as-possible (ARAP) framework. High-frequency details are fused onto the final mesh using a linear deformation model. The system is an order of magnitude faster than state-of-the-art methods, while matching the quality and robustness of many offline algorithms. We show precise real-time reconstructions of diverse scenes, including: large deformations of users' heads, hands, and upper bodies; fine-scale wrinkles and folds of skin and clothing; and non-rigid interactions performed by users on flexible objects such as toys. We demonstrate how acquired models can be used for many interactive scenarios, including re-texturing, online performance capture and preview, and real-time shape and motion re-targeting.

Supplementary Material

ZIP File (a156-zollhofer.zip)

Supplemental material.

Download
116.50 MB

MP4 File (a156-sidebyside.mp4)

Download
22.91 MB

References

[1]

Beeler, T., Hahn, F., Bradley, D., Bickel, B., Beardsley, P., Gotsman, C., Sumner, R. W., and Gross, M. 2011. High-quality passive facial performance capture using anchor frames. ACM TOG (Proc. SIGGRAPH) 30, 4, 75.

Digital Library

[2]

Blanz, V., and Vetter, T. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH, 187--194.

Digital Library

[3]

Bleyer, M., Rhemann, C., and Rother, C. 2011. Patchmatch stereo: Stereo matching with slanted support windows. In Proc. BMVC, vol. 11, 1--11.

[4]

Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4, 53.

Digital Library

[5]

Botsch, M., and Sorkine, O. 2008. On linear variational surface deformation methods. IEEE Trans. Vis. Comp. Graph 14, 1, 213--230.

Digital Library

[6]

Bradley, D., Popa, T., Sheffer, A., Heidrich, W., and Boubekeur, T. 2008. Markerless garment capture. ACM TOG (Proc. SIGGRAPH) 27, 3, 99.

Digital Library

[7]

Brown, B. J., and Rusinkiewicz, S. 2007. Global non-rigid alignment of 3D scans. ACM TOG 26, 3, 21--30.

Digital Library

[8]

Cagniart, C., Boyer, E., and Ilic, S. 2010. Free-form mesh tracking: a patch-based approach. In Proc. CVPR.

[9]

Cao, C., Weng, Y., Lin, S., and Zhou, K. 2013. 3D shape regression for real-time facial animation. ACM TOG 32, 4, 41.

Digital Library

[10]

Chen, J., Izadi, S., and Fitzgibbon, A. 2012. Kinêtre: animating the world with the human body. In Proc. UIST, 435--444.

Digital Library

[11]

de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM TOG (Proc. SIGGRAPH) 27, 1--10.

Digital Library

[12]

Dou, M., Fuchs, H., and Frahm, J.-M. 2013. Scanning and tracking dynamic objects with commodity depth cameras. In Proc. ISMAR, 99--106.

[13]

Gall, J., Stoll, C., De Aguiar, E., Theobalt, C., Rosenhahn, B., and Seidel, H.-P. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR, 1746--1753.

[14]

Garrido, P., Valgaert, L., Wu, C., and Theobalt, C. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM TOG (Proc. SIGGRAPH Asia) 32, 6, 158.

Digital Library

[15]

Helten, T., Baak, A., Bharaj, G., Muller, M., Seidel, H.-P., and Theobalt, C. 2013. Personalization and evaluation of a real-time depth-based full body tracker. In Proc. 3DV, 279--286.

Digital Library

[16]

Hernández, C., Vogiatzis, G., Brostow, G. J., Stenger, B., and Cipolla, R. 2007. Non-rigid photometric stereo with colored lights. In Proc. ICCV, 1--8.

[17]

Izadi, S., Kim, D., Hilliges, O., Molyneaux, D., Newcombe, R., Kohli, P., Shotton, J., Hodges, S., Freeman, D., Davison, A., and Fitzgibbon, A. 2011. KinectFusion: Real-time 3D reconstruction and interaction using a moving depth camera. In Proc. UIST, 559--568.

Digital Library

[18]

Kolb, A., Barth, E., Koch, R., and Larsen, R. 2009. Time-of-flight sensors in computer graphics. In Proc. Eurographics State-of-the-art Reports, 119--134.

[19]

Li, H., Sumner, R. W., and Pauly, M. 2008. Global correspondence optimization for non-rigid registration of depth scans. In Proc. SGP, Eurographics Association, 1421--1430.

Digital Library

[20]

Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM TOG 28, 5, 175.

Digital Library

[21]

Li, H., Vouga, E., Gudym, A., Luo, L., Barron, J. T., and Gusev, G. 2013. 3D self-portraits. ACM TOG 32, 6, 187.

Digital Library

[22]

Li, H., Yu, J., Ye, Y., and Bregler, C. 2013. Realtime facial animation with on-the-fly correctives. ACM Transactions on Graphics 32, 4 (July).

Digital Library

[23]

Liao, M., Zhang, Q., Wang, H., Yang, R., and Gong, M. 2009. Modeling deformable objects from a single depth camera. In Proc. ICCV, 167--174.

[24]

Mitra, N. J., Flöry, S., Ovsjanikov, M., Gelfand, N., Guibas, L. J., and Pottmann, H. 2007. Dynamic geometry registration. In Proc. SGP, 173--182.

Digital Library

[25]

Newcombe, R. A., Izadi, S., Hilliges, O., Molyneaux, D., Kim, D., Davison, A. J., Kohli, P., Shotton, J., Hodges, S., and Fitzgibbon, A. 2011. KinectFusion: Real-time dense surface mapping and tracking. In Proc. ISMAR, 127--136.

Digital Library

[26]

Niessner, M., Zollhöfer, M., Izadi, S., and Stamminger, M. 2013. Real-time 3D reconstruction at scale using voxel hashing. ACM TOG 32, 6, 169.

Digital Library

[27]

Oikonomidis, I., Kyriazis, N., and Argyros, A. A. 2011. Efficient model-based 3D tracking of hand articulations using Kinect. In Proc. BMVC, 1--11.

[28]

Pradeep, V., Rhemann, C., Izadi, S., Zach, C., Bleyer, M., and Bathiche, S. 2013. MonoFusion: Real-time 3D reconstruction of small scenes with a single web camera. In Proc. ISMAR, 83--88.

[29]

Sorkine, O., and Alexa, M. 2007. As-rigid-as-possible surface modeling. In Proc. SGP, 109--116.

Digital Library

[30]

Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. Computer Graphics and Applications 27, 3, 21--31.

Digital Library

[31]

Sumner, R. W., and Popović, J. 2004. Deformation transfer for triangle meshes. In ACM SIGGRAPH 2004 Papers, ACM, New York, NY, USA, SIGGRAPH '04, 399--405.

Digital Library

[32]

Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM TOG 26, 3, 80.

Digital Library

[33]

Taylor, J., Shotton, J., Sharp, T., and Fitzgibbon, A. 2012. The vitruvian manifold: Inferring dense correspondences for one-shot human pose estimation. In Proc. CVPR, 103--110.

Digital Library

[34]

Tevs, A., Berner, A., Wand, M., Ihrke, I., Bokeloh, M., Kerber, J., and Seidel, H.-P. 2012. Animation cartography-intrinsic reconstruction of shape and motion. ACM TOG 31, 2, 12.

Digital Library

[35]

Theobalt, C., de Aguiar, E., Stoll, C., Seidel, H.-P., and Thrun, S. 2010. Performance capture from multi-view video. In Image and Geometry Processing for 3D-Cinematography, R. Ronfard and G. Taubin, Eds. Springer, 127ff.

[36]

Tong, J., Zhou, J., Liu, L., Pan, Z., and Yan, H. 2012. Scanning 3D full human bodies using Kinects. TVCG 18, 4, 643--650.

Digital Library

[37]

Valgaerts, L., Wu, C., Bruhn, A., Seidel, H.-P., and Theobalt, C. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM TOG (Proc. SIGGRAPH Asia) 31, 6 (November), 187.

Digital Library

[38]

Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM TOG (Proc. SIGGRAPH).

Digital Library

[39]

Vlasic, D., Peers, P., Baran, I., Debevec, P., Popovic, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM TOG (Proc. SIGGRAPH Asia) 28, 5, 174.

Digital Library

[40]

Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG 28, 15.

Digital Library

[41]

Waschbüsch, M., Würmlin, S., Cotting, D., Sadlo, F., and Gross, M. 2005. Scalable 3D video of dynamic scenes. In Proc. Pacific Graphics, 629--638.

[42]

Weber, D., Bender, J., Schnoes, M., Stork, A., and Fellner, D. 2013. Efficient gpu data structures and methods to solve sparse linear systems in dynamics applications. Computer Graphics Forum 32, 1, 16--26.

[43]

Wei, X., Zhang, P., and Chai, J. 2012. Accurate realtime full-body motion capture using a single depth camera. ACM TOG 31, 6 (Nov.), 188.

Digital Library

[44]

Weise, T., Wismer, T., Leibe, B., and Gool, L. V. 2009. In-hand scanning with online loop closure. In IEEE International Workshop on 3-D Digital Imaging and Modeling.

[45]

Weise, T., Li, H., Gool, L. V., and Pauly, M. 2009. Face/off: Live facial puppetry. In Proceedings of the 2009 ACM SIGGRAPH/Eurographics Symposium on Computer animation (Proc. SCA'09), Eurographics Association, ETH Zurich.

Digital Library

[46]

Weise, T., Bouaziz, S., Li, H., and Pauly, M. 2011. Realtime performance-based facial animation. ACM TOG 30, 4, 77.

Digital Library

[47]

Weiss, A., Hirshberg, D., and Black, M. J. 2011. Home 3D body scans from noisy image and range data. In Proc. ICCV, 1951--1958.

Digital Library

[48]

White, B. S., McKee, S. A., de Supinski, B. R., Miller, B., Quinlan, D., and Schulz, M. 2005. Improving the computational intensity of unstructured mesh applications. In Proc. ACM Intl. Conf. on Supercomputing, 341--350.

Digital Library

[49]

Wilamowski, B. M., and Yu, H. 2010. Improved computation for levenberg-marquardt training. IEEE Trans. Neural Networks 21, 6, 930--937.

Digital Library

[50]

Wu, C., Stoll, C., Valgaerts, L., and Theobalt, C. 2013. On-set performance capture of multiple actors with a stereo camera. ACM TOG 32, 6, 161.

Digital Library

[51]

Ye, G., Liu, Y., Hasler, N., Ji, X., Dai, Q., and Theobalt, C. 2012. Performance capture of interacting characters with handheld kinects. In Proc. ECCV. Springer, 828--841.

Digital Library

[52]

Zeng, M., Zheng, J., Cheng, X., and Liu, X. 2013. Templateless quasi-rigid shape modeling with implicit loop-closure. In Proc. CVPR, 145--152.

Digital Library

Cited By

Jang DYang DJang DChoi BLee SShin D(2024)ELMO: Enhanced Real-time LiDAR Motion Capture through UpsamplingACM Transactions on Graphics10.1145/368799143:6(1-14)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687991
Sariyanidi EZampella CSchultz RTunç B(2024)Inequality-Constrained 3D Morphable Face Model FittingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333494846:2(1305-1318)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3334948
TAJDARI FHUYSMANS TYAO XXU JZEBARJADI MSONG Y(2024)4D Feet: Registering Walking Foot Shapes Using Attention Enhanced Dynamic-Synchronized Graph Convolutional LSTM NetworkIEEE Open Journal of the Computer Society10.1109/OJCS.2024.34066455(343-355)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3406645
Show More Cited By

Index Terms

Real-time non-rigid reconstruction using an RGB-D camera
1. Applied computing
  1. Document management and text processing
    1. Document capture
      1. Document scanning
      2. Graphics recognition and interpretation
2. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
  2. Computer graphics
    1. Image manipulation

Recommendations

Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
Real-Time Geometry, Albedo, and Motion Reconstruction Using a Single RGB-D Camera

This article proposes a real-time method that uses a single-view RGB-D input (a depth sensor integrated with a color camera) to simultaneously reconstruct a casual scene with a detailed geometry model, surface albedo, per-frame non-rigid motion, and per-...
On template-based reconstruction from a single view: Analytical solutions and proofs of well-posedness for developable, isometric and conformal surfaces
CVPR '12: Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

Recovering a deformable surface's 3D shape from a single view registered to a 3D template requires one to provide additional constraints. A recent approach has been to constrain the surface to deform quasi-isometrically. This is applicable to surfaces ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 33, Issue 4

July 2014

1366 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2601097

Issue’s Table of Contents

Copyright © 2014 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 July 2014

Published in TOG Volume 33, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

328
Total Citations
View Citations
3,563
Total Downloads

Downloads (Last 12 months)107
Downloads (Last 6 weeks)12

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Jang DYang DJang DChoi BLee SShin D(2024)ELMO: Enhanced Real-time LiDAR Motion Capture through UpsamplingACM Transactions on Graphics10.1145/368799143:6(1-14)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687991
Sariyanidi EZampella CSchultz RTunç B(2024)Inequality-Constrained 3D Morphable Face Model FittingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333494846:2(1305-1318)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TPAMI.2023.3334948
TAJDARI FHUYSMANS TYAO XXU JZEBARJADI MSONG Y(2024)4D Feet: Registering Walking Foot Shapes Using Attention Enhanced Dynamic-Synchronized Graph Convolutional LSTM NetworkIEEE Open Journal of the Computer Society10.1109/OJCS.2024.34066455(343-355)Online publication date: 2024
https://doi.org/10.1109/OJCS.2024.3406645
Cuau LSantos JPoignet PZemiti N(2024)Direct TPS-based 3D non-rigid motion estimation on 3D colored point cloud in eye-in-hand configuration2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)10.1109/IROS58592.2024.10802824(11581-11586)Online publication date: 14-Oct-2024
https://doi.org/10.1109/IROS58592.2024.10802824
Zhao MJiang JMa LXin SMeng GYan D(2024)Correspondence-Free Non-Rigid Point Set Registration Using Unsupervised Clustering Analysis2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.02003(21199-21208)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.02003
Wang HWang JAgapito L(2024)MorpheuS: Neural Dynamic $360^{\circ}$ Surface Reconstruction from Monocular RGB-D Video2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01981(20965-20976)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01981
Jiang YShen ZWang PSu ZHong YZhang YYu JXu L(2024)HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01866(19734-19745)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.01866
Tretschk EGolyanik VZollhöfer MBožič ALassner CTheobalt C(2024)SceNeRFlow: Time-Consistent Reconstruction of General Dynamic Scenes2024 International Conference on 3D Vision (3DV)10.1109/3DV62453.2024.00136(1424-1435)Online publication date: 18-Mar-2024
https://doi.org/10.1109/3DV62453.2024.00136
Nawaz MNazir T(2024)EDet-BTR: EfficientDet-based brain tumor recognition from the magnetic resonance imagingBiomedical Signal Processing and Control10.1016/j.bspc.2024.10661896(106618)Online publication date: Oct-2024
https://doi.org/10.1016/j.bspc.2024.106618
Chen RZhao JZhang FChalmers ARhee T(2024)Neural Radiance Fields for Dynamic View Synthesis Using Local Temporal PriorsComputational Visual Media10.1007/978-981-97-2095-8_5(74-90)Online publication date: 10-Apr-2024
https://dl.acm.org/doi/10.1007/978-981-97-2095-8_5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents