Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Reconstruction of Personalized 3D Face Rigs from Monocular Video

Published: 18 May 2016 Publication History

Abstract

We present a novel approach for the automatic creation of a personalized high-quality 3D face rig of an actor from just monocular video data (e.g., vintage movies). Our rig is based on three distinct layers that allow us to model the actor’s facial shape as well as capture his person-specific expression characteristics at high fidelity, ranging from coarse-scale geometry to fine-scale static and transient detail on the scale of folds and wrinkles. At the heart of our approach is a parametric shape prior that encodes the plausible subspace of facial identity and expression variations. Based on this prior, a coarse-scale reconstruction is obtained by means of a novel variational fitting approach. We represent person-specific idiosyncrasies, which cannot be represented in the restricted shape and expression space, by learning a set of medium-scale corrective shapes. Fine-scale skin detail, such as wrinkles, are captured from video via shading-based refinement, and a generative detail formation model is learned. Both the medium- and fine-scale detail layers are coupled with the parametric prior by means of a novel sparse linear regression formulation. Once reconstructed, all layers of the face rig can be conveniently controlled by a low number of blendshape expression parameters, as widely used by animation artists. We show captured face rigs and their motions for several actors filmed in different monocular video formats, including legacy footage from YouTube, and demonstrate how they can be used for 3D animation and 2D video editing. Finally, we evaluate our approach qualitatively and quantitatively and compare to related state-of-the-art methods.

Supplementary Material

garrido (garrido.zip)
Supplemental movie, appendix, image and software files for, Reconstruction of Personalized 3D Face Rigs from Monocular Video

References

[1]
Marc Alexa. 2002. Linear combination of transformations. ACM TOG 21, 3 (2002), 380--387.
[2]
Oleg Alexander, Mike Rogers, William Lambeth, Matt Chiang, and Paul Debevec. 2009. The digital emily project: Photoreal facial modeling and animation. In ACM SIGGRAPH 2009 Courses. Article 12, 15 pages.
[3]
Thabo Beeler, Bernd Bickel, Paul Beardsley, Bob Sumner, and Markus Gross. 2010. High-quality single-shot capture of facial geometry. ACM TOG 29, 4, Article 40, 9 pages.
[4]
Thabo Beeler, Fabian Hahn, Derek Bradley, Bernd Bickel, Paul Beardsley, Craig Gotsman, Robert W. Sumner, and Markus Gross. 2011. High-quality passive facial performance capture using anchor frames. ACM TOG 30, 4, Article 75, 10 pages.
[5]
Pascal Bérard, Derek Bradley, Maurizio Nitti, Thabo Beeler, and Markus Gross. 2014. High-quality capture of eyes. ACM TOG 33, 6, Article 223, 12 pages.
[6]
Amit H. Bermano, Derek Bradley, Thabo Beeler, Fabio Zund, Derek Nowrouzezahrai, Ilya Baran, Olga Sorkine-Hornung, Hanspeter Pfister, Robert W. Sumner, Bernd Bickel, and Markus Gross. 2014. Facial performance enhancement using dynamic shape space analysis. ACM TOG 33, 2, Article 13, 12 pages.
[7]
V. Blanz, C. Basso, T. Poggio, and T. Vetter. 2003. Reanimating faces in images and video. CGF 22, 3, 641--650.
[8]
Volker Blanz and Thomas Vetter. 1999. A morphable model for the synthesis of 3D faces. In Proc. SIGGRAPH’99. 187--194.
[9]
Sofien Bouaziz, Yangang Wang, and Mark Pauly. 2013. Online modeling for realtime facial animation. ACM TOG 32, 4, Article 40, 10 pages.
[10]
Chen Cao, Derek Bradley, Kun Zhou, and Thabo Beeler. 2015. Real-time high-fidelity facial performance capture. ACM TOG 34, 4, Article 46, 9 pages.
[11]
Chen Cao, Qiming Hou, and Kun Zhou. 2014a. Displaced dynamic expression regression for real-time facial tracking and animation. ACM TOG 33, 4, Article 43, 10 pages.
[12]
Chen Cao, Yanlin Weng, Shun Zhou, Yiying Tong, and Kun Zhou. 2014b. FaceWarehouse: A 3D facial expression database for visual computing. IEEE TVCG 20, 3, 413--425.
[13]
Philip David, Daniel DeMenthon, Ramani Duraiswami, and Hanan Samet. 2004. SoftPOSIT: Simultaneous pose and correspondence determination. IJCV 59, 3, 259--284.
[14]
Beat Fasel and Juergen Luettin. 2003. Automatic facial expression analysis: A survey. Pattern Recognition 36, 1, 259--275.
[15]
Graham Fyffe, Andrew Jones, Oleg Alexander, Ryosuke Ichikari, and Paul Debevec. 2014. Driving high-resolution facial scans with video performance capture. ACM TOG 34, 1, Article 8, 14 pages.
[16]
Pablo Garrido, Levi Valgaerts, Chenglei Wu, and Christian Theobalt. 2013. Reconstructing detailed dynamic face geometry from monocular video. ACM TOG 32, 6, Article 158, 10 pages.
[17]
Pablo Garrido, Levi Valgaerts, Hamid Sarmadi, Ingmar Steiner, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2015. VDub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. CGF 34, 2 (2015), 193--204.
[18]
Paul Graham, Borom Tunwattanapong, Jay Busch, Xueming Yu, Andrew Jones, Paul E. Debevec, and Abhijeet Ghosh. 2013. Measurement-based synthesis of facial microgeometry. CGF 32, 2 (2013), 335--344.
[19]
Nicholas J. Higham. 1986. Computing the polar decomposition with applications. SIAM J. Sci. Stat. Comput. 7, 4, 1160--1174.
[20]
Arthur E. Hoerl and Robert W. Kennard. 2000. Ridge regression: Biased estimation for nonorthogonal problems. Technometrics 42, 1, 80--86.
[21]
Pei-Lun Hsieh, Chongyang Ma, Jihun Yu, and Hao Li. 2015. Unconstrained realtime facial performance capture. In Proc. CVPR.
[22]
Haoda Huang, Jinxiang Chai, Xin Tong, and Hsiang-Tao Wu. 2011. Leveraging motion capture and 3D scanning for high-fidelity facial performance acquisition. ACM TOG 30, 4, Article 74, 10 pages.
[23]
Hao-Da Huang, KangKang Yin, Ling Zhao, Yue Qi, Yizhou Yu, and Xin Tong. 2012. Detail-preserving controllable deformation from sparse examples. IEEE TVCG 18, 8, 1215--1227.
[24]
Alexandru Eugen Ichim, Sofien Bouaziz, and Mark Pauly. 2015. Dynamic 3D avatar creation from hand-held video input. ACM TOG 34, 4, Article 45 (2015), 14 pages.
[25]
Pushkar Joshi, Wen C. Tien, Mathieu Desbrun, and Frédéric Pighin. 2003. Learning controls for blend shape based realistic facial animation. In Proc. SCA’03. 187--192.
[26]
Martin Klaudiny and Adrian Hilton. 2012. High-detail 3D capture and non-sequential alignment of facial performance. In Proc. 3DIMPVT. 17--24.
[27]
Kenneth Levenberg. 1944. A method for the solution of certain non-linear problems in least squares. Quarter. Appl. Math. 2, 164--168.
[28]
Bruno Lévy and Hao (Richard) Zhang. 2010. Spectral mesh processing. In ACM SIGGRAPH 2010 Courses. Article 8, 312 pages.
[29]
J. P. Lewis, Ken Anjyo, Taehyun Rhee, Mengjie Zhang, Fred Pighin, and Zhigang Deng. 2014. Practice and theory of blendshape facial models. In Eurographics STARs. 199--218.
[30]
Hao Li, Jihun Yu, Yuting Ye, and Chris Bregler. 2013. Realtime facial animation with on-the-fly correctives. ACM TOG 32, 4, Article 42, 10 pages.
[31]
Jun Li, Weiwei Xu, Zhi-Quan Cheng, Kai Xu, and Reinhard Klein. 2015. Lightweight wrinkle synthesis for 3D facial modeling and animation. Comput.-Aided Des. 58, 117--122.
[32]
Wan-Chun Ma, Andrew Jones, Jen-Yuan Chiang, Tim Hawkins, Sune Frederiksen, Pieter Peers, Marko Vukovic, Ming Ouhyoung, and Paul Debevec. 2008. Facial performance synthesis using deformation-driven polynomial displacement maps. ACM TOG 27, 5, Article 121, 10 pages.
[33]
Donald W. Marquardt. 1963. An algorithm for least-squares estimation of nonlinear parameters. SIAM J. Appl. Math. 11, 2 (1963), 431--441.
[34]
Jorge J. Moré. 1978. The Levenberg-Marquardt algorithm: Implementation and theory. In Numerical Analysis. Lecture Notes in Math., Vol. 630. 105--116.
[35]
Claus Müller. 1966. Spherical Harmonics. Lecture Notes in Math., Vol. 17.
[36]
Diego Nehab, Szymon Rusinkiewicz, James Davis, and Ravi Ramamoorthi. 2005. Efficiently combining positions and normals for precise 3D geometry. ACM TOG 24, 3, 536--543.
[37]
Thomas Neumann, Kiran Varanasi, Stephan Wenger, Markus Wacker, Marcus Magnor, and Christian Theobalt. 2013. Sparse localized deformation components. ACM TOG 32, 6, Article 179, 10 pages.
[38]
Tiberiu Popa, I. South-Dickinson, Derek Bradley, Alla Sheffer, and Wolfgang Heidrich. 2010. Globally consistent space-time reconstruction. CGF 29, 5, 1633--1642.
[39]
Ravi Ramamoorthi and Pat Hanrahan. 2001. A signal-processing framework for inverse rendering. In Proc. SIGGRAPH ’01. 117--128.
[40]
Jason M. Saragih, Simon Lucey, and Jeffrey F. Cohn. 2011. Real-time avatar animation from a single image. In Proc. FG. 213--220.
[41]
Fuhao Shi, Hsiang-Tao Wu, Xin Tong, and Jinxiang Chai. 2014. Automatic acquisition of high-fidelity facial performances using monocular videos. ACM TOG 33, 6, Article 222, 13 pages.
[42]
Eftychios Sifakis, Igor Neverov, and Ronald Fedkiw. 2005. Automatic determination of facial muscle activations from sparse motion capture marker data. ACM TOG 24, 3 (2005), 417--425.
[43]
Robert W. Sumner and Jovan Popovic. 2004. Deformation transfer for triangle meshes. ACM TOG 23, 3 (2004), 399--405.
[44]
Supasorn Suwajanakorn, Ira Kemelmacher-Shlizerman, and Steven M. Seitz. 2014. Total moving face reconstruction. In Proc. ECCV. 796--812.
[45]
J. Rafael Tena, Fernando De la Torre, and Iain Matthews. 2011. Interactive region-based linear 3D face models. ACM TOG 30, 4, Article 76, 10 pages.
[46]
Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, and Christian Theobalt. 2015. Real-time expression transfer for facial reenactment. ACM TOG 34, 6, Article 183, 14 pages.
[47]
Levi Valgaerts, Chenglei Wu, Andrés Bruhn, Hans-Peter Seidel, and Christian Theobalt. 2012. Lightweight binocular facial performance capture under uncontrolled lighting. ACM TOG 31, 6, Article 187, 11 pages.
[48]
Bruno Vallet and Bruno Lévy. 2008. Spectral geometry processing with manifold harmonics. CGF 27, 2, 251--260.
[49]
Daniel Vlasic, Matthew Brand, Hanspeter Pfister, and Jovan Popović. 2005. Face transfer with multilinear models. ACM TOG 24, 3, 426--433.
[50]
Michael Wand, Bart Adams, Maksim Ovsjanikov, Alexander Berner, Martin Bokeloh, Philipp Jenke, Leonidas Guibas, Hans-Peter Seidel, and Andreas Schilling. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM TOG 28, 2, Article 15, 15 pages.
[51]
Thibaut Weise, Sofien Bouaziz, Hao Li, and Mark Pauly. 2011. Realtime performance-based facial animation. ACM TOG 30, 4, Article 77, 10 pages.
[52]
Andreas Wenger, Andrew Gardner, Chris Tchou, Jonas Unger, Tim Hawkins, and Paul Debevec. 2005. Performance relighting and reflectance transformation with time-multiplexed illumination. ACM TOG 24, 3, 756--764.
[53]
Chenglei Wu, Michael Zollhöfer, Matthias Nießner, Marc Stamminger, Shahram Izadi, and Christian Theobalt. 2014. Real-time shading-based refinement for consumer depth cameras. ACM TOG 33, 6, Article 200, 10 pages.
[54]
Michael Zollhöfer, Matthias Nießner, Shahram Izadi, Christoph Rehmann, Christopher Zach, Matthew Fisher, Chenglei Wu, Andrew Fitzgibbon, Charles Loop, Christian Theobalt, and Marc Stamminger. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM TOG 33, 4, Article 156, 12 pages.
[55]
Michael Zollhöfer, Justus Thies, Matteo Colaianni, Marc Stamminger, and Günther Greiner. 2014. Interactive model-based reconstruction of the human head using an RGB-D sensor. J.Vis. Comput. Anim. 25, 3--4 (2014), 213--222.

Cited By

View all
  • (2024)3D Gaussian Blendshapes for Head Avatar AnimationACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657462(1-10)Online publication date: 13-Jul-2024
  • (2024)Self-supervised 3D face reconstruction based on dense key pointsSeventh International Conference on Computer Graphics and Virtuality (ICCGV 2024)10.1117/12.3029465(9)Online publication date: 13-May-2024
  • (2024)SketchMetaFace: A Learning-Based Sketching Interface for High-Fidelity 3D Character Face ModelingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329170330:8(5260-5275)Online publication date: 1-Aug-2024
  • Show More Cited By

Index Terms

  1. Reconstruction of Personalized 3D Face Rigs from Monocular Video

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 35, Issue 3
    June 2016
    128 pages
    ISSN:0730-0301
    EISSN:1557-7368
    DOI:10.1145/2903775
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 May 2016
    Accepted: 01 January 2016
    Revised: 01 December 2015
    Received: 01 September 2015
    Published in TOG Volume 35, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D model fitting
    2. blendshapes
    3. corrective shapes
    4. facial animation
    5. shape-from-shading
    6. video editing

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Technicolor
    • ERC Starting Grant CapReal

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)80
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 09 Nov 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)3D Gaussian Blendshapes for Head Avatar AnimationACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657462(1-10)Online publication date: 13-Jul-2024
    • (2024)Self-supervised 3D face reconstruction based on dense key pointsSeventh International Conference on Computer Graphics and Virtuality (ICCGV 2024)10.1117/12.3029465(9)Online publication date: 13-May-2024
    • (2024)SketchMetaFace: A Learning-Based Sketching Interface for High-Fidelity 3D Character Face ModelingIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329170330:8(5260-5275)Online publication date: 1-Aug-2024
    • (2024)Unlocking Human-Like Facial Expressions in Humanoid Robots: A Novel Approach for Action Unit Driven Facial Expression Disentangled SynthesisIEEE Transactions on Robotics10.1109/TRO.2024.342205140(3850-3865)Online publication date: 1-Jan-2024
    • (2024)Inequality-Constrained 3D Morphable Face Model FittingIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333494846:2(1305-1318)Online publication date: 1-Feb-2024
    • (2024)Human Motion Generation: A SurveyIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2023.333093546:4(2430-2449)Online publication date: 1-Apr-2024
    • (2024)Continuously Controllable Facial Expression Editing in Talking Face VideosIEEE Transactions on Affective Computing10.1109/TAFFC.2023.333451115:3(1400-1413)Online publication date: Jul-2024
    • (2024)MonoNPHM: Dynamic Head Reconstruction from Monocular Videos2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01022(10747-10758)Online publication date: 16-Jun-2024
    • (2024)Learning Dynamic Tetrahedra for High-Quality Talking Head Synthesis2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00498(5209-5219)Online publication date: 16-Jun-2024
    • (2024)3D Facial Expressions through Analysis-by-Neural-Synthesis2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00241(2490-2501)Online publication date: 16-Jun-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media