Article

3D Face Reconstruction with Dense Landmarks

Authors:

Tadas Baltrušaitis,

Charlie Hewitt,

Matthew Johnson,

Nikola Milosavljević,

Stephan Garbin,

Ivan Stojiljković,

Julien ValentinAuthors Info & Claims

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIII

Pages 160 - 177

https://doi.org/10.1007/978-3-031-19778-9_10

Published: 23 October 2022 Publication History

Abstract

Landmarks often play a key role in face analysis, but many aspects of identity or expression cannot be represented by sparse landmarks alone. Thus, in order to reconstruct faces more accurately, landmarks are often combined with additional signals like depth images or techniques like differentiable rendering. Can we keep things simple by just using more landmarks? In answer, we present the first method that accurately predicts 10

\times

as many landmarks as usual, covering the whole head, including the eyes and teeth. This is accomplished using synthetic training data, which guarantees perfect landmark annotations. By fitting a morphable model to these dense landmarks, we achieve state-of-the-art results for monocular 3D face reconstruction in the wild. We show that dense landmarks are an ideal signal for integrating face shape information across frames by demonstrating accurate and expressive facial performance capture in both monocular and multi-view scenarios. Finally, our method is highly efficient: we can predict dense landmarks and fit our 3D face model at over 150FPS on a single CPU thread. Please see our website: https://microsoft.github.io/DenseLandmarks/.

References

[1]

Alp Güler, R., Trigeorgis, G., Antonakos, E., Snape, P., Zafeiriou, S., Kokkinos, I.: DenseReg: fully convolutional dense shape regression in-the-wild. In: CVPR (2017)

[2]

Bagdanov, A.D., Del Bimbo, A., Masi, I.: The Florence 2D/3D hybrid face dataset. In: Workshop on Human Gesture and Behavior Understanding. ACM (2011)

[3]

Bai, Z., Cui, Z., Liu, X., Tan, P.: Riggable 3D face reconstruction via in-network optimization. In: CVPR (2021)

[4]

Beeler, T., Bickel, B., Beardsley, P., Sumner, B., Gross, M.: High-quality single-shot capture of facial geometry. In: ACM Transactions on Graphics (2010)

[5]

Beeler, T., et al.: High-quality passive facial performance capture using anchor frames. In: ACM Transactions on Graphics (2011)

[6]

Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Computer Graphics and Interactive Techniques (1999)

[7]

Blanz V and Vetter T Face recognition based on fitting a 3d morphable model TPAMI 2003 25 9 1063-1074

[8]

Bogo F, Kanazawa A, Lassner C, Gehler P, Romero J, and Black MJ Leibe B, Matas J, Sebe N, and Welling M Keep It SMPL: automatic estimation of 3d human pose and shape from a single image Computer Vision – ECCV 2016 2016 Cham Springer 561-578

[9]

Bradley, D., Heidrich, W., Popa, T., Sheffer, A.: High resolution passive facial performance capture. In: ACM Transactions on Graphics, vol. 29, no. 4 (2010)

[10]

Browatzki, B., Wallraven, C.: 3FabRec: Fast Few-shot Face alignment by Reconstruction. In: CVPR (2020)

[11]

Bulat, A., Sanchez, E., Tzimiropoulos, G.: Subpixel heatmap regression for facial landmark Localization. In: BMVC (2021)

[12]

Bulat, A., Tzimiropoulos, G.: How far are we from solving the 2d & 3d face alignment problem? (and a dataset of 230,000 3D facial landmarks). In: ICCV (2017)

[13]

Cao C, Chai M, Woodford O, and Luo L Stabilized real-time face tracking via a learned dynamic rigidity prior ACM Trans. Graph. 2018 37 6 1-11

[14]

Chandran, P., Bradley, D., Gross, M., Beeler, T.: Semantic deep face models. In: International Conference on 3D Vision (3DV) (2020)

[15]

Cong, M., Lan, L., Fedkiw, R.: Local geometric indexing of high resolution data for facial reconstruction from sparse markers. CoRR abs/1903.00119 (2019). www.arxiv.org/abs/1903.00119

[16]

Deng, J., Guo, J., Ververas, E., Kotsia, I., Zafeiriou, S.: RetinaFace: single-shot multi-level face localisation in the wild. In: CVPR (2020)

[17]

Deng, Y., Yang, J., Xu, S., Chen, D., Jia, Y., Tong, X.: Accurate 3d face reconstruction with weakly-supervised learning: from single image to image set. In: CVPR Workshops (2019)

[18]

Dib A et al. Practical face reconstruction via differentiable ray tracing Comput. Graph. Forum 2021 40 2 153-164

[19]

Dib, A., Thebault, C., Ahn, J., Gosselin, P.H., Theobalt, C., Chevallier, L.: Towards high fidelity monocular face reconstruction with rich reflectance using self-supervised learning and ray tracing. In: CVPR (2021)

[20]

Dou P and Kakadiaris IA Multi-view 3D face reconstruction with deep recurrent neural networks Image Vis. Comput. 2018 80 80-91

[21]

Dou, P., Shah, S.K., Kakadiaris, I.A.: End-to-end 3D face reconstruction with deep neural networks. In: CVPR (2017)

[22]

Falcon, W., et al.: Pytorch lightning 3(6) (2019). GitHub. Note. https://github.com/PyTorchLightning/pytorch-lightning

[23]

Feng Y, Feng H, Black MJ, and Bolkart T Learning an animatable detailed 3D face model from in-the-wild images ACM Trans. Graph. (ToG) 2021 40 4 1-13

[24]

Feng, Y., Wu, F., Shao, X., Wang, Y., Zhou, X.: Joint 3d face reconstruction and dense alignment with position map regression network. In: ECCV (2018)

[25]

Garrido P et al. Reconstruction of personalized 3d face rigs from monocular video ACM Trans. Graph. 2016 35 3 1-15

[26]

Genova, K., Cole, F., Maschinot, A., Sarna, A., Vlasic, D., Freeman, W.T.: Unsupervised training for 3d morphable model regression. In: CVPR (2018)

[27]

Gerig, T., et al.: Morphable face models-an open framework. In: Automatic Face & Gesture Recognition (FG). IEEE (2018)

[28]

Grishchenko, I., Ablavatski, A., Kartynnik, Y., Raveendran, K., Grundmann, M.: Attention mesh: high-fidelity face mesh prediction in real-time. In: CVPR Workshops (2020)

[29]

Güler, R.A., Neverova, N., Kokkinos, I.: Densepose: dense human pose estimation in the wild. In: CVPR (2018)

[30]

Guo J, Zhu X, Yang Y, Yang F, Lei Z, and Li SZ Vedaldi A, Bischof H, Brox T, and Frahm J-M Towards fast, accurate and stable 3d dense face alignment Computer Vision – ECCV 2020 2020 Cham Springer 152-168

[31]

Guo Y, Cai J, Jiang B, Zheng J, et al. Cnn-based real-time dense face reconstruction with inverse-rendered photo-realistic face images TPAMI 2018 41 6 1294-1307

[32]

Han S et al. Megatrack: monochrome egocentric articulated hand-tracking for virtual reality ACM Trans. Graph. (TOG) 2020 39 4 1-87

[33]

Hassner, T., Harel, S., Paz, E., Enbar, R.: Effective face frontalization in unconstrained images. In: CVPR (2015)

[34]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR (2016)

[35]

Jeni, L.A., Cohn, J.F., Kanade, T.: Dense 3D face alignment from 2D videos in real-time. In: Automatic Face and Gesture Recognition (FG). IEEE (2015)

[36]

Kartynnik, Y., Ablavatski, A., Grishchenko, I., Grundmann, M.: Real-time facial surface geometry from monocular video on mobile GPUs. In: CVPR Workshops (2019)

[37]

Kendall, A., Gal, Y.: What uncertainties do we need in bayesian deep learning for computer vision? In: Advances in Neural Information Processing Systems, vol. 30 (2017)

[38]

Kumar, A., et al.: Luvli face alignment: estimating landmarks’ location, uncertainty, and visibility likelihood. In: CVPR (2020)

[39]

Lewis, J.P., Cordner, M., Fong, N.: Pose space deformation: a unified approach to shape interpolation and skeleton-driven deformation. In: SIGGRAPH (2000)

[40]

Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. In: ACM Transactions on Graphics, (Proceedings SIGGRAPH Asia) (2017)

[41]

Li, Y., Yang, S., Zhang, S., Wang, Z., Yang, W., Xia, S.T., Zhou, E.: Is 2d heatmap representation even necessary for human pose estimation? (2021)

[42]

Liu DC and Nocedal J On the limited memory BFGS method for large scale optimization Math. Program. 1989 45 1 503-528

[43]

Liu, F., Zhu, R., Zeng, D., Zhao, Q., Liu, X.: Disentangling features in 3D face shapes for joint face reconstruction and recognition. In: CVPR (2018)

[44]

Liu, Y., Jourabloo, A., Ren, W., Liu, X.: Dense face alignment. In: ICCV Workshops (2017)

[45]

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: ICLR (2019)

[46]

Morales A, Piella G, and Sukno FM Survey on 3d face reconstruction from uncalibrated images Comput. Sci. Rev. 2021 40 100400

[47]

Paszke, A., et al.: Pytorch: an imperative style, high-performance deep learning library. In: NeurIPS (2019)

[48]

Piotraschke, M., Blanz, V.: Automated 3D face reconstruction from multiple images using quality measures. In: CVPR (2016)

[49]

Popa T, South-Dickinson I, Bradley D, Sheffer A, and Heidrich W Globally consistent space-time reconstruction Comput. Graph. Forum 2010 29 5 1633-1642

[50]

Richardson, E., Sela, M., Kimmel, R.: 3D face reconstruction by learning from synthetic data. In: 3DV. IEEE (2016)

[51]

Richardson, E., Sela, M., Or-El, R., Kimmel, R.: Learning detailed face reconstruction from a single image. In: CVPR (2017)

[52]

Sagonas C, Antonakos E, Tzimiropoulos G, Zafeiriou S, and Pantic M 300 faces in-the-wild challenge: database and results Image Vis. Computi. (IMAVIS) 2016 47 3-18

[53]

Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenet V2: Inverted residuals and linear bottlenecks. In: CVPR (2018)

[54]

Sanyal, S., Bolkart, T., Feng, H., Black, M.: Learning to regress 3d face shape and expression from an image without 3d supervision. In: CVPR (2019)

[55]

Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR (2006)

[56]

Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: ICCV (2017)

[57]

Shang J Vedaldi A, Bischof H, Brox T, and Frahm J-M Self-supervised monocular 3d face reconstruction by occlusion-aware multi-view geometry consistency Computer Vision – ECCV 2020 2020 Cham Springer 53-70

[58]

Taylor J et al. Efficient and precise interactive hand tracking through joint, continuous optimization of pose and correspondences ACM Trans. Graph. (ToG) 2016 35 4 1-12

[59]

Taylor, J., Shotton, J., Sharp, T., Fitzgibbon, A.: The vitruvian manifold: inferring dense correspondences for one-shot human pose estimation. In: CVPR (2012)

[60]

Tewari, A., et al.: FML: face model learning from videos. In: CVPR (2019)

[61]

Tewari, A., et al: Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz. In: CVPR (2018)

[62]

Tewari, A., et al.: Mofa: model-based deep convolutional face autoencoder for unsupervised monocular reconstruction. In: ICCV Workshops (2017)

[63]

Thies J, Zollhöfer M, Nießner M, Valgaerts L, Stamminger M, and Theobalt C Real-time expression transfer for facial reenactment ACM Trans. Graph. 2015 34 6 1-183

[64]

Thies, J., Zollhöfer, M., Stamminger, M., Theobalt, C., Nießner, M.: Face2Face: real-time face capture and reenactment of RGB videos. In: CVPR (2016)

[65]

Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: CVPR (2019)

[66]

Tran, L., Liu, X.: Nonlinear 3d face morphable model. In: CVPR (2018)

[67]

Tuan Tran, A., Hassner, T., Masi, I., Medioni, G.: Regressing robust and discriminative 3D morphable models with a very deep neural network. In: CVPR (2017)

[68]

Wang, X., Bo, L., Fuxin, L.: Adaptive wing loss for robust face alignment via heatmap regression. In: ICCV (2019)

[69]

Wightman, R.: Pytorch image models (2019). https://www.github.com/rwightman/pytorch-image-models,

[70]

Wood, E., et al.: Fake it till you make it: Face analysis in the wild using synthetic data alone (2021)

[71]

Wu, W., Qian, C., Yang, S., Wang, Q., Cai, Y., Zhou, Q.: Look at boundary: a boundary-aware face alignment algorithm. In: CVPR (2018)

[72]

Yi, H., et al.: MMFace: a multi-metric regression network for unconstrained face reconstruction. In: CVPR (2019)

[73]

Yoon, J.S., Shiratori, T., Yu, S.I., Park, H.S.: Self-supervised adaptation of high-fidelity face models for monocular performance tracking. In: CVPR (2019)

[74]

Zhou, Y., Deng, J., Kotsia, I., Zafeiriou, S.: Dense 3d face decoding over 2500fps: joint texture & shape convolutional mesh decoders. In: CVPR (2019)

[75]

Zhu, M., Shi, D., Zheng, M., Sadiq, M.: Robust facial landmark detection via occlusion-adaptive deep networks. In: CVPR (2019)

[76]

Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: CVPR (2016)

[77]

Zhu, X., Lei, Z., Liu, X., Shi, H., Li, S.Z.: Face alignment across large poses: a 3d solution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 146–155 (2016)

[78]

Zollhöfer M et al. State of the art on monocular 3d face reconstruction, tracking, and applications Comput. Graph. Forum 2018 37 2 523-550

Cited By

Nguyen BNguyen HLy N(2023)A New Coarse-To-Fine 3D Face Reconstruction Method Based On 3DMM Flame and Transformer: CoFiT-3D FaReProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3628960(393-400)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3628960
Navarro IKneubuehler DVerhulsdonck TDu Bois EWelch WShang CSachs IMcguire MZordan VBhat K(2023)Audiovisual Inputs for Learning Robust, Real-time Facial Animation with Lip SyncProceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games10.1145/3623264.3624451(1-12)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3623264.3624451
Trevithick AChan MStengel MChan ELiu CYu ZKhamis SChandraker MRamamoorthi RNagano K(2023)Real-Time Radiance Fields for Single-Image Portrait View SynthesisACM Transactions on Graphics10.1145/359246042:4(1-15)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592460
Show More Cited By

Index Terms

3D Face Reconstruction with Dense Landmarks
1. Computing methodologies

Index terms have been assigned to the content through auto-classification.

Recommendations

Mixed 2D-3D information for face recognition
Transactions on edutainment V

Face recognition with assistance of 3D models has been a successful approach recently. In this paper, we develop a face recognition system fusing 2D and 3D face information. First, the HarrLBP representation is proposed to represent the 2D faces. Then, ...
How effective are landmarks and their geometry for face recognition?

This paper evaluates how biologically meaningful landmarks and their geometry extracted from face images can be used for face recognition. The traditional Procrustes distance is studied for the landmark-based face model. By using complex principal ...
Frontal face synthesis based on multiple pose-variant images for face recognition
ICB'07: Proceedings of the 2007 international conference on Advances in Biometrics

Pose variance remains a challenging problem for face recognition. In this paper, a stereoscopic synthesis method for generating a frontal face image is proposed to improve the performance of automatic face recognition system. Through this method, a ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Computer Vision – ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XIII

Oct 2022

803 pages

ISBN:978-3-031-19777-2

DOI:10.1007/978-3-031-19778-9

Editors:
Shai Avidan
Tel Aviv University, Tel Aviv, Israel
,
Gabriel Brostow
University College London, London, UK
,
Moustapha Cissé
Google AI, Accra, Ghana
,
Giovanni Maria Farinella
University of Catania, Catania, Italy
,
Tal Hassner
Facebook (United States), Menlo Park, CA, USA

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 23 October 2022

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Nguyen BNguyen HLy N(2023)A New Coarse-To-Fine 3D Face Reconstruction Method Based On 3DMM Flame and Transformer: CoFiT-3D FaReProceedings of the 12th International Symposium on Information and Communication Technology10.1145/3628797.3628960(393-400)Online publication date: 7-Dec-2023
https://dl.acm.org/doi/10.1145/3628797.3628960
Navarro IKneubuehler DVerhulsdonck TDu Bois EWelch WShang CSachs IMcguire MZordan VBhat K(2023)Audiovisual Inputs for Learning Robust, Real-time Facial Animation with Lip SyncProceedings of the 16th ACM SIGGRAPH Conference on Motion, Interaction and Games10.1145/3623264.3624451(1-12)Online publication date: 15-Nov-2023
https://dl.acm.org/doi/10.1145/3623264.3624451
Trevithick AChan MStengel MChan ELiu CYu ZKhamis SChandraker MRamamoorthi RNagano K(2023)Real-Time Radiance Fields for Single-Image Portrait View SynthesisACM Transactions on Graphics10.1145/359246042:4(1-15)Online publication date: 26-Jul-2023
https://dl.acm.org/doi/10.1145/3592460
Guo JYu JLattas ADeng J(2022)Perspective Reconstruction of Human Faces by Joint Mesh and Landmark RegressionComputer Vision – ECCV 2022 Workshops10.1007/978-3-031-25072-9_23(350-365)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-25072-9_23

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents