research-article

Inequality-Constrained 3D Morphable Face Model Fitting

Authors:

Evangelos Sariyanidi,

Casey J. Zampella,

Robert T. Schultz,

Birkan TunçAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 46, Issue 2

Pages 1305 - 1318

https://doi.org/10.1109/TPAMI.2023.3334948

Published: 01 February 2024 Publication History

Abstract

3D morphable model (3DMM) fitting on 2D data is traditionally done via unconstrained optimization with regularization terms to ensure that the result is a plausible face shape and is consistent with a set of 2D landmarks. This paper presents inequality-constrained 3DMM fitting as the first alternative to regularization in optimization-based 3DMM fitting. Inequality constraints on the 3DMM's shape coefficients ensure face-like shapes without modifying the objective function for smoothness, thus allowing for more flexibility to capture person-specific shape details. Moreover, inequality constraints on landmarks increase robustness in a way that does not require per-image tuning. We show that the proposed method stands out with its ability to estimate person-specific face shapes by jointly fitting a 3DMM to multiple frames of a person. Further, when used with a robust objective function, namely gradient correlation, the method can work “in-the-wild” even with a 3DMM constructed from controlled data. Lastly, we show how to use the log-barrier method to efficiently implement the method. To our knowledge, we present the first 3DMM fitting framework that requires <italic>no learning</italic> yet is accurate, robust, and efficient. The absence of learning enables a generic solution that allows flexibility in the input image size, interchangeable morphable models, and incorporation of camera matrix.

References

[1]

A. D. Bagdanov, A. Del Bimbo, and I. Masi, “The florence 2d/3d hybrid face dataset,” in Proc. ACM Workshop Hum. Gesture Behav. Understanding, 2011, pp. 79–80.

[2]

Z. Bai, Z. Cui, X. Liu, and P. Tan, “Riggable 3D face reconstruction via in-network optimization,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 6216–6225.

[3]

S. Baker and I. Matthews, “Lucas-Kanade 20 years on: A unifying framework,” Int. J. Comput. Vis., vol. 56, no. 3, pp. 221–255, 2004.

Digital Library

[4]

A. Bas, W. A. Smith, T. Bolkart, and S. Wuhrer, “Fitting a 3D morphable model to edges: A comparison between hard and soft correspondences,” in Proc. Asian Conf. Comput. Vis., 2016, pp. 377–391.

[5]

V. Blanz and T. Vetter, “A morphable model for the synthesis of 3D faces,” in Proc. Conf. Comput. Graph. Interactive Techn., Addison-Wesley Publishing Co., 1999, pp. 187–194.

[6]

V. Blanz and T. Vetter, “Face recognition based on fitting a 3D morphable model,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 25, no. 9, pp. 1063–1074, Sep. 2003.

Digital Library

[7]

T. Bolkart and S. Wuhrer, “3D faces in motion: Fully automatic registration and statistical analysis,” Comp. Vis. Image Understanding, vol. 131, pp. 100–115, 2015.

Digital Library

[8]

J. Booth, E. Antonakos, S. Ploumpis, G. Trigeorgis, Y. Panagakis, and S. Zafeiriou, “3D face morphable models” in-the-wild”,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5464–5473.

[9]

J. Booth et al., “3D reconstruction of “in-the-wild” faces in images and videos,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 40, no. 11, pp. 2638–2652, 2018.

Digital Library

[10]

S. Boyd, S. P. Boyd, and L. Vandenberghe, Convex Optimization. Cambridge, U.K.: Cambridge Univ. Press, 2004.

[11]

A. Brunton, A. Salazar, T. Bolkart, and S. Wuhrer, “Review of statistical shape spaces for 3D data with comparative analysis for human faces,” Comput. Vis. Image Understanding, vol. 128, pp. 1–17, 2014.

[12]

A. Bulat and G. Tzimiropoulos, “How far are we from solving the 2D & 3D face alignment problem? (and a dataset of 230,000 3D facial landmarks),” in Proc. IEEE Int. Conf. Comput. Vis., 2017, pp. 1021–1030.

[13]

C. Cao, Y. Weng, S. Zhou, Y. Tong, and K. Zhou, “FaceWarehouse: A 3D facial expression database for visual computing,” IEEE Trans. Vis. Comput. Graph., vol. 20, no. 3, pp. 413–425, Mar. 2014.

Digital Library

[14]

P. Chandran, G. Zoss, M. Gross, P. Gotardo, and D. Bradley, “Shape transformers: Topology-independent 3D shape models using transformers,” Comput. Graph. Forum, vol. 41, no. 2, pp. 195–207, 2022.

[15]

B. Chaudhuri, N. Vesdapunt, L. Shapiro, and B. Wang, “Personalized face modeling for improved face reconstruction and motion retargeting,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 142–160.

[16]

B. Chaudhuri, N. Vesdapunt, and B. Wang, “Joint face detection and facial motion retargeting for multiple faces,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 9719–9728.

[17]

Y. Deng, J. Yang, S. Xu, D. Chen, Y. Jia, and X. Tong, “Accurate 3D face reconstruction with weakly-supervised learning: From single image to image set,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2019, pp. 285–295.

[18]

B. Egger et al., “3D morphable face models–past, present, and future,” ACM Trans. Graph., vol. 39, no. 5, pp. 1–38, 2020.

Digital Library

[19]

Y. Feng, H. Feng, M. J. Black, and T. Bolkart, “Learning an animatable detailed 3D face model from in-the-wild images,” ACM Trans. Graph., vol. 40, no. 8, pp. 1–13, 2021. [Online]. Available: https://doi.org/10.1145/3450626.3459936

[20]

Y. Feng, F. Wu, X. Shao, Y. Wang, and X. Zhou, “Joint 3D face reconstruction and dense alignment with position map regression network,” in Proc. Eur. Conf. Comput. Vis., 2018, pp. 534–551.

[21]

Z.-H. Feng, J. Kittler, M. Awais, P. Huber, and X.-J. Wu, “Wing loss for robust facial landmark localisation with convolutional neural networks,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2235–2245.

[22]

C. Ferrari, S. Berretti, P. Pala, and A. Del Bimbo, “A sparse and locally coherent morphable face model for dense semantic correspondence across heterogeneous 3D faces,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 44, no. 10, pp. 6667–6682, Oct. 2022.

Digital Library

[23]

Z. Gao, J. Zhang, Y. Guo, C. Ma, G. Zhai, and X. Yang, “Semi-supervised 3D face representation learning from unconstrained photo collections,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. Workshops, 2020, pp. 1426–1435.

[24]

P. Garrido et al., “Reconstruction of personalized 3D face rigs from monocular video,” ACM Trans. Graph., vol. 35, no. 3, pp. 1–15, 2016.

Digital Library

[25]

B. Gecer, S. Ploumpis, I. Kotsia, and S. Zafeiriou, “GANFIT: Generative adversarial network fitting for high fidelity 3D face reconstruction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 1155–1164.

[26]

K. Genova, F. Cole, A. Maschinot, A. Sarna, D. Vlasic, and W. T. Freeman, “Unsupervised training for 3D morphable model regression,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 8377–8386.

[27]

J. Guo, X. Zhu, Y. Yang, F. Yang, Z. Lei, and S. Z. Li, “Towards fast, accurate and stable 3D dense face alignment,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 152–168.

[28]

Y. Guo, J. Cai, B. Jiang, and J. Zheng, “CNN-based real-time dense face reconstruction with inverse-rendered photo-realistic face images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 6, pp. 1294–1307, Jun. 2019.

Digital Library

[29]

M. Hernandez, T. Hassner, J. Choi, and G. Medioni, “Accurate 3D face reconstruction via prior constrained structure from motion,” Comput. Graph., vol. 66, pp. 14–22, 2017.

[30]

T. Koizumi and W. A. Smith, ““look ma, no landmarks!”–unsupervised, model-based dense face alignment,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 690–706.

[31]

G.-H. Lee and S.-W. Lee, “Uncertainty-aware mesh decoder for high fidelity 3D face reconstruction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 6100–6109.

[32]

T. Li, T. Bolkart, M. J. Black, H. Li, and J. Romero, “Learning a model of facial shape and expression from 4D scans,” ACM Trans. Graph., vol. 36, no. 6, pp. 194–1, 2017.

Digital Library

[33]

A. Morales, G. Piella, and F. M. Sukno, “Survey on 3D face reconstruction from uncalibrated images,” Comput. Sci. Rev., vol. 40, 2021, Art. no.

[34]

P. Paysan, R. Knothe, B. Amberg, S. Romdhani, and T. Vetter, “A 3D face model for pose and illumination invariant face recognition,” in Proc. IEEE Conf. Adv. Video Signal Based Surveill., 2009, pp. 296–301.

[35]

M. B. R., A. Tewari, H.-P. Seidel, M. Elgharib, and C. Theobalt, “Learning complete 3D morphable face models from images and videos,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 3361–3371.

[36]

E. Richardson, M. Sela, and R. Kimmel, “3D face reconstruction by learning from synthetic data,” in Proc. Int. Conf. 3D Vis., 2016, pp. 460–469.

[37]

S. Romdhani and T. Vetter, “Estimating 3D shape and texture using pixel intensity, edges, specular highlights, texture constraints and a prior,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2005, pp. 986–993.

[38]

S. Sanyal, T. Bolkart, H. Feng, and M. J. Black, “Learning to regress 3D face shape and expression from an image without 3D supervision,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 7763–7772.

[39]

E. Sariyanidi, C. Ferrari, S. Berretti, R. Schultz, and B. Tunc, “Meta-evaluation for 3D face reconstruction via synthetic data,” in Proc. IEEE Int. Joint Conf. Biometrics, 2023, pp. 460–469.

[40]

E. Sariyanidi, C. J. Zampella, R. T. Schultz, and B. Tunc, “Can facial pose and expression be separated with weak perspective camera?,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2020, pp. 7173–7182.

[41]

E. Sariyanidi, C. J. Zampella, R. T. Schultz, and B. Tunc, “Inequality-constrained and robust 3D face model fitting,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 433–449.

[42]

J. Shang et al., “Self-supervised monocular 3D face reconstruction by occlusion-aware multi-view geometry consistency,” in Proc. Eur. Conf. Comput. Vis., 2020, pp. 53–70.

[43]

F. Shi, H.-T. Wu, X. Tong, and J. Chai, “Automatic acquisition of high-fidelity facial performances using monocular videos,” ACM Trans. Graph., vol. 33, no. 6, pp. 1–13, 2014.

Digital Library

[44]

Z. Shu, D. Ceylan, K. Sunkavalli, E. Shechtman, S. Hadap, and D. Samaras, “Learning monocular face reconstruction using multi-view supervision,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit., 2020, pp. 241–248.

[45]

A. Tewari et al., “Self-supervised multi-level face model learning for monocular reconstruction at over 250 Hz,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2018, pp. 2549–2559.

[46]

J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, and M. Nießner, “Face2Face: Real-time face capture and reenactment of RGB videos,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2016, pp. 2387–2395.

[47]

L. Tran and X. Liu, “On learning 3D face morphable model from in-the-wild images,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 43, no. 1, pp. 157–171, Jan. 2021.

Digital Library

[48]

A. Tuan Tran, T. Hassner, I. Masi, and G. Medioni, “Regressing robust and discriminative 3D morphable models with a very deep neural network,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2017, pp. 5163–5172.

[49]

G. Tzimiropoulos, S. Zafeiriou, and M. Pantic, “Robust and efficient parametric face alignment,” in Proc. IEEE Int. Conf. Comput. Vis., 2011, pp. 1847–1854.

[50]

P. Wang, Y. Tian, W. Che, and B. Xu, “Efficient and accurate face shape reconstruction by fusion of multiple landmark databases,” in Proc. IEEE Int. Conf. Image Process., 2019, pp. 335–339.

[51]

L. Wolf, T. Hassner, and I. Maoz, “Face recognition in unconstrained videos with matched background similarity,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2011, pp. 529–534.

[52]

F. Wu et al., “MVF-NET: Multi-view 3D face morphable model regression,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2019, pp. 959–968.

[53]

X. Zeng, R. Hu, W. Shi, and Y. Qiao, “Multi-view self-supervised learning for 3D facial texture reconstruction from single image,” Image Vis. Comput., vol. 115, 2021, Art. no.

[54]

X. Zhang et al., “A high-resolution spontaneous 3D dynamic facial expression database,” in Proc. IEEE Int. Conf. Autom. Face Gesture Recognit. Workshop, 2013, pp. 1–6.

[55]

Z. Zhang et al., “Learning to aggregate and personalize 3D face from in-the-wild photo collection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 14214–14224.

[56]

X. Zhu, X. Liu, Z. Lei, and S. Z. Li, “Face alignment in full pose range: A 3D total solution,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 1, pp. 78–92, Jan. 2019.

Digital Library

[57]

M. Zollhöfer et al., “Real-time non-rigid reconstruction using an RGB-D camera,” ACM Trans. Graph., vol. 33, no. 4, pp. 1–12, 2014.

Digital Library

Cited By

Gokmen MSariyanidi EYankowitz LZampella CSchultz RTunc B(2024)Detecting Autism from Head Movements using KinesicsProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685711(350-354)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3685711
Castellani UBartolomioli RMarchioro GCalomino D(2024)From coin to 3D face sculpture portraits in the round of Roman emperorsComputers and Graphics10.1016/j.cag.2024.103999123:COnline publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1016/j.cag.2024.103999

Index Terms

Inequality-Constrained 3D Morphable Face Model Fitting
1. Computing methodologies
2. Theory of computation
  1. Randomness, geometry and discrete structures

Index terms have been assigned to the content through auto-classification.

Recommendations

Face Recognition Based on Fitting a 3D Morphable Model

This paper presents a method for face recognition across variations in pose, ranging from frontal to profile views, and across a wide range of illuminations, including cast shadows and specular reflections. To account for these variations, the algorithm ...
Component-Based Face Recognition with 3D Morphable Models
CVPRW '04: Proceedings of the 2004 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'04) Volume 5 - Volume 05

We present a system for pose and illumination invariant face recognition that combines two recent advances in the computer vision field: 3D morphable models and component-based recognition. A 3D morphable model is used to compute 3D face models from ...
Efficient 3D morphable face model fitting

We propose an efficient stepwise optimisation (ESO) strategy that optimises sequentially the pose, shape, light direction, light strength and skin texture parameters in separate steps leading to an accurate and efficient fitting.A perspective camera and ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 46, Issue 2

Feb. 2024

652 pages

Issue’s Table of Contents

0162-8828 © 2023 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See https://www.ieee.org/publications/rights/index.html for more information.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 February 2024

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 25 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gokmen MSariyanidi EYankowitz LZampella CSchultz RTunc B(2024)Detecting Autism from Head Movements using KinesicsProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685711(350-354)Online publication date: 4-Nov-2024
https://dl.acm.org/doi/10.1145/3678957.3685711
Castellani UBartolomioli RMarchioro GCalomino D(2024)From coin to 3D face sculpture portraits in the round of Roman emperorsComputers and Graphics10.1016/j.cag.2024.103999123:COnline publication date: 21-Nov-2024
https://dl.acm.org/doi/10.1016/j.cag.2024.103999

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents