Abstract
Morphable models are essential for the statistical modeling of 3D faces. Previous works on morphable models mostly focus on large-scale facial geometry but ignore facial details. This paper augments morphable models in representing facial details by learning a Structure-aware Editable Morphable Model (SEMM). SEMM introduces a detail structure representation based on the distance field of wrinkle lines, jointly modeled with detail displacements to establish better correspondences and enable intuitive manipulation of wrinkle structure. Besides, SEMM introduces two transformation modules to translate expression blendshape weights and age values into changes in latent space, allowing effective semantic detail editing while maintaining identity. Extensive experiments demonstrate that the proposed model compactly represents facial details, outperforms previous methods in expression animation qualitatively and quantitatively, and achieves effective age editing and wrinkle line editing of facial details. Code and model are available at https://github.com/gerwang/facial-detail-manipulation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Abrevaya, V.F., Boukhayma, A., Wuhrer, S., Boyer, E.: A decoupled 3D facial shape model by adversarial training. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9419–9428 (2019)
Abrevaya, V.F., Wuhrer, S., Boyer, E.: Multilinear autoencoder for 3D face model learning. In: 2018 IEEE Winter Conference on Applications of Computer Vision (WACV), pp. 1–9. IEEE (2018)
Amberg, B., Paysan, P., Vetter, T.: Weight, sex, and facial expressions: on the manipulation of attributes in generative 3D face models. In: Bebis, G., et al. (eds.) ISVC 2009. LNCS, vol. 5875, pp. 875–885. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-10331-5_81
Bagautdinov, T., Wu, C., Saragih, J., Fua, P., Sheikh, Y.: Modeling facial geometry using compositional VAEs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3877–3886 (2018)
Bando, Y., Kuratate, T., Nishita, T.: A simple method for modeling wrinkles on human skin. In: Proceedings of 10th Pacific Conference on Computer Graphics and Applications, pp. 166–175. IEEE (2002)
Bermano, A.H., et al.: Facial performance enhancement using dynamic shape space analysis. ACM Trans. Graph. (TOG) 33(2), 1–12 (2014)
Bickel, B., Lang, M., Botsch, M., Otaduy, M.A., Gross, M.H.: Pose-space animation and transfer of facial details. In: Symposium on Computer Animation, pp. 57–66 (2008)
Blanz, V., Vetter, T.: A morphable model for the synthesis of 3D faces. In: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, pp. 187–194 (1999)
Boissieux, L., Kiss, G., Thalmann, N.M., Kalra, P.: Simulation of skin aging and wrinkles with cosmetics insight. In: Magnenat-Thalmann, N., Thalmann, D., Arnaldi, B. (eds.) Computer Animation and Simulation 2000, pp. 15–27. Springer, Vienna (2000). https://doi.org/10.1007/978-3-7091-6344-3_2
Bolkart, T., Wuhrer, S.: A groupwise multilinear correspondence optimization for 3D faces. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3604–3612 (2015)
Booth, J., Roussos, A., Zafeiriou, S., Ponniah, A., Dunaway, D.: A 3D morphable model learnt from 10,000 faces. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5543–5552 (2016)
Bouritsas, G., Bokhnyak, S., Ploumpis, S., Bronstein, M., Zafeiriou, S.: Neural 3D morphable models: spiral convolutional networks for 3D shape representation learning and generation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 7213–7222 (2019)
Brunton, A., Bolkart, T., Wuhrer, S.: Multilinear wavelets: a statistical shape space for human faces. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8689, pp. 297–312. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10590-1_20
Cao, C., Bradley, D., Zhou, K., Beeler, T.: Real-time high-fidelity facial performance capture. ACM Trans. Graph. (ToG) 34(4), 1–9 (2015)
Cao, C., Hou, Q., Zhou, K.: Displaced dynamic expression regression for real-time facial tracking and animation. ACM Trans. Graph. (TOG) 33(4), 1–10 (2014)
Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. IEEE Trans. Visual Comput. Graphics 20(3), 413–425 (2013)
Chandran, P., Bradley, D., Gross, M., Beeler, T.: Semantic deep face models. In: 2020 International Conference on 3D Vision (3DV), pp. 345–354. IEEE (2020)
Chen, A., Chen, Z., Zhang, G., Mitchell, K., Yu, J.: Photo-realistic facial details synthesis from single image. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9429–9439 (2019)
Chen, Y., Wu, F., Wang, Z., Song, Y., Ling, Y., Bao, L.: Self-supervised learning of detailed 3D face reconstruction. IEEE Trans. Image Process. 29, 8696–8705 (2020)
Cheng, S., Bronstein, M., Zhou, Y., Kotsia, I., Pantic, M., Zafeiriou, S.: MeshGAN: non-linear 3D morphable models of faces. arXiv preprint arXiv:1903.10384 (2019)
Chibane, J., Mir, A., Pons-Moll, G.: Neural unsigned distance fields for implicit function learning. In: Advances in Neural Information Processing Systems (NeurIPS), December 2020
Choi, Y., Uh, Y., Yoo, J., Ha, J.W.: StarGAN V2: diverse image synthesis for multiple domains. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8188–8197 (2020)
Deng, Q., Ma, L., Jin, A., Bi, H., Le, B.H., Deng, Z.: Plausible 3D face wrinkle generation using variational autoencoders. IEEE Trans. Vis. Comput. Graph. 01, 1–1 (2021)
Egger, B., et al.: 3D morphable face models-past, present, and future. ACM Trans. Graph. (TOG) 39(5), 1–38 (2020)
Feng, Y., Feng, H., Black, M.J., Bolkart, T.: Learning an animatable detailed 3D face model from in-the-wild images. ACM Trans. Graph. (TOG) 40(4), 1–13 (2021)
Fyffe, G., Jones, A., Alexander, O., Ichikari, R., Debevec, P.: Driving high-resolution facial scans with video performance capture. ACM Trans. Graph. (TOG) 34(1), 1–14 (2014)
Garrido, P., Valgaerts, L., Wu, C., Theobalt, C.: Reconstructing detailed dynamic face geometry from monocular video. ACM Trans. Graph. 32(6), 158–1 (2013)
Gerig, T., et al.: Morphable face models-an open framework. In: 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), pp. 75–82. IEEE (2018)
Golovinskiy, A., Matusik, W., Pfister, H., Rusinkiewicz, S., Funkhouser, T.: A statistical model for synthesis of detailed facial geometry. ACM Trans. Graph. (TOG) 25(3), 1025–1034 (2006)
Guo, Y., Cai, J., Jiang, B., Zheng, J., et al.: CNN-based real-time dense face reconstruction with inverse-rendered photo-realistic face images. IEEE Trans. Pattern Anal. Mach. Intell. 41(6), 1294–1307 (2018)
Huynh, L., et al.: Mesoscopic facial geometry inference using deep neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8407–8416 (2018)
Ichim, A.E., Kadleček, P., Kavan, L., Pauly, M.: Phace: physics-based face modeling and animation. ACM Trans. Graph. (TOG) 36(4), 1–14 (2017)
Igarashi, T., Nishino, K., Nayar, S.K.: The Appearance of Human Skin: A Survey. Now Publishers Inc. (2007)
Jahanian, A., Chai, L., Isola, P.: On the “steerability” of generative adversarial networks. In: International Conference on Learning Representations (2020)
Jiang, L., Zhang, J., Deng, B., Li, H., Liu, L.: 3D face reconstruction with geometry details from a single image. IEEE Trans. Image Process. 27(10), 4756–4770 (2018)
Jiang, Z.H., Wu, Q., Chen, K., Zhang, J.: Disentangled representation learning for 3d face shape. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11957–11966 (2019)
Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8110–8119 (2020)
Kim, H.J., Oeztireli, A.C., Shin, I.K., Gross, M., Choi, S.M.: Interactive generation of realistic facial wrinkles from sketchy drawings. In: Computer Graphics Forum, vol. 34, pp. 179–191. Wiley Online Library (2015)
Lau, M., Chai, J., Xu, Y.Q., Shum, H.Y.: Face poser: interactive modeling of 3D facial expressions using facial priors. ACM Trans. Graph. (TOG) 29(1), 1–17 (2009)
Lewis, J.P., Anjyo, K., Rhee, T., Zhang, M., Pighin, F.H., Deng, Z.: Practice and theory of blendshape facial models. Eurographics (State Art Rep.) 1(8), 2 (2014)
Li, D., Yang, J., Kreis, K., Torralba, A., Fidler, S.: Semantic segmentation with generative models: semi-supervised learning and strong out-of-domain generalization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8300–8311 (2021)
Li, J., Kuang, Z., Zhao, Y., He, M., Bladin, K., Li, H.: Dynamic facial asset and rig generation from a single scan. ACM Trans. Graph. 39(6), 215–1 (2020)
Li, J., Xu, W., Cheng, Z., Xu, K., Klein, R.: Lightweight wrinkle synthesis for 3D facial modeling and animation. Comput. Aided Des. 58, 117–122 (2015)
Li, M., Yin, B., Kong, D., Luo, X.: Modeling expressive wrinkles of face for animation. In: Fourth International Conference on Image and Graphics (ICIG 2007), pp. 874–879. IEEE (2007)
Li, R., et al.: Learning formation of physically-based face attributes. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3410–3419 (2020)
Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ACM Trans. Graph. 36(6) (2017)
Li, Y.B., Xiao, H., Zhang, S.Y.: The wrinkle generation method for facial reconstruction based on extraction of partition wrinkle line features and fractal interpolation. In: Fourth International Conference on Image and Graphics (ICIG 2007), pp. 933–937. IEEE (2007)
Li, Y., Ma, L., Fan, H., Mitchell, K.: Feature-preserving detailed 3D face reconstruction from a single image. In: Proceedings of the 15th ACM SIGGRAPH European Conference on Visual Media Production, pp. 1–9 (2018)
Li, Y., Chen, X., Wu, F., Zha, Z.J.: Linestofacephoto: face photo generation from lines with conditional self-attention generative adversarial networks. In: Proceedings of the 27th ACM International Conference on Multimedia, pp. 2323–2331 (2019)
Liu, M.Y., et al.: Few-shot unsupervised image-to-image translation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 10551–10560 (2019)
Ma, L., Deng, Z.: Real-time hierarchical facial performance capture. In: Proceedings of the ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games, pp. 1–10 (2019)
Mescheder, L., Geiger, A., Nowozin, S.: Which training methods for GANs do actually converge? In: International Conference on Machine Learning, pp. 3481–3490. PMLR (2018)
Neumann, T., Varanasi, K., Wenger, S., Wacker, M., Magnor, M., Theobalt, C.: Sparse localized deformation components. ACM Trans. Graph. (TOG) 32(6), 1–10 (2013)
Park, J.J., Florence, P., Straub, J., Newcombe, R., Lovegrove, S.: DeepSDF: learning continuous signed distance functions for shape representation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019
Park, T., et al.: Swapping autoencoder for deep image manipulation. Adv. Neural. Inf. Process. Syst. 33, 7198–7211 (2020)
Parke, F.I.: A parametric model for human faces. The University of Utah (1974)
Paysan, P.: Statistical modeling of facial aging based on 3D scans. Ph.D. thesis, University of Basel (2010)
Radlanski, R.J., Wesker, K.: The Face: Pictorial Atlas of Clinical Anatomy. Quintessence Publishing (2012)
Ranjan, A., Bolkart, T., Sanyal, S., Black, M.J.: Generating 3D faces using convolutional mesh autoencoders. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 704–720 (2018)
Richardson, E., Sela, M., Or-El, R., Kimmel, R.: Learning detailed face reconstruction from a single image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1259–1268 (2017)
Sela, M., Richardson, E., Kimmel, R.: Unrestricted facial geometry reconstruction using image-to-image translation. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1576–1585 (2017)
Sengupta, S., Kanazawa, A., Castillo, C.D., Jacobs, D.W.: SfSNet: learning shape, reflectance and illuminance of faces in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6296–6305 (2018)
Serup, J., Jemec, G.B., Grove, G.L.: Handbook of Non-Invasive Methods and the Skin. CRC Press, Boca Raton (2006)
Shamai, G., Slossberg, R., Kimmel, R.: Synthesizing facial photometries and corresponding geometries using generative adversarial networks. ACM Trans. Multimedia Comput. Commun. Appl. (TOMM) 15(3s), 1–24 (2019)
Shi, F., Wu, H.T., Tong, X., Chai, J.: Automatic acquisition of high-fidelity facial performances using monocular videos. ACM Trans. Graph. (TOG) 33(6), 1–13 (2014)
Shin, I.K., Öztireli, A.C., Kim, H.J., Beeler, T., Gross, M., Choi, S.M.: Extraction and transfer of facial expression wrinkles for facial performance enhancement. In: PG (Short Papers) (2014)
Simo-Serra, E., Iizuka, S., Sasaki, K., Ishikawa, H.: Learning to simplify: fully convolutional networks for rough sketch cleanup. ACM Trans. Graph. (TOG) 35(4), 1–11 (2016)
Slossberg, R., Shamai, G., Kimmel, R.: High quality facial surface and texture synthesis via generative adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Suwajanakorn, S., Kemelmacher-Shlizerman, I., Seitz, S.M.: Total moving face reconstruction. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 796–812. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10593-2_52
Taubin, G.: A signal processing approach to fair surface design. In: Proceedings of the 22nd Annual Conference on Computer Graphics and Interactive Techniques, SIGGRAPH 1995, pp. 351–358. Association for Computing Machinery, New York (1995). https://doi.org/10.1145/218380.218473
Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1126–1135 (2019)
Vlasic, D., Brand, M., Pfister, H., Popovic, J.: Face transfer with multilinear models. In: ACM SIGGRAPH 2006 Courses, pp. 24-es (2006)
Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798–8807 (2018)
Wang, Z., Yu, X., Lu, M., Wang, Q., Qian, C., Xu, F.: Single image portrait relighting via explicit multiple reflectance channel modeling. ACM Trans. Graph. (TOG) 39(6), 1–13 (2020)
Werner, D., Al-Hamadi, A., Werner, P.: Truncated signed distance function: experiments on voxel size. In: Campilho, A., Kamel, M. (eds.) ICIAR 2014. LNCS, vol. 8815, pp. 357–364. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11755-3_40
Wu, C., Bradley, D., Gross, M., Beeler, T.: An anatomically-constrained local deformation model for monocular face capture. ACM Trans. Graph. (TOG) 35(4), 1–12 (2016)
Wu, Y., Kalra, P., Moccozet, L., Magnenat-Thalmann, N.: Simulating wrinkles and skin aging. Vis. Comput. 15(4), 183–198 (1999)
Xu, F., Chai, J., Liu, Y., Tong, X.: Controllable high-fidelity facial performance transfer. ACM Trans. Graph. (TOG) 33(4), 1–11 (2014)
Yang, H., et al.: FaceScape: a large-scale high quality 3D face dataset and detailed riggable 3D face prediction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 601–610 (2020)
Yenamandra, T., et al.: i3DMM: deep implicit 3D morphable model of human heads. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12803–12813 (2021)
Zeng, X., Peng, X., Qiao, Y.: DF2Net: a dense-fine-finer network for detailed 3D face reconstruction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2315–2324 (2019)
Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 586–595 (2018)
Zhang, Y., et al.: Datasetgan: efficient labeled data factory with minimal human effort. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 10145–10155 (2021)
Zhuang, P., Koyejo, O.O., Schwing, A.: Enjoy your editing: controllable GANs for image editing via latent space navigation. In: International Conference on Learning Representations (2020)
Acknowledgements
This work was supported by Beijing Natural Science Foundation (JQ19015), the NSFC (No. 62021002, 61727808), the National Key R &D Program of China 2018YFA0704000, the Key Research and Development Project of Tibet Autonomous Region (XZ202101ZY0019G). This work was supported by THUIBCS, Tsinghua University and BLBCI, Beijing Municipal Education Commission. Feng Xu is the corresponding author.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Ling, J., Wang, Z., Lu, M., Wang, Q., Qian, C., Xu, F. (2022). Structure-Aware Editable Morphable Model for 3D Facial Detail Animation and Manipulation. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13663. Springer, Cham. https://doi.org/10.1007/978-3-031-20062-5_15
Download citation
DOI: https://doi.org/10.1007/978-3-031-20062-5_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20061-8
Online ISBN: 978-3-031-20062-5
eBook Packages: Computer ScienceComputer Science (R0)