Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

MoFaNeRF: Morphable Facial Neural Radiance Field

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13663))

Included in the following conference series:

Abstract

We propose a parametric model that maps free-view images into a vector space of coded facial shape, expression and appearance with a neural radiance field, namely Morphable Facial NeRF. Specifically, MoFaNeRF takes the coded facial shape, expression and appearance along with space coordinate and view direction as input to an MLP, and outputs the radiance of the space point for photo-realistic image synthesis. Compared with conventional 3D morphable models (3DMM), MoFaNeRF shows superiority in directly synthesizing photo-realistic facial details even for eyes, mouths, and beards. Also, continuous face morphing can be easily achieved by interpolating the input shape, expression and appearance codes. By introducing identity-specific modulation and texture encoder, our model synthesizes accurate photometric details and shows strong representation ability. Our model shows strong ability on multiple applications including image-based fitting, random generation, face rigging, face editing, and novel view synthesis. Experiments show that our method achieves higher representation ability than previous parametric models, and achieves competitive performance in several applications. To the best of our knowledge, our work is the first facial parametric model built upon a neural radiance field that can be used in fitting, generation and manipulation. The code and data is available at https://github.com/zhuhao-nju/mofanerf.

Y. Zhuang and H. Zhu—These authors contributed equally to this work.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bagautdinov, T., Wu, C., Saragih, J., Fua, P., Sheikh, Y.: Modeling facial geometry using compositional VAEs. In: CVPR, pp. 3877–3886 (2018)

    Google Scholar 

  2. Blanz, V., Vetter, T., et al.: A morphable model for the synthesis of 3D faces. In: SIGGRAPH, vol. 99, pp. 187–194 (1999)

    Google Scholar 

  3. Cao, C., Weng, Y., Zhou, S., Tong, Y., Zhou, K.: Facewarehouse: a 3D facial expression database for visual computing. TVCG 20(3), 413–425 (2013)

    Google Scholar 

  4. Chan, E.R., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3D-aware image synthesis. In: CVPR, pp. 5799–5809 (2021)

    Google Scholar 

  5. Chen, J., et al.: Animatable neural radiance fields from monocular RGB videos. arXiv preprint arXiv:2106.13629 (2021)

  6. Cheng, S., Bronstein, M., Zhou, Y., Kotsia, I., Pantic, M., Zafeiriou, S.: MeshGAN: non-linear 3D morphable models of faces. arXiv preprint arXiv:1903.10384 (2019)

  7. Dai, H., Pears, N., Smith, W., Duncan, C.: Statistical modeling of craniofacial shape and texture. IJCV 128(2), 547–571 (2019)

    Article  Google Scholar 

  8. Dosovitskiy, A., Brox, T.: Generating images with perceptual similarity metrics based on deep networks. NIPS 29, 658–666 (2016)

    Google Scholar 

  9. Egger, B., et al.: 3D morphable face models-past, present, and future. ToG 39(5), 1–38 (2020)

    Article  Google Scholar 

  10. Gafni, G., Thies, J., Zollhofer, M., Nießner, M.: Dynamic neural radiance fields for monocular 4D facial avatar reconstruction. In: CVPR, pp. 8649–8658 (2021)

    Google Scholar 

  11. Gao, C., Shih, Y., Lai, W.S., Liang, C.K., Huang, J.B.: Portrait neural radiance fields from a single image. https://arxiv.org/abs/2012.05903 (2020)

  12. Gatys, L.A., Ecker, A.S., Bethge, M.: Image style transfer using convolutional neural networks. In: CVPR, pp. 2414–2423 (2016)

    Google Scholar 

  13. Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, vol. 27 (2014)

    Google Scholar 

  14. Gu, J., Liu, L., Wang, P., Theobalt, C.: Stylenerf: a style-based 3D-aware generator for high-resolution image synthesis (2021)

    Google Scholar 

  15. Hong, Y., Peng, B., Xiao, H., Liu, L., Zhang, J.: Headnerf: a real-time nerf-based parametric head model. In: CVPR, pp. 20374–20384 (2022)

    Google Scholar 

  16. Horé, A., Ziou, D.: Image quality metrics: PSNR vs. SSIM. In: ICPR, pp. 2366–2369 (2010)

    Google Scholar 

  17. Jiang, Z.H., Wu, Q., Chen, K., Zhang, J.: Disentangled representation learning for 3D face shape. In: CVPR, pp. 11957–11966 (2019)

    Google Scholar 

  18. Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43

    Chapter  Google Scholar 

  19. Jourabloo, A., Liu, X.: Large-pose face alignment via CNN-based dense 3D model fitting. In: CVPR, pp. 4188–4196 (2016)

    Google Scholar 

  20. Karras, T., Laine, S., Aila, T.: A style-based generator architecture for generative adversarial networks. In: CVPR, pp. 4401–4410 (2019)

    Google Scholar 

  21. Karras, T., Laine, S., Aittala, M., Hellsten, J., Lehtinen, J., Aila, T.: Analyzing and improving the image quality of StyleGAN. In: CVPR, pp. 8110–8119 (2020)

    Google Scholar 

  22. Kazemi, V., Sullivan, J.: One millisecond face alignment with an ensemble of regression trees. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1867–1874 (2014)

    Google Scholar 

  23. Lee, C.H., Liu, Z., Wu, L., Luo, P.: MaskGAN: towards diverse and interactive facial image manipulation. arXiv preprint arXiv:1907.11922 (2019)

  24. Li, H., Weise, T., Pauly, M.: Example-based facial rigging. ToG 29, 32 (2010)

    Article  Google Scholar 

  25. Li, T., Bolkart, T., Black, M.J., Li, H., Romero, J.: Learning a model of facial shape and expression from 4D scans. ToG 36(6), 194 (2017)

    Article  Google Scholar 

  26. Li, T., et al.: Neural 3D video synthesis. arXiv preprint arXiv:2103.02597 (2021)

  27. Li, Z., Niklaus, S., Snavely, N., Wang, O.: Neural scene flow fields for space-time view synthesis of dynamic scenes (2020). https://arxiv.org/abs/2011.13084

  28. Liu, L., Habermann, M., Rudnev, V., Sarkar, K., Gu, J., Theobalt, C.: Neural actor: neural free-view synthesis of human actors with pose control. arXiv preprint arXiv:2106.02019 (2021)

  29. Luo, L., Xue, D., Feng, X.: Ehanet: an effective hierarchical aggregation network for face parsing. Appl. Sci. 10(9), 3135 (2020)

    Article  Google Scholar 

  30. Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing scenes as neural radiance fields for view synthesis. In: ECCV, pp. 405–421 (2020)

    Google Scholar 

  31. Nitzan, Y., Bermano, A., Li, Y., Cohen-Or, D.: Face identity disentanglement via latent space mapping. ToG 39, 1–14 (2020)

    Article  Google Scholar 

  32. Noguchi, A., Sun, X., Lin, S., Harada, T.: Neural articulated radiance field. arXiv preprint arXiv:2104.03110 (2021)

  33. Park, K., et al.: Nerfies: deformable neural radiance fields. arXiv preprint arXiv:2011.12948 (2020)

  34. Park, K., et al.: Hypernerf: a higher-dimensional representation for topologically varying neural radiance fields. arXiv preprint arXiv:2106.13228 (2021)

  35. Peng, S., et al.: Animatable neural radiance fields for human body modeling. arXiv preprint arXiv:2105.02872 (2021)

  36. Peng, S., et al.: Neural body: implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In: CVPR (2021)

    Google Scholar 

  37. Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-NeRF: neural radiance fields for dynamic scenes (2020). https://arxiv.org/abs/2011.13961

  38. Pumarola, A., Corona, E., Pons-Moll, G., Moreno-Noguer, F.: D-nerf: neural radiance fields for dynamic scenes. In: CVPR, pp. 10318–10327 (2021)

    Google Scholar 

  39. Raj, A., et al.: Pixel-aligned volumetric avatars. In: CVPR, pp. 11733–11742 (2021)

    Google Scholar 

  40. Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: a unified embedding for face recognition and clustering. In: CVPR, pp. 815–823 (2015)

    Google Scholar 

  41. Schwarz, K., Liao, Y., Niemeyer, M., Geiger, A.: Graf: generative radiance fields for 3D-aware image synthesis. In: CVPR (2021)

    Google Scholar 

  42. Tewari, A., et al.: Self-supervised multi-level face model learning for monocular reconstruction at over 250 HZ. In: CVPR, pp. 2549–2559 (2018)

    Google Scholar 

  43. Tran, L., Liu, F., Liu, X.: Towards high-fidelity nonlinear 3D face morphable model. In: CVPR, pp. 1126–1135 (2019)

    Google Scholar 

  44. Tran, L., Liu, X.: Nonlinear 3D face morphable model. In: CVPR, pp. 7346–7355 (2018)

    Google Scholar 

  45. Tran, L., Liu, X.: On learning 3D face morphable model from in-the-wild images. PAMI 43(1), 157–171 (2019)

    Google Scholar 

  46. Tretschk, E., Tewari, A., Golyanik, V., Zollhöfer, M., Lassner, C., Theobalt, C.: Non-rigid neural radiance fields: reconstruction and novel view synthesis of a deforming scene from monocular video. https://arxiv.org/abs/2012.12247 (2020)

  47. Vlasic, D., Brand, M., Pfister, H., Popović, J.: Face transfer with multilinear models. ToG 24(3), 426–433 (2005)

    Article  Google Scholar 

  48. Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: CVPR, pp. 4690–4699 (2021)

    Google Scholar 

  49. Wang, T.C., Liu, M.Y., Zhu, J.Y., Tao, A., Kautz, J., Catanzaro, B.: High-resolution image synthesis and semantic manipulation with conditional GANs. In: CVPR, pp. 8798–8807 (2018)

    Google Scholar 

  50. Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. TIP 13(4), 600–612 (2004)

    Google Scholar 

  51. Wang, Z., et al.: Learning compositional radiance fields of dynamic human heads. In: CVPR, pp. 5704–5713 (2021)

    Google Scholar 

  52. Xian, W., Huang, J.B., Kopf, J., Kim, C.: Space-time neural irradiance fields for free-viewpoint video. In: CVPR, pp. 9421–9431 (2021)

    Google Scholar 

  53. Xiao, Y., Zhu, H., Yang, H., Diao, Z., Lu, X., Cao, X.: Detailed facial geometry recovery from multi-view images by learning an implicit function. In: AAAI (2022)

    Google Scholar 

  54. Yang, H., et al.: Facescape: a large-scale high quality 3d face dataset and detailed riggable 3D face prediction. In: CVPR (2020)

    Google Scholar 

  55. Yenamandra, T., et al.: i3DMM: deep implicit 3D morphable model of human heads. In: CVPR, pp. 12803–12813 (2021)

    Google Scholar 

  56. Yu, A., Ye, V., Tancik, M., Kanazawa, A.: pixelNeRF: neural radiance fields from one or few images. In: CVPR, pp. 4578–4587 (2021)

    Google Scholar 

  57. Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effectiveness of deep features as a perceptual metric. In: CVPR, pp. 586–595 (2018)

    Google Scholar 

  58. Zhang, Z., Li, L., Ding, Y., Fan, C.: Flow-guided one-shot talking face generation with a high-resolution audio-visual dataset. In: CVPR, pp. 3661–3670 (2021)

    Google Scholar 

  59. Zhou, H., Hadap, S., Sunkavalli, K., Jacobs, D.W.: Deep single-image portrait relighting. In: CVPR, pp. 7194–7202 (2019)

    Google Scholar 

  60. Zhu, H., et al.: FaceScape: 3D facial dataset and benchmark for single-view 3D face reconstruction. arXiv preprint arXiv:2111.01082 (2021)

Download references

Acknowledgement

This work was supported by the NSFC grant 62025108, 62001213, and Tencent Rhino-Bird Joint Research Program. We thank Dr. Yao Yao for his valuable suggestions and Dr. Yuanxun Lu for proofreading the paper.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xun Cao .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 2 (pdf 9379 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhuang, Y., Zhu, H., Sun, X., Cao, X. (2022). MoFaNeRF: Morphable Facial Neural Radiance Field. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13663. Springer, Cham. https://doi.org/10.1007/978-3-031-20062-5_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20062-5_16

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20061-8

  • Online ISBN: 978-3-031-20062-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics