research-article

ReN Human: Learning Relightable Neural Implicit Surfaces for Animatable Human Rendering

Authors:

Yuchi HuoAuthors Info & Claims

ACM Transactions on Graphics, Volume 43, Issue 5

Article No.: 162, Pages 1 - 22

https://doi.org/10.1145/3678002

Published: 09 August 2024 Publication History

Abstract

Recently, implicit neural representation has been widely used to learn the appearance of human bodies in the canonical space, which can be further animated using a parametric human model. However, how to decompose the material properties from the implicit representation for relighting has not yet been investigated thoroughly. We propose to address this problem with a novel framework, ReN Human, that takes sparse or even monocular input videos collected in unconstrained lighting to produce a 3D human representation that can be rendered with novel views, poses, and lighting. Our method represents humans as deformable implicit neural representation and decomposes the geometry, material of humans as well as environment illumination for capturing a relightable and animatable human model. Moreover, we introduce a volumetric lighting grid consisting of spherical Gaussian mixtures to learn the spatially varying illumination and animatable visibility probes to model the dynamic self-occlusion caused by human motion. Specifically, we learn the material property fields and illumination using a physically-based rendering layer that uses Monte Carlo importance sampling to facilitate differentiation of the complex rendering integral. We demonstrate that our approach outperforms recent novel views and poses synthesis methods in a challenging benchmark with sparse videos, enabling high-fidelity human relighting.

Supplementary Material

tog-22-0119-File003 (tog-22-0119-file003.mp4)

Supplementary material

Download
266.30 MB

References

[1]

Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, and Gerard Pons-Moll. 2018. Video-based reconstruction of 3d people models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8387–8397.

[2]

Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, and Gerard Pons-Moll. 2020. Loopreg: Self-supervised learning of implicit surface correspondences, pose and shape for 3d human mesh registration. Advances in Neural Information Processing Systems 33 (2020), 12909–12922.

[3]

Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, and Ravi Ramamoorthi. 2020. Neural reflectance fields for appearance acquisition. arXiv:2008.03824. Retrieved from https://arxiv.org/abs/2008.03824

[4]

Blender. 2022. Blender. Retrieved December 5, 2022, from https://www.blender.org/

[5]

Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, and Hendrik Lensch. 2021. Nerd: Neural reflectance decomposition from image collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12684–12694.

[6]

Brent Burley and Walt Disney Animation Studios. 2012. Physically-based shading at disney. In Proceedings of the ACM SIGGRAPH. 1–7.

[7]

Zhaoxi Chen and Ziwei Liu. 2022. Relighting4d: Neural relightable human from videos. In European Conference on Computer Vision, Springer, 606–623.

[8]

Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5939–5948.

[9]

Robert L. Cook and Kenneth E. Torrance. 1982. A reflectance model for computer graphics. ACM Transactions on Graphics 1, 1 (1982), 7–24.

Digital Library

[10]

Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 145–156.

Digital Library

[11]

Frank Dellaert and Lin Yen-Chen. 2020. Neural volume rendering: Nerf and beyond. arXiv:2101.05204. Retrieved from https://arxiv.org/abs/2101.05204

[12]

Junting Dong, Wen Jiang, Qixing Huang, Hujun Bao, and Xiaowei Zhou. 2019. Fast and robust multi-person 3d pose estimation from multiple views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7792–7801.

[13]

Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In Proceedings of the European Conference on Computer Vision (ECCV’18). 770–785.

Digital Library

[14]

Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit geometric regularization for learning shapes. In Proceedings of the 37th International Conference on Machine Learning. 3789–3799.

[15]

Marc Habermann, Weipeng Xu, Michael Zollhofer, Gerard Pons-Moll, and Christian Theobalt. 2020. Deepcap: Monocular human performance capture using weak supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5052–5063.

[16]

Eric Heitz. 2014. Understanding the masking-shadowing function in microfacet-based BRDFs. Journal of Computer Graphics Techniques 3, 2 (2014), 32–91.

[17]

Eric Heitz. 2018. Sampling the GGX distribution of visible normals. Journal of Computer Graphics Techniques 7, 4 (2018), 1–13.

[18]

Eric Heitz and Eugene d’Eon. 2014. Importance sampling microfacet-based BSDFs using the distribution of visible normals. In Proceedings of the Computer Graphics Forum. Wiley Online Library, 103–112.

Digital Library

[19]

Chaonan Ji, Tao Yu, Kaiwen Guo, Jingxin Liu, and Yebin Liu. 2022. Geometry-aware single-image full-body human relighting. In Proceedings of the European Conference on Computer Vision. Springer, 388–405.

Digital Library

[20]

Hanbyul Joo, Tomas Simon, and Yaser Sheikh. 2018. Total capture: A 3d deformation model for tracking faces, hands, and bodies. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8320–8329.

[21]

James T. Kajiya. 1986. The rendering equation. In Proceedings of the 13th Annual Conference on Computer Graphics and Interactive Techniques. 143–150.

Digital Library

[22]

Yoshihiro Kanamori and Yuki Endo. 2018. Relighting humans: occlusion-aware inverse rendering for full-body human images. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–11.

[23]

Ladislav Kavan, Steven Collins, Jiří Žára, and Carol O’Sullivan. 2007. Skinning with dual quaternions. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games. 39–46.

Digital Library

[24]

Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980

[25]

Zhengfei Kuang, Kyle Olszewski, Menglei Chai, Zeng Huang, Panos Achlioptas, and Sergey Tulyakov. 2022. Neroic: Neural rendering of objects from online image collections. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–12.

[26]

John P. Lewis, Matt Cordner, and Nickson Fong. 2000. Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 165–172.

Digital Library

[27]

Gengyan Li, Abhimitra Meka, Franziska Mueller, Marcel C. Buehler, Otmar Hilliges, and Thabo Beeler. 2022. EyeNeRF: A hybrid representation for photorealistic synthesis, animation and relighting of human eyes. ACM Transactions on Graphics 41, 4 (2022), 1–16.

Digital Library

[28]

Lingjie Liu, Marc Habermann, Viktor Rudnev, Kripasindhu Sarkar, Jiatao Gu, and Christian Theobalt. 2021. Neural actor: Neural free-view synthesis of human actors with pose control. ACM Transactions on Graphics 40, 6 (2021), 1–16.

Digital Library

[29]

Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics 34, 6 (2015), 1–16.

Digital Library

[30]

Nelson Max. 1995. Optical models for direct volume rendering. IEEE Transactions on Visualization and Computer Graphics 1, 2 (1995), 99–108.

Digital Library

[31]

Abhimitra Meka, Christian Häne, Rohit Pandey, Michael Zollhöfer, Sean Fanello, Graham Fyffe, Adarsh Kowdle, Xueming Yu, Jay Busch, Jason Dourgarian, Peter Denny, Sofien Bouaziz, Peter Lincoln, Matt Whalen, Geoff Harvey, Jonathan Taylor, Shahram Izadi, Andrea Tagliasacchi, Paul Debevec, Christian Theobalt, Julien Valentin, and Christoph Rhemann. 2019. Deep reflectance fields: High-quality facial reflectance field inference from color gradient illumination. ACM Transactions on Graphics 38, 4 (2019), 1–12. Retrieved from

Digital Library

[32]

Abhimitra Meka, Maxim Maximov, Michael Zollhoefer, Avishek Chatterjee, Hans-Peter Seidel, Christian Richardt, and Christian Theobalt. 2018. Lime: Live intrinsic material estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6315–6324.

[33]

Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision. Springer, 405–421.

Digital Library

[34]

Mixamo. 2023. Mixamo. Retrieved December 7, 2023, from https://www.mixamo.com/

[35]

Rohit Pandey, Sergio Orts Escolano, Chloe Legendre, Christian Haene, Sofien Bouaziz, Christoph Rhemann, Paul Debevec, and Sean Fanello. 2021. Total relighting: Learning to relight portraits for background replacement. ACM Transactions on Graphics 40, 4 (2021), 1–21.

Digital Library

[36]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165–174.

[37]

Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. 2021a. Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14314–14323.

[38]

Sida Peng, Zhen Xu, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2024. Animatable implicit neural representations for creating realistic avatars from videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 46, 6 (2024), 4147–4159.

[39]

Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2021b. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9054–9063.

[40]

Renderpeople. 2022. Renderpeople. Retrieved December 5, 2022, from https://renderpeople.com/

[41]

Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. 2019. Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2304–2314.

[42]

Shen Sang and Manmohan Chandraker. 2020. Single-shot neural relighting and svbrdf estimation. In Proceedings of the European Conference on Computer Vision. Springer, 85–101.

Digital Library

[43]

Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T. Barron. 2021. Nerv: Neural reflectance and visibility fields for relighting and view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7495–7504.

[44]

Tiancheng Sun, Jonathan T. Barron, Yun-Ta Tsai, Zexiang Xu, Xueming Yu, Graham Fyffe, Christoph Rhemann, Jay Busch, Paul E Debevec, and Ravi Ramamoorthi. 2019. Single image portrait relighting.ACM Transactions on Graphics 38, 4 (2019), 79–1.

Digital Library

[45]

Bruce Walter, Stephen R. Marschner, Hongsong Li, and Kenneth E. Torrance. 2007. Microfacet models for refraction through rough surfaces. In Proceedings of the 18th Eurographics Conference on Rendering Techniques, Eurographics Association, Goslar, DEU, 195–206. Retrieved from

Digital Library

[46]

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems 34 (2021), 27171–27183.

[47]

Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.

Digital Library

[48]

Chung-Yi Weng, Brian Curless, Pratul P. Srinivasan, Jonathan T. Barron, and Ira Kemelmacher-Shlizerman. 2022. Humannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16210–16220.

[49]

Andreas Wenger, Andrew Gardner, Chris Tchou, Jonas Unger, Tim Hawkins, and Paul Debevec. 2005. Performance relighting and reflectance transformation with time-multiplexed illumination. ACM Transactions on Graphics 24, 3 (2005), 756–764.

Digital Library

[50]

Zexiang Xu, Kalyan Sunkavalli, Sunil Hadap, and Ravi Ramamoorthi. 2018. Deep image-based relighting from optimal sparse samples. ACM Transactions on Graphics 37, 4 (2018), 1–13.

Digital Library

[51]

Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, and Raquel Urtasun. 2021. S3: Neural shape, skeleton, and skinning fields for 3D human modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13284–13293.

[52]

Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. 2021. Volume rendering of neural implicit surfaces. Advances in Neural Information Processing Systems 34 (2021), 4805–4815.

[53]

Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Basri Ronen, and Yaron Lipman. 2020. Multiview neural surface reconstruction by disentangling geometry and appearance. Advances in Neural Information Processing Systems 33, (2020), 2492–2502.

[54]

Jason Zhang, Gengshan Yang, Shubham Tulsiani, and Deva Ramanan. 2021. Ners: Neural reflectance surfaces for sparse-view 3d reconstruction in the wild. Advances in Neural Information Processing Systems 34 (2021), 29835–29847.

[55]

Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, and Noah Snavely. 2021a. Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5453–5462.

[56]

Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595.

[57]

Xiuming Zhang, Pratul P. Srinivasan, Boyang Deng, Paul Debevec, William T. Freeman, and Jonathan T. Barron. 2021b. Nerfactor: Neural factorization of shape and reflectance under an unknown illumination. ACM Transactions on Graphics 40, 6 (2021), 1–18.

Digital Library

[58]

Zerong Zheng, Han Huang, Tao Yu, Hongwen Zhang, Yandong Guo, and Yebin Liu. 2022. Structured local radiance fields for human avatar modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15893–15903.

Index Terms

ReN Human: Learning Relightable Neural Implicit Surfaces for Animatable Human Rendering

Index terms have been assigned to the content through auto-classification.

Recommendations

Image-based rendering of diffuse, specular and glossy surfaces from a single image
SIGGRAPH '01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques

In this paper, we present a new method to recover an approximation of the bidirectional reflectance distribution function (BRDF) of the surfaces present in a real scene. This is done from a single photograph and a 3D geometric model of the scene. The ...
Relighting4D: Neural Relightable Human from Videos
Computer Vision – ECCV 2022
Abstract
Human relighting is a highly desirable yet challenging task. Existing works either require expensive one-light-at-a-time (OLAT) captured data using light stage or cannot freely change the viewpoints of the rendered body. In this work, we propose a ...
Deep Reflectance Volumes: Relightable Reconstructions from Multi-view Photometric Images
Computer Vision – ECCV 2020
Abstract
We present a deep learning approach to reconstruct scene appearance from unstructured images captured under collocated point lighting. At the heart of Deep Reflectance Volumes is a novel volumetric scene representation consisting of opacity, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 43, Issue 5

October 2024

196 pages

EISSN:1557-7368

DOI:10.1145/3613708

Editor:
Carol O'Sullivan
Trinity College Dublin, Ireland

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 August 2024

Online AM: 15 July 2024

Accepted: 05 July 2024

Revised: 07 December 2023

Received: 05 December 2022

Published in TOG Volume 43, Issue 5

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

NSFC
Zhejiang Province “Jianbing” Research and Development Project
National Natural Science Foundation of China
National Key R&D Program of China
Information Technology Center and State Key Lab of CAD&CG, Zhejiang University

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
318
Total Downloads

Downloads (Last 12 months)318
Downloads (Last 6 weeks)77

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents