Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ReN Human: Learning Relightable Neural Implicit Surfaces for Animatable Human Rendering

Published: 09 August 2024 Publication History

Abstract

Recently, implicit neural representation has been widely used to learn the appearance of human bodies in the canonical space, which can be further animated using a parametric human model. However, how to decompose the material properties from the implicit representation for relighting has not yet been investigated thoroughly. We propose to address this problem with a novel framework, ReN Human, that takes sparse or even monocular input videos collected in unconstrained lighting to produce a 3D human representation that can be rendered with novel views, poses, and lighting. Our method represents humans as deformable implicit neural representation and decomposes the geometry, material of humans as well as environment illumination for capturing a relightable and animatable human model. Moreover, we introduce a volumetric lighting grid consisting of spherical Gaussian mixtures to learn the spatially varying illumination and animatable visibility probes to model the dynamic self-occlusion caused by human motion. Specifically, we learn the material property fields and illumination using a physically-based rendering layer that uses Monte Carlo importance sampling to facilitate differentiation of the complex rendering integral. We demonstrate that our approach outperforms recent novel views and poses synthesis methods in a challenging benchmark with sparse videos, enabling high-fidelity human relighting.

Supplementary Material

tog-22-0119-File003 (tog-22-0119-file003.mp4)
Supplementary material

References

[1]
Thiemo Alldieck, Marcus Magnor, Weipeng Xu, Christian Theobalt, and Gerard Pons-Moll. 2018. Video-based reconstruction of 3d people models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8387–8397.
[2]
Bharat Lal Bhatnagar, Cristian Sminchisescu, Christian Theobalt, and Gerard Pons-Moll. 2020. Loopreg: Self-supervised learning of implicit surface correspondences, pose and shape for 3d human mesh registration. Advances in Neural Information Processing Systems 33 (2020), 12909–12922.
[3]
Sai Bi, Zexiang Xu, Pratul Srinivasan, Ben Mildenhall, Kalyan Sunkavalli, Miloš Hašan, Yannick Hold-Geoffroy, David Kriegman, and Ravi Ramamoorthi. 2020. Neural reflectance fields for appearance acquisition. arXiv:2008.03824. Retrieved from https://arxiv.org/abs/2008.03824
[4]
Blender. 2022. Blender. Retrieved December 5, 2022, from https://www.blender.org/
[5]
Mark Boss, Raphael Braun, Varun Jampani, Jonathan T. Barron, Ce Liu, and Hendrik Lensch. 2021. Nerd: Neural reflectance decomposition from image collections. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 12684–12694.
[6]
Brent Burley and Walt Disney Animation Studios. 2012. Physically-based shading at disney. In Proceedings of the ACM SIGGRAPH. 1–7.
[7]
Zhaoxi Chen and Ziwei Liu. 2022. Relighting4d: Neural relightable human from videos. In European Conference on Computer Vision, Springer, 606–623.
[8]
Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5939–5948.
[9]
Robert L. Cook and Kenneth E. Torrance. 1982. A reflectance model for computer graphics. ACM Transactions on Graphics 1, 1 (1982), 7–24.
[10]
Paul Debevec, Tim Hawkins, Chris Tchou, Haarm-Pieter Duiker, Westley Sarokin, and Mark Sagar. 2000. Acquiring the reflectance field of a human face. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 145–156.
[11]
Frank Dellaert and Lin Yen-Chen. 2020. Neural volume rendering: Nerf and beyond. arXiv:2101.05204. Retrieved from https://arxiv.org/abs/2101.05204
[12]
Junting Dong, Wen Jiang, Qixing Huang, Hujun Bao, and Xiaowei Zhou. 2019. Fast and robust multi-person 3d pose estimation from multiple views. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7792–7801.
[13]
Ke Gong, Xiaodan Liang, Yicheng Li, Yimin Chen, Ming Yang, and Liang Lin. 2018. Instance-level human parsing via part grouping network. In Proceedings of the European Conference on Computer Vision (ECCV’18). 770–785.
[14]
Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit geometric regularization for learning shapes. In Proceedings of the 37th International Conference on Machine Learning. 3789–3799.
[15]
Marc Habermann, Weipeng Xu, Michael Zollhofer, Gerard Pons-Moll, and Christian Theobalt. 2020. Deepcap: Monocular human performance capture using weak supervision. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5052–5063.
[16]
Eric Heitz. 2014. Understanding the masking-shadowing function in microfacet-based BRDFs. Journal of Computer Graphics Techniques 3, 2 (2014), 32–91.
[17]
Eric Heitz. 2018. Sampling the GGX distribution of visible normals. Journal of Computer Graphics Techniques 7, 4 (2018), 1–13.
[18]
Eric Heitz and Eugene d’Eon. 2014. Importance sampling microfacet-based BSDFs using the distribution of visible normals. In Proceedings of the Computer Graphics Forum. Wiley Online Library, 103–112.
[19]
Chaonan Ji, Tao Yu, Kaiwen Guo, Jingxin Liu, and Yebin Liu. 2022. Geometry-aware single-image full-body human relighting. In Proceedings of the European Conference on Computer Vision. Springer, 388–405.
[20]
Hanbyul Joo, Tomas Simon, and Yaser Sheikh. 2018. Total capture: A 3d deformation model for tracking faces, hands, and bodies. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8320–8329.
[21]
James T. Kajiya. 1986. The rendering equation. In Proceedings of the 13th Annual Conference on Computer Graphics and Interactive Techniques. 143–150.
[22]
Yoshihiro Kanamori and Yuki Endo. 2018. Relighting humans: occlusion-aware inverse rendering for full-body human images. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1–11.
[23]
Ladislav Kavan, Steven Collins, Jiří Žára, and Carol O’Sullivan. 2007. Skinning with dual quaternions. In Proceedings of the 2007 Symposium on Interactive 3D Graphics and Games. 39–46.
[24]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
[25]
Zhengfei Kuang, Kyle Olszewski, Menglei Chai, Zeng Huang, Panos Achlioptas, and Sergey Tulyakov. 2022. Neroic: Neural rendering of objects from online image collections. ACM Transactions on Graphics (TOG) 41, 4 (2022), 1–12.
[26]
John P. Lewis, Matt Cordner, and Nickson Fong. 2000. Pose space deformation: A unified approach to shape interpolation and skeleton-driven deformation. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 165–172.
[27]
Gengyan Li, Abhimitra Meka, Franziska Mueller, Marcel C. Buehler, Otmar Hilliges, and Thabo Beeler. 2022. EyeNeRF: A hybrid representation for photorealistic synthesis, animation and relighting of human eyes. ACM Transactions on Graphics 41, 4 (2022), 1–16.
[28]
Lingjie Liu, Marc Habermann, Viktor Rudnev, Kripasindhu Sarkar, Jiatao Gu, and Christian Theobalt. 2021. Neural actor: Neural free-view synthesis of human actors with pose control. ACM Transactions on Graphics 40, 6 (2021), 1–16.
[29]
Matthew Loper, Naureen Mahmood, Javier Romero, Gerard Pons-Moll, and Michael J. Black. 2015. SMPL: A skinned multi-person linear model. ACM Transactions on Graphics 34, 6 (2015), 1–16.
[30]
Nelson Max. 1995. Optical models for direct volume rendering. IEEE Transactions on Visualization and Computer Graphics 1, 2 (1995), 99–108.
[31]
Abhimitra Meka, Christian Häne, Rohit Pandey, Michael Zollhöfer, Sean Fanello, Graham Fyffe, Adarsh Kowdle, Xueming Yu, Jay Busch, Jason Dourgarian, Peter Denny, Sofien Bouaziz, Peter Lincoln, Matt Whalen, Geoff Harvey, Jonathan Taylor, Shahram Izadi, Andrea Tagliasacchi, Paul Debevec, Christian Theobalt, Julien Valentin, and Christoph Rhemann. 2019. Deep reflectance fields: High-quality facial reflectance field inference from color gradient illumination. ACM Transactions on Graphics 38, 4 (2019), 1–12. Retrieved from
[32]
Abhimitra Meka, Maxim Maximov, Michael Zollhoefer, Avishek Chatterjee, Hans-Peter Seidel, Christian Richardt, and Christian Theobalt. 2018. Lime: Live intrinsic material estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6315–6324.
[33]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision. Springer, 405–421.
[34]
Mixamo. 2023. Mixamo. Retrieved December 7, 2023, from https://www.mixamo.com/
[35]
Rohit Pandey, Sergio Orts Escolano, Chloe Legendre, Christian Haene, Sofien Bouaziz, Christoph Rhemann, Paul Debevec, and Sean Fanello. 2021. Total relighting: Learning to relight portraits for background replacement. ACM Transactions on Graphics 40, 4 (2021), 1–21.
[36]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165–174.
[37]
Sida Peng, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Xiaowei Zhou, and Hujun Bao. 2021a. Animatable neural radiance fields for modeling dynamic human bodies. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14314–14323.
[38]
Sida Peng, Zhen Xu, Junting Dong, Qianqian Wang, Shangzhan Zhang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2024. Animatable implicit neural representations for creating realistic avatars from videos. IEEE Transactions on Pattern Analysis and Machine Intelligence 46, 6 (2024), 4147–4159.
[39]
Sida Peng, Yuanqing Zhang, Yinghao Xu, Qianqian Wang, Qing Shuai, Hujun Bao, and Xiaowei Zhou. 2021b. Neural body: Implicit neural representations with structured latent codes for novel view synthesis of dynamic humans. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 9054–9063.
[40]
Renderpeople. 2022. Renderpeople. Retrieved December 5, 2022, from https://renderpeople.com/
[41]
Shunsuke Saito, Zeng Huang, Ryota Natsume, Shigeo Morishima, Angjoo Kanazawa, and Hao Li. 2019. Pifu: Pixel-aligned implicit function for high-resolution clothed human digitization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 2304–2314.
[42]
Shen Sang and Manmohan Chandraker. 2020. Single-shot neural relighting and svbrdf estimation. In Proceedings of the European Conference on Computer Vision. Springer, 85–101.
[43]
Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T. Barron. 2021. Nerv: Neural reflectance and visibility fields for relighting and view synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 7495–7504.
[44]
Tiancheng Sun, Jonathan T. Barron, Yun-Ta Tsai, Zexiang Xu, Xueming Yu, Graham Fyffe, Christoph Rhemann, Jay Busch, Paul E Debevec, and Ravi Ramamoorthi. 2019. Single image portrait relighting.ACM Transactions on Graphics 38, 4 (2019), 79–1.
[45]
Bruce Walter, Stephen R. Marschner, Hongsong Li, and Kenneth E. Torrance. 2007. Microfacet models for refraction through rough surfaces. In Proceedings of the 18th Eurographics Conference on Rendering Techniques, Eurographics Association, Goslar, DEU, 195–206. Retrieved from
[46]
Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. NeuS: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. Advances in Neural Information Processing Systems 34 (2021), 27171–27183.
[47]
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.
[48]
Chung-Yi Weng, Brian Curless, Pratul P. Srinivasan, Jonathan T. Barron, and Ira Kemelmacher-Shlizerman. 2022. Humannerf: Free-viewpoint rendering of moving people from monocular video. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16210–16220.
[49]
Andreas Wenger, Andrew Gardner, Chris Tchou, Jonas Unger, Tim Hawkins, and Paul Debevec. 2005. Performance relighting and reflectance transformation with time-multiplexed illumination. ACM Transactions on Graphics 24, 3 (2005), 756–764.
[50]
Zexiang Xu, Kalyan Sunkavalli, Sunil Hadap, and Ravi Ramamoorthi. 2018. Deep image-based relighting from optimal sparse samples. ACM Transactions on Graphics 37, 4 (2018), 1–13.
[51]
Ze Yang, Shenlong Wang, Sivabalan Manivasagam, Zeng Huang, Wei-Chiu Ma, Xinchen Yan, Ersin Yumer, and Raquel Urtasun. 2021. S3: Neural shape, skeleton, and skinning fields for 3D human modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13284–13293.
[52]
Lior Yariv, Jiatao Gu, Yoni Kasten, and Yaron Lipman. 2021. Volume rendering of neural implicit surfaces. Advances in Neural Information Processing Systems 34 (2021), 4805–4815.
[53]
Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Basri Ronen, and Yaron Lipman. 2020. Multiview neural surface reconstruction by disentangling geometry and appearance. Advances in Neural Information Processing Systems 33, (2020), 2492–2502.
[54]
Jason Zhang, Gengshan Yang, Shubham Tulsiani, and Deva Ramanan. 2021. Ners: Neural reflectance surfaces for sparse-view 3d reconstruction in the wild. Advances in Neural Information Processing Systems 34 (2021), 29835–29847.
[55]
Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, and Noah Snavely. 2021a. Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5453–5462.
[56]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 586–595.
[57]
Xiuming Zhang, Pratul P. Srinivasan, Boyang Deng, Paul Debevec, William T. Freeman, and Jonathan T. Barron. 2021b. Nerfactor: Neural factorization of shape and reflectance under an unknown illumination. ACM Transactions on Graphics 40, 6 (2021), 1–18.
[58]
Zerong Zheng, Han Huang, Tao Yu, Hongwen Zhang, Yandong Guo, and Yebin Liu. 2022. Structured local radiance fields for human avatar modeling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15893–15903.

Index Terms

  1. ReN Human: Learning Relightable Neural Implicit Surfaces for Animatable Human Rendering
    Index terms have been assigned to the content through auto-classification.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Graphics
    ACM Transactions on Graphics  Volume 43, Issue 5
    October 2024
    196 pages
    EISSN:1557-7368
    DOI:10.1145/3613708
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 09 August 2024
    Online AM: 15 July 2024
    Accepted: 05 July 2024
    Revised: 07 December 2023
    Received: 05 December 2022
    Published in TOG Volume 43, Issue 5

    Check for updates

    Author Tags

    1. Relighting
    2. sign distance function
    3. animatable human
    4. inverse rendering
    5. volume rendering

    Qualifiers

    • Research-article

    Funding Sources

    • NSFC
    • Zhejiang Province “Jianbing” Research and Development Project
    • National Natural Science Foundation of China
    • National Key R&D Program of China
    • Information Technology Center and State Key Lab of CAD&CG, Zhejiang University

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 318
      Total Downloads
    • Downloads (Last 12 months)318
    • Downloads (Last 6 weeks)77
    Reflects downloads up to 12 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media