research-article

Open access

Diffusion Posterior Illumination for Ambiguity-Aware Inverse Rendering

Authors:

Marc Habermann,

Shunsuke Saito,

Michael Zollhöfer,

Thomas Leimkühler,

Christian TheobaltAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 42, Issue 6

Article No.: 233, Pages 1 - 14

https://doi.org/10.1145/3618357

Published: 05 December 2023 Publication History

Abstract

Inverse rendering, the process of inferring scene properties from images, is a challenging inverse problem. The task is ill-posed, as many different scene configurations can give rise to the same image. Most existing solutions incorporate priors into the inverse-rendering pipeline to encourage plausible solutions, but they do not consider the inherent ambiguities and the multi-modal distribution of possible decompositions. In this work, we propose a novel scheme that integrates a denoising diffusion probabilistic model pre-trained on natural illumination maps into an optimization framework involving a differentiable path tracer. The proposed method allows sampling from combinations of illumination and spatially-varying surface materials that are, both, natural and explain the image observations. We further conduct an extensive comparative study of different priors on illumination used in previous work on inverse rendering. Our method excels in recovering materials and producing highly realistic and diverse environment map samples that faithfully explain the illumination of the input images.

Supplementary Material

ZIP File (papers_431s4-file4.zip)

supplemental

Download
4.67 MB

MP4 File (papers_431s4-file3.mp4)

supplemental

Download
128.54 MB

References

[1]

Brian DO Anderson. 1982. Reverse-time diffusion equation models. Stochastic Processes and their Applications 12, 3 (1982), 313--326.

[2]

Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. 2022. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. CVPR (2022).

[3]

Ronen Basri and David W Jacobs. 2003. Lambertian reflectance and linear subspaces. IEEE transactions on pattern analysis and machine intelligence 25, 2 (2003), 218--233.

Digital Library

[4]

Yochai Blau and Tomer Michaeli. 2018. The perception-distortion tradeoff. In Proceedings of the IEEE conference on computer vision and pattern recognition. 6228--6237.

[5]

Brent Burley. 2015. Extending the Disney BRDF to a BSDF with integrated subsurface scattering. Physically Based Shading in Theory and Practice'SIGGRAPH Course (2015).

[6]

Brent Burley and Walt Disney Animation Studios. 2012. Physically-based shading at disney. In Acm Siggraph, Vol. 2012. vol. 2012, 1--7.

[7]

Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon. 2021. Ilvr: Conditioning method for denoising diffusion probabilistic models. arXiv preprint arXiv:2108.02938 (2021).

[8]

Hyungjin Chung, Jeongsol Kim, Sehui Kim, and Jong Chul Ye. 2022a. Parallel Diffusion Models of Operator and Image for Blind Inverse Problems. arXiv preprint arXiv:2211.10656 (2022).

[9]

Hyungjin Chung, Jeongsol Kim, Michael T Mccann, Marc L Klasky, and Jong Chul Ye. 2022b. Diffusion posterior sampling for general noisy inverse problems. arXiv preprint arXiv:2209.14687 (2022).

[10]

Hyungjin Chung, Byeongsu Sim, and Jong Chul Ye. 2022c. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12413--12422.

[11]

Blender Online Community. 2018. Blender - a 3D modelling and rendering package. http://www.blender.org

[12]

Mohammad Reza Karimi Dastjerdi, Yannick Hold-Geoffroy, Jonathan Eisenmann, Siavash Khodadadeh, and Jean-François Lalonde. 2022. Guided Co-Modulated GAN for 360° Field of View Extrapolation. In 2022 International Conference on 3D Vision (3DV). IEEE, 475--485.

[13]

Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat gans on image synthesis. Advances in Neural Information Processing Systems 34 (2021), 8780--8794.

[14]

Ron O Dror, Alan S Willsky, and Edward H Adelson. 2004. Statistical characterization of real-world illumination. Journal of Vision 4, 9 (2004), 11--11.

[15]

Bernhard Egger, Sandro Schönborn, Andreas Schneider, Adam Kortylewski, Andreas Morel-Forster, Clemens Blumer, and Thomas Vetter. 2018. Occlusion-aware 3d morphable models and an illumination prior for face image analysis. International Journal of Computer Vision 126 (2018), 1269--1287.

Digital Library

[16]

James AD Gardner, Bernhard Egger, and William AP Smith. 2022. Rotation-Equivariant Conditional Spherical Neural Fields for Learning a Natural Illumination Prior. arXiv preprint arXiv:2206.03858 (2022).

[17]

Marc-André Gardner, Kalyan Sunkavalli, Ersin Yumer, Xiaohui Shen, Emiliano Gambaretto, Christian Gagné, and Jean-François Lalonde. 2017. Learning to predict indoor illumination from a single image. arXiv preprint arXiv:1704.00090 (2017).

[18]

Param Hanji, Rafal Mantiuk, Gabriel Eilertsen, Saghi Hajisharif, and Jonas Unger. 2022. Comparison of single image HDR reconstruction methods---the caveats of quality assessment. In ACM SIGGRAPH 2022 Conference Proceedings. 1--8.

Digital Library

[19]

Jon Hasselgren, Nikolai Hofmann, and Jacob Munkberg. 2022. Shape, light & material decomposition from images using monte carlo rendering and denoising. arXiv preprint arXiv:2206.03380 (2022).

[20]

Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs trained by a two time-scale update rule converge to a local nash equilibrium. NeurIPS 30 (2017).

[21]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in Neural Information Processing Systems 33 (2020), 6840--6851.

[22]

Aapo Hyvärinen and Peter Dayan. 2005. Estimation of non-normalized statistical models by score matching. Journal of Machine Learning Research 6, 4 (2005).

[23]

Wenzel Jakob, Sébastien Speierer, Nicolas Roussel, Merlin Nimier-David, Delio Vicini, Tizian Zeltner, Baptiste Nicolet, Miguel Crespo, Vincent Leroy, and Ziyi Zhang. 2022b. Mitsuba 3 renderer. https://mitsuba-renderer.org.

[24]

Wenzel Jakob, Sébastien Speierer, Nicolas Roussel, and Delio Vicini. 2022a. Dr.Jit: A Just-In-Time Compiler for Differentiable Rendering. Transactions on Graphics (Proceedings of SIGGRAPH) 41, 4 (July 2022).

Digital Library

[25]

Haian Jin, Isabella Liu, Peijia Xu, Xiaoshuai Zhang, Songfang Han, Sai Bi, Xiaowei Zhou, Zexiang Xu, and Hao Su. 2023. TensoIR: Tensorial Inverse Rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165--174.

[26]

James T Kajiya. 1986. The rendering equation. In Proceedings of the 13th annual conference on Computer graphics and interactive techniques. 143--150.

Digital Library

[27]

Tero Karras, Miika Aittala, Janne Hellsten, Samuli Laine, Jaakko Lehtinen, and Timo Aila. 2020. Training generative adversarial networks with limited data. Advances in neural information processing systems 33 (2020), 12104--12114.

[28]

Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. 2018. Neural 3d mesh renderer. In CVPR. 3907--3916.

[29]

Bahjat Kawar, Michael Elad, Stefano Ermon, and Jiaming Song. 2022. Denoising diffusion restoration models. arXiv preprint arXiv:2201.11793 (2022).

[30]

Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In ICLR.

[31]

Samuli Laine, Janne Hellsten, Tero Karras, Yeongho Seol, Jaakko Lehtinen, and Timo Aila. 2020. Modular primitives for high-performance differentiable rendering. ACM Transactions on Graphics (TOG) 39, 6 (2020), 1--14.

Digital Library

[32]

Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. 2018. Noise2Noise: Learning image restoration without clean data. arXiv preprint arXiv:1803.04189 (2018).

[33]

Tzu-Mao Li, Miika Aittala, Frédo Durand, and Jaakko Lehtinen. 2018. Differentiable monte carlo ray tracing through edge sampling. ACM Transactions on Graphics (TOG) 37, 6 (2018), 1--11.

Digital Library

[34]

Shichen Liu, Weikai Chen, Tianye Li, and Hao Li. 2019. Soft rasterizer: Differentiable rendering for unsupervised single-view mesh reconstruction. arXiv preprint arXiv:1901.05567 (2019).

[35]

Matthew M Loper and Michael J Black. 2014. OpenDR: An approximate differentiable renderer. In Computer Vision-ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6--12, 2014, Proceedings, Part VII 13. Springer, 154--169.

[36]

Guillaume Loubet, Nicolas Holzschuch, and Wenzel Jakob. 2019. Reparameterizing discontinuous integrands for differentiable rendering. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1--14.

Digital Library

[37]

Linjie Lyu, Marc Habermann, Lingjie Liu, Ayush Tewari, Christian Theobalt, et al. 2021. Efficient and differentiable shadow computation for inverse problems. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13107--13116.

[38]

Linjie Lyu, Ayush Tewari, Thomas Leimkühler, Marc Habermann, and Christian Theobalt. 2022. Neural Radiance Transfer Fields for Relightable Novel-view Synthesis with Global Illumination. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XVII. Springer, 153--169.

Digital Library

[39]

David McAllester. 2023. On the Mathematics of Diffusion Models. arXiv preprint arXiv:2301.11108 (2023).

[40]

Ben Mildenhall, Pratul P Srinivasan, Matthew Tancik, Jonathan T Barron, Ravi Ramamoorthi, and Ren Ng. 2021. Nerf: Representing scenes as neural radiance fields for view synthesis. Commun. ACM 65, 1 (2021), 99--106.

Digital Library

[41]

Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, et al. 2019. The streetlearn environment and dataset. arXiv preprint arXiv:1903.01292 (2019).

[42]

Anish Mittal, Rajiv Soundararajan, and Alan C. Bovik. 2013. Making a "Completely Blind" Image Quality Analyzer. IEEE Signal Processing Letters 20, 3 (2013), 209--212.

[43]

Jacob Munkberg, Jon Hasselgren, Tianchang Shen, Jun Gao, Wenzheng Chen, Alex Evans, Thomas Müller, and Sanja Fidler. 2022. Extracting Triangular 3D Models, Materials, and Lighting From Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 8280--8290.

[44]

Alexander Quinn Nichol and Prafulla Dhariwal. 2021. Improved denoising diffusion probabilistic models. In ICML. PMLR, 8162--8171.

[45]

Merlin Nimier-David, Delio Vicini, Tizian Zeltner, and Wenzel Jakob. 2019. Mitsuba 2: A retargetable forward and inverse renderer. ACM Transactions on Graphics (TOG) 38, 6 (2019), 1--17.

Digital Library

[46]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In CVPR. 165--174.

[47]

Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. 2022. Dreamfusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.14988 (2022).

[48]

Ravi Ramamoorthi and Pat Hanrahan. 2001. An efficient representation for irradiance environment maps. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. 497--500.

Digital Library

[49]

Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In International conference on machine learning. PMLR, 1530--1538.

[50]

Daniel Roich, Ron Mokady, Amit H Bermano, and Daniel Cohen-Or. 2022. Pivotal tuning for latent-based editing of real images. ACM Transactions on graphics (TOG) 42, 1 (2022), 1--13.

[51]

Marcel Santana Santos, Ren Tsang, and Nima Khademi Kalantari. 2020. Single Image HDR Reconstruction Using a CNN with Masked Features and Perceptual Loss. ACM Transactions on Graphics 39, 4 (7 2020).

Digital Library

[52]

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsupervised learning using nonequilibrium thermodynamics. In International Conference on Machine Learning. PMLR, 2256--2265.

[53]

Jiaming Song, Arash Vahdat, Morteza Mardani, and Jan Kautz. 2023. Pseudoinverse-guided diffusion models for inverse problems. In ICLR.

[54]

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2020. Score-based generative modeling through stochastic differential equations. arXiv preprint arXiv:2011.13456 (2020).

[55]

Pratul P. Srinivasan, Boyang Deng, Xiuming Zhang, Matthew Tancik, Ben Mildenhall, and Jonathan T. Barron. 2021. NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis. In CVPR.

[56]

Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2018. Deep image prior. In Proceedings of the IEEE conference on computer vision and pattern recognition. 9446--9454.

[57]

Delio Vicini, Sébastien Speierer, and Wenzel Jakob. 2021. Path Replay Backpropagation: Differentiating Light Paths using Constant Memory and Linear Time. Transactions on Graphics (Proceedings of SIGGRAPH) 40, 4 (Aug. 2021), 108:1--108:14.

Digital Library

[58]

Pascal Vincent. 2011. Aconnection between score matching and denoising autoencoders. Neural computation 23, 7 (2011), 1661--1674.

[59]

Guangcong Wang, Yinuo Yang, Chen Change Loy, and Ziwei Liu. 2022. Stylelight: Hdr panorama generation for lighting estimation and editing. In Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part XV. Springer, 477--492.

[60]

Jiaping Wang, Peiran Ren, Minmin Gong, John Snyder, and Baining Guo. 2009. All-frequency rendering of dynamic, spatially-varying reflectance. In ACM SIGGRAPH Asia 2009 papers. 1--10.

[61]

Peng Wang, Lingjie Liu, Yuan Liu, Christian Theobalt, Taku Komura, and Wenping Wang. 2021. Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction. arXiv preprint arXiv:2106.10689 (2021).

[62]

Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. 2004. Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing 13, 4 (2004), 600--612.

Digital Library

[63]

Haoqian Wu, Zhipeng Hu, Lincheng Li, Yongqiang Zhang, Changjie Fan, and Xin Yu. 2023. NeFII: Inverse Rendering for Reflectance Decomposition with Near-Field Indirect Illumination. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4295--4304.

[64]

Ye Yu and William AP Smith. 2021. Outdoor inverse rendering from a single image using multiview self-supervision. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 7 (2021), 3659--3675.

[65]

Fangneng Zhan, Changgong Zhang, Yingchen Yu, Yuan Chang, Shijian Lu, Feiying Ma, and Xuansong Xie. 2021. Emlight: Lighting estimation via spherical distribution approximation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3287--3295.

[66]

Kai Zhang, Fujun Luan, Qianqian Wang, Kavita Bala, and Noah Snavely. 2021a. Physg: Inverse rendering with spherical gaussians for physics-based material editing and relighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5453--5462.

[67]

Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In CVPR.

[68]

Xiuming Zhang, Pratul P Srinivasan, Boyang Deng, Paul Debevec, William T Freeman, and Jonathan T Barron. 2021b. Nerfactor: Neural factorization of shape and reflectance under an unknown illumination. ACM Transactions on Graphics (TOG) 40, 6 (2021), 1--18.

Digital Library

[69]

Yuanqing Zhang, Jiaming Sun, Xingyi He, Huan Fu, Rongfei Jia, and Xiaowei Zhou. 2022. Modeling indirect illumination for inverse rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18643--18652.

Cited By

Zhang YLiu YXie ZYang LLiu ZYang MZhang RKou QLin CWang WJin X(2024)DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion ModelsACM Transactions on Graphics10.1145/365817043:4(1-18)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658170

Index Terms

Diffusion Posterior Illumination for Ambiguity-Aware Inverse Rendering
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Computer graphics
    1. Rendering

Recommendations

Learning-based Inverse Rendering of Complex Indoor Scenes with Differentiable Monte Carlo Raytracing
SA '22: SIGGRAPH Asia 2022 Conference Papers

Indoor scenes typically exhibit complex, spatially-varying appearance from global illumination, making inverse rendering a challenging ill-posed problem. This work presents an end-to-end, learning-based inverse rendering framework incorporating ...
A signal-processing framework for inverse rendering
SIGGRAPH '01: Proceedings of the 28th annual conference on Computer graphics and interactive techniques

Realism in computer-generated images requires accurate input models for lighting, textures and BRDFs. One of the best ways of obtaining high-quality data is through measurements of scene attributes from real photographs by inverse rendering. However, ...
NeRF as a Non-Distant Environment Emitter in Physics-based Inverse Rendering
SIGGRAPH '24: ACM SIGGRAPH 2024 Conference Papers

Physics-based inverse rendering enables joint optimization of shape, material, and lighting based on captured 2D images. To ensure accurate reconstruction, using a light model that closely resembles the captured environment is essential. Although the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 42, Issue 6

December 2023

1565 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3632123

Issue’s Table of Contents

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2023

Published in TOG Volume 42, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
269
Total Downloads

Downloads (Last 12 months)269
Downloads (Last 6 weeks)34

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang YLiu YXie ZYang LLiu ZYang MZhang RKou QLin CWang WJin X(2024)DreamMat: High-quality PBR Material Generation with Geometry- and Light-aware Diffusion ModelsACM Transactions on Graphics10.1145/365817043:4(1-18)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658170

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents