research-article

Neural Wavelet-domain Diffusion for 3D Shape Generation

Authors:

Chi-Wing FuAuthors Info & Claims

SA '22: SIGGRAPH Asia 2022 Conference Papers

Article No.: 24, Pages 1 - 9

https://doi.org/10.1145/3550469.3555394

Published: 30 November 2022 Publication History

Abstract

This paper presents a new approach for 3D shape generation, enabling direct generative modeling on a continuous implicit representation in wavelet domain. Specifically, we propose a compact wavelet representation with a pair of coarse and detail coefficient volumes to implicitly represent 3D shapes via truncated signed distance functions and multi-scale biorthogonal wavelets, and formulate a pair of neural networks: a generator based on the diffusion model to produce diverse shapes in the form of coarse coefficient volumes; and a detail predictor to further produce compatible detail coefficient volumes for enriching the generated shapes with fine structures and details. Both quantitative and qualitative experimental results manifest the superiority of our approach in generating diverse and high-quality shapes with complex topology and structures, clean surfaces, and fine details, exceeding the 3D generation capabilities of the state-of-the-art models.

Supplemental Material

MP4 File

presentation

Download
32.32 MB

PDF File

Appendix

Download
40.23 MB

References

[1]

Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas J. Guibas. 2018. Learning representations and generative models for 3D point clouds. In Proceedings of International Conference on Machine Learning (ICML). 40–49.

[2]

Matan Atzmon and Yaron Lipman. 2020. SAL: Sign agnostic learning of shapes from raw data. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2565–2574.

[3]

Ruojin Cai, Guandao Yang, Hadar Averbuch-Elor, Zekun Hao, Serge Belongie, Noah Snavely, and Bharath Hariharan. 2020. Learning gradient fields for shape generation. In European Conference on Computer Vision (ECCV). 364–381.

Digital Library

[4]

Angel X. Chang, Thomas Funkhouser, Leonidas J. Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, 2015. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012(2015).

[5]

Ding-Yun Chen, Xiao-Pei Tian, Yu-Te Shen, and Ming Ouhyoung. 2003. On visual similarity based 3D model retrieval. In Computer Graphics Forum, Vol. 22. 223–232.

[6]

Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 5939–5948.

[7]

Zhang Chen, Yinda Zhang, Kyle Genova, Sean Fanello, Sofien Bouaziz, Christian Häne, Ruofei Du, Cem Keskin, Thomas Funkhouser, and Danhang Tang. 2021. Multiresolution Deep Implicit Functions for 3D Shape Representation. In IEEE International Conference on Computer Vision (ICCV). 13087–13096.

[8]

Julian Chibane, Thiemo Alldieck, and Gerard Pons-Moll. 2020. Implicit functions in feature space for 3D shape reconstruction and completion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6970–6981.

[9]

Albert Cohen. 1992. Biorthogonal wavelets. Wavelets: A Tutorial in Theory and Applications 2 (1992), 123–152.

[10]

Fergal Cotter. 2020. Uses of Complex Wavelets in Deep Convolutional Neural Networks. Ph.D. Dissertation. University of Cambridge.

[11]

Ingrid Daubechies. 1990. The wavelet transform, time-frequency localization and signal analysis. IEEE transactions on information theory 36, 5 (1990), 961–1005.

Digital Library

[12]

Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion models beat GANS on image synthesis. Conference on Neural Information Processing Systems (NeurIPS) (2021), 8780–8794.

[13]

Haoqiang Fan, Hao Su, and Leonidas J. Guibas. 2017. A point set generation network for 3D object reconstruction from a single image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 605–613.

[14]

Rizal Fathony, Anit Kumar Sahu, Devin Willmott, and J. Zico Kolter. 2020. Multiplicative filter networks. In International Conference on Learning Representations (ICLR).

[15]

Rinon Gal, Amit Bermano, Hao Zhang, and Daniel Cohen-Or. 2020. MRGAN: Multi-Rooted 3D Shape Generation with Unsupervised Part Disentanglement. In IEEE International Conference on Computer Vision (ICCV). 2039–2048.

[16]

Rohit Girdhar, David F. Fouhey, Mikel Rodriguez, and Abhinav Gupta. 2016. Learning a predictable and generative vector representation for objects. In European Conference on Computer Vision (ECCV). 484–499.

[17]

Amos Gropp, Lior Yariv, Niv Haim, Matan Atzmon, and Yaron Lipman. 2020. Implicit geometric regularization for learning shapes. In Proceedings of International Conference on Machine Learning (ICML). 3569–3579.

[18]

Thibault Groueix, Matthew Fisher, Vladimir G. Kim, Bryan C. Russell, and Mathieu Aubry. 2018. A papier-mâché approach to learning 3D surface generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 216–224.

[19]

Zekun Hao, Hadar Averbuch-Elor, Noah Snavely, and Serge Belongie. 2020. DualSDF: Semantic shape manipulation using a two-level representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 7631–7641.

[20]

Amir Hertz, Or Perel, Raja Giryes, Olga Sorkine-Hornung, and Daniel Cohen-Or. 2022. SPAGHETTI: Editing Implicit Shapes Through Part Aware Generation. arXiv preprint arXiv:2201.13168(2022).

[21]

Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Conference on Neural Information Processing Systems (NeurIPS) (2020), 6840–6851.

[22]

Le Hui, Rui Xu, Jin Xie, Jianjun Qian, and Jian Yang. 2020. Progressive point cloud deconvolution generation network. In European Conference on Computer Vision (ECCV). 397–413.

Digital Library

[23]

Moritz Ibing, Isaak Lim, and Leif Kobbelt. 2021. 3D Shape Generation With Grid-Based Implicit Functions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 13559–13568.

[24]

Li Jiang, Shaoshuai Shi, Xiaojuan Qi, and Jiaya Jia. 2018. GAL: Geometric adversarial loss for single-view 3D-object reconstruction. In European Conference on Computer Vision (ECCV). 802–816.

Digital Library

[25]

Hyeongju Kim, Hyeonseung Lee, Woo Hyun Kang, Joun Yeop Lee, and Nam Soo Kim. 2020. SoftFlow: Probabilistic framework for normalizing flow on manifolds. In Conference on Neural Information Processing Systems (NeurIPS). 16388–16397.

[26]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980(2014).

[27]

Marian Kleineberg, Matthias Fey, and Frank Weichert. 2020. Adversarial generation of continuous implicit shape representations. arXiv preprint arXiv:2002.00349(2020).

[28]

Manyi Li and Hao Zhang. 2021. D2IM-Net: Learning detail disentangled implicit fields from single images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 10246–10255.

[29]

Ruihui Li, Xianzhi Li, Ke-Hei Hui, and Chi-Wing Fu. 2021. SP-GAN:Sphere-Guided 3D Shape Generation and Manipulation. ACM Transactions on Graphics (SIGGRAPH) 40, 4 (2021).

Digital Library

[30]

Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020. Neural sparse voxel fields. Conference on Neural Information Processing Systems (NeurIPS) (2020), 15651–15663.

[31]

Shichen Liu, Shunsuke Saito, Weikai Chen, and Hao Li. 2019. Learning to infer implicit surfaces without 3D supervision. Conference on Neural Information Processing Systems (NeurIPS) (2019).

[32]

Shi-Lin Liu, Hao-Xiang Guo, Hao Pan, Pengshuai Wang, Xin Tong, and Yang Liu. 2021. Deep Implicit Moving Least-Squares Functions for 3D Reconstruction. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1788–1797.

[33]

William E. Lorensen and Harvey E. Cline. 1987. Marching Cubes: A high resolution 3D surface construction algorithm. In Proceedings of SIGGRAPH, Vol. 21. 163–169.

[34]

Andrew Luo, Tianqin Li, Wen-Hao Zhang, and Tai Sing Lee. 2021. SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators. In IEEE International Conference on Computer Vision (ICCV). 16238–16248.

[35]

Shitong Luo and Wei Hu. 2021. Diffusion probabilistic models for 3D point cloud generation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2837–2845.

[36]

Stephane G Mallat. 1989. A theory for multiresolution signal decomposition: the wavelet representation. IEEE transactions on pattern analysis and machine intelligence 11, 7(1989), 674–693.

[37]

Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. 2021. ACORN: Adaptive coordinate networks for neural scene representation. ACM Transactions on Graphics (SIGGRAPH) 40, 4 (2021), 13.

Digital Library

[38]

Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3D reconstruction in function space. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4460–4470.

[39]

Alexander Quinn Nichol and Prafulla Dhariwal. 2021. Improved denoising diffusion probabilistic models. In Proceedings of International Conference on Machine Learning (ICML). 8162–8171.

[40]

Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2020. Differentiable volumetric rendering: Learning implicit 3D representations without 3D supervision. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 3504–3515.

[41]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. DeepSDF: Learning continuous signed distance functions for shape representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 165–174.

[42]

Vishwanath Saragadam, Jasper Tan, Guha Balakrishnan, Richard G. Baraniuk, and Ashok Veeraraghavan. 2022. MINER: Multiscale Implicit Neural Representations. arXiv preprint arXiv:2202.03532(2022).

[43]

Edward J. Smith, Scott Fujimoto, Adriana Romero, and David Meger. 2019. GEOMetrics: Exploiting geometric structure for graph-encoded objects. In Proceedings of International Conference on Machine Learning (ICML). 5866–5876.

[44]

Edward J. Smith and David Meger. 2017. Improved adversarial systems for 3D object generation and reconstruction. In Conference on Robot Learning. PMLR, 87–96.

[45]

Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep unsupervised learning using nonequilibrium thermodynamics. In Proceedings of International Conference on Machine Learning (ICML). 2256–2265.

[46]

Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising diffusion implicit models. arXiv preprint arXiv:2010.02502(2020).

[47]

Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 11358–11367.

[48]

Jiapeng Tang, Xiaoguang Han, Junyi Pan, Kui Jia, and Xin Tong. 2019. A skeleton-bridged deep learning approach for generating meshes of complex topologies from single RGB images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4541–4550.

[49]

Jiapeng Tang, Xiaoguang Han, Mingkui Tan, Xin Tong, and Kui Jia. 2021. SkeletonNet: A topology-preserving solution for learning mesh reconstruction of object surfaces from RGB images. IEEE Transactions Pattern Analysis & Machine Intelligence (2021). to appear.

[50]

Luiz Velho, Demetri Terzopoulos, and Jonas Gomes. 1994. Multiscale implicit models. In Proceedings of SIBGRAPI, Vol. 94. 93–100.

[51]

Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. 2018. Pixel2Mesh: Generating 3D mesh models from single RGB images. In European Conference on Computer Vision (ECCV). 52–67.

Digital Library

[52]

Jiajun Wu, Chengkai Zhang, Tianfan Xue, Bill Freeman, and Josh Tenenbaum. 2016. Learning a probabilistic latent space of object shapes via 3D generative-adversarial modeling. In Conference on Neural Information Processing Systems (NeurIPS). 82–90.

[53]

Qiangeng Xu, Weiyue Wang, Duygu Ceylan, Radomir Mech, and Ulrich Neumann. 2019. DISN: Deep implicit surface network for high-quality single-view 3D reconstruction. In Conference on Neural Information Processing Systems (NeurIPS). 490–500.

[54]

Yifan Xu, Tianqi Fan, Yi Yuan, and Gurprit Singh. 2020. Ladybird: Quasi-Monte Carlo sampling for deep implicit field based 3D reconstruction with symmetry. In European Conference on Computer Vision (ECCV). 248–263.

Digital Library

[55]

Xingguang Yan, Liqiang Lin, Niloy J Mitra, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2022. Shapeformer: Transformer-based shape completion via sparse representation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 6239–6249.

[56]

Guandao Yang, Yin Cui, Serge Belongie, and Bharath Hariharan. 2018. Learning single-view 3D reconstruction with limited pose supervision. In European Conference on Computer Vision (ECCV). 86–101.

Digital Library

[57]

Guandao Yang, Xun Huang, Zekun Hao, Ming-Yu Liu, Serge Belongie, and Bharath Hariharan. 2019. PointFlow: 3D point cloud generation with continuous normalizing flows. In IEEE International Conference on Computer Vision (ICCV). 4541–4550.

[58]

Wenbin Zhao, Jiabao Lei, Yuxin Wen, Jianguo Zhang, and Kui Jia. 2021. Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 10256–10265.

[59]

Xin-Yang Zheng, Yang Liu, Peng-Shuai Wang, and Xin Tong. 2022. SDF-StyleGAN: Implicit SDF-Based StyleGAN for 3D Shape Generation. In Eurographics Symposium on Geometry Processing (SGP).

[60]

Linqi Zhou, Yilun Du, and Jiajun Wu. 2021. 3D shape generation and completion through point-voxel diffusion. In IEEE International Conference on Computer Vision (ICCV). 5826–5835.

[61]

Rui Zhu, Hamed Kiani Galoogahi, Chaoyang Wang, and Simon Lucey. 2017. Rethinking Reprojection: Closing the loop for pose-aware shape reconstruction from a single image. In IEEE International Conference on Computer Vision (ICCV). 57–65.

Cited By

Fukaya KDaylamani-Zad DAgius H(2025)Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the ArtACM Computing Surveys10.1145/370849957:5(1-38)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3708499
Hui KSanghi ARampini AMalekshan KLiu ZShayani HFu CSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Make-a-shapeProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692899(20660-20681)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692899
Ge JZhou MFu C(2024)Learn to Create Simple LEGO Micro BuildingsACM Transactions on Graphics10.1145/368775543:6(1-13)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687755
Show More Cited By

Index Terms

Neural Wavelet-domain Diffusion for 3D Shape Generation
1. Computing methodologies
  1. Computer graphics
    1. Shape modeling
      1. Mesh models
      2. Shape analysis
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Recommendations

Neural Wavelet-domain Diffusion for 3D Shape Generation, Inversion, and Manipulation
This paper presents a new approach for 3D shape generation, inversion, and manipulation, through a direct generative modeling on a continuous implicit representation in wavelet domain. Specifically, we propose a compact wavelet representation with a pair ...
Locally Attentional SDF Diffusion for Controllable 3D Shape Generation

Although the recent rapid evolution of 3D generative neural networks greatly improves 3D shape generation, it is still not convenient for ordinary users to create 3D shapes and control the local geometry of generated shapes. To address these challenges, ...
DSG-Net: Learning Disentangled Structure and Geometry for 3D Shape Generation
3D shape generation is a fundamental operation in computer graphics. While significant progress has been made, especially with recent deep generative models, it remains a challenge to synthesize high-quality shapes with rich geometric details and complex ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SA '22: SIGGRAPH Asia 2022 Conference Papers

November 2022

482 pages

ISBN:9781450394703

DOI:10.1145/3550469

Editors:
Soon Ki Jung
Kyungpook National University, South Korea
,
Jehee Lee
Seoul National University, South Korea
,
Adam Bargteil
University of Maryland Baltimore County, USA

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGGRAPH: ACM Special Interest Group on Computer Graphics and Interactive Techniques

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 November 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Data Availability

presentation https://dl.acm.org/doi/10.1145/3550469.3555394#3550469.3555394.mp4

Appendix https://dl.acm.org/doi/10.1145/3550469.3555394#Supp_304.pdf

Funding Sources

the Research Grants Council of the Hong Kong Special Administrative Region

Conference

SA '22

Sponsor:

SIGGRAPH

SA '22: SIGGRAPH Asia 2022

December 6 - 9, 2022

Daegu, Republic of Korea

Acceptance Rates

Overall Acceptance Rate 178 of 869 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

61
Total Citations
View Citations
802
Total Downloads

Downloads (Last 12 months)214
Downloads (Last 6 weeks)14

Reflects downloads up to 08 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fukaya KDaylamani-Zad DAgius H(2025)Intelligent Generation of Graphical Game Assets: A Conceptual Framework and Systematic Review of the State of the ArtACM Computing Surveys10.1145/370849957:5(1-38)Online publication date: 9-Jan-2025
https://dl.acm.org/doi/10.1145/3708499
Hui KSanghi ARampini AMalekshan KLiu ZShayani HFu CSalakhutdinov RKolter ZHeller KWeller AOliver NScarlett JBerkenkamp F(2024)Make-a-shapeProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3692899(20660-20681)Online publication date: 21-Jul-2024
https://dl.acm.org/doi/10.5555/3692070.3692899
Ge JZhou MFu C(2024)Learn to Create Simple LEGO Micro BuildingsACM Transactions on Graphics10.1145/368775543:6(1-13)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687755
Peng HZhang JGuo MCao YHu S(2024)CharacterGen: Efficient 3D Character Generation from Single Images with Multi-View Pose CanonicalizationACM Transactions on Graphics10.1145/365821743:4(1-13)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658217
Xu XLambourne JJayaraman PWang ZWillis KFurukawa Y(2024)BrepGen: A B-rep Generative Diffusion Model with Structured Latent GeometryACM Transactions on Graphics10.1145/365812943:4(1-14)Online publication date: 19-Jul-2024
https://dl.acm.org/doi/10.1145/3658129
Bhat SMitra NWonka P(2024)LOOSECONTROL: Lifting ControlNet for Generalized Depth ConditioningACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657525(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657525
Petrov DGoyal PThamizharasan VKim VGadelha MAverkiou MChaudhuri SKalogerakis E(2024)GEM3D: GEnerative Medial Abstractions for 3D Shape SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657415(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657415
Hu JHui KLiu ZZhang HFu C(2024)CNS-Edit: 3D Shape Editing via Coupled Neural Shape OptimizationACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657412(1-12)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657412
Hu JHui KLiu ZLi RFu C(2024)Neural Wavelet-domain Diffusion for 3D Shape Generation, Inversion, and ManipulationACM Transactions on Graphics10.1145/363530443:2(1-18)Online publication date: 3-Jan-2024
https://dl.acm.org/doi/10.1145/3635304
Jang YHyun K(2024)Advancing 3D CAD with Workflow Graph-Driven Bayesian Command InferencesExtended Abstracts of the CHI Conference on Human Factors in Computing Systems10.1145/3613905.3650895(1-6)Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3613905.3650895
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten