Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

PhysFiT: Physical-aware 3D Shape Understanding for Finishing Incomplete Assembly

Published: 13 November 2024 Publication History

Abstract

Understanding the part composition and structure of 3D shapes is crucial for a wide range of 3D applications, including 3D part assembly and 3D assembly completion. Compared to 3D part assembly, 3D assembly completion is more complicated, which involves repairing broken or incomplete furniture that miss several parts with a toolkit. Given an incomplete assembly, 3D assembly completion seeks to identify its missing parts from multiple candidates, determine their poses, and produce complete assembly that is well-connected, structurally stable, and aesthetically pleasing. This task necessitates not only specialized knowledge of part composition but, more importantly, an awareness of physical constraints, i.e., connectivity, stability, and symmetry. Neglecting these constraints often results in assemblies that, although visually plausible, are impractical. To address this challenge, we propose PhysFiT, a physical-aware 3D shape understanding framework. This framework is built upon attention-based part relation modeling and incorporates connection modeling, simulation-free stability optimization and symmetric transformation consistency. We evaluate its efficacy on 3D part assembly and 3D assembly completion, a novel assembly task presented in this work. Extensive experiments demonstrate the effectiveness of PhysFiT in constructing geometrically sound and physically compliant assemblies.

References

[1]
Jonas Adler and Sebastian Lunz. 2018. Banach Wasserstein GAN. Advan. Neural Inf. Process. Syst. 31 (2018).
[2]
Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey Zagoruyko. 2020. End-to-end object detection with transformers. In European Conference on Computer Vision. Springer, 213–229.
[3]
Siddhartha Chaudhuri, Evangelos Kalogerakis, Leonidas Guibas, and Vladlen Koltun. 2011. Probabilistic reasoning for assembly-based 3D modeling. In ACM SIGGRAPH 2011 Papers. ACM, 1–10.
[4]
Yao Chen, Pooya Sareh, and Jian Feng. 2015. Effective insights into the geometric stability of symmetric skeletal structures under symmetric variations. Int. J. Solids Struct. 69 (2015), 277–290.
[5]
Yun-Chun Chen, Haoda Li, Dylan Turpin, Alec Jacobson, and Animesh Garg. 2022. Neural shape mating: Self-supervised object assembly with adversarial shape priors. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12724–12733.
[6]
Junfeng Cheng, Mingdong Wu, Ruiyuan Zhang, Guanqi Zhan, Chao Wu, and Hao Dong. 2023. Score-PA: Score-based 3D part assembly. arXiv preprint arXiv:2309.04220 (2023).
[7]
Erwin Coumans and Yunfei Bai. 2016. PyBullet, a Python module for physics simulation for games, robotics and machine learning. Retrieved from http://pybullet.org
[8]
Bi’an Du, Xiang Gao, Wei Hu, and Renjie Liao. 2024. Generative 3D part assembly via part-whole-hierarchy message passing. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’24). 20850–20859.
[9]
Benyamin Ghojogh, Milad Sikaroudi, Sobhan Shafiei, Hamid R. Tizhoosh, Fakhri Karray, and Mark Crowley. 2020. Fisher discriminant triplet and contrastive losses for training siamese networks. In International Joint Conference on Neural Networks (IJCNN’20). IEEE, 1–7.
[10]
Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C. Courville. 2017. Improved training of Wasserstein GANs. Advan. Neural Inf. Process. Syst. 30 (2017).
[11]
Abhinav Narayan Harish, Rajendra Nagar, and Shanmuganathan Raman. 2022. RGL-NET: A recurrent graph learning framework for progressive part assembly. In IEEE/CVF Winter Conference on Applications of Computer Vision (WACV’22). IEEE, 647–656.
[12]
Tao Hu, Zhizhong Han, Abhinav Shrivastava, and Matthias Zwicker. 2019. Render4Completion: Synthesizing multi-view depth maps for 3D shape completion. In IEEE/CVF International Conference on Computer Vision Workshops.
[13]
Prakhar Jaiswal, Jinmiao Huang, and Rahul Rai. 2016. Assembly-based conceptual 3D modeling with unlabeled components using probabilistic factor graph. Comput.-aid. Des. 74 (2016), 45–54.
[14]
Evangelos Kalogerakis, Siddhartha Chaudhuri, Daphne Koller, and Vladlen Koltun. 2012. A probabilistic model for component-based shape synthesis. ACM Trans. Graph. 31, 4 (2012), 1–11.
[15]
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. 2020. Supervised contrastive learning. Advan. Neural Inf. Process. Syst. 33 (2020), 18661–18673.
[16]
Jun Li, Chengjie Niu, and Kai Xu. 2020b. Learning part generation and assembly for structure-aware shape synthesis. In AAAI Conference on Artificial Intelligence, Vol. 34. 11362–11369.
[17]
Jun Li, Kai Xu, Siddhartha Chaudhuri, Ersin Yumer, Hao Zhang, and Leonidas Guibas. 2017. GRASS: Generative recursive autoencoders for shape structures. ACM Trans. Graph. 36, 4 (2017), 1–14.
[18]
Yichen Li, Kaichun Mo, Yueqi Duan, He Wang, Jiequan Zhang, Lin Shao, Wojciech Matusik, and Leonidas Guibas. 2023. Category-level multi-part multi-joint 3D shape assembly. arXiv preprint arXiv:2303.06163 (2023).
[19]
Yichen Li, Kaichun Mo, Lin Shao, Minhyuk Sung, and Leonidas Guibas. 2020a. Learning 3D part assembly from a single image. In European Conference on Computer Vision(ECCV’20). Springer, 664–682.
[20]
Mariem Mezghanni, Théo Bodrito, Malika Boulkenafed, and Maks Ovsjanikov. 2022. Physical simulation layer for accurate 3D modeling. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 13514–13523.
[21]
Mariem Mezghanni, Malika Boulkenafed, Andre Lieutier, and Maks Ovsjanikov. 2021. Physically-aware generative network for 3D shape modeling. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). 9330–9341.
[22]
Paritosh Mittal, Yen-Chi Cheng, Maneesh Singh, and Shubham Tulsiani. 2022. AutoSDF: Shape priors for 3D completion, reconstruction and generation. arXiv preprint arXiv:2203.09516 (2022).
[23]
Kaichun Mo, Paul Guerrero, Li Yi, Hao Su, Peter Wonka, Niloy Mitra, and Leonidas J. Guibas. 2019a. StructureNet: Hierarchical graph networks for 3D shape generation. arXiv preprint arXiv:1908.00575 (2019).
[24]
Kaichun Mo, Shilin Zhu, Angel X. Chang, Li Yi, Subarna Tripathi, Leonidas J. Guibas, and Hao Su. 2019b. PartNet: A large-scale benchmark for fine-grained and hierarchical part-level 3D object understanding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR’19). 909–918.
[25]
Despoina Paschalidou, Amlan Kar, Maria Shugrina, Karsten Kreis, Andreas Geiger, and Sanja Fidler. 2021. ATISS: Autoregressive transformers for indoor scene synthesis. Advan. Neural Inf. Process. Syst. 34 (2021), 12013–12026.
[26]
Karl Pearson. 1901. LIII. On lines and planes of closest fit to systems of points in space. Lond., Edinb., Dublin Philos. Mag. J. Sci. 2, 11 (1901), 559–572.
[27]
Muhammad Sarmad, Hyunjoo Jenny Lee, and Young Min Kim. 2019. RL-GAN-Net: A reinforcement learning agent controlled GAN network for real-time point cloud shape completion. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 5898–5907.
[28]
Dietmar Saupe and Dejan V. Vranić. 2001. 3D model retrieval with spherical harmonics and moments. In Joint Pattern Recognition Symposium. Springer, 392–397.
[29]
Dule Shu, James Cunningham, Gary Stump, Simon W. Miller, Michael A. Yukish, Timothy W. Simpson, and Conrad S. Tucker. 2020. 3D design using generative adversarial networks and physics-based validation. J. Mechan. Des. 142, 7 (2020), 071701.
[30]
Russell Stewart, Mykhaylo Andriluka, and Andrew Y. Ng. 2016. End-to-end people detection in crowded scenes. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR’16). 2325–2333.
[31]
Minhyuk Sung, Anastasia Dubrovina, Vladimir G. Kim, and Leonidas Guibas. 2018. Learning fuzzy set representations of partial shapes on dual embedding spaces. In Computer Graphics Forum, Vol. 37. Wiley Online Library, 71–81.
[32]
Minhyuk Sung, Hao Su, Vladimir G. Kim, Siddhartha Chaudhuri, and Leonidas Guibas. 2017. ComplementMe: Weakly-supervised component suggestions for 3D modeling. ACM Trans. Graph. 36, 6 (2017), 1–12.
[33]
Lyne P. Tchapmi, Vineet Kosaraju, Hamid Rezatofighi, Ian Reid, and Silvio Savarese. 2019. TopNet: Structural point cloud decoder. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). 383–392.
[34]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advan. Neural Inf. Process. Syst. 30 (2017).
[35]
Dejan V. Vranic and Dietmar Saupe. 2004. 3D model retrieval.Ph. D. Dissertation. Citeseer.
[36]
Kai Wang, Paul Guerrero, Vladimir G. Kim, Siddhartha Chaudhuri, Minhyuk Sung, and Daniel Ritchie. 2022. The shape part slot machine: Contact-based reasoning for generating 3D shapes from parts. In European Conference on Computer Vision (ECCV’22). Springer, 610–626.
[37]
Meng Wang, Yue Gao, Ke Lu, and Yong Rui. 2012. View-based discriminative probabilistic modeling for 3D object retrieval and recognition. IEEE Trans. Image Process. 22, 4 (2012), 1395–1407.
[38]
Xinpeng Wang, Chandan Yeshwanth, and Matthias Nießner. 2021. SceneFormer: Indoor scene generation with transformers. In International Conference on 3D Vision (3DV’21). IEEE, 106–115.
[39]
Yimin Wei, Hao Liu, Tingting Xie, Qiuhong Ke, and Yulan Guo. 2022. Spatial-temporal transformer for 3D point cloud sequences. In IEEE/CVF Winter Conference on Applications of Computer Vision. 1171–1180.
[40]
Xin Wen, Tianyang Li, Zhizhong Han, and Yu-Shen Liu. 2020. Point cloud completion by skip-attention network with hierarchical folding. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 1939–1948.
[41]
Karl D. D. Willis, Pradeep Kumar Jayaraman, Hang Chu, Yunsheng Tian, Yifei Li, Daniele Grandi, Aditya Sanghi, Linh Tran, Joseph G. Lambourne, Armando Solar-Lezama, and Wojciech Matusik. 2022. JoinABle: Learning bottom-up assembly of parametric CAD joints. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’22). 15849–15860.
[42]
Rundi Wu, Yixin Zhuang, Kai Xu, Hao Zhang, and Baoquan Chen. 2020. PQ-NET: A generative part seq2seq network for 3D shapes. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20). 829–838.
[43]
Xiaohua Xie, Kai Xu, Niloy J. Mitra, Daniel Cohen-Or, Wenyong Gong, Qi Su, and Baoquan Chen. 2013. Sketch-to-design: Context-based part assembly. In Computer Graphics Forum, Vol. 32. Wiley Online Library, 233–245.
[44]
Boshen Xu, Sipeng Zheng, and Qin Jin. 2024. SPAFormer: Sequential 3D part assembly with transformers. arXiv preprint arXiv:2403.05874 (2024).
[45]
Xingguang Yan, Liqiang Lin, Niloy J. Mitra, Dani Lischinski, Danny Cohen-Or, and Hui Huang. 2022. ShapeFormer: Transformer-based shape completion via sparse representation. arXiv preprint arXiv:2201.10326 (2022).
[46]
Xumin Yu, Yongming Rao, Ziyi Wang, Zuyan Liu, Jiwen Lu, and Jie Zhou. 2021a. PoinTr: Diverse point cloud completion with geometry-aware transformers. In IEEE/CVF International Conference on Computer Vision (ICCV’21). 12498–12507.
[47]
Xumin Yu, Yongming Rao, Ziyi Wang, Jiwen Lu, and Jie Zhou. 2023. AdaPoinTr: Diverse point cloud completion with adaptive geometry-aware transformers. arXiv preprint arXiv:2301.04545 (2023).
[48]
Xumin Yu, Lulu Tang, Yongming Rao, Tiejun Huang, Jie Zhou, and Jiwen Lu. 2021b. Point-BERT: Pre-training 3D point cloud transformers with masked point modeling. arXiv preprint arXiv:2111.14819 (2021).
[49]
Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. 2018. PCN: Point completion network. In International Conference on 3D Vision (3DV’18). IEEE, 728–737.
[50]
Guanqi Zhan, Qingnan Fan, Kaichun Mo, Lin Shao, Baoquan Chen, Leonidas Guibas, and Hao Dong. 2020. Generative 3D part assembly via dynamic graph learning. Advan. Neural Inf. Process. Syst. 33 (2020), 6315–6326.
[51]
Cheng Zhang, Haocheng Wan, Shengqiang Liu, Xinyi Shen, and Zizhao Wu. 2021. PVT: Point-voxel transformer for 3D deep learning. arXiv preprint arXiv:2108.06076 2 (2021).
[52]
Rufeng Zhang, Tao Kong, Weihao Wang, Xuan Han, and Mingyu You. 2022. 3D part assembly generation with instance encoded transformer. IEEE Robot. Autom. Lett. July (2022).
[53]
Chenyang Zhu, Kai Xu, Siddhartha Chaudhuri, Renjiao Yi, and Hao Zhang. 2018. SCORES: Shape composition with recursive substructure priors. ACM Trans. Graph. 37, 6 (2018), 1–14.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 44, Issue 1
February 2025
123 pages
EISSN:1557-7368
DOI:10.1145/3696812
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 November 2024
Online AM: 29 October 2024
Accepted: 15 October 2024
Revised: 11 September 2024
Received: 10 March 2023
Published in TOG Volume 44, Issue 1

Check for updates

Author Tags

  1. Automatic assembly
  2. 3D assembly completion
  3. 3D shape modeling
  4. transformer

Qualifiers

  • Research-article

Funding Sources

  • National Natural Science Foundation of China
  • Shanghai Municipal Science and Technology Major Project
  • Fundamental Research Funds for the Central Universities

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 256
    Total Downloads
  • Downloads (Last 12 months)256
  • Downloads (Last 6 weeks)177
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media