research-article

Mesh deformation-based single-view 3D reconstruction of thin eyeglasses frames with differentiable rendering

Authors:

Zhiyong SuAuthors Info & Claims

Volume 135, Issue C

https://doi.org/10.1016/j.gmod.2024.101225

Published: 01 October 2024 Publication History

Abstract

With the support of Virtual Reality (VR) and Augmented Reality (AR) technologies, the 3D virtual eyeglasses try-on application is well on its way to becoming a new trending solution that offers a “try on” option to select the perfect pair of eyeglasses at the comfort of your own home. Reconstructing eyeglasses frames from a single image with traditional depth and image-based methods is extremely difficult due to their unique characteristics such as lack of sufficient texture features, thin elements, and severe self-occlusions. In this paper, we propose the first mesh deformation-based reconstruction framework for recovering high-precision 3D full-frame eyeglasses models from a single RGB image, leveraging prior and domain-specific knowledge. Specifically, based on the construction of a synthetic eyeglasses frame dataset, we first define a class-specific eyeglasses frame template with pre-defined keypoints. Then, given an input eyeglasses frame image with thin structure and few texture features, we design a keypoint detector and refiner to detect predefined keypoints in a coarse-to-fine manner to estimate the camera pose accurately. After that, using differentiable rendering, we propose a novel optimization approach for producing correct geometry by progressively performing free-form deformation (FFD) on the template mesh. We define a series of loss functions to enforce consistency between the rendered result and the corresponding RGB input, utilizing constraints from inherent structure, silhouettes, keypoints, per-pixel shading information, and so on. Experimental results on both the synthetic dataset and real images demonstrate the effectiveness of the proposed algorithm.

Graphical abstract

Display Omitted

Highlights

•

A mesh deformation-based single-view reconstruction algorithm for thin eyeglasses.

•

A synthetic eyeglasses frame mesh dataset.

•

A coarse-to-fine network for detecting and refining keypoints.

•

An unsupervised free-form deformation method for refining reconstructed mesh.

References

[1]

Zhang Boping, Augmented reality virtual glasses try-on technology based on iOS platform, EURASIP J. Image Video Process. 2018 (1) (2018) 132.

[2]

Davide Marelli, Simone Bianco, Gianluigi Ciocca, Faithful Fit, Markerless, 3D Eyeglasses Virtual Try-On, in: Pattern Recognition. ICPR International Workshops and Challenges, 2021, pp. 460–471.

[3]

Marelli Davide, Bianco Simone, Ciocca Gianluigi, Designing an AI-based virtual try-on web application, Sensors 22 (10) (2022).

[4]

Feng Zhuming, Jiang Fei, Shen Ruimin, Virtual glasses try-on based on large pose estimation, Procedia Comput. Sci. 131 (2018) 226–233.

[5]

Liu Lingjie, Chen Nenglun, Ceylan Duygu, Theobalt Christian, Wang Wenping, Mitra Niloy J., CurveFusion: Reconstructing thin structures from RGBD sequences, ACM Trans. Graph. 37 (6) (2018).

[6]

Wang Peng, Liu Lingjie, Chen Nenglun, Chu Hung-Kuo, Theobalt Christian, Wang Wenping, Vid2Curve: Simultaneous camera motion estimation and thin structure reconstruction from an RGB video, ACM Trans. Graph. 39 (4) (2020).

[7]

Lu Yawen, Wang Yuxing, Parikh Devarth, Khan Awais, Lu Guoyu, Simultaneous direct depth estimation and synthesis stereo for single image plant root reconstruction, IEEE Trans. Image Process. 30 (2021) 4883–4893.

[8]

Wang Nanyang, Zhang Yinda, Li Zhuwen, Fu Yanwei, Yu Hang, Liu Wei, Xue Xiangyang, Jiang Yu-Gang, Pixel2Mesh: 3D mesh model generation via image guided deformation, IEEE Trans. Pattern Anal. Mach. Intell. 43 (10) (2021) 3600–3613.

[9]

Furukawa Yasutaka, Ponce Jean, Accurate, dense, and robust multiview stereopsis, IEEE Trans. Pattern Anal. Mach. Intell. 32 (8) (2010) 1362–1376.

[10]

Page G.F., Multiple view geometry in computer vision, by Richard Hartley and Andrew Zisserman, Robotica 23 (2) (2005) 271.

[11]

Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, Andrew Fitzgibbon, KinectFusion: Real-time dense surface mapping and tracking, in: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 2011, pp. 127–136.

[12]

Dai Angela, Nießner Matthias, Zollhöfer Michael, Izadi Shahram, Theobalt Christian, BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration, ACM Trans. Graph. 36 (3) (2017).

Digital Library

[13]

Han Xian-Feng, Laga Hamid, Bennamoun Mohammed, Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era, IEEE Trans. Pattern Anal. Mach. Intell. 43 (5) (2021) 1578–1604.

[14]

Zhu Hao, Zuo Xinxin, Yang Haotian, Wang Sen, Cao Xun, Yang Ruigang, Detailed avatar recovery from single image, IEEE Trans. Pattern Anal. Mach. Intell. (2021) 1.

[15]

Mao Aihua, Dai Canglan, Liu Qing, Yang Jie, Gao Lin, He Ying, Liu Yong-Jin, STD-net: Structure-preserving and topology-adaptive deformation network for single-view 3D reconstruction, IEEE Trans. Vis. Comput. Graphics (2021) 1.

[16]

Zhang Jingbo, Wan Ziyu, Liao Jing, Adaptive joint optimization for 3D reconstruction with differentiable rendering, IEEE Trans. Vis. Comput. Graphics (2022) 1.

[17]

Liu Zhihao, Wu Kai, Guo Jianwei, Wang Yunhai, Deussen Oliver, Cheng Zhanglin, Single image tree reconstruction via adversarial network, Graph. Models 117 (2021).

[18]

Xie Haozhe, Yao Hongxun, Zhang Shengping, Zhou Shangchen, Sun Wenxiu, Pix2Vox++: Multi-scale context-aware 3D object reconstruction from single and multiple images, Int. J. Comput. Vis. 128 (2020) 2919–2935.

[19]

Kim Hyunjun, Kim Minho, Volume reconstruction based on the six-direction cubic box-spline, Graph. Models 125 (2023).

[20]

Haoqiang Fan, Hao Su, Leonidas Guibas, A Point Set Generation Network for 3D Object Reconstruction from a Single Image, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 2463–2471.

[21]

Insafutdinov Eldar, Dosovitskiy Alexey, Unsupervised learning of shape and pose with differentiable point clouds, in: Advances in Neural Information Processing Systems, vol. 31, 2018, pp. 2807–2817.

[22]

Zhang Xiangjun, Zheng Yinglin, Deng Wenjin, Dai Qifeng, Lin Yuxin, Shi Wangzheng, Zeng Ming, Vertex position estimation with spatial–temporal transformer for 3D human reconstruction, Graph. Models 130 (2023).

[23]

Johannes L. Schonberger, Jan-Michael Frahm, Structure-from-Motion Revisited, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 4104–4113.

[24]

Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada, Neural 3D Mesh Renderer, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 3907–3916.

[25]

Han Xian-Feng, Laga Hamid, Bennamoun Mohammed, Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era, IEEE Trans. Pattern Anal. Mach. Intell. 43 (5) (2021) 1578–1604.

[26]

Christopher B. Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, Silvio Savarese, 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction, in: Computer Vision – ECCV 2016, 2016, pp. 628–644.

[27]

Shubham Tulsiani, Tinghui Zhou, Alexei A. Efros, Jitendra Malik, Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 209–217.

[28]

Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Shengping Zhang, Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images, in: 2019 IEEE/CVF International Conference on Computer Vision, ICCV, 2019, pp. 2690–2698.

[29]

Xi Li, Ping Kuang, 3D-VRVT: 3D Voxel Reconstruction from A Single Image with Vision Transformer, in: 2021 International Conference on Culture-Oriented Science & Technology, ICCST, 2021, pp. 343–348.

[30]

Abhijit Kundu, Yin Li, James M. Rehg, 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 3559–3568.

[31]

Shikun Liu, Lee Giles, Alexander Ororbia, Learning a Hierarchical Latent-Variable Model of 3D Shapes, in: 2018 International Conference on 3D Vision (3DV), 2018, pp. 542–551.

[32]

Liu Zheng-Ning, Cao Yan-Pei, Kuang Zheng-Fei, Kobbelt Leif, Hu Shi-Min, High-quality textured 3D shape reconstruction with cascaded fully convolutional networks, IEEE Trans. Vis. Comput. Graphics 27 (1) (2021) 83–97.

[33]

Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove, DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 165–174.

[34]

Azinovic Dejan, Martin-Brualla Ricardo, Goldman Dan B., Nießner Matthias, Thies Justus, Neural RGB-D surface reconstruction, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 6290–6301.

[35]

Wu Jiajun, Wang Yifan, Xue Tianfan, Sun Xingyuan, Freeman Bill, Tenenbaum Josh, MarrNet: 3D shape reconstruction via 2.5D sketches, in: Advances in Neural Information Processing Systems, 30, 2017, pp. 540–550.

[36]

Volker Blanz, Thomas Vetter, A Morphable Model for the Synthesis of 3D Faces, in: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, 1999, pp. 187–194.

[37]

Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, James Davis, SCAPE: Shape Completion and Animation of People, in: ACM SIGGRAPH 2005 Papers, 2005, pp. 408–416.

[38]

Loper Matthew, Mahmood Naureen, Romero Javier, Pons-Moll Gerard, Black Michael J., SMPL: A skinned multi-person linear model, ACM Trans. Graph. 34 (6) (2015).

Digital Library

[39]

Pons-Moll Gerard, Romero Javier, Mahmood Naureen, Black Michael J., Dyna: A model of dynamic human shape in motion, ACM Trans. Graph. 34 (4) (2015).

[40]

Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. Osman, Dimitrios Tzionas, Michael J. Black, Expressive Body Capture: 3D Hands, Face, and Body From a Single Image, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 10967–10977.

[41]

Dominic Jack, Jhony K. Pontes, Sridha Sridharan, Clinton Fookes, Sareh Shirazi, Frederic Maire, Anders Eriksson, Learning Free-Form Deformations for 3D Object Reconstruction, in: Computer Vision – ACCV 2018, 2019, pp. 317–333.

[42]

Matheus Gadelha, Rui Wang, Subhransu Maji, Multiresolution Tree Networks for 3D Point Cloud Processing, in: Computer Vision – ECCV 2018, 2018, pp. 105–122.

[43]

Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia, GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 820–834.

[44]

Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, Ulrich Neumann, Point-NeRF: Point-based Neural Radiance Fields, in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 5428–5438.

[45]

Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, William T. Freeman, Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2974–2983.

[46]

Ian Cherabier, Johannes L. Schönberger, Martin R. Oswald, Marc Pollefeys, Andreas Geiger, Learning Priors for Semantic 3D Reconstruction, in: Computer Vision – ECCV 2018, vol. 42, (13) 2018, pp. 325–341.

[47]

Guo Li, Ligang Liu, Hanlin Zheng, Niloy J. Mitra, Analysis, Reconstruction and Manipulation Using Arterial Snakes, in: ACM SIGGRAPH Asia 2010 Papers, 2010.

[48]

Amy Tabb, Shape from Silhouette Probability Maps: Reconstruction of Thin Objects in the Presence of Silhouette Extraction and Calibration Error, in: 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 161–168.

[49]

Hsiao Kai-Wen, Huang Jia-Bin, Chu Hung-Kuo, Multi-view wire art, ACM Trans. Graph. 37 (6) (2018).

Digital Library

[50]

Liu Lingjie, Ceylan Duygu, Lin Cheng, Wang Wenping, Mitra Niloy J., Image-based reconstruction of wire art, ACM Trans. Graph. 36 (4) (2017).

Digital Library

[51]

Shiwei Li, Yao Yao, Tian Fang, Long Quan, Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves (CVPR), in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2887–2896.

[52]

McQuilkin Kent, Powers Anne, Cinema 4D: The Artist’s Project Sourcebook, Taylor & Francis, 2011.

[53]

Martini H., Swanepoel K.J., Weiss G., The fermat-torricelli problem in normed planes and spaces, J. Optim. Theory Appl. 115 (2) (2002) 283–314.

[54]

Zaslavski A.J., Weiszfeld’s method, Springer Optim. Appl. (2016).

[55]

Burns Don, Osfield Robert, Open scene graph a: Introduction, b: Examples and applications, in: Virtual Reality Conference, IEEE, IEEE Computer Society, 2004, p. 265.

[56]

Sederberg Thomas W., Parry Scott R., Free-form deformation of solid geometric models, ACM SIGGRAPH Comput. Graph. 20 (4) (1986) 151–160.

[57]

Jhony Kaesemodel Pontes, Chen Kong, Anders Eriksson, Clinton Fookes, Sridha Sridharan, Simon Lucey, Compact Model Representation for 3D Reconstruction, in: 2017 International Conference on 3D Vision (3DV), 2017.

[58]

Andrey Kurenkov, Jingwei Ji, Animesh Garg, Viraj Mehta, JunYoung Gwak, Christopher Choy, Silvio Savarese, DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image, in: 2018 IEEE Winter Conference on Applications of Computer Vision, WACV, 2018, pp. 858–866.

[59]

Dominic Jack, Jhony K. Pontes, Sridha Sridharan, Clinton Fookes, Sareh Shirazi, Frederic Maire, Anders Eriksson, Learning Free-Form Deformations for 3D Object Reconstruction, in: Computer Vision – ACCV 2018, 2019, pp. 317–333.

[60]

Huber Peter J., Robust estimation of a location parameter, in: Breakthroughs in Statistics: Methodology and Distribution, Springer New York, 1992, pp. 492–518.

[61]

Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca, et al., Pytorch: An imperative style, high-performance deep learning library, Adv. Neural Inf. Process. Syst. 32 (2019) 8026–8037.

[62]

K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, in: International Conference on Learning Representations, ICLR, 2015.

[63]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 770–778.

[64]

Gao Huang, Zhuang Liu, Laurens Van Der Maaten, Kilian Q. Weinberger, Densely Connected Convolutional Networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 2261–2269.

[65]

Krizhevsky Alex, Sutskever Ilya, Hinton Geoffrey E., ImageNet classification with deep convolutional neural networks, Commun. ACM 60 (6) (2017) 84–90.

Digital Library

Index Terms

Mesh deformation-based single-view 3D reconstruction of thin eyeglasses frames with differentiable rendering

Index terms have been assigned to the content through auto-classification.

Recommendations

MVE-An image-based reconstruction environment

We present an image-based reconstruction system, the Multi-View Environment. MVE is an end-to-end multi-view geometry reconstruction software which takes photos of a scene as input and produces a textured surface mesh as result. The system covers a ...
Nonstationary Gabor frames - approximately dual frames and reconstruction errors

Nonstationary Gabor frames, recently introduced in adaptive signal analysis, represent a natural generalization of classical Gabor frames by allowing for adaptivity of windows and lattice in either time or frequency. Due to the lack of a complete ...
3D reconstruction for featureless scenes with curvature hints

We present a novel interactive framework for improving 3D reconstruction starting from incomplete or noisy results obtained through image-based reconstruction algorithms. The core idea is to enable the user to provide localized hints on the curvature of ...

Comments

Information & Contributors

Information

Published In

cover image Graphical Models

Graphical Models Volume 135, Issue C

Oct 2024

63 pages

Issue’s Table of Contents

The Author(s).

Publisher

Academic Press Professional, Inc.

United States

Publication History

Published: 01 October 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 17 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents