Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Mesh deformation-based single-view 3D reconstruction of thin eyeglasses frames with differentiable rendering

Published: 01 October 2024 Publication History

Abstract

With the support of Virtual Reality (VR) and Augmented Reality (AR) technologies, the 3D virtual eyeglasses try-on application is well on its way to becoming a new trending solution that offers a “try on” option to select the perfect pair of eyeglasses at the comfort of your own home. Reconstructing eyeglasses frames from a single image with traditional depth and image-based methods is extremely difficult due to their unique characteristics such as lack of sufficient texture features, thin elements, and severe self-occlusions. In this paper, we propose the first mesh deformation-based reconstruction framework for recovering high-precision 3D full-frame eyeglasses models from a single RGB image, leveraging prior and domain-specific knowledge. Specifically, based on the construction of a synthetic eyeglasses frame dataset, we first define a class-specific eyeglasses frame template with pre-defined keypoints. Then, given an input eyeglasses frame image with thin structure and few texture features, we design a keypoint detector and refiner to detect predefined keypoints in a coarse-to-fine manner to estimate the camera pose accurately. After that, using differentiable rendering, we propose a novel optimization approach for producing correct geometry by progressively performing free-form deformation (FFD) on the template mesh. We define a series of loss functions to enforce consistency between the rendered result and the corresponding RGB input, utilizing constraints from inherent structure, silhouettes, keypoints, per-pixel shading information, and so on. Experimental results on both the synthetic dataset and real images demonstrate the effectiveness of the proposed algorithm.

Graphical abstract

Display Omitted

Highlights

A mesh deformation-based single-view reconstruction algorithm for thin eyeglasses.
A synthetic eyeglasses frame mesh dataset.
A coarse-to-fine network for detecting and refining keypoints.
An unsupervised free-form deformation method for refining reconstructed mesh.

References

[1]
Zhang Boping, Augmented reality virtual glasses try-on technology based on iOS platform, EURASIP J. Image Video Process. 2018 (1) (2018) 132.
[2]
Davide Marelli, Simone Bianco, Gianluigi Ciocca, Faithful Fit, Markerless, 3D Eyeglasses Virtual Try-On, in: Pattern Recognition. ICPR International Workshops and Challenges, 2021, pp. 460–471.
[3]
Marelli Davide, Bianco Simone, Ciocca Gianluigi, Designing an AI-based virtual try-on web application, Sensors 22 (10) (2022).
[4]
Feng Zhuming, Jiang Fei, Shen Ruimin, Virtual glasses try-on based on large pose estimation, Procedia Comput. Sci. 131 (2018) 226–233.
[5]
Liu Lingjie, Chen Nenglun, Ceylan Duygu, Theobalt Christian, Wang Wenping, Mitra Niloy J., CurveFusion: Reconstructing thin structures from RGBD sequences, ACM Trans. Graph. 37 (6) (2018).
[6]
Wang Peng, Liu Lingjie, Chen Nenglun, Chu Hung-Kuo, Theobalt Christian, Wang Wenping, Vid2Curve: Simultaneous camera motion estimation and thin structure reconstruction from an RGB video, ACM Trans. Graph. 39 (4) (2020).
[7]
Lu Yawen, Wang Yuxing, Parikh Devarth, Khan Awais, Lu Guoyu, Simultaneous direct depth estimation and synthesis stereo for single image plant root reconstruction, IEEE Trans. Image Process. 30 (2021) 4883–4893.
[8]
Wang Nanyang, Zhang Yinda, Li Zhuwen, Fu Yanwei, Yu Hang, Liu Wei, Xue Xiangyang, Jiang Yu-Gang, Pixel2Mesh: 3D mesh model generation via image guided deformation, IEEE Trans. Pattern Anal. Mach. Intell. 43 (10) (2021) 3600–3613.
[9]
Furukawa Yasutaka, Ponce Jean, Accurate, dense, and robust multiview stereopsis, IEEE Trans. Pattern Anal. Mach. Intell. 32 (8) (2010) 1362–1376.
[10]
Page G.F., Multiple view geometry in computer vision, by Richard Hartley and Andrew Zisserman, Robotica 23 (2) (2005) 271.
[11]
Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohi, Jamie Shotton, Steve Hodges, Andrew Fitzgibbon, KinectFusion: Real-time dense surface mapping and tracking, in: 2011 10th IEEE International Symposium on Mixed and Augmented Reality, 2011, pp. 127–136.
[12]
Dai Angela, Nießner Matthias, Zollhöfer Michael, Izadi Shahram, Theobalt Christian, BundleFusion: Real-time globally consistent 3D reconstruction using on-the-fly surface reintegration, ACM Trans. Graph. 36 (3) (2017).
[13]
Han Xian-Feng, Laga Hamid, Bennamoun Mohammed, Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era, IEEE Trans. Pattern Anal. Mach. Intell. 43 (5) (2021) 1578–1604.
[14]
Zhu Hao, Zuo Xinxin, Yang Haotian, Wang Sen, Cao Xun, Yang Ruigang, Detailed avatar recovery from single image, IEEE Trans. Pattern Anal. Mach. Intell. (2021) 1.
[15]
Mao Aihua, Dai Canglan, Liu Qing, Yang Jie, Gao Lin, He Ying, Liu Yong-Jin, STD-net: Structure-preserving and topology-adaptive deformation network for single-view 3D reconstruction, IEEE Trans. Vis. Comput. Graphics (2021) 1.
[16]
Zhang Jingbo, Wan Ziyu, Liao Jing, Adaptive joint optimization for 3D reconstruction with differentiable rendering, IEEE Trans. Vis. Comput. Graphics (2022) 1.
[17]
Liu Zhihao, Wu Kai, Guo Jianwei, Wang Yunhai, Deussen Oliver, Cheng Zhanglin, Single image tree reconstruction via adversarial network, Graph. Models 117 (2021).
[18]
Xie Haozhe, Yao Hongxun, Zhang Shengping, Zhou Shangchen, Sun Wenxiu, Pix2Vox++: Multi-scale context-aware 3D object reconstruction from single and multiple images, Int. J. Comput. Vis. 128 (2020) 2919–2935.
[19]
Kim Hyunjun, Kim Minho, Volume reconstruction based on the six-direction cubic box-spline, Graph. Models 125 (2023).
[20]
Haoqiang Fan, Hao Su, Leonidas Guibas, A Point Set Generation Network for 3D Object Reconstruction from a Single Image, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 2463–2471.
[21]
Insafutdinov Eldar, Dosovitskiy Alexey, Unsupervised learning of shape and pose with differentiable point clouds, in: Advances in Neural Information Processing Systems, vol. 31, 2018, pp. 2807–2817.
[22]
Zhang Xiangjun, Zheng Yinglin, Deng Wenjin, Dai Qifeng, Lin Yuxin, Shi Wangzheng, Zeng Ming, Vertex position estimation with spatial–temporal transformer for 3D human reconstruction, Graph. Models 130 (2023).
[23]
Johannes L. Schonberger, Jan-Michael Frahm, Structure-from-Motion Revisited, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 4104–4113.
[24]
Hiroharu Kato, Yoshitaka Ushiku, Tatsuya Harada, Neural 3D Mesh Renderer, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 3907–3916.
[25]
Han Xian-Feng, Laga Hamid, Bennamoun Mohammed, Image-based 3D object reconstruction: State-of-the-art and trends in the deep learning era, IEEE Trans. Pattern Anal. Mach. Intell. 43 (5) (2021) 1578–1604.
[26]
Christopher B. Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, Silvio Savarese, 3D-R2N2: A Unified Approach for Single and Multi-view 3D Object Reconstruction, in: Computer Vision – ECCV 2016, 2016, pp. 628–644.
[27]
Shubham Tulsiani, Tinghui Zhou, Alexei A. Efros, Jitendra Malik, Multi-view Supervision for Single-View Reconstruction via Differentiable Ray Consistency, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 209–217.
[28]
Haozhe Xie, Hongxun Yao, Xiaoshuai Sun, Shangchen Zhou, Shengping Zhang, Pix2Vox: Context-Aware 3D Reconstruction From Single and Multi-View Images, in: 2019 IEEE/CVF International Conference on Computer Vision, ICCV, 2019, pp. 2690–2698.
[29]
Xi Li, Ping Kuang, 3D-VRVT: 3D Voxel Reconstruction from A Single Image with Vision Transformer, in: 2021 International Conference on Culture-Oriented Science & Technology, ICCST, 2021, pp. 343–348.
[30]
Abhijit Kundu, Yin Li, James M. Rehg, 3D-RCNN: Instance-Level 3D Object Reconstruction via Render-and-Compare, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2018, pp. 3559–3568.
[31]
Shikun Liu, Lee Giles, Alexander Ororbia, Learning a Hierarchical Latent-Variable Model of 3D Shapes, in: 2018 International Conference on 3D Vision (3DV), 2018, pp. 542–551.
[32]
Liu Zheng-Ning, Cao Yan-Pei, Kuang Zheng-Fei, Kobbelt Leif, Hu Shi-Min, High-quality textured 3D shape reconstruction with cascaded fully convolutional networks, IEEE Trans. Vis. Comput. Graphics 27 (1) (2021) 83–97.
[33]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, Steven Lovegrove, DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 165–174.
[34]
Azinovic Dejan, Martin-Brualla Ricardo, Goldman Dan B., Nießner Matthias, Thies Justus, Neural RGB-D surface reconstruction, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 6290–6301.
[35]
Wu Jiajun, Wang Yifan, Xue Tianfan, Sun Xingyuan, Freeman Bill, Tenenbaum Josh, MarrNet: 3D shape reconstruction via 2.5D sketches, in: Advances in Neural Information Processing Systems, 30, 2017, pp. 540–550.
[36]
Volker Blanz, Thomas Vetter, A Morphable Model for the Synthesis of 3D Faces, in: Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques, 1999, pp. 187–194.
[37]
Dragomir Anguelov, Praveen Srinivasan, Daphne Koller, Sebastian Thrun, Jim Rodgers, James Davis, SCAPE: Shape Completion and Animation of People, in: ACM SIGGRAPH 2005 Papers, 2005, pp. 408–416.
[38]
Loper Matthew, Mahmood Naureen, Romero Javier, Pons-Moll Gerard, Black Michael J., SMPL: A skinned multi-person linear model, ACM Trans. Graph. 34 (6) (2015).
[39]
Pons-Moll Gerard, Romero Javier, Mahmood Naureen, Black Michael J., Dyna: A model of dynamic human shape in motion, ACM Trans. Graph. 34 (4) (2015).
[40]
Georgios Pavlakos, Vasileios Choutas, Nima Ghorbani, Timo Bolkart, Ahmed A. Osman, Dimitrios Tzionas, Michael J. Black, Expressive Body Capture: 3D Hands, Face, and Body From a Single Image, in: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2019, pp. 10967–10977.
[41]
Dominic Jack, Jhony K. Pontes, Sridha Sridharan, Clinton Fookes, Sareh Shirazi, Frederic Maire, Anders Eriksson, Learning Free-Form Deformations for 3D Object Reconstruction, in: Computer Vision – ACCV 2018, 2019, pp. 317–333.
[42]
Matheus Gadelha, Rui Wang, Subhransu Maji, Multiresolution Tree Networks for 3D Point Cloud Processing, in: Computer Vision – ECCV 2018, 2018, pp. 105–122.
[43]
Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia, GAL: Geometric Adversarial Loss for Single-View 3D-Object Reconstruction, in: Proceedings of the European Conference on Computer Vision, ECCV, 2018, pp. 820–834.
[44]
Qiangeng Xu, Zexiang Xu, Julien Philip, Sai Bi, Zhixin Shu, Kalyan Sunkavalli, Ulrich Neumann, Point-NeRF: Point-based Neural Radiance Fields, in: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR, 2022, pp. 5428–5438.
[45]
Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, William T. Freeman, Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling, in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2974–2983.
[46]
Ian Cherabier, Johannes L. Schönberger, Martin R. Oswald, Marc Pollefeys, Andreas Geiger, Learning Priors for Semantic 3D Reconstruction, in: Computer Vision – ECCV 2018, vol. 42, (13) 2018, pp. 325–341.
[47]
Guo Li, Ligang Liu, Hanlin Zheng, Niloy J. Mitra, Analysis, Reconstruction and Manipulation Using Arterial Snakes, in: ACM SIGGRAPH Asia 2010 Papers, 2010.
[48]
Amy Tabb, Shape from Silhouette Probability Maps: Reconstruction of Thin Objects in the Presence of Silhouette Extraction and Calibration Error, in: 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013, pp. 161–168.
[49]
Hsiao Kai-Wen, Huang Jia-Bin, Chu Hung-Kuo, Multi-view wire art, ACM Trans. Graph. 37 (6) (2018).
[50]
Liu Lingjie, Ceylan Duygu, Lin Cheng, Wang Wenping, Mitra Niloy J., Image-based reconstruction of wire art, ACM Trans. Graph. 36 (4) (2017).
[51]
Shiwei Li, Yao Yao, Tian Fang, Long Quan, Reconstructing Thin Structures of Manifold Surfaces by Integrating Spatial Curves (CVPR), in: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018, pp. 2887–2896.
[52]
McQuilkin Kent, Powers Anne, Cinema 4D: The Artist’s Project Sourcebook, Taylor & Francis, 2011.
[53]
Martini H., Swanepoel K.J., Weiss G., The fermat-torricelli problem in normed planes and spaces, J. Optim. Theory Appl. 115 (2) (2002) 283–314.
[54]
Zaslavski A.J., Weiszfeld’s method, Springer Optim. Appl. (2016).
[55]
Burns Don, Osfield Robert, Open scene graph a: Introduction, b: Examples and applications, in: Virtual Reality Conference, IEEE, IEEE Computer Society, 2004, p. 265.
[56]
Sederberg Thomas W., Parry Scott R., Free-form deformation of solid geometric models, ACM SIGGRAPH Comput. Graph. 20 (4) (1986) 151–160.
[57]
Jhony Kaesemodel Pontes, Chen Kong, Anders Eriksson, Clinton Fookes, Sridha Sridharan, Simon Lucey, Compact Model Representation for 3D Reconstruction, in: 2017 International Conference on 3D Vision (3DV), 2017.
[58]
Andrey Kurenkov, Jingwei Ji, Animesh Garg, Viraj Mehta, JunYoung Gwak, Christopher Choy, Silvio Savarese, DeformNet: Free-Form Deformation Network for 3D Shape Reconstruction from a Single Image, in: 2018 IEEE Winter Conference on Applications of Computer Vision, WACV, 2018, pp. 858–866.
[59]
Dominic Jack, Jhony K. Pontes, Sridha Sridharan, Clinton Fookes, Sareh Shirazi, Frederic Maire, Anders Eriksson, Learning Free-Form Deformations for 3D Object Reconstruction, in: Computer Vision – ACCV 2018, 2019, pp. 317–333.
[60]
Huber Peter J., Robust estimation of a location parameter, in: Breakthroughs in Statistics: Methodology and Distribution, Springer New York, 1992, pp. 492–518.
[61]
Paszke Adam, Gross Sam, Massa Francisco, Lerer Adam, Bradbury James, Chanan Gregory, Killeen Trevor, Lin Zeming, Gimelshein Natalia, Antiga Luca, et al., Pytorch: An imperative style, high-performance deep learning library, Adv. Neural Inf. Process. Syst. 32 (2019) 8026–8037.
[62]
K. Simonyan, A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, in: International Conference on Learning Representations, ICLR, 2015.
[63]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, Deep Residual Learning for Image Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2016, pp. 770–778.
[64]
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, Kilian Q. Weinberger, Densely Connected Convolutional Networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, 2017, pp. 2261–2269.
[65]
Krizhevsky Alex, Sutskever Ilya, Hinton Geoffrey E., ImageNet classification with deep convolutional neural networks, Commun. ACM 60 (6) (2017) 84–90.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Graphical Models
Graphical Models  Volume 135, Issue C
Oct 2024
63 pages

Publisher

Academic Press Professional, Inc.

United States

Publication History

Published: 01 October 2024

Author Tags

  1. Reconstruction
  2. Image-based reconstruction
  3. Free-form deformation
  4. Eyeglasses frame
  5. Thin structure

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 17 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media