research-article

Learning Visible Surface Area Estimation for Irregular Objects

Authors:

Yonghong TianAuthors Info & Claims

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Pages 2333 - 2343

https://doi.org/10.1145/3503161.3548017

Published: 10 October 2022 Publication History

Abstract

Visible surface area estimation for irregular objects, one of the most fundamental and challenging topics in mathematics, supports a wide range of applications. The existing techniques usually estimate the visible surface area via mathematical modeling from 3D point clouds. However, the 3D scanner is expensive, and the corresponding evaluation method is too complex. In this paper, we propose a novel problem setting, deep learning for visible surface area estimation, which is the first trial to estimate the visible surface area for irregular objects from monocular images. Technically, we first build a novel visible surface area estimation dataset including 9099 real annotations. Then, we design a learning-based architecture to predict the visible surface area, including two core modules (i.e., the classification module and the area-bins module). The classification module is presented to predict the visible surface area distribution interval and assist network training for more accurate visible surface area estimation. Meanwhile, the area-bins module using the transformer encoder is proposed to distinguish the difference in visible surface area between irregular objects of the same category. The experimental results demonstrate that our approach can effectively estimate the visible surface area for irregular objects with various categories and sizes. We hope that this work will attract further research into this newly identified, yet crucial research direction. Our source code and data are available at \textcolormagenta \urlhttps://github.com/liuxu0303/VSAnet .

Supplementary Material

MP4 File (MM22-fp1131.mp4)

This is a presentation video about the work of "learning visible surface area estimation for irregular objects". The video contains the introduction, dataset construction, our approach, experiments and summary.

Download
143.02 MB

References

[1]

NH Banks. 1985. Surface area estimation of potato tubers. Potato Research 28, 4 (1985), 487--495.

[2]

Shariq Farooq Bhat, Ibraheem Alhashim, and PeterWonka. 2021. Adabins: Depth estimation using adaptive bins. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4009--4018.

[3]

Garrick Brazil and Xiaoming Liu. 2019. M3d-RPN: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE International Conference on Computer Vision. 9287--9296.

[4]

Chi-Hua Chen, ED Gutierrez, Wes Thompson, Matthew S Panizzon, Terry L Jernigan, Lisa T Eyler, Christine Fennema-Notestine, Amy J Jak, Michael C Neale, Carol E Franz, et al. 2012. Hierarchical genetic organization of human cortical surface area. Science 335, 6076 (2012), 1634--1636.

[5]

Dengsheng Chen, Jun Li, Zheng Wang, and Kai Xu. 2020. Learning canonical shape space for category-level 6d object pose and size estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11973--11982.

[6]

Hansheng Chen, Yuyao Huang, Wei Tian, Zhong Gao, and Lu Xiong. 2021. Monorun: Monocular 3d object detection by reconstruction and uncertainty propagation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10379--10388.

[7]

Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2147-- 2156.

[8]

Yongjian Chen, Lei Tai, Kai Sun, and Mingyang Li. 2020. Monopair: Monocular 3d object detection using pairwise spatial relationships. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12093--12102.

[9]

Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, and Feng Wu. 2021. Disentangle Your Dense Object Detector. In Proceedings of the ACM International Conference on Multimedia. 4939--4948.

Digital Library

[10]

Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, and Yu Zhang. 2021. Neighbor-vote: Improving monocular 3D object detection through neighbor distance voting. In Proceedings of the ACM International Conference on Multimedia. 5239--5247.

Digital Library

[11]

Murray Clayton, Nevin D Amos, Nigel H Banks, and R Hugh Morton. 1995. Estimation of apple fruit surface area. New Zealand Journal of Crop and Horticultural Science 23 (1995), 345--349.

[12]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[13]

David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. Proceedings of the Advances in Neural Information Processing Systems 27 (2014).

[14]

Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 605--613.

[15]

Omar K Farha, A Özgür Yazayd?n, Ibrahim Eryazici, Christos D Malliakas, Brad G Hauser, Mercouri G Kanatzidis, SonBinh T Nguyen, Randall Q Snurr, and Joseph T Hupp. 2010. De novo synthesis of a metal--organic framework material featuring ultrahigh surface area and gas storage capacities. Nature Chemistry 2, 11 (2010), 944--948.

[16]

Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, and Yao Hu. 2021. Decoupled IoU Regression for Object Detection. In Proceedings of the ACM International Conference on Multimedia. 5628--5636.

Digital Library

[17]

Shane Gilroy, Martin Glavin, Edward Jones, and Darragh Mullins. 2021. Pedestrian occlusion level classification using keypoint detection and 2D body surface area estimation. In Proceedings of the IEEE International Conference on Computer Vision. 3833--3839.

[18]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[19]

Lu He, Qianyu Zhou, Xiangtai Li, Li Niu, Guangliang Cheng, Xiao Li, Wenxuan Liu, Yunhai Tong, Lizhuang Ma, and Liqing Zhang. 2021. End-to-End Video Object Detection with Spatial-Temporal Transformers. In Proceedings of the ACM International Conference on Multimedia. 1507--1516.

Digital Library

[20]

Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).

[21]

Abhishek Kar, Shubham Tulsiani, Joao Carreira, and Jitendra Malik. 2015. Amodal completion and size constancy in natural scenes. In Proceedings of the IEEE International Conference on Computer Vision. 127--135.

Digital Library

[22]

Lijian Leng, Qin Xiong, Lihong Yang, Hui Li, Yaoyu Zhou,Weijin Zhang, Shaojian Jiang, Hailong Li, and Huajun Huang. 2021. An overview on engineering the surface area and porosity of biochar. Science of the Total Environment 763 (2021), 144204.

[23]

Jiehong Lin, Zewei Wei, Zhihao Li, Songcen Xu, Kui Jia, and Yuanqing Li. 2021. Dualposenet: Category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In Proceedings of the IEEE International Conference on Computer Vision. 3560--3569.

[24]

Yu-Shen Liu, Jing Yi, Hu Zhang, Guo-Qin Zheng, and Jean-Claude Paul. 2010. Surface area estimation of digitized 3D objects using quasi-Monte Carlo methods. Pattern Recognition 43, 11 (2010), 3900--3909.

Digital Library

[25]

Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).

[26]

Xinzhu Ma, Zhihui Wang, Haojie Li, Pengbo Zhang, Wanli Ouyang, and Xin Fan. 2019. Accurate monocular 3d object detection via color-embedded 3d reconstruction for autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision. 6851--6860.

[27]

Chris J Maddison, Daniel Tarlow, and Tom Minka. 2014. A? Sampling. Proceedings of the Advances in Neural Information Processing Systems 27 (2014), 3086--3094.

[28]

Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens Van Der Maaten. 2018. Exploring the limits of weakly supervised pretraining. In Proceedings of the European Conference on Computer Vision. 181--196.

[29]

Giuseppe Parise, Pietroantonio Scarpino, and Erling Hesla. 2022. Flash intensity of arc, iso-flashes distribution and body surface area. IEEE Transactions on Industry Applications (2022).

[30]

Helene Retrouvey, Justin Chan, and Shahriar Shahrokhi. 2018. Comparison of two-dimensional methods versus three-dimensional scanning systems in the assessment of total body surface area estimation in burn patients. Burns 44, 1 (2018), 195--200.

[31]

Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.

Digital Library

[32]

Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision. 746--760.

Digital Library

[33]

Leslie N Smith and Nicholay Topin. 2019. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-domain Operations Applications, Vol. 11006. 1100612.

[34]

Shuran Song, Samuel P Lichtenberg, and Jianxiong Xiao. 2015. Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 567--576.

[35]

Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. 6105--6114.

[36]

Neerja Thakkar and Hany Farid. 2021. On the feasibility of 3D model-based forensic height and weight estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 953--961.

[37]

Meng Tian, Marcelo H Ang, and Gim Hee Lee. 2020. Shape prior deformation for categorical 6d object pose and size estimation. In Proceedings of the European Conference on Computer Vision. 530--546.

Digital Library

[38]

He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, and Leonidas J Guibas. 2019. Normalized object coordinate space for category-level 6d object pose and size estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2642--2651.

[39]

XinlongWang, Rufeng Zhang, Tao Kong, Lei Li, and Chunhua Shen. 2020. Solov2: Dynamic and fast instance segmentation. Proceedings of the Advances in Neural Information Processing Systems 33 (2020), 17721--17732.

[40]

Yongming Wen, Yiquan Fang, Junhao Cai, Kimwa Tung, and Hui Cheng. 2021. GCCN: Geometric Constraint Co-attention Network for 6D Object Pose Estimation. In Proceedings of the ACM International Conference on Multimedia. 2671--2679.

Digital Library

[41]

Guy Windreich, Nahum Kiryati, and Gabriele Lohmann. 2003. Voxel-based surface area estimation: From theory to practice. Pattern Recognition 36, 11 (2003), 2531--2541.

[42]

Chi-Yuang Yu, Ching-Hua Lin, and Yi-Hsueh Yang. 2010. Human body surface area database and estimation formula. Burns 36, 5 (2010), 616--629.

[43]

Chi-Yuang Yu and Hsin-Hung Tu. 2009. Foot surface area database and estimation formula. Applied Ergonomics 40, 3 (2009), 767--774.

[44]

Rui Zhu, Xingyi Yang, Yannick Hold-Geoffroy, Federico Perazzi, Jonathan Eisenmann, Kalyan Sunkavalli, and Manmohan Chandraker. 2020. Single view metrology in the wild. In Proceedings of the European Conference on Computer Vision. 316--333.

Digital Library

[45]

Armin Ziaratban, Mohsen Azadbakht, and Azim Ghasemnezhad. 2017. Modeling of volume and surface area of apple from their geometric characteristics and artificial neural network. International Journal of Food Properties 20, 4 (2017), 762--768.

[46]

Zhuofan Zong, Qianggang Cao, and Biao Leng. 2021. RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection. In Proceedings of the ACM International Conference on Multimedia. 5637--5645.

Digital Library

Index Terms

Learning Visible Surface Area Estimation for Irregular Objects
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Surface area estimation of digitized 3D objects using weighted local configurations

We present a method for estimating surface area of three-dimensional objects in discrete binary images. A surface area weight is assigned to each 2x2x2 configuration of voxels. The total surface area of a digital object is given by a summation of the ...
Surface area estimation of digitized 3D objects using quasi-Monte Carlo methods

A novel and efficient quasi-Monte Carlo method for estimating the surface area of digitized 3D objects in the volumetric representation is presented. It operates directly on the original digitized objects without any surface reconstruction procedure. ...
Estimation of Non-rigid Surface Deformation Using Developable Surface Model
ICPR '10: Proceedings of the 2010 20th International Conference on Pattern Recognition

There is a strong demand for a method of acquiring a non-rigid shape under deformation with high accuracy and high resolution. However, this is difficult to achieve because of performance limitations in measurement hardware. In this paper, we propose a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '22: Proceedings of the 30th ACM International Conference on Multimedia

October 2022

7537 pages

ISBN:9781450392037

DOI:10.1145/3503161

General Chairs:
João Magalhães
NOVA University of Lisbon, Portugal
,
Alberto del Bimbo
University of Florence, Italy
,
Shin'ichi Satoh
National Institute of Informatics, Japan
,
Nicu Sebe
University of Trento, Italy
,
Program Chairs:
Xavier Alameda-Pineda
Inria, Grenoble, France
,
Qin Jin
Renmin University of China, China
,
Vincent Oria
New Jersey Institute of Technology, USA
,
Laura Toni
University College London, UK

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China (NSFC)
National Key R&D Program of China

Conference

MM '22

Sponsor:

SIGMM

MM '22: The 30th ACM International Conference on Multimedia

October 10 - 14, 2022

Lisboa, Portugal

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24

Sponsor:
sigmm

The 32nd ACM International Conference on Multimedia

October 28 - November 1, 2024

Melbourne , VIC , Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
115
Total Downloads

Downloads (Last 12 months)31
Downloads (Last 6 weeks)3

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents