Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3503161.3548017acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Learning Visible Surface Area Estimation for Irregular Objects

Published: 10 October 2022 Publication History

Abstract

Visible surface area estimation for irregular objects, one of the most fundamental and challenging topics in mathematics, supports a wide range of applications. The existing techniques usually estimate the visible surface area via mathematical modeling from 3D point clouds. However, the 3D scanner is expensive, and the corresponding evaluation method is too complex. In this paper, we propose a novel problem setting, deep learning for visible surface area estimation, which is the first trial to estimate the visible surface area for irregular objects from monocular images. Technically, we first build a novel visible surface area estimation dataset including 9099 real annotations. Then, we design a learning-based architecture to predict the visible surface area, including two core modules (i.e., the classification module and the area-bins module). The classification module is presented to predict the visible surface area distribution interval and assist network training for more accurate visible surface area estimation. Meanwhile, the area-bins module using the transformer encoder is proposed to distinguish the difference in visible surface area between irregular objects of the same category. The experimental results demonstrate that our approach can effectively estimate the visible surface area for irregular objects with various categories and sizes. We hope that this work will attract further research into this newly identified, yet crucial research direction. Our source code and data are available at \textcolormagenta \urlhttps://github.com/liuxu0303/VSAnet .

Supplementary Material

MP4 File (MM22-fp1131.mp4)
This is a presentation video about the work of "learning visible surface area estimation for irregular objects". The video contains the introduction, dataset construction, our approach, experiments and summary.

References

[1]
NH Banks. 1985. Surface area estimation of potato tubers. Potato Research 28, 4 (1985), 487--495.
[2]
Shariq Farooq Bhat, Ibraheem Alhashim, and PeterWonka. 2021. Adabins: Depth estimation using adaptive bins. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 4009--4018.
[3]
Garrick Brazil and Xiaoming Liu. 2019. M3d-RPN: Monocular 3d region proposal network for object detection. In Proceedings of the IEEE International Conference on Computer Vision. 9287--9296.
[4]
Chi-Hua Chen, ED Gutierrez, Wes Thompson, Matthew S Panizzon, Terry L Jernigan, Lisa T Eyler, Christine Fennema-Notestine, Amy J Jak, Michael C Neale, Carol E Franz, et al. 2012. Hierarchical genetic organization of human cortical surface area. Science 335, 6076 (2012), 1634--1636.
[5]
Dengsheng Chen, Jun Li, Zheng Wang, and Kai Xu. 2020. Learning canonical shape space for category-level 6d object pose and size estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 11973--11982.
[6]
Hansheng Chen, Yuyao Huang, Wei Tian, Zhong Gao, and Lu Xiong. 2021. Monorun: Monocular 3d object detection by reconstruction and uncertainty propagation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 10379--10388.
[7]
Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, and Raquel Urtasun. 2016. Monocular 3d object detection for autonomous driving. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2147-- 2156.
[8]
Yongjian Chen, Lei Tai, Kai Sun, and Mingyang Li. 2020. Monopair: Monocular 3d object detection using pairwise spatial relationships. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 12093--12102.
[9]
Zehui Chen, Chenhongyi Yang, Qiaofei Li, Feng Zhao, Zheng-Jun Zha, and Feng Wu. 2021. Disentangle Your Dense Object Detector. In Proceedings of the ACM International Conference on Multimedia. 4939--4948.
[10]
Xiaomeng Chu, Jiajun Deng, Yao Li, Zhenxun Yuan, Yanyong Zhang, Jianmin Ji, and Yu Zhang. 2021. Neighbor-vote: Improving monocular 3D object detection through neighbor distance voting. In Proceedings of the ACM International Conference on Multimedia. 5239--5247.
[11]
Murray Clayton, Nevin D Amos, Nigel H Banks, and R Hugh Morton. 1995. Estimation of apple fruit surface area. New Zealand Journal of Crop and Horticultural Science 23 (1995), 345--349.
[12]
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).
[13]
David Eigen, Christian Puhrsch, and Rob Fergus. 2014. Depth map prediction from a single image using a multi-scale deep network. Proceedings of the Advances in Neural Information Processing Systems 27 (2014).
[14]
Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition. 605--613.
[15]
Omar K Farha, A Özgür Yazayd?n, Ibrahim Eryazici, Christos D Malliakas, Brad G Hauser, Mercouri G Kanatzidis, SonBinh T Nguyen, Randall Q Snurr, and Joseph T Hupp. 2010. De novo synthesis of a metal--organic framework material featuring ultrahigh surface area and gas storage capacities. Nature Chemistry 2, 11 (2010), 944--948.
[16]
Yan Gao, Qimeng Wang, Xu Tang, Haochen Wang, Fei Ding, Jing Li, and Yao Hu. 2021. Decoupled IoU Regression for Object Detection. In Proceedings of the ACM International Conference on Multimedia. 5628--5636.
[17]
Shane Gilroy, Martin Glavin, Edward Jones, and Darragh Mullins. 2021. Pedestrian occlusion level classification using keypoint detection and 2D body surface area estimation. In Proceedings of the IEEE International Conference on Computer Vision. 3833--3839.
[18]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[19]
Lu He, Qianyu Zhou, Xiangtai Li, Li Niu, Guangliang Cheng, Xiao Li, Wenxuan Liu, Yunhai Tong, Lizhuang Ma, and Liqing Zhang. 2021. End-to-End Video Object Detection with Spatial-Temporal Transformers. In Proceedings of the ACM International Conference on Multimedia. 1507--1516.
[20]
Eric Jang, Shixiang Gu, and Ben Poole. 2016. Categorical reparameterization with gumbel-softmax. arXiv preprint arXiv:1611.01144 (2016).
[21]
Abhishek Kar, Shubham Tulsiani, Joao Carreira, and Jitendra Malik. 2015. Amodal completion and size constancy in natural scenes. In Proceedings of the IEEE International Conference on Computer Vision. 127--135.
[22]
Lijian Leng, Qin Xiong, Lihong Yang, Hui Li, Yaoyu Zhou,Weijin Zhang, Shaojian Jiang, Hailong Li, and Huajun Huang. 2021. An overview on engineering the surface area and porosity of biochar. Science of the Total Environment 763 (2021), 144204.
[23]
Jiehong Lin, Zewei Wei, Zhihao Li, Songcen Xu, Kui Jia, and Yuanqing Li. 2021. Dualposenet: Category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In Proceedings of the IEEE International Conference on Computer Vision. 3560--3569.
[24]
Yu-Shen Liu, Jing Yi, Hu Zhang, Guo-Qin Zheng, and Jean-Claude Paul. 2010. Surface area estimation of digitized 3D objects using quasi-Monte Carlo methods. Pattern Recognition 43, 11 (2010), 3900--3909.
[25]
Ilya Loshchilov and Frank Hutter. 2017. Decoupled weight decay regularization. arXiv preprint arXiv:1711.05101 (2017).
[26]
Xinzhu Ma, Zhihui Wang, Haojie Li, Pengbo Zhang, Wanli Ouyang, and Xin Fan. 2019. Accurate monocular 3d object detection via color-embedded 3d reconstruction for autonomous driving. In Proceedings of the IEEE International Conference on Computer Vision. 6851--6860.
[27]
Chris J Maddison, Daniel Tarlow, and Tom Minka. 2014. A? Sampling. Proceedings of the Advances in Neural Information Processing Systems 27 (2014), 3086--3094.
[28]
Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens Van Der Maaten. 2018. Exploring the limits of weakly supervised pretraining. In Proceedings of the European Conference on Computer Vision. 181--196.
[29]
Giuseppe Parise, Pietroantonio Scarpino, and Erling Hesla. 2022. Flash intensity of arc, iso-flashes distribution and body surface area. IEEE Transactions on Industry Applications (2022).
[30]
Helene Retrouvey, Justin Chan, and Shahriar Shahrokhi. 2018. Comparison of two-dimensional methods versus three-dimensional scanning systems in the assessment of total body surface area estimation in burn patients. Burns 44, 1 (2018), 195--200.
[31]
Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang, Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. 2015. Imagenet large scale visual recognition challenge. International Journal of Computer Vision 115, 3 (2015), 211--252.
[32]
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus. 2012. Indoor segmentation and support inference from rgbd images. In Proceedings of the European Conference on Computer Vision. 746--760.
[33]
Leslie N Smith and Nicholay Topin. 2019. Super-convergence: Very fast training of neural networks using large learning rates. In Proceedings of the Artificial Intelligence and Machine Learning for Multi-domain Operations Applications, Vol. 11006. 1100612.
[34]
Shuran Song, Samuel P Lichtenberg, and Jianxiong Xiao. 2015. Sun rgb-d: A rgb-d scene understanding benchmark suite. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 567--576.
[35]
Mingxing Tan and Quoc Le. 2019. Efficientnet: Rethinking model scaling for convolutional neural networks. In Proceedings of the International Conference on Machine Learning. 6105--6114.
[36]
Neerja Thakkar and Hany Farid. 2021. On the feasibility of 3D model-based forensic height and weight estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 953--961.
[37]
Meng Tian, Marcelo H Ang, and Gim Hee Lee. 2020. Shape prior deformation for categorical 6d object pose and size estimation. In Proceedings of the European Conference on Computer Vision. 530--546.
[38]
He Wang, Srinath Sridhar, Jingwei Huang, Julien Valentin, Shuran Song, and Leonidas J Guibas. 2019. Normalized object coordinate space for category-level 6d object pose and size estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2642--2651.
[39]
XinlongWang, Rufeng Zhang, Tao Kong, Lei Li, and Chunhua Shen. 2020. Solov2: Dynamic and fast instance segmentation. Proceedings of the Advances in Neural Information Processing Systems 33 (2020), 17721--17732.
[40]
Yongming Wen, Yiquan Fang, Junhao Cai, Kimwa Tung, and Hui Cheng. 2021. GCCN: Geometric Constraint Co-attention Network for 6D Object Pose Estimation. In Proceedings of the ACM International Conference on Multimedia. 2671--2679.
[41]
Guy Windreich, Nahum Kiryati, and Gabriele Lohmann. 2003. Voxel-based surface area estimation: From theory to practice. Pattern Recognition 36, 11 (2003), 2531--2541.
[42]
Chi-Yuang Yu, Ching-Hua Lin, and Yi-Hsueh Yang. 2010. Human body surface area database and estimation formula. Burns 36, 5 (2010), 616--629.
[43]
Chi-Yuang Yu and Hsin-Hung Tu. 2009. Foot surface area database and estimation formula. Applied Ergonomics 40, 3 (2009), 767--774.
[44]
Rui Zhu, Xingyi Yang, Yannick Hold-Geoffroy, Federico Perazzi, Jonathan Eisenmann, Kalyan Sunkavalli, and Manmohan Chandraker. 2020. Single view metrology in the wild. In Proceedings of the European Conference on Computer Vision. 316--333.
[45]
Armin Ziaratban, Mohsen Azadbakht, and Azim Ghasemnezhad. 2017. Modeling of volume and surface area of apple from their geometric characteristics and artificial neural network. International Journal of Food Properties 20, 4 (2017), 762--768.
[46]
Zhuofan Zong, Qianggang Cao, and Biao Leng. 2021. RCNet: Reverse Feature Pyramid and Cross-scale Shift Network for Object Detection. In Proceedings of the ACM International Conference on Multimedia. 5637--5645.

Index Terms

  1. Learning Visible Surface Area Estimation for Irregular Objects

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '22: Proceedings of the 30th ACM International Conference on Multimedia
    October 2022
    7537 pages
    ISBN:9781450392037
    DOI:10.1145/3503161
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep learning
    2. monocular image
    3. visible surface area

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China (NSFC)
    • National Key R&D Program of China

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 995 of 4,171 submissions, 24%

    Upcoming Conference

    MM '24
    The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne , VIC , Australia

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 115
      Total Downloads
    • Downloads (Last 12 months)31
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media