Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Context-Aware 3D Points of Interest Detection via Spatial Attention Mechanism

Published: 12 July 2023 Publication History

Abstract

Detecting points of interest is a fundamental problem in 3D shape analysis and can be beneficial to various tasks in multimedia processing. Traditional learning-based detection methods usually rely on each vertex’s geometric features to discriminate points of interest from other vertices. Observing that points of interest are related to not only geometric features on themselves but also the geometric features of surrounding vertices, we propose a novel context-aware 3D points of interest detection algorithm by adopting the spatial attention mechanism in this article. By designing a context attention module, our approach presents a novel deep neural network to simultaneously pay attention to the geometric features of vertices and their local contexts during extracting points of interest. To obtain satisfactory extraction results, our method adaptively assigns different weights to those features in a data-driven way. Extensive experimental results on SHREC 2007, SHREC 2011, and SHREC 2014 datasets show that our algorithm achieves superior performance over existing methods.

References

[1]
Mathieu Aubry, Ulrich Schlickewei, and Daniel Cremers. 2011. The wave kernel signature: A quantum mechanical approach to shape analysis. In Proceedings of the IEEE International Conference on Computer Vision Workshops. 1626–1633.
[2]
Volker Blanz and Thomas Vetter. 2003. Face recognition based on fitting a 3D morphable model. IEEE Transactions on Pattern Analysis and Machine Intelligence 25, 9 (2003), 1063–1074.
[3]
Michael M. Bronstein and Iasonas Kokkinos. 2010. Scale-invariant heat kernel signatures for non-rigid shape recognition. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 1704–1711.
[4]
Shuhui Bu, Zhenbao Liu, Junwei Han, Jun Wu, and Rongrong Ji. 2014. Learning high-level feature by deep belief networks for 3-D model retrieval and recognition. IEEE Transactions on Multimedia 16, 8 (2014), 2154–2167.
[5]
Umberto Castellani, Marco Cristani, Simone Fantoni, and Vittorio Murino. 2008. Sparse points matching by combining 3D mesh saliency with statistical descriptors. Computer Graphics Forum 27, 2 (2008), 643–652.
[6]
Jyun-Yuan Chen, Chao-Hung Lin, Po-Chi Hsu, and Chung-Hao Chen. 2013. Point cloud encoding for 3D building model retrieval. IEEE Transactions on Multimedia 16, 2 (2013), 337–345.
[7]
Xiaobai Chen, Abulhair Saparov, Bill Pang, and Thomas Funkhouser. 2012. Schelling points on 3D surface meshes. ACM Transactions on Graphics 31, 4 (2012), 1–12.
[8]
Bowen Cheng, Ishan Misra, Alexander G. Schwing, Alexander Kirillov, and Rohit Girdhar. 2022. Masked-attention mask transformer for universal image segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1290–1299.
[9]
Clement Creusot, Nick Pears, and Jim Austin. 2013. A machine-learning approach to keypoint detection and landmarking on 3D meshes. International Journal of Computer Vision 102, 1 (2013), 146–179.
[10]
Helin Dutagaci, Chun Pan Cheung, and Afzal Godil. 2012. Evaluation of 3D interest point detection techniques via human-generated ground truth. The Visual Computer 28, 9 (2012), 901–917.
[11]
Helin Dutagaci, Afzal Godil, Petros Daras, Apostolos Axenopoulos, George C. Litos, Stavroula Manolopoulou, Keita Goto, Tomohiro Yanagimachi, Yukinori Kurita, Shun Kawamura, Takahiko Furuya, and Ryutarou Ohbuchi2011. SHREC’11 track: Generic shape retrieval. In Proceedings of the 4th Eurographics Conference on 3D Object Retrieval. 65–69.
[12]
Miquel Feixas, Mateu Sbert, and Francisco Gonz A Lez. 2009. A unified information-theoretic framework for viewpoint selection and mesh saliency. ACM Transactions on Applied Perception 6, 1 (2009), 1–23.
[13]
Ran Gal and Daniel Cohen-Or. 2006. Salient geometric features for partial shape matching and similarity. ACM Transactions on Graphics 25, 1 (2006), 130–150.
[14]
Zan Gao, Yinming Li, and Shaohua Wan. 2020. Exploring deep learning for view-based 3D model retrieval. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 1, Article 18 (2020), 21 pages.
[15]
N. Gelfand, N. J. Mitra, L. J. Guibas, and H. Pottmann. 2005. Robust global registration. In Proceedings of the Symposium on Geometry Processing. 197–206.
[16]
Daniela Giorgi, Silvia Biasotti, and Laura Paraboschi. 2007. Shape retrieval contest 2007: Watertight models track. SHREC Competition 8, 7 (2007), 7.
[17]
Afzal Godil and Asim Imdad Wagan. 2011. Salient local 3D features for 3D shape retrieval. In Three-Dimensional Imaging, Interaction, and Measurement, J. Angelo Beraldin, Geraldine S. Cheok, Michael B. McCarthy, Ulrich Neuschaefer-Rube, Ian E. McDowall, Margaret Dolinsky, and Atilla M. Baskurt (Eds.). Vol. 7864, SPIE, 275–282.
[18]
Longteng Guo, Jing Liu, Xinxin Zhu, Peng Yao, Shichen Lu, and Hanqing Lu. 2020. Normalized and geometry-aware self-attention network for image captioning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
[19]
Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, and Jian Sun. 2020. PVN3D: A deep point-wise 3D keypoints voting network for 6dof pose estimation. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition. IEEE, 11629–11638.
[20]
Sagi Katz, George Leifman, and Ayellet Tal. 2005. Mesh segmentation using feature point and core extraction. The Visual Computer 21, 8 (2005), 649–658.
[21]
Manfred Lau, Kapil Dev, Weiqi Shi, Julie Dorsey, and Holly Rushmeier. 2016. Tactile mesh saliency. ACM Transactions on Graphics 35, 4 (2016), 1–11.
[22]
Chang Ha Lee, Amitabh Varshney, and David W. Jacobs. 2005. Mesh saliency. ACM Transactions on Graphics 24, 3 (2005), 659–666.
[23]
Zhouhui Lian, Afzal Godil, Benjamin Bustos, Mohamed Daoudi, Jeroen Hermans, Shun Kawamura, Yukinori Kurita, Guillaume Lavoué, Hien Van Nguyen, Ryutarou Ohbuchi, Yuki Ohkita, Yuya Ohishi, Fatih Murat Porikli, Martin Reuter, Ivan Sipiran, Dirk Smeets, Paul Suetens, Hedi Tabia, and Dirk Vandermeulen. 2011. SHREC’11 track: Shape retrieval on non-rigid 3D watertight meshes. In Proceedings of the 4th Eurographics Conference on 3D Object Retrieval. 79–88.
[24]
Stavros Nousias, Gerasimos Arvanitis, Aris Lalos, and Konstantinos Moustakas. 2023. Deep Saliency Mapping for 3D Meshes and Applications. ACM Transactions on Multimedia Computing, Communication, and Applications 19, 2, (2023), 22 pages.
[25]
D. Pickup, X. Sun, P. L. Rosin, R. R. Martin, Z. Cheng, Z. Lian, M. Aono, A. Ben Hamza, A. Bronstein, M. Bronstein, S. Bu, U. Castellani, S. Cheng, V. Garro, A. Giachetti, A. Godil, J. Han, H. Johan, L. Lai, B. Li, C. Li, H. Li, R. Litman, X. Liu, Z. Liu, Y. Lu, A. Tatsuma, and J. Ye. 2014. SHREC’14 track: Shape retrieval of non-rigid 3D human models. In Proceedings of the 7th Eurographics Workshop on 3D Object Retrieval. Eurographics Association, 10.
[26]
Alex Rodriguez and Alessandro Laio. 2014. Clustering by fast search and find of density peaks. Science 344, 6191 (2014), 1492–1496.
[27]
Lior Shapira, Shy Shalom, Ariel Shamir, Daniel Cohen-Or, and Hao Zhang. 2010. Contextual part analogies in 3D objects. International Journal of Computer Vision 89, 2 (2010), 309–326.
[28]
Lior Shapira, Ariel Shamir, and Daniel Cohen-Or. 2008. Consistent mesh partitioning and skeletonisation using the shape diameter function. The Visual Computer 24, 4 (2008), 249.
[29]
Zhenyu Shu, Xiaoyong Shen, Shiqing Xin, Qingjun Chang, Jieqing Feng, Ladislav Kavan, and Ligang Liu. 2020. Scribble based 3D shape segmentation via weakly-supervised learning. IEEE Transactions on Visualization and Computer Graphics 26, 8 (2020), 2671–2682.
[30]
Zhenyu Shu, Shiqing Xin, Xin Xu, Ligang Liu, and Ladislav Kavan. 2018. Detecting 3D points of interest using multiple features and stacked auto-encoder. IEEE Transactions on Visualization and Computer Graphics 25, 8 (2018), 2583–2596.
[31]
Zhenyu Shu, Sipeng Yang, Haoyu Wu, Shiqing Xin, Chaoyi Pang, Ladislav Kavan, and Ligang Liu. 2022. 3D shape segmentation using soft density peak clustering and semi-supervised learning. Computer-Aided Design 145, 1 (2022), 103181.
[32]
Zhenyu Shu, Sipeng Yang, Shiqing Xin, Chaoyi Pang, Xiaogang Jin, Ladislav Kavan, and Ligang Liu. 2021. Detecting 3D points of interest using projective neural networks. IEEE Transactions on Multimedia 24, 1 (2021), 1637–1650.
[33]
Ivan Sipiran and Benjamin Bustos. 2010. A robust 3D interest points detector based on Harris operator. In Proceedings of the 3rd Eurographics Conference on 3D Object Retrieval. 7–14.
[34]
Ziyi Sun, Yunfeng Zhang, Fangxun Bao, Ping Wang, Xunxiang Yao, and Caiming Zhang. 2022. SADnet: Semi-supervised single image dehazing method based on an attention mechanism. ACM Transactions on Multimedia Computing, Communications, and Applications 18, 2, Article 58 (2022), 23 pages.
[35]
Gary K. L. Tam, Zhi-Quan Cheng, Yu-Kun Lai, Frank C. Langbein, Yonghuai Liu, David Marshall, Ralph R. Martin, Xian-Fang Sun, and Paul L. Rosin. 2012. Registration of 3D point clouds and meshes: A survey from rigid to nonrigid. IEEE Transactions on Visualization and Computer Graphics 19, 7 (2012), 1199–1217.
[36]
Leizer Teran and Philippos Mordohai. 2014. 3D interest point detection via discriminative learning. In Proceedings of the European Conference on Computer Vision. Springer, 159–173.
[37]
Chengwei Wang, Dan Kang, Xiuyang Zhao, Lizhi Peng, and Caiming Zhang. 2016. Extraction of feature points on 3D meshes through data gravitation. In Proceedings of the International Conference on Intelligent Computing. Springer, 601–612.
[38]
Guangshun Wei, Long Ma, Chen Wang, Christian Desrosiers, and Yuanfeng Zhou. 2021. Multi-task joint learning of 3D keypoint saliency and correspondence estimation. Computer-Aided Design 141, 1 (2021), 103105.
[39]
Jin Xie, Guoxian Dai, and Yi Fang. 2017. Deep multimetric learning for shape-based 3D model retrieval. IEEE Transactions on Multimedia 19, 11 (2017), 2463–2474.
[40]
J. Yu, J. Li, Z. Yu, and Q. Huang. 2020. Multimodal transformer with multi-view visual representation for image captioning. IEEE Transactions on Circuits and Systems for Video Technology30, 12 (2020), 4467–4480.
[41]
Jin Yuan, Lei Zhang, Songrui Guo, Yi Xiao, and Zhiyong Li. 2020. Image captioning with a joint attention mechanism by visual concept samples. ACM Transactions on Multimedia Computing, Communications, and Applications 16, 3, Article 83 (2020), 22 pages.
[42]
Qingkai Zhen, Di Huang, Yunhong Wang, and Liming Chen. 2016. Muscular movement model-based automatic 3D/4D facial expression recognition. IEEE Transactions on Multimedia 18, 7 (2016), 1438–1450.
[43]
Guanyu Zhu, Yong Zhou, Rui Yao, Hancheng Zhu, and Jiaqi Zhao. 2023. Cyclic Self-attention for point cloud recognition. ACM Transactions on Multimedia Computing, Communication, and Applications 19, 1s (2023),19 pages.
[44]
Guangyu Zou, Jing Hua, Ming Dong, and Hong Qin. 2008. Surface matching with salient keypoints in geodesic scale space. Computer Animation and Virtual Worlds 19, 3–4 (2008), 399–410.

Index Terms

  1. Context-Aware 3D Points of Interest Detection via Spatial Attention Mechanism

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 6
    November 2023
    858 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3599695
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 12 July 2023
    Online AM: 11 May 2023
    Accepted: 09 May 2023
    Revised: 24 March 2023
    Received: 20 November 2022
    Published in TOMM Volume 19, Issue 6

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. 3D point of interest
    2. deep learning
    3. attention mechanism

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • Zhejiang Provincial Natural Science Foundation of China
    • Ningbo Major Special Projects of the “Science and Technology Innovation 2025”

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 186
      Total Downloads
    • Downloads (Last 12 months)98
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 10 Nov 2024

    Other Metrics

    Citations

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media