CGR-Block: Correlated Feature Extractor and Geometric Feature Fusion for Point Cloud Analysis
Abstract
:1. Introduction
- (1)
- Perform max-pooling functions to aggregate the features of neighbor points indiscriminately (e.g., PointNet++). Although this scheme is currently the most widely adopted approach for point cloud analysis networks, and has the advantage of cheap computational complexity and a positive impact on inference speed, the method does not distinguish semantic differences between features.
- (2)
- Construct a multilayer perceptron (MLP) that takes a set of neighbor point features as input, outputs a set of weights, and then uses the weights to perform a weighted summation of the neighbor point features (e.g., RandLA-Net [6]). However, a simple MLP has difficulty learning a meaningful set of weights, and using the learned weights to rescale the features yields no observable improvement over undifferentiated max-pooling.
- (3)
- Use the self-attention mechanism to capture the long-distance interaction between point features in a purely data-driven and learning-based way, and adjust the features adaptively (e.g., PointTransformer [7], PCT [8]). Although the result of this scheme is remarkable, since the computational complexity and memory consumption of the self-attention mechanism are , where N is the input point number, the naive self-attention is not suitable for processing point clouds.
- The proposed CGR-block can simultaneously extract and fuse abstract semantic features and local geometric tokens of the point cloud, and it can serve as the basic module to construct the network for point cloud analysis.
- The proposed correlated feature extractor unit can mine inter-point interaction information in a heuristic way and extract features efficiently.
- The proposed geometric feature fusion unit generates a compact geometric pattern token at each stage of feature extraction and fuses it into the deep semantic feature extraction process, which provides considerable contribution with a weak overhead.
- CGR-Net performs multiple experiments on the point cloud classification and part segmentation tasks to verify that the network constructed by the proposed CGR-block can achieve or outperform state-of-the-art approaches.
2. Related Work
2.1. Multiple-View-Based and Voxel-Based Methods
2.2. Discrete Point-Based Methods
3. Methods
3.1. Overview
3.2. Farthest Point Sampling (FPS) and K-Nearest Neighbors Grouping (KNN)
3.3. Correlated Feature Extractor
3.4. Geometric Feature Fusion
3.5. CGR-Block
3.6. Network Architecture
4. Results
4.1. Classification on ModelNet40
4.2. Classification on ScanObjectNN
4.3. Part Segmentation on the ShapeNet-Part
4.4. Ablation Studies
4.4.1. The Validity of the Components of CGR-Block
4.4.2. The Output Dimension of the Geometric Feature Fusion Unit
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Su, H.; Maji, S.; Kalogerakis, E.; Learned-Miller, E. Multi-view convolutional neural networks for 3d shape recognition. In Proceedings of the IEEE International Conference on Computer vision, Santiago, Chile, 7–13 December 2015; pp. 945–953. [Google Scholar]
- Chen, C.; Fragonara, L.Z.; Tsourdos, A. Gapnet: Graph attention based point neural network for exploiting local feature of point cloud. arXiv 2019, arXiv:1905.08705. [Google Scholar]
- Guerry, J.; Boulch, A.; Le Saux, B.; Moras, J.; Plyer, A.; Filliat, D. Snapnet-r: Consistent 3d multi-view semantic labeling for robotics. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Venice, Italy, 22–29 October 2017; pp. 669–678. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11108–11117. [Google Scholar]
- Zhao, H.; Jiang, L.; Jia, J.; Torr, P.H.; Koltun, V. Point transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 11–17 October 2021; pp. 16259–16268. [Google Scholar]
- Guo, M.H.; Cai, J.X.; Liu, Z.N.; Mu, T.J.; Martin, R.R.; Hu, S.M. Pct: Point cloud transformer. Comput. Vis. Media 2021, 7, 187–199. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Wu, Z.; Song, S.; Khosla, A.; Yu, F.; Zhang, L.; Tang, X.; Xiao, J. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 1912–1920. [Google Scholar]
- Uy, M.A.; Pham, Q.H.; Hua, B.S.; Nguyen, T.; Yeung, S.K. Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 1588–1597. [Google Scholar]
- Yi, L.; Kim, V.G.; Ceylan, D.; Shen, I.C.; Yan, M.; Su, H.; Lu, C.; Huang, Q.; Sheffer, A.; Guibas, L. A scalable active framework for region annotation in 3d shape collections. ACM Trans. Graph. 2016, 35, 1–12. [Google Scholar] [CrossRef]
- Feng, Y.; Zhang, Z.; Zhao, X.; Ji, R.; Gao, Y. GVCNN: Group-View Convolutional Neural Networks for 3D Shape Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Maturana, D.; Scherer, S. Voxnet: A 3d convolutional neural network for real-time object recognition. In Proceedings of the 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, Germany, 28 September 2015–2 October 2015; pp. 922–928. [Google Scholar]
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep learning for 3d point clouds: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Zhang, K.; Hao, M.; Wang, J.; de Silva, C.W.; Fu, C. Linked dynamic graph cnn: Learning on point cloud via linking hierarchical features. arXiv 2019, arXiv:1904.10014. [Google Scholar]
- Simonovsky, M.; Komodakis, N. Dynamic edge-conditioned filters in convolutional neural networks on graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3693–3702. [Google Scholar]
- Liu, J.; Ni, B.; Li, C.; Yang, J.; Tian, Q. Dynamic points agglomeration for hierarchical point sets learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 7546–7555. [Google Scholar]
- Li, J.; Chen, B.M.; Lee, G.H. So-net: Self-organizing network for point cloud analysis. In Proceedings of the IEEE Conference on Computer VISION and pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 9397–9406. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. Pointcnn: Convolution on x-transformed points. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 3–8 December 2018; Volume 31. [Google Scholar]
- Xie, S.; Liu, S.; Chen, Z.; Tu, Z. Attentional shapecontextnet for point cloud recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4606–4615. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
- Liu, Y.; Fan, B.; Xiang, S.; Pan, C. Relation-shape convolutional neural network for point cloud analysis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8895–8904. [Google Scholar]
- Xu, M.; Zhang, J.; Zhou, Z.; Xu, M.; Qi, X.; Qiao, Y. Learning geometry-disentangled representation for complementary understanding of 3d object point cloud. arXiv 2021, arXiv:2012.10921. [Google Scholar]
- Xu, Y.; Fan, T.; Xu, M.; Zeng, L.; Qiao, Y. Spidercnn: Deep learning on point sets with parameterized convolutional filters. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 87–102. [Google Scholar]
- Wu, W.; Qi, Z.; Fuxin, L. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9621–9630. [Google Scholar]
- Komarichev, A.; Zhong, Z.; Hua, J. A-cnn: Annularly convolutional neural networks on point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 7421–7430. [Google Scholar]
- Engel, N.; Belagiannis, V.; Dietmayer, K. Point transformer. IEEE Access 2021, 9, 134826–134840. [Google Scholar] [CrossRef]
- Han, X.F.; Kuang, Y.J.; Xiao, G.Q. Point Cloud Learning with Transformer. arXiv 2021, arXiv:2104.13636. [Google Scholar]
- Yan, X.; Zheng, C.; Li, Z.; Wang, S.; Cui, S. Pointasnl: Robust point clouds processing using nonlocal neural networks with adaptive sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 5589–5598. [Google Scholar]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 6411–6420. [Google Scholar]
- Liu, Z.; Hu, H.; Cao, Y.; Zhang, Z.; Tong, X. A closer look at local aggregation operators in point cloud analysis. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 326–342. [Google Scholar]
- Liu, Y.; Fan, B.; Meng, G.; Lu, J.; Xiang, S.; Pan, C. Densepoint: Learning densely contextual representation for efficient point cloud processing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27 October–2 November 2019; pp. 5239–5248. [Google Scholar]
- Ben-Shabat, Y.; Lindenbaum, M.; Fischer, A. 3d point cloud classification and segmentation using 3d modified fisher vector representation for convolutional neural networks. arXiv 2017, arXiv:1711.08241. [Google Scholar]
- Qiu, S.; Anwar, S.; Barnes, N. Dense-resolution network for point cloud classification and segmentation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Virtual, 5–9 January 2021; pp. 3813–3822. [Google Scholar]
- Qiu, S.; Anwar, S.; Barnes, N. Geometric back-projection network for point cloud classification. IEEE Trans. Multimed. 2021, 24, 1943–1955. [Google Scholar] [CrossRef]
- Goyal, A.; Law, H.; Liu, B.; Newell, A.; Deng, J. Revisiting point cloud shape classification with a simple and effective baseline. In Proceedings of the International Conference on Machine Learning—PMLR, Virtual, 18–24 July 2021; pp. 3809–3820. [Google Scholar]
- Cheng, S.; Chen, X.; He, X.; Liu, Z.; Bai, X. Pra-net: Point relation-aware network for 3d point cloud analysis. IEEE Trans. Image Process. 2021, 30, 4436–4448. [Google Scholar] [CrossRef] [PubMed]
- Huang, Q.; Wang, W.; Neumann, U. Recurrent slice networks for 3d segmentation of point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2626–2635. [Google Scholar]
- Klokov, R.; Lempitsky, V. Escape from cells: Deep kd-networks for the recognition of 3d point cloud models. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 863–872. [Google Scholar]
- Su, H.; Jampani, V.; Sun, D.; Maji, S.; Kalogerakis, E.; Yang, M.H.; Kautz, J. Splatnet: Sparse lattice networks for point cloud processing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2530–2539. [Google Scholar]
- Yi, L.; Su, H.; Guo, X.; Guibas, L.J. Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 2282–2290. [Google Scholar]
- Shen, Y.; Feng, C.; Yang, Y.; Tian, D. Mining point cloud local structures by kernel correlation and graph pooling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 4548–4557. [Google Scholar]
- Atzmon, M.; Maron, H.; Lipman, Y. Point convolutional neural networks by extension operators. arXiv 2018, arXiv:1803.10091. [Google Scholar] [CrossRef] [Green Version]
Methods | Inputs | #Points | mAcc (%) | OA (%) |
---|---|---|---|---|
PointNet [4] | P | 1k | 86.0 | 89.2 |
PointNet++ [5] | P + N | 1k | - | 91.9 |
SpiderCNN [26] | P + N | 1k | - | 92.4 |
PointCNN [20] | P | 1k | 88.1 | 92.5 |
PointConv [27] | P + N | 1k | - | 92.5 |
A-CNN [28] | P + N | 1k | 90.3 | 92.6 |
Point Trans. (Engel et al., 2020) [29] | P | 1k | - | 92.8 |
DGCNN [16] | P | 1k | 90.2 | 92.9 |
MLMSPT [30] | P | 1k | - | 92.9 |
RS-CNN [24] | P | 1k | - | 92.9 |
PointASNL [31] | P | 1k | - | 92.9 |
KPConv [32] | P | 7k | - | 92.9 |
PCT [8] | P | 1k | - | 93.2 |
PosPool [33] | P | 5k | - | 93.2 |
DensePoint [34] | P | 1k | - | 93.2 |
PointASNL [31] | P + N | 1k | - | 93.2 |
RS-CNN * [24] | P | 1k | - | 93.6 |
Point Trans. (Zhao et al., 2021) [7] | P | 1k | 90.6 | 93.7 |
GDANet * [25] | P | 1k | - | 93.8 |
Ours | P + N | 1k | 91.9 | 94.1 |
Methods | mAcc (%) | OA (%) |
---|---|---|
3DmFV [35] | 58.1 | 63.0 |
PointNet [4] | 63.4 | 68.2 |
SpiderCNN [26] | 69.8 | 73.7 |
PointNet++ [5] | 75.4 | 77.9 |
DGCNN [16] | 73.6 | 78.1 |
PointCNN [20] | 75.1 | 78.5 |
BGA-DGCNN [11] | 75.7 | 79.7 |
BGA-PN++ [11] | 77.5 | 80.2 |
DRNet [36] | 78.0 | 80.3 |
GBNet [37] | 77.8 | 80.5 |
SimpleView [38] | - | 80.5 ± 0.3 |
PRANet [39] | 79.1 | 82.1 |
Ours | 82.7 | 83.5 |
Methods | Class mIoU | Ins-Tance mIoU | Airplane | Bag | Cap | Car | Chair | Earphone | Guitar | Knife | Lamp | Laptop | Motorbike | Mug | Pistol | Rocket | Skateboard | Table |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Kd-Net [41] | 77.4 | 82.3 | 80.1 | 74.6 | 74.3 | 70.3 | 88.6 | 73.5 | 90.2 | 87.2 | 81.0 | 94.9 | 57.4 | 86.7 | 78.1 | 51.8 | 69.9 | 80.3 |
PointNet [4] | 80.4 | 83.7 | 83.4 | 78.7 | 82.5 | 74.9 | 89.6 | 73.0 | 91.5 | 85.9 | 80.8 | 95.3 | 65.2 | 93.0 | 81.2 | 57.9 | 72.8 | 80.6 |
SCN [21] | 81.8 | 84.6 | 83.8 | 80.8 | 83.5 | 79.3 | 90.5 | 69.8 | 91.7 | 86.5 | 82.9 | 96.0 | 69.2 | 93.8 | 82.5 | 62.9 | 74.4 | 80.8 |
SPLATNet [42] | 82.0 | 84.6 | 81.9 | 83.9 | 88.6 | 79.5 | 90.1 | 73.5 | 91.3 | 84.7 | 84.5 | 96.3 | 69.7 | 95.0 | 81.7 | 59.2 | 70.4 | 81.3 |
SO-Net [19] | 80.8 | 84.6 | 81.9 | 83.5 | 84.8 | 78.1 | 90.8 | 72.2 | 90.1 | 83.6 | 82.3 | 95.2 | 69.3 | 94.2 | 80.0 | 51.6 | 72.1 | 82.6 |
SyncCNN [43] | 82.0 | 84.7 | 81.6 | 81.7 | 81.9 | 75.2 | 90.2 | 74.9 | 93.0 | 86.1 | 84.7 | 95.6 | 66.7 | 92.7 | 81.6 | 60.6 | 82.9 | 82.1 |
KCNet [44] | 82.2 | 84.7 | 82.8 | 81.5 | 86.4 | 77.6 | 90.3 | 76.8 | 91.0 | 87.2 | 84.5 | 95.5 | 69.2 | 94.4 | 81.6 | 60.1 | 75.2 | 81.3 |
RS-Net [40] | 81.4 | 84.9 | 82.7 | 86.4 | 84.1 | 78.2 | 90.4 | 69.3 | 91.4 | 87.0 | 83.5 | 95.4 | 66.0 | 92.6 | 81.8 | 56.1 | 75.8 | 82.2 |
DGCNN [16] | 82.3 | 85.1 | 84.2 | 83.7 | 84.4 | 77.1 | 90.9 | 78.5 | 91.5 | 87.3 | 82.9 | 96.0 | 67.8 | 93.3 | 82.6 | 59.7 | 75.5 | 82.0 |
PCNN [45] | 81.8 | 85.1 | 82.4 | 80.1 | 85.5 | 79.5 | 90.8 | 73.2 | 91.3 | 86.0 | 85.0 | 95.7 | 73.2 | 94.8 | 83.3 | 51.0 | 75.0 | 81.8 |
PointNet++ [5] | 81.9 | 85.1 | 82.4 | 79.0 | 87.7 | 77.3 | 90.8 | 71.8 | 91.0 | 85.9 | 83.7 | 95.3 | 71.6 | 94.1 | 81.3 | 58.7 | 76.4 | 82.6 |
SpiderCNN [26] | 82.4 | 85.3 | 83.5 | 81.0 | 87.2 | 77.5 | 90.7 | 76.8 | 91.1 | 87.3 | 83.3 | 95.8 | 70.2 | 93.5 | 82.7 | 59.7 | 75.8 | 82.8 |
Ours | 82.8 | 85.5 | 82.4 | 79.7 | 87.7 | 79.4 | 90.4 | 76.2 | 91.2 | 85.7 | 84.3 | 95.8 | 75.3 | 94.7 | 81.5 | 61.4 | 76.6 | 82.2 |
Ablations | mAcc (%) | OA (%) |
---|---|---|
(1) Simplify correlated feature extractor | 91.6 | 93.7 |
(2) Remove geometric feature fusion | 90.2 | 92.2 |
(3) Remove shortcut | 91.8 | 93.1 |
(4) The full network | 91.9 | 94.1 |
16 | 32 | 64 | 128 | 256 | |
---|---|---|---|---|---|
mAcc (%) | 91.0 | 91.2 | 91.6 | 91.9 | 91.7 |
OA (%) | 93.5 | 93.6 | 93.9 | 94.1 | 93.5 |
#params (M) | 5.66 | 5.75 | 5.95 | 6.37 | 7.31 |
#FLOPs/sample (M) | 1461.3 | 1504.7 | 1596.2 | 1798.2 | 2277.6 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, F.; Zhao, Y.; Shi, G.; Cui, Q.; Cao, T.; Jiang, X.; Hou, Y.; Zhuang, R.; Mei, Y. CGR-Block: Correlated Feature Extractor and Geometric Feature Fusion for Point Cloud Analysis. Sensors 2022, 22, 4878. https://doi.org/10.3390/s22134878
Wang F, Zhao Y, Shi G, Cui Q, Cao T, Jiang X, Hou Y, Zhuang R, Mei Y. CGR-Block: Correlated Feature Extractor and Geometric Feature Fusion for Point Cloud Analysis. Sensors. 2022; 22(13):4878. https://doi.org/10.3390/s22134878
Chicago/Turabian StyleWang, Fan, Yingxiang Zhao, Gang Shi, Qing Cui, Tengfei Cao, Xian Jiang, Yongjie Hou, Rujun Zhuang, and Yunfei Mei. 2022. "CGR-Block: Correlated Feature Extractor and Geometric Feature Fusion for Point Cloud Analysis" Sensors 22, no. 13: 4878. https://doi.org/10.3390/s22134878