MSIDA-Net: Point Cloud Semantic Segmentation via Multi-Spatial Information and Dual Adaptive Blocks
Abstract
:1. Introduction
- We propose a multiple spatial information encoding block that aims to learn more types of spatial information about the point cloud;
- We propose a coordinate systems attentive pooling fusion (CSAPF) block to sufficiently learn local context features. The local features encoded in each of the three coordinate systems are first attention pooled. Then, the three attention scores obtained are added and averaged. Through those steps, each of the neighbouring points obtains a more reasonable attention score for feature learning;
- A local aggregation features attention (LAFA) block is proposed. The distribution of points in each local region are different, so this block aims to learn the features of each local overall region in different coordinate systems and the adaptive weights of these local aggregation features to learn the contribution. In other words, the better the local aggregation features can describe the local region, the more important they are. Through this block, not only can the relation among different coordinate systems be adequately learned, but also the understanding ability of our proposed method for local regions can be improved.
2. Related Work
2.1. Projection-Based Methods
2.2. Voxel-Based Methods
2.3. Point-Based Methods
3. Methodology
3.1. Spatial Information Encoding Based on Multiple Coordinate Systems
3.1.1. The Spatial Information Encoding of the Cartesian Coordinate System
3.1.2. The Spatial Information Encoding of the Spherical Coordinate System
3.1.3. Spatial Information Encoding of the Cylindrical coordinate System
3.2. Coordinate Systems Attentive Pooling Fusion
3.2.1. Calculating the Attention Scores of Neighbouring Points in Each Coordinate System
3.2.2. Attention Scores Fusion
3.3. Local Aggregation Features Attention
3.4. Loss Function
4. Experiments
4.1. Datasets
4.2. Results of Semantic Segmentation
Method | mIoU | Road | Sidewalk | Parking | Other-ground | Building | Car | Truck | Bicycle | Motorcycle | Other-vehicle | Vegetation | Trunk | Terrain | Person | Bicyclist | Motocyclist | Fence | Pole | Traffic-sign |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
(%) | ||||||||||||||||||||
PointNet [9] | 14.6 | 61.6 | 35.7 | 15.8 | 1.4 | 41.4 | 46.3 | 0.1 | 1.3 | 0.3 | 0.8 | 31 | 4.6 | 17.6 | 0.2 | 0.2 | 0 | 12.9 | 2.4 | 3.7 |
PointNet++ [10] | 20.1 | 72 | 41.8 | 18.7 | 5.6 | 62.3 | 53.7 | 0.9 | 1.9 | 0.2 | 0.2 | 46.5 | 13.8 | 30 | 0.9 | 1 | 0 | 16.9 | 6 | 8.9 |
SquSegV2 [44] | 39.7 | 88.6 | 67.6 | 45.8 | 17.7 | 73.7 | 81.8 | 13.4 | 18.5 | 17.9 | 14 | 71.8 | 35.8 | 60.2 | 20.1 | 25.1 | 3.9 | 41.1 | 20.2 | 36.3 |
TangentConv [19] | 40.9 | 83.9 | 63.9 | 33.4 | 15.4 | 83.4 | 90.8 | 15.2 | 2.7 | 16.5 | 12.1 | 79.5 | 49.3 | 58.1 | 23 | 28.4 | 8.1 | 49 | 35.8 | 28.5 |
PointASNL [45] | 46.8 | 87.4 | 74.3 | 24.3 | 1.8 | 83.1 | 87.9 | 39 | 0 | 25.1 | 29.2 | 84.1 | 52.2 | 70.6 | 34.2 | 57.6 | 0 | 43.9 | 57.8 | 36.9 |
RandLA-Net [14] | 53.9 | 90.7 | 73.7 | 60.3 | 20.4 | 86.9 | 94.2 | 40.1 | 26 | 25.8 | 38.9 | 81.4 | 61.3 | 66.8 | 49.2 | 48.2 | 7.2 | 56.3 | 49.2 | 47.7 |
PolarNet [46] | 54.3 | 90.8 | 74.4 | 61.7 | 21.7 | 90 | 93.8 | 22.9 | 40.3 | 30.1 | 28.5 | 84 | 65.5 | 67.8 | 43.2 | 40.2 | 5.6 | 67.8 | 51.8 | 57.5 |
MinkNet42 [47] | 54.3 | 91.1 | 69.7 | 63.8 | 29.3 | 92.7 | 94.3 | 26.1 | 23.1 | 26.2 | 36.7 | 83.7 | 68.4 | 64.7 | 43.1 | 36.4 | 7.9 | 57.1 | 57.3 | 60.1 |
BAAF-Net [13] | 59.9 | 90.9 | 74.4 | 62.2 | 23.6 | 89.8 | 95.4 | 48.7 | 31.8 | 35.5 | 46.7 | 82.7 | 63.4 | 67.9 | 49.5 | 55.7 | 53 | 60.8 | 53.7 | 52 |
FusionNet [48] | 61.3 | 91.8 | 77.1 | 68.8 | 30.8 | 92.5 | 95.3 | 41.8 | 47.5 | 37.7 | 34.5 | 84.5 | 69.8 | 68.5 | 59.5 | 56.8 | 11.9 | 69.4 | 60.4 | 66.5 |
Ours | 59.8 | 90.7 | 74.9 | 63.1 | 27.1 | 91.1 | 95.6 | 52.3 | 35.3 | 43.3 | 46.1 | 82.1 | 64.5 | 67 | 52.6 | 57.5 | 22.7 | 64 | 54.4 | 51.6 |
4.3. Ablation Experiments
4.3.1. Ablation of the CSAPF Block and LAFA Block
4.3.2. Ablation Experiments for the Features of Coordinate Systems
4.3.3. Ablation of the Fusion Attention Score
4.4. Information about Experiments and Model Size
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Guo, Y.; Wang, H.; Hu, Q.; Liu, H.; Liu, L.; Bennamoun, M. Deep Learning for 3D Point Clouds: A Survey. IEEE Trans. Pat. Anal. Mach. Intell. 2020, 43, 4338–4364. [Google Scholar] [CrossRef] [PubMed]
- Lu, T.; Wang, L.; Wu, G. Cga-net: Category Guided Aggregation for Point Cloud Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 11693–11702. [Google Scholar]
- Landrieu, L.; Simonovsky, M. Large-Scale Point Cloud Semantic Segmentation with Superpoint Graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 4558–4567. [Google Scholar]
- Landrieu, L.; Boussaha, M. Point Cloud Oversegmentation with Graph-Structured Deep Metric Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 7440–7449. [Google Scholar]
- Li, Y.; Ma, L.; Zhong, Z.; Cao, D.; Li, J. TGNet: Geometric Graph CNN on 3-D Point Cloud Segmentation. IEEE Trans. Geo. Rem. Sens. 2020, 58, 3588–3600. [Google Scholar] [CrossRef]
- Bazazian, D.; Nahata, D. DCG-net: Dynamic Capsule Graph Convolutional Network for Point Clouds. IEEE Access 2020, 8, 188056–188067. [Google Scholar] [CrossRef]
- Liu, J.; Ni, B.; Li, C.; Yang, J.; Tian, Q. Dynamic Points Agglomeration for Hierarchical Point Sets Learning. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 7546–7555. [Google Scholar]
- Liang, Z.; Yang, M.; Deng, L.; Wang, C.; Wang, B. Hierarchical Depthwise Graph Convolutional Neural Network for 3D Semantic Segmentation of Point Clouds. In Proceedings of the 2019 IEEE International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 8152–8158. [Google Scholar]
- Qi, C.R.; Su, H.; Mo, K.; Guibas, L.J. Pointnet: Deep Learning on Point Sets for 3D Classification and Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 652–660. [Google Scholar]
- Qi, C.R.; Yi, L.; Su, H.; Guibas, L.J. Pointnet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space. arXiv 2017, arXiv:1706.02413. [Google Scholar]
- Wang, Y.; Sun, Y.; Liu, Z.; Sarma, S.E.; Bronstein, M.M.; Solomon, J.M. Dynamic Graph CNN for Learning on Point Clouds. ACM Trans. Grap. 2019, 38, 1–12. [Google Scholar] [CrossRef] [Green Version]
- Pan, L.; Chew, C.M.; Lee, G.H. PointAtrousGraph: Deep Hierarchical Encoder-Decoder with Point Atrous Convolution for Unorganized 3D Points. In Proceedings of the IEEE/CVF International Conference on Robotics and Automation (ICRA), Virtual, 31 May–31 August 2020; pp. 1113–1120. [Google Scholar]
- Qiu, S.; Anwar, S.; Barnes, N. Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 1757–1767. [Google Scholar]
- Hu, Q.; Yang, B.; Xie, L.; Rosa, S.; Guo, Y.; Wang, Z.; Trigoni, N.; Markham, A. Randla-net: Efficient semantic segmentation of large-scale point clouds. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 14–19 June 2020; pp. 11108–11117. [Google Scholar]
- Fan, S.; Dong, Q.; Zhu, F.; Lv, Y.; Ye, P.; Wang, F.Y. SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 19–25 June 2021; pp. 14504–14513. [Google Scholar]
- Lawin, F.J.; Danelljan, M.; Tosteberg, P.; Bhat, G.; Khan, F.S.; Felsberg, M. Deep Projective 3D Semantic Segmentation. In Proceedings of the International Conference on Computer Analysis of Images and Patterns (CAIP), Ystad, Sweden, 22–24 August 2017; pp. 95–107. [Google Scholar]
- Boulch, A.; Lesaux, B.; Audebert, N. Unstructured Point Cloud Semantic Labeling Using Deep Segmentation Networks. In Proceedings of the 3DOR@ Eurographics, Lyon, France, 23–24 April 2017; p. 3. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pat. Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef] [PubMed]
- Tatarchenko, M.; Park, J.; Koltun, V.; Zhou, Q.Y. Tangent Convolutions for Dense Prediction in 3D. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 3887–3896. [Google Scholar]
- Huang, J.; You, S. Point Cloud Labeling Using 3D Convolutional Neural Network. In Proceedings of the IEEE/CVF Conference on Pattern Recognition (ICPR), Stockholm, Sweden, 16–21 May 2016; pp. 2670–2675. [Google Scholar]
- Tchapmi, L.P.; Choy, C.B.; Armeni, I.; Gwak, J.Y.; Savarese, S. Segcloud: Semantic Segmentation of 3D Point Clouds. In Proceedings of the IEEE/CVF Conference on 3D Vision (3DV), Qingdao, China, 10–12 October 2017; pp. 537–547. [Google Scholar]
- Meng, H.Y.; Gao, L.; Lai, Y.K.; Manocha, D. Vv-net: Voxel Vae Net with Group Convolutions for Point Cloud Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 8500–8508. [Google Scholar]
- Dai, A.; Ritchie, D.; Bokeloh, M.; Reed, S.; Strum, J.; Nießner, M. Scancomplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 4578–4587. [Google Scholar]
- Hu, Z.; Bai, X.; Shang, J.; Zhang, R.; Dong, J.; Wang, X.; Sun, G.; Fu, H.; Tai, C.L. Vmnet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (CVPR), Virtual, 19–25 June 2021; pp. 15488–15498. [Google Scholar]
- Ye, M.; Xu, S.; Cao, T.; Chen, Q. Drinet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Virtual, 11–17 October 2021; pp. 7447–7456. [Google Scholar]
- Ma, Y.; Guo, Y.; Liu, H.; Lei, Y.; Wen, G. Global Context Reasoning for Semantic Segmentation of 3D Point Clouds. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 2–5 May 2020; pp. 2931–2940. [Google Scholar]
- Wang, L.; Huang, Y.; Hou, Y.; Zhang, S.; Shan, J. Graph Attention Convolution for Point Cloud Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 10296–10305. [Google Scholar]
- Zhang, W.; Xiao, C. PCAN: 3D Attention Map Learning Using Contextual Information for Point Cloud Based Retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12436–12445. [Google Scholar]
- Yang, J.; Zhang, Q.; Ni, B.; Li, L.; Liu, J.; Zhou, M.; Tian, Q. Modeling Point Clouds with Self-Attention and Gumbel Subset Sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3323–3332. [Google Scholar]
- Hua, B.S.; Tran, M.K.; Yeung, S.K. Pointwise Convolutional Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 984–993. [Google Scholar]
- Thomas, H.; Qi, C.R.; Deschaud, J.E.; Marcotegui, B.; Goulette, F.; Guibas, L.J. Kpconv: Flexible and Deformable Convolution for Point Clouds. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 6411–6420. [Google Scholar]
- Engelmann, F.; Kontogianni, T.; Leibe, B. Dilated Point Convolutions: On the Receptive Field Size of Point Convolutions on 3D Point Clouds. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA), Virtual, 31 May–31 August 2020; pp. 9463–9469. [Google Scholar]
- Lei, H.; Akhtar, N.; Mian, A. Spherical Kernel for Efficient Graph Convolution on 3D Point Clouds. IEEE Trans. Pat. Anal. Mach. Intell. 2020, 43, 3664–3680. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Lin, Z.H.; Huang, S.Y.; Wang, Y.C.F. Learning of 3D Graph Convolution Networks for Point Cloud Analysis. IEEE Trans. Pat. Anal. Mach. Intell. 2021. Early Access. [Google Scholar] [CrossRef] [PubMed]
- Armeni, I.; Sax, S.; Zamir, A.R.; Savarese, S. Joint 2D-3D-Semantic Data for Indoor Scene Understanding. arXiv 2017, arXiv:1702.01105. [Google Scholar]
- Hackel, T.; Savinov, N.; Ladicky, L.; Wegner, J.D.; Schindler, K.; Pollefeys, M. Semantic3d. net: A New Large-Scale Point Cloud Classification Benchmark. arXiv 2017, arXiv:1704.03847. [Google Scholar]
- Behley, J.; Garbade, M.; Milioto, A.; Quenzel, J.; Behnke, S.; Stachniss, C.; Gall, J. Semantickitti: A Dataset for Semantic Scene Understanding of Lidar Sequences. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 9297–9307. [Google Scholar]
- Huang, Q.; Wang, W.; Neumann, U. Recurrent Slice Networks for 3D Segmentation of Point Clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018; pp. 2626–2635. [Google Scholar]
- Ye, X.; Li, J.; Huang, H.; Du, L.; Zhang, X. 3D Recurrent Neural Networks with Context Fusion for Point Cloud Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 403–417. [Google Scholar]
- Li, Y.; Bu, R.; Sun, M.; Wu, W.; Di, X.; Chen, B. Pointcnn: Convolution on X-Transformed Points. Adv. Neural Inf. Process. Syst. 2018, 31. [Google Scholar]
- Zhao, H.; Jiang, L.; Fu, C.W.; Jia, J. Pointweb: Enhancing Local Neighborhood Features for Point Cloud Processing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 5565–5573. [Google Scholar]
- Zhang, Z.; Hua, B.S.; Yeung, S.K. Shellnet: Efficient Point Cloud Convolutional Neural Networks Using Concentric Shells Statistics. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea, 27–28 October 2019; pp. 1607–1616. [Google Scholar]
- Truong, G.; Gilani, S.Z.; Islam, S.M.S.; Suter, D. Fast Point Cloud Registration Using Semantic Segmentation. In Proceedings of the 2019 Digital Image Computing: Techniques and Applications (DICTA), Perth, Australia, 2–4 December 2019; pp. 1–8. [Google Scholar]
- Wu, B.; Zhou, X.; Zhao, S.; Yue, X.; Keutzer, K. Squeezesegv2: Improved Model Structure and Unsupervised Domain Adaptation for Road-Object Segmentation from a Lidar Point Cloud. In Proceedings of the 2019 IEEE International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; pp. 4376–4382. [Google Scholar]
- Yan, X.; Zheng, C.; Li, Z.; Wang, S.; Cui, S. Pointasnl: Robust Point Clouds Processing Using Nonlocal Neural Networks with Adaptive Sampling. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 14–19 June 2020; pp. 5589–5598. [Google Scholar]
- Zhang, Y.; Zhou, Z.; David, P.; Yue, X.; Xi, Z.; Gong, B.; Foroosh, H. Polarnet: An Improved Grid Representation for Online Lidar Point Clouds Semantic Segmentation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Virtual, 14–19 June 2020; pp. 9601–9610. [Google Scholar]
- Choy, C.; Gwak, J.; Savarese, S. 4D Spatio-Temporal Convnets: Minkowski Convolutional Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 3075–3084. [Google Scholar]
- Zhang, F.; Fang, J.; Wah, B.; Torr, P. Deep Fusionnet for Point Cloud Semantic Segmentation. In Proceedings of the European Conference on Computer Vision (ECCV), Virtual, 23–28 August 2020; pp. 644–663. [Google Scholar]
Methods | OA | mAcc (%) | mIoU (%) | Ceil. | Floor | Wall | Beam | Col. | Wind. | Door | Table | Chair | Sofa | Book. | Board | Clut. |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
PointNet [9] | 78.6 | 66.2 | 47.6 | 88.0 | 88.7 | 69.3 | 42.4 | 23.1 | 47.5 | 51.6 | 54.1 | 42.0 | 9.6 | 38.2 | 29.4 | 35.2 |
RSNet [38] | - | 66.5 | 56.5 | 92.5 | 92.8 | 78.6 | 32.8 | 34.4 | 51.6 | 68.1 | 59.7 | 60.1 | 16.4 | 50.2 | 44.9 | 52.0 |
3P-RNN [39] | 86.9 | - | 56.3 | 92.9 | 93.8 | 73.1 | 42.5 | 25.9 | 47.6 | 59.2 | 60.4 | 66.7 | 24.8 | 57.0 | 36.7 | 51.6 |
SPG [3] | 86.4 | 73.0 | 62.1 | 89.9 | 95.1 | 76.4 | 62.8 | 47.1 | 55.3 | 68.4 | 73.5 | 69.2 | 63.2 | 45.9 | 8.7 | 52.9 |
PointCNN [40] | 88.1 | 75.6 | 65.4 | 94.8 | 97.3 | 75.8 | 63.3 | 51.7 | 58.4 | 57.2 | 71.6 | 69.1 | 39.1 | 61.2 | 52.2 | 58.6 |
PointWeb [41] | 87.3 | 76.2 | 66.7 | 93.5 | 94.2 | 80.8 | 52.4 | 41.3 | 64.9 | 68.1 | 71.4 | 67.1 | 50.3 | 62.7 | 62.2 | 58.5 |
ShellNet [42] | 87.1 | - | 66.8 | 90.2 | 93.6 | 79.9 | 60.4 | 44.1 | 64.9 | 52.9 | 71.6 | 84.7 | 53.8 | 64.6 | 48.6 | 59.4 |
KPConv [31] | - | 79.1 | 70.6 | 93.6 | 92.4 | 83.1 | 63.9 | 54.3 | 66.1 | 76.6 | 57.8 | 64.0 | 69.3 | 74.9 | 61.3 | 60.3 |
RandLA [14] | 88.0 | 82.0 | 70.0 | 93.1 | 96.1 | 80.6 | 62.4 | 48.0 | 64.4 | 69.4 | 69.4 | 76.4 | 60.0 | 64.2 | 65.9 | 60.1 |
SCF-Net [15] | 88.4 | 82.7 | 71.6 | 93.3 | 96.4 | 80.9 | 64.9 | 47.4 | 64.5 | 70.1 | 71.4 | 81.6 | 67.2 | 64.4 | 67.5 | 60.9 |
BAAF-Net [13] | 88.9 | 83.1 | 72.2 | 93.3 | 96.8 | 81.6 | 61.9 | 49.5 | 65.4 | 73.3 | 72.0 | 83.7 | 67.5 | 64.3 | 67.0 | 62.4 |
Ours | 89.2 | 83.7 | 73.0 | 93.6 | 97.0 | 82.1 | 67.1 | 52.1 | 66.0 | 71.8 | 73.5 | 80.2 | 70.3 | 67.3 | 64.0 | 63.3 |
Methods | mIoU (%) | OA (%) | Man-made. | Natural. | High Veg. | Low Veg. | Buildings | Hard Scape | Scanning Art. | Cars |
---|---|---|---|---|---|---|---|---|---|---|
SnapNet [17] | 59.1 | 88.6 | 82.0 | 77.3 | 79.7 | 22.9 | 91.1 | 18.4 | 37.3 | 64.4 |
SEGCloud [21] | 61.3 | 88.1 | 83.9 | 66.0 | 86.0 | 40.5 | 91.1 | 30.9 | 27.5 | 64.3 |
ShellNet [42] | 69.3 | 93.2 | 96.3 | 90.4 | 83.9 | 41.0 | 94.2 | 34.7 | 43.9 | 70.2 |
GACNet [27] | 70.8 | 91.9 | 86.4 | 77.7 | 88.5 | 60.6 | 94.2 | 37.3 | 43.5 | 77.8 |
SPG [3] | 73.2 | 94.0 | 97.4 | 92.6 | 87.9 | 44.0 | 93.2 | 31.0 | 63.5 | 76.2 |
KPConv [31] | 74.6 | 92.9 | 90.9 | 82.2 | 84.2 | 47.9 | 94.9 | 40.0 | 77.3 | 79.7 |
RGNet [43] | 74.7 | 94.5 | 97.5 | 93.0 | 88.1 | 48.1 | 94.6 | 36.2 | 72.0 | 68.0 |
RandLA-Net [14] | 77.4 | 94.8 | 95.6 | 91.4 | 86.6 | 51.5 | 95.7 | 51.5 | 69.8 | 76.8 |
SCF-Net [15] | 77.6 | 94.7 | 97.1 | 91.8 | 86.3 | 51.2 | 95.3 | 50.5 | 67.9 | 80.7 |
Ours | 77.8 | 94.6 | 97.5 | 94.9 | 87.0 | 54.9 | 94.2 | 42.8 | 72.0 | 78.8 |
Method | Ablation | mIoU (%) | OA (%) | ||
---|---|---|---|---|---|
CSAPF Block | LAFA Block | ||||
MSIDA-Net | Ours1 | 64.2 | 87.8 | ||
Ours2 | √ | 66 | 89.2 | ||
Ours3 | √ | 66.2 | 88.6 | ||
MSIDA | √ | √ | 66.9 | 89.3 |
Method | Ablation Methods(Remove) | mIOU (%) | OA (%) | |||
---|---|---|---|---|---|---|
Cartesian Feature | Spherical Feature | Cylindrical Feature | ||||
MSIDA-Net | Ours4 | √ | 65.4 | 88.3 | ||
Ours5 | √ | 65.3 | 88.7 | |||
Ours6 | √ | 65.5 | 88.8 | |||
Full features | 66.9 | 89.3 | ||||
Method | Ablation Methods(Concatenate) | mIOU (%) | OA (%) | |||
MSIDA-Net | Ours7 | Encode [Spherical Feature + Cylindrical Feature] and Cartesian Feature | 66.1 | 88.9 | ||
Ours8 | Encode [Cartesian Feature + Cylindrical Feature] and Spherical Feature | 65.3 | 88.7 | |||
Ours9 | Encode [Cartesian Feature + Spherical Feature] and Cylindrical Feature | 66.5 | 89.0 |
Methods | Ablation Methods (Ratio) | mIoU (%) | OA (%) | |||
---|---|---|---|---|---|---|
Cartesian ( ) | Cartesian ( ) | Cylindrical ( ) | ||||
MSIDA-Net | Ours10 | 0.10 | 0.30 | 0.6 | 65.7 | 89.0 |
Ours11 | 0.10 | 0.60 | 0.3 | 65.4 | 88.7 | |
Ours12 | 0.30 | 0.10 | 0.6 | 65.2 | 88.5 | |
Ours13 | 0.30 | 0.60 | 0.1 | 66.3 | 88.7 | |
Ours14 | 0.60 | 0.10 | 0.30 | 64.8 | 88.3 | |
Ours15 | 0.60 | 0.30 | 0.10 | 65.4 | 88.2 | |
Original () | 0.33 | 0.33 | 0.33 | 66.9 | 89.3 | |
Method | Ablation Methods (Secondary Weighting) | mIoU (%) | OA (%) | |||
MSIDA-Net | Ours16 | - | 66.1 | 88.6 |
Method | Dataset | Parameters (Millions) | Training Speed (Batch/s) | Test Time (s) | Max Inference Points (Millions) |
---|---|---|---|---|---|
MSIDA-Net | S3DIS | 15.98 | 1.50 | 47.1 | 0.37 |
Semantic3D | 15.98 | 0.85 | 91.7 | 0.36 | |
SemanticKITTI | 3.94 | 1.39 | 433.5 | 0.40 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Shuang, F.; Li, P.; Li, Y.; Zhang, Z.; Li, X. MSIDA-Net: Point Cloud Semantic Segmentation via Multi-Spatial Information and Dual Adaptive Blocks. Remote Sens. 2022, 14, 2187. https://doi.org/10.3390/rs14092187
Shuang F, Li P, Li Y, Zhang Z, Li X. MSIDA-Net: Point Cloud Semantic Segmentation via Multi-Spatial Information and Dual Adaptive Blocks. Remote Sensing. 2022; 14(9):2187. https://doi.org/10.3390/rs14092187
Chicago/Turabian StyleShuang, Feng, Pei Li, Yong Li, Zhenxin Zhang, and Xu Li. 2022. "MSIDA-Net: Point Cloud Semantic Segmentation via Multi-Spatial Information and Dual Adaptive Blocks" Remote Sensing 14, no. 9: 2187. https://doi.org/10.3390/rs14092187