Deep-Learning-Based Point Cloud Analysis II

Gao, Wei; Li, Ge

doi:10.1007/978-981-97-9570-3_6

Wei Gao³ &
Ge Li³

82 Accesses

Abstract

The emergence of advanced 3D sensing technologies, such as LiDAR, has significantly increased the availability of point cloud data, driving the need for robust analytics through deep learning. Point clouds, with their detailed spatiotemporal structures, are vital across numerous applications, requiring innovative approaches for effective interpretation and utilization. This chapter delves into the intersection of deep learning and point cloud analytics, covering essential tasks like point classification and semantic segmentation. It then explores place recognition, object retrieval, and registration, emphasizing their importance in interpreting dynamic environments. This chapter concludes with an examination of multimodal analysis, showcasing the synergistic potential of integrating point cloud data with other data modalities. Each section systematically unpacks the problems, general solution strategies, seminal contributions, and emerging trends, encapsulating the state-of-the-art in deep-learning-based point cloud analytics and paving the way for future advancements in the field.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 159.99; Price excludes VAT (USA)

Hardcover Book: USD 199.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

B. Qu, X. Liang, S. Sun, W. Gao, Exploring aigc video quality: a focus on visual harmony, video-text consistency and domain distribution gap, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2024)
Google Scholar
B. Qu, H. Li, W. Gao, Bringing textual prompt to ai-generated image quality assessment, in 2024 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2024)
Google Scholar
Y. Wu, L. Xie, S. Sun, W. Gao, Y. Yan, Adaptive intra period size for deep learning-based screen content video coding, in 2024 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) (IEEE, Piscataway, 2024)
Google Scholar
H. Zheng, W. Gao, End-to-end RGB-D image compression via exploiting channel-modality redundancy. Proc. AAAI Conf. Artif. Intell. 38(7), 7562–7570 (2024)
Google Scholar
L. Tao, W. Gao, G. Li, C. Zhang, AdaNIC: towards practical neural image compression via dynamic transform routing, in Proceedings of the IEEE/CVF International Conference on Computer Vision (2023), pp. 16 879–16 888
Google Scholar
Y. Wu, W. Gao, End-to-end lossless compression of high precision depth maps guided by pseudo-residual. Preprint. arXiv:2201.03195 (2022)
Google Scholar
Y. Wu, Z. Qi, H. Zheng, L. Tao, W. Gao, Deep image compression with latent optimization and piece-wise quantization approximation, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021), pp. 1926–1930
Google Scholar
W. Gao, L. Tao, L. Zhou, D. Yang, X. Zhang, Z. Guo, Low-rate image compression with super-resolution learning, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2020), pp. 154–155
Google Scholar
W. Gao, S. Sun, H. Zheng, Y. Wu, H. Ye, Y. Zhang, OpenDMC: an open-source library and performance evaluation for deep-learning-based multi-frame compression, in Proceedings of the 31st ACM International Conference on Multimedia (2023), pp. 9685–9688
Google Scholar
Y. Guo, W. Gao, G. Li, Interpretable task-inspired adaptive filter pruning for neural networks under multiple constraints. Int. J. Comput. Vision 132(6) 2060–2076 (2024)
Article Google Scholar
W. Gao, Y. Guo, S. Ma, G. Li, S. Kwong, Efficient neural network compression inspired by compressive sensing. IEEE Trans. Neural Networks Learn. Syst. 35(2), 1965–1979 (2022)
Article Google Scholar
Y. Guo, W. Gao, Semantic-driven automatic filter pruning for neural networks, in 2022 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2022), pp. 1–6
Google Scholar
L. Tao, W. Gao, Efficient channel pruning based on architecture alignment and probability model bypassing, in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC) (IEEE, Piscataway, 2021), pp. 3232–3237
Google Scholar
Z. Yang, W. Gao, G. Li, Y. Yan, SUR-driven video coding rate control for jointly optimizing perceptual quality and buffer control. IEEE Trans. Image Proces. 32, 5451–5464 (2023)
Article Google Scholar
F. Shen, Z. Cai, W. Gao, An efficient rate control algorithm for intra frame coding in AVS3, in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC) (IEEE, Piscataway, 2021), pp. 3164–3169
Google Scholar
H. Yuan, W. Gao, J. Wang, Dynamic computational resource allocation for fast inter frame coding in video conferencing applications, in 2021 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2021), pp. 1–6
Google Scholar
W. Gao, Q. Jiang, R. Wang, S. Ma, G. Li, S. Kwong, Consistent quality oriented rate control in HEVC via balancing intra and inter frame coding. IEEE Trans. Ind. Inf. 18(3), 1594–1604 (2021)
Article Google Scholar
H. Yuan, W. Gao, A new coding unit partitioning mode for screen content video coding, in Proceedings of the 2021 5th International Conference on Digital Signal Processing (2021), pp. 66–72
Google Scholar
W. Gao, On the performance evaluation of state-of-the-art rate control algorithms for practical video coding and transmission systems, in Proceedings of the 2020 4th International Conference on Video and Image Processing (2020), pp. 179–185
Google Scholar
W. Gao, S. Kwong, Q. Jiang, C.-K. Fong, P.H. Wong, W.Y. Yuen, Data-driven rate control for rate-distortion optimization in HEVC based on simplified effective initial QP learning. IEEE Trans. Broadcast. 65(1), 94–108 (2018)
Article Google Scholar
W. Gao, A multi-objective optimization perspective for joint consideration of video coding quality, in 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (IEEE, Piscataway, 2019), pp. 986–991
Google Scholar
W. Gao, S. Kwong, Y. Jia, Joint machine learning and game theory for rate control in high efficiency video coding. IEEE Trans. Image Proces. 26(12), 6074–6089 (2017)
Article MathSciNet Google Scholar
W. Gao, S. Kwong, Y. Zhou, H. Yuan, SSIM-based game theory approach for rate-distortion optimized intra frame CTU-level bit allocation. IEEE Trans. Multimedia 18(6), 988–999 (2016)
Article Google Scholar
W. Gao, S. Kwong, H. Yuan, X. Wang, DCT coefficient distribution modeling and quality dependency analysis based frame-level bit allocation for HEVC. IEEE Trans. Circuits Syst. Video Technol. 26(1), 139–153 (2015)
Article Google Scholar
W. Gao, S. Kwong, Phase congruency based edge saliency detection and rate control for perceptual image and video coding, in 2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC) (IEEE, Piscataway, 2016), pp. 000 264–000 269
Google Scholar
H. Yuan, W. Gao, OpenFastVC: an open source library for video coding fast algorithm implementation, in Proceedings of the 31st ACM International Conference on Multimedia (2023), pp. 9660–9663
Google Scholar
H. Yuan, W. Gao, S. Ma, Y. Yan, Divide-and-conquer-based RDO-free CU partitioning for 8K video compression. ACM Trans. Multimedia Comput. Commun. Appl. 20(4), 1–20 (2024)
Article Google Scholar
L. Tao, W. Gao, A hardware implementation of entropy encoder for 8k video coding, in 2022 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2022), pp. 1–6
Google Scholar
Y. Guo, W. Gao, S. Ma, G. Li, Accelerating transform algorithm implementation for efficient intra coding of 8k UHD videos. ACM Trans. Multimedia Comput. Commun. Appl. 18(4), 1–20 (2022)
Article Google Scholar
Z. Cai, W. Gao, Efficient fast algorithm and parallel hardware architecture for intra prediction of AVS3, in 2021 IEEE International Symposium on Circuits and Systems (ISCAS) (IEEE, Piscataway, 2021), pp. 1–5
Google Scholar
W. Gao, H. Yuan, Y. Guo, L. Tao, Z. Cai, G. Li, OpenHardwareVC: an open source library for 8K UHD video coding hardware implementation, in Proceedings of the 30th ACM International Conference on Multimedia (2022), pp. 7339–7342
Google Scholar
W. Gao, H. Yuan, G. Liao, Z. Guo, J. Chen, PP8K: a new dataset for 8K UHD video compression and processing. IEEE MultiMedia 30(3), 100–109 (2023)
Article Google Scholar
X. Zang, W. Gao, G. Li, H. Fang, C. Ban, Z. He, H. Sun, A baseline investigation: transformer-based cross-view baseline for text-based person search, in Proceedings of the 31st ACM International Conference on Multimedia (2023), pp. 7737–7746
Google Scholar
G. Liao, W. Gao, G. Li, J. Wang, S. Kwong, Cross-collaborative fusion-encoder network for robust RGB-thermal salient object detection. IEEE Trans. Circuits Syst. Video Technol. 32(11), 7646–7661 (2022)
Article Google Scholar
W. Gao, G. Liao, S. Ma, G. Li, Y. Liang, W. Lin, Unified information fusion network for multi-modal RGB-D and RGB-T salient object detection. IEEE Trans. Circuits Syst. Video Technol. 32(4), 2091–2106 (2021)
Article Google Scholar
Y. Chen, S. Sun, G. Li, W. Gao, T.H. Li, Closing the gap between theory and practice during alternating optimization for gans. IEEE Trans. Neural Networks Learn. Syst. 35(10), 14005–14017 (2024)
Article MathSciNet Google Scholar
Y. Chen, C. Jin, G. Li, T.H. Li, W. Gao, Mitigating label noise in gans via enhanced spectral normalization. IEEE Trans. Circuits Syst. Video Technol. 33(8), 3924–3934 (2023)
Article Google Scholar
X. Zang, G. Li, W. Gao, Multidirection and multiscale pyramid in transformer for video-based pedestrian retrieval. IEEE Trans. Ind. Inf. 18(12), 8776–8785 (2022)
Article Google Scholar
X. Zang, G. Li, W. Gao, X. Shu, Learning to disentangle scenes for person re-identification. Image Vision Comput. 116, 104330 (2021)
Article Google Scholar
X. Zang, G. Li, W. Gao, X. Shu, Exploiting robust unsupervised video person re-identification. IET Image Proces. 16(3), 729–741 (2022)
Article Google Scholar
Z. Yue, G. Li, W. Gao, Cross-level guided attention for human-object interaction detection, in 2023 IEEE International Conference on Multimedia and Expo Workshops (ICMEW) (IEEE, Piscataway, 2023), pp. 284–289
Google Scholar
Z. Yao, W. Gao, Iterative saliency aggregation and assignment network for efficient salient object detection in optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 62, 1–13 (2024)
Google Scholar
Y. Sun, Z. Li, S. Wang, W. Gao, Depth-assisted calibration on learning-based factorization for a compressive light field display. Opt. Exp. 31(4), 5399–5413 (2023)
Article Google Scholar
Y. Sun, Z. Li, L. Li, S. Wang, W. Gao, Optimization of compressive light field display in dual-guided learning, in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, Piscataway, 2022), pp. 2075–2079
Google Scholar
W. Gao, S. Fan, G. Li, W. Lin, A thorough benchmark and a new model for light field saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 8003–8019 (2023)
Google Scholar
Z. Guo, W. Gao, H. Wang, J. Wang, S. Fan, No-reference deep quality assessment of compressed light field images, in 2021 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2021), pp. 1–6
Google Scholar
G. Liao, W. Gao, Rethinking feature mining for light field salient object detection. ACM Trans. Multimedia Comput. Commun. Appl. 20(10), 1–24 (2024)
Article Google Scholar
S. Sun, J. Liu, T.H. Li, H. Li, G. Liu, W. Gao, Streamflow: Streamlined multi-frame optical flow estimation for video sequences. Preprint. arXiv:2311.17099 (2023)
Google Scholar
R. Liu, J. Huang, W. Gao, T.H. Li, G. Li, Mug-STAN: adapting image-language pretrained models for general video understanding. Preprint. arXiv:2311.15075 (2023)
Google Scholar
C. Zhang, W. Gao, Learned rate control for frame-level adaptive neural video compression via dynamic neural network, in European Conference on Computer Vision (Springer, Berlin, 2024)
Google Scholar
W. Gao, G. Li, H. Yuan, R. Hamzaoui, Z. Li, S. Liu, Apccpa’22: 1st international workshop on advances in point cloud compression, processing and analysis, in Proceedings of the 30th ACM International Conference on Multimedia (2022), pp. 7392–7393
Google Scholar
T. Qin, G. Li, W. Gao, S. Liu, Multi-grained point cloud geometry compression via dual-model prediction with extended octree. ACM Trans. Multimedia Comput. Commun. Appl. 20(9), 1–30 (2024)
Article Google Scholar
Y. Shao, W. Gao, S. Liu, G. Li, Advanced patch-based affine motion estimation for dynamic point cloud geometry compression. Sensors 24(10), 3142 (2024)
Google Scholar
Y. Shao, F. Song, W. Gao, S. Liu, G. Li, Texture-guided graph transform optimization for point cloud attribute compression. Appl. Sci. 14(10), 4094 (2024)
Google Scholar
Y. Shao, X. Yang, W. Gao, S. Liu, G. Li, 3d point cloud attribute compression using diffusion-based texture-aware intra prediction. IEEE Trans. Circuits Syst. Video Technol. 34(10), 9633–9646 (2024)
Article Google Scholar
J. Zhang, Y. Chen, G. Liu, W. Gao, G. Li, Efficient point cloud attribute compression framework using attribute-guided graph fourier transform, in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, Piscataway, 2024), pp. 8426–8430
Google Scholar
W. Gao, H. Yuan, G. Li, Z. Li, H. Yuan, Low complexity coding unit decision for video-based point cloud compression. IEEE Trans. Image Proces. 33, 149–162 (2023)
Article Google Scholar
Y. Shao, G. Li, Q. Zhang, W. Gao, S. Liu, Non-rigid registration-based progressive motion compensation for point cloud geometry compression. IEEE Trans. Geosci. Remote Sens. 61, 1–14 (2023)
Google Scholar
F. Song, G. Li, X. Yang, W. Gao, S. Liu, Block-adaptive point cloud attribute coding with region-aware optimized transform. IEEE Trans. Circuits Syst. Video Technol. 33(8), 4294–4308 (2023)
Article Google Scholar
Y. An, Y. Shao, G. Li, W. Gao, S. Liu, A fast motion estimation method with hamming distance for lidar point cloud compression, in 2022 IEEE International Conference on Visual Communications and Image Processing (VCIP) (IEEE, Piscataway, 2022), pp. 1–5
Google Scholar
H. Yuan, W. Gao, G. Li, Z. Li, Rate-distortion-guided learning approach with cross-projection information for V-PCC fast CU decision, in Proceedings of the 30th ACM International Conference on Multimedia (2022), pp. 3085–3093
Google Scholar
F. Song, G. Li, W. Gao, T.H. Li, Rate-distortion optimized graph for point cloud attribute coding. IEEE Signal Proces. Lett. 29, 922–926 (2022)
Article Google Scholar
F. Song, G. Li, X. Yang, W. Gao, T.H. Li, Fine-grained correlation representation for graph-based point cloud attribute compression, in 2022 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2022), pp. 1–6
Google Scholar
F. Shen, W. Gao, A rate control algorithm for video-based point cloud compression, in 2021 International Conference on Visual Communications and Image Processing (VCIP) (IEEE, Piscataway, 2021), pp. 1–5
Google Scholar
F. Song, Y. Shao, W. Gao, H. Wang, T. Li, Layer-wise geometry aggregation framework for lossless lidar point cloud compression. IEEE Trans. Circuits Syst. Video Technol. 31(12), 4603–4616 (2021)
Article Google Scholar
L. Xie, W. Gao, H. Zheng, G. Li, SPCGC: scalable point cloud geometry compression for machine vision, in Proceedings of IEEE International Conference on Robotics and Automation (2024)
Google Scholar
L. Xie, W. Gao, H. Zheng, H. Ye, Semantic-aware visual decomposition for point cloud geometry compression, in 2024 Data Compression Conference (DCC) (IEEE, Piscataway, 2024), pp. 595–595
Google Scholar
Z. Qi, W. Gao, Variable-rate point cloud geometry compression based on feature adjustment and interpolation, in 2024 Data Compression Conference (DCC) (IEEE, Piscataway, 2024), pp. 63–72
Google Scholar
Z. Yu, W. Gao, When dynamic neural network meets point cloud compression: computation-aware variable rate and checkerboard context, in 2024 Data Compression Conference (DCC) (IEEE, Piscataway, 2024), pp. 600–600
Google Scholar
L. Xie, W. Gao, S. Fan, Z. Yao, PDNet: parallel dual-branch network for point cloud geometry compression and analysis, in 2024 Data Compression Conference (DCC) (IEEE, Piscataway, 2024), pp. 596–596
Google Scholar
L. Xie, W. Gao, H. Zheng, End-to-end point cloud geometry compression and analysis with sparse tensor, in Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis (2022), pp. 27–32
Google Scholar
C. Fu, G. Li, R. Song, W. Gao, S. Liu, OctAttention: octree-based large-scale contexts model for point cloud compression, in AAAI Conference on Artificial Intelligence (2022), pp. 625–633
Google Scholar
H. Zheng, W. Gao, Z. Yu, T. Zhao, G. Li, ViewPCGC: view-guided learned point cloud geometry compression, in Proceedings of the 32nd ACM International Conference on Multimedia (2024)
Google Scholar
L. Xie, W. Gao, H. Zheng, G. Li, ROI-guided point cloud geometry compression towards human and machine vision, in Proceedings of the 32nd ACM International Conference on Multimedia (2024)
Google Scholar
C. Peng, W. Gao, Laplacian matrix learning for point cloud attribute compression with ternary search-based adaptive block partition, in Proceedings of the 32nd ACM International Conference on Multimedia (2024)
Google Scholar
S. Luo, B. Qu, W. Gao, Learning robust 3d representation from clip via dual denoising. Preprint. arXiv:2407.00905 (2024)
Google Scholar
G. Li, G. Wei, W. Gao, Point Cloud Compression: Technologies and Standardization (Springer Nature, Berlin, 2024)
Book Google Scholar
G. Li, W. Gao, W. Gao, Introduction, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 1–28
Google Scholar
G. Li, W. Gao, W. Gao, Background knowledge, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 29–51
Google Scholar
G. Li, W. Gao, W. Gao, Predictive coding, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 53–70
Book Google Scholar
G. Li, W. Gao, W. Gao, Transform coding, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 71–96
Google Scholar
G. Li, W. Gao, W. Gao, Quantization techniques, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 97–112
Book Google Scholar
G. Li, W. Gao, W. Gao, Entropy coding, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 113–133
Book Google Scholar
G. Li, W. Gao, W. Gao, MPEG geometry-based point cloud compression (G-PCC) standard, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 135–165
Google Scholar
G. Li, W. Gao, W. Gao, AVS point cloud compression standard, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 167–197
Book Google Scholar
G. Li, W. Gao, W. Gao, MPEG video-based point cloud compression (V-PCC) standard, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 199–218
Book Google Scholar
G. Li, W. Gao, W. Gao, MPEG AI-based 3d graphics coding standard, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 219–241
Google Scholar
G. Li, W. Gao, W. Gao, Future work, in Point Cloud Compression: Technologies and Standardization (Springer, Berlin, 2024), pp. 243–250
Google Scholar
W. Gao, H. Ye, G. Li, H. Zheng, Y. Wu, L. Xie, OpenPointCloud: an open-source algorithm library of deep learning based point cloud compression, in ACM International Conference on Multimedia (2022), pp. 7347–7350
Google Scholar
W. Liu, W. Gao, X. Mu, Fast inter-frame motion prediction for compressed dynamic point cloud attribute enhancement. Proc. AAAI Conf. Artif. Intell. 38(4), 3720–3728 (2024)
Google Scholar
Z. Yang, W. Gao, X. Lu, DANet: density-adaptive network for geometry-based point cloud compression artifacts removal, in 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP) (IEEE, Piscataway, 2023), pp. 1–5
Google Scholar
X. Fan, G. Li, D. Li, Y. Ren, W. Gao, T.H. Li, Deep geometry post-processing for decompressed point clouds, in 2022 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2022), pp. 1–6
Google Scholar
X. Zhang, G. Liao, W. Gao, G. Li, TDRNet: transformer-based dual-branch restoration network for geometry based point cloud compression artifacts, in 2022 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2022), pp. 1–6
Google Scholar
Z. Li, G. Li, T.H. Li, S. Liu, W. Gao, Semantic point cloud upsampling. IEEE Trans. Multimedia 25, 3432–3442 (2023)
Article Google Scholar
R. Zhang, W. Gao, G. Li, T.H. Li, QINet: decision surface learning and adversarial enhancement for quasi-immune completion of diverse corrupted point clouds. IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2022)
Google Scholar
R. Bao, Y. Ren, G. Li, W. Gao, S. Liu, Flow-based point cloud completion network with adversarial refinement, in ICASSP 2022-2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, Piscataway, 2022), pp. 2559–2563
Google Scholar
J. Chen, G. Li, R. Zhang, T.H. Li, W. Gao, PointIVAE: invertible variational autoencoder framework for 3d point cloud generation, in 2022 IEEE International Conference on Image Processing (ICIP) (IEEE, Piscataway, 2022), pp. 3216–3220
Google Scholar
R. Zhang, J. Chen, W. Gao, G. Li, T.H. Li, PointOT: interpretable geometry-inspired point cloud generative model via optimal transport. IEEE Trans. Circuits Syst. Video Technol. 32(10), 6792–6806 (2022)
Article Google Scholar
S. Fan, W. Gao, Screen-based 3d subjective experiment software, in Proceedings of the 31st ACM International Conference on Multimedia (2023), pp. 9672–9675
Google Scholar
X. Mao, H. Yuan, X. Lu, R. Hamzaoui, W. Gao, PCAC-GAN: a sparse-tensor-based generative adversarial network for 3d point cloud attribute compression. Comput. Visual Media (2024)
Google Scholar
J. Wang, W. Gao, G. Li, Applying collaborative adversarial learning to blind point cloud quality measurement. IEEE Trans. Instrum. Measure. 72, 1–15 (2023)
Google Scholar
Y. Zhang, W. Gao, G. Li, OpenPointCloud-V2: a deep learning based open-source algorithm library of point cloud processing, in Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis (2022), pp. 51–55
Google Scholar
S. Fan, W. Gao, G. Li, Salient object detection for point clouds, in European Conference on Computer Vision (2022), pp. 1–19
Google Scholar
S. Luo, W. Gao, A general framework for rotation invariant point cloud analysis, in ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) (IEEE, Piscataway, 2024), pp. 3665–3669
Google Scholar
X. Lu, W. Gao, AttentiveNet: detecting small objects for lidar point clouds by attending to important points, in 2023 IEEE International Conference on Visual Communications and Image Processing (VCIP) (IEEE, Piscataway, 2023), pp. 1–5
Google Scholar
Z. Pan, N. Zhang, W. Gao, S. Liu, G. Li, Less is more: label recommendation for weakly supervised point cloud semantic segmentation. Proc. AAAI Conf. Artif. Intell. 38(5) 4397–4405 (2024)
Google Scholar
Z. Pan, G. Liu, W. Gao, T. Li, EPContrast: effective point-level contrastive learning for large-scale point cloud understanding, in 2024 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2024)
Google Scholar
N. Zhang, Z. Pan, T.H. Li, W. Gao, G. Li, Improving graph representation for point cloud segmentation via attentive filtering, in Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (2023), pp. 1244–1254
Google Scholar
K. Wen, N. Zhang, G. Li, W. Gao, MPVNN: multi-resolution point-voxel non-parametric network for 3d point cloud processing, in 2024 IEEE International Conference on Multimedia and Expo (ICME) (IEEE, Piscataway, 2024)
Google Scholar
D. Yang, W. Gao, G. Li, H. Yuan, J. Hou, S. Kwong, Exploiting manifold feature representation for efficient classification of 3d point clouds. ACM Trans. Multimedia Comput. Commun. Appl. 19(1s), 1–21 (2023)
Article Google Scholar
M.A. Uy, G.H. Lee, PointNetVLAD: deep point cloud based retrieval for large-scale place recognition, in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4470–4479
Google Scholar
J. Komorowski, MinkLoc3D: point cloud based large-scale place recognition, in IEEE Winter Conference on Applications of Computer Vision (2021), pp. 1789–1798
Google Scholar
L. Hui, H. Yang, M. Cheng, J. Xie, J. Yang, Pyramid point cloud transformer for large-scale place recogition, in IEEE Conference on Computer Vision and Pattern Recognition (2021), pp. 6078–6087
Google Scholar
R. Zhang, G. Li, W. Gao, T.H. Li, Compoint: can complex-valued representation benefit point cloud place recognition? IEEE Trans. Intell. Transport. Syst. 25(7), 7494–7507 (2024)
Article Google Scholar
S.B. Hegde, S. Gangisetty, An evaluation of feature encoding techniques for non-rigid and rigid 3d point cloud retrieval, in British Machine Vision Conference (2019), p. 47
Google Scholar
W. Zhang, C. Xiao, PCAN: 3d attention map learning using contextual information for point cloud based retrieval, in IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 12 436–12 445
Google Scholar
Q. Sun, H. Liu, J. He, Z. Fan, X. Du, DAGC: employing dual attention and graph convolution for point cloud based place recognition, in International Conference on Multimedia Retrieval (2020), pp. 224–232
Google Scholar
C.R. Qi, H. Su, K. Mo, L.J. Guibas, PointNet: deep learning on point sets for 3D classification and segmentation, in IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 77–85
Google Scholar
C. Choy, J. Gwak, S. Savarese, 4d spatio-temporal convnets: minkowski convolutional neural networks, in IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 3075–3084
Google Scholar
F. Radenovic, G. Tolias, O. Chum, Fine-tuning CNN image retrieval with no human annotation. IEEE Trans. Pattern Anal. Mach. Intell. 41(7), 1655–1668 (2019)
Article Google Scholar
T. Lin, P. Dollár, R.B. Girshick, K. He, B. Hariharan, S.J. Belongie, Feature pyramid networks for object detection, in IEEE Conference on Computer Vision and Pattern Recognition (IEEE Computer Society, Washington, 2017), pp. 936–944
Google Scholar
J. Komorowski, M. Wysoczanska, T. Trzcinski, Minkloc++: Lidar and monocular image fusion for place recognition, in International Joint Conference on Neural Networks (IEEE, Piscataway, 2021), pp. 1–8
Google Scholar
Q. Wang, B. Wu, P. Zhu, P. Li, W. Zuo, Q. Hu, ECA-Net: efficient channel attention for deep convolutional neural networks, in IEEE Conference on Computer Vision and Pattern Recognition (2020), pp. 11 531–11 539
Google Scholar
W. Maddern, G. Pascoe, C. Linegar, P. Newman, 1 year, 1000 km: the Oxford robotcar dataset. Int. J. Robot. Res. 36(1), 3–15 (2017)
Article Google Scholar
X. Huang, G. Mei, J. Zhang, R. Abbas, A comprehensive survey on point cloud registration. CoRR, vol. abs/2103.02690, 2021. [Online]. Available: https://arxiv.org/abs/2103.02690
P.J. Besl, n.d. McKay, A method for registration of 3-d shapes. IEEE Trans. Pattern Anal. Mach. Intell. 14(2), 239–256 (1992)
Google Scholar
L. Cheng, S. Chen, X. Liu, H. Xu, Y. Wu, M. Li, Y. Chen, Registration of laser scanning point clouds: a review. Sensors 18(5), 1641 (2018)
Google Scholar
H.M. Le, T. Do, T. Hoang, N. Cheung, SDRSAC: semidefinite-based randomized approach for robust point cloud registration without correspondences, in IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 124–133
Google Scholar
F. Pomerleau, F. Colas, R. Siegwart, A review of point cloud registration algorithms for mobile robotics, Found. Trends Robot. 4(1), 1–104 (2015)
Article Google Scholar
H. Yang, L. Carlone, A polynomial-time solution for robust registration with extreme outlier rates, in Robotics: Science and Systems XV, University of Freiburg, Freiburg im Breisgau, June 22–26, 2019, ed. by A. Bicchi, H. Kress-Gazit, S. Hutchinson (2019)
Google Scholar
H. Deng, T. Birdal, S. Ilic, PPFNet: Global context aware local features for robust 3d point matching, in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 195–205
Google Scholar
Z. Gojcic, C. Zhou, J.D. Wegner, A. Wieser, The perfect match: 3d point cloud matching with smoothed densities, in IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 5545–5554
Google Scholar
A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, T.A. Funkhouser, 3DMatch: learning local geometric descriptors from RGB-D reconstructions, in IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 199–208
Google Scholar
G. Elbaz, T. Avraham, A. Fischer, 3d point cloud registration for localization using a deep neural network auto-encoder, in IEEE Conference on Computer Vision and Pattern Recognition (2017), pp. 2472–2481
Google Scholar
W. Lu, G. Wan, Y. Zhou, X. Fu, P. Yuan, S. Song, DeepVCP: an end-to-end deep neural network for point cloud registration, in IEEE/CVF International Conference on Computer Vision (IEEE, Piscataway, 2019), pp. 12–21
Google Scholar
Z. Yang, J.Z. Pan, L. Luo, X. Zhou, K. Grauman, Q. Huang, Extreme relative pose estimation for RGB-D scans via scene completion, in IEEE Conference on Computer Vision and Pattern Recognition (2019), pp. 4531–4540
Google Scholar
X. Huang, L. Fan, Q. Wu, J. Zhang, C. Yuan, Fast registration for cross-source point clouds by using weak regional affinity and pixel-wise refinement, in IEEE International Conference on Multimedia and Expo (2019), pp. 1552–1557
Google Scholar
X. Huang, J. Zhang, L. Fan, Q. Wu, C. Yuan, A systematic approach for cross-source point cloud registration by preserving macro and micro structures. IEEE Trans. Image Proces. 26(7), 3261–3276 (2017)
Article MathSciNet Google Scholar
X. Huang, J. Zhang, Q. Wu, L. Fan, C. Yuan, A coarse-to-fine algorithm for registration in 3d street-view cross-source point clouds, in International Conference on Digital Image Computing: Techniques and Applications (2016), pp. 1–6
Google Scholar
X. Huang, G. Mei, J. Zhang, Feature-metric registration: a fast semi-supervised approach for robust point cloud registration without correspondences, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, June 13–19, 2020 (Computer Vision Foundation/IEEE, Piscataway, 2020), pp. 11 363–11 371
Google Scholar
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, J. Xiao, 3D ShapeNets: a deep representation for volumetric shapes, in IEEE Conference on Computer Vision and Pattern Recognition (IEEE Computer Society, Washington, 2015), pp. 1912–1920
Google Scholar
A. Geiger, P. Lenz, C. Stiller, R. Urtasun, Vision meets robotics: the KITTI dataset. Int. J. Robot. Res. 32(11), 1231–1237 (2013)
Article Google Scholar
A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? The KITTI vision benchmark suite, in IEEE Conference on Computer Vision and Pattern Recognition (2012), pp. 3354–3361
Google Scholar
Y. Zhou, O. Tuzel, VoxelNet: end-to-end learning for point cloud based 3d object detection, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 4490–4499
Google Scholar
M. Bijelic, T. Gruber, F. Mannan, F. Kraus, W. Ritter, K. Dietmayer, F. Heide, Seeing through fog without seeing fog: deep multimodal sensor fusion in unseen adverse weather, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 11 682–11 692
Google Scholar
J.H. Yoo, Y. Kim, J. Kim, J.W. Choi, 3D-CVF: generating joint camera and lidar features using cross-view spatial feature fusion for 3d object detection, in European Conference on Computer Vision (2020), pp. 720–736
Google Scholar
L. Xie, G. Xu, D. Cai, X. He, X-view: non-egocentric multi-view 3d object detector. IEEE Trans. Image Proces. 32, 1488–1497 (2023)
Article Google Scholar
K. Huang, B. Shi, X. Li, X. Li, S. Huang, Y. Li, Multi-modal sensor fusion for auto driving perception: a survey. Preprint. arXiv:2202.02703 (2022)
Google Scholar
S. Vora, A. H. Lang, B. Helou, O. Beijbom, Pointpainting: sequential fusion for 3d object detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 4604–4612
Google Scholar
L. Xie, C. Xiang, Z. Yu, G. Xu, Z. Yang, D. Cai, X. He, PI-RCNN: an efficient multi-sensor 3d object detector with point-based attentive cont-conv fusion module. Proc. AAAI Conf. Artif. Intell. 34(07), 12 460–12 467 (2020)
Google Scholar
T. Huang, Z. Liu, X. Chen, X. Bai, EPNet: enhancing point features with image semantics for 3d object detection, in European Conference on Computer Vision (2020), pp. 35–52
Google Scholar
M. Liang, B. Yang, S. Wang, R. Urtasun, Deep continuous fusion for multi-sensor 3d object detection, in Proceedings of the European Conference on Computer Vision (2018), pp. 641–656
Google Scholar
S. Pang, D. Morris, H. Radha, CLOCs: camera-lidar object candidates fusion for 3d object detection, in IEEE/RSJ International Conference on Intelligent Robots and Systems (2020), pp. 10 386–10 393
Google Scholar
C.R. Qi, W. Liu, C. Wu, H. Su, L.J. Guibas, Frustum pointnets for 3d object detection from RGB-D data, in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 918–927
Google Scholar
P. Sun, H. Kretzschmar, X. Dotiwalla, A. Chouard, V. Patnaik, P. Tsui, J. Guo, Y. Zhou, Y. Chai, B. Caine, et al., Scalability in perception for autonomous driving: Waymo open dataset, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 2446–2454
Google Scholar
H. Caesar, V. Bankiti, A.H. Lang, S. Vora, V.E. Liong, Q. Xu, A. Krishnan, Y. Pan, G. Baldan, O. Beijbom, nuScenes: a multimodal dataset for autonomous driving, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020), pp. 11 621–11 631
Google Scholar

Download references

Author information

Authors and Affiliations

School of ECE, Peking University, Shenzhen, China
Wei Gao & Ge Li

Authors

Wei Gao
View author publications
You can also search for this author in PubMed Google Scholar
Ge Li
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gao, W., Li, G. (2025). Deep-Learning-Based Point Cloud Analysis II. In: Deep Learning for 3D Point Clouds. Springer, Singapore. https://doi.org/10.1007/978-981-97-9570-3_6

Download citation

DOI: https://doi.org/10.1007/978-981-97-9570-3_6
Published: 10 October 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-9569-7
Online ISBN: 978-981-97-9570-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics