Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3552457.3555731acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Transformer and Upsampling-Based Point Cloud Compression

Published: 10 October 2022 Publication History

Abstract

Learning-based point cloud compression has exhibited superior coding performance over the traditional methods such as MEPG G-PCC. Considering that conventional point cloud representation formats (e.g., octree or voxel) will introduce additional errors and affect the reconstruction quality, we directly use the point-based representation and develop a framework that leverages transformer and upsampling techniques for point cloud compression. To extract latent features that well characterize an input point cloud, we build an end-to-end learning framework: at the encoder side, we leverage cascading transformers to extract and enhance useful features for entropy coding; At the decoder side, in addition to the transformers, an upsampling module utilizing both coordinates and features is devised to reconstruct the point cloud progressively. Experimental results demonstrate that the proposed method achieves the best coding performance against state-of-the-art point-based methods, e.g., >1 dB D1 and D2 PSNR at bitrate 0.10 bpp and more visually pleasing reconstructions. Extensive ablation studies also confirm the effectiveness of transformer and upsampling modules.

Supplementary Material

MP4 File (APCCPA22-apccpa08.mp4)
Learning-based point cloud geometry compression (PCGC) has received extensive attention. Existing methods can be divided into two categories: octree/voxel-based methods and point-based methods. A typical octree-based example is the MPEG G-PCC which mainly handles large-scale point clouds. The point-based methods generally employ PointNet to exploit point correlations and are widely used for small-scale point clouds. In this work, we also focus on the point-based approach. We construct an end-to-end framework: the encoder first downsamples and captures multi-scale features of the point cloud and then enhances them for entropy encoding; the decoder adopts a reverse procedure where features received are enhanced for coordinate reconstruction. Furthermore, we embed Transformer in the feature enhancement process of encoder and decoder to aggregate and emphasize valuable features. Experiment results show that our method greatly outperforms state-of-the-art PCGC methods, e.g., average 39% D1 BDBR and 43% D2 BDBR.

References

[1]
Johannes Ballé, David Minnen, Saurabh Singh, Sung Jin Hwang, and Nick Johnston. 2018. Variational image compression with a scale hyperprior. arXiv preprint arXiv:1802.01436 (2018).
[2]
Michael Batty, KayWAxhausen, Fosca Giannotti, Alexei Pozdnoukhov, Armando Bazzani, Monica Wachowicz, Georgios Ouzounis, and Yuval Portugali. 2012. Smart cities of the future. The European Physical Journal Special Topics 214, 1 (2012), 481--518.
[3]
Sourav Biswas, Jerry Liu, Kelvin Wong, Shenlong Wang, and Raquel Urtasun. 2020. Muscle: Multi sweep compression of lidar using deep entropy models. Advances in Neural Information Processing Systems 33 (2020), 22170--22181.
[4]
G. Bjøntegaard. 2001. Calculation of average PSNR differences between RDcurves. In ITU-T SG 16/Q6, 13th VCEG Meeting. document VCEG-M33.
[5]
Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. 2015. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012 (2015).
[6]
Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, and Tian Xia. 2017. Multi-view 3d object detection network for autonomous driving. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR). 1907--1915.
[7]
Felix Endres, Jürgen Hess, Jürgen Sturm, Daniel Cremers, and Wolfram Burgard. 2013. 3-D mapping with an RGB-D camera. IEEE Transactions on Robotics 30, 1 (2013), 177--187.
[8]
Haoqiang Fan, Hao Su, and Leonidas J Guibas. 2017. A point set generation network for 3d object reconstruction from a single image. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition (CVPR). 605--613.
[9]
Chunyang Fu, Ge Li, Rui Song,Wei Gao, and Shan Liu. 2022. OctAttention: Octreebased Large-scale Context Model for Point Cloud Compression. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI).
[10]
Linyao Gao, Tingyu Fan, Jianqiang Wan, Yiling Xu, Jun Sun, and Zhan Ma. 2021. Point cloud geometry compression via neural graph sampling. In 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 3373--3377.
[11]
André FR Guarda, Nuno MM Rodrigues, and Fernando Pereira. 2019. Point cloud coding: Adopting a deep learning-based approach. In 2019 Picture Coding Symposium (PCS). IEEE, 1--5.
[12]
André FR Guarda, Nuno MM Rodrigues, and Fernando Pereira. 2020. Adaptive deep learning-based point cloud geometry coding. IEEE Journal of Selected Topics in Signal Processing 15, 2 (2020), 415--430.
[13]
André FR Guarda, Nuno MM Rodrigues, and Fernando Pereira. 2020. Deep learning-based point cloud geometry coding: RD control through implicit and explicit quantization. In 2020 IEEE International Conference on Multimedia & Expo Workshops (ICMEW). IEEE, 1--6.
[14]
Meng-Hao Guo, Jun-Xiong Cai, Zheng-Ning Liu, Tai-Jiang Mu, Ralph R Martin, and Shi-Min Hu. 2021. PCT: Point cloud transformer. Computational Visual Media 7, 2 (2021), 187--199.
[15]
Lila Huang, Shenlong Wang, Kelvin Wong, Jerry Liu, and Raquel Urtasun. 2020. Octsqueeze: Octree-structured entropy model for lidar compression. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 1313--1323.
[16]
Tianxin Huang and Yong Liu. 2019. 3d point cloud geometry compression on deep learning. In Proceedings of the 27th ACM International Conference on Multimedia. 890--898.
[17]
Emre Can Kaya and Ioan Tabus. 2021. Neural network modeling of probabilities for coding the octree representation of point clouds. arXiv preprint arXiv:2106.06482 (2021).
[18]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[19]
MPEG. Accessed: 2021. Mpeg-pcc-tmc13. https://github.com/MPEGGroup/mpegpcc-tmc13.
[20]
Dat Thanh Nguyen, Maurice Quach, Giuseppe Valenzise, and Pierre Duhamel. 2021. Learning-based lossless compression of 3d point cloud geometry. In ICASSP 2021--2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 4220--4224.
[21]
Dat Thanh Nguyen, Maurice Quach, Giuseppe Valenzise, and Pierre Duhamel. 2021. Lossless coding of point cloud geometry using a deep generative model. IEEE Transactions on Circuits and Systems for Video Technology 31, 12 (2021), 4617--4629.
[22]
Dat Thanh Nguyen, Maurice Quach, Giuseppe Valenzise, and Pierre Duhamel. 2021. Multiscale deep context modeling for lossless point cloud geometry compression. In 2021 IEEE International Conference on Multimedia and ExpoWorkshops (ICMEW). IEEE, 1--6.
[23]
François Pomerleau, Francis Colas, and Roland Siegwart. 2015. A review of point cloud registration algorithms for mobile robotics. Foundations and Trends in Robotics 4, 1 (2015), 1--104.
[24]
Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. 2017. Pointnet: Deep hierarchical feature learning on point sets in a metric space. Advances in Neural Information Processing Systems (NIPS) 30 (2017).
[25]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2019. Learning convolutional transforms for lossy point cloud geometry compression. In 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 4320--4324.
[26]
Maurice Quach, Giuseppe Valenzise, and Frederic Dufaux. 2020. Improved deep point cloud geometry compression. In 2020 IEEE 22nd International Workshop on Multimedia Signal Processing (MMSP). IEEE, 1--6.
[27]
Zizheng Que, Guo Lu, and Dong Xu. 2021. Voxelcontext-net: An octree based framework for point cloud compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6042--6051.
[28]
Sebastian Schwarz, Marius Preda, Vittorio Baroncini, Madhukar Budagavi, Pablo Cesar, Philip A Chou, Robert A Cohen, Maja Krivoku?a, Sébastien Lasserre, Zhu Li, et al. 2018. Emerging MPEG standards for point cloud compression. IEEE Journal on Emerging and Selected Topics in Circuits and Systems 9, 1 (2018), 133--148.
[29]
Philip A. Chou Sebastian Schwarz and Indranil Sinharoy. 2018. Common test conditions for point cloud compression. ISO/IEC MPEG N18474 (2018).
[30]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. Advances in neural information processing systems 30 (2017).
[31]
Jianqiang Wang, Dandan Ding, Zhu Li, Xiaoxing Feng, Chuntong Cao, and Zhan Ma. 2021. Sparse Tensor-based Multiscale Representation for Point Cloud Geometry Compression. arXiv preprint arXiv:2111.10633 (2021).
[32]
Jianqiang Wang, Dandan Ding, Zhu Li, and Zhan Ma. 2021. Multiscale Point Cloud Geometry Compression. In 2021 Data Compression Conference (DCC). IEEE, 73--82.
[33]
Jianqiang Wang, Hao Zhu, Haojie Liu, and Zhan Ma. 2021. Lossy point cloud geometry compression via end-to-end learning. IEEE Transactions on Circuits and Systems for Video Technology 31, 12 (2021), 4909--4923.
[34]
Xuanzheng Wen, Xu Wang, Junhui Hou, Lin Ma, Yu Zhou, and Jianmin Jiang. 2020. Lossy geometry compression of 3d point cloud data via an adaptive octreeguided network. In 2020 IEEE International Conference on Multimedia and Expo (ICME). IEEE, 1--6.
[35]
Jochen Wendel, José Miguel Santana Núñez, and Alexander Simons. 2017. Semantic 3D city data as virtual and augmented reality: urban energy modelling. GIM International (2017).
[36]
Louis Wiesmann, Andres Milioto, Xieyuanli Chen, Cyrill Stachniss, and Jens Behley. 2021. Deep compression for dense point cloud maps. IEEE Robotics and Automation Letters 6, 2 (2021), 2060--2067.
[37]
Wei Yan, Shan Liu, Thomas H Li, Zhu Li, Ge Li, et al. 2019. Deep autoencoder-based lossy geometry compression for point clouds. arXiv preprint arXiv:1905.03691 (2019).

Cited By

View all
  • (2025)A Multiple Compression Approach using Attribute-based SignaturesOpen Research Europe10.12688/openreseurope.19247.15(49)Online publication date: 10-Feb-2025
  • (2025)A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding – Part I: GeometryIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.346293847:1(269-287)Online publication date: Jan-2025
  • (2024)PointsoupProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/595(5380-5388)Online publication date: 3-Aug-2024
  • Show More Cited By

Index Terms

  1. Transformer and Upsampling-Based Point Cloud Compression

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    APCCPA '22: Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis
    October 2022
    72 pages
    ISBN:9781450394918
    DOI:10.1145/3552457
    • General Chairs:
    • Wei Gao,
    • Ge Li,
    • Hui Yuan,
    • Raouf Hamzaoui,
    • Zhu Li,
    • Shan Liu
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 October 2022

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. deep neural network
    2. geometric information
    3. point cloud compression
    4. transformer

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    MM '22
    Sponsor:

    Acceptance Rates

    APCCPA '22 Paper Acceptance Rate 8 of 8 submissions, 100%;
    Overall Acceptance Rate 8 of 8 submissions, 100%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)221
    • Downloads (Last 6 weeks)18
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)A Multiple Compression Approach using Attribute-based SignaturesOpen Research Europe10.12688/openreseurope.19247.15(49)Online publication date: 10-Feb-2025
    • (2025)A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding – Part I: GeometryIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2024.346293847:1(269-287)Online publication date: Jan-2025
    • (2024)PointsoupProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/595(5380-5388)Online publication date: 3-Aug-2024
    • (2024)Encoding auxiliary information to restore compressed point cloud geometryProceedings of the Thirty-Third International Joint Conference on Artificial Intelligence10.24963/ijcai.2024/242(2189-2197)Online publication date: 3-Aug-2024
    • (2024)Octree-Retention Fusion: A High-Performance Context Model for Point Cloud Geometry CompressionProceedings of the 2024 International Conference on Multimedia Retrieval10.1145/3652583.3657620(1150-1154)Online publication date: 30-May-2024
    • (2024) GRNet: Geometry Restoration for G-PCC Compressed Point Clouds Using Auxiliary Density Signaling IEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.333693630:10(6740-6753)Online publication date: Oct-2024
    • (2024)Wireless Point Cloud Transmission2024 IEEE 25th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)10.1109/SPAWC60668.2024.10694621(851-855)Online publication date: 10-Sep-2024
    • (2024)Spatial-temporal Semantic Communications for Point Cloud-based Volumetric MediaICC 2024 - IEEE International Conference on Communications10.1109/ICC51166.2024.10622850(4704-4710)Online publication date: 9-Jun-2024
    • (2024)NeRI: Implicit Neural Representation of LiDAR Point Cloud Using Range Image SequenceICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10446596(8020-8024)Online publication date: 14-Apr-2024
    • (2024)A Density-Aware Point Cloud Geometry Compression Leveraging Cluster- Centric ProcessingIEEE Access10.1109/ACCESS.2024.341102912(81441-81452)Online publication date: 2024
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media