research-article

A Dynamic 3D Point Cloud Dataset for Immersive Applications

Authors:

Chun-Ying Huang,

Cheng-Hsin HsuAuthors Info & Claims

MMSys '23: Proceedings of the 14th Conference on ACM Multimedia Systems

Pages 376 - 383

https://doi.org/10.1145/3587819.3592546

Published: 08 June 2023 Publication History

Abstract

Motion estimation in a 3D point cloud sequence is a fundamental operation with many applications, including compression, error concealment, and temporal upscaling. While there have been multiple research contributions toward estimating the motion vector of points between frames, there is a lack of a dynamic 3D point cloud dataset with motion ground truth to benchmark against. In this paper, we present an open dynamic 3D point cloud dataset to fill this gap. Our dataset consists of synthetically generated objects with pre-determined motion patterns, allowing us to generate the motion vectors for the points. Our dataset contains nine objects in three categories (shape, avatar, and textile) with different animation patterns. We also provide semantic segmentation of each avatar object in the dataset. Our dataset can be used by researchers who need temporal information across frames. As an example, we present an evaluation of two motion estimation methods using our dataset.

References

[1]

Adobe. 2023. Adobe Fuse. https://rotf.lol/5n7ekucm.

[2]

Anique Akhtar, Zhu Li, Geert Van der Auwera, and Jianle Chen. 2022. Dynamic Point Cloud Interpolation. In Proc. of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP'22). Singapore, Singapore, 2574--2578.

[3]

Iro Armeni, Ozan Sener, Amir Zamir, Helen Jiang, Ioannis Brilakis, Martin Fischer, and Silvio Savarese. 2016. 3D Semantic Parsing of Large-Scale Indoor Spaces. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). Las Vegas, NV, 1534--1543.

[4]

Jens Behley, Martin Garbade, Andres Milioto, Jan Quenzel, Sven Behnke, Cyrill Stachniss, and Jurgen Gall. 2019. SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences. In Proc. of IEEE/CVF International Conference on Computer Vision (ICCV'19). Seoul, Korea, 9297--9307.

[5]

Matthew Berger, Andrea Tagliasacchi, Lee Seversky, Pierre Alliez, Gael Guennebaud, Joshua Levine, Andrei Sharf, and Claudio Silva. 2017. A Survey of Surface Reconstruction from Point Clouds. Computer Graphics Forum 36, 1 (2017), 301--329.

Digital Library

[6]

Hugo Bertiche, Meysam Madadi, and Sergio Escalera. 2020. CLOTH3D: Clothed 3D Humans. In Proc. of European Conference on Computer Vision (ECCV'20). Cham, Germany, 344--359.

Digital Library

[7]

Holger Caesar, Varun Bankiti, Alex Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. 2020. nuScenes: A Multimodal Dataset for Autonomous Driving. In Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'20). Seattle, WA, 11621--11631.

[8]

Chao Cao, Marius Preda, and Titus Zaharia. 2019. 3D Point Cloud Compression: A Survey. In Proc. of ACM International Conference on 3D Web Technology (WEB3D'19). Los Angeles, CA, 1--9.

Digital Library

[9]

Edwin Catmull and James Clark. 1978. Recursively Generated B-Spline Surfaces on Arbitrary Topological Meshes. Elsevier Computer-aided Design 10, 6 (1978), 350--355.

[10]

CGTrager. 2023. CGTrader. https://www.cgtrader.com/.

[11]

Angel Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, Jianxiong Xiao, Li Yi, and Fisher Yu. 2015. ShapeNet: An Information-Rich 3D Model Repository. CoRR abs/1512.03012 (2015).

[12]

Ming-Fang Chang, John Lambert, Patsorn Sangkloy, Jagjeet Singh, Slawomir Bak, Andrew Hartnett, De Wang, Peter Carr, Simon Lucey, Deva Ramanan, and James Hays. 2019. Argoverse: 3D Tracking and Forecasting With Rich Maps. In Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'19). Long Beach, CA, 8748--8757.

[13]

Jingdao Chen, John Yi, Mark Kahoush, Erin Cho, and Yong Cho. 2020. Point Cloud Scene Completion of Obstructed Building Facades with Generative Adversarial Inpainting. Multidisciplinary Digital Publishing Institute Sensors 20, 18 (2020), 5029.

[14]

Blender Online Community. 2018. Blender-A 3D Modelling and Rendering Package. http://www.blender.org.

[15]

Angela Dai, Angel Chang, Manolis Savva, Maciej Halber, Thomas Funkhouser, and Matthias Niessner. 2017. ScanNet: Richly-Annotated 3D Reconstructions of Indoor Scenes. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17). Honolulu, HI, 5828--5839.

[16]

Eugene d'Eon, Bob Harrison, Taos Myers, and Philip Chou. 2017. 8i Voxelized Full Bodies-A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document WG11M40059/WG1M74006 7, 8 (2017), 11.

[17]

Varun Ganapathi, Christian Plagemann, Daphne Koller, and Sebastian Thrun. 2012. Real-Time Human Pose Tracking from Range Data. In Proc. of European Conference on Computer Vision (ECCV'12). Firenze, Italy, 738--751.

Digital Library

[18]

Andreas Geiger, Philip Lenz, and Raquel Urtasun. 2012. Are We Ready for Autonomous Driving? The KITTI Vision Benchmark Suite. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'12). Providence, RI, 3354--3361.

[19]

Zan Gojcic, Caifa Zhou, Jan Wegner, and Andreas Wieser. 2019. The Perfect Match: 3D Point Cloud Matching with Smoothed Densities. In Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'19). Long Beach, CA, 5545--5554.

[20]

Pedro Gomes, Silvia Rossi, and Laura Toni. 2021. Spatio-Temporal Graph-RNN for Point Cloud Prediction. In Proc. of IEEE International Conference on Image Processing (ICIP'21). Anchorage, AK, 3428--3432.

[21]

Danillo Graziosi, Ohji Nakagami, Satoru Kuma, Alexandre Zaghetto, Teruhiko Suzuki, and Ali Tabatabai. 2020. An Overview of Ongoing Point Cloud Compression Standardization Activities: Video-based (V-PCC) and Geometry-based (G-PCC). APSIPA Transactions on Signal and Information Processing 9 (2020), e13:1--e13:17.

[22]

Ju He, Zeqing Fu, Wei Hu, and Zongming Guo. 2019. Point Cloud Attribute Inpainting in Graph Spectral Domain. In Proc. of IEEE International Conference on Image Processing (ICIP'19). Taipei, Taiwan, 4385--4389.

[23]

Jing Huang and Suya You. 2012. Point Cloud Matching Based on 3D Self-similarity. In Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'12). Providence, RI, 41--48.

[24]

Xiaoshui Huang, Guofeng Mei, Jian Zhang, and Rana Abbas. 2021. A Comprehensive Survey on Point Cloud Registration. arXiv preprint arXiv:2103.02690 (2021). arXiv:2103.02690 [cs.CV]

[25]

Tzu-Kuan Hung, I-Chun Huang, Samuel Rhys Cox, Wei Tsang Ooi, and Cheng-Hsin Hsu. 2022. Error Concealment of Dynamic 3D Point Cloud Streaming. In Proc. of ACM International Conference on Multimedia (MM'22). Lisbon, Portugal, 3134--3142.

Digital Library

[26]

Maja Krivokuća, Philip Chou, and Patrick Savill. 2018. 8i Voxelized Surface Light Field (8iVSLF) Dataset. ISO/IEC JTC1/SC29 WG11 (MPEG) input document m42914 (2018).

[27]

Qi Liu, Honglei Su, Zhengfang Duanmu, Wentao Liu, and Zhou Wang. 2022. Perceptual Quality Assessment of Colored 3D Point Clouds. IEEE Transactions on Visualization and Computer Graphics (2022). Accepted to appear.

Digital Library

[28]

Charles Loop, Cai Qin, Orts Escolano Sergio, and Philip Chou. 2021. JPEG Pleno Database: Microsoft Voxelized Upper Bodies-A Voxelized Point Cloud Dataset. ISO/IEC JTC1/SC29 Joint WG11/WG1 (MPEG/JPEG) input document m38673/M72012 (2021).

[29]

Makehuman. 2023. MakeHuman. https://rotf.lol/3k47ed8x.

[30]

Nikolaus Mayer, Eddy Ilg, Philip Hausser, Philipp Fischer, Daniel Cremers, Alexey Dosovitskiy, and Thomas Brox. 2016. A Large Dataset to Train Convolutional Networks for Disparity, Optical Flow, and Scene Flow Estimation. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). Las Vegas, NV, 4040--4048.

[31]

Mixamo. 2023. Adobe's Mixamo. https://www.mixamo.com.

[32]

Ferda Ofli, Rizwan Chaudhry, Gregorij Kurillo, René Vidal, and Ruzena Bajcsy. 2013. Berkeley MHAD: A Comprehensive Multimodal Human Action Database. In Proc. of IEEE International Workshop on Applications of Computer Vision (WACV'13). Clearwater Beach, FL, 53--60.

Digital Library

[33]

Rafael Pagés, Konstantinos Amplianitis, Jan Ondrej, Emin Zerman, and Aljosa Smolic. 2021. Volograms & V-SENSE Volumetric Video Dataset. ISO/IEC JTC1/SC29/WG07M PEG2021/m56767 (2021).

[34]

Yancheng Pan, Biao Gao, Jilin Mei, Sibo Geng, Chengkun Li, and Huijing Zhao. 2020. SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances. In Proc. of IEEE International Conference on Intelligent Vehicles Symposium (IV'20). Las Vegas, NV, 687--693.

Digital Library

[35]

Albert Pumarola, Jordi Sanchez-Riera, Gary Choi, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2019. 3DPeople: Modeling the Geometry of Dressed Humans. In Proc. of IEEE/CVF International Conference on Computer Vision (ICCV'19). Seoul, Korea, 2242--2251.

[36]

ReportLinker. 2022. Virtual Reality Market Size, Share and Trends Analysis Report by End-User Type, Product Type and Region, 2021-2030. https://tinyurl.com/ywvam577.

[37]

German Ros, Laura Sellart, Joanna Materzynska, David Vazquez, and Antonio Lopez. 2016. The SYNTHA Dataset: A Large Collection of Synthetic Images for Semantic Segmentation of Urban Scenes. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'16). Las Vegas, NV, 3234--3243.

[38]

Alireza Shafaei and James Little. 2016. Real-Time Human Motion Capture with Multiple Depth Cameras. In Proc. of IEEE/RSJ Conference on Computer and Robot Vision (CRV'16). Victoria, Canada, 24--31.

[39]

Greg Turk and Marc Levoy. 1994. Zippered Polygon Meshes from Range Images. In Proc. of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'94). Orlando, FL, 311--318.

Digital Library

[40]

Gul Varol, Javier Romero, Xavier Martin, Naureen Mahmood, Michael Black, Ivan Laptev, and Cordelia Schmid. 2017. Learning From Synthetic Humans. In Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR'17). Honolulu, HI, 109--117.

[41]

Irene Viola, Jelmer Mulder, Francesca De Simone, and Pablo Cesar. 2019. Temporal Interpolation of Dynamic Digital Humans using Convolutional Neural Networks. In Proc. of IEEE International Conference on Artificial Intelligence and Virtual Reality (AIVR'19). San Diego Bay, CA, 90--907.

[42]

Haiyan Wang and Yingli Tian. 2022. Sequential Point Clouds: A Survey. arXiv:2204.09337 [cs.CV]

[43]

Cheng-Hao Wu, Chih-Fan Hsu, Tzu-Kuan Hung, Carsten Griwodz, Wei Tsang Ooi, and Cheng-Hsin Hsu. 2022. Quantitative Comparison of Point Cloud Compression Algorithms with PCC Arena. IEEE Transactions on Multimedia (February 2022), 1--16. Accepted to appear.

[44]

Yi Xu, Yao Lu, and Ziyu Wen. 2017. Owlii Dynamic Human Mesh Sequence Dataset. ISO/IEC JTC1/SC29/WG11 m41658 (2017).

[45]

Yiming Zeng, Yue Qian, Qijian Zhang, Junhui Hou, Yixuan Yuan, and Ying He. 2022. IDEA-Net: Dynamic 3D Point Cloud Interpolation via Deep Embedding Alignment. In Proc. of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR'22). New Orleans, LA, 6338--6347.

[46]

Emin Zerman, Pan Gao, Cagri Ozcinar, and Aljosa Smolic. 2019. Subjective and Objective Quality Assessment for Volumetric Video Compression. Electronic Imaging 2019, 10 (2019), 323-1--323-7.

[47]

Emin Zerman, Cagri Ozcinar, Pan Gao, and Aljosa Smolic. 2020. Textured Mesh vs Coloured Point Cloud: A Subjective Study for Volumetric Video Compression. In Proc. of International Conference on Quality of Multimedia Experience (QoMEX'20). Ahlone, Ireland, 1--6.

[48]

Jiaying Zhang, Xiaoli Zhao, Zheng Chen, and Zhejun Lu. 2019. A Review of Deep Learning-Based Semantic Segmentation for Point Cloud. IEEE Access 7 (December 2019), 179118--179133.

Cited By

Zhu MSun YLi NZhou JChen SHsu CLiu Y(2024)Dynamic 6-DoF Volumetric Video Generation: Software Toolkit and Dataset2024 IEEE 26th International Workshop on Multimedia Signal Processing (MMSP)10.1109/MMSP61759.2024.10743552(1-6)Online publication date: 2-Oct-2024
https://doi.org/10.1109/MMSP61759.2024.10743552

Index Terms

A Dynamic 3D Point Cloud Dataset for Immersive Applications

Recommendations

Error Concealment of Dynamic 3D Point Cloud Streaming
MM '22: Proceedings of the 30th ACM International Conference on Multimedia

Recently standardized MPEG Video-based Point Cloud Compression (V-PCC) codec has shown promise in achieving a good rate-distortion ratio of dynamic 3D point cloud compression. Current error concealment methods of V-PCC, however, lead to significantly ...
Informative Point cloud Dataset Extraction for Classification via Gradient-based Points Moving
MM '24: Proceedings of the 32nd ACM International Conference on Multimedia

Point cloud plays a significant role in recent learning-based vision tasks, which contain additional information about the physical space compared to 2D images. However, such a 3D data format also results in more expensive training costs to train a ...
3D Plant Phenotyping: All You Need is Labelled Point Cloud Data
Computer Vision – ECCV 2020 Workshops
Abstract
In the realm of modern digital phenotyping technological advancements, the demand of annotated datasets is increasing for either training machine learning algorithms or evaluating 3D phenotyping systems. While a few 2D datasets have been proposed ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MMSys '23: Proceedings of the 14th ACM Multimedia Systems Conference

June 2023

495 pages

ISBN:9798400701481

DOI:10.1145/3587819

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

MMSys '23

Sponsor:

SIGMM

MMSys '23: 14th Conference on ACM Multimedia Systems

June 7 - 10, 2023

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 176 of 530 submissions, 33%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
274
Total Downloads

Downloads (Last 12 months)108
Downloads (Last 6 weeks)14

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Zhu MSun YLi NZhou JChen SHsu CLiu Y(2024)Dynamic 6-DoF Volumetric Video Generation: Software Toolkit and Dataset2024 IEEE 26th International Workshop on Multimedia Signal Processing (MMSP)10.1109/MMSP61759.2024.10743552(1-6)Online publication date: 2-Oct-2024
https://doi.org/10.1109/MMSP61759.2024.10743552

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten