Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Dictionary Compression in Point Cloud Data Management

Published: 06 June 2019 Publication History

Abstract

Nowadays, massive amounts of point cloud data can be collected thanks to advances in data acquisition and processing technologies such as dense image matching and airborne LiDAR scanning. With the increase in volume and precision, point cloud data offers a useful source of information for natural-resource management, urban planning, self-driving cars, and more. At the same time, on the scale that point cloud data is produced, management challenges are introduced: it is important to achieve efficiency both in terms of querying performance and space requirements. Traditional file-based solutions to point cloud management offer space efficiency, however, they cannot scale to such massive data and provide the declarative power of a DBMS.
In this article, we propose a time- and space-efficient solution to storing and managing point cloud data in main memory column-store DBMS. Our solution, Space-Filling Curve Dictionary-Based Compression (SFC-DBC), employs dictionary-based compression in the spatial data management domain and enhances it with indexing capabilities by using space-filling curves. SFC-DBC does so by constructing the space-filling curve over a compressed, artificially introduced dictionary space. Consequently, SFC-DBC significantly optimizes query execution and yet does not require additional storage resources, compared to traditional dictionary-based compression. With respect to space-filling-curve-based approaches, it minimizes storage footprint and increases resilience to skew. As a proof of concept, we develop and evaluate our approach as a research prototype in the context of SAP HANA. SFC-DBC outperforms other dictionary-based compression schemes by up to 61% in terms of space and up to 9.4× in terms of query performance.

References

[1]
Foteini Alvanaki, Romulo Goncalves, Milena Ivanova, Martin L. Kersten, and Kostis Kyzirakos. 2015. GIS navigation boosted by column stores. PVLDB 8, 12 (2015), 1956--1959.
[2]
Rudolf Bayer. 1997. The universal B-tree for multidimensional indexing: General concepts. In Proceedings of the International Conference on Worldwide Computing and Its Applications (WWCA’97). 198--209.
[3]
Tudor David, Rachid Guerraoui, and Vasileios Trigonakis. 2013. Everything you always wanted to know about synchronization but were afraid to ask. In Proceedings of the ACM SIGOPS 24th Symposium on Operating Systems Principles (SOSP’13). 33--48.
[4]
Christos Faloutsos. 1988. Gray codes for partial match and range queries. IEEE Trans. Softw. Eng. 14, 10 (1988), 1381--1393.
[5]
Christos Faloutsos and Shari Roseman. 1989. Fractals for secondary key retrieval. In Proceedings of the 8th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems. 247--252.
[6]
Franz Färber, Norman May, Wolfgang Lehner, Philipp Große, Ingo Müller, Hannes Rauhe, and Jonathan Dees. 2012. The SAP HANA database—An architecture overview. IEEE Data Eng. Bull. 35, 1 (2012), 28--33. http://sites.computer.org/debull/A12mar/hana.pdf.
[7]
Romulo Goncalves, Tom van Tilburg, Kostis Kyzirakos, Foteini Alvanaki, Panagiotis Koutsourakis, Ben van Werkhoven, and Willem Robert van Hage. 2016. A spatial column-store to triangulate the Netherlands on the fly. In Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS’16). 80:1--80:4.
[8]
Antonin Guttman. 1984. R-trees: A dynamic index structure for spatial searching. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’84). 47--57.
[9]
Norbert Haala. 2011. Multiray photogrammetry and dense image matching. In Photogrammetric Week, Vol. 11 (2011).
[10]
Chris L. Jackins and Steven L. Tanimoto. 1980. Oct-trees and their use in representing three-dimensional objects. Comput. Graph. Image Proc. 14, 3 (1980), 249--270.
[11]
H. V. Jagadish. 1990. Linear clustering of objects with multiple attributes. In Proceedings of the ACM SIGMOD International Conference on Management of Data. 332--342.
[12]
Christian S. Jensen, Dan Lin, and Beng Chin Ooi. 2004. Query and update efficient B+-tree based indexing of moving objects. In Proceedings of the 30th International Conference on Very Large Data Bases. 768--779. http://www.vldb.org/conf/2004/RS20P3.PDF.
[13]
Kostis Kyzirakos, Foteini Alvanaki, and Martin L. Kersten. 2016. In memory processing of massive point clouds for multi-core systems. In Proceedings of the 12th International Workshop on Data Management on New Hardware (DaMoN’16). 7:1--7:10.
[14]
Per-Åke Larson, Cipri Clinciu, Eric N. Hanson, Artem Oks, Susan L. Price, Srikumar Rangarajan, Aleksandras Surna, and Qingqing Zhou. 2011. SQL server column store indexes. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’11). 1177--1184.
[15]
Robert Laurini. 1985. Graphics databases built on Peano space-filling curves. In EUROGRAPHICS, Vol. 85 (1985).
[16]
Oscar Martinez-Rubi, Peter van Oosterom, Romulo Goncalves, Theo Tijssen, Milena Ivanova, Martin L. Kersten, and Foteini Alvanaki. 2014. Benchmarking and improving point cloud data management in MonetDB. SIGSPATIAL Special 6, 2 (2014), 11--18.
[17]
Mohamed F. Mokbel and Walid G. Aref. 2009. Space-filling curves for query processing. In Encycl. Datab. Syst. 2675--2680.
[18]
Bongki Moon, H. V. Jagadish, Christos Faloutsos, and Joel H. Saltz. 2001. Analysis of the clustering properties of the Hilbert space-filling curve. IEEE Trans. Knowl. Data Eng. 13, 1 (2001), 124--141.
[19]
Actueel Hoogte Bestand Nederland. 2017. AHN Datasets. Retrieved from http://www.ahn.nl.
[20]
Oracle. 2017. Spatial and Graph Developer’s Guide. Retrieved from https://docs.oracle.com/database/121/SPATL/.
[21]
Jack A. Orenstein and T. H. Merrett. 1984. A class of data structures for associative searching. In Proceedings of the 3rd ACM SIGACT-SIGMOD Symposium on Principles of Database Systems. 181--190.
[22]
Mirjana Pavlovic, Kai-Niklas Bastian, Hinnerk Gildhoff, and Anastasia Ailamaki. 2017. Dictionary compression in point cloud data management. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (GIS’17). 45:1--45:10.
[23]
Giuseppe Peano. 1890. Sur une courbe, qui remplit toute une aire plane. In Mathematische Annalen. Springer-Verlag, 157--160.
[24]
PostgreSQL. 2017. A PostgreSQL Extension for Storing Point Cloud (LiDAR) Data. Retrieved from https://github.com/pgpointcloud/pointcloud.
[25]
Iraklis Psaroudakis, Tobias Scheuer, Norman May, Abdelkader Sellami, and Anastasia Ailamaki. 2015. Scaling up concurrent main-memory column-store scans: Towards adaptive NUMA-aware data and task placement. PVLDB 8, 12 (2015), 1442--1453.
[26]
Rapidlasso GmbH. 2017. LAStools. Retrieved from https://rapidlasso.com/lastools/.
[27]
Rico Richter and Jürgen Döllner. 2010. Out-of-core real-time visualization of massive 3D point clouds. In Proceedings of the 7th International Conference on Computer Graphics, Virtual Reality, Visualisation and Interaction in Africa (Afrigraph’10). 121--128.
[28]
Hanan Samet. 1984. The quadtree and related hierarchical data structures. ACM Comput. Surv. 16, 2 (1984), 187--260.
[29]
Vishal Sikka, Franz Färber, Wolfgang Lehner, Sang Kyun Cha, Thomas Peh, and Christof Bornhövd. 2012. Efficient transaction processing in SAP HANA database: The end of a column store myth. In Proceedings of the ACM SIGMOD International Conference on Management of Data (SIGMOD’12). 731--742.
[30]
Tomás Skopal, Michal Krátký, Jaroslav Pokorný, and Václav Snásel. 2006. A new range query algorithm for universal B-trees. Inf. Syst. 31, 6 (2006), 489--511.
[31]
Hermann Tropf and Helmut Herzog. 1981. Multidimensional range search in dynamically balanced trees. ANGEWANDTE INFO.2 (1981), 71--77.
[32]
Peter van Oosterom, Oscar Martinez-Rubi, Milena Ivanova, Mike Hörhammer, Daniel Geringer, Siva Ravada, Theo Tijssen, Martin Kodde, and Romulo Goncalves. 2015. Massive point cloud data management: Design, implementation and execution of a point cloud benchmark. Comput. Graph. 49 (2015), 92--125.
[33]
Aloysius Wehr and Uwe Lohr. 1999. Airborne laser scanning—An introduction and overview. P&RS 54, 2 (1999), 68--82.
[34]
Thomas Willhalm, Nicolae Popovici, Yazan Boshmaf, Hasso Plattner, Alexander Zeier, and Jan Schaffner. 2009. SIMD-scan: Ultra fast in-memory table scan using on-chip vector processing units. PVLDB 2, 1 (2009), 385--394.

Cited By

View all
  • (2021)Effect of d-Dimensional Re-orderings on Lossless Compression of Radio-Astronomy and Digital Elevation DataIEEE Access10.1109/ACCESS.2021.30848389(80415-80433)Online publication date: 2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Spatial Algorithms and Systems
ACM Transactions on Spatial Algorithms and Systems  Volume 5, Issue 1
Special Issue on SIGSPATIAL 2017
March 2019
146 pages
ISSN:2374-0353
EISSN:2374-0361
DOI:10.1145/3336122
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2019
Accepted: 01 December 2018
Revised: 01 September 2018
Received: 01 April 2018
Published in TSAS Volume 5, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Point cloud
  2. data compression
  3. multidimensional data access methods
  4. spatial data management

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

  • European Research Council (ERC)
  • EU FP7 programme (ERC-2013-CoG)

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)13
  • Downloads (Last 6 weeks)2
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Effect of d-Dimensional Re-orderings on Lossless Compression of Radio-Astronomy and Digital Elevation DataIEEE Access10.1109/ACCESS.2021.30848389(80415-80433)Online publication date: 2021

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media