Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2791347.2791374acmotherconferencesArticle/Chapter ViewAbstractPublication PagesssdbmConference Proceedingsconference-collections
research-article
Public Access

The hyperdyadic index and generalized indexing and query with PIQUE

Published: 29 June 2015 Publication History

Abstract

Many scientists rely on indexing and query to identify trends and anomalies within extreme-scale scientific data. Compressed bitmap indexing (e.g., FastBit) is the go-to indexing method for many scientific datasets and query workloads. Recently, the ALACRITY compressed inverted index was shown as a viable alternative approach. Notably, though FastBit and ALACRITY employ very different data structures (inverted list vs. bitmap) and binning methods (bit-wise vs. decimal-precision), close examination reveals marked similarities in index structure.
Motivated by this observation, we ask two questions. First, "Can we generalize FastBit and ALACRITY to an index model encompassing both?" And second, if so, "Can such a generalized framework enable other, new indexing methods?" This paper answers both questions in the affrmative.
First, we present PIQUE, a Parallel Indexing and Query Unified Engine, based on formal mathematical decomposition of the indexing process. PIQUE factors out commonalities in indexing, employing algorithmic/data structure "plugins" to mix orthogonal indexing concepts such as FastBit compressed bitmaps with ALACRITY binning, all within one framework.
Second, we define the hyperdyadic tree index, distinct from both bitmap and inverted indexes, demonstrating good index compression while maintaining high query performance. We implement the hyperdyadic tree index within PIQUE, reinforcing our unified indexing model.
We conduct a performance study of the hyperdyadic tree index vs. WAH compressed bitmaps, both within PIQUE and compared to FastBit, a state-of-the-art bitmap index system. The hyperdyadic tree index shows a 1.14-1.90x storage reduction vs. compressed bitmaps, with comparable or better query performance under most scenarios tested.

References

[1]
T. Apaydin, G. Canahuate, H. Ferhatosmanoglu, and A. S. Tosun. Approximate encoding for direct access and query processing over compressed bitmaps. In Very Large Data Bases (VLDB), 2006.
[2]
G. Bernardo, S. Álvarez García, N. Brisaboa, G. Navarro, et al. Compact querieable representations of raster data. In String Processing and Information Retrieval (SPIR), volume 8214, pages 96--108. 2013.
[3]
K. J. Bowers, B. J. Albright, L. Yin, B. Bergen, et al. Ultrahigh performance three-dimensional electromagnetic relativistic kinetic plasma simulationa). Physics of Plasmas, 15(5), 2008.
[4]
S. Byna, J. Chou, O. Rübel, H. Karimabadi, et al. Parallel I/O, analysis, and visualization of a trillion particle simulation. In Proc. High Performance Computing, Networking, Storage and Analysis (SC), 2012.
[5]
C.-Y. Chan and Y. E. Ioannidis. Bitmap index design and evaluation. In SIGMOD Record, volume 27, pages 355--366, 1998.
[6]
C.-Y. Chan and Y. E. Ioannidis. An efficient bitmap encoding scheme for selection queries. In SIGMOD Record, volume 28, pages 215--226, 1999.
[7]
H. K. Chang and J.-W. Chang. Fixed binary linear quadtree coding scheme for spatial data. In Proc. Visual Communications and Image Processing, 1994.
[8]
J. Chou, K. Wu, and Prabhat. FastQuery: a parallel indexing system for scientific data. In Proc. Cluster Computing, 2011.
[9]
D. Comer. Ubiquitous b-tree. Computing Surveys, 11(2):121--137, 1979.
[10]
I. Gargantini. An effective way to represent quadtrees. Communications of the ACM, 25(12):905--910, 1982.
[11]
J. M. Hellerstein, J. F. Naughton, and A. Pfeffer. Generalized search trees for database systems. In Proc. Very Large Data Bases (VLDB), pages 562--573, 1995.
[12]
C.-Y. Huang and K.-L. Chung. Transformations between bincodes and the DF-expression. Computers & Graphics, 19(4):601--610, 1995.
[13]
J. Jenkins, I. Arkatkar, S. Lakshminarasimhan, D. A. Boyuka II, et al. ALACRITY: Analytics-driven lossless data compression for rapid in-situ indexing, storing, and querying. In Transactions on Large-Scale Data-and Knowledge-Centered Systems X (TLDKS X), pages 95--114. 2013.
[14]
S. Lakshminarasimhan, D. A. Boyuka II, S. V. Pendse, X. Zou, et al. Scalable in situ scientific data encoding for analytical query processing. In Proc. Symposium on High-performance Parallel and Distributed Computing (HPDC), 2013.
[15]
T.-W. Lin. Set operations on constant bit-length linear quadtrees. Pattern Recognition, 30(7):1239--1249, 1997.
[16]
P. O'Neil and D. Quass. Improved query performance with variant indexes. In SIGMOD Record, volume 26, pages 38--49, 1997.
[17]
P. E. O'Neil. Model 204 architecture and performance. In High Performance Transaction Systems, volume 359 of Lecture Notes in Computer Science, pages 39--59. 1989.
[18]
H. Samet. The quadtree and related hierarchical data structures. Computing Surveys, 16(2), 1984.
[19]
A. Shoshani and D. Rotem. Scientific Data Management: Challenges, Technology, and Deployment. CRC Press, 2010.
[20]
K. Stockinger and K. Wu. Bitmap indices for data warehouses. Data Warehouses and OLAP: Concepts, Architectures and Solutions, page 57, 2006.
[21]
K. Stockinger, K. Wu, and A. Shoshani. Evaluation strategies for bitmap indices with binning. In Proc. Database and Expert Systems Applications, 2004.
[22]
K. Wu, W. Koegler, J. Chen, and A. Shoshani. Using bitmap index for interactive exploration of large datasets. In Proc. Scientific and Statistical Database Management (SSDBM), pages 65--74, 2003.
[23]
K. Wu, E. Otoo, and A. Shoshani. On the performance of bitmap indices for high cardinality attributes. In Proc. Very Large Data Bases (VLDB), 2004.
[24]
K. Wu, E. J. Otoo, and A. Shoshani. Optimizing bitmap indices with efficient compression. Transactions on Database Systems (TODS), 31(1):1--38, 2006.
[25]
K. Wu, K. Stockinger, and A. Shoshani. Performances of multi-level and multi-component compressed bitmap indices. Technical report, Lawrence Berkeley National Laboratory, 2007.
[26]
X. Zou, S. Lakshminarasimhan, D. A. Boyuka II, S. Ranshous, et al. Fast set intersection through run-time bitmap construction over PForDelta compressed indexes. In Proc. Euro-Par, 2014.

Cited By

View all
  • (2020)Parallel Query Service for Object-centric Data Management Systems2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW50202.2020.00076(406-415)Online publication date: May-2020
  • (2016)AMR-aware in situ indexing and scalable queryingProceedings of the 24th High Performance Computing Symposium10.22360/SpringSim.2016.HPC.012(1-9)Online publication date: 3-Apr-2016

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
SSDBM '15: Proceedings of the 27th International Conference on Scientific and Statistical Database Management
June 2015
390 pages
ISBN:9781450337090
DOI:10.1145/2791347
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SSDBM 2015

Acceptance Rates

Overall Acceptance Rate 56 of 146 submissions, 38%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)56
  • Downloads (Last 6 weeks)9
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2020)Parallel Query Service for Object-centric Data Management Systems2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW50202.2020.00076(406-415)Online publication date: May-2020
  • (2016)AMR-aware in situ indexing and scalable queryingProceedings of the 24th High Performance Computing Symposium10.22360/SpringSim.2016.HPC.012(1-9)Online publication date: 3-Apr-2016

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media