Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1559845.1559878acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Self-organizing tuple reconstruction in column-stores

Published: 29 June 2009 Publication History

Abstract

Column-stores gained popularity as a promising physical design alternative. Each attribute of a relation is physically stored as a separate column allowing queries to load only the required attributes. The overhead incurred is on-the-fly tuple reconstruction for multi-attribute queries. Each tuple reconstruction is a join of two columns based on tuple IDs, making it a significant cost component. The ultimate physical design is to have multiple presorted copies of each base table such that tuples are already appropriately organized in multiple different orders across the various columns. This requires the ability to predict the workload, idle time to prepare, and infrequent updates.
In this paper, we propose a novel design, partial sideways cracking, that minimizes the tuple reconstruction cost in a self-organizing way. It achieves performance similar to using presorted data, but without requiring the heavy initial presorting step itself. Instead, it handles dynamic, unpredictable workloads with no idle time and frequent updates. Auxiliary dynamic data structures, called cracker maps, provide a direct mapping between pairs of attributes used together in queries for tuple reconstruction. A map is continuously physically reorganized as an integral part of query evaluation, providing faster and reduced data access for future queries. To enable flexible and self-organizing behavior in storage-limited environments, maps are materialized only partially as demanded by the workload. Each map is a collection of separate chunks that are individually reorganized, dropped or recreated as needed. We implemented partial sideways cracking in an open-source column-store. A detailed experimental analysis demonstrates that it brings significant performance benefits for multi-attribute queries.

References

[1]
D. Abadi et al. Integrating compression and execution in column-oriented database systems. SIGMOD 2006.
[2]
D. Abadi et al. Materialization Strategies in a Column-Oriented DBMS. ICDE 2007.
[3]
S. Agrawal et al. Database Tuning Advisor for Microsoft SQL Server. VLDB 2004.
[4]
P. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. CIDR 2005.
[5]
N. Bruno and S. Chaudhuri. To Tune or not to Tune? A Lightweight Physical Design Alerter. VLDB 2006.
[6]
S. Harizopoulos et al. Performance Tradeoffs in Read-Optimized Databases. VLDB 2006.
[7]
S. Idreos, M. Kersten, and S. Manegold. Database Cracking. CIDR 2007.
[8]
S. Idreos, M. Kersten, and S. Manegold. Updating a Cracked Database. SIGMOD 2007.
[9]
M. Kersten and S. Manegold. Cracking the Database Store. CIDR 2005.
[10]
S. Manegold et al. Cache-Conscious Radix-Decluster Projections. VLDB 2004.
[11]
K. Schnaitter et al. COLT: Continuous On-Line Database Tuning. SIGMOD 2006.
[12]
M. Stonebraker et al. C-Store: A Column Oriented DBMS. VLDB 2005.
[13]
D. C. Zilio et al. DB2 Design Advisor: Integrated Automatic Physical Database Design. VLDB 2004.
[14]
TPC Benchmark H. http://www.tpc.org/tpch/.
[15]
MonetDB. http://monetdb.cwi.nl/.

Cited By

View all
  • (2024)Optimizing the B+tree Index with Hotness Awareness and AdaptivityAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5581-3_29(356-367)Online publication date: 1-Aug-2024
  • (2023)Adaptive Indexing in High-Dimensional Metric SpacesProceedings of the VLDB Endowment10.14778/3603581.360359216:10(2525-2537)Online publication date: 1-Jun-2023
  • (2023)Cracking-Like Join for Trusted Execution EnvironmentsProceedings of the VLDB Endowment10.14778/3598581.359860216:9(2330-2343)Online publication date: 10-Jul-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '09: Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
June 2009
1168 pages
ISBN:9781605585512
DOI:10.1145/1559845
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 29 June 2009

Permissions

Request permissions for this article.

Check for updates

Badges

Author Tags

  1. database cracking
  2. self-organization

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '09
Sponsor:
SIGMOD/PODS '09: International Conference on Management of Data
June 29 - July 2, 2009
Rhode Island, Providence, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)23
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Optimizing the B+tree Index with Hotness Awareness and AdaptivityAdvanced Intelligent Computing Technology and Applications10.1007/978-981-97-5581-3_29(356-367)Online publication date: 1-Aug-2024
  • (2023)Adaptive Indexing in High-Dimensional Metric SpacesProceedings of the VLDB Endowment10.14778/3603581.360359216:10(2525-2537)Online publication date: 1-Jun-2023
  • (2023)Cracking-Like Join for Trusted Execution EnvironmentsProceedings of the VLDB Endowment10.14778/3598581.359860216:9(2330-2343)Online publication date: 10-Jul-2023
  • (2023)Adaptive Indexing of Objects with Spatial ExtentProceedings of the VLDB Endowment10.14778/3598581.359859616:9(2248-2260)Online publication date: 1-May-2023
  • (2023)HAD B+-Tree: A Hotness-Aware Adaptive B+-Tree for SSD/HDD-Based Hybrid Storage Architecture2023 2nd International Conference on Sensing, Measurement, Communication and Internet of Things Technologies (SMC-IoT)10.1109/SMC-IoT62253.2023.00024(91-95)Online publication date: 29-Dec-2023
  • (2023)Towards a Signature Based Compression Technique for Big Data Storage2023 IEEE 39th International Conference on Data Engineering Workshops (ICDEW)10.1109/ICDEW58674.2023.00022(100-104)Online publication date: Apr-2023
  • (2022)60 Years of Databases (part four)PROBLEMS IN PROGRAMMING10.15407/pp2022.02.057(57-95)Online publication date: Jun-2022
  • (2022)TiresiasProceedings of the VLDB Endowment10.14778/3551793.355185715:11(3126-3136)Online publication date: 29-Sep-2022
  • (2022)Proteus: Autonomous Adaptive Storage for Mixed WorkloadsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517834(700-714)Online publication date: 10-Jun-2022
  • (2022)EgpuIP: An Embedded GPU Accelerated Library for Image Processing2022 IEEE 24th Int Conf on High Performance Computing & Communications; 8th Int Conf on Data Science & Systems; 20th Int Conf on Smart City; 8th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys57074.2022.00147(914-921)Online publication date: Dec-2022
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media