Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fast updates on read-optimized databases using multi-core CPUs

Published: 01 September 2011 Publication History

Abstract

Read-optimized columnar databases use differential updates to handle writes by maintaining a separate write-optimized delta partition which is periodically merged with the read-optimized and compressed main partition. This merge process introduces significant overheads and unacceptable downtimes in update intensive systems, aspiring to combine transactional and analytical workloads into one system.
In the first part of the paper, we report data analyses of 12 SAP Business Suite customer systems. In the second half, we present an optimized merge process reducing the merge overhead of current systems by a factor of 30. Our linear-time merge algorithm exploits the underlying high compute and bandwidth resources of modern multi-core CPUs with architecture-aware optimizations and efficient parallelization. This enables compressed in-memory column stores to handle the transactional update rate required by enterprise applications, while keeping properties of read-optimized databases for analytic-style queries.

References

[1]
U. A. Acar, G. E. Blelloch, and R. D. Blumofe. The data locality of work stealing. In SPAA, pages 1--12, 2000.
[2]
S. Agrawal, V. R. Narasayya, and B. Yang. Integrating Vertical and Horizontal Partitioning Into Automated Physical Database Design. In SIGMOD, pages 359--370, 2004.
[3]
F. Beier, K. Stolze, and K.-U. Sattler. Online reorganization in read optimized mmdbs. In SIGMOD, pages 1125--1136, 2011.
[4]
P. A. Boncz, M. Zukowski, and N. Nes. MonetDB/X100: Hyper-Pipelining Query Execution. In CIDR, pages 225--237, 2005.
[5]
J. Chhugani, A. D. Nguyen, V. W. Lee, W. Macy, M. Hagog, Y.-K. Chen, A. Baransi, S. Kumar, and P. Dubey. Efficient implementation of sorting on multi-core SIMD CPU architecture. In VLDB, pages 1313--1324, 2008.
[6]
J. Cieslewicz and K. A. Ross. Adaptive aggregation on chip multiprocessors. In VLDB, pages 339--350, 2007.
[7]
G. P. Copeland and S. Khoshafian. A Decomposition Storage Model. In SIGMOD, pages 268--279, 1985.
[8]
R. S. Francis and I. D. Mathieson. A Benchmark Parallel Sort for Shared Memory Multiprocessors, IEEE Trans. Computers, 37(12):1619--1626, 1988.
[9]
C. D. French. "One Size Fits All" Database Architectures Do Not Work for DDS. In SIGMOD, pages 449--450, 1995.
[10]
M. Grund, J. Krueger, H. Plattner, A. Zeier, P. Cudre-Mauroux, and S. Madden. Hyrise - a hybrid main memory storage engine. In VLDB, pages 105--116, 2011.
[11]
S. Héman, M. Zukowski, N. J. Nes, L. Sidirourgos, and P. A. Boncz. Positional update handling in column stores. In SIGMOD, pages 543--554, 2010.
[12]
W. D. Hillis and G. L. Steele, Jr. Data parallel algorithms. Commun. ACM, 29(12):1170--1183, 1986.
[13]
A. Jindal. The mimicking octopus: Towards a one-size-fits-all database architecture. In VLDB PhD Workshop, 2010.
[14]
C. Kim, J. Chhugani, N. Satish, E. Sedlar, A. D. Nguyen, T. Kaldewey, V. W. Lee, S. A. Brandt, and P. Dubey. FAST: Fast Architecture Sensitive Tree search on modern CPUs and GPUs. In SIGMOD, pages 339--350, 2010.
[15]
C. Kim, E. Sedlar, J. Chhugani, T. Kaldewey, A. D. Nguyen, A. D. Blas, V. W. Lee, N. Satish, and P. Dubey. Sort vs. Hash Revisited: Fast Join Implementation on Modern Multi-Core CPUs. In VLDB, pages 1378--1389, 2009.
[16]
J. Krueger, M. Grund, C. Tinnefeld, H. Plattner, A. Zeier, and F. Faerber. Optimizing Write Performance for Read Optimized Databases. In DASFAA, pages 291--305, 2010.
[17]
J. Krueger, M. Grund, A. Zeier, and H. Plattner. Enterprise application-specific data management. In EDOC, pages 131--140, 2010.
[18]
R. MacNicol and B. French. Sybase IQ Multiplex - Designed For Analytics. In VLDB, pages 1227--1230, 2004.
[19]
S. Manegold, P. A. Boncz, and M. L. Kersten. Generic Database Cost Models for Hierarchical Memory Systems. In VLDB, pages 191--202, 2002.
[20]
P. Mishra and M. H. Eich. Join Processing in Relational Databases. CSUR, 24(1):63--113, 1992.
[21]
S. B. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical Partitioning Algorithms for Database Design. ACM Transactions on Database Systems, 9(4):680--710, 1984.
[22]
H. Plattner. A common database approach for OLTP and OLAP using an in-memory column database. In SIGMOD, pages 1--2, 2009.
[23]
R. Ramamurthy, D. J. DeWitt, and Q. Su. A Case for Fractured Mirrors. In VLDB, pages 430--441, 2002.
[24]
J. Rao and K. A. Ross. Making B+-Trees Cache Conscious in Main Memory. In SIGMOD, pages 475--486, 2000.
[25]
M. Stonebraker. D. J. Abadi, A. Batkin. X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. J. O'Neil, P. E. O'Neil, A. Rasin, N. Tran, and S. B. Zdonik. C-Store: A Column-oriented DBMS. In VLDB, pages 553--564, 2005.
[26]
P. J. Titman. An Experimental Data Base System Using Binary Relations. In IFIP Working Conference Data Base Management, pages 351--362, 1974.
[27]
T. Willhalm, N. Popovici, Y. Boshmaf, H. Plattner, A. Zeier, and J. Schaffner. SIMD-Scan: Ultra Fast in-Memory Table Scan using on-Chip Vector Processing Units. In PVLDB, pages 385--394, 2009.
[28]
M. Zukowski, P. A. Boncz, N. Nes, and S. Héman. MonetDB/X100 - A DBMS In The CPU Cache. IEEE Data Eng. Bull., 28(2):17--22, 2005.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 5, Issue 1
September 2011
84 pages

Publisher

VLDB Endowment

Publication History

Published: 01 September 2011
Published in PVLDB Volume 5, Issue 1

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)4
  • Downloads (Last 6 weeks)0
Reflects downloads up to 15 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Two Birds With One Stone: Designing a Hybrid Cloud Storage Engine for HTAPProceedings of the VLDB Endowment10.14778/3681954.368200117:11(3290-3303)Online publication date: 1-Jul-2024
  • (2023)Rethinking the Encoding of Integers for Scans on Skewed DataProceedings of the ACM on Management of Data10.1145/36267511:4(1-27)Online publication date: 12-Dec-2023
  • (2022)Are updatable learned indexes ready?Proceedings of the VLDB Endowment10.14778/3551793.355184815:11(3004-3017)Online publication date: 29-Sep-2022
  • (2022)GaccO - A GPU-accelerated OLTP DBMSProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517876(1003-1016)Online publication date: 10-Jun-2022
  • (2021)An inquiry into machine learning-based automatic configuration tuning services on real-world database management systemsProceedings of the VLDB Endowment10.14778/3450980.345099214:7(1241-1253)Online publication date: 12-Apr-2021
  • (2021)Sharing opportunities for OLTP workloads in different isolation levelsProceedings of the VLDB Endowment10.14778/3401960.340196713:10(1696-1708)Online publication date: 10-Mar-2021
  • (2021)GalOPProceedings of the 17th International Workshop on Data Management on New Hardware10.1145/3465998.3466007(1-3)Online publication date: 20-Jun-2021
  • (2021)GPU-based efficient join algorithms on HadoopThe Journal of Supercomputing10.1007/s11227-020-03262-677:1(292-321)Online publication date: 1-Jan-2021
  • (2021)Utilizing the column imprints to accelerate no‐partitioning hash joins in large‐scale edge systemsTransactions on Emerging Telecommunications Technologies10.1002/ett.408432:6Online publication date: 13-Jun-2021
  • (2020)A cloud-native architecture for replicated data servicesProceedings of the 12th USENIX Conference on Hot Topics in Cloud Computing10.5555/3485849.3485868(19-19)Online publication date: 13-Jul-2020
  • Show More Cited By

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media