Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

SolarDB: Toward a Shared-Everything Database on Distributed Log-Structured Storage

Published: 25 June 2019 Publication History

Abstract

Efficient transaction processing over large databases is a key requirement for many mission-critical applications. Although modern databases have achieved good performance through horizontal partitioning, their performance deteriorates when cross-partition distributed transactions have to be executed. This article presents SolarDB, a distributed relational database system that has been successfully tested at a large commercial bank. The key features of SolarDB include (1) a shared-everything architecture based on a two-layer log-structured merge-tree; (2) a new concurrency control algorithm that works with the log-structured storage, which ensures efficient and non-blocking transaction processing even when the storage layer is compacting data among nodes in the background; and (3) find-grained data access to effectively minimize and balance network communication within the cluster. According to our empirical evaluations on TPC-C, Smallbank, and a real-world workload, SolarDB outperforms the existing shared-nothing systems by up to 50x when there are close to or more than 5% distributed transactions.

References

[1]
Alibaba Oceanbase. 2015. Oceanbase. Retrieved April 4, 2019 from https://github.com/alibaba/oceanbase.
[2]
Hal Berenson, Phil Bernstein, Jim Gray, Jim Melton, Elizabeth O’Neil, and Patrick O’Neil. 1995. A critique of ANSI SQL isolation levels. In Proceedings of SIGMOD, Vol. 24. ACM, New York, NY, 1--10.
[3]
Philip A. Bernstein, Sudipto Das, Bailu Ding, and Markus Pilman. 2015. Optimizing optimistic concurrency control for tree-structured, log-structured databases. In Proceedings of SIGMOD. 1295--1309.
[4]
Sashikanth Chandrasekaran and Roger Bamford. 2003. Shared cache-the future of parallel databases. In Proceedings of ICDE. IEEE, Los Alamitos, CA, 840--850.
[5]
Fay Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah A. Wallach, Mike Burrows, Tushar Chandra, et al. 2008. Bigtable: A distributed storage system for structured data. ACM Transactions on Computer Systems 26, 2 (2008), 4.
[6]
Giuseppe DeCandia, Deniz Hastorun, Madan Jampani, Gunavardhan Kakulapati, Avinash Lakshman, Alex Pilchin, Swaminathan Sivasubramanian, et al. 2007. Dynamo: Amazon’s highly available key-value store. In Proceedings of SOSP, Vol. 41. ACM, New York, NY, 205--220.
[7]
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, et al. 2013. Hekaton: SQL server’s memory-optimized OLTP engine. In Proceedings of SIGMOD. ACM, New York, NY, 1243--1254.
[8]
Aleksandar Dragojevic, Dushyanth Narayanan, Miguel Castro, and Orion Hodson. 2014. FaRM: Fast remote memory. In Proceedings of NSDI. 401--414.
[9]
Aleksandar Dragojevic, Dushyanth Narayanan, Edmund B. Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: Distributed transactions with consistency, availability, and performance. In Proceedings of SOSP. 54--70.
[10]
Anil K. Goel, Jeffrey Pound, Nathan Auch, Peter Bumbulis, Scott MacLean, Franz Färber, Francis Gropengiesser, et al. 2015. Towards scalable real-time analytics: An architecture for scale-out of OLxP workloads. Proceedings of the VLDB Endowment 8, 12 (2015), 1716--1727.
[11]
J. W. Josten, C. Mohan, I. Narang, and J. Z. Teng. 1997. DB2’s use of the coupling facility for data sharing. IBM Systems Journal 36, 2 (1997), 327--351.
[12]
Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, et al. 2008. H-store: A high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment 1, 2 (2008), 1496--1499.
[13]
Alfons Kemper and Thomas Neumann. 2011. HyPer: A hybrid OLTP8OLAP main memory database system based on virtual memory snapshots. In Proceedings of ICDE. IEEE, Los Alamitos, CA, 195--206.
[14]
Ken Kennedy and Kathryn S. McKinley. 1993. Maximizing Loop Parallelism and Improving Data Locality via Loop Fusion and Distribution. Springer.
[15]
Hsiang-Tsung Kung and John T. Robinson. 1981. On optimistic methods for concurrency control. ACM Transactions on Database Systems 2 (1981), 213--226.
[16]
Avinash Lakshman and Prashant Malik. 2010. Cassandra: A decentralized structured storage system. ACM SIGOPS Operating Systems Review 44, 2 (2010), 35--40.
[17]
Justin Levandoski, David Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang. 2015. High performance transactions in Deuteronomy. In Proceedings of CIDR. https://www.microsoft.com/en-us/research/publication/high-performance-transactions-in-deuteronomy/.
[18]
LevelDB. 2017. Home Page. Retrieved April 4, 2019 from http://leveldb.org/.
[19]
Simon Loesing, Markus Pilman, Thomas Etter, and Donald Kossmann. 2015. On the design and scalability of distributed shared-data databases. In Proceedings of SIGMOD. ACM, New York, NY, 663--676.
[20]
Shuai Mu, Yang Cui, Yang Zhang, Wyatt Lloyd, and Jinyang Li. 2014. Extracting more concurrency from distributed transactions. In Proceedings of OSDI. 479--494.
[21]
Steven S. Muchnick. 1997. Advanced Compiler Design Implementation. Morgan Kaufmann.
[22]
Patrick O’Neil, Edward Cheng, Dieter Gawlick, and Elizabeth O’Neil. 1996. The log-structured merge-tree (LSM-tree). Acta Informatica 33, 4 (1996), 351--385.
[23]
John Ousterhout, Parag Agrawal, David Erickson, Christos Kozyrakis, Jacob Leverich, David Mazières, Subhasish Mitra, et al. 2010. The case for RAMClouds: Scalable high-performance storage entirely in DRAM. ACM SIGOPS Operating Systems Review 43, 4 (2010), 92--105.
[24]
Wolf Rödiger, Tobias Mühlbauer, Alfons Kemper, and Thomas Neumann. 2015. High-speed query processing over high-speed networks. Proceedings of the VLDB Endowment 9, 4 (2015), 228--239.
[25]
Marco Serafini, Essam Mansour, Ashraf Aboulnaga, Kenneth Salem, Taha Rafiq, and Umar Farooq Minhas. 2014. Accordion: Elastic scalability for database systems supporting distributed transactions. Proceedings of the VLDB Endowment 7, 12 (2014), 1035--1046.
[26]
Michael Stonebraker, Samuel Madden, Daniel J. Abadi, Stavros Harizopoulos, Nabil Hachem, and Pat Helland. 2007. The end of an architectural era: (It’s time for a complete rewrite). In Proceedings of VLDB. 1150--1160.
[27]
Michael Stonebraker and Ariel Weisberg. 2013. The VoltDB main memory DBMS. IEEE Data Engineering Bulletin 36, 2 (2013), 21--27.
[28]
Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J. Elmore, Ashraf Aboulnaga, Andrew Pavlo, et al. 2014. E-store: Fine-grained elastic partitioning for distributed transaction processing systems. In Proceedings of VLDB. 245--256.
[29]
Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast distributed transactions for partitioned database systems. In Proceedings of SIGMOD. 1--12.
[30]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy transactions in multicore in-memory databases. In Proceedings of SOSP. 18--32.
[31]
Hoang Tam Vo, Sheng Wang, Divyakant Agrawal, Gang Chen, and Beng Chin Ooi. 2012. LogBase: A scalable log-structured database system in the cloud. Proceedings of the VLDB Endowment 5, 10 (2012), 1004--1015.
[32]
VoltDB Inc. 2017. VoltDB. Retrieved April 4, 2019 from https://www.voltdb.com/.
[33]
Gottfried Vossen. 1995. Database transaction models. In Computer Science Today. Springer, 560--574.
[34]
Zhaoguo Wang, Shuai Mu, Yang Cui, Han Yi, Haibo Chen, and Jinyang Li. 2016. Scaling multicore databases via constrained parallel execution. In Proceedings of SIGMOD. ACM, New York, NY, 1643--1658.
[35]
Zhaoguo Wang, Hao Qian, Jinyang Li, and Haibo Chen. 2014. Using restricted transactional memory to build a scalable in-memory database. In Proceedings of EuroSys. 26:1--26:15.
[36]
Michael Wei, Amy Tai, Christopher J. Rossbach, Ittai Abraham, Maithem Munshed, Medhavi Dhawan, Jim Stabile, et al. 2017. vCorfu: A cloud-scale object store on a shared log. In Proceedings of USENIX NSDI. 35--49.
[37]
Xingda Wei, Sijie Shen, Rong Chen, and Haibo Chen. 2017. Replication-driven live reconfiguration for fast distributed transaction processing. In Proceedings of USENIX ATC. 335--347.
[38]
Xingda Wei, Jiaxin Shi, Yanzhe Chen, Rong Chen, and Haibo Chen. 2015. Fast in-memory transaction processing using RDMA and HTM. In Proceedings of SOSP. ACM, New York, NY, 87--104.
[39]
Brian White, Jay Lepreau, Leigh Stoller, Robert Ricci, Shashi Guruprasad, Mac Newbold, Mike Hibler, et al. 2002. An integrated experimental environment for distributed systems and networks. In Proceedings of OSDI. 255--270.
[40]
Yingjun Wu, Chee-Yong Chan, and Kian-Lee Tan. 2016. Transaction healing: Scaling optimistic concurrency control on multicores. In Proceedings of SIGMOD. ACM, New York, NY, 1689--1704.
[41]
Cong Yan and Alvin Cheung. 2016. Leveraging lock contention to improve OLTP application performance. In Proceedings of VLDB. 444--455.
[42]
Matei Zaharia, Mosharaf Chowdhury, Michael J. Franklin, Scott Shenker, and Ion Stoica. 2010. Spark: Cluster computing with working sets. In Proceedings of USENIX HotCloud.

Cited By

View all
  • (2023)A Model and Survey of Distributed Data-Intensive SystemsACM Computing Surveys10.1145/360480156:1(1-69)Online publication date: 26-Aug-2023
  • (2022)Halo: A Hybrid PMem-DRAM Persistent Hash Index with Fast RecoveryProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517884(1049-1063)Online publication date: 10-Jun-2022
  • (2022)p2KVSProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519567(575-591)Online publication date: 28-Mar-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Storage
ACM Transactions on Storage  Volume 15, Issue 2
Systor 2018 Special Section on ATC 2018, Special Section on OSDI 2018 and Regular Papers
May 2019
187 pages
ISSN:1553-3077
EISSN:1553-3093
DOI:10.1145/3326597
  • Editor:
  • Sam H. Noh
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 June 2019
Accepted: 01 March 2019
Received: 01 January 2019
Published in TOS Volume 15, Issue 2

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Shared-everything architecture
  2. concurrency control
  3. log-structured storage

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)243
  • Downloads (Last 6 weeks)24
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)A Model and Survey of Distributed Data-Intensive SystemsACM Computing Surveys10.1145/360480156:1(1-69)Online publication date: 26-Aug-2023
  • (2022)Halo: A Hybrid PMem-DRAM Persistent Hash Index with Fast RecoveryProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517884(1049-1063)Online publication date: 10-Jun-2022
  • (2022)p2KVSProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519567(575-591)Online publication date: 28-Mar-2022
  • (2022)PolarDB-X: An Elastic Distributed Relational Database for Cloud-Native Applications2022 IEEE 38th International Conference on Data Engineering (ICDE)10.1109/ICDE53745.2022.00259(2859-2872)Online publication date: May-2022
  • (2021)Dependence-Cognizant Locking Improvement for the Main Memory Database SystemsMathematical Problems in Engineering10.1155/2021/66544612021(1-12)Online publication date: 20-Feb-2021
  • (2021)Nova-LSM: A Distributed, Component-based LSM-tree Key-value StoreProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457297(749-763)Online publication date: 9-Jun-2021
  • (2021)Continuously Bulk Loading over Range Partitioned Tables for Large Scale Historical Data2021 IEEE 37th International Conference on Data Engineering (ICDE)10.1109/ICDE51399.2021.00088(960-971)Online publication date: Apr-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media