Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3318464.3389724acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks

Published: 31 May 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Distributed transactions on high-overhead TCP/IP-based networks were conventionally considered to be prohibitively expensive and thus were avoided at all costs. To that end, the primary goal of almost any existing partitioning scheme is to minimize the number of cross-partition transactions. However, with the new generation of fast RDMA-enabled networks, this assumption is no longer valid. In fact, recent work has shown that distributed databases can scale even when the majority of transactions are cross-partition. In this paper, we first make the case that the new bottleneck which hinders truly scalable transaction processing in modern RDMA-enabled databases is data contention, and that optimizing for data contention leads to different partitioning layouts than optimizing for the number of distributed transactions. We then present Chiller, a new approach to data partitioning and transaction execution, which aims to minimize data contention for both local and distributed transactions. Finally, we evaluate Chiller using various workloads, and show that our partitioning and execution strategy outperforms traditional partitioning techniques which try to avoid distributed transactions, by up to a factor of 2.

    Supplementary Material

    MP4 File (3318464.3389724.mp4)
    Presentation Video

    References

    [1]
    Raja Appuswamy, Angelos C Anadiotis, Danica Porobic, Mustafa K Iman, and Anastasia Ailamaki. 2017. Analyzing the impact of system architecture on the scalability of OLTP engines for high-contention workloads. Proceedings of the VLDB Endowment, Vol. 11, 2 (2017), 121--134.
    [2]
    Kyle Banker. 2011. MongoDB in action .Manning Publications Co.
    [3]
    Carsten Binnig, Andrew Crotty, Alex Galakatos, Tim Kraska, and Erfan Zamanian. 2016. The End of Slow Networks: It's Time for a Redesign. PVLDB, Vol. 9, 7 (2016), 528--539.
    [4]
    Mike Burrows. 2006. The Chubby Lock Service for Loosely-coupled Distributed Systems. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI '06). USENIX Association, 335--350.
    [5]
    Haibo Chen, Rong Chen, Xingda Wei, Jiaxin Shi, Yanzhe Chen, Zhaoguo Wang, Binyu Zang, and Haibing Guan. 2017. Fast in-memory transaction processing using RDMA and HTM. ACM Transactions on Computer Systems (TOCS), Vol. 35, 1 (2017), 3.
    [6]
    Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing (SoCC). 143--154.
    [7]
    James A Cowling and Barbara Liskov. 2012. Granola: Low-Overhead Distributed Transaction Coordination. In USENIX Annual Technical Conference, Vol. 12.
    [8]
    Carlo Curino, Evan Jones, Yang Zhang, and Sam Madden. 2010. Schism: a workload-driven approach to database replication and partitioning. Proceedings of the VLDB Endowment, Vol. 3, 1--2 (2010), 48--57.
    [9]
    Mohammad Dashti, Sachin Basil John, Amir Shaikhha, and Christoph Koch. 2017. Transaction Repair for Multi-Version Concurrency Control. In Proceedings of the 2017 ACM International Conference on Management of Data. 235--250.
    [10]
    Bailu Ding, Lucja Kot, and Johannes Gehrke. 2018. Improving Optimistic Concurrency Control Through Transaction Batching and Operation Reordering. PVLDB, Vol. 12, 2 (2018), 169--182.
    [11]
    Aleksandar Dragojević, Dushyanth Narayanan, Edmund B Nightingale, Matthew Renzelmann, Alex Shamis, Anirudh Badam, and Miguel Castro. 2015. No compromises: distributed transactions with consistency, availability, and performance. In Proceedings of the 25th Symposium on Operating Systems Principles (SOSP). 54--70.
    [12]
    Aaron J. Elmore, Vaibhav Arora, Rebecca Taft, Andrew Pavlo, Divyakant Agrawal, and Amr El Abbadi. 2015. Squall: Fine-Grained Live Reconfiguration for Partitioned Main Memory Databases. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 299--313.
    [13]
    Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, and Amr El Abbadi. 2011. Zephyr: Live Migration in Shared Nothing Databases for Elastic Cloud Platforms. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of Data. 301--312.
    [14]
    Jose M Faleiro, Daniel J Abadi, and Joseph M Hellerstein. 2017. High performance transactions via early write visibility. Proceedings of the VLDB Endowment, Vol. 10, 5 (2017), 613--624.
    [15]
    Jose M Faleiro, Alexander Thomson, and Daniel J Abadi. 2014. Lazy evaluation of transactions in database systems. In Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 15--26.
    [16]
    Rachael Harding, Dana Van Aken, Andrew Pavlo, and Michael Stonebraker. 2017. An evaluation of distributed concurrency control. Proceedings of the VLDB Endowment, Vol. 10, 5 (2017), 553--564.
    [17]
    Patrick Hunt, Mahadev Konar, Flavio Paiva Junqueira, and Benjamin Reed. 2010. ZooKeeper: Wait-free Coordination for Internet-scale Systems. In USENIX Annual Technical Conference .
    [18]
    Instacart. 2017. The Instacart Online Grocery Shopping Dataset 2017.
    [19]
    Anuj Kalia, Michael Kaminsky, and David G Andersen. 2015. Using RDMA efficiently for key-value services. ACM SIGCOMM Computer Communication Review, Vol. 44, 4 (2015), 295--306.
    [20]
    Anuj Kalia, Michael Kaminsky, and David G. Andersen. 2016. FaSST: Fast, Scalable and Simple Distributed Transactions with Two-Sided (RDMA) Datagram RPCs. In 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). 185--201.
    [21]
    Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan PC Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, et al. 2008. H-store: a high-performance, distributed main memory transaction processing system. Proceedings of the VLDB Endowment, Vol. 1, 2 (2008), 1496--1499.
    [22]
    G. Karypis and V. Kumar. 1998. A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs. SIAM Journal on Scientific Computing, Vol. 20, 1 (1998), 359--392.
    [23]
    Tim Kraska, Martin Hentschel, Gustavo Alonso, and Donald Kossmann. 2009. Consistency rationing in the cloud: pay only when it matters. Proceedings of the VLDB Endowment, Vol. 2, 1 (2009), 253--264.
    [24]
    Bojie Li, Zhenyuan Ruan, Wencong Xiao, Yuanwei Lu, Yongqiang Xiong, Andrew Putnam, Enhong Chen, and Lintao Zhang. 2017. KV-direct: high-performance in-memory key-value store with programmable NIC. In Proceedings of the 26th Symposium on Operating Systems Principles (SOSP). 137--152.
    [25]
    Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguicc a, and Rodrigo Rodrigues. 2012. Making geo-replicated systems fast as possible, consistent when necessary. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 265--278.
    [26]
    Hatem A Mahmoud, Vaibhav Arora, Faisal Nawab, Divyakant Agrawal, and Amr El Abbadi. 2014. Maat: Effective and scalable coordination of distributed transactions in the cloud. Proceedings of the VLDB Endowment, Vol. 7, 5 (2014), 329--340.
    [27]
    Christopher Mitchell, Yifeng Geng, and Jinyang Li. 2013. Using One-Sided RDMA Reads to Build a Fast, CPU-Efficient Key-Value Store. In USENIX Annual Technical Conference. 103--114.
    [28]
    Andrew Pavlo, Carlo Curino, and Stanley Zdonik. 2012. Skew-aware automatic database partitioning in shared-nothing, parallel OLTP systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 61--72.
    [29]
    Marco Serafini, Rebecca Taft, Aaron J Elmore, Andrew Pavlo, Ashraf Aboulnaga, and Michael Stonebraker. 2016. Clay: fine-grained adaptive partitioning for general database schemas. Proceedings of the VLDB Endowment, Vol. 10, 4 (2016), 445--456.
    [30]
    Dennis Shasha, Francois Llirbat, Eric Simon, and Patrick Valduriez. 1995. Transaction Chopping: Algorithms and Performance Studies. ACM Trans. Database Syst., Vol. 20, 3 (Sept. 1995), 325--363.
    [31]
    Dennis Shasha, Eric Simon, and Patrick Valduriez. 1992. Simple Rational Guidance for Chopping up Transactions. In Proceedings of the 1992 ACM SIGMOD International Conference on Management of Data. 298--307.
    [32]
    Utku Sirin, Ahmad Yasin, and Anastasia Ailamaki. 2017. A Methodology for OLTP Micro-architectural Analysis. In Proceedings of the 13th International Workshop on Data Management on New Hardware (DAMON). Article 1, bibinfonumpages1:1--1:10 pages.
    [33]
    Michael Stonebraker and Ariel Weisberg. 2013. The VoltDB Main Memory DBMS. IEEE Data Eng. Bull., Vol. 36, 2 (2013), 21--27.
    [34]
    Rebecca Taft, Essam Mansour, Marco Serafini, Jennie Duggan, Aaron J Elmore, Ashraf Aboulnaga, Andrew Pavlo, and Michael Stonebraker. 2014. E-store: Fine-grained elastic partitioning for distributed transaction processing systems. Proceedings of the VLDB Endowment, Vol. 8, 3 (2014), 245--256.
    [35]
    Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J Abadi. 2012. Calvin: fast distributed transactions for partitioned database systems. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 1--12.
    [36]
    Alexander Garvey Thomson. 2013. Deterministic Transaction Execution in Distributed Database Systems. Ph.D. Dissertation. Yale University. Advisor(s) Abadi, Daniel J.
    [37]
    Khai Q Tran, Jeffrey F Naughton, Bruhathi Sundarmurthy, and Dimitris Tsirogiannis. 2014. JECB: A join-extension, code-based approach to OLTP data partitioning. In Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data. 39--50.
    [38]
    Tianzheng Wang and Hideaki Kimura. 2016. Mostly-Optimistic Concurrency Control for Highly Contended Dynamic Workloads on a Thousand Cores. PVLDB, Vol. 10, 2 (2016), 49--60.
    [39]
    Xingda Wei, Sijie Shen, Rong Chen, and Haibo Chen. 2017. Replication-driven Live Reconfiguration for Fast Distributed Transaction Processing. In USENIX Annual Technical Conference. 335--347.
    [40]
    Cong Yan and Alvin Cheung. 2016. Leveraging lock contention to improve OLTP application performance. Proceedings of the VLDB Endowment, Vol. 9, 5 (2016), 444--455.
    [41]
    Philip S. Yu, Daniel M. Dias, and Stephen S. Lavenberg. 1993. On the Analytical Modeling of Database Concurrency Control. J. ACM, Vol. 40, 4 (1993), 831--872.
    [42]
    Yuan Yuan, Kaibo Wang, Rubao Lee, Xiaoning Ding, Jing Xing, Spyros Blanas, and Xiaodong Zhang. 2016. BCC: Reducing False Aborts in Optimistic Concurrency Control with Low Cost for In-memory Databases. Proc. VLDB Endow., Vol. 9, 6 (Jan. 2016), 504--515.
    [43]
    Erfan Zamanian, Carsten Binnig, Tim Harris, and Tim Kraska. 2017. The end of a myth: Distributed transactions can scale. Proceedings of the VLDB Endowment, Vol. 10, 6 (2017), 685--696.
    [44]
    Erfan Zamanian, Carsten Binnig, and Abdallah Salama. 2015. Locality-aware partitioning in parallel database systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data. 17--30.
    [45]
    Erfan Zamanian, Xiangyao Yu, Michael Stonebraker, and Tim Kraska. 2019. Rethinking Database High Availability with RDMA Networks. Proc. VLDB Endow., Vol. 12, 11 (July 2019), 1637--1650.
    [46]
    Yang Zhang, Russell Power, Siyuan Zhou, Yair Sovran, Marcos K. Aguilera, and Jinyang Li. 2013. Transaction Chains: Achieving Serializability with Low Latency in Geo-Distributed Storage Systems. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (SOSP). 276--291.

    Cited By

    View all
    • (2024)Zero-sided RDMA: Network-driven Data Shuffling for Disaggregated Heterogeneous Cloud DBMSsProceedings of the ACM on Management of Data10.1145/36392912:1(1-28)Online publication date: 26-Mar-2024
    • (2024)Optimizing LSM-based indexes for disaggregated memoryThe VLDB Journal10.1007/s00778-024-00863-yOnline publication date: 19-Jun-2024
    • (2023)Fine-Grained Re-Execution for Efficient Batched Commit of Distributed TransactionsProceedings of the VLDB Endowment10.14778/3594512.359452316:8(1930-1943)Online publication date: 1-Apr-2023
    • Show More Cited By

    Index Terms

    1. Chiller: Contention-centric Transaction Execution and Data Partitioning for Modern Networks

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SIGMOD '20: Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data
      June 2020
      2925 pages
      ISBN:9781450367356
      DOI:10.1145/3318464
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 31 May 2020

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. RDMA
      2. data partitioning
      3. distributed transactions

      Qualifiers

      • Research-article

      Funding Sources

      • Google
      • NSF CAREER Award
      • Microsoft
      • Intel
      • Mellanox
      • German Research Foundation (DFG)
      • DOE Early Career Award
      • Huawei
      • NSF IIS Career Award

      Conference

      SIGMOD/PODS '20
      Sponsor:

      Acceptance Rates

      Overall Acceptance Rate 785 of 4,003 submissions, 20%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)106
      • Downloads (Last 6 weeks)8
      Reflects downloads up to

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Zero-sided RDMA: Network-driven Data Shuffling for Disaggregated Heterogeneous Cloud DBMSsProceedings of the ACM on Management of Data10.1145/36392912:1(1-28)Online publication date: 26-Mar-2024
      • (2024)Optimizing LSM-based indexes for disaggregated memoryThe VLDB Journal10.1007/s00778-024-00863-yOnline publication date: 19-Jun-2024
      • (2023)Fine-Grained Re-Execution for Efficient Batched Commit of Distributed TransactionsProceedings of the VLDB Endowment10.14778/3594512.359452316:8(1930-1943)Online publication date: 1-Apr-2023
      • (2023)Localized Validation Accelerates Distributed Transactions on Disaggregated Persistent MemoryACM Transactions on Storage10.1145/358201219:3(1-35)Online publication date: 21-Jan-2023
      • (2023)dLSM: An LSM-Based Index for Memory Disaggregation2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00217(2835-2849)Online publication date: Apr-2023
      • (2023)Saguaro: An Edge Computing-Enabled Hierarchical Permissioned Blockchain2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00027(259-272)Online publication date: Apr-2023
      • (2022)SwitchTxProceedings of the VLDB Endowment10.14778/3551793.355183815:11(2881-2894)Online publication date: 1-Jul-2022
      • (2022)DFI: The Data Flow Interface for High-Speed NetworksACM SIGMOD Record10.1145/3542700.354270551:1(15-22)Online publication date: 1-Jun-2022
      • (2022)P4DB - The Case for In-Network OLTPProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517825(1375-1389)Online publication date: 10-Jun-2022
      • (2022)Sherman: A Write-Optimized Distributed B+Tree Index on Disaggregated MemoryProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3517824(1033-1048)Online publication date: 10-Jun-2022
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media