Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2723372.2737788acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Optimizing Optimistic Concurrency Control for Tree-Structured, Log-Structured Databases

Published: 27 May 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Scaling-out a database system typically requires partitioning the database across multiple servers. If applications do not partition perfectly, then transactions accessing multiple partitions end up being distributed, which has well-known scalability challenges. To address them, we describe a high-performance transaction mechanism that uses optimistic concurrency control on a multi-versioned tree-structured database stored in a shared log. The system scales out by adding servers, without partitioning the database.
    Our solution is modeled on the Hyder architecture, published by Bernstein, Reid, and Das at CIDR 2011. We present the design and evaluation of the first full implementation of that architecture. The core of the system is a log roll-forward algorithm, called meld, that does optimistic concurrency control. Meld is inherently sequential and is therefore the main bottleneck. Our main algorithmic contributions are optimizations to meld that significantly increase transaction throughput. They use a pipelined design that parallelizes meld onto multiple threads. The slowest pipeline stage is much faster than the original meld algorithm, yielding a 3x improvement of system throughput over the original meld algorithm.

    References

    [1]
    Adya, A., R. Gruber, B. Liskov, U.Maheshwari. Efficient optimistic concurrency control using loosely synchronized clocks. SIGMOD 1995, pp. 23--34.
    [2]
    Agrawal, D., A. Bernstein, P. Gupta, S. Sengupta. Distributed optimistic concurrency control with reduced rollback. Distributed Computing 2, 1 (1987), pp. 45--59.
    [3]
    Aguilera, M.K., W.M. Golab, and M.A. Shah: A practical scalable distributed B-tree. PVLDB 1(1): 598--609, 2008.
    [4]
    Balakrishnan, M., D. Malkhi, J.D. Davis, V. Prabhakaran, M. Wei, T. Wobber: CORFU: A distributed shared log. ACM Trans. Comput. Syst. 31(4): 10 (2013).
    [5]
    Balakrishnan, M., D. Malkhi, T. Wobber, M. Wu, V. Prabhakaran, M. Wei, J.D. Davis, S. Rao, T. Zou, A. Zuck: Tango: distributed data structures over a shared log. SOSP 2013, pp. 325--340
    [6]
    Bernstein, P.A. and S. Das. Scaling optimistic concurrency control by approximately partitioning the certifier and log. IEEE Data Eng. Bull 38, 1 (March 2015).
    [7]
    Bernstein, P.A., C.W. Reid, S. Das: Hyder - A transactional record manager for shared flash. CIDR 2011, pp. 9--20
    [8]
    Bernstein, P.A., C.W. Reid, M. Wu, X. Yuan: Optimistic concurrency control by melding trees. PVLDB 4(11): 944--955 (2011).
    [9]
    Bernstein, P.A., D.W. Shipman, J.B. Rothnie Jr.: Concurrency Control in a System for Distributed Databases (SDD-1). ACM TODS 5(1): 18--51 (1980).
    [10]
    Cooper, B.F, A. Silberstein, E. Tam, R. Ramakrishnan, R. Sears: Benchmarking cloud serving systems with YCSB. SoCC 2010, pp. 143--154.
    [11]
    Elnikety, S., S. Dropsho, and F. Pedone. Tashkent: Uniting durability with transaction ordering for high-performance scalable database replication. EuroSys 2006, pp. 117--130.
    [12]
    Finlayson, R., and D. Cheriton. Log Files: An extended file service exploiting write-once storage. SOSP 1987, pp. 139--148.
    [13]
    Fredkin, E.: Trie memory. CACM, 3(9):490--499, 1960.
    [14]
    Gazagnairem, T. and V. Hanquez: OXenstored--An efficient hierarchical and transactional database using functional programming with reference cell comparisons. ICFP 2009, pp. 203--214.
    [15]
    Graefe, G. Write-optimized B-trees. VLDB 2004, pp. 672--683.
    [16]
    Gruber, R. E. Optimistic concurrency control for nested distributed transactions, 1989.
    [17]
    Guibas, L.J., and R. Sedgewick: A Dichromatic Framework for Balanced Trees. FOCS 1978, pp. 8--21.
    [18]
    Kapritsos, M., Y. Wang, V. Quema, A. Clement, L. Alvisi, M. Dahlin. All about Eve: Execute-verify replication for multicore servers. OSDI 2012, pp. 237--250.
    [19]
    Kung, H. T., and J.T. Robinson. On optimistic methods for concurrency control. ACM TODS 6, 2 (June 1981), 213--226.
    [20]
    Larson, P.-A., S. Blanas, C. Diaconu, C. Freedman, J.M. Patel, M. Zwilling. High-performance concurrency control mechanisms for main-memory databases. PVLDB 5, 4 (2011): 298--309.
    [21]
    Lausen, G. Concurrency control in database systems: A step towards the integration of optimistic methods and locking. ACM 1982, pp. 64--68.
    [22]
    Lee, S.-W., and B. Moon. Design of flash-based DBMS: An in-page logging approach. SIGMOD 2007, pp. 55--66.
    [23]
    Mu, S., Y. Cui, Y. Zhang, W. Lloyd, J. Li. Extracting more concurrency from distributed transactions. OSDI 2014, pp. 479--494.
    [24]
    Mullender, S. J., and A.S. Tanenbaum. A distributed file service based on optimistic concurrency control. SOSP 1985, pp. 51--62.
    [25]
    O'Neil, P.E., E. Cheng, D. Gawlick, E.J. O'Neil. The logstructured merge-tree (LSM-tree). Acta Inf. 33, 4 (June 1996): 351--385.
    [26]
    Phatak, S. and B.R. Badrinath, Bounded locking for optimistic concurrency control. Rutgers Univ., Dept. of Computer Science, Tech. Report #DCS-TR-380.
    [27]
    Rosenblum, M., and J.K. Ousterhout. The design and implementation of a log-structured file system. SOSP 1991, pp. 1--15.
    [28]
    Sears, R., and R. Ramakrishnan. blsm: A general purpose log structured merge tree. SIGMOD 2012, pp. 217--228.
    [29]
    Sheth, A. P., and M.T. Liu. Integrating locking and optimistic concurrency control in distributed database systems. ICDCS 1986, pp. 89--99.
    [30]
    Thomasian, A., and E. Rahm. A new distributed optimistic concurrency control method and a comparison of its performance with two-phase locking. ICSCS 1990, pp. 294--301.
    [31]
    Thomson, A., T. Diamond, S-C Weng, K. Ren, P. Shao, D.J. Abadi: Calvin: fast distributed transactions for partitioned database systems. SIGMOD 2012, pp. 1--12
    [32]
    Tokutek. http://www.tokutek.com/.
    [33]
    Wu, C.-H., L.-P Chang, T.-W. Kuo. An efficient R-tree implementation over flash-memory storage systems. GIS 2003, pp. 1724.

    Cited By

    View all
    • (2023)OCC2T: An Early-Read Dual-Track OCC Algorithm For Mixed Mode SystemsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577757(321-330)Online publication date: 27-Mar-2023
    • (2022)Robustness Against Read CommittedProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524162(1-14)Online publication date: 12-Jun-2022
    • (2022)Hihooi: A Database Replication Middleware for Scaling Transactional Databases ConsistentlyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298756034:2(691-707)Online publication date: 1-Feb-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '15: Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data
    May 2015
    2110 pages
    ISBN:9781450327589
    DOI:10.1145/2723372
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 27 May 2015

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. optimistic concurrency control
    2. scale-out transaction processing

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'15
    Sponsor:
    SIGMOD/PODS'15: International Conference on Management of Data
    May 31 - June 4, 2015
    Victoria, Melbourne, Australia

    Acceptance Rates

    SIGMOD '15 Paper Acceptance Rate 106 of 415 submissions, 26%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)23
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)OCC2T: An Early-Read Dual-Track OCC Algorithm For Mixed Mode SystemsProceedings of the 38th ACM/SIGAPP Symposium on Applied Computing10.1145/3555776.3577757(321-330)Online publication date: 27-Mar-2023
    • (2022)Robustness Against Read CommittedProceedings of the 41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems10.1145/3517804.3524162(1-14)Online publication date: 12-Jun-2022
    • (2022)Hihooi: A Database Replication Middleware for Scaling Transactional Databases ConsistentlyIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2020.298756034:2(691-707)Online publication date: 1-Feb-2022
    • (2022)Transaction Processing on Modern HardwareundefinedOnline publication date: 26-Feb-2022
    • (2021)Robustness against read committed for transaction templatesProceedings of the VLDB Endowment10.14778/3476249.347626814:11(2141-2153)Online publication date: 1-Jul-2021
    • (2021)BokiProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483541(691-707)Online publication date: 26-Oct-2021
    • (2020)Gossip-based visibility control for high-performance geo-distributed transactionsThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-020-00626-530:1(93-114)Online publication date: 21-Sep-2020
    • (2019)Transaction Processing on Modern HardwareSynthesis Lectures on Data Management10.2200/S00896ED1V01Y201901DTM05814:2(1-138)Online publication date: 8-Mar-2019
    • (2019)On supporting efficient snapshot isolation for hybrid workloads with multi-versioned indexesProceedings of the VLDB Endowment10.14778/3364324.336433413:2(211-225)Online publication date: 1-Oct-2019
    • (2019)Ocean vistaProceedings of the VLDB Endowment10.14778/3342263.334262712:11(1471-1484)Online publication date: 1-Jul-2019
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media