Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3297858.3304040acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

MV-RLU: Scaling Read-Log-Update with Multi-Versioning

Published: 04 April 2019 Publication History

Abstract

This paper presents multi-version read-log-update (MV-RLU), an extension of the read-log-update (RLU) synchronization mechanism. While RLU has many merits including an intuitive programming model and excellent performance for read-mostly workloads, we observed that the performance of RLU significantly drops in workloads with more write operations. The core problem is that RLU manages only two versions. To overcome such limitation, we extend RLU to support multi-versioning and propose new techniques to make multi-versioning efficient. At the core of MV-RLU design is concurrent autonomous garbage collection, which prevents reclaiming invisible versions being a bottleneck, and reduces the version traversal overhead the main overhead of multi-version design. We extensively evaluate MV-RLU with the state-of-the-art synchronization mechanisms, including RCU, RLU, software transactional memory (STM), and lock-free approaches, on concurrent data structures and real-world applications (database concurrency control and in-memory key-value store). Our evaluation results show that MV-RLU significantly outperforms other techniques for a wide range of workloads with varying contention levels and data-set size.

References

[1]
Maya Arbel and Adam Morrison. 2015. Predicate RCU: An RCU for Scalable Concurrent Updates. In Proceedings of the 20th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, San Francisco, CA, 21--30.
[2]
Jens Axboe. 2019. fio: Flexible I/O Tester. https://github.com/axboe/fio.
[3]
Silas B. Wickizer, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2013. OpLog: a library for scaling update-heavy data structures. CSAIL Technical Report1, 1 (2013), 1--12.
[4]
Silas Boyd-Wickizer, Austin T. Clements, Yandong Mao, Aleksey Pesterev, M. Frans Kaashoek, Robert Morris, and Nickolai Zeldovich. 2010. An Analysis of Linux Scalability to Many Cores. In Proceedings of the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI). Vancouver, Canada, 1--16.
[5]
Michael J. Cahill, Uwe Röhm, and Alan D. Fekete. 2009. Serializable Isolation for Snapshot Databases. ACM Trans. Database Syst. 34, 4, Article 20 (Dec. 2009), 42 pages.
[6]
Irina Calciu, Siddhartha Sen, Mahesh Balakrishnan, and Marcos K.Aguilera. 2017. Black-box Concurrent Data Structures for NUMA Architectures. In Proceedings of the 22nd ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, Xi'an, China, 207--221.
[7]
Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. 2012. Scalable Address Spaces Using RCU Balanced Trees. In Proceedings of the 17th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). London, UK.
[8]
Austin T. Clements, M. Frans Kaashoek, and Nickolai Zeldovich. 2013. RadixVM: Scalable Address Spaces for Multithreaded Applications. In Proceedings of the 8th European Conference on Computer Systems (EuroSys). Prague, Czech Republic.
[9]
Austin T. Clements, M. Frans Kaashoek, Nickolai Zeldovich, Robert T. Morris, and Eddie Kohler. 2013. The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP). Farmington, PA.
[10]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan,and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In Proceedings of the 1st ACM Symposium on Cloud Computing(SoCC). ACM, Indianapolis, Indiana, USA, 143--154.
[11]
Intel Corp. 2017. Intel Xeon Platinum 8180 Processor. https://ark.intel.com/products/120496/Intel-Xeon-Platinum-8180-Processor-38_5M-Cache-2_50-GHz.
[12]
M. Desnoyers, P. E. McKenney, A. S. Stern, M. R. Dagenais, and J.Walpole. 2012. User-Level Implementations of Read-Copy Update. IEEE Transactions on Parallel and Distributed Systems 23, 2 (Feb 2012), 375--382.
[13]
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, PravinMittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Heka-ton: SQL Server's Memory-optimized OLTP Engine. In Proceedings of the 2013 ACM SIGMOD/PODS Conference. ACM, New York, USA, 1243--1254.
[14]
David Dice, Alex Kogan, Yossi Lev, Timothy Merrifield, and Mark Moir. 2015. Adaptive integration of hardware and software lock elision techniques. In Proceedings of the ACM symposium on Parallelism in algorithms and architectures (SPAA). ACM, Prague, Czech Republic, 188--197.
[15]
Dave Dice, Ori Shalev, and Nir Shavit. 2006. Transactional Locking II. In Proceedings of the 20th International Conference on Distributed Computing (DISC). Springer Berlin Heidelberg, Stockholm, Sweden, 194--208.
[16]
Aleksandar Dragojevic. 2014. SwissTM: open source code. https://github.com/nmldiegues/tm-study-pact14/tree/master/swissTM.
[17]
Aleksandar Dragojevic, Pascal Felber, Vincent Gramoli, and Rachid Guer raoui. 2011. Why STM Can Be More Than a Research Toy. ACM Communication(2011), 70--77.
[18]
Aleksandar Dragojevic, Rachid Guerraoui, and Michal Kapalka. 2009. Stretching Transactional Memory. In Proceedings of the 2009 ACMSIGPLAN Conference on Programming Language Design and Implementation (PLDI). ACM, Dublin, Ireland, 155--165.
[19]
Panagiota Fatourou and Nikolaos D. Kallimanis. 2012. Revisiting the Combining Synchronization Technique. In Proceedings of the 17th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, New Orleans, LA, 257--266.
[20]
P. Felber, C. Fetzer, P. Marlier, and T. Riegel. 2010. Time-Based Software Transactional Memory. In Proceedings of the IEEE Transactions on Parallel and Distributed Systems. IEEE, California, USA, 1793--1807.
[21]
Jeremy Fitzhardinge. 2011. Userspace RCU. http://liburcu.org/.
[22]
Johan De Gelas. 2018. Assessing Cavium's ThunderX2: The Arm ServerDream Realized At Last. https://www.anandtech.com/show/12694/assessing-cavium-thunderx2-arm-server-reality/2.
[23]
Rachid Guerraoui and Vasileios Trigonakis. 2016. Optimistic Concurrency with OPTIK. In Proceedings of the 21st ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, Barcelona, Spain, 18:1--18:12.
[24]
Sangjin Han, Scott Marshall, Byung-Gon Chun, and Sylvia Ratnasamy.2012. MegaPipe: A New Programming Interface for Scalable NetworkI/O. In Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation (OSDI). USENIX Association, Hollywood, CA, 135--148.
[25]
Timothy L. Harris. 2001. A Pragmatic Implementation of Non-blocking Linked-lists. In Proceedings of the 20th International Conference on Distributed Computing (DISC). Springer Berlin Heidelberg, University of Lisboa, Portugal, 300--314.
[26]
A. Hassan, R. Palmieri, S. Peluso, and B. Ravindran. 2017. Optimistic Transactional Boosting. IEEE Transactions on Parallel and Distributed Systems 28, 12 (Dec 2017), 3600--3614.
[27]
Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. 2010. FlatCombining and the Synchronization-parallelism Tradeoff. In Proceedings of the ACM symposium on Parallelism in algorithms and architectures (SPAA). ACM, Thira, Santorini, Greece, 355--364.
[28]
Danny Hendler, Itai Incze, Nir Shavit, and Moran Tzafrir. 2010. Scalable flat-combining based synchronous queues. In International Symposiumon Distributed Computing. Springer, 79--93.
[29]
Maurice Herlihy and Eric Koskinen. 2008. Transactional Boosting: A Methodology for Highly-concurrent Transactional Objects. In Proceedings of the 13th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, Salt Lake City, UT, 207--216.
[30]
Maurice Herlihy and Nir Shavit. 2011.The Art of Multiprocessor Programming. Morgan Kaufmann.
[31]
Nathaniel Herman, Jeevana Priya Inala, Yihe Huang, Lillian Tsai, Eddie Kohler, Barbara Liskov, and Liuba Shrira. 2016. Type-aware Transactions for Faster Concurrent Code. In Proceedings of the 11th European Conference on Computer Systems (EuroSys). ACM, London, UK, 31:1--31:16.
[32]
Intel. 2018. Intel 64 and IA-32 Architectures Optimization Reference Manual. https://software.intel.com/sites/default/files/managed/9e/bc/64-ia-32-architectures-optimization-manual.pdf.
[33]
Sanidhya Kashyap, Changwoo Min, Kangnyeon Kim, and Taesoo Kim.2018. A Scalable Ordering Primitive for Multicore Machines. In Proceedings of the 13th European Conference on Computer Systems (EuroSys). ACM, Porto, Portugal, Article 34, 15 pages.
[34]
Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pan-dis. 2016. ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads. In Proceedings of the 2015 ACM SIGMOD/PODS Conference. San Francisco, CA, USA, 1675--1687.
[35]
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Coresand NVRAM. In Proceedings of the 2015 ACM SIGMOD/PODS Conference. ACM, Melbourne, Victoria, Australia, 691--706.
[36]
FAL Labs. 2011. Kyoto Cabinet: a straightforward implementation ofDBM.http://fallabs.com/kyotocabinet/.
[37]
Hyeontaek Lim, Dongsu Han, David G. Andersen, and Michael Kaminsky. 2014. MICA: A Holistic Approach to Fast In-memory Key-value Storage. In Proceedings of the 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI). Seattle, WA, 429--444.
[38]
Hyeontaek Lim, Michael Kaminsky, and David G. Andersen. 2017. Cicada: Dependably Fast Multi-Core In-Memory Transactions. In Proceedings of the 2017 ACM SIGMOD/PODS Conference. ACM, Chicago,Illinois, USA, 21--35.
[39]
Heiner Litz, David Cheriton, Amin Firoozshahian, Omid Azizi, andJohn P. Stevenson. 2014. SI-TM: Reducing Transactional Memory Abort Rates Through Snapshot Isolation. In Proceedings of the 18th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, Salt lake city, UT, 383--398.
[40]
Jean-Pierre Lozi, Florian David, Gaël Thomas, Julia Lawall, and Gilles Muller. 2012. Remote Core Locking: Migrating Critical-section Execution to Improve the Performance of Multithreaded Applications. In Proceedings of the 2012 USENIX Annual Technical Conference (ATC). USENIX Association, Boston, MA, 6--6.
[41]
Linux manual page. 2017. perf Manual. http://man7.org/linux/man-pages/man1/perf.1.html.
[42]
Alexander Matveev, Nir Shavit, Pascal Felber, and Patrick Marlier. 2015. Read-log-update: A Lightweight Synchronization Mechanism forConcurrent Programming. In Proceedings of the 25th ACM Symposiumon Operating Systems Principles (SOSP). ACM, Monterey, CA, 168--183.
[43]
Paul E. McKenney. 1998. Structured Deferral: Synchronization via Procrastination. ACM Queue (1998), 20:20--20:39.
[44]
Paul E. McKenney. 2012. RCU Linux Usage. http://www.rdrop.com/~paulmck/RCU/linuxusage.html.
[45]
Paul E. McKenney, Jonathan Appavoo, Andy Kleen, Orran Krieger, Rusty Russell, Dipankar Sarma, and Maneesh Soni. 2002. Read-Copy Update. In Ottawa Linux Symposium (OLS).
[46]
Frank McSherry, Michael Isard, and Derek G. Murray. 2015. Scalability! But at what COST?. In 15th USENIX Workshop on Hot Topics inOperating Systems (HotOS) (HotOS XV). USENIX Association, Kartause Ittingen, Switzerland.
[47]
Maged M. Michael. 2002. Safe Memory Reclamation for Dynamic Lock-free Objects Using Atomic Reads and Writes. In Proceedings ofthe 21st ACM SIGACT-SIGOPS Symposium on Principles of Distributed Computing (PODC). Monterey, California, 21--30.
[48]
Changwoo Min, Sanidhya Kashyap, Steffen Maass, Woonhak Kang,and Taesoo Kim. 2016. Understanding Manycore Scalability of File Systems. In Proceedings of the 2016 USENIX Annual Technical Conference(ATC). USENIX Association, Denver, CO, 71--85.
[49]
Donald Nguyen and Keshav Pingali. 2017. What Scalable ProgramsNeed from Transactional Memory. In Proceedings of the 22nd ACM International Conference on Architectural Support for ProgrammingLanguages and Operating Systems (ASPLOS). ACM, Xi'an, China, 105--118.
[50]
Yang Ni, Vijay S. Menon, Ali-Reza Adl-Tabatabai, Antony L. Hosking, Richard L. Hudson, J. Eliot B. Moss, Bratin Saha, and Tatiana Shpeisman. 2007. Open Nesting in Software Transactional Memory. In Proceedings of the 6th ACM Symposium on Principles and Practice of Parallel Programming (PPoPP). ACM, SAN Francisco, CA, USA, 68--78.
[51]
Oracle. 2004. Oracle Database Concepts 10g Release 1 (10.1) Chapter 13: Data Concurrency and Consistency -- Oracle Isolation Levels. https://docs.oracle.com/cd/B12037_01/server.101/b10743/consist.htm.
[52]
Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, LinMa, Prashanth Menon, Todd Mowry, Matthew Perron, Ian Quah, Sid-dharth Santurkar, Anthony Tomasic, Skye Toor, Dana Van Aken, Ziqi Wang, Yingjun Wu, Ran Xian, and Tieying Zhang. 2017. Self-Driving Database Management Systems. In Proceedings of the 39th biennial Conference on Innovative Data Systems Research (CIDR). Chaminade, California.
[53]
Aleksey Pesterev, Jacob Strauss, Nickolai Zeldovich, and Robert T. Morris. 2012. Improving Network Connection Locality on Multicore Systems. InProceedings of the 7th European Conference on Computer Systems (EuroSys). ACM, Bern, Switzerland, 337--350.
[54]
PostgreSQL. 2018. Serializable Snapshot Isolation (SSI) in PostgreSQL. https://wiki.postgresql.org/wiki/SSI.
[55]
Sepideh Roghanchi, Jakob Eriksson, and Nilanjana Basu. 2017. Ffwd: Delegation is (Much) Faster Than You Think. In Proceedings of the 26th ACM Symposium on Operating Systems Principles (SOSP). ACM, Shanghai, China, 342--358.
[56]
Alexander Spiegelman, Guy Golan-Gueta, and Idit Keidar. 2016. Transactional Data Structure Libraries. In Proceedings of the 2016 ACM SIG-PLAN Conference on Programming Language Design and Implementation (PLDI). ACM, Santa Barbara, CA, 682--696.
[57]
Paul Teich. 2017.The New Server Economies of Scalefor AMD. https://www.nextplatform.com/2017/07/13/new-server-economies-scale-amd/.
[58]
Bill Thomas. 2018. AMD Ryzen Threadripper 2nd Generation re-lease date, news and features. https://www.techradar.com/news/amd-ryzen-threadripper-2nd-generation.
[59]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy Transactions in Multicore In-memory Databases. In Proceedings of the 24th ACM Symposium on Operating Systems Principles (SOSP). ACM, Farmington, PA, 18--32.
[60]
Qi Wang, Timothy Stamler, and Gabriel Parmer. 2016. Parallel Sec-tions: Scaling System-level Data-structures. In Proceedings of the 11th European Conference on Computer Systems (EuroSys). ACM, London,UK, 33:1--33:15.
[61]
Tianzheng Wang and Hideaki Kimura. 2016. Mostly-optimistic Con-currency Control for Highly Contended Dynamic Workloads on a Thousand Cores. In Proceedings of the 39th International Conference on Very Large Data Bases (VLDB). VLDB Endowment, New Delhi, India, 49--60.
[62]
Wikipedia. 2018. Snapshot isolation. https://en.wikipedia.org/wiki/Snapshot_isolation.
[63]
Chris Williams. 2018. Broadcom's Arm server chip lives - as Cavium's two-socket ThunderX2.https://www.theregister.co.uk/2018/05/08/cavium_thunderx2/.
[64]
Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, and Andrew Pavlo. 2017. An Empirical Evaluation of In-memory Multi-version Concurrency Control. In Proceedings of the 39th International Conference on VeryLarge Data Bases (VLDB). VLDB Endowment, TU Munich, Germany, 781--792.
[65]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and Michael Stonebraker. 2014. Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores. In Proceedings of the39th International Conference on Very Large Data Bases (VLDB). VLDB Endowment, Hangzhou, China, 209--220.
[66]
Xiangyao Yu, Andrew Pavlo, Daniel Sanchez, and Srinivas Devadas. 2016. TicToc: Time Traveling Optimistic Concurrency Control. In Proceedings of the 2015 ACM SIGMOD/PODS Conference. ACM, SanFrancisco, CA, USA, 1629--1642.
[67]
Yang Zhan and Donald E. Porter. 2010. Versioned Programming: ASimple Technique for Implementing Efficient, Lock-Free, and Composable Data Structures. In Proceedings of the ACM International Systemsand Storage Conference. ACM, California, USA, 11:1--11:12.

Cited By

View all
  • (2023)Prism: Optimizing Key-Value Store for Modern Heterogeneous Storage DevicesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575722(588-602)Online publication date: 27-Jan-2023
  • (2023)Practically and Theoretically Efficient Garbage Collection for MultiversioningProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577508(66-78)Online publication date: 25-Feb-2023
  • (2023)Cooperative Concurrency Control for Write-Intensive Key-Value WorkloadsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567957(30-46)Online publication date: 25-Mar-2023
  • Show More Cited By

Index Terms

  1. MV-RLU: Scaling Read-Log-Update with Multi-Versioning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    ASPLOS '19: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems
    April 2019
    1126 pages
    ISBN:9781450362405
    DOI:10.1145/3297858
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    In-Cooperation

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 04 April 2019

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. concurrency control
    2. garbage collection
    3. multi-version
    4. synchronization

    Qualifiers

    • Research-article

    Funding Sources

    • Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIT)

    Conference

    ASPLOS '19

    Acceptance Rates

    ASPLOS '19 Paper Acceptance Rate 74 of 351 submissions, 21%;
    Overall Acceptance Rate 535 of 2,713 submissions, 20%

    Upcoming Conference

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)41
    • Downloads (Last 6 weeks)8
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Prism: Optimizing Key-Value Store for Modern Heterogeneous Storage DevicesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3575693.3575722(588-602)Online publication date: 27-Jan-2023
    • (2023)Practically and Theoretically Efficient Garbage Collection for MultiversioningProceedings of the 28th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3572848.3577508(66-78)Online publication date: 25-Feb-2023
    • (2023)Cooperative Concurrency Control for Write-Intensive Key-Value WorkloadsProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 110.1145/3567955.3567957(30-46)Online publication date: 25-Mar-2023
    • (2023)Compiler‐driven approach for automating nonblocking synchronization in concurrent data abstractionsConcurrency and Computation: Practice and Experience10.1002/cpe.793536:5Online publication date: 24-Oct-2023
    • (2022)Performance Analysis of RCU-Style Non-Blocking Synchronization Mechanisms on a Manycore-Based Operating SystemApplied Sciences10.3390/app1207345812:7(3458)Online publication date: 29-Mar-2022
    • (2021)A Universal Construction to implement Concurrent Data Structure for NUMA-muticoreProceedings of the 50th International Conference on Parallel Processing10.1145/3472456.3472475(1-11)Online publication date: 9-Aug-2021
    • (2021)A stealing mechanism for delegation methodsThe Journal of Supercomputing10.1007/s11227-021-03719-2Online publication date: 12-Mar-2021
    • (2020)CrossFSProceedings of the 14th USENIX Conference on Operating Systems Design and Implementation10.5555/3488766.3488774(137-154)Online publication date: 4-Nov-2020
    • (2020)JellyFishProceedings of the 21st International Middleware Conference10.1145/3423211.3425672(134-148)Online publication date: 7-Dec-2020
    • (2020)RCU Usage In the Linux KernelACM SIGOPS Operating Systems Review10.1145/3421473.342148154:1(47-63)Online publication date: 31-Aug-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media