Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

One-shot Garbage Collection for In-memory OLTP through Temporality-aware Version Storage

Published: 30 May 2023 Publication History

Abstract

Most modern in-memory online transaction processing (OLTP) engines rely on multi-version concurrency control (MVCC) to provide data consistency guarantees in the presence of conflicting data accesses. MVCC improves concurrency by generating a new version of a record on every write, thus increasing the storage requirements. Existing approaches rely on garbage collection and chain consolidation to reduce the length of version chains and reclaim space by freeing unreachable versions. However, finding unreachable versions requires the traversal of long version chains, which incurs random accesses right into the critical path of transaction execution, hence limiting scalability.
This paper introduces OneShotGC, a new multi-version storage design that eliminates version traversal during garbage collection, with minimal discovery and memory management overheads. OneShotGC leverages the temporal correlations across versions to opportunistically cluster them into contiguous memory blocks that can be released in one shot. We implement OneShotGC in Proteus and use YCSB and TPC-C to experimentally evaluate its performance with respect to the state-of-the-art, where we observe an improvement of up to 2x in transactional throughput.

Supplemental Material

MP4 File
OneShotGC: Presentation video for SIGMOD 2023

References

[1]
Jan Bö ttcher, Viktor Leis, Thomas Neumann, and Alfons Kemper. 2019. Scalable Garbage Collection for In-Memory MVCC Systems. Proc. VLDB Endow., Vol. 13, 2 (2019), 128--141. https://doi.org/10.14778/3364324.3364328
[2]
Rodrigo Bruno and Paulo Ferreira. 2018. A study on garbage collection algorithms for big data environments. ACM Computing Surveys (CSUR), Vol. 51, 1 (2018), 1--35.
[3]
Richard L. Cole, Florian Funke, Leo Giakoumakis, Wey Guy, Alfons Kemper, Stefan Krompass, Harumi A. Kuno, Raghunath Othayoth Nambiar, Thomas Neumann, Meikel Poess, Kai-Uwe Sattler, Michael Seibold, Eric Simon, and Florian Waas. 2011. The mixed workload CH-benCHmark. In Proceedings of the Fourth International Workshop on Testing Database Systems, DBTest 2011, Athens, Greece, June 13, 2011, Goetz Graefe and Kenneth Salem (Eds.). ACM, 8. https://doi.org/10.1145/1988842.1988850
[4]
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Åke Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL server's memory-optimized OLTP engine. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22--27, 2013, Kenneth A. Ross, Divesh Srivastava, and Dimitris Papadias (Eds.). ACM, 1243--1254. https://doi.org/10.1145/2463676.2463710
[5]
Jose M. Faleiro and Daniel J. Abadi. 2015. Rethinking serializable multiversion concurrency control. Proc. VLDB Endow., Vol. 8, 11 (2015), 1190--1201. https://doi.org/10.14778/2809974.2809981
[6]
Richard Jones and Rafael Lins. 1996. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley & Sons, Inc., USA.
[7]
Jong-Bin Kim, Hyunsoo Cho, Kihwang Kim, Jaeseon Yu, Sooyong Kang, and Hyungsoo Jung. 2020. Long-lived Transactions Made Less Harmful. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 495--510. https://doi.org/10.1145/3318464.3389714
[8]
Jong-Bin Kim, Kihwang Kim, Hyunsoo Cho, Jaeseon Yu, Sooyong Kang, and Hyungsoo Jung. 2021. Rethink the Scan in MVCC Databases. In SIGMOD '21: International Conference on Management of Data, Virtual Event, China, June 20--25, 2021, Guoliang Li, Zhanhuai Li, Stratos Idreos, and Divesh Srivastava (Eds.). ACM, 938--950. https://doi.org/10.1145/3448016.3452783
[9]
Kangnyeon Kim, Tianzheng Wang, Ryan Johnson, and Ippokratis Pandis. 2016. ERMIA: Fast Memory-Optimized Database System for Heterogeneous Workloads. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Fatma Ö zcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 1675--1687. https://doi.org/10.1145/2882903.2882905
[10]
Andreas Kipf, Varun Pandey, Jan Bö ttcher, Lucas Braun, Thomas Neumann, and Alfons Kemper. 2017. Analytics on Fast Data: Main-Memory Database Systems versus Modern Streaming Systems. In Proceedings of the 20th International Conference on Extending Database Technology, EDBT 2017, Venice, Italy, March 21--24, 2017, Volker Markl, Salvatore Orlando, Bernhard Mitschang, Periklis Andritsos, Kai-Uwe Sattler, and Sebastian Breß (Eds.). OpenProceedings.org, 49--60. https://doi.org/10.5441/002/edbt.2017.06
[11]
Per-Åke Larson, Spyros Blanas, Cristian Diaconu, Craig Freedman, Jignesh M. Patel, and Mike Zwilling. 2011. High-Performance Concurrency Control Mechanisms for Main-Memory Databases. Proc. VLDB Endow., Vol. 5, 4 (2011), 298--309. https://doi.org/10.14778/2095686.2095689
[12]
Juchang Lee, Hyungyu Shin, Chang Gyoo Park, Seongyun Ko, Jaeyun Noh, Yongjae Chuh, Wolfgang Stephan, and Wook-Shin Han. 2016. Hybrid Garbage Collection for Multi-Version Concurrency Control in SAP HANA. In Proceedings of the 2016 International Conference on Management of Data, SIGMOD Conference 2016, San Francisco, CA, USA, June 26 - July 01, 2016, Fatma Ö zcan, Georgia Koutrika, and Sam Madden (Eds.). ACM, 1307--1318. https://doi.org/10.1145/2882903.2903734
[13]
Justin J. Levandoski, David B. Lomet, Sudipta Sengupta, Ryan Stutsman, and Rui Wang. 2015. High Performance Transactions in Deuteronomy. In Seventh Biennial Conference on Innovative Data Systems Research, CIDR 2015, Asilomar, CA, USA, January 4--7, 2015, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2015/Papers/CIDR15_Paper15.pdf
[14]
Yossi Levanoni and Erez Petrank. 2006. An on-the-fly reference-counting garbage collector for java. ACM Transactions on Programming Languages and Systems (TOPLAS), Vol. 28, 1 (2006), 1--69.
[15]
Liang Li, Gang Wu, Guoren Wang, and Ye Yuan. 2019. Accelerating Hybrid Transactional/Analytical Processing Using Consistent Dual-Snapshot. In Database Systems for Advanced Applications - 24th International Conference, DASFAA 2019, Chiang Mai, Thailand, April 22--25, 2019, Proceedings, Part I (Lecture Notes in Computer Science), Guoliang Li, Jun Yang, Jo a o Gama, Juggapong Natwichai, and Yongxin Tong (Eds.), Vol. 11446. Springer, 52--69. https://doi.org/10.1007/978--3-030--18576--3_4
[16]
Xiaozhou Li, David G. Andersen, Michael Kaminsky, and Michael J. Freedman. 2014. Algorithmic improvements for fast concurrent Cuckoo hashing. In Ninth Eurosys Conference 2014, EuroSys 2014, Amsterdam, The Netherlands, April 13--16, 2014, Dick C. A. Bulterman, Herbert Bos, Antony I. T. Rowstron, and Peter Druschel (Eds.). ACM, 27:1--27:14. https://doi.org/10.1145/2592798.2592820
[17]
Thomas Neumann, Tobias Mü hlbauer, and Alfons Kemper. 2015. Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems. In Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, Melbourne, Victoria, Australia, May 31 - June 4, 2015, Timos K. Sellis, Susan B. Davidson, and Zachary G. Ives (Eds.). ACM, 677--689. https://doi.org/10.1145/2723372.2749436
[18]
Aunn Raza, Periklis Chrysogelos, Angelos-Christos G. Anadiotis, and Anastasia Ailamaki. 2020a. Adaptive HTAP through Elastic Resource Scheduling. In Proceedings of the 2020 International Conference on Management of Data, SIGMOD Conference 2020, online conference [Portland, OR, USA], June 14--19, 2020, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 2043--2054. https://doi.org/10.1145/3318464.3389783
[19]
Aunn Raza, Periklis Chrysogelos, Panagiotis Sioulas, Vladimir Indjic, Angelos-Christos G. Anadiotis, and Anastasia Ailamaki. 2020b. GPU-accelerated data management under the test of time. In CIDR 2020, 10th Conference on Innovative Data Systems Research, Amsterdam, The Netherlands, January 12--15, 2020, Online Proceedings. www.cidrdb.org. http://cidrdb.org/cidr2020/papers/p18-raza-cidr20.pdf
[20]
David Ungar. 1984. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. ACM Sigplan notices, Vol. 19, 5 (1984), 157--167.
[21]
Yingjun Wu, Joy Arulraj, Jiexi Lin, Ran Xian, and Andrew Pavlo. 2017. An Empirical Evaluation of In-Memory Multi-Version Concurrency Control. Proc. VLDB Endow., Vol. 10, 7 (2017), 781--792. https://doi.org/10.14778/3067421.3067427
[22]
Hirotaka Yamamoto, Kenjiro Taura, and Akinori Yonezawa. 1998. Comparing reference counting and global mark-and-sweep on parallel computers. In International Workshop on Languages, Compilers, and Run-Time Systems for Scalable Computers. Springer, 205--218.
[23]
Benjamin Zorn. 1990. Comparing mark-and sweep and stop-and-copy garbage collection. In Proceedings of the 1990 ACM conference on LISP and functional programming. 87--98.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 1, Issue 1
PACMMOD
May 2023
2807 pages
EISSN:2836-6573
DOI:10.1145/3603164
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2023
Published in PACMMOD Volume 1, Issue 1

Permissions

Request permissions for this article.

Author Tags

  1. DBMS
  2. GC
  3. MVCC
  4. OLTP
  5. garbage collection

Qualifiers

  • Research-article

Funding Sources

  • Swiss National Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 289
    Total Downloads
  • Downloads (Last 12 months)147
  • Downloads (Last 6 weeks)8
Reflects downloads up to 03 Oct 2024

Other Metrics

Citations

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media