Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Taurus: lightweight parallel logging for in-memory database management systems

Published: 01 October 2020 Publication History

Abstract

Existing single-stream logging schemes are unsuitable for in-memory database management systems (DBMSs) as the single log is often a performance bottleneck. To overcome this problem, we present Taurus, an efficient parallel logging scheme that uses multiple log streams, and is compatible with both data and command logging. Taurus tracks and encodes transaction dependencies using a vector of log sequence numbers (LSNs). These vectors ensure that the dependencies are fully captured in logging and correctly enforced in recovery. Our experimental evaluation with an in-memory DBMS shows that Taurus's parallel logging achieves up to 9.9X and 2.9X speedups over single-streamed data logging and command logging, respectively. It also enables the DBMS to recover up to 22.9X and 75.6X faster than these baselines for data and command logging, respectively. We also compare Taurus with two state-of-the-art parallel logging schemes and show that the DBMS achieves up to 2.8X better performance on NVMe drives and 9.2X on HDDs.

References

[1]
[n.d.]. DBx1000. https://github.com/yxymit/DBx1000.
[2]
Panagiotis Antonopoulos, Peter Byrne, Wayne Chen, Cristian Diaconu, Raghavendra Thallam Kodandaramaih, Hanuma Kodavalla, Prashanth Purnananda, Adrian-Leonard Radu, Chaitanya Sreenivas Ravella, and Girish Mittur Venkataramanappa. 2019. Constant time recovery in Azure SQL database. Proceedings of the VLDB Endowment 12, 12 (2019), 2143--2154.
[3]
Joy Arulraj, Andrew Pavlo, and Subramanya R Dulloor. 2015. Let's talk about storage & recovery methods for non-volatile memory database systems. In SIGMOD. 707--722.
[4]
Joy Arulraj, Matthew Perron, and Andrew Pavlo. 2016. Write-behind logging. VLDB 10, 4 (2016), 337--348.
[5]
Philip A Bernstein and Sudipto Das. 2015. Scaling Optimistic Concurrency Control by Approximately Partitioning the Certifier and Log. IEEE Data Eng. Bull. 38, 1 (2015), 32--49.
[6]
Andreas Chatzistergiou, Marcelo Cintra, and Stratis D Viglas. 2015. Rewind: Recovery write-ahead system for in-memory non-volatile data-structures. VLDB 8, 5 (2015), 497--508.
[7]
Brian F. Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking Cloud Serving Systems with YCSB. In SoCC. 143--154.
[8]
David J DeWitt, Randy H Katz, Frank Olken, Leonard D Shapiro, Michael R Stonebraker, and David A Wood. 1984. Implementation techniques for main memory database systems. Vol. 14. ACM.
[9]
Cristian Diaconu, Craig Freedman, Erik Ismert, Per-Ake Larson, Pravin Mittal, Ryan Stonecipher, Nitin Verma, and Mike Zwilling. 2013. Hekaton: SQL Server's Memory-Optimized OLTP Engine. In SIGMOD. 1243--1254.
[10]
Ru Fang, Hui-I Hsiao, Bin He, C Mohan, and Yun Wang. 2011. High performance database logging using storage class memory. In 2011 IEEE 27th International Conference on Data Engineering. IEEE, 1221--1231.
[11]
Colin J Fidge. 1987. Timestamps in message-passing systems that preserve the partial ordering. Australian National University. Department of Computer Science.
[12]
Goetz Graefe and Harumi Kuno. 2016. Controlled lock violation for data transactions. US Patent 9,396,227.
[13]
Goetz Graefe, Mark Lillibridge, Harumi Kuno, Joseph Tucek, and Alistair Veitch. 2013. Controlled lock violation. In SIGMOD. ACM, 85--96.
[14]
Chuntao Hong, Dong Zhou, Mao Yang, Carbo Kuo, Lintao Zhang, and Lidong Zhou. 2013. KuaFu: Closing the parallelism gap in database replication. In ICDE. IEEE, 1186--1195.
[15]
Jian Huang, Karsten Schwan, and Moinuddin K Qureshi. 2014. NVRAM-aware logging in transaction systems. VLDB 8, 4 (2014), 389--400.
[16]
Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. 2010. Aether: a scalable approach to logging. VLDB 3, 1--2 (2010), 681--692.
[17]
Ryan Johnson, Ippokratis Pandis, Radu Stoica, Manos Athanassoulis, and Anastasia Ailamaki. 2012. Scalability of write-ahead logging on multicore and multi-socket hardware. The VLDB Journal 21, 2 (2012), 239--263.
[18]
Hyungsoo Jung, Hyuck Han, and Sooyong Kang. 2017. Scalable database logging for multicores. VLDB 11, 2 (2017), 135--148.
[19]
Robert Kallman, Hideaki Kimura, Jonathan Natkins, Andrew Pavlo, Alexander Rasin, Stanley Zdonik, Evan P. C. Jones, Samuel Madden, Michael Stonebraker, Yang Zhang, John Hugg, and Daniel J. Abadi. 2008. H-Store: A High-Performance, Distributed Main Memory Transaction Processing System. Proc. VLDB Endow. 1, 2 (2008), 1496--1499.
[20]
Jongbin Kim, Hyeongwon Jang, Seohui Son, Hyuck Han, Sooyong Kang, and Hyungsoo Jung. 2019. Border-Collie: A Wait-free, Read-optimal Algorithm for Database Logging on Multicore Hardware. In SIGMOD. ACM, 723--740.
[21]
Wook-Hee Kim, Jinwoong Kim, Woongki Baek, Beomseok Nam, and Youjip Won. 2016. NVWAL: Exploiting NVRAM in write-ahead logging. ACM SIGOPS Operating Systems Review 50, 2 (2016), 385--398.
[22]
Hideaki Kimura. 2015. FOEDUS: OLTP Engine for a Thousand Cores and NVRAM. In SIGMOD. ACM, 691--706.
[23]
Hideaki Kimura, Goetz Graefe, and Harumi A Kuno. 2012. Efficient locking techniques for databases on modern hardware. In ADMS@ VLDB. 1--12.
[24]
Hsiang-Tsung Kung and John T Robinson. 1981. On optimistic methods for concurrency control. ACM Transactions on Database Systems (TODS) 6, 2 (1981), 213--226.
[25]
Per-Åke Larson, Spyros Blanas, Cristian Diaconu, Craig Freedman, Jignesh M. Patel, and Mike Zwilling. 2011. High-Performance Concurrency Control Mechanisms for Main-Memory Databases. VLDB (2011), 298--309.
[26]
Nirmesh Malviya, Ariel Weisberg, Samuel Madden, and Michael Stonebraker. 2014. Rethinking main memory OLTP recovery. In ICDE. 604--615. http://hstore.cs.brown.edu/papers/voltdb-recovery.pdf
[27]
Friedemann Mattern. 1988. Virtual Time and Global States of Distributed Systems. In Parallel and Distributed Algorithms. 215--226.
[28]
Qingzhong Meng, Xuan Zhou, Shan Wang, Haiyan Huang, and Xiaoli Liu. 2018. A Twin-Buffer Scheme for High-Throughput Logging. In International Conference on Database Systems for Advanced Applications. 725--737.
[29]
C Mohan, Don Haderle, Bruce Lindsay, Hamid Pirahesh, and Peter Schwarz. 1992. ARIES: a transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. ACM Transactions on Database Systems (TODS) 17, 1 (1992), 94--162.
[30]
Yasuhiro Nakamura, Hideyuki Kawashima, and Osamu Tatebe. 2019. Integration of TicToc Concurrency Control Protocol with Parallel Write Ahead Logging Protocol. International Journal of Networking and Computing 9, 2 (2019), 339--353.
[31]
David A Patterson, Garth Gibson, and Randy H Katz. 1988. A Case for Redundant Arrays of Inexpensive Disks (RAID). Vol. 17. ACM.
[32]
Eljas Soisalon-Soininen and Tatu Ylönen. 1995. Partial strictness in two-phase locking. In International Conference on Database Theory. Springer, 139--147.
[33]
Jayson Speer and Markus Kirchberg. 2007. C-ARIES: A multi-threaded version of the ARIES recovery algorithm. In International Conference on Database and Expert Systems Applications. Springer, 319--328.
[34]
The Transaction Processing Council. 2007. TPC-C Benchmark (Revision 5.9.0).
[35]
Stephen Tu, Wenting Zheng, Eddie Kohler, Barbara Liskov, and Samuel Madden. 2013. Speedy Transactions in Multicore In-Memory Databases. In SOSP.
[36]
Tianzheng Wang and Ryan Johnson. 2014. Scalable logging through emerging non-volatile memory. VLDB 7, 10 (2014), 865--876.
[37]
Zhaoguo Wang, Hao Qian, Jinyang Li, and Haibo Chen. 2014. Using restricted transactional memory to build a scalable in-memory database. In EuroSys. 26.
[38]
Yingjun Wu, Wentian Guo, Chee-Yong Chan, and Kian-Lee Tan. 2017. Fast Failure Recovery for Main-Memory DBMSs on Multicores. In Proceedings of the 2017 ACM International Conference on Management of Data. 267--281.
[39]
Yu Xia, Xiangyao Yu, Andrew Pavlo, and Srinivas Devadas. 2020. Taurus: Lightweight Parallel Logging for In-Memory Database Management Systems (Extended Version). arXiv:arXiv:2010.06760
[40]
Jian Yang, Juno Kim, Morteza Hoseinzadeh, Joseph Izraelevitz, and Steve Swanson. 2020. An empirical guide to the behavior and use of scalable persistent memory. In 18th {USENIX} Conference on File and Storage Technologies ({FAST} 20). 169--182.
[41]
Chang Yao, Divyakant Agrawal, Gang Chen, Beng Chin Ooi, and Sai Wu. 2016. Adaptive logging: Optimizing logging and recovery costs in distributed in-memory databases. In SIGMOD. 1119--1134.
[42]
Chang Yao, Meihui Zhang, Qian Lin, Beng Chin Ooi, and Jiatao Xu. 2018. Scaling distributed transaction processing and recovery based on dependency logging. VLDB Journal 27, 3 (2018), 347--368.
[43]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and Michael Stonebraker. 2014. Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores. VLDB, 209--220.
[44]
Wenting Zheng, Stephen Tu, Eddie Kohler, and Barbara Liskov. 2014. Fast databases with fast durability and recovery through multicore parallelism. In OSDI. 465--477.
[45]
Huan Zhou, Jinwei Guo, Huiqi Hu, Weining Qian, Xuan Zhou, and Aoying Zhou. 2020. Plover: parallel logging for replication systems. Frontiers of Computer Science 14, 4 (2020), 144606.

Cited By

View all
  • (2025)Aion: Live Migration for In-Memory Databases with Zero Downtime and Reduced Redundant Data TransferData Science and Engineering10.1007/s41019-024-00276-5Online publication date: 15-Jan-2025
  • (2024)Lupin: Tolerating Partial Failures in a CXL PodProceedings of the 2nd Workshop on Disruptive Memory Systems10.1145/3698783.3699377(41-50)Online publication date: 3-Nov-2024
  • (2024)TimeCloth: Fast Point-in-Time Database Recovery in The CloudCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653382(214-226)Online publication date: 9-Jun-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 14, Issue 2
October 2020
167 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 October 2020
Published in PVLDB Volume 14, Issue 2

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)39
  • Downloads (Last 6 weeks)2
Reflects downloads up to 24 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Aion: Live Migration for In-Memory Databases with Zero Downtime and Reduced Redundant Data TransferData Science and Engineering10.1007/s41019-024-00276-5Online publication date: 15-Jan-2025
  • (2024)Lupin: Tolerating Partial Failures in a CXL PodProceedings of the 2nd Workshop on Disruptive Memory Systems10.1145/3698783.3699377(41-50)Online publication date: 3-Nov-2024
  • (2024)TimeCloth: Fast Point-in-Time Database Recovery in The CloudCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653382(214-226)Online publication date: 9-Jun-2024
  • (2024)Log Replaying for Real-Time HTAP: An Adaptive Epoch-Based Two-Stage Framework2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00167(2096-2108)Online publication date: 13-May-2024
  • (2024)Fast Parallel Recovery for Transactional Stream Processing on Multicores2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00122(1478-1491)Online publication date: 13-May-2024
  • (2024)Poplar: Partially-Ordered Parallel Logging for Lower Isolation LevelsWeb and Big Data10.1007/978-981-97-7238-4_30(477-493)Online publication date: 31-Aug-2024
  • (2023)DecLog: Decentralized Logging in Non-Volatile Memory for Time Series Database SystemsProceedings of the VLDB Endowment10.14778/3617838.361783917:1(1-14)Online publication date: 1-Sep-2023
  • (2023) R 3 : Record-Replay-Retroaction for Database-Backed Applications Proceedings of the VLDB Endowment10.14778/3611479.361151016:11(3085-3097)Online publication date: 24-Aug-2023
  • (2023)Knock Out 2PC with Practicality Intact: a High-performance and General Distributed Transaction Protocol2023 IEEE 39th International Conference on Data Engineering (ICDE)10.1109/ICDE55515.2023.00179(2317-2331)Online publication date: Apr-2023
  • (2022)p2KVSProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3519567(575-591)Online publication date: 28-Mar-2022
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media