Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Live Patching for Distributed In-Memory Key-Value Stores

Published: 20 December 2024 Publication History

Abstract

Providers of high-availability data stores need to roll out software updates without causing noticeable downtimes. For distributed data stores like Redis Cluster, the state-of-the-art is a rolling update, where the nodes are restarted in sequence. This requires preserving, restoring, and resynchronizing the database state, which can significantly prolong updates for larger memory states, and thus delay critical security fixes. In this article, we propose applying software updates directly in memory without restarting any nodes. We present the first fully operational live patching solution for Redis Cluster on Linux. We support both push- and pull-based distribution of patches, trading dissemination speed against cluster elasticity, the ability to allow nodes to dynamically join or leave the cluster. Our integration is very lightweight, as it piggybacks on the cluster-internal gossip protocol. Our experiments benchmark live patching against state-of-the-art rolling updates. In one scenario, live patching updates the entire cluster orders of magnitude faster, without unfavorable trade-offs regarding throughput, tail latencies, or network consumption. To showcase generalizability, we provide general guidelines on integrating live patching for distributed database systems and successfully apply them to a primary-replica PostgreSQL setup. Given our overall promising results, we discuss the opportunities of live patching in database DevOps.

References

[1]
Gautam Altekar, Ilya Bagrak, Paul Burstein, and Andrew Schultz. 2005. OPUS: Online Patches and Updates for Security. In Proc. USENIX.
[2]
Amazon. 2018. Best Practices for Upgrading Amazon RDS for MySQL and Amazon RDS for MariaDB. https://aws.amazon.com/blogs/database/best-practices-for-upgrading-amazon-rds-for-mysql-and-amazon-rdsfor- mariadb/
[3]
Jeff Arnold and M. Frans Kaashoek. 2009. Ksplice: automatic rebootless kernel updates. In Proc. EuroSys. 187--198.
[4]
Haibo Chen, Jie Yu, Rong Chen, Binyu Zang, and Pen-Chung Yew. 2007. POLUS: A POwerful Live Updating System. In Proc. ICSE. 271--281.
[5]
Codership. 2013. Minimizing downtime and maximizing elasticity with Galera Cluster for MySQL. https://galeracluster.com/wp-content/uploads/2013/10/Minimizing-downtime-and-maximizing-elasticity-with- Galera-Cluster-for-MySQL.pdf
[6]
Codership. 2024. Upgrading Galera Cluster. https://galeracluster.com/library/documentation/upgrading.html
[7]
Benoît Dageville, Thierry Cruanes, Marcin Zukowski, Vadim Antonov, Artin Avanes, Jon Bock, Jonathan Claybaugh, Daniel Engovatov, Martin Hentschel, Jiansheng Huang, Allison W. Lee, Ashish Motivala, Abdul Q. Munir, Steven Pelley, Peter Povinec, Greg Rahn, Spyridon Triantafyllis, and Philipp Unterbrunner. 2016. The Snowflake Elastic Data Warehouse. In Proc. SIGMOD. 215--226.
[8]
DB-Engines. 2024. DB-Engines Ranking - The Popularity Ranking of Database Management Systems. https://dbengines. com/de/ranking Ranking of June 2024.
[9]
Jeffrey Dean and Luiz André Barroso. 2013. The tail at scale. Commun. ACM 56, 2 (2013), 74--80.
[10]
DigitalOcean. 2024. Managed Databases for Redis. https://www.digitalocean.com/products/managed-databases-redis
[11]
dynup/kpatch. 2024. kpatch Patch Author Guide. https://github.com/dynup/kpatch/blob/master/doc/patch-authorguide. md
[12]
EDB. 2024. PostgreSQL BDR (Bi-Directional Replication). https://www.enterprisedb.com/docs/pgd/4/bdr/
[13]
Exoscale. 2024. Exoscale for Redis. https://www.exoscale.com/dbaas/redis/
[14]
Michael Fruth. 2022. Live Patching Database Management Systems. In Proc. SIGMOD. 2524--2526. ACM SIGMOD Student Research Competition 2022.
[15]
Michael Fruth and Stefanie Scherzinger. 2024. The Case for DBMS Live Patching [Extended Version]. https: //doi.org/10.48550/arXiv.2410.09925 arXiv:2410.09925
[16]
Christopher M. Hayden, Edward K. Smith, Michail Denchev, Michael Hicks, and Jeffrey S. Foster. 2012. Kitsune: efficient, general-purpose dynamic software updating for C. In Proc. OOPSLA. 249--264.
[17]
Christopher M. Hayden, Edward K. Smith, Michael Hicks, and Jeffrey S. Foster. 2011. State transfer for clear and efficient runtime updates. In Proc. ICDE. 179--184.
[18]
Alfons Kemper and Thomas Neumann. 2011. HyPer: A hybrid OLTP&OLAP main memory database system based on virtual memory snapshots. In Proc. ICDE. 195--206.
[19]
Rajender Kumar. 2021. IBM AIX 7.2 Live Kernel Update for a reboot-free world! https://www.ibm.com/support/pages/ibmaix- 72-live-kernel-update-reboot-free-world
[20]
Aravind Machiry, Nilo Redini, Eric Camellini, Christopher Kruegel, and Giovanni Vigna. 2020. SPIDER: Enabling Fast Patch Propagation In Related Software Repositories. In Proc. SP. 1562--1579.
[21]
Kristis Makris and Rida A. Bazzi. 2009. Immediate Multi-Threaded Dynamic Software Updates Using Stack Reconstruction. In Proc. USENIX.
[22]
Kristis Makris and Kyung Dong Ryu. 2007. Dynamic and adaptive updates of non-quiescent subsystems in commodity operating system kernels. In Proc. EuroSys. 327--340.
[23]
MariaDB. 2024. What is MariaDB Galera Cluster? https://mariadb.com/kb/en/what-is-mariadb-galera-cluster/
[24]
Microsoft. 2023. Hotpatch for virtual machines. https://learn.microsoft.com/en-us/windows-server/get-started/ hotpatch
[25]
Microsoft. 2023. Upgrade a failover cluster instance. https://learn.microsoft.com/en-us/sql/sql-server/failover-clusters/ windows/upgrade-a-sql-server-failover-cluster-instance?view=sql-server-ver16
[26]
MySQL. 2024. MySQL Cluster CGE. https://www.mysql.com/products/cluster/
[27]
Iulian Neamtiu, Michael W. Hicks, Gareth Paul Stoyle, and Manuel Oriol. 2006. Practical dynamic software updating for C. In Proc. SIGPLAN. 72--83.
[28]
Hans Olav Norheim. 2019. Hot Patching SQL Server Engine in Azure SQL Database. https://techcommunity.microsoft.com/t5/azure-sql-blog/hot-patching-sql-server-engine-in-azure-sql-database/bap/ 849700
[29]
Oracle. 2019. Oracle Real Application Clusters (RAC) on Oracle Database 19c. https://www.oracle.com/technetwork/database/options/clustering/rac-twp-overview-5303704.pdf
[30]
Pu Pang, Gang Deng, Kaihao Bai, Quan Chen, Shixuan Sun, Bo Liu, Yu Xu, Hongbo Yao, Zhengheng Wang, Xiyu Wang, Zheng Liu, Zhuo Song, Yong Yang, Tao Ma, and Minyi Guo. 2023. Async-fork: Mitigating Query Latency Spikes Incurred by the Fork-based Snapshot Mechanism from the OS Level. Proc. VLDB Endow. 16, 5 (2023), 1033--1045.
[31]
Josh Poimboeuf. 2014. Introducing kpatch: Dynamic Kernel Patching. https://www.redhat.com/de/blog/introducingkpatch- dynamic-kernel-patching
[32]
PostgreSQL. 2010. PostgreSQL 9.0 release notes. https://www.postgresql.org/docs/9.0/release-9-0.html
[33]
PostgreSQL. 2024. Upgrading a PostgreSQL Cluster. https://www.postgresql.org/docs/current/upgrading.html
[34]
Ajeet Raina. 2022. Redis Use Case Examples for Developers. https://redis.com/blog/5-industry-use-cases-for-redisdevelopers/
[35]
Redis. 2024. Diagnosing latency issues. https://redis.io/docs/management/optimization/latency/
[36]
Redis. 2024. Redis cluster specification. https://redis.io/docs/reference/cluster-spec/
[37]
Redis. 2024. Redis Enterprise Cloud. https://redis.io/cloud/
[38]
Redis. 2024. Redis replication. https://redis.io/docs/management/replication/
[39]
Redis. 2024. Scale with Redis Cluster. https://redis.io/docs/management/scaling/
[40]
Florian Rommel, Christian Dietrich, Daniel Friesel, Marcel Köppen, Christoph Borchert, Michael Müller, Olaf Spinczyk, and Daniel Lohmann. 2020. From Global to Local Quiescence: Wait-Free Code Patching of Multi-Threaded Processes. In Proc. OSDI. 651--666.
[41]
Salvatore Sanfilippo and Redis Labs. 2015. Redis 3.0 release notes. https://raw.githubusercontent.com/antirez/redis/3.0/00-RELEASENOTES
[42]
ScaleGrid. 2024. ScaleGrid for Redis. https://scalegrid.io/redis/
[43]
Gareth Paul Stoyle, Michael W. Hicks, Gavin M. Bierman, Peter Sewell, and Iulian Neamtiu. 2005. Mutatis mutandis: safe and predictable dynamic software updating. In Proc. POPL. 183--194.
[44]
SUSE. 2024. libpulp. https://github.com/SUSE/libpulp
[45]
SUSE. 2024. Live Kernel Patching Using kGraft. https://documentation.suse.com/sles/12-SP5/html/SLES-kgraft/index.html
[46]
Nico Weichbrodt, Joshua Heinemann, Lennart Almstedt, Pierre-Louis Aublin, and Rüdiger Kapitza. 2021. sgx-dl: Dynamic loading and hot-patching for secure applications: experience paper. In Proc. Middleware. 91--103.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data
Proceedings of the ACM on Management of Data  Volume 2, Issue 6
SIGMOD
December 2024
792 pages
EISSN:2836-6573
DOI:10.1145/3709598
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 20 December 2024
Published in PACMMOD Volume 2, Issue 6

Permissions

Request permissions for this article.

Author Tags

  1. benchmarking
  2. database cluster
  3. key-value stores
  4. live patching

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)44
Reflects downloads up to 16 Feb 2025

Other Metrics

Citations

Cited By

View all

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media