Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2588555.2595642acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Fast database restarts at facebook

Published: 18 June 2014 Publication History
  • Get Citation Alerts
  • Abstract

    Facebook engineers query multiple databases to monitor and analyze Facebook products and services. The fastest of these databases is Scuba, which achieves subsecond query response time by storing all of its data in memory across hundreds of servers. We are continually improving the code for Scuba and would like to push new software releases at least once a week. However, restarting a Scuba machine clears its memory. Recovering all of its data from disk --- about 120 GB per machine --- takes 2.5-3 hours to read and format the data per machine. Even 10 minutes is a long downtime for the critical applications that rely on Scuba, such as detecting user-facing errors. Restarting only 2% of the servers at a time mitigates the amount of unavailable data, but prolongs the restart duration to about 12 hours, during which users see only partial query results and one engineer needs to monitor the servers carefully. We need a faster, less engineer intensive, solution to enable frequent software upgrades.
    In this paper, we show that using shared memory provides a simple, effective, fast, solution to upgrading servers. Our key observation is that we can decouple the memory lifetime from the process lifetime. When we shutdown a server for a planned upgrade, we know that the memory state is valid (unlike when a server shuts down unexpectedly). We can therefore use shared memory to preserve memory state from the old server process to the new process. Our solution does not increase the server memory footprint and allows recovery at memory speeds, about 2-3 minutes per server. This solution maximizes uptime and availability, which has led to much faster and more frequent rollouts of new features and improvements. Furthermore, this technique can be applied to the in-memory state of any database, even if the memory contains a cache of a much larger disk-resident data set, as in most databases.

    References

    [1]
    Application checkpointing. http://en.wikipedia.org/wiki/Applicationcheckpointing.
    [2]
    eXtremeDB Embedded In-Memory Database System. http://www.mcobject.com/standardedition.shtml.
    [3]
    Scribe. https://github.com/facebook/scribe.
    [4]
    Sharing memory between processes - 1.54.0.http://www.boost.org/doc/libs/154 0/, 2013.
    [5]
    L. Abraham, J. Allen, O. Barykin, V. Borkar, B. Chopra, C. Gerea, D. Merl, J. Metzler, D. Reiss, S. Subramanian, et al. Scuba: diving into data at facebook. In VLDB, pages 1057--1067, 2013.
    [6]
    N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. Tao: Facebook's distributed data store for the social graph. In USENIX, 2013.
    [7]
    Y. Collet. Lz4: Extremely fast compression algorithm. code.google.com, 2013.
    [8]
    J. Evans. A scalable concurrent malloc (3) implementation for FreeBSD. In BSDCan, 2006.
    [9]
    D. G. Feitelson, E. Frachtenberg, and K. L. Beck. Development and deployment at Facebook. IEEE Internet Computing, 17(4):8--17, 2013.
    [10]
    A. Hall, O. Bachmann, R. Büssow, S. Gánceanu, and M. Nunkesser. Processing a trillion cells per mouse click. PVLDB, 5(11):1436--1446, July 2012.
    [11]
    D. R. Hipp. Sqlite: Write-ahead log. http://www.sqlite.org/draft/wal.html.
    [12]
    S. Idreos, F. Groffen, N. Nes, S. Manegold, K. S. Mullender, and M. L. Kersten. Monetdb: Two decades of research in column-oriented database architectures. IEEE Data Eng. Bull., 35(1):40--45, 2012.
    [13]
    T. Lahiri, M.-A. Neimat, and S. Folkman. Oracle TimesTen: An In-Memory Database for Enterprise Applications. IEEE Data Eng. Bull., 36(2):6--13, 2013.
    [14]
    A. Lamb, M. Fuller, R. Varadarajan, N. Tran, B. Vandier, L. Doshi, and C. Bear. The Vertica Analytic Database: C-Store 7 Years Later . PVLDB, 5(12):1790--1801, 2012.
    [15]
    P.-A. Larson, M. Zwilling, and K. Farlee. The Hekaton Memory-Optimized OLTP Engine. IEEE Data Eng. Bull., 36(2):34--40, 2013.
    [16]
    J. Lee, M. Muehle, N. May, F. Faerber, V. Sikka, H. Plattner, J. Krueger, and M. Grund. High-Performance Transaction Processing in SAPHANA. IEEE Data Eng. Bull., 36(2):28--33, 2013.
    [17]
    C. Legnitto. 1m people try to help Facebook spruce up Android. http://news.cnet.com/8301--10233--57614540--93/1m- people-try-to-help-facebook-spruce-up-android/.
    [18]
    C. Legnitto. Update on the Facebook for Android beta testing program. https://m.facebook.com/notes/facebook- engineering/update-on-the-facebook-for-android-beta- testing-program/10151729114953920.
    [19]
    S. Melnik, A. Gubarev, J. J. Long, G. Romer, S. Shivakumar, M. Tolton, and T. Vassilakis. Dremel: Interactive analysis of web-scale datasets. PVLDB, 3(1):330--339, 2010.
    [20]
    R. Nishtala, H. Fugal, S. Grimm, M. Kwiatkowski, H. Lee, H. C. Li, R. McElroy, M. Paleczny, D. Peek, P. Saab, D. Stafford, T. Tung, and V. Venkataramani. Scaling Memcache at Facebook. In NSDI, pages 385--398. USENIX Association, 2013.
    [21]
    M. A. Olson, K. Bostic, and M. I. Seltzer. Berkeley DB. In USENIX, pages 183--191, 1999.
    [22]
    V. Sikka, F. Färber, W. Lehner, S. K. Cha, T. Peh, and C. Bornhövd. Efficient transaction processing in SAP HANA database: the end of a column store myth. In SIGMOD, pages 731--742, 2012.
    [23]
    M. Stonebraker, D. J. Abadi, A. Batkin, X. Chen, M. Cherniack, M. Ferreira, E. Lau, A. Lin, S. Madden, E. O'Neil, P. O'Neil, A. Rasin, N. Tran, and S. Zdonik. C-Store: A Column-Oriented DBMS. In VLDB, pages 553--564, 2005.
    [24]
    M. Stonebraker and A. Weisberg. The VoltDB Main Memory DBMS. IEEE Data Eng. Bull., 36(2):21--27, 2013.
    [25]
    B. Walters. STLdb. http://sourceforge.net/apps/trac/stldb/.

    Cited By

    View all
    • (2024)Hybrid fault tolerance in distributed in-memory storage systemsJUSTC10.52396/JUSTC-2022-012554:4(0406)Online publication date: 2024
    • (2024)Supports for Testing Memory Error Handling Code of In-memory Key Value Stores2024 19th European Dependable Computing Conference (EDCC)10.1109/EDCC61798.2024.00020(41-48)Online publication date: 8-Apr-2024
    • (2024)A quantitative evaluation of persistent memory hash indexesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00812-133:2(375-397)Online publication date: 1-Mar-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SIGMOD '14: Proceedings of the 2014 ACM SIGMOD International Conference on Management of Data
    June 2014
    1645 pages
    ISBN:9781450323765
    DOI:10.1145/2588555
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2014

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. database
    2. recovery
    3. restart
    4. rollover
    5. shared memory

    Qualifiers

    • Research-article

    Conference

    SIGMOD/PODS'14
    Sponsor:

    Acceptance Rates

    SIGMOD '14 Paper Acceptance Rate 107 of 421 submissions, 25%;
    Overall Acceptance Rate 785 of 4,003 submissions, 20%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)22
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Hybrid fault tolerance in distributed in-memory storage systemsJUSTC10.52396/JUSTC-2022-012554:4(0406)Online publication date: 2024
    • (2024)Supports for Testing Memory Error Handling Code of In-memory Key Value Stores2024 19th European Dependable Computing Conference (EDCC)10.1109/EDCC61798.2024.00020(41-48)Online publication date: 8-Apr-2024
    • (2024)A quantitative evaluation of persistent memory hash indexesThe VLDB Journal — The International Journal on Very Large Data Bases10.1007/s00778-023-00812-133:2(375-397)Online publication date: 1-Mar-2024
    • (2023)TreeSLS: A Whole-system Persistent Microkernel with Tree-structured State Checkpoint on NVMProceedings of the 29th Symposium on Operating Systems Principles10.1145/3600006.3613160(1-16)Online publication date: 23-Oct-2023
    • (2022)Meta's next-generation realtime monitoring and analytics platformProceedings of the VLDB Endowment10.14778/3554821.355484115:12(3522-3534)Online publication date: 1-Aug-2022
    • (2022)Exploring the under-explored terrain of non-open source data for software engineering through the lens of federated learningProceedings of the 30th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3540250.3560883(1610-1614)Online publication date: 7-Nov-2022
    • (2022)Hy-FiX: Fast In-Place Upgrades of KVM HypervisorsIEEE Transactions on Cloud Computing10.1109/TCC.2021.305659010:4(2679-2690)Online publication date: 1-Oct-2022
    • (2022)Towards Making Unikernels Rejuvenatable2022 IEEE International Symposium on Software Reliability Engineering Workshops (ISSREW)10.1109/ISSREW55968.2022.00062(154-161)Online publication date: Oct-2022
    • (2022)Graceful ECC-uncorrectable Error Handling in the Operating System Kernel2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE)10.1109/ISSRE55969.2022.00021(109-120)Online publication date: Oct-2022
    • (2022)Hardening In-memory Key-value Stores against ECC-uncorrectable Memory Errors2022 52nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN53405.2022.00057(509-521)Online publication date: Jul-2022
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media