Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2043556.2043560acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article

Fast crash recovery in RAMCloud

Published: 23 October 2011 Publication History
  • Get Citation Alerts
  • Abstract

    RAMCloud is a DRAM-based storage system that provides inexpensive durability and availability by recovering quickly after crashes, rather than storing replicas in DRAM. RAMCloud scatters backup data across hundreds or thousands of disks, and it harnesses hundreds of servers in parallel to reconstruct lost data. The system uses a log-structured approach for all its data, in DRAM as well as on disk: this provides high performance both during normal operation and during recovery. RAMCloud employs randomized techniques to manage the system in a scalable and decentralized fashion. In a 60-node cluster, RAMCloud recovers 35 GB of data from a failed server in 1.6 seconds. Our measurements suggest that the approach will scale to recover larger memory sizes (64 GB or more) in less time with larger clusters.

    References

    [1]
    More Details on Today's Outage | Facebook, Sept. 2010. http://www.facebook.com/note.php?note_id=431441338919.
    [2]
    Agiga tech agigaram, Mar. 2011. http://www.agigatech.com/agigaram.php.
    [3]
    memcached: a distributed memory object caching system, Jan. 2011. http://www.memcached.org/.
    [4]
    M. K. Aguilera, A. Merchant, M. Shah, A. Veitch, and C. Karamanolis. Sinfonia: A new paradigm for building scalable distributed systems. ACM Trans. Comput. Syst., 27:5:1--5:48, November 2009.
    [5]
    Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations (extended abstract). In Proceedings of the twenty-sixth annual ACM symposium on theory of computing, STOC '94, pages 593--602, New York, NY, USA, 1994. ACM.
    [6]
    F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst., 26:4:1--4:26, June 2008.
    [7]
    B. F. Cooper, R. Ramakrishnan, U. Srivastava, A. Silberstein, P. Bohannon, H.-A. Jacobsen, N. Puz, D. Weaver, and R. Yerneni. Pnuts: Yahoo!'s hosted data serving platform. Proc. VLDB Endow., 1:1277--1288, August 2008.
    [8]
    J. Dean. Keynote talk: Evolution and future directions of large-scale storage and computation systems at google. In Proceedings of the 1st ACM symposium on Cloud computing, Jun 2010.
    [9]
    G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: amazon's highly available key-value store. In Proceedings of twenty-first ACM SIGOPS symposium on operating systems principles. SOSP '07, pages 205--220, New York, NY, USA, 2007. ACM.
    [10]
    D. J. DeWitt, R. H. Katz, F. Olken, L. D. Shapiro, M. R. Stonebraker, and D. A. Wood. Implementation techniques for main memory database systems. In Proceedings of the 1984 ACM SIGMOD international conference on management of data, SIGMOD '84, pages 1--8, New York, NY, USA, 1984. ACM.
    [11]
    H. Garcia-Molina and K. Salem. Main memory database systems: An overview. IEEE Trans. on Knowl. and Data Eng., 4:509--516, December 1992.
    [12]
    S. Ghemawat, H. Gobioff, and S.-T. Leung. The google file system. In Proceedings of the nineteenth ACM symposium on Operating systems principles, SOSP '03, pages 29--43, New York, NY, USA, 2003. ACM.
    [13]
    M. P. Herlihy and J. M. Wing. Linearizability: a correctness condition for concurrent objects. ACM Trans. Program. Lang. Syst., 12:463--492, July 1990.
    [14]
    P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. Zookeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX annual technical conference, USENIX ATC '10, pages 11--11, Berkeley, CA, USA, 2010. USENIX Association.
    [15]
    R. Johnson and J. Rothschild. Personal Communications, March 24 and August 20, 2009.
    [16]
    R. Kallman, H. Kimura, J. Natkins, A. Pavlo, A. Rasin, S. Zdonik, E. P. C. Jones, S. Madden, M. Stonebraker, Y. Zhang, J. Hugg, and D. J. Abadi. H-store: a high-performance, distributed main memory transaction processing system. Proc. VLDB Endow., 1:1496--1499, August 2008.
    [17]
    M. D. Mitzenmacher. The power of two choices in randomized load balancing. PhD thesis, University of California, Berkeley, 1996. AAI9723118.
    [18]
    J. Ousterhout, P. Agrawal, D. Erickson, C. Kozyrakis, J. Leverich, D. Mazières, S. Mitra, A. Narayanan, D. Ongaro, G. Parulkar, M. Rosenblum, S. M. Rumble, E. Stratmann, and R. Stutsman. The case for ramcloud. Commun. ACM, 54:121--130, July 2011.
    [19]
    J. K. Ousterhout, A. R. Cherenson, F. Douglis, M. N. Nelson, and B. B. Welch. The sprite network operating system. Computer, 21:23--36, February 1988.
    [20]
    D. A. Patterson, G. Gibson, and R. H. Katz. A case for redundant arrays of inexpensive disks (raid). In Proceedings of the 1988 ACM SIGMOD international conference on management of data, SIGMOD '88, pages 109--116, New York, NY, USA, 1988. ACM.
    [21]
    M. Rosenblum and J. K. Ousterhout. The design and implementation of a log-structured file system. ACM Trans. Comput. Syst., 10:26--52, February 1992.
    [22]
    M. Seltzer, K. A. Smith, H. Balakrishnan, J. Chang, S. McMains, and V. Padmanabhan. File system logging versus clustering: a performance comparison. In Proceedings of the USENIX 1995 Technical Conference, TCON'95, pages 21--21, Berkeley, CA, USA, 1995. USENIX Association.
    [23]
    K. Shvachko, H. Kuang, S. Radia, and R. Chansler. The hadoop distributed file system. In Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST), MSST '10, pages 1--10, Washington, DC, USA, 2010. IEEE Computer Society.
    [24]
    I. Stoica, R. Morris, D. Liben-Nowell, D. R. Karger, M. F. Kaashoek, F. Dabek, and H. Balakrishnan. Chord: a scalable peer-to-peer lookup protocol for internet applications. IEEE/ACM Trans. Netw., 11:17--32, February 2003.

    Cited By

    View all
    • (2024)A TCP Congestion Control Optimization Method for SDN-Based Data Center Networks2024 9th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS61882.2024.10603010(468-473)Online publication date: 19-Apr-2024
    • (2024)MINOS: Distributed Consistency and Persistency Protocol Implementation & Offloading to SmartNICs2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00076(1-17)Online publication date: 2-Mar-2024
    • (2024)Reliability-Based Load Balancing Algorithm in Cloud Environment for Manufacturing SectorAdvances in Manufacturing and Materials10.1007/978-981-97-3173-2_29(433-443)Online publication date: 11-Jul-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SOSP '11: Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles
    October 2011
    417 pages
    ISBN:9781450309776
    DOI:10.1145/2043556
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 October 2011

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. crash recovery
    2. main memory databases
    3. scalability
    4. storage systems

    Qualifiers

    • Research-article

    Funding Sources

    Conference

    SOSP '11
    Sponsor:

    Acceptance Rates

    Overall Acceptance Rate 131 of 716 submissions, 18%

    Upcoming Conference

    SOSP '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)101
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A TCP Congestion Control Optimization Method for SDN-Based Data Center Networks2024 9th International Conference on Computer and Communication Systems (ICCCS)10.1109/ICCCS61882.2024.10603010(468-473)Online publication date: 19-Apr-2024
    • (2024)MINOS: Distributed Consistency and Persistency Protocol Implementation & Offloading to SmartNICs2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00076(1-17)Online publication date: 2-Mar-2024
    • (2024)Reliability-Based Load Balancing Algorithm in Cloud Environment for Manufacturing SectorAdvances in Manufacturing and Materials10.1007/978-981-97-3173-2_29(433-443)Online publication date: 11-Jul-2024
    • (2023)InfiniStore: Elastic Serverless Cloud StorageProceedings of the VLDB Endowment10.14778/3587136.358713916:7(1629-1642)Online publication date: 1-Mar-2023
    • (2023)Hybrid Block Storage for Efficient Cloud Volume ServiceACM Transactions on Storage10.1145/359644619:4(1-25)Online publication date: 3-Oct-2023
    • (2023)Persistent Memory Disaggregation for Cloud-Native Relational DatabasesProceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 310.1145/3582016.3582055(498-512)Online publication date: 25-Mar-2023
    • (2023)DiffForward: On Balancing Forwarding Traffic for Modern Cloud Block Services via Differentiated ForwardingProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35794447:1(1-26)Online publication date: 2-Mar-2023
    • (2023)Oasis: Controlling Data Migration in Expansion of Object-based Storage SystemsACM Transactions on Storage10.1145/356842419:1(1-22)Online publication date: 19-Jan-2023
    • (2023)NearPM: A Near-Data Processing System for Storage-Class ApplicationsProceedings of the Eighteenth European Conference on Computer Systems10.1145/3552326.3587456(751-767)Online publication date: 8-May-2023
    • (2023)Optimal Rack-Coordinated Updates in Erasure-Coded Data Centers: Design and AnalysisIEEE Transactions on Computers10.1109/TC.2023.3234215(1-14)Online publication date: 2023
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media