Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1294261.1294278acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
Article

Sinfonia: a new paradigm for building scalable distributed systems

Published: 14 October 2007 Publication History
  • Get Citation Alerts
  • Abstract

    We propose a new paradigm for building scalable distributed systems. Our approach does not require dealing with message-passing protocols -- a major complication in existing distributed systems. Instead, developers just design and manipulate data structures within our service called Sinfonia. Sinfonia keeps data for applications on a set of memory nodes, each exporting a linear address space. At the core of Sinfonia is a novel minitransaction primitive that enables efficient and consistent access to data, while hiding the complexities that arise from concurrency and failures. Using Sinfonia, we implemented two very different and complex applications in a few months: a cluster file system and a group communication service. Our implementations perform well and scale to hundreds of machines.

    Supplementary Material

    index.html (index.html)
    Slides from the presentation
    ZIP File (p159-slides.zip)
    Supplemental material for Sinfonia: a new paradigm for building scalable distributed systems
    Audio only (1294278.mp3)
    Video (1294278.mp4)

    References

    [1]
    Y. Amir and J. Stanton. The Spread wide area group communication system. Technical Report CNDS-98-4, The Johns Hopkins University, July 1998.
    [2]
    K. P. Birman and T. A. Joseph. Exploiting virtual synchrony in distributed systems. In Symposium on Operating Systhem Principles, pages 123--138, Nov. 1987.
    [3]
    N. Budhiraja, K. Marzullo, F. B. Schneider, and S. Toueg. The primary-backup approach. In SJ. Mullender, editor, Distributed Systems, chapter 8. Addison-Wesley, 1993.
    [4]
    M. Burrows. The Chubby lock service for loosely-coupled distributed systems. In Symposium on Operating Systems Design and Implementation, pages 335--350, Nov. 2006.
    [5]
    T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225--267, March 1996.
    [6]
    F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach, M. Burrows, T. Chandra, A. Fikes, and R. E. Gruber. BigTable: A distributed storage system for structured data. In Symposium on Operating Systems Design and Implementation, pages 205--218, Nov. 2006.
    [7]
    C. Chao, R. English, D. Jacobson, A. Stepanov, and J. Wilkes. Mime: a high performance storage device with strong recovery guarantees. Technical Report HPL-CSP-92-9, HP Laboratories, Nov. 1992.
    [8]
    G. V. Chockler, I. Keidar, and R. Vitenberg. Group communication specifications: A comprehensive study. ACM Computing Surveys, 33(4):1--43, December 2001.
    [9]
    J. Dean and S. Ghemawat. MapReduce: Simplified data processing on large clusters. In Symposium on Operating Systems Design and Implementation, pages 137--150, Dec. 2004.
    [10]
    X. Défago, A. Schiper, and P. Urbàn. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Computing Surveys, 36(4):372--421, Dec. 2004.
    [11]
    M. Fakler, S. Frenz, R. Goeckelmann, M. Schoettner, and P. Schulthess. Project Tetropolis-application of grid computing to interactive virtual 3D worlds. In International Conference on Hypermedia and Grid Systems, May 2005.
    [12]
    P. Ferreira et al. Perdis: design, implementation, and use of a persistent distributed store. In Recent Advances in Distributed Systems, volume 1752 of LNCS, chapter 18. Springer-Verlag, Feb. 2000.
    [13]
    S. Ghemawat, H. Gobioff, and S.-T. Leung. The Google file system. In Symposium on Operating Systems Principles, pages 29--43, Oct. 2003.
    [14]
    S. D. Gribble, E. A. Brewer, J. M. Hellerstein, and D. Culler. Scalable, distributed data structures for Internet service construction. In Symposium on Operating Systems Design and Implementation, pages 319--332, Oct. 2000.
    [15]
    T. Harris and K. Fraser. Language support for lightweight transactions. In Conference on Object-Oriented Programming Systems, Languages and Applications, pages 388--402, Oct. 2003.
    [16]
    M. Herlihy, V. Luchangco, M. Moir, and W. Scherer. Software transactional memory for dynamic--sized data structures. In Symposium on Principles of Distributed Computing, pages 92--101, July 2003.
    [17]
    M. Herlihy and J. E. B. Moss. Transactional memory: Architectural support for lock-free data structures. In International Symposium on Computer Architecture, pages 289--300, May 1993.
    [18]
    H.-I. Hsiao and D. DeWitt. Chained declustering: a new availability strategy for multiprocessor database machines. In International Data Engineering Conference, pages 456--465, Feb. 1990.
    [19]
    L. Lamport. The part-time parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998.
    [20]
    B. Liskov. Distributed programming in Argus. Commununications of the ACM, 31(3):300--312, 1988.
    [21]
    B. Liskov, M. Castro, L. Shrira, and A. Adya. Providing persistent objects in distributed systems. In European Conference on Object--Oriented Programming, pages 230--257, June 1999.
    [22]
    J. MacCormick, N. Murphy, M. Najork, C. A. Thekkath, and L. Zhou. Boxwood: Abstractions as the foundation for storage infrastructure. In Symposium on Operating Systems Design and Implementation, pages 105--120, Dec. 2004.
    [23]
    P. Mehra and S. Fineberg. Fast and flexible persistence: the magic potion for fault-tolerance, scalability and performance in online data stores. In International Parallel and Distributed Processing Symposium -- Workshop 11, page 206a, Apr. 2004.
    [24]
    M. A. Olson. The design and implementation of the Inversion File System. In USENIX Winter Conference, pages 205--218, Jan. 1993.
    [25]
    RDMA Consortium. http://www.rdmaconsortium.org.
    [26]
    M. Satyanarayanan, J. J. Kistler, P. Kumar, M. E. Okasaki, E. H. Siegel, and D. C. Steere. Coda: A highly available file system for a distributed workstation environment. IEEE Transactions on Computers, 39(4):447--459, Apr. 1990.
    [27]
    M. Satyanarayanan, H. H. Mashburn, P. Kumar, D. C. Steere, and J. J. Kistler. Lightweight recoverable virtual memory. ACM Transactions on Computer Systems, 12(1):33--57, Feb. 1994.
    [28]
    A. Schiper and S. Toueg. From set membership to group membership: A separation of concerns. IEEE Transactions on Dependable and Secure Computing, 3(1):2--12, Feb. 2006.
    [29]
    F. B. Schmuck and J. C. Wyllie. Experience with transactions in QuickSilver. In Symposium on Operating Systems Principles, pages 239--253, Oct. 1991.
    [30]
    R. Sears and E. Brewer. Stasis: Flexible transactional storage. In Symposium on Operating Systems Design and Implementation, pages 29--44, Oct. 2006.
    [31]
    N. Shavit and D. Touitou. Software transactional memory. In Symposium on Principles of Distributed Computing, pages 204--213, Aug. 1995.
    [32]
    D. Skeen and M. Stonebraker. A formal model of crash recovery in a distributed system. IEEE Transactions on Software Engineering, 9(3):219--228, May 1983.
    [33]
    A. Z. Spector et al. Camelot: a distributed transaction facility for Mach and the Internet -- an interim report. Research paper CMU--CS--87--129, Carnegie Mellon University, Computer Science Dept., June 1987.

    Cited By

    View all
    • (2023)Fine-Grained Re-Execution for Efficient Batched Commit of Distributed TransactionsProceedings of the VLDB Endowment10.14778/3594512.359452316:8(1930-1943)Online publication date: 22-Jun-2023
    • (2022)Natto: Providing Distributed Transaction Prioritization for High-Contention WorkloadsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526161(715-729)Online publication date: 10-Jun-2022
    • (2022)JiffyProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3527539(697-713)Online publication date: 28-Mar-2022
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    SOSP '07: Proceedings of twenty-first ACM SIGOPS symposium on Operating systems principles
    October 2007
    378 pages
    ISBN:9781595935915
    DOI:10.1145/1294261
    • cover image ACM SIGOPS Operating Systems Review
      ACM SIGOPS Operating Systems Review  Volume 41, Issue 6
      SOSP '07
      December 2007
      363 pages
      ISSN:0163-5980
      DOI:10.1145/1323293
      Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 14 October 2007

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. distributed systems
    2. fault tolerance
    3. scalability
    4. shared memory
    5. transactions
    6. two-phase commit

    Qualifiers

    • Article

    Conference

    SOSP07
    Sponsor:
    SOSP07: ACM SIGOPS 21st Symposium on Operating Systems Principles 2007
    October 14 - 17, 2007
    Washington, Stevenson, USA

    Acceptance Rates

    Overall Acceptance Rate 131 of 716 submissions, 18%

    Upcoming Conference

    SOSP '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)64
    • Downloads (Last 6 weeks)9

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Fine-Grained Re-Execution for Efficient Batched Commit of Distributed TransactionsProceedings of the VLDB Endowment10.14778/3594512.359452316:8(1930-1943)Online publication date: 22-Jun-2023
    • (2022)Natto: Providing Distributed Transaction Prioritization for High-Contention WorkloadsProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526161(715-729)Online publication date: 10-Jun-2022
    • (2022)JiffyProceedings of the Seventeenth European Conference on Computer Systems10.1145/3492321.3527539(697-713)Online publication date: 28-Mar-2022
    • (2021)BasilProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483552(1-17)Online publication date: 26-Oct-2021
    • (2021)ZeusProceedings of the Sixteenth European Conference on Computer Systems10.1145/3447786.3456234(145-161)Online publication date: 21-Apr-2021
    • (2021)GeoPaxos+: Practical Geographical State Machine Replication2021 40th International Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS53918.2021.00031(233-243)Online publication date: Sep-2021
    • (2020)Performance-optimal read-only transactionsProceedings of the 14th USENIX Conference on Operating Systems Design and Implementation10.5555/3488766.3488785(333-349)Online publication date: 4-Nov-2020
    • (2020)GryffProceedings of the 17th Usenix Conference on Networked Systems Design and Implementation10.5555/3388242.3388286(591-618)Online publication date: 25-Feb-2020
    • (2020)POLARDB meets computational storageProceedings of the 18th USENIX Conference on File and Storage Technologies10.5555/3386691.3386695(29-42)Online publication date: 24-Feb-2020
    • (2020)2PC*: a distributed transaction concurrency control protocol of multi-microservice based on cloud computing platformJournal of Cloud Computing10.1186/s13677-020-00183-w9:1Online publication date: 23-Jul-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media