Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2815400.2815416acmconferencesArticle/Chapter ViewAbstractPublication PagessospConference Proceedingsconference-collections
research-article
Open access

Implementing linearizability at large scale and low latency

Published: 04 October 2015 Publication History

Abstract

Linearizability is the strongest form of consistency for concurrent systems, but most large-scale storage systems settle for weaker forms of consistency. RIFL provides a general-purpose mechanism for converting at-least-once RPC semantics to exactly-once semantics, thereby making it easy to turn non-linearizable operations into linearizable ones. RIFL is designed for large-scale systems and is lightweight enough to be used in low-latency environments. RIFL handles data migration by associating linearizability metadata with objects in the underlying store and migrating metadata with the corresponding objects. It uses a lease mechanism to implement garbage collection for metadata. We have implemented RIFL in the RAMCloud storage system and used it to make basic operations such as writes and atomic increments linearizable; RIFL adds only 530 ns to the 13.5 μs base latency for durable writes. We also used RIFL to construct a new multi-object transaction mechanism in RAMCloud; RIFL's facilities significantly simplified the transaction implementation. The transaction mechanism can commit simple distributed transactions in about 20 μs and it outperforms the H-Store main-memory database system for the TPC-C benchmark.

Supplementary Material

MP4 File (p71.mp4)

References

[1]
RAMCloud Git Repository. https://github.com/PlatformLab/RAMCloud.git.
[2]
Aguilera, M. K., Merchant, A., Shah, M., Veitch, A., and Karamanolis, C. Sinfonia: A New Paradigm for Building Scalable Distributed Systems. ACM Transactions on Computer Systems 27, 3 (Nov. 2009), 5:1--5:48.
[3]
Baker, J., Bond, C., Corbett, J. C., Furman, J., Khorlin, A., Larson, J., Leon, J.-M., Li, Y., Lloyd, A., and Yushprakh, V. Megastore: Providing Scalable, Highly Available Storage for Interactive Services. In Proceedings of the Conference on Innovative Data system Research (CIDR) (2011), pp. 223--234.
[4]
Belay, A., Prekas, G., Klimovic, A., Grossman, S., Kozyrakis, C., and Bugnion, E. Ix: A protected dataplane operating system for high throughput and low latency. In 11th USENIX Symposium on Operating Systems Design and Implementation (OSDI 14) (Broomfield, CO, Oct. 2014), USENIX Association, pp. 49--65.
[5]
Bernstein, P. A., and Newcomer, E. Principles of transaction processing. Morgan Kaufmann, 2009.
[6]
Corbett, J. C., Dean, J., Epstein, M., Fikes, A., Frost, C., Furman, J. J., Ghemawat, S., Gubarev, A., Heiser, C., Hochschild, P., Hsieh, W., Kanthak, S., Kogan, E., Li, H., Lloyd, A., Melnik, S., Mwaura, D., Nagle, D., Quinlan, S., Rao, R., Rolig, L., Saito, Y., Szymaniak, M., Taylor, C., Wang, R., and Woodford, D. Spanner: Google's Globally Distributed Database. ACM Trans. Comput. Syst. 31, 3 (Aug. 2013), 8:1--8:22.
[7]
DeCandia, G., Hastorun, D., Jampani, M., Kakulapati, G., Lakshman, A., Pilchin, A., Sivasubramanian, S., Vosshall, P., and Vogels, W. Dynamo: Amazon's Highly Available Key-Value Store. In ACM SIGOPS Operating Systems Review (2007), vol. 41, ACM, pp. 205--220.
[8]
Dragojević, A., Narayanan, D., Castro, M., and Hodson, O. FaRM: Fast Remote Memory. In 11th USENIX Symposium on Networked Systems Design and Implementation (NSDI 14) (Seattle, WA, Apr. 2014), USENIX Association, pp. 401--414.
[9]
Fox, A., Gribble, S. D., Chawathe, Y., Brewer, E. A., and Gauthier, P. Cluster-based Scalable Network Services. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (New York, NY, USA, 1997), SOSP '97, ACM, pp. 78--91.
[10]
Gray, C., and Cheriton, D. Leases: An Efficient Fault-tolerant Mechanism for Distributed File Cache Consistency. In Proceedings of the Twelfth ACM Symposium on Operating Systems Principles (New York, NY, USA, 1989), SOSP '89, ACM, pp. 202--210.
[11]
Gray, J., and Reuter, A. Transaction Processing: Concepts and Techniques, 1st ed. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
[12]
Herlihy, M. P., and Wing, J. M. Linearizability: A Correctness Condition for Concurrent Objects. ACM Trans. Program. Lang. Syst. 12 (July 1990), 463--492.
[13]
Hunt, P., Konar, M., Junqueira, F. P., and Reed, B. ZooKeeper: wait-free coordination for internet-scale systems. In Proceedings of the 2010 USENIX annual technical conference (Berkeley, CA, USA, 2010), USENIX ATC '10, USENIX Association, pp. 11--11.
[14]
Information Sciences Institute. RFC 793: Transmission control protocol, 1981. Edited by Jon Postel. Available at https://www.ietf.org/rfc/rfc793.txt.
[15]
Kallman, R., Kimura, H., Natkins, J., Pavlo, A., Rasin, A., Zdonik, S., Jones, E. P. C., Madden, S., Stonebraker, M., Zhang, Y., Hugg, J., and Abadi, D. J. H-Store: A High-performance, Distributed Main Memory Transaction Processing System. Proc. VLDB Endow. 1, 2 (Aug. 2008), 1496--1499.
[16]
Kung, H.-T., and Robinson, J. T. On optimistic methods for concurrency control. ACM Transactions on Database Systems (TODS) 6, 2 (1981), 213--226.
[17]
Lakshman, A., and Malik, P. Cassandra: A Decentralized Structured Storage System. SIGOPS Oper. Syst. Rev. 44, 2 (Apr. 2010), 35--40.
[18]
Lamport, L. The Part-Time Parliament. ACM Transactions on Computer Systems 16, 2 (May 1998), 133--169.
[19]
Lloyd, W., Freedman, M. J., Kaminsky, M., and Andersen, D. G. Don't Settle for Eventual: Scalable Causal Consistency for Wide-area Storage with COPS. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (New York, NY, USA, 2011), SOSP '11, ACM, pp. 401--416.
[20]
Ongaro, D., and Ousterhout, J. In Search of an Understandable Consensus Algorithm. In 2014 USENIX Annual Technical Conference (USENIX ATC 14) (Philadelphia, PA, June 2014), USENIX Association, pp. 305--319.
[21]
Ousterhout, J., Gopalan, A., Gupta, A., Kejriwal, A., Lee, C., Montazeri, B., Ongaro, D., Park, S. J., Qin, H., Rosenblum, M., et al. The RAMCloud Storage System. ACM Transactions on Computer Systems (TOCS) 33, 3 (2015), 7.
[22]
Pavlo, A. Personal communication, March 22 2015.
[23]
Petersen, K., Spreitzer, M. J., Terry, D. B., Theimer, M. M., and Demers, A. J. Flexible Update Propagation for Weakly Consistent Replication. In Proceedings of the Sixteenth ACM Symposium on Operating Systems Principles (New York, NY, USA, 1997), SOSP '97, ACM, pp. 288--301.
[24]
Sethi, R. Useless actions make a difference: Strict serializability of database updates. Journal of the ACM (JACM) 29, 2 (1982), 394--403.
[25]
Transaction Processing Performance Council. TPC-C Benchmark (Revision 5.11). http://www.tpc.org/tpcc/spec/tpcc_current.pdf, 2010.

Cited By

View all
  • (2024)Caching in Forschung und IndustrieSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_5(91-140)Online publication date: 3-May-2024
  • (2024)Systeme für skalierbares DatenmanagementSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_4(61-90)Online publication date: 3-May-2024
  • (2023)PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent ReadsProceedings of the VLDB Endowment10.14778/3611540.361156216:12(3754-3767)Online publication date: 1-Aug-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SOSP '15: Proceedings of the 25th Symposium on Operating Systems Principles
October 2015
499 pages
ISBN:9781450338349
DOI:10.1145/2815400
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 October 2015

Check for updates

Qualifiers

  • Research-article

Funding Sources

  • NetApp
  • Emulex
  • Google
  • Huawei
  • Inventec
  • Samsung
  • VMware
  • Facebook
  • NEC

Conference

SOSP '15
Sponsor:

Acceptance Rates

SOSP '15 Paper Acceptance Rate 30 of 181 submissions, 17%;
Overall Acceptance Rate 131 of 716 submissions, 18%

Upcoming Conference

SOSP '25
ACM SIGOPS 31st Symposium on Operating Systems Principles
October 13 - 16, 2025
Seoul , Republic of Korea

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)448
  • Downloads (Last 6 weeks)23
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Caching in Forschung und IndustrieSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_5(91-140)Online publication date: 3-May-2024
  • (2024)Systeme für skalierbares DatenmanagementSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_4(61-90)Online publication date: 3-May-2024
  • (2023)PolarDB-SCC: A Cloud-Native Database Ensuring Low Latency for Strongly Consistent ReadsProceedings of the VLDB Endowment10.14778/3611540.361156216:12(3754-3767)Online publication date: 1-Aug-2023
  • (2023)RDMA-Enabled Concurrency Control Protocols for Transactions in the Cloud EraIEEE Transactions on Cloud Computing10.1109/TCC.2021.311651611:1(798-810)Online publication date: 1-Jan-2023
  • (2022)Stateful Serverless Computing with CrucialACM Transactions on Software Engineering and Methodology10.1145/349038631:3(1-38)Online publication date: 7-Mar-2022
  • (2022)DrTM+B: Replication-Driven Live Reconfiguration for Fast and General Distributed Transaction ProcessingIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2022.314825133:10(2628-2643)Online publication date: Oct-2022
  • (2021)Log-structured Protocols in DelosProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483544(538-552)Online publication date: 26-Oct-2021
  • (2021)Exploiting Nil-Externality for Fast Replicated StorageProceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles10.1145/3477132.3483543(440-456)Online publication date: 26-Oct-2021
  • (2021)Distributed Data PersistencyMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480060(71-85)Online publication date: 18-Oct-2021
  • (2021)In reference to RPCProceedings of the Workshop on Hot Topics in Operating Systems10.1145/3458336.3465302(191-198)Online publication date: 1-Jun-2021
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media