Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2592798.2592822acmconferencesArticle/Chapter ViewAbstractPublication PageseurosysConference Proceedingsconference-collections
research-article

Kronos: the design and implementation of an event ordering service

Published: 14 April 2014 Publication History

Abstract

This paper proposes a new approach to determining the order of interdependent operations in a distributed system. The key idea behind our approach is to factor the task of tracking happens-before relationships out of components that comprise the system, and to centralize them in a separate event ordering service. This not only simplifies implementation of individual components by freeing them from having to propagate dependence information, but also enables dependence relationships to be maintained across multiple independent systems. A novel API enables the system to detect and take advantage of concurrency whenever possible by maintaining fine-grained information and binding events to a time order as late as possible. We demonstrate the benefits of this approach through several example applications, including a transactional key-value store, and an online graph store. Experiments show that our event ordering service scales well and has low overhead in practice.

References

[1]
M. Attariyan and J. Flinn. Using Causality To Diagnose Configuration Bugs. In Proc. of USENIX, Boston, MA, June 2008.
[2]
K. Audenaert. Clock Trees: Logical Clocks For Programs With Nested Parallelism. In IEEE Transactions on Software Engineering, 23(10), 1997.
[3]
P. Bailis, A. Fekete, A. Ghodsi, J. M. Hellerstein, and I. Stoica. The Potential Dangers Of Causal Consistency And An Explicit Solution. In Proc. of SoCC, San Jose, CA, Oct. 2012.
[4]
M. Balakrishnan, D. Malkhi, T. Wobber, M. Wu, V. Prabhakaran, M. Wei, J. D. Davis, S. Rao, T. Zou, and A. Zuck. Tango: Distributed Data Structures Over A Shared Log. In Proc. of SOSP, Farmington, PA, Nov. 2013.
[5]
K. P. Birman, A. Schiper, and P. Stephenson. Fast Causal Multicast. Cornell University, Technical Report TR90-1105, 1990.
[6]
K. P. Birman, A. Schiper, and P. Stephenson. Lightweight Causal And Atomic Group Multicast. In ACM ToCS, 9(3), 1991.
[7]
P. Briggs and L. Torczon. An Efficient Representation For Sparse Sets. In ACM LoPLaS, 2(1-4), 1993.
[8]
N. Bronson, Z. Amsden, G. Cabrera, P. Chakka, P. Dimov, H. Ding, J. Ferris, A. Giardullo, S. Kulkarni, H. Li, M. Marchukov, D. Petrov, L. Puzar, Y. J. Song, and V. Venkataramani. TAO: Facebooks Distributed Data Store For The Social Graph. In Proc. of USENIX, San Jose, CA, June 2013.
[9]
M. Burrows. The Chubby Lock Service For Loosely-Coupled Distributed Systems. In Proc. of OSDI, Seattle, WA, Nov. 2006.
[10]
B. Charron-Bost. Concerning The Size Of Logical Clocks In Distributed Systems. In Information Processing Letters, 39(1), 1991.
[11]
D. R. Cheriton and D. Skeen. Understanding The Limitations Of Causally And Totally Ordered Communication. In Proc. of SOSP, Asheville, NC, Oct. 1993.
[12]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's Highly Available Key-Value Store. In Proc. of SOSP, Stevenson, WA, Oct. 2007.
[13]
R. A. Elmasri and S. Navathe. Fundamentals Of Database Systems. Addison-Wesley, US, 2010.
[14]
P. Erdös and A. Rényi. On The Evolution Of Random Graphs. In Mathematical Institute of the Hungarian Academy of Sciences, 5(17--61), 1960.
[15]
R. Escriva, B. Wong, and E. G. Sirer. HyperDex: A Distributed, Searchable Key-Value Store. In Proc. of SIGCOMM, Helsinki, Finland, Aug. 2012.
[16]
A. J. Feldman, W. P. Zeller, M. J. Freedman, and E. W. Felten. SPORC: Group Collaboration Using Untrusted Cloud Resources. In Proc. of OSDI, Vancouver, Canada, Oct. 2010.
[17]
C. J. Fidge. Logical Time In Distributed Computing Systems. In IEEE Computer, 24(8), 1991.
[18]
J. E. Gonzalez, Y. Low, H. Gu, D. Bickson, and C. Guestrin. PowerGraph: Distributed Graph-Parallel Computation on Natural Graphs. In Proc. of OSDI, Los Angeles, CA, Oct. 2012.
[19]
J. Gray and L. Lamport. Consensus On Transaction Commit. Microsoft Research, Technical Report MSR-TR-2003-96, 2004.
[20]
P. Hunt, M. Konar, F. P. Junqueira, and B. Reed. ZooKeeper: Wait-Free Coordination For Internet-Scale Systems. In Proc. of USENIX, Boston, MA, June 2010.
[21]
D. A. Khotimsky. Hierarchical Vector Clock: Scalable Plausible Clock For Detecting Causality In Large Distributed Systems. In Proc. of ICATM, Colmar, France, 1999.
[22]
L. Lamport. The Part-Time Parliament. In ACM ToCS, 16(2), 1998.
[23]
L. Lamport. Time, Clocks, And The Ordering Of Events In A Distributed System. In CACM, 21(7), 1978.
[24]
B. Lampson and H. E. Sturgis. Crash Recovery In A Distributed Storage System. Xerox Parc, Palo Alto, CA, Technical Report, 1976.
[25]
W. Lloyd, M. J. Freedman, M. Kaminsky, and D. G. Andersen. Don't Settle For Eventual: Scalable Causal Consistency For Wide-Area Storage With COPS. In Proc. of SOSP, Cascais, Portugal, Oct. 2011.
[26]
P. Mahajan, S. T. V. Setty, S. Lee, A. Clement, L. Alvisi, M. Dahlin, and M. Walfish. Depot: Cloud Storage With Minimal Trust. In Proc. of OSDI, Vancouver, Canada, Oct. 2010.
[27]
G. Malewicz, M. H. Austern, A. J. C. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System For Large-Scale Graph Processing. In Proc. of SIGMOD, Indianapolis, IN, June 2010.
[28]
F. Mattern. Virtual Time And Global States Of Distributed Systems. In Proc. of PDA Workshop, Chateau de Bonas, France, Oct. 1989.
[29]
J. J. McAuley and J. Leskovec. Learning To Discover Social Circles In Ego Networks. In Proc. of NIPS, Lake Tahoe, CA, Dec. 2012.
[30]
E. B. Nightingale, K. Veeraraghavan, P. M. Chen, and J. Flinn. Rethink The Sync. In Proc. of OSDI, Seattle, WA, Nov. 2006.
[31]
B. M. Oki and B. Liskov. Viewstamped Replication: A General Primary Copy. In Proc. of PODC, Toronto, Canada, Aug. 1988.
[32]
D. Peng and F. Dabek. Large-Scale Incremental Processing Using Distributed Transactions And Notifications. In Proc. of OSDI, Vancouver, Canada, Oct. 2010.
[33]
A. Roy, I. Mihailovic, and W. Zwaenepoel. X-Stream: Edge-Centric Graph Processing Using Streaming Partitions. In Proc. of SOSP, Farmington, PA, Nov. 2013.
[34]
D. Skeen and M. Stonebraker. A Formal Model Of Crash Recovery In A Distributed System. In IEEE Transactions on Software Engineering, 9(3), 1983.
[35]
J. Terrace and M. J. Freedman. Object Storage On CRAQ: High-Throughput Chain Replication For Read-Mostly Workloads. In Proc. of USENIX, San Diego, CA, June 2009.
[36]
D. B. Terry, M. Theimer, K. Petersen, A. J. Demers, M. Spreitzer, and C. Hauser. Managing Update Conflicts In Bayou, A Weakly Connected Replicated Storage System. In Proc. of SOSP, Copper Mountain, CO, Dec. 1995.
[37]
Titan Distributed Graph Database. http://thinkaurelius.github.io/titan/.
[38]
F. J. Torres-Rojas and M. Ahamad. Plausible Clocks: Constant Size Logical Clocks For Distributed Systems. In Distributed Computing, 12(4), 1999.
[39]
J. Ugander and L. Backstrom. Balanced Label Propagation For Partitioning Massive Graphs. In Proc. of WSDM, Rome, Italy, Feb. 2013.
[40]
R. van Renesse and F. B. Schneider. Chain Replication For Supporting High Throughput And Availability. In Proc. of OSDI, San Francisco, CA, Dec. 2004.

Cited By

View all
  • (2022)Omega: A Secure Event Ordering Service for the EdgeIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.307852019:5(2952-2964)Online publication date: 1-Sep-2022
  • (2021)On the Safety Implications of Misordered Events and Commands in IoT Systems2021 IEEE Security and Privacy Workshops (SPW)10.1109/SPW53761.2021.00038(235-241)Online publication date: May-2021
  • (2020)Practical client-side replicationProceedings of the VLDB Endowment10.14778/3407790.340784713:12(2590-2605)Online publication date: 1-Jul-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
EuroSys '14: Proceedings of the Ninth European Conference on Computer Systems
April 2014
388 pages
ISBN:9781450327046
DOI:10.1145/2592798
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 April 2014

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

EuroSys 2014
Sponsor:
EuroSys 2014: Ninth Eurosys Conference 2014
April 14 - 16, 2014
Amsterdam, The Netherlands

Acceptance Rates

EuroSys '14 Paper Acceptance Rate 27 of 147 submissions, 18%;
Overall Acceptance Rate 241 of 1,308 submissions, 18%

Upcoming Conference

EuroSys '25
Twentieth European Conference on Computer Systems
March 30 - April 3, 2025
Rotterdam , Netherlands

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)18
  • Downloads (Last 6 weeks)2
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Omega: A Secure Event Ordering Service for the EdgeIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2021.307852019:5(2952-2964)Online publication date: 1-Sep-2022
  • (2021)On the Safety Implications of Misordered Events and Commands in IoT Systems2021 IEEE Security and Privacy Workshops (SPW)10.1109/SPW53761.2021.00038(235-241)Online publication date: May-2021
  • (2020)Practical client-side replicationProceedings of the VLDB Endowment10.14778/3407790.340784713:12(2590-2605)Online publication date: 1-Jul-2020
  • (2020)On combining fault tolerance and partial replication with causal consistencyProceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data10.1145/3380787.3393684(1-5)Online publication date: 27-Apr-2020
  • (2020)The intrinsic cost of causal consistencyProceedings of the 7th Workshop on Principles and Practice of Consistency for Distributed Data10.1145/3380787.3393674(1-6)Online publication date: 27-Apr-2020
  • (2020)Combining High Throughput and Low Migration Latency for Consistent Data Storage on the Edge2020 29th International Conference on Computer Communications and Networks (ICCCN)10.1109/ICCCN49398.2020.9209720(1-11)Online publication date: Aug-2020
  • (2020)Omega: a Secure Event Ordering Service for the Edge2020 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN48063.2020.00062(489-501)Online publication date: Jun-2020
  • (2018)Analysis of Bounds on Hybrid Vector ClocksIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2018.281870029:9(1947-1960)Online publication date: 1-Sep-2018
  • (2018)Tracking Causal Order in AWS Lambda Applications2018 IEEE International Conference on Cloud Engineering (IC2E)10.1109/IC2E.2018.00027(50-60)Online publication date: Apr-2018
  • (2018)Tracing Function Dependencies across Clouds2018 IEEE 11th International Conference on Cloud Computing (CLOUD)10.1109/CLOUD.2018.00039(253-260)Online publication date: Jul-2018
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media