Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

The Network is Reliable: An informal survey of real-world communications failures

Published: 08 July 2014 Publication History

Abstract

The network is reliable tops Peter Deutsch’s classic list, "Eight fallacies of distributed computing", "all [of which] prove to be false in the long run and all [of which] cause big trouble and painful learning experiences." Accounting for and understanding the implications of network behavior is key to designing robust distributed programs; in fact, six of Deutsch’s "fallacies" directly pertain to limitations on networked communications. This should be unsurprising: the ability (and often requirement) to communicate over a shared channel is a defining characteristic of distributed programs, and many of the key results in the field pertain to the possibility and impossibility of performing distributed computations under particular sets of network conditions.

References

[1]
Abadi, D. 2012. Consistency tradeoffs in modern distributed database system design: CAP is only part of the story. Computer 45(2): 37-42; http://dl.acm.org/citation.cfm?id=2360959.
[2]
Amazon Web Services. 2011. Summary of the Amazon EC2 and Amazon RDS service disruption in the U.S. East region; http://aws.amazon.com/message/65648/.
[3]
Bailis, P., Davidson, A., Fekete, A., Ghodsi, A., Hellerstein, J.M., Stoica, I. 2014. Highly available transactions: virtues and limitations. In Proceedings of VLDB (to appear); http://www.bailis.org/papers/hat-vldb2014.pdf.
[4]
Bailis, P., Fekete, A., Franklin, M.J., Ghodsi, A., Hellerstein, J.M., Stoica, I. 2014. Coordination-avoiding database systems; http://arxiv.org/abs/1402.2237.
[5]
Bailis, P., Ghodsi, A. 2013. Eventual consistency today: limitations, extensions, and beyond. ACM Queue 11(3); http://queue.acm.org/detail.cfm?id=2462076.
[6]
CityCloud. 2011; https://www.citycloud.eu/cloud-computing/post-mortem/.
[7]
Davidson, S.B., Garcia-Molina, H., Skeen, D. 1985. Consistency in a partitioned network: a survey. ACM Computing Surveys 17(3): 341-370; http://dl.acm.org/citation.cfm?id=5508.
[8]
Dwork, C., Lynch, M., Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. Journal of the ACM 35(2): 288-323. http://dl.acm.org/citation.cfm?id=42283.
[9]
Fischer, M. J., Lynch, N.A., Patterson, M.S. 1985. Impossibility of distributed consensus with one faulty process. Journal of the ACM 32(2): 374-382; http://dl.acm.org/citation.cfm?id=214121.
[10]
Fog Creek Software. 2012. May 5-6 network maintenance post-mortem; http://status.fogcreek.com/2012/05/may-5-6-network-maintenance-post-mortem.html.
[11]
Gilbert, S., Lynch, N. 2002. Brewer's conjecture and the feasibility of consistent, available, partition-tolerant Web services. ACM SIGACT News 33(2): 51-59; http://dl.acm.org/citation.cfm?id=564601.
[12]
Gill, P., Jain, N., Nagappan, N. 2011. Understanding network failures in data centers: measurement, analysis, and implications. In Proceedings of SIGCOMM; http://research.microsoft.com/en-us/um/people/navendu/papers/sigcomm11netwiser.pdf.
[13]
Github. 2012. Github availability this week; https://github.com/blog/1261-github-availability-this-week.
[14]
Kielhofner, K. 2013. Packets of death; http://blog.krisk.org/2013/02/packets-of-death.html.
[15]
Lillich, J. 2013. Post mortem: network issues last week; http://www.freistil.it/2013/02/post-mortem-network-issues-last-week/.
[16]
Narayan, P.P.S. 2010. Sherpa update; https://developer.yahoo.com/blogs/ydn/sherpa-7992.html#4.
[17]
Prince, M. 2013. Today's outage post mortem; http://blog.cloudflare.com/todays-outage-post-mortem-82515.
[18]
Turner, D., Levchenko, K., Snoeren, A., Savage, S. 2010. California fault lines: understanding the causes and impact of network failures. In Proceedings of SIGCOMM ; http://cseweb.ucsd.edu/~snoeren/papers/cenic-sigcomm10.pdf.
[19]
Twilio. 2013. Billing incident post-mortem: breakdown, analysis and root cause; http://www.twilio.com/blog/2013/07/billing-incident-post-mortem.html.

Cited By

View all
  • (2025)Kollaps: Decentralized and Efficient Network Emulation for Large-Scale SystemsIEEE Transactions on Networking10.1109/TNET.2024.347805033:1(35-50)Online publication date: Feb-2025
  • (2024)Vinia: Voice-enabled intent-based networking for industrial automationComputer Science and Information Systems10.2298/CSIS230213002B21:1(395-418)Online publication date: 2024
  • (2024)Die NoSQL-Toolbox: Die NoSQL-Landschaft im ÜberblickSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_8(189-205)Online publication date: 3-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Queue
Queue  Volume 12, Issue 7
Practice
July 2014
43 pages
ISSN:1542-7730
EISSN:1542-7749
DOI:10.1145/2639988
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 July 2014
Published in QUEUE Volume 12, Issue 7

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)7,561
  • Downloads (Last 6 weeks)916
Reflects downloads up to 01 Mar 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Kollaps: Decentralized and Efficient Network Emulation for Large-Scale SystemsIEEE Transactions on Networking10.1109/TNET.2024.347805033:1(35-50)Online publication date: Feb-2025
  • (2024)Vinia: Voice-enabled intent-based networking for industrial automationComputer Science and Information Systems10.2298/CSIS230213002B21:1(395-418)Online publication date: 2024
  • (2024)Die NoSQL-Toolbox: Die NoSQL-Landschaft im ÜberblickSchnelles und skalierbares Cloud-Datenmanagement10.1007/978-3-031-54388-3_8(189-205)Online publication date: 3-May-2024
  • (2023)CASPR: Connectivity-Aware Scheduling for Partition Resilience2023 42nd International Symposium on Reliable Distributed Systems (SRDS)10.1109/SRDS60354.2023.00017(70-81)Online publication date: 25-Sep-2023
  • (2022)dDrops: Detecting silent packet drops on programmable data planeComputer Networks10.1016/j.comnet.2022.109171214(109171)Online publication date: Sep-2022
  • (2022)The Evaluation of the Two Detection Algorithms for Distributed Denial of Service AttackAd Hoc Networks and Tools for IT10.1007/978-3-030-98005-4_5(63-71)Online publication date: 27-Mar-2022
  • (2020)KollapsProceedings of the Fifteenth European Conference on Computer Systems10.1145/3342195.3387540(1-16)Online publication date: 15-Apr-2020
  • (2020)The Impossibility of Fast Transactions2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00120(1143-1154)Online publication date: May-2020
  • (2020)Paxos in the NIC: Hardware Acceleration of Distributed Consensus Protocols2020 16th International Conference on the Design of Reliable Communication Networks DRCN 202010.1109/DRCN48652.2020.1570611009(1-6)Online publication date: Mar-2020
  • (2020)Scalable Impact Range Detection against Newly Added Rules for Smart Network Verification2020 IEEE 44th Annual Computers, Software, and Applications Conference (COMPSAC)10.1109/COMPSAC48688.2020.00-47(1471-1476)Online publication date: Jul-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Magazine Site

View this article on the magazine site (external)

Magazine Site

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media