Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1529974.1529993acmotherconferencesArticle/Chapter ViewAbstractPublication PagesladisConference Proceedingsconference-collections
research-article

Reducing the costs of large-scale BFT replication

Published: 15 September 2008 Publication History

Abstract

We identify three key challenges in designing large-scale fault tolerant services. The first is keeping stable best-case performance in presence of failures, which are increasingly becoming commonplace. The second is that worst-case failures should not result in major service disruptions and needs to be tolerated. The third is minimizing the costs of replicating a large number of services. While most previous work has focused on addressing the first two challenges, we propose new approaches to reduce the replication costs of BFT replication in large-scale services.

References

[1]
M. Abdel-El-Malek, G. Ganger, G. Goodson, M. Reiter and J. Wylie, "Fault-Scalable Byzantine Fault-Tolerant Services," SOSP, pp. 59--74, 2005.
[2]
Amazon Web Services Service Health Dashboard, http://status.aws.amazon.com/s3-20080720.html
[3]
Y. Amir, B. Coan, J. Kirsch and J. Lane, "Byzantine Replication Under Attack," DSN, 2008.
[4]
Arbor Networks Atlas, Summary Report on Global Denial of Service, http://atlas.arbor.net/summary/dos
[5]
M. Castro and B. Liskov, "Practical Byzantine Fault Tolerance and Proactive Recovery," ACM TOCS, 2002.
[6]
T. Chandra, R. Griesemer and J. Redstone, "Paxos Made Live - An Engineering Perspective," PODC, 2007.
[7]
B. G. Chun, P. Maniatis, S. Shenker and J. Kubiatowicz "Attested append-only memory: making adversaries stick to their word," SOSP, 2007.
[8]
A. Clement, M. Marchetti, E. Wong, L. Alvisi and M. Dahlin, "Making Byzantine Falt Tolerant System Tolerate Byzantine Faults," UT Austin, UTCS-TR-08-27, 2008.
[9]
C. Constantinescu, "Impact of Deep Submicron Technology on Dependability of VLSI Circuits," DSN, pp. 205--209, 2002.
[10]
M. Correia, N. F. Neves and P. Verissimo "How to Tolerate Half Less One Byzantine Nodes in Practical Distributed Systems," SRDS, pp. 174--183, 2004.
[11]
J. Cowling, D. Myers, B. Liskov, R. Rodrigues and L. Shrira, "HQ Replication: A Hybrid Quorum Protocol for Byzantine Fault Tolerance," OSDI, pp. 177--190, 2006.
[12]
W. S. Dantas, A. N. Bessani, J. S. Fraga and M. Correia, "Evaluating Byzantine Quorum Systems," SRDS, 2007.
[13]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall and W. Vogels, "Dynamo: Amazon's Highly Available Key-value Store," SOSP, 2007.
[14]
P. Dutta, R. Guerraoui and M. Vukolic, "Best-case complexity of asynchronous Byzantine consensus," EPFL Tech. Rep., 2004.
[15]
S. Ghemawat, H. Gobioff and S. T. Leung, "The Google File System," SOSP, 2003.
[16]
J. Gray, "Why Do Computers Stop and What Can Be Done About It?," SRDS, 1986.
[17]
M. Herlihy and J. M. Wing, "Linearizability: A Correctness Condition for Concurrent Objects," ACM Trans. on Programming Languages and Systems, 12(3), pp. 463--492, Jul. 1990.
[18]
R. Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong, "Zyzzyzva: Speculative Byzantine Fault Tolerance," SOSP, 2007.
[19]
R. Kotla, L. Alvisi, M. Dahlin, A. Clement and E. Wong, "Zyzzyzva: Speculative Byzantine Fault Tolerance," Univerisity of Texas at Austin, UTCS-TR-07-40, 2007.
[20]
L. Lamport, "The Part-Time Parliament," ACM TOCS, 16(2), pp. 133--169, May 1998.
[21]
L. Lamport, "Lower Bounds for Asynchronous Consensus," FuDiCo workshop, pp. 22--23, 2003.
[22]
J-P. Martin and L. Alvisi, "Fast Byzantine Consensus," IEEE TDSC, 3(3), pp. 202--215, Jul. 2006.
[23]
M. G. Merideth, A. Iyengar, T. Mikalsen, S. Tai, I. Rouvelloy and P. Narasimhan, "Thema: Byzantine-Fault-Tolerant Middleware for Web-Service Applications," SRDS, pp. 131--142, 2005.
[24]
A. C. Myers and B. Liskov, "Protecting Privacy Using the Decentralized Label Model," ACM TSEM, 9(4), pp. 410--442, 2000.
[25]
The New York Times Online, "Before the Gunfire, Cyberattacks," http://www.nytimes.com/2008/08/13/technology/13cyber.html
[26]
T. Peng, C. Leckie and K. Ramamohanarao, "Survey of Network-Based Defense Mechanisms Countering the DoS and DDoS Problems," ACM Comp. Surv., 30(1), Apr. 2007.
[27]
F. Schneider, "Implementing Fault-Tolerant Services Using the State Machine Approach: A Tutorial," ACM Comp. Surv., 22(4), pp. 299--319, Dec. 1990.
[28]
M. Serafini and N. Suri, "The Fail-Heterogeneous Architectural Model," SRDS, 2007.
[29]
M. Serafini, P. Bokor, D. Dobre, M. Majuntke and N. Suri, "Scrooge: Cost-Effective Byzantine Fault Tolerance For Large-Scale Partitioned Services," Technische Universität Darmstadt, TR-TUD-DEEDS-10-01-2008.
[30]
P. Shivakumar, M. Kistler, S. W. Keckler, D. Burger, L. Alvisi, "Modeling the effect of technology trends on the soft error rate of combinational logic," DSN, pp. 389--398, 2002.
[31]
A. Singh, T. Das, P. Maniatis, P. Druschel and T. Roscoe, "BFT Protocols Under Fire," NSDI, 2008.
[32]
Trusted Computing Group, https://www.trustedcomputinggroup.org/home
[33]
N. Zeldovich, S. Boyd-Wickizer and D. Mazières, "Securing distributed systems with information flow control," NSDI, 2008.

Cited By

View all
  • (2017)Dependable Cloud Resources with Guardian2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2017.158(1543-1554)Online publication date: Jun-2017
  • (2010)Scrooge: Reducing the costs of fast Byzantine replication in presence of unresponsive replicas2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN)10.1109/DSN.2010.5544295(353-362)Online publication date: Jun-2010

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
LADIS '08: Proceedings of the 2nd Workshop on Large-Scale Distributed Systems and Middleware
September 2008
85 pages
ISBN:9781605582962
DOI:10.1145/1529974
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • IBMR: IBM Research

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 September 2008

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Byzantine fault tolerance
  2. cost-efficient speculative execution
  3. message histories
  4. replier quorums

Qualifiers

  • Research-article

Conference

LADIS '08
Sponsor:
  • IBMR
LADIS '08: Conference on Large Scale Distributed Systems and Middleware
September 15 - 17, 2008
New York, Yorktown Heights, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2017)Dependable Cloud Resources with Guardian2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS.2017.158(1543-1554)Online publication date: Jun-2017
  • (2010)Scrooge: Reducing the costs of fast Byzantine replication in presence of unresponsive replicas2010 IEEE/IFIP International Conference on Dependable Systems & Networks (DSN)10.1109/DSN.2010.5544295(353-362)Online publication date: Jun-2010

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media