Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Zyzzyva: Speculative Byzantine fault tolerance

Published: 01 January 2010 Publication History

Abstract

A longstanding vision in distributed systems is to build reliable systems from unreliable components. An enticing formulation of this vision is Byzantine Fault-Tolerant (BFT) state machine replication, in which a group of servers collectively act as a correct server even if some of the servers misbehave or malfunction in arbitrary (“Byzantine”) ways. Despite this promise, practitioners hesitate to deploy BFT systems, at least partly because of the perception that BFT must impose high overheads.
In this article, we present Zyzzyva, a protocol that uses speculation to reduce the cost of BFT replication. In Zyzzyva, replicas reply to a client's request without first running an expensive three-phase commit protocol to agree on the order to process requests. Instead, they optimistically adopt the order proposed by a primary server, process the request, and reply immediately to the client. If the primary is faulty, replicas can become temporarily inconsistent with one another, but clients detect inconsistencies, help correct replicas converge on a single total ordering of requests, and only rely on responses that are consistent with this total order. This approach allows Zyzzyva to reduce replication overheads to near their theoretical minima and to achieve throughputs of tens of thousands of requests per second, making BFT replication practical for a broad range of demanding services.

References

[1]
Abd-El-Malek, M., Ganger, G., Goodson, G., Reiter, M., and Wylie, J. 2005. Fault-Scalable Byzantine fault-tolerant services. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). 59--74.
[2]
Aiyer, A. S., Alvisi, L., Clement, A., Dahlin, M., Martin, J.-P., and Porth, C. 2005. BAR fault tolerance for cooperative services. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). 45--58.
[3]
Amazon. 2008. Amazon S3 availability event: July 20, 2008. http://status.aws.amazon.com/s3-20080720.html.
[4]
Bellare, M. and Micciancio, D. 1997. A new paradigm for collision-free hashing: Incrementally at reduced cost. In Proceedings of 14th Annual Eurocrypt Conference (Eurocrypt'97). 163--192.
[5]
Castro, M. 2001. Practical Byzantine fault tolerance. Ph.D. thesis, MIT, Cambridge, MA.
[6]
Castro, M. and Liskov, B. 1999. Practical Byzantine fault tolerance. In Proceedings of the 3rd USENIX Symposium on Operating Systems Design and Implementation (OSDI'99). 173--186.
[7]
Castro, M. and Liskov, B. 2000. Proactive recovery in a Byzantine-fault-tolerant system. In Proceedings of the 4th Symposium on Operating Systems Design and Implementation (OSDI'00). 273--288.
[8]
Castro, M. and Listov, B. 2002. Practical Byzantine fault tolerance and proactive recovery. ACM Trans. Comput. Syst. 20, 4, 398--461.
[9]
Chun, B.-G., Maniatis, P., Shenker, S., and Kubiatowicz, J. 2007. Attested append-only memory: Making adversaries stick to their word. SIGOPS Oper. Syst. Rev. 41, 6, 189--204.
[10]
Clement, A., Kapritsos, M., Lee, S., Wang, Y., Alvisi, L., Dahlin, M., and Riche, T. 2009a. UpRight cluster services. In Proceedings of the 22nd ACM Symposium on Operating Systems Principles (SOSP'09). 270--290.
[11]
Clement, A., Marchetti, M., Wong, E., Alvisi, L., and Dahlin, M. 2009b. Making Byzantine fault tolerant services tolerate Byzantine faults. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI'09). 153--168.
[12]
Cowling, J., Myers, D., Liskov, B., Rodrigues, R., and Shrira, L. 2006. HQ replication: A hybrid quorum protocol for Byzantine fault tolerance. In Proceedings of the 7th Symposium on Operating Systems Design and Implementation (OSDI'06). 177--190.
[13]
Dutta, P., Guerraoui, R., and Vukolić, M. 2005. Best-Case complexity of asynchronous Byzantine consensus. Tech. rep. EPFL/IC/200499, EPFL.
[14]
Dwork, C., Lynch, N., and Stockmeyer, L. 1988. Consensus in the presence of partial synchrony. J. ACM, 288--323.
[15]
Fischer, M., Lynch, N., and Paterson, M. 1985. Impossibility of distributed consensus with one faulty process. J. ACM 32, 2, 374--382.
[16]
Gmail. 2006. Lost gmail emails and the future of Web apps. http://it.slashdot.org (12/29/06).
[17]
Herlihy, M. and Wing, J. 1990. Linearizability: A correctness condition for concurrent objects. ACM Trans. Prog. Lang. Syst. 12, 3, 463--492.
[18]
Hotmail. 2004. Hotmail incinerates customer files. http://news.com.com, (6/3/04).
[19]
Keeney, M., Kowalski, E., Cappelli, D., Moore, A., Shimeall, T., and Rogers, S. 2005. Insider threat study: Computer system sabotage in critical infrastructure sectors. http://www.cert.org/archive/pdf/insidercross051105.pdf.
[20]
Kotla, R. 2008. xbft: Byzantine fault tolerance with high performance, low cost, and aggressive fault isolation. Ph.D. thesis, The University of Texas at Austin, Austin, TX.
[21]
Kotla, R., Alvisi, L., Dahlin, M., Clement, A., and Wong, E. 2007a. Zyzzyva: Speculative Byzantine fault tolerance. In Proceedings of the 21st ACM Symposium on Operating Systems Principles (SOSP'07). 45--58.
[22]
Kotla, R. and Dahlin, M. 2004. High throughput Byzantine fault tolerance. In Proceedings of the International Conference on Dependable Systems and Networks (DSN'04). 575--584.
[23]
Kotla, R., Dahlin, M., and Alvisi, L. 2007b. SafeStore: A durable and practical storage system. In Proceedings of the USENIX Annual Technical Conference. 129--142.
[24]
Lamport, Shostak, and Pease. 1982. The Byzantine generals problem. ACM Trans. Program. Lang. Syst. 4, 3, 382--401.
[25]
Lamport, L. 1978. Time, clocks, and the ordering of events in a distributed system. Comm. ACM 21, 7, 558--565.
[26]
Lamport, L. 1984. Using time instead of timeout for fault-tolerant distributed systems. ACM Trans. Program. Lang. Syst. 6, 2, 254--280.
[27]
Lamport, L. 2003. Lower bounds for asynchronous consensus. Lecture Notes in Computer Science, vol. 2584. Springer, 22--23.
[28]
Li, J. and Mazières, D. 2007. Beyond one-third faulty replicas in Byzantine fault tolerant services. In Proceedings of the 4th USENIX Symposium on Networked Systems Design and Implementation (NSDI'07). 131--144.
[29]
Liskov, B., Ghemawat, S., Gruber, R., Johnson, P., and Shrira, L. 1991. Replication in the Harp file system. In Proceedings of the 13th ACM Symposium on Operating Systems Principles. 226--238.
[30]
Martin, J.-P. and Alvisi, L. 2006. Fast Byzantine consensus. IEEE Trans. Depend. Secure. Comput. 3, 3, 202--215.
[31]
Nightingale, E., Veeraraghavan, K., Chen, P., and Flinn, J. 2006. Rethink the sync. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI'06). 1--14.
[32]
Nightingale, E. B., Chen, P., and Flinn, J. 2005. Speculative execution in a distributed file system. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). 191--205.
[33]
OpenSSL. 2007. OpenSSL. http://www.openssl.org/.
[34]
Pease, M., Shostak, R., and Lamport, L. 1980. Reaching agreement in the presence of faults. J. ACM 27, 2.
[35]
Prabhakaran, V., Bairavasundaram, L., Agrawal, N., Arpaci-Dusseau, H. G. A., and Arpaci-Dusseau, R. 2005. IRON file systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP'05). 206--220.
[36]
Reiter, M. 1995. The Rampart toolkit for building high-integrity services. Lecture Notes in Computer Science, vol. 938. Springer, 99--110.
[37]
Rodrigues, R., Castro, M., and Liskov, B. 2001. BASE: Using abstraction to improve fault tolerance. In Proceedings of the 18th ACM Symposium on Operating Systems Principles (SOSP'01). 15--28.
[38]
Santry, D. S., Feeley, M. J., Hutchinson, N. C., Veitch, A. C., Carton, R. W., and Ofir, J. 1999. Deciding when to forget in the Elephant file system. In Proceedings of the 17th ACM Symposium on Operating Systems Principles (SOSP'99). 110--123.
[39]
Schneider, F. B. 1990. Implementing fault-tolerant services using the state machine approach: A tutorial. ACM Comput. Surv. 22, 4.
[40]
Singh, A., Das, T., Maniatis, P., Druschel, P., and Roscoe, T. 2008. BFT protocols under fire. In Proceedings of the 5th USENIX Symposium on Networked Systems Design and Implementation (NSDI'08). 189--204.
[41]
Singh, A., Fonseca, P., Kuznetsov, P., Rodrigues, R., and Maniatis, P. 2009. Zeno: Eventually consistent Byzantine fault tolerance. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI'09). 169--184.
[42]
Wester, B., Cowling, J., Nightingale, E. B., Chen, P. M., Flinn, J., and Liskov, B. 2009. Tolerating latency in replicated state machines through client speculation. In Proceedings of the 6th USENIX Symposium on Networked Systems Design and Implementation (NSDI'09). 245--260.
[43]
Wood, T., Singh, R., Venkataramani, A., and Shenoy, P. 2008. ZZ: Cheap practical BFT using virtualization. Tech. rep. TR14-08, University of Massachusetts, Amherst, MA.
[44]
Yang, J., Sar, C., and Engler, D. 2006. Explode: A lightweight, general system for finding serious storage system errors. In Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation (OSDI'06). 131--146.
[45]
Yin, J., Martin, J.-P., Venkataramani, A., Alvisi, L., and Dahlin, M. 2003. Separating agreement from execution for Byzantine fault tolerant services. In Proceedings of the 19th ACM Symposium on Operating Systems Principles (SOSP'03). 253--267.

Cited By

View all
  • (2025)Cataphract: A Batch Processing Method Specialized for BFT DatabasesInternational Journal of Networking and Computing10.15803/ijnc.15.1_3215:1(32-50)Online publication date: 2025
  • (2025)Atlas, a modular and efficient open-source BFT frameworkJournal of Systems and Software10.1016/j.jss.2024.112317222(112317)Online publication date: Apr-2025
  • (2024)A scalable blockchain-enabled federated learning architecture for edge computingPLOS ONE10.1371/journal.pone.030899119:8(e0308991)Online publication date: 16-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Computer Systems
ACM Transactions on Computer Systems  Volume 27, Issue 4
December 2009
69 pages
ISSN:0734-2071
EISSN:1557-7333
DOI:10.1145/1658357
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 January 2010
Accepted: 01 September 2009
Revised: 01 June 2009
Received: 01 March 2009
Published in TOCS Volume 27, Issue 4

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Byzantine fault tolerance
  2. output commit
  3. replication
  4. speculative execution

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)88
  • Downloads (Last 6 weeks)12
Reflects downloads up to 16 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Cataphract: A Batch Processing Method Specialized for BFT DatabasesInternational Journal of Networking and Computing10.15803/ijnc.15.1_3215:1(32-50)Online publication date: 2025
  • (2025)Atlas, a modular and efficient open-source BFT frameworkJournal of Systems and Software10.1016/j.jss.2024.112317222(112317)Online publication date: Apr-2025
  • (2024)A scalable blockchain-enabled federated learning architecture for edge computingPLOS ONE10.1371/journal.pone.030899119:8(e0308991)Online publication date: 16-Aug-2024
  • (2024)Autobahn: Seamless high speed BFTProceedings of the ACM SIGOPS 30th Symposium on Operating Systems Principles10.1145/3694715.3695942(1-23)Online publication date: 4-Nov-2024
  • (2024)Consensus-Agnostic State-Machine ReplicationProceedings of the 25th International Middleware Conference10.1145/3652892.3700776(341-353)Online publication date: 2-Dec-2024
  • (2024)Dashing and Star: Byzantine Fault Tolerance with Weak CertificatesProceedings of the Nineteenth European Conference on Computer Systems10.1145/3627703.3650073(250-264)Online publication date: 22-Apr-2024
  • (2024)OsirisBFT: Say No to Task Replication for Scalable Byzantine Fault Tolerant AnalyticsProceedings of the 29th ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming10.1145/3627535.3638468(94-108)Online publication date: 2-Mar-2024
  • (2024)An Efficient and Reliable Byzantine Fault Tolerant Blockchain Consensus Protocol for Single-Hop Wireless NetworksIEEE Transactions on Wireless Communications10.1109/TWC.2023.329370923:3(1974-1987)Online publication date: 1-Mar-2024
  • (2024)Secure Data Sharing for Consortium Blockchain-Enabled Vehicular Social NetworksIEEE Transactions on Vehicular Technology10.1109/TVT.2024.344820773:12(19682-19695)Online publication date: Dec-2024
  • (2024)BASS: A Blockchain-Based Asynchronous SignSGD Architecture for Efficient and Secure Federated LearningIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2024.337480921:6(5388-5402)Online publication date: 1-Nov-2024
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media