Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3569902.3569909acmotherconferencesArticle/Chapter ViewAbstractPublication PagesladcConference Proceedingsconference-collections
research-article

Strengthening Atomic Multicast for Partitioned State Machine Replication

Published: 17 January 2023 Publication History

Abstract

Partitioned state machine replication is a technique that extends classical state machine replication with state partitioning (or sharding) to provide both fault tolerance and performance scalability. The crux of the technique is ordering client requests within a partition, among the replicas that implement the partition, and across partitions, involving all the replicas accessed by the request. To cope with the complexity of ordering requests, partitioned state machine replication can use atomic multicast, a communication abstraction. Atomic multicast provides the means for requests to be propagated reliably and consistently to one or more sets of groups of replicas, where each replica group implements one partition. The paper revisits atomic multicast from the perspective of partitioned state machine replication and makes the following contributions: First, we show that if one implements partitioned state machine replication using an atomic multicast with global total order, a strong order property, then replicas would need to further coordinate as part of the execution of requests to ensure correctness. Second, we introduce a stronger version of atomic multicast that accounts for real-time dependencies between requests. Our proposed atomic multicast can be used to order requests within and across partitions so that replicas do not need to further coordinate to ensure linearizability. Third, we extend a well-known implementation of atomic multicast to ensure the stronger order property.

References

[1]
Marcos K. Aguilera, Naama Ben-David, Rachid Guerraoui, Virendra J. Marathe, and Igor Zablotchi. 2019. The Impact of RDMA on Agreement. Proceedings of the 2019 ACM Symposium on Principles of Distributed Computing (2019).
[2]
Hagit Attiya and Jennifer Welch. 2004. Distributed computing: fundamentals, simulations, and advanced topics. Vol. 19. John Wiley & Sons.
[3]
Carlos Eduardo Bezerra, Daniel Cason, and Fernando Pedone. 2015. Ridge: high-throughput, low-latency atomic multicast. In 2015 IEEE 34th Symposium on Reliable Distributed Systems (SRDS). IEEE, 256–265.
[4]
Carlos Eduardo Bezerra, Fernando Pedone, and Robbert Van Renesse. 2014. Scalable State-Machine Replication. In DSN. 331–342.
[5]
Kenneth P. Birman and Thomas A. Joseph. 1987. Reliable Communication in the Presence of Failures. ACM Trans. Comput. Syst. 5, 1 (1987), 47–76.
[6]
Tushar Deepak Chandra and Sam Toueg. 1996. Unreliable failure detectors for reliable distributed systems. Journal of the ACM (JACM) 43, 2 (1996), 225–267.
[7]
Paulo Coelho, Tarcisio Ceolin Junior, Alysson Bessani, Fernando Dotti, and Fernando Pedone. 2018. Byzantine fault-tolerant atomic multicast. In 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 39–50.
[8]
Paulo Coelho, Nicolas Schiper, and Fernando Pedone. 2017. Fast Atomic Multicast. In DSN.
[9]
James C Corbett, Jeffrey Dean, Michael Epstein, Andrew Fikes, Christopher Frost, Jeffrey John Furman, Sanjay Ghemawat, Andrey Gubarev, Christopher Heiser, Peter Hochschild, 2013. Spanner: Google’s globally distributed database. ACM Transactions on Computer Systems (TOCS) 31, 3 (2013), 1–22.
[10]
James Cowling and Barbara Liskov. 2012. Granola: Low-Overhead Distributed Transaction Coordination. In 2012 USENIX Annual Technical Conference (USENIX ATC 12). USENIX Association, Boston, MA, 223–235. https://www.usenix.org/conference/atc12/technical-sessions/presentation/cowling
[11]
Xavier Défago, André Schiper, and Péter Urbán. 2004. Total order broadcast and multicast algorithms: Taxonomy and survey. ACM Comput. Surv. 36, 4 (2004).
[12]
Carole Delporte-Gallet and Hugues Fauconnier. 2000. Fault-Tolerant Genuine Atomic Multicast to Multiple Groups. In OPODIS. Citeseer, 107–122.
[13]
Cynthia Dwork, Nancy A. Lynch, and Larry J. Stockmeyer. 1988. Consensus in the presence of partial synchrony. J. ACM 35, 2 (1988), 288–323.
[14]
Vitor Enes, Carlos Baquero, Alexey Gotsman, and Pierre Sutra. 2021. Efficient replication via timestamp stability. In Proceedings of the Sixteenth European Conference on Computer Systems. 178–193.
[15]
Udo Fritzke, Philippe Ingels, Achour Mostéfaoui, and Michel Raynal. 1998. Fault-tolerant total order multicast to asynchronous groups. In Proceedings Seventeenth IEEE Symposium on Reliable Distributed Systems (SRDS). IEEE, 228–234.
[16]
Alexey Gotsman, Anatole Lefort, and Gregory Chockler. 2019. White-box atomic multicast. In 2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). IEEE, 176–187.
[17]
Rachid Guerraoui, Jad Hamza, Dragos-Adrian Seredinschi, and Marko Vukolic. 2019. Can 100 Machines Agree?CoRR abs/1911.07966(2019). arXiv:1911.07966http://arxiv.org/abs/1911.07966
[18]
Rachid Guerraoui and André Schiper. 1997. Software-Based Replication for Fault Tolerance. Computer 30, 4 (1997), 68–74.
[19]
Rachid Guerraoui and André Schiper. 2001. Genuine atomic multicast in asynchronous distributed systems. Theor. Comput. Sci. 254, 1-2 (2001), 297–316.
[20]
Vassos Hadzilacos and Sam Toueg. 1994. A Modular Approach to Fault-Tolerant Broadcasts and Related Problems. Technical Report. Ithaca, NY, USA.
[21]
Maurice Herlihy and Jeannette M. Wing. 1990. Linearizability: A Correctness Condition for Concurrent Objects. ACM Trans. Program. Lang. Syst. 12, 3 (1990), 463–492.
[22]
Leslie Lamport. 1978. Time, Clocks, and the Ordering of Events in a Distributed System. Commun. ACM 21, 7 (1978), 558–565.
[23]
Long Hoang Le, Mojtaba Eslahi-Kelorazi, Paulo R. Coelho, and Fernando Pedone. 2021. RamCast: RDMA-based atomic multicast. Proceedings of the 22nd International Middleware Conference (2021).
[24]
Long Hoang Le, Enrique Fynn, Mojtaba Eslahi-Kelorazi, Robert Soulé, and Fernando Pedone. 2019. DynaStar: Optimized Dynamic Partitioning for Scalable State Machine Replication. In ICDCS.
[25]
Zhongmiao Li, Peter Van Roy, and Paolo Romano. 2017. Enhancing throughput of partially replicated state machines via multi-partition operation scheduling. In 2017 IEEE 16th International Symposium on Network Computing and Applications (NCA). 1–10. https://doi.org/10.1109/NCA.2017.8171364
[26]
Parisa Jalili Marandi, Marco Primi, and Fernando Pedone. 2012. Multi-Ring Paxos. In International Conference on Dependable Systems and Networks (DSN 2012). IEEE, 1–12.
[27]
Parisa Jalili Marandi, Marco Primi, Nicolas Schiper, and Fernando Pedone. 2010. Ring Paxos: A high-throughput atomic broadcast protocol. In Dependable Systems and Networks (DSN). IEEE, 527–536.
[28]
Marta Patiño Martinez, Ricardo Jiménez-Peris, Bettina Kemme, and Gustavo Alonso. 2005. MIDDLE-R: Consistent Database Replication at the Middleware Level. ACM Trans. Comput. Syst. 23, 4 (nov 2005), 375–423.
[29]
Fernando Pedone, Rachid Guerraoui, and André Schiper. 2003. The Database State Machine Approach. Distributed Parallel Databases 14, 1 (2003), 71–98.
[30]
Luis Rodrigues, Rachid Guerraoui, and André Schiper. 1998. Scalable atomic multicast. In International Conference on Computer Communications and Networks. 840–847.
[31]
Nicolas Schiper. 2009. On Multicast Primitives in Large Networks and Partial Replication Protocols. Ph. D. Dissertation. Università della Svizzera italiana.
[32]
Nicolas Schiper and Fernando Pedone. 2008. On the inherent cost of atomic broadcast and multicast in wide area networks. In International Conference on Distributed Computing and Networking (ICDCN). Springer, 147–157.
[33]
Nicolas Schiper and Fernando Pedone. 2008. Solving Atomic Multicast When Groups Crash. In OPODIS, Theodore P. Baker, Alain Bui, and Sébastien Tixeuil (Eds.).
[34]
Nicolas Schiper, Pierre Sutra, and Fernando Pedone. 2010. P-Store: Genuine Partial Replication in Wide Area Networks. In Symposium on Reliable Distributed Systems (SRDS).
[35]
Alexander Thomson, Thaddeus Diamond, Shu-Chun Weng, Kun Ren, Philip Shao, and Daniel J. Abadi. 2012. Calvin: Fast Distributed Transactions for Partitioned Database Systems. In SIGMOD.

Cited By

View all
  • (2024)Extending State Machine Replication through CompositionProceedings of the 13th Latin-American Symposium on Dependable and Secure Computing10.1145/3697090.3697106(231-240)Online publication date: 26-Nov-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
LADC '22: Proceedings of the 11th Latin-American Symposium on Dependable Computing
November 2022
167 pages
ISBN:9781450397377
DOI:10.1145/3569902
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 January 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. multicast protocols
  2. scalable state machine replication

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

  • CNPq - Brazil
  • CAPES - Brazil - PUCRS PrInt
  • FAPERGS - RS - Brazil

Conference

LADC 2022
LADC 2022: Latin-American Symposium on Dependable Computing
November 21 - 24, 2022
Fortaleza/CE, Brazil

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)8
  • Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Extending State Machine Replication through CompositionProceedings of the 13th Latin-American Symposium on Dependable and Secure Computing10.1145/3697090.3697106(231-240)Online publication date: 26-Nov-2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media