Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1011767.1011816acmconferencesArticle/Chapter ViewAbstractPublication PagespodcConference Proceedingsconference-collections
Article

Communication-efficient leader election and consensus with limited link synchrony

Published: 25 July 2004 Publication History

Abstract

We study the degree of synchrony required to implement the leader election failure detector Ω and to solve consensus in partially synchronous systems. We show that in a system with n processes and up to f process crashes, one can implement Ω and solve consensus provided there exists some (unknown) correct process with f outgoing links that are eventually timely. In the special case where f = 1, an important case in practice, this implies that to implement Ω and solve consensus it is sufficient to have just one eventually timely link -- all the other links in the system, Θ(n2) of them, may be asynchronous. There is no need to know which link pq is eventually timely, when it becomes timely, or what is its bound on message delay. Surprisingly, it is not even required that the source p or destination q of this link be correct: either p or q may actually crash, in which case the link pq is eventually timely in a trivial way, and it is useless for sending messages. We show that these results are in a sense optimal: even if every process has f - 1 eventually timely links, neither Ω nor consensus can be solved. We also give an algorithm that implements Ω in systems where some correct process has f outgoing links that are eventually timely, such that eventually only f links carry messages, and we show that this is optimal. For f = 1, this algorithm ensures that all the links, except for one, eventually become quiescent.

References

[1]
M. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg. Stable leader election (extended abstract). In Proceedings of the 15th International Symposium on Distributed Computing, LNCS 2180, pages 108--122. Springer-Verlag, 2001.]]
[2]
M. K. Aguilera, W. Chen, and S. Toueg. Failure detection and consensus in the crash-recovery model. Distributed Computing, 13(2):99--125, Apr. 2000.]]
[3]
M. K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg. On implementing omega with weak reliability and synchrony assumptions. In Proceedings of the 22nd ACM Symposium on Principles of Distributed Computing, pages 306--314, Boston, Massachusetts, USA, July 2003.]]
[4]
E. Anceaume, A. Fernandez, A. Mostefaoui, G. Neiger, and M. Raynal. A necessary and sufficient condition for transforming limited accuracy failure detectors. Journal of Computer and System Sciences, 2003(to appear).]]
[5]
M. Castro and B. Liskov. Practical byzantine fault tolerance and proactive recovery. ACM Transactions on Computer Systems (TOCS), 20(4):398--461, Nov. 2002.]]
[6]
T. D. Chandra, V. Hadzilacos, and S. Toueg. The weakest failure detector for solving consensus. J. ACM, 43(4):685--722, July 1996.]]
[7]
T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. J. ACM, 43(2):225--267, Mar. 1996.]]
[8]
F. Chu. Reducing Ω to ◇W. Information Processing Letters, 67(6):298--293, Sept. 1998.]]
[9]
D. Dolev, C. Dwork, and L. Stockmeyer. On the minimal synchronism needed for distributed consensus. J. ACM, 34(1):77--97, Jan. 1987.]]
[10]
C. Dwork, N. A. Lynch, and L. Stockmeyer. Consensus in the presence of partial synchrony. J. ACM, 35(2):288--323, Apr. 1988.]]
[11]
M. J. Fischer, N. A. Lynch, and M. S. Paterson. Impossibility of distributed consensus with one faulty process. J. ACM, 32(2):374--382, Apr. 1985.]]
[12]
E. Gafni and L. Lamport. Disk paxos. In Proceedings of the 14th International Symposium on Distributed Computing, LNCS 1914, pages 330--344. Springer-Verlag, 2000.]]
[13]
R. Guerraoui and P. Dutta. Fast indulgent consensus with zero degradation. In Proceedings of the 4th European Dependable Computing Conference, Oct. 2002.]]
[14]
L. Lamport. The Part-Time parliament. ACM Transactions on Computer Systems, 16(2):133--169, May 1998.]]
[15]
L. Lamport. Paxos made simple. SIGACT News, 32(4):18--25, Dec. 2001.]]
[16]
M. Larrea, A. Fernández, and S. Arévalo. Optimal implementation of the weakest failure detector for solving consensus. In Proceedings of the 19th Symposium on Reliable Distributed Systems, pages 52--59. IEEE Computer Society Press, Oct. 2000.]]
[17]
M. Larrea, A. Fernandez, and S. Arevalo. Eventually consistent failure detectors. In ACM Symposium on Parallel Algorithms and Architectures, pages 326--327, 2001.]]
[18]
A. Mostéfaoui and M. Raynal. k-set agreement with limited accuracy failure detectors. In Proceedings of the 19th ACM Symposium on Principles of Distributed Computing, pages 143--152, aug 2000.]]
[19]
A. Mostèfaoui and M. Raynal. Leader-based consensus. Parallel Processing Letters, 11(1):95--107, 2001.]]
[20]
E. Mourgaya, A. Mostefaoui, and M. Raynal. Asynchronous implementation of failure detectors. In Proceedings of the International Conference on Dependable Systems and Networks. IEEE Computer Society Press, 2003.]]
[21]
J. Yang, G. Neiger, and E. Gafni. Structured derivations of consensus algorithms for failure detectors. In Proceedings of the 17th ACM Symposium on Principles of Distributed Computing, pages 297--306, 1998.]]

Cited By

View all
  • (2024)Disaster-FD: A Failure Detector for Disaster-Prone EnvironmentsProceedings of the 13th Latin-American Symposium on Dependable and Secure Computing10.1145/3697090.3697096(200-209)Online publication date: 26-Nov-2024
  • (2024)Topological Characterization of Consensus in Distributed SystemsJournal of the ACM10.1145/368730271:6(1-48)Online publication date: 22-Aug-2024
  • (2024)Self-Stabilizing Indulgent Zero-degrading Binary ConsensusTheoretical Computer Science10.1016/j.tcs.2024.114387(114387)Online publication date: Jan-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PODC '04: Proceedings of the twenty-third annual ACM symposium on Principles of distributed computing
July 2004
422 pages
ISBN:1581138024
DOI:10.1145/1011767
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 July 2004

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. asynchronous systems
  2. consensus
  3. distributed systems
  4. failure detector
  5. fault tolerance
  6. leader election
  7. partially synchronous systems
  8. synchronous systems

Qualifiers

  • Article

Conference

PODC04
PODC04: Principles of Distributed Computing 2004
July 25 - 28, 2004
Newfoundland, St. John's, Canada

Acceptance Rates

Overall Acceptance Rate 740 of 2,477 submissions, 30%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)20
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Disaster-FD: A Failure Detector for Disaster-Prone EnvironmentsProceedings of the 13th Latin-American Symposium on Dependable and Secure Computing10.1145/3697090.3697096(200-209)Online publication date: 26-Nov-2024
  • (2024)Topological Characterization of Consensus in Distributed SystemsJournal of the ACM10.1145/368730271:6(1-48)Online publication date: 22-Aug-2024
  • (2024)Self-Stabilizing Indulgent Zero-degrading Binary ConsensusTheoretical Computer Science10.1016/j.tcs.2024.114387(114387)Online publication date: Jan-2024
  • (2023)Leaderless consensusJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.01.009176(95-113)Online publication date: Jun-2023
  • (2022)Collective Iterative Learning Control: Exploiting Diversity in Multi-Agent Systems for Reference Tracking TasksIEEE Transactions on Control Systems Technology10.1109/TCST.2021.310964630:4(1390-1402)Online publication date: Jul-2022
  • (2021)Leaderless Consensus2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS51616.2021.00045(392-402)Online publication date: Jul-2021
  • (2021)A synod based deterministic and indulgent leader election protocol for asynchronous large groupsInternational Journal of Parallel, Emergent and Distributed Systems10.1080/17445760.2021.187906737:2(220-247)Online publication date: 1-Feb-2021
  • (2021)Probabilistic and Temporal Failure Detectors for Solving Distributed ProblemsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2021.07.017Online publication date: Jul-2021
  • (2020)Building a Data Store with the Dynamic StructureAutomatic Control and Computer Sciences10.3103/S014641161907026553:7(794-810)Online publication date: 4-Mar-2020
  • (2020)PALE: Time Bounded Practical Agile Leader ElectionIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.293362031:2(470-485)Online publication date: 1-Feb-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media