Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/11945529_5guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

When consensus meets self-stabilization

Published: 12 December 2006 Publication History
  • Get Citation Alerts
  • Abstract

    This paper presents a self-stabilizing failure detector, asynchronous consensus and replicated state-machine algorithm suite, the components of which can be started in an arbitrary state and converge to act as a virtual state-machine.
    Self-stabilizing algorithms can cope with transient faults. Transient faults can alter the system state to an arbitrary state and hence, cause a temporary violation of the safety property of the consensus. New requirements for consensus that fit the on-going nature of self-stabilizing algorithms are presented. The wait-free consensus (and the replicated state-machine) algorithm presented is a classic combination of a failure detector and a (memory bounded) rotating coordinator consensus that satisfy both eventual safety and eventual liveness.
    Several new techniques and paradigms are introduced. The bounded memory failure detector abstracts away synchronization assumptions using bounded heartbeat counters combined with a balance-unbalance mechanism. The practically infinite paradigm is introduced in the scope of self-stabilization, where an execution of, say, 264 sequential steps is regarded as (practically) infinite. Finally, we present the first self-stabilizing wait-free reset mechanism that ensures eventual safety and can be used in other scopes.

    References

    [1]
    U. Abraham, "Self-stabilizing timestamps," Theoretical Computer Science, Vol. 308(1-3), 449-515, 2003.
    [2]
    M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier and S. Toueg, "On implementing omega with weak reliability and synchrony assumptions," In Proc. of the 22nd ACM symposium on Principles of distributed computing, 306-314, July 2003.
    [3]
    M.K. Aguilera, C. Delporte-Gallet, H. Fauconnier, and S. Toueg, "Communication-efficient leader election and consensus with limited link synchrony," In Proc. of the 23rd ACM symposium on Principles of distributed computing, 328-337, July 2004.
    [4]
    A. Arora, S.S. Kulkarni, and M. Demirbas, "Resettable vector clocks," In Proc. of the 19th ACM Symposium on Principles of Distributed Computing, 269-278, August 2000.
    [5]
    H. Attiya and J.L. Welch, Distributed Computing: Fundamentals, Simulations, and Advanced Topics (2nd edition). John Wiley and Sons, Inc., 2004.
    [6]
    J. Beauquier and S. Kekkonen-Moneta, "Fault-tolerance and self-stabilizing: Impossibility results and solutions using self-stabilizing failure detectors," International Journal of Systems Science, Vol. 28(11), 1177-1187, 1997.
    [7]
    K.P. Birman, "Replication and Fault-Tolerance in the ISIS System," ACM Symposium on Operating Systems Principles (SOSP), 79-86, 1985.
    [8]
    T.D. Chandra and S. Toueg, "Unreliable failure detectors for reliable distributed systems," Journal of the ACM, Vol. 43(2), 225-267, March 1996.
    [9]
    T.D. Chandra, V. Hadzilacos and S. Toueg, "The weakest failure detector for solving consensus," In Proc. of the 11th ACM Symposium on Principles of Distributed Computing, 147-158, August 1992.
    [10]
    B. Chor, A. Israeli and M. Li, "On processor coordination using asynchronous hardware," In Proc. of the 6th ACM Symposium on Principles of Distributed Computing, 86-97, August 1987.
    [11]
    L. Davidovitch, S. Dolev, and S. Rajsbaum, "Consensus continue? Stability of multi-valued continuous consensus!," In Proc. of the Workshop on Geometry and Topology in Concurrency and Distributed Computing, 21-24, October 2004.
    [12]
    C. Delporte-Gallet, H. Fauconnier and R. Guerraoui, "Failure Detection Lower Bounds on Registers and Consensus," In Proc. of the 16th International Conference on Distributed Computing, 237-251, October 2002.
    [13]
    E.W. Dijkstra, "Self-Stabilizing Systems in spite of Distributed Control," Communications of the ACM, Vol. 1(11), 643-644, 1974.
    [14]
    S. Dolev, Self-stabilization. MIT press, 2000.
    [15]
    S. Dolev, A. Israeli and S. Moran, "Self-Stabilization of Dynamic Systems Assuming only Read/Write Atomicity," In Proc. of the 9th ACM Symposium on Principles of Distributed Computing, 103-117, August 1990.
    [16]
    S. Dolev, S. Gilbert, L. Lahiani, N. Lynch, and T. Nolte, "Virtual Stationary Automata for Mobile Networks," Proc. of the 2005 International Conference On Principles Of Distributed Systems, (OPODIS), 2005.
    [17]
    S. Dolev, S. Gilbert, N. Lynch, E. Schiller, A. Shvartsman, and J.L. Welch, "Virtual Mobile Nodes for Mobile Ad Hoc Networks," International Conference on Principles of Distributed Computing, (DISC 2004), 230-244, 2004.
    [18]
    S. Dolev, R.I. Kat and E.M. Schiller, "When Consensus Meets Self-Stabilization: Self-Stabilizing Failure-Detector, Consensus and Replicated State-Machine," Computer Science, Ben-Gurion University of the Negev, TR #06-05, May 2006.
    [19]
    S. Dolev and S. Rajsbaum, "Stability of long-lived consensus," Journal of Computer and System Sciences, 26-45, August 2003.
    [20]
    S. Dolev and J.L. Welch, "Wait-free clock synchronization," Algorithmica, Vol. 18, 486-511, 1997.
    [21]
    M.J. Fischer, N.A. Lynch and M.S. Paterson, "Impossibility of distributed consensus with one faulty process," Journal of the ACM, Vol. 32(2), 374-382, April 1985.
    [22]
    F.C. Freiling, R. Guerraoui and P. Kouznetsov, "The Failure Detector Abstraction," Department for Mathematics and Computer Science, University of Mannheim, TR-2006-003, 2006.
    [23]
    F.C. Freiling and H. Völzer, "Illustrating the impossibility of crash-tolerant consensus in asynchronous systems," ACM SIGOPS Operating Systems Review, Vol. 40(2), 105-109, April 2006.
    [24]
    A. Fox and D. Patterson, "Self-Repairing Computers," Scientific American, June 2003.
    [25]
    E. Gafni and L. Lamport, "Disk Paxos," Distributed Computing, Vol. 16(1), 1-20, 2003.
    [26]
    M. Herlihy, "Wait-Free Synchronization," ACM Transactions on Programming Languages and Systems, vol. 13(1), 124-149, 1991.
    [27]
    M. Hutle and J. Widder, "Self-Stabilizing Failure Detector Algorithms," Parallel and Distributed Computing and Networks Conference, 485-490, February 2005.
    [28]
    S.S. Kulkarni and A. Arora, "Multitolerance in Distributed Reset," Chicago Journal of Theoretical Computer, CJTCS-1998-4, 1998.
    [29]
    L. Lamport, "The part-time parliament," ACM Transactions on Computer Systems, Vol. 16(2), 133-169, May 1998.
    [30]
    L. Lamport, "Time, Clocks, and Ordering of Events in a Distributed System," Communication of the ACM, Vol. 21(7), 558-565, 1978.
    [31]
    W. Lo and V. Hadzilacos, "Using Failure Detectors to Solve Consensus in Asynchronous Shared-Memory Systems," Lecture Notes in Computer Science (WDAG), Vol. 857, 280-295, October 1994.
    [32]
    M.C. Loui and H.H. Abu-Amara, "Memory requirements for agreement among unreliable asynchronous processes," Advances in Computing Research, Vol. 4, 163-183, 1987.
    [33]
    N. Lynch, Distributed Algorithms, Morgan Kaufmann Publishers, 1996.
    [34]
    D. Malkhi, F. Oprea and L. Zhou, "Omega Meets Paxos: Leader Election and Stability without Eventual Timely Links," In Proc. of the 19th Symposium on Distributed Computing (DISC), 99-213, September 2005.
    [35]
    M. Raynal, "A Short Introduction to Failure Detectors for Asynchronous Distributed Systems," SIGACT News column, 36(1), March 2005.

    Cited By

    View all
    • (2011)Pragmatic self-stabilization of atomic memory in message-passing systemsProceedings of the 13th international conference on Stabilization, safety, and security of distributed systems10.5555/2050613.2050617(19-31)Online publication date: 10-Oct-2011
    • (2011)Stabilizing consensus with the power of two choicesProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989516(149-158)Online publication date: 4-Jun-2011
    • (2009)Consensus When All Processes May Be Byzantine for Some TimeProceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems10.1007/978-3-642-05118-0_9(120-132)Online publication date: 5-Nov-2009

    Index Terms

    1. When consensus meets self-stabilization
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Guide Proceedings
      OPODIS'06: Proceedings of the 10th international conference on Principles of Distributed Systems
      December 2006
      440 pages
      ISBN:3540499903

      Sponsors

      • EPHE: l'Ecole Pratique des Hautes Etudes
      • LaISC: Laboratoire d'Informatique et des Systèmes ComplexesLaboratoire d'Informatique et des Systèmes Complexes
      • AUF: Agence Universitaire de la Francophonie
      • Université Paris 8: Université Paris 8

      Publisher

      Springer-Verlag

      Berlin, Heidelberg

      Publication History

      Published: 12 December 2006

      Author Tags

      1. consensus
      2. distributed reset
      3. failure detector
      4. self-stabilization
      5. state-machine
      6. wait-free

      Qualifiers

      • Article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0

      Other Metrics

      Citations

      Cited By

      View all
      • (2011)Pragmatic self-stabilization of atomic memory in message-passing systemsProceedings of the 13th international conference on Stabilization, safety, and security of distributed systems10.5555/2050613.2050617(19-31)Online publication date: 10-Oct-2011
      • (2011)Stabilizing consensus with the power of two choicesProceedings of the twenty-third annual ACM symposium on Parallelism in algorithms and architectures10.1145/1989493.1989516(149-158)Online publication date: 4-Jun-2011
      • (2009)Consensus When All Processes May Be Byzantine for Some TimeProceedings of the 11th International Symposium on Stabilization, Safety, and Security of Distributed Systems10.1007/978-3-642-05118-0_9(120-132)Online publication date: 5-Nov-2009

      View Options

      View options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media