A reset of a distributed system is safe if it does not complete ``prematurely,'''' i.e., without having reset some process in the system. Safe resets are possible in the presence of certain faults, such as process fail-stops and repairs, but are not always possible in the presence of more general faults, such as arbitrary transients. In this paper, we design a bounded-memory distributed-reset program that possesses two tolerances: (1) in the presence of fail-stops and repairs, it always executes resets safely, and (2) in the presence of a finite number of transient faults, it eventually executes resets safely. Designing this multitolerance in the reset program introduces the novel concern of designing a safety detector that is itself multitolerant. A broad application of our multitolerant safety detector is to make any total program likewise multitolerant.
Cited By
- Bonakdarpour B and Kulkarni S Active stabilization Proceedings of the 13th international conference on Stabilization, safety, and security of distributed systems, (77-91)
- Bapat S, Leal W, Kwon T, Wei P and Arora A (2009). Chowkidar, ACM Transactions on Autonomous and Adaptive Systems, 4:1, (1-32), Online publication date: 1-Jan-2009.
- Dolev S, Kat R and Schiller E When consensus meets self-stabilization Proceedings of the 10th international conference on Principles of Distributed Systems, (45-63)
- Daliot A and Dolev D Self-stabilization of byzantine protocols Proceedings of the 7th international conference on Self-Stabilizing Systems, (48-67)
Recommendations
Automated Synthesis of Multitolerance
DSN '04: Proceedings of the 2004 International Conference on Dependable Systems and NetworksWe concentrate on automated synthesis of multitolerant programs,i.e., programs that tolerate multiple classes of faultsand provide a (possibly) different level of fault-tolerance toeach class. We consider three levels of fault-tolerance: (1)failsafe, ...
The Complexity of Adding Multitolerance
We focus on the problem of adding multitolerance to an existing fault-intolerant program. A multitolerant program tolerates multiple classes of faults and provides a potentially different level of fault tolerance to each of them. We consider three ...
Complexity Analysis of Weak Multitolerance
ICDCS '10: Proceedings of the 2010 IEEE 30th International Conference on Distributed Computing SystemsIn this paper, we classify multitolerant systems, i.e., systems that tolerate multiple classes of faults and provide potentially different levels of tolerance to them in terms of \strong and \weak multitolerance. Intuitively, this classification is ...