Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Multitolerance in Distributed ResetDecember 1998
1998 Technical Report
Publisher:
  • Chicago Journal of Theoretical Computer Science
Published:07 December 1998
Bibliometrics
Skip Abstract Section
Abstract

A reset of a distributed system is safe if it does not complete ``prematurely,'''' i.e., without having reset some process in the system. Safe resets are possible in the presence of certain faults, such as process fail-stops and repairs, but are not always possible in the presence of more general faults, such as arbitrary transients. In this paper, we design a bounded-memory distributed-reset program that possesses two tolerances: (1) in the presence of fail-stops and repairs, it always executes resets safely, and (2) in the presence of a finite number of transient faults, it eventually executes resets safely. Designing this multitolerance in the reset program introduces the novel concern of designing a safety detector that is itself multitolerant. A broad application of our multitolerant safety detector is to make any total program likewise multitolerant.

Contributors
  • Michigan State University
  • The Ohio State University

Recommendations