Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/122759.122772acmconferencesArticle/Chapter ViewAbstractPublication Pagesadhoc-now03Conference Proceedingsconference-collections
Article
Free access

Restoring consistent global states of distributed computations

Published: 01 December 1991 Publication History
First page of PDF

References

[1]
Mike Acetta, Robert Baron, William Bolosky, David Golub, Richard Rashid, Avadis Tevanian, and Michael Young. Mach: A new kernel foundation for UNIX development. In Proceedings of the Summer Uaeniz Conference, July 1986.
[2]
David F. Bacon. How to log all filesystem operations (while only writing a few to disk). Technical Report RC, IBM T.J. Watson Research Center, 1990.
[3]
J.F. Bartlett. A 'nonstop' operating system. In 11th Hawaii International Conference on System Sciences, University of Hawaii, 1978.
[4]
Anita Borg, Wolgang Blau, Wolfgang Graetsch, Ferdinand Herrmann, and Wolfgang Oberle. Fault tolerance under unix. A CM Transactions on Computer Systems, 7(1):1-24, February 1989.
[5]
David F. Bacon and Seth Copes Goldstein. Hardware-assisted replay of multiprocessor programs. In Proceedings of the A CM/ONR Workshop on Parallel and Distributed Debugging, Santa Cruz, CA, May 1991.
[6]
K. Mani Chandy and Leslie Lamport. Distributed snapshots: Determining global states of distributed systems. Transactions on Computer Systems, 3(1):63-75, February 1985.
[7]
Carol Critchlow and Kim Taylor. The inhibition spectrum and the achievement of causal consistency. In Proceedings of the Ninth Annual A CM Symposium on Principles of Distributed Computing, 1990.
[8]
Anne Dinning and Edith Schonberg. An empirical comparison of monitoring algorithms for access anomaly detection. In 2ncl A CM Conference PPoPP, pages 1-10, March 1990.
[9]
German S. Goldszmidt, Shmuel Katz, and Shaula Yemini. High level language debugging for concurrent programs. Transactions on Computer Systems, November 1990.
[10]
Hector Garcia-Molina, F Germano, and W. Kohler. Debugging a distributed computer system. IEEE Transactions on Software Engineering, se-10(2):210-219, March 1984.
[11]
J. Joyce, Greg Lomow, K. Slind, and Brian W. Unger. Monitoring distributed systems. A CM Transactions on Computer Systems, 5(2):121-150, May 1985.
[12]
David B. Johnson. Distributed System Fault Tolerance Using Message Logging. PhD thesis, Rice University, 1989.
[13]
Richard Koo and Sam Toueg. Checkpointing and rollback-recovery for distributed systems. IEEE Transactions on Soj~ware Engineering, se-13(1), January 1987.
[14]
Leslie Lamport. Time, clocks, and the ordering of events in a distributed system. Communications of the A CM, 21(7):558- 565, July 1978.
[15]
Thomas J. LeBlanc and John M. Mellor- Crummey. Debugging parallel programs with instant replay. IEEE Transactions on Computers, c-36(4), April 1987.
[16]
Kai Li, Jeffrey F. Naughton, and James S. Plank. Checkpointing multicomputer applications. Technical Report CS-TR-315-91, Princeton University, Department of Computer Science, 1991. Submitted to the Symposium on Reliable Distributed Systems, Pisa Italy, Sep 1991.
[17]
Richard j. LeBlanc and Arnold D. Robbins. Event-driven monitoring of distributed programs. In 5th International Conference on Distributed Computer Systems, pages 515- 522, 1985.
[18]
Barton P. Miller and Jong-Deok Choi. Breakpoints and halting in distributed programs. In International Conference on Distribu~ed Computer Systems, 1988.
[19]
Sang Lyul Min and Jong-Deok Choi. An efficient cache-based access anomaly detection scheme. In Proceedings of the Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, pages 235-244, Santa Clara, CA, April 1991.
[20]
Brian Randell. System structure for software fault tolerance. IEEE Transactions on Software Engineering, SE-1(2):220-232, June 1975.
[21]
Robert E. Strom, David F. Bacon, and Shaula Alexander Yemini. Volatile logging in n-fault-tolerant distributed systems. In The Eighteenth Annual International Symposium on Fault-Toleran~ Computing: Digest of Papers, pages 44-49, june 1988.
[22]
Edward T. Smith. Debugging tools for message-based, communicating processes. In 4th International Conference on Distributed Computer Systems, pages 303-310, 1984.
[23]
Robert E. Strom and Shaula Alexander Yemini. Optimistic recovery in distributed systems. A CM Transactions on Computer Systems, 3(3):204-226, August 1985.

Cited By

View all

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
PADD '91: Proceedings of the 1991 ACM/ONR workshop on Parallel and distributed debugging
December 1991
206 pages
ISBN:0897914570
DOI:10.1145/122759
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 December 1991

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Article

Conference

WPDD91
Sponsor:
WPDD91: Workshop on Parallel & Distributed Debugging
May 20 - 21, 1991
California, Santa Cruz, USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)55
  • Downloads (Last 6 weeks)10
Reflects downloads up to 06 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Computational modeling of consistent observation of asynchronous distributed computation on N–manifoldCogent Engineering10.1080/23311916.2018.15280295:1(1-16)Online publication date: 26-Sep-2018
  • (2014)HotRestoreProceedings of the 28th USENIX conference on Large Installation System Administration10.5555/2717491.2717492(1-16)Online publication date: 9-Nov-2014
  • (2013)Asynchronous Distributed CheckpointingDistributed Algorithms for Message-Passing Systems10.1007/978-3-642-38123-2_8(189-218)Online publication date: 2013
  • (2005)Determining consistent states of distributed objects participating in a remote method callProceedings of the 5th international conference on Computational Science - Volume Part I10.1007/11428831_44(355-363)Online publication date: 22-May-2005
  • (2003)Replay debugging of real-time systems using time machinesProceedings International Parallel and Distributed Processing Symposium10.1109/IPDPS.2003.1213515(8)Online publication date: 2003
  • (1999)Using message semantics for fast-output commit in checkpointing-and-rollback recoveryProceedings of the 32nd Annual Hawaii International Conference on Systems Sciences. 1999. HICSS-32. Abstracts and CD-ROM of Full Papers10.1109/HICSS.1999.772986(10)Online publication date: 1999
  • (1999)Communication-Induced Determination of Consistent SnapshotsIEEE Transactions on Parallel and Distributed Systems10.1109/71.79831210:9(865-877)Online publication date: 1-Sep-1999
  • (1998)Critical Path Profiling of Message Passing and Shared-Memory ProgramsIEEE Transactions on Parallel and Distributed Systems10.1109/71.7305309:10(1029-1040)Online publication date: 1-Oct-1998
  • (1997)Replaying distributed programs without message loggingProceedings. The Sixth IEEE International Symposium on High Performance Distributed Computing (Cat. No.97TB100183)10.1109/HPDC.1997.622370(137-147)Online publication date: 1997
  • (1997)Independent global snapshots in large distributed systemsProceedings Fourth International Conference on High-Performance Computing10.1109/HIPC.1997.634530(462-467)Online publication date: 1997
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media