Abstract
This paper proposes a new approach to rollback-recovery, using multi-agent in distributed computing system. Previous rollback-recovery protocols were dependent on inherent communication and operating system, which cause a decline of computing performance in distributed computing system. By using multi-agent, we propose rollback-recovery protocol which works independently on operating system. We define three kinds of agent. One is a recovery agent that performs rollback-recovery protocol after a failure. Other is an information agent that constructs domain knowledge as a rule of fault tolerance and information during failure-free operation. The other is the facilitator agent that controls the efficient communication between agents. Also we propose rollback-recovery protocol using multi-agent and simulate the proposed roll-back-recovery protocol using JAVA and agent communication language in CORBA environment.
This work was supported by grant No. R01-2001-00354 from the Korea Science & Engineering Foundation.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
B. Bhargava, S. R. Lian: Independent Checkpointing and Concurrent Rollback for Recovery—An Optimistic Approach, In Proceedings of the Symposium on Reliable Distributed Systems (1988) 3–12
E. N. Elnozahy, D. B. Johnson, Y. M. Wang,: A Survey of Rollback-Recovery Protocols in Message Passing Systems, CMU Technical Report CMU-CS-99-148 (1999)
E. N. Elnozahy: Manetho: Fault tolerance in distributed systems using rollback-recovery and process replication, Ph. D. Thesis, Rice University (1993)
Finin T., Fritzson R., Mckay D., McEntire R.: KQML as an agent communication language, Proc. of CIKM’ 94 (1994) 126–130
Genesereth M., Fikes R.: Knowledge interchange format version 3.0 reference manual, Technical Report Logic-92-1, Computer Science Department, Stanford University (1992)
L. Alvisi: Understanding the message logging paradigm for masking process crashes, Ph.D. Thesis, Department of Computer Science, Cornell University (1996)
L. Alvisi, K. Marzullo: Message Logging: Pessimistic, Optimistic, Causal and Optimal, IEEE Trans. on Software Engineering, Vol. 24 (1998) 149–159
L. Lamport: Time, Clocks and the Ordering of Events in a Distributed System, Communications of the ACM, 21 (1978) 558–565
R. Koo and S. Toueg: Checkpointing and rollback-recovery for distributed systems, IEEE Trans. on Software Engineering, Vol. SE-13, No. 1 (1987) 23–31
R.D. Schlichting and F.B. Schneider: Fail-stop processors: an approach to designing fault-tolerant distributed computing systems”, ACM Transactions on Computer Systems 1 (1985) 222–238
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Lee, HM., Chung, KS., Shin, SC., Lee, DW., Lee, WG., Yu, HC. (2002). A Recovery Technique Using Multi-agent in Distributed Computing Systems. In: Arbab, F., Talcott, C. (eds) Coordination Models and Languages. COORDINATION 2002. Lecture Notes in Computer Science, vol 2315. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-46000-4_23
Download citation
DOI: https://doi.org/10.1007/3-540-46000-4_23
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43410-8
Online ISBN: 978-3-540-46000-8
eBook Packages: Springer Book Archive