Abstract
This article describes the implementation of checkpointing and recovery services in a Java-based distributed platform. Our case study is suma, a distributed execution platform implemented on top of Grid services. suma has been designed for execution of Java bytecode, with additional support for parallel processing. suma middleware is built on top of commodity software and communication technologies, including Java, Corba, and Globus services. The implementation of suma that runs on top of Globus services is called suma/g.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Baker, M., Carpenter, B., Hoon Ko, S., Li, X.: mpiJava: A Java interface to MPI. In: First UK Workshop on Java for High Performance Network Computing, Europar 1998 (1998)
Bouchenak, S.: Making Java applications mobile or persistent. In: Proceedings of 6th USENIX Conference on Object-Oriented Technologies and Systems (2001)
Cardinale, Y., Curiel, M., Figueira, C., García, P., Hernández, E.: Implementation of a corba-based metacomputing system. In: Hertzberger, B., Hoekstra, A.G., Williams, R. (eds.) HPCN-Europe 2001. LNCS, vol. 2110, p. 629. Springer, Heidelberg (2001)
Cardinale, Y., Hernández, E.: Checkpointing facility in a metasystem. In: Sakellariou, R., Keane, J.A., Gurd, J.R., Freeman, L. (eds.) Euro-Par 2001. LNCS, vol. 2150, p. 75. Springer, Heidelberg (2001)
Cardinale, Y., Hernández, E.: Parallel checkpointing facility in a metasystem. In: Proceedings of The Parallel Computing Conference, Naples, Italy (2001)
Elnozahy, E.N., Alvisi, L., Wang, Y.-M., Johnson, D.B.: A survey of rollback-recovery protocols in message-passing systems. ACM Computing Surveys 34(30) (2002)
Foster, I., Kesselman, C., Tuecke, S.: The Anatomy of the Grid: Enabling Scalable Virtual Organizations. International Journal of High Performance Computing Applications 15(3) (2001)
Helary, J.M., Mostefaoui, A., Netzer, R., Raynal, M.: Communication-based prevention of useless checkpoints in distributed computations. Technical Report Publication interne n 1105, Institut de Recherche en Informatique et Systemes Aleatoires (May 1997)
Hernández, E., Cardinale, Y., Figueira, C., Teruel, A.: SUMA: A Scientific Metacomputer. In: Parallel Computing: Fundamentals and Applications. Proceedings of The International Conference. Imperial College Press, London (2000)
Manivannan, D., Singhal, M.: Quasi-Synchronous Checkpointing: Models, Characterization, and Classification. IEEE Transactions on Parallel and Distributed Systems 10(7) (1999)
Mostefaoui, A., Raynal, M.: Efficient message logging for uncoordinated checkpointing protocols. Technical Report Publication interne n 1018, Institut de Recherche en Informatique et Systemes Aleatoires (June 1996)
Stellner, G.: Cocheck: Checkpointing and process migration for MPI. In: 10th International Parallel Processing Symposium (1996)
The Globus Alliance. The Globus Toolkit, http://www.globus.org/
The Globus Alliance. The Globus Toolkit, http://www.globus.org/ogsa
von Laszewski, G., Foster, I., Gawor, J., Smith, W., Tuecke, S.: CoG Kits: A Bridge between Commodity Distributed Computing and High-Performance Grids. In: ACM Java Grande 2000 Conference, San Francisco, CA (JUNE 2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Cardinale, Y., Hernández, E. (2005). Parallel Checkpointing on a Grid-Enabled Java Platform. In: Sloot, P.M.A., Hoekstra, A.G., Priol, T., Reinefeld, A., Bubak, M. (eds) Advances in Grid Computing - EGC 2005. EGC 2005. Lecture Notes in Computer Science, vol 3470. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11508380_75
Download citation
DOI: https://doi.org/10.1007/11508380_75
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26918-2
Online ISBN: 978-3-540-32036-4
eBook Packages: Computer ScienceComputer Science (R0)