Abstract
Scheduling in metacomputing environments is an active field of research as the vision of a Computational Grid becomes more concrete. An important class of Grid applications are long-running parallel computations with large numbers of somewhat independent tasks (Monte-Carlo simulations, parameter-space searches, etc.). A number of Grid middle-ware projects are available to implement such applications but scheduling strategies are still open research issues. This is mainly due to the diversity of both Grid resource types and of their availability patterns. The purpose of this work is to develop and validate a general adaptive scheduling algorithm for task farming applications along with a user interface that makes the algorithm accessible to domain scientists. Our algorithm is general in that it is not tailored to a particular Grid middleware and that it requires very few assumptions concerning the nature of the resources. Our first testbed is NetSolve as it allows quick and easy development of the algorithm by isolating the developer from issues such as process control, I/O, remote software access, or fault-tolerance.
Chapter PDF
References
Ian Foster and Carl Kesselman, editors. The Grid, Blueprint for a New computing Infrastructure. Morgan Kaufmann Publishers, Inc., 1998.
M. Litzkow, M. Livny, and M.W. Mutka. Condor-A Hunter of Idle Workstations. In Proc. of the 8th International Conference of Distributed Computing Systems, pages 104–111. Department of Computer Science, University of Winsconsin, Madison, June 1988.
L. Silva, B. Veer, and J. Silva. How to Get a Fault-Tolerant Farm. In World Transputer Congress, pages 923–938, Sep. 2993.
S. Sekiguchi, M. Sato, H. Nakada, S. Matsuoka, and U. Nagashima. Ninf:Network based Information Library for Globally High Performance Computing. In Proc. of Parallel Object-Oriented Methods and Applications (POOMA), Santa Fe, 1996.
I. Foster and K Kesselman. Globus: A Metacomputing Infrastructure Toolkit. In Proc. Workshop on Environments and Tools. SIAM, to appear.
A. Grimshaw, W. Wulf, J. French, A. Weaver, and P.Jr. Reynolds. A Synopsis of the Legion Project. Technical Report CS-94-20, Department of Computer Science, University of Virginia, 1994.
D. Abramson, I. Foster, J. Giddy, A. Lewis, R. Sosic, and R. Sutherst. The Nimrod Computational Workbench: A Case Study in Desktop Metacomputing. In Proceedings of the 20th Autralasian Computer Science Conference, Feb. 1997.
D. Abramson and J. Giddy. Scheduling Large Parametric Modelling Experiments on a Distributed Meta-computer. In PCW’97, Sep. 1997.
A. Baratloo, P. Dasgupta, and Z. Kedem. Calypso: A Novel Software System for Fault-Tolerant Parallel Processing on Distributed Platforms. In 4th IEEE International Symposium on High Performance Distributed Computing, Aug. 1995.
L.M. Silva, J.G. Silva, S. Chapple, and L. Clarke. Portable checkpointing and recovery. In Proceedings of the HPDC-4, High-Performance Distributed Computing, pages 188–195, Washington, DC, August 1995.
F. Berman, R. Wolski, S. Figueira, J. Schopf, and G. Shao. Application-Level Scheduling on Distributed Heterogeneous Networks. In Proc. of Supercomputing’96, Pittsburgh, 1996.
F. Berman and R. Wolski. The AppLeS Project: A Status Report. In Proc. of the 8th NEC Research Symposium, Berlin, Germany, 1997.
F. Berman, R. Wolski, and G. Shao. Performance Effects of Scheduling Strategies for Master/Slave Distributed Applications. Technical Report TR-CS98-598, U.C., San Diego, 1998.
R. Wolski. Dynamically forecasting network performance using the network weather service. Technical Report TR-CS96-494, U.C. San Diego, October 1996.
M. Litzkow and M. Livny. Experience with the Condor Distributed Batch System. In Proc. of IEEE Workshop on Experimental Distributed Systems. Department of Computer Science, University of Winsconsin, Madison, 1990.
H. Casanova and J. Dongarra. Providing Uniform Dynamic Access to Numerical Software. In M. Heath, A. Ranade, and R. Schrieber, editors, IMA Volumes in Mathematics and its Applications, Algorithms for Parallel Processing, volume 105, pages 345–355. Springer-Verlag, 1998.
The Math Works Inc. MATLAB Reference Guide. The Math Works Inc., 1992.
S. Wolfram. The Mathematica Book, Third Edition. Wolfram Median, Inc. and Cambridge University Press, 1996.
H. Casanova, J. Dongarra, and K. Seymour. Client User’s Guide to Netsolve. Technical Report CS-96-343, Department of Computer Science, University of Tennessee, 1996.
H Casanova and J. Dongarra. NetSolve: A Network Server for Solving Computational Science Problems. The International Journal of Supercomputer Applications and High Performance Computing, 1997.
H. Casanova and J. Dongarra. NetSolve’s Network Enabled Server: Examples and Applications. IEEE Computational Science & Engineering, 5(3):57–67, September 1998.
H. Casanova and J. Dongarra. NetSolve version 1.2: Design and Implementation. Technical Report to appear, Department of Computer Science, University of Tennessee, 1998.
D.E. Bakken and R.D. Schilchting. Supporting fault-tolerant parallel programming in Linda. IEEE Transactions on Parallel and Distributed Systems, 6(3):287–302, March 1995.
D. Gelernter and D. Kaminsky. Supercomputing out of recycled garbage: Preliminary experience with piranha. In International Conference on Supercomputing, pages 417–427, Washington, D.C., June 1992. ACM.
J.R. Stiles, T.M. Bartol, E.E. Salpeter,, and M.M. Salpeter. Monte Carlo simulation of neuromuscular transmitter release using MCell, a general simulator of cellular physiological processes. Computational Neuroscience, pages 279–284, 1998.
J.R. Stiles, D. Van Helden, T.M. Bartol, E.E. Salpeter, and M.M. Salpeter. Miniature end-plate current rise times <100 microseconds from improved dual recordings can be modeled with passive acetylcholine diffusion form a synaptic vesicle. In Proc. Natl. Acad. Sci. U.S.A., volume 93, pages 5745–5752, 1996.
M. Beck, J. Plank, T. Moore, and W. Elwasif. Why IBP Now. The International Journal of Supercomputer Applications and High Performance Computing, to appear.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Casanova, H., Kim, M., Plank, J., Dongarra, J. (1999). Adaptive Scheduling for Task Farming with Grid Middleware. In: Amestoy, P., et al. Euro-Par’99 Parallel Processing. Euro-Par 1999. Lecture Notes in Computer Science, vol 1685. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-48311-X_3
Download citation
DOI: https://doi.org/10.1007/3-540-48311-X_3
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-66443-7
Online ISBN: 978-3-540-48311-3
eBook Packages: Springer Book Archive