Abstract
In this paper, we present an experimental study of deterministic non-preemptive multiple workflow scheduling strategies on a Grid. We distinguish twenty five strategies depending on the type and amount of information they require. We analyze scheduling strategies that consist of two and four stages: labeling, adaptive allocation, prioritization, and parallel machine scheduling. We apply these strategies in the context of executing the Cybershake, Epigenomics, Genome, Inspiral, LIGO, Montage, and SIPHT workflows applications. In order to provide performance comparison, we performed a joint analysis considering three metrics. A case study is given and corresponding results indicate that well known DAG scheduling algorithms designed for single DAG and single machine settings are not well suited for Grid scheduling scenarios, where user run time estimates are available. We show that the proposed new strategies outperform other strategies in terms of approximation factor, mean critical path waiting time, and critical path slowdown. The robustness of these strategies is also discussed.
Similar content being viewed by others
References
Pinedo, M.L.: Scheduling: Theory, Algorithms, and Systems, 3rd edn. Springer (2008)
Mccreary, C., Khan, A.A., Thompson, J.J., Mcardle, M.E.: A comparison of heuristics for scheduling dags on multiprocessors. In: International Parallel and Distributed Processing Symposium (IPPS94), pp. 446–451. Cancun, México (1994)
Kwong, K.Y., Ahmad, I.: Dynamic critical-path scheduling: an effective technique for allocating task graphs to multiprocessors. IEEE Trans. Parallel Distrib. Syst. 7, 506–521 (1996)
Kwok, Y.-K., Ahmad, I.: Static scheduling algorithms for allocating directed task graphs to multiprocessors. ACM Comput. Surv. 31(4), 406–471 (1999)
Leung, J., Kelly, L., Anderson, J.H.: Handbook of Scheduling: Algorithms, Models, and Performance Analysis. CRC Press, Inc., Boca Raton (2004)
Rajakumar, S., Arunachalam, V.P., Selladurai, V.: Workflow balancing strategies in parallel machine scheduling. Int. J. Adv. Manuf. Technol. 23, 366–374 (2004)
Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of scientific workflows in the askalon grid environment. SIGMOD Record 34(3), 56–62 (2005)
Bittencourt, L.F., Madeira, E.R.M.: A dynamic approach for scheduling dependent tasks on the xavantes grid middleware. In: MCG’06: Proceedings of the 4th International Workshop on Middleware for Grid Computing. MCG’06, pp. 10–16. ACM, New York (2006)
Jia, Y., Rajkumar, B.: Scheduling scientific workflow applications with deadline and budget constraints using genetic algorithms. Sci. Program. 14(3), 217–230 (2006)
Ramakrishnan, A., Singh, G., Zhao, H., Deelman, E., Sakellariou, R., Vahi, K., Blackburn, K., Meyers, D., Samidi, M.: Scheduling data-intensive workflows onto storage-constrained distributed resources. In: CCGRID’07: Proceedings of the 7th IEEE Symposium on Cluster Computing and the Grid. CCGRID’07, pp. 14–17 (2007)
Szepieniec, T., Bubak, M.: Investigation of the dag eligible jobs maximization algorithm in a grid. In: Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing, GRID’08, pp. 340–345. IEEE Computer Society, Washington (2008)
Singh, G., Su, M.-H., Vahi, K., Deelman, E., Berriman, B., Good, J., Katz, D.S., Mehta, G.: Workflow task clustering for best effort systems with Pegasus. In: MG’08: Proceedings of the 15th ACM Mardi Gras Conference, pp. 1–8. ACM, New York (2008)
Masko, L., Dutot, P.F., Mounie, G., Trystram, D., Tudruj, M.: Scheduling moldable tasks for dynamic SMP clusters in soc technology. In: Parallel Processing and Applied Mathematics. Lecture Notes in Computer Science, vol. 3911, pp. 879–887. Springer (2005)
Masko, L., Mounie, G., Trystram, D., Tudruj, M.: Program graph structuring for execution in dynamic SMP clusters using moldable tasks. In: International Symposium on Parallel Computing in Electrical Engineering, PAR ELEC 2006, pp. 95–100 (2006)
Singh, G., Kesselman, C., Deelman, E.: Optimizing grid-based workflow execution. J. Grid Computing 3, 201–219 (2005)
Bittencourt, L.F., Madeira, E.R.M.: Towards the scheduling of multiple workflows on computational grids. J. Grid Computing 8, 419–441 (2010)
Zhao, H., Sakellariou, R.: Scheduling multiple dags onto heterogeneous systems. In: Parallel and Distributed Processing Symposium, 20th International, IPDPS’06, p. 14. IEEE Computer Society, Washington (2006)
Topcuouglu, H., Hariri, S., Wu, M.-Y.: Performance-effective and low-complexity task scheduling for heterogeneous computing. IEEE Trans. Parallel Distrib. Syst. 13(3), 260–274 (2002)
Sakellariou, R., Zhao, H.: A hybrid heuristic for dag scheduling on heterogeneous systems. In: 13th IEEE Heterogeneous Computing Workshop (HCW’04). IPDPS’04, pp. 111–123. IEEE Computer Society, Santa Fe (2004)
Zhu, L., Sun, Z., Guo, W., Jin, Y., Sun, W., Hu, W.: Dynamic multi dag scheduling algorithm for optical grid environment. Netw. Architect. Manag. Appl. V 6784(1), 1122 (2007)
N’takpé, T., Suter, F.: Concurrent scheduling of parallel task graphs on multi-clusters using constrained resource allocations. In: International Parallel and Distributed Processing Symposium/International Parallel Processing Symposium, pp. 1–8 (2009)
Hsu, C.-C., Huang, K.-C., Wang, F.-J.: Online scheduling of workflow applications. In Grid environments. Future Gen. Comput. Syst. 27, 860–870 (2011)
Mu’alem, A.W., Feitelson, D.G.: Utilization, predictability, workloads, and user runtime estimates in scheduling the IBM SP2 with backfilling. IEEE Trans. Parallel Distrib. Syst. 12, 529–543 (2001)
Ramirez-Alcaraz, J.M., Tchernykh, A., Yahyapour, R., Schwiegelshohn, U., Quezada-Pina, A., Gonzalez-García, J.L., Hirales-Carbajal, A.: Job allocation strategies with user run time estimates for online scheduling in hierarchical Grids. J. Grid Computing 9, 95–116 (2011)
Shmoys, D.B., Wein, J., Williamson, D.P.: Scheduling parallel machines on-line. SIAM J. Comput. 24, 1313–1331 (1995)
Condor high throughput computing. Available in: http://www.cs.wisc.edu/condor/. Cited August 2011
Openpbs. Available in: http://www.mcs.anl.gov/research/projects/openpbs/. Cited August 2011
Globus. Available in http://www.globus.org/. Cited August 2011
Tchernykh, A., Schwiegelshohn, U., Yahyapour, R., Kuzjurin, N.: On-line hierarchical job scheduling on Grids with admissible allocation. J. Scheduling 13, 545–552 (2010)
Workflow generator. Available in https://confluence.pegasus.isi.edu/display/pegasus/. Cited August 2010
Garey, M.R., Graham, R.L.: Bounds for multiprocessor scheduling with resource constraints. SIAM J. Comput. 4, 187–200 (1975)
Hirales-Carbajal, A., Tchernykh, A., Roblitz, T., Yahyapour, R.: A grid simulation framework to study advance scheduling strategies for complex workflow applications. In: IEEE International Symposium on Parallel Distributed Processing, Workshops and Phd Forum (IPDPSW), pp. 1–8 (2010)
Bharathi, S., Chervenak, A., Deelman, E., Mehta, G., Su, M.-H., Vahi, K.: Characterization of scientific workflows. In: Third Workshop on Workflows in Support of Large-Scale Science, WORKS08, pp. 1–10 (2008)
Lee, C.B., Schwartzman, Y., Hardy, J., Snavely, A.: Are user runtime estimates inherently inaccurate? In: Job Scheduling Strategies for Parallel Processing, pp. 253–263 (2004)
Dolan, E.D., Moré, J.J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002)
Dolan, E.D., Moré, J.J., Munson, T.S.: Optimality measures for performance profiles. Siam. J. Optim. 16, 891–909 (2006)
Dongarra, J.J., Jeannot, E., Saule, E., Shi, Z.: Bi-objective scheduling algorithms for optimizing makespan and reliability on heterogeneous systems. In: Proceedings of the Nineteenth Annual ACM Symposium on Parallel Algorithms and Architectures, SPAA’07, pp. 280–288. ACM, New York (2007)
Saule, E., Trystram, D.: Analyzing scheduling with transient failures. Inform. Process. Lett. 109(11), 539–542 (2009)
Canon, L.-C., Jeannot, E., Sakellariou, R., Zheng, W.: Comparative evaluation of the robustness of dag scheduling heuristics. In: Gorlatch, S., Fragopoulou, P., Priol, T. (eds.) Journal of Grid Computing, pp. 73–84. Springer, New York (2008)
Casanova, H., Legrand, A., Zagorodnov, D., Berman, F.: Heuristics for scheduling parameter sweep applications in grid environments. In: Heterogeneous Computing Workshop, pp. 349–363 (2000)
Buyya, R., Murshed, M.: GridSim: a toolkit for the modeling and simulation of distributed resource management and scheduling for grid computing. J. Concurr. Comput. Pract. Exp. 14, 1175–1220 (2002)
Casanova, H.: SimGrid: a toolkit for the simulation of application scheduling. In: Proceedings of the First IEEE/ACM International Symposium on Cluster Computing and the Grid, pp. 430–437 (2001)
Sulistio, A., Yeo, C.S., Buyya, R.A.: A taxonomy of computer-based simulations and its mapping to parallel and distributed systems simulation tools. Software: Practice and Experience (SPE) 34(7), 653–673 (2004). ISSN: 0038-0644
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Hirales-Carbajal, A., Tchernykh, A., Yahyapour, R. et al. Multiple Workflow Scheduling Strategies with User Run Time Estimates on a Grid. J Grid Computing 10, 325–346 (2012). https://doi.org/10.1007/s10723-012-9215-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10723-012-9215-6