Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/CCGrid.2015.74acmotherconferencesArticle/Chapter ViewAbstractPublication PagesccgridConference Proceedingsconference-collections
research-article

Towards a realistic scheduler for mixed workloads with workflows

Published: 04 May 2015 Publication History

Abstract

Many fields of modern science require huge amounts of computation, and workflows are a very popular tool in e-Science since they allow to organize many small, simple tasks to solve big problems. They are used in astronomy, bioinformatics, machine learning, social network analysis, physics, and many other branches of science. Workflows are notoriously difficult to schedule, and the vast majority of research on workflow scheduling is concerned with scheduling single workflows with known runtimes. The goal of this PhD research is to bring more realism to the problem of workflow scheduling in actual systems. First, in real systems, multiple workflows may be contending for the available resources. Second, task runtime estimates are not always known, and task runtime estimates may be wrong. Third, workflows are usually not the only type of jobs submitted to a system, there may for instance also be parallel applications and bags-of-tasks. Accordingly, the purpose of this PhD research is to create and analyze policies for online scheduling of workloads of workflows with and without known task runtimes that also contain jobs of other types. We are in the process of simulating policies, and we will validate our results by means of an implementation and real-world experiments with the Koala-w workflow processing system.

References

[1]
I. J. Taylor et al., Workflows for e-Science. Springer-Verlag London Limited, 2007.
[2]
G. Juve et al., "Characterizing and profiling scientific workflows," Future Generation Computer Systems, vol. 29, pp. 682--692, 2013.
[3]
C. Cantacessi et al., "A practical, bioinformatic workflow system for large data sets generated by next-generation sequencing," Nucleic Acids Research, vol. 38, pp. e171--e171, 2010.
[4]
A. Ilyushkin, B. Ghit, and D. Epema, "Scheduling workloads of workflows with unknown task runtimes," in 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 2015.
[5]
H. Topcuoglu, S. Hariri, and M.-y. Wu, "Performance-effective and low-complexity task scheduling for heterogeneous computing," IEEE Transactions on Parallel and Distributed Systems, vol. 13, pp. 260--274, 2002.
[6]
S. Abrishami, M. Naghibzadeh, and D. H. J. Epema, "Cost-driven scheduling of grid workflows using partial critical paths," IEEE Transactions on Parallel and Distributed Systems, vol. 23, pp. 1400--1414, 2012.
[7]
H. Zhao and R. Sakellariou, "Scheduling multiple DAGs onto heterogeneous systems," in 20th International Parallel and Distributed Processing Symposium, 2006.
[8]
Z. Yu and W. Shi, "A planner-guided scheduling strategy for multiple workflow applications," in International Conference on Parallel Processing-Workshops, 2008.
[9]
C.-C. Hsu, K.-C. Huang, and F.-J. Wang, "Online scheduling of workflow applications in grid environments," Future Generation Computer Systems, vol. 27, pp. 860--870, 2011.
[10]
K. Lee et al., "Adaptive workflow processing and execution in Pegasus," Concurrency and Computation: Practice and Experience, vol. 21, pp. 1965--1981, 2009.
[11]
E. Deelman et al., "Pegasus: A framework for mapping complex scientific workflows onto distributed systems," Scientific Programming, vol. 13, pp. 219--237, 2005.
[12]
J. Frey, "Condor DAGMan: Handling inter-job dependencies," Tech. Rep., 2002.
[13]
L. Yang, A. Bundy, C. Hughes, and D. Berry, "Fast, but approximate, workflow-runtime estimation using the bell-curve calculus," 2007.
[14]
A. M. Chirkin, A. Belloum, S. V. Kovalchuk, and M. X. Makkes, "Execution time estimation for workflow scheduling," in Proceedings of the 9th Workshop on Workflows in Support of Large-Scale Science, 2014.
[15]
T. Hegeman et al., "The BTWorld use case for big data analytics: Description, mapreduce logical workflow, and empirical evaluation," in IEEE International Conference on Big Data, 2013.
[16]
A. Iosup, D. H. J. Epema, T. Tannenbaum, M. Farrellee, and M. Livny, "Inter-operating grids through delegated matchmaking," in Proceedings of the ACM/IEEE Conference on Supercomputing, 2007.

Cited By

View all
  • (2015)Big omics data experienceProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807595(1-12)Online publication date: 15-Nov-2015

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CCGRID '15: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing
May 2015
1277 pages
ISBN:9781479980062

Publisher

IEEE Press

Publication History

Published: 04 May 2015

Check for updates

Author Tags

  1. scheduling
  2. workflows
  3. workloads

Qualifiers

  • Research-article

Conference

CCGrid '15

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2015)Big omics data experienceProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/2807591.2807595(1-12)Online publication date: 15-Nov-2015

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media