Workflow Scheduling Algorithms For Grid Computing
Workflow Scheduling Algorithms For Grid Computing
net/publication/226382382
CITATIONS READS
282 180
3 authors:
16 PUBLICATIONS 2,490 CITATIONS
University of Melbourne
825 PUBLICATIONS 52,039 CITATIONS
SEE PROFILE
SEE PROFILE
Kotagiri Ramamohanarao
University of Melbourne
478 PUBLICATIONS 8,044 CITATIONS
SEE PROFILE
Some of the authors of this publication are also working on these related projects:
All content following this page was uploaded by Kotagiri Ramamohanarao on 25 September 2018.
Summary. Workflow scheduling is one of the key issues in the management of work-
flow execution. Scheduling is a process that maps and manages execution of inter-
dependent tasks on distributed resources. It introduces allocating suitable resources to
workflow tasks so that the execution can be completed to satisfy objective functions
specified by users. Proper scheduling can have significant impact on the performance
of the system. In this chapter, we investigate existing workflow scheduling algorithms
developed and deployed by various Grid projects.
7.1 Introduction
Grids [22] have emerged as a global cyber-infrastructure for the next-generation
of e-Science and e-business applications, by integrating large-scale, distributed
and heterogeneous resources. A number of Grid middleware and management
tools such as Globus [21], UNICORE [1], Legion [27] and Gridbus [13] have been
developed, in order to provide infrastructure that enables users to access remote
resources transparently over a secure, shared scalable world-wide network. More
recently, Grid computing has progressed towards a service-oriented paradigm
[7,24] which defines a new way of service provisioning based on utility computing
models. Within utility Grids, each resource is represented as a service to which
consumers can negotiate their usage and Quality of Service.
Scientific communities in areas such as high-energy physics, gravitational-
wave physics, geophysics, astronomy and bioinformatics, are utilizing Grids to
share, manage and process large data sets. In order to support complex sci-
entific experiments, distributed resources such as computational devices, data,
applications, and scientific instruments need to be orchestrated while managing
the application workflow operations within Grid environments [36]. Workflow
is concerned with the automation of procedures, whereby files and other data
are passed between participants according to a defined set of rules in order to
achieve an overall goal [30]. A workflow management system defines, manages
and executes workflows on computing resources.
F. Xhafa, A. Abraham (Eds.): Meta. for Sched. in Distri. Comp. Envi., SCI 146, pp. 173–214, 2008.
springerlink.com
c Springer-Verlag Berlin Heidelberg 2008
174 J. Yu, R. Buyya, and K. Ramamohanarao
Grid Users
……
Workflow Grid Workflow Modeling
Workflow Design Monitor & Definition Tools
& Definition
Grid Information Services
Build Time Grid Workflow
feedback Specification Resource Info Service
Run Time (e.g. MDS)
Grid Middleware
Fig. 7.1 shows an architecture of workflow management systems for Grid com-
puting. In general, a workflow specification is created by a user using workflow
modeling tools, or generated automatically with the aid of Grid information ser-
vices such as MDS(Monitoring and Discovery Services) [20] and VDS (Virtual
Data System) [23] prior to the run time. A workflow specification defines work-
flow activities (tasks) and their control and data dependencies. At run time, a
workflow enactment engine manages the execution of the workflow by utilizing
Grid middleware. There are three major components in a workflow enactment
engine: the workflow scheduling, data movement and fault management. Work-
flow scheduling discovers resources and allocates tasks on suitable resources to
meet users’ requirements, while data movement manages data transfer between
selected resources and fault management provides mechanisms for failure han-
dling during execution. In addition, the enactment engine provides feedback to
a monitor so that users can view the workflow process status through a Grid
workflow monitor. Workflow scheduling is one of the key issues in the workflow
management [59].
A scheduling is a process that maps and manages the execution of inter-
dependent tasks on the distributed resources. It allocates suitable resources to
workflow tasks so that the execution can be completed to satisfy objective func-
tions imposed by users. Proper scheduling can have significant impact on the per-
formance of the system. In general, the problem of mapping tasks on distributed
services belongs to a class of problems known as NP-hard problems [53]. For such
problems, no known algorithms are able to generate the optimal solution within
polynomial time. Solutions based on exhaustive search are impractical as the
overhead of generating schedules is very high. In Grid environments, scheduling
decisions must be made in the shortest time possible, because there are many
7 Workflow Scheduling Algorithms for Grid Computing 175
users competing for resources, and time slots desired by one user could be taken
up by another user at any moment.
Many heuristics and meta-heuristics based algorithms have been proposed
to schedule workflow applications in heterogeneous distributed system environ-
ments. In this chapter, we discuss several existing workflow scheduling algorithms
developed and deployed in various Grid environments.
Immediate mode
Batch mode
List scheduling Dependency mode
Heuristics
based Dependency-batch
mode
Cluster based scheduling
Best-effort based
scheduling Duplication based scheduling
Heuristics based
Deadline-
constrained Metaheuristics based
QoS-constraint
based scheduling Heuristics based
Budget-
constrained Metaheuristics based
Others
monetary cost of accessing resources and various users’ QoS satisfaction levels.
On the other hand, QoS constraint based scheduling attempts to minimize per-
formance under most important QoS constraints, for example time minimization
under budget constraints or cost minimization under deadline constraints.
USA.
Max-min vGrADS Rice Univer- EMAN bio-
sity, USA. imaging
Sufferage vGrADS Rice Univer- EMAN bio-
sity, USA. imaging
Dependency mode HEFT ASKALON University of WIEN2K
Innsbruck, quantum
Austria. chemistry
& Invmod
hydrological
Dependency-batch mode Hybrid Sakellarious University of Randomly
& Zhao Manchester, generated
UK. task graphs
University of Randomly
Cluster based scheduling Ranaweera &
THAN Cincinnati, generated
Duplication based scheduling Agrawal
USA task graphs
7.3.1 Heuristics
In general, there are four classes of scheduling heuristics for workflow applica-
tions, namely individual task scheduling, list scheduling, and cluster and dupli-
cation based scheduling.
178 J. Yu, R. Buyya, and K. Ramamohanarao
List scheduling
A list scheduling heuristic prioritizes workflow tasks and scheldules the tasks
based on their priorities. There are two major phases in a list scheduling heuris-
tic, the task prioritizing phase and the resource selection phase [33]. The task
prioritizing phase sets the priority of each task with a rank value and generates a
scheduling list by sorting the tasks according to their rank values. The resource
selection phase selects tasks in the order of their priorities and map each selected
task on its optimal resource.
Different list scheduling heuristics use different attributes and strategies to
decide the task priorities and the optimal resource for each task. We catego-
rize workflow-based list scheduling algorithms as either batch, dependency or
dependency-batch mode.
The batch mode scheduling group workflow tasks into several independent
tasks and consider tasks only in the current group. The dependency mode ranks
workflow tasks based on its weight value and the rank value of its inter-dependent
tasks, while the dependency-batch mode further use a batch mode algorithm to
re-ranks the independent tasks with similar rank values.
Batch mode
Batch mode scheduling algorithms are initially designed for scheduling parallel
independent tasks, such as bag of tasks and parameter tasks, on a pool of re-
sources. Since the number of resources is much less than the number of tasks,
the tasks need to be scheduled on the resources in a certain order. A batch mode
algorithm intends to provide a strategy to order and map these parallel tasks on
the resources, in order to complete the execution of these parallel tasks at earli-
est time. Even though batch mode scheduling algorithms aim at the scheduling
problem of independent tasks; they can also be applied to optimize the execu-
tion time of a workflow application which consists of a lot of independent parallel
tasks with a limited number of resources.
7 Workflow Scheduling Algorithms for Grid Computing 179
Symbol Definition
EET (t, r) Estimated Execution Time: the amount of time the resource r will
take to execute the task t, from the time the task starts to execute
on the resource.
EAT (t, r) Estimated Availability Time: the time at which the resource r is
available to execute task t.
F AT (t, r) File Available Time: the earliest time by which all the files required
by the task t will be available at the resource r.
ECT (t, r) Estimated Completion Time: the estimated time by which task
t will complete execution at resource r: ECT (t, r) = EET (t, r) +
max(EAT (t, r), F AT (t, r))
M CT (t) Minimum Estimated Completion Time: minimum ECT for task
t over all available resources.
Algorithm Features
M in − M in It sets high scheduling priority to tasks which have the shortest exe-
cution time.
M ax − M in It sets high scheduling priority to tasks which have long execution
time.
Suf f erage It sets high scheduling priority to tasks whose completion time by the
second best resource is far from that of the best resource which can
complete the task at earliest time.
tasks assigned to their best choice (which can complete the tasks at earlist time)
than Max-Min heuristics [12]. Experimental results conducted by Maheswaran
et al. [39] and Casanova et al. [14] have proved that Min-Min heuristic outper-
form Max-Min heuristic. However, since Max-min schedule tasks with longest
execution time first, a long execution execution task may have more chance of
being executed in parallel with shorter tasks. Therefore, it might be expected
that the Max-Min heuristic perform better than the Min-Min heuristic in the
cases where there are many more short tasks than long tasks [12, 39].
On the other hand, since the Sufferage heuristic considers the adverse effect
in the completion time of a task if it is not scheduled on the resource having
with minimum completion time [39], it is expected to perform better in the cases
where large performance difference between resources. The experimental results
conducted by Maheswaran et al. shows that the Sufferage heuristic produced the
shortest makespan in the high heterogeneity environment among three heuristics
discussion in this this section. However, Casanova et al. [14] argue that the
Sufferage heuristic could perform worst in the case of data-intensive applications
in multiple cluster environments.
t0
w0
W0,1 W0,3
W0,2
t1 t2 t3
w1 w2 w3
W1,4
W2,5 W3,5
t4
w4
W4,5
t5
w5
Dependency Mode
Dependency mode scheduling algorithms are derived from the algorithms for
scheduling a task graph with interdependent tasks on distributed computing
environments. It intends to provide a strategy to map workflow tasks on hetero-
geneous resources based on analyzing the dependencies of the entire task graph,
in order to complete these interdependent tasks at earliest time. Unlike batch
mode algorithms, it ranks the priorities of all tasks in a workflow application at
one time.
Many dependency mode heuristics rank tasks based on the weights of task
nodes and edges in a task graph. As illustrated in Fig. 7.3, a weight wi is as-
signed to a task Ti and a weight wi,j is assigned to an edge (Ti , Tj ). Many list
scheduling schemes [33] developed for scheduling task graphs on homogenous
systems set the weight of each task and edge to be equal to its estimation exe-
cution time and communication time, since in a homogenous environment, the
execution times of a task and data transmission time on all available resources
are identical. However, in a Grid environment, resources are heterogeneous. The
computation time varies from resource to resource and the communication time
varies from data link to data link between resources. Therefore, it needs to con-
sider processing speeds of different resources and different transmission speeds
of different data links and an approximation approach to weight tasks and edges
for computing the rank value.
Zhao and Sakellariou [62] proposed six possible approximation options, mean
value, median value, worst value, best value, simple worst value, and simple best
value. These approximation approaches assign a weight to each task node and
edge as either the average, median, maximum, or minimum computation time
and communication time of processing the task over all possible resources. In-
stead of using approximation values of execution time and transmission time,
Shi and Dongarra [46] assign a higher weight task with less capable resources.
Their motivation is quite similar to the QoS guided min-min scheduling, i.e., it
7 Workflow Scheduling Algorithms for Grid Computing 183
may cause longer delay if tasks with scarce capable resources are not scheduled
first, because there are less choices of resources to process these tasks.
Then tasks in the workflow are ordered in HEFT based on a rank fuction. For
a exit task Ti , the rank value is:
rank(Ti ) = i (7.3)
The rank values of other tasks are computed recursively based on Eqs. 7.1,
7.2, 7.3 as shown in 7.4.
where succ(Ti ) is the set of immediate successors of task Ti . The algorithm then
sorts the tasks by decreasing order of their rank values. The task with higher rank
value is given higher priority. In the resource selection phase, tasks are scheduled
in the order of their priorities and each task is assigned to the resource that can
complete the task at the earliest time.
Even though original HEFT proposed by Topcuoglu et al. [51] computes the
rank value for each task using the mean value of the task execution time and
communication time over all resources, Zhao and Sakellariou [62] investigated
and compared the performances of the HEFT algorithm produced by other dif-
ferent approximation methods on different cases. The results of the expeirments
showed that the mean value method is not the most effiecient choice, and the
performance could differ significantly from one application to another [62].
Dependency-Batch Mode
Sakellariou and Zhao [45] proposed a hybrid heuristic for scheduling DAG on het-
erogeneous systems. The heuristic combines dependency mode and batch mode.
As described in Algorithm 7.5, the heuristic first compute rank values of each task
and ranks all tasks in the decreasing order of their rank values (Algorithm 7.5: line
1-3). And then it creates groups of independent tasks (Algorithm 7.5:line 4-11).
In the grouping phase, it processes tasks in the order of their rank values and add
tasks into the current group. Once it finds a task which has a dependency with
any task within the group, it creates another new group. As a result, a number
of groups of independent tasks are generated. And the group number is assigned
2: compute the rank value of each task according to equations 7.3 and 7.4
3: sort the tasks in a scheduling list Q by decreasing order of task rank value
4: create a new group Gi and i = 0
5: while Q is not empty do
6: t ← remove the first task from Q
7: if t has a dependence with a task in Gi then
8: i + +; create a new group Gi
9: end if
10: add t to Gi
11: end while
12: j=0
13: while j <= i do
14: scheduling tasks in Gi by using a batch mode algorithm
15: j++
16: end while
7 Workflow Scheduling Algorithms for Grid Computing 185
based on the order of rank values of their tasks, i.e., if m > n, the ranking value
of tasks in group m is higher than that of the tasks in group n. Then it schedules
tasks group by group and uses a batch mode algorithm to reprioritize the tasks in
the group.
7.3.2 Meta-heuristics
Meta-heuristics provide both a general structure and strategy guidelines for
devoping a heuristic for solving computational problems. They are generally
applied to a large and complicated problem. They provide an efficient way of
moving quickly toward a very good solution. Many metahuristics have been ap-
plied for solving workflow scheduling problmes, including GRASP, Genetic Algo-
rithms and Simulated Annealing. The details of these algorithms are presented
in the sub-sections that follow.
186 J. Yu, R. Buyya, and K. Ramamohanarao
The construction phase (Algorithm 7.7:line 8 and line 15-24) generates a fea-
sible solution. A feasible solution for the workflow scheduling problem is required
to meet the following conditions: a task must be started after all its predecessors
have been completed; every task appears once and only once in the schedule. In
the construction phase, a restricted candidate list (RCL) is used to record the
best candidates, but not necessarily the top candidate of the resources for pro-
cessing each task. There are two major mechanisms that can be used to generate
the RCL, cardinality-based RCL and value-based RCL.
Table 7.5. Fitness Values and Slots for Roulette Wheel Selection
R1
T0 T1 T2 T0 T2 T7
R2
T1
T3 T4 R3 T3 T5
T5 R4 T4 T6
T6
time
T7
(a)
T0:R0 T0
T1:R2 T2
T2:R1
T3:R3 T1
T4:R4
T5:R3 T3
T6:R4
T7:R1 T4
T5
T6
T7
(b)
Fig. 7.5. (a) Workflow application and schedule. (b) seperated machine string and
scheduling string. (c) two-dimensional string.
to achieve the objective function. For example, the fitness function developed
in [32] is Cmax − F T (I), where Cmax is the maximum completion time observed
so far and F T (I) is the completion time of the individual I. As the objective
function is to minimize the execution time, an individual with a large value of
fitness is fitter than the one with a small value of fitness.
After the fitness evaluation process, the new individuals are compared with the
previous generation. The selection process is then conducted to retain the fittest
individuals in the population, as successive generations evolve. Many methods
for selecting the fittest individuals have been used for solving task scheduling
problems such as roulette wheel selection, rank selection and elitism.
The roulette wheel selection assigns each individual to a slot of a roulette
wheel and the slot size occupied by each individual is determined by its fitness
value. For example, there are four individuals (see Table 7.5) and their fitness
values are 0.45, 0.30, 0.25 and 0.78, respectively. The slot size of an individual is
calculated by dividing its fitness value by the sum of all individual fitness in the
population. As illustrated in Fig. 7.6, individual 1 is placed in the slot ranging
from 0 − 0.25 while individual 2 is in the slot ranging from 0.26 − 0.42. After
that, a random number is generated between 0 and 1, which is used to determine
which individuals will be preserved to the next generation. The individuals with
a higher fitness value are more likely to be selected since they occupy a larger
slot range.
The roulette wheel selection will have problems when there are large dif-
ferences between the fitness values of individuals in the population [41]. For
example, if the best fitness value is 95% of all slots of the roulette wheel, other
individuals will have very few chances to be selected. Unlike the roulette wheel
selection in which the slot size of an individual is proportional to its fitness value,
a rank selection process firstly sorts all individuals from best to worst according
to their fitness values and then assigns slots based on their rank. For example,
the size of slots for each individual implemented by DOǦAN and Özgüner [16]
is proportional to their rank value. As shown in Table 7.6, the size of the slot
for individual I is defined as P I = nR(I)R(i) , where R(I) is the rank value of I
i=1
and n is the number of all individuals. Both the roulette wheel selection and the
rank selection select individuals according to their fitness value. The higher the
fitness value, the higher the chance it will be selected into the next generation.
However, this does not guarantee that the individual with the highest value goes
to the next generation for reproduction. Elitism can be incorporated into these
two selection methods, by first copying the fittest individual into the next gener-
ation and then using the rank selection or roulette wheel selection to construct
the rest of the population. Hou et al. [32] showed that the elitism method can
improve the performance of the genetic algorithm.
In addition to selection, crossover and mutation are two other major genetic
operators. Crossovers are used to create new individuals in the current popula-
tion by combining and rearranging parts of the existing individuals. The idea
behind the crossover is that it may result in an even better individual by combin-
ing two fittest individuals [32]. Mutations occasionally occur in order to allow a
192 J. Yu, R. Buyya, and K. Ramamohanarao
certain child to obtain features that are not possessed by either parent. It helps
a genetic algorithm to explore new and potentially better genetic material than
was previously considered. The frequency of mutation operation occurrence is
controlled by the mutation rate whose value is determined experimentally [32].
an even better solution is more probably derived from good solutions. Instead of
creating a new solution by randomized search, SA and GAs generate new solu-
tions by randomly modifying current already know good solutions. The SA uses
a point-to-point method, where only one solution is modified in each iteration,
whereas GAs manipulate a population of solutions in parallel which reduce the
probability of trapping into a local optimum [65]. Another benefit of producing
a collection of solutions at each iteration is the search time can be significantly
decreased by using some parallelism techniques.
Compared with the heuristics based scheduling approaches, the advantage of
the meta-heuristics based approaches is that it produces an optimized scheduling
196 J. Yu, R. Buyya, and K. Ramamohanarao
solution based on the performance of entire workflow, rather than the partial of
the workflow as considered by heuristics based approach. Thus, unlike heuris-
tics based approach designed for a specified type of workflow application, it
can produce good quality solutions for different types of workflow applications
(e.g. different workflow structure, data- and computational-intensive workflows,
etc). However, the scheduling time used for producing a good quality solution
required by meta-heuristics based algorithms is significantly higher. Therefore,
the heuristics based scheduling algorithms are well suited for a workflow with a
simple structure, while the meta-heuristics based approaches have a lot of po-
tential for solving large and complex structure workflows. It is also common to
incorporate these two types of scheduling approaches by using a solution gener-
ated by a heuristic based algorithm as a start search point for the meta-heuristics
based algorithms to generate a satisfactory solution in shorter time.
workflow (known as deadline). Cost is the total expense for executing workflow
execution including the usage charges by accessing remote resources and data
transfer cost (known as budget ). In this section, we present scheduling algo-
rithms based on these two constraints, called Deadline constrained scheduling
and Budget constrained scheduling. Table 7.9 and 7.10 presents the overview of
QoS constrained workflow scheduling algorithms.
faster may charges higher price. Scheduling the tasks based on the best-effort
based scheduling algorithms presented in the previous sections, attempting to
minimize the execution time will results in high and unnecessary cost. Therefore,
a deadline constrained scheduling algorithm intends to minimize the execution
cost while meeting the specified deadline constraint.
Two heuristics have been developed to minimize the cost while meeting a
specified time constraint. One is proposed by Menasc and Casalicchio [37] de-
noted as Back-tracking, and the other is proposed by Yu et al. [60] denoted as
Deadline Distribution.
Back-tracking
Synchronization task
V1
T2 T3 T4 T2 T3 T4
V2
T5 T6 V0 T5 T6 V8
T0 T14 T0 T14
V6
T7 T7 V3 V4
T11 T11
T10 T10
T8 T9 T12 T13 T8 T9 T12 T13
V5 V7
a) Before partitioning b) After partitioning
where P Vi is the set of parent task partitions of Vi . The relation between three
attributes of a task partition Vi follows that:
A sub-deadline can be also assigned to each task based on the deadline of its
task partition. If the task is a synchronization task, its sub-deadline is equal
to the deadline of its task partition. However, if a task is a simple task of a
branch, its sub-deadline is assigned by dividing the deadline of its partition
based on its processing time. Let Pi be the set of parent tasks of Ti and Si is
the set of resources that are capable to execute Ti . tji is the sum of input data
transmission time and execution time of executing Ti on Si . The sub-deadline
of task in partition is defined by:
where
min tji
1≤j≤|Si |
eet[Ti ] = eet[V ]
min tlk
1≤l≤|Sk |
Tk ∈V
0, Ti = Tentry
rt[Ti ] = max dl[Tj ], otherwise
Tj ∈Pi
Once each task has its own sub-deadline, a local optimal schedule can be gen-
erated for each task. If each local schedule guarantees that their task execution
can be completed within their sub-deadline, the whole workflow execution will
be completed within the overall deadline. Similarly, the result of the cost mini-
mization solution for each task leads to an optimized cost solution for the entire
workflow. Therefore, an optimized workflow schedule can be constructed from all
local optimal schedules. The schedule allocates every workflow task to a selected
service such that they can meet its assigned sub-deadline at low execution cost.
where c(I) is the sum of the task execution cost and data transmission cost of I,
maxCost is the most expensive solution of the current population and B is the
budget constraint. α is a binary variable and α = 1 if users specify the budget
constraint, otherwise α = 0.
For the budget constrained scheduling, the time-fitness component is designed
to encourage the genetic algorithm to choose individuals with earliest completion
time from the current population. For the deadline constrained scheduling, it
encourages the formation of individuals that satisfy the deadline constraint. The
time fitness function of an individual I is defined by:
t(I)
Ftime (I) = Dβ (maxT ime(1−β) )
, β = {0, 1} (7.11)
where t(I) is the completion time of I, maxTime is the largest completion time
of the current population and D is the deadline constraint. β is a binary variable
and β = 1 if users specify the deadline constraint, otherwise β = 0.
For the deadline constrained scheduling problem, the final fitness function
combines two parts and it is expressed as:
Ftime (I), if Ftime (I) > 1
F (I) = (7.12)
Fcost (I), otherwise
7 Workflow Scheduling Algorithms for Grid Computing 203
For the budget constrained scheduling problem, the final fitness function com-
bines two parts and it is expressed as:
Fcost (I), if Fcost (I) > 1
F (I) = (7.13)
Ftime (I), otherwise
In order to applying mutation operators in Grid environment, it developed
two types of mutation operations, swapping mutation and replacing mutation.
Swapping mutation aims to change the execution order of tasks in an individual
that compete for a same time slot. It randomly selects a resource and swaps the
positions of two randomly selected tasks on the resource. Replacing mutation re-
allocates an alternative resource to a task in an individual. It randomly selects
a task and replaces its current resource assignment with a resource randomly
selected in the resources which are able to execute the task.
Algorithm Features
Algorithm Features
Unlike best-effort scheduling in which only one single objective (either op-
timizing time or system utilization) is considered, QoS constrained scheduling
needs to consider more factors such as monetary cost and reliability. It needs
to optimize multiple objectives among which some objectives are conflicting.
However, with the increase of the number of factors and objectives required to
be considered, it becomes infeasible to develop a heuristic to solve QoS con-
strained scheduling optimization problems. For this reason, we can believe that
metahueristics based scheduling approach such as genetic algorithms will play
more important role for the multi-objective and multi-constraint based workflow
scheduling.
task
According to many Grid workflow projects [11, 35, 55], workflow application
structures can be categorized as either balanced structure or unbalanced structure.
Examples of balanced structure include Neuro-Science application workflows [63]
and EMAN refinement workflows [35], while the examples of unbalanced struc-
ture include protein annotation workflows [40] and Montage workflows [11].
Fig. 7.9 shows two workflow structures, a balanced-structure application and an
unbalanced-structure application, used in our experiments. As shown in Fig. 7.9a,
the balanced-structure application consists of several parallel pipelines, which re-
quire the same types of services but process different data sets. In Fig. 7.9b, the
structure of the unbalanced-structure application is more complex. Unlike the
balanced-structure application, many parallel tasks in the unbalanced structure
require different types of services, and their workload and I/O data varies sig-
nificantly.
1. reg
ister
GIS (serv
ice ty
pe)
2.query(type A) Grid
1. r Service
egi
ste
3. service list r (s
erv
ice
Workflow typ
5.slots e)
System
Grid
4. availableSlotQuery(duration) Service
6. makeReservation(task)
Table 7.13. Service speed and corresponding price for executing a task
querying the GridSim Index Service (GIS). Every service is able to provide free
slot query, and handle reservation request and reservation commitment.
There are 15 types of services with various price rates in the simulated Grid
testbed, each of which was supported by 10 service providers with various pro-
cessing capability. The topology of the system is such that all services are con-
nected to one another, and the available network bandwidths between services
are 100Mbps, 200Mbps, 512Mbps and 1024Mbps.
For the experiments, the cost that a user needs to pay for a workflow execution
comprises of two parts: processing cost and data transmission cost. Table 7.13
shows an example of processing cost, while Table 7.14 shows an example of
data transmission cost. It can be seen that the processing cost and transmission
cost are inversely proportional to the processing time and transmission time
respectively.
In order to evaluate algorithms on a reasonable deadline constraint we also
implemented a time optimization algorithm, HEFT, and a cost optimization
7 Workflow Scheduling Algorithms for Grid Computing 207
algorithm, Greedy Cost (GC). The HEFT algorithm is a list scheduling algo-
rithm which attempts to schedule DAG tasks at minimum execution time on a
heterogeneous environment. The GC approach is to minimize workflow execu-
tion cost by assigning tasks to services of lowest cost. The deadline used for the
experiments are based on the results of these two algorithms. Let Tmax and Tmin
be the total execution time produced by GC and HEFT respectively. Deadline
D is defined by:
Execution Time/Deadline
1 1
0.95 0.9
0.9 0.8
0.85 0.7
0.8 0.6
0 2 4 6 8 10 0 2 4 6 8 10
User Deadline (k) User Deadline (k)
6 5
5
4
4
3
3
2 2
1 1
0 2 4 6 8 10 0 2 4 6 8 10
User Deadline (k) User Deadline (k)
Fig. 7.12. Execution cost for scheduling balanced- and unbalanced-structure applica-
tions
35 BT BT
50
30
40
25
20 30
15
20
10
10
5
0 0
0 2 4 6 8 10 0 2 4 6 8 10
User Deadline (k) User Deadline (k)
Parameter Value/Type
Population size 10
Maximum generation 100
Crossover probability 0.9
Reordering mutation probability 0.5
Replacing mutation probability 0.5
Selection scheme elitism-rank selection
Initial individuals randomly generated
TD 7 TD
Execution Cost/Cheapest Cost
Deadline GA GA
Execution Time/Deadline
GA+TD GA+TD
4
1
0 2 4 6 8 10 0 2 4 6 8 10
Deadline (k) Deadline (k)
Fig. 7.14. Normalized Execution Time and Cost for Scheduling Balanced-structure
Application
210 J. Yu, R. Buyya, and K. Ramamohanarao
GA+TD
0.95
4
0.9
0.85
2
0.8 TD
GA
0.75 GA+TD
1
0.7
0 2 4 6 8 10 0 2 4 6 8 10
Deadline (k) Deadline (k)
Fig. 7.15. Normalized Execution Time and Cost for Scheduling Unbalanced-structure
Application
GA performs worse than TD since its values is higher than TD, especially for
balanced-structure application. However, the results are improved when incor-
porating GA and TD together by putting the solution produced by TD into the
initial population of GA. As shown in Fig. 7.15a, the value of GA+TD is much
lower than that of GA and TD at the tight deadline.
As the deadline increases, both GA and TD can meet the deadline (see
Fig. 7.14a and 7.15a) and GA can outperform TD. For example, execution time
(see Fig. 7.14a) and cost (see Fig. 7.14b) generated by GA at k = 2 are lower
than that of TD. However, as shown in Fig. 7.14b) the performance of GA is re-
duced and TD can perform better, when the deadline becomes very large (k = 8
and 10). In general, GA+TD performs best. This shows that the genetic algo-
rithm can improve the results returned by other simple heuristics by employing
these heuristic results as individuals in its initial population.
7.6 Conclusions
In this chapter, we have presented a survey of workflow scheduling algorithms for
Grid computing. We have categorized current existing Grid workflow schedul-
ing algorithms as either best-effort based scheduling or QoS constraint based
scheduling.
Best-effort scheduling algorithms target on community Grids in which re-
source providers provide free access. Several heuristics and metahueristics based
algorithms which intend to optimize workflow execution times on community
Grids have been presented. The comparison of these algorithms in terms of com-
puting time, applications and resources scenarios has also been examined in
detail. Since service provisioning model of the community Grids is based on best
effort, the quality of service and service availability cannot be guaranteed. There-
fore, we have also discussed several techniques on how to employ the scheduling
algorithms in dynamic Grid environments.
7 Workflow Scheduling Algorithms for Grid Computing 211
Acknowledgment
We would like to thank Hussein Gibbins and Chee Shin Yeo for their comments
on this paper. This work is partially supported through Australian Research
Council (ARC) Discovery Project grant.
References
1. Almond, J., Snelling, D.: UNICORE: Uniform Access to Supercomputing as an
Element of Electronic Commerce. Future Generation Computer Systems 15, 539–
548 (1999)
2. The Austrian Grid Consortium, http://www.austrangrid.at
3. Bajaj, R., Agrawal, D.P.: Improving Scheduling of Tasks in a Heterogeneous En-
vironment. IEEE Transactions on Parallel and Distributed Systems 15, 107–118
(2004)
4. Berman, F., et al.: New Grid Scheduling and Rescheduling Methods in the GrADS
Project. International Journal of Parallel Programming (IJPP) 33(2-3), 209–229
(2005)
5. Berriman, G.B., et al.: Montage: a Grid Enabled Image Mosaic Service for the
National Virtual Observatory. In: ADASS XIII, ASP Conference Series (2003)
6. Berti, G., et al.: Medical Simulation Services via the Grid. In: HealthGRID 2003
conference, Lyon, France, January 16-17 (2003)
7. Benkner, S., et al.: VGE - A Service-Oriented Grid Environment for On-Demand
Supercomputing. In: The 5th IEEE/ACM International Workshop on Grid Com-
puting (Grid 2004), Pittsburgh, PA, USA (November 2004)
8. Binato, S., et al.: A GRASP for job shop scheduling. In: Essays and surveys on
meta-heuristics, pp. 59–79. Kluwer Academic Publishers, Dordrecht (2001)
9. Blackford, L.S., et al.: ScaLAPACK: a linear algebra library for message-passing
computers. In: The Eighth SLAM Conference on Parallel Processing for Scientific
Computing (Minneapolis, MN, 1997), Philadelphia, PA, USA, p. 15 (1997)
10. Blaha, P., et al.: WIEN2k: An Augmented Plane Wave plus Local Orbitals Program
for Calculating Crystal Properties. Institute of Physical and Theoretical Chemistry,
Vienna University of Technology (2001)
11. Blythe, J., et al.: Task Scheduling Strategies for Workflow-based Applications in
Grids. In: IEEE International Symposium on Cluster Computing and the Grid
(CCGrid 2005) (2005)
12. Braun, T.D., Siegel, H.J., Beck, N.: A Comparison of Eleven static Heuristics for
Mapping a Class of Independent Tasks onto Heterogeneous Distributed Computing
Systems. Journal of Parallel and Distributed Computing 61, 801–837 (2001)
212 J. Yu, R. Buyya, and K. Ramamohanarao
13. Buyya, R., Venugopal, S.: The Gridbus Toolkit for Service Oriented Grid and
Utility Computing: An overview and Status Report. In: The 1st IEEE International
Workshop on Grid Economics and Business Models, GECON 2004, Seoul, Korea,
April 23 (2004)
14. Casanova, H., et al.: Heuristics for Scheduling Parameter Sweep Applications in
Grid Environments. In: The 9th Heterogeneous Computing Workshop (HCW 2000)
(April 2000)
15. Cooper, K., et al.: New Grid Scheduling and Rescheduling Methods in the GrADS
Project. In: NSF Next Generation Software Workshop, International Parallel and
Distributed Processing Symposium, Santa Fe (April 2004)
16. Doǧan, A., Özgüner, F.: Genetic Algorithm Based Scheduling of Meta-Tasks with
Stochastic Execution Times in Heterogeneous Computing Systems. Cluster Com-
puting 7, 177–190 (2004)
17. Deelman, E., et al.: Pegasus: Mapping scientific workflows onto the grid. In: Euro-
pean Across Grids Conference, pp. 11–20 (2004)
18. Fahringer, T., et al.: ASKALON: a tool set for cluster and Grid computing. Con-
currency and Computation: Practice and Experience 17, 143–169 (2005)
19. Feo, T.A., Resende, M.G.C.: Greedy Randomized Adaptive Search Procedures.
Journal of Global Optimization 6, 109–133 (1995)
20. Fitzgerald, S., et al.: A Directory Service for Configuring High-Performance Dis-
tributed Computations. In: The 6th IEEE Symposium on High-Performance Dis-
tributed Computing, Portland State University, Portland, Oregon, August 5-8
(1997)
21. Foster, I., Kesselman, C.: Globus: A Metacomputing Infrastructure Toolkit. Inter-
national Journal of Supercomputer Applications 11(2), 115–128 (1997)
22. Foster, I., Kesselman, C. (eds.): The Grid: Blueprint for a Future Computing In-
frastructure. Morgan Kaufmann Publishers, USA (1999)
23. Foster, I., et al.: Chimera: A Virtual Data System for Representing, Querying and
Automating Data Derivation. In: The 14th Conference on Scientific and Statistical
Database Management, Edinburgh, Scotland (July 2002)
24. Foster, I., et al.: The Physiology of the Grid, Open Grid Service Infrastructure
WG. In: Global Grid Forum (2002)
25. Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learn-
ing. Addison-Wesley, Reading (1989)
26. Goldberg, D.E., Deb, K.: A comparative analysis of selection schemes used in
genetic algorithms. Foundations of Genetic Algorithms, 69–93 (1991)
27. Grimshaw, A., Wulf, W.: The Legion vision of a worldwide virtual computer. Com-
munications of the ACM 40(1), 39–45 (1997)
28. He, X., Sun, X., von Laszewski, G.: QoS Guided Min-Min Heuristic for Grid Task
Scheduling. Journal of Computer Science and Technology 18(4), 442–451 (2003)
29. Hillier, F.S., Lieberman, G.J.: Introduction to Operations Research. McGraw-Hill
Science, New York (2005)
30. Hollinsworth, D.: The Workflow Reference Model, Workflow Management Coali-
tion, TC00-1003 (1994)
31. Hoos, H.H., Stützle, T.: Stochastic Local Search: Foundation and Applications.
Elsevier Science and Technology (2004)
32. Hou, E.S.H., Ansari, N., Ren, H.: A Genetic Algorithm for Multiprocessor Schedul-
ing. IEEE Transactions on Parallel and Distributed Systems 5(2), 113–120 (1994)
33. Kwok, Y.K., Ahmad, I.: Static Scheduling Algorithms for Allocating Directed Task
Graphs to Multiprocessors. ACM Computing Surveys 31(4), 406–471 (1999)
7 Workflow Scheduling Algorithms for Grid Computing 213
34. Ludtke, S., Baldwin, P., Chiu, W.: EMAN: Semiautomated software for high-
resolution single-particle reconstructions. Journal of Structural Biology 128, 82–97
(1999)
35. Mandal, A., et al.: Scheduling Strategies for Mapping Application Workflows onto
the Grid. In: IEEE International Symposium on High Performance Distributed
Computing (HPDC 2005) (2005)
36. Mayer, A., et al.: Workflow Expression: Comparison of Spatial and Temporal Ap-
proaches. In: Workflow in Grid Systems Workshop, GGF-10, Berlin, March 9 (2004)
37. Menascè, D.A., Casalicchio, E.: A Framework for Resource Allocation in Grid
Computing. In: The 12th Annual International Symposium on Modeling, Analysis,
and Simulation of Computer and Telecommunications Systems (MASCOTS 2004),
Volendam, The Netherlands, October 5-7 (2004)
38. Metropolis, N., et al.: Equations of state calculations by fast computing machines.
Joural of Chemistry and Physics 21, 1087–1091 (1953)
39. Maheswaran, M., et al.: Dynamic Matching and Scheduling of a Class of Indepen-
dent Tasks onto Heterogeneous Computing Systems. In: The 8th Heterogeneous
Computing Workshop (HCW 1999), San Juan, Puerto Rico, April 12 (1999)
40. O’Brien, A., Newhouse, S., Darlington, J.: Mapping of Scientific Workflow within
the e-Protein project to Distributed Resources, UK e-Science All Hands Meeting,
Nottingham, UK (2004)
41. Obitko, M.: Introduction to Genetic Algorithms (March 2006),
http://cs.felk.cvut.cz/∼ xobitko/ga/
42. Prodan, R., Fahringer, T.: Dynamic Scheduling of Scientific Workflow Applications
on the Grid using a Modular Optimisation Tool: A Case Study. In: The 20th Sym-
posium of Applied Computing (SAC 2005), Santa Fe, New Mexico, USA, March
2005. ACM Press, New York (2005)
43. Rutschmann, P., Theiner, D.: An inverse modelling approach for the estimation of
hydrological model parameters. Journal of Hydroinformatics (2005)
44. Sakellariou, R., Zhao, H.: A Low-Cost Rescheduling Policy for Efficient Mapping
of Workflows on Grid Systems. Scientific Programming 12(4), 253–262 (2004)
45. Sakellariou, R., Zhao, H.: A Hybrid Heuristic for DAG Scheduling on Heteroge-
neous Systems. In: The 13th Heterogeneous Computing Workshop (HCW 2004),
Santa Fe, New, Mexico, USA, April 26 (2004)
46. Shi, Z., Dongarra, J.J.: Scheduling workflow applications on processors with dif-
ferent capabilities. Future Generation Computer Systems 22, 665–675 (2006)
47. Spooner, D.P., et al.: Performance-aware Workflow Management for Grid Com-
puting. The Computer Journal (2004)
48. Sulistio, A., Buyya, R.: A Grid Simulation Infrastructure Supporting Advance
Reservation. In: The 16th International Conference on Parallel and Distributed
Computing and Systems (PDCS 2004), November 9-11. MIT, Cambridge (2004)
49. Tannenbaum, T., et al.: Condor - A Distributed Job Scheduler. In: Computing
with Linux. MIT Press, Cambridge (2002)
50. Thickins, G.: Utility Computing: The Next New IT Model. Darwin Magazine (April
2003)
51. Topcuoglu, H., Hariri, S., Wu, M.Y.: Performance-Effective and Low-Complexity
Task Scheduling for Heterogeneous Computing. IEEE Transactions on Parallel and
Distributed Systems 13(3), 260–274 (2002)
52. Tsiakkouri, E., et al.: Scheduling Workflows with Budget Constraints. In: Gor-
latch, S., Danelutto, M. (eds.) The CoreGRID Workshop on Integrated research
in Grid Computing, Technical Report TR-05-22, University of Pisa, Dipartimento
Di Informatica, Pisa, Italy, November 28-30, pp. 347–357 (2005)
214 J. Yu, R. Buyya, and K. Ramamohanarao
53. Ullman, J.D.: NP-complete Scheduling Problems. Journal of Computer and System
Sciences 10, 384–393 (1975)
54. Wang, L., et al.: Task Mapping and Scheduling in Heterogeneous Computing En-
vironments Using a Genetic-Algorithm-Based Approach. Journal of Parallel and
Distributed Computing 47, 8–22 (1997)
55. Wieczorek, M., Prodan, R., Fahringer, T.: Scheduling of Scientific Workflows in
the ASKALON Grid Enviornment. ACM SIGMOD Record 34(3), 56–62 (2005)
56. Wu, A.S., et al.: An Incremental Genetic Algorithm Approach to Multiprocessor
Scheduling. IEEE Transactions on Parallel and Distributed Systems 15(9), 824–834
(2004)
57. YarKhan, A., Dongarra, J.J.: Experiments with Scheduling Using Simulated An-
nealing in a Grid Environment. In: Parashar, M. (ed.) GRID 2002. LNCS, vol. 2536.
Springer, Heidelberg (2002)
58. Young, L., et al.: Scheduling Architecture and Algorithms within the ICENI Grid
Middleware. In: UK e-Science All Hands Meeting, pp. 5–12. IOP Publishing Ltd.,
Bristol, UK, Nottingham, UK (2003)
59. Yu, J., Buyya, R.: A Taxonomy of Workflow Management Systems for Grid Com-
puting. Journal of Grid Computing 3(3-4), 171–200 (2005)
60. Yu, J., Buyya, R., Tham, C.K.: A Cost-based Scheduling of Scientific Workflow
Applications on Utility Grids. In: The first IEEE International Conference on e-
Science and Grid Computing, Melbourne, Australia, December 5-8 (2005)
61. Yu, J., Buyya, R.: Scheduling Scientific Workflow Applications with Deadline and
Budget Constraints using Genetic Algorithms. Scientific Programming 14(3-4),
217–230 (2006)
62. Zhao, H., Sakellariou, R.: An experimental investigation into the rank function
of the heterogeneous earliest finish time shceulding algorithm. In: Kosch, H.,
Böszörményi, L., Hellwagner, H. (eds.) Euro-Par 2003. LNCS, vol. 2790, pp. 189–
194. Springer, Heidelberg (2003)
63. Zhao, Y., et al.: Grid Middleware Services for Virtual Data Discovery, Composition,
and Integration. In: The Second Workshop on Middleware for Grid Computing,
Toronto, Ontario, Canada (2004)
64. Zomaya, A.Y., Ward, C., Macey, B.: Genetic Scheduling for Parallel Processor Sys-
tems: Comparative Studies and Performance Issues. IEEE Transactions on Parallel
and Distributed Systems 10(8), 795–812 (1999)
65. Zomaya, A.Y., Teh, Y.H.: Observations on Using Genetic Algorithms for Dynamic
Load-Balancing. IEEE Transactions on Parallel and Distributed Systems 12(9),
899–911 (2001)