A_Reference_Model_for_Grid_Architectures_and_Its_A
A_Reference_Model_for_Grid_Architectures_and_Its_A
A_Reference_Model_for_Grid_Architectures_and_Its_A
net/publication/220831236
CITATIONS READS
12 434
4 authors, including:
Natalia Sidorova
Eindhoven University of Technology
135 PUBLICATIONS 3,403 CITATIONS
SEE PROFILE
All content following this page was uploaded by Natalia Sidorova on 01 January 2014.
Carmen Bratosin, Wil van der Aalst, Natalia Sidorova, and Nikola Trčka
1 Introduction
R. Meersman and Z. Tari (Eds.): OTM 2008, Part I, LNCS 5331, pp. 898–913, 2008.
c Springer-Verlag Berlin Heidelberg 2008
A Reference Model for Grid Architectures and Its Analysis 899
main conclusion, on which the most of the respondents agree, is that grid com-
puting is about sharing resources in a distributed environment. This definition,
however, only offers an idea of what a grid is and not how it is actually working.
In order to classify all the functionalities that a grid system should pro-
vide, [13] describes a grid architecture as composed of five layers: (1) fabric,
providing resources such as computational units and network resources; (2) con-
nectivity layer composed of communication and authentication protocols; (3)
resource layer implementing negotiation, monitoring, accounting, and payment
for individual resources; (4) collective layer focusing on global resource man-
agement; and finally, (5) the layer composed of user applications. Similar clas-
sification is given in [1] where the architecture is composed of four layers: (1)
resources, composed of the actual grid resources like computers and storage fa-
cilities; (2) network, connecting the resources; (3) middleware layer, equivalent
to the collective layer of [13], but also including some of the functionality of the
resource layer (e.g. monitoring); and (4) application layer. In both [13] and [1],
as well as in most of the other similar works done by practitioners, the grid archi-
tecture is described only at a very high level. The separation between the main
parts of the grid is not well defined. Moreover, there is a huge gap between the
architectural models, usually given in terms of informal diagrams, and the actual
grid implementations which use an engineering-like approach. A good conceptual
reference model for grids is missing.
This paper tries to fill the gap between high-level architectural diagrams and
concrete implementations, by providing a colored Petri net (CPN) [14] describ-
ing a reference grid architecture. Petri nets [16] are a well established graphical
formalism, able to model concurrency, parallelism, communication and synchro-
nization. CPNs extend Petri nets with data, time and hierarchy, and combine
their strength with the strength of programming languages. For these reasons,
we consider CPNs to be a suitable language for modeling grid architectures. The
CPN reference model, being formal, resolves ambiguities and provides semantics.
Its graphical nature and hierarchical composition contribute to a better under-
standing of the whole grid mechanism. Moreover, as CPN models are executable
and supported by CPN Tools [11] (a powerful modeling and simulation frame-
work), the model can be used for rapid prototyping, and for all kinds of analysis
ranging from model checking to performance analysis.
Literature refers to different types of grids, based on the main applications
supported. For example, a data grid is used for managing large sets of data
distributed on several locations, and a computational grid focuses on offering
computing power for large and distributed applications. Each type of grid has
its particular characteristics making it a non-trivial task to unify them. This
paper focuses only on computational grids. However, we also take into account
some data aspects, such as input and output data of computational tasks and
the duration of data transfer, as these are important aspects for the analysis.
(Computational) grids are used in different domains ranging from biology and
physics to weather forecasting and business intelligence. Although the results
presented in this paper are highly generic, we focus on process mining as an
900 C. Bratosin et al.
application domain. The basic idea of process mining is to discover, monitor and
improve real processes (i.e., not assumed processes) by extracting knowledge
from event logs [4,3]. It is characterized by an abundance of data (i.e., event
logs containing millions of events) and potentially computing intensive analysis
routines (e.g., genetic algorithms for process discovery). However, as many of its
algorithms can be distributed by partitioning the log or model, process mining
is an interesting application domain for grid computing.
At TU/e we are involved in many challenging applications of process min-
ing that could benefit from the grid (a recent example is our analysis of huge
logs coming from the “CUSTOMerCARE Remote Services Network” of Philips
Medical Systems). We need a good experimental framework that allows us to
experiment with different scheduling techniques and grid application designs. To
show how our CPN model can be applied in this direction, and that is not only
suitable as a descriptive model, we perform a simulation study. Using a small
(but typical) process mining application as input we conduct several simple ex-
periments to see how parameters such as the arrival rate, distribution strategies,
and data transfer, influence the throughput time of an application and resource
utilization. The simulations are done under the realistic hypothesis that the re-
sources are unreliable, i.e., can appear and disappear at any moment in time.
For the visualization and analysis of the results we use the link of CPN tools
with the SPSS software [2] and ProM framework [5]. Note that in this paper we
do not aim to come with a novel middleware design, or invent a new scheduling
policy, but rather to illustrate the powerful capabilities of the model and its
simulation environment.
The rest of the paper is organized as follows. In the remainder of this section
we discuss some related work. In Section 2 we present a grid architecture and its
CPN model. The simulation experiments are presented in Section 3. Section 4
concludes the paper.
Related Work. While formal techniques are widely used to describe grid work-
flows [12,6,8], only a few attempts have been made to specify the semantics of a
complete grid. In [15] a semantical model for grid systems is given using Abstract
State Machines [7] as the underlying formalism. The model is very high level (a
refinement method is only informally proposed) and no analysis is performed.
[18] gives a formal semantic specification (in terms of Pi-calculus) for dynamic
binding and interactive concurrency of grid services. The focus is on grid service
composition.
In order to analyze grid behavior several researchers developed grid simulators.
(The most notable examples are SimGrid [10] and GridSim [9].) These simulators
are typically Java or C implementations, meant to be used for the analysis of
scheduling algorithms. They do not provide a clear reference model as their
functionality is hidden in code. This makes it difficult to check the alignment
between the real grid and the simulated grid.
In [8] we proposed to model Grid workflows using CPNs, and we also used
process mining as a running example. In that paper, however, we fully covered
only the application layer of the grid; for the other layers the basic functionality
A Reference Model for Grid Architectures and Its Analysis 901
was modeled, just to close the model and make the analysis possible. We also
completely abstracted from data aspects.
CPN'Replications.nreplications 10
Job FinishedJob
RegisterDataAck
Job Job
AckDataReg
CancelJob RegisterData RemoveData
Middleware
Middleware
JobAllocatedAndTransferList ResDataList
Claim
Resources
Resources
discovery or data movement. In order to achieve this goal the application layer
provides a grid workflow description language that is independent of resources
and their properties. Defined workflows can therefore be (re)used on different
grid platforms.
In our case, CPNs are themselves used to model grid workflows. However,
they are expected to follow a certain pattern. The user describes each job as a
tuple (ID, AP, OD) where ID is the set of input data, AP is the application to
be executed, and OD is the set of output data. Every data element is represented
by its logical name, leaving the user a possibility to parameterize the workflow
at run-time with actual data. All data is case related, i.e. if the user wants to
realize multiple experiments on the same data set, the data is replicated and
the dependency information is lost. It is assumed that the set ID contains only
the existing data, i.e. that at least one resource has this data (e.g. created by
previous jobs). It is also assumed that AP is an existing application at the level
of resources.
In Figure 2 we present the CPN model of the application layer. In this case
the layer consists of only one workflow, but multiple cases can be generated by
the GenCases module. Every workflow is preceded by the substitution transition
RegisterData. This transition, as the name implies, registers for every new case
the location of the input data required for the first job. The workflow from
Figure 2 thus needs a “Log” and a “FilterFile”. When all the jobs in a case are
executed, the application layer sends a message to the middleware instructing
that all the data of the case is deleted (transition Garbage Removal ).
In Figure 3, an example of a workflow is presented, describing a simple, but
very typical, process mining experiment. The event log (“Log”) is first filtered
using a filter described in “Filter File”. Then, the obtained “Filtered Log” is
A Reference Model for Grid Architectures and Its Analysis 903
GenCases End
GenCases CaseID
"Log"
Log c
"FilterFile"
DataID
Filter
DataID
Workflow
c Garbage
Start RegisterData P1 Workflow P2
Removal
CaseID RegisterData CaseID
CaseID
[{caseid=c,dataid=""}]
Gen
RegisterData
Out GenTaskID
FinishedJob
DataCatalogue In
Job Job
Out
Job
RegisterDataAck RemoveData
In CancelJob Out
AckDataReg
In DataNameList
Job
"Log"
Log
"CCResult"
I/O
DataID "FLog"
"FilterFile" CCResult
Filter Flog
"PN" DataID
I/O
DataID DataID
PN
DataID
FinishedJob
I/O In
Job CancelJob Job
Out Job
Job
Gen
I/O
GenTaskID
mined and the result of the mining algorithm (“PN”) is assessed by using the
conformance checker (to see, e.g. how many traces from the log can be repro-
duced by the mined model).
All jobs follow the pattern from Figure 4. Each logical data name (from ID
and OD) is modeled as a distinct place. In this way we can easily observe the data
dependencies between jobs. These dependencies can be used in an optimization
algorithm to decide whether some data is no longer needed, or when some data
904 C. Bratosin et al.
getCase(jobid)
c
p
d
d2 p
{caseid=c, taskid=t} [jobid=(#jobid(job))]
Start jobid End
RunJob
Job Job
JobID
{ job
jobid={ { jobid
{
caseid=c, caseid=c,
caseid=c,
taskid=t taskid=t+1 FinishedJob
taskid=t [getCase(jobid) = getCaseOfJob(job)]
}, } In
}
plugintype=p,
Job
inputdatalist=[ CancelJob
{
dataid=d,
copymove=true,
del=true job
}
Gen
],
outputdatalist=[d2] I/O
GenTaskID
}
CancelJob
I/O
Job
Job
Out
Job
2.2 Middleware
The link between user applications and resources is made via the middleware layer
(Figure 5). This layer contains the intelligence needed to discover, allocate, and
monitor resources for jobs. We consider just one centralized middleware, but our
model can be easily extended to a distributed middleware. We also restrict ourselves
to a middleware working according to a “just-in-time” strategy, i.e., the search for
an available resource is done only at the moment a job becomes available. If there are
multiple suitable resources, an allocation policy is applied. Look ahead strategies
and advanced planning techniques are not considered in this paper.
The place GlobalResInformation models an information database containing
the current state of resources. The middleware uses this information to match
jobs with resources, and to monitor the behavior of resources. The database
is updated based on the information received from the resource layer and the
realized matches.
Data Catalog is a database containing information about the location of data ele-
ments and the amount of storage area that they occupy. This information is also used
Sent For DataNameList
Garbage removal RemoveData
ResDataList In
Out
Data Management
RegisterData
RegisterDataAck In
Out DataCatalogue
AckDataReg
DataManagement
[]
DataCatalogue
CancelJob DataCatalogue
Out
Job
FailedJobs
Job RL
In Fault-Handling
Job JobToSchedList
GlobalResInformation
Fault-Handling ClaimResponse
In ResList
ClaimResponse
Monitoring
[]
Monitoring
JobReceiving JobsToSchedule Scheduling
JobToSchedList
JobReceiving Scheduling
ConcreteResInformation
In
ConcreteResListT FinishedJob
ESTIMATION Out
Job
Knowledge
Sent Job ClaimRequest
Database
RegisterDataForRes
A Reference Model for Grid Architectures and Its Analysis
Sent Job
Out
JobAllocatedAndTransferList
{
[b=true] resid=rid,
job=job2sched,
catalogue transferlist=createTransferList(rid,job2sched,catalogue)
}
DataCatalogue ClaimSucceed
I/O newData(catalogue,rid,job2sched)
DataCatalogue
{job={resid=rid,job=job2sched},
response=b}
catalogue
[existResToAlloc(rl,catalogue,job2schedlist,sch)]
{ {resid=rid,job=job2sched}
delJob(job2sched,job2schedlist) resid=rid,
job=job2sched
Jobs }
job2schedlist Match Matched
Pool ClaimResponse
I/O sch JobAllocated In
JobToSchedList ClaimResponse
{resid=rid,job=job2sched} takeRes(rl,rid,job2sched,catalogue)
rl
ClaimRequest
Out
Claim
SCHEDULER
Scheduling GlobalResInformation {resid=rid,job=job2sched}
policy
Scheduler
I/O
ResList
removeJobFromResID(rl,rid,job2sched,catalogue) {job={resid=rid,job=job2sched},
rl response=b}
job2schedlist
ClaimFail
addJob(job2sched,job2schedlist)
[b=false]
catalogue
When the middleware receives the message that a job is finished, it updates the
global resource information database and forwards this message to the application
layer.
Jobs can fail at the resource layer. Therefore, a fault handling mechanism
is defined (transition Fault-Handling). When a job fails, the middleware tries
to re-allocate it. However, if the necessary input data is no longer available at
the resource level, the middleware is unable to execute the job and it sends a
message to the application layer that the job is canceled.
2.3 Resource Layer
Every resource is described in terms of the available computing power (expressed
in number of CPUs), the amount of storage area available for hosting data, the
list of supporting applications, and the set of running and allocated jobs. The
resources are unaware of job issuers and of job dependencies. Every job is exe-
cuted on just one resource. However, resources can work on multiple jobs at the
same time. Figure 7 presents the conceptual representation of the functionalities
of the resource layer in terms of CPNs.
The set of resources is assumed to be fixed, but resources are considered
unreliable. They can appear/dissapear at any moment in time, except when
transferring data. Transition Resource Dynamics, governed by a stochastic clock,
simulates the possibility that a resource becomes unavailable. When this happens
all the data is lost and all running jobs are aborted on this resource.
After a successful match by the middleware, the transition Claim is used to rep-
resents a guarantee that the allocated resource can execute the job. Recall that
this phase is necessary because the allocation at the middleware level is based
on a possibly outdated information of the state of the resources. If the claiming
succeeds, one CPU and the estimated necessary storage area are reserved at the
resource. The resource is now ready to perform the job, and is waiting for the full
job description to arrive. The job description also contains the locations of the in-
put data and the information on which application to execute. The substitution
transition Transfer models the gathering of necessary input data from other re-
sources. If the input data is no longer present on a source node, the job is aborted.
If the transfer starts, we assume that it ends successfully. Note that the reserved
CPU remains unoccupied during the transfer. When all the input data is present
on the allocated resource, the job starts executing.
The resources can always offer their capabilities and, so, the resource layer
constantly updates the middleware on the current state of the resources. There
are two types of updates sent: (1) recognition of new data (transferred data, or
data generated by job execution) and (2) signals announcing to the middleware
that a resource is still available. While the former is sent on every change in the
resource status, the latter info is periodical.
Remove Data transition models the fact that any data can be deleted from
a resource at the request of the middleware. These requests can arrive and be
fulfilled at any moment in time.
908
ClaimRequest
In
Claim
{
ConcreteResInformation RegisterDataForRes job=job2sched,
Out Out resid=rid
ConcreteResListT ConcreteRes
}
Claim ClaimResponse
resList Out
JobAllocatedAndTransferList
SendInformation ClaimResponse
C. Bratosin et al.
addJobToRes(resList,rid,job2sched) updateAvailability(resList,rid)
Receive Resource
resList Resources
Job resList Dynamics
{ ConcreteResList
resid=rid, rid@+changeResStateTime(resList,rid)
job=job2sched, rid
transferlist=trl
} RCLOCK
resList Clock
RClock
WaitTransfer Transfer
The reference model presented in this section offers a clear view and a good
understanding of our grid architecture. The next section shows how we can use
this model to also analyze the behavior of the grid.
The testbed for our experiment is as follows. We consider a resource pool con-
taining 6 identical resources, each having 3 CPUs and a storage area of 1000GB.
The resources are unreliable, and can appear/disappear at any moment. Their
dynamics is governed by a uniform distribution. We assume that the resources
are used exclusively for our process mining applications. Every resource can
perform the three process mining operations, i.e. Filtering, Mining and Confor-
mance Check. All user applications follow the workflow structure from Figure 3.
The individual cases arrive according to an exponential distribution, and have
uniformly distributed input file (i.e. of the log and the filter file) sizes . We take
the scheduling policy to be first-ready-first-served, but the scheduling algorithm
gives priority to the more advanced cases, i.e., Conformance Check jobs have
higher priority than Filtering jobs. The motivation for this comes from the fact
that garbage removal takes place only at case completion.
The performance measures in question are examined for job arrival rates of
2,4,6,8, and 10, jobs per 100 time units. We perform 10 independent simula-
tions for each of the examined configurations, and we calculate 95% confidence
intervals. Each simulation run is limited to 2000 jobs.
In our first calculation we assume that data required for a job is always copied
to the allocated resource, and never moved (i.e., it stays on the source resource).
This strategy, on the one side, overloads the grid with a lot of replicated data
and, therefore, reduces performance. On the other side, however, it gives the
middleware more options when allocating a job, thus improving the performance.
Figure 8 shows the evolution of the performance parameters when the arrival
rate is varied. Figures 8(a) and 8(b) show the evolution of the resource utilization,
in terms of the number of CPUs used (called CPU load when in percentage)
and the amount of storage area occupied; Figure 8(c) shows the evolution of
Fig. 10. Bottlenecks found with Performance Analysis with Petri Nets plugin and
Fuzzy Miner
Fig. 11. Dot Plot plugin showing the transfer event frequency
the throughput time. We observe that when the CPU load is less then 80%,
the throughput time is around 200 time units for all the arrival rates. When
the arrival rate is around 8/100, the resource load stabilizes to around 100%,
but the throughput time starts to increase swiftly.
To find bottlenecks we do a more detailed inspection using ProM. We first ap-
ply the Performance Sequence Diagram plugin, which gives us the result shown
in Figure 9. The chart represent the individual execution patterns for the case
of the highest arrival rate. We observe that the execution time for filtering (in-
cluding the queueing time) is higher than for the other jobs. As the resource
occupation is very high, the newly arrived cases wait long to be scheduled. The
patterns 2 and 3 are the cancelation executions. The execution time value for
canceled jobs is higher than for those with a normal execution (Pattern 1). This
is because the middleware cancels jobs based on a time out mechanism. Similar
conclusions can be made by using the Performance Analysis with Petri Nets
plugin, and the Fuzzy Miner plugin, as seen in Figure 10. Using the Dot Plot
plugin (Figure 11) we observe that when the arrival rate is 1/10, the frequency
of data transfer is significantly higher than for the lower arrival rates. As this
arrival rate is very high, after a job is finished the next job of the same case is
unlikely to be scheduled on the same resource.
In our second experiment we change the data transfer strategy, and no longer
replicate the data but move it. Figure 12 shows the confidence intervals for the
912 C. Bratosin et al.
two strategies, when the arrival rate is 1/10. With the new strategy storage
area occupation is decreased by half, and there is a slight improvement in the
throughput time.
4 Conclusions
In this paper, we presented a reference model for grid architectures in terms of col-
ored Petri nets, motivated by the absence of a good conceptual definition for the
grid. Our model is formal and offers a good understanding of the main parts of
the grid, their behavior and their interactions. To show that the model is not only
suitable for definition purposes, we conducted a simulation experiment. Under the
assumption that the grid is used for process mining applications, we compared the
performance of two scheduling strategies, different in the way they handle data
transfer.
The grid model is the starting point in the developing of both an experimental
simulation framework and a real grid architecture to support process mining
experiments.
References
1. Grid architecture,
http://gridcafe.web.cern.ch/gridcafe/gridatwork/architecture.html
A Reference Model for Grid Architectures and Its Analysis 913