Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views15 pages

CMS Data Challenges. The Nature of The Problem. What Is GMA ? and What Is R-GMA ? Performance Test Description Performance Test Results Conclusions

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1/ 15

 CMS data challenges. The nature of the problem.

 What is GMA ?
 And what is R-GMA ?
 Performance test description
 Performance test results
 Conclusions
 As part of the preparations for data taking CMS is
performing DATA CHALLENGES.
 Large number of simulated events to

optimise detectors and prepare software


 Enormous processing requirements

BUT
each event is independent of all the others

each event can be generated on a machine


without any interaction with any other
Work split between farms.

How to handle the book-keeping ?

a data-base automatically
updated

Implemented via a job wrapper BOSS


Output to <stdout> and <stderr> is intercepted and the
information is recorded in a mySQL production database.
Event generation and job accounting decoupled
Worker
Node (WN) WN WN Database
Machine

WN
WN
WN

Submission
Machine WN
UI WN WN
Database
Machine

Submission
Machine
UI
Producer register producer

data
data

data Ask for Registry


data (Directory services)
data
data
data

Consumer
locate producer

address of producer
Developed for E(uropean) D(ata) G(rid)
Extends the GMA in two important ways
1. Introduces a time stamp on the data.

Can be used for information


and monitoring

2. A relational implementation
3. Hides the registry behind the API
Each Virtual Organisation appears
to have one RDBMS
The user interface to R-GMA is via SQL statements
(not all SQL statements and structures are supported)

Information is advertised via a table create


Information is published via insert
Information is read via select … from table

The first read request registers the consumer as interested


in this data.
Relational queries are supported

NOTE : sql is the interface – it should not be supposed an


actual database lies behind it.
R-GMA can be dropped into the framework with very little
disruption
1. Set up calls for mySQL are replaced by those for R-GMA
producers
2. An archiver (joint consumer/producer) runs on a single
machine which collects the data from all the running jobs
and writes it to a local database (and possible republishes it).
The data can then be queried either by direct mySQL calls or via
R-GMA consumer (a distributed database has been
created)
WAN
LAN Connection
Connection
R-GMA
BOSS
R-GMA Database
R-GMA
R-GMA
R-GMA
R-GMA
R-GMA
R-GMA
 The architecture of GMA clearly provides a putative
solution to the wide area monitoring problem.

BUT
Does a specific implementation provide a practical solution

Before entrusting CMS production to R-GMA, we must be


confident that it will perform.

What load will it fail at and why ?


<Message length> 35 chars.
Multi-threaded job
each thread produces messages. Length 35 chars,
suitable distribution.
Threads starting time distribution can be altered.
One machine delivers the R-GMA load of a farm.

R-GMA R-GMA
servlet consumer
One machine per grid cluster providing loads of greater
than the cluster

R-GMA
servlet

R-GMA
servlet

R-GMA
consumer

R-GMA
servlet
R-GMA
servlet
R-GMA can survive loads of around 20% of the current
CMS requirements and does provides a grid method for
monitoring. An overload of a factor 2 jobs causes
problems after about five minutes running.
We believe these instabilities are soluble.

When production starts in earnest we will compare reality with our


model.

You might also like