Introduction To Parallel Programming
Introduction To Parallel Programming
2009 2
Parallel (Computing)
► Execution of several activities at the same time.
2 multiplications at the same time on 2 different processes,
Printing a file on two printers at the same time.
2009 3
Why Parallel Computing ?
► Save time - wall clock time
► Solve larger problems
► Parallel nature of the problem, so parallel
models fit it best
► Provide concurrency (do multiple things at the
same time)
► Taking advantage of non-local resources
► Cost savings
► Overcoming memory constraints
► Can be made highly fault-tolerant (replication)
2009 4
What application ?
Traditional HPC Enterprise App.
2009 5
How to parallelize ?
► 3 steps :
2009 6
Additional definitions
2009 7
Parallelism vs Distribution vs
Concurrency
► Parallelism sometimes proceeds from
distribution:
Problem domain parallelism
E.g: Collaborative Computing
► Distribution sometimes proceeds from
parallelism:
Solution domain parallelism
E.G.: Parallel Computing on Clusters
► Parallelism leads naturally to Concurrency:
Several processes trying to print a file on a single
printer
2009 8
Levels of Parallelism
HardWare
► Bit-level parallelism
Hardware solution
based on increasing processor word size
4 bits in the ‘70s, 64 bits nowadays
Focus
Focus on
on hardware
hardware capabilities
capabilities for
for structuring
structuring
► Instruction-level parallelism
A goal of compiler and processor designers
Micro-architectural techniques
Instruction pipelining, Superscalar, out-of-order execution,
register renamming
Focus
Focus on
on program
program instructions
instructions for
for structuring
structuring
2009 9
Levels of Parallelism
SoftWare
► Data parallelism (loop-level)
Distribution of data (Lines, Records, Data-
structures, …) on several computing entities
Working on local structure or architecture to work in
parallel on the original
Focus
Focus on
on the
the data
data for
for structuring
structuring
► Task Parallelism
Task decomposition into sub-tasks
Shared memory between tasks or
Communication between tasks through messages
Focus
Focus on
on tasks
tasks (activities,
(activities, threads)
threads) for
for structuring
structuring
2009 10
Performance ?
► Performance as Time
Time spent between the start and the end of a
computation
► Performance as rate
MIPS ( Millions of Instructions / sec)
Not equivalent on all architectures
► Peak Performance
Maximal Performance of a Resource (theoretical)
Real code achieves only a fraction of the peak
performance
2009 11
Code Performance
► how to make code go fast : “High Performance”
► Performance conflicts with
Correctness
By trying to write fast code, one can breaks it
Readability
Multiplication/division by 2 versus bit shifting
Fast code requires more lines
Modularity can hurt performance
– Abstract design
Portability
Code that is fast on machine A can be slow on machine B
At the extreme, highly optimized code is not portable at all,
and in fact is done in hardware.
2009 12
Speedup
linear speedup
up
d
ee
speedup
r sp
ea
rlin
pe
sub-linear speedup
su
number of processors
2009 13
Super Linear Speedup
► Rare
► Some reasons for speedup > p (efficiency > 1)
Parallel computer has p times as much RAM so
higher fraction of program memory in RAM instead
of disk
An important reason for using parallel computers
Parallel computer is solving slightly different, easier
problem, or providing slightly different answer
In developing parallel program a better algorithm
was discovered, older serial algorithm was not best
possible
2009 14
Amdahl’s Law
► Amdahl [1967] noted: given a program,
let f be fraction of time spent on operations that must
be performed serially.
► Then for p processors,
Speedup(p) ≤ 1/(f + (1 − f)/p)
► Thus no matter how many processors are used
Speedup ≤ 1/f
► Unfortunately, typically f was 10 –20%
► Useful rule of thumb :
If maximal possible speedup is S, then S processors
run at about 50% efficiency.
2009 15
Maximal Possible Speedup
2009 16
Another View of Amdahl’s Law
2009 18
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Clouds
► Conclusion
2009 19
Michael Flynn’s Taxonomy
classification of computer architectures
Data & Operand (instructions)
2009 20
Single Instruction Single Data
Stream
A single processor executes a single instruction
stream
Data stored in a single memory
Corresponds to the Von Neumann architecture
Computing Unit
D D D D D D D
Instructions
2009 21
Single Instruction Multiple Data
Streams
► Vector processors
Instructions executed over vectors of data
► Parallel SIMD
Synchronous execution of the same instruction
Computing Unit
D1 D1 D1 D1 D1
D2 D2 D2 D2 D2
D… D… D… D… D…
Dn Dn Dn Dn Dn
Instructions
2009 22
Cray 1 Vector machine, 70s,
CPU 64bits, 8Mo RAM, 166 MFlops weighed 5.5 tons
2009 23
Cray X1E - 2005
CPUs 1020* 1GHz, 4080 Go RAM,18 Tflops,
rank 72
2009 24
Multiple Instructions Single Data
Streams
Few examples of this
architecture in this class Computing Units
(systolic arrays)
D D D D
Cryptography algo.
3D – Raytracing
engines D D D D D D
D D D D
2009 25
Multiple Instructions Multiple
Data Streams
► Distributed systems are MIMD architectures
► Either exploiting a single shared memory space
or a distributed memory space.
Computing Unit
D D D D D D D
Instructions
Memory
Computing Unit
D D D D D D D
Instructions
2009 26
Sharing Memory or not
2009 27
Multiple Instructions Multiple
Data Streams
Shared-memory Multiple CPUs with a shared Memory
Computing Unit
D D D D D D D
Memory Bus
Instructions
Shared
Memory Computing Unit
D D D D D D D
Instructions
2009 28
Symmetric Multi Processing
System
► SMP machine
Multiple CPUs
A single memory control
Uniform Memory Access
One Cabinet
•1024 PowerPC 700Mhz
•256 Go RAM ( up to 2Go ) /
•5.7 teraflops of processing power
•IBM version of a Linux Kernel on
processing nodes
•Novell Linux on Management Nodes
2009 30
IBM Blue Gene/L SuperComputer
► Maximum size of 65,536 compute nodes
2007 : up to 1000 Tflops/s
2009 31
Shared Memory, Conclusion
► Advantages
Memory scalable to number of processors. Increase
number of processors, size of memory and
bandwidth increases.
Each processor can rapidly access its own memory
without interference
► Disadvantages
Difficult to map existing data structures to this
memory organization
User responsible for sending and receiving data
among processors
To minimize overhead and latency, data should be
blocked up in large chunks and shipped before
receiving node needs it
2009 32
MIMD, Distributed Memory
► Require a communication network to connect
inter-processor memory
Computing Unit
Memory D D D D D D D
Network
Instructions
Computing Unit
Memory D D D D D D D
Instructions
2009 33
Distributed Memory, Conclusion
► Advantages:
Memory is scalable with number of processors. Increase
the number of processors and the size of memory
increases proportionately.
Each processor can rapidly access its own memory
without interference and without the overhead incurred
with trying to maintain cache coherency.
Cost effectiveness: can use commodity, off-the-shelf
processors and networking.
► Disadvantages:
The programmer is responsible for many of the details
associated with data communication between
processors.
It may be difficult to map existing data structures, based
on global memo
2009 34
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Grids
P2P
Clouds
► Conclusion
2009 35
Parallel Programming Models
► several parallel programming models in
common use:
Threads (Posix)
Shared Memory (OpenMP)
Message Passing (MPI)
Data Parallel (Fortan)
Hybrid (MPI + Posix)
2009 36
Issues When Parallelizing
► Common issue: Partitioning
Data decomposition
Functional decomposition
► 2 possible outputs
Embarrassingly Parallel
Solving many similar, but independent, tasks :
parameter sweeps.
Communicating Parallel Computing
Solving a task by simultaneous use of multiple
processors, all elements (intensively)
communicating
2009 37
Communicating Tasks
► Cost of communications
► Latency vs. Bandwidth
► Visibility of communications
► Synchronous vs. asynchronous
communications
► Scope of communications
Point-to-point
Collective
► Efficiency of communications
2009 38
Data Dependencies
► A dependence exists between program statements
when the order of statement execution affects the
results of the program.
► A data dependence results from multiple uses of
the same location(s) in storage by different tasks.
► Dependencies are one of the primary inhibitors to
parallelism.
► Handle Data Dependencies:
Distributed memory - communicate required data at
synchronization points.
Shared memory -synchronize read/write operations
between tasks.
2009 39
The Memory Bottleneck
► The memory is a very common bottleneck that
programmers often don’t think about
When you look at code, you often pay more attention to
computation
a[i] = b[j] + c[k]
The access to the 3 arrays take more time than doing an addition
For the code above, the memory is the bottleneck for most machines!
2009 40
Memory and parallel programs
► Principle of locality: make sure that concurrent
processes spend most of their time working on
their own data in their own memory
Place data near computation
Avoid modifying shared data
Access data in order and reuse
Avoid indirection and linked data-structures
Partition program into independent, balanced
computations
Avoid adaptive and dynamic computations
Avoid synchronization and minimize inter-process
communications
► Locality is what makes efficient parallel
programming painful
As a programmer you must constantly have a mental
picture of where all the data is with respect to where the
computation is taking place
2009 41
Duality: Copying vs. Sharing
2009 42
Classification Extension
► Single Program, Multiple Data streams (SPMD)
Multiple autonomous processors simultaneously
executing the same program on different data , but
at independent points, rather than in the lockstep
that SIMD imposes
Typical MPI like weather forecast
2009 43
Architecture to Languages
SMP:
► Shared-Memory Processing
► Symmetric Multi Processing
MPP:
► Message Passing Processing
► Massively Parallel Processing
2009 44
Parallel Programming Models
► Implicit
2009 45
Parallel Programming Models
► Explicit
Programmer is responsible for the parallelization
work:
Task and Data decomposition
Mapping Tasks and Data to resources
Communication or synchronization management
► Several classes:
Control (loops and parallel regions) directives
(Fortran-S, KSR-Fortran, OpenMP)
Data distribution: HPF (historical)
2009 46
Parallel Programming Models
Strategy 1: Automatic parallelization
2009 47
Parallel Programming Models
Strategy 2: Major Recoding
2009 49
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
Using Clouds
► Conclusion
2009 50
Parallel Programming Languages
Goals
► System architecture transparency
► Network communication transparency
► Easy-to-use
► Fault –tolerance
► Support of heterogeneous systems
► Portability
► High level programming language
► Good scalability
► Some parallelism transparency
2009 51
OpenMP: Shared Memory
Application Programming Interface
2009 52
OpenMP: General Concepts
Shared
► An OpenMP program is Local variables
executed by a unique variables
process
Thread
► This process activates
threads when entering a
parallel region
Program Process
► Each thread executes a
task composed by Set of
instructions
several instructions Parallel
region
► Two kinds of variables:
Private
Shared
2009 53
OpenMP
► The programmer has
to introduce OpenMP
directives within his fork
code
parallel
► When program is region
executed, a parallel
region will be created join
on the “fork and join”
model
2009 54
OpenMP: General Concepts
0 1 2 3 Nb. Tasks
► An OpenMP program
is an alternation of
sequential and
parallel regions
► A sequence region is
always executed by
the master task
► A parallel region can
Time
be executed by
several tasks at the
same time
2009 55
OpenMP: General Concepts
X=a+b
Do i= …
…….. Call sub(..) call sub(..)
……….
End do
do i= ..
end do
2009 56
OpenMP: General Concepts
► A task is affected to
a processor by the
Operating System
Task Manager
Processors
2009 57
OpenMP Basics: Parallel region
Program parallel
► inside a parallel region: use OMP_LIB
implicit none
by default, variables are real ::a
shared logical ::p
2009 58
OpenMP Basics: Parallel region
Program parallel
By using the DEFAULT use OMP_LIB
implicit none
clause one can change the real ::a
a=9999.
default status of a variable !$OMP PARALLEL DEFAULT(PRIVATE)
a=a+10.
within a parallel region print *, “A value is : ”,a
!$OMP END PARALLEL
If a variable has a private end program parallel
status (PRIVATE) an
instance of it (with an
undefined value) will exist
in the stack of each task.
a=9999
2009 59
OpenMP Basics:
Synchronizations
program parallel
► The BARRIER directive implicit none
synchronizes all threads real,allocatable,dimension(:) :: a, b
integer :: n, i
within a parallel region n = 5
OpenMP Is Not:
► Meant for distributed memory parallel
► Necessarily implemented identically by all
vendors
► Guaranteed to make the most efficient use of
shared memory
2009 61
MPI, Message Passing Interface
► Library specification for message-passing
► Proposed as a standard
► High performance on both massively parallel
machines and on workstation clusters
► Supplies many communication variations and
optimized functions for a wide range of needs
► Helps the production of portable code, for
distributed-memory multiprocessor machine
a shared-memory multiprocessor machine
a cluster of workstations
2009 62
MPI, Message Passing Interface
► MPI is a specification, not an implementation
MPI has Language Independent Specifications (LIS)
for the function calls and language bindings
► Implementations for
C, C++, Fortran
Python
Java
2009 63
MPI, Message Passing Interface
► MPI is a collection of functions, handling:
Communication contexts
Point to Point communications
Blocking
Non blocking
Synchronous or Asynchronous.
Collectives Communications
Data Templates (MPI Datatype)
Virtual Topologies
Parallel I/O
Dynamic management of processes (spawn,
semaphores, critical sections…)
Remote Direct Memory Access (high troughput, low
latency)
2009 64
MPI Basics
Node1 Node 2
► The overwhelmingly
most frequently used program program
program
MPI commands are
variants of calculus
Waiting
MPI_SEND() to send for data
data
Data of Data of
MPI_RECV() to Node1 Node 2
receive it.
► There are several blocking, synchronous,
and non-blocking varieties.
2009 65
MPI Principles Behind
► Design to optimize:
Copy, or Latency
2009 66
Message Passing Interface
► Difficulties:
Application is now viewed as a graph of
communicating processes. And each process is :
Written in a standard sequential language (Fortran, C, C++)
All variables are private (no shared memory) and local to
each process
Data exchanges (communications) between processes are
explicit : call of functions or subroutines
2009 68
Main MPI problems for
Modern Parallel Computiing
► Too Primitive (no Vertical Frameworks)
► Too static in design
► Too complex interface (API)
More than 200 primitives and 80 constants
► Too many specific primitives to be adaptive
Send, Bsend, Rsend, Ssend, Ibsend, etc.
► Typeless (Message Passing rather than RMI)
► Manual management of complex data structures
2009 69 69
Languages, Conclusion
► Program remains too static in design
Do not offer a way to use new resources that
appears at runtime
► Bound to a given distributed system (cluster)
Hard to cross system boundaries
2009 70
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Clouds
► Conclusion
2009 71
The Grid Concept
Rational Solution
Computer Power One vast computational resource
is like
Electricity 1. Global management,
2. Mutual sharing of the
resource
Can hardly be stored
if not used
2009 72
The Grid Concept
Rational Solution
Computer Power One vast computational resource
is like
Electricity 1. Global management,
2. Mutual sharing of the
resource
Can hardly be stored
if not used
2009 73
Original Grid Computing
A Grid is a Distributed System
2009 74
Grid
Multiple Instructions
Multiple heterogeneous computers with their own
Multiple Data
memory connected through a network
Streams
Computing Unit
Memory D D D D D D D
Network
Instructions
D D D D D
Shared
Memory
D D D D D
2009 75
Grid Computing: Fundamentals
Why using Grid Computing?
► Optimizing the use of resources
2009 77
Grid Computing: Fundamentals
How to use Grid Computing?
► Parallel computation
Communication issues
Scalability issues
The scalability of a system decreases when the amount of
communication increases
2009 78
Grid Computing: Fundamentals
How to use Grid Computing?
► Virtualizing resources and organizations for
collaboration
2009 79
Grid Computing
Different kinds of Grids
► Computing Grid:
Aggregate computing power
► Information Grid:
Knowledge sharing
Remote access to Data owned by others
► Storage Grid:
Large scale storage
Can be internal to a company
2009 80
The multiple GRIDs
► Scientific Grids :
Parallel machines, Clusters
Large equipments: Telescopes, Particle accelerators, etc.
► Enterprise Grids :
Data, Integration: Web Services
Remote connection, Security
► Internet Grids (miscalled P2P grid):
Home PC: Internet Grid (e.g. SETI@HOME)
► Intranet Desktop Grids
Desktop office PCs: Desktop Intranet Grid
2009 81
Top 500
http://www.top500.org
2009 83
Typology of Big Machines
2009 85
Top 500: Architectures
2009 86
Top 500: Architectures
2009 87
Top 500: Applications
2009 88
Top 500: Interconnect Trend
2009 89
Top 500: Operating Systems
2009 90
Grid, Conclusion
► The goal to present a vast computational
resource is not completely reached.
Still a system with boundaries and limitations
► Only a grid instance can be seen as a
computational resource
► Using different grid instances is not transparent
Need for virtualization at middleware level
► Too static design from application POV
A Grid is not meant to adapt itself to an application
2009 91
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Using Clouds
► Conclusion
2009 92
The Globus Toolkit
► A Grid development environment
Develop new OGSA-compliant Web Services
Develop applications using Java or C/C++ Grid APIs
Secure applications using basic security mechanisms
► A set of basic Grid services
Job submission/management
File transfer (individual, queued)
Database access
Data management (replication, metadata)
Monitoring/Indexing system information
► Tools and Examples
► The prerequisites for many Grid community tools
2009
93 93
The Globus Toolkit
2009
94 94
The Globus Toolkit
► Areas of Competence
2009
95 95
The Globus Toolkit
GRAM - Basic Job Submission and Control
► A uniform service interfaceService
for remote job submission
and control
2009
96 96
How To Use the Globus Toolkit
► By itself, the Toolkit has surprisingly limited end-
user value.
There’s very little user interface material there.
You can’t just give it to end users (scientists, engineers,
marketing specialists) and tell them to do something useful!
► The Globus Toolkit is useful to system integrators.
You’ll need to have a specific application or system in mind.
You’ll need to have the right expertise.
You’ll need to set up prerequisite hardware/software.
You’ll need to have a plan…
2009
97 97
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Clouds
► Conclusion
2009 98
Peer to Peer
► What is a P2P system?
A system where all participants are equals
A system which uses the resources of the enterprise, of the
Internet
► Structured
Peers are associated using an algorithm (Distributed Hash
Table) and the placement of resources is controlled
► Unstructured
Peers are “randomly” associated and the resources
randomly distributed
► P2P deals with 2 resources
Files/Data : P2P File Sharing
CPUs : Edge Computing or Global Computing
2009 99
P2P Architectures and
techniques
Boinc (*@home)
► “An open source platform for volunteer
computing”
► Internet Computing
► Master-Slave applications
Servers have tasks to be performed
Clients connect to servers to get work
No client-to-client communication
2009 100
P2P Architectures and
techniques
Boinc (*@home)
http://boinc.berkeley.edu
2009 101
P2P Architectures and
techniques
Condor
► Workload management system for compute-
intensive jobs
► Provides
Job queuing mechanism
Scheduling policy
Resource monitoring and management
► Matchmaking
A file indicates resources available
When new resources are needed:
Condor dynamically provides the corresponding resources
http://www.cs.wisc.edu/condor/
2009 102
JXTA (Juxtapose)
► Open source p2p protocol specification
► Started by Sun Microsystems in 2001
2009 103
P2P, Conclusion
► Resources’ pool size is dynamic
Can adapt to application needs
Best effort most of the time, QoS needed
► Resources are volatile
Need for fault-tolerant applications
► No real industrial vendors
2009 104
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Clouds
► Conclusion
2009 105
Cloud Computing
Cloud computing is a label for the subset of grid computing
that includes utility computing and other approaches to the
use of shared computing resources (Wikipedia)
2009 106
Some Clouds
► Peer to Peer File sharing: Bit torrent
► Web based Applications:
Google Apps
Facebook
► New Microsoft OS with cloud computing
applications
► Web Based Operating Systems
http://icloud.com/
2009 107
Cloud Computing
► Perceived benefits
Easy to deploy
Cheap
Pay per use model
Outsourcing, reduce in-house costs
Infinite capacities (storage, computation, …)
2009 109
Hype vs Reality
Hype Reality
All of corporate computing will move to Low-priority business tasks will
the cloud. constitute the bulk of migration out of
internal data centers.
The economics are vastly superior. Cloud computing is not yet more
efficient than the best enterprise IT
departments.
Mainstream enterprises are using it. Most current users are Web 2.0-type
companies (early adopters)
It will drive IT capital expenditures to It can reduce start-up costs
zero. (particularly hardware) for new
companies and projects.
It will result in an IT infrastructure that It still requires a savvy IT
a business unit can provision with a administrator, developer, or both.
credit card.
Source: CFO magazine
2009 110
Cloud, conclusion
► Another step towards integration of grid
computing within application
► Possible to adapt resource to applications
► Several vendors exist, market exists
Amazon Ec2, Flexiscale, GoGrid, Joyent,
2009 111
Traditional Parallel Computing &
HPC Solutions
► Parallel Computing
Principles
Parallel Computer Architectures
Parallel Programming Models
Parallel Programming Languages
► Grid Computing
Multiple Infrastructures
Using Grids
P2P
Clouds
► Conclusion
2009 112
The Weaknesses and Strengths
of Distributed Computing
► In any form of computing, there is always a
tradeoff in advantages and disadvantages
2009 113
The Weaknesses and Strengths
of Distributed Computing
► Disadvantages of distributed computing:
Multiple Points of Failures: the failure of one or
more participating computers, or one or more
network links, can spell trouble.
Security Concerns: In a distributed system, there
are more opportunities for unauthorized access.
Malicious worms, viruses, etc.
Personal Identity theft – social, medical, …
Lack of interoperability between Grid Systems
2009 114
Solutions for Parallel and Distributed
Processing (a few of them…)
2009 115
Software Shared Memory
► Emulate a distributed shared memory at
software level
write(key, value), read(key) and
take(key)
2009 116
Communicating processes
► Enable communication between remote
processes
Message passing : send(…),receive(…),…
RPC : func(…),object.foo(…)
2009 117
Implicit Programming-based
► Parallelism is predefined in the solution
2009 118
Implicit GUI-based
► Tasks are third party applications
Parallelism can be deduced from…
Parameters (parameters sweeping)
Tasks flow
2009 120
Conclusion
The need to unify Distribution and Multi-Core
Seamless
Sequential Multithreaded Distributed
2009 121
General Tradeoff:
ProActive Design Decision
2009 122
Conclusion
Abstracting Away Architecture
2009 123
Parallelism: Problem / Solution
• Independent Tasks Master Slave package
Monte Carlo Simulation (in Financial Math, Non Linear Physic, ...)
Embarrassingly Parallel Applications • Dynamic Generation of Tasks High-Level Patterns
(Skeleton Framework)
Post-Production
• Dynamic,Numerical OO SPMD
Electro-Magnetism, Vibro-Acoustic, Fluid Dynamics
Highly Communicating Applications
• Unstructured Basic API with Active Objects and Groups
EDA (Electronic Design Automation for ICs), N-Body
2009 124
Conclusion
Various Applications with Different Needs
► A set of parallel programming frameworks in Java
Active Objects (Actors)
Master/Worker
Branch & Bound
SPMD
Skeletons
Event Programming
Mathlab, Scilab
A component framework as a reference implementation of
the GCM
Legacy Code Wrapping, Grid Enabling
2009 125
Conclusion
Local Machine, Enterprise Servers, Enterprise
Grids, SaaS-Clouds
► Resource Management
Still the need for In-Enterprise sharing (vs. SaaS, Cloud)
– Meta-Scheduler/RM for
– Dynamic scheduling and resource sharing
– Various Data Spaces, File Transfer
2009 126
Conclusion
Needs for Tools for Parallel Programming
► Parallel Programming needs Tools:
Understand
Analyze
Debug
Optimize
2009 127
Solutions for Parallel and Distributed
Processing (a few of them…)
EXPLICIT DISTRIBUTION AND IMPLICIT DISTRIBUTION AND
PARALLELISM PARALLELISM
SHARED MEMORY COMMUNICATING PROGRAMMING GUI BASED
PROCESSES BASED
2009 128
Backed up by
2009 129