0% found this document useful (0 votes)

8 views

Intro Parallel Programming Paradigms

The document provides an overview of parallel programming paradigms, focusing on performance determinants such as CPU speed, data movement, and workload distribution. It discusses various architectures, including distributed and shared memory, and highlights the importance of MPI (Message Passing Interface) for process communication in parallel computing. Additionally, it emphasizes the need for application-specific solutions and the significance of benchmarking and understanding hardware properties.

Uploaded by

najmaddinhasanzada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

8 views

Intro Parallel Programming Paradigms

Uploaded by

najmaddinhasanzada

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 45

Overview

on Parallel
Programming Paradigms
Ivan Giro3o – igiro3o@ictp.it
Informa(on & Communica(on Technology Sec(on (ICTS)
Interna(onal Centre for Theore(cal Physics (ICTP)
What Determines Performance?
• How fast is my CPU?
• How fast can I move data around?
• How well can I split work into pieces?
– Very applica(on speciﬁc: never assume that a good
solu(on for one problem is as good a solu(on for
another
– always run benchmarks to understand requirements
of your applica(ons and proper(es of your hardware
– respect Amdahl's law

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

2
igiroSo@ictp.it ICTP, smr2761
Parallel Architectures
• Distributed Memory • Shared Memory
memory memory node memory
node

node

MEMORY
CPU CPU CPU

CPU CPU CPU CPU CPU

NETWORK
node

memory memory memory

node
node

node

CPU CPU CPU

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
3
igiroSo@ictp.it ICTP, smr2761
Mul(ple Socket CPUs

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

4
igiroSo@ictp.it ICTP, smr2761
Paradigm at Shared Memory /1
Thread 1! Thread 2! Thread 3!

PC! Private data! PC! Private data! PC! Private data!

Shared data!

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

5
igiroSo@ictp.it ICTP, smr2761
Paradigm at Shared Memory /2
• Usually indicated as Mul(threading Programming
• Commonly implemented in scien(ﬁc compu(ng
using the OpenMP standard (direc(ve based)
• Thread management overhead
• Limited scalability
• Write access to shared data can easily lead to
race condi(ons and incorrect data

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

6
igiroSo@ictp.it ICTP, smr2761
Parallel Programming Paradigms
• MPI (Message Passing Interface)
– A standard deﬁned for portable message passing
– It available in the form of library which includes interfaces
for expressing the data exchange among processes
– A framework is provided for spawning the independent
processes (i.e., mpirun)
– Processes communica(on is via network
– It works on either shared and distributed mem.
architecture
– ideal for distribu(ng memory among compute nodes

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

7
igiroSo@ictp.it ICTP, smr2761
MPI Program Design
• Mul(ple and separate processes (can be local and
remote) concurrently that are coordinated and
exchange data through “messages” => a “share
nothing” paralleliza(on
• Best for coarse grained paralleliza(on Distribute
large data sets; replicate small data
• Minimize communica(on or overlap communica(on
and compu(ng for eﬃciency => Amdahl's law
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
8
igiroSo@ictp.it ICTP, smr2761
What is MPI?
• A standard, i.e. there is a document describing how the API
(constants & subrou(nes) are named and should behave; mul(ple
“levels”, MPI-‐1 (basic), MPI-‐2 (advanced), MPI-‐3 (new)
• A library or API to hide the details of low-‐level communica(on
hardware and how to use it
• Implemented by mul(ple vendors
• Open source and commercial versions
• Vendor speciﬁc versions for certain hardware
• Not binary compa(ble between implementa(ons

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

9
igiroSo@ictp.it ICTP, smr2761
Programming Parallel Paradigms
• Are the tools we use to express the parallelism
for on a given architecture
• They diﬀer in how programmers can manage and
deﬁne key features like:
– parallel regions
– concurrency
– process communica(on
– synchronism
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
10
igiroSo@ictp.it ICTP, smr2761
MPI inter process communica(ons
MPI on Mul( core CPU 1 MPI proces / core
Stress network
node node
Stress OS

Many MPI codes (QE) based on

ALLTOALL
MPI_BCAST Messages = processes * processes

We need to exploit the hierarchy

node node

Re-‐design Mix message passing

network applica@ons And mul@-‐threading
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
11
igiroSo@ictp.it ICTP, smr2761
The Hybrid Mode
node node

node node

network
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
12
igiroSo@ictp.it ICTP, smr2761
The Hybrid Mode
node node

node node

network
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
13
igiroSo@ictp.it ICTP, smr2761
~ 8 GBytes

The Intel Xeon E5-‐2665

Sandy Bridge-‐EP 2.4GHz

mpirun -np 8 pw-gpu.x -inp input file

01/10/2015 – Ivan GiroSo
Computer Architecture for HPC -‐ ICTP, smr2761 14
igiroSo@ictp.it
~ 8 GBytes

The Intel Xeon E5-‐2665

Sandy Bridge-‐EP 2.4GHz

mpirun -np 1 pw-gpu.x -inp input file

01/10/2015 – Ivan GiroSo
Computer Architecture for HPC -‐ ICTP, smr2761 15
igiroSo@ictp.it
~ 8 GBytes

The Intel Xeon E5-‐2665

Sandy Bridge-‐EP 2.4GHz

export OMP_NUM_THREADS=4
export OPENBLAS_NUM_THREADS=$OMP_NUM_THREADS
mpirun -np 2 pw-gpu.x -inp input file

01/10/2015 – Ivan GiroSo

Computer Architecture for HPC -‐ ICTP, smr2761 16
igiroSo@ictp.it
Workload Management: system level, High-throughput

Python: Ensemble simulations, workfows

MPI: Domain partition

OpenMP: Node Level shared mem

CUDA/OpenCL/OpenAcc:
floating point accelerators

01/10/2015 – Ivan GiroSo

Computer Architecture for HPC -‐ ICTP, smr2761 17
igiroSo@ictp.it
Type of Parallelism
• Func@onal (or task) parallelism:
diﬀerent people are performing
diﬀerent task at the same (me

• Data Parallelism:
diﬀerent people are performing the
same task, but on diﬀerent
equivalent and independent objects

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

18
igiroSo@ictp.it ICTP, smr2761
Process Interac(ons
• The eﬀec(ve speed-‐up obtained by the paralleliza(on depend by the
amount of overhead we introduce making the algorithm parallel
• There are mainly two key sources of overhead:
1. Time spent in inter-‐process interac(ons (communica@on)
2. Time some process may spent being idle (synchroniza@on)

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

19
igiroSo@ictp.it ICTP, smr2761
Eﬀect of load-‐unbalancing

all here?

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

20
igiroSo@ictp.it ICTP, smr2761
Mapping and Synchroniza(on

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

21
igiroSo@ictp.it ICTP, smr2761
Amdahl's law
In a massively parallel context, an upper limit for the scalability of parallel
applica(ons is determined by the frac(on of the overall execu(on (me
spent in non-‐scalable opera(ons (Amdahl's law).

maximum speedup tends to

1 / ( 1 − P )
P= parallel frac(on

1000000 core

P = 0.999999

serial frac+on= 0.000001

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

22
igiroSo@ictp.it ICTP, smr2761
How do we evaluate the improvement?
• We want es(mate the amount of the
introduced overhead => To = npesTP -‐ TS
• But to quan(fy the improvement we use the
term Speedup:
TS
SP =
TP
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
23
igiroSo@ictp.it ICTP, smr2761
Speedup

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

24
igiroSo@ictp.it ICTP, smr2761
Eﬃciency
• Only embarrassing parallel algorithm can obtain an
ideal Speedup
• The Eﬃciency is a measure of the frac(on of (me for
which a processing element is usefully employed:

SP
EP = p

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
25
igiroSo@ictp.it ICTP, smr2761
Eﬃciency

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

26
igiroSo@ictp.it ICTP, smr2761
Amdal’s Law And Real Life
• The speedup of a parallel program is limited by the
sequen(al frac(on of the program
• This assumes perfect scaling and no overhead

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

27
igiroSo@ictp.it ICTP, smr2761
Scaling -‐ QE-‐CP on Fermi BGQ @ CINECA

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

28
igiroSo@ictp.it ICTP, smr2761
Easy Parallel Compu(ng
• Farming, embarrassingly parallel
– Execu(ng mul(ple instances on the same program with diﬀerent
inputs/ini(al cond.
– Reading large binary ﬁles by splivng the workload among processes
– Searching elements on large data-‐sets
– Other parallel execu(on of embarrassingly parallel problem (no
communica(on among tasks)

• Ensemble simula(ons (weather forecast)

• Parameter space (ﬁnd the best wing shape)

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

29
igiroSo@ictp.it ICTP, smr2761
Single Program on Mul(ple Data
• performing the same program (set of instruc(ons)
among diﬀerent data
• Same model adopted by the MPI library
• A parallel tool is needed to handle the diﬀerent
processes working in parallel
• The MPI library provides the mpirun applica(on to
execute parallel instances of the same program
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
30
igiroSo@ictp.it ICTP, smr2761
$ mpirun -np 12 my_program.x

mynode01 mynode02

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

Ivan GiroSo 31
igiroSo@ictp.it ICTP, smr2761
[igirotto@mynode01 ~]$ mpirun -np 12 /bin/hostname
mynode01
mynode02
mynode01
mynode02
mynode01
mynode02
mynode01
mynode02
mynode01
mynode02
mynode01
mynode02

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

32
igiroSo@ictp.it ICTP, smr2761
Parallel Opera(ons in Prac(ce
• Parallel reading and compu(ng in parallel is
always allowed
• Parallel wri(ng is extremely dangerous!
• To control the parallel flow each process should
be unique and iden(fiable (ID)
• The OpenMPI implementa(on of the MPI library
provides a series of environment variables
defined for each MPI process
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
33
igiroSo@ictp.it ICTP, smr2761
OMPI_COMM_WORLD_SIZE -‐ the number of processes in this process' MPI
Comm_World

OMPI_COMM_WORLD_RANK -‐ the MPI rank of this process

OMPI_COMM_WORLD_LOCAL_RANK -‐ the rela(ve rank of this process on this node
within its job. For example, if four processes in a job share a node, they will each be
given a local rank ranging from 0 to 3.

OMPI_UNIVERSE_SIZE -‐ the number of process slots allocated to this job. Note that
this may be diﬀerent than the number of processes in the job.

OMPI_COMM_WORLD_LOCAL_SIZE -‐ the number of ranks from this job that are
running on this node.

OMPI_COMM_WORLD_NODE_RANK -‐ the rela(ve rank of this process on this node
looking across ALL jobs. hSp://www.open-‐mpi.org
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
igiroSo@ictp.it ICTP, smr2761
34
In Python
import os
myid = os.environ['OMPI_COMM_WORLD_RANK']
[...]

In BASH
#!/bin/bash
myid=${OMPI_COMM_WORLD_RANK}
[...]

[igirotto@mynode01 ~]$ mpirun ./myprogram.[py/sh...]

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

35
igiroSo@ictp.it ICTP, smr2761
Possible Applica(ons
• Execu(ng mul(ple instances on the same program
with diﬀerent inputs/ini(al cond.
• Reading large binary ﬁles by splivng the workload
among processes
• Searching elements on large data-‐sets
• Other parallel execu(on of embarrassingly parallel
problem (no communica(on among tasks)
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
36
igiroSo@ictp.it ICTP, smr2761
Conclusions
• Task Farming is a simple model to parallelize
simple problems that can be divided in
independent task
• The mpirun applica(on aids to easily perform
mul(ple processes, includes environment sevng
• Load balancing remains a main problem, but
moving from serial to parallel processing can
substan(ally speed-‐up (me of simula(on
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
37
igiroSo@ictp.it ICTP, smr2761
Task Farming
• Many independent programs (tasks) running at once
– each task can be serial or parallel
– “independent” means they don’t communicate directly
– Processes possibly driven by the mpirun framework

[igirotto@localhost]$ more my_shell_wrapper.sh

#!/bin/bash
#example for the OpenMPI implementation
./prog.x --input input_${OMPI_COMM_WORLD_RANK}.dat

[igirotto@localhost]$ mpirun -np 400 ./my_shell_wrapper.sh

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

38
igiroSo@ictp.it ICTP, smr2761
Master/Slave
W1
Master W1

W4 W2
W4
W3

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

39
igiroSo@ictp.it ICTP, smr2761
Parallel I/O

File I/O Bandwidth P0

System

P4 P3 P2 P1
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
40
igiroSo@ictp.it ICTP, smr2761
Parallel I/O
P0 P1 P2 P3

I/O Bandwidth
I/O Bandwidth

I/O Bandwidth

I/O Bandwidth
File File File File
System System System System
01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
41
igiroSo@ictp.it ICTP, smr2761
Parallel I/O
P0 P1 P2 P3
I/O

I/O

I/O
I/O

MPI I/O & Parallel I/O Libraries (Hdf5, Netcdf, etc…)

Parallel File System

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms
42
igiroSo@ictp.it ICTP, smr2761
What If You Want to Learning How to Program All This?!

• Introductory School on Parallel Programming

and Parallel Architecture for High
Performance Compu(ng | (smr 2877)
• 3 October 2016 -‐ 14 October 2016

What If You Want to Master All This?!

11/09/2015 – Ivan GiroSo Introduc(on to High-‐Performance Compu(ng
43
igiroSo@ictp.it ICTP, smr2706
11/09/2015 – Ivan GiroSo Introduc(on to High-‐Performance Compu(ng
44
igiroSo@ictp.it ICTP, smr2706
01/10/2015 – Ivan GiroSo Overview on Parallel Programming
45
igiroSo@ictp.it Paradigms ICTP, smr2761

Parallel Programming for Modern High Performance Computing Systems (Czarnul, Pawel)
No ratings yet
Parallel Programming for Modern High Performance Computing Systems (Czarnul, Pawel)
330 pages
CHB Wall Design
100% (1)
CHB Wall Design
6 pages
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
From Everand
Xbox Architecture: Architecture of Consoles: A Practical Analysis, #13
Rodrigo Copetti
No ratings yet
A Survey of Parallel Programming Models and Tools in The Multi and Many-Core Era
No ratings yet
A Survey of Parallel Programming Models and Tools in The Multi and Many-Core Era
18 pages
Lecture 6 Parallel Programming Models
No ratings yet
Lecture 6 Parallel Programming Models
17 pages
Prebook MCAP
No ratings yet
Prebook MCAP
11 pages
3.3-Recent Trends in Parallel Computing
No ratings yet
3.3-Recent Trends in Parallel Computing
12 pages
Computação Paralela
No ratings yet
Computação Paralela
18 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
2.ParallelArchExec
No ratings yet
2.ParallelArchExec
46 pages
Multi Core Architectures and Programming
No ratings yet
Multi Core Architectures and Programming
10 pages
Intro_HPC_IITK
No ratings yet
Intro_HPC_IITK
44 pages
Parallel Programming
No ratings yet
Parallel Programming
42 pages
Parallel Programming
No ratings yet
Parallel Programming
17 pages
Unit IV Cluster Computing
No ratings yet
Unit IV Cluster Computing
70 pages
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
No ratings yet
Cloud Computing CS 15-319: Programming Models-Part I Lecture 4, Jan 25, 2012
40 pages
Slides 1
No ratings yet
Slides 1
28 pages
Khaitan PSERC Webinar HPC Mar 2013 Slides
No ratings yet
Khaitan PSERC Webinar HPC Mar 2013 Slides
52 pages
MPI & Parallel Programming Models Cloud Computing
No ratings yet
MPI & Parallel Programming Models Cloud Computing
7 pages
L1.3a HPC Concepts
No ratings yet
L1.3a HPC Concepts
43 pages
L12-Principles of Message Passing1
No ratings yet
L12-Principles of Message Passing1
10 pages
03 Programming
No ratings yet
03 Programming
63 pages
High Performance Computing
No ratings yet
High Performance Computing
17 pages
Overview of Parallel Computing: Shawn T. Brown
No ratings yet
Overview of Parallel Computing: Shawn T. Brown
46 pages
Mpi Openmp Handouts
No ratings yet
Mpi Openmp Handouts
67 pages
Introduction To Parallel Computing: John Von Neumann Institute For Computing
No ratings yet
Introduction To Parallel Computing: John Von Neumann Institute For Computing
18 pages
Programming Models
No ratings yet
Programming Models
21 pages
CS 133 Parallel & Distributed Computing: Course Instructor: Adam Kaplan Lecture #1: 4/2/2012
No ratings yet
CS 133 Parallel & Distributed Computing: Course Instructor: Adam Kaplan Lecture #1: 4/2/2012
22 pages
PDC Lecture 14 MPI Sockets and Memory Models
No ratings yet
PDC Lecture 14 MPI Sockets and Memory Models
20 pages
Preview-9781482211191 A37870511
No ratings yet
Preview-9781482211191 A37870511
50 pages
Marshalling
No ratings yet
Marshalling
10 pages
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
No ratings yet
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
28 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
HPCfirstlecture
No ratings yet
HPCfirstlecture
4 pages
Parallel Computing
100% (1)
Parallel Computing
12 pages
Paralle Processing in Brief
No ratings yet
Paralle Processing in Brief
31 pages
Parallel Computing
No ratings yet
Parallel Computing
57 pages
High Performance Computing For Computational Mechanics: ISCM-10
No ratings yet
High Performance Computing For Computational Mechanics: ISCM-10
63 pages
BIg data anslysi
No ratings yet
BIg data anslysi
57 pages
Chapter 2 - Parallel Algorithm Design
No ratings yet
Chapter 2 - Parallel Algorithm Design
84 pages
Parallel Programming
No ratings yet
Parallel Programming
108 pages
3.Introduction to Parallelism
No ratings yet
3.Introduction to Parallelism
64 pages
@vtucode - in 21CS643 Module 5 2021 Scheme
No ratings yet
@vtucode - in 21CS643 Module 5 2021 Scheme
108 pages
Parallel Comp Point Main
No ratings yet
Parallel Comp Point Main
18 pages
Multi Threading
No ratings yet
Multi Threading
168 pages
Cost-Effective HPC Clustering For Computer Vision Applications
No ratings yet
Cost-Effective HPC Clustering For Computer Vision Applications
6 pages
HPC Module 4
No ratings yet
HPC Module 4
18 pages
Parallel Programming Models
No ratings yet
Parallel Programming Models
25 pages
Parallel Programming
100% (2)
Parallel Programming
410 pages
Chapter 4
No ratings yet
Chapter 4
46 pages
Concurrent Programming With Threads: Rajkumar Buyya
No ratings yet
Concurrent Programming With Threads: Rajkumar Buyya
168 pages
Concurrency: CS2403 Programming Languages
No ratings yet
Concurrency: CS2403 Programming Languages
44 pages
Mit Openmp Mpi
No ratings yet
Mit Openmp Mpi
77 pages
Parallel Programming For Multicore Machines Using OpenMP and MPI Lecture Notes (Dr. Constantinos Evangelinos) (Z-Library)
No ratings yet
Parallel Programming For Multicore Machines Using OpenMP and MPI Lecture Notes (Dr. Constantinos Evangelinos) (Z-Library)
292 pages
MPI (1)
No ratings yet
MPI (1)
57 pages
5CS022 Lecture 1
No ratings yet
5CS022 Lecture 1
36 pages
Intro To Parallel Computing
No ratings yet
Intro To Parallel Computing
127 pages
INTEL - The Parallel Universe - Issue 21 - 2015
No ratings yet
INTEL - The Parallel Universe - Issue 21 - 2015
36 pages
2 Parallel Computer Memory Architectures
No ratings yet
2 Parallel Computer Memory Architectures
26 pages
Echo on a Chip - Secure Embedded Systems in Cryptography: A New Perception for the Next Generation of Micro-Controllers handling Encryption for Mobile Messaging
From Everand
Echo on a Chip - Secure Embedded Systems in Cryptography: A New Perception for the Next Generation of Micro-Controllers handling Encryption for Mobile Messaging
Mancy A. Wake
No ratings yet
Neural Networks with Python
From Everand
Neural Networks with Python
Mei Wong
No ratings yet
İngilis Dili
No ratings yet
İngilis Dili
27 pages
ing tesr
No ratings yet
ing tesr
31 pages
Labor 4
No ratings yet
Labor 4
2 pages
Labo 6
No ratings yet
Labo 6
3 pages
Removal-Exam - RCD
No ratings yet
Removal-Exam - RCD
2 pages
SOP_Cos_new_SOP
No ratings yet
SOP_Cos_new_SOP
59 pages
Example Foundation & Floor Plan
No ratings yet
Example Foundation & Floor Plan
1 page
shahjahanabad-notes-for-topic-shajanabad
No ratings yet
shahjahanabad-notes-for-topic-shajanabad
13 pages
Stonecutters Bridge 2
No ratings yet
Stonecutters Bridge 2
10 pages
Block Retaining Wall
No ratings yet
Block Retaining Wall
11 pages
20×24 Gable Shed Plans - PDF Download - HowToSpecialist
No ratings yet
20×24 Gable Shed Plans - PDF Download - HowToSpecialist
48 pages
Front Elevation: Department of Public
No ratings yet
Front Elevation: Department of Public
1 page
All Systemair Axial Fans at A Glance
No ratings yet
All Systemair Axial Fans at A Glance
58 pages
New Giza Jasperlake - Type D
No ratings yet
New Giza Jasperlake - Type D
11 pages
Gipe 007204
No ratings yet
Gipe 007204
140 pages
Art App - Medieval and Renaissance Art
No ratings yet
Art App - Medieval and Renaissance Art
6 pages
APU Shuttle Services TPM 230712
No ratings yet
APU Shuttle Services TPM 230712
1 page
Red Fort, Delhi
No ratings yet
Red Fort, Delhi
9 pages
Dry Stone Wall - E Corpus
100% (1)
Dry Stone Wall - E Corpus
7 pages
Measuring For Wave Fold Curtains
No ratings yet
Measuring For Wave Fold Curtains
1 page
Mashere Wasti ESR (100k, 12m Staging)
No ratings yet
Mashere Wasti ESR (100k, 12m Staging)
9 pages
Your Dream Home Awaits!: Rera Web Site - WWW - Rera.wb - Gov.in Rera Registration No. - WBRERA/P/SOU/2023/000281
No ratings yet
Your Dream Home Awaits!: Rera Web Site - WWW - Rera.wb - Gov.in Rera Registration No. - WBRERA/P/SOU/2023/000281
30 pages
Armstrong 1-AV Installation Maintenance Maunal
No ratings yet
Armstrong 1-AV Installation Maintenance Maunal
6 pages
NTCC Report Final PDF
No ratings yet
NTCC Report Final PDF
69 pages
Curing, Capping and Determining The Compressive Strength of Cylindrical Concrete Specimens
No ratings yet
Curing, Capping and Determining The Compressive Strength of Cylindrical Concrete Specimens
13 pages
Vista Fashions Awnings Brochure
No ratings yet
Vista Fashions Awnings Brochure
12 pages
Design of Drystone Gravity WALL (H 2m) Designed by Approved by
No ratings yet
Design of Drystone Gravity WALL (H 2m) Designed by Approved by
3 pages
Mani Mandir Complex
No ratings yet
Mani Mandir Complex
19 pages
Parking Lot Design PDF
67% (3)
Parking Lot Design PDF
14 pages
Syllabus: West Bengal University of Technology
No ratings yet
Syllabus: West Bengal University of Technology
64 pages
28sqm Scheme 3 Models
No ratings yet
28sqm Scheme 3 Models
16 pages
Presintation Scheme: Ground 1
No ratings yet
Presintation Scheme: Ground 1
1 page
DoW Specification For Road and Bridge Works-1995
100% (1)
DoW Specification For Road and Bridge Works-1995
311 pages

Intro Parallel Programming Paradigms

Uploaded by

Intro Parallel Programming Paradigms

Uploaded by

Overview

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

CPU CPU CPU CPU CPU

memory memory memory

CPU CPU CPU

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

PC! Private data! PC! Private data! PC! Private data!

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

Many MPI codes (QE) based on

We need to exploit the hierarchy

Re-­‐design Mix message passing

The Intel Xeon E5-­‐2665

mpirun -np 8 pw-gpu.x -inp input file

The Intel Xeon E5-­‐2665

mpirun -np 1 pw-gpu.x -inp input file

The Intel Xeon E5-­‐2665

01/10/2015 – Ivan GiroSo

Python: Ensemble simulations, workfows

MPI: Domain partition

OpenMP: Node Level shared mem

01/10/2015 – Ivan GiroSo

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

maximum speedup tends to

serial frac+on= 0.000001

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

• Ensemble simula(ons (weather forecast)

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

OMPI_COMM_WORLD_RANK -­‐ the MPI rank of this process

[igirotto@mynode01 ~]$ mpirun ./myprogram.[py/sh...]

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

[igirotto@localhost]$ more my_shell_wrapper.sh

[igirotto@localhost]$ mpirun -np 400 ./my_shell_wrapper.sh

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

01/10/2015 – Ivan GiroSo Overview on Parallel Programming Paradigms

File I/O Bandwidth P0

MPI I/O & Parallel I/O Libraries (Hdf5, Netcdf, etc…)

Parallel File System

• Introductory School on Parallel Programming

What If You Want to Master All This?!

You might also like

Re-‐design Mix message passing

The Intel Xeon E5-‐2665

The Intel Xeon E5-‐2665

The Intel Xeon E5-‐2665

OMPI_COMM_WORLD_RANK -‐ the MPI rank of this process