0% found this document useful (0 votes)

48 views

Chapter 3 - Shared-Memory Programming, OpenMP

Uploaded by

thanhtruongtran23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

48 views

Chapter 3 - Shared-Memory Programming, OpenMP

Uploaded by

thanhtruongtran23

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 65

Shared Memory

Programming-OpenMP
References
• Michael J. Quinn. Parallel Computing. Theory and Practice.
McGraw-Hill
• Albert Y. Zomaya. Parallel and Distributed Computing
Handbook. McGraw-Hill
• Ian Foster. Designing and Building Parallel Programs.
Addison-Wesley.
• Ananth Grama, Anshul Gupta, George Karypis, Vipin Kumar .
Introduction to Parallel Computing, Second Edition. Addison
Wesley.
• Joseph Jaja. An Introduction to Parallel Algorithm. Addison
Wesley.
• Nguyễn Đức Nghĩa. Tính toán song song. Hà Nội 2003.

3
4.1 Shared Memory

4
Shared Memory Architecture
• All processors have access to one global memory
• All processors share the same address space
• The system runs a single copy of the OS
• Processors communicate by reading/writing to the
global memory
• Examples: multiprocessor PCs (Intel P4), Sun Fire
15K, NEC SX-7, Fujitsu PrimePower, IBM p690,
SGI Origin 3000.

5
Shared Memory Systems

OpenMP
Pthreads

6
Shared Memory Systems (2)
• All processors may access the whole main memory

• Non-UniformMemoryAccess • UniformMemoryAccess
• Memory access time is • Memory access time is
non-uniform uniform

7 / 71
Clustered of SMPs

MPI
hybrid MPI + OpenMP

8
4.2 Multithread Programming

9
Shared Memory Programming

Communication is implicitly specified

Focus on constructs for expressing
concurrency and synchronization
Minimize data-sharing overheads

10
Process and thread
• A process is an instance of a
computer program
• Information present in a
process include:
• Text
• Machine code
• Data
• Global variables
• Stack
• Local variables
• Program counter (PC)
• A pointer to the istruction
to be executed

11
Multi-threading
• The process contains several
concurrent execution flows
(threads)
• Each thread has its own
program counter (PC)
• Each thread has its own
private stack (variables local
to the thread)
• The instructions executed by
a thread can access:
• the process global
memory (data)
• the thread local stack

12
OpenMP vs. POSIX Threads
• POSIX threads is the other widely used shared programming API.
• Fairly widely available, usually quite simple to implement on top of
OS kernel threads.
• Lower level of abstraction than OpenMP
• library routines only, no directives
• more flexible, but harder to implement and maintain
• OpenMP can be implemented on top of POSIX threads
• Not much difference in availability
• not that many OpenMP C++ implementations
• no standard Fortran interface for POSIX threads

13
4.3 OPENMP

14
What is OpenMP?
• What does OpenMP stands for?
• Open specifications for Multi Processing via collaborative work
between interested parties from the hardware and software industry,
government and academia.
• OpenMP is an Application Program Interface (API)
that may be used to explicitly direct multi-threaded,
shared memory parallelism.
• API components: Compiler Directives, Runtime Library
Routines. Environment Variables
• OpenMP is a directive-based method to invoke parallel
computations on share-memory multiprocessors

15
What is OpenMP?
• OpenMP API is specified for C/C++ and Fortran.
• OpenMP is not intrusive to the original serial code:
instructions appear in comment statements for
fortran and pragmas for C/C++.
• OpenMP website: http://www.openmp.org
• Materials in this lecture are taken from various OpenMP
tutorials in the website and other places.

16
OpenMP Usage
• Applications
• Applications with intense computational needs
• From video games to big science & engineering
• Programmer Accessibility
• From very early programmers in school to scientists to
parallel computing experts
• Available to millions of programmers
• In every major (Fortran & C/C++) compiler

17
Programming Model
• Fork-Join Parallelism:
• Master thread spawns a team of threads as needed.
• Parallelism is added incrementally: i.e. the sequential
program evolves into a parallel program.

18
OpenMP Syntax
• Most of the constructs in OpenMP are compiler
directives or pragmas.
• For C and C++, the pragmas take the form:
• #pragma omp construct [clause [clause]…]
• For Fortran, the directives take one of the forms:
• C$OMP construct [clause [clause]…]
• !$OMP construct [clause [clause]…]
• *$OMP construct [clause [clause]…]
• Since the constructs are directives, an OpenMP
program can be compiled by compilers that don’t
support OpenMP.

19
OpenMP: How is OpenMP Typically Used?
• OpenMP is usually used to parallelize loops:
• Find your most time consuming loops.
• Split them up between threads.
Split-up this loop between multiple threads
void main() void main()
{ {
double Res[1000]; double Res[1000];
#pragma omp parallel for
for(int i=0;i<1000;i++) { for(int i=0;i<1000;i++) {
do_huge_comp(Res[i]); do_huge_comp(Res[i]);
} }
} Sequential program } Parallel program 20
How to compile and run OpenMP
programs?
• Gcc 4.2 and above supports OpenMP 3.0
• gcc –fopenmp a.c
• Try example1.c
• To run: ‘a.out’
• To change the number of threads:
• setenv OMP_NUM_THREADS 4 (tcsh) or export
OMP_NUM_THREADS=4(bash)

21
OpenMP:
How do Threads Interact?
• OpenMP is a shared memory model.
• Threads communicate by sharing variables.
• Unintended sharing of data can lead to race conditions:
• race condition: when the program’s outcome changes as the
threads are scheduled differently.
• To control race conditions:
• Use synchronization to protect data conflicts.
• Synchronization is expensive so:
• Change how data is stored to minimize the need for
synchronization.

22
OpenMP Constructs
• OpenMP’s constructs fall into 5 categories:
• Parallel Regions
• Worksharing
• Data Environment
• Synchronization
• Runtime functions/environment variables
• OpenMP is basically the same between Fortran and
C/C++

23
OpenMP: Parallel Regions
• You create threads in OpenMP with the “omp parallel” pragma.
• For example, To create a 4-thread Parallel region:

double A[1000];
omp_set_num_threads(4);
#pragma omp parallel
Each thread redundantly
executes the code within {
the structured block int ID =omp_get_thread_num();
pooh(ID,A);
}
• Each thread calls pooh(ID,A) for ID = 0 to 3

24
Parallel Region

25
OpenMP: Work-Sharing Constructs
• The “for” Work-Sharing construct splits up loop iterations
among the threads in a team

#pragma omp parallel

#pragma omp for
for (I=0;I<N;I++){
NEAT_STUFF(I);
}

By default, there is a barrier at the end of

the “omp for”. Use the “nowait” clause to
turn off the barrier.
26
Work Sharing Constructs
A motivating example
Sequential code for(i=0;I<N;i++) { a[i] = a[i] + b[i];}

#pragma omp parallel

{
int id, i, Nthrds, istart, iend;
OpenMP Parallel
Region id = omp_get_thread_num();
Nthrds = omp_get_num_threads();
istart = id * N / Nthrds;
iend = (id+1) * N / Nthrds;
for(i=istart;I<iend;i++) {a[i]=a[i]+b[i];}
}
OpenMP Parallel
#pragma omp parallel
Region
OpenMP and a work-
parallel region
sharing for construct
and a work-sharing for #pragma omp for schedule(static)
construct for(i=0;I<N;i++) { a[i]=a[i]+b[i];}
27
OpenMP For Construct:
The Schedule Clause
• The schedule clause effects how loop iterations are mapped onto
threads
schedule(static [,chunk])
• Deal-out blocks of iterations of size “chunk” to each thread.
schedule(dynamic[,chunk])
• Each thread grabs “chunk” iterations off a queue until all iterations
have been handled.
schedule(guided[,chunk])
• Threads dynamically grab blocks of iterations. The size of the block
starts large and shrinks down to size “chunk” as the calculation
proceeds.
schedule(runtime)
• Schedule and chunk size taken from the OMP_SCHEDULE
environment variable.
28
static Scheduling
• Iterations are divided into chunks
of size chunk, and the chunks are
assigned to the threads in the
team in a round-robin fashion in
the order of the thread number
• It is the default schedule and the
default chunk is approximately
Niter /Nthreads

• For example:
!$omp parallel do &
!$omp schedule(static,3)

29
dynamic Scheduling
• Iterations are distributed to
threads in the team in chunks as
the threads request them. Each
thread executes a chunk of
iterations, then requests another
chunk, until no chunks remain to
be distributed.
• The default chunk is 1

• For example:
!$omp parallel do &
!$omp schedule(dynamic,1)

30
guided Scheduling
• Iterations are assigned to threads
in the team in chunks as the
executing threads request them.
Each thread executes a chunk of
iterations, then requests another
chunk, until no chunks remain to
be assigned. The chunk
decreases to chunk
• The default value of chunk is 1

• For example:
!$omp parallel do &
!$omp schedule(guided,1)

31
runtime and auto Scheduling
• runtime: iteration scheduling scheme is set at runtime
through the enviroment variable OMP_SCHEDULE
• For example:
!$omp parallel do &
!$omp schedule(runtime)
• the scheduling scheme can be modified without recompiling the
program changing the environment variable OMP_SCHEDULE,
for example: setenv OMP_SCHEDULE “dynamic,50”

• Only useful for experimental purposes during the parallelization

• auto: the decision regarding scheduling is delegated to the

compiler and/or runtime system

32
Scheduling experiment
3
2
1
0
3
2
1
0
3
2
1
0
0 200 400 600 800 1000
Different scheduling for a 1000 iterations loop with 4 threads: guided
(top), dynamic (middle), static (bottom)

33
Sections: Work-Sharing Constructs
• The Sections work-sharing construct gives a
different structured block to each thread.
#pragma omp parallel
#pragma omp sections
{ By default, there is a barrier at the end of
X_calculation(); the “omp sections”. Use the “nowait”
clause to turn off the barrier.
#pragma omp section
y_calculation();
#pragma omp section
z_calculation();
}
34
Data-sharing attributes
• In a parallel construct the data-sharing attributes are
implicitily determined by the default clause, if present
• if no default clause is present they are shared
• Certain variables have a predetermined data-sharing attributes
• Variables with automatic storage duration that are declared in a
scope inside a construct are private
• Objects with dynamic storage duration are shared
• The loop iteration variable(s) in the associated for-loop(s) of a
for construct is (are ) private
• A loop iteration variable for a sequential loop in a parallel
construct is private in the innermost such construct that
encloses the loop (only Fortran)
• Variables with static storage duration that are declared in a
scope inside the construct are shared
• ...

35
Data-sharing attributes clauses
• Explicitly determined data-sharing attributes are those that are
referenced in a given construct and are listed in a data-sharing
attribute clause
• shared(list): there is only one istance of the objects in the
list accessible by all threads in the team
• private(list): each thread has a copy of the variables in
the list
• firstprivate(list): same as private but all variables in
the list are initialized with the value that the original object had
before entering the parallel construct
• lastprivate(list): same as private but the thread that
executes the sequentially last iteration or section updates the
value of the objects in the list
• The default clause sets the implicit default
• default(none|shared) in C/C++

36
Parallel programming using OpenMP
Data sharing

firstprivate
Particular case of private.
Each private copy is initialized with the value of the variable
of the master thread.

Example
void f() {
int x= 17;
#pragma omp parallel for firstprivate(x)
for (long i=0;i<maxval;++i) {
x+=i; //xisinitially17
}
std::cout<<x<<std::endl; //x==17
}

37
lastprivate
Pass the value of the private variable of the last sequential
iteration to the global variable.

Example
void f() {
int x= 17;
#pragma omp parallel for firstprivate(x) lastprivate(x)
for (long i=0;i<maxval;++i) {
x+=i; //xisinitially17
}
std::cout<<x<<std::endl; //xvalueiniterationi==maxval −1
}

38
The threadprivate directive
C/C++
#pragma omp threadprivate(list)

• Is a declarative directive
• Is used to create private copies of
• file-scope, namespace-scope or static variables in C/C++
• Follows the variable declaration in the same program unit
• Initial data are undefined, unless the copyin clause is used

39
Data Environment:
Default Storage Attributes
• Shared Memory programming model:
• Most variables are shared by default
• Global variables are SHARED among threads
• Fortran: COMMON blocks, SAVE variables, MODULE
variables
• C: File scope variables, static
• But not everything is shared...
• Stack variables in sub-programs called from parallel
regions are PRIVATE
• Automatic variables within a statement block are
PRIVATE.

40
Private Clause
• private(var) creates a local copy of var for each
thread.
– The value is uninitialized
– Private copy is not storage associated with the
original
void wrong(){
int IS = 0;
#pragma parallel for private(IS)
for(int J=1;J<1000;J++)
IS = IS + J;
printf(“%i”, IS);
}
41
OpenMP: Reduction
• Another clause that effects the way variables are shared:
• reduction (op : list)
• The variables in “list” must be shared in the enclosing
parallel region.
• Inside a parallel or a worksharing construct:
• A local copy of each list variable is made and initialized
depending on the “op” (e.g. 0 for “+”)
• pair wise “op” is updated on the local value
• Local copies are reduced into a single global copy at the
end of the construct.

42
OpenMP:
An Reduction Example
#include <omp.h>
#define NUM_THREADS 2
void main ()
{
int i;
double ZZ, func(), sum=0.0;
omp_set_num_threads(NUM_THREADS)
#pragma omp parallel for reduction(+:sum)
private(ZZ)
for (i=0; i< 1000; i++){
ZZ = func(i);
sum = sum + ZZ;
}
}

43
OpenMP: Synchronization
• OpenMP has the following constructs to support
synchronization:
• barrier
• critical section
• atomic
• flush
• ordered
• single
• master

44
Synchronization

45
atomic construct
• The atomic construct applies only to statements that update
the value of a variable
• Ensures that no other thread updates the variable between
reading and writing
• The allowed instructions differ between Fortran and C/C++
• Refer to the OpenMP specifications
• It is a special lightweight form of a critical
• Only read/write are serialized, and only if two or more threads
access the same memory address

C/C++
#pragma omp atomic [clause]
<statement>

46
atomic Examples
C/C++
#pragma omp atomic update
x += n*mass; // default update

#pragma omp atomic read

v = x; // read atomically

#pragma omp atomic write

x = n*mass; write atomically

#pragma omp atomic capture

v = x++; // capture x in v and
// update x atomically

47
Critical session
• Only one thread at a time can enter a critical section
For(I=0; I<N; I++) {
…… Cannot be parallelized if sum is shared.
sum += A[I];
…… Fix:
} For(I=0; I<N; I++) {
……
#pragma omp critical
{
sum += A[I];
}
……
}
48
Master directive
• The master construct denotes a structured block that is only
executed by the master thread. The other threads just skip it
(no implied barriers or flushes).
#pragma omp parallel private (tmp)
{
do_many_things();
#pragma omp master
{ exchange_boundaries(); }
#pragma barrier
do_many_other_things();
}

49
Single directive
• The single construct denotes a block of code that is executed by
only one thread.
• A barrier and a flush are implied at the end of the single block.

#pragma omp parallel private (tmp)

{
do_many_things();
#pragma omp single
{ exchange_boundaries(); }
do_many_other_things();
}

50
Ordering
• An ordered region is executed in sequential order.

#pragma omp parallel

{
#pragma omp for ordered reduction(+:res)
for ( int i =0;i<max;++i) {
double tmp = f(i) ;
#pragma ordered
res += g(tmp);
}
}

51
Simple locks
• Locks in the OpenMP library.
• Also nested locks.
omp_lock_t l;
omp_init_lock(&l);
#pragma omp parallel
{
int id = omp_get_thread_num();
double x = f( i ) ;
omp_set_lock(&l);
cout << "ID=" << id << " tmp= " << tmp << endl;
omp_unset_lock(&l);
}
omp_destroy_lock(&l);

52
OpenMP: Library routines
• Lock routines
• omp_init_lock(), omp_set_lock(), omp_unset_lock(), omp_test_lock()
• Runtime environment routines:
• Modify/Check the number of threads
• omp_set_num_threads(), omp_get_num_threads(),
omp_get_thread_num(), omp_get_max_threads()
• Turn on/off nesting and dynamic mode
• omp_set_nested(), omp_set_dynamic(), omp_get_nested(),
omp_get_dynamic()
• Are we in a parallel region?
• omp_in_parallel()
• How many processors in the system?
• omp_num_procs()

53
OpenMP: Environment Variables
• OMP_NUM_THREADS
• bsh:
• export OMP_NUM_THREADS=2
• csh:
• setenv OMP_NUM_THREADS 4

54
CODE. Matrix-vector multiply using a parallel loop and
critical directive

/* Spawn a parallel region explicitly scoping all variables */

#pragma omp parallel shared(a,b,c,nthreads,chunk)
private(tid,i,j,k)
{
#pragma omp for schedule (static, chunk)
for (i=0; i<NRA; i++)
{
printf("thread=%d did row=%d\n",tid,i);
for(j=0; j<NCB; j++)
for (k=0; k<NCA; k++)
c[i][j] += a[i][k] * b[k][j];
}
}

55
Ex.Travelling Salesman Problem
• The map is represented as a graph with nodes
representing cities and edges representing the
distances between cities.
• A special node (cities) is the starting point of the
tour.
• Travelling salesman problem is to find the circle
(starting point) that covers all nodes with the
smallest distance.
• This is a well known NP-complete problem.

56
CODE. Sequential TSP
Init_q(); init_best();
While ((p = dequeue()) != NULL) {
for each expansion by one city {
q = addcity (p);
if (complete(q)) {update_best(q);}
else enqueue(q);
}
}

57
CODE. OpenMP TSP
Do_work() {
While ((p = dequeue()) != NULL) {
for each expansion by one city {
q = addcity (p);
if (complete(q)) {update_best(q);}
else enqueue(q);
} } }
main() {
init_q(); init_best();
#pragma omp parallel for
for (i=0; I < NPROCS; i++)
do_work();
}

58
Summary
• OpenMP provides a compact, yet powerful
programming model for shared memory programming
• It is very easy to use OpenMP to create parallel programs.
• OpenMP preserves the sequential version of the
program
• Developing an OpenMP program:
• Start from a sequential program
• Identify the code segment that takes most of the time.
• Determine whether the important loops can be parallelized
• The loops may have critical sections, reduction variables, etc
• Determine the shared and private variables.
• Add directives

59
Conclusion
• OpenMP is successful in small-to-medium SMP
systems
• Multiple cores/CPUs dominate the future computer
architectures; OpenMP would be the major parallel
programming language in these architectures.
• Simple: everybody can learn it in 2 weeks
• Not so simple: Don’t stop learning! keep learning it
for better performance

60
OpenMP discussion
• Ease of use
• OpenMP takes cares of the thread maintenance.
• Big improvement over pthread.
• Synchronization
• Much higher constructs (critical section, barrier).
• Big improvement over pthread.

• OpenMP is easy to use!!

61
OpenMP discussion
• Expressiveness
• Data parallelism:
• MM and SOR
• Fits nicely in the paradigm
• Task parallelism:
• TSP
• Somewhat awkward. Use OpenMP constructs to create threads.
OpenMP is not much different from pthread.

62
OpenMP discussion
• Exposing architecture features (performance):
• Not much, similar to the pthread approach
• Assumption: dividing job into threads = improved performance.
• How valid is this assumption in reality?
• Overheads, contentions, synchronizations, etc
• This is one weak point for OpenMP: the performance of
an OpenMP program is somewhat hard to understand.

63
OpenMP final thoughts
• Main issues with OpenMP: performance
• Is there any obvious way to solve this?
• Exposing more architecture features?
• Is the performance issue more related to the fundamantal
way that we write parallel program?
• OpenMP programs begin with sequential programs.
• May need to find a new way to write efficient parallel programs
in order to really solve the problem.

64
Thank
you for
your
attentions
!

Lab+ +Password+Cracking+Using+Mimikatz
No ratings yet
Lab+ +Password+Cracking+Using+Mimikatz
8 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
Mpsoc Architectures Openmp
No ratings yet
Mpsoc Architectures Openmp
35 pages
UNIT 3
No ratings yet
UNIT 3
13 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
04
No ratings yet
04
39 pages
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
No ratings yet
Govindarajan_ParallelizationPrinciples-NSM-AstroPhysics
50 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
Openmp: Parallel Processing
No ratings yet
Openmp: Parallel Processing
40 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
About OpenMP
No ratings yet
About OpenMP
86 pages
Open MP
No ratings yet
Open MP
35 pages
Omp Handouts
No ratings yet
Omp Handouts
109 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
Open MP
No ratings yet
Open MP
30 pages
Parallel Programming Using OpenMP
No ratings yet
Parallel Programming Using OpenMP
76 pages
Chap4 OpenMP
No ratings yet
Chap4 OpenMP
35 pages
OMP Common Core-Voss
No ratings yet
OMP Common Core-Voss
217 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
10 OpenMP-2
No ratings yet
10 OpenMP-2
25 pages
A Tutorial On Parallel Computing On Shared Memory Systems
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
23 pages
Lecture 10 Shared Memory Programming with OpenMP.pptx
No ratings yet
Lecture 10 Shared Memory Programming with OpenMP.pptx
30 pages
Openmp
No ratings yet
Openmp
61 pages
CS-3006 5 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 5 UsingOpenMP SharedMemoryProgramming
76 pages
Parallel Programming Using Openmp: Mike Bailey
No ratings yet
Parallel Programming Using Openmp: Mike Bailey
27 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
46 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
Openmp 6pp
No ratings yet
Openmp 6pp
5 pages
OpenMP P1
No ratings yet
OpenMP P1
32 pages
Unit III
No ratings yet
Unit III
15 pages
07 OpenMP
No ratings yet
07 OpenMP
28 pages
OpenMP 2
No ratings yet
OpenMP 2
3 pages
Openmp
No ratings yet
Openmp
21 pages
OpenMP Tutorial - Lawrence Livermore National Laboratory
No ratings yet
OpenMP Tutorial - Lawrence Livermore National Laboratory
75 pages
Cs6801 Mcap MGM
No ratings yet
Cs6801 Mcap MGM
7 pages
Beginning OpenMP
No ratings yet
Beginning OpenMP
20 pages
Openmp: John H. Osorio Ríos
No ratings yet
Openmp: John H. Osorio Ríos
24 pages
OpenMP Workshop Day 1
No ratings yet
OpenMP Workshop Day 1
49 pages
Openmp 2pp
No ratings yet
Openmp 2pp
15 pages
ParallelProgramming_Start2016
No ratings yet
ParallelProgramming_Start2016
41 pages
Unit 4 Shared-Memory Parallel Programming With Openmp
No ratings yet
Unit 4 Shared-Memory Parallel Programming With Openmp
37 pages
Omp Hands On SC08 PDF
No ratings yet
Omp Hands On SC08 PDF
153 pages
Omp Hands On SC08
No ratings yet
Omp Hands On SC08
153 pages
Programming Assignment: On Openmp
No ratings yet
Programming Assignment: On Openmp
19 pages
OpenMP Examples
No ratings yet
OpenMP Examples
12 pages
DS1822-Parallel Computing - Unit2
No ratings yet
DS1822-Parallel Computing - Unit2
25 pages
Unit 3 - Programming Multi-Core and Shared Memory
No ratings yet
Unit 3 - Programming Multi-Core and Shared Memory
100 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
OpenMP 01 Introduction
No ratings yet
OpenMP 01 Introduction
70 pages
Updated_CS8083 MCP UNIT III notes
No ratings yet
Updated_CS8083 MCP UNIT III notes
26 pages
Shared Memory and Accelerators
No ratings yet
Shared Memory and Accelerators
88 pages
Parallel Programming Module 3
No ratings yet
Parallel Programming Module 3
44 pages
OpenMP Presentation
No ratings yet
OpenMP Presentation
51 pages
NumPy Recipes
From Everand
NumPy Recipes
Martin McBride
No ratings yet
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
From Everand
Python Programming: General-Purpose Libraries; NumPy,Pandas,Matplotlib,Seaborn,Requests,os & sys: Python, #2
e3
No ratings yet
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
From Everand
Computer Science: Learn about Algorithms, Cybersecurity, Databases, Operating Systems, and Web Design
Jonathan Rigdon
No ratings yet
Chapter 4 - Message-Passing Programming, MPI
No ratings yet
Chapter 4 - Message-Passing Programming, MPI
79 pages
Chapter 1 - Parallel Architectures
No ratings yet
Chapter 1 - Parallel Architectures
60 pages
Chapter 5 - General Purpose PGPU, CUDA
No ratings yet
Chapter 5 - General Purpose PGPU, CUDA
70 pages
Chapter 2 - Parallel Algorithm Design
No ratings yet
Chapter 2 - Parallel Algorithm Design
84 pages
Practical Programming 2016 PDF
No ratings yet
Practical Programming 2016 PDF
275 pages
OCI-IAM Overview
No ratings yet
OCI-IAM Overview
43 pages
Manual UniGUI
100% (1)
Manual UniGUI
100 pages
Ready Queue and Waiting Queue: Assignment On
No ratings yet
Ready Queue and Waiting Queue: Assignment On
3 pages
Huawei E5180 E5180s 22 Lte Cube Official Specifications Datasheet
No ratings yet
Huawei E5180 E5180s 22 Lte Cube Official Specifications Datasheet
21 pages
COS1511 Assignment 2 Semester 1 2024F
100% (1)
COS1511 Assignment 2 Semester 1 2024F
18 pages
PN Systemredundanz V1 0 en
No ratings yet
PN Systemredundanz V1 0 en
9 pages
Sagem JTAG Unlocker English r9
No ratings yet
Sagem JTAG Unlocker English r9
5 pages
Lfad211 Prelim
No ratings yet
Lfad211 Prelim
145 pages
OBE Operating System Syllabus - UPDATED
No ratings yet
OBE Operating System Syllabus - UPDATED
3 pages
Dell Vostro 3910 BRANDED DESKTOP 12-09-2023
No ratings yet
Dell Vostro 3910 BRANDED DESKTOP 12-09-2023
1 page
sophos
No ratings yet
sophos
3 pages
Quectel AG35 HTTP (S) AT Commands Manual V1.0 PDF
No ratings yet
Quectel AG35 HTTP (S) AT Commands Manual V1.0 PDF
37 pages
MIS Chapter 13
No ratings yet
MIS Chapter 13
61 pages
22621-2023-Summer-Question-Paper (Msbte Study Resources)
No ratings yet
22621-2023-Summer-Question-Paper (Msbte Study Resources)
2 pages
n3 Hacking Device Chart v6
No ratings yet
n3 Hacking Device Chart v6
2 pages
1.5.7 Packet Tracer - Network Representation
No ratings yet
1.5.7 Packet Tracer - Network Representation
6 pages
Landslide Disaster Management Game: Anshul Bang Anoushaka Krishanan
No ratings yet
Landslide Disaster Management Game: Anshul Bang Anoushaka Krishanan
13 pages
FlexVPN Site-to-Site Using IKEv2 Routing
No ratings yet
FlexVPN Site-to-Site Using IKEv2 Routing
5 pages
IC RIV IT Checklist
No ratings yet
IC RIV IT Checklist
2 pages
Mobile APP Testing
No ratings yet
Mobile APP Testing
7 pages
Configuring VLANs and Trunking
No ratings yet
Configuring VLANs and Trunking
11 pages
Ourlog 2799
No ratings yet
Ourlog 2799
4 pages
Answer 4
No ratings yet
Answer 4
6 pages
Ecy-S1000 SP
No ratings yet
Ecy-S1000 SP
3 pages
Module 09 - Mimics - Basics
No ratings yet
Module 09 - Mimics - Basics
14 pages
CSC 319 Compiler Constructions
No ratings yet
CSC 319 Compiler Constructions
54 pages
The Timeline of Artificial Intelligence
No ratings yet
The Timeline of Artificial Intelligence
1 page
Rohini 69771659590
No ratings yet
Rohini 69771659590
7 pages

Chapter 3 - Shared-Memory Programming, OpenMP

Uploaded by

Chapter 3 - Shared-Memory Programming, OpenMP

Uploaded by

Shared Memory

Communication is implicitly specified

#pragma omp parallel

By default, there is a barrier at the end of

#pragma omp parallel

• Only useful for experimental purposes during the parallelization

• auto: the decision regarding scheduling is delegated to the

#pragma omp atomic read

#pragma omp atomic write

#pragma omp atomic capture

#pragma omp parallel private (tmp)

#pragma omp parallel

/*** Spawn a parallel region explicitly scoping all variables ***/

• OpenMP is easy to use!!

You might also like

/* Spawn a parallel region explicitly scoping all variables */