Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

PDC

Download as txt, pdf, or txt
Download as txt, pdf, or txt
You are on page 1of 2

OpenMP is directive based.

#pragma is for compiler directive.

#pragma omp parallel [clause list]


-> is responsible for creating a group of threads
-> How to set number of threads:
1) Specified in the directive
2) Using environment variable
3) runtime using OpenMP functions
-> clause list is to specifiy "conditional parallelization", "no. of
threads", "data handling"

The main thread that encounters this directive becomes master of this group and has
thead id 0 within the group.

Conditional Parallelization: if (condition) is true then parallelize (create


threads) otherwise threads are not created.
Degree of Concurrency/No. of threads: num_threads(integer) specifies the number of
threads created by parallel directive.
Data Handling:
1) private(var list), uninitialized local to each thread (value cannot be
passed from caller thread).
2) firstprivate(var:value list), private with initialization (value passed
from caller thread)
3) shared(var list), shared among all threads
4) lastprivate(var list), all threads will have uninitialized local vars and
the value at end will be of the last thread in execution

default(none) forces us to specify data handlers for all variable that we want to
pass.

Scheduling
Iterations are divided into chunks that are approximately equal in size and it
distributes at most one chunk to each thread.
One chunk will include iteration n-m (for example 0-4).
These chunks are assigned to threads which is defined by the type of scheduling we
are performing.
--> Static assigns the chunks in the round-robin manner (in-order execution). (each
thread will get the same number of iterations except one if odd)
--> Dynamic assigns the chunks in first-come first-served manner, as the threads
finish their work they ask for more work if it isn't finished.
--> Guided assigns in first-come first-served manner, but the chunk-size starts
large then shrinks to the provided size.
--> Auto is on the compiler to decide.
--> Runtime is when the scheduling and chunksize is defined at runtime.

For iterations that take roughly equal time, Static is the best due to little
overhead.
For iterations that vary in time, Dynamic works the best.

Dynamic Overhead: After each iteration, the threads must stop and receive a new
value of the loop
variable to use for its next iteration.

If we increase chunk-size it'll get closer to being static and it won't have to
reassign new values very quickly.
This is where guided works better because chunksize is large initially.

You might also like