Lecture Week - 2 General Parallelism Terms
Lecture Week - 2 General Parallelism Terms
Lecture 3:
General Parallelism Terms
Farhad M. Riaz
Farhad.Muhammad@numl.edu.pk
• Shared Memory
• Distributed Memory
Shared vs Distributed Memory
Some General Parallel
Terminology
Supercomputing / High Performance Computing (HPC)
– Using the world's fastest and largest computers to solve large
problems
Node
– A standalone "computer in a box". Usually comprised of multiple
CPUs/processors/cores, memory, network interfaces, etc. Nodes
are networked together to comprise a supercomputer.
CPU / Socket / Processor / Core
– In the past, a CPU (Central Processing Unit) was a singular
execution component
– Then, multiple CPUs were incorporated into a node
– Then, individual CPUs were subdivided into multiple "cores", each
being a unique execution unit
– CPUs with multiple cores are sometimes called "sockets”
– The result is a node with multiple CPUs, each containing multiple
cores
Some General Parallel
Terminology
Task
– A logically discrete section of computational work. A task is typically a
program or program-like set of instructions that is executed by a
processor. A parallel program consists of multiple tasks running on
multiple processors.
Pipelining
– Breaking a task into steps performed by different processor units, with
inputs streaming through, much like an assembly line; a type of
parallel computing.
Shared Memory
– From a strictly hardware point of view, describes a computer
architecture where all processors have direct (usually bus based)
access to common physical memory. In a programming sense, it
describes a model where parallel tasks all have the same "picture" of
memory and can directly address and access the same logical
memory locations regardless of where the physical memory actually
exists.
Some General Parallel
Terminology
Symmetric Multi-Processor (SMP)
– Shared memory hardware architecture where multiple processors
share a single address space and have equal access to all resources.
Distributed Memory
– In hardware, refers to network based memory access for physical
memory that is not common. As a programming model, tasks can only
logically "see" local machine memory and must use communications
to access memory on other machines where other tasks are
executing.
Communications
– Parallel tasks typically need to exchange data. There are several ways
this can be accomplished, such as through a shared memory bus or
over a network, however the actual event of data exchange is
commonly referred to as communications regardless of the method
employed
Some General Parallel
Terminology
Synchronization
– The coordination of parallel tasks in real time, very often associated with communications.
– Often implemented by establishing a synchronization point within an application where a
task may not proceed further until another task(s) reaches the same or logically equivalent
point.
– Synchronization usually involves waiting by at least one task and can therefore cause a
parallel application's wall clock execution time to increase.
Granularity
– granularity is a qualitative measure of the ratio of computation to communication
– Coarse: relatively large amounts of computational work are done between communication
events
– Fine: relatively small amounts of computational work are done between communication
events
Observed Speedup
– Observed speedup of a code which has been parallelized, defined as:
wall-clock time of serial execution
-----------------------------------
wall-clock time of parallel execution
– One of the simplest and most widely used indicators for a parallel program's performance.
Some General Parallel
Terminology
Parallel Overhead
– The amount of time required to coordinate parallel tasks, as opposed to
doing useful work. Parallel overhead can include factors such Task
start-up time
Synchronizations
Data communications
Software overhead imposed by parallel languages, libraries, operating
system, etc.
Task termination time
Massively Parallel
– Refers to the hardware that comprises a given parallel system - having
many processing elements. The meaning of "many" keeps increasing,
but currently, the largest parallel computers are comprised of
processing elements numbering in the hundreds of thousands
Embarrassingly Parallel
– Solving many similar, but independent tasks simultaneously; little to no
need for coordination between the tasks
Some General Parallel
Terminology
Scalability
– Refers to a parallel system's (hardware
and/or software) ability to demonstrate a
proportionate increase in parallel speedup
with the addition of more resources. Factors
that contribute to scalability include:
Hardware - particularly memory-CPU bandwidths
and network communication properties
Application algorithm
Parallel overhead related
Characteristics of your specific application
Why Parallel Computing is
necessary?
Rise of multi-core
computing machines
Under-utilization of
resources
Hyperthreading,
introduced by Intel
Example