Slides Chapter 2 - Parallel Programming Platforms
Slides Chapter 2 - Parallel Programming Platforms
Hardware
System Software
Application Software
Parallel Algorithms
Logical Organization
The
users view of the machine as it is being presented via its system software actual hardware architecture
Physical Organization
The
Control Mechanism
SISD/SIMD/MIMD/MISD
Communication Model
Shared-Address
Space
Message-Passing
UMA/NUMA/ccNUMA
Physical Organization
PRAM Models
EREW/ERCW/CREW/CRCW
Concurrent
Common/Arbitrary/Priority/Sum
Physical Organization
Static
Dynamic
direct network
The network consists of switching elements that the various processors attach to
indirect network
distributed-memory system
shared-memory systems
Diameter
Connectivity
The minimum number of arcs that must be removed to break it into two disconnected networks
Measures the multiplicity of paths The minimum number of arcs that must be removed to partition the network into two equal halves.
Bisection width
Bisection bandwidth
Applies to networks with weighted arcsweights correspond to the link width (how much data it can transfer) The minimum volume of communication allowed between any two halves of a network
Cost
Network Topologies
Bus-Based Networks
Shared
Network Topologies
Crossbar Networks
Switch-based
Network Topologies
Pass-through
Cross-over
Network Topologies
Network Topologies
Cartesian Topologies
Network Topologies
Hypercubes
Network Topologies
Trees
log
Topology Embeddings
in the early days of parallel computing when topology specific algorithms were being developed.
maximum number of lines an edge is mapped to maximum number of edges mapped on a single link
congestion
Routing Mechanisms
Routing:
The
algorithm used to determine the path that a message will take to go from the source to destination
There is a predefined ordering of the dimensions Messages are routed along the dimensions in that order until they cannot move any further
010 011
011 111
Physical Organization
certain level of consistency must be maintained for multiple copies of the same data Required to ensure proper semantics and correct program execution
serializability
Two
Invalidate/Update Protocols
Invalidate/Update Protocols
Classical trade-off between communication overhead (updates) and idling (stalling in invalidates) Additional problems with false sharing Existing schemes are based on the invalidate protocol
A
number of approaches have been developed for maintaining the state/ownership of the shared data
Message-Passing Systems
The
add headers/trailer, error-correction, execute the routing algorithm, establish the connection between source & destination time to travel between two directly connected nodes. node latency 1/channel-width
per-hop time: th
In general true because ts is much larger than th and for most of the algorithms that we will study mtw is much larger than lth