Parallel
Parallel
Parallel
Unit-I:
Sequential model, need of alternative model, parallel computational models such as PRAM, LMCC, Hypercube, Cube
Connected Cycle, Butterfly, Perfect Shuffle Computers, Tree model, Pyramid model, Fully Connected model, PRAMCREW, EREW models, simulation of one model from another one.
Unit-II:
Performance Measures of Parallel Algorithms, speed-up and efficiency of PA, Cost- optimality, An example of illustrate
Cost- optimal algorithms- such as summation, Min/Max on various models.
Unit-III:
Parallel Sorting Networks, Parallel Merging Algorithms on CREW/EREW/MCC, Parallel Sorting Networks on
CREW/EREW/MCC/, linear array
Unit-IV:
Parallel Searching Algorithm, Kth element, Kth element in X+Y on PRAM, Parallel Matrix Transportation and
Multiplication Algorithm on PRAM, MCC, Vector-Matrix Multiplication, Solution of Linear Equation, Root finding.
Unit-V:
Graph Algorithms - Connected Graphs, search and traversal, Combinatorial Algorithms- Permutation, Combinations,
Derrangements.
References:
1. M.J. Quinn, Designing Efficient Algorithms for Parallel Computer, McGrawHill.
2. S.G. Akl, Design and Analysis of Parallel Algorithms
3. S.G. Akl, Parallel Sorting Algorithm by Academic Press
COURSE PLAN
PARALLEL ALGORITHMS
ECS-073
Course description : This course is about one (and perhaps the most fundamental) aspect of parallelism, namely, parallel
algorithms. A parallel algorithm is a solution method for a given problem destined to be performed on a parallel computer. In
order to properly design such algorithms, one needs to have a clear understanding of the model of computation underlying
the parallel computer.
Topic
Lecture-1
1.Sequential
mode
2. Desirable
Properties
For Parallel
Algorithms
Lecture-2
1. Need of
alternative
model
2. Parallel
computationa
l models
Lecture-3
Parallel
computationa
l models
Knowledge Input
Unit-I
Concept Input
Supportive Aid
References
Difference between
sequential model and
parallel model
Discussion and
Example
Notes
Working function of
these models
Discussion and
Example
S.G. Akl,
Design
and
Analysis of
Parallel
Algorithms
, Notes
Asymptotic Notation
Cube Connected
Cycle
Butterfly
Perfect Shuffle
Computers
Tree model
Working function of
these models
Discussion and
Example
S.G. Akl,
Design
and
Analysis of
Parallel
Algorithms
, Notes
Basic introduction of
different type of
model
Lecture-4
Designing
Algorithms
Topic
Lecture-1
Performance
Measures of
Parallel
Algorithms
Lecture-2
Performance
Measures of
Parallel
Algorithms
Pyramid model
Fully Connected
model
Incremental Approach
PRAM-CREW
EREW models
Discussion and
Examples
Design &
Analysis of
Parallel
Algorithms
Supportive Aid
References
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Other Measures
Area.
Length
Period
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Knowledge Input
Lecture-3
Expressing
Algorithms
Lecture-4
Lower Bound
Lecture-5
Example of
illustrate
Cost- optimal
algorithms
Lecture-6
An Algorithm
For Parallel
Selection
informal language to
describe parallel
algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Linear Order
Rank
Selection
Complexity
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Broadcasting a
Datum
Computing All
Sums
Procedure
Analysis
Unit-III
Topic
Lecture-1
Merging
Lecture-2
Merging On
The CREW
Model
Lecture-3
1. Merging
On The
EREW Model
2. A Better
Algorithm
For The
EREW Model
Lecture-4
Sorting
Knowledge Input
Concept Input
Supportive Aid
References
Introduction
A Network For
Merging
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Introduction
A Network For
Sorting
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Sequential Merging
Parallel Merging
practitioners, as sorting
data is at the heart of many
computations. It also has a
rich theory:
Lecture-5
Sorting On A
Linear Array
Lecture-6
1. Sorting On
The CRCW
Model
2. Sorting On
The CREW
Model
Lecture-7
Sorting On
The EREW
Model
Topic
Lecture-1
Searching
Lecture-2
Searching A
Random
Sequence
Lecture-3
ODD-EVEN
TRANSPOSITION
MERGE SPLIT
CRCW SORT
CREW SORT
Simulating
Procedure CREW
SORT
Sorting by ConflictFree Merging
Sorting by Selection
Knowledge Input
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Supportive Aid
References
Unit-IV
Concept Input
1. Introduction
2. Searching A Sorted
Sequence
EREW
Searching
CREW
Searching
CRCW
Searching
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Searching on SM SIMD
Computers
Searching on a Tree
Searching on a Mesh
We begin by studying
parallel search algorithms
for shared-memory SIMD
computers. We then show
how the power of this
model is not really needed
for the search problem
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Generating
Permutations
and
Combinations
1. Introduction
In this lecture we
describe a number of
parallel algorithms for the
two fundamental problems
of generating permutations
and combinations.
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Lecture-4
Sequential
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Adapting a Sequential
Algorithm
An Adaptive
Permutation Generator
Parallel Permutation
Generator for Few
Processors
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
A Fast Combination
Generator
An Adaptive
Combination Generator
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Introduction
Problems involving
matrices arise in a
multitude of numerical and
nonnumerical
contexts.
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Mesh Transpose
Shuffle Transpose
EREW Transpose
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Mesh Multiplication
Cube Multiplication
CRCW Multiplication
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Linear Array
Multiplication
Tree Multiplication
Convolution
We study it
separately in order to
demonstrate the use of two
interconnection networks in
Discussion and
Examples
Design
and
Analysis of
Parallel
Lecture-5
Generating
Permutation
In Parallel
Lecture-6
Generating
Combination
In Parallel
Lecture-7
Matrix
Operations
Lecture-8
Trasposition
Lecture-9
Matrix-ByMatrix
Multiplicatio
n
Lecture-10
Matrix-ByVector
Multiplicatio
n
Generating
Permutations
Lexicographically
Numbering
Permutations
Generating
Combinations
Lexicographically
Numbering
Combinations
performing matrix
operations, namely, the
linear (or one-dimensional)
array and the tree.
Algorithms
Lecture-11
Numerical
Problems
Lecture-12
Finding
Roots Of
Nonlinear
Equations
1. Introduction
2. Solving Systems Of
Linear Equations
An SIMD Algorithm
An MIMD Algorithm
An SIM D
Algorithm
An MIMD
Algorithm
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design
and
Analysis of
Parallel
Algorithms
Unit-V
Topic
Lecture-1
Graph
Theory
Lecture-2
Computing
The
Connectivity
Matrix
Lecture-3
Finding
Connected
Components
Lecture-4
All-Pairs
Shortest
Paths
Knowledge Input
Concept Input
Supportive Aid
References
Introduction
Definition
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
CUBE
CONNECTIVITY
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
CUBE COMPONENTS
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
CUBE SHORTEST
PATHS
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Lecture-5
Computing
The
Minimum
Spanning
Tree
Lecture-6
Traversing
Combinatoria
l Spaces
Lecture-7
Basic Designs
Principles
Lecture-8
The
Algorithm
Lecture-9
Analysis And
Examples
EREW MST
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Introduction
Sequential Tree
Traversal
Many combinatorial
problems can be solved by
generating and searching a
special graph known as a
state-space graph. This
method, aptly called statespace traversal, differs from
the searching algorithms
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Procedures and
Processes
Semaphores
Score Tables
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Parallel Cutoffs
Storage Requirements
Discussion and
Examples
Design and
Analysis of
Parallel
Algorithms
Books Recommended:
1. M.J. Quinn, Designing Efficient Algorithms for Parallel Computer, McGrawHill.
2. S.G. Akl, Design and Analysis of Parallel Algorithms
3. S.G. Akl, Parallel Sorting Algorithm by Academic Press
TUTORIAL SHEET
PARALLEL ALGORITHMS
ECS-073
TUTORIAL SHEET 1
1.
2.
3.
4.
5.
6.
nn
n 2k
n
5.
6.
7.
8.
TUTORIAL SHEET 4
1. In steps I and 2 of procedure SEQUENTIAL SELECT, a simple sequential algorithm is required for sorting short
sequences. Describe one such algorithm.
2. Show that 2-D mesh with an odd number of rows, an odd number of columns.
3. Prove that a complete binary tree with weight n can be embedded with dilation 1 in an (n+2) dimensional
hypercube.
4. Show that, in general, any (r, s)-merging network must require 12(s log r) comparators when r < s.
5. The sequence of comparisons in the odd-even merging network can be viewed as a parallel algorithm. Describe an
implementation of that algorithm on an SIMD computer where the processors are connected to form a linear array.
The two input sequences to be merged initially occupy processors PI to P, and P, 1 to P., respectively. When the
algorithm terminates, Pi should contain the ith smallest element of the output sequence.
6. Establish the correctness of procedure EREW MERGE.
7. Establish the correctness of procedure TWO-SEQUENCE MEDIAN.
8. Develop a parallel merging algorithm for the CRCW model.
TUTORIAL SHEET 5
log n
1. Write an
positive integer.
matrix multiplication algorithm for the CREW PRAM model. Assume that
n 2k
, where k is a
2. What is
for the hypercube SIMD model?
3. Determine the processor efficiency of the hypercube SIMD matrix multiplication algorithm as a function of the
matrix dimension n.
a j i 0
not needed?
n / log p
TUTORIAL SHEET 7
1. Design an algorithm for sorting on the pyramid machine.
2. Implement the idea of sorting by enumeration on a cube-connected SIMD computer and analyze the running time of
your implementation.
n 1
3. Derive an algorithm for sorting by enumeration on the EREW model. The algorithm should use
processors,
where k is an arbitrary integer, and run in 0(k log n) time.
4. Let the elements of the sequence S to be sorted belong to the set {O , 1. . . m 1},. A sorting algorithm known as
sorting by bucketing first distributes the elements among a number of buckets that are then sorted individually.
Show that sorting can be completed in 0(log n) time on the EREW model using n processors and O(mn) memory
locations.
5. Which of the following sequences are bitonic sequences?
a) 2,3
b) 8,1
c) 2,5,3
d) 6,2,6,9,7
e) 3,3,4,5,2
f) 1,3,6,4,7,9
g) 8,4,2,1,2,5,7,9
h) 1,9,7,3,2,5
6. Prove or disprove: All sequences containing fewer than four elements are bitonic sequences.
7. How many shuffle-exchange steps does Stones bitonic sorter require for n-values, where
And each step uses n/2 comparators?
8. What is the worst case time complexity of the parallel quicksort algorithm?
n 2k
TUTORIAL SHEET 8
1. Show that (log n) is a lower bound on the number of steps required to search a sorted sequence of n elements on
an EREW SM SIMD computer with n processors.
2. Consider a tree-connected SIMD computer where each node contains a record (not just the leaves). Describe
algorithms for querying and maintaining such a file of records.
3. Can the transpose of an n x n matrix be obtained on an interconnection network, other than the perfect shuffle, in
O(log n) time?
4. Is there an interconnection network capable of simulating procedure EREW TRANSPOSE in constant time?
n2
5. Design an algorithm for multiplying two n x n matrices on a cube with
processors in 0(n) time.
6. Show that procedure CUBE CONNECTIVITY is not cost optimal. Can the procedure's cost be reduced?
7. Derive a parallel algorithm to compute the connectivity matrix of an n-vertex graph in 0(n) time on an n x n meshconnected SIMD computer.
8. An articulation point of a connected undirected graph G is a vertex whose removal splits G into two or more
connected components. Design a parallel algorithm to determine all the articulation points of a given graph.