Guide To Graph Algorithms Sequential Parallel and Distributed Compress
Guide To Graph Algorithms Sequential Parallel and Distributed Compress
K. Erciyes
Guide
to Graph
Algorithms
Sequential, Parallel and Distributed
Texts in Computer Science
Series editors
David Gries, Dept of Computer Science, Cornell University, Ithaca, New York,
USA
Orit Hazzan, Faculty of Education in Technology and Science, Technion—Israel
Institute of Technology, Haifa, Israel
Fred B. Schneider, Dept of Computer Science, Cornell University, Ithaca,
New York, USA
More information about this series at http://www.springer.com/series/3191
K. Erciyes
Guide to Graph
Algorithms
Sequential, Parallel and Distributed
123
K. Erciyes
International Computer Institute
Ege University
Izmir
Turkey
This Springer imprint is published by the registered company Springer International Publishing AG
part of Springer Nature
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
To the memories of Semra, Seyhun, Şebnem
and Hakan
Preface
Graphs are key data structures for the analysis of various types of networks such as
mobile ad hoc networks, the Internet, and complex networks such as social net-
works and biological networks. Study of graph algorithms is needed to efficiently
solve a variety of problems in such networks. This study is commonly centered
around three fundamental paradigms: Sequential graph algorithms; parallel graph
algorithms and distributed graph algorithms. Sequential algorithms in general
assume a single flow of control and such methods are well established. For
intractable graph problems that do not have sequential solutions in polynomial time,
approximation algorithms which have proven approximation ratios to the optimum
solutions can be used. Many times however, the approximation algorithms are not
known to date and the only choice is the use of heuristics which are common sense
rules that are shown experimentally to work for a wide range of inputs. The
algorithm designer is frequently confronted with this task of knowing what to
search or not; and what road to follow if the solution does not exist. The first aim of
this book is to provide a comprehensive and in-depth analysis of sequential graph
algorithms and guide the designer on how to approach a typical hard problem by
showing how to inspect an appropriate heuristic which is commonly needed in
many cases.
Parallel algorithms are needed to provide speed-up in the running of graph
algorithms. Shared memory parallel algorithms synchronize using a common
memory and distributed memory parallel algorithms communicate by
message-passing only. Distributed graph (or network) algorithms are aware of
network topology and can be used for various network related tasks such as routing.
Distributed algorithms is the common term used for distributed memory and dis-
tributed graph algorithms, however, we will call shared memory and distributed
memory parallel graph algorithms parallel graph algorithms and distributed graph
or network algorithms as distributed algorithms. Design and analysis of parallel and
distributed algorithms as well as sequential algorithms for graphs will be the subject
of this book.
A second and a fundamental goal of this book is to unify these three seemingly
different methods of graph algorithms where applicable. For example, the minimum
spanning tree (MST) problem can be solved by four classical sequential algorithms:
Boruvka’s, Prim’s, Kruskal’s, and Reverse-Delete algorithms all with similar
vii
viii Preface
complexities. A parallel MST algorithm will attempt to find the MST of a large
network on a fewer number of processors solely to obtain a speedup. In a dis-
tributed MST algorithm, each processor is a node of the network graph and par-
ticipates to find the MST of the network. We will describe and compare all three
paradigms for this and many other well-known graph problems by looking at the
same problem from three different angles, which we believe will help to understand
the problem better and form a unifying view.
A third and an important goal of this work will be the conversions between
sequential, shared, and distributed memory parallel and distributed algorithms for
graphs. This process is not commonly implemented in literature although there are
opportunities in many cases. We will exemplify this concept by maximal weighted
matching algorithm in graphs. The sequential approximation algorithm for this
purpose with 0.5 ratio has a time complexity of OðmlogðmÞÞ with m being the
number of edges. Preis provided a faster localized sequential algorithm based on the
first algorithm with better OðmÞ complexity. Later, Hoepmann provided a dis-
tributed version of Preis’ algorithm. More recently, Manne provided sequential
form of Hoepman’s distributed graph algorithm and parallelized this algorithm. The
sequence of methods employed has been sequential ! sequential ! distributed !
sequential ! parallel for this graph problem. Although this example shows a rather
long transformation sequence, sequential ! parallel and sequential ! distributed
are commonly followed by researchers mostly by common sense. Parallel graph
algorithms $ distributed graph algorithms conversion of algorithms is very seldom
practiced. Our aim will be to lay down the foundations of these transformations
between paradigms to convert an algorithm in one domain to another. This may be
difficult for some types of algorithms but graph algorithms are a good premise.
As more advanced technologies are developed, we are confronted with the
analysis of big data of complex networks which have tens of thousands of nodes
and hundreds of thousands of edges. We also provide a part on algorithms for big
data analysis of complex networks such as the Internet, social networks, and bio-
logical networks in the cell. To summarize, we have the following goals in this
book:
• A comprehensive study and a detailed study of fundamental principles of
sequential graph algorithms and approaches for NP-hard problems, approxi-
mation algorithms and heuristics.
• A comparative analysis of sequential, parallel and distributed graph algorithms
including algorithms for big data.
• Study of conversion principles between the three methods.
There are three parts in the book; we provide a brief background on graphs,
sequential, parallel, and distributed graph algorithms in the first part. The second
part forms the core of the book with a detailed analysis of sequential, parallel, and
distributed algorithms for fundamental graph problems. In the last part, our focus is
on algebraic and dynamic graph algorithms and graph algorithms for very large
networks, which are commonly implemented using heuristics rather than exact
solutions.
Preface ix
We review theory as much as needed for the design of sequential, parallel, and
distributed graph algorithms and our emphasis for many problems is on imple-
mentation details in full. Our study of sequential graph algorithms throughout the
book is comprehensive, however, we provide a comparative analysis of sequential
algorithms only with the fundamental parallel and distributed graph algorithms. We
kept the layout of each chapter as homogenous as possible by first describing the
problem informally and then providing the basic theoretical background. We then
describe fundamental algorithms by first describing the main idea of an algorithm;
then giving its pseudocode; showing an example implementation and finally the
analysis of its correctness and complexities. This algorithm template is repeated for
all algorithms except the ones that have complex structures and phases in which
case we describe the general idea and the operation of the algorithm.
The intended audience for this book is the senior/graduate students of computer
science, electrical and electronic engineering, bioinformatics, and any researcher or
a person with background in discrete mathematics, basic graph theory and algo-
rithms. There is a Web page for the book to keep errata and other material at: http://
ube.ege.edu.tr/*erciyes/GGA/.
I would like to thank senior/graduate students at Ege University, University of
California Davis, California State University San Marcos, and senior/graduate
students at Izmir University who have taken the distributed algorithms and complex
networks courses, sometimes under slightly different names, for their valuable
feedback when parts of the material covered in the book was presented during
lectures. I would also like to thank Springer editors Wayne Wheeler and Simon
Rees for their help and their faith in another book project I have proposed.
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 Sequential Graph Algorithms . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Parallel Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Distributed Graph Algorithms . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Algorithms for Large Graphs . . . . . . . . . . . . . . . . . . . . . 7
1.3 Challenges in Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Outline of the Book . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Part I Fundamentals
2 Introduction to Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 Notations and Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Vertex Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.2 Subgraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.3 Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Graph Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.1 Union and Intersection . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.3.2 Cartesian Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.4 Types of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.1 Complete Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.4.2 Directed Graphs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4.3 Weighted Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.4 Bipartite Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.5 Regular Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.4.6 Line Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5 Walks, Paths, and Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.5.1 Connectivity and Distance . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 Graph Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.1 Adjacency Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.2 Adjacency List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.6.3 Incidence Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
xi
xii Contents
2.7 Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.8 Graphs and Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8.1 Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.8.2 Laplacian Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.9 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3 Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.2.1 Structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.2 Procedures and Functions . . . . . . . . . . . . . . . . . . . . . . . 40
3.3 Asymptotic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.4 Recursive Algorithms and Recurrences . . . . . . . . . . . . . . . . . . . . 44
3.5 Proving Correctness of Algorithms . . . . . . . . . . . . . . . . . . . . . . . 46
3.5.1 Contraposition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.2 Contradiction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5.3 Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5.4 Strong Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.5.5 Loop Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.6 Reductions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.6.1 Difficult Graph Problems . . . . . . . . . . . . . . . . . . . . . . . . 51
3.6.2 Independent Set to Vertex Cover Reduction . . . . . . . . . 52
3.7 NP-Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.7.1 Complexity Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.7.2 The First NP-Hard Problem: Satisfiability . . . . . . . . . . . 56
3.8 Coping with NP-Completeness . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.8.1 Randomized Algorithms . . . . . . . . . . . . . . . . . . . . . . . . 58
3.8.2 Approximation Algorithms . . . . . . . . . . . . . . . . . . . . . . 61
3.8.3 Backtracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.8.4 Branch and Bound . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.9 Major Design Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.9.1 Greedy Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.9.2 Divide and Conquer . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.9.3 Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.10 Chapter Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4 Parallel Graph Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.2 Concepts and Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.3 Parallel Architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.3.1 Shared Memory Architectures . . . . . . . . . . . . . . . . . . . . 80
4.3.2 Distributed Memory Architectures . . . . . . . . . . . . . . . . . 80
Contents xiii
Abstract
Graphs are discrete structures that are frequently used to model many real-world
problems such as communication networks, social networks, and biological net-
works. We present sequential, parallel, and distributed graph algorithm concepts,
challenges in graph algorithms, and the outline of the book in this chapter.
1.1 Graphs
Graphs are discrete structures that are frequently used to model many real-world
problems such as communication networks, social networks, and biological net-
works. A graph consists of vertices and edges connecting these vertices. A graph
is shown as G = (V, E) where V is the set of vertices and E is the set of edges it
has. Figure 1.1 shows an example graph with vertices and edges between them with
V = {a, b, c, d} and E = {(a, b), (a, e), (a, d), (b, c), (b, d), (b, e), (c, d), (d, e)},
(a, b) denoting the edge between vertices a and b for example.
Graphs have numerous applications including computer science, scientific com-
puting, chemistry, and sociology since they are simple yet effective to model real-life
phenomenon. A vertex of a graph represents some entity such as a person in a social
network or a protein in a biological network. An edge in such a network corre-
sponds to a social interaction such as friendship in a social network or a biochemical
interaction between two proteins in the cell.
Study of graphs has both theoretical and practical implications. In this chapter,
we describe the main goal of the book which is to provide a unified view of graph
algorithms in terms of sequential, parallel, and distributed graph algorithms with
integer value for each edge may denote the cost of sending a message over that edge.
Commonly, the edge values are called the weights of edges. Let us assume our aim
is to design a sequential algorithm to find the edge with the maximum value. This
may have some practical usage, as we may need to find the highest cost edge in the
network to avoid that link while some communication or transportation is done.
Let us form a distance matrix D for this graph, which has entries d(i, j) showing
the weights of edges between vertices vi and v j as below:
⎡ ⎤
0 7 0 0 0 0 0 1
⎢7 0 4 0 0 0 0 9 ⎥
⎢ ⎥
⎢0 4 0 2 5 6 14 1 ⎥
⎢ ⎥
⎢0 0 2 0 3 10 0 0 ⎥
D=⎢
⎢0
⎥
⎥
⎢ 0 5 3 0 13 0 0 ⎥
⎢0 7 6 10 13 0 4 12 ⎥
⎢ ⎥
⎣0 7 14 0 0 4 0 8 ⎦
11 9 1 0 0 12 8 0
If two vertices vi and v j are not connected, we insert a zero for that entry in D.
We can have a simple sequential algorithm that finds the largest value in each row of
this matrix first and then the maximum value of all these largest values in the second
step as shown in Algorithm 1.1.
This algorithm requires 8 comparisons for each row for a total of 64 comparisons.
It needs n 2 comparisons in general for a graph with n vertices.
4 1 Introduction
Parallel graph algorithms aim for performance as all parallel algorithms. This way of
speeding up programs is needed especially for very large graphs representing com-
plex networks such as biological or social networks which consist of huge number
of nodes and edges. We have a number of processors working in parallel on the same
problem and the results are commonly gathered at a single processor for output.
Parallel algorithms may synchronize and communicate through shared memory or
they run as distributed memory algorithms communicating by the transfer of mes-
sages only. The latter mode of communication is a more common practice in parallel
computing due to its versatility to realize in general network architectures.
We can attempt to parallelize an existing sequential algorithm or design a new
parallel algorithm from scratch. A common approach in parallel computing is the
partitioning of data to a number of processors so that each computing element works
on a particular partition. Another fundamental approach is the partitioning of com-
putation across the processors as we will investigate in Chap. 4. We will see some
graph problems are difficult to partition into data or computation.
Let us reconsider the sequential algorithm in the previous section and attempt to
parallelize it using data partitioning. Since graph data is represented by the distance
matrix, the first thing to consider would be the partitioning of this matrix. Indeed,
row-wise or column-wise partitioning of a matrix representing a graph is commonly
used in parallel graph algorithms. Let us have a controlling processor we will call the
supervisor or the root and two worker processors to do the actual work. This mode
of operation, sometimes called supervisor/worker model, is also a common practice
in the design of parallel algorithms. Processors are commonly called processes to
mean the actual processor may also be doing some other work. We now have three
processes p0 , p1 , and p2 , and p0 is the supervisor. The process p0 has the distance
matrix initially, and it partitions and sends the first half of the rows from 1 to 4 to p1
and 5 to 8 to p2 as shown below:
⎡ ⎤
0 7 0 0 0 0 0 1 p1
⎢7 0 4 0 0 0 0 9 ⎥
⎢ ⎥
⎢0 4 0 2 5 6 14 1 ⎥
⎢ ⎥
⎢0 0 2 0 3 10 0 0 ⎥
D=⎢
⎢0
⎥
⎢ 0 5 3 0 13 0 0 p2 ⎥⎥
⎢0 7 6 10 13 0 4 12 ⎥
⎢ ⎥
⎣0 7 14 0 0 4 0 8 ⎦
11 9 1 0 0 12 8 0
Each worker now finds the heaviest edge incident to the vertices in the rows it is
assigned using the sequential algorithm described and sends this result to the super-
visor p0 which finds the maximum of these two values and outputs it. A more general
form of this algorithm with k worker processes is shown in Algorithm 1.2. Since data
is partitioned to two processes now, we would expect to have a significant decrease
1.2 Graph Algorithms 5
Designing parallel graph algorithms may not be trivial as in this example, and
in general we need more sophisticated methods. The operation we need to do may
depend largely on what was done before which means significant communications
and synchronization may be needed between the workers. The inter-process com-
munication across the network connecting the computational nodes is costly and we
may end up designing a parallel graph algorithm that is not efficient.
In the distributed version of our sample maximum weight edge finding algorithm,
we have computational nodes of a computer network as the vertices of the graph, and
our aim is that each node in the network modeled by the graph should receive the
largest weight edge of the graph in the end. We will attempt to solve this problem using
rounds for the synchronization of the nodes. Each node starts the round, performs
some function in the round, and does not start the next round until all other nodes
have also finished execution of the round. This model is widely used for distributed
algorithms as we will describe in Chap. 5 and there is no other central control other
than the synchronization of the rounds. Each node starts by broadcasting the largest
weight it is incident to all of its neighbors and receiving the largest weight values
from neighbors. In the following rounds, a node broadcasts the largest weight it has
seen so far and after a certain number of steps, the largest value will be propagated to
all nodes of the graph as shown in Algorithm 1.3. The number of steps is the diameter
of the graph which is the maximum number of edges between any two vertices.
We now can see fundamental differences between parallel and distributed graph
algorithms using this example as follows.
• Parallel graph algorithms are needed mainly for the speedup they provide. There
are a number of processing elements that work in parallel which cooperate to
finish an overall task. The main relation between the number of processes and
the size of the graph is that we would prefer to use more processes for large
graphs. We assume each processing element can communicate with each other in
general although there are some special parallel computing architectures such as
processors forming a cubic architecture of communication as in the hypercube.
• In distributed graph algorithms, computational nodes are the vertices of the graph
under consideration and communicate with their neighbors only to solve a problem
related to the network represented by the graph. Note that the process number is
the number of vertices of the graph for these algorithms.
One important goal of this book is to provide a unified view of graph algorithms from
these three different angles. There are cases we may want to solve a network problem
on parallel processing environment, for example, all shortest paths between any two
nodes in the network may need to be stored in a central server to be transferred to
1.2 Graph Algorithms 7
individual nodes or for statistical purposes. In this case, we run a parallel algorithm
for the network using a number of processing elements. In a network setting, we
need each node to work to know the shortest paths from it to other nodes.
A general approach is to derive parallel and distributed graph algorithms from a
sequential one but there are ways of converting a parallel graph algorithm to distrib-
uted one or vice versa for some problems. For the example problem we have, we
can have each row of the distance matrix D assigned to a single process. This way,
each process can be represented by a network node provided that it communicates
with its neighbors only. Conversions as such are useful in many cases since we do
not design a new algorithm from scratch.
Recent technical advancements in the last few decades have resulted in the avail-
ability of data of very large networks. These networks are commonly called complex
networks and consist of tens of thousands of nodes and hundreds of thousands of links
between the nodes. One such type of networks is the biological networks within the
cell of living organisms. A protein–protein interaction (PPI) network is a biological
network formed with interacting proteins outside the nucleus in the cell.
A social network consisting of individuals interacting over the Internet may again
be a very large network. These complex networks can be modeled by graphs with
vertices representing the nodes and edges the interaction between the nodes like any
other network. However, these networks are different than a small network modeled
by a graph in few respects. First of all, they have very small diameters meaning the
shortest distance between any two vertices is small when compared to their sizes.
For example, various PPI networks consisting of thousands of nodes are found to
have a diameter of only several units. Similarly, social networks and technological
networks such as the Internet also have small diameters. This state is known as
small-world property. Second, empirical studies suggest these networks have very
few nodes with very high number of connections; and most of the other nodes have
few connections to neighbors. This so-called scale-free property is exhibited again in
most of the complex networks. Lastly, the size of these networks being large requires
efficient algorithms for their analysis. In summary, we need efficient and possibly
parallel algorithms that exploit various properties such as small-world and scale-free
features of these networks.
time algorithms for the majority of problems related to graphs. The algorithms at
hand typically have exponential time complexities which means even for moderate
size graphs, the execution times are significant. For example, assume an algorithm
A to solve a graph problem P has time complexity 2n , n being the number of
vertices in the graph. We can see that A may have poor performance even for
graphs with n > 20 vertices. We then have the following choices:
There are other methods such as backtracking and branch-and-bound which work
only for a subset of the search space and therefore have less time complexities.
However, these approaches can be applied to only subset of problems and are not
general. Let us exemplify these concepts by an example. A clique in a graph is a
subgraph such that each vertex in this subgraph has connections to all other vertices
in the subgraph as shown in Fig. 1.3. Finding cliques in a graph has many implications
as these exhibit dense regions of activity. Finding the largest clique of a graph G with
n vertices, which is the clique with the maximum number of vertices in the graph,
cannot be performed in polynomial time. A brute force algorithm, which is typically
the first algorithm that comes to mind, will enumerate all 2n subgraphs of G and
check the clique condition from the largest to the smallest. Instead of searching for
an approximation algorithm, we could do the following by intuition: start with the
vertex that has the highest number of connections called its degree; check whether all
of its neighbors have the same number of connections and if all have, then we have a
clique. If this fails, continue with the next highest degree vertex. This heuristic will
work fine but in general, we need to show experimentally that a heuristic works for
most of the input variations, for 90% for example but an algorithm that works fine
for 60 % of the time with diverse inputs would not be a favorable heuristic.
1.3 Challenges in Graph Algorithms 9
• Performance: Even with polynomial time graph algorithms, the size of the graph
may restrict its use for large graphs. Recent interest in large graphs representing
large real-life networks demands high-performance algorithms which are com-
monly realized by parallel computing. Biological networks and social networks
are examples of such networks. Therefore, there is a need for efficient parallel
algorithms to be implemented in these large graphs. However, some graph prob-
lems are difficult to parallelize due to the structure of the procedures used.
• Distribution: Several large real networks are distributed in a sense each node of
the network is an autonomous computing element. The Internet, the Web, mobile
ad hoc networks, and wireless sensor networks are examples of such networks
which can be termed as computer networks in general. These networks can again
be modeled conveniently by graphs. However, the nodes of the network now
actively participate in the execution of the graph algorithm. This type of algorithms
is termed distributed algorithms.
The main goal of this book is the study of graph algorithms from three angles:
sequential, parallel, and distributed algorithms. We think this approach will provide
a better understanding of the problem at hand and its solution by also showing its
possible application areas. We will be as comprehensive as possible in the study of
sequential graph algorithms but will only present representative graph algorithms for
parallel and distributed cases. We will see some graph problems have complicated
parallel algorithmic solutions reported in research studies and we will provide a
contemporary research survey of the topics in these cases.
10 1 Introduction
• Fundamentals: This part has four chapters; the first chapter contains a dense review
of basic graph theory concepts. Some of these concepts are detailed in individual
chapters. We then describe sequential, parallel, and distributed graph algorithms
in sequence in three chapters. In each chapter, we first provide the main concepts
about the algorithm method and then provide a number of examples on graphs
using the method mentioned. For example, in the sequential algorithm methods,
we give a greedy graph algorithm while describing greedy algorithms. This part
basically forms the background for parts II and III.
• Basic Graph Algorithms: This part contains the core material of the book. We look
at the main topics in graph theory at each chapter which are trees and graph tra-
versals; weighted graphs; connectivity; matching; subgraphs; and coloring. Here,
we leave out some theoretical topics of graph theory which do not have signif-
icant algorithms. The topics we investigate in the book allow algorithmic meth-
ods conveniently and we start each chapter with brief theoretical background for
algorithmic analysis. In other words, our treatment of related graph theoretical
concepts is not comprehensive as our main goal is the study of graph algorithms
rather than graph theory on its own. In each chapter, we first describe sequential
algorithms and this part is one place in the book that we try to be as compre-
hensive as possible by describing most of the well-established algorithms of the
topic. We then provide only sample parallel and distributed algorithms on the
topic investigated. These are typically one or two well-known algorithms rather
than a comprehensive list. In some cases, the parallel or distributed algorithms at
hand are complicated. For such problems, we give a survey of algorithms with
short descriptions.
• Advanced Topics: We present recent and more advanced topics in graph algo-
rithms than Part II in this section of the book starting with algebraic and dynamic
graph algorithms. Algebraic graph algorithms commonly make use of the matrices
associated with a graph and operations on them while solving a graph problem.
Dynamic graphs represent real networks where edges are inserted and removed
from a graph in time. Algorithms for such graphs, called dynamic graph algo-
rithms, aim to provide solutions in shorter time than running the static algorithm
from scratch.
Large graphs representing real-life networks such as biological and social net-
works tend to have interesting and unexpected properties as we have outlined.
Study of such graphs has become a major research direction in network science
recently. We therefore considered it to be appropriate to have two chapters of the
book dedicated for this purpose. Algorithms for these large graphs have somehow
different goals, and community detection which is finding dense regions in these
graphs has become one of the main topics of research. We first provide a chapter
on general description and analysis of these large graphs along with algorithms
to compute some important parameters. We then review basic complex network
1.4 Outline of the Book 11
We conclude this chapter by emphasizing the main goals of the book once more.
First, it would be proper to state what this book is not. This book is not intended as
a graph theory book, or a parallel computing book or a distributed algorithms book
on graphs. We assume basic familiarity with these areas although we provide a brief
and dense review of these topics as related to graph problems in Part I. We describe
basic graph theory including the notation and basic theorems related to the topic at
the beginning of each chapter. Our emphasis is again on graph theory that is related
to the graph algorithm we intend to review. We try to be as comprehensive as possible
in the analysis of sequential graph algorithms but we review only exemplary parallel
and distributed graph algorithms. Our main focus is guiding the reader to graphs
algorithms by investigating and studying the same problem from three different
views: a thorough sequential, typical parallel, and distributed algorithmic approaches.
Such an approach is effective and beneficial not only because it helps to understand
the problem at hand better but also it is possible to convert from one approach to
another saving significant amount of time compared to designing a completely new
algorithm.
Part I
Fundamentals
Introduction to Graphs
2
Abstract
Graphs are used to model many applications with vertices of a graph representing
the objects or nodes and the edges showing the connections between the nodes.
We review notations used for graphs, basic definitions, vertex degrees, subgraphs,
graph isomorphism, graph operations, directed graphs, distance, graph represen-
tations, and matrices related to graphs in this chapter.
2.1 Introduction
A graph is a set of points and a set of lines in a plane or a 3-D space. A graph can be
formally defined as follows.
The vertex set consists of vertices also called nodes, and an edge in the edge set
is incident between two vertices called its endpoints. The vertex set of a graph G
is shown as V (G) and the edge set as E(G). We will use V for V (G) and E for
E(G) when the graph under consideration is known. A trivial graph has one vertex
and no edges. A null graph has an empty vertex set and an empty edge set. A graph
is called finite if both V (G) and E(G) are finite. We will consider only simple and
finite graphs in this book, unless stated otherwise. The number of vertices of a graph
G is called its order and we will use the literal n for this parameter. The number of
edges of G is called its size and we will show this parameter by the literal m. An
edge of a graph G between its vertices u and v is commonly shown as (u, v), uv or
sometimes {u, v}; we will adopt the first one. The vertices at the ends of an edge
are called its endpoints or end vertices or simply ends. For an edge (u, v) between
vertices u and v, we say u and v are incident to the edge (u, v).
Definition 2.2 (self-loop, multiple edge) A self-loop is an edge with the same end-
points. Multiple edges have the same pair of endpoints.
An edge that is not a self-loop is called a proper edge. A simple graph does not
have any self-loops or multiple edges. A graph containing multiple edges is called a
multigraph. An underlying graph of a multigraph is obtained by substituting a single
edge for each multiple edge. An example multigraph is depicted in Fig. 2.1.
(a) (b)
a a b a b a b
b
e e
c d c d c d c
d
Informally, we have the same vertex set in the complement of a graph G but only
have edges that do not exist in G. Complements of two graphs are shown in Fig. 2.2.
Definition 2.4 (degree of a vertex) The sum of the number of proper edges and twice
the number of self-loops incident on a vertex v of a graph G is called its degree and
is shown by deg(v).
A vertex that has a degree of 0 is called an isolated vertex and a vertex of degree
1 is called a pendant vertex. The minimum degree of a graph G is denoted by δ(G)
and the maximum degree by Δ(G). The following relation between the degree of a
vertex v in G and these parameter holds:
Since the maximum number of edges in a simple undirected graph is n(n − 1)/2, for
any such graph,
n(n − 1) n
0≤m≤ =
2 2
.
n
2
We can, therefore, conclude there are at most 2 possible simple undirected
graphs having n vertices. The first theorem of graph theory, which is commonly
refered to as the handshaking lemma is as follows.
18 2 Introduction to Graphs
Theorem 2.1 (Euler) The sum of the degrees of a simple undirected graph G =
(V, E) is twice the number of its edges shown below.
deg(v) = 2m (2.2)
v∈V
Proof is trivial as each edge is counted twice to find the sum. A vertex in a graph
with n vertices can have a maximum degree of n − 1. Hence, the sum of the degrees
in a complete graph where every vertex is connected to all others is n(n − 1). The
total number of edges is n(n − 1)/2 in such a graph. In a meeting of n people, if
everyone shook hands with each other, the total number of handshakes would be
n(n − 1)/2 and hence the name of lemma. The average degree of a graph is
deg(v)
v∈V
= 2m/n. (2.3)
n
A vertex is called odd or even depending on whether its degree is odd or even.
Proof The vertices of a graph G = (V, E) may be divided into the even-degree (ve )
and odd-degree (vo ) vertices. The sum of degrees can then be stated as
deg(v) = deg(ve ) + deg(vo )
v∈V ve ∈V vo ∈V
Since the sum is even by Theorem 2.1, the sum of the odd-degree vertices should
also be even which means there must be an even number of odd-degree vertices.
Theorem 2.2 Every graph with at least two vertices has at least two vertices that
have the same degree.
Proof We will prove this theorem using contradiction. Suppose there is no such
graph. For a graph with n vertices, this implies the vertex degrees are unique, from
0 to n − 1. We cannot have a vertex u with degree of n − 1 and a vertex v with 0
degree in the same graph G as former implies u is connected to all other vertices in
G and therefore a contradiction.
This theorem can be put in practice in a gathering of people where some know
each other and rest are not acquainted. If persons are represented by the vertices of a
graph where an edge between two individuals, who know each other is represented
by an edge we can say there are at least two persons that have the same number of
acquaintances in the meeting.
2.2 Notations and Definitions 19
Definition 2.5 (degree sequence) The degree sequence of a graph G is the list of
the degrees of its vertices in nondecreasing or nonincreasing, more commonly in
nonincreasing order. The degree sequence of a digraph is the list consisting of its
in-degree, out-degree pairs.
The degree sequence of the graph in Fig. 2.1 is {2, 3, 3, 4} for vertices a, d, c,
b in sequence. Given a degree sequence D = (d1 , d2 , . . . , dn ), which consists of
a finite set of nonnegative integers, D is called graphical if it represents a degree
sequence of some graph G. We may need to check whether a given degree sequence
is graphical.
The condition that deg(v) < n − 1, ∀v ∈ V is the first condition and
also v∈V deg(v) should be even. However, these are necessary but not sufficient
and an efficient method is proposed in the theorem first proved by Havel [7] and later
by Hakimi using a more complicated method [5].
This means if we come across a degree sequence which is graphical during this
process, the initial degree sequence is graphical. Let us see the implementation of
this theorem to a degree sequence by analyzing the graph of Fig. 2.3a.
The degree sequence for this graph is {4, 3, 3, 3, 2, 1}. We can now iterate as
follows starting with the initial sequence. Deleting 4 and subtracting 1 from the first
4 of the remaining elements gives
{2, 2, 2, 1, 1}
(a) b c (b)
a f e d
Fig. 2.3 a A sample graph to implement Havel–Hakimi theorem b A graph representing graphical
sequence {1, 1, 1, 1} c A graph representing graphical sequence {0, 1, 1}
20 2 Introduction to Graphs
The last sequence is graphical since it can be realized as shown in Fig. 2.3b. This
theorem can be conveniently implemented using a recursive algorithm due to its
recursive structure.
2.2.2 Subgraphs
In many cases, we would be interested in part of a graph rather than the graph as a
whole. A subgraph G of a graph G has a subset of vertices of G and a subset of its
edges. We may need to search for a subgraph of a graph that meets some condition,
for example, our aim may be to find dense subgraphs which may indicate an increased
relatedness or activity in that part of the network represented by the graph.
(b) a b c g
(a)
f e d h
a b c g
f e d h (c)
b c g
(a) a b (b) a b a’ b’
d c d c d’ c’
(c)
a b a’ b’ a b a’ b’
d c d’ c’ d c d’ c’
Fig. 2.5 Obtaining a regular graph a The graph b The first iteration, c The 3-regular graph obtained
in the second iteration shown by dashed lines
obtained by deleting the edge e from G. The induced subgraph of G by the vertex
set V is shown by G[V ]. The subgraph G[V \ V ] is denoted by G − V .
Vertices in a regular graph all have the same degree. For a graph G, we can obtain
a regular graph H which contains G as an induced subgraph. We simply duplicate
G next to itself and join each corresponding pair of vertices by an edge if this vertex
does not have a degree of Δ(G) as shown in Fig. 2.5. If the new graph G is not
Δ(G)-regular, we continue this process by duplicating G until the regular graph is
obtained. This result is due to Konig who stated that for every graph of maximum
degree r , there exists an r -regular graph that contains G as an induced subgraph.
2.2.3 Isomorphism
e’
e’
a b a’
d’ a’
e d’ b’
d c c’ b’
c’
corresponding vertices have the same degrees. Thus, we can say that the number of
vertices, the number of edges and the degree sequences are isomorphism invariants,
that is, they do not change in isomorphic graphs.
We may need to generate new graphs from a set of input graphs by using certain
operations. These operations are uniting and finding intersection of two graphs and
finding their product as described below.
Definition 2.8 (union and intersection of two graphs) The union of two graphs
G 1 = (V1 , E 1 ) and G 2 = (V2 , E 2 ) is a graph G 3 = (V3 , E 3 ) in which V3 = V1 ∪ V2
and E 3 = E 1 ∪ E 2 . This operation is shown as G 3 = G 1 ∪ G 2 . The intersection of
two graphs G 1 = (V1 , E 1 ) and G 2 = (V2 , E 2 ) is a graph G 3 = (V3 , E 3 ) in which
V3 = V1 ∩ V2 and E 3 = E 1 ∩ E 2 . This is shown as G 3 = G 1 ∩ G 2 .
Definition 2.9 (join of two graphs) The join of two graphs G 1 = (V1 , E 1 ) and
G 2 = (V2 , E 2 ) is a graph G 3 = (V3 , E 3 ) in which V3 = V1 ∪ V2 and E 3 =
E 1 ∪ E 2 ∪ {(u, v) : u ∈ V1 and v ∈ V2 }. This operation is shown as G 3 = G 1 ∨ G 2 .
The join operation of two graphs creates new edges between each vertex pairs,
one from each of the two graphs. Figure 2.8 displays the join of two graphs. All of the
union, intersection, and join operations are commutative, that is, G 1 ∪G 2 = G 2 ∪G 1 ,
G 1 ∩ G 2 = G 2 ∩ G 1 , and G 1 ∨ G 2 = G 2 ∨ G 1 .
2.3 Graph Operations 23
(a) b (b) b c
a d
f e e
(c) b c (d) b
a d
f e e
Fig. 2.7 Union and intersection of two graphs. The graph in c is the union of the graphs in a and
b and the graph in d is their intersection
(a) (b)
Definition 2.10 (cartesian product) The cartesian product or simply the product of
two graphs G 1 = (V1 , E 1 ) and G 2 = (V2 , E 2 ) shown by G 1 G 2 or G 1 × G 2 is a
graph G 3 = (V3 , E 3 ) in which V3 = V1 × V2 and an edge ((u i , v j ), (u p , jq )) is in
G 1 × G 2 if one of the following conditions holds:
1. i = p and (v j , vq ) ∈ E 2
2. j = q and (u i , u p ) ∈ E 1 .
24 2 Introduction to Graphs
(a) x y z w (b)
a P4
ax ay az aw
K2
b
bx by bz bw
K2 X P4
Informally, the vertices we have in the product are the cartesian product of vertices
of the graphs and hence each represents two vertices, one from each graph. Figure 2.9a
displays the product of complete graph K 2 and the path graph with 4 vertices, P4 .
Graph product is useful in various cases, for example, the hypercube of dimension
n, Q n , is a special graph that is the graph product of K 2 by itself n times. It can be
described recursively as Q n = K 2 × Q n−1 . A hypercube of dimension 3 is depicted
in Fig. 2.9b.
We review main types of graphs that have various applications in this section.
Definition 2.11 (complete graph) In a complete simple graph G(V, E), each vertex
v ∈ V is connected to all other vertices in V .
K1 K2 K3 K4 K5
A directed edge or an arc has an orientation from its head endpoint to its tail endpoint
shown by an arrow. Directed graphs consist of directed edges.
The sum of the in-degrees of the vertices in a graph is equal to the sum of the
out-degrees which are both equal to the sum of the number of edges. A directed
graph that has no cycles is called a directed acyclic graph (DAG).
We have considered unweighted graphs up to this point. Weighted graphs have edges
and vertices labeled with real numbers representing weights.
Weighted graphs find many real applications, for example, weight of an edge (u, v)
may represent the cost of moving from u to v as in a roadway or cost of sending
a message between two routers u and v in a computer network. The weight of a
vertex v may be associated with capacity stored at v which may be used to represent
a property such as the storage volume of a router in a computer network.
8 1
10 12
11 4 g
d e f
2.4 Types of Graphs 27
(c)
(b)
(d)
(a)
Fig. 2.13 a A 0-regular graph, b A 1-regular graph, c 2-regular graphs; d A 3-regular graph
e4 e3
d c
v4 v3
In a regular graph, each vertex has the same degree. Each vertex of a k-regular graph
has a degree of k. Every k-complete graph is a k − 1-regular graph but the latter does
not imply the former. For example, a d-hypercube is a d-regular graph but it is not
a d-complete graph. Examples of regular graphs are shown in Fig. 2.13. Any single
n-cycle graph is a 2-regular graph. Any regular graph with odd-degree vertices must
have an even number of such vertices to have an even number sum of vertices.
We need few definitions to specify traversing the edges and vertices of a graph.
e11
vertex v0 is called the initial vertex and vn is called the terminating vertex of the
walk W .
In a directed graph, a directed walk can be defined similarly. The length of a walk
W is the number of edges (arcs in digraphs) included in it. A walk can have repeated
edges and vertices. A walk is closed if it starts and ends at the same vertex and open
otherwise. A walk is shown in Fig. 2.15.
A graph is connected if there is a walk between any pair of its vertices. Connectivity
is an important concept that finds many applications in computer networks and we
will review algorithms for connectivity in Chap. 8.
Definition 2.17 (trail) A trail is a walk that does not have any repeated edges.
Definition 2.18 (path) A path is a trail that does not have any repeated vertices with
the exception of initial and terminal vertices.
In other words, a path does not contain any repeated edges or vertices. Paths are
shown by the vertices only. For example, (i, a, b, h, g) is a path in Fig. 2.15.
Definition 2.19 (cycle) A closed path which starts and ends at the same vertex is
called a cycle.
The length of a cycle can be an odd integer in which case it is called an odd cycle.
Otherwise, it is called an even cycle.
Definition 2.20 (circuit) A closed trail which starts and ends at the same vertex is
called a circuit.
A trail in Fig. 2.15 is (h, e9 , b, e6 , f, e5 , e). When e11 and h are added to this trail,
it becomes a cycle. An Eulerian tour is a closed Eulerian trail and an Eulerian graph
is a graph that has an Eulerian tour. The number of edges contained in a cycle is
denoted its length l and the cycle is shown as Cl . For example, C3 is a triangle.
Definition 2.21 (Hamiltonian Cycle) A cycle that includes all of the vertices in
a graph is called a Hamiltonian cycle and such a graph is called Hamiltonian. A
Hamiltonian path of a graph G passes through every vertex of G.
Definition 2.22 (distance) The distance d(u, v) between two vertices u and v in a
(directed) graph G is the length of the shortest path between them.
the length of the shortest walk from u to v. In an undirected simple (weighted) graph
G(V, E, w), the following can be stated:
1. d(u, v) = d(v, u)
2. d(u, w) ≤ d(u, w) + d(w, v), ∀w ∈ V
The maximum eccentricity is called the diameter and the minimum value of this
parameter is called the radius of the graph. The vertex v of a graph G with minimum
eccentricity in a connected graph G is called the central vertex of G. Finding central
vertex of a graph has practical implications, for example, we may want to place a
resource center at a central location in a geographical area where cities are represented
by the vertices of a graph and the roads by its edges. There may be more than one
central vertex.
An adjacency list of a simple graph (or a digraph) is an array of lists with each list
representing a vertex and its (out)neighbors in a linked list. The end of the list is
marked by a NULL pointer. The adjacency list of a graph is depicted in Fig. 2.16c.
The adjacency matrix requires O(n + m) space. For sparse graphs, adjacency list
2.6 Graph Representations 31
1 2 3 4 5
(b)
(a) e2 1 0 0 1 1 1
1 2
e1
2 1 0 1 0 0
e6 e4 e3
5 3 0 0 0 1 0
e7
e5
3 4 0 0 0 0 0
4
5 0 0 1 1 0
e8
(c)
1 3 5 4 \
2 1 3 \
3 4 \
4 \
5 4 3 \
Fig. 2.16 a A digraph, b Its adjacency matrix, c Its adjacency list. The end of the list entries are
marked by a backslash
is preferred due to the space dependence on the number of vertices and edges. For
dense graphs, adjacency matrix is commonly used as searching the existence of an
edge in this matrix can be done in constant time. With the adjacency list of a graph,
the time required for the same operation is O(n).
In a digraph G = (V, E), the predecessor list Pv ⊆ V of a vertex v is defined as
follows.
Pv = {u ∈ V : (u, v) ∈ E}
and the successor list of v, Sv ⊆ V is,
Sv = {u ∈ V : (v, u) ∈ E}
The predecessor and successor lists of the vertices of the graph of Fig. 2.16a are
listed in Table 2.2.
32 2 Introduction to Graphs
In the edge list representation of a graph, all of its edges are included in the list.
2.7 Trees
A graph is called a tree if it is connected and does not contain any cycles. The
following statements equally define a tree T :
In a rooted tree T , there is a special vertex r called the root and every other vertex
of T has a directed path to r ; the tree is unrooted otherwise. A rooted tree is depicted
in Fig. 2.17. A spanning tree of a graph G is a tree that covers all vertices of G.
2.7 Trees 33
The spectral analysis of graphs involves operations on the matrices related to graphs.
2.8.1 Eigenvalues
Ax = λx (2.4)
If we can find values of x and λ for this equation to hold, λ is called an eigenvalue
of A and x as an eigenvector of A. There will be a number of eigenvalues and a set
of eigenvectors corresponding to these eigenvalues in general. Rewriting Eq. 2.4,
Ax − λx = 0 (2.5)
(A − λI )x = 0
det (A − λI )x = 0 (2.6)
For A[n, n], det (A − λI ) = 0 is called the characteristic polynomial which has
a degree of n and therefore has n roots. Solving this polynomial provides us with
eigenvalues and substituting these in Eq. 2.4 provides the eigenvectors of matrix A.
34 2 Introduction to Graphs
Definition 2.24 (degree matrix) The degree matrix D of a graph G is the diagonal
matrix with elements d1 , . . . , dn where di is the degree of vertex i.
The entries of the normalized Laplacian matrix L can then be specified as below.
⎧
⎪
⎨ 1 if i = j
Li j = √−1 if i and j are neighbors
⎪
⎩
di d j
0 otherwise.
The Laplacian matrix and adjacency matrix of a graph G are commonly used to
analyze the spectral properties of G and design algebraic graph algorithm to solve
various graph problems.
We have reviewed the basic concepts in graph theory leaving the study of some of
the related background including trees, connectivity, matching, network flows and
coloring to Part II when we discuss algorithms for these problems. The main graph
theory background is presented in a number of books including books by Harary
[6], Bondy and Murty [1], and West [8]. Graph theory with applications is studied
in a book edited by Gross et al. [4]. Algorithmic graph theory focusses more on
the algorithmic aspects of graph theory and books available in this topic include the
books by Gibbons [2], Golumbic [3].
2.9 Chapter Notes 35
Exercises
1. Show that the relation between the size and the order of a simple graph is given
by m ≤ (n/2) and decide when the equality holds.
2. Find the order of a 4-regular graph that has a size of 28.
3. Show that the minimum and maximum degrees of a graph G are related by
δ(G) ≤ 2m/n ≤ Δ(G) inequality.
4. Show that for a regular bipartite graph G = (V 1, V 2, E), |V 1| = |V 2|.
5. Let G = (V1 , V2 , E) be a bipartite graph with vertex partitions V1 and V2 . Show
that
deg(u) = deg(v)
u∈V1 v∈V2
References
1. Bondy AB, Murty USR (2008) Graph theory. Graduate texts in mathematics. Springer, Berlin.
1st Corrected edition 2008. 3rd printing 2008 edition (28 Aug 2008). ISBN-10: 1846289696,
ISBN-13: 978-1846289699
2. Gibbons A (1985) Algorithmic graph theory, 1st edn. Cambridge University Press, Cambridge.
ISBN-10: 0521288819, ISBN-13: 978-0521288811
3. Golumbic MC, Rheinboldt W (2004) Algorithmic graph theory and perfect graphs, vol 57,
2nd edn. Annals of discrete mathematics. North Holland, New York. ISBN-10: 0444515305,
ISBN-13: 978-0444515308
4. Gross JL, Yellen J, Zhang P (eds) (2013) Handbook of graph theory, 2nd edn. CRC Press, Boca
Raton
5. Hakimi SL (1962) On the realizability of a set of integers as degrees of the vertices of a graph.
SIAM J Appl Math 10(1962):496–506
6. Harary F (1969) Graph theory. Addison Wesley series in mathematics. Addison–Wesley, Read-
ing
7. Havel V (1955) A remark on the existence of finite graphs. Casopis Pest Mat 890(1955):477–480
8. West D (2000) Introduction to graph theory, 2nd edn. PHI learning. Prentice Hall, Englewood
Cliffs
Graph Algorithms
3
Abstract
We review basic algorithm structure and provide a brief and dense review of the
main principles of sequential algorithm design and analysis with focus on graph
algorithms. We then provide a short survey of NP-completeness with example
NP-hard graph problems. Finally, we briefly review the major algorithm design
methods showing their implementations for graph problems.
3.1 Introduction
analysis with focus on graph algorithms. We first describe the basic concepts of algo-
rithms and then review the mathematics behind the analysis of algorithms. We then
provide a short survey of NP-completeness with focus on NP-hard graph problems.
When we are dealing with such difficult graph problems, we may use approximation
algorithms which provide suboptimal solutions with proven approximation ratios.
In many cases, however, our only choice is to use heuristic algorithms that work
for most input combinations as we describe. Finally, we briefly review the major
algorithm design methods showing their implementations for graph problems.
3.2 Basics
• It accepts a set of inputs, processes these inputs, and produces some useful outputs.
• It should provide correct output. Correctness is a fundamental requirement of any
algorithm.
• It should execute a finite number of steps to produce the output. In other words,
we do not want the algorithm to run forever without producing any output. It is
possible to have algorithms running infinitely such as server programs, but these
produce outputs while running.
• An algorithm should be effective. It should perform the required task using a
minimum number of steps. It does not make sense to have an algorithm that runs
2 days to estimate weather for tomorrow since we know what it would be by then.
Given two algorithms that perform the same task, we would prefer the one with
less number of steps.
When presenting an algorithm in this book and in general, we first need to provide
a simple description of the main idea of the algorithm. We then would need to detail
its operation using pseudocode syntax which shows its running using basic structures
as described next. Pseudocode is the description of an algorithm in a more formal
and structured way than verbally describing it but less formal than a programming
language. We then typically show an example operation of the algorithm in a sample
graph. The second fundamental thing to do is to prove that the algorithm works
correctly which is self-evident in many cases, trivial in some cases and need rigorous
proof techniques for various others as we describe in this chapter. Finally, we should
present its worst-case analysis which shows its performance as the number of steps
required in the worst case.
3.2 Basics 39
3.2.1 Structures
Three fundamental structures used in algorithms are the assignment, decision, and
loops. An assignment provides assigning a value to a variable as shown below:
b←3
a ← b2
Here we assign an integer value of 3 to a variable b and then assign the square of
b to a. The final value of a is 9. Decisions are the key structures in algorithms as in
daily life. We need to branch to some part of the algorithm based on some condition.
The following example uses if...then...else structure to determine the larger one of
two input numbers:
input a,b
if a > b then print a
else if a=b then print ‘‘they are equal’’
else print b
end if
• for loops: The for loops are commonly used when we know how many times the
loop should execute before we start with the loop. There is a loop variable and
test condition. The loop variable is modified at the end of the loop according to
the starting line and tested against the condition specified in the testing line. If
this condition yields a true value, the loop is entered. In the following example, i
is the loop variable and it is incremented by 1 at the end of each loop iteration and
checked against the boundary value of 10. This simple loop prints the squares of
integers between 1 and 10.
• while loops: In case we do not know how many times the loop will be executed,
while structure may be used. We have a test condition at the beginning of the loop
and if this succeeds, the loop is entered. The following example illustrates the use
of while loop where we input Q for quitting by the user and otherwise add the two
numbers given by the user. We do not know when the user may want to stop, so
the use of while is appropriate here. Also note that we need two input statements
for control, one outside the loop to be executed once and another one inside the
loop to test iteratively since check is at the beginning of the loop.
40 3 Graph Algorithms
input chr
while chr <> ’Q’
input a,b
print a+b
input chr
end while
• repeat .. until loops: This structure is used in similar situations to while loops when
we do not know how many times the loop will be executed. The main difference is
that we do the test at the end of the loop which means this loop is executed at least
once whereas the while loop may not be executed even once. We will write the
previous example with the repeat...until (or loop...until) structure with a clearly
shorter code.
repeat
input a,b
print a+b
until chr <> ’Q’
We need to assess the running time of algorithms for various input sizes to evaluate
their performances. We can assess the behavior of an algorithm experimentally but
theoretical determination of the required number of steps as a function of input size
is needed in practice. Let us illustrate these concepts by writing an algorithm that
searches for a key integer value in an integer array and returns its first occurrence as
the index of the array as shown in Algorithm 3.2. We want to find out the execution
time of this algorithm as the number of steps executed and if we can find an algorithm
that has less number of steps for the same process, we would prefer that algorithm.
In such analysis, we are not interested in the constant number of steps but rather,
we need to find the number of steps required as the size of the input grows. For
example, initializing the variable i takes constant time but it is performed only once
therefore can be neglected, and this is more meaningful when n 1. The number of
times the loop is executed is important as it affects the performance of the algorithm
significantly. However, the number of steps, say 2 or 3, inside the loop is insignificant
again since 2n or 3n is invariable when n is very large.
When we run this algorithm, it is possible that the key value is found in the first
array entry in which case the algorithm completes in one step. This will be the lowest
running time of the algorithm. In the worst case, we need to check each entry of the
array A, running the loop n times. We are mostly interested in the worst execution
time of an algorithm as this is what can be expected as the worst case.
We need to analyze the running time and space requirement of an algorithm as
functions of the input size. Our interest is to determine the asymptotic behavior of
these functions when input size gets very large. The number of steps required to run
an algorithm is termed its time complexity. This parameter can be specified in three
ways: the best-case, average-case, and worst-case complexities described as follows,
assuming f and g are functions from N to R+ and n is the input size.
The Worst-Case Analysis
The worst running time of an algorithm is f (n) = O(g(n)), if there exists a constant
c > 0 such that ∀n 0 ≥ n, f (n) ≤ cg(n). This is also called the big-Oh notation and
42 3 Graph Algorithms
states that the running time is bounded by a function g(n) multiplied by a constant
when the input size is greater than or equal to a threshold input value. There are
many O(g(n)) functions for f (n) but we search for the smallest possible value to
select a tight upper bound on f (n).
Example 3.1 Let f (n) = 5n + 3 for an algorithm, which means its running time
is this linear function of its input size n. We can have a function g(n) = n 2 and
n 0 = 6, and hence claim cg(n) ≥ f (n), ∀n ≥ n 0 . Therefore, 5n + 3 ∈ O(n 2 ) which
means f (n) has a worst-time complexity of O(n 2 ). Note that any complexity greater
than n 2 , for example, O(n 3 ), is also a valid worst-time complexity for f (n). In fact
O(n) is a closer complexity for the worst case for this algorithm as this function
approaches n in the limit when n is very large. We would normally guess this and
proceed as follows:
5n + 3 ≤ cn
(c − 5)n ≥ 3
n ≥ 3/(c − 5)
We can select c = 6 and n 0 = 4 for this inequality to hold and hence complete
the proof that 5n + 3 ∈ O(n). As another example, consider the time complexity
of 4 log n + 7. We claim this is O(log n) and need to find c and n 0 values such that
4 log n + 7 ≤ c log n for n ≥ n 0 which holds for c = 12 and n 0 = 2.
Example 3.2 Let f (n) = 3 log n+2 for an algorithm, and let us consider the function
g(n) = log n to be a lower bound on the running time of the algorithm. In this case,
we need to verify 3 log n + 2 ≥ c log n for some constant c for all n ≥ n 0 values for
a threshold n 0 value.
3 log n + 2 ≥ c log n
(3 − c) log n ≥ −2
log n ≥ −2/(3 − c)
and for n 0 = 2 and c = 4, this equation holds and hence claims 3 log n + 2 ∈
Ω(log n). The key point here was guessing that this function grows at least as log n.
3.3 Asymptotic Analysis 43
n0 n n1 n2
0 0 0
f(n)=O(n) f(n)=omega(n) f(n)=O(g(n))
Theta Notation
Θ(n) is the set of functions that grow at the same rate as f (n) and is considered
as a tight bound for f (n). These functions are both in = O(g(n)) and Ω(g(n)).
Formally, g(n) ∈ Θ(n) if there exists constants n 0 , c1 and c2 such that ∀n ≥ n 0 ,
c1 f (n) ≤ |g(n)| ≤ c2 f (n). The relation between the growth rate of a function f (n),
O(n), Θ(n), and Ω(n) is depicted in Fig. 3.1.
The Average-Case Analysis
Our aim in determining the average case is to find the expected running time of the
algorithm using a randomly selected input, assuming a probability distribution over
inputs of size n. This method in general is more difficult to assess than the worst or
best cases as it requires probabilistic analysis, but it can provide more meaningful
results. Another point of concern is the memory space needed by an algorithm. This
is specified as the maximum number of bits required by the algorithm and called the
space complexity of the algorithm.
General Rules
O(1) ⊂ O(log n) ⊂ O(n) ⊂ O(n log n) ⊂ O(n 2 ) ⊂ O(n 3 ), .., O(n k )n ⊂ 2n ⊂ O(n!) ⊂ O(n n )
Although asymptotic analysis shows the running time of the algorithm as the size
of the input is increased to very large values, we may have to work with only small
input sizes, which means low-order terms and constants may not be ignored. Also,
given two algorithms, choice of the one with better average complexity rather than
the one with better worst-case complexity would be more sensible as this would
cover most of the cases.
We have two main methods of algorithm design and implementation: recursion and
iteration. A recursive algorithm is the one that calls itself and these algorithms are
commonly employed to break a large problem into smaller parts, solve the smaller
parts, and then combine the results as we will see in this chapter. Iterative solutions
keep repeating the same procedure until the desired condition is met. Recursive
algorithms commonly provide shorter solutions but they may be more difficult to
design and analyze. Let us consider a recursive function Power to find the nth power
of an integer x. The iterative solution would involve n times multiplication of x
by itself in a for or another type of loop. In the recursive algorithm, we have the
function calling itself with decremented values of n each time until the base case is
encountered when n = 0. The nested calls to this function start returning to the caller
after this point, each time multiplying x with the returned value. The first returned
value from the last call is 1 followed by x, x 2 , x 3 until x n as shown in Algorithm 3.3.
T (1) = c2 + T (1)
T (n) = c2 + T (n − 1)
T (n) = 2c2 + T (n − 2)
T (n) = 3c2 + T (n − 3)
...
T (n) = kc2 + T (n − k)
when k = n,
T (n) = 2T (n − 1) + 1
with
T (0) = 0
and guess the solution is T (n) = 2n − 1 simply by looking at the values of this
function for the first few values of n which are 0, 1, 3, 7, 15, 31, and 63 for inputs 0,
1, 2, 3, 4, and 5. Considering the base case, we find it holds.
T (0) = 20 − 1 = 0
46 3 Graph Algorithms
T (n) = 2T (n − 1) = 2(2n−1 − 1) + 1 = 2n − 1
1. f (n) = O(n log a−ε ) for some ε > 0: T (n) = Θ(n log a ).
2. f (n) = Θ(n log a ): T (n) = Θ(n log a log n).
3. f (n) = Ω(n log a+ε ) for some ε > 0, and a f (n/b) ≤ c f (n) for some c < 1 and
∀n > n 0 : T (n) = Θ(n log a ).
Example 3.3 If a and b are two even integers, their product ab is even.
Proof We can write a = 2m and b = 2n for some integers m and n since they are
even, and therefore are divisible by 2. The product ab = 2m.2n = 4mn = 2(2mn)
is an even number since it is divisible by 2.
3.5 Proving Correctness of Algorithms 47
The proofs may be as simple as in this example. In many cases, however, a proof
involves more sophisticated reasoning to arrive at the conclusion. Let us look at
another example that involves a direct but not so easy to derive proof. Let us see
another example of direct proof.
b = 2m − a
a − b = a − 2m + a
= 2a − 2m
= 2(a − m)
which shows that the difference is an even number and completes the proof.
3.5.1 Contraposition
Example 3.5 For any integer a > 2, if a is a prime number, then a is an odd number.
Proof Let us assume the opposite of the conclusion, a is even. We can then write
a = 2n for some integer n. However, this implies a is divisible by 2, and hence, it
cannot be a prime number which contradicts the premise.
3.5.2 Contradiction
In this proof method, we assume the premise p is true and the conclusion q is not
true ( p ∧ ¬q) and try to find a contradiction. This contradiction can be against what
we assume as hypothesis or simply be something against we know to be true such
as 1=0. In this case, if we find ( p ∧ ¬q) is false, it means either p is false or ¬q is
false. Since we assume p is true as it is the premise, ¬q must be false which means
q is true if there is a contradiction and that completes the proof.
48 3 Graph Algorithms
3.5.3 Induction
In induction, we are given a sequence of propositions in the form P(1), ..., P(n) and
we perform two steps:
If these two steps provide true results, we can conclude P(n) is true for any n. It is
one of the most commonly used methods to prove sequential, parallel, and distributed
algorithms.
Example 3.7 Let us illustrate this method by proving that the sum S of the first n
odd numbers 1+3+5... is n 2 .
Proof 1. Basis step: P(1) = 1 = 12 , so the basis step yields a true answer.
2. Induction step: Assuming P(k) = k 2 , we need to show that P(k + 1) = (k + 1)2 .
Since the kth element of P(n) is expressed as 2k − 1 as it is an odd number, the
following can be stated:
This proof method is useful when P(k + 1) does not depend on P(k) but on some
smaller values of k. In fact, the two induction methods are equivalent.
Example 3.8 Every integer greater than 1 is either a prime number or can be written
as the product of prime numbers.
Proof We need to consider the base case and the strong induction case.
An algorithm with a loop starts by initializing some variables based on some inputs,
executes a loop, and produces some output based on the values of its variables. A
loop invariant is an assertion about the value of a variable after each iteration of a
particular loop, and the final value of this variable is used to determine the correctness
of the algorithm.
A precondition is a set of statements that are true before the algorithm executes
which is commonly represented as a set of inputs, and a postcondition is a set of
statements that remain true after the algorithm executes which are the outputs. We
use loop invariants to help us understand why an algorithm is correct. We must show
three things about a loop invariant:
1. Initialization: The loop invariant should be true before the first iteration of the
loop.
2. Maintenance: If the loop invariant is true for the nth iteration, it should be true
for (n+1)th iteration.
3. Termination: The invariant is true when the loop terminates.
The first two properties of a loop variant assert that the variant is true before each
loop iteration, similar to the induction method. Initialization is like the base case
of induction, and maintenance is similar to the inductive step. There are no definite
rules to choose a loop variant and we proceed by commonsense in most cases. Let
50 3 Graph Algorithms
us consider the following while loop. Our aim is to prove this loop works correctly
and terminates.
a > 0;
b = 0;
while ( a != b)
b = b + 1;
We will choose a ≥ b as the loop variant L. We need to show the three conditions:
L is true before the loop starts; if L is true before an iteration, it remains true after
the iteration and lastly, it should establish the postcondition. We can now check
these conditions as follows and can determine that this loop works correctly and
terminates:
3.6 Reductions
We may want to prove that some computational problems are difficult to solve. In
order to verify this, we need to show some problem X is at least as hard as a known
problem Y . An elegant way of proving this assertion is to reduce problem Y to X . Let
us assume we have a problem P1 that has an algorithmic solution A1 and a similar
problem P2 that does not have any solution. Similarity may imply we can borrow
some of the solutions found for P1 if we can find a reduction of problem P2 to P1 .
Let us consider the vertex cover problem. A vertex cover (VCOV) of a graph G =
(V, E) is a subset V of its vertices such that any edge e ∈ E is incident to at least
one vertex v ∈ V . Informally, we try to cover all edges of G by the vertices in this
subset. The decision form of this problem (VCOV) asks: Given a graph G = (V, E)
and an integer k, does G have a vertex cover of size at most k? The optimization
VCOV problem is to find the vertex cover with the minimum number of vertices
among all vertex covers of a graph.
An independent set of a graph G is a subset V of its vertices such that no vertex
in V is adjacent to any other vertex in V . The decision form of this problem (IND)
seeks to answer the question: Given graph G and an integer k, does G contain an
independent set of at least k vertices? The optimization IND problem is to find the
independent set with the maximum number of vertices among all independent sets
of a graph.
A related graph problem is finding the dominating set (DOM) of a graph G =
(V, E) which is a subset V of its vertices such that any v ∈ V \ V is adjacent to at
least one vertex in V . The decision form of DOM seeks to answer the question: Given
graph G and an integer k, does G contain a dominating set of at most k vertices?
These subgraphs are displayed in a sample graph in Fig. 3.2. The optimization DOM
problem is to find the dominating set with the minimum number of vertices among
all dominating sets of a graph. Minimal or maximal versions of all of these problems
are to find vertex subsets that cannot be reduced or enlarged by removal/addition of
any other vertices. We will review these problems in more detail in Chap. 10.
Fig. 3.2 Some difficult graph problems. a A minimum vertex cover, b a maximum independent set,
and c a minimum dominating set. Note that a is also a maximum independent set and a minimum
dominating set, b is also a minimum vertex cover and a minimum dominating set but c is only a
minimum dominating set for this graph
52 3 Graph Algorithms
We will show that an independent set of a graph can be reduced to a vertex cover by
first considering the theorem below.
Figure 3.3 shows the equivalence of these two problems in a sample graph. Given
the five vertices in (a) with k = 5, we can see these form an independent set. We
now transform this input to an input of the vertex cover problem in polynomial time,
which are the white vertices in (a). We check whether these form a vertex cover in
(b) again in polynomial time and conclude they do. Our test algorithm simply marks
incident edges to black vertices in O(k) time and checks whether any edges are left
unmarked in the end. All edges are covered by these four vertices in this case. Hence,
we can deduce the five black vertices in (a) are indeed a solution to the independent
set decision problem for this graph. We have shown an example of IND ≤ P VCOV .
(a) (b)
Fig. 3.3 A sample graph with an independent set in a and a vertex cover in b shown by black circles
in both cases. The independent set in b is formed by the white vertices in a
3.7 NP-Completeness 53
3.7 NP-Completeness
The problems we face can be classified based on the time it takes to solve them. A
polynomial function O(n k ), with n as the variable and k as the constant, is bounded
by n k as we saw. The exponential functions refer to functions such as O(2n ) or
O(n n ) which grow very rapidly with the increased input size n. A polynomial-time
algorithm has a running time bounded by a polynomial function of its input, and an
exponential algorithm is the one which does not have a time performance bounded
by a polynomial function of n. A tractable problem is solvable by a polynomial-
time algorithm, and an intractable problem cannot be solved by a polynomial-time
algorithm. Searching for a key value in a list can be performed by a polynomial-time
algorithm as we need to check each entry of a list of size n in n time and listing all
permutations of n numbers is an example of an exponential algorithm. In fact, the
third class of problems have no known polynomial-time algorithms but they are not
proven to be intractable either. When we are presented with an intractable problem,
we can do one of the following:
The tractable problems have polynomial-time algorithms to solve them. The intractable
problems, on the other hand, can be further divided into two subclasses; the ones
proven to have no polynomial-time algorithms and others that have exponential time
solution algorithms. At this point, we will need to classify the problems based on
the expected output from them as optimization problems or decision problems. In an
optimization problem, we attempt to maximize or minimize a particular objective
function and a decision problem returns a yes or a no as an answer to a given input.
Let us consider the IND problem as both optimization and a decision problem. The
optimization problem asks to find the independent set of a graph G = (V, E) with
54 3 Graph Algorithms
the largest order. The decision problem we saw seeks an answer to the question:
Given an integer k ≤ |V |, does G have an independent set with at least k vertices?
Dealing with decision problems is advantageous not only because these problems
are in general easier than their optimization versions but they also may provide a
transfer to the optimization problem. In the IND decision problem, we can try all
possible k values and find the largest one that provides an independent set to find a
solution to the IND optimization problem.
Complexity Class P
The first complexity class we will consider is P which contains problems that can be
solved in polynomial time.
Definition 3.2 (class P) The complexity class P refers to decision problems that can
be solved in polynomial time.
Definition 3.3 (class NP) The complexity class Nondeterministic Polynomial (NP)
is the set of decision problems that can be verified by a polynomial algorithm.
The NP class includes P (P ⊂ NP) class since all of the problems in P have
certifiers in polynomial time but whether P = NP has not been determined and
remains a grand challenge in Computer Science. Many problems are difficult to
solve, but an input instance can be verified in polynomial time whether it yields a
solution or not. For example, given a graph G = (V, E) and a certificate S ∈ V ,
we can check in polynomial time whether S is an independent set in G. The certifier
program is shown in Algorithm 3.4 where we simply check whether any two vertices
in S are adjacent. If any such two vertices are found, the answer is NO and the input
is rejected. The algorithm runs two nested loops in O(n 2 ) time, and hence, we have
a polynomial-time certifier which shows IND ∈ NP.
3.7 NP-Completeness 55
NP-Complete
NP
NP-Hard Problems
Definition 3.4 (class NP-Hard) A decision problem Pi is NP-hard if every problem
in NP is polynomial time reducible to Pi . In other words, if we can solve Pi in
polynomial time, we can solve all NP problems in polynomial time.
Figure 3.4 displays the relation between the complexity classes. Class P is con-
tained in class NP as every problem in P has a certifier, and NP-complete problems
are in the intersection of NP problems and NP-hard problems.
56 3 Graph Algorithms
The satisfiability problem (SAT) states that given a Boolean formula, is there a way
to assign truth values to the variables in the formula such that the formula evaluates
true value? Let us consider a set of logical variables x1 , x2 , ..., xn each of which
can be true or false. A clause is formed by disjunction of logical variables such as
(x1 ∨ x2 ∨ x3 ). A CNF formula is a conjunction of the clauses as C1 ∧ C2 , ... ∧ Ck
such as below:
The CNF formula is satisfied if every clause in it yields a true value. The sat-
isfiability problem searches for an assignment to variables x1 , x2 , ..., xn such that
CNF formula is satisfied. 3-SAT problem requires each clause to be of length 3 over
variables x1 , x2 , ..., xn . The SAT problem has no known polynomial-time algorithm
but we cannot conclude it is intractable either. We can try all combinations of the
input in 2n time to find the solution. However, we can have a distinct input and check
whether this input is accepted by the SAT circuit, and hence conclude SAT is in NP.
This problem was shown to be NP-hard and therefore to be NP-complete by Cook
in 1970 [3]. We can, therefore, use 3-SAT problem as basis to prove other problems
to be NP-complete or not. The relationships between various problems is depicted
in Fig. 3.5.
Let us show how to reduce the 3-SAT problem to IND problem. In the former, we
know that we have to set at least one term in each clause to be true and we cannot
set both xi and xi to be true at the same time. We first draw triangles for each clause
IND
VCOV
__ __
x1 x1 x1
__ __ __
x2 x3 x2 x3 x2 x3
Fig. 3.6 The graph for the 3-SAT equation of Eq. 3.1. The black vertices x1 , x3 and x2 represent
the independent set of this graph which is also the solution to the 3-SAT of Eq. 3.1 with x1 = 1,
x2 = 0 and x3 = 0 values
of 3-SAT with each vertex representing the term inside the clause. A true literal
from each clause suffices to obtain a true value for the 3-SAT formula. We add lines
between a term and its inverse as we do not want to include both in the solution. The
graph drawn this way for Eq. 3.1 is shown in Fig. 3.6.
We now claim the following.
Theorem 3.2 the 3-SAT formula F with k clauses is satisfiable if and only if the
graph formed this way has an independent set of size k.
Proof Let the graph formed this way be G = (V, E). If formula F is satisfiable,
we need to have at least one true literal from each clause. We form the vertex set
V ⊂ V by selecting a vertex from each triangle, and also by not selecting a variable
x and its complement x at the same time since a variable and its complement cannot
be true at the same time. V is an independent set since there are no edges between
the vertices selected. To prove the claim in the reverse direction, let us consider G
has an independent set of size k. The set V cannot have two vertices from the same
cluster and it will not have a variable and its complement at the same time since it
is an independent set. Moreover, when we set true values to all variables in V , we
will have satisfied the SAT formula F. Transformation of the 3-SAT problem to IND
problem can be performed in polynomial time; hence, we deduce these two problems
are equivalent, and finding a solution to one means solution to the other one is also
discovered.
Many of the graph optimization problems are NP-hard with no known solutions in
polynomial time. However, methods to result in solutions most of the time within
a specified bound of probability (randomized algorithms), or results that are close
58 3 Graph Algorithms
to the exact solution within a specified margin to the exact result (approximation
algorithms); or methods that eliminate some of the unwanted intermediate results to
achieve improvement in the performance (backtracking and branch and bound) are
the topics we will review in this section.
Randomized algorithms are frequently used for some of the difficult graph problems
as they are simple and provide efficient solutions. Randomly generated numbers or
random choices are typically used in these algorithms to decide on the courses of
computation. The output from a randomized algorithm and its running time varies for
different inputs and even for the same input. Two classes of randomized algorithms
are Las Vegas and Monte Carlo algorithms. The former always returns a correct
answer but the runtime of such algorithms depend on the random choices made. The
algorithm runs a constant amount of time but the answer may or may not be correct
in Monte Carlo algorithms.
The average cost of the algorithm over all random choices gives us its expected
bounds and a randomized algorithm is commonly specified in terms of its expected
running time for all inputs. On the other hand, when we say an algorithm runs in
O(x) time with high probability, it means the runtime of this algorithm will not be
above the value of x with high probability. Randomized algorithms are commonly
used in two cases: when an initial random configuration is to be chosen and to decide
on a local solution when there are several options. The randomized choice may be
repeated with different seeds and then the best solution is returned [2].
Karger’s Minimum Cut Algorithm
We will describe how randomization helps to find the minimum cut (mincut hence-
forth) of a graph. Given a graph G = (V, E), finding a mincut of G is to partition
the vertices of the graph into two disjoint sets V1 and V2 such that the number of
edges between V1 and V2 is minimum. There is a solution to this problem using
the maximum flow as we will see in Chap. 8, here we will describe a randomized
algorithm due to Karger [4].
This simple algorithm selects an edge at random, makes a supervertex from the
endpoints of the selected edge using contraction and continues until there are exactly
two supervertices left as shown in Algorithm 3.5. The vertices in each final super-
vertex are the vertices of the partitions.
Contracting two vertices u and v is done as follows:
Let us see how this algorithm works in the simple graph of Fig. 3.7. The edges
picked at random are shown inside dashed regions and the final cut consists of three
edges between V1 = {b, h, a} and V2 = {c, f, d, e, g} as shown in (h). This is not
the mincut however, the minimum cut consists of edges (b, c) and (g, f ) as depicted
in (i).
Karger’s algorithm will find the correct minimum cut if it never selects an edge
that belongs to the minimum cut. In our example, we selected the edge (g, f ) that
belongs to the mincut in Fig. 3.7e deliberately to result in cut that is not minimum.
On the other hand, the mincut edges have the lowest probability to be selected by
this algorithm since they have fewer edges than all edges that do not belong to the
mincut. Before its analysis, let us state few observations about the mincut of a graph.
Remark 1 The size of a mincut of a graph G is at most the minimum degree δ(G)
of G.
This is valid since the mincut is not larger than any cut of G. Therefore, δ(G) sets
an upper bound on the size of the mincut. Since we cannot determine δ(G) easily,
let us check whether an upper bound in terms of the size and order of G exists.
Proof Assume the size of the mincut is k. Then, every vertex of G must have at least
a degree of k. Therefore, by Euler theorem (handshaking lemma),
deg(v) k
v∈V v∈V nk
m= ≥ = (3.2)
2 2 2
which means k ≤ 2m/n.
60 3 Graph Algorithms
(a) (b) b h
a b c d c d
h g f e g f e
(c) b h (d) b h
d
c f c f d
a a
g e g e
cf d cf d g cf d g e
g e e
(h) (i)
a b c d a b c d
h g f e h g f e
Proof There are m edges and at most 2m/n are in the mincut by Theorem 3.3, so
P(ε1 ) = m/(2m/n) = 2/n.
Remark 2 Given a graph G with a mincut C, the algorithm must not select any edge
(u, v) ∈ C.
3.8 Coping with NP-Completeness 61
2 n−2
P(ε1 ) = 1 − P(ε1 ) ≥ 1 − ≥ (3.3)
n n
Let us choose a minimum cut C with size k and find the probability that an edge
(u, v) ∈ C is not contracted by the algorithm which will give us the probability
that the algorithm finds the correct result. We will first evaluate the probability of
selecting an edge (u, v) ∈ C in the first round, P(ε1 ) which is k/m for a mincut with
size k. Therefore,
k k 2
P(ε1 ) = ≤ = (3.4)
m nk/2 n
Let P(εC ) be the probability that the final cut obtained s minimum. This proba-
bility is the product of the probabilities of the probability the first selected edge is
not in mincut, probability the second selected edge is not in mincut, etc., until two
last supervertices are formed. A contraction of an edge results in one less vertex in
the new graph. Therefore,
2 2 2 2 2
P(εC ) ≥ 1 − 1− 1− 1− ... 1 − (3.5)
n n−1 n−2 n−3 3
since nominators and denominators cancel in every two terms except the two first
denominators,
n−2 n−3 n−4 1
= ...
n n−1 n−2 3
2
=
n(n − 1)
2
Hence, the probability that the algorithm returns the mincut C is at least n(n−1) .
We can therefore conclude this algorithm succeeds with probability p ≥ 2/n and 2
running it O(n 2 log n) time provides a minimum cut with high probability. Using
the adjacency matrix of the graph, we can run each iteration in O(n 2 ) time, and the
total time is O(n 4 log n).
A(I )
α A = max I (3.6)
O P T (I )
Example 3.9 We had already investigated the vertex cover problem in Sect. 3.6.
Finding the minimum vertex cover which has the least number of vertices among
all vertex covers of a graph is NP-hard. Since our aim is to cover all edges of the
graph by a subset of vertices, we can design an algorithm that picks each edge in
a random order and since we cannot determine which vertex will be used to cover
the edge, we include both ends of the edge in the cover as shown in Algorithm 3.6.
For each selected edge (u, v), we need to delete edges incident to u or v from graph
since these edges are covered.
The iterations of this algorithm in a sample graph are shown in Fig. 3.8 which
provides a vertex cover with an approximation of 2.
Theorem 3.4 Algorithm 3.6 provides a vertex cover in O(m) time and the size of
MVC is 2 |MinVC|.
Proof Since the algorithm continues until there are no more edges left, every edge
is covered, therefore the output from Seq1_M V C is an MVC, taking O(m) time.
The set of edges picked by this algorithm is a matching, as edges chosen are disjoint
and it is maximal as addition of another edge is not possible. Since two vertices are
covered for each matched edge, the approximation ratio for this algorithm is 2.
3.8.3 Backtracking
In many cases of algorithm design, we are faced with a search space that grows expo-
nentially with the size of the input. The brute-force or exhaustive search approach
3.8 Coping with NP-Completeness 63
/ /
_ _
(d) (e)
Fig. 3.8 A possible iteration of Algorithm 3.6 in a sample graph, showing the selected edge in bold
and the vertices included at the endpoints of this edge as black at each step, from a–c. The final
vertex cover has six vertices as shown in d and the minimum vertex cover for this graph has three
vertices as shown in e resulting in the worst approximation ratio of 2
searches all available options. Backtracking is a clever way of searching the available
options while looking for a solution. In this method, we look at a partial solution and
if we can pursue it further, we do. Otherwise, we backtrack to the previous state and
proceed from there since proceeding from current state violates the requirements.
This way, we save some of the operations needed from the current state onwards.
The choices that can be made are placed in a state-search tree where nodes except
the leaves correspond to partial solutions and edges are used to expand the partial
solutions. A subtree which does not lead to a solution is not searched and we back-
track to the parent of such a node. Backtracking can be conveniently implemented
by recursion since we need to get back to the previous choice if we find the current
choice does not lead to a solution.
Let us see how this method works using an example. In the subset sum problem,
we are given a set of S of n distinct integers and are asked to find possibly more
than one subsets of S sum of which equals a given integer M. For example, if
S = {1, 3, 5, 7, 8} and M = 11, then S1 = {1, 3, 7} and S2 = {3, 8} are the
solutions. We have 2n possible subsets and a binary tree representing the state-space
tree will have 2n leaves with one or more leaves providing the solutions if they exist.
Given S = {2, 3, 6, 7} and M = 9, a state-space tree can be formed as shown
in Fig. 3.9. The nodes of the tree show the sum accumulated up to that point from
the root down the tree and we start with 0 sum. We consider each element of the
set S in increasing order and at length i from the root, the element considered is the
ith element of S. At each node, we have the left branch showing the decision if we
include the element and the right branch proceeds to the subtree when we do not
64 3 Graph Algorithms
2 not 2
2 0
3 not 3 3 not 3
5 2 3 0
6 not 6 6 not 6 6 not 6 6 not 6
11 5 8 2 9 3 6 0
7 n7 7 n7 7 n7 7 n7 7 n7 7 n7 7 n7 7 n7
18 11 12 5 15 8 9 2 16 9 10 3 13 6 7 0
{2,7} {3,6}
include that element in the search. The nodes shown in dashed circles are the points
in the search where we do not need to proceed any further since the requirement can
not be met if we do. For example, when we include the first two elements of the
set, we have 5 as the accumulated sum and adding the third element, 6 will give 11
which is more than M. So we do not take the subtree rooted at left branch of node 5.
Similarly, selecting 3 and 6 gives us the solution and we report it but still backtrack
since all solutions are required. If only one solution was required, we would have
stopped there.
Branch and bound method is similar to backtracking approach used for decision
problems, which is modified for optimization problems. The aim in solving an opti-
mization problem is to maximize or minimize an objective function and the result
found is the optimal solution to the problem. Branch and bound algorithms employ
the state-space trees as in backtracking with the addition of the record of the best
solution best found up to that point in the execution. Moreover, as the execution
progresses, we need the limit on the best value nexti that can be obtained for each
node i of the tree if we continue processing from that node. This way, we can com-
pare these values and if nexti is no better than best, there is no need to process the
subtree rooted at node i.
We will illustrate the general idea of this method by the traveling salesperson
problem (TSP) in which a salesperson starts her journey from a city, visits each city,
and returns to the original city using a minimal total distance. This, in fact, is the
Hamiltonian cycle problem with edge weights. Let G = (V, E) be the undirected
3.8 Coping with NP-Completeness 65
graph that represents the cities and roads between them. For the sample graph of
Fig. 3.10, we can see there are six possible weighted Hamiltonian cycles and only
two provides the optimal route. The routes in both are the same but the order of visits
are reversed. Note that starting from any vertex would provide the same routes. In
fact, there are (n − 1)! possible routes in a fully connected graph with n vertices.
A brute-force approach would start from a vertex a, for example, and search
all possible Hamiltonian cycles using the state-space tree and then record the best
route found. We need to define a lower bound value (lb) for the branch and bound
algorithm. The lb value is calculated as the total sum of the two minimum weight
edges from each vertex divided by two to get the average value. For the graph depicted
in Fig. 3.10, this value is
((2 + 4) + (3 + 1) + (3 + 2) + (4 + 1))/2 = 10
We can now start to build the state-space tree and every time we consider adding
an edge to the existing path, we will modify the values in the lower bound function
as affected by the selection of that edge and calculate a new lower bound. Then,
we will select the edge with the minimum lower bound value among all possible
edges. The state-space tree of the branch and bound algorithm for TSP in the graph
of Fig. 3.10 is depicted in Fig. 3.11. The optimal Hamiltonian cycles are a, c, b, d, a
and a, d, b, c, a. These paths correspond to paths (f) and (g) of Fig. 3.10.
Key to the operation of any branch and bound algorithm is the specification of
the lower bound. When search space is large, we need this parameter to be easily
computed at each step yet to be selective enough to prune the unwanted nodes of
the state-space tree early. For the TSA example, another lower bound is the sum of
the minimum entries in the adjacency matrix A of the graph G. This is a solid lower
bound since we are considering the lightest weight edge from each vertex and we
know we can not do better than this. For our example of Fig. 3.10, the lower bound
calculated this way is 9. Computing the lower bound in each step would then involve
66 3 Graph Algorithms
10
10 16 10
14 10 13 10
X X
a,c,b,d,a a,d,b,c,a
10 10
Fig. 3.11 The state-space tree for the graph of Fig. 3.10. Each tree node has the path and the lower
bound value using this path
deleting the row a and column b from A when the edge (a, b) is considered and then
including lightest edges from each remaining vertices that are not connected to a or
b to compute the lower bound for including the edge (a, b) in the path.
The greedy method searches for solutions that are locally optimal based on the
current state. It chooses the best alternative available using the currently available
information. A real-life example is the change provided by a cashier in a supermarket.
Commonly, the cashier will select the largest coin to result in least number of coins
in the next step which is optimal in some coin combinations such as in the U.S.
In many cases, following the locally best solution at each step will not yield an
overall optimal solution. However, in certain problems, the greedy method provides
optimal solutions. Finding shortest paths between the nodes of a weighted graph and
constructing a minimum spanning tree of a graph are examples of greedy method that
can be used efficiently to find the solutions. Greedy algorithms can also be used to
find approximate solutions to some problems. We will describe Kruskal’s algorithm
to find the minimum spanning tree (MST) as an example greedy graph algorithm.
The divide and conquer strategy involves dividing a problem instance into several
smaller instances of possibly similar sizes. The smaller instances are then solved
and the solutions to the smaller instances are then typically combined to provide
the overall solution. Solving the smaller instances and combining these solutions is
performed recursively.
(a) 8 (b) 8
5 5
a b a b
7 6 7 6
9 1 c 9 1 c
g 10 g 10
4 4
2 11 2 11
e d e d
3 3
(c) 8
(d) 8
5 5
a b a b
7 6 7 6
9 c 9 1 c
10 1 g 10
g 4
4
2 11 2 11
e d e d
3 3
(e) 8
5
a b
7 6
9 1 c
g 10
4
2 11
e d
9
2
4 4
2 2
2
2 2 2 2
2 2 2 2
2 2 2 2 2 2 2 2
Let us try to analyze the recursive calls to this function. For example, F(9) is
F(8) + F(7); F(8) is F(7) + F(6) and moving in this direction, we will reach the
base values of F(1) and F(0). Note the value of F(7) is calculated twice to find
F(9). This recurrence relation has exponential time complexity; however, we can
see some of the calls are repeated, for example, F(3) has to be calculated twice and
deduce this is not the best way to solve this problem. Dynamic programming solution
provides a solution with better complexity as we will see next.
In this method, the problem is divided into smaller instances first, the small prob-
lems are solved, and the results are stored to be used in the next stage. It is similar to
divide and conquer method in a way as it recursively divides the instance of the prob-
lem into smaller instances. However, it computes the solution to smaller instances,
records them, and does not recalculate these solutions as in divide and conquer.
(a) (b)
6 6
3 2 3 2
a b c a b c
9 9
3 3
1 7 g 1 7
g 4 4
3 3
2 8 2 e 8
f e d f d
5 5
(c) (d) 6
6
3 2 3 2
a b c a b c
9 9
3 3
1 7 g 1 7
g 4 4
3 3
2 8 2 e 8
f e d f d
5 5
Fig. 3.14 An example running of Bellman–Ford algorithm in a sample graph for the source vertex
g.The first reachable vertices are a, f , and e which are included in the shortest path tree T . This
tree is updated at each iteration when less cost paths compared to the previous ones are found
3.9 Major Design Methods 73
Example
An undirected and edge-weighted sample graph is depicted in Fig. 3.14 where we
implement this algorithm with source vertex g. The maximum number of changes of
the shortest path for a vertex is n-1 requiring n-1 iterations of the outer loop at line
8. Each loop requires at most m edge checking resulting in O(nm) time complexity
for this algorithm. We will see a more detailed version of this algorithm that also
provides a tree structure in Chap. 7.
methods attempt to find an optimal global solution by always selecting the local
optimum solutions. These local solution choices are based on what is known so
far and may not and do not lead to an optimal solution in general. However, we
saw greedy algorithms provide optimal solutions in a few graph problems including
shortest paths and minimum spanning trees. In the divide and conquer method, the
problem at hand is divided into a number of smaller problems which are solved
and solutions are merged to find the final solution. These algorithms often employ
recursion due to the nature of their operation. Dynamic programming also divides the
problem into smaller parts but makes use of the partial solutions found to obtain the
general solution. The background we have reviewed is related mainly to sequential
graph algorithms and we will see that further background and considerations are
needed for parallel and distributed graph algorithms in the next chapters. There are
a number of algorithm books which provide the basic background about algorithms
including the one by Skiena [6], Cormen et al. [1], and by Tardos and Kleinberg [5].
Exercises
1. Work out by proofs the worst-case running times for the following:
a. f (n) = 4n 3 + 5n 2 − 4.
b. f (n) = 2n + n 7 + 23.
c. f (n) = 2n log n + n + 8.
2. Prove 3n 4 is not O(n 3 ). Note that you need to show there are no valid constant
c and a threshold n 0 values for this worst case to hold.
3. Write the pseudocode of a recursive algorithm to find the sum of first n positive
integers. Form and solve the recurrence relation for this algorithm to find its
worst-time complexity.
4. Prove n! ≤ n n by the induction method.
5. A clique is the subgraph of a graph in which every vertex is adjacent to all other
vertices in the clique. Given a graph G = (V, E) and its subgraph V , V is a
d
k
a b c
i h g
f
6 4
14 8
e d
9
a b c
d
h
g f e
d
13
15
11 5 7
k a b c e
1
16
3 6
8 4
14 12
i h g 9
j
10 f
2
11. Find the minimal vertex cover for the sample graph in Fig.3.18 using the 2-
approximation algorithm. Show each iteration of this algorithm and work out
the approximation achieved by your iterations.
12. Find the MST of the sample graph of Fig. 3.19 using Kruskal’s algorithm by
showing each iteration.
References
1. Cormen TH, Stein C, Rivest RL, Leiserson CE (2009) Introduction to algorithms, 3rd edn. MIT
Press, Cambridge ISBN-13: 978-0262033848
2. Dasgupta S, Papadimitriou CH, Vazirani UV (2006) Algorithms. McGraw-Hill, New York
3. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP
completeness. W.H Freeman, New York
4. Karger D (1993) Global min-cuts in RNC and other ramifications of a simple mincut algorithm.
In: Proceedings of the 4th annual ACM-SIAM symposium on discrete algorithms
5. Kleinberg J, Tardos E (2005) Algorithm design, 1st edn. Pearson, London ISBN-13: 978-
032129535
6. Skiena S (2008) The algorithm design manual. Springer, Berlin ISBN-10: 1849967202
Parallel Graph Algorithms
4
Abstract
We investigate methods for parallel algorithm design with emphasis on graph
algorithms in this chapter. Shared memory and distributed memory parallel
processing are the two fundamental models at hardware, operating system, pro-
gramming, and algorithmic levels of parallel computation. We review these meth-
ods and describe static and dynamic load balancing in parallel computing systems.
4.1 Introduction
We have reviewed basic concepts in algorithms and main methods to design sequen-
tial algorithms with emphasis on graph algorithms in the previous chapter. Our aim in
this chapter is to investigate methods for parallel algorithm design with emphasis on
graph algorithms again. Parallel processing is commonly used to solve computation-
ally large and data-intensive tasks on a number of computational nodes. The main
goal in using this method is to obtain results much faster than would be acquired by
sequential processing and hence improve the system performance. Parallel process-
ing requires design of efficient parallel algorithms and this is not a trivial task as we
will see.
There are various tasks involved in parallel running of algorithms; we first need
to identify the subtasks that can execute in parallel. For some problems, this step can
be performed conveniently; however, many problems are inherently sequential and
we will see a number of graph algorithms fall in this category. Assuming a task T
can be divided into n parallel subtasks t1 , ..., tn , the next problem is to assign these
subtasks to processors of the parallel system. This process is called mapping and is
denoted as a function from the task set T to the processor set P as M : T → P.
Subtasks may have dependencies so that a subtask t j may not start before a preceding
subtask ti finishes. This is indeed the case when ti sends some data to t j in which
case starting t j without this data would be meaningless. If we know all of the task
dependencies and also characteristics such as the execution times, we can distribute
tasks evenly to the processors before running them and this process is termed as static
scheduling. In many cases, we do not have this information beforehand and dynamic
load balancing is used to provide each processor with fair share of the workload at
runtime based on the load variation of the processes.
We can have shared memory parallel processing in which computational nodes
communicate via a shared memory and in this case, the global memory should be
protected against concurrent accesses. In distributed memory parallel processing,
communication and synchronization among parallel tasks are handled by sending
and receiving messages over a communication network without any shared memory.
Parallel programming implicates writing the actual parallel code that will run on the
parallel processors. For shared memory parallel processing, lightweight processes
called threads are widely used and for parallel processing applications in distributed
memory architectures; the Message Passing Interface (MPI) is a commonly imple-
mented interface standard. We will see that we can have different models at hardware,
algorithm, and programming modes. These models are related to each other to some
extent as will be described. For example, message passing model at algorithmic
level requires distributed memory at hardware level which can be implemented by a
message passing programming model that runs the same code with different data or
different codes possibly with different data.
We start this chapter by describing fundamental concepts of parallel processing
followed by the specification of models of parallel computing. We then investigate
parallel algorithm design methods focussing on parallel graph algorithms which
require specific techniques. Static and dynamic load balancing methods to evenly
distribute parallel tasks to processors are outlined and we conclude by illustrating
the parallel programming environments.
run at the same physical time. Tasks that communicate using shared memory in a
single-processor system are concurrent but are not parallel. Concurrency is more
general than parallelism and encompasses parallelism.
• fine-grain versus coarse-grain parallelism: When the computation is partitioned
into number of tasks, the size of tasks as well as the size of data they work on affects
their running time. In fine-grain parallelism, tasks communicate and synchronize
frequently and coarse-grain parallelism involves tasks with larger computation
times that communicate and synchronize much less frequently.
• embarrassingly parallel: The parallel computation consists of independent tasks
that have no inter-dependencies in this mode. In other words, there are no prece-
dence relationships or data communications among them. The speedup achieved
by these algorithms may be close to the number of processors used in the parallel
processing system.
• multi-core computing: A multi-core processor contains more than one of the
processing elements called cores. Most contemporary processors are multi-core
and multi-core computing is running on programs in parallel on multi-core proces-
sors. The parallel algorithm should make use of the multi-core architecture effec-
tively and the operating system should provide effective scheduling of tasks to
cores in these systems.
• symmetric multiprocessing: A symmetric multiprocessor (SMP) contains a num-
ber of identical processors that communicate via a shared memory. Note that
SMPs are organized on a coarser scale than multi-core processors which contain
cores in a single integrated circuit package.
• multiprocessors: A multiprocessor consists of a number of processors which com-
municate through a shared memory. We typically have a set of microprocessors
connected by a high-speed parallel bus to a global memory. Memory arbitration
at hardware level is needed in these systems.
• multicomputer: Each processor has private memory and typically communicates
with other microcomputers by sending and receiving messages. There is no global
memory in general.
• cluster computing: A cluster is a set of connected computers that communicate
and synchronize using messages over a network to finish a common task. A cluster
is envisioned as a single computer by the user and it acts as a single computer
by the use of suitable software. Note that a cluster is a more abstract view of a
multiprocessor system with software capabilities such as dynamic load balancing.
• Grid computing: A grid is a large number of geographically distributed computers
that work and cooperate to achieve a common goal. Grid computing provides a
platform of parallel computing mostly for embarrassingly parallel applications
due to unpredictable delays in communication.
• Cloud computing: Cloud computing enables sharing of networked computing
resources for various applications using the Internet. It provides delivery of ser-
vices such as online storage, computing power, and specialized user applications
to the user.
80 4 Parallel Graph Algorithms
A processing unit has a processor, a memory, and an input/output unit in its most
basic form. We need a number of processors that should execute in parallel to perform
subtasks of a larger task. These subtasks need two basic operations: communication
to transfer data produced and synchronization. We can have these operations in shared
memory or distributed memory configurations as described next. A general trend in
parallel computing is to employ general-purpose off-the-shelf processors connected
by a network due to the simplicity and the scalability of such configurations.
In a shared memory architecture as shown in Fig. 4.8a, each processor has some
local memory, and interprocess communication and synchronization are performed
using a shared memory that provides access to all processors. Data is read from and
written to the shared memory locations; however, we need to provide some form of
control on access to this memory to prevent race conditions. We can have a number
of shared memory modules as shown in Fig. 4.1b with the network interface to these
modules providing concurrent accesses to different modules by different processors.
The main advantage of shared memory parallel processors is fast data access
to memory. However, the shared memory should be protected against concurrent
accesses by the parallel tasks and this process should be controlled by the programmer
in many cases. Another major disadvantage of shared memory approach is the limited
number of processors that can be connected due to the bottleneck over the bus while
accessing the shared memory. In conclusion, we can state shared memory systems
are not scalable.
(a) (b)
P P P
P P P
Interconnection Network
Shared Memory
M M M
(d)
P P
(c)
P P
M M
P M P M P P
P P
Fig. 4.1 Parallel computing architectures, a a shared memory architecture with a single global
memory, b a shared memory architecture with a number of memory modules, c a distributed
memory architecture, d a distributed and shared memory architecture
can use off-the-shelf computers and connect them using a network to have a parallel
computing system. Access to local memories is faster than global memory access;
however, the algorithm designer should take the responsibility of how and when to
synchronize the processor and transfer of data. The main advantage of distributed
memory systems is their scalability and use of the off-the-shelf computers. Also,
there are no overheads in memory management as in shared memory. As the main
disadvantage, it is the task of the algorithm designer to manage data communication
using messages; and communication over the network is commonly serial which is
slower than the parallel communication of the shared memory system. Also, some
algorithms are based on sharing of global data converting and mapping of which to
distributed memory may not be a trivial task.
In many cases, contemporary parallel computing applications use both shared
memory and distributed memory architectures. The nodes of the parallel computing
system are symmetric multiple processors (SMPs) that are connected via a network
to other SMPs in many cases. Each SMP node works in shared memory mode to run
its tasks but communicates with other nodes in distributed memory mode as shown
in Fig. 4.1d.
82 4 Parallel Graph Algorithms
Special architectures provide communication links that can transfer data between
multiple source–destination pair of nodes in parallel. In a hypercube, processors are
connected as the vertices of a cube as shown in Fig. 4.2a for a hypercube of size
4. The largest distance between any two processors is log n in a hypercube with n
processors, and a hypercube of size d has n = 2d processors. Each node has an
integer label which has a difference of one bit from any of its neighbors providing
a convenient way of detecting neighbors while designing parallel algorithms on the
hypercube.
A linear array consists of connected processors connected in a line each having
a left and right neighbor except the starting and terminating processor as depicted
in Fig. 4.2b. A ring network has processors connected in a cycle as in Fig. 4.2c.
The mesh architecture has a 2-D array of processors connected as a matrix and the
balanced tree architecture is a tree with nodes as processors each of which has two
children except the leaves as shown in Fig. 4.2c, d. Few parallel computers including
Cray T3D, SGI, and IBM Blue Gene have mesh structures.
4.3 Parallel Architectures 83
(a) (b)
6 7 14 15
2 3 10 11
4 5 12 13 (c)
0 1 8 9
(d)
(e)
Fig. 4.2 Special parallel processing architectures, a a hypercube of dimension 4, b a linear array,
c a mesh of two dimensions, d a balanced binary tree
4.4 Models
We need a model of parallel computing which specifies what can be done in parallel
and how these operations can be performed. Two basic models based on architectural
constraints are described next.
The parallel random access memory (PRAM) extends the basic RAM model to
parallel computing. The processor is identical with some local memory, and there is
a global shared memory used for communication and synchronization. Therefore, it
assumes the shared memory architecture described in Sect. 4.3.1. Processors work
synchronously using a global clock and at each time unit, a processor can perform
a read from a global or local memory location; execute a single RAM operation and
write to one global or local memory location. PRAM models are classified according
to the read or write access rights to the global memory as follows.
Receiver Receiver
time time
Sender Sender
non-blocking send blocking send
Fig. 4.3 Blocking and non-blocking communication modes, a a non-blocking send and a blocking
receive, b a non-blocking send and a non-blocking receive. Blocked times are shown by gray
rectangles. Network delays cause the duration between sending and receiving of a message
4.5 Analysis of Parallel Algorithms 85
We need to assess the efficiency of a parallel algorithm to decide its goodness. The
running time of an algorithm, the number of processors it uses, and its cost are used
to determine the efficiency of a parallel algorithm. The running time of a parallel
algorithm T p can be specified as
T p = t f in − tst
where tst is the start time of the algorithm on the first (earliest) processor and t f in
is the finishing time of the algorithm in the last (latest) processor. The depth D p
of a parallel algorithm is the largest number of dependent steps performed by the
algorithm. The dependency in this context means a step cannot be performed before
a previous one finishes since it needs the output from this previous step. If T p is
the worst-case running time of a particular algorithm A for a problem Q using
p identical processors and Ts is the worst-case running time of the fastest known
sequential algorithm to solve Q, the speedup S p is defined as below:
Ts
Sp = (4.1)
Tp
We need the speedup to be as large as possible for efficiency. The parallel process-
ing time T p increases with increased interprocess communication costs resulting in
a lower speedup. Placing parallel tasks in fewer processors to reduce network traffic
decreases parallelism and these two contradicting approaches should be considered
carefully while allocating parallel tasks to processors. Efficiency of a parallel algo-
rithm is defined as
Sp
Ep = (4.2)
p
A parallel algorithm is said to be scalable if its efficiency remains almost constant
when both the number of processors and the size of the problem are increased.
Efficiency of a parallel algorithm is between 0 and 1. A program that is scalable
with speedup approaching p has efficiency approaching 1. Let us analyze adding n
numbers using k processors assuming n/k elements are distributed to each processor.
Each pi , 0 ≤ i < k, finds its local sum in Θ(n/k) time. Then, the partial sums
are added in log(k) time by k processors resulting in a total parallel time T p =
Θ(n/k + log k). The sequential algorithm has a time complexity of Ts = Θ(n).
Therefore, efficiency of this algorithm is
Ts n
Ep = = (4.3)
kT p n + k log k
86 4 Parallel Graph Algorithms
c
n= k log k (4.4)
1−c
Listing the efficiency values against the size of the problem n and the number
of processors k, we can see for (n, k) values of (64,4), (192,8), and (512,16), the
efficiency is 80 % [6] for a maximum of (512,32) value; all other efficiency values
are lower. Total cost or simply the cost of a parallel algorithm is the collective time
taken by all processors to solve the problem. The cost on p parallel computers is the
time spent multiplied by the number of processor as pT p , and hence
Ts
Tp ≥
p
which is called the work law. We are interested in the number of steps taken rather
than physical time duration of the parallel algorithm. A parallel algorithm is cost-
optimal if the total work done W is
W = pT p = Θ(Ts )
In other words, when its cost is similar to the cost of best-known sequential
algorithm for the same problem, the parallelism achieved P can be specified in
terms of these parameters as below:
W
P=
S
Let us illustrate these concepts by another parallel algorithm to add 8 numbers;
this time each processor does one addition only. A possible implementation of this
algorithm using four processors is shown in Fig. 4.8. We can see the number of
dependent steps which is the depth of this algorithm is 3 and the total work done is
7 as 4, 2, and 1 additions are done in steps 1, 2, and 3. These problems are termed
as circuit problems which include finding minimum/maximum values in an array
and their depth is log n as can be seen. Total work done in these algorithms is n-1
(Fig. 4.4).
Processes residing on different computing nodes of the parallel system need to com-
municate to finish an overall task. We can distinguish the two basic modes of com-
munication between the processes: either all processes involved in the parallel task
communicate or a group of processes communicate with each other. Viewed from
4.6 Basic Communication Modes 87
20 + 16
p1
36
another angle, there is also the architecture of the hardware that needs to be con-
sidered when specifying these communication modes. The following are the basic
communication between all n processes of the system or the group [6].
(m6,m7)
(a) m6 6 m7
(b) (m6,m7) 6
7 7
m2 2 3 m3 (m2,m3) 2 3 (m2,m3)
m4 4 5 m5 4 5
(m4,m5)
(m4,m5)
m0 0 1 m1 (m0,m1) 0 1 (m0,m1)
(c) (d)
(m4,...,m7) (m4,...,m7) (m0,...,m7) (m0,...,m7)
6 7 6 7
(m0,...,m3)
(m0,...,m7) (m0,...,m7)
2 3 (m0,...,m3) 2 3
(m4,...,m7) (m4,...,m7) (m0,...,m7) (m0,...,m7)
4 5 4 5
0 1 0 1
(m0,...,m3) (m0,...,m3) (m0,...,m7) (m0,...,m7)
Fig. 4.5 All-to-all communication in a 3-D hypercube. The first data exchange between neighbors
is in x direction, then y and finally in z directions in a, b and c consecutively
each node to have all messages m i , 0 ≤ i < n, with n = 2d at the end of the
algorithm. We can have all neighbors exchange their messages along x-axis in the
first step, then exchange the obtained result along y-axis, and finally along z-axis in
the last step for a 3-D hypercube as shown in Fig. 4.2. Note that the size of messages
transmitted is doubled at each step and the total number of steps is d (Fig. 4.5).
The problem here is to write an algorithm that provides data transfer specified
above. We can make use of the labeling of the nodes in a hypercube which differ in
one bit from any neighbor. The least significant bit (LSB) difference of a node from
a neighbor is in x direction, the second LSB difference is in y direction, and so on.
We can therefore bitwise Exclusive-OR of an identity of a node to find the identity
of its neighbor. For example, node 5 (0101B) when XORed with 0001B results in
4 which is the neighbor of node 5 in x direction. Algorithm 4.1 makes use of this
property and selects neighbors (neigh) for transfer which have 1-bit difference at
each iteration. Total number of steps is the dimension d of the hypercube.
4.7 Parallel Algorithm Design Methods 89
We can employ one of the following strategies when designing a parallel algorithm;
modify an existing sequential algorithm by detecting subtasks that can be performed
in parallel which is by far one of the most commonly used approaches. Alternatively,
we can design a parallel algorithm from scratch or we can start the same sequential
algorithm on a number of processors but with different, possibly random initial
conditions and the first one that finishes becomes the winner. Foster proposed a four-
step design approach for parallel processing stated below [4]. We will look into these
steps in more detail in the following sections.
1. Partitioning: Data, the overall task or both, can be partitioned into a number
of processors. Partitioning of data is called data or domain decomposition and
partitioning of code is termed functional decomposition.
2. Communication: The amount of data and the sending and receiving parallel sub-
tasks are determined in this step.
3. Agglomeration: The subtasks determined in the first two steps are arranged into
larger groups with the aim of reducing communication among them.
4. Mapping: The formed groups are allocated to the processors of the parallel system.
When the task graph that depicts subtasks and their communication is constructed,
the last two steps of this methodology are reduced to graph partitioning problem
as we will wee.
need to form the product C of two n × n matrices A and B and we partition these
matrices as n/2, n/2 sub-matrices for four processes as follows:
C1 C2 A1 A2 B1 B2
= ×
C3 C4 A3 A4 B3 B4
The tasks to be performed by each process p1 , .., p4 can now be stated as below:
C1 = (A1 × B1 ) + (A2 × B3 ) → p1
C2 = (A1 × B2 ) + (A2 × B4 ) → p2
C3 = (A3 × B1 ) + (A4 × B3 ) → p3
C4 = (A3 × B2 ) + (A4 × B4 ) → p4
We can simply distribute the associated partitions of matrices to each process
in the message passing model, for example A1 , A2 , B1 , and B3 to p1 , or have the
processes work on their related partitions in shared memory in the PRAM model.
In the supervisor/worker model of parallel processing, we have one process, say
p1 , that has all the inputs which are matrices A and B in this case. This supervisor
node is responsible for the distribution of the initial data to worker processes and
then collecting the results. It may also be involved in computing the results if there
is a load unbalance such that the supervisor remains idle when other processes are
involved in computation. Intuitively, when there are a large number of processes with
needed dense communication, the role of the supervisor can be confined to manage
basic dataflow and provide the output. Alternatively, in the fully distributed model,
all processes are equal with input data provided to all. Each node in the network
works in its partition but exchanges messages to have the total result stored in them.
This mode may be used in the first step of a parallel task if total results are needed
by each process as input data to the next step of individual computations. We have
used block partitioning of the input matrices in this example where the matrix is
partitioned into blocks of equal size as shown in Fig. 4.6 for an 8 × 8 matrix where
we have 16 processes, p1 to p1 6 each having a 2 × 2 partition of the matrix.
Row-wise Array Partitioning
Let us consider a matrix A[n, n] with n rows and n columns. In row partitioning, we
simply partition the matrix A to k parts such that pi gets (i − 1)n/k + 1 to in/k
-1 rows. Such a partitioning is depicted in Fig. 4.7a for an 8 × 8 matrix with four
processes p1 , p2 , p3 , and p4 .
Column-wise Array Partitioning
In column-wise partitioning of a matrix A[n, n], each process now has n/k consecu-
tive columns of A. Column-wise partitioning of an 8 × 8 matrix with four processes
is shown in Fig. 4.7b. Row-wise and column-wise partitioning of a matrix are called
1-D partitioning and the block partitioning is commonly called 2-D partitioning of
a matrix.
4.7 Parallel Algorithm Design Methods 91
1 2 3 4 5 6 7 8
1
p1 p2 p3 p4
2
3
p5 p6 p7 p8
4
5
p9 p10 p11 p12
6
7
p13 p14 p15 p16
8
1 2 3 4 5 6 7 8
1
p1
2
3
p2
4
5
p3
6
7
p4
8
p1 p2 p3 p4
version of this algorithm, we note the recursive calls are independent as they operate
on different data partitions; hence, we can perform these calls in parallel simply
by performing the operations within the else statement between lines 4 and 8 of
Algorithm 4.2 in parallel. The recurrence relation for this algorithm in this case is
T (n) = T (n/2) + 1 which has solution T (n) = O(log n).
The divide and conquer method requires graph structure to be partitioned into smaller
graphs and this is not an easy task due to the irregular structures of graphs. Partitioning
of data for parallel graph algorithms means balanced partitioning of graph among the
processors which is an NP-hard problem. We need radically different methods for
parallel graph algorithms, and graph contraction, pointer jumping, and randomization
are the three fundamental approaches for this purpose.
A randomized algorithm makes certain decisions based on the result of coin flips
during the execution of the algorithm as we reviewed in Chap. 3. These algorithms
assume any input combination is possible. Two main classes of randomized algo-
rithms are Las Vegas and Monte Carlo algorithms as we have outlined.
Randomized algorithms can be used effectively for parallel solution of various
graph problems. Discovering connected components of a graph, finding maximal
independent sets, and constructing minimum spanning trees of a graph can all be
performed by parallel randomized algorithms as we will see in Part II.
Symmetry breaking in a parallel graph algorithm involves selection of a subset
from a large set of independent operations using some property of the graph. For
example, finding all candidate vertices for the maximal independent set (MIS) of a
graph can be done in parallel. However, we cannot have both of adjacent vertices
included in the MIS as this violates the definition of MIS. A symmetry breaking
4.8 Parallel Algorithm Methods for Graphs 93
V3
k
procedure may select the vertex with a lower identifier or a lower degree. In general,
symmetry breaking may be employed to correct the output when independent parallel
operations on the vertices or edges of a graph produce a large and possibly incorrect
result.
Given an unweighted undirected graph G = (V, E), graph partitioning task is divid-
ing the vertex set V into disjoint vertex sets V1 , ..., Vk such that the number of vertices
in each partition is approximately equal and the number of edges between the sub-
graphs induced by the vertices in the partition is minimal. This process is depicted
in Fig. 4.8 where a graph is partitioned to three balanced subgraphs. Vertex and
edges may have weights associated with them representing some physical parameter
related to the network represented by a graph. In such a case, our aim in partitioning
is to have approximately equal sum of weights of vertices in each partition with a
total minimum sum of edge weights between the partitions.
In PRAM and distributed memory model, each process pi works on its partition.
Assuming work done by a processor is a function of the number of vertices and
edges in its partition, the load is evenly distributed. However, inter-partition edges
and border vertices should be handled with care when obtaining the overall solution to
the problem. The duplicated border vertices in partitions they do not belong are called
ghost vertices and using these nodes helps to overcome the difficulties encountered
in the partition boundaries. A simple and effective algorithm was proposed rather
early in 1970 by Kernighan and Lin to partition a graph recursively [9]. It is basically
used to improve an existing partition by swapping vertices between the partitions to
94 4 Parallel Graph Algorithms
reduce the cost of inter-partition edges. In the multi-level graph partitioning method,
the graph G = (V, E) is coarsened to a small graph G = (V , E ) using suitable
heuristics, a k-way partition of G is computed, and the partition obtained is projected
back to the original graph G [7]. A parallel formation of this method that uses
maximal matching during coarsening phase is presented in [8].
Graph contraction method involves obtaining smaller graphs by shrinking the orig-
inal graph at each step. This scheme is useful in designing efficient parallel graph
algorithms in two respects. It can be conveniently performed in O(log n) steps as
the size of the graph is reduced by a constant factor at each step. Therefore, if we
can find some way of contracting a graph in parallel while simultaneously solving a
graph problem during contraction, then we have a suitable parallel graph algorithm.
Searching for a solution to some graph problems such as minimum spanning trees
during contraction is possible as we will see. Moreover, careful selection of contrac-
tion method maintains basic graph properties, and hence we can solve the problem
on a smaller graph with much ease in parallel and then combine the solutions to find
the solution for the original graph.
Let us assume we have an input graph G and obtain G 1 , ..., G k small graphs after
contraction. We can solve the problem in parallel in these small graphs and then
merge the solutions. A graph contraction algorithm template shown in Algorithm 4.3
follows a typical recursive algorithm structure with the base case and an inductive
case. When we reach the base case, we start computing the required function on
the small graph and then recurse on this small graph. Note that the vertex partition
should be disjoint.
(a)
a c d
a b c d (b)
d
h g e c
f
g
i
j
g
Fig. 4.9 Contraction of a sample graph. The vertex set in a is partitioned into four subsets and each
subset is represented by one of the vertices in the subset to get the contracted graph in b
Let us consider the example in Fig. 4.10 where we partition the vertex set into four
subsets which have a representative vertex as shown. Assuming the newly formed
graph is small enough, we can now solve the original problem in the small graph
which is the base case and recurse to obtain the full solution (Fig. 4.9).
In other words, edges in the matching are disjoint with no shared vertices. A
maximal matching in a graph G cannot be enlarged by the addition of new edges
and the maximum matching in G is the matching with the maximum size among all
matchings in G. We can therefore view edge partitioning and contraction problem
as recursively finding maximal matching E in a graph G, edge contraction in E
to obtain G and continuing with finding maximal matching in G , and so on. For
graph contraction, a sufficiently large matching rather than a maximal matching can
be used. We will search the graph matching problem in more detail in Chap. 9.
96 4 Parallel Graph Algorithms
(a)
k
(b)
k
a b c d b c
h g f e g e
j
i
(c)
(d)
k c a c
a g
Fig. 4.10 Edge contraction of a sample graph in a. A maximal matching is found and graph is
contracted to obtain the graph in b. The size of partitions is enlarged as more vertices are included and
the label of a partition is the smallest label it has lexicographically. Final graph has two supervertices
(a) (b)
j k j k
m a b c d m a c d
l h g f e f e
i i
(d)
(c)
a d
j k
m a c d
(e)
a
Fig. 4.12 Iterations of star contraction of a sample graph. The contracted vertices are shown as
enlarged circles and centers as shaded
v becomes a center. If there are more than one center neighbors of v, it selects one
arbitrarily to be its center [1].
Let us consider a linked list of n elements with each element pointing to the next
element in the list. The pointer jumping method provides each element to point to
the end of the list after log n steps. At each step of the algorithm, each element points
98 4 Parallel Graph Algorithms
to the element pointed by its successor as shown in Algorithm 6.1. This algorithm
can run in parallel as each pointer update can be performed in parallel.
The operation of this algorithm for a linked list of eight elements is shown in
Fig. 4.13. We can use this template for parallel graph algorithm design that uses
linked lists as graphs can be represented by adjacency or edge lists. Pointer jumping
method is suitable for PRAM model with shared memory.
List Ranking
Given a linked list L, finding the distance from each node of L to the terminal node
is called list ranking. Algorithm 6.1 can be modified to compute these distances as
shown in Algorithm 4.5.
1 1 1 1 1 1 1 0
2 2 2 2 2 2 1 0
4 4 4 4 3 2 1 0
7 6 5 4 3 2 1 0
Fig. 4.13 Pointer jumping method in a linked list of eight elements. After three steps, all of the
elements point to the end of the list. List ranking algorithm is also depicted in this figure at the end
of which all nodes have distances to the head stored
4.8 Parallel Algorithm Methods for Graphs 99
Line 12 of this algorithm for a node a involves reading the distance of the next
node of a and then adding this distance to own and writing the sum as the new
distance to a. These two consecutive operations can be done in constant time in only
CRCW PRAM model. In order to provide EREW version of this algorithm, we need
to replace line 12 by a read line and a write line below, both of which should be
executed in EREW mode. The time complexity of this algorithm is O(log n).
temp = (a.next).d
a.d = temp
P2
In dynamic load balancing, we allocate the tasks to the processor during the execution
of the parallel algorithm. This method is needed when task characteristics are not
known en priori. We may have a centralized load balancing scheme in which a central
process commonly called the supervisor manages load distribution. Whenever a
4.9 Processor Allocation 101
(a)
P2
a b c id task id
3 1 2 t c comp. time
P1
4 2 5 3
2
d 4
e f
2 2 3
3 2
P3
6
7 5
g 2
h i
3 4 2
8
3
9 1
j
2
(b)
P1 c f i
P2 a d g i
P3 b e h t
2 4 6 8 10 12 14 16 18
Fig. 4.15 Allocation of static tasks using the task dependency graph. We have a graph of nine tasks
in a and a possible allocation to three processors is shown in b. The partitioning here attempts to
put tasks that communicate heavily to the same processor by also trying to keep the workload in
each processor similar
102 4 Parallel Graph Algorithms
Parallel programming involves writing the actual code that will run on the parallel
machine. We will review parallel programming with examples in shared memory and
message passing distributed memory models. This task can be modeled as single-
program multiple-data (SPMD) paradigm in which all processes execute the same
code on different data or multiple-program multiple-data (MPMD) model with each
process running different codes on different data in both of these models.
Operating systems are built around the concept of a process which is the basic unit
of code to be scheduled that has data such as registers, stack, and private memory.
Organizing the main functions of an operating system which are resource manage-
ment and convenient user interface around this perception has many advantages. In
the very basic sense, many processes can be scheduled independently in multitasking
operating systems preventing the unnecessary waits due to slow input/output devices
such as disks. A process can be in one of the three basic states at any time: running
when it is executing, blocked when it cannot execute due to the unavailability of a
resource, or ready when the only resource it needs is the processor. Using processes
provides switching the processor among different processes. The current environ-
ment of the running process such as its registers, file pointers, and local data is stored
and the saved environment of the new process to run is restored in context switching.
Another problem encountered when using this model of computing is the protection
of the shared memory among processes when data in this area needs to be read or
written. The code of a process or the operating system that performs access to shared
memory with other processes is called the critical section. Although in theory we
can use processes for parallel processing in shared memory environment, two main
difficulties are the costly overhead of context switching and protection of shared
memory segments against concurrent read/write operations by the processes.
4.10 Parallel Programming 103
4.10.1.1 Threads
Modern operating systems support threads which are lightweight processes within
a process. A thread has program counter, registers, stack, and a small local memory
making the context switching at least an order of less costly than switching processes.
There is the global area of the process which needs to be protected since threads need
to access this area often. There are two main types of threads: kernel threads and
user threads. A kernel thread is known by the kernel of an operating system and
hence can be scheduled independently contrary to the user threads which are only
identified in user space. The user threads are managed by the runtime resulting in an
order of decrease in their context switch when compared to kernel threads. However,
a user thread blocked on an input/output operation blocks the whole process since
these threads are not identified by the kernel.
• Thread function: The thread itself is declared as a procedure with input parameters
and also possible return parameters to be invoked by the main thread.
• Thread creation: This system call creates a thread and starts running it.
int pthread_create(&thread_id,&attributes,start_function,
arguments);
where thr ead_id is the variable; the created thread identifier will be stored after
this system call, certain properties of a thread can be initialized by the attributes
variable, start_function is the address of the thread code, and the arguments are
the variables passed to the created thread.
• Waiting for thread termination: The main thread waits for the threads it has created
using this function call.
where thread is the identifier of the thread to wait and status is used for the return
status or passing back a variable to the main thread.
• Thread synchronization: Threads need to synchronize for critical sections and also
for notifying events to each other using data structures such as mutual exclusion
variables, condition variables, and semaphores.
104 4 Parallel Graph Algorithms
Mutual Exclusion
Protection of global variables using POSIX threads is provided by mutual exclusion
variables. In the sample C code using POSIX threads below, we have two threads
T 1 and T 2, a global shared variable data between them, and a mutual exclusion
variable m which is initialized by the main thread which also activates threads and
finally waits for them to finish. Each thread locks m before entering its critical section
preventing interruption in this section. Upon exit, it unlocks m to enable any other
thread enter its critical section protected by m. The operating system ensures that
lock and unlock operations on the mutual exclusion variables are executed atomic
and hence cannot be interrupted.
#include <pthread.h>
int data;
pthread_t thread1, thread2;
pthread_mutex_t m;
T1(){ T2(){
... ...
pthread_mutex_lock(&m); pthread_mutex_lock(&m);
data=data+1; data=data*4;
pthread_mutex_unlock(&m); pthread_mutex_unlock(&m);
... } ... }
main() {
pthread_mutex_init(&m);
pthread_create(&thread1,NULL,T1,*void);
pthread_create(&thread2,NULL,T2,*void);
...
pthread_join(thread1,NULL);
pthread_join(thread2,NULL);
}
Synchronization
Threads, as processes, need to synchronize on conditions. Let us assume two threads
one of which produces some data (producer) and needs to inform another thread
that it has finished this task so that the second thread can retrieve and process this
data (consumer). The consumer thread cannot proceed before the first producer
declares the availability of data using some signaling method. Semaphores are data
structures consisting of an integer and commonly a process queue associated with
them. Processes and threads can perform two main atomic actions on semaphores:
wait in which the caller may wait or continue depending on the condition it intends
to wait, and signal call provides signaling the completion of the waited event by also
possibly freeing any waiting process for that event.
4.10 Parallel Programming 105
The C code provided below shows how two threads synchronize using two
semaphores sema1 and sema2. We have a thread producer which inputs some data
and writes this data to shared memory location data. It then signals the other thread
consumer which reads this data and processes it. Synchronization is needed so that
unread data is not overwritten by the producer and also data is not read and processed
more than once by the consumer. The semaphore sema1 is initialized to true value in
the main thread thereby allowing thread producer that executes a wait on it to con-
tinue without waiting in the first instance, since it has nothing to wait initially. The
second semaphore sema2 is initialized to false value since we do not want consumer
to proceed before producer signals the availability of data for producer. Semaphores
can also be used for mutual exclusion but employment of mutual exclusion variables
for mutual exclusion should be preferred to enhance readability of the program and
also for performance.
main() {
pthread_create(&t1,NULL,producer,*void);
pthread_create(&t2,NULL,consumer,*void);
sem_init(&sema1,1,0);
sem_init(&sema2,1,0);
...
}
A multithreaded program with the POSIX thread library (lpthread) can be com-
piled and linked as follows in UNIX environment:
#include <stdio.h>
#include <pthread.h>
#define n 100
#define n_threads 1024
pthread_t threads[n_threads];
pthread_mutex_t m1;
/******************************************************
thread code to be invoked n_threads times
******************************************************/
/******************************************************
4.10 Parallel Programming 107
main thread
******************************************************/
main()
{ pthread_t threads[n_threads];
int i;
pthread_mutex_init(&m1);
for(i=1; i<=n_threads; i++)
pthread_create(&threads[i],NULL,worker,i);
for(i=1; i<=n_threads; i++)
pthread_join(threads[i],NULL);
printf("Approximate PI is:
Figure 4.16 displays the PI curve between 0 and 1 x-axis values where we assume
five parallel threads for simplicity. Precision can be improved by either increasing
the number of threads and/or calculating the area as the mean value of the border
area values as shown in the figure.
Threads can be used conveniently for parallel processing in modern multi-core
processors. They communicate using shared memory and hence do not need to send
messages thereby preventing data communication delays at the expense of overheads
caused for the protection of shared memory. OpenMP is a widely used parallel
processing platform that uses multithreading [12,15].
Message passing interface (MPI) standard specifies a message passing library of rou-
tines to provide a portable, flexible, and efficient method to write message passing
programs [10,11]. Although it is primarily targeted for distributed memory programs,
the later developments and versions of MPI provide distributed memory, shared
memory, or hybrid implementations. Parallel Virtual Machine (PVM) is another tool
widely used for parallel processing [14]. MPI consists of routines to send and receive
data among processes and various other modes of communication such as broadcast-
ing a message to all processes in the system or multicasting in which a message is
sent to a group of messages. MPI programs start by initializing the environment, per-
forming parallel computations by sending and receiving messages among processes,
and then terminating. In order to define the set of processes that will run, objects
called communicators are employed. Each process in MPI belongs to a communica-
tor and inside a communicator, each process has a unique identifier called its rank.
The following C routines are used to initialize the parallel computing environment
and then terminate.
• MPI_Init(int *argc, char **argv): Inputs a pointer to the number of arguments and
a pointer to the argument vector. These parameters are passed from the command
line to specify the number of processes to be invoked.
• int MPI_Comm_size(MPI_Comm comm, int *size): The number of parallel
processes in the communicator comm is returned in size in the communicator
comm.
• int MPI_Comm_rank(MPI_Comm comm, int *rank): Returns the rank of a
process in the group comm. The ranks are ordered 0,..., size-1.
• int MPI_Finalize(void): This routine is called by each process before exiting to
clean up the library and terminate.
The main procedures for data transfer are the send and receive routines with many
variations. The blocking send and the blocking receive are specified below:
where buf is the address of send/receive buffer, count is the number of elements in
the buffer, datatype is the data type of each send/receive buffer element, dest/source
is the integer rank of destination/source, tag is the type of message, comm is the
communicator, and status is the status object to be examined. Note that two processes
may use message tag to perform different actions by different tags. We can now
write a simple MPI application of two processes sending and receiving messages in
C programming language as below:
4.10 Parallel Programming 109
#include <mpi.h>
The same code is run on all processors in this SPMD model which is very common
in MPI applications due to the difficulty in writing different codes for different
processes. The instructions to be run by each process are separated by the use of
process identifiers. In this example, rank 0 is the sender and rank 1 is the receiver.
We need to compile and run this code named mess.c in UNIX environment as follows:
The first line is the compiling and linking command using the mpicc com-
piler/linker and the second line starts running the executable program with eight
processes. Note that we pass this argument of eight processes to the main program in
which MPI_Init uses to initialize the environment and starts running these identical
processes in a hardware environment we do not know. In fact, we could have installed
MPI on a single computer and eight processes could run on the same computer. The
following example displays the use of MPI to calculate PI using the same method
of finding the area under the curve 4/(1 + x 2 ) between x = 0 and x = 1 as we did
with POSIX threads.
110 4 Parallel Graph Algorithms
#include <mpi.h>
#include <math.h>
{
int i, n, my_id, n_slices, n_procs, n_sl;
double my_area, total_area, width, x, my_sum;
MPI_Init(&argc,&argv);
MPI_Comm_size(MPI_COMM_WORLD,&n_procs);
MPI_Comm_rank(MPI_COMM_WORLD,&my_id);
if (myid == 0) {
printf("Enter the number of slices for each process:");
scanf("%d",&n_slices);
MPI_Bcast(&n_slices, 1, MPI_INT, 0, MPI_COMM_WORLD);
for(i= 1; i < n_procs; i++) {
MPI_Recv( &part_area, 1, MPI_DOUBLE, MPI_ANY_SOURCE,
1, MPI_COMM_WORLD, &status);
total_area += part_area; }
printf("Approximate PI is: %f", total_area);
}
else {
MPI_Recv(&n_sl, 1, MPI_INT, 0, 0, MPI_COMM_WORLD,
MPI_STATUS_IGNORE);
my_sum = 0.0;
width = 1.0 / (double) (n_sl + 1);
for (i = my_id + 1; i <= n_sl; i++) {
x = width * (double)i ;
my_area += 4.0 / (1.0 + x*x);
}
MPI_Send(&my_area, 1, MPI_DOUBLE, 1, 0, MPI_COMM_WORLD);
}
MPI_Finalize();
return 0;
}
supervisor which adds them and outputs. We could have the supervisor also involved
in the computation of PI (See Exercise 6).
4.11 Conclusions
We have reviewed the parallel computing fundamental concepts in this chapter with
emphasis on design methods for parallel graph algorithms. Parallel algorithms may
use the shared memory or the message passing model in a general sense. The PRAM
model is an idealistic method to design parallel algorithms in the shared memory
platform; moreover, it provides an abstract model hiding details of implementation
and hence can be used for high-level design and comparison of shared memory
parallel algorithms. Access mode to shared memory is important in this model,
and reads and writes can be performed in concurrent or exclusive modes. Message
passing model is suitable for distributed memory processors which communicate and
synchronize by sending and receiving messages only. Basic communication modes
in a parallel computing system maybe classified based on the source and destination
of the data transfer. Grouping the communications under operations such as one-
to-all or all-to-all modes eases the burden of writing a parallel algorithm since we
can simply specify the needed mode rather than writing the actual algorithm for
communications. Design methods for parallel algorithms may be broadly classified
as data or functional decomposition. Data is decomposed into a number of sets to be
processed by parallel processors using the same algorithm in the first and different
tasks are allocated to different processors in the second method.
We saw graphs require special methods of parallel computing and graph contrac-
tion is a commonly used approach to enable parallel graph operations. The graph
under consideration is made smaller at each step, using methods such as edge or
star contraction. Graph contraction can be performed in parallel and also solving the
problem in a smaller graph can be done more conveniently. Some graph problems
can be solved efficiently using sequential algorithms; however, the same problems
do not have simple parallel algorithmic solutions due to the dependencies involved.
Randomization and symmetry breaking methods provide simple and elegant par-
allel graph algorithms for various graph problems such as maximal independent
sets and minimum spanning trees. Data partitioning for graphs frequently involves
dividing the adjacency matrix row-wise, column-wise, or block-wise to a number of
processors. Parallel tasks relations can be depicted by a task dependency graph and
allocation of these tasks to processors is a variation of graph partitioning problem.
This approach requires task computation times and their interaction to be known in
advance which may not be realistic in many real-life applications. Dynamic load bal-
ancing involves keeping the loads on processes even at runtime. Finally, we reviewed
two commonly used platforms for shared memory and distributed memory program-
ming: POSIX threads provide a convenient API to implement shared memory parallel
algorithms and MPI is widely used for distributed memory programming.
112 4 Parallel Graph Algorithms
locks, messages
semaphores (send, receive) OPERATING SYSTEM
Shared Distributed
HARDWARE
Memory Memory
At a more abstract level, we can view the modeling of the whole process of paral-
lel computing at four related levels: hardware, operating system, programming, and
algorithmic levels as shown in Fig. 4.17. In all these layers of design, the main distinc-
tion is whether shared or distributed memory is used. We find operating system and
middleware should provide different services at these levels. The main problem with
shared memory approach is the protection of memory during concurrent accesses
and this is provided by the operating system constructs such as semaphores and
locks. At the programming level, threads which are lightweight processes are widely
used for shared memory programming. The POSIX thread library provides all neces-
sary routines for thread synchronization and mutual exclusion. The MPI standard is
widely used for distributed memory parallel computing with a wide range of required
message passing procedures. At algorithmic level, the PRAM model which assumes
shared memory is not practical as it assumes infinitely large shared memory and
infinite number of processors; however, it is used to compare various parallel algo-
rithms for the same problem. We will mostly consider distributed memory platforms
in our analysis of parallel graph algorithms in this book to provide implementable
solutions, except in few places where we describe PRAM algorithms.
Exercises
2 3
(2)
(3) 4 5
(2)
(3)
0 1
msg (1)
11
a d id task id
3 2 3 t c comp. time
g
2
1 5
2 9
b
2 e i
3 2 3
12
8
7
h
c 3
4 3
4 f
6 5 2
9 13 25
1 2 3 4
1 4 12 14
39
8
0 7 6 5
8 10 5 7
61 51 46
Fig. 4.20 A unidirectional ring of eight processes used to calculate sum of integers stored at each
node. The integers stored are shown next to nodes and the messages contain the partial sums
transferred
References
1. Belloch G (2015) Algorithm design: parallel and sequential, Draft book
2. http://www.nvidia.com/object/cuda_home_new.html
3. Flynn MJ (1972) Some computer organizations and their effectiveness. IEEE Trans. Comput.
C21(9):948960
4. Foster I (1995) Designing and building parallel programs: concepts and tools for parallel soft-
ware engineering. Addison-Wesley, Boston
5. Gantt HL (1910) Work, wages and profit, The engineering magazine, New York
6. Grama A, Karypis G, Kumar V, Gupta A (2003) Introduction to parallel computing, 2nd edn.
Addison-Wesley, Boston
7. Karypis G, Kumar V (1995) Multilevel k-way partitioning scheme for irregular graphs. Tech-
nical Report TR 95-064, Department of Computer Science, University of Minnesota
References 115
Abstract
A distributed system consists of a number of computational nodes connected
by a communication network. The nodes of a distributed system cooperate and
communicate to achieve a common goal. We first describe the type of distributed
systems, the communication and synchronization methods used in these systems.
We then investigate few fundamental distributed algorithms including spanning
tree construction, broadcast and convergecast operations over a spanning tree, and
leader election.
5.1 Introduction
commonly dynamic in which nodes and links may be inserted to or deleted from the
network due to failures or movement of the nodes as in the case of a mobile network.
A rescue operation consisting of moving nodes is an example of a mobile network.
A distributed system can be conveniently modeled by a graph in which vertices
of the graph represent the computational nodes and an edge between two nodes
represents a communication facility between them. The algorithms running at the
nodes of a graph representing the distributed system are commonly termed distributed
graph or network algorithms. Note that distributed memory-employing algorithms
in a parallel processing environment are also called distributed algorithms in the
literature but in the context of this book, we will use distributed (graph) algorithms to
mean algorithms running in a network represented by a graph. We will see designing
a distributed version of a sequential graph algorithm is not a trivial task. We start
this chapter by describing common distributed system platforms. We then investigate
distributed graph algorithms, classify them, and show the operation of some basic
distributed graph algorithms.
widely used due to easiness and speed in their deployment. Two types of wireless
networks have gained importance recently; mobile ad hoc networks and wireless
sensor networks.
A wireless sensor network (WSN) consists of a network of sensors with radio trans-
ceivers and controllers. These networks of physically tiny nodes in most cases, have
many applications including environmental control, e-health, and intelligent build-
ings. A sensor node has a very limited power and sensors are typically controlled by
a central node called the sink with more computational capabilities. Data recorded
by sensor nodes is collected at the sink for further processing. Routing of data mes-
sages to the sink efficiently using network protocols as well as keeping the network
connected are the main issues to be addressed in WSNs. Sensor networks are mostly
stationary and require low-power operation, which is more critical than managing
power in MANETs.
A MANET or a WSN can be conveniently modeled by a graph and the problems
such as routing, connectivity can then be transferred to graph domain to be solved
with methods developed for graphs. For example, efficient routing problem can be
solved with the aid of the method of finding the shortest distance between two
nodes of a weighted graph. However, these problems should now be solved in a
distributed manner without any global knowledge, which makes the problem harder
than an ordinary graph problem. A node in a graph representing a WSN can only
communicate with its neighbors, but we need to have a global decision using the
120 5 Distributed Graph Algorithms
collected data from all of the sensors. Figure 5.1 displays a wireless network with
nodes that can transmit and receive radio signals within a radius of r meters. We can
then connect the nodes that are within transmission ranges of each other by an edge
and obtain the graph shown.
5.3 Models
Messages are crucial for the correct operation of a distributed algorithm. We can
define the widely accepted message passing model of a distributed system formally
as follows [1,6]:
are widely used to design algorithms, network protocols, and sequence analysis in
bioinformatics. Formally, a deterministic FSM is a quintuple (I, S, S0 , δ, F) where
The next state of an FSM is determined by its current state and the input it re-
ceives. The same input may cause different actions in different states. As an everyday
example, let us consider students in a school who for simplicity can have only two
states: in_class or out_class meaning they can be either in the class or out of the
class. When the bell rings in in_class state, it means they can go out and the bell
ringing in out_class state means they should go in the class. An FSM diagram or a
state transition diagram is a visual aid to understand the behavior of an FSM. The
circles in such a diagram denote its states and transitions between states are shown
by directed arcs which are labeled as a/b where a is the set of inputs received and b is
the set of outputs produced when these inputs are received. A double circle denotes
the accept state.
A state table provides an alternative way of representing a FSM. It has states of
the FSM as rows and inputs as columns and the elements of the table can be the next
FSM state and actions to be taken when the input is received. The output of a Moore
Machine type of FSM is the next state, whereas the output in a Mealy Machine type
of FSM contains outputs as well as the next state.
Example 5.1 We will design a simple FSM for an elevator that can only go to floors
0, 1, and 2. There are two buttons in the elevator: up and down which take the elevator
up and down respectively. We can associate the current state of the elevator with the
floor it currently stays; therefore we have three states 0, 1, and 2. At each state, the up
or down button can be pressed represented by two inputs up by 0 and down by 1. The
FSM diagram for this example is shown in Fig. 5.2 which shows all state transitions,
considering there will be two inputs at each state. We cannot go down from 0 state
and also going up from second floor is not allowed shown by loops at these states.
up up
up
0 1 2
down
down down
We can now form the state table for this FSM with entries showing the next state
of the FSM when the input shown in columns is received at state shown in rows
as shown in Table 5.1. This way of expressing an FSM provides a very convenient
way of writing its algorithm. We can form a 2-D array with each element being a
function pointer. We then define functions to be performed for each table entry; for
example, receiving “0” (up) at “1” (first floor) state should cause a transition to state
2 (elevator should move to second floor) which is realized by changing the current
state to “2”. The running of the algorithm is then straightforward; every time an input
is received, we activate the function shown by the FSM table entry as shown by the
C programming language code below.
#include <stdio.h>
# define UP 0
# define DOWN 1
void *fsm_tab[3][2]();
int input;
void act00(){curr_state=1;}
void act01(){curr_state=0;}
void act10(){curr_state=2;}
void act11(){curr_state=0;}
void act20(){curr_state=2;}
void act21(){curr_state=1;}
main()
{ curr_state=0; // initialize curr_state
fsm_tab[0][0]=act00; // initialize FSM table
fsm_tab[0][1]=act01;
fsm_tab[0][2]=act02;
fsm_tab[1][0]=act10;
fsm_tab[1][1]=act11;
fsm_tab[1][2]=act12;
while (true)
5.3 Models 123
The algorithms that run at the nodes of a distributed system need to synchronize to
accomplish a common goal. This process can be performed at various levels. Let us
see how synchronization can be handled locally at three main levels of hierarchy; the
hardware, the operating system, and the application. At the lowest level, hardware
may provide synchronization at a certain number of clock ticks periodically. At a
higher level, one of the main tasks of local operating systems at each node is the
synchronization of the processes residing at that node. Moreover, this function can be
extended to processes running at the nodes of the distributed system at the application
level.
However, we need a mechanism to provide synchronization among the nodes
which should be translated to local synchronization mechanisms described above. A
very commonly used method in a distributed system is synchronization via messages.
In this so-called message passing model, each local operating system or middleware
provides two basic primitives; send and receive for sending and receiving messages.
These procedures can be executed in blocking or non-blocking fashion. A blocking
send stops the caller until an acknowledgment from the receiver is received. A block-
ing receive means the receiver should wait until a message is received. The blocking
receive maybe selective in which a message from a particular sender is waited and
execution is resumed only after this happens. It is common practice to employ a
non-blocking send with a blocking receive since the sent message is assumed to be
delivered correctly while the actions of a receiver depend on whether the message is
received and also its contents and thus a blocking receive is frequently used.
Sending and receiving are commonly employed indirectly using data structures
called ports or mailboxes. These are depository places for messages, and placement
or removal of messages can be performed asynchronously from these structures.
In a distributed system, the locally executed send procedure typically deposits the
message in the mailbox of the network process which appends the necessary network
headers, and transfers the message through lower network layer software to the
network. The receiving network process removes network headers and deposits the
message in the mailbox of the receiver which takes it from there as shown in simplified
form in Fig. 5.3. There are three main software modules at each node of a distributed
system: system(OS), network protocol stack (N/W), and the application (APP) as
shown in this figure.
124 5 Distributed Graph Algorithms
node i node j
APP APP
p N/W N/W p
i j
send ni write n
read j receive
receive send
network proc
mbox network mbox
OS OS
Fig. 5.3 Distributed communication via mailboxes. Process pi at node i sends a message to
process p j at node j using mailboxes via network processes n i and n j
1. send message.
2. receive message.
3. do some computation.
We assume here that a process sends the results of its computation from round k-1
in kth round. This order is not strict however, we could have compute-send-receive
sequence which would mean each process now computes new results in round k
based on what it has received in the previous round and sends the new results in the
current round. Distributed algorithms that work asynchronously and do not have this
synchronously executing rounds are called asynchronous algorithms. Detecting the
termination of distributed algorithms is needed to stop the algorithm when a certain
condition is met and this is not a trivial task. Although starting and ending a round
cause overheads in terms of needed extra messages, designing synchronous distrib-
uted algorithms is more straightforward than asynchronous algorithms in general.
The asynchronous algorithms require more complex control logic and detection of
termination in such algorithms is also more difficult.
Yet another distinction is whether a single initiator starts the distributed algorithm
or there are more than one initiators. A single initiator that also controls the overall
5.4 Communication and Synchronization 125
In the case of an SSI algorithm, a previously built spanning tree to transfer control
messages can be conveniently used. Based on foregoing, a possible SSI algorithm
template is sketched in Algorithm 5.1. All processes start the kth round when they
receive the start message over the spanning tree T , which is basically a broadcast
operation over T as we will see shortly. The three actions in the round are sending
results of the previous round to all neighbors, receiving results of the previous round
from neighbors, and prepare new results for the next round. When a process finishes
executing a round, it waits for all of its children in T to finish before it can send
the stop message to its parent. When the root of the spanning tree T receives stop
message from all of its children, the round k is over and the root can now start the
round k + 1. We will use this structure frequently while designing distributed graph
algorithms.
• Time Complexity: Time complexity is the number of steps required for the distrib-
uted algorithm to finish as in a sequential algorithm. For synchronous distributed
algorithms, we would be mostly interested in the number of rounds as time com-
plexity.
• Message Complexity: This parameter is commonly considered as the dominant
cost of a distributed algorithm since it directly shows the utilization of the network
and indicates synchronization costs among the nodes of the network. Transferring
a message over a network is magnitudes of orders more costly than doing local
computations.
• Bit Complexity: The length of a message may also affect the performance of
a distributed algorithm, especially if message length increases as the message
traverses the network. For a large network modeled by a graph with many vertices
and edges, bit complexity may be significant which directly affects the network
performance.
• Space Complexity: This is the required storage at a node of the distributed system
for the algorithm under consideration.
We are now ready to design and implement simple distributed graph algorithms.
We will describe sample basic algorithms which follow a logical sequence. The first
algorithm uses a technique named flooding to send a message from a node of the
graph representing the network to all other nodes. We then make use of this algorithm
to build a spanning tree of the network which can be used for efficient broadcast and
convergecast of messages in the network as described next.
Our aim is to send a message from a single node to all nodes in the graph. This
operation is called broadcast and has many applications in real networks, for example
to inform all nodes of an alarm condition that occurs at a node. In the simplest case,
we can have the following rules as a first attempt to solve this problem:
IDLE VISITED
This algorithm works fine and all nodes will receive message msg sent by pi
eventually. However, we can obtain a more efficient algorithm with less messages
transferred between the nodes by a simple modification: A node sends msg to its
neighbors only when it receives it for the first time. This way, duplicate transmission
along an edge of the graph in the same direction is prevented. We now need a way
to detect whether a message is received first time or not which can be implemented
simply using a variable such as visited which is false initially and becomes true
when msg arrives for the first time. Nevertheless, this modified algorithm is simple
to implement by an FSM having two states as shown in Fig. 5.4, which will also aid
us to understand the use of FSMs in distributed algorithms.
We can implement this algorithm based on the FSM as shown in Algorithm 5.2.
When the message msg arrives for the first time, the VISITED state is entered and
any further receptions of msg are ignored.
ack(j)
nack(j)
Analysis
A careful look at this algorithm reveals that each edge of the graph will be traversed
at most twice, once in each direction when both nodes at the ends of an edge start
sending the message msg concurrently. Therefore, message complexity is O(m).
Assuming there is at least one message transfer at each time unit, time taken by this
algorithm is the longest distance between any two vertices of the graph which is its
diameter and thus, time complexity is Θ(diam(G)).
We can design a spanning tree construction of a network using the Flooding algorithm
with few modifications. Building a spanning tree in a network environment means
each node knows its parent and its children in the general sense. We will not attempt
to store all of the tree structure at a special node or at each node of the graph since
parent/children relationship at each node is adequate for transferring messages over
the spanning tree. We have a single initiator as in the Flooding algorithm and this
node becomes the root of the spanning tree to be formed. The first modification we
have is to assign the sender j of the message msg( j) as the parent of the receiver i if
msg( j) is received for the first time. Since we also require the parent to be aware of
its children, node i should send an acknowledgment message ack(i) to j to inform
j of this situation. Otherwise, if node i already has a parent, meaning it has been
visited before, it sends back a negative acknowledgment message nack(i) to node j.
We have, therefore, three types of messages; check, ack, and nack. Determining the
types of messages is crucial in the design of distributed graph algorithms, moreover,
determination of states is performed by messages if we are to use a FSM. Let us
modify the FSM of Fig. 5.4 to reflect what we have been discussing. We can see
that the states may remain the same since a node can be either in IDLE or VISITED
state as before. Based on its state and the type of the message, we may need to take
different actions. The modified FSM is shown in Fig. 5.5 with the VISITED state
having all possible message types as input now.
5.6 Distributed Graph Algorithm Examples 129
b c d
c a
n
c a
a c c a
n g f e
Fig. 5.6 A spanning tree constructed in a graph using flooding. The branch (g, b) is on tree but
(g, a) is not since check message (c) from node a arrives at g later than c from b, which is replied
by a nack (n) message. A similar situation is depicted for branch (e, d) where message c from node
d is replied by an ack (a) message and (d, e) is included in the tree
130 5 Distributed Graph Algorithms
We could have easily implemented this algorithm without using an FSM, a node
having a parent or not basically shows its state as IDLE or VISITED. With this in
mind, this algorithm is shown in Algorithm 5.4 as in [2]. However, for complicated
distributed algorithms, using FSMs would ease the design and implementation.
Analysis
Each edge of the graph will be traversed at least twice by check/ack or check/nack
message pairs and at most four times when two nodes start to send check messages
to each other simultaneously. Therefore, message complexity of this algorithm is
O(m). The depth of the tree constructed will be at most n − 1, assuming a linear
network is built. If there is at least one message transfer per unit time, time complexity
is O(n).
into these two operations in this section. One other activity is the multicast sending
of messages in which a message is delivered to only a specified subset of processes.
Broadcast over a Spanning Tree
For the broadcast operation, we will assume a graph represents the network of the
distributed system and a spanning tree T is already built by an algorithm similar
to what we have discussed. The broadcast is initiated by a node by sending msg to
all of its children. Any node on the tree T that has children simply forwards msg
to all of its children. Since msg is transferred only over tree edges, the number of
messages will be n − 1 for a graph with n vertices. Time taken will be the depth
of T , assuming concurrent sending of messages at each level. Depth of T can be a
maximum of n − 1 assuming a linear network.
Convergecast over a Spanning Tree
In certain networks, data from all nodes are to be collected at a node with more
capabilities and this special node can then analyze and evaluate all of the data, provide
reports containing statistics which can be transferred to more advanced computation
centers or users for further processing. This situation is commonly encountered
in wireless sensor networks where data sensed needs to go through these steps of
operation. Collecting data is very much simplified when a spanning tree constructed
beforehand is used. In this case, the leaves of the tree send their data to their parents,
the parents combine their own data with those of leaves, and send these to their
parents. An intermediate node may in fact perform some simple operation on data
such as taking average or finding extreme values. This way, data sent upwards in the
tree does not have to get much larger at each level. This process of gathering called
convergecast continues until all data is collected at the special node, commonly
called the sink in sensor networks. Algorithm 5.5 shows the pseudocode for the
convergecast process over a spanning tree. Leaves of the tree start the algorithm and
any intermediate node in the tree should wait until data from all of its children are
received before combining these data with its own to be sent to its parent as realized
at line 8 of the algorithm. The termination condition for the root of the tree is met
when it receives the convergecast messages from all of its children at line 12. For all
others, termination is on line 17 when they send their data to their parents.
Message and time complexities for this algorithm are the same as the Broadcast al-
gorithm using similar reasoning. Figure 5.7 shows the operation of the Convergecast
algorithm using the spanning tree built in Fig. 5.6. The messages are labeled with
pair (a, b); a showing the time frame and b is the duration of the message. We can
see the highest level of tree finishes convergecast in 5 time units as this is the longest
duration, followed by 6 units at level 2 and 2 units at level 1 for a total of 13 time
units.
132 5 Distributed Graph Algorithms
g f e
(2,3)
(3,2)
The main idea of this algorithm is that any node detecting the failure of the current
leader initiates the algorithm by sending an election message containing its identifier
to its neighbor at its right assuming a clockwise unidirectional ring. A node that
receives this message changes its state to ELECT. If the identifier in the message
is greater than its own identifier, it simply passes the election message to its next
neighbor. Otherwise, it inserts its identifier which is greater than the identifier in
the incoming message and sends it to the neighbor. We have two messages in this
example:
• election: Sent by any node that detects leader failure. This message may be sent
by more than one initiator.
• leader: The new leader broadcasts this message to notify all nodes that election
is over.
• i > j: Process i replaces j with i in message and passes it to the next node.
• i < j: Process i simply passes message to next node.
• i = j: Process i becomes the leader and sends the leader message to its next
neighbor.
In the last case, the election message originating from node i has returned to
itself meaning it has the highest identifier among all active processes. Basically, the
highest identifier is transferred between all functioning nodes and when the originator
receives its own message, it determines it is the leader and sends the leader message
to its neighbor which is then broadcast to all nodes by neighbor transfers. The FSM
for this algorithm is depicted in Fig. 5.8.
Analysis
The worst case happens when the nodes are ordered from smallest to largest with
respect to their identifiers in clockwise direction and start election concurrently in
anticlockwise direction. The largest identifier message travels through all nodes n
times, the second largest identifier is transferred n − 1 times and in total there will
LEAD ELECT
134 5 Distributed Graph Algorithms
1 7 1
(a) 7 (b)
(1)
(7)
(2)
6 6
2 2
(6)
(3)
(5) (4)
5 3 5 3
4 4
Fig. 5.9 Ring leader election algorithm: worst and best scenarios. In a, each message by the
originator is tagged with the number of links it travels. For example message originating at node 7
is tagged with 7 since it goes through 7 edges back to node 7. The best case is depicted in b
n
be i=1 = n(n + 1)/2 messages as shown in Fig. 5.9a. The best case occurs for a
total of 2n-1 messages when messages are transmitted in clockwise direction as in
Fig. 5.9b. In this case, even if all nodes start election concurrently, their messages
will be purged by the next nodes for n − 1 times and only the message of the highest
identifier node, which is 7 in this case, will traverse the ring all the way back to
the originator at n step. Total number of steps will then be 2n − 1, excluding the
declaration message sent by the leader.
tributed algorithm typically runs in rounds and the next round is not started until all
nodes finish executing the current round. The synchronization at the beginning and
end of round are commonly realized by special messages sent by a special node.
Distributed algorithms can be modeled by FSMs which are mathematical models
which include states and transitions between states as we have outlined. We can
design a distributed algorithm without a FSM but for complicated algorithms, FSMs
provide a neater algorithm with visual aid and less error-prone than algorithms which
otherwise could involve many decision- making statements.
We then described some sample distributed graph algorithms which include build-
ing a spanning tree of the graph, broadcast and convergecast operations over a span-
ning tree, and a leader election algorithm to find the new coordinator of nodes in
a ring when leader fails. We need to prove that a distributed algorithm correctly
achieves what it is intended for; and time, message, bit, and space complexities of
a distributed algorithm are used to evaluate its performance. In general, message
complexity is considered as the dominant cost of a distributed algorithm.
Exercises
4
6
2
X
8
7
1
136 5 Distributed Graph Algorithms
leaves of the spanning tree once they receive the broadcast message. Write the
pseudocode for this algorithm with comments and work out its time and message
complexities.
4. Show the execution of the ring election algorithm for the nodes shown in
Fig. 5.10. Assume nodes 2 and 5 find concurrently that the leader is not working
and decide to run an election.
5. In a fully connected graph with each node having unique identifiers, bully al-
gorithm may be used to elect a new leader. A node u that finds leader is not
functioning may start this algorithm by sending an election message to all nodes
that have higher identifiers than itself. Any node v that receives this message
sends back and ack message to the node u which then leaves election. The node
v now starts election and this process continues until there is one winner which
is the active node with highest identifier. The new leader broadcasts it is winning
by a special message to all nodes. Write the pseudocode for this algorithm and
find its time and message complexities. Show its operation in a complete graph
of 8 nodes where nodes 4 and 6 find simultaneously the leader 8 is down.
References
1. Attiya H, Welch J (2004) Distributed computing: fundamentals, simulations, and advanced
topics, 2nd edn. Wiley, New York
2. Erciyes K (2013) Distributed graph algorithms for computer networks. Springer computer
communications and networks series. Springer, Berlin. ISBN- 10:1447151720 (May 16, 2013)
3. Foster I, Kesselman C (2004) The grid: blueprint for a new computing infrastructure. Morgan
Kaufmann, San Mateo
4. Mell P, Grance T (2011) The NIST definition of cloud computing. National institute of standards
and technology, US department of commerce, special publication, 800145
5. Tanenbaum AS, Steen MV (2007) Distributed systems, principles and paradigms, 2nd edn.
Pearson-Prentice Hall, Upper Saddle River. ISBN 0-13-239227-5
6. Tel G (2000) Introduction to distributed algorithms, 2nd edn. Cambridge University Press,
Cambridge
Part II
Basic Graph Algorithms
Trees and Graph Traversals
6
Abstract
A tree is a connected acyclic graph and a forest consists of trees. In this chapter, we
first describe the tree structure, algorithms to construct a spanning tree of a graph,
and tree traversal algorithms. Two main methods of graph traversal are depth-first
search and breadth-first search. We review sequential, parallel, and distributed
algorithms for these traversals along with their various applications.
6.1 Introduction
A tree is a connected acyclic graph and a forest consists of trees. Trees find many ap-
plications in computer science and real-life situations. For example, the organization
of a university or any establishment is typically shown by a tree, and family trees
illustrate the parental relationships between the individuals. In computer science,
trees are used for efficient data storage and tree-based algorithms find a wide range
of applications. A spanning tree of a graph is its tree subgraph that includes all of
the vertices of the graph. A graph may have a number of spanning trees. We have
described trees briefly in Chap. 2; now, we provide a more detailed analysis of trees
with related algorithms in this chapter. We start by defining the tree structure and
stating its properties. We then describe algorithms for constructing spanning trees
and tree traversals and briefly review special tree types.
Traversing all vertices or all edges of a graph in some order is required in various
applications, for example, to find all reachable vertices in a graph. The algorithms that
perform traversals may also be used as the building blocks of more complicated graph
algorithms. In an indirected graph traversal, all edges are considered whereas only the
outgoing edges from a node are considered in a directed graph. Two main methods of
graph traversal are the depth-first search and breadth-first search. In the first method,
we start from any vertex of a graph and go as deep as we can by visiting neighbors of
each visited vertex. The breadth-first search involves visiting all neighbors of a vertex
first, then visiting all neighbors of these neighbors and proceeding in this manner
until all vertices are visited. Both these methods produce spanning trees rooted at
the start vertex. We describe sequential, parallel, and distributed algorithms for both
of these approaches with their possible applications in this chapter.
6.2 Trees
A graph is a tree if it is connected and does not contain any cycles. A forest is a graph
with no cycles. Every path is a tree and a tree T is path if and only if the maximum
degree of T is 2. A tree can be rooted or unrooted. A designated node called the root
in a rooted tree is at the top of the hierarchy and every other vertex of T has a path
to the root; the tree is unrooted otherwise. A binary tree consists of nodes that have
at most two children. The following statements equally define a tree T :
Definition 6.1 (level) The level of a vertex in a rooted tree is its distance to the root.
The level of a vertex is also called its depth in the tree.
Definition 6.2 (parent, child) A vertex v that is connected to vertex u (the predeces-
sor of u) on the path to the root is called the parent of u and v is called the child of
u.
A vertex v can have only one parent as having more than one parent produces a
cycle and the resulting structure therefore will not be a tree.
Definition 6.3 (leaf, internal vertex, siblings) A leaf is a vertex of the tree that has
no children. An internal vertex of a tree has a parent and one or more children.
Siblings in a tree have the same parent.
The maximum level of a leaf is the height (or depth) of the tree as it is the farthest
vertex to the root.
MSTs find numerous applications and we will investigate sequential, parallel, and
distributed algorithms for MSTs in Chap. 8.
Definition 6.5 (m-ary tree (m ≥ 2)) An m-ary tree is a rooted tree in which every
vertex other than the leaves has at most m children. In a binary tree, m = 2.
Definition 6.6 (complete m-ary tree) A complete m-ary tree is an m-ary tree in
which every internal vertex of the tree has exactly m children and all leaves have the
same depth.
Definition 6.7 (ordered tree) In an ordered tree, there is a linear ordering of the
children of each node. This means we can identify the children as first, second, etc.
Definition 6.8 (center of a tree) A center of a tree T is either a vertex v such that
max(d(u, v)) ∀u ∈ V is minimum or two adjacent centers with this property.
We can find center(s) of a tree by recursively removing its leaves until there are
one or two centers left. This procedure is illustrated in Fig. 6.2.
Theorem 6.1 An undirected graph G is a tree if and only if there is a unique simple
path between any two vertices of G.
Proof We will first assume G is a tree which means it has no cycles. Now, if there
are two simple paths between any vertex pair (u, v) in G, the total path from u to v
v level(v)=2
child of v
internal
leaf
142 6 Trees and Graph Traversals
Fig. 6.2 Finding two centers of an unrooted tree. The leaves are recursively removed from a–c to
obtain the two centers
and then back to u would form a cycle; however, from the definition of a tree, we
know G does not have a cycle and therefore a contradiction. In the other direction
of the statement, let us assume G is a graph in which any two vertices u and v are
connected by a unique path. If there were two distinct paths between a vertex pair u
and v, these paths would form a cycle and since G is a tree and does not contain a
cycle, we have a contradiction.
Proof We will prove this theorem by induction n. The induction hypothesis is that a
tree with n nodes has n −1 edges. For the base case, when n = 1, the trivial graph has
no edges; therefore, the base case holds. Let us consider a tree T with n + 1 nodes.
Removing a leaf node v with its incident edge from T leaves a tree T = (V , E ).
Since we have not created a cycle in doing so, T is also a tree, say with p edges and
q vertices. With the inductive hypothesis, p = q − 1. Since we removed one edge
and one node from T , p = m − 1, q = n − 1. Substitution yields p = n − 2, when
q = n − 1, hence m = n − 1.
Theorem 6.3 Any connected graph G with n vertices and n − 1 edges is a tree.
Proof We need to show G is acyclic. Let us assume the contrary that G has at least
one cycle, and iteratively remove edges from cycles until we have a graph G which
is acyclic and therefore is a tree. We can conclude G has n − 1 edges by Theorem
6.2. Since we have removed at least one edge to obtain G , G had a size at least n
which is one greater than the size of G and hence a contradiction.
In pointer jumping method, we would have each element of a linked list point to the
link of the element it points to in each step. This method can be conveniently used to
find the root of a rooted tree as shown in Algorithm 6.1. After log depth steps, all
vertices of the tree will point to the root. Note that this method is inherently parallel.
6.2 Trees 143
Fig. 6.3 The root of a tree is found in two iterations by all nodes
Finding the root of a tree with depth 3 is shown in Fig. 6.3. It takes three iterations
for all node to point to the root.
A spanning tree of a graph G is its subgraph that is a tree and contains all vertices of
G. Every connected and undirected graph has at least one spanning tree. Spanning
trees find various applications such as providing a communication infrastructure in
computer networks and cluster analysis. Let us denote the number of spanning trees
of a graph G by τ (G). Cayley provided a formula to find the number of spanning
trees of a labeled complete graph K n which has a unique identifier for each of its
vertices as follows [4]:
τ (K n ) = n n−2 (6.1)
The 3 spanning trees of K 3 and 16 spanning trees of labeled K 4 are shown in
Fig. 6.4 and Fig. 6.5.
144 6 Trees and Graph Traversals
a b a b a b a b
d c d c d c d c
a b a b a b a b
c d c d c d c
a b a b a b a b
d c d c d d c
c
a b a b a b a b
d c d c d c d c
(a) (b)
(c) (d)
Fig.6.6 Steps of Algorithm 6.2 in a sample graph. All of the graph edges shown in bold are included
in the spanning tree initially. Then, an edge removal which does not disconnect the tree is removed
at each iteration starting from a until this is not possible
The steps of operation of this algorithm are depicted in Fig. 6.6. We remove an
edge in each iteration, and hence the r epeat − until loop is executed O(m) times.
Alternatively, we could check that the resulting structure after each edge has n − 1
edges and is connected. Checking the former is simple by keeping a counter and
decrementing it after each edge deletion. However, checking the connectedness of
the graph is not trivial as we will see in the second part of this chapter. Note that we
require both properties by Theorem 6.3. We will see we can construct spanning trees
with special properties in linear time in Chap. 6.
Fig. 6.7 Construction of a spanning tree of a small sample graph using outgoing edge concept.
Selected outgoing edge at each iteration is shown by a dashed line
Correctness is evident since any outgoing edge will not produce a cycle with
edges already included in the tree. Hence, the resulting structure will be cycle free
and therefore a tree. A possible operation of this algorithm in a small sample graph is
shown in Fig. 6.7. The time complexity is O(n) which is the number of times the while
loop is executed. Note that we do not need any extra processing such as checking
tree property or connectivity as in the previous algorithm. Selecting outgoing edges
from the set of edge already included in the tree will be useful in forming minimum
spanning trees as we will see in the next chapter.
Fig. 6.8 Running of the third spanning tree construction algorithm in a sample graph. Spanning
tree edges are shown in bold and each vertex is a tree initially
merging each vertex with the neighbor vertex. The operation of this algorithm in a
sample graph is depicted in Fig. 6.8.
This method lends itself to parallel processing since independent subtree forma-
tions are possible. In fact, it is also suitable for distributed processing in a network
environment. Each vertex is a network node and requests merge operation from a
neighbor node. We need to be careful as not to form cycles when concurrent requests
are made by two nodes from the same subtree to two nodes that coexist in another
neighbor subtree. This problem can be handled by selecting a leader for each subtree
which controls the requests to merge by a suitable protocol.
Tree traversal is the process of recursively visiting each node of the tree exactly once.
Traversing trees in some determined sequence is useful in many graph applications.
We can classify tree traversals by the order in which the vertices are visited as preorder
and postorder for general trees.
Preorder Traversal
In preorder traversal of a rooted tree, a vertex is visited before its descendants as
shown in Algorithm 6.4. The time complexity of this traversal is O(n) for a tree with
n vertices.
The preorder traversal of the sample tree of Fig. 6.9 results in vertex processing
sequence of a, b, c, d, e, f, g, h, i, j, k, l, n, o, m.
Postorder Traversal
Postorder traversal of a tree involves visiting a tree vertex after visiting its de-
scendants as depicted in Algorithm 6.5. The postorder traversal of the sample tree of
Fig. 6.5 provides vertex processing sequence of c, f, g, h, e, d, b, j, n, o, l, m, k, i, a.
Since each vertex is visited exactly once, time complexity of this method is O(n).
148 6 Trees and Graph Traversals
b i
c
d j k
e
m
f l
g h
n o
Definition 6.9 (binary tree) A binary tree is a rooted tree in which every vertex has
at most two children and each child of a vertex v is left child or right child of v.
A complete binary tree of depth d has 2d+1 − 1 vertices which is the maximum
order of any binary tree. Binary trees can be traversed as preorder or postorder as in
general trees. An additional traversal method for binary trees is inorder tree traversal.
6.2 Trees 149
Inorder Traversal
In this mode of binary tree traversal, the vertices at each left subtree of a vertex
are processed first, the vertex is processed second, and the right subtree vertices
of the vertex are processed finally. The pseudocode for this operation is shown in
Algorithm 6.6. The time complexity of this algorithm is also O(n).
* c d +
a b e
*
f g
150 6 Trees and Graph Traversals
8 18
5 10 14 20
2 13 16 23
A priority queue is a data structure to store a set of elements each having a value called
a key. This data structure is useful in implementing various graph algorithms as we
will see when reviewing algorithms for weighted graphs in Chap. 7. A min-priority
queue provides the following operations.
When we are dealing with a max-priority queue, the operations in such a queue
are Maximum(S), ExtractMax(S), and IncreaseKey(S,x,k) which find the maximum
element of the queue, extract this value, and increase the value of the key of an
element in turn.
Heap as a Priority Queue
A min-binary heap is a complete binary tree except possibly the leaves in which
the keys of the children of any vertex u are greater than or equal to the key of u.
Therefore, along each path from the root, keys monotonically increase and the root
has the minimum key value as shown in Fig. 6.12. We can have max-binary heap in
which the key values decrease from the root downward.
All of the priority queue operations defined above can be implemented with heaps
as HeapInsert, HeapExtract, HeapMin, and HeapDecrease procedures in O(log n)
time and building the binary heap takes O(n) time. A detailed description of the
heap structure is provided in [5].
Depth-first search (DFS) is a basic method to traverse all nodes of a graph. It basically
traverses a graph by going as deep as possible from a given vertex and hence the
6.3 Depth-First Search 151
6 20 9 13
23 16 14 18
12
name. We explore a path as far as we can go by marking vertices we visit along the
path and when we cannot go any further since either we encounter a vertex with a
degree of 1 or all neighbors of a vertex are visited, we return to where we come
from. This method can be best described by a person in a maze carrying a chalk and
a string. Each room (vertex) has a number of doors (neighbors) and the person after
entering the room selects one of the unmarked doors and marks it with the chalk and
goes through that door to another room. If that has no unmarked doors (all neighbors
visited) or no doors other than the one she came from (a vertex with one edge), she
returns to where she came from. The string is used to keep track of where she came
from. The DFS algorithm has two versions as recursive and iterative described below.
We assume the graph is connected and if it is not, DFS algorithm is performed on each
component of the graph. The latter version of the algorithms is called D F S_For est.
We select any vertex u of the graph G and this vertex is the root of the DFS tree to be
formed. We then select an edge (u, v) that is incident to u. This edge is a tree edge
and is included in the DFS tree T , and the vertex u is the parent of vertex v in T .
We proceed in this manner by always selecting unexplored edges that are incident
to vertices that are not visited. If all the edges incident on a vertex u are explored
meaning all of its neighbors are visited, we return to the parent of u and continue the
search from there. When we find an unexplored edge (u, v) incident on a vertex v,
the edge (u, v) is traversed, it is included in T , and u becomes the parent of v as in the
root case. If v has been visited before, and (u, v) is unexplored, (u, v) is a non-tree
edge and is not included in T . The pseudocode of an algorithm that performs the
described procedure is given in Algorithm 6.7. We also record the visit times for
each vertex; the first time of visit when a vertex is discovered is in d[v] and the final
time when we return from the recursive call to v is stored in f [v].
The output of the algorithm is a DFS tree stored in the array Pr ed which shows
the predecessors of each vertex. There are few things to note about this algorithm as
follows.
152 6 Trees and Graph Traversals
• We can select the neighbors of the visited vertex arbitrarily or using some ordering
in line 17. If vertices are labeled with unique integers, the ordering can be linear,
from smallest to the largest vertex identifiers. When vertices are labeled with
letters, we can have a lexicographically first choice. As a consequence, the DFS
tree obtained as the result of this algorithm is not unique as the order of the
selection of unexplored edges affects the structure of this tree.
• If the adjacency matrix representation of the graph is used, we need to check the
entire row that belongs to vertex u in lines 17 and 18 for n times for a graph
with n vertices. Using adjacency list means the checking in these lines will be a
maximum of Δ(G) times; hence, we can deduce using adjacency list is a better
choice for this algorithm than using adjacency matrix.
• The D F S procedure terminates when it returns to the root vertex it is called from.
The algorithm terminates when the D F S procedure is run on all components of
the graph.
• The first visit time d[v] of a vertex v is called the depth-first number of vertex u
and corresponds to the number given to it during the preorder traversal of the tree
formed. We will see first and last visit times of vertices can be used for various
DFS applications.
6.3 Depth-First Search 153
Analysis
The edges included in the DFS tree form a directed spanning tree of G. This is true
since we never form a cycle by never selecting an edge between two marked vertices
(lines 18 and 19) and all of the vertices are marked and are included in the DFS tree
in the end. We need to invoke the D F S procedure n times, one for each vertex. Each
activation of this procedure involves checking each entry in the adjacency matrix
for a total of n times. The time taken using the adjacency matrix is therefore Θ(n 2 ).
Using the adjacency list means checking each edge in G twice for each vertex in its
ends plus the time taken for initialization resulting in a total time of Θ(n + m).
Example
The running of D F S_For est algorithm on a sample graph with two components is
shown in Fig. 6.13. We can see two DFS trees which are directed spanning trees of
two components are formed.
For any two vertices u and v on the DFS tree formed with u as the ancestor of v,
Fig. 6.13 Running of D F S_Recur sive_For est algorithm in a sample disconnected graph with
first and last visit times shown next to the vertices. The source vertices are i and d in the components
and the arrows point to parents in the tree
154 6 Trees and Graph Traversals
• tree edges: These are the edges on the tree formed. An edge (u, v) belongs to DFS
tree if DFS(u) calls DFS(v).
• back edges: When vertex u of an edge (u, v) is a descendant other than its children
of vertex v in tree, (u, v) is a back edge.
• front edges: When vertex u of an edge (u, v) is an ancestor of vertex v other than
its parent in tree, (u, v) is a front edge.
• cross edges: Any edge that is not a tree, back or front edge is called a cross edge.
There is also the following relationship between the discovery time d and finish
time f of vertices in a graph G, commonly called the parenthesis theorem. Other
orderings of discovery and finish times are not possible.
Figure 6.14 displays these edges in a sample graph. Discovering a back edge on
a DFS tree helps us to discover various properties of a graph. Let us modify the
DFS procedure in Algorithm 6.7 for a directed graph to classify these edges using
the discovery and finish times of their endpoints described above. We will use three
colors for each vertex in this implementation: a vertex is white when it is unexplored,
it is gray when it is explored but not finished, and it is black when it is finished. The
array color is initialized to white for all vertices and holds the color of each vertex.
The pseudocode for the modified DFS is shown in Algorithm 6.8 [5].
We search all of the neighbors of a vertex u that the procedure takes as input;
if we encounter an edge with an endpoint v that is completed before, then (u, v) is
a forward edge if d[u] < d[v] else it is a cross edge. The vertex v may be gray
meaning it is explored but not finished in which case (u, v) is a back edge, otherwise
it is a tree edge. In an undirected graph, there are no forward or cross edges; hence,
j g h i
6.3 Depth-First Search 155
we only need to check the color of the neighbor vertex v of the vertex u; if it is gray,
we have a back edge and otherwise v is white and (u, v) is a tree edge. The running
time for this edge classification algorithm is Θ(n + m) since we have only added
constant time operations to the DFS procedure in Algorithm 6.7.
When we know the recursion depth of a graph is very large, for example more than
few thousand, we can replace recursive calls with a stack and obtain a non-recursive
DFS algorithm. The iterative DFS algorithm shown in Algorithm 6.9 starts from the
source vertex and at each iteration, the neighbors of the vertex u under consideration
are pushed into a stack S. This way, visiting all neighbors of u is guaranteed. Once
this step is finished, a vertex w is popped from S, marked as visited and the parent
of w is marked as u. This step is repeated now for vertex w. Note that this algorithm
performs recursive visiting of vertices using the stack S which keeps track of vertices
seen but not processed. Different than the recursive DFS algorithm, we have the last
pushed vertex on to the stack being processed first. We have the same time complexity
of Θ(n + m).
156 6 Trees and Graph Traversals
Due to the nature of its execution, DFS algorithm is difficult to parallelize. A simple
way to provide parallel processing is to divide the search space among processors.
However, this static allocation commonly results in poor load balance since the size of
the subtrees may vary significantly. Search space in general is formed dynamically
and is difficult to estimate beforehand. Dynamic load balancing for parallel DFS
processing may then be used.
A simple dynamic load balancing for parallel DFS may work as follows. A process
pi works on a given search space and when it finishes its work, it requests work from
other processes. This can be handled by a central process or in a truly distributed
manner with no central control. In terms of implementation, the whole search space
may be given to a single process and all other processes may be given empty search
spaces initially as described in [8]. The search space is then divided among processes
when they request work.
In a network setting, our aim is to have the nodes of the network cooperate to find the
DFS of the whole network. We may use the DFS tree formed for various applications
such as finding connected nodes in such a distributed environment. The DFS tree
information may be gathered at the root which may then transfer the connected
vertices in the network to a management utility which can take remedy actions if
there are unconnected nodes. There are various algorithms for this purpose and we
6.3 Depth-First Search 157
will review a basic one that imitates the sequential algorithm we have seen, using
a special message called the token. This special message provides a single point of
execution which is the holder of the token. Any node that possesses the token can
run the algorithm while the others stay idle. The token serves a second purpose; it
holds the identifiers of the nodes that are visited to prevent visiting them again.
This algorithm called the T oken_DFS [6] which operates using the same principle
as in the DFS procedure is shown in pseudocode in Algorithm 6.10 [6]. We have now
a root node which starts the algorithm which is where we would start the sequential
algorithm. A node receiving the token for the first time records the sender as its
parent. It then checks whether it has an unvisited neighbor by comparing the list
contained in the token with its neighbors. If such a neighbor exists, it sends the token
to that node. Otherwise, token is returned to the parent which is basically imitating the
return from the recursive procedure in the sequential algorithm. Correct termination
of the distributed algorithm is a fundamental problem in a distributed setting. Any
node other than the root terminates when it returns the token to its parent. The root
has a different termination, it stops when token is returned to it, and it has no other
unvisited neighbors.
Analysis
Theorem 6.4 The T oken_D F S correctly constructs a DFS tree in 2n −2 time using
2n − 2 messages.
Proof Since the operation is basically the same of the sequential DFS algorithm, the
DFS tree will be constructed correctly, that is, visiting each node in DFS manner and
forming a tree without any cycles. The DFS tree constructed will have n − 1 edges
since any tree with n vertices has n − 1 edges, and only the edges of this tree will
have been traversed twice, once in each direction resulting in a total of 2n − 2 token
transfers among the nodes resulting in 2n − 2 messages. The non-tree edges will
not be traversed since we always search unvisited nodes. There is a single activity at
any time dictated by the possession of the token, and each message transfer takes a
single time unit resulting in 2n − 2 time.
1 ba
a b c d
11 df cba
3 10
12 5
a 6
cba
d f c ba
8
g f e
7
9
e d f b a c
gedf c ba
f c b a
4
Fig. 6.15 Running of Dist_BFS algorithm in a sample graph. The contents of the token are shown
only when there is a change. The directed tree is rooted at vertex a and the arrows show the sequence
of execution
We can use the DFS algorithm to test connectivity of undirected or directed graphs,
to detect cycles in undirected or directed graphs and for topological order.
that there will not be any back edges such as an edge (u, v) with v being the ancestor
of u in the tree in acyclic graphs. We can make use of the following theorem to detect
a back edge.
Proof If (u, v) is a back edge, it connects vertex u to its ancestor v; hence, vertex
u is a descendant of vertex v. By the parenthesis theorem, the interval [d(u), f (u)]
is a subinterval of [d(v), f (v)] so the forward direction of the theorem holds. For
the reverse direction, let us consider the parenthesis theorem again. When d(v) <
d(u) < f (u) < f (v), vertex u is a descendant of v. This means edge (u, v) connects
a vertex to its ancestor and hence it is a back edge.
(a)
(b)
a c c
b b
d d
f f e
e
(c) (d)
(e) d
c c
b e
d d
e e (f)
d
Fig.6.16 Iterations of the simple algorithm for topological ordering in a sample graph. The selected
vertex with no incoming edges at each iteration is shown inside a dashed circle. The ordering formed
is {a, f, b, c, e, d} in the order of deleted vertices
A Simple Algorithm
We can have a simple algorithm for topological order as follows. We first find a vertex
v with in-degree 0. There is always such a vertex in a DAG since it is loop-free. If
there are more than one such vertices, an arbitrary selection is made. This vertex
is placed in the ordered output list; and v with all of its outgoing edges is deleted
from graph. This process is repeated until there are no vertices left. Correctness is
ensured since if deleting outgoing edges from a vertex v placed in the list leaves a
vertex u with no incoming edges, than we know v ≺ u. Algorithm 6.11 shows this
routine in pseudocode where we have L as the ordered output list. The operation of
this algorithm is shown in Fig. 6.16.
Analysis
In terms of implementation, we can have an array A of adjacency lists of the graph
and also an array D showing the in-degrees of each vertex as shown below for the
example graph of Fig. 6.17. We need to check each entry of D for a 0 value and if
there is more than one such vertex, we select one arbitrarily. We then remove this
vertex from graph by inserting a –1 in the in-degree array D and deleting this vertex
and removing it from the array A of adjacency lists of its outgoing neighbors. This
process is repeated until there is one vertex with an in-degree of 0.
162 6 Trees and Graph Traversals
D D D D D D
a 0 a -1 a -1 a -1 a -1 a -1
b 2 b 1 b 0 b -1 b -1 b -1
c 2 c 1 c 1 c 0 c 0 c -1
d 3 d 3 d 3 d 2 d 1 d 0
e 2 e 2 e 1 e 0 e -1 e -1
f 1 f 0 f -1 f -1 f -1 f -1
Fig. 6.17 The values of array D during iterations of the simple topological ordering algorithm for
the graph of Fig. 6.17
Initializing the in-degree array D takes O(m) time, searching each entry is O(n)
time resulting in O(n 2 ) time for the whole array D. We then reduce in-degree of
all its neighbors in O(m) time resulting in a total runtime of O(n 2 + m) for this
algorithm.
DFS-Based Algorithm
We can make use of the edge properties discovered during the recursive DFS algo-
rithm to perform topological order. When we run the recursive DFS algorithm on a
graph G to obtain a DAG G , every edge (u, v) in G has f (v) < d(u) since there
are no back edges in G . This provides us the necessary information to form the
topological order of G: simply list the vertices of G from the largest finish time to
the smallest. We can obtain this list by adding the identity of the vertex to the front
of a list L when we finish with it and when DFS is completed, the list L contains the
topological ordering of vertices. Total time taken therefore is Θ(n + m) as in the re-
cursive DFS. Figure 6.18 displays a DFS tree obtained in the same graph of Fig. 6.17
and sorting the finish times of vertices provides the same topological ordering as the
previous algorithm in this graph.
6.3 Depth-First Search 163
Analysis
Theorem 6.6 The DFS-based topological sort algorithm correctly provides a topo-
logical sort of an acyclic digraph G = (V, E).
Proof We need to show that for any directed edge (u, v) ∈ E, f (v) < f (u) for the
correctness of this algorithm. For any edge (u, v) considered during the DFS, vertex
v cannot be an ancestor of vertex u since G is given acyclic, hence f (v) < f (u).
In terms of our color coding of vertices, vertex v cannot be gray; it can either be
white (unexplored) or black (explored). If vertex v is white, it will be processed and
become black before u becomes black and hence f (v) < f (u). If it is black, this
means it has already been processed to have its finishing time f (v) determined and
vertex u will have a greater finish time than v in this case as well.
The main idea of the breadth-first search (BFS) method is to visit all neighbors of
a vertex first before visiting other vertices, and hence the name. Starting from the
source vertex s, we first visit all neighbors of s in an arbitrary order. These neighbor
vertices, N (s), all have a distance of unity from the vertex s after the visit. We then
visit neighbors of vertices in N (s) which are labeled with distance of 2 to vertex s.
This process continues until all vertices are visited.
We can implement this algorithm by inserting the adjacent vertices of the currently
visited vertex in a queue, and then removing them from the queue one by one and
repeat the process. We need to keep track of the visited vertices as in the DFS
algorithms to prevent a vertex to be visited again. Algorithm 6.12 shows one way of
implementing the described procedure.
164 6 Trees and Graph Traversals
Analysis
The distance and predecessor initialization for vertices takes O(n) time. Each vertex
is enqueued at most once and each edge is explored at most twice, once for each
vertex incident to it. Total time spent to construct the BFS tree is therefore O(n + m).
Example
The running of BFS algorithm on a sample graph is shown in Fig. 6.19. The BFS
tree constructed shows the shortest paths from the root vertex a.
Properties
The properties of the BFS graph traversal are as follows.
check whether a vertex is visited or not as in the DFS algorithm since the distance
value of infinity shows that vertex is not visited. All vertices will be processed at
the end of the algorithm and therefore the output is a spanning tree of G.
• The order we select and hence enqueue the neighbors of v in line 12 of the
algorithm affects the structure of the BFS tree obtained and therefore this tree is
not unique.
• The BFS algorithm partitions the edges of an undirected graph into tree edges and
back edges.
When we view the BFS procedure, we can see all the vertices at the same level can
be processed in parallel. For example, we can have a parallel loop to explore all of
the neighbor vertices of vertices at level i to find the vertices at level i + 1. However,
we can have a situation in which two vertices u and v at level i will both attempt to
set level of an unexplored neighbor vertex w to i + 1 and set themselves as the parent
of w. However, this seemingly race condition does not cause any problem as it does
not matter which vertex sets the level of w to i + 1 and it is also immaterial whether
u or v is the parent of w. Moreover, we need to synchronize all of the processes at
level i to ensure they all finish before starting the parallel processing at level i + 1
to find vertices at level i + 2.
This level-synchronous approach is implemented in various parallel BFS algo-
rithms as described in [3]. A PRAM-based approach may work similar to what we
have described providing parallel loop processing by a number of processing units
and atomic level updates with barrier synchronization between level processing.
The total execution time based on this model would then be O(diam(G)) since the
number of levels would not exceed the diameter of the graph G. Various parallel
algorithms, whether shared or distributed memory, adapt this strategy with possibly
added load balancing during parallel processing of the exploration loop. A fine-grain
parallel BFS algorithm running on shared memory Cray MTA-2 system is reported
in [2]. In a distributed memory parallel processing system, partitioning of the graph
to processing nodes is commonly pursued. A distributed memory parallel algorithm
using 2-D graph partitioning is implemented in BlueGene/L in [12].
• round: Sent by root r at the beginning of each round over the partial tree T . Each
node on T broadcasts round to its children.
• probe: Sent by leaves of T’ to all neighbor nodes except the parent.
• ack: A non-tree node responds to probe message by an ack message. It marks the
sender of probe as its parent and the receiver of ack marks the sender as one of
its children. If there are more than one probe messages received concurrently, the
receiving node arbitrarily picks one of them.
• nack: If the receiver of a probe message already has a parent assigned, its ends
the sender a nack message.
• upcast: Sent by leaves of T to their parents to inform the round is over.
Once a new layer is formed, the leaves from the previous round collect all ack and
nack messages and start a convergecast operation described in Sect. 5.6.3 over the
edges of T . When the root receives upcast messages from all of its children, it can
start the next round. Algorithm 6.13 shows round k of the distributed synchronous
BFS algorithm run by any node except the root. The set childs is the set of children
for the root and intermediate nodes; others are the set of neighbors of a leaf node
that are not its children and the set collected is used to keep track of which children
have sent an upcast message to an intermediate node.
We can see this algorithm works rather asynchronously in one synchronous round.
We do not pay attention to the order of messages received although we know each
node on T will first receive the round message and it will act differently depending
on whether it is a leaf or an intermediate node. A leaf node searches for neighbors to
be included in the next layer and an intermediate node simply acts as a gateway by
sending the round message to its children. Upon completion of the round, an interme-
diate node collects upcast messages from its children and sends an upcast message
to its parent. We check whether a leaf node has received ack or nack messages from
all of its neighbors; or a non-leaf node has received upcast messages from all of its
children to decide if the round is over. When this is decided, an upcast message is
sent to the parent.
6.4 Breadth-First Search 167
(a) h (b) p h
p b c a c
a a b
a p
g a g a
p
p f e d f e d
a
i i
h h
(c) (d)
p
a b c a b c
a
p
g g
a
f e d f e d a
p
i i
Fig. 6.20 Running of Algorithm 6.13 on a sample graph for four rounds. Only probe messages ( p)
that are acknowledged with ack (a) messages are shown
tree edges. Note that nodes a and f send probe messages to each other in the next
round (round 2) which are rejected. The number of rounds started by the node g is
4 which is the diameter of the graph.
Analysis
Proof At each step of the algorithm, only leaves at layer k of the partial tree formed
will be sending probe messages to form layer k + 1 leaves which would be enlarging
the partial BFS tree one layer. Hence, BFS tree property is obeyed to form the final
BFS tree.
At each kth step of the algorithm, time spent will be proportional to the current
level k as messages are broadcast and convergecast through k layers. We can have at
most diam(G) levels and hence totaling gives
diam(G)
k = O(diam 2 (G))
k=1
It can be seen this process eventually builds a BFS tree starting from the root. The
termination condition would be the traversing of the longest shortest path between any
two nodes which would be the diameter of the graph G. Therefore, each node should
wait a maximum of diam(G) of the network messages. Unlike the synchronous
algorithm, this time we do not have an easy solution for termination since we do not
know the diameter en priori. We can include a time-to-live field in each message
which is initialized to an upper limit of the diameter value and is decremented at
each reception at a node. When this field becomes zero, the message is no longer
transmitted to neighbors.
Proof After diam(G) steps, all nodes will have received layer (diam(G) − 1 mes-
sage and will set its distance to this value and hence the BFS tree will be constructed.
Time needed is the diameter of the network to reach the farthest node from the root
node, hence time complexity is O(diam(G)). The longest path in the network will
have a length of n − 1 and a node having this value for the first time may need to
change it n − 2 times and will send at most n · deg(v) messages resulting in the below
total number of messages [11].
m
n.deg(v) = O(nm)
v=1
We can find whether a graph G is connected or not using this algorithm as in the DFS
algorithm. If all of the vertices of G are processed at the end, then it is connected.
Finding the shortest path from the root vertex to all others in an undirected is also
provided by this method. The BFS algorithm in an unweighted simple graph provides
the distance between a vertex v and the source vertex; this distance is simply the level
of v in the BFS tree formed.
bipartite, there will not be any edge of G that has two vertices of the same color at
its endpoints. Specifically, we have the following steps of this algorithm.
1. Input: G = (V, E)
2. Output: Evaluate G as bipartite or non-bipartite
3. Select s ∈ V and set color (s) ← white
4. Run modified BFS algorithm starting from vertex s ∈ V as follows.
Figure 6.21 shows a sample graph partitioned into two sets by the BFS algorithm
and we can see it is not bipartite as there is an edge joining two vertices of the same
color.
This algorithm works correctly with the following reasoning to show only one of
the cases is valid.
• If there is not an edge between two vertices in the same layer, vertices in adjacent
layers will be colored with opposite colors. Therefore, G is bipartite in this case.
• Let us assume there is an edge (u, v) between vertices u and v of the same layer
L j . Let w be the least common ancestor of u and v in the BFS tree T at layer L i .
Then, w − u − v − w is a cycle of the graph having length 2( j − i) + 1 which is
an odd cycle. Hence, G is not bipartite.
BFS is accomplished in O(n + m) time and scanning the edges can be performed
in O(m) time resulting in a total time of O(n + m) for this algorithm.
3
1 1 2
1 1 2
a
1
Fig. 6.21 A graph that is partitioned into white and gray vertices by the BFS algorithm from a
source vertex a. Levels of vertices are shown next to them. The edge enclosed in the dashed ellipse
is between two vertices that are gray, therefore this graph is not bipartite
172 6 Trees and Graph Traversals
We described tree structure in graphs in the first part of this chapter. Trees have
numerous implementations in computer science and in real-life applications. We
reviewed algorithms to construct spanning trees in a graph and then tree traversal
algorithms.
In the second part of the chapter, we reviewed two fundamental graph traversal
methods: DFS and BFS. We saw DFS can be implemented as an effective recursive
algorithm with O(n + m) time complexity and it also has an iterative version using
a stack with the same time complexity. It can be used to test connectivity and to
find the number of components of a graph. An important DFS implementation is
to find the topological order of a directed acyclic graph in which vertices have
precedence relationships. DFS algorithm is difficult to parallelize due to its sequential
and dependent nature of execution between each algorithm step. However, the graph
contraction method in which we obtained a coarser graph of a previous step can be
used for parallel DFS construction. Distributed DFS tree building involves nodes
of a communication network cooperating to construct this tree. We can convert the
sequential DFS algorithm to a distributed one using a special message called token
between the nodes. Any node that possesses the token is allowed to execute, and
hence we have in fact a sequential algorithm running in a distributed manner. There
are various other distributed DFS algorithms which achieve better parallelism at the
expense of increased number of messages as we have reviewed.
The BFS algorithm visits vertices of a graph using layer-by-layer search and it can
be used to find distances from a source vertex to all other vertices in an undirected,
unweighted graph. For a weighted graph, we need to modify this algorithm to find
distances as we will see in the next chapter. BFS algorithm can be used to test
bipartiteness of an undirected graph as we saw. The parallel version of BFS algorithm
uses graph contraction as in the parallel DFS algorithm. There are few distributed BFS
algorithms and one such algorithm works synchronously in rounds under the control
of a special node called the supervisor. This node enlarges the BFS tree layer-by-
layer at each round and in fact imitates the sequential BFS algorithm in a distributed
setting. Other than solving explicit problems such as connectivity, topological order,
and bipartiteness, these two basic methods of graph traversals provide building blocks
of various more complex graph algorithms as we will see. The implementations of
these algorithms in directed graphs are similar, and we should consider only the
outgoing edges from a vertex in a digraph.
Exercises
1. Write the pseudocode of the recursive tree center finding algorithm of Sect. 6.2
and show step-by-step execution of this algorithm in the sample tree depicted in
Fig. 6.22.
2. Construct a possible spanning tree of the graph depicted in Fig. 6.23 using the
second spanning tree algorithm of Sect. 6.2.4.2.
6.5 Chapter Notes 173
k
l
f e d
h i
h g f e
i a b c
j h g f
3. Design and form the pseudocode of a distributed algorithm that forms a span-
ning tree based on the third algorithm of Sect. 6.2.4.3 for spanning tree construc-
tion. Show a possible running of this algorithm in the network graph shown in
Fig. 6.24.
4. Work out a possible DFS tree rooted at vertex a in the digraph of Fig. 6.25 by
showing the discovery and finish times for each vertex. Show also the tree edges,
front edges, back edges, and cross edges in the graph.
174 6 Trees and Graph Traversals
k e
j h g f
j h g f
g
b
e i
h
c
f
j h g f
5. The token-based DFS algorithm is to be executed in the sample graph of Fig. 6.26.
Work out a possible DFS tree rooted at vertex a. Show the iterations of the
algorithm in this graph with the contents of the token.
6. Write the pseudocode of the DFS-based cycle detection algorithm that uses
discovery times and finish times of vertices. Show the step-by-step running of
this algorithm in the graph of Fig. 6.27.
7. Work out the topological order of vertices in the DAG of Fig. 6.28 using both
the simple algorithm and the DFS-based algorithm.
6.5 Chapter Notes 175
8. Find the BFS tree rooted at vertex g in Fig. 6.29 by showing the levels for each
vertex.
9. Design the distributed synchronous BFS algorithm of Sect. 6.13 (Algorithm 6.13)
with FSMs. Draw the FSM diagram and write the pseudocode for this algorithm.
10. Modify Algorithm 6.13 such that termination using a special terminate message
upcast by leaves is used.
References
1. Awerbuch A (1985) A new distributed depth first search algorithm. Inf Process Lett 20:147150
2. Bader DA, Madduri K (2006) Designing multithreaded algorithms for breadth-First search and
st-connectivity on the Cray MTA-2. In: Proceedings of the 35th international conference on
parallel processing (ICPP 2006), pp 523–530
3. Buluc A, Madduri K (2011) Parallel breadth-first search on distributed memory systems. In:
International conference for high performance computing, networking, storage and analysis
(SC’11), Article 65
4. Cayley A (1857) On the theory of analytical forms called trees. Philos Mag 4(13):172176
5. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT
Press, Cambridge, Chapter 22
6. Erciyes K (2013) Distributed graph algorithms for computer networks, Chaps. 4 and 5. Springer
computer communications and networks series. Springer, Berlin (2013). ISBN- 10:1447151720
7. Even S, Tarjan RE (1975) Network flow and testing graph connectivity. SIAM J Comput
4(4):507–518
8. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing, 2nd edn.
Chapter 11. Addison Wesley, Boston
9. Hopcroft JE, Karp RM (1973) An n 5/2 algorithm for maximum matching in bipartite graphs.
SIAM J Comput 2(4):225–231
10. Hopcroft JE, Karp RM (1974) Efficient planarity testing. J ACM 21(4):549–568
11. Peleg D (2000) Distributed computing: a locality-sensitive approach. SIAM monographs on
discrete mathematics and applications, Chapter 5
12. Yoo A, Chow E, Henderson K, McLendon W, Hendrickson B, Catalyuurek UV (2005) A
scalable distributed parallel breadth-First search algorithm on BlueGene/L. In: Proceedings of
the ACM/IEEE conference on high performance computing (SC2005)
Weighted Graphs
7
Abstract
A weighted graph can have weights associated with its edges or its vertices. The
weight on an edge typically denotes the cost of traversing that edge and the weights
of a vertex commonly show its capacity to perform some function. In this chapter,
we review sequential, parallel, and distributed algorithms for weighted graphs
for two specific tasks; the minimum spanning tree problem and the shortest path
problem.
7.1 Introduction
A graph can have weights associated with its edges or its vertices. The weight on an
edge typically denotes the cost of traversing that edge and the weights of a vertex
commonly show its capacity to perform some function. Our aim in this chapter is to
review algorithms for weighted graphs for two specific tasks; the minimum spanning
tree problem and the shortest path problem.
A tree is a connected graph with no cycles and a spanning tree of a connected
graph is a tree that includes all nodes of the graph as we reviewed in the previous
chapter. A minimum spanning tree (MST) of a weighted, undirected, and connected
graph is the spanning tree with the minimum total cost of edges among all spanning
trees of that graph. There can be more than one MST in a graph if edge weights are
not distinct. MSTs find a wide range of applications such as connecting a number
of cities, components or other objects. In general, our aim in search of an MST of a
graph is to use a minimum amount of roads, wires, or any other connecting medium
to connect the objects under consideration. MSTs are also used for clustering of
large networks consisting of tens of thousands of nodes and hundreds of thousands
of edges such as biological networks. Removing a number of heaviest weight edges
results in clusters in such networks.
In this section, we will first describe the four main sequential algorithms to construct
MSTs of weighted graphs. We will then investigate ways of obtaining parallel algo-
rithms from these algorithm followed by the illustration of a distributed algorithm
that can be used to find the MST of a computer network. We will also consider
conversion between parallel and distributed algorithms for this problem.
7.2.1 Background
Given a weighted, undirected, and connected graph G = (V, E, w), we are searching
for the MST T ⊆ G such that w(T ) given below is minimized.
w(T ) = w(u, v) (7.1)
(u,v)∈T
In search for a solution to this problem, we will consider few seemingly reasonable
heuristics. First of all, we do not want heavy-weight edges in the MST and attempt
to include as many light edges as possible. We also need to prevent cycles as a tree is
required. Lastly, the bridges of a graph are to be included in the MST since excluding
these edges leaves the MST disconnected. Two rules defined below will help to form
MSTs.
Proof Consider an MST T that contains the set A and does not contain the edge
(u, v). Then there must be a path p that connects the vertex u to the vertex v since
the MST T must cover all vertices. Let us combine the edge (u, v) with the path p
to form a cycle C in G. The edge (u, v) is across the cut which means there must
7.2 Minimum Spanning Trees 179
be at least an edge (w, z) ∈ C that goes through the cut. Let us now replace (w, z)
with (u, v) to form a new tree T = T ∪ {(u, v)} − {w, z}. T is a spanning tree of G
since we added one edge and removed one edge resulting in n − 1 edges. Moreover,
w(T ) = w(T ) + w(u, v) − w(w, z) ≤ w(T ) since w(u, v) ≤ w(w, z) which means
T that contains the edge (u, v) is an MST of G.
The cut property is useful in forming an MST of a graph. Any least weight edge
across any cut of the graph can be included in the MST until we have n−1 edges which
means the formed tree is an MST. The cycle property is also a useful characteristic
of an MST.
Theorem 7.1 (Cycle property) Let C be any cycle in G and (u, v) be the maximum
weight edge in C. There is no MST of G that contains (u, v).
Proof Let T be an MST of G and assume the contrary that T contains (u, v). Deleting
(u, v) from T results in two subgraphs with vertices VT and V − VT . The cycle C
has another edge (w, z) = (u, v) that has exactly one end point in VT and w(w, z) <
w(u, v) since edge (u, v) is the maximum weight edge of C. Form a new tree T =
T − {(u, v)} ∪ {(w, z)}. The total weight of T is less than the total weight of T which
is a contradiction.
Proof Let T be an MST of G. For each edge (u, v) ∈ T , the tree T = (V, T \{(u, v)}
has two connected components say P and Q. The edge (u, v) is the only edge of
T across the cut between P and Q and it is the least weight unique edge between
these two sets by the cut property and because edges of G have distinct weights.
Therefore, every MST of G must contain (u, v) and if we consider all edges of T ,
every MST of G must contain all edges of T . Hence every MST is equal to T .
We can have a generic algorithm to build the MST of a graph G as follows. We start
with an MST T = Ø of G and always add safe edges to T that should be in the MST
of G. The fundamental algorithms to build the MST of a weighted graph using this
method are due to Prim, Kruskal, and Boruvka as we will review next.
be included in the set A which has a subset of edges of an MST of G. This algorithm
assumes that edges in A form a single tree. It starts from an arbitrary vertex s and
includes it in the MST. Then at each step of the algorithm, the minimum weight
outgoing edge (MWOE) (u, v) from the current tree fragment T such that u ∈ T
and v ∈ G \ T is found and added to the tree T . If T is an MST fragment of G,
T ∪ (u, v) is also an MST fragment of G by Theorem 3. Proceeding in this manner,
the algorithm finishes when all vertices are included in the final MST as shown in
Algorithm 7.1. Figure 7.3 shows the iterations of Prim’s algorithm in a graph.
Analysis
Theorem 7.3 (correctness) Prim’s algorithm provides an MST of the input graph
G = (V, E, w).
Proof The cut property ensures correctness of this algorithm since we always select
the MWOE that is part of the MST by this property. We will show an alternative
proof. In each iteration, we add a vertex v ∈ V − VT that is connected to a vertex
u ∈ VT over the lightest edge (u, v) between the two sets. Since edge (u, v) is
always between two disjoint sets, it cannot form a cycle with vertices of VT , hence
T is always a tree throughout the algorithm running. Also, since VT contains all
vertices of G in the end as tested in line 5, T is a spanning tree of G.
We now need to check whether T is an MST out of all spanning trees of G and we
will do this by induction. Since each vertex must be covered by the MST, the basis
of induction is proven. We now want to show that if Ti−1 is part of the MST, then
adding MWOE to it will provide Ti which will also be part of MST. Let us assume Ti
is not part of the MST and let Ti−1 is a partial MST of G and by adding the MWOE
(u, v) of Ti−1 to we obtain Ti according to the rule of Prim’s algorithm. We will
assume (u, v) ∈ / Tn−1 which is the final MST built. In this case, there is another edge
(w, z) in the cutset between Ti−1 and Ti that is part of Tn−1 . Moreover, (u, v) and
(w, z) are edges of a cycle. Deleting (w, z) from G results in another tree T of G
which has a total weight less than Tn−1 since w(u, v) ≤ w(u, v). This means Ti is
included in another MST of G which contradicts our initial assumption.
7.2 Minimum Spanning Trees 181
11
a 10
b a b
1 4 10 4
1
5 c 9 5 c
f 9 8 f 8
7 6 2 6 2
e d 7 e d
3 3
(a) (b)
a b a b
4 1 4
1
5 c 5 c
f 9 f 8
8
6 2 7 6 2
7 e d e d
3 3
(c) (d)
a b
1 4
5 c
f 8
7 6 2
e d
3
(e)
Fig. 7.1 Running of Prim’s MST algorithm in a small graph starting from vertex a
Proof The main operation performed by this algorithms is the selection of the
MWOE at each iteration in line 6 of Algorithm 7.1. An array d can hold the minimum
distances to any node in VT for ∀v ∈ V − VT , let us call this set VT . We can then find
the minimum value vertex v of this array to include it in VT with the corresponding
edge (u, v) in E T . We also need to update the entries in this array as inclusion of
182 7 Weighted Graphs
the new node v may result in change of values of its neighbors. We need to check
n − 1 vertices to find the minimum value of d in the first iteration, followed by n − 2
iterations in the second step with one less step than the previous one at each step,
resulting in O(n 2 ) steps. Updating of d values requires checking the neighbors of
node v at each step, resulting in the sum of degrees of nodes in total which is 2m. The
total time taken therefore is O(m + n 2 ). We may use a heap data structure to store
the values of VT as we will show in detail in the next section. Finding the minimum
value of VT in the heap can be done in log n time in each step in this case. Updating
the d values for neighbors of v requires a further deg(v) log n at each step. Summing
these two operations for n steps results in O(n log n + m log n). For a dense graph,
we can assume m n and the resulting time is O(m log n).
Implementation
Finding the MWOE from the MST fragment is key to the operation of this algo-
rithm. We will use a min-priority queue based on a key attribute as was described in
Sect. 6.2.7. We define key(v) of a vertex v to be the minimum weight of any edge
connecting v to a vertex in the tree which is initialized to ∞ since we do not have a
tree at start. The queue Q contains all of the vertices of the graph initially and we
extract a vertex with the minimum key value from Q at each iteration of the while
loop. We also assign the parent of each vertex v in P[v] to form the tree structure
during iterations. The procedure ExtractMin(Q) removes the element with the low-
est key from Q. Hence, we invoke this procedure to find MWOE until Q becomes
empty. The pseudocode for this algorithm is shown in Algorithm 7.2.
11
a 10
b a b
1 4 10 4
1
5 c 9 5 c
f 9 8 f 8
7 6 2 6 2
e d 7 e d
3 3
(a) (b)
a b a b
4 1 4
1
5 c 9 5 c
f 9 f 8
8
6 2 7 6 2
7 e d e d
3 3
(c) (d)
a b
1 4
5 c
f 9 8
7 6 2
e d
3
(e)
Fig. 7.2 Running of Kruskal’s MST algorithm for the same graph of Fig. 7.1. The same MST is
obtained as edge weights are distinct
Implementation
Based on these operations, we can restructure Kruskal’s algorithm as shown in
Algorithm 7.4. All vertices of the graph are components of the forest first. The
edges are sorted and inserted into the queue Q, we then test whether the endpoints
of an edge (u, v) dequeued from Q are in the same tree. If they are, we know that
adding (u, v) to the MST T will create a cycle and we discard this edge. Otherwise,
the trees of u and v are merged by the Union operation to form a new tree.
Analysis
Theorem 7.5 (correctness) Kruskal’s algorithm provides an MST of the input graph
G(V, E, w).
Proof At each iteration, we add a vertex v ∈ VT that is connected to a vertex u ∈ VT
over the cheapest edge (u, v) between the two sets. Since edge (u, v) is always
between two disjoint sets, it cannot form a cycle with vertices of VT , hence T is
always a tree throughout the algorithm running. Also, since VT contains all vertices
of G as tested in line 5, T is a spanning tree of G. We now need to prove T is an
MST of G. During the ith iteration of the algorithm, let Ai be the subset of edges of
the final MST T ∗ . Note that unlike in Prim’s algorithm, Ai may contain a disjoint set
of edges. There will not be any edge in Ai that has a greater weight than any edge in
E \ Ai simply because we include low weight edges in Ai starting from the lowest
one. This means if the new edge (u, v) to be added creates a cycle with the existing
edges in Ai , it is the highest weight edge in that cycle. By the cycle property, we are
rejecting an edge that does not belong to the MST. On the other hand, whenever we
accept an edge, it belongs to the MST by the cut property.
Theorem 7.6 (correctness) Reverse edge deletion algorithm provides an MST of the
input graph G = (V, E, w).
Proof This algorithm produces a spanning tree since the resulting structure does not
contain any cycles as we delete the heaviest weight edge that lies on a cycle removal
of which does not disconnect the graph. We will show the resulting spanning tree
T is an MST of G as follows. Let (u, v) be the edge removed during an iteration
of the algorithm. Before removal, it must have been on a cycle C as otherwise such
removal would disconnect G. Since it is the first edge encountered on C, it is the
heaviest weight edge on the cycle C. By the cycle property, the edge (u, v) does not
belong to any MST of G. Therefore, this algorithm results in an MST of G since it
removes edges that cannot be contained in any MST of the graph G.
The weights of edges of the graph G can be sorted in O(m log m) time or
O(m log n) time for a dense graph. The main problem with this algorithm is the
testing of the connectedness of the graph. This can be performed by the DFS or the
BFS algorithm in O(n + m) time after each edge removal resulting in O(nm + m 2 )
time. Total time taken is then O(m log m + nm + m 2 ). It is shown in [18] that remov-
ing an edge, checking the connectivity after removal and reinserting the edge if graph
is disconnected can be performed in O(m log n(log log n)3 ) time per operation.
11
10 a 10
a b b
4 1 4
1
5 c 5 c
f 9 f 9 8
8
6 2 7 6 2
7 e d e d
3 3
(a) (b)
a b a b
1 4 1 4
5 c 5 c
f 9 8
f 8
7 6 2 7 6 2
e d e d
3 3
(c) (d)
a b a b
1 4 1 4
5 c 5 c
f 8 f 8
7 6 2 7 6 2
e d e d
3 3
(e) (f)
a b
1 4
5 c
f 8
7 6 2
e d
3
(g)
Fig. 7.3 Running of reverse edge deletion algorithm for the same graph of Fig. 7.1. The same MST
is obtained as in Prim’s and Kruskal’s algorithms as edge weights are distinct
7.2 Minimum Spanning Trees 189
11 11
C1 10 10
a b a b
1 4 1 4
5 c f 5 c
f 9 9 8
8
7 6 7 6
e d 2 e d 2
3 C2 3
(a) (b)
Fig. 7.4 Running of Boruvka’s MST algorithm in the graph G of Fig. 7.1. The first step in a divides
the graph into two components C1 = G[a, f ] and C2 = G[b, c, d, e] with the lightest weight
incident edges included in MST as shown in bold. The lightest edge between these components
is ( f, d) which is used to merge them and this edge becomes part of the MST as shown in b.
Arrows show the vertex that the lightest edges are incident. The same MST as in Prim’s, Kruskal’s
algorithms and the reverse-delete algorithms is obtained as edge weights are distinct
only one component which contains all of the vertices providing a spanning tree and
the selected edges are the edges of MST as shown in Algorithm 7.5.
Analysis
Lemma 7.1 Suppose edge weights of the graph G = (V, E, w) are distinct. Let ev
be the least weight edge incident to a vertex v ∈ V . The MST T of G contains every
such edge ev .
Proof Let us assume the MST T , which is unique due to distinct edge weights, does
not contain some edge ev = (u, v). Adding the edge (u, v) to T creates a cycle. Let
the vertex w be a neighbor of vertex v, then w(u, v) < w(v, w) since (u, v) is the
lightest edge incident on v. If we delete (v, w) from T and add (u, v) to T , we obtain
T = T − {(v, w)} ∪ {(u, v)} which is still a tree having n − 1 edges and has a total
weight smaller than the weight of T resulting in a contradiction.
Theorem 7.7 (correctness) Boruvka’s algorithm produces the MST of the input
graph G = (V, E, w) that has distinct edge weights.
Proof The final component T is a tree since we join two tree components by exactly
one edge preventing cycles at each iteration. This component T includes all of the
vertices of G as we continue until there is one component, it is a spanning tree of
G. Edges to be included in the tree T at each iteration are part of the MST of G by
Lemma 7.1 hence the resulting tree T is the MST of G.
Note that we required the edge weights to be distinct to select a unique least weight
edge incident to a vertex. This restriction can be relaxed by varying the weights of
equal-weight edges slightly to have unique edge weights.
Proof Each step of the algorithm reduces the number of vertices by at least a factor
of 2 and therefore the total number of steps is log n. Each step requires O(m) time
for contraction resulting in O(m log n) time.
7.2 Minimum Spanning Trees 191
Out of the four algorithms we have reviewed, Boruvka’s algorithm is most suitable
for parallel processing due to relatively independent contraction operations involved.
We will however first look at ways of parallelizing Prim’s algorithm and then describe
briefly how Boruvka’s algorithm can run on a distributed memory parallel system.
This algorithm is inherently sequential as we search for the MWOE at each itera-
tion. However, searching for MWOE can be done in parallel within a single iteration.
The general idea of the parallel algorithm is to divide the vertices to k processes and
have them find the MWOEs in their partitions. The global MWOE can then be
found by a special process which broadcasts it to all others for local updates as in
lines 14–16 of the sequential algorithm. In the implementation, we have k processes
p0 , . . . , pk−1 with p0 as the supervisor. We divide the vertices into k subsets where
192 7 Weighted Graphs
each process pi gets n/k vertices in the set Vi . The array d is 1-D block partitioned
to k processes and the weighted adjacency matrix A is also column partitioned to k
processes. Each process then finds the minimum value of array d in its partition by
using matrix A and the global minimum is computed using the all-to-one reduction
at the root process p0 which then broadcasts it to all processes. The processes update
their distances and form their partition of d for the nodes they are responsible. This
process continues until array d has no elements left meaning all nodes are in VT .
The pseudocode for this algorithm is depicted in Algorithm 7.6.
We will show the implementation of this parallel algorithm using four processes
p0 , . . . , p3 for the same graph we have used to demonstrate sequential algorithms.
Weighted adjacency matrix, sometimes called the distance matrix, A is formed and
partitioned as follows:
7.2 Minimum Spanning Trees 193
p1 | p2 | p3
a b |c d |e f
a 0 10 |11 5 |9 1
b 10 0 |4 8 |∞ ∞
c 11 4 |0 2 |∞ ∞
d 5 8 |2 0 |6 3
e 9 ∞ |∞ 6 |7 7
f 1 ∞ |∞ 3 |7 0
b c d e f
10 11 5 9 1
The root process gathers the minimum values of 10, 5, and 1 from processes p1 ,
p2 and p3 respectively and determines the global minimum value of 1 between nodes
a and f . It then broadcasts this value which is included in VT , node f is removed
from VT and all neighbor edges are tested to obtain the new d as below:
b c d e
10 11 3 9
This time node d is broadcast to all processes, it is removed from VT , and d is
updated to yield d with (8, 2, 6) for nodes b, c, and e respectively. Three more rounds
of the parallel algorithm provide the same MST found by other methods.
Analysis
Each process pi finds the minimum value and performs the updates in Θ(n/k) time
and the total time is Θ(n 2 /k) for n rounds. It takes log k time to perform one-to-all
communication in each round, resulting in Θ(n log k) total time for communication.
Total time taken is
TP = Θ(n 2 /k) + Θ(n log k) (7.2)
Since the sequential time is Θ(m log n) for Prim’s algorithm, the speedup obtained
is
Θ(m log n)
S= (7.3)
Θ(n 2 /k) + Θ(n log k)
194 7 Weighted Graphs
Merging edges in step 2 can be performed using Kruskal’s algorithm. Figure 7.6
shows the operation of this parallel algorithm in a small graph where we partition the
adjacency matrix to four processes. Computing the edges of MST in each partition
takes O(n 2 /k) time and there are O(log k) merging operations each with a cost of
O(n 2 log k) and each process sends O(n) edges in one merge resulting in a total
parallel time of O(n 2 /k) + O(n 2 log k) [14].
Various parallel MST algorithms are based on Boruvka’s algorithm. One such
approach is reported in [4] where the resulting super vertices after contraction consist
of trees of vertices. Neighborhood information is kept in edge lists, one for each
vertex. There are at most n − 1 elements in each edge list. The steps of the parallel
Boruvka’s algorithm in this study consists of the following steps:
7.2 Minimum Spanning Trees 195
8
(a) 17
14 23 10
3
28 27 33 11 32
5 p1
p0
7 21 24 20
16 15 36 2
29 4 13
25 19 18 31
26 22
6 35
9 p2
30 34
1
p3 12
37
8
17
(b)
14 23 10
3
28 27 33 11 32
5
p0 20
7 21 15
16 2
24 36
29 4 13
19 18 31
25 26 22
6 35
9
30 34
1 p2
37
12
8
17
(c)
14 23 10
3
28 27 33 11 32
5
7 21 15 24 20 36
16
2
29 4 13
19 18 31
25 26 22
6 35
9
30 34
1
37
12
Fig. 7.6 Running of parallel Kruskal’s algorithm in a sample graph. There are 4 processes and the
adjacency matrix of the graph is partitioned to four processes p0 , p1 , p2 and p3 as shown by dashed
lines
196 7 Weighted Graphs
1. Choose lightest edge: The edge list of each vertex is searched to find the lightest
weight edge incident to that vertex to form components. Cycles are removed to
have a tree for each component.
2. Find root: Each vertex finds the root of the tree it belongs using the pointer
jumping method.
3. Rename vertices: Each process pi , 1 ≤ i ≤ k determines the new name of each
vertex listed in its edge lists.
4. Merge: The edge lists in each component are merged to the root edge list to shrink
it into a single super-vertex.
5. Clean up: Each process pi runs the sequential MST algorithm in its edge list.
The parallel running time for this algorithm is given as Θ((ts + tw )(m log n/ p)
which results in a speedup comparable to the number of parallel processes but the
constant (s + tw ) may be very large [4]. Contraction can be performed using the
edge or star contraction methods we have reviewed in Chap. 3 to obtain a parallel
Boruvka’s algorithm [1].
In the distributed version of this problem, we are interested in finding the MST of a
network in which every node is involved in the construction. We will consider each
of the three sequential algorithms for this purpose. As a first attempt, investigation of
Prim’s algorithm reveals it is basically sequential in nature. However, the synchro-
nous single initiator (SSI) model of distributed processing may be convenient for this
purpose. We will build and use the MST for proper transfer of message between the
root and other nodes in the tree. The processing is performed in synchronous rounds
in this model and we have a root process a which initiates each round. In the first
round, it includes the lightest edge incident to it in the MST. In each round thereafter,
the root solicits the MWOE of each leaf of the partial T which are convergecast to
the root. It then finds the smallest of MWOEs received from children and broadcasts
this to the members of T which can update their states. In essence, we are processing
the graph exactly as in the sequential algorithm but since we do not know the global
MWOE beforehand, the special process root has to receive all candidates from each
leaf of T and determine the lightest edge. We will describe a possible implementation
of this idea similar to [7,15]. The messages needed are as follows.
• start: This is sent by the root to its children in each round. It has a dual purpose;
initiation of a new round k and carrying the MWOE (u, v) determined in round
k-1 of the partial tree T .
7.2 Minimum Spanning Trees 197
• reply: This is the convergecast message from leaves to the root. At each intermedi-
ate node, the MWOEs of children are gathered, compared with the own MWOE,
and the smallest of them is sent by the reply message to the parent.
• check: This message is needed to avoid cycles. The newly added node v sends it
to neighbors to check the ones already in T and for such a node u, edge (u, v) is
marked as internal so it will not be considered as MWOE of v in future rounds.
• status: This message is returned by a node u as a reply to check message from v
and contains information about whether u ∈ T or not.
Algorithm 7.9 shows one way of implementing the procedure we have described.
The synchronization is provided by messages only and the root starts the next round
only after all convergecast messages from its children are received. It selects the
lightest edge (u, v) with v as the new vertex and sends it to the nodes of the partial T
in the next round. Any node x that has an edge to vertex v marks this edge (x, v) as
internal to prevent cycles in T . The vertex v checks whether its neighbors are in T or
not. This is again needed to prevent cycles as a neighbor may have become part of T .
The leaves start the convergecast process which ends at root with MWOEs received
from children.
The running of this algorithm in a small network graph is depicted in Fig. 7.7
where the building of the MST is completed in seven rounds.
This algorithm correctly finds the MST of a graph as it mimics the sequential
Prim algorithm in a distributed setting. Each step k of the algorithm requires O(k)
time and messages. The time and message complexities are therefore both O(n 2 ).
Looking at other sequential algorithms, Kruskal’s algorithm is difficult to be im-
plemented by the nodes in the network as it requires global ordering of the weights of
edges. However, Boruvka’s algorithms involve independent steps. Since each graph
node now is a node in the network, finding the lightest incident edge in the initial
phase can be done by each node in a single step. We need however to find ways
to contract and manage the contracted nodes. A simple yet effective approach is to
elect a leader for each contracted component which can find the MWOE of nodes in
its component and ask the connected component in the other end of this MWOE for
merge operation. The leader may be the lowest identifier node or the newest node in
the component. The leaders of each component then communicate with neighboring
leaders and decide on the lightest edge between them. A similar method is employed
in the algorithm of Gallager, Humblet, and Spira to find the MST of a network [8].
The nature of Boruvka’s algorithm provides distributed processing conveniently.
However, the choice of the leader has to be performed.
198 7 Weighted Graphs
8 8
2 12 7 2 12 7
a b c d a b c d
9 10 1 4 9 10 1 4
5 5
h g f e h g f e
11 6 13 11 6 13
3 3
(a) (b)
8 8
2 12 7 2 12 7
a b c d a b c d
9 5 10 1 4 9 5 10 1 4
h g f e h g f e
11 6 13 11 6 13
3 3
(c) (d)
8 8
2 12 7 2 12 7
a b c d a b c d
9 10 1 4 9 10
5 5 1 4
h g f e h g f e
11 6 13 11 6 13
3 3
(e) (f)
8
2 12 7
a b c d
(g) 9 10 1 4
5
h g f e
11 6 13
3
Fig. 7.7 Running of distributed Prim’s MST algorithm in a graph from vertex a for seven rounds.
The gray nodes show the leaves of the tree formed. The downcasting of the probe message is in the
reverse direction of arrows of bold lines from parent to children and upcasting of ack message is in
the direction of arrows from children to parent
7.2 Minimum Spanning Trees 199
In various real-life situations, we may be interested to find the shortest path, that
is, the path with minimum total weight among all other paths between two vertices.
For example, the shortest traveling route between two cities may be required. We
review algorithms for this problem in this section. In all of these algorithms, we will
employ a technique called relaxation which can be described as follows. Given a
weighted graph G = (V, E, w), we want to find shortest paths from a source vertex
s ∈ V to all other vertices in the graph G. We define a distance value dv which
shows the best estimate of the current distance of v to s and a predecessor vertex
u of v which is its parent in the tree formed rooted at the source vertex s. We will
be forming a spanning tree T rooted at s at the end of a shortest path algorithm,
sometimes referred to as shortest path tree in which the sum of weights of a path
from s to a vertex v in this tree will be minimum among all possible paths from s
to v. Distance of each vertex from the source vertex is set to infinity and its parent
is undefined initially. Relaxation then involves checking whether a shorter path of a
vertex v through a neighbor vertex u than its current distance is found in which case
its distance is updated to go through that neighbor vertex u and its parent is set to
u as shown in the following steps performed for each vertex v. We need to add the
weight of the edge between these two vertices to the distance of vertex u to get the
actual distance.
Another issue of concern is whether the graph has negative weights and/or negative
cycles.
In the more general case, we can search shortest paths from a single vertex to all
other vertices which is called the single source shortest path (SSSP) problem for
which we review a fundamental algorithm due to Dijkstra.
visited vertices set. Thereafter at each iteration, distance value and predecessor of any
neighbor u of the newly included vertex v has been updated if distance through v is
smaller than its current distance. This algorithm processes all vertices and eventually
forms a spanning tree rooted at source vertex s. We have a vertex set S which shows
vertices to be processed, an array D with D[i] showing the current shortest distance
of vertex i to the source vertex s and another array P with P[i] which shows the
current predecessor of a vertex i along this path as shown in Algorithm 7.3.
The running of this algorithm is depicted in Fig. 7.8. The source vertex is f and
the nearest vertex to f is a which is included in the searched vertices. Then all
neighbors of a which are b and e are checked whether they have shorter distance to
the source vertex f through a. Since these vertices had infinity distances initially,
their distances are modified for smaller values through a. Then we find vertex e has
the smallest distance value and include it in the searched vertices and update distance
values of its neighbors. Note that vertex b has a smaller distance value of 7 through
e and therefore its distance is updated and its predecessor becomes e. This process
continues until we search all vertices which is performed by removing the shortest
distance v from the initial vertex set V at each iteration.
Analysis
Theorem 7.9 (correctness) For each vertex v ∈ S at any time during Dijkstra’s
shortest path algorithm execution, the path Ps,v obtained by the algorithm is the
shortest path between the source vertex s and the vertex v.
202 7 Weighted Graphs
2 ~ 2
6
8
6
a b 3 a b 3
2 2
f 4 1 2 c ~ f
4
1 2 c ~
9 8 9 d 8
e d e
7 7
~ ~ 6 ~
(a) (b)
2 7 2 7
6 6
a b 3 a b
2 2 3 10
f 4 1 2 c ~ f 4
1 2 c
9 8 9 8
e d e d
7 7
6 13 6 9
(c) (d)
2 7 2 7
6 6
a b 3 10 a b 3
2 2 10
f 1 2 c f 1 2 c
4 4
9 e d 8 9 d 8
e
7 7
6 9 6 9
(e) (f)
Fig. 7.8 Running of Dijkstra’s SSSP algorithm in a sample directed graph from the source vertex
f . At each iteration, the vertex with the minimum distance value shown by a large circle is selected
We will have proved the correctness of the algorithm by proving this theorem
since the set S will contain all of the vertices of the graph at the end of the algorithm.
Proof We will use induction for the proof as in [11]. For the base case, d(s) = 0
and S = {s} when |S| = 1 and hence Ps,s is the shortest path.
Let us assume adding a vertex v ∈ / S to S when the size of S is k and u ∈ S
is a neighbor vertex of v on the shortest path Ps,v from source vertex s to vertex v.
Consider any arbitrary path P from s to v. Our hypothesis is that the total weight
of this path is at least as high as the total weight of Ps,v . Let vertex a be the last
vertex on P just before it leaves S and the vertex b ∈ {V \ S} be the first vertex that
is the neighbor of vertex a on this path as depicted in Fig. 7.9. We know the total
7.3 Shortest Paths 203
P
a
Pa b
S
weight of path Ps,u , w(Ps,u ), is the minimum distance to vertex u from the source
vertex s by the inductive hypothesis. If w(a, b) < w(u, v) the algorithm would have
selected the edge (a, b) rather than the edge (u, v). Therefore, w(P) ≥ w(Ps,v ) which
means Ps,v found during the k+1th iteration of the algorithm is the shortest path from
vertex s to vertex v. Note that we have relied on the nonexistence of negative weight
edges.
We need to run the while loop of the algorithm O(n) times for n vertices since we
process a single vertex at each iteration. We also need to find the smallest distance
of unprocessed vertices to the source vertex s in O(m) time since we may need to
consider all edges to find the minimum value. Hence the time complexity of this
algorithm is O(nm) in this straightforward implementation.
We can improve the performance of this algorithm by using a priority queue.
In this case, we will use three priority queue operations; Insert, ExtractMin, and
DecreaseKey. We need to insert all vertices in the queue Q by the Insert operation,
find the minimum value of the queue by the ExtractMin operation and DecreaseKey
operation during relaxation where we update distance values of the neighbors of
the selected vertex. When a binary min-heap is used as the priority queue, time to
construct the queue takes O(n) time. We need n ExtractMin operations for n vertices
each with O(log n) time and O(m) steps of relaxation using DecreaseKey during
relaxation each with O(log n) time. Hence, total time taken is O((n + m) log n). A
Fibonacci heap that has an amortized O(log n) time for ExtractMin operation and
O(1) amortized time for DecreaseKey operation can be used instead of the binary
min-heap. In this implementation, the time complexity is reduced to O(n log n + m).
Analysis
The following lemma helps to prove correctness of this algorithm [12]:
2
3 ~ 2 ~ 2
3
5
2 ~
a b c a b c
2 2
6 6
g 7 1 2 1 g 7 1 1
2
5 5
f e d f e d
4 8 4 8
5 ~ ~ 5 9 ~
(a) (b)
3
2 5 7 2 5 7
3 2 4 3 2
a b c a b c
2 2
6 6
g 7 1 2 1 g 7 1 2 1
5 5
f e d f e d
4 8 4 8
5 7 11 5 7 8
(c) (d)
Fig. 7.10 Running of Bellman–Ford algorithm from the source vertex g in 4 iterations. The visited
vertices at each iteration are shown in gray and the current distance value of a vertex is shown next
to it
• Base case: The distance labels of every vertex other than the source vertex s have
infinite labels when the algorithm starts. After the iteration i=1, only the neighbors
206 7 Weighted Graphs
of vertex s will have a distance label such that ∀v ∈ N (s), d(v, s) = w(s, u). Thus,
all of these neighbors will have the shortest distance to s when k = 1.
• Inductive case: Let us consider the kth iteration, and assume theorem holds for
all i < k. Let P be a shortest s − vk path with at most k edges and and (vk−1 , vk )
be the last edge of P. By Lemma 7.2, the part of path P up to the vertex vk−1
(Ps,vk−1 ) is a shortest path between s and vk−1 ; and by the induction hypothesis,
dist (s, vk−1 ) ≤ w(E(Ps,vk−1 )) after the (k −1)th iteration. After the kth iteration,
we have dist (s, vk ) ≤ dist (s, vk−1 ) + w(vk−1 , vk ) ≤ w(E(P)).
The above reasoning is valid when there are no negative cycles in the graph.
We now want to prove using contradiction that this algorithm returns false when
there is a negative cycle. Letus assume graph G contains a negative cycle C =
{v0 , v1 , . . . , vk , v0 } such that ik w(vi , vi+1 ) < 0 with vk+1 = v0 and the algorithm
returns true. There is a path from the source vertex s to v1 and to all other vertices
of C and let d(vi ) be the distance obtained in the first part of the algorithm using
relaxation. Since we assumed the algorithm returns true without detecting negative
cycles, d(vi+1 ) ≤ d(vi ) + w(vi , vi+1 ) for i = 1, ..., k. When we sum for all vertices
in the cycle, we obtain
k
k
d(vi+1 ) ≤ (d(vi ) + w(vi , vi+1 ))
i=1 i=1
k
k
k
d(vi+1 ) ≤ d(vi ) + w(vi , vi+1 )
i=1 i=1 i=1
k k
Since we sum over the cycle C, i=1 d(vi+1 ) = i=1 d(vi ) and canceling in the
above equation results in the following.
k
0≤ w(vi , vi+1 )
i=1
Proof We need to have n − 1 iterations of the outer for loop to consider the longest
path in a graph since there may be n − 1 changes of the distance of a vertex over this
longest path. There may be at most m edge checking at each iteration of the inner
loop at line 7 and hence, the total time complexity of this algorithm is O(nm). It
is, therefore, a slower algorithm than Dijkstra’s SSSP algorithm, however, it allows
negative weight edges which may be needed in real-life applications.
7.3 Shortest Paths 207
Proof Since the distributed algorithm has the same logic as the sequential algorithm,
we can conclude each node finds its distance to a source node correctly. We have
noted that the root needs to execute n − 1 rounds to take the longest path in the
network into account, hence time complexity in rounds for this algorithm is O(n).
Each edge is in the network is used to send update messages in both directions in
each round, resulting in a total of 2m messages per round. Total number of messages
exchanged will, therefore, be O(nm).
208 7 Weighted Graphs
In a more general case, we may need to discover shortest paths from all vertices to
all other vertices in the graph which is called the all-pairs shortest paths (APSP)
problem. As a first approach, we can run Dijkstra’s SSSP algorithm for each vertex
of the graph resulting in O(n 2 log n) time complexity. When the graph has edges
with negative weights, we cannot use Dijkstra’s algorithm and using Bellman–Ford
algorithm for this purpose yields a time complexity of O(n 2 m) considering running
it for n vertices. We will search for algorithms with better performances when dealing
with negative weight edges and one such approach is due to Floyd–Warshall described
in the next section.
6 6 4
1 3
9 6
5 4
2
Correctness follows from the relaxation rule as we always improve the shortest
paths using all possible pivots. We have three nested loops each running n times
resulting in a time complexity of O(n 3 ) for this algorithm with the initialization
taking O(n 2 ) time. A small example graph is depicted in Fig. 7.11.
The contents of the distance matrix initially and for k = 1 in sequence are shown
below with modified contents displayed in bold figures.
1 2 3 4 5 6 1 2 3 4 5 6
⎛ ⎞ ⎛ ⎞
1 0 1 ∞ ∞ 6 2 1 0 1 ∞ ∞ 6 2
2 ⎜ 1 0 2 1 4 ∞⎟ 2 ⎜ 0 2 1 4 3 ⎟
⎜ ⎟ ⎜1 ⎟
⎜∞ 2 0 6 ∞ ∞⎟ ⎜ 2 0 6 ∞ ∞⎟
D (0) =
3 ⎜ ⎟ −→ D (1) = 3 ⎜ ∞ ⎟
⎜∞ 1 6 0 2 ∞⎟ 4 ⎜ 1 6 0 2 ∞⎟
4 ⎜ ⎟ ⎜∞ ⎟
5 ⎝6 4 ∞ 2 0 9 ⎠ 5 ⎝6 4 ∞ 2 0 8 ⎠
6 2 ∞ ∞ ∞ 9 0 6 2 3 ∞ ∞ 8 0
1 2 3 4 5 6 1 2 3 4 5 6
⎛ ⎞ ⎛ ⎞
1 0 1 3 2 5 2 1 0 1 3 2 5 2
2⎜⎜1 0 2 1 4 3⎟⎟ 2⎜⎜1 0 2 1 4 4⎟⎟
3⎜ 3 2 0 3 6 6⎟⎟ 3⎜ 3 2 0 3 6 6⎟
D (2) = ⎜ −→ D = ⎜
(4) ⎟
4⎜⎜2 1 3 0 2 4⎟⎟ 4 ⎜2 1
⎜ 3 0 2 4⎟⎟
5 ⎝5 4 6 2 0 7⎠ 5 ⎝5 4 6 2 0 6⎠
6 2 3 6 4 7 0 6 2 4 6 4 6 0
1 2 3 4 5 6
⎛ ⎞
1 0 2 2 2 2 6
2⎜⎜1 0 3 4 4 1⎟⎟
3⎜ 2 2 0 2 2 2⎟
P= ⎜ ⎟
4⎜⎜ 2 2 2 0 5 2⎟⎟
5 ⎝2 4 4 4 0 4⎠
6 1 1 1 1 1 0
7.3 Shortest Paths 211
Θ(n 3 )
S= = n, E = Θ(1) (7.5)
Θ(n 2 )
If the number of processes k is smaller than the number of nodes, this algorithm
has good performance, otherwise, it will scale poorly.
Computation-Partitioned SSSP Paths
We can have the SSSP algorithm running on a number of parallel processes when
k > n as follows. Assuming we have k processes available for parallel computation,
we assign k/n processes to each vertex and then run k/n processes in parallel for each
vertex as described in Sect. 7.3.1.3 when parallelizing Dijkstra’s SSSP algorithm. In
other words, we have n parallel SSSP computations each of which is handled by n/k
processes.
TP = Θ(n 3 /k) + Θ(n log k)
Θ(n 3 ) 1
S= = n, E= (7.6)
Θ(n 3 /k) + Θ(n log k) 1 + Θ((k log k)/n 2 )
1 5 9 13
1
p p p p
1,1 1,2 1,3 1,4
5
p p p p (5,9)
2,1 2,2 2,3 2,4
9
p p p p (8,12)
3,1 3,2 3,3 3,4
13
p p p p
4,1 4,2 4,3 4,4
Fig. 7.12 2-D partitioning of 16 × 16 D matrix for a graph with 16 vertices to 16 processes
in PFW _APSP algorithm. The process p2,3 for example, has upper left corner coordinates (5,
9) and lower right coordinates (8, 12). This process needs all matrix entries held by processes
p2,1 , p2,2 , p2,4 in its row and by processes p1,3 , p3,2 , p4,3 in its column to be able to compute its
subblock D values for the current iteration
Algorithm 7.14 P F W _A P S P
1: Input : subblock of D 0 my postion of the distance matrix
2: Output : Di,n j shortest path values for my subblock
3:
4: for j = 1 to n do update distances and next node
5: broadcast my segment of D (k−1) to all processes in
6: broadcast my segment of D (k−1) to all processes in
7: receive D (k−1) values from processes in my row and column
8: compute D (k) for my subblock
9: end for
√
Each process pi, j holds n/ p elements of the kth row or column which are
√
broadcast in Θ((n log p)/ p) time. Synchronization in line 7 requires Θ(log p)
time and computation of n 2 / p values assigned to a process requires Θ(n 2 / p) time
resulting a total parallel processing time of
n3 n2
TP = Θ( ) + Θ( √ log p) (7.7)
p p
We know that sequential algorithm has a time complexity of Θ(n 3 ), therefore the
speedup S can be stated as follows:
Θ(n 3 )
S= √ (7.8)
Θ(n 3 / p) + Θ(n 2 log p/ p)
7.3 Shortest Paths 213
1
E= √ (7.9)
1 + Θ( p log p/n)
From the speedup and efficiency equations, we can conclude this algorithm can
employ O(n 2 / log2 n) processes. Synchronization step can be omitted to result in a
faster-pipelined version of 2-D algorithm with efficiency 1/(1 + Θ( p/n 2 ) [10].
• There is a special node called the root which initiates each round.
• A spanning tree is built beforehand to send and receive control messages such as
broadcast round and convergecast round_over messages.
• Nodes have unique integer identifiers in the range 1, . . . , n.
• The root sends round number r in each round which is interpreted as the parameter
k in the sequential algorithm. Any node that finds its identifier equals r will
broadcast its D values for all other nodes to compare.
Figure 7.13 displays the running of this algorithm in a small network. Broadcasting
of Dk vector is the main bottleneck in this algorithm. We can have the node r send its
D vector to the root which then broadcasts this vector to all nodes over the spanning
tree. Toueg provided an asynchronous version of this algorithm by reducing the set
of nodes that should receive the Dk values with a time complexity O(n 2 ) and a
message complexity O(nm) [19].
214 7 Weighted Graphs
0 6 1 5 3 6 0 3 ~ ~ 0 6 1 5 3 6 0 3 11 9
6 6
1 2 1 2
3 3
5 5 3 5 5 3
1 1
1 1
3~ ~ 1 0 3 9 4 1 0
7 7
4 3 4 3
5 ~ 7 0 1 1 3 0 7 ~ 5 11 6 0 1 1 3 0 7 4
0 4 1 5 3 4 0 3 10 7 0 4 1 4 3 4 0 3 8 7
6 6
1 2 1 2
3 3
5 5 3 5 5 3
1 1
1 1
3 7 4 1 0 3 7 4 1 0 7
7 3
4 3 4
5 9 6 0 1 1 3 0 7 4 4 8 5 0 1 1 3 0 5 4
Fig. 7.13 Running of DFW_APSP algorithm in a small network. The current vectors at each node
are shown next to them. At iterations k = 2 and k = 4, there are no changes to previous distance
values and these are not shown. After five iterations, all of the shortest paths are determined
7.4 Chapter Notes 215
8 7
a b c
2
3 10 5 d
4
9
g f e
11 5
Exercises
1. Find the MST of the sample graph of Fig. 7.14 using Prim’s MST algorithm.
2. Work out the MST of the graph depicted in Fig. 7.15 using both Kruskal’s and
Boruvka’s MST algorithms and show they both result in the same MST of this
graph since edge weights are distinct.
3. Work out single source shortest paths from vertex a of the digraph depicted in
Fig. 7.16 using Dijkstra_SSSP algorithm by showing each iteration.
4. Write the pseudocode of parallel Dijkstra_SSSP algorithm and work out its
efficiency.
5. Construct single source shortest paths from vertex a of the digraph depicted in
Fig. 7.17 using BF_SSSP algorithm by showing each iteration.
6. Construct all-pairs shortest paths of the digraph depicted in Fig. 7.18 using
FW _APSP algorithm by showing each iteration.
7. Form the distance matrix D for the graph of Fig. 7.19 and provide a 2-D parti-
tioning of this matrix to 4 processes. Show the data sent by each process during
parallel running of Floyd–Warshall algorithm for the first two iterations. Work
out the final D values after k iterations.
8. Modify distributed APSP algorithm DFW _APSP pseudocode so that the root
node may also execute this code by showing starting and ending of each round.
7.4 Chapter Notes 217
3 13 2
f e d
2 2 1
f e d
5 8
1 -3 3
4
7
5 4
5
218 7 Weighted Graphs
2 11 14
a b c d
4
8 5
i 15 7 12
10 h g e
f
13 6 9
References
1. Acar UA, Blelloch GE (2017) Algorithm design: parallel and sequential. Chapter 18, Draft book,
Carnegie Mellon University, Dept. of Computer Science. https://www.parallel-algorithms-
book.com/
2. Bellman R (1958) On a routing problem. Q Appl Math 16:87–90
3. Boruvka O (1926) About a certain minimal problem. Prce mor. prrodoved. spol. v Brne III (in
Czech, German summary), 3:37–58
4. Chung S, Condon A (1996) Parallel implementation of Boruvka’s minimum spanning tree
algorithm. Technical report 1297, Computer Sciences Dept., University of Wisconsin
5. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT
Press, Cambridge, Chapter 23
6. Dijkstra EW (1959) A note on two problems in connexion with graphs. Numerische Mathematik
1:269–271
7. Erciyes K (2013) Distributed graph algorithms for computer networks. Chapter 6, Springer
computer communications and networks series. Springer, Berlin
8. Gallager RG, Humblet PA, Spira PM (1983) Distributed algorithms for mininimum-weight
spanning trees. ACM Trans Progrmm Lang Syst 5(1):66–77
9. Graham RL, Hell P (1985) On the history of the minimum spanning tree problem. Ann Hist
Comput 7(1):4357
10. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing, 2nd edn.
Chapter 10. Addison Wesley, Boston
11. Kleinberg J, Tardos E (2005) Algorithm design. Chapter 4. Pearson Int. Ed. ISBN-13: 978-
0321295354 ISBN-10: 0321295358
12. Korte B, Vygen J (2008) Combinatorial optimization: theory and algorithms. Chapter 7, 4th
edn. Springer, Berlin
13. Kruskal JB (1956) On the shortest spanning subtree of a graph and the traveling salesman
problem. Proc Am Math Soc 7:4850
14. Loncar V, Skrbic S, Bala A (2013) Parallelization of minimum spanning tree algorithms using
distributed memory architectures. Transaction on engineering technology. Special volume of
the world congress on engineering, pp 543–554
References 219
Abstract
An undirected graph is connected if there is a path between any pair of its vertices.
In a digraph, connectivity implies there is a path between any two of its vertices
in both directions. We start this chapter by defining the parameters of vertex and
edge connectivity. We continue by describing algorithms to find cut-vertices and
bridges of undirected graphs. We then review algorithms to find blocks of graphs
and strongly connected components of digraphs. We describe the relationship
between Connectivity, and network flows and matching and review sequential,
parallel and distributed algorithms for all of the mentioned topics.
8.1 Introduction
Connectivity is a fundamental concept in graph theory which has both theoretical and
practical implications. An undirected graph is connected if there is a path between
any pair of its vertices. In a digraph, connectivity implies there is a path between any
two of its vertices in both directions. In practice, the study of connectivity is needed
for reliable communication networks as connectivity has to be provided in loss of
edges (links) or vertices (routers) in these networks. A cut-vertex of a graph G is a
special vertex in G removal of which disconnects G. Similarly, removing an edge
called bridge of a connected graph G disconnects G. It would be of interest to detect
such parts of networks to enhance connectivity around these regions by supplying
additional communication devices and links.
We start this chapter by formally defining the parameters of vertex and edge
connectivity. We continue by describing algorithms to find cut-vertices and bridges
of undirected graphs. Blocks are maximal connected components of a graph without
a cut-vertex and we review algorithms to find blocks of graphs. We then review
strongly connected components of digraphs along with algorithms to discover them.
Connectivity is related to network flows and matching as we will see. Our main goal
in a flow network is to find the maximum flow from a source node to a destination
node and we show an algorithm to find maximum flow may be used to find how well
connected a graph is. A matching of a graph is a set of its disjoint edges and we
will see in the next chapter a flow algorithm can be employed to find a maximum
matching of a bipartite graph. We also provide parallel and distributed algorithms
for most of the topics discussed.
8.2 Theory
When V consists of a single vertex, this vertex is called the cut-ertex (or the articu-
lation point) of G. A complete graph K n of order n does not have a cut-vertex since
there is no single vertex removal of which disconnects such a graph. Edge-cut of a
graph can be defined similarly as follows.
8.2 Theory 223
cut-vertex
bridge
0 ≤ κ(G) ≤ n − 1 (8.1)
Definition 8.6 (edge connectivity) The edge-connectivity λ(G) (or λ(G)) of a con-
nected graph G is the minimum number of edges removal of which results in a
disconnected graph
8.2.2 Blocks
A block of a graph G is a maximal set of edges Every graph is a union of its blocks.
A block B of a graph G may contain a cut-vertices of G although it cannot have a
cut-vertex of its own. An edge is a block of a graph G if and only if it is a bridge of G.
Therefore, each edge of a tree is its blocks and every isolated vertex of a graph are its
blocks. In summary, the blocks of a graph consists of all bi-connected components,
all bridges and all isolated vertices. Blocks of a sample disconnected and undirected
graph are shown in Fig. 8.2.
We need to define disjoint paths between two vertices of a graph before stating
Menger’s theorems for connectivity.
8.2 Theory 225
Fig. 8.2 Blocks of an undirected graph shown encircled. The bold vertex, for example, is a cut-
vertex of the graph but not the cut-vertex of the block it belongs
Definition 8.8 (edge connectivity of two vertices) Let u and v be two distinct vertices
of an undirected graph G. The edge connectivity of vertices u and v, λ(u, v), is the
least number of edges that are to be deleted from G to have u and v disconnected.
Definition 8.9 (vertex connectivity of two vertices) Let u and v be two distinct ver-
tices of an undirected graph G. The vertex connectivity of vertices u and v, κ(u, v),
is the least number of vertices selected from V − {u, v} that are to be deleted from G
to have u and v disconnected. We can immediately see that the vertex connectivity
of the graph G is the minimum of κ(u, v) for each pair of vertices u and v.
Definition 8.10 (vertex disjoint paths) Collection of paths between the two vertices
u and v of a graph G are called vertex disjoint (independent) if they do not share any
vertices other than u and v. The greatest number of independent paths between the
two vertices u and v is denoted as κ(u, v).
Definition 8.11 (edge-disjoint paths) Collection of paths between the two vertices
u and v of a graph G are called edge disjoint (edge-independent) if they do not share
any edges. The greatest number of edge-independent paths between the two vertices
u and v is denoted as λ(u, v).
We will now state Menger’s theorems without proving them which provide nec-
essary and sufficient conditions for a graph to be k-connected or k-edge connected.
Theorem 8.2 (Menger’s Theorem, vertex version) Let κ(u, v) be the maximum num-
ber of vertex disjoint paths between the vertices u and v. A graph is k-connected if
and only if each vertex pair in the graph is connected by at least k disjoint paths.
226 8 Connectivity
Theorem 8.3 (Menger’s Theorem, edge version) Let λ(u, v) be the maximum num-
ber of edge disjoint paths between the vertices u and v. A graph is k-edge-connected
if and only if each vertex pair in the graph is connected by at least k edge-disjoint
paths.
path to the source vertex, forming a spanning tree rooted at the source in the end.
Time spent will be O(n + m) as in the DFS or BFS algorithm. We can find the
connected components of an undirected graph using the DFS algorithm with a simple
modification; run DFS on the graph to get a forest; each tree in the forest formed by a
call from the main program is then a connected component as shown in this modified
version of DFS in Algorithm 8.1. Every time a return is performed from the DFS
procedure, all of the vertices in that component have been visited. We can actually
label the vertices with the components they are in by defining an array label[1 . . . n],
where label[i] shows the number of the component vertex i belongs. Time taken to
find the components of the graph G using the DFS algorithm is O(n + m).
connected and hence, a deficit network. We need to find such articulation points in
networks to provide additional links around them to make the network more robust
to failures. We will first describe a naive algorithm to find articulation points of an
undirected graph and then a DFS-based algorithm with better time complexity.
Remark 4 A vertex w with a back edge (u, v) in a DFS tree of a graph G cannot be
an articulation point as removal of w does not leave G disconnected.
We can see this is valid as (u, v) still keeps the graph G connected and conversely,
removal of a vertex w that does not have a back edge leaves G disconnected and
therefore, w is an articulation point. We now have a property to classify vertices;
any vertex that does not have a back edge from a vertex in its subtree to one of its
8.3 Sequential Connectivity Algorithms 229
ancestors in a DFS tree is an articulation point. The root r of the DFS tree needs
special treatment as it has no back edges but it can still be an articulation point if and
only if it has more than one child. In such a case, if any two vertices in the subtrees
of the children of the root were connected by a non-tree edge, they would be in the
same subtree. Therefore, when root r has more than one children, removal of r will
leave the graph G disconnected.
Remark 5 The root vertex of a DFS tree of a graph G is an articulation point if and
only if it has more than one child.
1. num(v) (Rule 1)
2. lowest num(u) among all back edges (v, u) (Rule 2)
3. lowest low(u) among all tree edges (v, u) (Rule 3)
Any vertex v other than the root in the DFS tree is an articulation point if and only
if low(u) ≥ num(v) for any child u of v meaning there are no back edges from any
vertex in the subtree of v to any one of its ancestors. The root is an articulation point
if and only if it has more than one child. We can now structure an algorithm based on
the foregoing as shown in Algorithm 8.3. The procedure assign_num is basically
a DFS algorithm which also assigns the num values to vertices as they are visited.
The second procedure check_ A P finds the low values for vertices by checking the
rules above and tests articulation point condition and includes vertices that satisfy
this condition in V .
Running of this algorithm is shown in Fig. 8.3. A simple graph with two articula-
tion points b and d is given in (a). We form a DFS tree shown in (b) for this graph and
label every vertex v with num(v) and low(v) as described. For example, vertex g has
7, 1 since it has been discovered last in the DFS and the back edge (g, b) connects
it to vertex b which has a num value of 1, therefore, its low value is set to 1. We
find d has a descendant vertex e which has a low value of 4 which is equal to the
num value of d, therefore, vertex d is an articulation point. Vertex b is an articulation
230 8 Connectivity
point simply because it has more than one child. Other possible DFS trees rooted at
vertices c and e are shown in (c) and (d) of the same figure. We find vertices b and d
are again articulation points in both with the same reasoning as above. The runtime
of this algorithm is simply the time it takes for DFS which is O(n + m).
c e 2,2 a c 3,1
4,1
a b d d
g f 7,1 g e 5,4
6,4 f
b 2,1 2,1 d
d 5,1
4,2 b
e 6,5
5,5 a g 6,2
f 7,5
Fig. 8.3 Running of DSF-based AP algorithm in a sample graph for three different DFS trees
formed. The articulation points found are vertices b and d in all cases shown in double circles
Hopcroft–Tarjan Algorithm
Hopcroft and Tarjan provided an algorithm to find blocks of a graph using DFS as
in the articulation point algorithm [10]. The main idea of this algorithm is the key
observation that the blocks are separated by the articulation points of the graph. Note
that an articulation point of a graph is not an articulation point of any block it belongs
since a block does not contain an articulation point of its own. We can, therefore,
discover articulation points in the graph and all vertices between any two articulation
points will be a block. This algorithm uses this fact and operates similarly to the DFS-
based articulation point finding an algorithm with the exception that we push edges
visited in a stack until we discover such a cut-vertex and pop all the vertices of
edges from the stack into a block data structure when we do. The variables num and
low as in the articulation point algorithm are used and the algorithm consists of the
following steps.
1. Start DFS from an arbitrary vertex s of the graph G = (V, E). Set counter ← 1
and num(s) ← 1, low(s) ← 1.
2. Perform DFS as usual and whenever a neighbor vertex v of the vertex u under
consideration is encountered, check the edge (u, v).
232 8 Connectivity
a. The vertex v is discovered for the first time and thus (u, v) is a tree edge.
Increment counter and set num(v) = counter , low(v) = num(v). Push the
edge (u, v) onto stack S.
b. The vertex v has been visited before and num(v) < num(u). Therefore the
edge (u, v) is a back edge. Set low(u) = min{low(u), num(v)}. Push the
edge (u, v) onto stack S.
c. The vertex v has been visited before with num(v) > num(u). Thus, the edge
(u, v) is a forward edge. This is valid only when G is a digraph. Since the
edge (u, v) has already been processed in this case, we do nothing.
3. Upon backtracking from a vertex v that was searched by using the edge (u, v), set
low(u) = min{low(u), low(v)}. If low(v) ≥ num(u), vertex u is an articulation
point as in Algorithm 8.3. In this case, pop all edges from the stack S up to and
including the edge (u, v). The vertices incident to these edges will form a block
of G.
4. When a return from the source vertex s is performed, pop all remaining edges
from the stack S and include all incident vertices on these edges in a single block.
(a) f S
2,2
1,1
a b e
(d,b)
(c,d)
c d
(b,c)
3,3/ 4,4/
2 2 (a,b)
(b)
f S
2,2 5,5 6,6
1,1
a b e
(e,f)
c d
(b,e)
3,2 4,2
(a,b)
• Removing an edge that is part of a cycle of a graph G does not disconnect G and
hence, an edge (u, v) is a bridge if and only if (u, v) is not contained in any cycle.
• Consider a bridge (u, v) of a graph G. The vertices u or v are articulation points
of G if they have a degree greater than 1.
We can apply the same strategy to find bridges of an undirected graph G = (V, E)
as we did in finding the cut-vertices; remove each edge one-by-one and check the
connectivity of the graph using DFS or BFS after each removal. If G becomes
disconnected after removing an edge e, this edge e is a bridge (cut-edge) of G. We
need to execute the loop for each edge for a total of m times and checking connectivity
takes O(n + m) time by DFS or BFS resulting in a total time of O(m(n + m)) for
this algorithm. Again, this method is not favorable for large graphs and we look for
algorithms with better performances.
8.3 Sequential Connectivity Algorithms 235
1. Perform a DFS of the graph G = (V, E) from any vertex of G to obtain the DFS
tree T and label each vertex v with num(v) with respect to its first visit time.
2. For each vertex v ∈ V do the following.
We need to test this condition for every vertex in the DFS tree. In order to do so, we
will run the DFS algorithm in the graph G and record the discovery time (num(v))
for each vertex v. Then, we compute the values of N D(v), low(v) and high(v) for
each vertex and check the bridge condition. Running of this algorithm in a small
graph is shown in Fig. 8.5. The edges (b, c) and (d, e) shown in bold are the two
bridges of the graph in (a) as can be seen. We run the DFS algorithm and compute
d 5,4,4,8
6,1,6,6 e f 7,2,4,8
g 8,1,4,8
the num, N D, low and high values for each vertex as shown next to each vertex in
the DFS tree in (b). Then, we check the bridge condition for every vertex v incident
on edge (u, v); low(v) = num(v) and high(v) < num(v) + N D(v). Only vertices c
and e satisfy this condition and hence (b, c) and (d, e) are the bridges of this graph.
The running time is simply the time for the DFS which is O(n + m).
A digraph can be used to model a finite-state machine and strong connectivity in such
a digraph implies recovery from a malfunction state as there is always a path from
every state to another. We may want to find strongly connected people in a social
network who are close friends to analyze such a network. We can check whether a
digraph G = (V, E) is strongly connected or not by selecting an arbitrary vertex
v, running DFS (or BFS), reversing the direction of edges to obtain the transpose
graph G T and then running DFS (or BFS) from that vertex in G T again. If the
visited vertices in both directions equals V , G is strongly connected. This method is
sufficient as it is possible to get from any vertex u to w via v. The digraph may not
be strongly connected, in this case, this algorithm determines the strongly connected
component containing the start vertex v. This component called Vc has the common
vertices visited during DFS or BFS of G and then G T as shown in Algorithm 8.5.
Since we run DFS or BFS in both directions, the time required for this algorithm is
O(n + m).
Decomposing a digraph into its SCCs is useful in various algorithms as it allows inde-
pendent runs of the algorithm on each SCC, therefore, allowing parallel processing.
There are two fundamental algorithms to detect SCCs in a digraph due to Tarjan
[13] and Aho [1]. Both algorithms make use of DFS, Tarjan’s algorithm works with
a single DFS call while Kosaraju’s algorithm requires two DFS calls, however is
simpler to implement than Tarjan’s algorithm.
We observe that the contracted digraph G SCC , commonly called component graph
of G, has no cycles, in other words, it is a directed acyclic graph. If there was a cycle
between SCCs, they could be contracted into a larger SCC. Kosaraju’s algorithm
is based on the idea that same SCCs exist in a graph G and its transpose G T . We
show the high-level description of this algorithm in Algorithm 8.7. It consists of two
phases, we first perform a DFS on G to form a DFS forest and place the vertices on a
stack with respect to their finish times during DFS. In the second phase, we remove
a vertex from stack and perform a DFS on the graph transpose G T . The second call
to DFS is, in fact, to visit the vertices in G SCC . When the search ends, we have all
8.3 Sequential Connectivity Algorithms 239
the vertices of a SCC visited. We then continue with the next vertex from the stack
until all vertices are visited and placed on the SCCs.
A sample digraph and the operation of the first phase of this algorithm is depicted
in Fig. 8.6.
The second phase of the algorithm pops vertices from the stack and performs a
DFS on these vertices to obtain SCCs as shown in Fig. 8.7.
h g f e d
(b) b d (c) d
b
g g
f
h f e
c
a e h
a
c
240 8 Connectivity
h g f e d
(b) d (c) b
d f
b
g a
c
f
e h
c e
h g
a
Analysis
Given a digraph G = (V, E) with two distinct SCCs C1 and C2 , consider an edge
(u, v) ∈ E with u ∈ C1 and v ∈ C2 . We then have the following observation.
Remark 6 It can be shown that f in(C1 ) > f in(C2 ). Similarly, if (u, v) is an edge
in G T with u ∈ C1 and v ∈ C2 , then f in(C2 ) > f in(C1 ) [3].
Proof We will prove the correctness of this algorithm by induction. Let k denote the
number of trees formed when DFS is called on G T . When k = 0, the base case holds,
and assume the first k − 1 trees obtained this way are SCCs of the graph G. Let u be
the root of the kth tree and a member of the SCC C1 of G. For any undiscovered SCC
C x at step k, f in(C1 ) > f in(C x ) and all other vertices of C1 will be descendants of
u in the discovered DFS tree. Any edge that is leaving C1 in G T should be directed
to SCCs already discovered by the above remark. Therefore, all descendants of u
will be only in the SCC C1 and no other SCCs of G T [3]. The time complexity of
this algorithm is O(n + m) since it involves two DFS calls, first one in G and the
second one in G T .
(a)
9/12 a b 10/11
c 4/5
8/13 h g f e d 15/16
1/14 2/7 3/6
a b
c
h g f e d
starting from the largest denominator vertex. Each discovered component by the
DFS is a SCC.
Finding the vertex and edge connectivity numbers of computer networks provide us
with the vital information on how reliable they are. Clearly, the larger the connectivity
number is, the more robust a network is. We will see efficient ways of finding the
vertex and edge connectivity of a graph when we review the network flows in the next
section. We review shortly a brute-force algorithm that will find vertex connectivity
and then a brute-force algorithm for edge connectivity.
As a first attempt, we can implement the following brute-force strategy. We first
find all subsets of the vertices of the network graph; sort these in increasing order
and then remove each subset from graph starting from the smaller subsets and check
the connectivity of the graph using the BFS (or the DFS) algorithm as shown in the
pseudocode of Algorithm 8.8. The B F S_Conn algorithm checks the connectivity
of the graph using the BFS algorithm and returns true if the graph is connected and
false otherwise.
This brute-force method will provide the exact k value for the graph. Although
it will work, major problem in practice with this algorithm is its exponential time
complexity. For a graph with n vertices, the number of subsets of its power set is
2n , resulting in 2n iterations of the for loop. The BFS algorithm within each loop
242 8 Connectivity
iteration also has O(n + m) time complexity resulting in a total time complexity of
O(2n (n + m)) which is unacceptable even for moderate size graphs.
The same method can be applied to find the edge connectivity of a graph G, this
time by forming the power set of edges of G. The number of the power set is 2m this
time and hence, the total time needed is O(2m (n + m)) in this case.
In many cases, we would be interested to find if any two vertices of a graph are
connected, that is, there is a path between these two vertices. The below graph
derived from the original graph provides this information.
The connectivity matrix of the graph defined below provides a suitable represen-
tation of a graph that is equivalent to its transitive closure.
Thus, finding the transitive closure of a graph is reduced to working out its con-
nectivity matrix. We can set 0 for vertices that are not neighbors instead of ∞ in C
to obtain the matrix A and compute the powers of A, say Ak using matrix multipli-
cation using logical or and logical and instead of scalar multiplication and addition
to obtain connectivity of vertices that are k + 1 hops away. Since longest path in a
graph may be of length n − 1 at most, An−1 will be equal to C. Alternatively, run-
ning the BFS algorithm for each vertex will provide C or setting weights of edges 1
8.3 Sequential Connectivity Algorithms 243
and running Floyd–Warshall algorithm using the adjacency matrix of the graph with
edges having unity weights will also result in the connectivity matrix in O(n 3 ) time.
Warshall’s algorithm can be implemented to find the connectivity matrix C. We have
a directed graph G = (V, E) with an adjacency matrix A[n, n], where A[i, j] = 1 if
(i, j) ∈ E, and compute the matrix C, where C[i, j] = 1 if there is a path of length
greater than or equal to 1 from i to j as shown in Algorithm 8.9. This algorithm has
Θ(n 3 ) time complexity due to three nested loops.
Connectivity search that makes use of the network flow algorithms is based on finding
the edge or vertex connectivity values between every pair of vertices of a graph. Once
we have these values, the edge or vertex connectivity is assigned to the minimum
value of the computed values. We will first review, network flow method with two
basic algorithms to compute flow in a network and then describe algorithms to find
the vertex and edge connectivity using this method.
Let us assume a directed graph G = (V, E) in the usual sense. We will form a flow
network as follows. Each edge e ∈ E of G is assigned a nonnegative integer called
the capacity c(e) and we have a source vertex s and a sink vertex t. A flow f (u, v)
through the edge (u, v) in this network satisfies the following:
244 8 Connectivity
0 ≤ f (u, v) ≤ c(u, v)
which means a flow through an edge may not exceed the capacity assigned to that
edge.
• Flow Conservation: ∀u ∈ {V − {s, t}},
f (v, u) = f (u, v)
v∈V v∈V
In other words, the flow into the vertex u equals the flow out of u for any vertex
u except the source vertex s and the sink vertex t.
That is, the flow value of a flow network is the difference of the sum of flows from
the sink vertex s to the sum of flows into s. The maximum flow problem is to find a
flow with a maximum value in a flow network. An example flow network is depicted
in Fig. 8.9.
8.4.1.1 Cuts
Definition 8.18 (cut) A cut (S, T ) of a flow network divides the network into two
nodes such that s ∈ S and t ∈ T . The capacity of a cut,
sets S and T of disjoint
c(S, T ) is defined as e∈[S,T ] c(e).
An edge (u, v) with u ∈ S and v ∈ T is called a forward edge of the cut (S, T ).
When v ∈ S and u ∈ T , the edge (u, v) is said to be a backward edge. The flow
across a cut (S, T ) is the difference between the sum of the flows in forward edges
and the sum of the flows in backward edges. The cut shown by a dashed curve in
Fig. 8.9 has a flow of 2 + 3 − 1 = 5 value. Given, a flow network G with any cut
(S, T ) of G, the following remarks can be shown.
s 2/2 1/2 t
4/4
1/2
Remark 7 The value of flow f in G is equal to the value across the cut (S, T ).
Remark 8 The value of flow f across the cut (S, T ) does not exceed the capacity of
the cut (S, T ).
In other words, the residual capacity of an edge (u, v) is the amount of flow that can
be pushed through (u, v) and the residual capacity of the edge (v, u) is the flow that
is used. When we are updating a residual network after a flow change, we should
always modify these values so that flow conservation at a node of the network is
obeyed. For example, if we increase the value of flow through an edge (u, v) by 3
units, then we should increase the value of flow through the edge (v, u) by 3 units to
maintain flow network property. Also, flow used by the edge (v, u) may be returned
if it will cause a larger network flow by doing so. In summary, G f has edges that may
be utilized to have more flows through them. The residual network of the network
of Fig. 8.9 is shown in Fig. 8.10.
where the residual capacity c f (P) of a path P is the minimum residual capacity of
its edges. We can push a maximum additional flow through P by the value of c f (P)
as otherwise, we will be violating the capacity constraint in the residual network.
2
2 1
d c
246 8 Connectivity
Theorem 8.5 A flow is maximum if and only if there are no augmenting paths from
source s to sink t. The value of this flow | f | = c(S, T ) for some cut (S, T ) of G.
Proof If there is an augmenting path from s to t, then we can increase the value of
maximum flow f through this path. Therefore, f would not be maximum.
4 4
a b a b
3 2
2
3
2 1 2 4
s 1 4 t s t
4 4
1 5 1 5
d c d c
1 1
a 3 b a 3 b
2 2
3 3
2 1 2 1
s 2 3 2 t s 2 3 2 t
2
4 3 3
1 1
d c 2 d c
1
4
a 3 b a b
3 2 2
3
2 1 2
s 2 3 1 t s 2 4 t
1 2
1 4 3
1
3 d c d c
Note that when a flow f x is pushed through an edge (u, v) for the first time, we
need to form a new edge (v, u) with label f x if such an edge does not exist. Figure 8.11
displays the operation of this algorithm in a small network. We start with 0 flow and
the sum of all possible flows to any node u is equal to the sum of all possible flows
from u. We have an arbitrary cut in this network with a value of 7 as shown and this
value is the maximum flow to be attained in this network as shown by the max-flow
min-cut theorem. We then search for augmenting paths and whenever such a path
p is found, flows through all edges of this path are decreased by the value of the
minimum flow along the path p and flow is increased with this value. We find the
edge with the minimum value in the augmenting path s − a − b − c − t is (s, a)
with the value of 3. Hence, flow f is set to 3, the residual graph is updated to obtain
the graph in (c) and proceeding in this manner, we have the final residual graph in
(f) after four iterations which do not have any augmenting paths and we stop with a
final f value of 7 as in the cut.
248 8 Connectivity
Analysis
Since flow values are integers, we will be incrementing the flow value | f ∗ | times at
most where f ∗ is the maximum flow value, since flow value is incremented by one
at each step in the worst case. Each augmenting path can be found by the DFS or the
BFS algorithm in O(n + m) time resulting in a total time of O(| f ∗ |(n + m)) time.
When the choice of the augmenting paths are done arbitrarily and | f ∗ | is large, the
time complexity of this algorithm may be high.
The minimum length path p can be determined in O(n + m) time using BFS.
Having found the shortest path p, we can augment f in O(n) and update of G f takes
also O(n) time resulting in ≈ O(m) time for one iteration of the while loop. There
are O(n) iterations resulting in a total time of O(m 2 n) [2].
The edge connectivity λ(u, v) of two vertices u and v of a simple graph G is the least
number of edges deletion of which makes u and v disconnected. In an undirected
graph, λ(u, v) = λ(v, u) and in the case of a directed graph this equality may not
hold. We can see that when an undirected graph G is not trivial, the edge connectivity
of G, λ(G), is the minimum value of λ(u, v) for each pair of unordered vertices u and
v. For a digraph G , λ(G ) is the minimum value of λ(u, v) for each pair of ordered
vertices u and v.
With this background, we can compute the value of λ(G) for an undirected or
a directed graph if we have a method to find the connectivity values for each pair
8.4 Flow-Based Connectivity 249
of vertices. This method is in fact based on the maximum flow algorithm we have
reviewed above. Even provided an algorithm based on the maximum flow to com-
pute λ(u, v) for each pair of vertices and the graph edge connectivity is simply the
minimum of all the values computed [6]. The algorithm to find λ(u, v) consists of
the following steps [4,6].
a. Replace each edge (x, y) ∈ E with arcs (x, y) and (y, x).
b. Designate u as the source and v as the sink vertex.
c. Assign a capacity of 1 to each arc.
We have n(n − 1)/2 unordered pairs in an undirected graph and n(n − 1) ordered
pairs in a digraph to check. Hence, we need to call the above procedure so many
times. It was shown in [6] that the time complexity of this algorithm is O(nm).
Let us consider the graph G of Fig. 8.12. The edge-cut C shown in dashed line
separates the vertices into two subsets of G 1 and G 2 . If C is minimum edge-cut and
we select a single vertex a ∈ G 1 and check connectivity ∀v ∈ G 2 , it can be seen that
κ(G) is the minimum of these values.
We can, therefore, have an algorithm consisting of the following steps [5]:
C G2
G1
k a b c e
d i h g f
Once a spanning tree is constructed using Algorithm 8.12, the graph edge con-
nectivity can be computed by the following algorithm.
The vertex connectivity κ(u, v) of two vertices u and v of a simple graph G = (V, E)
is the least number of vertices deletion of which makes u and v disconnected. When
(u, v) ∈ E, κ(u, v) = n − 1. The method employed to find vertex connectivity
is similar to the edge connectivity computation. We can find the edge connectivity
values of all vertex pairs in G using maximum flow method and assign the minimum
of these values as the edge connectivity of G as shown in [6]. This algorithm we will
call Even’s algorithm consist of the following steps.
The G graph obtained this way will have 2n vertices and 2n+m edges. Figure 8.13
displays the directed graph G obtained from an undirected graph G using this pro-
cedure. The racs in G’ are labeled with unity values and the maximum flow in this
network from vertex u 2 to v1 is computed. It was shown in [6] that the time complexity
of the above algorithm is O(mn 2/3 ).
Even and Tarjan showed that we do not need to find κ(u, v) for each pair of
vertices u and v and we need only compute values for the set V ⊂ V of vertices
with |V | = κ + 1 and update the minimum value of κ(u, v) as we do as shown in
Algorithm 8.14 [7].
d c
d1 d2 c1 c2
8.4 Flow-Based Connectivity 253
We may need to find whether a graph representing a network is connected and its
connectivity parameters. We present algorithms for this purpose in this section.
We can obtain the connectivity matrix C from the adjacency matrix A of a graph
G by multiplying A n − 1 times by itself, in other words, taking the n − 1th power
of A. However, we need to perform the required addition and multiplication in the
usual matrix multiplication as Boolean addition (logical or operation) and Boolean
multiplication (logical and operation). Before performing the Boolean multiplication,
we need to generate matrix B which differs from the adjacency matrix A with all
diagonal elements as 1’s instead of 0’s. This matrix B now has 1’s for all paths in G
that have 0 or 1 length. Multiplying B by itself provides B 2 which shows paths of
length 2 or less and in general, B k contains 1’s for paths of length k or less between
any two vertices.
The maximum path length in a graph G with n vertices can be n − 1 and hence,
we need to find B n−1 . The required number of Boolean multiplications is then
log (n − 1). For example, to find B 8 , we need to find B × B to yield B 2 ; then
254 8 Connectivity
(a) (b) a b c d e
a b 1 0
a 0 0 0
b 0 0 1 0 0
e A= c 0 0 0 1 0
c d 1 0 0 0 1
d
e 1 0 0 0 0
Now, checking the entry C[i, j] shows whether there is a path from vertex i to
j. We can see that the vertex b can reach all other vertices whereas it cannot be
reached by any other vertex as evident from the graph. In order to parallelize this
algorithm, we can use any of the parallel matrix multiplication procedures such as
the one described in Sect. 4.7.1 using Boolean multiplication and addition instead
of multiplication and addition of real numbers.
We have seen how to find connected components of an undirected graph in Sect. 8.3.1.
We will now describe two algorithms to find the components of an undirected graph
in parallel.
The row i of D has the names of the vertices vertex i is connected which are in
fact in the same component as i since there is a path between each of them and vertex
i. We can now assign a vertex vi to component k if k is the smallest index for which
D[i, k] = 0. The parallel formation of this algorithm has three steps as follows.
n2
TP = Θ + Θ(n log k) (8.4)
k
(a)
a b
f
e h
c g
d
(b) a b c d e f g h
a 0 1 1 1 1 0 0 0
b 1 0 1 0 1 0 0 0 p
0
A= c 1 1 0 1 0 0 0 0
d 1 0 1 0 1 0 0 0
e 1 1 0 1 0 0 0 0
f 0 0 0 0 0 0 1 1 p
1
g 0 0 0 0 0 1 0 1
h 0 0 0 0 0 1 1 0
(a) (b)
a b a b
f f
e h e h
g g
d c d c
p
0
(c) (d)
a b a b
f f
e h e h
g g
d c d c
p
1
Fig. 8.16 Partitioning of the sample graph of Fig. 8.15 to two processes
8.5 Parallel Connectivity Search 257
e h
g
d c
Θ(n 2 )
S= (8.5)
Θ(n 2 /k) + Θ(n log k)
1
E= (8.6)
1 + Θ((k log k)/n)
The classical algorithms of Tarjan and later Kosaraju to find SCCs of a digraph are
difficult to parallelize due to the inherently sequential operation of the DFS algorithm
employed in both. A parallel algorithm called divide and conquer strong components
(DCSC) algorithm by Fleischer et al. [8] uses a different approach by partitioning the
digraph into three disjoint subgraphs and processing these subgraphs recursively in
parallel. We will briefly describe this parallel algorithm as it can be used in practice
with some modifications.
Given a digraph G = (V, E), the descendants Desc(G, v) of a vertex v are the
vertices in G that are reachable from v including itself. The predecessors Pr ed(G, v)
of a vertex v can be defined similarly as the set of vertices from which the vertex v
is reachable. The remaining vertices in graph G are called the remainder shown by
Rem(G, v) = V \ {Desc(G, v) ∪ Pr ed(G, v)}. It is shown that
and any SCC of G is a subset of Desc(G, v), Pr ed(G, v) or Rem(G, v). The designed
algorithm makes use of this property by first selecting a random vertex v, finding its
predecessor and descendant sets and then finding the SCC that contains this vertex.
It then recurses on the remaining vertices in parallel as shown in Algorithm 8.15.
It was shown in [8] this algorithm has an expected time complexity of O(n log n)
in the serial case. Later on, McLendon et al. extended this algorithm by a simple
modification to improve performance [12].
258 8 Connectivity
In a network environment, our aim is to have each node of the network find out the
connectivity values of the graph that represents the network.
We know that the lowest degree δ(G) of a graph G is an upper bound on the value of
the connectivity κ(G) since we can isolate this vertex by removing all edges incident
to it to have G disconnected. This concept can be used in a distributed setting by nodes
exchanging their degrees to estimate δ(G). Three localized distributed algorithms to
determine the value of κ(G), say k, that works with neighbor knowledge only are
proposed by Jorgic et al. [16, 1]. In the first algorithm called local neighbor discovery
(LND), each node discovers its degree di by first sending hello messages to neighbors
and counting the responds from them. Each node then exchanges degree information
with their neighbors and send this data to neighbors. Repeating this process p times
results in degrees of nodes transferred to all nodes within p hops from them. Nodes
can then simply sort the degrees they received in total and denote the lowest value as
the value of k. The pseudocode of a possible implementation is shown in Algorithm
8.16 and although the original algorithm uses the time-to-live field of the message
which is initialized to p, we provide a SSI algorithm version that works in p rounds
to achieve the same function.
All of the edges of the graph will be traversed in both directions in each round, so
there will be O( pm) messages in total. Although this linear time may look favorable,
a high value of p is needed to estimate k more correctly. Moreover, δ(G) is an upper
bound on the value of k so the actual value may be much lower. In the second
algorithm called local subgraph connectivity detection (LSCD) proposed by the
same authors, a further test is made to find a subgraph of p-hop neighbors of a given
8.6 Distributed Connectivity Algorithms 259
node is k-connected. A node v determines that the graph is k-connected when both
of the following conditions are satisfied:
The third algorithm searches for critical nodes removal of which will disconnect
graph.
network. We described a linear time DFS-based algorithm that makes use of the
simple property that any vertex on a DFS tree of a graph that does not have a back
edge from its subtree to its ancestors is an articulation point.
A block of a graph is a maximal connected subgraph without any articulation
points. We reviewed two linear-time algorithms to identify blocks in a graph. We can
find connectivity and strong connectivity of a graph in parallel as demonstrated by
two algorithms. Finally, we described a heuristic distributed algorithm that estimates
the vertex connectivity of a network.
Exercises
1. Show the articulation points, bridges, and the blocks in the undirected graph of
Fig. 8.18.
2. Write the pseudocode of the DFS-based articulation point search algorithm as
one main procedure. Identify articulation points in the sample undirected graph
of Fig. 8.19 using the DFS-based algorithm. Show the low and num values of
vertices at each iteration.
3. Find the bridges of the same graph of Fig. 8.19 using Tarjan’s algorithm.
4. Work out the blocks of the sample graph depicted in Fig. 8.20 using the Hopcroft–
Karp algorithm by showing the iterations of the algorithm.
b
i
e h
a c
g j
d f
o n
m l k
a b d j
e i
h g f k
f g
b
h
a c e j i
d
l k
h g f e
14
s 5 9 t
12
10 1
d c
5. Find the SCCs of the digraph in Fig. 8.21 using Kojarasu’s algorithm. Show the
contents of the stack and the DFS trees formed.
6. Find the maximum flow in the network of Fig. 8.22 using the Ford–Fulkerson
algorithm by showing all iterations of the algorithm.
References
1. Aho AV, Hopcroft JE, Ullman JD (1983) Data Structures and Algorithms. Addison-Wesley
2. Akl SG (1989) The design and analysis of parallel algorithms. Prentice Hall, Englewood Cliffs,
p 07632
262 8 Connectivity
3. Cormen TH, Leiserson CE, Rivest RL, Stein C (2009) Introduction to algorithms, 3rd edn. MIT
Press, Cambridge
4. Esfahanian AH (1988) On the evolution of connectivity algorithms. In: Wilson R, Beineke L
(eds) Selected topics in graph theory. Cambridge University Press, Cambridge
5. Esfahanian AH, Hakimi SL (1984) On computing the connectivities of graphs and digraphs.
Networks 355–366
6. Even S (1979) Graph algorithms, Computer Science Press (Computer software engineering
series). ISBN-10: 0914894218, ISBN-13: 978-0914894216
7. Even S, Tarjan RE (1975) Network flow and testing graph connectivity. SIAM J Comput 4:507–
518
8. Fleischer L, Hendrickson B, Pinar A (2000) On identifying strongly connected components in
parallel. Parallel and distributed processing, pp 505–511
9. Grama A, Karypis G, Kumar V, Gupta A (2003) Introduction to parallel computing, 2nd edn.
Addison-Wesley, New York
10. Hopcroft J, Tarjan R (1973) Algorithm 447: efficient algorithms for graph manipulation. Com-
mun ACM 16(6):372–378
11. Matula DW (1987) Determining edge connectivity in O(mn). In: Proceedings, 28th symposium
on foundations of computer science, pp 249–251
12. McLendon W III, Hendrickson B, Plimpton SJ, Rauchwerger L (2005) Finding strongly con-
nected components in distributed graphs. J Parallel Distrib Comput 65(8):901–910
13. Tarjan RE (1972) Depth first search and linear graph algorithms. SIAM J Comput 1(2):146–160
14. Tarjan RE (1974) A note on finding the bridges of a graph. Inf Process Lett 2(6):160–161
15. Whitney H (1932) Congruent graphs and the connectivity of graphs. Am J Math 54:150–168
Matching
9
Abstract
A matching of a graph is a subset of edges that do not share any endpoints. Match-
ing can be used in many applications including channel frequency assignment in
radio networks, graph partitioning, and clustering. In an unweighted graph, max-
imum matching of a graph is the set of edges that has the maximum cardinality
among all matchings in that graph. In an edge-weighted weighted graph, our aim
is to find a matching with the maximum (or minimum) total weight. Finding a
maximum (weighted) matching in an unweighted or weighted graph is one of the
rare graph problems that can be solved in polynomial time. We review sequential,
parallel, and distributed algorithms for unweighted and weighted general graphs
and bipartite graphs in this chapter.
9.1 Introduction
time. However, there are various approximation algorithms to improve the runtime
of matching. Also, approximation algorithms turn out to be easier to implement
with significantly less lines of code than exact algorithms. Unweighted or weighted
matching in bipartite graphs can be treated separately than general graph matching as
the structure of bipartite graphs can be exploited for designing conceptually different
algorithms than the general case.
In this chapter, we review the matching problem in general graphs and bipartite
graphs for both unweighted and weighted cases. We describe sequential, parallel,
and distributed algorithms for these graphs.
9.2 Theory
(a) (b)
a b c a b c
d d
e e
g f e g f e
Fig. 9.1 a A MM of size 3, b A MaxMM with size 4 of a sample graph. Path (e, a, g, f, b, c) is
an augmenting path in (a) and path (g, f, b, c) is an alternating path in (b). There is no augmenting
path in (b) since the matching is maximum. We can also see that matching in (b) is perfect as each
vertex is saturated
9.2 Theory 265
Augmenting and alternating paths are displayed in Fig. 9.1. An augmenting path
of a matching M that contains k edges contains exactly k + 1 edges that are not in M
for a total of 2k + 1 edges. We try to find augmenting paths since we can increase the
size of a matching that has an augmenting path by toggling the edges of matching in
the path with edges that do not belong to matching in the path.
Definition 9.3 (alternating tree) An alternating tree is rooted at a free vertex and
each path of this tree is an alternating path.
For example, the tree rooted at vertex e in Fig. 9.1b is an alternating tree with
branches e, a, b, c, e, g, f and e, a, b, e, d.
This means we need to find edges that are present in only one of the input graphs
and include vertices incident on those edges. Two graphs and their symmetric dif-
ference is shown in Fig. 9.2.
e e
d c d c d c
M = M ⊕ P = (M − P) ∪ (P − M)
Now to prove this lemma, we see that P has odd number of edges and its edges
alternate between as edges in M and edges not in M. Edges in the complement of the
path P have the same set of neighbors in M as in M and vertices in P have exactly
one neighbor in M , therefore M is also a matching of G.
This process is called augmenting the matching M and can be used in a number
of maximum matching algorithms. The matching in augmenting path in Fig. 9.3a is
augmented to have a matching with an incremented size in (b).
We can now state an important theorem by Berge which forms the basis of few
fundamental matching algorithms [3].
Theorem 9.1 (Berge) A matching M is maximum if and only if there are no aug-
menting paths with respect to M.
(a) (b)
i a b c d i a b c d
h g f e h g f e
more maximum matching edges than M edges. This path P begins and ends with a
maximum matching edge, therefore it is an augmenting path of M, meaning M can
be enlarged using P. This contradicts the first assumption that M has no augmenting
paths.
1. M ←Ø
2. while ∃ an augmenting path P with respect to M
3. M←M⊕P
4. end while
5. return M
We need a method to find the augmenting path P and we will see it is more
convenient to have different procedures for bipartite graphs and general graphs in
the next sections.
In the matching of a weighted graph, our aim is to find the matching with the
maximum or minimum total weight. A maximum weighted matching (MaxWM)
of a weighted graph G = (V, E, w) with w : E → R+ has the maximum total
weight among all weighted maximal
matchings of G where the weight of a match-
ing M is defined as w(M) = e∈M w(e). Similarly, the minimum weighted matching
(MinWM) of G has the total least weight among all weighted maximal matchings
of G. By a maximal weighted matching (MWM) of a weighted graph G, we mean
a weighted matching of G which cannot be enlarged by a new edge. We will see
both MaxWM and MinWM have practical applications. Figure 9.4 displays MWM
nd MaxWM of a sample graph.
(a) 1 8
(b) 1 8
9 9
6 5 6 5
3 3
12 2 12 2
Fig. 9.4 a A MWM of total weight 13, b A MaxMM with total weight 29 of a sample graph
268 9 Matching
4 4
15 6 15 6
12 12
3 5 3 5
9 9
8 8
1 1
|N (S)| = |T | which means |N (S)| < |S| and therefore a contradiction. We can now
conclude that no such vertex exists and hence every vertex in S is saturated.
Theorem 9.3 (König 1931) For any bipartite graph G = (A ∪ B, E), the maximum
size of a matching α(G) is equal to the minimum size of a vertex cover β(G) [16].
We will use the approach of enlarging the matching with augmenting paths to find a
maximum matching in an unweighted bipartite graph. In order to do so, let us consider
a bipartite graph G = (A ∪ B, E) and construct a digraph G = (A ∪ B, E ) where
each edge e ∈ E is directed from A to B if e ∈ / M and it is directed from B to
A if e ∈ M. We can see that there is an augmenting path with respect to matching
M in G if and only if there is a directed path from an unmatched vertex in A to a
unmatched vertex in B in the graph G . Let us now rewrite the generic matching
algorithm formally as shown in Algorithm 9.1 with the procedure Find_ A P to find
the augmenting paths.
Proof The upperbound on the size of the matching is n/2 and at each step we can
only extend the current matching by 1 resulting in O(n) time for the loop that calls
Find_ A P. An augmenting path can be found in O(m) time searching all of the
edges in the worst case. The total time needed is therefore O(nm).
We still have not shown how to search for an augmenting path in G . One way of
achieving this is by adding a source vertex s to the left of the bipartite graph and
Fig. 9.6 Running of MaxM_BPG1 in a small bipartite graph G = (A ∪ B, E). The initial arbitrarily
selected matching M in a is shown in bold. Using this matching an augmenting path shown in dashed
lines starting from a bold vertex in A and ending at a bold vertex in B is shown in b which is XORed
with M to obtain the matching in c. We find an augmenting path now with this matching shown in
dashed lines shown in d to form matching in e in which another augmenting path shown in dashed
lines is discovered. The final matching obtained by XORing this path with current matching does
not have any augmenting paths as all vertices are now saturated after 3 steps
9.3 Unweighted Bipartite Graph Matching 271
(a) A B (b) A B
s t s
t
(c) A B (d) A B
s t
Fig. 9.7 Running of FindBFS_AP in the graph of Fig. 9.6. The two existing matched edges are
made directed from B to A and all other edges are directed from A to B. A vertex s is added to the
left with directed edges to all unmatched vertices in A and directed edges from unmatched vertices
in B to the new vertex t. We can now run BFS from s to t to find the shortest path shown as dashed
lines in b of the figure. We know augment this path to obtain the augmenting path shown in c and
the final maximum matching is shown in d which is different than the maximum matching found
in Fig. 9.6 but has the same size
connecting it by directed edges to all free vertices of A. A sink vertex t is also added
to all of the free vertices in B by directing all edges from vertices of B to vertex t
as shown in Fig. 9.7. We then run BFS from s and return the shortest path as shown
in procedure FindBFS_AP in Algorithm 9.2. The shortest path starting from s and
ending in t will be an augmenting path as it goes through free vertices in A and B
and it has to go through some matched edges to be able to return to B. The running
time for BFS is O(n + m) ≈ O(m) in a dense graph and hence the total time of
Algorithm 9.1 using BFS-based approach is O(nm) since the size of the matching
can be at most n/2.
The operation of this algorithm is shown in Fig. 9.7 on the same bipartite graph
of Fig. 9.6 with different initial matching .
272 9 Matching
(a) (b) B
B
A A 1
1 1
1
1 1
1 1 1
s
1
1 1 t
1
1
1 1
1 1
Fig. 9.8 The flow-based re-construction of an example bipartite graph in a is shown in b. The
maximum flow edges which correspond to maximum matching edges are shown in bold
conservation law. Each vertex v ∈ B has exactly one outgoing edge that can pass
flow to vertex t and hence, at most one of its incoming edges can carry the maximum
flow, again by the flow conservation law. This is to say the edges of the maximum
flow are disjoint resulting in a matching. Therefore, each u ∈ A will be matched
with at most one vertex of B forming a maximum matching in G. When we have a
flow of value k, this corresponds to a matching of size k with the same set of edges
of G. Since we attempt to maximize the flow, we are maximizing the matching.
Time complexity of this algorithm is the same of Ford–Fulkerson which is O(kC)
where k is the number of edges that the flow runs through and C is the maximum
flow in the graph which is |A| = n. The number of edges in the newly formed graph
G is 2n + m since we added 2n new edges to the vertices s and t. The complexity
of this algorithm is therefore O(n 2 + nm).
The Hopcroft–Karp algorithm also makes use of augmenting paths while finding
the maximum matching in a bipartite graph. This algorithm however searches many
paths simultaneously rather than√one by one as in the previous algorithm and brings
down the time complexity to O( nm) [14]. The working principle of this algorithm
is based on the following lemma.
Proof All of the edges in the set E obtained by M ⊕ M ∗ have a maximum degree of
2 meaning the connected components of the subgraph G induced by E are simple
paths and cycles. Let us consider paths and cycles separately in G . Each cycle has
the same number of edges in M ∗ as in M, however, each M-augmenting path has
exactly one less edge in M as in M ∗ . In M ⊕ M ∗ , we have exactly k more edges
in M ∗ than edges in M. Therefore, G contains k vertex disjoint augmenting paths
of M.
The algorithm consists of a number of phases and all possible vertex disjoint
augmenting paths are searched in each phase. The symmetric difference of the union
of all of these paths with the existing matching is computed to yield the new matching
as shown in the high-level description of the algorithm in Algorithm 9.3.
Finding the disjoint augmenting paths of a bipartite graph G = (A ∪ B, E) in each
phase can be done by a modified BFS algorithms as follows. The BFS algorithm is
run for each unmatched vertex v ∈ A to form layers starting at v using alternating
paths of unmatched and matched edges to form an alternating edge tree rooted at v.
The BFS algorithm stops when one or more unmatched vertices in B are reached
274 9 Matching
since we are looking for shortest augmenting paths. A path reaching an unmatched
vertex in B will be an augmenting path since it started from an unmatched vertex and
traversed alternating edges. All of the unmatched vertices reached in B are stored in
the set F. After this first part of the phase is over, a modified DFS algorithm is run
for each vertex in F until an augmenting path ending at a free vertex in A is found.
The modified DFS algorithm should run through alternating edges to discover an
augmenting path. Each discovered path Px is added to the set P , vertices in Px are
removed from the BFS tree with the orphan vertices and this procedure is repeated
for other free vertices in B. At the end of each phase, the new matching is formed
by XORing the set P with the existing matching M. The detailed version of this
algorithm is depicted in Algorithm 9.4.
Correctness of the algorithm is evident since the BFS and DFS algorithms discover
augmenting paths based on their operations. Also, since we delete the augmenting
path found during DFS from the BFS trees, the paths discovered are disjoint making it
possible to include the union of them to matching at once. Running of this algorithm
in a bipartite graph G = (A ∪ B, E) is depicted in Fig. 9.9 with A = {a, b, c, d, e}
9.3 Unweighted Bipartite Graph Matching 275
(a) d e
a b c d e
2 1 3
c a b
1 2 3 4 5
4 5 3 5
(b)
a b c d e
1 2 3 4 5
Fig. 9.9 Running of Hopcroft–Karp algorithm in a small bipartite graph. Augmenting paths in BFS
trees are enclosed in dashed lines
and B = {1, 2, 3, 4, 5}. The first iteration of the algorithm starts with M = {Ø} and
all of the vertices are unmatched. The BFS from all of vertices in A ends in all of
the free vertices in B which results in a BFS tree same as the original graph which
is not shown. Therefore, the free vertex set F has {1, 2, 3, 4} at the end of BFS and
we stop BFS in the first layer since we have reached free vertices in B. We now run
DFS from each of the vertices in F to find paths to be included in P . We should
delete the edges and vertices found in the path along with any remaining orphan
vertices before searching the next path. We have selected the matching shown in (a)
by always opting for the first free vertex in A from left while running DFS. In the
second phase, we run the BFS from the free vertices d and e in A to obtain the BFS
trees shown in (a) rooted at vertices d and e which end at free vertices 4 and 5 in B.
Note that vertex 3 is not a free vertex and we need not run DFS from there but we had
to stop at layer 3 since we reached free vertices in B. Running DFS from vertices 4
from first tree and 5 from the second one results in the final maximum matching of
the graph with size 5 in two phases which is a perfect matching since each vertex is
matched as shown in (b). If we had selected the augmenting path from vertex 5 in the
BFS tree on the left, we would have edges (5, c) and (2, d) matched and would need
a third phase that would select the augmenting path (e, 3), (3, b), (b, 5)(5, c), (c, 4),
however, we would arrive at the same maximum matching.
Analysis
We will first state a lemma to aid the analysis of the complexity of this algorithm.
276 9 Matching
Lemma 9.3 If the shortest augmenting path with respect to a matching M in a graph
G has l edges, then the size of the maximum matching in G has a maximum size of,
|V |
|M| +
l +1
Proof Each phase of the algorithm increases the length of the shortest augmenting
path
√ by at least one. Therefore,√the length of the shortest augmenting √ path after
√
n iterations will be at least n + 1. There will be at most |n|/(√ n + 1) ≤ n
augmenting paths left and hence, the algorithm will run for √another n iterations at
most. The total number of loop execution will therefore be 2 n times. Each iteration
of the while loop requires O(m) time√due to BFS and DFS algorithms making the
time complexity of this algorithm O( nm).
The first sequential algorithm is a greedy one that selects legal edges iteratively and
the second algorithm finds MaxM in linear time.
9.4 Unweighted Matching in General Graphs 277
This process is repeated until there are no edges left. Operation of this algorithm
is depicted in Fig. 9.10 The greedy algorithm is correct since we never select any
adjacent edges to be included in M (matching rule) as these are deleted from graph
and we continue until graph becomes empty meaning there can be no more edges
added to M (MM rule). The number of iterations of the while loop has an upper
bound as the number of edges and hence the time complexity of this algorithm is
O(m).
(a) (b)
(c) (d)
Fig. 9.10 Three iterations of the greedy matching algorithm in a sample graph results in MM of
cardinality 3 as shown in a, b and c. Matching edges are shown in bold and the deleted edges are
shown as dashed. A MaxM of the same graph is displayed in d
278 9 Matching
(a) blossom
i h
a b c d e
f g
stem
(b)
j
a b c d B
(c) j
i h
a b c d e
f g
(d) j
i h
a b c d e
f g
Fig. 9.11 The flower, blossom and stem of a sample graph and contracting and uncontracting of
the blossom
to find the augmenting path that has a and j as endpoints by ending with edge (i, e)
or edge ( f, e) instead. We need a way to find augmenting paths in the presence of
such odd alternating cycles.
Edmond presented a linear time algorithm to overcome this difficulty in unweigh
ted general graphs [7]. This algorithm improves the current matching by finding
augmenting paths in the graph as in other matching algorithms but by also taking
care of odd alternating cycles. The general idea of this algorithm is to detect such
cycles and remove them by shrinking them to super nodes and then carry on search of
augmenting paths. A blossom in a graph G is an odd cycle consisting of 2k + 1 edges
with exactly k edges belonging to the current matching M as shown in Fig. 9.11a
where the blossom consists of vertices e, f , g, h, i and the stem is an even-length
alternating path of vertices a, b, c, d and e, starting from a free vertex and ending
at the base (or the tip) of the blossom. The base vertex of the blossom is connected
to the stem and is both part of the stem and the blossom. The stem and the blossom
form the flower. The essence of this algorithm relies on the following theorem which
we state without proof.
1. If P starting from a free vertex u in G goes through b B and ends at a free vertex
v in G , then P is replaced by a path u → (x → ... → y) such that the edges in
the blossom included in P are alternating.
2. If P starting from a free vertex u in G ends at b B , the path u → v B is replaced
by the path u → (x → ... → y) such that path P = u → y is alternating and y
is a free vertex.
The contraction of the blossom of the graph G in Fig. 9.11a to get G is depicted
in (b) where we have an augmenting path and no more blossoms, therefore we
can uncontract the blossom in (c) to mark the augmenting path shown by dashed
lines. As this path runs through the blossom B, we select alternating edges in B
to complete the augmenting path that ends at vertex j. Finally, we form the new
matching M ← M ⊕ P with size 5 which in fact is maximum for this graph as there
are no blossoms or augmenting paths left. Another example when the alternating
280 9 Matching
path ends in a blossom is shown in Fig. 9.12. We apply the same strategy, shrink
the blossom B to obtain G first in (b), search for an augmenting path in G and
when such a path P finishing at B is found as shown in (c), unshrink B and mark the
edges of the augmenting path inside the blossom accordingly (c). Finally, perform
M ← M ⊕ P to obtain the matching M of size 5 in (d) which is maximum as there
are no other blossoms or augmenting paths.
As we have seen in these examples, there are three possibilities while searching
for an augmenting path in the graph G;
A more detailed example with two nested blossoms is shown in Fig. 9.13. We start
a BFS from a free vertex a and label vertices as inner and outer corresponding to
odd and even levels respectively. Vertices c and e are both outer vertices, therefore a
blossom is detected and this is contracted to vertex B1 in the new graph G . We find
9.4 Unweighted Matching in General Graphs 281
(a)
blossom
j h
i
a b c d e
f g
stem
(b)
j
a b c d B
(c)
j h
i
a b c d e
f g
(d)
j
i h
a b c d e
f g
another blossom (B2 ) in G and this is contracted to give the new graph G . Note
that in between G and G formation, we have not encountered an augmenting path,
otherwise we would have uncontracted B1 in G to get a new matching. We find an
augmenting path in G and therefore uncontract blossoms and mark the augmenting
path P through them so as to alternate. Finally the new matching M is formed by
M ← M ⊕ P.
Analysis
The algorithm is based on Berge’s theorem, it attempts to find an augmenting path
in a general graph and when such a path is found, it enlarges it. It only remains to
show contracting and uncontracting of blossoms do not disturb the augmenting paths
found.
282 9 Matching
(a) O I
(b) (c)
I I I
b c d b b
O
O O B1 O O
a f e a f a B2
O O
G
h g G’ h g G’’ h I
I
(d) (e)
b c d b c d
a f e a f e
G G
g h g
h
Proof There will be at most n augmentations and there will be at most n/2 times
of blossom shrinking between any two augmentations. The alternating tree can be
constructed in O(m) time, therefore the total time taken is O(n 2 m).
√
An improvement to the running time of this algorithm to O( nm) was provided
by Micali and Vazirani [19] and a complete proof was given in [25].
unmatch
or
neigh_matched
UNMATCHED
match
col(i,j)=round r
MATCHED NGH_MATCHED
unmatch
or
neigh_matched
Fig. 9.14 The FSM of the matching algorithm for node i with a neighbor node j
by the definition of edge coloring and then continue with color class 2. We should
include an edge in this class only if it is not adjacent to any previously matched edge.
A distributed algorithm based on this observation is proposed in [12] to find a
maximal matching in a network which is already edge colored with k colors. It is
an SSI algorithm working in rounds under the control of a root node. There are k
rounds starting with round 1 and at round r , any node that has an incident edge (u, v)
colored with r checks wether it can include (u, v) in matching legally. That is, there
are no other edges adjacent to (u, v) that are included in the matching in the previous
rounds. We will sketch a possible implementation of this algorithm as in [9] but by
using a FSM. There are three states of a node as follows, also as shown in Fig. 9.14.
• UNMATCHED: Initially, all nodes are in UNMATCHED state which means they
can compete to be a matched node.
• MATCHED: Any node that has an incident edge incident to it which is determined
to be a matching edge enters this state.
• NEIGH_MATCHED: When a node has a neighbor that is MATCHED, it is
assigned to this state.
The pseudocode for a single round of this distributed algorithm for a node i is
shown in Algorithm 9.7.
The operation of this algorithm in a small sample network is depicted in Fig. 9.15.
This algorithm correctly finds a maximal matching in a network since we obey the
matching rule in each round by not considering adjacent edges of the matched edges
and also, we continue until each color class is considered and thus the matching is
maximal. There will be a total of k rounds for a k-edge-colored network and each
edge will be traversed at most once by the match, unmatch or neigh_match messages
and thus the total number of messages transferred is O(km).
9.5 Weighted Bipartite Graph Matching 285
(a)
1 3
1 4 1 4
3
2 2 2 2 2 3
3
1 1 4
5 4 1
3 4
(b)
1 3
1 4 1 4
3
2 2 2 2 2 3
3
1 1 4
5 4 1
3 4
(c)
Fig. 9.15 Running of Algorithm 9.7 in a sample network. The first and second rounds are shown in
a and b respectively. In only two rounds a maximal matching of size 5 is obtained. The maximum
matching of size 7 for this network is shown in c
We can implement a greedy strategy in which we always select the greatest weight
available edge from all available edges. We need to sort the weights of edges initially
286 9 Matching
and then check availability. The running time of this algorithm is dominated by the
sorting operation and hence we need O(m log m) and the approximation factor is
1/2 [22].
The Hungarian method, so-called by its developer Kuhn as it relies on the earlier
ideas of two Hungarian mathematicians König and Egervary, finds the maximum
matching in a weighted complete bipartite graph with the same order of bipartite
vertex sets in linear time [17]. This method solves the assignment problem which
aims to assign objects such as machines, people, processors to tasks by finding
minimum or maximum weighted matching in such a graph. Let us assume we are
8 2 8 2 8 2
12 12 12
6 3 6 3 6 3
9
9 9
4 4 4
5 5 5
1 1 1
Fig.9.16 Running of the greedy algorithm to find MWM in a weighted bipartite graph with weights
shown next to edges. The largest weight available edge is selected in each step to obtain the final
matching of total weight 26 shown with bold lines in c in 3 iterations
9.5 Weighted Bipartite Graph Matching 287
given a set of people and a set of tasks these people can perform which are the vertices
of the bipartite graph consecutively. The weight on an edge (u, v) shows the time
required by person u to perform task v and our aim is to have the minimum amount
of time to get all tasks done. We could have processors of a multiprocessing system
instead of people and tasks would be software modules running on these machines
in this case. We will describe this method in two equivalent approaches; using the
cost matrix and as a graph-theoretic method (Kuhn-Munkres algorithm). These two
approaches have the same time complexity.
• If a number is added to or subtracted from all of the entries of any one row or
column of a cost matrix Ci to get a cost matrix Ci+1 , then on optimal assignment
for the cost matrix Ci+1 is also an optimal assignment for the cost matrix Ci .
• We have an optimal assignment of ai to b j if ci j = 0. In other words, if we can
reduce the cost of assigning an element of A to an element of B to zero, this
assignment is optimal.
1. Reduce rows: Subtract the least value of each row from all of the entries in its
row.
2. Reduce columns: Subtract the least value of each column from all of the entries
in its column.
3. Cover zeroes: Cover the zero entry rows and columns using minimum number of
lines.
4. if number of lines is n goto 6.
5. else Find the smallest uncovered element x. Subtract x from all of the uncov-
ered elements of C and add x to elements that are at the intersection of the covering
lines in 3. Goto 1.
6. Assignment: Select a row or column with only one zero and assign. If not found,
select arbitrarily. Select other assignments so that no two tasks are assigned to
same persons.
288 9 Matching
(a) (b)
8 8
A 1 A 1
2 2 2 2
5 5
7 7
9 9
12 12
1 5 1 5
B 2 B 2
6 3 6 3
4 3 4 3
7 7
9 9
8 C 8
C 3 3
9 9 9 9
5 4 5 4
7 7
4 4
3 3 4
D 4 D
6 6
5 5
1 1
2 E 2 5
E 5
Fig. 9.17 A sample unweighted fully connected bipartite graph to test the Hungarian algorithm
Let us see the operation of this algorithm through an example shown in Fig. 9.17a.
The cost matrix C for this graph is given as below and the first two steps of the
algorithm which reduces rows and then columns results in the matrices shown.
1 2 3 4 5 1 2 3 4 5 1 2 3 4 5
⎛ ⎞⎛ ⎞ ⎛ ⎞ ⎛ ⎞
A 8 2 5 7 9 2 A 6 0 3 5 7 A 2 0 0 5 7
B ⎜ 12 1 6 4 7⎟ ⎜1⎟ B ⎜ 11 0 5 3 6⎟ B ⎜7 0 2 3 6⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟ ⎜ ⎟
⎜9 5⎟ ⎜ ⎟ ⎜6 2⎟ ⎜2 2⎟
C ⎜ 3 8 9 ⎟ ⎜3⎟ → C ⎜ 0 5 6 ⎟→ C ⎜ 0 2 6 ⎟
D ⎝7 4 9 3 6 ⎝3⎠
⎠ D ⎝4 1 6 0 3⎠ D ⎝2 1 3 0 3⎠
E 5 3 4 1 2 1 E 4 2 3 0 1 E 2 2 0 0 1
4 0 3 0 1
Covering rows and columns with zeroes results in the first left matrix C below
with covered entries shown in bold. Since the number of covered lines is 4 which is
less than 5, we need to continue with the algorithm. We select the lowest uncovered
value which is 2 and subtract 2 from all of the uncovered values and add it to the
entries at the intersection of the covered entries to obtain the second matrix and cover
this matrix this time again with covered rows and columns shown in bold figures.
9.5 Weighted Bipartite Graph Matching 289
1 2 3 4 5 1 2 3 4 5
⎛ ⎞ ⎛ ⎞
A 2 0 0 5 7 A 2 2 0 7 7
B ⎜7 0 2 3 6⎟ B ⎜5 0 0 3 4⎟
⎜ ⎟ ⎜ ⎟
⎜2 2⎟ ⎜0 0⎟
C ⎜ 0 2 6 ⎟→ C ⎜ 0 0 6 ⎟
D ⎝2 1 3 0 3⎠ D ⎝0 1 1 0 3⎠
E 2 2 0 0 1 E 2 4 0 0 1
We find that the number of covered rows and columns is 5 which is the number
of vertices of the bipartite graph, therefore we stop and move on to the assignment
step. We search for single 0 rows first as this means that person can only do the task
that has 0 in its column. Person A has such a property and we assign task 3 to her
and delete task 3 column as this task cannot be assigned to another person. Person
F can do tasks 3 and 4 but since task 3 is already assigned, we have to assign task
4 to her and delete column 4 from the matrix. Similarly, person C is assigned task
2 and person E can only be assigned to task 1 which leaves person D only with
task 5 although she is capable of performing tasks 1, 2, 3 and 5. The assignments
in 0 locations are shown in bold in the final cost matrix in below left and the actual
assignment in the original cost matrix using these values is shown in below right.
For the total time taken, we calculate this as 5 + 1 + 5 + 7 + 1 = 19 units from the
original cost matrix. This matching is depicted in the bipartite graph of Fig. 9.17b.
1 2 3 4 5 1 2 3 4 5
⎛ ⎞ ⎛ ⎞
A 2 2 0 7 7 A 8 2 5 7 9
B ⎜5 0 0 3 4⎟ B ⎜ 12 1 6 4 7⎟
⎜ ⎟ ⎜ ⎟
⎜0 0⎟ ⎜9 5⎟
C ⎜ 0 0 6 ⎟↔ C ⎜ 3 8 9 ⎟
D ⎝0 1 1 0 3⎠ D ⎝7 4 9 3 6⎠
E 2 4 0 0 1 E 5 3 4 1 2
Definition 9.7 (equality graph) The equality graph of a graph G = (V, E) with
respect to a labeling function l is a graph G l = (V, El ) such that
(a) (b)
5
0 a 1 6 0 a 1 6
3
7 7
6 6
8
0 b 2 9 0 b 2 9
4 9 9
2 3 7 0 3 7
0 c c
Fig. 9.18 a A legally labeled weighted bipartite graph, b Its equality graph
This way, we make sure labeling rule is obeyed and an initial equality graph is
obtained. A bipartite graph that is labeled accordingly and its equality graph are
depicted in Fig. 9.18. Note that when the bipartite graph is not fully connected, we
need to append edges with 0 weights.
The following theorem due to Kuhn and later Munkres provides the basis for this
graph-theoretic assignment algorithm.
Thus, finding the maximum weight matching in the original graph G is reduced
to finding a perfect matching of the equality graph G l . We can now form the steps
of the algorithm based on this theorem.
Finding new labeling l is crucial in the operation of this algorithm. For a legal
labeling of the graph G, let us first define the neighborhood relations of a vertex in
G l and the set S,
al = min{u ∈ S, v ∈
/ T }{l(u) + l(v) − w(u, v)} (9.8)
Now, the improved labeling l for any vertex of G can be specified in terms of the
previous labeling l using al as follows.
⎧
⎨ l(x) − al , if x ∈ S
l (x) = l(x) + al , if x ∈ T (9.9)
⎩
l(x) otherwise
We can now write the pseudocode of the Kuhn-Munkres algorithm as shown in
Algorithm 9.1.
An example operation of this algorithm in a small weighted bipartite graph is
depicted in Fig. 9.19.
Analysis
There are n phases of the algorithm and at each phase the size of the matching is
incremented by 1. Initial slack calculation takes O(n 2 ) time. When a vertex moves
292 9 Matching
Matching in weighted bipartite graphs problem can be solved efficiently using the
auction method which is based on game theory. Auctions in everyday life involves an
auctioneer opening bidding and bidders submitting bids and the object under consid-
eration is acquired by the bidder that offers the highest price. Auction algorithms are
based on this principle in which a bipartite graph G = (A ∪ B, E) is considered with
vertex set A as buyers and B as objects [4]. Each object i has a price pi associated
with it and the weight of an edge between a bidder i and an object j, w(i, j) shows
9.5 Weighted Bipartite Graph Matching 293
9 9
2 B X2 7 2 B 7
5 5
6 6
7 7
2 2
3 C 5 3 C 5
5 5
9 9
4 4
4 D 0 4 D 2
2
2
Fig. 9.19 Running of Kuhn-Munkres algorithm in a small weighted bipartite graph. The edges of
the matching obtained at each iteration are shown in bold
the amount that bidder i values object j, in other words, it is the cost of the object
as seen by buyer i. The algorithm consists of the bidding phase and the assignment
phase. Each object can be sold to only one person and each person can buy only one
object.
For each object, a buyer has a benefit and a price to be the owner of that object.
The profit for an object by a buyer is the difference between the benefit and the price
for an object. Algorithm 9.10 displays the pseudocode for the sequential auction
algorithm as adapted from [24]. At each iteration, the first element of the buyers
from the set B is selected, then an object with the maximum profit for that buyer
is found. The second highest profit yielding object is also computed and the bid is
computed as the difference of the first two best profits. The object that provides this
bid is then assigned to the buyer in the assignment phase. The new price for the
object is then increased by the bid and a small value designated as the ε which may
be initialized to δ ← 1/(n + 1). Iterations continue until each buyer is assigned to
an object.
294 9 Matching
We will show the operation of this algorithm using the weighted bipartite graph
of Fig. 9.19 as an example in Fig. 9.20. We have four buyers 1, 2, 3 and 4 and four
objects A, B, C and D with initial bids all set to zeroes and ε = 0. We start with the
lowest index buyer 1 who has the highest profit at object C with cost 8 and the second
highest profit is object A with profit 3. The bid is therefore 8 − 5 = 3 for object C
as shown. Buyer 1 is assigned to this object and we start the second iteration with
buyer 2. Similarly, buyer 2 is assigned to object A with bid 3 which is the difference
of its two best profits as shown in (b) and buyer 3 is assigned to the object B with
the bid 2 as depicted in (c). We have a different situation in (d) where buyer 4 can
bid 7, which is the difference between its two best profits, for object B. This bid is
higher than the current bid of 2 for object B and therefore we release buyer 3 from
object B and assign buyer 4 to this object. Finally, buyer 3 is re-assigned this time to
object D as shown in (d) which is the maximum weighted matching for this bipartite
graph.
Recently, it was shown in [21] that the expected time complexity of the auction
algorithm for random bipartite graphs where each edge is independently selected
log2 n
with probability p ≥ c log
n
n
with c > 1 is O( nlog np ). Also in this study, the expected
time complexity of this algorithm in a shared memory parallel system with O(log n)
processors is shown to be O(n log n) (Fig. 9.20).
9.5 Weighted Bipartite Graph Matching 295
9 9
2 B X2 7 2 B 7
5 5
6 6
7 7
2 2
3 C 5 3 C 5
5 5
9 9
4 4
4 D 0 4 D 2
2
2
Fig. 9.20 A sample weighted bipartite graph to test the Auction algorithm. We have the same
maximum matching as in Fig. 9.19
Preis came up with a greedy weighted matching algorithm that has better performance
than the global greedy algorithm [22]. The idea of this algorithm is to select the locally
heaviest edges rather than a globally maximum weight one. A locally heaviest edge
is the edge with largest weight among all of its adjacent edges. Selection of a locally
heaviest edge is done arbitrarily choosing an edge (u, v) but if an adjacent edge that
has a larger weight is found, then that edge is selected. The operation of this algorithm
is depicted in Algorithm 9.11. We can see the local operations are independent and
for this reason, this approach is suitable for distributed and also parallel matching.
The iterations of this algorithm are illustrated in Fig. 9.21. The time complexity
of this algorithm is O(m log n) with an approximation ratio of 2 [22].
(a) (b)
12 4 2 12 4 2
a b a b c d
c d 8
3 15 7 e 3 15 7
9 1 9
10
2 6 5 i 2 6
i g h g
h f
(c) 2
c d
(d)
2
15 7 i h
2 6 g
i h
(e)
12 4 2
a b c d 8
3 15 7 e
9 1 10
i 2 6 5
h g f
Fig. 9.21 The operation of Preis’ algorithm in a sample graph. The first selected edge is (d, e) but
an adjacent edge (e, f ) has a greater weight so (e, f ) is checked and found to be locally heaviest
and included in the matching M in a. All adjacent edges to (e, f ) are removed from the graph to
obtain the subgraph in b. This time edges (i, h), then (h, b) and then (b, a) are selected in sequence
to find the locally heaviest edge (b, a) which is added to M in b. The third iteration selects (c, d)
and (c, g) in turn to add (c, g) to M. The last edge to add is (i, h) as shown in d and the final
matching with a total weight 39 is shown in e
edge adjacent to this edge and hence it can be included in the MWM. There are
two message types, request and drop; a node u that finds (u, v) is the heaviest edge
incident to it sends request to neighbor node v. If this node finds (u, v) is the heaviest
weight edge incident to it, it replies by a request message and (u, v) is included in
the MWM as shown in Algorithm 9.12.
Analysis
An edge of the network graph may be traversed by at most two messages, either
by req from two nodes at its endpoints or a req and a drop message. Therefore,
total number of messages exchanged will be 2m. Since this algorithm imitates the
sequential Preis algorithm, the output is the same matching produced is the same
of the global heaviest matching algorithm with the same approximation ratio of 1/2
[13].
298 9 Matching
In search of a parallel algorithm for the matching problem, we can partition the graph
and distribute the vertices to processors. Each process then performs the following
for its partition of the graph.
We need to be careful while considering the border vertices in the partitions. This
can be handled by the introduction of ghost vertices which are the non-member
vertices that are connected to the border vertices of a partition. In this case, when a
border vertex v is matched in a partition i, the process pi responsible for the partition
i should inform processes p j which hold ghost vertices that are neighbors of v of
the matching.
Parallelizing Hoepman’s Algorithm
Manne et al. described the similarity between the Hoepman algorithm and the Luby’s
parallel algorithm to build an independent set of a graph we will describe in Chap. 10.
9.6 Weighted Matching in General Graphs 299
results can then be gathered at a root process which merges them to find the global
matching. A recent survey of parallel algorithms for maximum matching in bipartite
graphs is provided in [2].
In many cases, approximation matching algorithms turn out to be faster at the
expense of returning an approximate solution rather than an exact one. For very
large graphs, they may be preferable as times involved may be very high. We have
also described how a series of conversions from one type of algorithm can lead to
efficient solutions. The algorithm that sorts edges and then includes legal edges to
matching has O(m log m) complexity due to sorting process and has an approxima-
tion ratio of 1/2. Preis came up with the idea of selecting local heaviest edges which
are independent of each other to result in a better time complexity of O(m). Hoepman
later on provided a distributed version of this algorithm with the same approxima-
tion ratio as we reviewed. Finally Manne et al. presented a parallel approximation
matching algorithm based on Hopeman’s work. We can see the sequence of develop-
ment here are a sequential algorithm; an improved sequential algorithm; a distributed
algorithm from the improved sequential algorithm and a distributed memory parallel
algorithm that builds upon the distributed algorithm. This path, although in less steps,
is commonly followed in various graph problems as we saw.
Distributed matching algorithms need careful consideration as matching of an
edge incident to a node in a network requires notification to two-hop neighbors since
they will be affected. Matching has numerous applications and hence there is need
for parallel and distributed algorithms with better performances.
Exercises
1. Given the graph of Fig. 9.22 with initial matching shown in bold, find augmenting
paths iteratively to obtain a maximum matching for this graph.
2. Work out the maximum matching in the bipartite graph of Fig. 9.23 using the
augmenting path algorithm.
3. Find the maximum matching in the bipartite graph of Fig. 9.24 using the Hopcroft–
Karp algorithm showing the BFS trees constructed in all iterations.
4. Determine the maximum matching in the graph of Fig. 9.24 this time using the
maximum flow method of Ford–Fulkerson algorithm.
5. A multiprocessor system has 5 computers P1 , . . . , P5 that should finish 5 tasks
1, . . . , 5. The time to finish tasks for each processor is given in the below cost
matrix. Work out the minimum time to finish all tasks by these 5 processors using
i h g f e
9.7 Chapter Notes 301
a
e
b
g
d
i
1 2 3 4 5
⎛ ⎞
P1 8 1 4 3 2
P2 ⎜2 5 9 6 4⎟
⎜ ⎟
C = P3 ⎜6 2 3 4 5⎟ (9.10)
⎜ ⎟
P4 ⎝1 4 7 9 3⎠
P5 5 0 8 1 2
302 9 Matching
2 1
1 10 4 3
4
5 7 6 5 1
12 2
7
2 4 8 9
11 2
References
1. Avis D (1983) A survey of heuristics for the weighted matching problem. Networks 13:475–493
2. Azad A, Halappanavar M, Rajamanickam S, Boman EG, Khan AM, Pothen A (2012) Multi-
threaded algorithms for maxmum matching in bipartite graphs. IPDPS 2012:860–872
3. Berge C (1957) Two theorems in graph theory. Proc Natl Acad Sci USA 43:842–844
4. Bertsekas DP, Tsitsiklis JN (1989) Parallel and distributed computation: numerical methods.
Prentice-Hall, Englewood Cliffs
5. Bertsekas DP, Castanon DA (1991) Parallel synchronous and asynchronous implementations
of the auction algorithm. Parallel Comput 17:707–732
6. Bus L, Tvrdik P (2009) Towards auction algorithms for large dense assignment problems.
Comput Optim Appl 43(3):411–436
7. Edmonds J (1965) Paths, trees and flowers. Can J Math 17:449–467
8. Erciyes K (2015) Distributed and sequential algorithms for bioinformatics. Springer computa-
tional biology series. Springer, Cham
9. Erciyes K (2015) Distributed graph algorithms for computer networks. Springer computer and
communications series. Springer, London
10. Gabow HN (1976) An efficient implementation of Edmonds’ algorithm for maximum matching
on graphs. J Assoc Comput Mach 23:221–234
References 303
11. Hausmann D, Korte B (1978) K-greedy algorithms for independence systems. Z Oper Res
22(1):219–228
12. Hirvonen J, Suomela J (2012) Distributed maximal matching: greedy is optimal. In: Kowalski D,
Panconesi A (eds) PODC12. Proceedings of 2012 ACM symposium on principles of distributed
computing, Madeira, Portugal, 161–8 July 2012
13. Hoepman JH (2004) Simple distributed weighted matchings. Technical report, Nijmegen insti-
tute for computing and information sciences (NIII)
14. Hopcraft J, Karp RM (1973) An O(n 2.5 ) algorithm for maximum matching in bipartite graphs.
SIAM J Comput 2:225–231
15. Karypis G, Kumar V (1998) A parallel algorithm for multilevel graph partitioning and sparse
matrix ordering. J Parallel Distrib Comput 48(1):71–95
16. König D (1931) Graphen und matrizen. Math. Lapok 38:116–119
17. Kuhn HW (1955) The Hungarian method for the assignment problem. Nav Res Logist Q 2:83–
97
18. Manne F, Bisseling RH (2007) A parallel approximation algorithm for the weighted maximum
matching problem. In: Wyrzykowski R, Karczewski K, Dongarra J, Wasniewski J (eds) Pro-
ceedings of seventh international conference on parallel processing and applied mathematics
(PPAM 2007). LNCS, vol 4967. Springer, Berlin, pp 708–717
19. Micali S, Vazirani V (1980) An O(sqrt(| V |)| E |) algorithm for finding maximum matching
in general graphs. In: Proceedings of 21st annual symposium on on foundations of computer
science, IEEE, pp 17–27
20. Munkres J (1957) Algorithms for the assignment and transportation problems. J Soc Ind Appl
Math 5(1):32–38
21. Naparstek O, Leshem A (2014) Expected time complexity of the auction algorithm and the push
relabel algorithm for maximal bipartite matching on random graphs. Random Struct Algorithms
48:384–395
22. Preis R (1999) Linear time 1/2-approximation algorithm for maximum weighted matching in
general graphs. In: Meinel C, Tison S (eds) Symposium on theoretical aspects of computer
science (STACS) 1999. LNCS, vol 1563, Springer, Berlin, 259–269
23. Riedy J (2010) Making static pivoting scalable and dependable. Ph.D. thesis, EECS Department,
University of California, Berkeley
24. Sathe M (2012) Parallel graph algorithms for finding weighted matchings and subgraphs in
computational science. Ph.D. thesis, University of Basel
25. Vazirani
√ VV (1994) A theory of alternating paths and blossoms for proving correctness of the
O( V E) general graph maximum matching algorithm. Combinatorica 14(1):71–109
Independence, Domination,
and Vertex Cover 10
Abstract
Subgraphs of a graph may have some special properties and detecting these sub-
graphs may be useful for various applications. In this chapter, we study theory and
sequential, parallel, and distributed algorithms for three such special subgraphs:
independent sets, dominating sets, and vertex cover.
10.1 Introduction
An independent set of a graph is a subset of its vertices such that no vertex in this
set is adjacent to any other vertex contained in this set. We can formally define the
independent set as follows.
(a) (b)
Fig. 10.1 a An MIS with order 3. b A MaxIS with order 4 of a sample graph. The vertices in the
independent sets are shown in bold
10.2 Independent Sets 307
problem which means they can be reduced to each other in polynomial time stated
as follows:
We will review four sequential algorithms to find the MIS of a graph, starting with a
random greedy one. The second algorithm uses a heuristic and the third one considers
labels of vertices while selecting members of MIS while the fourth algorithm is a
general method that searches an independent set at each step.
(a) / (b) /
/ / /
/
_ _ _
/
/
/
/ /
(c) (d)
/ /
Fig. 10.3 Running of Seq_MIS1 in a sample graph. The first three iterations are shown in a–c;
and the last two iterations are shown in d. The independent set vertices are shown in bold and the
deleted neighbor vertices in gray with the deleted edges marked with dashed lines
this set should be adjacent. It is also maximal since we proceed until there are no
vertices left and hence cannot enlarge I any further. This algorithm requires O(n)
steps as we may end up selecting a vertex and its single neighbor repeatedly as in the
case of a linear network. The running of this algorithm in a sample graph is shown
in Fig. 10.3.
Theorem 10.1 LDFA provides an MIS I such that |I | ≥ n/(Δ(G)+1) where Δ(G)
is the maximum degree of the graph.
(a) (b)
/
/ /
_
(c) (d)
_
/
/
/
(e) (f) /
/
/
/
/ /
(g) (h)
Fig. 10.4 Running of LDFA in the same graph of Fig. 10.3. The final MIS is shown in h
these labels as shown in Algorithm 10.2. In this case, we know which vertex to select
at each iteration and this algorithm is called Lexicographically First MIS algorithm
(LFA). However, this algorithm does not improve the runtime of the previous one
as it has a similar greedy approach as the first one and has a time complexity of
O(n + m) time since we check neighbors of each vertex.
Clearly, the choice of the independent set at line 6 determines the performance of
this algorithm. We will see that randomization in this selection provides algorithms
with good performances (Fig. 10.5).
10.2 Independent Sets 311
(a) (b)
/ / /
/ / /
/
_ _ _
/
/
/ / /
/
Fig. 10.5 Finding MIS of a sample graph by selecting an IS at each step. The independent set
vertices are shown in bold and the deleted neighbor vertices in gray with the deleted edges marked
with dashed lines
Running of this algorithm in a sample graph is depicted in Fig. 10.6 for a sample
graph colored with four colors. Time complexity of this algorithm is O(km) since
we need to run the for loop O(k) times and we may need to check all of the edges
at each run. We also need the time C to color vertices of the graph G and thus total
time is C + O(km).
by Luby in 1986 to find MIS of a graph which proceeds as follows [7]. Each vertex v
is marked with probability 1/(2d(v)) in parallel, where d(v) is the degree of v, to be
included in the independent set or not. This marking may produce edges with both
endpoints marked to be in the MIS since assignment is done in parallel independently
and hence corrections are needed. The next step identifies such edges and for each
such edge (u, v) with both u and v covered in the MIS, the vertex with the higher
10.2 Independent Sets 313
degree is selected and in the case of a tie, vertex identifiers are used to select only
one of such vertices. The selected vertices and their neighbors are then deleted from
the graph and this process is repeated until the graph becomes empty as shown in
Algorithm 10.5.
The selection of the nodes to be included in the independent set can be imple-
mented on a EREW-PRAM using O(m) processors with each execution taking
O(log n) time. It can be shown that the execution of the while loop is O(log n)
times [7] resulting in a total time of O(log2 n) for this algorithm. We will use this
algorithm as the basis of a distributed algorithm as described in the next section.
We can form a shared memory parallel version of Luby’s algorithm as described
in [3]. We have three vectors of n elements for n vertices; C, I , and R. I [i] shows
whether vertex i is included in the MIS or not, C[i] displays whether vertex i is a
candidate to be included with 0 meaning it is either in MIS or a neighbor of a vertex
in the MIS and therefore cannot be included, and finally R[i] holds the generated
random number for vertex i at each iteration. The vector C is initialized to all 1 s
meaning all vertices are candidates and I is initialized to all 0 s since no MIS member
is determined. The vectors are 1-D partitioned among k processes and each process
performs the following steps at each iteration until MIS is found which is determined
by all entries of vector C becoming 0 as shown in Algorithm 10.6 for parallel process
i for a total number of k processes. This algorithm requires synchronization at each
step, otherwise its performance is similar to Algorithm 10.5 when synchronization
is not considered.
We can use various heuristics to design distributed algorithms for the MIS problem
in a network setting. We describe three distributed MIS algorithms in this section; the
first algorithm uses identifiers of nodes to break symmetries and the second algorithm
is a distributed version of Luby’s parallel algorithm with the third one being another
randomized distributed algorithm with a better performance. Although the last two
algorithms have similar structures, we show two common ways of implementation,
using finite-state machines in the first one and a more straightforward approach in
the second one by showing control messages explicitly.
(a)
10 8 6 2 7
4 11 9 1 5
(b)
10 8 6 2 7
4 11 9 1 5
Fig. 10.7 Execution of Dist_MIS1 in a sample graph with unique node identifiers. The in_mis
messages are shown by arrows, the MIS nodes are in black, and the neighbor nodes that enter
NONMIS state are shown by double circles. In only two rounds shown in a and b, the MIS is
formed
A node that is included in the MIS should not participate in the algorithm in further
rounds. This is accomplished by changing its state and informing its neighbors by
in_mis message so they should as well remain inactive in further rounds. Any node
that is adjacent to a neighboring node of an MIS node is informed by the neighbor by
the neigh_mis message so that it is dropped from the active neighbors list. Operation
of this algorithm in a sample network is shown in Fig. 10.7.
The time complexity of this algorithm is O(n) as there can be purely sequential
operation as in a linear network with increasing/decreasing identifiers or a general
network with neighbors that have increasing or decreasing identifiers in a sequence.
The number of messages transmitted is proportional to m. This algorithm may there-
fore turn out to be slow.
A dominating set of a graph is a subset of its vertices such that every vertex is either
in this set or adjacent to a vertex in it. We can formally define this set as follows:
(a) (b)
Fig. 10.8 a A connected DS with order 4. b An unconnected MDS with order 3 of a sample graph.
The vertices in the dominating sets are shown in black
A minimum dominating set (MinDS) of a graph is the set with the minimum order
among all dominating sets of that graph. The cardinality of MinDS of a graph G is
called the domination number (γ (G) of G). A minimal dominating set (MDS) of a
graph does not contain any other dominating sets of that graph as a proper subset as
shown in Fig. 10.8. In other words, removing a vertex from such a set will destroy
the dominating set property of this set. In a connected dominating set (CDS), there
is a path between each pair of vertices in the dominating set, consisting only of dom-
inating set vertices. Formally, ∀u ∈ D ∧ v ∈ D, there is a path u, x1 , . . . , xk , v such
that u, x1 , . . . , xk , v ∈ D.
A k-dominating set D of a graph G = (V, E) consists of vertices such that every
v ∈ V − D is adjacent to at least k elements of D. A k-distance dominating set which
is sometimes confused with the k-dominating set concepts is a set of vertices that
have a distance of at least k to at least one of the vertices of the dominating set. Note
that the latter definition loosens the general dominating set definition.
Finding MinDS of a graph is NP-hard [2] and we are mostly interested in find-
ing minimal dominating sets of graphs when we review sequential, parallel, and
distributed algorithms for this purpose in this section.
For the design of a greedy algorithm, we will use coloring of vertices such that
vertices in MDS will be shown in black, their dominated neighbors are colored gray
and any other vertex in the graph is white with all vertices initialized to white. The
span of a vertex v is the number of white neighbors it has including itself if it is
white. The heuristic we will use is to always select a white or a gray vertex with the
highest span in the graph. Since our aim is to find a MDS, we are trying to cover as
many white vertices as possible at each step with this heuristic. The pseudocode for
this algorithm called Span_M DS is depicted in Algorithm 10.10.
Figure 10.9 displays operation of this algorithm in a sample graph where the vertex
with the highest span is colored black at each iteration. We can see that after three
iterations, there are no white vertices left and the algorithm terminates. If we always
select a gray vertex with the highest span at each iteration after the first one, we have
a connected DS as shown in Fig. 10.9d.
10.3 Dominating Sets 319
(a) (b)
(c) (d)
(3) 1
2
4
Fig. 10.9 Running of MDS_Span in a sample graph. The three iterations are shown in a–c with
the dominating set vertices in black and the dominated vertices in gray. A connected DS is shown
in d for the same graph with the iteration step numbers displayed next to included vertices
n vertices using separate paths, this algorithm may start by coloring u or v black and
then attempt to color all of the intermediate vertices black, resulting in n − 1 steps
where coloring of u and v black suffices to form a DS in two steps.
(a) (b)
(c) (d)
Fig. 10.10 Running of Guha–Khuller’s first algorithm in a sample graph. Selected MCDS vertices
are colored in black and their neighbors are shown in gray. The selected vertex or vertex pairs are
shown inside dashed regions. The MCDS is formed after four iterations of the algorithm
10.3 Dominating Sets 321
(a) (b)
(c) (d)
(e) (f)
Fig. 10.11 Running of Guha–Khuller’s second algorithm in a sample graph. The selected vertex
at each iteration is shown inside a dashed circle to be shown in black in the next iteration. Note that
we could have opted to include the white vertex in MDS in c since it doing so also results in one
less piece. Also, we could have selected the gray vertex next to the white one in d to result in one
less step. The unconnected MDS in e is connected in f. The MCDS is formed after six iterations of
the algorithm
using a Steiner tree based algorithm in the second phase. The output MCDS has an
approximation ratio of 3 + ln Δ [4].
vertex with the highest degree between the MIS vertices or using a Steiner tree based
algorithm as in Guha–Khuller second algorithm.
A simple distributed algorithm based on the span of a node in its two-hop neighbor-
hood in the network can be formed as follows. Each node exchanges its span with
all of its two-hop neighbors and if it has the highest span among all of these neigh-
bors, it enters the MDS. The algorithm is executed by a node i until it has no white
neighbors which are neither MDS or dominated nodes as shown in Algorithm 10.11.
It can be shown that this algorithm computes a MDS with ln Δ + 2 approximation
ratio [12]. The number of rounds needed is O(n) since there will be at least one new
node entering the dominating set at each round.
A vertex cover or a cover of a graph is a subset of its vertices such that every edge is
incident to at least one vertex in this subset. Vertex cover has numerous applications,
such as placing stores in a region so that every road leads to at least one store, in
bioinformatics [10] and in chemistry [9]. We can define this set property formally as
follows:
A minimum vertex cover (MinVC) of a graph is the set with the minimum order
vertex cover among all vertex covers of that graph. A minimal vertex cover (MVC)
of a graph does not contain any other vertex covers of that graph as depicted in
Fig. 10.12. In other words, removing a vertex from a MVC will destroy the vertex
10.4 Vertex Cover 323
(a) (b)
Fig. 10.12 a An unconnected MVC with order 5. b A connected MinVC with order 4 of a sample
graph. The vertices in the vertex covers are shown in gray. The vertex cover in b is connected
cover property of this set. Finding MinVC of a graph is NP-hard [2] and commonly,
our goal is to find a minimal vertex cover of a graph.
We had stated this before in Chap. 3 as an example of a reduction and it would
be appropriate to restate it here. A set V is a vertex cover of a graph G = (V, E)
if and only if V \ V is an independent set of G. Since V is a vertex cover, every
edge (u, v) has at least one endpoint in V . If both of vertices u and v are in V \ V ,
then (u, v) ∈ / E as otherwise V will not be a vertex cover since it does not have a
vertex incident to the edge (u, v). Therefore, V \ V is an independent set of G. We
saw that I is an independent set of a graph G if and only if I is a clique in G. Thus,
all of the three problems of independent set, clique, and vertex cover are equivalent.
Vertices of a graph may have weights associated with them depicting some phys-
ical property such as the capacity of a router in a computer network. In such a case,
our aim is to find a vertex cover with the minimal total weight. The minimum weight
vertex cover (MinWVC) is the vertex cover with minimum total weight among all
weighted vertex covers of a graph. Viewed from another perspective, we may require
to find a minimal connected vertex cover (MCVC) which is to say that there is a
path between each pair of vertices in the cover consisting of a subset of vertices in
this set only. We will review sequential, parallel, and distributed algorithms to find
unweighted and weighted minimal vertex covers in this section.
Unweighted vertex cover algorithms assume the vertices of the graph have no weights
(or sometimes unity weights) assigned to them.
current highest degree at each iteration since deleting a vertex and its incident edges
will cause a decrease in the degrees of its active neighbors. However, this approach
does not yield a fixed approximation ratio. In fact, the approximation ratio is Θ(log n)
which is not favorable as it depends on the number of vertices.
We have already reviewed how to compute the vertex cover of a graph from
a matching which yielded a constant approximation ratio of 2 as an example of
an approximation algorithm (See Sect. 3.8.2). In this algorithm, we find a maximal
matching of a graph by selecting an arbitrary legal edge and including both ends of the
selected edge in the MVC. The selected edge and its adjacent edges are deleted from
the graph and the process is repeated until there are no edges left. As for a parallel
vertex cover algorithm, we can always use a parallel maximal matching algorithm
by including both endpoints of matching edges in the vertex cover consequently.
(a) (b)
a p a p
b q b q
c r c r
d s d s
e t e t
Fig. 10.13 A sample bipartite graph to test minimum vertex cover algorithm from maximum
matching. The shaded vertices in a are visited by DFS from vertex d and the dark vertices in b are
contained in the minimum vertex cover
Proof We will first prove V ∗ is a vertex cover, then it is a minimum one. Let us
assume V ∗ is not a vertex cover. Then ∃(u, v) ∈ E with u ∈ A and v ∈ B such that
u∈/ V ∗ and v ∈
/ V ∗ . Since V ∗ = ((A \ S) ∪ T ), we must have u ∈ S and v ∈ B \ T .
We have two possibilities in such a case:
We can conclude that each vertex of the set V ∗ is incident to a matched edge by
the first two observations and two endpoints of a matched edge are not both included
in V ∗ by the last observation.
(a) (b)
/ \
/
/ / /
_ /
\ _
/
/ / /
\ /
/
(c) (d)
Fig.10.14 Execution of the greedy distributed vertex cover algorithm in a sample undirected graph.
Nodes included in the cover are shown in bold with deleted edges as dashed at each iteration
Proof In the last iteration of the algorithm, all vertices that have degree ≥ 1 will be
included in the vertex cover, therefore all of the edges of the graph will be removed
with at least one of their incident vertices included in the vertex cover, hence the
algorithm correctly constructs a vertex cover.
328 10 Independence, Domination, and Vertex Cover
(a) (b)
/ \
/
/ / /
/
_ _
\
/
/ / /
\ /
/
(c)
Fig. 10.15 Running of Parnas–Ron algorithm in the same sample graph of Fig. 10.14
The number of iterations is at most log Δ(G) since after log Δ(G) iterations, the
remaining vertices in the graph will have 0 degree. At each iteration, there are at
most 2|V ∗ | new vertices that are added from G − V ∗ to V C. At the beginning of
the ith iteration, the degree of each vertex is at most d/2i−1 . Therefore, the number
of edges between V ∗ and G − V ∗ is at most |V ∗ | · Δ(G)/2i−1 . Let us assign xi to
the number vertices in G − V ∗ of degree at least d/2i at the beginning of the ith
iteration. Hence, xi · d/2i ≤ |V ∗ | · d/2i−1 ; therefore, xi ≤ 2|V ∗ |. Since we have at
most log Δ(G) iterations, it follows that the total number of vertices included in V C
is at most 2|V ∗ | · log Δ(G) [8].
When vertices have weights, we search for a minimal weighted vertex cover. Finding
the vertex cover with the minimum total weight is NP-hard as most of the problems
we have studied. The Pricing algorithm is a sequential approximation algorithm for
MWVC problem as described next together with parallel and distributed algorithms
for the MWVC problem.
and the sum of prices assigned to edges that are incident to a vertex u should not
exceed the weight wu of a vertex. Formally,
When the sum of the prices of edges incident to a vertex equals its weight, the
vertex is said to be tight. A possible implementation of this algorithm is shown in
Algorithm 10.14 where each vertex v ∈ V has a capacity ci which is initialized to
its weight and the active edge set S is initialized to E. The algorithm inspects each
edge euv ∈ S and if u or v has a remaining capacity, it charges the edge e with the
lower of the capacities. When the capacity of u or v becomes 0, it is labeled as a
tight node and included in the M W V C V . The algorithm stops when each vertex
has a tight vertex at least incident to one of its endpoints, meaning all of the edges
are covered by tight vertices on one or both ends. The execution of this algorithm is
shown in Fig. 10.16.
Proof This algorithm correctly constructs a vertex cover since we continue exploring
all edges until there are no edges left with a tight vertex on one or both of its endpoints,
therefore all of the edges are covered. There will be at least one tight vertex at each
iteration, resulting in O(n) time complexity
Let V be the set of all tight vertices at the end of the algorithm and V ∗ the minimum
vertex cover vertices. We need to show w(V ) ≤ 2w(V ∗ ). Since all vertices in V
are tight, we can write the following equation.
w(V ) = wv = pe ≤ pe (10.1)
v∈V v∈V e=(u,v) v∈V e=(u,v)
330 10 Independence, Domination, and Vertex Cover
4 4
(a) (b)
6 8 5 6 8 5
11 11
7 3 11 7\
11 3
1
(c) 4 (d) 4
3 5 5
6 /
8 5 6 3 5
6 6
1
11 1
11 3 11 11
1 3
Fig. 10.16 Running of Pricing_M W V C algorithm in a sample graph. The weights of vertices are
shown inside them and tight vertices are shown with double circles. After five iterations, MWVC
is formed as shown in f with a total weight of 18
2 2
9 6 8 2
p2
332 10 Independence, Domination, and Vertex Cover
vertex cover results to form the final minimal vertex cover of the graph. Each process
finds the minimal vertex cover in its partition by including higher identifier neighbors
of border vertices in its partition and sends the partial minimal vertex cover to the
r oot.
We have reviewed three problems in graphs; the maximum independent set, the
minimum dominating set, and the minimum vertex cover problems. All of these
problems are NP-hard and the algorithms proposed in literature are approximation
or heuristic algorithms that find maximal or minimal solutions rather than maximum
or minimum order subgraphs. We can employ greedy algorithms commonly by the
use of some heuristic but algorithms with better performances are frequently sought.
As we have noted in the highest degree first algorithm to find a minimal vertex cover,
a seemingly natural heuristic may not provide a constant approximation ratio.
10.5 Chapter Notes 333
We saw the independent set, clique, and vertex cover problems are computationally
equivalent; a vertex set V of a graph G = (V, E) is a vertex cover of G if and only if
V \ V is an independent set of G. Also, the set V \ V is a clique in G if and only if V
is an independent set of G. A dominating set may be connected and an independent
set may be used as the first step of forming a connected dominating set. A vertex
cover may also be connected and also vertices may have weights associated with
them depicting some physical parameter attributed to the nodes of the system that
the graph represents. The weighted versions of these problems commonly require
different considerations than unweighted ones.
The parallel algorithms for these problems are scarce and with the recent advance-
ments resulting in the availability of data for very large real networks, there is an
increasing need for parallel algorithms. In some cases, fast, efficient deterministic
parallel algorithms have been developed for these problems but these algorithms may
be quite complicated. For parallel computation of MDS, we can use a similar algo-
rithm as the one used for vertex cover by partitioning the graph to a set of processes
each of which runs an MDS algorithm in its partition.
Distributed network algorithms are at a more fairly investigated level for these
problems as we have noted. In many cases, these algorithms are derived from sequen-
tial ones, however, there is a still need for algorithms with better performances.
Exercises
1. Propose a heuristic to find the IS in Algorithm 10.2 and implement this method
with Algorithm 10.2 to find the MIS of Fig. 10.19.
2. In order to find the MaxIS of a rooted tree, we can include all leaves of the tree in
the MaxIS and move upwards in the tree by not including one level in the MaxIS
and next level in the MaxIS.
3. Implement Guha–Khuller first algorithm to find the MDS in the example graph
of Fig. 10.20.
4. Implement the distributed span algorithm to find the MDS in the example graph
of Fig. 10.21.
5. Find the MVC of the example bipartite graph of Fig. 10.22 using matching. Show
all iterations of the algorithm.
6. Work out the MWVC in the sample graph of Fig. 10.23 using Pricing algorithm.
7. Implement the greedy distributed weighted vertex cover algorithm to find the
MWVC in the graph shown in Fig. 10.24, where weights of vertices are shown
inside them.
334 10 Independence, Domination, and Vertex Cover
a p
b q
c r
d s
e t
5 2 6
4 11
7 3
10
7 5
9 3 2 6 3
8
11 9 1 5 1
1 7
References
1. Erciyes K (2013) Distributed graph algorithms for computer networks (Chap. 10). Computer
communications and networks series, Springer, Berlin. ISBN 978-1-4471-5172-2
2. Garey MR, Johnson DS (1978) Computers and intractability: a guide to the theory of NP-
completeness. Freeman, New York
3. Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing (Chapter
10), 2nd edn. Addison Wesley, Reading
4. Guha S, Khuller S (1998) Approximation algorithms for connected dominating sets. Algorith-
mica 20(4):374–387
5. König D (1931) Graphen und Matrizen. Math Lapok 38:116119
6. Koufogiannakis C, Young N (2009) Distributed and parallel algorithms for weighted vertex
cover and other covering problems. In: The 28th ACM SIGACT-SIGOPS symposium on prin-
ciples of distributed computing (PODC 2009)
7. Luby M (1986) A simple parallel algorithm for the maximal independent set problem. SIAM
J Comput 15(4):1036–1053
8. Parnas M, Ron D (2007) Approximating the minimum vertex cover in sublinear time and a
connection to distributed algorithms. Theor Comput Sci 381(1):183–196
9. Rhodes N, Willett P, Calvet A, Dunbar JB, Humblet C (2003) Clip: similarity searching of 3d
databases using clique detection. J Chem Inf Comput Sci 43(2):443448
10. Samudrala R, Moult J (2006) A graph-theoretic algorithm for comparative modeling of protein
structure. J Mol Biol 279(1):287302
11. Stein C (2012) IEOR 8100. Lecture notes. Columbia University
12. Wattenhofer R (2016) Principles of distributed computing (Chapter 7). Class notes. ETH Zurich
Coloring
11
Abstract
Coloring in a graph refers either to vertex coloring, edge coloring or both in which
case it is called total coloring. Each vertex is assigned a color from a set of colors
such that no two adjacent vertices have the same color in vertex coloring. Edge
coloring is the process of assigning colors to the edges of a graph such that no
two edges incident to the same vertex are assigned the same color. We review
sequential, parallel, and distributed algorithms for vertex and edge coloring in
this chapter.
11.1 Introduction
Coloring in a graph refers either to vertex coloring, edge coloring or both in which
case it is called total coloring. Each vertex is assigned a color from a set of colors such
that no two adjacent vertices have the same color in vertex coloring. This method has
many applications including channel frequency assignment and scheduling of jobs.
Assignment of frequency channels to radio stations may be modeled by coloring of a
graph with the vertices representing radio stations and an edge connects two stations
if they are within interference distance to each other. Different colors in this case
correspond to different broadcast frequencies. As a scheduling example, we may
need to assign final exams in a university so that no student takes two exams at the
same time. We can represent each exam by a vertex in a graph and an edge connects
two vertices a and b if a student is taking both final exams a and b. If the colors of
vertices represent time slots for final exams, our aim is to color each vertex of the
graph such that two adjacent vertices receive a different color, meaning a student
taking both exams will attend them in different time slots. Edge coloring is the
process of assigning colors to the edges of a graph such that no two edges incident to
the same vertex are assigned the same color. Edge coloring may be used in planning
© Springer International Publishing AG, part of Springer Nature 2018 337
K. Erciyes, Guide to Graph Algorithms, Texts in Computer Science,
https://doi.org/10.1007/978-3-319-73235-0_11
338 11 Coloring
a timetable for teachers to teach courses in a school to achieve the minimum amount
of course time. A bipartite graph with teacher and course partitions of vertices can
be formed, and we search a minimal edge-coloring of this graph. Then, we find a
maximum time value that any teacher is involved in teaching, which is the maximum
time spent in teaching all of the courses. Time division multiple access network
communication protocols for sensor networks may be realized using edge coloring,
representing each time slot with a color [7].
The main goal of any coloring method is to use a minimum possible number of
colors. Since this is an NP-hard problem for vertex and edge coloring [10], various
heuristics are commonly employed. Parallel vertex coloring algorithms attempt to
concurrently color different regions of the graph under consideration by a number
of processes to achieve speedup. A distributed graph coloring algorithm on the other
hand, is executed by each node of the network graph so that each node determines
its color in the end.
We can have parallel and distributed edge coloring algorithms as in vertex coloring.
Total graph coloring can be achieved by both coloring vertices and edges of a graph.
We start with the vertex coloring problem in this chapter by describing sequential,
parallel and distributed algorithms for this task and continue with algorithms for
edge coloring.
A vertex coloring of a graph is the assignment of colors to its vertices such that no
two adjacent vertices have the same color. It can be formally defined as follows.
For a graph with n vertices, the set C with n elements will provide its coloring,
however, our aim is to find the minimum number of colors in the optimization version
of the vertex coloring problem. The decision version of this problem seeks to find an
answer to the question. “can we color the vertices of a graph with at most k colors?.”
We will consider connected and simple graphs for this problem. A k-coloring of a
graph G is the coloring of G using k colors. In a proper vertex coloring of a graph,
adjacent vertices are colored with distinct colors. The vertices of the same color in
a graph form a color class.
Remark 9 Any subgraph H of a graph G can be colored with less colors than G,
that is, χ (H ) ≤ χ (G).
While coloring G we will have colored all of its subgraphs and it is probable that
we use less colors to color its subgraphs.
This is valid since all vertices are connected to all other vertices in K n and hence,
we need n distinct colors to color such a graph.
Remark 11 The chromatic number χ (G) of a star graph Sn with n vertices is 2 since
we can color all of the vertices connected to the center with the same color and the
center with another color.
• A bipartite graph has no odd length cycles and hence can be colored with 2 colors.
We can color a bipartite graph by running the BFS algorithm of Chap. 7 and color
vertices at odd levels with color 1 and the vertices at even levels with color 2.
• Since a tree is a bipartite graph, we can color a tree with two colors.
• A cycle graph with an even number of vertices is a bipartite graph and therefore,
we can color such a graph with two colors Fig. 11.1a.
• A cycle graph with an odd number of vertices is not a bipartite graph. We can
color such a graph with n vertices using a total of three colors; two colors for
n − 1 vertices and a third color for the nth vertex as shown in Fig. 11.1b.
(a) (b)
Theorem 11.1 (Brook’s Theorem [4]) For a connected graph G that is not fully
connected or an odd cycle,
χ (G) ≤ Δ(G) + 1 (11.1)
Lovasz gave a short and simple algorithmic proof of this theorem by considering
three cases [15].
Δ(K n ) + 1 = n = χ (K n )
We will see the greedy algorithm presented in the next section also has a complexity
as this upper bound. Brook showed that the equality holds only for odd-cycle graphs
and complete graphs. However, this upper bound on a chromatic number of a graph
may turn out to be very far from the real value as in the star graph Sn with n vertices;
Δ(Sn ) = n − 1 and χ (Sn ) = 2 in such a graph.
Remark 14 If a graph G can be partitioned into k disjoint independent sets but not
less, then χ (G) = k.
11.2 Vertex Coloring 341
|Ii | ≤ α(G)
where α(G) is the maximum independence number of G. Since
n
χ (G) ≥ (11.2)
α(G)
A clique of a graph G is a complete subgraph of G. The clique number ω(G) of a
graph G is the order of its largest clique. There is a relation between a clique and an
independent set of a graph G as we have noted in Sect. 10.2.1, a subset V of vertices
of G = (V, E) is a clique if and only if V is a maximal independent set in G.
Theorem 11.2 Let ω(G) be the clique number of graph G, that is, it is the order of
the largest clique of G. Then,
χ (G) ≥ ω(G) (11.3)
Proof Since all of the vertices of a clique are adjacent to each other, each such vertex
must be colored with a different color. Therefore, the order of the maximum clique
in a graph G sets a lower bound on the chromatic index of G.
Since coloring the vertices of a graph with its chromatic number of colors is an
NP-hard problem, various heuristics are proposed in the literature that approximate
the number of colors to χ (G). In this section, we first present a greedy coloring
algorithm template and then review four algorithms using different heuristics which
are the random algorithm, the first-fit algorithm, the largest-degree-first algorithm and
the saturation-based-ordering algorithm. All of these algorithms may be classified
as greedy approaches as they select the vertex that best meets the required criteria at
each iteration.
(a) 1 4 2 1 (b) 3 1 3 1
b c d e b c d e
2 1
a a
i h g f i h g f
3 2 1 2 2 3 2 1
Fig. 11.2 Random selection; a A random coloring heuristic by Algorithm 11.1 selects vertices
b-g-e-a-d-h-i-c in sequence and uses four colors shown next to vertices as integers; b The largest-
degree-first heuristic uses three colors for this graph, irrespective of the choice of vertices when
degrees are the same
the selection of the vertex to be colored. In the simplest form, a vertex is selected ran-
domly and the smallest available color is assigned to this vertex in random selection.
The operation of this algorithm in a simple graph is depicted in Fig. 11.2a.
• Identifier-based Algorithm: In this case, vertices of the graph are numbered from
1 to n to yield a vertex set V = {v1 , . . . , vn } and the vertices are colored in
sequence obeying the rules of coloring; that is, coloring each vertex with the
minimum possible color that does not conflict with neighbors. This algorithm is
also called the first-fit algorithm and uses at most 2χ (G) colors on the average
[11]. It is simple and fast in general sense but can yield an approximation ratio of
n/4 in some special graphs [12] requiring O(m) running time.
• Largest-Degree-First (LDF) Algorithm: It makes sense to color the large degree
vertices first to have a less number of colors since the low-degree vertices can
usually be colored in a more flexible way as proposed in [20]. The operation of
11.2 Vertex Coloring 343
i h g f
2 3 2 1
this algorithm is shown in the graph of Fig. 11.2b and we can see it results in one
less color than the greedy approach. This algorithm can be implemented to have
O(m) time complexity (Fig. 11.3).
• Saturation-Degree-Ordering (SDO) Algorithm: A further refinement to the LDF
algorithm can be provided as follows [3]. The saturation degree s(v) of a vertex v
is defined as the number of distinct colors currently assigned to its neighbors. This
parameter is dynamic and a greedy algorithm based on the saturation degrees can
be designed to always select the vertex with the highest value of this parameter.
In case of ties, the vertex with the highest degree is selected which means we are
searching the largest value of the pair (s(v), deg(v)) of all vertices v ∈ V that are
not assigned a color. Note that such an algorithm will start by the largest degree
vertex v of the graph and assign the minimum color to v, and will continue with
the largest degree neighbor vertex of v. The operation of SDO algorithm is shown
in Fig. 11.3. This algorithm has O(n 2 ) time complexity [3].
• Incident-Degree-Ordering Algorithm: This heuristic is a modified form of the
SDO. The incident degree of a vertex is the number of its colored neighbors. Note
that the colors of neighbors need not be distinct as in the saturation-based heuristic.
The vertex that has the highest incident degree is selected at each iteration of the
algorithm [6]. Vertex identifiers are used in case of ties as in the saturation degree
algorithm. It is a linear time algorithm running in O(m) time.
(a) (b)
(c) (d)
1 2 3
Fig. 11.4 Maximal independent set-based vertex coloring of the same graph of Fig. 11.2; a, b, and
c displays the iterations of the algorithm, where a new color is assigned to each new independent
set as shown by different patterns; d shows the final coloring of the graph
There are only few algorithm for coloring vertices of a graph in parallel. We describe
independent-set-based algorithms and the identifier-based algorithm for this purpose
next.
We have seen how an independent set can be constructed in parallel using Luby’s
Monte Carlo method in Sect. 10.2.3. In this algorithm, vertices were assigned random
permutations 1, . . . , n at each iteration and a vertex with a local minimum value was
colored these values were used to break symmetries and decide the colors of vertices.
We can simply implement this algorithm for this purpose and then color the vertices
of the graph accordingly.
Jones–Plassmann Algorithm
In another and more recent approach, Jones and Plassmann presented independent
set-based parallel graph coloring algorithm (JP_Color) [13]. Their approach is dif-
ferent than Luby’s algorithm as the random numbers are assigned only once at the
beginning of the algorithm and do not change. Also, assigning of the colors to the
independent set vertices is assigned individually for each vertex, that is, each vertex
in the set is colored with the minimum color that does not exist in its neighbors
as shown in Algorithm 11.4. Each vertex v is assigned a random number which is
its weight w(v) initially. If the weight of a vertex is greater than all of the weights
assigned to its neighbors, then v is assigned to the independent set I with ties broken
by unique vertex identifiers. This step is performed in parallel and the elements of
the set I are also colored in parallel with legal colors. Different than Algorithm 11.3,
the independent set I formed at each step need not be MIS as in Luby’ method and
the vertices of I may be colored with different colors. Jones and Plassmann showed
that the expected runtime of this algorithm in bounded degree graphs using PRAM
model is O(log n/log logn) [13].
The parallel Largest-Degree-First (PLDF) algorithm has a similar structure to
JP_Alg with the difference that the weights that are assigned are the degrees of the
vertices and the ties are broken by selecting a random number. A parallel vertex
coloring method using graph partitioning is described in [9]. An experimental study
reported in [1] compares parallel independent set, Jones and Plassmann and LDF
algorithms for parallel vertex coloring.
346 11 Coloring
0001 0001
0100 1001 0111
0110 1001 1011 100
Proof The initial coloring of vertices is legal. We need to show the newly formed
colors are also legal. Let us assume two cases.
11.2 Vertex Coloring 347
• Case 1: Successor of vertex u, vertex v, chooses the same index. Since bit values
are different, they have different colors assigned.
• Case 2: When they have different indexes, they have different colors.
Definition 11.3 Starting with n, log∗ n is the number of times logarithm on base
2 is applied until reaching a number smaller than or equal to 2. That is, log∗ n =
min{i : logi n ≤ 2}. In other words, ∀n ≤ 2, log∗ n = 1 and ∀n > 2, log∗ n =
1+log∗ (log n). For example, log∗ 232 = 1+log∗ 32 = 2+log∗ 16 = 3+log∗ 4 = 4.
Proof Let n j be the maximum number of bits used by the color cv of vertex v after
iteration j and let n 0 = log n be the number of bits used for initial coloring of
nodes. We can state n j+1 ≤ log n j + 1 ≤ log n j + 2. We can continue to find
n 1 ≤ log n 0 + 2 and n 2 ≤ log (log n 0 + 2) + 2 ≤ log log n 0 + 3 when log n 0 ≥ 2.
We can see for j = 1, 2, . . . with log( j) n 0 ≥ 3, n j ≤ log( j) n 0 + 3. Hence, when
the number of iterations j = log∗ n 0 , n j ≤ 5. Since n 0 = log n, the number of
bits for cv is at most 5. The number of bits after two more iterations is reduced to
3 and hence, the number of possible colors becomes 8. Another iteration makes the
palette size 6 as the first part of the color has 3 possible values [2].
Note that this algorithm can only reduce the colors to 6 colors. For example, let
us assume node u is the predecessor of vertex v and φ(v) = 101 B and φ(u) = 011 B ,
the index is 001 B and the node v will set 011 B which will be the same as the color
value of u.
348 11 Coloring
In a distributed setting, our aim is to have each node in the network assigned a legal
color. We will first present a simple color reduction algorithm using identifiers of
nodes as initial colors and then a synchronous algorithm that breaks symmetries
using the identifiers of nodes and an algorithm to color the nodes of a tree in this
section.
Theorem 11.4 Algorithm 11.6 provides legal coloring of a network with Δ+1 colors
in n − Δ + 1 time using O(Δn) messages.
11.2 Vertex Coloring 349
(a) 6 5 7
(b) 6 5 7
4 2 4 2
1 1 8
8 3
3
(c) (d)
6 5 7 6 5 7
4 2 4 2
1 8 1
3 8
3
Fig. 11.5 Distributed coloring of a sample graph using Algorithm 11.6. Rounds 6, 7, and 8 provide
a legal coloring with 8 (Δ + 1) colors
Proof Since the initial coloring is legal and the color changing nodes perform legal
coloring, that is; selecting a color not used by neighbors, the final coloring is legal
and uses Δ + 1 colors exactly for n > Δ + 1 in n − Δ + 1 rounds. The only messages
sent in a round is by the node i that has an identifier equalling the round number and
therefore, this would be O(Δ) messages. The total number of messages exchanged
will then be O(Δn).
The operation of this algorithm is depicted in Fig. 11.6 for the same sample graph
of Fig. 11.5. This graph is colored with three legal colors in four rounds. We can
apply a different criteria such as degrees of vertices or a random number picked
between 0 and 1 to break symmetries instead of vertex identifiers resulting basically
in a very similarly structured algorithm.
Theorem 11.5 Algorithm 11.7 correctly colors the nodes of a network in O(n) time
with O(Δ + 1) colors.
Proof Correction is evident since coloring rule is applied at each step. As in other
rank-based greedy distributed algorithms we have reviewed, number of rounds re-
quired may be as high as the number of vertices n as in the case of a network with
sorted identifier neighbors. The maximum number of colors used will be Δ + 1 as
there will always be a free color in that range.
(a) 6 5
(b) 6 5
7 7
4 2 4 2
1 8 1 8
3 3
1 2 3
(c) (d)
6 5 7 6 5 7
4 2 4 2
1 8 1 8
3 3
Fig. 11.6 Distributed coloring of a sample graph Algorithm 11.7. In four rounds, three colors
assigned to eight nodes
two colors suffice to color the whole tree. The steps of this algorithm for node i are
as follows.
1. if i = root then
2. ci ← 0
3. send color(1) to children
4. else receive color (c)
5. ci ← 1 − c
6. if i = lea f then
7. send color(ci ) to children
color(1)
root
color(0)
In the edge coloring of a graph, edges are assigned colors such that adjacent edges
have different colors.
In proper edge coloring, adjacent edges are assigned distinct colors. Edge coloring
problem is finding the minimum number of colors to color edges of a graph. In the
decision version of this problem, we try to find an answer to whether the edges of
a graph can be colored with at most k different colors. A graph is said to be k-edge
colorable if there is a coloring φ : E → C such that |C| = k. The edge chromatic
number or the chromatic index χ (G) of a graph G is the minimum value of k such
that G is k-edge colorable. In other words, G is k-edge chromatic if χ (G) = k.
Proof Let v be the vertex with the maximum degree in G. Every edge incident to
v must be colored with a different color, therefore χ (G) must be at least equal to
Δ(G).
Strong lower and upper bounds exist for edge coloring of graphs. Vizing has
shown that for every nonempty simple graph G with no multiple edges and no loops
11.3 Edge Coloring 353
(a) (b)
1
[19],
χ (G) ≤ 1 + Δ(G) (11.5)
Based on Eqs. 11.4 and 11.5,
which means for every nonempty simple graph G, either χ (G) = Δ(G) or χ (G) =
1 + Δ(G). The simple graphs G where χ (G) = Δ(G) are called Class 1 graphs
and graphs which have χ (G) = Δ(G) + 1 are called Class 2 graphs.
Remark 16 A star graph Sn has n − 1 edges. Since all of these edges are adjacent to
each other, χ (Sn ) = n − 1. Therefore, Sn is a Class 1 graph.
k
m = |E(G)| = |E i | ≤ kα (G) (11.8)
i=1
hence, χ (G) ≥ m
α (G) .
The performance of this algorithm clearly depends on how finding the maximal
matching is performed. The operation of this algorithm is depicted in Fig. 11.9, where
the MM edges of a sample graph found at each step are colored with a new color.
The greedy algorithm for edge coloring of a graph can be sketched similar to the
greedy algorithms we have seen. An uncolored edge e is picked randomly and colored
with the minimum legal color that does not conflict with the assigned colors of the
adjacent edges of the edge e. This edge is then removed from the graph and the
process is repeated until there are no more uncolored edges left. Algorithm 11.9
displays the pseudocode for this algorithm.
The colors required for this algorithm can be as high as 2Δ − 1 as shown in
Fig. 11.10, where two vertices with Δ degrees are connected by an edge that we
want to color. The only available color in this case, is 2Δ − 1.
11.3 Edge Coloring 355
2
(a) (b)
1 1
1 1
1 2 1
1 1 2
(c) (d)
1 1
1 3 3 1 3 3
4
4
2 1 2 1
1 2 1 2
3 3
(e) (f)
1 1
1 3 3 3
1 3
4 4
4 5 5 5 6 5
2 4
1 2 1
1 2 1 2
3 3
Fig. 11.9 Operation of Algorithm 11.9 in a small sample graph. Selected edges at each iteration
are shown in bold with an assigned color number next to the edges. We have colored the edges of
this graph with Δ + 1 = 6 colors
1 k+1
2k+1
.
. .
.
k 2k
Fig. 11.10 Two vertices with Δ degrees are colored with all available colors and the only remaining
color for the edge between them is 2Δ − 1
e e e
1 1 1
1 1
d c d c d c
2
Fig. 11.11 Operation of Algorithm 11.9 in a small sample graph. Selected edges at each iteration
are shown in bold with assigned color number next to the edges. We could color this graph with Δ
colors
The edges of bipartite graphs can be colored with Δ colors as proved by König [14]
which means,
χ (G) = Δ(G) (11.9)
11.3 Edge Coloring 357
Finding a perfect matching in a k-regular bipartite graph takes O(km) time (see
Chap. 9), hence Algorithm 11.10 provides edge coloring of a bipartite graph with a
maximum degree Δ in O(Δm) time. The execution of this simple algorithm in a
small bipartite graph is shown in Fig. 11.12.
Edges of the same color in a properly colored graph G establish a maximal matching
of G as we have seen. We have reviewed in Chap. 9 how to perform parallel matching
of a graph and hence, we can use these algorithms to perform parallel edge coloring.
Algorithm 11.11 displays the pseudocode of such an algorithm which performs find-
ing MM and removing these edges from the graph in addition to assigning colors to
the edges of the maximal matching.
358 11 Coloring
4
2 2
2 3 4
3
(g) 4
3 2 1
2 1
2 3
Fig. 11.12 Coloring edges of a bipartite graph. The graph in a is made full-bipartite by adding
edges to obtain the graph in b. After three iterations of disjoint perfect matchings, the original graph
is edge colored with Δ = 4 colors as shown in g
1 2 3 3
5 5
Fig. 11.13 Edge coloring of K 5 . Disjoint maximal matchings are selected at each iteration with
each matching assigned a new color. The final colored graph shown in e has 5 distinct colors
11.3 Edge Coloring 359
In a network setting, each node should act independently to color the edges adjacent to
it. We need a way to break symmetries in a distributed environment as we commonly
implement. Let us define a heuristic to be the node with maximum degree decides
what color to be assigned to its adjacent edge. Different than distributed vertex
coloring problem, we need to consider the case when a node with a lower degree
receives requests from two maximum degree neighbors. In such a case, we will
assume it selects the one with the larger degree. The messages to be used in the
algorithm we propose are as follows.
We assume each node is aware of its neighbors and hence, its degree initially. The
variable list curr _neighs holds the identifiers of currently active neighbors of a node
i such that ∀ j ∈ curr _neighs, the edge (i, j) is not yet colored. The algorithm starts
by each node exchanging its degree with neighbors. Thereafter, as long as there are
uncolored edges adjacent to a node i, it checks its degree with the current neighbors.
360 11 Coloring
If it finds it has the maximum degree among them, then node i broadcasts a propose
message to all of its active neighbors. A neighbor that receives more than one request
message responds to the sender with the largest degree by the ack message as shown
in a coarse sketch of this algorithm in Algorithm 11.12.
The operation of this algorithm in a small sample network is depicted in Fig. 11.15,
where nodes with highest degrees in their neighborhoods decide to color edges
incident to them by proposing to their neighbors. The maximum colors used will be
2Δ(G) − 1 since there will always be a color in this range proposed by the highest
degree nodes. Grable et al. proposed a synchronous randomized distributed edge
coloring algorithm in which each edge (u, v) picks a random color c from its palette
of (1 + ε)max{deg(u), deg(v)} colors in each round [10]. If color c is not in conflict
with any of the selected colors of neighbors, c is determined to be the color of the
edge (u, v). It is shown this algorithm colors the edges of a graph with (1 + ε)Δ(G)
colors for any ε > 0 in O(log log n) rounds for graphs with sufficiently large degrees.
11.4 Chapter Notes 361
na a
a
(c) a
2 4 6
3
1
2 a
1 5 1
5 4 5
a a
3 4 2
6 5
3
(d)
2 3 4 6
1
2 5
5 4 1 5 1
3 4 2
6 5
3
Vertex coloring is the process of coloring the vertices of a graph such that no two
adjacent vertices have the same color. It can form the basis of more complicated
graph algorithms and also has various practical implementations such as assigning
channel frequencies in wireless networks. We reviewed theoretical aspects of vertex
coloring and then described sequential, parallel, and distributed algorithms for this
362 11 Coloring
purpose. Coloring vertices of a graph with its chromatic number is NP-hard but
otherwise coloring with Δ + 1 coloring is straightforward.
A simple parallel algorithm makes use of the independent set property where no
two vertices in such a set are adjacent. We can, therefore, assign the same color to
the vertices of an independent set. We can find an independent set in parallel as was
shown in Chap. 10, hence, this algorithm can be used to assign colors in parallel.
There are various distributed vertex coloring algorithms. We described a basic rank-
based algorithm which is slow as its execution time in rounds is dependent on the
number of nodes in the network and another simple algorithm to color trees.
Edge coloring refers to assigning colors to the edges of a graph such that no
two edges with common endpoints receive the same color. We described a simple
sequential algorithm and parallel and distributed algorithms for edge coloring. Edge
coloring can make use of matching algorithms since edges of the same class in a
graph, that is edges with the same color, constitute a matching in that graph. This
way, we can perform edge coloring in parallel using a parallel maximal matching
algorithm as we reviewed. Distributed algorithms for edge coloring require careful
consideration as we need to check colors of edges incident to neighbors of a node v
when coloring an edge incident to v. Total graph coloring requires both the vertices
and edges of a graph to be colored.
Exercises
1. Color the vertices of the graph in Fig. 11.16 using the greedy algorithm that selects
the first uncolored lowest index.
2. Partition the sample graph of Fig. 11.17 into all possible disjoint maximal inde-
pendent sets and color each set with a new color to find the vertex coloring of this
graph.
3. Find the vertex coloring of the graph shown in Fig. 11.18 using the greedy dis-
tributed method of Algorithm 11.6. Assign identifiers to nodes arbitrarily.
4. Color the edges of the bipartite graph shown in Fig. 11.19 by first forming a Δ-
regular bipartite graph and then finding disjoint maximal matchings of this graph
and coloring edges of each matching with a new color.
5. Use the disjoint maximal matching method to color the edges of the graph depicted
in Fig. 11.20.
6. Work out the edge coloring of the graph shown in Fig. 11.21 using the greedy
distributed method of Algorithm 11.12.
3 7 10
1 4
2 6 5
References
1. Allwright JR, Bordawekar R, Coddington PD, Dincer K, Martin CL (1995) A comparison of
parallel graph coloring algorithms. Technical report SCCS-666, Northeast Parallel Architecture
Center, Syracuse University
2. Barenboim L, Elkin M (2013) Distributed graph coloring. Monograph, Ben Gurion University
of the Negev
3. Brelaz D (1979) New methods to color the vertices of a graph. Commun ACM 22(4):251–256
4. Brooks RL (1941) On colouring the nodes of a network. Proc Camb Philos Soc Math Phys Sci
37:194–197
5. Cole R, Vishkin U (1986) Deterministic coin tossing with applications to optimal parallel list
ranking. Inf Control 70(1):32–53
6. Coleman TF, More JJ (1983) Estimation of sparse Jacobian matrices and graph coloring prob-
lems. SIAM J Numer Anal 20(1):187–209
7. Gandham S, Dawande M, Prakash R (2005) Link scheduling in sensor networks: distributed
edge coloring revisited. In: Proceedings of the 24th INFOCOM, vol 4, pp 2492–2501
8. Garey MR, Johnson DS (1979) Computers and intractability. W.H. Freeman, New York
9. Gebremedhin AH (1999) Parallel graph coloring. MS thesis, Department of Informatics Uni-
versity of Bergen Norway
10. Grable D, Panconesi A (1997) Nearly optimal distributed edge-coloring in O(log log n) rounds.
Random Struct Algorithms 10(3):385–405
11. Grimmet GR, McDiarmid CJH (1975) On coloring random graphs. Math Proc Camb Philos
Soc 77:313–324
12. Halldorsson MM (1991) Frugal methods for the independent set and graph coloring problems.
Ph.D. thesis, The State University of New Jersey, New Brunswick, New Jersey, October 1991
13. Jones MT, Plassmann PE (1993) A parallel graph coloring heuristic. SIAM J Sci Comput
14(3):654–669
14. König D (1931) Graphen und Matrizen. Math Lapok 38:116–119
15. Lovasz L (1975) Three short proofs in graph theory. J Comb Theory Ser B 19:269–271
16. Luby M (1986) A simple parallel algorithm for the maximal independent set problem. SIAM
J Comput 15(4):1036–1055
17. Matula DW, Marble G, Isaacson JD (1972) Graph coloring algorithms. Academic Press, New
York
References 365
18. Nishizeki T, Terada O, Leven D (1983) Algorithms for edge-coloring of graphs. Tohoku Uni-
versity, Electrical Communications Department, Technical report, TRECIS 83001
19. Vizing VG (1964) On an estimate of the chromatic class of a p-graph. Diskret Anal 3:25–30
(in Russian)
20. Welsh DJA, Powell MB (1967) An upper bound for the chromatic number of a graph and its
application to timetabling problems. Comput J 10:85–86
Part III
Advanced Topics
Algebraic and Dynamic Graph
Algorithms 12
Abstract
Algebraic graph theory is the study of algebraic methods to solve graph problems.
We review algebraic solutions to the main graph problems in the first part of this
chapter. Many real-life networks are represented by dynamic graphs in which
new vertices/edges may be inserted and some vertices/edges may be deleted as
time progresses. We describe few dynamic graph problems that can be solved by
dynamic graph algorithms, and finally we give a brief description of the methods
used in dynamic algebraic graph algorithms, which are used for dynamic graphs
using linear algebraic techniques.
12.1 Introduction
Algebraic graph theory is the study of algebraic methods to solve graph problems.
Linear algebra and group theory are the two of the mostly referred areas of algebra
while dealing with graphs. Algebraic graph algorithms using linear algebra com-
monly make use of the matrices associated with a graph to solve various problems in
graphs. By using this approach to form graph algorithms for many of the problems,
we have seen has a number of benefits. First of all, we can use various existing matrix
operations for this task which results in simpler algorithms which can be converted
to executable codes with ease in general. As another advantage, parallel matrix oper-
ations and such software environments for them are readily available making parallel
formation of these tasks simpler.
Our purpose in the first part of this chapter is to introduce this paradigm, and
give examples of solving some of the graph problems we have investigated in Part
II of this book. We start with a short review of matrices that are used to represent
graphs. We then review algebraic graph algorithms for some graph problems, which
include graph traversals, shortest paths from a single source, all-pairs shortest paths,
• The column sums of Q(G) is zero, therefore the rows of Q(G) are linearly inde-
pendent.
• Rank of Q(G) is n − 1 for a connected graph G.
• Q(G) = n − k for a graph with k components.
Adjacency Matrix
We have used the adjacency matrix A(G) of a graph G in various algorithms. Let
us briefly review basic algebraic properties of A(G) as regards to graphs. The entry
ai j of A(G) is equal to 0, if vertices vi and v j are not adjacent and is 1 if they are
neighbors. An entry aii is 0 in A(G) and hence A(G) has zeros in its diagonal. We
observe the following properties of A(G).
12.2 Graph Matrices 371
Eigenvalues
Let us consider the equation Ax = λx where A is a non-singular square matrix, x is
a vector, and λ is a constant. When a vector is multiplied by a matrix, its direction
changes. Some vectors such as x are different since they do not change direction.
These vectors are called eigenvectors of the matrix A and the number λ is called an
eigenvalue of A. When A is the identity matrix, all vectors are the eigenvectors of A
with all eigenvalues being 1. Let us rewrite the equation Ax = λx.
Ax − λx = 0 (12.1)
(A − λI )x = 0,
therefore,
det (A − λI )x = 0. (12.2)
The equation det (A − λI ) = 0 is called the characteristic polynomial of A.
This provides us the method to find the eigenvalues and eigenvectors of a graph by
implementing the following steps.
Laplacian Matrix
The Laplacian matrix L(G) of an undirected unweighted graph G without multiple
edges is defined as
L(G) = D(G) − A(G), (12.3)
where D is the degree matrix with an entry dii as the degree of vertex i with all other
elements equal to 0 and A(G) is the adjacency matrix of G. The other entries in L
are −1 if vertex i is adjacent to vertex j and 0 otherwise. The Laplacian L(G) of a
graph G is related to its incidence matrix Q(G) as follows:
where Q T (G) is the transpose of the incidence matrix. The normalized Laplacian L
of G is defined as
The set of all eigenvalues of the Laplacian matrix of a graph G is called the
Laplacian spectrum or just the spectrum of G. We will see shortly the eigenvalues
of the Laplacian matrix, which will provide vital information about the connectivity
of a graph.
Algebraic graph algorithms employ various operations using the three main matri-
ces associated with a graph, its adjacency matrix A which is sparse in general, its
incidence matrix I , and the graph Laplacian L. We provide algebraic algorithms for
sample graph problems in this section.
12.3.1 Connectivity
The second smallest eigenvalue of the Laplacian matrix of a graph G, called the
Fiedler value, provides information on how well G is connected. This value is greater
than 0 if and only if G is connected. Moreover, the larger this value is, the more con-
nected G is and the number of 0s in the Laplacian eigenvalues of a graph G is the
number of connected components of G. The Fiedler value of a graph G, shown by
α(G), is called the algebraic connectivity of G and has been used in numerous appli-
cations involving spectral graph theory and combinatorial optimization problems. Let
κ(G) denote the vertex connectivity of G. Fiedler showed that [9]
The Laplacian matrix L(G) of a graph G can also be used to enumerate the
spanning trees of the graph G according to the theorem below.
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
⎛ ⎞ ⎛ ⎞
1 1 1 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1
2⎜⎜1 1 1 0 0 0 0 0⎟⎟ 2⎜⎜1 1 1 1 1 1 1 1⎟⎟
3⎜⎜1 1 1 0 0 0 0 0⎟⎟ 3 ⎜1 1 1
⎜ 1 1 1 1 1⎟⎟
4⎜ 1 1 1 1 1 1 1 0⎟ ⎜
⎟, C T = ⎜ 0 0 0
4 1 1 1 1 1⎟
C= ⎜ ⎟
5⎜⎜1 1 1 1 1 1 1 0⎟⎟ 5 ⎜0 0 0
⎜ 1 1 1 1 1⎟⎟
6⎜⎜1 1 1 1 1 1 1 0⎟⎟ 6 ⎜0 0 0
⎜ 1 1 1 1 1⎟⎟
7 ⎝1 1 1 1 1 1 1 0⎠ 7 ⎝0 0 0 1 1 1 1 1⎠
8 1 1 1 1 1 1 1 1 8 0 0 0 0 0 0 0 1
Performing logical and of these two matrices gives us the new matrix C . Any
element i in this matrix is in the same SCC with an entry j if and only if C [i, j] = 1.
For this example of directed graph, we can see that the SCCs are {1, 2, 3}, {4, 5, 6,
7}, and {8}.
3 2
7 6 8
374 12 Algebraic and Dynamic Graph Algorithms
1 2 3 4 5 6 7 8
⎛ ⎞
1 1 1 1 0 0 0 0 0
2⎜⎜1 1 1 0 0 0 0 0⎟⎟
3⎜⎜1 1 1 0 0 0 0 0⎟⎟
4⎜ 0 0 0 1 1 1 1 0⎟
C =C ∧C = ⎜
T ⎟
5⎜⎜0 0 0 1 1 1 1 0⎟⎟
6⎜⎜0 0 0 1 1 1 1 0⎟⎟
7 ⎝0 0 0 1 1 1 1 0⎠
8 0 0 0 0 0 0 0 1
The connectivity matrix C can be formed by successively multiplying the adja-
cency matrix A of the directed graph to get An−1 , since the longest path in a graph
may not be longer than n − 1. For example, for a graph with 12 vertices, we need to
form C = A11 which can be obtained by A2 = A× A; A4 = A2 × A2 ; A8 = A4 × A4
and A11 = A8 × A3 for a total of 4 matrix multiplications using logical and and
logical or operations instead of scalar multiplication and addition. We would need
log n matrix multiplications and since an n × n matrix multiplication requires
O(n ω ) operations, the complexity of this step is O(n ω log n). Taking the transpose
of C and forming C ∧ C T both take O(n 2 ) time, therefore total time complexity is
O(n ω log n) with ω < 2.376. Tarjan’s or Kosaraju’s SCC detection algorithms both
use DFS and hence have better performances of O(n + m). However, parallelizing
DFS is difficult as discussed in Chap. 6, but matrix multiplication can be parallelized
simply by distributing the rows or columns of matrices to a set of processes as we
saw in Chap. 4.
In breadth-first search (BFS), we explored vertices that are k hops away from a
given source vertex before exploring the ones that are k + 1 hops away as discussed
in Chap. 6. This resulted in shortest distance paths in an undirected unweighted
graph. Let us consider the sparse adjacency matrix A of an undirected unweighted
graph G = (V, E). We have a sparse vector X to show the source vertex position,
for example, X [3] = 1 with all other elements 0 meaning vertex 3 is the source.
Multiplying A T by X gives us the vector that has 1s in all neighbors of vertex 3.
Multiplying the product again by A T provides neighbors that are two hops away and
so on. We can now sketch an algebraic algorithm using this property as shown in
Algorithm 12.4. We have the matrix A and vector X as input and we want to form the
n × n matrix N , which shows the vertices that are i distance away in its ith row. We
need to provide a simple modification since the result of the multiplication shows all
vertices that are at most i hops away.
12.3 Algebraic Graph Algorithms 375
Let us investigate how this algorithm works for the sample graph of Fig. 12.2. The
transpose of the adjacency matrix A is itself since graph G is undirected.
The matrices A and X formed for source vertex 7 in this graph and the resulting
neighbor matrix N [1, ∗] are as follows. We show the full matrix for comparison but
only its ith row shown in bold is modified in ith iteration.
1 2 3 4 5 6 7 1 2 3 4 5 6 7
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 1 0 0 0 1 1 0 1 1 0 0 0 0 1 0
2⎜⎜1 0 1 1 1 1 0⎟⎟
⎜0⎟
⎜ ⎟ 2⎜⎜0 0 0 0 0 0 0⎟⎟
3⎜⎜0 1 0 0 0 0 0⎟⎟ ⎜ 0
⎜ ⎟
⎟ 3 ⎜0
⎜ 0 0 0 0 0 0⎟⎟
AT 4 ⎜
⎜0 1 0 0 0 0 ⎟ (1) ⎜ ⎟ (1) ⎜
0 ⎟ × X ⎜0⎟ → N 4 ⎜0 0 0 0 0 0 0⎟⎟
5⎜⎜0 1 0 0 0 1 0⎟⎟
⎜0⎟
⎜ ⎟ 5⎜⎜0 0 0 0 0 0 0⎟⎟
6 ⎝1 1 0 0 1 0 1 ⎠ ⎝ 0 ⎠ 6 ⎝0 0 0 0 0 0 0⎠
7 1 0 0 0 0 1 0 1 7 0 0 0 0 0 0 0
The second iteration of the f or loop results in the following. Note that the result
of the multiplication is the vector (1 1 0 0 1 1 1) and we subtract the previous value
(1 0 0 0 0 1 0) from this product to obtain (0 1 0 0 1 0 0), which becomes the second
6 5 4
376 12 Algebraic and Dynamic Graph Algorithms
row of N .
1 2 3 4 5 6 7 1 2 3 4 5 6 7
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 1 0 0 0 1 1 1 1 1 0 0 0 0 1 0
2⎜⎜1 0 1 1 1 1 0⎟⎟
⎜0⎟
⎜ ⎟ 2⎜⎜0 1 0 0 1 0 0⎟⎟
3⎜⎜0 1 0 0 0 0 0⎟⎟ ⎜ 0
⎜ ⎟
⎟ 3 ⎜0
⎜ 0 0 0 0 0 0⎟⎟
AT 4 ⎜
⎜0 1 0 0 0 0 ⎟ (2) ⎜ ⎟ (2) ⎜
0 ⎟ × X ⎜0⎟ → N 4 ⎜0 0 0 0 0 0 0⎟⎟
5⎜⎜0 1 0 0 0 1 0⎟⎟
⎜0⎟
⎜ ⎟ 5⎜⎜0 0 0 0 0 0 0⎟⎟
6 ⎝1 1 0 0 1 0 1 ⎠ ⎝ 1 ⎠ 6 ⎝0 0 0 0 0 0 0⎠
7 1 0 0 0 0 1 0 0 7 0 0 0 0 0 0 0
The final value of N at third iteration is shown below. The first row has 1s at
immediate neighbors of vertex 7, the second row has 1s at two-hop neighbors and
the third row shows three-hop neighbors. Since the diameter of the graph is 3, we
can stop at the third iteration.
1 2 3 4 5 6 7
⎛ ⎞
1 1 0 0 0 0 1 0
2⎜⎜0 1 0 0 1 0 0⎟⎟
3⎜⎜0 0 1 1 0 0 0⎟⎟
N (3) 4 ⎜
⎜0 0 0 0 0 0 0⎟⎟
5⎜⎜0 0 0 0 0 0 0⎟⎟
6 ⎝0 0 0 0 0 0 0⎠
7 0 0 0 0 0 0 0
We need diam(G) iterations of the f or loop and also we need to perform Θ(n 2 )
multiplications at each iteration resulting in Θ(n 2 diam(G)) time complexity for this
algorithm. We can immediately see that this algebraic approach can be parallelized
conveniently by 1-D partitioning of A T and X and distributing these to a number of
processes.
We look at the algebraic versions of two main algorithms for shortest paths: Bellman–
Ford SSSP algorithm and Floyd–Warshall APSP algorithm in this section.
each step until the shortest distances are found. Relaxation of an edge (u, v) is stated
as providing d(v) = min{d(v), d(u) + w(u, v)}. An algebraic formulation of this
algorithm will use the sparse adjacency matrix A[n, n] and a vector D[n] which
shows shortest distance d(i) from s, ∀i ∈ V in the end as shown in Algorithm 12.2.
Definition 12.1 (Tutte matrix) The Tutte matrix T (G) of an undirected simple graph
G = (V, E) is an n × n matrix with elements;
⎧
⎨ xi j if (i, j) ∈ E and i < j
Ti j = −xi j if (i, j) ∈ E and i > j
⎩
0 if (i, j) ∈/ E,
where xi j are formal variables. Tutte proved an important relation between the Tutte
matrix and perfect matching of a graph [24].
Theorem 12.1 (Tutte) Let G = (V, E) be an undirected simple graph with a Tutte
matrix T . Then, G has a perfect matching if and only if det(T ) = 0.
Tutte matrix consists of variables and its determinant is a polynomial of its vari-
ables. The determinant should be a polynomial with all zero parameters to have a
perfect matching. We can, therefore, compute Tutte matrix T , compute its determi-
nant, and check whether this is zero. However, computing T may take exponential
time. Lovazs provided a randomized algorithm to test the perfect matching condition
of a graph by substituting for each variable of Tutte matrix T from a polynomially
large set of integers and then checking whether T is non-singular [14]. Lovazs also
showed that the rank of Tutte matrix of a graph G provides the size of the maximum
matching of G [15] shown by the following theorem.
Theorem 12.2 (Lovazs) Let G = (V, E) be an undirected simple graph with a Tutte
matrix T and k be the size of the maximum matching of G. Then rank(T ) = 2k.
380 12 Algebraic and Dynamic Graph Algorithms
Rabin–Vazirani Algorithm
Once we know that the graph G has a perfect matching, we need an algorithm to
find this matching. Let us assume an edge (u, v) that belongs to a perfect matching
in G. The subgraph obtained by removing (u, v) and all of its adjacent edges from G
has also a perfect matching. If we know how to find an edge e of a perfect matching,
we can recursively build a perfect matching of G. Rabin and Vazirani found that the
inverse T −1 of Tutte matrix provides this information as shown by the following
theorem.
Rabin and Vazirani showed that n/2 matrix inversions are sufficient as each inver-
sion provides one edge of matching. Matrix inversion which takes O(n ω ) time dom-
inates the time taken for the algorithm and each trial to find the perfect matching
takes O(n ω+1 ) time [19].
We have seen static graph algorithms that provide an output of some function on
the graph data structure up to now. However, graphs that represent many real-life
12.4 Dynamic Graph Algorithms 381
networks are not static going through modifications in time. A dynamic graph G
may evolve with time due to changes in G such as insertion or removal of edges.
Dynamic graphs represent many real-life networks, for example, the Internet, protein
interaction networks, and social networks in which such changes occur frequently.
A dynamic graph algorithm allows the following operations on dynamic graphs:
• query: We evaluate a certain property of the graph G. For example; “Is graph
connected?”
• insert: An edge or an isolated vertex is inserted to the graph.
• delete: An edge or an isolated vertex is deleted from the graph.
The two latter operations are commonly called update procedures. We can perform
these operations using the static graph algorithms we have seen up to now from
scratch for the modified graph. However, the main goal of a dynamic graph algorithm
is to provide more efficient solutions for these operations than the static algorithms.
These algorithms are classified as follows:
• Fully dynamic: Insertions and deletion of edges and vertices are allowed.
• Incremental: Only insertions of edges and vertices are allowed.
• Decremental: Only deletion of edges and vertices are allowed.
The last two types of algorithms are named partial dynamic graph algorithms.
Queries are allowed in all of the algorithms described. Intuitively, answering a query
in a dynamic graph in general is simpler than performing an update operation.
Another distinction is between undirected and directed graphs. A dynamic graph
operation whether a query or an update is generally more difficult in a directed graph
than an undirected graph. The task of a dynamic graph algorithm remains to provide
a better performance than its static counterpart. We will see the design of clever data
structures is crucial when forming dynamic algorithms.
The fully dynamic connectivity algorithm in an undirected graph allows insertion
an deletion of edges, and enables queries such as “is graph connected” or “are vertices
u and v in the same component?”. In the fully dynamic minimum spanning tree
problem, we maintain a forest of minimum spanning trees when edges are inserted,
deleted, and weights of edges change. The main problems in directed graphs are
dynamic transitive closure and dynamic shortest paths. In the first problem, we keep
information to evaluate whether a vertex v is reachable from a vertex u when edges are
deleted and inserted. The shortest path problem involves providing and maintaining
information about shortest paths when edges are inserted and deleted in a dynamic
environment.
We start this section by first defining some methods to be used in designing efficient
dynamic graph algorithms for undirected and directed graphs and then provide a brief
survey of algorithms for two representative dynamic graph problems; connectivity
and matching.
382 12 Algebraic and Dynamic Graph Algorithms
12.4.1 Methods
The methods for undirected graphs and directed graphs differ significantly. We will
classify these methods as described in [7] for undirected and directed graphs.
G having common vertices in S with high probability. Using this property, we can
find a long path using short searches to design algorithms for transitive closure and
shortest paths [5,7].
12.4.2 Connectivity
In the dynamic connectivity problem, we need to test the connectivity of the graph
when there are queries and updates. Typical queries would be testing whether graph G
is connected (connected(G)) and are vertices u and v connected (connected(u, v)).
For the update problem, we would need to perform these queries when an edge
(u, v) is inserted or deleted from graph. We will consider two cases separately as
their implementations are very different; the incremental and decremental dynamic
connectivity.
Let us recall the union-find data structure we have reviewed in Chap. 7. This
structure maintains disjoint groups of data items with each group having a represen-
tative. It supports two operations; f ind(x) returns the representative of the set that
x belongs and union(x, y) merges the groups of x and y. We can check whether
two data items are in the same group by testing their representatives with find to see
if they have the same representative. If they do, they are in the same group. We can
also unite two groups by the union operation. The union-find data structure can be
implemented in O(α(n)) time where α(n) is the inverse of the very fast growing
Ackermann function [22].
We can see this data structure is adequate to have a dynamic incremental connec-
tivity algorithm. We can perform a DFS algorithm in the graph and store each tree
of the forest in a group with the root of the tree being the representative of the tree.
Each query can be realized by find(x) operations, which outputs the root of the tree
that x is contained and an insert(u,v) operation is realized by the union operation
which merges two trees if u and v are in different trees.
b e
g
c
d f h
i j
a b c b d b a e f e g e h i h j h e f e a
Fig. 12.3 An Euler tour of a tree. Visit times for the vertex h is shown
√
problem. An update
√ time of O( m) using clusters was presented in [10] which was
improved to O( n) using the method of sparsification in [8]. A randomized algo-
rithm with amortized O(log2 n) expected time per operation was proposed in [11].
A deterministic algorithm with amortized O(log2 n) per operation was presented in
[12] and a randomized algorithm with expected O(log n(log log n)3 ) amortized time
per operation is described in [23]. We will take a closer look at a novel data structure
called Euler tour tree that can be used for dynamic connectivity problem.
Euler Tour Trees
The Euler tour tree (ETT) was presented in [11] to store information about dynamic
graphs. An Euler tour of a graph G is a path that traverses each edge of G exactly
once. A tree does not have an Euler tour, in order to realize an Euler tour of a tree
T , each edge is considered bidirectional and hence each edge is traversed twice and
the tour starts and ends at the root vertex [11]. An Euler tour tree (ETT) associates a
weighted or unweighted key for each vertex. An ETT is basically a balanced binary
tree of an Euler tour of the tree T . We can think of an Euler tour of a tree T as a
depth- first traversal of T . An Euler tour of a sample tree is shown in Fig. 12.3, where
the BST stores the vertices in the order of their visit times and each vertex in the tree
holds pointers to the vertices in the BST showing their first and last visited times.
The main idea of the connectivity algorithm based on ETTs is to store the Euler
tour of a tree instead of storing the tree. Edge insertions or deletions can be performed
by modifying Euler trees of the forest. Testing whether two vertices are connected
can be done by checking if these vertices are in the same ETT. The following main
operations are provided in an ETT:
• FindRoot(v): Finds the root of the tree that contains vertex v. Since the root is
visited as the first element and the last element of the tree, the minimum or the
maximum element of the tree is returned.
12.4 Dynamic Graph Algorithms 385
b e
g
c
d f h
i j
Previous ETT
a b c b d b a e f e g e h i h j h e f e a
New ETT a b c b d b a e f e g e f e a
Fig. 12.4 Operation of the Cut (h) procedure on the sample graph of Fig. 12.3
• Cut(v): The subtree rooted at vertex v is cut from the tree it is contained. This
can be implemented by dividing the BST into three segments; segment before
the first visit to v, segment between the first and last visit to v, and the segment
after the last visit to v. The second segment is shown as the bold box in Fig. 12.4.
The first segment contains Euler tour of the tree before reaching vertex v, Euler
tour of the subtree rooted at v and the Euler tour of the tree after v is visited last
time. We can now merge the first and third segment to perform the cut operation.
This procedure is illustrated in Fig. 12.4 for vertex h. Note that one occurrence of
vertex e has to be deleted from the BST.
• Link(u,v): The subtree rooted at vertex u is connected as a child of vertex v. We
divide the BST into two segments; left segment Sl is from the beginning until
before the last visit to v, and the second segment Sr is the rest of the BST. The
ETT of the forest is then E T TF = {S1 ∪ v ∪ E T Tu ∪ S2 }, where E T Tu is the
ETT of the tree rooted at the vertex u.
There are various other procedures to modify the ETTs as described in [11]. In
order to answer the connected(u,v) query, the roots of the ETTs that contain these
vertices are found by the FindRoot procedure and checked whether they are the
same. Insertion and deletion are performed by the reconstruction of Euler tours with
changes in only O(log n) vertices of the balanced BST. Queries can be answered
in O(log n/ log log n) time and insertions take O(log2 n/ log log n) time using this
method [11].
386 12 Algebraic and Dynamic Graph Algorithms
We will take a closer look at a deterministic algorithm that works on the described
logic to update the maximal matching of a graph.
Neiman and Solomon Algorithm
Neiman and Solomon presented
√ a deterministic algorithm to find maximal matching
of a graph that runs in O( m) update time with a 3/2-approximation [17]. The main
idea of this algorithm is to consider three cases when adding an edge (u, v) to the
graph G. If both endpoints of (u, v) are free, then (u, v) is added to the existing
matching. If both u and v are matched, then matching is not changed. When one
endpoint of edge is matched and other is not, neighbors of vertices u and v are
searched. When a matched edge (u, v) is deleted from the graph, the neighbors of
u and v are checked to see if an edge (u, w) or (u, y) or both can be added to the
matching. The algorithm works in rounds and the three invariants to be maintained
at the ith round of the algorithm are as follows.
1. The√ degree deg(v) of a free vertex v that can be matched at all times is
≤ 2(m i + n) √
2. For a free vertex v, deg(v) ≤ 2m i . When a high-degree vertex u becomes free
and all of its neighbors are matched, a surrogate v is searched√
in place of u. The
vertex v is matched to a neighbor v of u such that deg(v ) ≤ 2m. Then u and
v can be matched and the low-degree vertex v becomes free.
3. M is maximal
12.4 Dynamic Graph Algorithms 387
Algorithm 12.6 displays the pseudocode of the insert procedure in this algorithm.
We have a procedure called Surrogate that is called when one endpoint of the edge
to be inserted is matched and the other is free. In this case, adding the edge (u, v)
may result in an augmenting path which means the maximal matching M can be
enlarged.
The Insert procedure calls the procedure surrogate when one end of the added
edge (u, v) is free and the other is not. In the first case, if (u, v) ∈/ M, we simply
remove the edge (u, v) from the graph without changing the matching M. In the
second case, (u, v) ∈ M, and the edge (u, v) is deleted from the matching M. This
may result in forming new augmenting paths of length less than or equal to 3 which
start either at u or v. In this case, we check whether there is a free vertex w that is
a neighbor of vertex u or v in which case edge (u, v) is added to the matching M.
Furthermore, the√ degree of vertex u √under consideration is checked; two cases are
when deg(u) ≤ 2m or deg(u) > 2m. In the first case, u√may become free but
a search for an augmenting path is carried. When deg(u) > 2m, u is not allowed
to be free since its degree is high or it has no free neighbors. In this case, the proce-
dure Surrogate is called to find a surrogate vertex that may become free instead of
vertex u.
A recent work on dynamic deterministic approximate maximum matching with
worst-case update time of O(log3 n) time is presented in [3] and randomized 2-
approximate matching algorithms are reported in [18,21].
We will then survey dynamic algebraic graph algorithms for two main problems we
have been investigating in this chapter; connectivity and matching.
A dynamic matrix operation can be defined as the procedure that performs a matrix
function such as finding determinant or inverse of a matrix when a change such
as the contents of a row or column occurs. This procedure should implement the
required function without having to run the static counterpart from scratch, and
therefore should provide a better performance. The following matrix operations can
be performed dynamically as described in [20].
• determinant of a matrix
• adjoint of a matrix
• inverse of a matrix
• matrix rank
• characteristic polynomial of a matrix
• linear system of equations
12.5.2 Connectivity
of C shows the number of paths of length k or less between the vertices i and j.
If all entries of Ak are positive, then the graph is connected.
Finding the determinant of a matrix can be performed dynamically [20] and hence
we have a dynamic method to find the connectivity using the Laplacian matrix. Sim-
ilarly, matrix multiplication of dynamic matrices can be performed to result in a
dynamic connectivity method using the adjacency matrix. Maintaining connectiv-
ity in a distributed system consisting of many autonomous computing elements has
a number of applications. For example, providing connectivity in a mobile robot
network is needed for the coordination of the robots and maintaining positive defi-
niteness of positive entries of C is sufficient to provide connectivity in such a network
[25].
We described Rabin and Vazirani algorithm [19] for perfect matching in Sect. 12.3.5.
The random adjacency matrix A(G) of the graph G, called Tutte matrix T , is created
first in this algorithm. Its inverse A−1 (G) is then computed and an allowed edge e
is found, this edge and its endpoints are removed from G and a new Tutte matrix T
is created for the new graph. This loop continues until G becomes empty. The time
to compute the matrix T −1 (G) is O(n ω ) resulting in O(n ω+1 ) time in total.
Mucha and Sankowski found that computing the inverse of Tutte matrix in each
iteration is not necessary since Tutte matrix at r th iteration, Tr +1 , is Tr with two rows
and columns corresponding to i and j deleted [16]. The following theorem was used
to form a relation between Tr−1 −1
+1 and Tr .
Using this theorem, we can compute the inverse of a matrix dynamically after
removing a row and column without having to compute the inverse from scratch. We
consider the columns as variables and the rows as the equations, and the described
procedure eliminates the first variable using the first equation. Mucha and Sankowski
provided a O(n 3 ) algorithm shown in Algorithm 12.7 that finds the perfect matching
of a simple undirected graph based on Rabin–Vazirani algorithm using the method
we have described [16].
390 12 Algebraic and Dynamic Graph Algorithms
We have reviewed first algebraic, then dynamic graph algorithms, and finally dynamic
algebraic graph algorithms in this chapter. A class of algebraic graph algorithms
relies heavily on matrices related to a graph and operations on them to solve a graph
problem. We can see the already available matrix library functions can be used
for such problems whether in sequential or parallel operations. The algebraic graph
algorithms provided solutions which may have worse performances than the classical
counterparts, but they are much easier to parallelize than the classical ones. Matrix
multiplication is frequently used to solve various graph problems as we have seen.
Since this basic matrix operation can be parallelized conveniently, we can deduce that
these problems can be parallelized more easily than the traditional graph algorithms
described until now. Dynamic graph algorithms provide more efficient solutions
than the static ones when there is a change in the structure of the graph. Designing
sophisticated data structures is crucial for dynamic graph algorithms since updates
and queries depend largely on the data structures used. We reviewed basic methods for
undirected graphs which include sparsification, randomization, and clustering; and
directed graphs with reachability trees, matrix data structures, locally shortest paths,
and long paths. We then reviewed algorithms for dynamic connectivity and dynamic
matching. Our last topic of review was the dynamic algebraic graph algorithms which
work on dynamic graphs using methods of linear algebra. We looked at two main
problems again; connectivity and matching. We saw how a simple modification to
Rabin–Vazirani algebraic matching algorithm using a basic method again from linear
algebra led to a more efficient solution.
These topics are relatively more recent areas of study than the static graph algo-
rithms we have seen until now and they have been the focus of many recent studies.
Our main goal in the analysis of these topics is to provide a general survey with exam-
ples to give some idea on the related concepts rather than being comprehensive. A
good review of algebraic graph algorithms is provided in [13]. A detailed survey of
12.6 Chapter Notes 391
1 2
4 3
1 2 3 4
8 7 6 5
1 5 6
10
3 2 7 11
4 8
8 9
a
g h k
b c
m
i
d
l
j
f e
dynamic graphs and dynamic graph algorithms are provided in [6,7] and algebraic
theory related to graphs is presented in [1]. A thorough analysis of dynamic matrix
operations for some graph problems is provided in [20].
392 12 Algebraic and Dynamic Graph Algorithms
Exercises
1. Find the Laplacian matrix for sample graph shown in Fig. 12.5 and work out the
eigenvalues and eigenvectors of this graph.
2. Work out the algebraic BFS algorithm in the sample graph depicted in Fig. 12.6
for source vertex 2. Show the contents of neighborhood matrix N at each iteration.
Describe how to run this algorithm in parallel using two processes p0 and p1 using
this graph as an example.
3. Discover the SCCs of the directed graph depicted in Fig. 12.7 using its connec-
tivity matrix and its transpose.
4. Form the union-find data structures as trees for the graph of Fig. 12.8. Show the
operation of f ind(c), connected(d, f ), and insert (e, k) using this data structure
on this graph.
5. Sketch a parallel version of Rabin–Vazirani algorithm for perfect matching to
run using k processes po , . . . , pk−1 on distributed memory computers. Write the
pseudocode for a process by showing the interprocess communication explicitly.
You can assume a master/slave or a fully distributed model of a parallel processing
system.
References
1. Bapat RB (2014) Graphs and matrices (Universitext), 2nd edn. Springer, Berlin (Chapters 3
and 4)
2. Baswana S, Gupta M, Sen S (2011) Fully dynamic maximal matching in O(log n) update time.
In: 52nd annual IEEE symposium on foundations of computer science FOCS 2011, pp 383–392
3. Bhattacharya S, Henzinger M, Nanongkai D (2017) Fully dynamic approximate maximum
matching and minimum vertex cover in O(log 3 n) worst case update time. In: 28th ACM SIAM
symposium on discrete algorithms (SODA17), pp 470–489
4. Demetrescu C, Italiano GF (2004) A new approach to dynamic all pairs shortest paths. J Assoc
Comput Mach (JACM) 51(6):968–992
5. Demetrescu C, Italiano GF (2006) Fully dynamic all pairs shortest paths with real edge weights.
J Comput Syst Sci 72(5):813–837
6. Demetrescu C, Finocchi I, Italiano GF (2004) Dynamic graphs. Handbook of data structures and
applications. Computer and information science series. Chapman and Hall/CRC, Boca Raton
(Sect. 36)
7. Demetrescu C, Finocchi I, Italiano GF (2013) Dynamic graph algorithms, 2nd edn. Handbook
of graph theory. Chapman and Hall/CRC, Boca Raton (Sect. 10-2)
8. Eppstein D, Galil Z, Italiano GF, Nissenzweig A (1997) Sparsification a technique for speeding
up dynamic graph algorithms. J Assoc Comput Mach 44:669–696
9. Fiedler M (1973) Algebraic connectivity of graphs. Czechoslovak Math J 23:298–305
10. Frederickson GN (1985) Data structures for on-line updating of minimum spanning trees, with
applications. SIAM J Comput 14(4):781–798
11. Henzinger MR, King V (1999) Randomized fully dynamic graph algorithms with polylogarith-
mic time per operation. J ACM 46(4):502–516
References 393
12. Holm J, de Lichtenberg K, Thorup M (2001) Poly-logarithmic deterministic fully dynamic algo-
rithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity. J Assoc Comput
Mach 48(4):723–760
13. Kepner J, Gilbert J (eds) (2011) Graph algorithms in the language of linear algebra. SIAM
14. Lovazs L (1979) On determinants, matchings, and random algorithms. In: Budach L (ed) Fun-
damentals of computing theory. Akademia-Verlag, Berlin
15. Lovazs L, Plummer M (1986) Matching theory. Academic Press, Budapest
16. Mucha M, Sankowski P (2006) Maximum matchings in planar graphs via Gaussian elimination.
Algorithmica 45(1):3–20
17. Neiman O, Solomon S (2013) Simple deterministic algorithms for fully dynamic maximal
matching. In: Proceedings of the ACM symposium on theory of computing (STOC’13), pp
745–754
18. Onak K, Rubinfeld R (2010) Maintaining a large matching and a small vertex cover. In: Pro-
ceedings of the ACM symposium on theory of computing (STOC’10), pp 457–464
19. Rabin MO, Vazirani VV (1989) Maximum matchings in general graphs through randomization.
J Algorithms 10(4):557–567
20. Sankowski P (2005) Algebraic graph algorithms. Ph.D. thesis, Warsaw University, Faculty of
Mathematics, Information and Mechanics
21. Solomon S (2016) Fully dynamic maximal matching in constant update time. In: Proceedings
of FOCS, pp 325–334
22. Tarjan R (1975) Efficiency of a good but not linear set union algorithm. J ACM 22(2):215–225
23. Thorup M (2000) Near-optimal fully-dynamic graph connectivity. In: Proceedings of the thirty-
second annual ACM symposium on Theory of computing. ACM Press, pp 343–350
24. Tutte WT (1947) The factorization of linear graphs. J Lond Math Soc s1–22(2):107–111
25. Zavlanos MM, Pappas GJ (2005) Controlling connectivity of dynamic graphs. In: Proceedings
of the 44th IEEE conference on decision and control and European control conference, Seville,
Spain, December 2005, pp 6388–6393
Analysis of Large Graphs
13
Abstract
Analysis of these graphs requires introduction of new parameters and methods
conceptually different than the ones used for relatively smaller graphs. We describe
new parameters and methods for the analysis of these graphs and also describe
various models to represent them in this chapter. Two widely used models for the
large graphs representing real networks are small-world and scale-free models.
The former means the average distance between any two nodes in large graphs
is small and only few nodes with high degrees exist with majority of the nodes
having low degrees in the latter.
13.1 Introduction
Large graphs consist of thousands of vertices and tens of thousands of edges. Analysis
of these graphs requires introduction of new parameters and methods conceptually
different than the ones we have reviewed up to this point. Global description of large
graphs is very difficult due to the large sizes involved. One way of tackling this
problem is to select a sample and representative subgraph of a given graph, analyze
it, and extend the results obtained to the whole graph. However, sample selection is
a problem on its own and reliability of extrapolating the analysis results is another
issue to be considered. Alternatively, and more commonly, we can analyze the local
properties of vertices and edges in these graphs and use the results obtained to have
some idea on the overall structure of the graph.
We start this chapter by defining some new parameters for large graph analysis.
Real large networks represented by large graphs have some interesting properties.
These networks, commonly called complex networks, exhibit small-world and scale-
free structures. The former means the distance between any two nodes in these net-
works is small compared to the number of nodes they have and the scale-free property
is depicted by the existences of few very high-degree nodes and many low-degree
nodes. These attributes are not found in random networks and we provide a brief
review of these real-life network models. Main types of complex networks are the
technological networks, biological networks, and social networks as we will analyze
in the next chapter. We describe the centrality concept which provides assigning im-
portance values to vertices and edges based on their usage in a network and we review
the main algorithms to asses centralities in networks. Clustering provides grouping
similar objects using some similarity measure. Graph clustering is the process of de-
tecting dense regions of a given graph. We define parameters to assess the quality of
the output of any clustering method and review few basic algorithms to find clusters
and cluster-like structures in graphs in the last part of this chapter.
Analysis of large graphs is difficult due to the amount of data to represent them.
We can however analyze local properties in these graphs with the aim of deducing
their global properties. We need to define some new parameters for the assessment
of global properties of large graphs as described in the next sections.
nk
P(k) = , (13.1)
n
where nk is the number of vertices with degree k and n is the number of vertices.
Plotting of P(k) against the degrees provides visual analysis of the distribution of
the vertex degrees of the graph. For a random graph, we expect the degree distrib-
ution to be binomial with peak around the average degree of the graph. For graphs
representing many real-life networks, we see rather interesting distributions which
are radically different than the binomial. A degree distribution of a simple graph is
depicted in Fig. 13.1
In an assortative network, a node has a high probability of being neighbor of a
node with similar degree. For example, nodes of a social network are persons and
this property is exhibited in such networks since a person with many friends has a
high chance of having another person with many friends as a friend, as in the case of
celebrities who know each other. In disassortative networks, a high-degree vertex is
13.2 Main Parameters 397
deg 2
1 2 3 4 5 6
number
commonly attached to vertex with a low degree as in the case of biological networks
such as the protein interaction networks.
The density of a graph G, ρ(G), is defined as the ratio of the size of its existing edges
to the maximum possible of size of edges that can exist as follows:
2m
ρ(G) = (13.2)
n(n − 1)
In an undirected graph, the sum of the degrees is equal to 2m by the handshake
theorem (See Chap. 2) and the average degree of a graph, deg(G), is the sum of all
degrees divided by the number of its vertices is 2m/n. Substitution in Eq. 13.2 yields
deg(G) deg(G)
ρ(G) = ≈ when n is large (13.3)
n−1 n
In a dense graph, ρ(G) is stable when n is increased to very large values and
ρ(G) approaches 0 with large values of n in sparse graphs. The average degree of
the graph in Fig. 13.1 is 38/13 = 2.9.
2nv
CC(v) = , (13.4)
|N(v)|(|N(v)| − 1)
where nv is the number of edges between the neighbors of vertex v. This parameter
basically shows how well the neighbors of a vertex are connected and hence their
tendency to forming a clique. In a social network for example, the clustering coeffi-
cient of a person provides evaluation of how much friends of that person are friends
of each other. The average clustering coefficient of a graph G, CC(G), is the mean
value of all of the clustering coefficients of vertices as follows:
1
CC(G) = CC(v), (13.5)
n
v∈V
where n is the number of nodes that have a degree of two or more. For a vertex v with
a degree less than two, CC(v) is sometimes considered one or zero and in this case,
the denominator in the above equation can be taken as n. If this parameter is high in
a graph G, we can deduce that the vertices of G are well connected and therefore G
is a dense graph. Clustering coefficients of vertices of a sample graph are depicted
in Fig. 13.2.
The transitivity T (G) of a graph G = (V, E) as proposed in [12] assesses how well
neighbors of the vertices of a graph are connected. Let a triangle subgraph of a graph
G be G t = (Vt , Et ) with Vt = {v1 , v2 , v3 } and Et = {(v1 , v2 ), (v2 , v3 ), (v1 , v3 )}. A
triplet is a three-vertex subgraph G r = (Vr , Er ) with Vr = {v1 , v2 , v3 } and Er =
{(v1 , v2 ), (v2 , v3 )} with v2 in the middle. Each triangle contains three triplets; the
transitivity of a graph can now be defined as follows:
3 × number of triangles
T (G) = (13.6)
number of connected triplets
A simple four-vertex graph is depicted in Fig. 13.3. The clustering coefficients
are given next to vertices. Graph clustering coefficient is the average of these values,
yielding (2 + 43 )/4 = 5/6. There are two triangles in this graph and eight triplets,
g f e
h
1
0.67 0.5 0.67
13.2 Main Parameters 399
d c
1 2/3
counting three triplets per triangle and including {d, a, b} and {a, b, c} triplets giving
a total of eight triplets. Hence, the transitivity of this graph is 3 × 2/8 = 0.75.
One way of assessing similarity between the two vertices of a graph is to find the
number of their common neighbors. Matching index defined below is used to quantify
this similarity.
Definition 13.3 (matching index) The matching index of two vertices u and v in
a graph G is defined as the ratio of the number of their common neighbors to the
number of the union of their neighbors.
In Fig. 13.2, the matching index of vertices b and f , mbf , is 0.33 since the union of
their neighbor set is N(b) ∪ N(f ) = {a, c, d, e, g, h} with a size of 6 and they have
two common neighbors, c and g. The vertices in a graph may be far apart especially
in the case of very large networks in which case common neighbors may not exist.
In such a case, we propose the k-hop matching index parameter which is basically
the ratio of common neighbors of two vertices in k-hop neighborhood to all of their
neighbors in such a neighborhood. We can evaluate this parameter sequentially, and
also in a distributed setting. The following distributed algorithm steps for a node
i of a distributed system can be employed for this purpose. The algorithm can be
implemented in k synchronous rounds under the control of a supervisor using SSI
model.
1. degs[1] ← deg(i)
2. for i = 1 to k
3. send degs[k] to N(i)
4. receive degs[k + 1] from N(i)
5. end for
6. comm ← common neighbors in degs[1..k]
7. all ← neighbors in degs[1..k]
8. mi ← |comm|
|all|
400 13 Analysis of Large Graphs
A random network model assumes that the edges between nodes are inserted ran-
domly. In the basic random graph model, G(n, p), proposed by Erdos–Renyi, there
are n vertices and each edge is successively added between two vertices with prob-
ability p independently [7,8]. In order to generate a random network based on this
G(n, p) model, the following steps are applied:
We will have a different random network with the same values of n and p for
each generation. The degree distribution in random networks is binomial centered
around the average degree and these networks also exhibit small average clustering
coefficient with low diameter with respect to the number of nodes.
A small-world network has a small diameter compared to its size which means
reaching from any node to any other in such a network can be performed by few
number of hops. This property is observed in many real large networks such as
social networks and biological networks. This is a useful property in large networks
as fast communication between any two nodes is possible. The small-world property
is characterized by a low value of average path length l defined as the mean of
distances between all pairs of n nodes in a graph G = (V, E) as below:
1
l= d(u, v) (13.7)
n(n − 1)
u,v∈V,u =v
1. Growth: A new vertex v is generated and connected to one of the existing vertices
with the following rule.
2. Preferential attachment: Vertex v is attached to one of the existing vertices, say
u, with a probability related to the degree of vertex u.
Starting with a small number of vertices, a scale-free graph is obtained when these
two steps are repeated sufficient number of times as proven by Barabasi and Albert
[1].
Let us define the clustering function C(k) of a graph G as the average clustering
coefficient of the nodes with degree k. In various biological networks, C(k) ≈ k −1
showing the clustering coefficient parameter is higher in lower degree nodes than
the higher degree nodes. This basically means higher degree nodes have neighbors
that are less connected than neighbors of lower degree nodes. This new model was
named hierarchical networks in which low-degree nodes are densely clustered and
these regions are connected by high-degree nodes [6,15].
13.4 Centrality
adjacency matrix A of a graph G = (V, E), we can form the centrality vector DC
which has the centrality value DC(i) for vertex i as follows:
DC = A × [1] (13.8)
In a distributed setting, we can have a root process pi gathering all of the degree
values of nodes over a previously constructed spanning tree T using the convergecast
operation with the SSI model. This process can then compute the average graph
degree deg(G) and the centrality vector C. Upon reception of a start message from
the root over the tree T , the role of nodes is to simply convergecast their degrees over
the edges of T to the root.
1
CC(v) = d(u, v) (13.9)
u∈V
We sum the distance of every vertex to vertex v and take the reciprocal of this
value. A possible way to evaluate the closeness centralities of all vertices in a graph
is then to compute APSP routes in a graph using a modified version of Dijkstra’s
algorithm we saw in Chap. 7 for this purpose. We need to add the calculation of
the sums of distances while assigning a vertex to the set of decided routes (See
Exercise 4). In a parallel or distributed setting, we can use the algorithms described in
Chap. 7 with the modification described (See Exercise 5). If the graph is unweighted,
the BFS algorithm can be used with a similar modification. Figure 13.4 depicts a
a 2 5
2
4 3
e d
6
13.4 Centrality 403
sample undirected weighted graph from which the closeness centrality values can
be computed using the shortest paths.
σst(v)
BC(v) = , (13.10)
σst
s =t =v
with σst (v) as the total number of shortest paths between vertices s and t that run
through vertex v, and σst as the total number of shortest paths between vertices s and
t. For an unweighted graph, we can simply run the BFS algorithm for every vertex,
count the number of shortest path through each vertex, and divide this number by
the total number of shortest paths in the graph. For a weighted graph, we need to run
a APSP algorithm. The vertex betweenness values of a sample graph are depicted in
Fig. 13.5.
We can count the number of shortest paths through each vertex excluding the
starting and ending vertices to find 3, 0, 7, 0, 0, and 9 for vertices a, b, c, d, and
e, respectively. There are a total of 15 shortest paths between each vertex pairs and
the vertex betweenness values for these vertices are then 0.2, 0, 0.47, 0, 0, and 0.6,
respectively, in lexicographical order. We can conclude vertex f is the most influential
vertex since the largest number of paths pass through it which can in fact be detected
visually.
σst(e)
BC(e) = (13.11)
σst
s =t =v
404 13 Analysis of Large Graphs
(a) 9 (b) 9
b c b c
2 2 2 2
1 a 1 d
a 7 4 d 7 4
3 8 3 8
f e f e
2 2
(c) 9 (d) b
9
b c c
2 2 2 2
1 a 1 d
a 7 4 d 7 4
3 6 3 8
f e f e
2 2
(e) 9 (f) 9
b c b c
2 2 2 2
1 a 1 d
a 7 4 d 7 4
3 8 3 8
f e f e
2 2
We will now describe an algorithm due to Newman and Girvan [11] to compute
edge betweenness centrality values of edges in a graph. This algorithm has two parts:
a vertex weight assignment and edge weight assignment phases. Vertices are labeled
with the number of shortest paths that go through them first and then this information
is used to find edge weights to yield edge centrality values later. The first part of the
algorithm is depicted in Algorithm 13.1 which works for a source node s that has a
distance of 0 and a weight of 1 initially [6]. The vertex weight assignment procedure
then iteratively assigns distance values to the vertices as in a BFS algorithm with
additional vertex weight values. However, if a vertex u is visited before and has a
weight one more than the weight of its ancestor v, its weight is made equal to the
sum of its previous weight and the weight of its ancestor. This is needed since an
alternative shortest path to the source vertex s is discovered through the ancestor
vertex v. We need to repeat this procedure for all vertices with each one as the source
vertex in an iteration and the vertex weight of a vertex v is the sum of all of the values
assigned to it at each run of the procedure.
13.4 Centrality 405
The second procedure of the algorithm uses the vertex weights assigned to denote
edge weights. It starts by assigning weights to edges that end up in leaf vertices as
the ratio of the vertex weight of its ancestor to the weight of itself. We are basically
denoting weights that show the percentage of shortest paths to the leaf edges. Then,
as we move upward in the tree toward the source vertex s, each edge (u, v) is assigned
a weight that is the sum of all edge weights below (u, v) multiplied by the ratio of
the weight of vertex u to the weight of vertex v. Again the sum of all shortest paths
through edge (u, v) is scaled to give the percentage of shortest paths through that
edge as shown in Algorithm 13.2. This process has to be repeated for all vertices
considering each of them as the source and the edge betweenness value of an edge is
determined as the sum of all values found for that edge. Total time needed is O(nm)
considering n vertices. The edge betweenness values of the edges of a sample graph
for a single source vertex is depicted in Fig. 13.6.
1 1
xi = xj = aij xu , (13.12)
λ λ
j∈N(i) j∈V
where N(i) is the set of neighbors of node i and aij is the ij-th entry of the adjacency
matrix A of the graph G = (V, E), and λ is a constant. We can rewrite this equation
in matrix notation as follows:
Ax = λx, (13.13)
which turns out to be the eigenvalue equation of the matrix A. There will be n
eigenvalues and corresponding eigenvectors associated with the adjacency matrix
A. The eigenvalue centrality values of vertices are determined by the eigenvector
corresponding to the largest eigenvalue of A as shown by Perron–Probenius theorem
[14]. We can now state the steps of an algorithm to find the eigenvalue centralities
of the vertices in a graph G = (V, E) as follows:
We will now look at ways of finding dense subgraphs of a given graph. These sub-
graphs, often termed clusters, indicate a region of dense activity in the network
represented by the graph. In the extreme case, a clique is a subgraph that is fully
connected. However, clique-like or cliquish structures are more commonly encoun-
tered in practice. We will review methods to find cliques and these structures in this
section.
13.5.1 Cliques
A subgraph in which every vertex is a neighbor to all others in this subgraph is called a
clique. A clique of a graph G is a densely connected region in G which may indicate
a special kind of activity in the network that is represented by G, for example, a
clique of friends in a social network. Therefore, detecting cliques is a commonly
required task in such graphs representing real-life phenomena. In real-life networks,
one often finds clique-like structures than full cliques in graphs representing real-life
networks such as protein interaction networks of the cell and social networks and
hence we present algorithms to discover such structures in graphs.
A maximal clique of a graph G is the clique that is not a subset of a larger clique.
The maximum clique of a graph G is the clique of G with the largest order among all
cliques of G as shown in Fig. 13.7. The order of the maximum clique of G is denoted
by ω(G) and is called the clique number of G. Finding the maximum clique in a
graph is called the maximum clique problem and is NP-hard [9], and therefore various
k
D
C
A
a b c d
i j h g f e
Fig. 13.7 Cliques of a sample graph; A = {a, b, k} is a clique but not maximal as it included in the
larger clique B = {a, b, k, i, j} which is also the maximum clique of the graph. C = {c, d, f , g} and
D = {d, e, f } are maximal cliques as they are not a subset of larger cliques
408 13 Analysis of Large Graphs
Every k-club of a graph is also its k-clique. However, not every k-clique of a graph
is its k-club since a k-clique may contain vertices outside of G c . The maximum k-club
problem is to find the largest k-club of a graph and is NP-hard.
The pseudocode for this algorithm is shown in Algorithm 13.3 [6]. We need set
P to be empty so R cannot be extended and X to be empty to ensure that the clique
is not included in another clique to have a maximal clique in R. This condition is
checked at each recursive call to the procedure first. Otherwise, a recursive call is
made that adds a vertex in P to R and its neighbors in P and X. Experimentally,
the time complexity was found to be O(4n ) and a second version using pivoting
resulted in time complexity of O(3.14n ) [5]. Various parallel implementations of
this algorithm exist; using MPI in [10], using thread pools in Java in [3] and on a
Cray XT supercomputer in [17].
core values
1
4 3
2 2
3
4
Fig. 13.9 Cores of sample graph with two components. 1, 2, 3, and 4 cores are encircled and the
core values of vertices are shown with filled colors
13.5.2 k-cores
a b c d
h g f e
2
1
Let us see the step-by-step operation of this algorithm in the sample graph of
Fig. 13.10. The contents of the sorted queue along with the removed and labeled
vertices are shown in Table 13.1 for running the algorithm in this simple graph.
412 13 Analysis of Large Graphs
13.5.3 Clustering
Clustering is the process of grouping of similar objects based on some metric. When
the nodes of a graph are used to represent objects and edges depict their relations,
this process is equivalent to finding dense subgraphs of the graph representing the
network. In this case, the aim of a clustering algorithm is to divide the graph into
subgraphs such that density within a subgraph is maximized with minimum number
of edges among clusters. When edges are weighted, we need to maximize the total
number of edge weights within a cluster and minimize the total weight of edges
among them.
We can have a vertex belonging to more than one cluster in graph clustering and
a vertex may belong to only one cluster in graph partitioning. We need to assess the
quality of the clusters obtained after using a clustering method. A convenient way
to achieve this is to compare the densities of clusters and the average density of the
graph. The density ρ(G) of an unweighted, undirected simple graph G is the ratio of
the size of its edges to the size of maximum possible edges in G as ρ(G) = n(n−1) 2m
.
The edges inside a cluster are called internal edges of the cluster and the edges
connecting the cluster vertices to the other vertices of the graph are called external
edges. We can examine whether a vertex v is appropriately placed in a cluster by
examining the ratio of the number of internal edges incident to v to the number of
external edges on it. We can now define intracluster density of a cluster Ci as the
ratio of all internal edges in Ci to all possible number of edges in Ci as follows [16]:
2 v∈Ci degint (v)
degint (Ci ) = (13.14)
|Ci ||Ci − 1|
The intracluster density of the whole graph as the average of all intracluster den-
sities as follows [6]:
1
k
degint (G) = degint (Ci ), (13.15)
k
i=1
where k is the number of clusters obtained. Clearly, we would need degint (G) as
high as possible when compared with graph density for proper clustering. A sample
graph divided into three clusters is depicted in Fig. 13.11 with calculated intracluster
densities.
We can now define the intercluster density degext (G) as the ratio of the size of the
intercluster edges to the maximum allowed number of edges between the clusters
as shown below [16] and we need this parameter to be significantly lower than the
graph density for a good quality clustering.
C2
The intercluster density of the sample graph in Fig. 13.11 is 0.092 which is signif-
icantly lower than the graph density 0.28 and hence we can consider the clustering
in this graph is favorable. Similar cluster parameters for edge-weighted graphs can
be defined. Let us first consider the density of an edge-weighted graph which is the
ratio of the sum of edge weights to the maximum possible number of edges as below:
2 (u,v)∈E w(u,v)
ρ(G(V, E, w)) = (13.17)
n(n − 1)
The intracluster densities now are formed as the ratio of the sum of edge weights
in a cluster to the maximum possible number of edges in that cluster and the graph
intracluster density is the average value of all the clusters contained in the graph.
In both unweighted and weighted graphs, a good clustering should result in average
graph intracluster values which are significantly higher than the graph densities. The
intercluster density in the case of a weighted graph is the ratio of the sum of weights
of all edges between each pair of clusters to the maximum possible number of edges
between clusters.
Analysis of large graphs as a whole is difficult due to their huge sizes. As one
approach to overcome this problem to some extent, local properties around vertices
can be assessed and global properties may then be approximated using these local
properties. For example, the degree of a vertex and its clustering coefficient are
local properties; degree distribution and the average clustering coefficient are global
properties of a graph that give some insight on its overall structure. We reviewed
these parameters along with the matching index of two vertices and the density of a
graph in the first part of this chapter.
414 13 Analysis of Large Graphs
C1
C2
C3
a b c d
g f e
h
a 3 d
2
4 7
9
f e
2
416 13 Analysis of Large Graphs
References
1. Barabasi A-L, Albert R (1999) Emergence of scaling in random networks. Science 286:509–512
2. Batagelj V, Zaversnik M (2003) An O(m) algorithm for cores decomposition of networks. CoRR
(Computing research repository), arXiv:0310049
3. Blaar H, Karnstedt M, Lange T, Winter R (2005) Possibilities to solve the clique problem by
thread parallelism using task pools. In: Proceedings of the 19th IEEE international parallel and
distributed processing symposium (IPDPS05)Workshop 5 Volume 06 in Germany
4. Boccaletti S, Latorab V, Morenod Y, Chavez M, Hwang D-U (2006) Complex networks: struc-
ture and dynamics. Phys Rep 424:175–308
5. Bron C, Kerbosch J (1973) Algorithm 457: finding all cliques of an undirected graph. Commun
ACM 16(9):575–577
6. Erciyes K (2015) Distributed and sequential algorithms for bioinformatics. Springer, Berlin
(chapters 10–11)
7. Erdos P, Renyi A (1959) On random graphs. Publicationes Mathematicae 6:290–297
8. Erdos P, Renyi A (1960) On the evolution of random graphs. Publ Math Inst Hung Acad Sci
5:17–61
9. Garey MR, Johnson DS (1979) Computers and intractability: a guide to the theory of NP-
completeness. W H Freeman and company, New York
10. Jaber K, Rashid NA, Abdullah R (2009) The parallel maximal cliques algorithm for protein
sequence clustering. Am J Appl Sci 6:13681372
11. Newman MEJ, Girvan M (2004) Finding and evaluating community structure in networks. Phys
Rev E 69:026113
12. Newman MEJ, Strogatz SH, Watts DJ (2002) Random graph models of social networks. Proc
Natl Acad Sci USA 99:25662572
13. Özgr A, Vu T, Erkan G, Radev DR (2008) Identifying gene-disease associations using centrality
on a literature mined geneinteraction network. Bioinformatics 24(13):277–285
14. Perron O (1907) Mathematische Annalen. Zur Theorie der Matrices 64(2):248–263
15. Ravasz E, Somera AL, Mongru DA, Oltvai ZN, Barabsi AL (2002) Hierarchical organization
of modularity in metabolic networks. Science 297:15511555
16. Schaeffer SE (2007) Graph clustering. Comput Sci Rev 1:2764
17. Schmidt MC, Samatova NF, Thomas K, Park B-H (2009) A scalable, parallel algorithm for
maximal clique enumeration. J Parallel Distrib Comput 69:417428
18. Watts DJ, Strogatz SH (1998) Collective dynamics of small-world networks. Nature 393:440442
Complex Networks
14
Abstract
Complex networks consist of tens of thousands of nodes and hundreds of thou-
sands of edges connecting these nodes. The graphs used to model these networks
are large and special methods are commonly needed for the analysis of these
networks. The main complex networks which are biological networks, social net-
works, technological networks and information networks are reviewed with brief
description of the algorithms needed to solve some problems in these networks
in this chapter.
14.1 Introduction
Biological networks are the networks of organisms with nodes representing biolog-
ical entities and edges showing the interactions among them. The cell is the basic
biological unit of all organisms. Biological networks can be classified as networks
within the cell and networks outside the cell at a more macroscopic level. Cells are
mainly of two types: eukaryotic cells which have nuclei carrying the genetic infor-
mation in chromosomes and prokaryotic cells which do not have nuclei. A gene is
the basic unit of hereditary and consists of a string of nucleotides which are small
molecules that make deoxyribonucleic acid (DNA) in a double helix structure.
Genes are decoded to make amino acids which are chained to make proteins which
are large molecules outside the nucleus of the cell carrying al vital functions related
to life. This process is called the central dogma of life. Proteins are large molecules
and their main functionality depends on their amino acid sequences, their 3-D shape
and also the interaction with other proteins. These interactions can be represented by
graph edges to obtain protein-protein-interaction (PPI) networks with proteins as the
nodes of the network. Figure 14.1 displays the PPI network of T. pallidum. Other main
networks in the cell are the gene regulation networks (GRNs) formed by interacting
proteins and genes and metabolic networks represent biochemical reactions in the
cell to generate metabolism [8]. Other biological networks outside the cell include
brain networks, neural networks, phylogenetic networks and the food web [8]. The
main problems encountered in biological networks are clustering, network motif
search and network alignment.
14.2.1 Clustering
clusters. This process may be repeated until a certain cluster quality criterion is met.
The algorithm is implemented using the following steps:
Note that MST is computed only once and heaviest edges are removed iteratively.
Labeling of the newly formed clusters can be done simply by the BFS algorithm in
linear times for the two clusters. We may remove a number of edges at each step from
the MST that are more than a threshold distance apart. Computation of MST can be
performed in parallel using any of the algorithms described in Chap. 7. Clustering
420 14 Complex Networks
(a) (b)
1. Motif search: Either all instances of a motif are searched or sampling is used
where only a sample subgraph is searched and the results are projected to the
whole graph. For very large graphs, the latter is commonly employed.
2. Isomorphic classes: Different looking subgraphs may be isomorphic, therefore
they need to be grouped together for correct processing.
3. Statistical significance: We need to determine the statistical significance of dis-
covered subgraphs. This process is usually performed by generating a set of ran-
dom graphs similar in structure to the target graph and performing the search in
these graphs. The results obtained in two cases can then be compared to determine
whether the found subgraphs are actual motifs.
edges depict the relationships. Social networks have small-world and scale -free
properties like other complex networks.
• Triadic closure: Let us assume A and B are two friends in a social network. There
is a good chance that a friend of A (or B) who does not know B (or A) will become
friends with that person in future. This property is called triadic closure as there
will be a triangle formed among the three persons in future by the composition of
an edge between the two people who have not met before.
• Homophily: The homopholy property observed in dynamic social networks is
that the individuals or groups have tendency to arrange relationships with other
individuals or groups like themselves. The similarity could be the age or the
philosophy or something else in common.
• Relations: In a friendship social network, we can label edges between two persons
as positive (+) meaning they like each other or negative (−) showing they dislike
each other, assuming these relations are symmetric. In a small social network of
triangle structure with three people A, B and C; we can have four cases:
The first case and the last case are balanced relations while the second and third
are not. With three people who all dislike each other, there is a tendency for two
to become friends against the third one and hence this case is unbalanced. Also,
the third case implies two persons who do not want to be together but want to
be together with a third person causing again an unbalanced situation. The last
case is balanced as there is no conflict. In graph-theoretical terms, this means any
triangle with one or three positive relations are balanced and triangles with zero or
two positive edges are unbalanced. We can now find all triangles in a given social
network with assigned relations and if all of these are balanced, the whole social
network is a balanced network, otherwise even if there exists one unbalanced
triangle, the network is said to be unbalanced.
• Structural balance: A general method to determine whether a social network is
balanced or not was proposed by Harary [13] in the Balance Theorem.
Theorem 14.1 A complete social network is balanced when all pairs of its nodes
have positive relations with each other or when its nodes can be partitioned into
14.3 Social Networks 423
two sets V1 and V2 such that all nodes within these groups are friends with all other
nodes in their groups and each node of one group has negative relations with all
other nodes in the other group.
A balanced social network according to this theorem is depicted in Fig. 14.3. It can
be seen that any triangle which has two nodes in one group and one in the other has
just one + label which means these triangles are balanced. The remaining triangles
are all embedded in each group and have + labels on their edges, therefore they are
also balanced resulting in a balanced network.
Detecting communities which are dense regions of activity in social networks have
many implications; for example, we can analyze these clusters of persons or groups
to understand their behavior. We will review two algorithms for this purpose which
are implemented in social networks.
Edge Betweenness-Based Algorithm
The edge-betweenness value of an edge e in a graph G was the fraction of all, pairs-
shortest-paths that pass through e. Intuitively, edges with high values have a greater
probability of joining dense regions of the graph than edges with lower values. In the
extreme case, a bridge removal of which disconnects a graph has a very high edge-
betweenness value. Based on this observation, Girvan et al. proposed an algorithm to
detect clusters in a complex network represented by a graph G = (V, E) consisting
of the following steps with a similar structure to MST-based clustering [11]:
424 14 Complex Networks
1. repeat
2. compute edge-betweenness value σx y for each edge (x, y) of graph G.
3. euv ← the edge with maximum σ value
4. G ← G − {euv }
5. until a quality criterion is met
The most time consuming step for this algorithm is the calculation of the edge-
betweenness values and hence it has low performance for graphs which have more
than few thousand nodes. This method is also used for detecting clusters in biological
networks.
k
Q= (eii − ai2 ). (14.1)
i=1
Using this equation, we are in fact evaluating the difference in probabilities of an
edge being in module i and that a random edge would fall in module i and summing
these values for all clusters. When the percentage of edges within clusters are much
higher than the ones with one end in a cluster (inter-cluster edges), we expect a high
value of Q, in fact the value of Q approaches unity when there are only few edges
between clusters. A clustering algorithm based on the modularity concept can then
be formed such that two clusters are combined to increase modularity as follows
[19].
This algorithm provides clusters as a dendogram which can be used to obtain the
required number of clusters. The time complexity of this algorithm is O((m + n)n)
or O(n 2 ) on sparse graphs [19].
14.4 Ad Hoc Wireless Networks 425
(a) (b)
r
r
a b a b
r
r
c
c
c
r r
Fig. 14.4 A MANET with three nodes a, b and c. They are all within ranges r of each other in a
and node c moves to a new position which is out of range of node a in b; causing the edge (a, c)
be deleted from the graph and the edge (b, c) to be modified
14.4.1 k-Connectivity
Connectivity in a graph G = (V, E) meant there is a path between any two nodes
u and v in G as we reviewed in Chap. 8. Connectivity is needed in any computer
network for transfer of information between each pair of nodes but this problem is
more eminent in a wireless network. For example, moving nodes in a MANET may
disrupt connectivity easily and a sensor node that runs out of battery power may
cause disconnection in such a network. The probabilities of these events are much
higher than the failure of a router in a wired network.
Let us recall the k-connectivity problem: A network is k-connected if there is at
least k disjoint paths between any two of its nodes. We can deduce that the failure
of a minimum of k-1 nodes results in a disconnected and therefore non-functional
14.4 Ad Hoc Wireless Networks 427
SINK
Fig. 14.5 A WSN with a sink node. Transmission ranges of all nodes are shown
network in a k-connected network. Clearly, the higher the value of k, the more
strongly connected the network becomes. Hence, we can say a network with a higher
k connectivity value is more reliable and fault tolerant against node failures than a
network with a lower k value. In general, we have three main problems related to
k-connectivity in an ad hoc wireless network [1]:
In search of a solution to the first problem, the neighbors of a node or the radio
power of nodes are increased iteratively in various research studies. Even when this
first step is accomplished, we need to monitor and determine the value of k to take
remedial action when this falls below a desired value. When this happens, we can
place new nodes in a WSN or move mobile nodes to new locations in a MANET to
428 14 Complex Networks
increase the value of k. We have already reviewed algorithms to determine the value
of k in Chap. 8.
C1
C2
a
C3
C4
Fig. 14.6 An example backbone in a wireless ad hoc network. Clusters C1 , C2 , C3 and C4 are
shown inside dashed circles and CHs are shown in bold. Node a in C4 sends a message to its CH
which forwards the message along the backbone until the cluster of b which is C3 is found
14.4 Ad Hoc Wireless Networks 429
14.4.2.1 Algorithms
Gerla and Tsai proposed a clustering algorithm using the identifers of nodes in
wireless ad hoc networks [10]. A node in such a network broadcasts periodically the
identifiers of the neighbors in its transmission range in its unit disk graph. After the
broadcast, it listens to the medium for a while and does one of the following:
• A node that does not hear a node with a higher identifier than itself after a timeout
decides to be a CH and broadcasts this condition.
• The lowest identifier neighbor that a node hears is assigned as its CH, unless that
node voluntarily gives up its position as a CH.
• A node that hears the declaration of two or more CHs assigns itself as a gateway
bridging two clusters.
As can be seen, the symmetry breaking condition is the choice of the node with
the lowest identifier in the transmission range and hence the name of the algorithm.
The node that hears all higher identifier nodes becomes the CH and broadcasts itself
as the CH. Nodes that hear the CH declaration message become part of the cluster
managed by that CH. A final correction in the algorithm involves selecting a node as
a gateway when it hears two nodes as CHs. This simple algorithm creates clusters in
linear time, however, a low identifier node joining a cluster results in reorganization
of a cluster which may be costly.
The authors proposed another algorithm to form clusters called highest connec-
tivity cluster algorithm which aims to select the node with the highest degree as the
430 14 Complex Networks
(a) (b)
C2
C1 8 C2 C1
11
3 5
10
4
7
9
6
1
12
C3 C3
Fig. 14.7 a Lowest identifier algorithm, b Highest connectivity algorithm implementations. CHs
are shown in bold and gateway nodes are gray
CH [10]. The following rules are applied in this algorithm after a node broadcasts
the list of nodes it can hear including itself:
This time, nodes with higher degrees are selected assuming they can access the
nodes in their clusters easily. In both algorithms, no two CHs will be adjacent to
each other and in a cluster the distance between any two nodes is at most two-hops.
Figure 14.7 displays clusters formed by both algorithms. Total number of messages
communicated in the first algorithm is 2n as each node will broadcast one message
(update) to its neighbors and another message (i_am_chead or or dinar y) to inform
whether it is a CH or an ordinary node.
Instead of grouping the wireless ad hoc network into clusters with CHs and then
forming a backbone with these CHs, we can use connected dominating sets as the
backbone. A dominating set D of a graph G = (V, E) is a subset of its vertices
such that ∀v ∈ V , either v ∈ D or v is adjacent to a vertex u ∈ D. In a connected
dominating set (CDS), there is a path between any two vertices in this set. Recall
finding the minimum order connected or unconnected dominating set is NP-hard
(see Chap. 3 and we need to form a CDS to have correct operation of the backbone.
Otherwise, we need to insert vertices between the elements of the dominating set. We
also form clusters this way by denoting each element of D as CH and any neighbor
connected to such a CH becomes the member of the cluster of this CH. For example,
14.4 Ad Hoc Wireless Networks 431
the CHs in Fig. 14.6 form a 3-hop dominating set with shown clusters around them.
We can build a maximal independent set (MIS) of the graph and then connect the
nodes in the MIS to obtain a CDS. We will describe a direct algorithm that finds the
CDS in linear time in the next section. An evident requirement is that the backbone
nodes should be connected and that every node should be in the backbone or a
neighbor to a backbone node.
The first rule means a node that finds all of its neighbor set is covered by a neighbor
with a lower identifier removes itself from the CDS. We have two nodes covering the
same neighbors in this case and one of them can be removed and identifiers are used
to break the symmetry. The second rule removes a node from CDS if its neighbors
are covered by the union of the neighbors of two higher identity nodes in the CDS. In
this case, we are looking at the union of nodes which may be partly covered by two
neighbor nodes. As each node sends exactly two broadcast messages in a wireless ad
hoc network in this implementation, total number of messages transmitted is 2n and
it can be used conveniently in a MANET due to its low maintenance requirements.
When a node moves away, only its neighbors need to update their states. Figure 14.8
shows an example network where a minimal CDS is formed in two steps.
Cokuslu and Erciyes modified this algorithm by incorporating the degrees of
the nodes as well as their identifiers while pruning in the second step [4]. They
compared their algorithm with Wu’s algorithm and showed experimentally that it
provides significantly smaller MCDSs. Das et al. [5] provided two algorithms that
are the distributed versions of Guha–Khuller algorithms we have seen in Chap. 10.
In the first algorithm, nodes are assigned weights as their effective degrees which is
the number of their non-CDS neighbors. Initially a small dominating set C is formed
which may have several disconnected components. The forest consisting of the edges
{v1 , v2 } where v1 ∈ C and v2 ∈ N (v1 ) is then connected in the second stage using
a distributed minimum spanning tree (MST) algorithm. The CDS consists of the
432 14 Complex Networks
(a) (b)
1 2 4 2 4
1
7 7
5 3 6 3 6
5
R1 R2
Fig. 14.8 Implementation of Pruning-based Algorithm in a small graph. Nodes that mark them-
selves black since they have two directly unconnected neighbors are shown in a and implementing
Rule 1 and Rule 2 in nodes 5 and 3 respectively results in the smaller CDS shown in b
non-leaf nodes of the MST formed. This algorithm provides an approximation ratio
of 2H Δ + 1 in O((n + |C|)Δ time using O(n|C| + m + n log n) messages [5].
1. Leader Election: A spanning tree is constructed rooted at the leader and nodes
notify leader that this phase is over.
2. Level Calculation: The leader starts this phase by sending a level 0 message
and then each node increases the level received from parent and transfers level
message to children if they exist. A convergecast operation by complete messages
to the root concludes this phase.
3. Color Marking: The nodes in the MIS are colored black and all other nodes are
colored gray at the end of this phase. The dominator message is sent by a node
that marks itself black and the dominatee is sent by a node that marks itself gray.
Initially, all nodes are white and the algorithm is executed according to the fol-
lowing rules [2]:
1. A white node which receives a dominator message first time marks itself gray
and broadcasts a dominatee to inform it has been dominated.
14.4 Ad Hoc Wireless Networks 433
2. A white node that has received dominatee messages from all of the neighbors
with lower ranks marks itself black, sends a dominator message to all of its
neighbors and assigns its parent in T as its dominator.
3. A gray node receiving a dominator message from a child node in T for the first
time which has never sent a dominatee message, it changes its color to black
and sends dominator message to all of its neighbors.
4. Whenever a black node finds that all of its neighbors are black and have lower
ranks than itself, it changes its color to gray and sends dominatee message to all
of its neighbors.
Rule 1 ensures that if the neighbor of a white node is included in the CDS, it
colors itself gray to be a neighbor node of a CDS node. In Rule 2, if all the lower
rank neighbors are gray, a node may be assigned as a CDS node. The second phase
finishes when the leaves of the tree are marked. At the end of the first two phases,
an MIS is formed and the nodes in this set are connected using invite and join
messages. This algorithm has time complexity of O(n), message complexity of
O(n log n) and the resulting CDS has a size of at most 8|MinCDS| + 1 [2].
The Internet is the largest computer network in the world and consists of billions of
devices including personal computers, servers, mobile phones connected by various
networks. The Internet is a complex network exhibiting small-world and scale-free
properties as discovered experimentally [3]. The average distance between any two
nodes is between 3 and 9 and a small fraction of nodes in the Internet have very
high number of connections with most of the nodes having low degrees. In fact, the
average degree of nodes in the Internet is between 2 and 8 [3].
Routing in Internet is needed to find efficient paths from a source to many desti-
nations in the network. A routing protocol specifies a set of rules for efficient data
transfer between sources and destinations. There are various choices for Internet
routing protocols; the information can be stored globally or decentralized. We have
all routers having complete topology information in the former and a router is aware
of its neighbors and the link costs to these neighbors in the latter. The network may
be static with routes changing slowly or dynamic with frequent route changes. Addi-
tionally, the routing may be sensitive to load in the network or not. Two representative
routing algorithms in the Internet are the distance vector and link state algorithms.
In both of these algorithms, it is assumed that the router is aware of the address of its
neighbors and the cost reaching those neighbors. We will see the routing problem in
the Internet can be solved efficiently with graph algorithms.
434 14 Complex Networks
The distance vector routing (DVR) protocol uses a local distributed and a dynamic
algorithm that adapts to changes and link failures. It is based on the Bellman–Ford
dynamic shortest path algorithm we reviewed in Chap. 3. The main idea of this
algorithm is the diffusion of the shortest paths to neighbors. Each node i periodically
sends its shortest distances to all other nodes in a message update including vector
length[1..n] with entry length[ j] showing its distance to node j. When a node
receives these vectors from the neighbors, it updates its shortest paths to all other
nodes in the network based on the values in length. When a node i receives an
update message, it checks the entries in length vector and if there is a shorter route
to a destination j in length than its own, it updates its local routing table with that of
the one in length. The count-to-infinity problem is encountered in this protocol when
a node becomes isolated due to a link failure or breaks down and all nodes start to
increase their distances to this node.
The link state protocol uses a global distributed algorithm in which each router is
aware of the entire network topology and computes the shortest paths to all other
nodes individually using the Dijkstra’s single source shortest path (SSSP) algorithm
we saw in Chap. 7. The network information is transferred by periodic link state
packets (LSPs) that includes the cots of reaching neighbors a sequence number and
time-to-live field which is decremented at each hop. Nodes gather the information
flooded through the network and use it to compute routes using the SSSP algorithm.
The local routing tables need to be large as whole network information is to be stored.
Up to this point, we have assumed all routers are in a flat networks structure which is
not realistic. Hierarchical routing in the Internet is based on hierarchical placement
of routers which is more reasonable than storing routing information for millions
of destinations at a single node. Routers are clustered into autonomous systems
(ASs) and routers in the same AS use the same routing protocol whereas routers
in different ASs can run different routing protocols. There is an inter-AS routing
protocol for data transfer among ASs. If a packet received by a router is destined for
a destination in the same AS, the shortest route computed by the inter-AS routing
algorithm is used. Otherwise, the packet is transferred to one of the gateway routers
to be delivered to the destination AS. The required destination AS is computed by
the inter-AS routing protocol. The standard inter-AS protocol of the Internet is the
Border Gateway Protocol (BGP) [21].
14.6 The Web as an Information Network 435
The world wide web (WWW), or Web, is an information network formed by the
references in Web documents. We can think of the Web as a higher level structure
over the Internet which consists of Web pages with links to each other. Such an
organization of Web pages can be conveniently modeled by a digraph, commonly
denoted as the Web graph. We can then search solution to the problems such as finding
the most relevant page to a query encountered there using this digraph. The hyper
text transfer protocol (http) is used for communication between the Web clients and
Web servers and the links between Web pages are called hyperlinks.
The Web graph is very dynamic with numerous nodes (pages) being added and
deleted at any time. It is a complex network consisting of millions of nodes bearing
the commonly found complex network properties such as small-world and scale-free
networks. In other words, there are only few hops between any two documents on
the Web and only a small percentage of the nodes have very high degrees with most
of the nodes having small degrees. The Web graph was found experimentally to have
a very large strongly connected component called the giant component (GC) with
other nodes grouped as follows:
• IN: This the set of nodes that have directed links to the GC.
• OUT : Nodes that have directed links from the GC form this component.
• Tendrils: A tendril has Web pages connected to either IN or OUT but are not part
of IN, OUT or the GC.
• Disconnected nodes: Pages that cannot be accessed from any component.
8 c
v 1
6 d
w 2
Hubs Authorities
• Authority Update Rule: The authority score of a page is the sum of the hub scores
of all pages pointing to it.
• Hub Update Rule: The hub score of a page is the sum of the authority scores of
all pages that it points to.
With these rules, we give more importance to authorities that are pointed by more
hubs than others. Moreover, if a hub has pointed to authorities which have been
pointed by many hubs, its importance is also raised. This feedback structure may
be repeated in a loop to determine the importance of authorities which can then be
presented to the user with respect to their priorities. A one-step implementation of
this algorithm is shown in Fig. 14.9.
A possible pseudocode of this algorithm is depicted in Algorithm 14.1 in which
the hub and authority scores of the pages are initialized to 1 and the above rules are
applied iteratively [7]. The final scores at each iteration are calculated by dividing
the score value with the sum of the scores. It was shown in [16] the scores for hub
and authority pages converge as the number of iterations go to infinity.
page in the Web graph based on the number of pages that reference it. It is basically
a score for a page based on the votes it receives from other pages. This is a sensible
metric for the importance of a page since the relative importance and popularity of
a page increases by the number of pages referencing it. Page rank can be considered
as a fluid that runs through the network accumulating at important nodes. The page
rank algorithm to find importance of pages in a Web graph assigns ranks of the pages
in the Web graph such that the total page rank value in the network remains constant.
It initially assigns rank values of 1/n to each page in an n node network as shown in
Algorithm 14.1. The current rank value of a page is evenly distributed to its outgoing
links and then, the new page rank values are calculated as the sum of the weights of
the ingoing links of pages. Execution of this algorithm for k steps results in more
refined results for page rank values as in the Authority and Hubs algorithm and the
page rank values converge as k → ∞.
Finding the initial edge weights using this algorithm in a small Web graph with
five nodes is depicted in Fig. 14.10.
Running of the Page Rank algorithm for the graph of Fig. 14.10 for the first three
iterations is shown in Table 14.1. We can see page 3 has the least rank as it is pointed
by only one page and page 4 has slightly higher rank than others as it is the only page
pointed by 3 pages. A page that does not point to many pages as other pages but has
many input edges may have high scores after a significant number of iterations. This
situation is corrected by the introduction of damping factor d which is used to scale
down page rank values by (1 − d)/n [16].
438 14 Complex Networks
Table 14.1 Page Rank values of the Web nodes of Fig. 14.10
Vertices 1 2 3 4 5
n out 3 2 2 2 1
k = 1: 0.067 0.1 0.1 0.1 0.2
weight/edge
rank 0.2 0.3 0.067 0.267 0.167
k = 2: 0.067 0.15 0.034 0.134 0.167
weight/edge
rank 0.284 0.201 0.034 0.251 0.201
k = 3: 0.095 0.101 0.017 0.067 0.167
weight/edge
rank 0.168 0.184 0.095 0.213 0.162
14.7 Chapter Notes 439
We have described and reviewed fundamental complex networks which are biological
networks, social networks, technological networks and information networks. All of
these networks exhibit small-world and scale-free properties of complex networks.
We then took a closer look at some of the representative examples for these networks.
PPI networks are biological networks that exist outside the nuclei in the cell and
three main problems encountered in these structures are detecting clusters, finding
network motifs and aligning two networks. Clusters or dense subgraphs in these
networks may indicate some dense activity in this region and we reviewed two
algorithms for the purpose of discovering clusters. Network motifs are repeating
subgraphs and finding them is another important task and research area in PPI net-
works and other biological networks. These structures are assumed to have some
basic functionality and are considered to be the basic building blocks of organisms.
Moreover, finding similar network motifs in two or more organisms may indicate
common ancestry. Alignment of two or more networks show their similarities and is
frequently used to compare various networks.
Social networks are formed by individuals or groups of individuals and finding
communities which are closely related groups provides insight to the structure of a
social network. We reviewed two main algorithm for this purpose and also described
relations, and balanced and unbalanced social networks. Wireless ad hoc networks
are technological networks like the Internet. Two main types of these networks are
the mobile ad hoc networks and wireless sensor networks. We described various
clustering algorithms in these networks in detail which are all distributed algorithms
executed by individual nodes of the network. We also reviewed main routing algo-
rithms in the Internet which are extensions of the routing algorithms described in
Chap. 7. We lastly reviewed the Web which is an information network and analyzed
two algorithms to attribute importance to Web pages for efficient Web queries. The
first algorithm called HITS divides the nodes to hubs and authorities during a query
and assigns scores to these nodes based on the scores of nodes they point and are
pointed by. The PageRank algorithm is more general and finds wide use in the Web.
We can say clustering is a fundamental research area in all of these networks
and heuristics are widely used to find the dense regions in the large graphs that
represent these networks. There is not a single algorithm that fits all of the needs
of the application and the experimental results obtained along with its complexity
are commonly accepted as the goodness of an algorithm. Parallel and distributed
algorithms are needed in all areas of research in these complex networks. A survey of
parallel clustering algorithms with newly proposed ones are given in [8]. Distributed
clustering algorithms in wireless ad hoc networks have been studied extensively but
the same is not valid for clustering in other complex networks. There are only few
studies for parallel network motif search and parallel network alignment which are
potential research areas.
440 14 Complex Networks
k 8
i
3
16 15
a b c
9 11
h 1 13 12 d
10 7
5
17
2
g f e
j
6 4 14
Exercises
1. Work out the MST of the graph shown in Fig. 14.11 using any algorithm. Then,
divide this graph into three clusters by removing the heaviest two edges from the
MST. Find the total cost of inter-cluster edges and the cost of edges within the
clusters and compute cluster quality.
2. Find whether the social network depicted in the labeled graph of Fig. 14.12 is
balanced or not by checking every triangle. Suggest what to do to make this
network balanced.
3. Work out the scores of hub and authority pages of the Web query graph of
Fig. 14.13 for three iterations of the HITS algorithm. Determine whether there is
a convergence of score values or not.
4. A Web graph is given in Fig. 14.14. Implement the PageRank algorithm in this
graph for four iterations by showing the page rank scores for each page.
References 441
Hubs Authorities
h c
e
d
f
References
1. Akram VK, Orhan Dagdeviren O (2015) On k-connectivity problems in distributed systems.
Advanced methods for complex network analysis. IGI Global
2. Alzoubi KM, Wan P-J, Frieder O (2002) New distributed algorithm for connected dominating
set in wireless ad hoc networks. In: Proceedings of 35th Hawaii international conference on
system sciences, Big Island, Hawaii
3. Caldarelli G, Vespignani A (2007) Large scale structure and dynamics of complex networks:
from information technology to finance and natural science. Complex Systems and Interdisci-
plinary Science. World Scientific Publishing Company. Chapter 8, ISBN-13: 978-9812706645
4. Cokuslu D, Erciyes K, Dagdeviren O (2006) A dominating set based clustering algorithm for
mobile ad hoc networks. Int Conf Comput Sci 1:571–578
5. Das B, Bharghavan V (1997) Routing in ad-hoc networks using minimum connected dominating
sets. In: IEEE international conference on communications (ICC97), vol 1, pp 376380
6. Dongen SV (2000) Graph clustering by flow simulation. Ph.D. Thesis, University of Utrecht,
The Netherlands
442 14 Complex Networks
7. Erciyes K (2014) The Internet and the Web. In: Complex networks: an algorithmic perspective.
CRC Press. ISBN-10: 1466571667, ISBN-13: 978-1466571662
8. Erciyes K (2015) Distributed and sequential algorithms for Bioinformatics, Springer, Berlin
(Chaps. 10 and 11)
9. Fiedler M (1973) Algebraic connectivity of graphs. Czechoslov Math J 23:298–305
10. Gerla M, Tsai JTC (1995) Multicluster, mobile, multimedia radio network. Wirel Netw 1:255–
265
11. Girvan M, Newman MEJ (2002) Community structure in social and biological networks. PNAS
99:7821–7826
12. Hagen L, Kahng AB (1992) New spectral methods for ratio cut partitioning and clustering.
IEEE Trans Comput Aided Des Integr Circuits Syst 11(9):1074–1085
13. Harary F (1953) On the notion of balance of a signed graph. Mich Math J 2(2):143–146
14. International Organization for Standardization (1989-11-15) ISO/IEC 7498-4:1989 – Informa-
tion technology – open systems interconnection – basic reference model: naming and address-
ing. ISO Standards Maintenance Portal. ISO Central Secretariat. Retrieved 17 Aug 2015
15. Jorgic M, Goel N, Kalaichevan K, Nayak A, Stojmenovic I (2007) Localized detection of
k-connectivity in wireless ad hoc, actuator and sensor networks. In: Proceedings of 16th inter-
national conference on computer communications and networks (ICCCN 2007), pp 33–38
16. Kleinberg J (1999) Authoritative sources in a hyperlinked environment. J ACM 46(5):604–632
17. Lin CR, Gerla M (1997) Adaptive clustering for mobile wireless networks. IEEE J Sel Areas
Commun 15(1):1265–1275
18. Mount DM (2004) Bioinformatics: sequence and genome analysis, 2nd edn. Cold Spring Harbor
Laboratory Press, NY. ISBN 0-87969-608-7
19. Newman M (2003) Fast algorithm for detecting community structure in networks. Phys Rev E
69:066133
20. Olman V, Mao F, Wu H, Xu Y (2009) Parallel clustering algorithm for large data sets with
applications in bioinformatics. IEEE/ACM Trans Comput Biol Bioinform 6:344–352
21. RFC 4271 - A Border Gateway Protocol 4 (BGP-4). www.ietf.org
22. Titz B, Rajagopala SV, Goll J, Hauser R, McKevitt MT, Palzkill T, Uetz P (2008) The binary
protein interactome of Treponema pallidum, the syphilis spirochete. PLOS ONE 3(5):e2292
23. Wu J, Li H (1999) On calculating connected dominating set for ef- ficient routing in ad hoc
wireless networks. In: Proceedings of the third international workshop on discrete algorithms
and methods for mobile computing and communications, pp. 7–14
Epilogue
15
15.1 Introduction
We know by now most of the problems encountered in graph world are NP-Hard
except few ones such as the matching problem where we searched for the disjoint
edges with the maximum size. Any new problem we face will probably not have a
solution in polynomial time. We will attempt to specify the steps to follow in such a
case as follows.
• In-depth understanding of the basic graph algorithms is very helpful. These algo-
rithms such as the DFS and BFS algorithms may be used as the building blocks to
solve a more complicated problem. In many cases, a modified form of the basic
algorithm can be used. We saw how simple DFS algorithm with some modifica-
tions can be used for various problems such as finding articulation points, bridges,
strongly connected components, and the blocks of a graph.
• When dealing with an NP-hard graph problem, we can search an approximation
algorithm if it exists. In many cases however, approximation algorithms are rare
and attempting to design a new one is not a trivial task. After all, if one can come up
with a new approximation algorithm that has a better proven approximation ratio
than existing ones, this can be published in an article. In some cases, we may opt to
use an approximation algorithm that has a slightly worse approximation ratio than
the best available one, due to the complexity of implementing a better algorithm.
For example, finding the minimal vertex cover of a graph using matching is a
simple algorithm with an approximation ratio of 2.
• In the more common case, use of heuristics is unavoidable. Choice of a heuristic
largely depends on the nature of the problem at hand. When we are to design a
new graph algorithm or modify an existing one, we have only few properties to
begin with. Especially in the case of altering an existing algorithm for our purpose,
degree of nodes can be incorporated to break symmetries or to directly select a
node to work on. Degree of neighbors or degree of k-hop neighbors can also be
used. We can define new parameters which use degrees and neighbors and their
relationship. The clustering coefficient of a vertex for example shows how well
connected neighbors of that vertex are.
• For problems related to large graphs, we may opt for an approximation algorithm
with better performance than a polynomial algorithm to solve the problem due to
the high execution times involved. Moreover, parallelization is always helpful to
improve performance.
• A computer network consists of autonomous nodes that function independently.
For network problems, we need efficient distributed algorithms which are executed
by the network nodes. These algorithms commonly use neighbor information to
find local solutions which are then used to find a global solution to a problem.
We can use any of the algorithms we have developed in Part II for large graphs.
However, even a linear time algorithm may be problematic due to the size of these
large graphs. Employment of the following techniques is frequently needed in such
large graphs.
15.3 Are Large Graph Algorithms Different? 445
• Use of heuristics: Heuristics are commonly used in solving NP-hard graph prob-
lems as we noted. In some cases, one may opt for a heuristic algorithm that has,
for example, O(n) complexity using a heuristic than a deterministic algorithm
that has O(n 2 ) complexity. This improvement in performance may be significant
for a large graph to decide to use the heuristic solution.
• Scalable parallel algorithms: Parallel algorithms are needed in the analysis of
large graphs representing complex networks due to the magnitude of the graph.
This method is a necessity rather than a choice in such implementations.
• Distributed algorithms: In the case of a large computer networks such as the
Internet or a wireless sensor network, distributed algorithms are needed.
We have three modes for graph algorithms: sequential, parallel, and distributed as
emphasized throughout the book. We may then have the following possible conver-
sions:
15.5 Implementation
Once we have some way of solving the problem, we need to consider the imple-
mentation choices. We saw three ways of implementations that consider data flow in
an algorithm for a given problem; sequential, parallel, or distributed. From another
perspective, the data structures used and the environment the algorithm is practiced
15.5 Implementation 447
We have devoted most of the book to sequential graph algorithms with sample parallel
and distributed algorithms to specific graph problems. The size of the problem and
the environment it is implemented is crucial in deciding whether we should look for a
parallel or a distributed algorithm. For a large graph representing a complex network,
parallel algorithms are commonly required to provide efficiency. On the other hand,
if we are to solve a network problem in which network nodes participate in finding
the solution, we need to search a distributed algorithm for the task at hand. In many
cases, these boundaries are not so clear. For example, we may have a wireless sensor
network with a cluster of computing nodes used as the sink and hundreds of sensing
nodes. Solving a problem such as routing or more complicated problems can be
handled by nodes performing some local operation and sending their data to the sink
using a distributed algorithm; the sink finding the solution efficiently using parallel
computing and then sending the result to the individual nodes using the distributed
algorithm.
As a general rule, we can say that parallel computing is required in solving prob-
lems related to complex networks represented by large graphs. These problems such
as network motif search or network alignment are NP-hard in many cases which
require use of heuristics, and even such implementations take considerable time due
to the huge size of the graph. The speedup obtained by such a parallel algorithm is
the ratio of the sequential time to the parallel computing time and the efficiency is
defined as the speedup divided by the number of processing elements used.
When we are dealing with a problem in which network nodes represent graph
vertices, we should look for efficient distributed algorithms. We saw single initiator
synchronous distributed algorithms are frequently used in such cases due to their
relatively ease of implementation. The number of rounds and the total number of
messages exchanged to terminate the algorithm provide us a good indication of its
performance.
The main part of this book including parts I and II and most of Part III is dedicated
to graph algorithms that can be considered as classical algorithms in a sense that
traditional algorithmic techniques such as greedy, dynamic, and divide and conquer
methods are employed. We reviewed alternative and relatively more recent method of
algebraic graph algorithm design in Chap. 12. This method makes use of main matri-
ces associated with a graph: adjacency matrix, incidence matrix, and the Laplacian
matrix. The review of solutions to few graph problems showed the performances of
such algorithms are commonly inferior to their classical counterparts. However, the
448 15 Epilogue
algebraic method provides simpler algorithms and more importantly, a vast library of
matrix operations in sequential and parallel form are readily available for use in such
algorithms. For example, if the algebraic algorithm involves matrix multiplication,
we already have a method to perform this operation in parallel by row, column, or
block partitioning. Therefore, we do not need to spend a lot of time to find a parallel
algorithm for the problem studied if we can form the algebraic solution using basic
matrix operations.
Dynamic graph algorithms are a necessity rather than a choice since real networks
are almost always dynamic with frequent addition and deletion of edges. Examples
of these networks are the Internet, the Web, social networks, and biological networks
of the cell. Instead of running a known static (classical or algebraic) algorithm for
the modified network from scratch, it is sensible to design algorithms that make use
of the existing network information and solve the problem faster. We noted the main
challenge in the design of dynamic graph algorithms lies in the design of clever data
structures so that modifications can be handled quickly.
Lastly, we investigated dynamic algebraic graph algorithms which are even a less
investigated area than algebraic and dynamic graph algorithms. When we have an
algebraic method to solve a graph problem, there is the possibility to have a dynamic
version of this technique by using some known result from linear algebra and also by
using a dynamic matrix library operation. We have seen such a procedure in dynamic
algebraic matching when an algebraic solution to this problem was combined with
dynamic matrix operations and a theorem from linear algebra.
In conclusion, graph algorithm design using the traditional approaches will be
used for small to moderate size problems. Moreover, they commonly provide the
basic design methods to be used in algebraic or dynamic algorithms for graphs.
However, when we are searching for a solution in a large graph, parallel processing
is needed and such operation can be handled more conveniently by algebraic graph
algorithms. Dynamic graph algorithms are needed for dynamic networks for better
performance. In many cases, dynamic networks such as protein interaction networks,
social networks, and the Internet are large. We can therefore conclude dynamic
algebraic graph algorithms will continue to be an effective research area in future.
Let us elaborate on a case study to illustrate the guidelines we have expressed until
now. We need to cluster nodes of a WSN for the general benefits obtained from such
process. Electing a clusterhead (CH) eases various tasks such as routing since the
CH may perform these tasks on behalf of the nodes in its cluster. This hierarchi-
cal structure is clearly useful in managing any establishment including countries.
However, we need a spanning tree in the WSN to broadcast various commands
from the root and also aggregate the data of the sensors to the root. We may use two
distinct algorithms for our purpose but a closer look reveals that two tasks can be per-
formed by one algorithm in a more efficient way. We will call this process spanning
15.6 A Case Study: Backbone Construction in WSNs 449
Fig. 15.1 Cluster formation in a sample WSN with a sink node. The maximum hop count is 2 and
CHs are shown in gray with clusters in dashed regions
tree-based clustering. The general idea of the new algorithm is to build clusters and
the spanning tree simultaneously. Each cluster will be a subtree in the spanning tree.
We now search for a spanning tree algorithm and see if we can modify this
algorithm for our purpose. We saw how to build a spanning tree using flooding in
Chap. 5. Erciyes et al. presented an algorithm that builds a spanning tree and forms
clusters simultaneously using flooding [3]. The main idea of this algorithm is to keep
a record of the depth of the spanning tree obtained during the iterations and assign
the nodes of the tree within every d hops to a cluster.
The root node starts the algorithm by sending the first probe message and the
nodes receiving this message first time mark the sender as their parent and send
back an ack message, otherwise, the sender is replied with a nack message as in the
original flooding algorithm. Additionally, the depth of subtree cluster is determined
prior to execution in the variable max_depth and at every message reception by the
nodes, the variable count is incremented and checked against max_depth. The end
of the cluster is marked when this is reached and another cluster is started as shown
in Algorithm 15.1 [2]. An unvisited node that receives a message probe with a 0 in
count field becomes a CH of its subtree. All nodes other than the root are classified as
CH, ordinary or leaf at the end of the algorithm. Clusters formed with this algorithm
in a sample WSN are shown in Fig. 15.1.
Proof The diameter of the graph is the upper bound on the time required for the
algorithm as it is the farthest distance between any two vertices. Since each edge will
be traversed twice by either probe/ack or probe/reject message pairs, the message
complexity of the algorithm is O(m).
Banerjee and Khuller [1] also proposed a protocol based on a spanning tree by
grouping branches of a spanning tree into clusters of an approximate target size.
15.7 Conclusions
We can briefly summarize the steps to follow when we need to decide on an algorithm
for the graph problem we have:
effectively to find the edge betweenness values in a graph and DFS for various
connectivity problems. If this is not possible, we can search for an approximation
algorithm that works in linear time. If we cannot find an appropriate algorithm,
our best choice will be to use some heuristics that give good results most of the
time.
2. If the graph is large, it is always worthwhile to attempt to parallelize the algorithm.
In this case, we saw partitioning of the adjacency matrix of the graph yields fea-
sible solutions in many cases. We can also partition the graph by first contracting
its vertices to obtain a simpler graph and then partition the simple graph. For such
graphs, using algebraic graph algorithms provides easiness in parallelization as
parallel matrix operations are already available.
3. If we need to design a distributed algorithm for a computer network such as a
WSN, we may attempt to convert the sequential algorithm to a distributed one
as a general approach. This may not be a trivial task especially if the sequential
algorithm relies heavily on global data since the distributed algorithms commonly
work using local data around the nodes.
References
1. Banerjee S, Khuller S (2000) A clustering scheme for hierarchical routing in wireless networks.
Technical report CS-TR-4103, University of Maryland, College Park
2. Erciyes K (2013) Distributed graph algorithms for computer networks. Computer communica-
tions and networks series. Springer, Berlin, pp 247–248. ISBN 978-1-4471-5172-2
3. Erciyes K, Ozsoyeller D, Dagdeviren O (2008) Distributed algorithms to form cluster based
spanning trees in wireless sensor networks. ICCS 2008. LNCS. Springer, Berlin, pp 519–528
Pseudocode Conventions
A
A.1 Introduction
We show the conventions of pseudocode writing used throughout the book here. We
follow the mainstream adaptations such as in [1, 2]. Main points to be emphasized
are as follows:
• Every algorithm starts with the declaration of its input and the output produced
by it.
• The lines of the algorithm are numbered for reference.
• We use indentations to show blocks which are executed within control structures.
• A procedure that is used from the main body of the algorithm is shown explicitly.
The data structures, control structures, and distributed algorithm structures are
described in the next sections.
A set of vertices which will contain integer values is declared and initialized as
empty.
The assignment in the above examples is performed by using the ← operator. The
value on the right of this operator is evaluated and assigned to the variable on the
left in the usual sense. Sometimes, we have two or more short expressions which are
placed in the same line of the algorithm separated by semicolons as follows. Note
that a statement line does not end with a semicolon.
i ← 5; j ←8
A ← A ∪ {x}
and removing an element y from the set A is performed by using the setminus (\)
operator as follows:
A ← A \ {v}
Table A.3 shows the set operations used in the text with their meanings.
Appendix A: Pseudocode Conventions 455
Control structures are used to alter the flow of execution. Selection and repetition
are two main modes of control as described below.
Selection
Selecting one of few alternative flows is commonly performed by the if-then-else
construction. The Boolean expression after the if statement is evaluated and the
branch after then is taken if this expression yields a true value. We can specify an
else block to specify the alternative flow when the expression yields a false value.
An example is depicted in Algorithm A.7 where we want to test which of the given
two integers a and b is greater than the other or whether they are equal to each other.
We see line 7 is executed in this example.
Repetition
We use the loop constructs f or , while, and r epeat..until to implement a statement
for a number of times. The for-do loop is commonly used when the number of
iterations is known beforehand. The example shown in Algorithm A.3 finds the sum
of the elements of a matrix with integer elements.
When we are dealing with sets and do not know the size of the set, the for all loop
can be conveniently used. Commonly, we arbitrarily select an element of the set and
perform an operation on this element as shown in Algorithm A.4 where we simply
output each element of set S which consists of integers.
There are cases when we want to enter a loop based on a condition. The while loop
can be used for such implementations, and this type of loop may be entered 0 or more
times based on the evaluation of a Boolean expression as shown in Algorithm A.5
where the sum of numbers entered is calculated until 99 is entered. Note that 99 may
be entered as the first input causing no execution of the block inside the loop.
Appendix A: Pseudocode Conventions 457
Our last loop structure we use in the algorithms is the Repeat .. Until loop where
the decision to execute the loop is made after the loop is run. This type of loop is used
when we know the loop is to run at least once as shown in Algorithm A.6, where we
implement the above example of adding numbers entered. Note that we do not need
the input statement before the loop this time since we know the loop will execute at
least once.
References
1. Cormen TH, Leiserson CE, Rivest RL, Stein C (2001) Introduction to algorithms. MIT Press,
Cambridge
2. Erciyes K (2013) Distributed graph algorithms for computer networks. Springer, Berlin
Linear Algebra Review
B
B.1 Introduction
A graph can be represented by its adjacency matrix or incidence matrix. The Lapla-
cian matrix of a graph provides information about the spectral properties of a graph.
The algebraic graph theory is based on applying algebraic methods to graph prob-
lems and commonly, the matrices associated with a graph are used for this purpose.
Linear algebra is a branch of mathematics that deals with matrices. We provide a
very brief and partial review of linear algebra sufficient to be a background for the
spectral graph properties and algebraic graph algorithms described in the book.
• diagonal matrix: All entries except the diagonal values of this matrix is 0. For a
4 × 4 matrix
⎛ ⎞
a1,1 0 0 0
⎜ 0 a2,2 0 0 ⎟
A=⎜ ⎝ 0
⎟
0 a3,3 0 ⎠
0 0 0 a4,4
• identity matrix: The identity matrix I a diagonal matrix with all diagonal values
of unity. I4 is shown below:
⎛ ⎞
1 0 0 0
⎜0 1 0 0⎟
A=⎜
⎝0
⎟
0 1 0⎠
0 0 0 1
• symmetric matrix: The values symmetric to the diagonal are equal in this matrix,
which means this matrix is equal to its transpose.
⎛ ⎞
a1,1 a1,2 a1,3 a1,4
⎜a1,2 a2,2 a2,3 a2,4 ⎟
A=⎜
⎝a1,3
⎟
a2,3 a3,3 a3,4 ⎠
a1,4 a2,4 a3,4 a4,4
• addition: Two matrices can be added if they have the same dimension. The cor-
responding items of the matrices can then be added to form the sum matrix. The
addition of two 3 × 3 matrices A and B to get matrix C is as below:
C = A+B
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
c1,1 c1,2 c1,3 a1,1 a1,2 a1,3 b1,1 b1,2 b1,3
⎝c2,1 c2,2 c2,3 ⎠ = ⎝a2,1 a2,2 a2,3 ⎠ + ⎝b2,1 b2,2 b2,3 ⎠
c3,1 c3,2 c3,3 a3,1 a3,2 a3,3 b3,1 b3,2 b3,3
⎛ ⎞
a1,1 + b1,1 a1,2 + b1,2 a1,3+ b1,3
= ⎝a2,1 + b2,1 a2,2 + b2,2 a2,3 + b2,3 ⎠
a3,1 + b3,1 a3,2 + b3,2 a3,3 + b3,3
• multiplication with a scalar: Two matrices can be added if they have the same
dimension. The corresponding items of the matrices can then be added to form
the sum matrix. The addition of two 3 × 3 matrices A and B to get matrix C is as
below:
C =k·A
⎛ ⎞ ⎛ ⎞
a1,1 a1,2 a1,3 k · a1,1 k · a1,2 k · a1,3
C = k · ⎝a2,1 a2,2 a2,3 ⎠ = ⎝k · a2,1 k · a2,2 k · a2,3 ⎠
a3,1 a3,2 a3,3 k · a3,1 k · a3,2 k · a3,3
• matrix multiplication: Multiplication of a matrix by the identity matrix does not
change it AI = A
C = A×B
a1,1 a1,2 b b
= × 1,1 1,2
a2,1 a2,2 b2,1 b2,2
a1,1 b1,1 + a1,2 b2,1 + a1,1 b1,2 + a1,2 b2,2
=
a2,1 b1,1 + a2,2 b2,1 + a2,1 b1,2 + a2,2 b2,2
462 Appendix B: Linear Algebra Review
For example,
5 4 1 2 3 2
= ×
−3 −2 −1 0 1 1
AIn = A = Im A
For an n × n matrix A, if there exists a n × n matrix B such that
AB = A = B A = I
Then, the matrix B is called the inverse of A written as A−1 , and A is called a
non-singular matrix. When such an inverse matrix can not be determined, the matrix
A is called singular. A square matrix is singular if and only if its determinant is 0.
Determinant of a Matrix
The determinant of a square matrix A, shown by det(A) or |A|, is used for various
operations including in finding the inverse of the matrix A. Determinant of a 2 × 2
matrix A is calculated as follows:
ab
|A| = = ad − bc
cd
Determinant of a 3 × 3 matrix A can be found selecting a row or a column
and multiplying each element of the selected row/column by the determinant of the
subgraph obtained by deleting the row/column and column of that element from the
matrix as below:
a b c
e f d f d e
|A| = d e f = a −b +c
h i g i g h
g h i
1
A−1 = CT
det (A)
Appendix B: Linear Algebra Review 463
The following properties of matrix operations are valid assuming the matrices are of
appropriate sizes:
• A + B = B + A.
• A(B + C) = AB + AC.
• (A T )T = A.
• (A + B)T = A T + B T .
• (AB)T = B T A T .
• (AB)−1 = B −1 A−1 .
• k(A + B) = k A + k B for a scalar k.
A system of linear equations of the form shown below can be solved using matrix
operations.
...
⎛ ⎞⎛ x ⎞ ⎛ b ⎞
a1,1 a1,2 · · · a1,n 1 1
⎜ a2,1 a2,2 ⎜ x2 ⎟ ⎜ b2 ⎟
⎜ · · · a2,n ⎟
⎟ ⎜ ⎟ ⎜ ⎟
⎜ .. .. .. .. ⎟ ⎜ ⎟=⎜ ⎟
⎝ . .
⎜
. . ⎠⎝···⎠ ⎝···⎟
⎟ ⎜
⎠
am,1 am,2 · · · am,n xm bm
464 Appendix B: Linear Algebra Review
x = A−1 b
Let us consider the following linear equation with two variables x1 and x2 :
2x1 + 3x2 = 4
x1 + 2x2 = 3
We can write this equation using matrix notation as Ax = b as follows:
2 3 x1 4
=
1 2 x2 3
We can now compute A−1 and then x = A−1 b as below:
x1 2 −3 4 −1
= =
x2 −1 2 3 2
to yield values x1 = −1 and x2 = 2.
Gaussian Elimination
Another method to solve a system of linear equations is to first form the matrix
equation Ax = b as before. We then form the augmented matrix equation as below:
⎛ ⎞
a1,1 a1,2 ··· a1,n |b1
⎜ a2,1 a2,2 ··· a2,n |b2 ⎟
⎜ ⎟
G
Am,n =⎜ .. .. .. .. ⎟
⎝ . . . . |···⎠
am,1 am,2 · · · am,n |bm
Next, the augmented matrix A G is transformed into an upper triangular matrix
AU using elementary row operations. We then solve for xm and then use the value of
xm to obtain the value for xm−1 , etc., using backward substitution. Let us consider
the following system of equations with three variables x1 , x2 , and x3 :
x1 + x2 − x3 = −2
3x1 − 2x2 + x3 = 7
2x1 − x2 − 3x3 = 9
Appendix B: Linear Algebra Review 465
Multiplying the first row by −2 and adding it to the third row; then multiplying
the first row by −3 and adding it to the second row yields the first matrix below. The
final upper triangular matrix obtained by multiplying the second row by −3/5 and
adding it to the third row is as follows:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 1 −1 | −2 1 1 −1 | −2 1 1 −1 | −2
⎝ 3 −2 1 | 7 ⎠ → ⎝ 0 −5 4 | 13 ⎠ → ⎝ 3 −2 1 | 7⎠
0 −3 5 | 13 0 −3 5 | 13 0 0 13/5 | 26/5
A structures, 40
Algebraic graph algorithm, 369, 372 Alternating path, 265
BFS, 374 Approximation algorithm, 61
connectivity, 372 Articulation point, 222, 227
matching, 378 Auction algorithm, 292
Rabin–Vazirani algorithm, 380 Auction-based algorithm
matrices, 370 parallel, 295
minimum spanning tree, 377 Augmenting path, 265
shortest path, 376
Algorithm, 37 B
algebraic, 370 Backtracking, 62
approximation, 61 Batagelj and Zaversnik algorithm, 410
asymptotic analysis, 41 Bellman–Ford algorithm, 71, 203, 207, 376
complexity class, 53 Berge’s theorem, 266
NP, 54 BFS, 374
NP-complete, 55 Biconnected graph, 222
NP-hard, 55 Biological network, 418
P, 54 Bipartite vertex cover, 324
divide and conquer, 68 Bipartite weighted matching, 285
dynamic programming, 70 Block, 224
graph decomposition, 230
minimum spanning tree, 67 Hopcroft–Tarjan algorithm, 231
HITS, 438 Boruvka’s algorithm, 187, 194
minimal dominating set, 318 Branch and bound, 64
minimum spanning tree, 67 Bridge, 232
NP-completeness, 53 Tarjan’s bridge algorithm, 235
proof, 46 Bron and Kerbosch algorithm, 409
contradiction, 47
contrapositive, 47 C
direct, 46 Centrality, 401
induction, 48 closeness, 402
loop invariant, 49 degree, 401, 402
strong induction, 48 edge betweenness, 403
randomized, 58 eigenvalue, 406
Karger’s algorithm, 58 vertex betweenness, 403
recursive, 44 Clique, 306, 407
reductions, 50 Bron and Kerbosch algorithm, 409
© Springer International Publishing AG, part of Springer Nature 2018 467
K. Erciyes, Guide to Graph Algorithms, Texts in Computer Science,
https://doi.org/10.1007/978-3-319-73235-0
468 Index
F I
Finite-state machine, 120 Independent set, 52, 306
Flow-based matching, 272 Luby’s algorithm, 311
Floyd–Warshall algorithm, 208, 211, 213, Induction, 48
377 Internet, 434
Ford–Fulkerson algorithm, 246 routing, 435
hierarchical, 435
G link state, 435
Graph, 1
adjacency list, 30 K
adjacency matrix, 30 k-connectivity, 223, 258
algorithm, 2 König’s theorem, 269
distributed, 5 Kosaraju’s SCC algorithm, 237
large graph, 7 Kruskal’s algorithm, 183, 194
parallel, 4 Kuhn-Munkres algorithm, 289
sequential, 2
bipartite, 26 L
cartesian product, 23 Large graph, 395
complete, 24 analysis, 395
degree sequence, 19 Batagelj and Zaversnik algorithm, 410
directed, 25 centrality, 401
eigenvalue, 33 closeness, 402
incidence matrix, 32 degree, 401
intersection, 22 edge betweenness, 403
isomorphism, 21 eigenvalue, 406
Laplacian matrix, 34 vertex betweenness, 403
large graph, 7 clique, 407
line, 27 Bron and Kerbosch algorithm, 409
regular, 27 clustering, 412
subgraph, 20 degree distribution, 396
types, 24 density, 397
union, 22 k-cores, 410
vertex degree, 17 matching index, 399
weighted, 26 network models, 400
Graph algorithm, 37 random networks, 400
dynamic, 380 scale-free networks, 401
Graph matrices, 370 small world, 400
adjacency, 370 Loop invariant, 49
incidence, 370 Luby’s algorithm, 311
Laplacian, 371
Graph traversal, 139 M
breadth-first search, 163 MANET, 119
depth-first search, 150 Matching, 263, 378, 386, 389
Guha–Khuller algorithm, 320 algebraic, 378
Rabin–Vazirani algorithm, 380
H unweighted, 264
HITS algorithm, 436 bipartite, 268, 269
470 Index