Alg 07
Alg 07
Alg 07
7 Directed Graphs
Graphs or, more specifically, directed graphs are generalisations of the tree data structure. They can
represent a large number of real world problems – in particular, those that are concerned with networks
of interconnected objects. For example:
• Minimisation of costs of communications networks.
• Designing the layout of connections on a circuit board.
• Generating efficient assembly code for evaluating expressions.
• Flowcharts.
• Road or rail maps.
25 18
30 35
A 21
49 B
23
25
39
Suppose we want to find the shortest route between two towns on a map. We can draw a graph, such as
the one above, in which the nodes (or vertices) of the graph represent towns and the edges (or arcs)
represent roads between them. We can label each edge with the length or the road – the distance
between towns. A classic graph theoretic problem is to find the shortest path between two nodes, say A
and B, which could represent the shortest distance by road from town A to town B. It might equally
represent the cheapest flight from airport A to airport B if each node represents an airport, the edges are
the available flights, and the edge labels are the costs of those flights.
There are many special forms of graph and many algorithms for solving particular problems on them.
Many of these algorithms have very complex proofs of correctness. Many problems involving graphs
remain unsolved or only have very inefficient solutions. They form the core of “Theoretical Computer
Science”. We shall only scratch the surface here and look at:
• some definitions concerning graphs and their properties,
• some ways to implement graphs, and
• the solution to some simple problems involving graphs.
7.1 Definitions
• A directed graph (or digraph, or just graph) is a set of vertices, V, together with a set of ordered
pairs, E, of edges. Thus we write that a graph, G = <V, E>.
99
D0b Algorithmics: Data Structures and Data Types
• Each edge consists of two vertices in V and is represented diagrammatically by an arrow from the
first vertex to the second.
Here is an example of a graph with four vertices in V and four edges in E:
a
V = {a, b, c, d}
E = {{a, a}, {a, d}, {c, d}, {d, c}}
b c
This definition permits self-loops, i.e., edges of the form {v, v}, that begin and end at the same place.
Parallel edges, i.e., two identical edges in E, are prohibited however.
• A path is a sequence of vertices (v1, v2, …, vk), k ≥ 1, such that {vi, vi+1} ∈ E for all 1≤ i < k.
• A proper path is a path for which k ≥ 2.
100
D0b Algorithmics: Data Structures and Data Types
• If (v1, v2, …, vk) is a path, then we say that vk is reachable from v1. Every vertex is reachable from
itself.
• Two vertices v and w are strongly connected if w is reachable from v and v is reachable from w.
• A path (v1, v2, …, vk) is a cycle if v1 = vk.
• A path (v1, v2, …, vk) is simple if all vertices, v1, …, vk-1 are distinct. v1 can be equal (or not equal)
to vk. If v1 = vk then the path is a simple cycle.
• The length of a path is the number of edges it has.
The graph shown above has the following set of simple cycles:
{a}, {a, a}, {c, d, c}, {b}, {c}, {d}, and {d, c, d}.
The proper, simple cycles are:
{a, a}, {c, d, c} and {d, c, d}.
The only strongly connected pair of vertices is c and d.
We can think of a tree as a special type of graph <V, E> with a distinguished vertex called the root,
such that for every vertex v ∈V there is exactly one path (root, …, v). This is equivalent to saying that
each vertex has only one predecessor (c.f., parent in tree language).
• A subgraph of a directed graph G = <V, E> is a graph G’ = <V’, E’> such that V’⊂V and E’⊂E.
• A spanning tree of a digraph G = <V, E> is a subgraph T = <V’, E’> of G such that T is a rooted
tree and V’ = V.
Not every graph has a spanning tree – the graph used as an example above does not for example, since
there are no paths between vertex b and the other vertices.
Undirected graphs differ from directed graphs in that every edge connects two vertices in both
directions. An undirected graph can be defined in two ways:
1. A new object, like a graph, but with unordered edges. In-degree and out-degree of vertices are
replaced by just degree, which is the number of edges that have a particular vertex as an endpoint.
The road map example at the start of section 7 is an undirected graph (the roads are assumed to be 2
way streets). The degree of vertex A in that example is 2, while the degree of vertex B is 3.
2. A graph that satisfies the following: if (v1, v2) ∈E, then (v2, v1) ∈E. I.e., edges are always added in
pairs so that the two vertices in the edge are strongly connected.
101
D0b Algorithmics: Data Structures and Data Types
7.2 Implementation
There is a wide diversity of applications of graphs, which all require different operations to be
performed on the graph. In much the same way as trees, it is not sensible to try to define a graph ADT
that supports all these operations. In most cases the majority of these operations will not be used so the
class that implements such an ADT would be overcomplicated and cumbersome. Instead, we shall just
look at some data structures that can be used to represent graphs. For an actual implementation, a good
way to proceed is to define a minimal DiGraph class with basic low-level operations for constructing
graphs, such as insertion and deletion of vertices and edges. This class can then be extended to include
more sophisticated operations and algorithms that are required for particular applications.
There are many ways of representing graphs, but they are generally all variants of two methods:
• Adjacency Lists
• Adjacency Matrices
We will look at both of these graph data structures.
vlist
a
a c
null
b c
null d
c d b
null
d c
null null
102
D0b Algorithmics: Data Structures and Data Types
For each vertex, v, we keep a linked list of each w such that {v, w}∈E. We also store the vertices in a
linked list. The diagram above shows a digraph and the corresponding adjacency list data structure.
To implement a digraph in this way in Java, we use two member classes, Vertex and Edge – notice
that the structure above is built up from two slightly different objects, the link list nodes on the left that
represent the vertices and the nodes of the adjacency lists, which represent edges. Here is an outline of
the class:
public class DiGraphAL {
//Single protected data member which points to the top of the
//list of vertices.
protected Vertex vlist;
public DiGraphAL() {
vlist = null;
}
/**
* Insterts a vertex with label l into the graph.
*/
public void insertVertex(String l) {
103
D0b Algorithmics: Data Structures and Data Types
/**
* Inserts an edge from the vertex with label
* pred to that with label succ.
*/
public void insertEdge(String pred, String succ) {
Vertex v = vlist;
while(!v.label.equals(pred)) {
v = v.next;
}
The code above is incomplete and various checks should be included in the insert methods. The delete
methods have been omitted too, but these are important in a proper implementation. We add new
vertices at the start of the linked list of vertices (the easiest place to insert into a linked list) and,
similarly, we add new edges at the start of the edge lists.
Notice that the list of vertices is essentially a LUT. To perform an operation on a particular vertex, such
as add an edge, we need to chain through the list of vertices comparing vertex labels until we find the
one we are looking for. We are using a linear search strategy on a LUT whose entries consist of keys
that are the vertex labels and values that are lists of edges of that vertex.
If the edges have labels as well as the vertices, we can include an extra label field in the edge class.
We would also have to include an extra argument to the insertEdge method allowing the value of
the label to be set when the edge is created, for example:
public void insertEdge(String pred, String succ, int label) {
…
}
104
D0b Algorithmics: Data Structures and Data Types
a c null
b null
c d b null
d a null
We can simplify the general adjacency list data structure if we have information about the graph in
advance. Firstly, if we know how many vertices there are, we can use an array of vertices rather than a
linked list. Each element of the array contains a vertex and references a linked list of edges, as shown
above.
The Vertex class no longer requires the link field and so consists of just a label and a reference to an
Edge object.
If, in addition to the number of vertices, we also know how many edges there are in advance for each
vertex, then the adjacency lists themselves can also be replaced by arrays. For each slot in the array of
vertices, we store an array of edge objects each containing the label of a successor of that vertex and
possibly a label for the edge itself.
a c
b null
c d b
d a
We can also improve the implementation of the data structure by using a more efficient LUT, such as a
hash table, to locate the vertices in the graph.
105
D0b Algorithmics: Data Structures and Data Types
If we know an upper bound on the number of vertices, |V|, in a graph, then we can use a |V| x |V| matrix
(a 2D array) of Boolean values to store the graph:
a b c d
a F F T F TRUE if (v, w) ∈ E
A ( v, w ) =
b F F F F FALSE otherwise
c F T F T
d T F F F
If the edges are labelled, we can store the labels in the matrix rather than Boolean values. We need to
choose some “null” value to indicate that no edge is present, e.g., an empty string or a negative number.
We store the vertex labels in a LUT with the label as key and the index of the adjacency matrix as
value. In the example above, the LUT would contain the entries: {a, 0}, {b, 1}, {c, 2}, {d, 3}.
Here is an outline implementation of a digraph using an adjacency matrix:
import java.util.Hashtable;
//Initialise hashtable
vertLUT = new Hashtable(maxVerts*3/2);
nextVertIndex = 0;
}
106
D0b Algorithmics: Data Structures and Data Types
aMatrix[pIndex][sIndex] = true;
}
}
Exercise: can you see any problems with using a hashtable to store the vertex labels in this way? Think
about how you might implement a toString method for this class, for example. A solution might be to
use an additional protected data item, which is an array storing the vertex labels in order.
The operation of testing for the existence of an edge is generally a little more efficient with an
adjacency matrix, but at the expense of larger storage requirements.
The presence of cycles introduces a lot of complication in processing graphs, because of the danger of
infinite loops. Directed acyclic graphs (DAGs) are digraphs in which there are no cycles and these are
of special interest. They arise whenever activities need to be performed in some order. That order may
not be unique, but some actions need to be performed before others.
An example of this is time-tabling a set of courses in which there are prerequisites. We can express the
prerequisites of each course clearly and compactly using a DAG:
The labels of the nodes in the graph below are the names of courses and the edges indicate that one
course (the predecessor) is a prerequisite for another (the successor).
107
D0b Algorithmics: Data Structures and Data Types
B41 B10b
B10a B224
A topological ordering of the vertices of a DAG is a way to visit all the vertices that satisfies these
prerequisites.
Specifically, we want to find a sequence (v1, v2, …, vn) such that if there is an edge <vj, vk > then j < k.
108
D0b Algorithmics: Data Structures and Data Types
in-deg. 1 in-deg. 0
b b
c c c
in-deg. 1 in-deg. 0 in-deg. 0
l = {} l = {a} l = {a, b}
Traversal of a graph is similar to the idea of tree traversal. It means visiting every vertex in the graph
once and performing some operation at that vertex. There are two important ways of traversing graphs.
As usual, our “visits” will constitute a print operation, but this could be replaced by anything else. Here
is an outline of a depth first traversal of a digraph. These could be included as methods of a DiGraph
class.
public void depthFirstPrint() {
boolean visited[] = new boolean[number of vertices];
foreach vertex v in V {
if(!visited[indexOf(v)]) {
traverse(v, visited);
}
}
}
109
D0b Algorithmics: Data Structures and Data Types
if(!visited[indexOf(w)]) {
traverse(w, visited);
}
}
}
Note that we can “visit” the vertices in pre or post order. The algorithm above uses pre-order, since we
visit each vertex before traversing. We could move the System.out.println(v.label) after
the loop over the edges out of the vertex, to make it post-order.
Here is a graph showing the order of traversal using the pre-order depth first traversal above:
2 3
4
1 6 5
7 8
Post-order traversal visits the vertices in reverse topological order. Exercise: try this by hand on the
graph above starting with the left-most vertex.
foreach vertex v in V {
if(!visited[indexOf(v)]) {
while(!q.empty()) {
110
D0b Algorithmics: Data Structures and Data Types
2 5
8
1 3 6
4 7
In this final section, we look at some other algorithms for solving common problems concerning
graphs.
A natural question to ask about directed graphs in which the edges are labelled with costs is: what is the
shortest path from one vertex to another, i.e., the path that incurs the least total cost.
b
2 5
3
a 6
c
10
4 6 2
1
e 2 112 d
D0b Algorithmics: Data Structures and Data Types
Suppose we want to get from vertex b to vertex c in the labelled digraph above.
Path: {(b,c)} = cost of 5
Path: {(b, d), (d, c)} = cost of 3 + cost of 1 = cost of 4.
Path: {(b, a), (a, c)} = cost of 2 + cost of 6 = cost of 8.
There are other even longer paths, the “shortest” path is the one with the smallest cost, which in this
case is the second path above, which is not the most direct path.
Evaluation of the shortest path involves incrementally adding vertices to a set S of vertices whose
shortest path from S is known. As a corollary, we find the shortest path to all the other vertices too.
Suppose we want to find the shortest paths from vertex a to all the others. A set S contains vertices
whose shortest path from a are known (indicated by the boolean array distanceFound in the
algorithm below). The distance is stored in an array of distances, distance.
We consider every vertex w for which the distance is not yet known, but for which it can be
determined, as a candidate to add to the set S.
This candidate shortest path must be the one which minimises distance(a, v) + cost(v, w), over the set
containing each v already in S.
Here is an outline of the algorithm:
public shortestPath(Vertex v1, Vertex v2)
boolean[] distanceFound = new boolean[number of vertices];
float[] distance = new float[number of vertices];
Vertex v, w;
float min;
float infinity = some very large number;
v = v1;
//Main loop.
for(int i=0; i<no. of verts; i++) {
min = infinity;
//Find element of V\S with smallest distance.
foreach w in V {
if(!distanceFound[indexOf(w)] &&
distance[indexOf(w)] < min) {
v = w;
min = distance[indexOf(w)];
}
113
D0b Algorithmics: Data Structures and Data Types
//Add to set S
distanceFound[indexOf(v)] = true;
//Update distances
foreach w for which (v, w) is in E {
if(min + cost(v,w) < distance[indexOf(w)]) {
distance[indexOf(w)] = min + cost(v,w);
}
}
}
return distance[indexOf(v2)];
}
This is a version of Dijkstra’s algorithm. It provides the cost of the shortest path from the specified
vertex to all the others. No computational effort can be saved by just computing the cost to one,
specific other vertex. Note that if you need to get the actual paths themselves, an array of predecessors
needs to be maintained through the algorithm.
A more detailed description of shortest path algorithms can be found in Kruse, et al; Aho, et al; and
Kingston.
114
This document was created with Win2PDF available at http://www.daneprairie.com.
The unregistered version of Win2PDF is for evaluation or non-commercial use only.