Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
43 views

Graphs Using Adjacency Matrix and List

Uploaded by

Vinu Varghese
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views

Graphs Using Adjacency Matrix and List

Uploaded by

Vinu Varghese
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Lecture 23

Representing Graphs

15-122: Principles of Imperative Computation (Fall 2018)


Frank Pfenning, André Platzer, Rob Simmons,
Penny Anderson, Iliano Cervesato

In this lecture we introduce graphs. Graphs provide a uniform model for


many structures, for example, maps with distances or Facebook relation-
ships. Algorithms on graphs are therefore important to many applications.
They will be a central subject in the algorithms courses later in the curricu-
lum; here we only provide a very basic foundation for graph algorithms.
With respect to our learning goals we will look at the following notions.

Computational Thinking: We get a taste of the use of graphs in computer


science. We note that some graphs are represented explicitly while
others are kept implicit.

Algorithms and Data Structures: We see two basic ways to represent graphs:
using adjacency matrices and by means of adjacency lists.

Programming: We use linked lists to give an adjacency list implementa-


tion of graphs.

L ECTURE N OTES
c Carnegie Mellon University 2018
Lecture 23: Representing Graphs 2

1 Undirected Graphs
We start with undirected graphs which consist of a set V of vertices (also
called nodes) and a set E of edges, each connecting two different vertices.
The following is a simple example of an undirected graph with 5 vertices
(A, B, C, D, E) and 6 edges (AB, BC, CD, AE, BE, CE):

We don’t distinguish between the edge AB and the edge BA because


we’re treating graphs as undirected. There are many ways of defining
graphs with slight variations. Because we specified above that each edge
connects two different vertices, no vertex in a graph can have an edge from
a node back to itself in this course.

2 Implicit Graphs
There are many, many different ways to represent graphs. In some ap-
plications they are never explicitly constructed but remain implicit in the
way the problem was solved. The game of Lights Out is one example of
a situation that implicitly describes an undirected graph. Lights Out is an
electronic game consisting of a grid of lights, usually 5 by 5. The lights are
initially pressed in some pattern of on and off, and the objective of the game
is to turn all the lights off. The player interacts with the game by touching
a light, which toggles its state and the state of all its cardinally adjacent
neighbors (up, down, left, right).

We can think of lights out as an implicit graph with 225 vertices, one for ev-
ery possible configuration of the 5x5 lights out board, and an edge between
Lecture 23: Representing Graphs 3

two vertices if we can transition from one board to another with a single
button press. If we transition from one board to another by pressing a but-
ton, we can return to the first board by pressing the same button. Therefore
the graph is undirected.

Each of the 225 vertices is therefore connected to 25 different edges, giving


us 25 × 225 /2 total edges in this graph — we divide by 2 because going
to a node and coming back from it are expressed by the same edge. But
because the graph is implicit in the description of the Lights Out game, we
don’t have to actually store all 32 million vertices and 400 million edges in
memory to understand Lights Out.
An advantage to thinking about Lights Out as a graph is that we can
think about the game in terms of graph algorithms. Asking whether we
can get all the lights out for a given board is asking whether the vertex
representing our starting board is connected to the board with all the lights
out by a series of edges: a path. We’ll talk more about this graph reachability
question in the next lecture.

3 Explicit Graphs and a Graph Interface


Sometimes we do want to represent a graph as an explicit set of edges and
vertices and in that case we need a graph datatype. In the C code that
follows, we’ll refer to our vertices with unsigned integers. The minimal
interface for graphs in Figure 1 allows us to create and free graphs, check
whether an edge exists in the graph, add a new edge to the graph, and get
and free a list of the neighbors of a node.
We use the C0 notation for contracts on the interface functions here.
Even though C compilers do not recognize the @requires contract and
will simply discard it as a comment, the contract still serves an important
role for the programmer reading the program. For the graph interface, we
Lecture 23: Representing Graphs 4

1 typedef unsigned int vertex;


2 typedef struct graph_header *graph_t;
3

4 graph_t graph_new(unsigned int numvert);


5 //@ensures \result != NULL;
6

7 void graph_free(graph_t G);


8 //@requires G != NULL;
9

10 unsigned int graph_size(graph_t G);


11 //@requires G != NULL;
12

13 bool graph_hasedge(graph_t G, vertex v, vertex w);


14 //@requires G != NULL;
15 //@requires v < graph_size(G) && w < graph_size(G);
16

17 void graph_addedge(graph_t G, vertex v, vertex w);


18 //@requires G != NULL;
19 //@requires v < graph_size(G) && w < graph_size(G);
20 //@requires v != w && !graph_hasedge(G, v, w);
21

22 typedef struct vert_list_node vert_list;


23 struct vert_list_node {
24 vertex vert;
25 vert_list *next;
26 };
27

28 vert_list* graph_get_neighbors(graph_t G, vertex v);


29 //@requires G != NULL;
30 //@requires v < graph_size(G);
31

32 void graph_free_neighbors(vert_list* neighbors);

Figure 1: A simple graph interface — graph.h

decide that it does not make sense to add an edge into a graph when that
edge is already there, hence the second precondition. A neighbor list is just
a linked list of nodes.
With this minimal interface, we can create a graph for what will be our
Lecture 23: Representing Graphs 5

running example (letting A = 0, B = 1, and so on):


graph_t G = graph_new(5);
graph_addedge(G, 0, 1); // AB
graph_addedge(G, 1, 2); // BC
graph_addedge(G, 2, 3); // CD
graph_addedge(G, 0, 4); // AE
graph_addedge(G, 1, 4); // BE
graph_addedge(G, 2, 4); // CE
We could implement the graph interface in Figure 1 in a number of
ways. In the simplest form, a graph with e edges can be represented as
a linked list or array of edges. In the linked list implementation, it takes
O(1) time to add an edge to the graph with graph_addedge, because it
can be appended to the front of the linked list. Finding whether an edge
exists in a graph with e edges might require traversing the whole linked
list, so graph_hasedge is an O(e) operation. Getting the neighbors of a
node would take O(e).
Hashtables and balanced binary search trees would be our standard
tools in this class for representing sets of edges more efficiently. Instead
of taking that route, we will discuss two classic structures for directly rep-
resenting graphs.

4 Adjacency Matrices
One simple way is to represent the graph as a two-dimensional array that
describes its edge relation as follows.

There is a checkmark in the cell at row v and column v 0 exactly when there
is an edge between nodes v and v 0 . This representation of a graph is called
Lecture 23: Representing Graphs 6

an adjacency matrix, because it is a matrix that stores which nodes are neigh-
bors.
We can check if there is an edge from B (= 1) to D (= 3) by looking for
a checkmark in row 1, column 3. In an undirected graph, the top-right
half of this two-dimensional array will be a mirror image of the bottom-
left, because the edge relation is symmetric. Because we disallowed edges
between a node and itself, there are no checkmarks on the main diagonal
of this matrix.
The adjacency matrix representation requires a lot of space: for a graph
with v vertices we must allocate space in O(v 2 ). However, the benefit of the
adjacency matrix representation is that adding an edge (graph_addedge)
and checking for the existence of an edge (graph_hasedge) are both O(1)
operations.
Are the space requirements for adjacency matrices (requires space in
O(v 2 )) worse than the space requirements for storing all the edges in a
linked list (requires space in O(e))? That depends on the relationship be-
tween v, the number of vertices, and e the number of edges. A graph with
v vertices has between 0 and v2 = v(v−1)

2 edges. If most of the edges ex-
ist, so that the number of edges is proportional to v 2 , we say the graph
is dense. For a dense graph, O(e) = O(v 2 ), and so adjacency matrices are
a good representation strategy for dense graphs, because in big-O terms
they don’t take up more space than storing all the edges in a linked list,
and operations are much faster.

5 Adjacency Lists
If a graph is not dense, then we say the graph is sparse. The other classic
representation of a graphs, adjacency lists, can be a good representation of
sparse graphs.
In an adjacency list representation, we have a one-dimensional array
that looks much like a hash table. Each vertex has a spot in the array, and
each spot in the array contains a linked list of all the other vertices con-
nected to that vertex. Our running example would look like this as an
adjacency list:
Lecture 23: Representing Graphs 7

Adjacency lists require O(v + e) space to represent a graph with v ver-


tices and e edges: we have to allocate a single array of length v and then
allocate two list entries per edge. The complexity class O(v + e) is often
written as O(max(v, e)) — we leave it as an exercise to check that these two
classes are equivalent — and therefore this is the notation we will typically
use. Adding an edge is still constant time, but lookup (graph_hasedge)
now takes time in O(min(v, e)), since min(v − 1, e) is the maximum length
of any single adjacency list. Finding the neighbors of a node is immediate
with an adjacency list representation as we simply return the adjacency list
of that node — this has cost O(1). This is in contrast with the adjacency
matrix representation where we are forced to check every value on the row
of the matrix corresponding to that node.

The following table summarizes and compares the asymptotic cost as-
sociated with the adjacency matrix and adjacency list implementations of a
graph, under the assumptions used in this chapter.

Adjacency Matrix Adjacency List

Space O(v 2 ) O(max(v, e))


graph_hasedge O(1) O(min(v, e))
graph_addedge O(1) O(1)
graph_get_neighbor O(v) O(1)

The cost of graph_hasedge can be reduced by storing the neighbors of each


node not in a linked list but in a more search-efficient data structure, for
example an AVL tree or a hash set. Of course, doing so requires additional
space, something that may not be desirable in some applications. It also
comes at the expense of graph_get_neighbors.
Lecture 23: Representing Graphs 8

6 Adjacency List Implementation


The header for a graph is a struct with two fields: the first is an unsigned
integer representing the actual size, and the second is an array of adjacency
lists. We use the vertex list from the graph interface as our adjacency list.
1 typedef vert_list adjlist;
2

3 typedef struct graph_header graph;


4 struct graph_header {
5 unsigned int size;
6 adjlist **adj;
7 };
We leave it as an exercise to the reader to define the representation func-
tions
9 bool is_vertex(graph *G, vertex v)
10 bool is_graph(graph *G)
that check that a vertex is valid for a given graph and that a graph itself is
valid.
We can allocate the struct for a new graph using xmalloc, since we’re
going to have to initialize both its fields anyway. But we’d definitely allo-
cate the adjacency list itself using xcalloc to make sure that it is initialized
to array full of NULL values: empty adjacency lists.
12 graph *graph_new(unsigned int size) {
13 graph *G = xmalloc(sizeof(graph));
14 G->size = size;
15 G->adj = xcalloc(size, sizeof(adjlist*));
16 ENSURES(is_graph(G));
17 return G;
18 }
Given two vertices, we have to search through the whole adjacency list
of one vertex to see if it contains the other vertex. This is what gives the
operation a running time in O(min(v, e)).
20 bool graph_hasedge(graph *G, vertex v, vertex w) {
21 REQUIRES(is_graph(G) && is_vertex(G, v) && is_vertex(G, w));
22

23 for (adjlist *L = G->adj[v]; L != NULL; L = L->next) {


24 if (L->vert == w) return true;
Lecture 23: Representing Graphs 9

25 }
26 return false;
27 }
Because we assume an edge must not already exist when we add it to
the graph, we can add an edge in constant time:
29 void graph_addedge(graph *G, vertex v, vertex w) {
30 REQUIRES(is_graph(G) && is_vertex(G, v) && is_vertex(G, w));
31 REQUIRES(v != w && !graph_hasedge(G, v, w));
32

33 adjlist *L;
34

35 L = xmalloc(sizeof(adjlist)); // add w as a neighbor of v


36 L->vert = w;
37 L->next = G->adj[v];
38 G->adj[v] = L;
39

40 L = xmalloc(sizeof(adjlist)); // add v as a neighbor of w


41 L->vert = v;
42 L->next = G->adj[w];
43 G->adj[w] = L;
44

45 ENSURES(is_graph(G));
46 ENSURES(graph_hasedge(G, v, w));
47 }
Finding the neighbors of a vertex is just a matter of returning its adja-
cency list.
49 vert_list *graph_get_neighbors(graph *G, vertex v) {
50 REQUIRES(is_graph(G) && is_vertex(G, v));
51 return G->adj[v];
52 }
It is tempting to implement the operation graph_free_neighbors so
that it frees every node in its input neighbor list. But this would destroy
our representation since graph_get_neighbor returned an alias into the
adjacency list representation of the graph. Instead, we shall define this
function so that it does nothing
54 void graph_free_neighbors(vert_list *L) {
55 (void)L;
56 }
Lecture 23: Representing Graphs 10

Here, (void)L serves the purpose of fooling the compiler into believing
that this function is using the variable L. Without it, our standard compila-
tion flags would cause it to report an error.

7 Iterating through a Graph


To gain practice with working with our graph interface, we write a function
that prints all the edges in a graph. Give it a try and then check your work
on the next page. This function has the following prototype:
void graph_print(graph_t G)
Lecture 23: Representing Graphs 11

Our implementation is as follows:


void graph_print(graph_t G) {
for (vertex v = 0; v < graph_size(G); v++) {
printf("Vertices connected to %u: ", v);
vert_list *nbors = graph_get_neighbors(G, v);
for (vert_list *p = nbors; p != NULL; p = p->next) {
vertex w = p->vert; // w is a neighbor of v
printf(" %u,", w);
}
graph_free_neighbors(nbors);
printf("\n");
}
}
The outer loop examines all the vertices in the graph. For each of them, we
compute its neighbor list and then go through it in the inner loop to print
them. We call the function graph_free_neighbors to dispose of the neigh-
bor list once we are done with it. In our adjacency list representation, this
call does nothing, but this won’t be the case for an adjacency matrix repre-
sentation for example. Not calling it may cause a memory leak depending
on which implementation we use.
It is interesting to analyze the complexity of graph_print on a graph
containing v vertices and e edges. The outer loop runs v times. Inside this
loop, the following operations take place:
• Some print statements that we may assume have cost O(1).

• A call to graph_get_neighbors, whose cost is constant in the adja-


cency list representation but O(v) in the adjacency matrix representa-
tion. Up to this point in the code, the cost of our function is O(v) in
the former representation and O(v 2 ) in the latter.

• The inner loop, whose body performs constant cost operations. In


isolation, the body of this loop runs O(v) times since each vertex can
have up to v − 1 neighbors. Thus, a naive analysis gives us an O(v 2 )
worst case complexity for graph_print with both representations up
to this point in the code.
However, each neighbor corresponds to an edge in the graph. There-
fore, the body of the inner loop will be executed exactly 2e times
total over an entire run of graph_print — each edge is examined
twice, once from each of its endpoints. Thus the inner loop has cost
Lecture 23: Representing Graphs 12

O(e) overall. Adding this to our tally, the cost of print_graph to


this point in our analysis is O(max(v, e)) — which we recall is the
common way of writing O(v + e) — in the adjacency list representa-
tion, and O(max(v 2 , e)) for the adjacency matrix representation. Since
e ∈ O(v 2 ) for any graph, the latter expression simplifies to O(v 2 ).

• A call to graph_free_neighbors. This has constant cost in the adja-


cency list representation. By the same reasoning we just performed
for the inner loop, this will have cost O(e) overall in the adjacency
matrix representation.

Summarizing, our analysis tells us that graph_print has cost O(max(v, e))
in the adjacency list representation and O(v 2 ) with the adjacency matrix
representation. For a dense graph — where e ∈ O(v 2 ) — these two expres-
sions are equivalent. For a sparse graph, the former can be significantly
cheaper.

Exercises
Exercise 1. Define the representation functions is_graph and is_vertex (and
any other you may need) used in the contracts of the adjacency matrix implemen-
tation in Section 6 of the graph interface of Section 3.

Exercise 2. Give an implementation of the graph interface in Section 3 based on


adjacency matrices. Make sure to provide adequate representation functions.

You might also like