Ads 3 Part 1
Ads 3 Part 1
Ads 3 Part 1
Data Structures
1
SYLLABUS
Basic Concepts, Storage representation, Adjacency matrix,
adjacency list, adjacency multi list, inverse adjacency list.
Traversals-depth first and breadth first, Introduction to Greedy
Strategy, Minimum spanning Tree, Greedy algorithms for
computing minimum spanning tree- Prims and Kruskal
Algorithms, Dikjtra's Single source shortest path, Topological
ordering.
Case study- Data structure used in Webgraph and Google map.
2
WHAT IS A GRAPH?
A data structure that consists of a set of nodes (vertices)
and a set of edges that relate the nodes to each other
The set of edges describes relationships among the vertices
An edge is a pair of vertices.
If two of the vertices are connected with edge, then we say
these two vertices are adjacent
A graph may not have an edge from vertex to vertex i.e.< v
,v>
A graph may not have multiple occurrence of same edge
3
Example
B
Vertices E
A D
Edges
C
Undirected Graph with 5 vertices and 7 edges
Maxm number of edges in undirected graph n(n-1)/2
If graph has exact n(n-1)/2 edges then its complete graph
4
Example
B
E
A D
6
DIRECTED VS. UNDIRECTED GRAPHS
7
DIRECTED VS. UNDIRECTED GRAPHS
(CONT.)
When the edges in a graph have a direction, the graph is
called directed (or digraph)
if the graph is
directed, the
order of the
vertices in each
edge is important
9
• The degree of a vertex is the number of
edges incident to that vertex
• For directed graph,
• the in-degree of a vertex v is the number of edges
10
GRAPH TERMINOLOGY
11
MORE TERMINOLOGY
simple path: no repeated vertices
a b
be
c
c
d e
cycle: simple path, except that the last vertex is the same as the first
vertex
a cda
12
EVEN MORE TERMINOLOGY
•connected graph: any two vertices are connected by some path
connecte not
d
subgraph: subset of vertices connected
and edges forming a graph
connected component: maximal connected subgraph. E.g., the graph below has 3
connected components.
13
GRAPH TERMINOLOGY (CONT.)
What is the number of edges in a complete
directed graph with N vertices?
N * (N-1)
14
GRAPH TERMINOLOGY (CONT.)
What is the number of edges in a complete
undirected graph with N vertices?
N * (N-1) / 2
15
GRAPH TERMINOLOGY (CONT.)
Weighted graph: a graph in which each edge
carries a value
16
ABSTRACT DATA TYPE
objects: a nonempty set of vertices and a set of
undirected edges, where each edge is a pair of vertices
17
Graph DeleteVertex(graph, v)::= return a graph in
which v and all edges incident to it are removed
18
GRAPH TRAVERSAL
Use the same depth-first and breadth-first traversal algorithms
seen for the binary trees.
Differences between graph and tree traversals:
1) Tree traversal always visit all the nodes in the tree
2 ) Graph traversal visits all the nodes in the graph only when it is
connected. Otherwise it visits only a subset of the nodes. This subset
is call the connected component of the graph.
Recursive and iterative implementations of the algorithms.
Iterative: Use a stack for the depth-first search (dfs)
Use a queue for the breadth-first search (bfs)
19
GRAPH IMPLEMENTATION
Array-based implementation
⚫ A 1D array is used to represent the vertices
⚫ A 2D array (adjacency matrix) is used to represent the edges
20
ARRAY-BASED IMPLEMENTATION
21
GRAPH IMPLEMENTATION (CONT.)
Linked-list implementation
⚫ A 1D array is used to represent the vertices
⚫ A list is used for each vertex v which contains the
vertices which are adjacent from v (adjacency list)
22
LINKED-LIST IMPLEMENTATION
23
ADJACENCY MATRIX VS.
ADJACENCY LIST
REPRESENTATION
Adjacency matrix
⚫ Good for dense graphs
⚫ Memory requirements( Consider all edges)
⚫ Connectivity between two vertices can be tested quickly
Adjacency list
⚫ Good for sparse graphs
⚫ Memory requirements(Consider only connected edges)
⚫ Vertices adjacent to another vertex can be found quickly
24
GRAPH SEARCHING
25
DEPTH-FIRST-SEARCH (DFS)
What is the idea behind DFS?
⚫ Travel as far as you can down a path
⚫ Back up as little as possible when you reach a "dead end"
(i.e., next vertex has been "marked" or there is no next
vertex)
DFS can be implemented efficiently using a
stack
26
DEPTH-FIRST-SEARCH (DFS) (CONT.)
Set found to false
stack.Push(startVertex)
DO
stack.Pop(vertex)
IF vertex == endVertex
Set found to true
ELSE
Push all adjacent vertices onto stack
WHILE !stack.IsEmpty() AND !found
IF(!found)
Write "Path does not exist"
27
WALK-THROUGH Visited Array
F C A
A B
B C
D
H D
E
G E F
G
H
53
Can be used to attempt to visit all nodes of a graph in a
systematic manner
Works with directed and undirected graphs
Works with weighted and unweighted graphs
Steps
Breadth-first search starts with given node
Then visits nodes adjacent in some specified order (e.g.,
alphabetical)
54
BREADTH-FIRST-SEARCHING (BFS)
(CONT.)
BFS can be implemented efficiently using a queue
Set found to false IF(!found)
queue.Enqueue(startVertex) Write "Path does not exist"
DO
queue.Dequeue(vertex)
IF vertex == endVertex
Set found to true
ELSE
Enqueue all adjacent vertices onto queue
WHILE !queue.IsEmpty() AND !found
55
Walk-Through Enqueued Array
F C A
A B Q
B C
D
H D
E
G E F
G
H
F C A
A B Q D
B C
D
H D √
E
G E F
G
Nodes visited: H
F C A
A B Q C E F
B C √
D
H D √
E √
G E F √
G
Nodes visited: D H
F C A
A B Q E F
B C √
D
H D √
E √
G E F √
G
Nodes visited: D, C H
F C A
A B Q F G
B C √
D
H D √
E √
G E F √
G
Nodes visited: D, C, E H
F C A
A B Q G
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F H
F C A
A B Q H
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G H √
F C A √
A B √ Q A B
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H H √
F C A √
A B √ Q B
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, H √
A
Dequeue A. Visit A. Enqueue unenqueued nodes
64
adjacent to A.
Walk-Through Enqueued Array
F C A √
A B √ Q empty
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, H √
A, B
Dequeue B. Visit B. Enqueue unenqueued nodes
65
adjacent to B.
Walk-Through Enqueued Array
F C A √
A B √ Q empty
B C √
D
H D √
E √
G E F √
G √
Nodes visited: D, C, E, F, G, H, H √
A, B
Q empty. Algorithm done.
66
DEFINITION
A Minimum Spanning Tree (MST) is a subgraph of an
undirected graph such that the subgraph spans (includes)
all nodes, is connected, is acyclic, and has minimum
total edge weight
67
MINIMUM SPANNING TREES
Prim’s Algorithm
• Focuses on nodes
Kruskal’s Algorithm
• Focuses on edges, rather than nodes
68
Minimum Spanning Tree is concerned with connected undirected
graphs.
69
Example of Minimum Spanning Tree
• The following figure shows a graph G1 together with its three possible
minimum spanning trees.
•a •b •d
•c •e •f
•a •b •d •a •b •d •a •b •d
•c •e •f •c •e •f •c •e •f
• The following figure shows a graph G1 together with its three possible
minimum spanning trees. 70
WHAT IS A MINIMUM-COST SPANNING
TREE
• Minimum-Cost spanning tree is concerned with edge-weighted connected
undirected graphs.
• For an edge-weighted , connected, undirected graph, G, the total cost of G
is the sum of the weights on all its edges.
• A minimum-cost spanning tree for G is a minimum spanning tree of G that
has the least total cost.
71
CONSTRUCTING MINIMUM SPANNING TREE
72
ALGORITHM CHARACTERISTICS
73
PRIM’S ALGORITHM
74
PRIM’S ALGORITHM
Prim’s algorithm finds a minimum cost spanning tree by selecting edges
from the graph one-by-one as follows:
It starts with a tree, T, consisting of the starting vertex, x.
Then, it adds the shortest edge emanating from x that connects T to the rest
of the graph.
It then moves to the added vertex and repeat the process.
A 7
4
3 A False ∞ −
8
18 B False ∞ −
4
9
B D C False ∞ −
10
H 25 D False ∞ −
2
3 E False ∞ −
G 7
E F False ∞ −
G False ∞ −
H False ∞ −
•dv( Distance of Vertex )
•Pv( Previous Vertex )
76
2
Start with any node, say D
3
10
F C K dv pv
A 7
4
3 A
8
18 B
4
9
B D
10 C
H 25 D T 0 −
2
3 E
G 7
E
F
G
H
77
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A
8
18 B
4
9
B D
10 C 3 D
H 25 D T 0 −
2
3 E 25 D
G 7
E
F 18 D
G 2 D
H
78
2 Select node with
minimum distance
3
10
F C
K dv pv
A 7
4
3
8 A
18
4
9
B D B
10
H 25 C 3 D
2
3 D T 0 −
G 7
E E 25 D
F 18 D
G T 2 D
H
79
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A
8
18 B
4
9
B D
10 C 3 D
H 25 D T 0 −
2
3 E 7 G
G 7
E
F 18 D
G T 2 D
H 3 G
80
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A
8
18 B
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E 7 G
G 7
E
F 18 D
G T 2 D
H 3 G
81
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E 7 G
G 7
E
F 3 C
G T 2 D
H 3 G
82
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E 7 G
G 7
E
F T 3 C
G T 2 D
H 3 G
83
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A 10 F
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E 2 F
G 7
E
F T 3 C
G T 2 D
H 3 G
84
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A 10 F
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E T 2 F
G 7
E
F T 3 C
G T 2 D
H 3 G
85
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A 10 F
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E T 2 F
G 7
E
F T 3 C
G T 2 D
H 3 G
Table entries unchanged
86
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A 10 F
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E T 2 F
G 7
E
F T 3 C
G T 2 D
H T 3 G
87
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A 4 H
8
18 B 4 C
4
9
B D
10 C T 3 D
H 25 D T 0 −
2
3 E T 2 F
G 7
E
F T 3 C
G T 2 D
H T 3 G
88
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A T 4 H
8
18 B 4 C
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
89
2 Update distances of
adjacent, unselected nodes
3
10
F C K dv pv
A 7
4
3 A T 4 H
8
18 B 4 C
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
Table entries unchanged
90
2 Select node with
minimum distance
3
10
F C K dv pv
A 7
4
3 A T 4 H
8
18 B T 4 C
4
9
B D C T 3 D
10
H 25 D T 0 −
2
3 E T 2 F
G 7
E F T 3 C
G T 2 D
H T 3 G
91
2 Cost of Minimum
Spanning Tree = Σ dv = 21
3
F C K dv pv
A 4
3 A T 4 H
B T 4 C
4
B D C T 3 D
H D T 0 −
2
3 E T 2 F
G E F T 3 C
G T 2 D
H T 3 G
Done
92
Kruskal’s Algorithm
93
KRUSKAL'S ALGORITHM.
Kruskal’s algorithm also finds the minimum cost spanning tree of a graph
by adding edges one-by-one.
94
Walk-Through
Consider an undirected, weight graph
3
10
F C
A 4
4
3
8
6
5
4
B D
4
H 1
2
3
G 3
E
95
Sort the edges by increasing edge weight
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 (B,E) 4
8
6
5 (D,G) 2 (B,F) 4
4
B D
4 (E,G) 3 (B,H) 4
H 1
2
3 (C,D) 3 (A,H) 5
G 3
E
(G,H) 3 (D,F) 6
(C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
96
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 (B,F) 4
5
4
B D (E,G) 3 (B,H) 4
4
H 1 (C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E
(C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
97
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 (B,H) 4
4
H 1 (C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E
(C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
98
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 (A,H) 5
2
3 (G,H) 3 (D,F) 6
G 3
E
(C,F) 3 (A,B) 8
(B,C) 4 (A,F) 10
Accepting edge (E,G) would create a cycle
99
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3
(D,F) 6
G 3
E (C,F) 3
(A,B) 8
(B,C) 4
(A,F) 10
100
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3 √
(D,F) 6
G 3
E (C,F) 3
(A,B) 8
(B,C) 4
(A,F) 10
101
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3 √
(D,F) 6
G 3
E (C,F) 3 √
(A,B) 8
(B,C) 4
(A,F) 10
102
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3 √
(D,F) 6
G 3
E (C,F) 3 √
(A,B) 8
(B,C) 4 √
(A,F) 10
103
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4 χ
8
6 (D,G) 2 √ (B,F) 4
5
4
B D (E,G) 3 χ (B,H) 4
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3 √ (D,F) 6
G 3
E (C,F) 3 √
(A,B) 8
(B,C) 4 √
(A,F) 10
104
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3
(B,E) 4 χ
8 (D,E) 1 √
6
5
4
B D (D,G) 2 √ (B,F) 4 χ
4
H 1 (E,G) 3 χ (B,H) 4
2
3 (C,D) 3 √ (A,H) 5
G 3
E
(G,H) 3 √ (D,F) 6
(C,F) 3 √ (A,B) 8
(B,C) 4 √ (A,F) 10
105
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4 χ
8
6 (D,G) 2 √ (B,F) 4 χ
5
4
B D (E,G) 3 χ (B,H) 4 χ
4
H 1 (C,D) 3 √ (A,H) 5
2
3 (G,H) 3 √ (D,F) 6
G 3
E (C,F) 3 √ (A,B) 8
(B,C) 4 √ (A,F) 10
106
Select first |V|–1 edges which do not
generate a cycle
3
10
F C edge dv edge dv
A 4
4
3 (D,E) 1 √ (B,E) 4 χ
8
6 (D,G) 2 √ (B,F) 4 χ
5
4
B D (E,G) 3 χ (B,H) 4 χ
4
H 1 (C,D) 3 √ (A,H) 5 √
2
3 (G,H) 3 √ (D,F) 6
G 3
E (C,F) 3 √ (A,B) 8
(B,C) 4 √ (A,F) 10
107
Select first |V|–1 edges which do not
generate a cycle
edge dv edge dv
3
F C (D,E) 1 √ (B,E) 4 χ
A 4
3
(D,G) 2 √ (B,F) 4 χ
5
B D (E,G) 3 χ (B,H) 4 χ
H 1
(C,D) 3 √ (A,H) 5 √
2
3 (G,H) 3 √ (D,F) 6
G E
(C,F)
(B,C)
3
4
√
√
(A,B)
(A,F)
8
10
}
not
Done considered
110
SINGLE-SOURCE SHORTEST-PATH
PROBLEM
There are multiple paths from a source vertex to a destination
vertex
Shortest path: the path whose total weight (i.e., sum of edge
weights) is minimum
Examples:
⚫ Austin->Houston->Atlanta->Washington: 1560 miles
⚫ Austin->Dallas->Denver->Atlanta->Washington: 2980 miles
111
SINGLE-SOURCE SHORTEST-PATH
PROBLEM (CONT.)
Common algorithms: Dijkstra's algorithm, Bellman-Ford
algorithm
Dijkstra's algorithm algorithm was developed by a Dutch
computer scientist Edsger W. Dijkstra in 1956.
It is used to find the shortest path between a node/vertex
(source node) to any (or every) other nodes/vertices
(destination nodes) in a graph.
112
APPLICATIONS
113
WORKING OF DIJKSTRA'S SHORTEST
PATH FIRST ALGORITHM
1. Convert any problem to its graph equivalent representation.
2. Maintain a list of unvisited vertices. Assign a vertex as “source” and also allocate a
maximum possible cost (infinity) to every other vertex. The cost of the source to itself will
be zero as it actually takes nothing to go to itself.
3. In every step of the algorithm, it tries to minimize the cost for each vertex.
4. For every unvisited neighbor (V2, V3) of the current vertex (V1) calculate the new cost from
V1.
5. The new cost of V2 is calculated as :
Minimum( existing cost of V2 , (sum of cost of V1 + the cost of edge from V1 to V2) )
6. When all the neighbors of the current node are visited and cost has been calculated, mark the
current node V1 as visited and remove it from the unvisited list.
7. Select next vertex with smallest cost from the unvisited list and repeat from step 4.
8. The algorithm finally ends when there are no unvisited nodes left
114
DIJKSTRA'S ALGORITHM
EXAMPLE
115
DIJKSTRA'S ALGORITHM EXAMPLE
(CONTD.)
1.Assign cost of 0 to source vertex and ∞∞ (Infinity) to all other vertices
as shown in the image below.
2. Maintain a list of unvisited vertices.
3. Add all the vertices to the unvisted list.
116
DIJKSTRA'S ALGORITHM EXAMPLE
(CONTD.)
Calculate minimum cost for neighbors of selected source
117
DIJKSTRA'S ALGORITHM EXAMPLE
(CONTD.)
Select next vertex with smallest cost from the unvisited list.
Choose the unvisited vertex with minimum cost (here, it would be C) and
consider all its unvisited neighbors (A,E and D) and calculate the minimum
cost for them. Once this is done, mark C as visited.
118
DIJKSTRA'S ALGORITHM EXAMPLE
(CONTD.)
Repeat steps for all the remaining unvisited nodes
Choose the unvisited vertex with minimum cost (here, it would be C) and
consider all its unvisited neighbors (A,E and D) and calculate the minimum
cost for them. Once this is done, mark C as visited.
119
PSEUDO CODE OF DIJKSTRA
ALGORITHM
Dijkstra_Algorithm(source, G):
parameters: source node--> source, graph--> G
return: List of cost from source to all other nodes-->cost
unvisited_list = [] // List of unvisited vertices
vertices cost = []
cost[source] = 0 // Distance (cost) from source to source will be 0
for each vertex v in G:
// Assign cost as INFINITY to all vertices
if v ≠ source cost[v] = INFINITY
// All nodes pushed to unvisited_list initially
add v to unvisited_list
120
PSEUDO CODE OF DIJKSTRA
ALGORITHM (CONTD.)
//Main loop
121
COMPLEXITY ANALYSIS OF DIJSKTRA
ALGORITHM
122
UNIT III – CONCLUDED
123 THANK YOU!