Unit-5 - Oops
Unit-5 - Oops
Unit-5 - Oops
A graph can be defined as group of vertices and edges that are used to connect these vertices.
A graph can be seen as a cyclic tree, where the vertices (Nodes) maintain any complex relationship
among them instead of having parent child relationship.
Definition
A graph G can be defined as an ordered set G(V, E) where V(G) represents the set of vertices and E(G)
represents the set of edges which are used to connect these vertices.
A Graph G(V, E) with 5 vertices (A, B, C, D, E) and six edges ((A,B), (B,C), (C,E), (E,D), (D,B),
(D,A)) is shown in the following figure.
A graph can be directed or undirected. However, in an undirected graph, edges are not associated with
the directions with them.
An undirected graph is shown in the above figure since its edges are not attached with any of the
directions.
If an edge exists between vertex A and B then the vertices can be traversed from B to A as well as A
to B.
In a directed graph, edges form an ordered pair. Edges represent a specific path from some vertex A to
another vertex B. Node A is called initial node while node B is called terminal node.
Graph Terminology
Path
A path can be defined as the sequence of nodes that are followed in order to reach some terminal node V from
the initial node U.
Cycle
A cycle can be defined as the path which has no repeated edges or vertices except the first and last vertices.
Connected Graph
A connected graph is the one in which some path exists between every two vertices (u, v) in V. There are no
isolated nodes in connected graph.
Graph Representations
In graph theory, a graph representation is a technique to store graph into the memory of computer.
To represent a graph, we just need the set of vertices, and for each vertex the neighbors of the vertex
(vertices which is directly connected to it by an edge).
If it is a weighted graph, then the weight will be associated with each edge.
There are different ways to optimally represent a graph, depending on the density of its edges, type of
operations to be performed and ease of use.
Adjacency Matrix
Note, even if the graph on 100 vertices contains only 1 edge, we still have to have a 100x100 matrix with
lots of zeroes.
If there is any weighted graph then instead of 1s and 0s, we can store the weight of the edge.
Example
2. BFS ALGORITHM
Breadth-first search is a graph traversal algorithm that starts traversing the graph from the root node
and explores all the neighboring nodes.
Then, it selects the nearest node and explores all the unexplored nodes.
While using BFS for traversal, any node in the graph can be considered as the root node.
BFS puts every vertex of the graph into two categories - visited and non-visited.
It selects a single node in a graph and, after that, visits all the nodes adjacent to the selected node.
Applications of BFS algorithm
BFS can be used to find the neighboring locations from a given source location.
In a peer-to-peer network, BFS algorithm can be used as a traversal method to find all the neighboring
nodes. Most torrent clients, such as BitTorrent, uTorrent, etc. employ this process to find "seeds" and
"peers" in the network.
BFS can be used in web crawlers to create web page indexes. It is one of the main algorithms that can
be used to index web pages. It starts traversing from the source page and follows the links associated
with the page. Here, every web page is considered as a node in the graph.
BFS is used to determine the shortest path and minimum spanning tree.
BFS is also used in Cheney's technique to duplicate the garbage collection.
It can be used in ford-Fulkerson method to compute the maximum flow in a flow network.
Algorithm
The steps involved in the BFS algorithm to explore a graph are given as follows -
Step 2: Enqueue the starting node A and set its STATUS = 2 (waiting state)
Step 4: Dequeue a node N. Process it and set its STATUS = 3 (processed state).
Step 5: Enqueue all the neighbours of N that are in the ready state (whose STATUS = 1) and set
their STATUS = 2
(waiting state)
[END OF LOOP]
Step 6: EXIT
In the above graph, minimum path 'P' can be found by using the BFS that will start from Node A and end at
Node E.
The algorithm uses two queues, namely QUEUE1 and QUEUE2. QUEUE1 holds all the nodes that are to be
processed, while QUEUE2 holds all the nodes that are processed and deleted from QUEUE1.
1. QUEUE1 = {A}
2. QUEUE2 = {NULL}
Step 2 - Now, delete node A from queue1 and add it into queue2. Insert all neighbors of node A to queue1.
1. QUEUE1 = {B, D}
2. QUEUE2 = {A}
Step 3 - Now, delete node B from queue1 and add it into queue2. Insert all neighbors of node B to queue1.
1. QUEUE1 = {D, C, F}
2. QUEUE2 = {A, B}
Step 4 - Now, delete node D from queue1 and add it into queue2. Insert all neighbors of node D to queue1.
The only neighbor of Node D is F since it is already inserted, so it will not be inserted again.
1. QUEUE1 = {C, F}
2. QUEUE2 = {A, B, D}
Step 5 - Delete node C from queue1 and add it into queue2. Insert all neighbors of node C to queue1.
1. QUEUE1 = {F, E, G}
2. QUEUE2 = {A, B, D, C}
Step 5 - Delete node F from queue1 and add it into queue2. Insert all neighbors of node F to queue1. Since all
the neighbors of node F are already present, we will not insert them again.
1. QUEUE1 = {E, G}
2. QUEUE2 = {A, B, D, C, F}
Step 6 - Delete node E from queue1. Since all of its neighbors have already been added, so we will not insert
them again. Now, all the nodes are visited, and the target node E is encountered into queue2.
1. QUEUE1 = {G}
2. QUEUE2 = {A, B, D, C, F, E}
DFS algorithm is a recursive algorithm to search all the vertices of a tree data structure or a graph.
The depth-first search (DFS) algorithm starts with the initial node of graph G and goes deeper until we
find the goal node or the node with no children.
Stack data structure can be used to implement the DFS algorithm.
The step by step process to implement the DFS traversal is given as follows -
1. First, create a stack with the total number of vertices in the graph.
2. Now, choose any vertex as the starting point of traversal, and push that vertex into the stack.
3. After that, push a non-visited vertex (adjacent to the vertex on the top of the stack) to the top of the
stack.
4. Now, repeat steps 3 and 4 until no vertices are left to visit from the vertex on the stack's top.
5. If no vertex is left, go back and pop a vertex from the stack.
6. Repeat steps 2, 3, and 4 until the stack is empty.
Algorithm
Step 2: Push the starting node A on the stack and set its STATUS = 2 (waiting state)
Step 4: Pop the top node N. Process it and set its STATUS = 3 (processed state)
Step 5: Push on the stack all the neighbors of N that are in the ready state (whose STATUS = 1) and set their
STATUS = 2 (waiting state)
[END OF LOOP]
Step 6: EXIT
1. STACK: H
Step 2 - POP the top element from the stack, i.e., H, and print it. Now, PUSH all the neighbors of H onto the
1. Print: H]STACK: A
Step 3 - POP the top element from the stack, i.e., A, and print it. Now, PUSH all the neighbors of A onto the
2. STACK: B, D
Step 4 - POP the top element from the stack, i.e., D, and print it. Now, PUSH all the neighbors of D onto the
1. Print: D
2. STACK: B, F
Step 5 - POP the top element from the stack, i.e., F, and print it. Now, PUSH all the neighbors of F onto the
1. Print: F
2. STACK: B
Step 6 - POP the top element from the stack, i.e., B, and print it. Now, PUSH all the neighbors of B onto the
1. Print: B
2. STACK: C
Step 7 - POP the top element from the stack, i.e., C, and print it. Now, PUSH all the neighbors of C onto the
1. Print: C
2. STACK: E, G
Step 8 - POP the top element from the stack, i.e., G and PUSH all the neighbors of G onto the stack that are in
ready state.
1. Print: G
2. STACK: E
Step 9 - POP the top element from the stack, i.e., E and PUSH all the neighbors of E onto the stack that are in
ready state.
1. Print: E
2. STACK:
All the graph nodes have been traversed, and the stack is empty.
The time complexity of the DFS algorithm is O(V+E), where V is the number of vertices and E is the number
of edges in the graph.The space complexity of the DFS algorithm is O(V).
4. TOPOLOGICAL SORT
Topological Sort is a linear ordering of the vertices in such a way that if there is an edge in the DAG
going from vertex ‘u’ to vertex ‘v’,then ‘u’ comes before ‘v’ in the ordering.
Topological Sorting is possible if and only if the graph is a Directed Acyclic Graph.
There may exist multiple different topological orderings for a given directed acyclic graph.
123456
123465
132456
132465
Instruction Scheduling
Data Serialization
Problem-01:
Find the number of different topological orderings possible for the given graph-
Solution-
The topological orderings of the above graph are found in the following steps-
Step-01:
Write in-degree of each vertex-
Step-02:
Step-03:
There are two vertices with the least in-degree. So, following 2 cases are possible-
In case-01,
Remove vertex-C and its associated edges.
Then, update the in-degree of other vertices.
In case-02,
Remove vertex-D and its associated edges.
Then, update the in-degree of other vertices.
Step-05:
Now, the above two cases are continued separately in the similar manner.
In case-01,
Remove vertex-D since it has the least in-degree.
Then, remove the remaining vertex-E.
In case-02,
Remove vertex-C since it has the least in-degree.
Then, remove the remaining vertex-E.
Conclusion-
For the given graph, following 2 different topological orderings are possible-
ABCDE
ABDCE
5. MINIMUM SPANNING TREE
The cost of the spanning tree is the sum of the weights of all the edges in the tree.
There can be many spanning trees.
Minimum spanning tree is the spanning tree where the cost is minimum
Minimum spanning tree has direct application in the design of networks.
It is used in algorithms approximating the travelling salesman problem, multi-terminal minimum cut
problem and minimum-cost weighted perfect matching.
Applications are:
1. Cluster Analysis
2. Handwriting recognition
3. Image segmentation
There are two famous algorithms for finding the Minimum Spanning Tree:
KRUSKAL’S ALGORITHM
PRIM’S ALGORITHM
Kruskal’s Algorithm builds the spanning tree by adding edges one by one into a growing spanning
tree.
Kruskal's algorithm follows greedy approach as in each iteration it finds an edge which has least
weight and add it to the growing spanning tree.
Algorithm Steps:
Start adding edges to the MST from the edge with the smallest weight until the edge of the largest
weight.
Only add edges which doesn't form a cycle , edges which connect only disconnected components.
This could be done using DFS which starts from the first vertex, then check if the second vertex is visited or
not. But DFS will make time complexity large as it has an order of O(V+E) where V is the number of
vertices, E is the number of edges. So the best solution is "Disjoint Sets":
Disjoint sets are sets whose intersection is the empty set so it means that they don't have any element in
common.
In Kruskal’s algorithm, at each iteration we will select the edge with the lowest weight.
Start with the lowest weighted edge first i.e., the edges with weight 1.
After that we will select the second lowest weighted edge i.e., edge with weight 2.
Notice these two edges are totally disjoint. Now, the next edge will be the third lowest weighted edge
i.e., edge with weight 3, which connects the two disjoint pieces of the graph.
Now, we are not allowed to pick the edge with weight 4, that will create a cycle and we can’t have
any cycles.
So we will select the fifth lowest weighted edge i.e., edge with weight 5.
Now the other two edges will create cycles so we will ignore them. In the end, we end up with a
minimum spanning tree with total cost 11 ( = 1 + 2 + 3 + 5).
Prim’s Algorithm also use Greedy approach to find the minimum spanning tree.
Prim’s Algorithm we grow the spanning tree from a starting position.
An edge in Kruskal's, we add vertex to the growing spanning tree in Prim's.
Algorithm Steps:
Maintain two disjoint sets of vertices. One containing vertices that are in the growing spanning tree
Select the cheapest vertex that is connected to the growing spanning tree and is not in the growing
spanning tree and add it into the growing spanning tree. This can be done using Priority Queues.
Insert the vertices that are connected to growing spanning tree, into the Priority Queue.
Mark the nodes which have been already selected and insert only those nodes in the Priority Queue
In each iteration we will mark a new vertex that is adjacent to the one that we have already marked.
As a greedy algorithm, Prim’s algorithm will select the cheapest edge and mark the vertex. So
Choose the edge with weight 1. In the next iteration we have three options, edges with weight 2, 3
and 4. So, we will select the edge with weight 2 and mark the vertex. Now again we have three
options, edges with weight 3, 4 and 5.
But we can’t choose edge with weight 3 as it is creating a cycle. So we will select the edge with
weight 4 and we end up with the minimum spanning tree of total cost 7 ( = 1 + 2 +4).
6. DIJKSTRA ALGORITHM
Dijkstra's Algorithm basically starts at the node that you choose (the source node) and it analyzes the
graph to find the shortest path between that node and all the other nodes in the graph.
The algorithm keeps track of the currently known shortest distance from each node to the source node
and it updates these values if it finds a shorter path.
Once the algorithm has found the shortest path between the source node and another node, that node is
marked as "visited" and added to the path.
The process continues until all the nodes in the graph have been added to the path. This way, we have
a path that connects the source node to all other nodes following the shortest path possible to reach
each node.
The vertex 0 is represented by 'x' and the vertex 1 is represented by 'y'. The distance between
the vertices can be calculated by using the below formula:
d(x, y) = d(x) + c(x, y) < d(y)
A vertex is a source vertex so entry is filled with 0 while other vertices filled with ∞.
The distance from source vertex to source vertex is 0, and the distance from the source vertex to other
vertices is ∞.
This problem using the below table:
A B C D E
∞ ∞ ∞ ∞ ∞
Since 0 is the minimum value in the above table, so we select vertex A and added in the second row shown
as below:
A B C D E
A 0 ∞ ∞ ∞ ∞
As we can observe in the above graph that there are two vertices directly connected to the vertex A,
i.e., B and C.
The vertex A is not directly connected to the vertex E, i.e., the edge is from E to A.
Calculate the two distances, i.e., from A to B and A to C. The same formula will be used as in the
previous problem.
A B C D E
A 0 ∞ ∞ ∞ ∞
10 5 ∞ ∞
As we can observe in the third row that 5 is the lowest value so vertex C will be added in the third row.
Calculated the distance of vertices B and C from A. Now we will compare the vertices to find the vertex with
the lowest value. Since the vertex C has the minimum value, i.e., 5 so vertex C will be selected.
Since the vertex C is selected, so we consider all the direct paths from the vertex C. The direct paths from the
vertex C are C to B, C to D, and C to E.
First, we consider the vertex B. We calculate the distance from C to B. Consider vertex C as 'x' and vertex B
as 'y'.
= (5 + 3) < ∞
=8<∞
Since 8 is less than the infinity so we update d(B) from ∞ to 8. Now the new row will be inserted in which
value 8 will be added under the B column.
A B C D E
A 0 ∞ ∞ ∞ ∞
10 5 ∞ ∞
We consider the vertex D. We calculate the distance from C to D. Consider vertex C as 'x' and vertex D as 'y'.
= (5 + 9) < ∞
= 14 < ∞
Since 14 is less than the infinity so we update d(D) from ∞ to 14. The value 14 will be added under the D
column.
A B C D E
A 0 ∞ ∞ ∞ ∞
C 10 5 ∞ ∞
8 14
We consider the vertex E. We calculate the distance from C to E. Consider vertex C as 'x' and vertex E as 'y'.
= (5 + 2) < ∞
=7<∞
Since 14 is less than the infinity so we update d(D) from ∞ to 14. The value 14 will be added under the D
column.
A B C D E
A 0 ∞ ∞ ∞ ∞
C 10 5 ∞ ∞
8 14 7
As we can observe in the above table that 7 is the minimum value among 8, 14, and 7. Therefore, the vertex
E is added on the left as shown in the below table:
A B C D E
A 0 ∞ ∞ ∞ ∞
C 10 5 ∞ ∞
E 8 14 7
The vertex E is selected so we consider all the direct paths from the vertex E. The direct paths from the
vertex E are E to A and E to D. Since the vertex A is selected, so we will not consider the path from E to A.
= 13 < 14
Since 13 is less than the infinity so we update d(D) from ∞ to 13. The value 13 will be added under the D
column.
A B C D E
A 0 ∞ ∞ ∞ ∞
C 10 5 ∞ ∞
E 8 14 7
B 8 13
The value 8 is minimum among 8 and 13. Therefore, vertex B is selected. The direct path from B is B to D.
= (8 + 1) < 13
= 9 < 13
Since 9 is less than 13 so we update d(D) from 13 to 9. The value 9 will be added under the D column.
A B C D E
A 0 ∞ ∞ ∞ ∞
C 10 5 ∞ ∞
E 8 14 7
B 8 13
D 9
7. Floyd- Warshall Algorithm
Floyd-Warshall Algorithm is an algorithm for finding the shortest path between all the pairs of
This algorithm works for both the directed and undirected weighted graphs.
It does not work for the graphs with negative cycles (where the sum of the edges in a cycle is
negative).
A weighted graph is a graph in which each edge has a numerical value associated with it.
This algorithm follows the dynamic programming approach to find the shortest paths.
Initial graph
Follow the steps below to find the shortest path between all the pairs of vertices.
1. Create a matrix A0 of dimension n*n where n is the number of vertices. The row and the column are
indexed as i and j respectively. i and j are the vertices of the graph.
Each cell A[i][j] is filled with the distance from the ith vertex to the jth vertex. If there is no path
from ith vertex to jth vertex, the cell is left as infinity.
2. Fill each cell with the distance between ith and jth vertex
3. Now, create a matrix A1 using matrix A0. The elements in the first column and the first row are left as
they are. The remaining cells are filled in the following way.
Let k be the intermediate vertex in the shortest path from source to destination. In this step, k is the
first vertex. A[i][j] is filled with (A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).
That is, if the direct distance from the source to the destination is greater than the path through the
vertex k, then the cell is filled with A[i][k] + A[k][j].
In this step, k is vertex 1. We calculate the distance from source vertex to destination vertex through
this vertex k.
Calculate the distance from the source vertex to destination vertex through this vertex k
For example: For A1[2, 4], the direct distance from vertex 2 to 4 is 4 and the sum of the distance from
vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to 4) is 7. Since 4 < 7, A0[2,
4] is filled with 4.
4. Similarly, A2 is created using A1. The elements in the second column and the second row are left as
they are. In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same as in step
2.
Calculate the distance from the source vertex to destination vertex through this vertex 2
Calculate the distance from the source vertex to destination vertex through this vertex
6. Calculate the distance from the source vertex to destination vertex through this vertex 4