BCS304-DSA-MODULE 4
BCS304-DSA-MODULE 4
BCS304-DSA-MODULE 4
MODULE-4
TREES
TOPICS:- Terminology, Binary Trees, Properties of Binary trees, Array and linked
Representation of Binary Trees, Binary Tree Traversals - Inorder, postorder, preorder;
Additional Binary tree operations. Threaded binary trees, Binary Search Trees – Definition,
Insertion, Deletion, Traversal, Searching, Application of Trees-Evaluation of Expression,
Programming Examples.
Text 1: Ch 5: 5.1 –5.5, 5.7
Text 2: Ch 7: 7.1 – 7.9
INTRODUCTION
A tree is non-linear and a hierarchical data structure consisting of a collection of nodes such
that each node of the tree stores a value, a list of references to nodes (the “children”).
4.1 DEFINITION
A tree consists of a finite set of elements called nodes and a finite set of directed
lines called branches that connect the nodes. The number of branches associated with a
node is the degree of the node. When the branch is directed toward the node, it is indegree
branch. When the branch is directed away from the node it is an outdegree branch.
• Internal Node: A node that is not a root or a leaf is known as an internal node. For
e.g., B, E, F, C, H, I, J
• Parent and Child: The subtrees of a node A are the children of A. A is the parent of its
children (OR) a node is a parent if it has successor nodes i.e. if it has an outdegree
greater than zero. Conversely, a node with predecessor is a child. For e.g., children of
D are H, I and J. Parent of D is A.
• Siblings/Brothers/Sisters: Children of same parent are called siblings. For e.g., H, I
and J are siblings.
• Ancestor: An ancestor is any node in the path from the root to the node. For e.g.,
ancestors of M are A, D and H.
• Descendent: A descendent is any node in the path below the parent node i.e. all nodes
in the paths from a given node to a leaf are descendents of that node.
• Path: a sequence of nodes in which each node is adjacent to the next one. Every node
in the tree can be reached by following a unique path starting from the root. the length
of a path is the number of edges in the path, or 1 less than the number of nodes in it.
• Level: the level of a node is its distance from the root. If a node is at level 'l', then its
children are at level 'l+1'.
• Height or depth of a tree is defined as maximum level of any node in the tree. For
e.g., Height of given tree = 4.
• A Sub tree is any connected structure below the root.
• A tree is a set of nodes that either: Is empty , or Has a designated node, called the root,
from which hierarchically descend zero or more subtrees, which are also trees.
Binary search tree (BST), which may sometimes also be called an ordered or sorted binary
tree, is a node-based binary tree data structure.
BST Properties:
– The left subtree of a node contains only nodes with keys less than the node's key.
– The right subtree of a node contains only nodes with
keys greater than the node's key.
– Both the left and right subtrees must also be binary
search trees.
Note: Every node in the tree should satisfy this condition.
Example 1: Assume 40, 20, 10, 50, 65, 45, 30 are inserted in order for BST
construction.
Step 1: Initially root is NULL, there node 40 becomes the root node
Step 2: Node 20 is lesser than the root node 40, therefore node 20 is inserted as the left
child of node 40
Step 3: Node 10 is lesser than node 40, then insertion is done in the left subtree of node 40.
Node 10 is compared with node 20. As node 10 is lesser it is inserted as the left child of
node 20.
Step 4: Node 50 is greater than node 40, so it is inserted to the right of node 40
Step 5: Node 65 is greater than node 40, compare node 65 with the right subtree of the node
40. Node 65 is compared with node 50. As node 65 is greater it is inserted to the right
subtree of node 50.
Step 6: Node 45 is greater than root node 40, so recursively compare in the right subtree of
node 40. Node 45 is compared with node 50 and inserted as the left child of node 50
Step 7: Similarly node 30 is compared with node 40, then compared with node 20 and
finally inserted as the right child of node 20
Example 2: Construct BST for the given numbers- 50, 70, 60, 20, 90, 10, 40, 100
Example 3: Construct BST for the given numbers- 50, 15, 62, 5, 20, 58, 91, 3, 8, 37, 60, 24
Binary Search Trees can be implemented using Linked list and arrays. This section starts
with the linked list implementation and later array implementation will be discussed.
Linked list implementation: Insertion and deletion of nodes from middle of a tree requires
movement of potentially many nodes to reflect the change in level number of these nodes.
These problems can be overcome easily through the use of a linked representation Doubly
linked list is used to represent each node in a BST for linked list implementation.
Here each node has 3 fields.
– Info is used to store the actual data.
– lptr is used to store the address of left sub tree.
– rptr is used to store the address of right sub tree
Note: If lptr is NULL then there is no left sub tree, if rptr is NULL then there is no right sub
tree.
Structure of node:
struct bst 1000
{ root
int info; lptr info rptr
1000
struct bst *lptr,*rptr; 0 40 0
};
typedef struct bst node;
Insertion into BST: Assume data 40, 20, 10, 50, 65, 45, 30 to be inserted into BST
node* insert(node *root)
{
node *new1, *cur = root, *prev= NULL;
new1 = (node *)malloc(sizeof(node));
printf("\nEnter The Element ");
scanf("%d",&new1->item);
new1->lptr = new1->rptr = NULL;
if (root == NULL)
return new1;
while(cur != NULL)
{
prev = cur;
cur = new1->item < cur->item ?
cur->lptr : cur->rptr;
}
if (new1->item < prev->item)
prev->lptr = new1;
else
prev->rptr = new1;
return root;
}
The order in which the nodes of a linear list are visited in a traversal is clearly from
first to last. However, there is no such "natural" linear order for the nodes of a tree. Thus,
different orderings are used for traversal in different cases. Traversing a BT/BST involves
– Visiting the root node
– Traversing its left and right sub trees.
We designate the task of visiting the root node as Rt, traversing the left subtree as Ls and
traversing the right subtree as Rs. Different tree traversal techniques are :
– In order Traversal -- Ls Rt Rs
– Preorder Traversal -- Rt Ls Rs
– Post order Traversal -- Ls Rs Rt
Inorder Traversal: If the root is visited between traversing the subtree, it is called the
preorder traversal.
Preorder Traversal: If the root is visited before traversing the subtree, it is called the
preorder traversal.
• Process the root node
• Traverse the Left sub tree in preorder recursively.
• Traverse the Right sub tree in preorder recursively
40 20 10 30 50 45 65
Preorder traversal for the tree given in
figure 3:
Post order Traversal: If the root is visited after traversing the subtrees, it is called post
order traversal.
• Traverse the Left sub tree in postorder recursively.
• Traverse the Right sub tree in postorder recursively
• Process the root node
Implementation of postorder traversal:
void postorder(node *root)
{
if( root != NULL )
{
postorder(root->lptr);
postorder(root->rptr);
printf("%d ",root->info);
}
}
10 30 20 45 65 50 40
Postorder traversal for the tree given in
figure 3:
Example 2: Write down tree traversals for the below given tree
Preorder: abdgecf
Inorder: dgbeafc
Postorder: g d e b f c a’
}
if(key > root->info )
search(root->rptr,key);
else
search(root->lptr,key);
}
return (0);
}
Case 1:
if (root == NULL)
{
printf("tree is empty\n");
return NULL;
}
Case 2: Data is in the left subtree
if (data < root->item)
{
root->lptr = Delete(root->lptr, data);
return(root);
}
Case 3: Data is in the right subtree
if (data > root->item)
{
root->rptr = Delete(root->rptr, data);
return(root);
}
Case 4: Data is present but no children
if (root->lptr == NULL && root->rptr == NULL)
{
printf("deleted data %d",root->item);
free(root);
root = NULL;
return(root);
}
if (root->rptr == NULL)
{ temp = root; // save current node as a backup
root = root->lptr;
printf("deleted data %d",temp->item);
free(temp);
return(root);
}
min = FindMin(root->rptr);
root->item = min;
root->rptr = Delete(root->rptr, min);
{
temp = root->lptr;
printf("deleted data %d",root->item);
free(root);
return(temp);
}
// data is present but no left subtree
if (root->lptr == NULL)
{
temp = root->rptr;
printf("deleted info %d",root->item);
free(root); return(temp);
}
// If both left and right subtree are present, find the min element
// in right subtree and place it in root node and call delete function
//for right subtree.
min = FindMin(root->rptr);
root->item = min;
root->rptr = Delete(root->rptr, min);
return(root);
A one-dimensional array can be used to store nodes of binary search tree. Root node
starts from index 0. Consider the tree shown below, array representation of this tree is given
in figure 5.
a 0 1 2 3 4 5 6 7 8 9 10 11 … 19
15 10 20 8 12 16 25
#include<stdio.h>
#include<stdlib.h>
#define SIZE 20
void main()
{
int a[20], item, ch, i;
for(i = 0; i< SIZE; i++) a[i] = 0;
for(;;)
{
printf("1. Insert. 2. Inorder 3. Preorder 4. Postorder 5. Exit\n");
printf("Enetr your choice\n");
scanf("%d",&ch);
switch(ch)
{
case 1: printf("Enter the Item\n");
scanf("%d",&item);
insert(item,a); break;
case 2: if( a[0] == 0)
printf("list is empty\n");
else
{
printf("Inorder is ");
inorder(a, 0);
}
break;
A selection tree is a form of complete binary tree in which each node denotes a player. The
last level has n-1 nodes (external nodes) used to represent all the players, and the rest of the
nodes (internal nodes) represent either the winner or loser among them. It is also referred to
as a Tournament tree.
Dept.of ISE, RNSIT 2023-24 Page 16
DATA STRUCTURES AND APPLICATIONS [BCS304]
Some of the important properties of the selection tree are listed below:
a. The value of every internal node is always equal to one of its children.
b. A tournament tree can have holes. The tournament tree having nodes less than 2 ^
(n+1) -1 contains holes. The hole represents a player or team's absence and can be
anywhere in the tree.
c. Every node in a tournament tree is linked to its predecessor and successor, and unique
paths exist between all nodes.
d. It is a type of binary heap (min or max heap). Or we can say it is the application of
heaps.
e. The root node represents the winner of the tournament.
f. To find the winner or loser (best player) of the match, we need N-1 comparisons.
There exist a loser and a winner in every match. So, there are two methods to represent both
ideas:
In a selection tree, when the internal nodes represent the winner of the match, the tree
obtained is referred to as the winner tree. Each internal node stores either the smallest or
greatest of its children, depending on the winning criteria. When the winner is the smaller
value then the winner tree is referred to as the minimum winner tree, and when the winner is
the larger value, then the loser tree is referred to as the maximum winner tree.
The tournament's winner is always the smallest or the greatest of all the players or values and
can be found in O(1). The time needed to create the winner tree is O(Log N), where N
represents the number of players.
What is a loser tree?
In a tournament tree, when the internal nodes are used to represent the loser of the match
between two, then the tree obtained is referred to as the loser tree. When the loser is the
smaller value then the loser tree is referred to as the minimum loser tree, and when the loser
is the larger value, then the loser tree is referred to as the maximum loser tree.
It is also called the minimum or maximum loser tree. The same idea is also applied here, the
loser (or parent) is always equal to one of its children, and the loser is always the greatest or
smallest of all the players and can be found in O(1). Also, the time needed to create a loser
tree is O(Log N), where N is the number of players.
Here, you can see that the sample contains no linked trees. An empty graph and a single
tree are other examples of a forest data structure.
Applications of Forest data structure
• Social networking websites
• Tree and graph data structures are used by social networking services (like
Facebook, LinkedIn, Twitter, etc.) to describe their data. You build a forest of two
people when you work on adding two individuals as friends.
• Big data web scrapers
The main page serves as the root node, and the consecutive hyperlinks from that
page serve as the nodes for the remainder of the Tree in a website's organizational
structure, which resembles a tree. When Web scrapers collect information from
several similar websites, they display it as a forest of trees.
• Operating system storage
You would be able to view different discs in the system, such as C drive (C:), D
drive (D:), etc., if you were using a Windows-based operating system. Each drive
may be compared to a distinct tree, while the entirety of the storage can be compared
to a forest.
• Big data web scrapers
The main page serves as the root node, and the consecutive hyperlinks from that page
serve as the nodes for the remainder of the Tree in a website's organizational structure,
which resembles a tree. When Web scrapers collect information from several similar
websites, they display it as a forest of trees.
• Operating system storage
You would be able to view different discs in the system, such as C drive (C:), D drive
(D:), etc., if you were using a Windows-based operating system. Each drive may be
compared to a distinct tree, while the entirety of the storage can be compared to a forest.
Each drive may be compared to a distinct tree, while the entirety of the storage can be
compared to a forest.
TRANSFORMING A FOREST INTO A BINARY TREE
If T1, T2. . . .Tn is a forest of trees, then the binary tree corresponding to this forest,
denoted by B(T1,T2 Tn),
• is empty if n=0
• has root equal to root(T1);has left subtree equal to B(T11,T12 T1m)
• has right subtree B(T2 Tn)
• where T11,T12 and T1m are the subtrees of root(T1).
FOREST TRAVERSALS
There are 3 forest traversal techniques namely: preorder, inorder and postorder traversal.
Preorder traversal of forest F can be recursively defined as follows
• If F is empty then return.
• Visit the root of the first tree of F.
• Traverse the subtrees of the first tree in forest preorder.
• Traverse the remaining trees of F in forest preorder.
sets.
Array representation : The same 3 sets S1, S2 and S3 can be represented in the
form of array as shown below
Find(i):Find the set containing the element i. For example, 3 is in set S3 and 8 in set S1.
int find1(int i)
{
for(; parent[i]>=0; i=parent[i]);
return i;
}
void union1(int i, int j)
{
parent[i]= j;
}
If the number of nodes in tree i is less than the number in tree j then make j the parent
of i; otherwise make i the parent of j (Figure 5.41 & Program 5.17).
The number of distinct binary trees is equal to the number of distinct inorder permutations obtainable
from binary trees having the preorder permutation 1, 2 n. (Figure 5.49).
If we start with the numbers 1,2 and 3,then the possible permutations obtainable by a stack are (1,2,3)
(1,3,2) (2,1,3) (2,3,1) (3,2,1)
INTRODUCTION: GRAPH
• G(V,E) is a graph G consists of set of vertices and set of edges.
V is a finite set of vertices.
E is a set of pairs of vertices, these pairs are called edges. V(G) and E(G) represents the set
of vertices and edges respectively of graph G (Figure 1).
GRAPH REPRESENTATIONS
Three commonly used representations are:
1) Adjacency matrices,
2) Adjacency lists and
3) Adjacency Multilists
1. Adjacency Matrix
• Let G=(V, E) be a graph with n vertices, n >= 1.
• The adjacency matrix of G is a two-dimensional n*n array with the property that a[i][j]=1 iff
the edge (i,j) is in E(G). a[i][j]=0 if there is no such edge in G (Figure 6).
3) Adjacency Multilist
An edge in an undirected graph is represented by two nodes in adjacency list representation.
Adjacency Multilists are lists in which nodes may be shared among several lists. (an edge is shared
by two different paths)
M – Visited/Not
Vertex1 – starting vertex of edge (m,n) = m
Vertex2 – ending vertex of edge (m,n) = n
List1 – Contains next information about vertex1
List2 – Contains next information about vertex2
ADT OF GRAPH
GRAPH TRAVERSALS
DFS and BFS are common methods of graph traversal, which is the process of visiting every
vertex of a graph.
1.
We choose B, mark it as
visited and put onto the
stack. Here Bdoes not have
5. any unvisited adjacent node.
So, we pop Bfrom the stack.
DFS will visit the child vertices before visiting siblings using this algorithm:
• Mark the starting node/arbitrary vertex of the graph as visited and push it onto the stack
• While the stack is not empty
o Peek at top node on the stack
o If there is an unvisited child of that node, Mark the child as visited and push
the child node onto the stack
Else Pop the top node off the stack.
Note: 1. On each iteration, the algorithm proceeds to an unvisited vertex that is adjacent to the one
its is currently in.
2. The process continues until a dead end ( a vertex with no adjacent vertices is
encountered).
3. At a dead end, the algorithm backs up one edge to the vertex it comes from and tries to
continue visiting unvisited vertices from there.
4. The algorithm eventually halts after backing up to the starting vertex, with the latter
being a dead end.
Usage of Stack: Push a vertex onto the stack when the vertex is reached for the first time. Po a
vertex when it becomes a dead end.
As C does not have any unvisited adjacent node so we keep popping the stack until we find a
node that has an unvisited adjacent node.
Application Of Depth-First Search Algorithm
The minor spanning tree is produced by the DFS traversal of an unweighted graph.
Procedure:
1. Create a recursive function that takes arguments as a boolean array visited of size V and
index u denoting the current node (when it will be called initially, u will be 0 or any other
user-defined value).
2. Mark the current node as visited .i.e. visited[u]=True.
4. Search for all the adjacent vertices v, of node u and identify the unvisited ones.
As soon as unvisited vertex adjacent to u found make a DFS call with index as v.
For DFS we have used an auxiliary boolean array of size V where V is the number of vertices
present in the given graph G. The visited array will keep information about which vertices have
been already visited and which are not.
DFS(visited, u):
visited[u]=True
Print u
if(visited[v]=False)
DFS(visited,v)
2. Queue is initialized with the traversals starting vertex, which is marked as visited.
3. On each iteration, the algorithm identifies the unvisited vertices that are adjacent to the
first vertex, mark them as visited, and adds them to the queue, after that, the front vertex is removed
from the queue.
Algorithm
BFS(v)
// Input: starting vertex
//Output: Visited vertices
for(i=1 to n do)
if(a[v][i] == 1 and visited[i] == 0)
q[++r] = i;
if(f <= r)
{ visited[q[f]] = 1;
BFS(q[f++]);
}}
Breadth First Search of a graph
1.
At this stage, we are left with no unmarked (unvisited) nodes. But as per the algorithm we keep on
dequeuing in order to get all unvisited nodes. When the Queue Gets Emptied, The Program Is Over.
Let’s try to see where does Breadth-First Search finds its applications:
1. Minimum spanning tree for unweighted graphs- In Breadth-First Search we can reach
from any given source vertex to another vertex, with the minimum number of edges, and this
principle can be used to find the minimum spanning tree which is the path covering all
vertices in the shortest paths.
2. Peer to peer networking: In Peer to peer networking, to find the neighbouring peer from
any other peer, the Breadth-First Search is used.
3. Crawlers in search engines: Search engines need to crawl the internet. To do so, they can
start from any source page, follow the links contained in that page in the Breadth-First
Search manner, and therefore explore other pages.
4. GPS navigation systems: To find locations within a given radius from any source person,
we can find all neighbouring locations using the Breadth-First Search, and keep on exploring
until those are within the K radius.
5. While broadcasting from any source, we find all its neighbouring Broadcasting in
networks: peers and continue broadcasting to them, and so on.
6. Path Finding: To find if there is a path between 2 vertices, we can take any vertex as a
source, and keep on traversing until we reach the destination vertex. If we explore all
vertices reachable from the source and could not find the destination vertex, then that means
there is no path between these 2 vertices.
7. Finding all reachable Nodes from a given Vertex: All vertices which are reachable from a
given vertex can be found using the BFS approach in any disconnected graph. The vertices
which are marked as visited in the visited array after the BFS is complete contains all those
reachable vertices.
Connected components:
DFS or BFS functions can be used to check if the graph is connected or not.
Note: DFS function in lab program 11 is used to check if the graph is connected or not.
Spanning Trees:
A spanning tree is any tree that consists solely of edges in G and that includes all the vertices in G.
Biconnected Components:
An articulation point is a vertex v of G such that the deletion of v, together with all edges incident
on v, produces a graph G’, that has atleast 2 connected components. Ex: 1,3,5 and 7 are the
articulation points in the following graph.
A biconnected graph is a connected graph that has no articulation points.
Design, Develop and Implement a Program in C for the following operations on Graph(G) of Cities
a. Create a Graph of N cities using Adjacency Matrix.
b. Print all the nodes reachable from a given starting node in a digraph using BFS method
c. Check whether a given graph is connected or not using DFS method
void dfs(int v)
{ int i;
reach[v]=1;
for(i=1;i<=n;i++)
{ if(a[v][i] && !reach[i])
{ printf("\n %d->%d",v,i);
count++;
dfs(i);
}
}
}
void main()
{ int v, choice;
printf("\n Enter the number of vertices:");
scanf("%d",&n);
for(i=1;i<=n;i++)
{ q[i]=0;
visited[i]=0;
reach[i]=0;
}
printf("\n Enter graph data in matrix form:\n");
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
scanf("%d",&a[i][j]);
for(;;)
{ printf("\n 1.BFS\n 2.DFS\n 3.Exit\n");
scanf("%d",&choice);
switch(choice)
{ case 1:
printf("\n Enter the starting vertex:");
scanf("%d",&v);
bfs(v);
if( v < 1 || v > n )
printf("\n Bfs is not possible");
else
{ printf("\n The nodes which are reachable from %d:\n",v);
for(i=1; i<=n; i++)
if( visited[i] )
printf("%d\t",i);
} break;
case 2:
dfs(1);
if(count==n-1)
printf("\n Graph is connected");
else
printf("\n Graph is not connected");
break;
default: printf("Invalid Choice\n"); exit(0); } }}
S.
Breadth First Search (BFS) Depth First Search (DFS)
No.
Some Applications:
Some Applications:
Finding all connected Topological Sorting.
5. components in a graph. Finding connected components.
Finding the shortest path Solving puzzles such as maze.
between two nodes. Finding strongly connected components.
Finding all nodes within one Finding articulation points (cut vertices) of the graph.
connected component.
Testing a graph for bipartiteness.