Unit 4 Notes
Unit 4 Notes
Unit 4 Notes
Here,
Node A is the root node
B is the parent of D and E
D and E are the siblings
D, E, F and G are the leaf nodes
A and B are the ancestors of E
Here, the root node is A. All the nodes on the left of A are a part of the left
subtree whereas all the nodes on the right of A are a part of the right subtree.
Thus, according to preorder traversal, we will first visit the root node, so A
will print first and then move to the left subtree.
B is the root node for the left subtree. So B will print next, and we will visit
the left and right nodes of B. In this manner, we will traverse the whole left
subtree and then move to the right subtree. Thus, the order of visiting the
nodes will be A→B→C→D→E→F→G→H→I.
Algorithm for Preorder Traversal
o for all nodes of the tree:
Step 1: Visit the root node.
Step 2: Traverse left subtree recursively.
Step 3: Traverse right subtree recursively.
Pseudo-code for Preorder Traversal
void Preorder(struct node* ptr)
{
if(ptr != NULL)
{
printf("%d", ptr->data);
Preorder(ptr->left);
Preorder(ptr->right);
}
}
Uses of Preorder Traversal
o If we want to create a copy of a tree, we make use of preorder
traversal.
o Preorder traversal helps to give a prefix expression for the
expression tree.
Inorder Traversal
In an inorder traversal, we first visit the left subtree, then the root node and
then the right subtree in an inorder manner.
Consider the following tree:
In this case, as we visit the left subtree first, we get the node with the value
30 first, then 20 and then 40. After that, we will visit the root node and print
it. Then comes the turn of the right subtree. We will traverse the right
subtree in a similar manner. Thus, after performing the inorder traversal, the
order of nodes will be 30→20→40→10→50→70→60→80.
Algorithm for Inorder Traversal
o for all nodes of the tree:
Step 1: Traverse left subtree recursively.
Step 2: Visit the root node.
Step 3: Traverse right subtree recursively.
Pseudo-code for Inorder Traversal
void Inorder(struct node* ptr)
{
if(ptr != NULL)
{
Inorder(ptr->left);
printf("%d", ptr-
>data); Inorder(ptr-
>right);
}
}
Uses of Inorder Traversal
o It helps to delete the tree.
o It helps to get the postfix expression in an expression tree.
Postorder Traversal
Postorder traversal is a kind of traversal in which we first traverse the left
subtree in a postorder manner, then traverse the right subtree in a postorder
manner and at the end visit the root node.
For example, in the following tree:
The postorder traversal will be 7→5→4→20→60→30→10.
Algorithm for Postorder Traversal
o or all nodes of the tree:
Step 1: Traverse left subtree recursively.
Step 2: Traverse right subtree recursively.
Step 3: Visit the root node.
Pseudo-code for Postorder Traversal
void Postorder(struct node* ptr)
{
if(ptr != NULL)
{
Postorder(ptr->left);
Postorder(ptr->right);
printf(“%d”, ptr->data);
}
}
Uses of Postorder Traversal
o It helps to delete the tree.
o It helps to get the postfix expression in an expression tree.
4.4 EXPRESSION TREES
The expression tree is a tree used to represent the various expressions. The tree
data structure is used to represent the expressional statements. In this tree, the
internal node always denotes the operators. The leaf nodes always denote the
operands.
For example, expression tree for 3 + ((5+9)*2) would be:
o Next, read a'+' symbol, so two pointers to tree are popped, a new tree is
formed and push a pointer to it onto the stack.
o Next, 'c' is read, we create one node tree and push a pointer to it onto the
stack.
o Finally, the last symbol is read ' * ', we pop two tree pointers and form a
new tree with a, ' * ' as root, and a pointer to the final tree remains on
the stack.
Step 6 - Insert 55
o 55 is larger than 45 and smaller than 79, so it will be inserted as the left
subtree of 79.
Step 7 - Insert 12
o 12 is smaller than 45 and 15 but greater than 10, so it will be inserted as the
right subtree of 10.
Step 8 - Insert 20
o 20 is smaller than 45 but greater than 15, so it will be inserted as the right
subtree of 15.
Step 9 - Insert 50.
o 50 is greater than 45 but smaller than 79 and 55. So, it will be inserted as a
left subtree of 55.
Step2:
Step3:
4.5.3.1.2 Algorithm to search an element in Binary search tree
Search (root, item)
Step 1 - if (item = root → data) or (root = NULL)
return root
else if (item < root → data)
return Search(root → left, item)
else
return Search(root → right, item)
END if
Step 2 - END
4.6 HASHING
Hashing in the data structure is a technique of mapping a large chunk of data
into small tables using a hashing function. It is also known as the message digest
function. It is a technique that uniquely identifies a specific item from a
collection of similar items.
It uses hash tables to store the data in an array format. Each value in the array
has been assigned a unique index number. Hash tables use a technique to
generate these unique index numbers for each value stored in an array format.
This technique is called the hash technique.
You only need to find the index of the desired item, rather than finding the data.
With indexing, you can quickly scan the entire list and retrieve the item you
wish. Indexing also helps in inserting operations when you need to insert data at
a specific location. No matter how big or small the table is, you can update and
retrieve data within seconds.
The hash table is basically the array of elements, and the hash techniques of
search are performed on a part of the item i.e. key. Each key has been mapped to
a number, the range remains from 0 to table size 1
Types of hashing in data structure is a two-step process.
o The hash function converts the item into a small integer or hash value. This
integer is used as an index to store the original data.
o It stores the data in a hash table. You can use a hash key to locate data
quickly.
4.6.1 Examples
In schools, the teacher assigns a unique roll number to each student. Later, the
teacher uses that roll number to retrieve information about that student.
A library has an infinite number of books. The librarian assigns a unique number
to each book. This unique number helps in identifying the position of the books
on the bookshelf.
The lookup cost will be scanning all the entries of the selected linked list for the
required key. If the keys are uniformly distributed, then the average lookup cost
will be an average number of keys per linked list.
Step 2: Now we will insert all the keys in the hash table one by one. First key to
be inserted is 24. It will map to bucket number 0 which is calculated by using
hash function 24%6=0.
Step 3: Now the next key that is need to be inserted is 75. It will map to the
bucket number 3 because 75%6=3. So insert it to bucket number 3.
Step 4: The next key is 65. It will map to bucket number 5 because 65%6=5. So,
insert it to bucket number 5.
Step 5: Now the next key is 81. Its bucket number will be 81%6=3. But bucket 3
is already occupied by key 75. So separate chaining method will handles the
collision by creating a linked list to bucket 3.
Step 6: Now the next key is 42. Its bucket number will be 42%6=0. But bucket 0
is already occupied by key 24. So separate chaining method will again handles
the collision by creating a linked list to bucket 0.
Step 7: Now the last key to be inserted is 63. It will map to the bucket number
63%6=3. Since bucket 3 is already occupied, so collision occurs but separate
chaining method will handle the collision by creating a linked list to bucket 3.
In this way the separate chaining method is used as the collision resolution
technique.
4.10.1 Solution
Although, the quadratic probing eliminates the primary clustering, it still has the
problem.
When two keys hash to the same location, they will probe to the same alternative
location. This may cause secondary clustering. In order to avoid this secondary
clustering, double hashing method is created where we use extra multiplications
and divisions
The problem with linear probing is primary clustering. This means that even if
the table is empty, any key that hashes to table requires several attempt to
resolve the collision because it has to cross over the blocks of occupied cell.
These blocks of occupied cell form the primary clustering. If any key falls into
clustering, then we cannot predict the number of attempts needed to resolve the
collision. These long paths affect the performance of the hash table.
4.13 RE-HASHING
Rehashing is the process of re-calculating the hashcode of already stored entries
(Key-Value pairs), to move them to another bigger size hashmap when the
threshold is reached/crossed.
If you are going to store a really large number of elements in the HashTable then
it is always good to create a HashTable with sufficient capacity upfront as this is
more efficient than letting it perform automatic rehashing.