Algorithm All
Algorithm All
Algorithm All
Chapter 4 Stacks
Chapter 5 Queue
Chapter 6 Trees
1
Chapter 1
The way data are organized in a computers memory is said to be Data Structure and the
sequence of computational steps to solve a problem is said to be an algorithm. Therefore, a
program is nothing but data structures plus algorithms.
Given a problem, the first step to solve the problem is obtaining ones own abstract view, or
model,, of the problem. This process of modeling is called abstraction.
The model defines an abstract view to the problem. This implies that the model focuses only on
problem related stuff and that a programmer tries to define the properties of the problem.
2
With abstraction you create a well-defined entity that can be properly handled. These entities
define the data structure of the program.
An entity with the properties just described is called an abstract data type (ADT).
A data structure is a language construct that the programmer has defined in order to implement
an abstract data type.
There are lots of formalized and standard Abstract data types such as Stacks, Queues, Trees, etc.
1.1.2. Abstraction
Abstraction is a process of classifying characteristics as relevant and irrelevant for the particular
purpose at hand and ignoring the irrelevant ones.
How do data structures model the world or some part of the world?
• The value held by a data structure represents some specific characteristic of the world
• The characteristic being modeled restricts the possible values held by a data structure
• The characteristic being modeled restricts the possible operations to be performed on the
data structure.
Note: Notice the relation between characteristic, value, and data structures
1.2. Algorithms
3
An algorithm is a well-defined computational procedure that takes some value or a set of values
as input and produces some value or a set of values as output. Data structures model the static
part of the world. They are unchanging while the world is changing. In order to model the
dynamic part of the world we need to work with algorithms. Algorithms are the dynamic part of
a program’s world model.
An algorithm transforms data structures from one state to another state in two ways:
• An algorithm may change the value held by a data structure
• An algorithm may change the data structure itself
The quality of a data structure is related to its ability to successfully model the characteristics of
the world. Similarly, the quality of an algorithm is related to its ability to successfully simulate
the changes in the world.
However, independent of any particular world model, the quality of data structure and
algorithms is determined by their ability to work together well. Generally speaking, correct data
structures lead to simple and efficient algorithms and correct algorithms lead to accurate and
efficient data structures.
Algorithm analysis refers to the process of determining the amount of computing time and
storage space required by different algorithms. In other words, it’s a process of predicting the
resource requirement of algorithms in a given environment.
4
In order to solve a problem, there are many possible algorithms. One has to be able to choose
the best algorithm for the problem at hand using some scientific method. To classify some data
structures and algorithms as good, we need precise ways of analyzing them in terms of resource
requirement. The main resources are:
• Running Time
• Memory Usage
• Communication Bandwidth
Running time is usually treated as the most important since computational time is the most
precious resource in most problem domains.
The goal is to have a meaningful measure that permits comparison of algorithms independent of
operating platform.
There are two things to consider:
• Time Complexity: Determine the approximate number of operations required to solve a
problem of size n.
• Space Complexity: Determine the approximate memory required to solve a problem of
size n.
5
• Algorithm Analysis: Analysis of the algorithm or data structure to produce a function
T(n) that describes the algorithm in terms of the operations performed in order to
measure the complexity of the algorithm.
• Order of Magnitude Analysis: Analysis of the function T (n) to determine the general
complexity category to which it belongs.
There is no generally accepted set of rules for algorithm analysis. However, an exact count of
operations is commonly used.
Examples:
1. int count(){
int k=0;
cout<< “Enter an integer”;
cin>>n;
for (i=0;i<n;i++)
k=k+1;
return 0;
}
6
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+1+1+(1+n+1+n)+2n+1 = 4n+6 = O(n)
2. int total(int n)
{
int sum=0;
for (int i=1;i<=n;i++)
sum=sum+1;
return sum;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment statement: int sum=0
In the for loop:
1 assignment, n+1 tests, and n increments.
n loops of 2 units for an assignment, and an addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+ (1+n+1+n)+2n+1 = 4n+4 = O(n)
3. void func()
{
int x=0;
int i=0;
int j=1;
cout<< “Enter an Integer value”;
cin>>n;
while (i<n){
x++;
i++;
}
while (j<n)
{
j++;
}
}
7
In the first while loop:
n+1 tests
n loops of 2 units for the two increment (addition) operations
In the second while loop:
n tests
n-1 increments
-------------------------------------------------------------------
T (n)= 1+1+1+1+1+n+1+2n+n+n-1 = 5n+5 = O(n)
4. int sum (int n)
{
int partial_sum = 0;
for (int i = 1; i <= n; i++)
partial_sum = partial_sum +(i * i * i);
return partial_sum;
}
Example:
8
1. int doubler (int n)
{
int res = 0;
for (int i = 1; i <= n; i+=2)
res = res + i;
return res;
}
Time Units to Compute
-------------------------------------------------
1 for the assignment.
In loop:
1 assignment, (n/2)+1 tests,
n/2 increments.
n/2 loops of 2 units for an assignment and addition.
1 for the return statement.
-------------------------------------------------------------------
T (n)= 1+(1+n/2+1+n/2)+n+1 = 2n+4 = O(n)
9
N
fo r (in t i = 1 ; i < = N ; i+ + ) {
s u m = s u m + i; ∑ 1 = N
}
i=1
• Suppose we count the number of additions that are done. There is 1 addition per iteration of
the loop, hence N additions in total.
f o r ( in t i = 1 ; i < = N ; i+ + ) {
f o r ( in t j = 1 ; j < = M ; j+ + ) { N M N
s u m = s u m + i+ j ; ∑∑ 2 = ∑2M = 2 MN
} i =1 j =1 i =1
}
Again, count the number of additions. The outer summation is for the outer for loop.
Conditionals: Formally
• If (test) s1 else s2: Compute the maximum of the running time for s1 and s2.
if (te s t = = 1 ) {
fo r (in t i = 1 ; i < = N ; i+ + ) { N N N
s u m = s u m + i; max ∑ 1 , ∑ ∑ 2 =
}} i =1 i=1 j =1
( )
e ls e fo r (in t i = 1 ; i < = N ; i+ + ) {
fo r (in t j = 1 ; j < = N ; j+ + ) { max N , 2 N 2 = 2 N 2
s u m = s u m + i+ j;
}}
Example:
Suppose we have hardware capable of executing 106 instructions per second. How long would it
take to execute an algorithm whose complexity function was:
10
T (n) = 2n2 on an input size of n=108?
Exercise 1
Determine the run time equation and complexity of each of the following code segments.
3. int k=0;
for (int i=0; i<n; i++)
for (int j=i; j<n; j++)
k++;
What is the value of k when n is equal to 20?
4. int k=0;
for (int i=1; i<n; i*=2)
for(int j=1; j<n; j++)
k++;
What is the value of k when n is equal to 20?
5. int x=0;
for(int i=1; i<n; i=i+5)
x++;
What is the value of x when n=25?
6. int x=0;
for(int k=n; k>=n/3; k=k-5)
x++;
What is the value of x when n=25?
7. int x=0;
11
for (int i=1; i<n; i=i+5)
for (int k=n; k>=n/3; k=k-5)
x++;
What is the value of x when n=25?
8. int x=0;
for(int i=1; i<n; i=i+5)
for(int j=0; j<i; j++)
for(int k=n; k>=n/2; k=k-3)
x++;
What is the correct big-Oh Notation for the above code segment?
In order to determine the running time of an algorithm it is possible to define three functions
Tbest(n), Tavg(n) and Tworst(n) as the best, the average and the worst case running time of the
algorithm respectively.
Average Case (Tavg): The amount of time the algorithm takes on an "average" set of inputs.
Worst Case (Tworst): The amount of time the algorithm takes on the worst possible set of inputs.
Best Case (Tbest): The amount of time the algorithm takes on the smallest possible set of inputs.
We are interested in the worst-case time, since it provides a bound for all input – this is called
the “Big-Oh” estimate.
There are five notations used to describe a running time function. These are:
• Big-Oh Notation (O)
• Big-Omega Notation (Ω)
• Theta Notation (Θ)
• Little-o Notation (o)
• Little-Omega Notation (ω)
12
We use O-notation to give an upper bound on a function, to within a constant factor. Since O-
notation describes an upper bound, when we use it to bound the worst-case running time of an
algorithm, by implication we also bound the running time of the algorithm on arbitrary inputs as
well.
Formal Definition: f(n)= O(g(n)) if there exist constants c, k ∊ ℛ+ such that for all n≥ k,
f(n)≤c.g(n).
Examples: The following points are facts that you can use for Big-Oh problems:
• 1<=n for all n>=1
• n<=n2 for all n>=1
• 2n <=n! for all n>=4
• log2n<=n for all n>=2
• n<=nlog2n for all n>=2
Exercise:
f(n) = (3/2)n2+(5/2)n-3
Show that f(n)= O(n2)
Solution:
For n>=1, 5/2n<=5/2n2 , and -3<=3n2. Thus,
3/2n2 + 5/2n - 3 <= 3/2n2 + 5/2n2 + 3n2
3/2n2 + 5/2n - 3 <= 8n2
3/2n2 + 5/2n - 3 = O(n2) (c=8, k=1)
13
Diagramtically, the function f(n) and its big Oh g(n) can be drawn on xx-y y plane(y coordinate is
time, x coordinate is size of input) as follows. Generally, f (n) =O(g(n)) means that the growth
rate of f(n) is less than or equal
ual to g(n).
Typical Orders
Here is a table of some typical cases. This uses logarithms to base 2, but these are simply
proportional to logarithms in other base.
Big-O expresses an upper bound on the growth rate of a function, for sufficiently large values
of n. An upper bound is the best algorithmic solution that has been found for a problem. “What
is the best that we know we can do?”
Running times of different algorithms on 1GHZ (109 clock cycles) computer is indicated in the
following table. The table assumes one operation per clock cycle which is optimistic (operations
usually take more than 1 clock cycle
cycle).
14
1.4.1.1. Big-O Theorems
For all the following theorems, assume that f(n) is a function of n and that k is an arbitrary
constant.
Theorem 1: k is O(1)
Theorem 2: A polynomial is O(the term containing the highest power of n).
Polynomial’s growth rate is determined by the leading term
• If f(n) is a polynomial of degree d, then f(n) is O(nd)
In general, f(n) is big-O of the dominant term of f(n).
Theorem 3: k*f(n) is O(f(n))
Constant factors may be ignored
E.g. f(n) =7n4+3n2+5n+1000 is O(n4)
15
nn
Formal Definition: A function f(n)=Ω(g(n)) if there exist constants c, & k ∊ ℛ+ such that
f(n) >= c.g(n) for all n>=k.
f(n) = Ω( g (n)) means that f(n) is greater than or equal to some constant multiple of g(n) for all
values of n greater than or equal to some k.
In simple terms, f(n) = Ω(g(n)) means that the growth rate of f(n) is greater than or equal to
g(n).
16
Fig Big Omega growth
Formal Definition: A function f(n)(n) is Θ(g(n)) if it is both O(g(n)) and Ω(g(n)).. In other words,
there exist constants c1, c2, and k>00 such that
c1.g (n)<=f(n)<=c2.g(n)g(n) for all n >= k
If f(n)= Θ(g(n)),
(g(n)), then g(n) is an asymptotically tight bound for f(n).
In simple terms, f(n) = Θ(g(n))
(g(n)) means that f(n) and g(n) have the same rate of growth.
Example:
1. If f(n) = 2n+1, then f(n) = Θ(n)
2. f(n) = 2n2 then
f(n) = O(n4)
f(n) = O(n3)
f(n) = O(n2)
17
Fig Theta growth
All these are technically correct, but the last expression is the best and tight one. Since 2n2 and
n2 have the same growth rate, it can be written as f(n) = Θ(n2).
Example:
1. Show that f(n) =10n3+5n2+17 is Θ(n3)
10n3 < f(n) < 10n3+5n3+17n3
10n3 < f(n) < 32n3
f(n) = Θ(n3) (c1=10, c2=32, n>=1)
2. Show that 5n log n +10n is Θ(n3)
5n log n < f(n) < 5n log n + 10n log n
5n log n < f(n) < 15n log n
f(n)=(n log n) (c1=5, c2=15, k=1)
Exercise 2
Show that f(n) = 10n3-5n2 is Θ(n3)
f(n)=o(g(n)) means for all c>0 there exists some k>0 such that f(n)<c.g(n) for all n>=k.
Informally, f(n)=o(g(n)) means f(n) becomes insignificant relative to g(n) as n approaches
infinity.
Little-omega (ω) notation is to big-omega (Ω) notation as little-o notation is to Big-Oh notation.
We use ω notation to denote a lower bound that is not asymptotically tight.
Formal Definition: f(n)=ω(g(n)) if there exists a constant no>0 such that 0<= c.g(n)<f(n) for all
n>=k.
18
Example: 2n2=ω(n) but it’s not ω(n2).
Transitivity
• if f(n) = Θ(g(n)) and g(n) = Θ(h(n)) then f(n) = Θ(h(n)),
• if f(n) = O(g(n)) and g(n) = O(h(n)) then f(n) = O(h(n)),
• if f(n) = Ω(g(n)) and g(n) = Ω(h(n)) then f(n) = Ω (h(n)),
• if f(n) = o(g(n)) and g(n) = o(h(n)) then f(n) = o(h(n)), and
• if f(n) = ω (g(n)) and g(n) = ω(h(n)) then f(n) = ω (h(n)).
Symmetry
• f(n) = Θ(g(n)) if and only if g(n) = Θ(f(n)).
Transpose symmetry
• f(n) = O(g(n)) if and only if g(n) = Ω(f(n)),
• f(n) = o(g(n)) if and only if g(n) = ω(f(n)).
Reflexivity
• f(n) = Θ(f(n)),
• f(n) = O(f(n)),
• f(n) = Ω(f(n)).
Function Name
C Constant
log n Logarithmic
log2n Log-squared
N Linear
n log n Linear-log
n2 Quadratic
n3 Cubic
2n Exponential
n! Factorial
nn Polynomial
19
Fig algorithm growth rate comparison
Exercise 3
2. Show that
2 = 2݊ = 2݊ଶ
୧ୀଵ ୨ୀଵ ୨ୀଵ
2.
n(n + 1)
1 = i =
2
୧ୀଵ ୨ୀଵ ୧ୀଵ
3.
ିଵ
n(n + 1)
1 = i =
2
୧ୀଵ ୨ୀ୧ ୧ୀଵ
4.
20
୪୭మ ିଵ ୪୭మ
1 = n − 1 = (n − 1) log ଶ n
୧ୀଵ ୨ୀଵ ୧ୀଵ
5.
/ହ
݊
1 =
5
ୀଵ
6.
/ହ
2݊
1=
15
ୀ/ଷ
7.
/ହ /ହ /ହ
2݊ 2݊ଶ
1= =
15 75
ୀଵ ୀ/ଷ ୀଵ
8.
/ହ /ଷ
1 =
ୀଵ ୀଵ ୀ/ଶ
Chapter 2
2.1. Searching
Searching is a process of looking for a specific element in a list of items or determining that the
item is not in the list. There are two simple searching algorithms:
21
• Sequential Search, and
• Binary Search
Pseudocode
Loop through the array starting at the first element until the value of target matches one of the
array elements.
Time is proportional to the size of input (n) and we call this time complexity O(n).
Example Implementation:
The computational time for this algorithm is proportional to log2 n. Therefore the time
complexity is O(log n)
22
Find 22 in the following array.
Example Implementation:
Sorting is one of the most important operations performed by computers. Sorting is a process of
reordering a list of items in either increasing or decreasing order. The following are simple
sorting algorithms used to sort small-sized lists.
23
• Insertion Sort
• Selection Sort
• Bubble Sort
The insertion sort works just like its name suggests - it inserts each item into its proper place in
the final list. The simplest implementation of this requires two list structures - the source list
and the list into which sorted items are inserted. To save memory, most implementations use an
in-place sort that works by moving the current item past the already sorted items and repeatedly
swapping it with the preceding item until it is in place.
It's the most instinctive type of sorting algorithm. The approach is the same approach that you
use for sorting a set of cards in your hand. While playing cards, you pick up a card, start at the
beginning of your hand and find the place to insert the new card, insert it and move all the
others up one place.
Basic Idea:
Find the location for an element and move all others up, and insert the element.
The process involved in insertion sort is as follows:
1. The left most value can be said to be sorted relative to itself. Thus, we don’t need to do
anything.
2. Check to see if the second value is smaller than the first one. If it is, swap these two
values. The first two values are now relatively sorted.
3. Next, we need to insert the third value in to the relatively sorted portion so that after
insertion, the portion will still be relatively sorted.
4. Remove the third value first. Slide the second value to make room for insertion. Insert
the value in the appropriate position.
5. Now the first three are relatively sorted.
6. Do the same for the remaining items in the list.
24
8 34 32 51 64 21
8 32 34 51 64 21
after p = 6 8 21 32 34 51 64 4
Implementation
void insertion_sort(int list[])
{
int temp;
for(int i = 1; i < n; i++)
{
temp = list[i];
for(int j = i; j > 0 && temp < list[j-1]; j--) //work backwards to find where temp should go
{
list[j] = list[j-1];
list[j-1] = temp;
}
}
}
Analysis
How many comparisons?
1+2+3+…+(n-1)= O(n2)
How many swaps?
1+2+3+…+(n-1)= O(n2)
How much space?
In-place algorithm
25
Pass=5 8 21 32 34 51 64 1
Implementation:
void selection_sort(int list[])
{
int i, j, smallest;
for(i = 0; i < n; i++)
{
smallest = i;
for(j = i+1; j < n; j++)
{
if(list[j] < list[smallest])
smallest = j;
}
temp = list[smallest];
list[smallest] = list[i];
list[i] = temp;
}
}
Analysis
How many comparisons?
(n-1)+(n-2)+…+1= O(n2)
How many swaps?
n=O(n)
How much space?
In-place algorithm
Bubble sort is the simplest algorithm to implement and the slowest algorithm on very large
inputs.
Basic Idea:
• Loop through array from i=0 to n and swap adjacent elements if they are out of order.
26
8 34 21 64 51 32
i=1 8 34 21 64 32 51
… …
Implementation:
void bubble_sort(list[])
{
int i, j, temp;
for(i = 0; i < n; i++)
{
for(j = n-1; j > i; j--)
{
if(list[j] < list[j-1])
{
temp = list[j];
list[j] = list[j--1];
list[j-1] = temp;
}
}
}
}
General Comments
Each of these algorithms requires n--11 passes: each pass places one item in its correct place. The
ith pass makes either i or n - i comparisons and moves. So:
or O(n2). Thus these algorithms are only suitable for small problems where their simple code
makes them faster than the more complex code of the O( O(n logn)) algorithm. As a rule of thumb,
expect to find an O(n logn)) algorithm faster for n>10 - but the exact value depends very much
on individual machines!.
27
Empirically it’s known that Insertion sort is over twice as fast as the bubble sort and is just as
easy to implement as the selection sort. In short, there really isn't any reason to use the selection
sort - use the insertion sort instead.
If you really want to use the selection sort for some reason, try to avoid sorting lists of more
than a 1000 items with it or repetitively sorting lists of more than a couple hundred items.
Chapter 3
3.1. Structures
Structures are aggregate data types built using elements of primitive data types. Structure are
defined using the struct keyword:
E.g. struct Time{
int hour;
int minute;
int second;
};
The struct keyword creates a new user defined data type that is used to declare variables of an
aggregate data type. Structure variables are declared like variables of other types.
Syntax: struct <structure tag> <variable name>;
E.g. struct Time timeObject,
struct Time *timeptr;
28
The parentheses is required since (*) has lower precedence than (.).
An array may seem natural for storing a list, but it has problems.
Deletion problem - suppose you want to delete 7 from the following array. This requires moving
all array elements found after 7 one step down to lower array index.
Insertion problem - suppose we need to insert 79 near the front of the array. This requires
moving all array elements one index up to create room for the new item, 79.
29
Memory usage problem – shortage or wastage of memory. Suppose this array is for storing a
list:
int myArray[200];
When the program runs, if there are only a few list items, then we have been wasteful with
memory allocation. Iff the list has 201 or more items, then our program cannot accommodate the
list. Instead of compile-time memory allocation, run-time memory allocation might be better.
better
Array uses compile time memory allocation unlike linked list which uses run run-time
time memory
allocation.
A linked list is a data structure that is built from structures and pointers. It forms a chain of
"nodes" with pointers representing the links of the chain and holding the entire thing together. A
linked list can be represented by a diagram like this one:
30
This linked list has four nodes in it, each with a link to the next node in the series. The last node
has a link to the special value NULL, which any pointer (whatever its type) can point to, to
show that it is the last link in the chain. There is also ano
another
ther special pointer, called Start (also
called head), which points to the first link in the chain so that we can keep track of it.
The key part of a linked list is a structure, which holds the data ffor
or each node (the name,
address, age or whatever for the items in the list), and, most importantly, a pointer to the next
node. Here we have given the structure of a typical node:
struct Person
{
char name[20]; // name ame of up to 20 letters
int age
float height; // inn metres
Person *next; //Pointer
Pointer to next node
};
struct Person *start_ptr = NULL;
The important part of the structure is the line before the closing curly brackets. This gives a
pointer to the next node in the list. This is the only case in C++ where you are allowed to refer
to a data type (in this case Person) before you have even fini
finished defining it!
We have also declared a pointer called start_ptr that will permanently point to the start of the
list. To start with, there are no nodes in the list, which is why start_ptr is set to NULL.
Firstly, we declare the spacee for a pointer item and assign a temporary pointer to it. This is done
using the new statement as follows:
31
We can refer to the new node as *temp, i.e. "the node that temp points to".. When the fields of
this structure are referred to, brackets can be put round the *temp part, as otherwise the compiler
will think we are trying to refer to the fields of the pointer. Alternatively, we can use the arrow
pointer notation.
Having declared the node, we ask the user to fill in the details of the person, i.e. the name, age,
address or whatever:
cout << "Please enter the name of the person: ";
cin >> temp->name;
cout << "Please enter the age of the person : ";
cin >> temp->age;
cout << "Please enter the height of the person : ";
cin >> temp->height;
temp->next = NULL;
The last line sets the pointer from this node to the next to NULL, indicating that this node, when
it is inserted in the list, will be the last node. Having set up the information, we have to decide
what to do with the pointers. Of course, if the list is eempty
mpty to start with, there's no problem - just
set the Start pointer to point to this node (i.e. set it to the same value as temp):
if (start_ptr == NULL)
start_ptr = temp;
temp2 = start_ptr;
//We
We know this is not NULL - list not empty!
while (temp2->next != NULL)
{
32
temp2 = temp2-->next; // Move to next link in chain
}
The loop will terminate when temp2 points to the last node in the chain, and it knows when this
happened because the next pointer in that node will point to NULL. When it has found it, it sets
the pointer from that last node to point to the node we have just declared:
temp2->next = temp;
The link temp2->next in this diagram is the link joining the last two nodes. The full code for
adding a node at the end of the list is shown below, in its own little function:
void add_node_at_end ()
{
Person *temp, *temp2; // Temporary pointers
// Reserve space for new node and fill it with data
temp = new Person;
cout << "Please enter the name of the person: ";
cin >> temp->name;
cout << "Please enter the age of the person : ";
cin >> temp->age;
cout << "Please enter the height of the person : ";
cin >> temp->height;
temp->next = NULL;
33
}
temp2->next = temp;
}
}
Remember that start_ptr (head) should point to the first node in the llinked list.. Hence, when you
add node at the beginning, you have to reposition head to the new node. What do you think
would happen if start_ptr does not point to the first node in the list?
void add_in_middle()
{
Person *trav;
trav = head;
while(trav != NULL)
{
if(strcmp(trav
if(strcmp(trav->name, “Sue”)==0)
break;
trav=trav->next;
>next;
34
}
if(trav != NULL) //searched node found
{
temp->next=trav->next;
trav->next = temp;
}
}
Caution
There is a possibility that you may lose the right half of the list if not careful while adding to the
middle. This happens if you don’t make temp to point to node next to the node with value
“Sue”.
Having added one or more nodes, we need to display the list of nodes on the screen. This is
comparatively easy to do. Here is the method:
1. Set a temporary pointer to point to the same thing as the start pointer.
2. If the pointer points to NULL, display the message "End of list" and stop.
3. Otherwise, display the details of the node pointed to by the temporary pointer.
4. Make the temporary pointer point to the same thing as the next pointer of the node it is
currently pointing to.
5. Jump back to step 2.
The temporary pointer moves along the list, displaying the details of the nodes it comes across.
At each stage, it can get hold of the next node in the list by using the next pointer of the node it
is currently pointing to. Here is the C++ code that does the job:
void display()
{
temp = start_ptr;
do
35
{
if (temp == NULL)
cout << "End of list" << endl;
else
{
//display
isplay details for what temp points to
cout << "Name : " << temp
temp->name << endl;
cout <<< "Age : " << temp
temp->age << endl;
cout << "Height : " << temp
temp->height << endl;
cout << endl; // bblank line
Check through this code, matching it to the method listed above. It helps if you draw a diagram
on paper of a linked list and work through the code using the diagram.
One thing you may need to do is to navigate through the list, with a pointer that moves
backwards and forwards through the list, like an index pointer in an array. This is certainly
necessary when you want to insert or delete a node from somewhere inside tthe
he list, as you will
need to specify the position.
We will call the mobile pointer current. First of all, it is declared, and set to the same value as
the start_ptr pointer:
Person *current;
current = start_ptr;
Notice that you don't need to set current equal to the address of the start pointer, as they are
both pointers. The statement above makes them both point to the same thing:
36
It's easy to get the current pointer to point to the next node in the list i.e. move from left to right
along the list. If you want to move current along one node, use the next field of the node that it
is pointing to at the moment:
current = current->next;
In fact, we had better check that it isn't pointing to the last item in the list. If it is, then there is
no next node to move to:
if (current->next == NULL)
cout << "You are at the end of the list." << endl;
else
current = current->next;
Moving the current pointer back one step is a little harder. This is because we have no way of
moving back a step automatically from the current node. The only way to find the node before
the current one is to start at the beginning, work our way through and stop when we find the
node before the one we are considering at the moment. We can tell when this happens, as the
next pointer from that node will point to exactly the same place in memory as the current pointer
(i.e. the current node).
previous current
Start
stop
NULL
First of all, we had better check to see if the current node is also first the one. If it is, then there
is no "previous" node to point to. If not, check through all the nodes in turn until we detect that
we are just behind the current one
if (current == start_ptr)
cout << "You are at the start of the list" << endl;
else
{
Person *previous; // Declare the pointer
previous = start_ptr;
37
The else clause translates as follows: declare a temporary pointer (for use in this else clause
only). Set it equal to the start pointer. All the time that it is not pointing to the node before the
current node, move it along the line. Once the previous node has been found, the current pointer
is set to that node - i.e. it moves back along the list.
Now that you have the facility to move back and forth, you need to do something with it.
Firstly, let's see if we can alter/modify the details for that particular node in the list:
cout << "Please enter the new name of the person: ";
cin >> current->name;
cout << "Please enter the new age of the person : ";
cin >> current->age;
cout << "Please enter the new height of the person : ";
cin >> current->height;
The next easiest thing to do is to delete a node from the list directly after the current position.
We have to use a temporary pointer to point to the node to be deleted. Once this node has been
"anchored", the pointers to the remaining nodes can be readjusted before the node on death row
is deleted. Here is the sequence of actions:
1. Firstly, the temporary pointer is assigned to the node after the current one. This is the
node to be deleted:
current temp
NULL
2. Now the pointer from the current node is made to leap-frog the next node and point to
the one after that:
current temp
NULL
38
Here is the code for deleting the node. It includes a test at the start to test whether the current
node is the last one in the list:
if (current->next == NULL)
cout << "There is no node after current" << endl;
else
{
Person *temp;
temp = current->next;
current->next = temp->next; // could be NULL
delete temp;
}
Here is the code to add a node after the current one. This is done similarly, but we haven't
illustrated it with diagrams:
if (current->next == NULL)
add_node_at_end();
else
{
Person *temp = new temp;
get_details(temp);
//Make the new node point to the same thing as the current node
temp->next = current->next;
//Make the current node point to the new link in the chain
current->next = temp;
}
We have assumed that the function add_node_at_end() is the routine for adding the node to the
end of the list that we created near the top of this section. This routine is called if the current
pointer is the last one in the list so the new one would be added on to the end.
Similarly, the routine get_temp(temp) is a routine that reads in the details for the new node similar
to the one defined just above.
39
When a node is deleted, the space that it took up should be reclaimed. Otherwise the computer
will eventually run out of memory space. This is done with the delete instruction:
Now that the first node has been safely tagged (so that we can refer to it even when the start
pointer has been reassigned), we can move the start pointer to the next node in the chain:
start_ptr = start_ptr->next;; //s
//second node in chain.
40
{
Person *temp;
temp = start_ptr;
start_ptr = start_ptr->next
next;
delete temp;
}
Deleting a node from the end of the list is harder, as the temporary pointer must find where the
end of the list is by hopping along from the start. This is done using code that is almost identical
to that used to insert a node at the end of the list. It iiss necessary to maintain two temporary
pointers, temp1 and temp2. The pointer temp1 will point to the last node in the list and temp2 will
point to the previous node. We have to keep track of both as it is necessary to delete the last
node and immediately afterwards,
fterwards, to set the next pointer of the previous node to NULL (it is
now the new last node).
1. Look at the start pointer. If it is NULL, then the list is empty, so print out a "No nodes to
delete" message.
2. Make temp1 point to whatever the start pointer iis pointing to.
3. If the next pointer of what temp1 indicates is NULL, then we've found the last node of
the list, so jump to step 7.
4. Make another pointer, temp2, point to the current node in the list.
5. Make temp1 point to the next item in the list.
6. Go to step 3.
7. If you get this far, then the temporary pointer, temp1, should point to the last item in the
list and the other temporary pointer, temp2, should point to the last-but-one
one item.
8. Delete the node pointed to by temp1.
9. Mark the next pointer of the no
node pointed to by temp2 as NULL - it is the new last node.
Let's try it with a rough drawing. This is always a good idea when you are trying to understand
an abstract data type. Suppose we want to delete the last node from this list:
Firstly, the start pointer doesn't point to NULL, so we don't have to display a "Empty list, wise
guy!" message. Let's get straight on with step2 - set the pointer temp1 to the same as the start
pointer:
41
The next pointer from this node isn't NULL, so w
wee haven't found the end node. Instead, we set
the pointer temp2 to the same node as temp1
Going back to step 3, we see that temp1 still doesn't point to the last node in the list, so we
make temp2 point to what temp1 points toto.
start_ptr
NULL
temp 2 temp1
42
Eventually, this goes on until temp1 really is pointing to the last node in the list, with temp2
pointing to the penultimate node:
start_ptr
NULL
temp 2 temp1
Now we have reached step 8. The next thing to do is to delete the node pointed to by temp1
We suppose you want some code for all that! All right then ....
43
void delete_end_node()
{
Person *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else
{
temp1 = start_ptr;
while (temp1->next != NULL)
{
temp2 = temp1;
temp1 = temp1->next;
}
delete temp1;
temp2->next = NULL;
}
}
The code seems a lot shorter than the explanation!
Now, the sharp-witted amongst you will have spotted a problem. If the list only contains one
node, the code above will malfunction. This is because the function goes as far as the
temp1=start_ptr statement, but never gets as far as setting up temp2. The code above has to be
adapted so that if the first node is also the last (has a NULL next pointer), then it is deleted and
the start_ptr pointer is assigned to NULL. In this case, there is no need for the pointer temp2:
void delete_end_node()
{
Person *temp1, *temp2;
if (start_ptr == NULL)
cout << "The list is empty!" << endl;
else {
temp1 = start_ptr;
if (temp1->next == NULL) // This part is new!
{
delete temp1;
start_ptr = NULL;
}
else {
while (temp1->next != NULL)
{
temp2 = temp1;
44
temp1 = temp1
temp1->next;
}
delete temp1;
temp2->next = NULL;
}
}
}
That sounds even harder than a linked list! Well, if you've mastered how to do singly linked
lists, then it shouldn't be much of a leap to doubly linked lists
A doubly linked list is one where there are links from each node in both directions:
You will notice that each node in the list has two pointers, one to the next node and one to the
previous one - again, the ends of the list are defined by NULL pointers. Also there is no pointer
to the start of the list. Instead, there is simply a pointer to some position in the list that can be
moved left or right.
The reason we needed a start pointer in the ordinary linked list is because, having moved on
from one node to another, we can't easily move back, so without the start pointer, we would lose
track of all the nodes in the list that we have already passed. With the doubly linked list, we can
move the current pointer backwards and forwards at will.
45
};
Person *current;
current = new Person;
strcpy(current->name, "Fred"
"Fred");
current->next = NULL;
current->prev = NULL;
We have also included some code to declare the first node and set its pointers to NULL. It gives
the following situation:
We still need to consider the directions 'forward' and 'backward', so in this case, we will need to
define functions to add a node to the start of the list (left
(left-most
most position) and the end of the list
(right-most position).
//Declare
Declare a new node and link it in
Person *temp2;
temp2 = new Person;
strcpy(temp2->name, new_name
new_name); //store new name in the node
temp2->prev = NULL; // This is the new start of the list
temp2->next = temp; // Links to current list
temp->prev = temp2;
}
void add_node_at_end ()
{
//Declare
Declare a temporary pointer and move it to the end
46
Person *temp = current;
while (temp->next != NULL)
temp = temp->next;
Here, the new name is passed to the appropriate function as a parameter. We'll go through the
function for adding a node to the right-most end of the list. The method is similar for adding a
node at the other end. Firstly, a temporary pointer is set up and is made to march along the list
until it points to last node in the list.
Start_Ptr
After that, a new node is declared, and the name is copied into it. The next pointer of this new
node is set to NULL to indicate that this node will be the new end of the list. The prev pointer of
the new node is linked into the last node of the existing list. The next pointer of the current end
of the list is set to the new node.
47
singly linked lists as well as doubly linked lists
lists.. This clearly affects some of the tests, but the
structure is popular in some applications.
48
Chapter 4
Stacks
A simple data structure, in which insertion and deletion occur at the same end, is termed (called) a
stack. It is a LIFO (Last In First Out) structure.
The operations of insertion and deletion are called PUSH and POP
Push - push (put) item onto stack
Pop - pop (get) item from stack
Initial Stack Push(8) Pop
TOS=> 8
TOS=> 4 4 TOS=> 4
1 1 1
3 3 3
6 6 6
Our Purpose:
To develop a stack implementation that does not tie us to a particular data type or to a particular
implementation.
Implementation:
Stacks can be implemented as:
• an array (contiguous list) and
• a linked list.
We want a set of operations that will work with either type of implementation: i.e. the method of
implementation is hidden and can be changed without affecting the programs that use them.
Push()
{
if there is room {
put an item on the top of the stack
}
else {
give an error message
}
}
Pop()
{
49
if stack not empty {
return the value of the top item
remove the top item from the stack
}
else {
give an error message
}
}
CreateStack()
{
remove existing items from the stack
initialise the stack to empty
}
Algorithm:
Step-1: Increment the Stack TOP by 1. Check whether it is always less than the Upper Limit of
the stack. If it is less than the Upper Limit go to step-2 else report -"Stack Overflow"
Step-2: Put the new element at the position pointed by the TOP
Implementation:
50
}
else
cout<<"Stack Overflow";
}
Note: In array implementation, we have taken TOP = -1 to signify the empty stack, as this
simplifies the implementation.
POP is the synonym for delete when it comes to Stack. So, if you're taking an array as the stack,
remember that you'll return an error message, "Stack underflow", if an attempt is made to Pop
an item from an empty Stack. OK.
Algorithm
Step-1: If the Stack is empty then give the alert "Stack underflow" and quit; or else go to step-2
Step-2: a) Hold the value for the element pointed by the TOP
b) Put a NULL value instead
c) Decrement the TOP by 1
Implementation:
static int stack[UPPPERLIMIT];
int top=-1;
..
..
main()
{
..
..
poped_val = pop();
..
..
}
int pop()
{
int del_val = 0;
if(top == -1)
cout<<"Stack underflow"; /*step-1*/
else
{
del_val = stack[top]; /*step-2*/
51
stack[top] = NULL;
top = top -1;
}
return(del_val);
}
Note: - Step-2:(b) signifies that the respective element has been deleted.
It’s very similar to the insertion operation in a dynamic singly linked list. The only difference is
that here you'll add the new element only at the end of the list, which means addition can
happen only from the TOP. Since a dynamic list is used for the stack, the Stack is also dynamic,
means it has no prior upper limit set. So, we don't have to check for the Overflow condition at
all!
Algorithm
Implementation:
struct node{
int item;
struct node *next;
};
struct node *stack = NULL; /*stack is initially empty*/
struct node *top = stack;
main()
{
..
52
..
push(item);
..
}
push(int item)
{
if(stack == NULL) /*step-1*/
{
newnode = new node /*step-2*/
newnode -> item = item;
newnode -> next = NULL;
stack = newnode;
top = stack;
}
else
{
newnode = new node; /*step-3*/
newnode -> item = item;
newnode -> next = NULL;
top ->next = newnode;
top = newnode; /*step-4*/
}
}
Supposing you have only one element left in the Stack, then
we won't make use of "target" rather we'll take help of our
"bottom" pointer. See how...
53
Algorithm:
Step-1: If the Stack is empty then give an alert message "Stack Underflow" and quit; or else
proceed
Step-2: If there is only one element left go to step-3 or else step-4
Step-3: Free that element and make the "stack", "top" and "bottom" pointers point to NULL and
quit
Step-4: Make "target" point to just one element before the TOP; free the TOP most element;
make "target" as your TOP most element
Implementation:
struct node
{
int nodeval;
struct node *next;
}
struct node *stack = NULL; /*stack is initially empty*/
struct node *top = stack;
main()
{
int newvalue, delval;
..
push(newvalue);
..
delval = pop(); /*POP returns the deleted value from the stack*/
}
int pop( )
{
int pop_val = 0;
struct node *target = stack;
if(stack == NULL) /*step-1*/
cout<<"Stack Underflow";
else
{
if(top == bottom) /*step-2*/
{
pop_val = top -> nodeval; /*step-3*/
delete top;
stack = NULL;
top = bottom = stack;
}
54
else /*step-4*/
{
while(target->next != top)
Question:
Can we develop a method of evaluating arithmetic expressions without having to ‘look
ahead’ or ‘look back’? Consider the quadratic formula:
x = (-b+(b^2-4*a*c)^0.5)/(2*a)
In it’s current form we cannot solve the formula without considering the ordering of the
parentheses. i.e. we solve the innermost parenthesis first and then work outwards also
considering operator precedence. Although we do this naturally, consider developing an
algorithm to do the same . . . . . . possible but complex and inefficient. Instead . . . .
Computers solve arithmetic expressions by restructuring them so the order of each calculation is
embedded in the expression. Once converted an expression can then be solved in one pass.
55
Types of Expression
The normal (or human) way of expressing mathematical expressions is called infix form, e.g.
4+5*5. However, there are other ways of representing the same expression, either by writing all
operators before their operands or after them,
e.g. 455*+
+4*55
This method is called Polish Notation because this method was discovered by the Polish
mathematician Jan Lukasiewicz.
When the operators are written before their operands, it is called the prefix form
e.g. + 4 * 5 5
When the operators come after their operands, it is called postfix form (suffix form or reverse
polish notation)
e.g. 4 5 5 * +
For now, consider postfix notation as a way of redistributing operators in an expression so that
their operation is delayed until the correct time.
Notice the order of the operands remain the same but the operands are redistributed in a non-
obvious way (an algorithm to convert infix to postfix can be derived).
Purpose
The reason for using postfix notation is that a fairly simple algorithm exists to evaluate such
expressions based on using a stack.
Postfix Evaluation
Consider the postfix expression:
6523+8*+3+*
56
Algorithm
initialise stack to empty;
while (not end of postfix expression)
{
get next postfix item;
if(item is value)
push it onto the stack;
else if(item is binary operator)
{
pop the stack to x;
pop the stack to y;
perform y operator x;
push the results onto the stack;
}
else if (item is unary operator)
{
pop the stack to x;
perform operator(x);
push the results onto the stack
}
}
Unary operators: unary minus, square root, sin, cos, exp, etc.,
So for 6 5 2 3 + 8 * + 3 + *
57
TOS=> 3
2
5
6
So next a '+' is read (a binary operator), so 3 and 2 are popped from the stack and their sum
'5' is pushed onto the stack:
TOS=> 5
5
6
TOS=> 8
5 TOS=> 40
5 5
6 6
(8, 5 popped, 40 pushed)
TOS=> 3
TOS=> 45 45
6 6
(40, 5 popped, 45 pushed, 3 pushed)
58
TOS=> 48
6
TOS=> 288
Now there are no more items and there is a single value on the stack, representing the final
answer 288.
Note the answer was found with a single traversal of the postfix expression, with the stack being
used as a kind of memory storing values that are waiting for their operands.
Algorithm
initialize stack and postfix output to empty;
while(not end of infix expression)
{
get next infix item
if(item is value)
append item to postfix output
else if(item == ‘(‘)
push item onto stack
else if(item == ‘)’)
{
pop stack to x
while(x != ‘(‘)
append x to postfix output & pop stack to x
}
else {
while(precedence(stack top) >= precedence(item))
59
pop stack to x & append x to postfix output
push item onto stack
}
}
while(stack not empty)
pop stack to x and append x to postfix output
The algorithm immediately passes values (operands) to the postfix expression, but remembers
(saves) operators on the stack until their right-hand operands are fully translated.
abc
TOS=>
*
+
abc*+
TOS=> +
abc*+de
TOS=> *
(
+
TOS=> + abc*+de*f
(
+
abc*+de*f+
+
TOS=>
60
TOS=> * abc*+de*f+g
+
empty abc*+de*f+g*+
If these arguments are stored in a fixed memory area then the function cannot be called
recursively since the 1st return address would be overwritten by the 2nd return address before
the first was used:
10 call function abc(); /* retadrs = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs = 94 */
94 code
95 return /* to retadrs */
A stack allows a new instance of retadrs for each call to the function. Recursive calls on the
function are limited only by the extent of the stack.
10 call function abc(); /* retadrs1 = 11 */
11 continue;
...
90 function abc;
91 code;
92 if (expression)
93 call function abc(); /* retadrs2 = 94 */
94 code
95 return /* to retadrsn */
The C++ run-time system keeps track of the chain of active functions with a stack. When a
function is called, the run-time system pushes on the stack a frame containing:
• Local variables and return value
• Program counter, keeping track of the statement being executed
When a function returns, its frame is popped from the stack and control is passed to the method
on top of the stack
main()
{
int i = 5;
foo(i);
61
x = i+2;
}
foo(int j)
{
int k;
k = j+1;
bar(k);
k++;
}
bar(int m)
{
…
}
62
Chapter 5
Queue
Queue is a data structure that has access to its data at the front and rear.
• operates on FIFO (Fast In First Out) basis.
• uses two pointers/indices to keep track of information/data.
• has two basic operations:
o enqueue - inserting data at the rear of the queue
o dequeue – removing data at the front of the queue
dequeue enqueue
Front Rear
Example:
Analysis:
Consider the following structure: int Num[MAX_SIZE];
We need to have two integer variables that tell:
- the index of the front element
- the index of the rear element
We also need an integer variable that tells:
- the total number of data in the queue
63
REAR<MAX_SIZE-1 ?
Yes: - Increment REAR
- Store the data in Num[REAR]
- Increment QUEUESIZE
FRONT = = -1?
Yes: - Increment FRONT
No: - Queue Overflow
Implementation:
const int MAX_SIZE=100;
int Front =-1, Rear =-1;
int QueueSize = 0;
void enqueue(int x)
{
if(Rear < MAX_SIZE-1)
{
Rear++;
Num[Rear]=x;
QueueSize++;
if(Front == -1)
Front++;
}
else
cout<<"Queue Overflow";
}
int dequeue()
{
int x;
if(QueueSize>0)
{
x=Num[Front];
Front++;
QueueSize--;
64
}
else
cout<<"Queue Underflow";
return(x);
}
A problem with simple arrays is we run out of space even if the queue never reaches the size of
the array. Thus, simulated circular arrays (in which freed spaces are re-used to store data) can be
used to solve this problem.
The circular array implementation of a queue with MAX_SIZE can be simulated as follows:
12 11
13
10
9
MAX_SIZE - 1 8
0 7
1 6
2 5
3 4
Analysis:
65
Consider the following structure: int Num[MAX_SIZE];
We need to have two integer variables that tell:
- the index of the front element
- the index of the rear element
We also need an integer variable that tells:
- the total number of data in the queue
int FRONT =-1,REAR =-1;
int QUEUESIZE=0;
Implementation:
const int MAX_SIZE=100;
int Front =-1, Rear =-1;
int QueueSize = 0;
void enqueue(int x)
{
if(QueueSize<MAX_SIZE)
{
Rear++;
if(Rear = = MAX_SIZE)
Rear=0;
Num[Rear]=x;
QueueSize++;
66
if(Front = = -1)
Front++;
}
else
cout<<"Queue Overflow";
}
int dequeue()
{
int x;
if(QueueSize>0)
{
x=Num[Front];
Front++;
if(Front == MAX_SIZE)
Front = 0;
QueueSize--;
}
else
cout<<"Queue Underflow";
return(x);
}
Front Rear
Example: Consider the following queue of persons where females have higher priority
than males (gender is the key to give priority).
Thus, in the above example the implementation of the dequeue operation need to be
modified.
Example: The following two queues can be created from the above priority queue.
Aster Meron Abebe Alemu Belay Kedir Yonas
Female Female Male Male Male Male Male
Algorithm:
create empty females and males queue
while (PriorityQueue is not empty)
68
{
Data=DequeuePriorityQueue(); // delete data at the front
if(gender of Data is Female)
EnqueueFemale(Data);
else
EnqueueMale(Data);
}
Example: The following two queues (females queue has higher priority than the males
queue) can be merged to create a priority queue.
Aster Meron Abebe Alemu Belay Kedir Yonas
Female Female Male Male Male Male Male
Algorithm:
69
5.6. Application of Queues
i. Print server- maintains a queue of print jobs
Print()
{
EnqueuePrintQueue(Document)
}
EndOfPrint()
{
DequeuePrintQueue()
}
70
Chapter 6
Trees
A tree is a set of nodes and edges that connect pairs of nodes. It is an abstract model of a
hierarchical structure. Rooted tree has the following structure:
• One node distinguished as root.
• Every node C except the root is connected from exactly other node P. P is C's parent,
and C is one of C's children.
• There is a unique path from the root to the each node.
• The number of edges in a path is the length of the path.
B E F G
C D H I J
K L M
71
F
H I J
K L M
Binary tree: a tree in which each node has at most two children called left child and right child.
Full binary tree: a binary tree where each node has either 0 or 2 children.
Balanced binary tree: a binary tree where each node except the leaf nodes has left and right
children and all the leaves are at the same level.
Complete binary tree: a binary tree in which the length from the root to any leaf node is either h
or h-1 where h is the height of the tree. The deepest level should also be
filled from left to right.
72
Binary search tree (ordered binary tree): a binary tree that may be empty, but if it is not empty it
satisfies the following.
• Every node has a key and no two elements have the same key.
• The keys in the right subtree are larger than the keys in the root.
• The keys in the left subtree are smaller than the keys in the root.
• The left and the right subtrees are also binary search trees.
10
6 15
4 8 14 18
7 12
16 19
11 13
struct DataModel
{
Declaration of data fields
DataModel * Left, *Right;
};
DataModel *RootDataModelPtr=NULL;
73
struct Node
{
int Num;
Node * Left, *Right;
};
Node *RootNodePtr=NULL;
6.3.1. Insertion
When a node is inserted the definition of binary search tree should be preserved. Suppose there
is a binary search tree whose root node is pointed by RootNodePtr and we want to insert a node
(that stores 17) pointed by InsNodePtr.
17 17
InsertBST(RootNodePtr, InsNodePtr)
17 10 10
6 15 6 15
4 8 14 4 8 14 18
18
7 12 7 12 16 19
16 19
11 13 11 13 17
Function call:
if(RootNodePtr = = NULL)
RootNodePtr=InsNodePtr;
else
InsertBST(RootNodePtr, InsNodePtr);
74
Implementation:
void InsertBST(Nod *RNP, Node *INP)
{
//RNP=RootNodePtr and INP=InsNodePtr
int Inserted = 0;
while(Inserted = =0)
{
if(RNP->Num > INP->Num)
{
if(RNP->Left = = NULL)
{
RNP->Left = INP;
Inserted=1;
}
else
RNP = RNP->Left;
}
else
{
if(RNP->Right = = NULL)
{
RNP->Right = INP;
Inserted=1;
}
else
RNP = RNP->Right;
}
}
}
75
{
if(RNP->Right == NULL)
RNP->Right = INP;
else
InsertBST(RNP->Right, INP);
}
}
6.3.2. Traversing
Binary search tree can be traversed in three ways.
a. Preorder traversal - traversing binary tree in the order of parent, left and right.
b. Inorder traversal - traversing binary tree in the order of left, parent and right.
c. Postorder traversal - traversing binary tree in the order of left, right and parent.
Example:
RootNodePtr
10
6 15
4 8 14 18
7 12 16 19
11 13 17
Preorder traversal - 10, 6, 4, 8, 7, 15, 14, 12, 11, 13, 18, 16, 17, 19
Inorder traversal - 4, 6, 7, 8, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19
==> Used to display nodes in ascending order.
Postorder traversal - 4, 7, 8, 6, 11, 13, 12, 14, 17, 16, 19, 18, 15, 10
Function calls:
Preorder(RootNodePtr);
Inorder(RootNodePtr);
Postorder(RootNodePtr);
Implementation:
void Preorder (Node *CurrNodePtr)
{
76
if(CurrNodePtr ! = NULL)
{
cout<< CurrNodePtr->Num; // or any operation on the node
Preorder(CurrNodePtr->Left);
Preorder(CurrNodePtr->Right);
}
}
void Inorder (Node *CurrNodePtr)
{
if(CurrNodePtr ! = NULL)
{
Inorder(CurrNodePtr->Left);
cout<< CurrNodePtr->Num; // or any operation on the node
Inorder(CurrNodePtr->Right);
}
}
void Postorder (Node *CurrNodePtr)
{
if(CurrNodePtr ! = NULL)
{
Postorder(CurrNodePtr->Left);
Postorder(CurrNodePtr->Right);
cout<< CurrNodePtr->Num; // or any operation on the node
}
}
Example:
+
Preorder traversal - + – A * B C + D / E F Prefix notation
– + Inorder traversal - A – B * C + D + E / F Infix notation
Postorder traversal - A B C * – D E F / + + Postfix notation
A * D /
B C E F
77
Creating Expression Tree
We read our expression one symbol at a time. If the symbol is an operand, we create a one-node
tree and push a pointer to it onto a stack. If the symbol is an operator, we pop pointers to two
trees T1 and T2 from the stack (T1 is popped first) and form a new tree whose root is the operator
and whose left and right children point to T2 and T1 respectively. A pointer to this new tree is
then pushed onto the stack.
The first two symbols are operands, so we create one-node trees and push pointers to them onto
a stack (see figure a below).
a) b) c)
Next, a '+' is read, so two pointers to trees are popped, a new tree is formed, and a pointer to it is
pushed onto the stack (see figure b above).
Next, c, d, and e are read, and for each a one-node tree is created and a pointer to the
corresponding tree is pushed onto the stack (see figure c above).
Now a '+' is read, so two trees are merged (see figure a below).
a) b)
Continuing, a '*' is read, so we pop two tree pointers and form a new tree with a '*' as root (see
figure b above).
78
Finally, the last symbol ‘*’ is read, two trees are merged, and a pointer to the final tree is left on
the stack.
6.3.4. Searching
To search a node (whose Num value is Number) in a binary search tree (whose root node is
pointed by RootNodePtr), one of the three traversal methods can be used.
Function call:
ElementExists = SearchBST (RootNodePtr, Number);
//ElementExists is a Boolean variable defined as: bool ElementExists = false;
Implementation:
bool SearchBST (Node *RNP, int x)
{
if(RNP == NULL)
return(false);
else if(RNP->Num == x)
return(true);
else if(RNP->Num > x)
return(SearchBST(RNP->Left, x));
else
return(SearchBST(RNP->Right, x));
}
When we search an element in a binary search tree, sometimes it may be necessary for the
SearchBST function to return a pointer that points to the node containing the element searched.
Accordingly, the function has to be modified as follows.
Function call:
//SearchedNodePtr is a pointer variable defined as: Node
79
SearchedNodePtr = SearchBST (RootNodePtr, Number);
Implementation:
Node *SearchBST (Node *RNP, int x)
{
if((RNP == NULL) || (RNP->Num == x))
return (RNP);
else if(RNP->Num > x)
return (SearchBST(RNP->Left, x));
else
return (SearchBST (RNP->Right, x));
}
6.3.5. Deletion
To delete a node (whose Num value is N) from binary search tree (whose root node is pointed
by RootNodePtr), four cases should be considered. When a node is deleted the definition of
binary search tree should be preserved.
10
6 14
3 8 12 18
4 7 9 11 13 16 19
2
15 17
1 5
80
RootNodePtr
RootNodePtr
Delete 7
10
10
6 14
6 14
3 8 12 18
3 8 12 18
4 9 11 13 16 19
16 2
2 4 7 9 11 13 19
15 17
17 1 5
5 15
1
• If the deleted node is the left child of its parent and the deleted node has only the left child,
the left child of the deleted node is made the left child of the parent of the deleted node.
• If the deleted node is the left child of its parent and the deleted node has only the right child,
the right child of the deleted node is made the left child of the parent of the deleted node.
• If the deleted node is the right child of its parent and the node to be deleted has only the left
child, the left child of the deleted node is made the right child of the parent of the deleted
node.
• If the deleted node is the right child of its parent and the deleted node has only the right
child, the right child of the deleted node is made the right child of the parent of the deleted
node.
RootNodePtr
RootNodePtr
Delete 2
10
10
6 14
6 14
3 8 12 18
3 8 12 18
4 9 11 13 16 19
16 1
2 4 7 9 11 13 19
15 17
17 5
5 15
1
81
Approach 2: Deletion by copying- the following is done
• Copy the node containing the largest element in the left (or the smallest element in the right)
to the node containing the element to be deleted
• Delete the copied node
RootNodePtr
RootNodePtr
Delete 2
10
10
6 14
6 14
3 8 12 18
3 8 12 18
4 9 11 13 16 19
16 1
2 4 7 9 11 13 19
15 17
17 5
5 15
1
• If the deleted node is the right child of its parent, one of the following is done
o The left child of the deleted node is made the right child of the parent of the deleted
node, and
o The right child of the deleted node is made the right child of the node containing largest
element in the left of the deleted node
OR
o The right child of the deleted node is made the right child of the parent of the deleted
node, and
o The left child of the deleted node is made the left child of the node containing smallest
element in the right of the deleted node
82
RootNodePtr RootNodePtr
Delete 6
10 10
6 14 8 14
3 8 12 18 7 9 12 18
4 7 9 11 13 16 19 11 13 16 19
2 3
15 17 15 17
1 5 4
2
1 5
RootNodePtr
RootNodePtr
Delete 6 10
10
3 14
6 14
2 4 12 18
3 8 12 18
5 11 13 16 19
1
4 7 9 11 13 16 19
2
8 15 17
15 17
1 5
7 9
83
RootNodePtr RootNodePtr
Delete 6
10 10
6 14 5 14
3 8 12 18 3 8 12 18
4 7 9 11 13 16 19 4 7 9 11 13 16 19
2 2
15 17 15 17
1 5 1
RootNodePtr
RootNodePtr
Delete 6 10
10
7 14
6 14
3 8 12 18
3 8 12 18
4 9 11 13 16 19
16 2
2 4 7 9 11 13 19
5 15 17
15 17 1
1 5
• If the tree has only one node the root node pointer is made to point to nothing (NULL)
• If the root node has left child
o the root node pointer is made to point to the left child
o the right child of the root node is made the right child of the node containing the largest
element in the left of the root node
• If root node has right child
o the root node pointer is made to point to the right child
o the left child of the root node is made the left child of the node containing the smallest
element in the right of the root node
84
RootNodePtr RootNodePtr
RootNodePtr
10 Delete 10 6
6 14 3 8
3 8 12 18 4 7 9
2
4 7 9 11 13 16 19
2 1 5 14
15 17 12
1 5 18
11 13 16 19
15 17
RootNodePtr RootNodePtr
RootNodePtr
10 14
Delete 10
6 14 12 18
3 8 12 18 16
11 13 19
4 7 9 11 13 16 19 17
2 6 15
15 17 8
1 5 3
2 4 7 9
1 5
85
RootNodePtr
RootNodePtr
9
10 Delete 10
6 14
6 14
3 8 12 18
3 8 12 18
4 7 11 13 16 19
16 2
2 4 7 9 11 13 19
15 17
17 1 5
5 15
1
RootNodePtr
RootNodePtr
11
10 Delete 10
6 14
6 14
3 8 12 18
3 8 12 18
4 7 9 16 19
16 2 13
2 4 7 9 11 13 19
15 17
17 1 5
5 15
1
Function call:
if ((RootNodePtr->Left==NULL)&&( RootNodePtr->Right==NULL) && (RootNodePtr->Num==N))
{
//the node to be deleted is the root node having no child
RootNodePtr = NULL;
delete RootNodePtr;
}
else
DeleteBST(RootNodePtr, RootNodePtr, N);
86
{
Node *DNP; // a pointer that points to the currently deleted node
// PDNP is a pointer that points to the parent node of currently deleted node
if(RNP == NULL)
cout<<"Data not found\n";
else if (RNP->Num>x)
DeleteBST(RNP->Left, RNP, x);// delete the element in the left subtree
else if(RNP->Num<x)
DeleteBST(RNP->Right, RNP, x);// delete the element in the right subtree
else
{
DNP = RNP;
if((DNP->Left == NULL) && (DNP->Right == NULL))
{
if (PDNP->Left==DNP)
PDNP->Left=NULL;
else
PDNP->Right=NULL;
delete DNP;
}
else
{
if(DNP->Left != NULL) //find the maximum in the left
{
PDNP = DNP;
DNP = DNP->Left;
while(DNP->Right != NULL)
{
PDNP=DNP;
DNP=DNP->Right;
}
RNP->Num=DNP->Num;
DeleteBST(DNP,PDNP,DNP->Num);
}
else //find the minimum in the right
{
PDNP=DNP;
DNP=DNP->Right;
while(DNP->Left != NULL)
{
PDNP=DNP;
87
DNP=DNP->Left;
}
RNP->Num = DNP->Num;
DeleteBST(DNP,PDNP,DNP->Num);
}
}
}
}
88
Chapter 7
Shell sort is an improvement of insertion sort. It is developed by Donald Shell in 1959. Insertion
sort works best when the array elements are sorted in a reasonable order. Thus, shell sort first
creates this reasonable order.
Algorithm:
1. Choose gap gk between elements to be partly ordered.
2. Generate a sequence (called increment sequence) gk, gk-1,…., g2, g1 where for each
sequence gi, A[j]<=A[j+gi] for 0<=j<=n-1-gi and k>=i>=1
It is advisable to choose gk =n/2 and gi-1 = gi/2 for k>=i>=1. After each sequence gk-1 is done
and the list is said to be gi-sorted. Shell sorting is done when the list is 1-sorted (which is sorted
using insertion sort) and A[j]<=A[j+1] for 0<=j<=n-2. Time complexity is O(n3/2).
5 8 2 4 1 3 9 7 6 0
Sort (5, 3) 3 8 2 4 1 5 9 7 6 0
Sort (8, 9) 3 8 2 4 1 5 9 7 6 0
Sort (2, 7) 3 8 2 4 1 5 9 7 6 0
Sort (4, 6) 3 8 2 4 1 5 9 7 6 0
Sort (1, 0) 3 8 2 4 0 5 9 7 6 1
5- sorted list 3 8 2 4 0 5 9 7 6 1
Choose g2 =3
Sort (3, 4, 9, 1) 1 8 2 3 0 5 4 7 6 9
Sort (8, 0, 7) 1 0 2 3 7 5 4 8 6 9
Sort (2, 5, 6) 1 0 2 3 7 5 4 8 6 9
3- sorted list 1 0 2 3 7 5 4 8 6 9
Sort (1, 0, 2, 3, 7, 5, 4, 8, 6, 9) 0 1 2 3 4 5 6 7 8 9
1- sorted (shell sorted) list 0 1 2 3 4 5 6 7 8 9
89
The code:
void shellsort(input_type a[], int n)
{
int i, j, increment;
input_type tmp;
for( increment = n/2; increment > 0; increment /= 2 )
{
for( i = increment+1; i<=n; i++ )
{
tmp = a[i];
for( j = i; j > increment; j -= increment )
if( tmp < a[j-increment] )
a[j] = a[j-increment];
else
break;
a[j] = tmp;
}
}
}
Algorithm:
1. Choose a pivot value (mostly the first element is taken as the pivot value)
2. Position the pivot element and partition the list so that:
the left part has items less than or equal to the pivot value
the right part has items greater than or equal to the pivot value
3. Recursively sort the left part
4. Recursively sort the right part
90
Example: Sort the following list using 0 3 2 4 1 5 9 7 6 8
quick sort algorithm.
5 8 2 4 1 3 9 7 6 0 Left Right
Pivot
5 8 2 4 1 3 9 7 6 0 0 3 2 4 1 5 9 7 6 8
Left Right
Pivot
91
The following algorithm can be used to position a pivot value and create partition.
Quick sort can also be done in a slightly different way. We choose element in the center of the
array as the pivot value. Putting pointers on the left and right positions of the array:
• Increment left until you find a value greater than pivot
• Decrement right until you find a value less than pivot
• Then swap the two values
92
93
The following code can be used to do the sorting
int partition(int a[], int left, int right, int pivotIndex)
{
int pivot = a[pivotIndex];
do
{
while (a[left] < pivot)
left++;
while (a[right] > pivot)
right--;
if (left < right && a[left] != a[right])
swap(a[left], a[right]);
else
return right;
}while (left < right);
return right;
}
void quicksort(int a[], int left, int right)
{
if (left < right)
{
int pivot = (left + right) / 2; // middle
int pivotNew = partition(a, left, right, pivot);
quicksort(a, left, pivotNew - 1);
quicksort(a, pivotNew + 1, right);
}
}
Algorithm:
1. Construct a binary tree
• The root node corresponds to Data[0].
• If we consider the index associated with a particular node to be i, then the left child
of this node corresponds to the element with index 2*i+1 and the right child
corresponds to the element with index 2*i+2. If any or both of these elements do not
exist in the array, then the corresponding child node does not exist either.
94
2. Construct the heap tree from initial binary tree using "adjust" process.
3. Sort by swapping the root value with the lowest, right most value and deleting the
lowest, right most value and inserting the deleted value in the array in it proper position.
5 8 2 4 1 3 9 7 6 0
RootNodePtr RootNodePtr
5 9
8 2 8 5
4 1 3 9 7 1 3 2
7 6 0 4 6 0
Swap the root node with the lowest, right most node and delete the lowest, right most value;
insert the deleted value in the array in its proper position; adjust the heap tree; and repeat
this process until the tree is empty.
RootNodePtr RootNodePtr
9 0
8 5 9 8 5
7 1 3 2 7 1 3 2
4 6 0 4 6
RootNodePtr RootNodePtr
8 0
8 9 7 5
7 5
6 1 6 1 3 2
3 2
4 4
0
95
RootNodePtr RootNodePtr
7 0
6 5 7 8 9
6 5
4 1 3 2 4 1 3 2
RootNodePtr RootNodePtr
6 2
6 7 8 9
4 5 4 5
0 1 3 2 0 1 3
RootNodePtr RootNodePtr
5 2
5 6 7 8 9
4 3 4 3
0 1 2 0 1
RootNodePtr RootNodePtr
4 1
4 5 6 7 8 9
2 3 2 3
0 1 0
RootNodePtr RootNodePtr
3 0
3 4 5 6 7 8 9
2 1 2 1
96
0
RootNodePtr
RootNodePtr
2
2 3 4 5 6 7 8 9 1
0 1
0
RootNodePtr
RootNodePtr
1
1 2 3 4 5 6 7 8 9 0
0
RootNodePtr 0 1 2 3 4 5 6 7 8 9 RootNodePtr
0
The code:
void heapsort( input_type a[], unsigned int n )
{
int i;
for( i=n/2; i>0; i-- ) /* build_heap */
perc_down (a, i, n );
for( i=n; i>=2; i-- )
{
swap( &a[1], &a[i] ); /* delete_max */
perc_down( a, 1, i-1 );
}
}
void perc_down( input_type a[], unsigned int i, unsigned int n )
{
unsigned int child;
97
input_type tmp;
for( tmp = a[i]; i*2<=n; i = child )
{
child = i*2;
if( ( child != n ) && ( a[child+1] > a[child] ) )
child++;
if( tmp < a[child] )
a[i] = a[child];
else
break;
}
a[i] = tmp;
}
The fundamental operation in this algorithm is merging two sorted lists. Because the lists are
sorted, this can be done in one pass through the input, if the output is put in a third list. Suppose
we have two input arrays called a and b, an output array c, and three counters, aptr, bptr, and
cptr, which are initially set to the beginning of their respective arrays.
First, a comparison is done between 1 and 2. 1 is added to c, and then 13 and 2 are compared.
98
26 is added to c, and the a array is exhausted.
Algorithm:
1. Divide the array in to two halves.
2. Recursively sort the first n/2 items.
3. Recursively sort the last n/2 items.
4. Merge sorted items (using an auxiliary array).
99
Example: sort the following list using merge sort algorithm.
8 2 4 1 3 9 7 6 0
5 8 2 4 1 3 9 7 6 0
5 8 2 4 1 3 9 7 6 0
Division phase
5 8 2 4 1 3 9 7 6 0
5 8 2 4 1 3 9 7 6 0
4 1 6 0
1 4 0 6
5 8 1 2 4 3 9 0 6 7
Sorting and merging phase
1 2 4 5 8 0 3 6 7 9
0 1 2 3 4 5 6 7 8 9
The code:
void merge( int a[], int tmp_array[], int left_pos, int right_pos, int
right_end )
{
int i, left_end, num_elements, tmp_pos;
left_end = right_pos - 1;
tmp_pos = left_pos;
num_elements = right_end - left_pos + 1;
100
{
if( a[left_pos] <= a[right_pos] )
tmp_array[tmp_pos++] = a[left_pos++];
else
tmp_array[tmp_pos++] = a[right_pos++];
}
101
{
cout<<"No space for temporary array!!!";
return 0;
}
}
102
Chapter 8
Hashing
A hash table is a data structure that stores data and allows insertions, lookups, and deletions to
be performed in constant, O(1), time. Hash tables provide a means for rapid searching for a
record with a given key value and adapts well to the insertion and deletion of records. It is a
technique used for performing insertions, deletions and finds in constant time. The aim of
hashing is to map ap an extremely large key space onto a reasonable small range of integers such
that it is unlikely that two keys are mapped onto the same integer.
Direct addressing is a simple technique that works well when the universe U of keys is
reasonably small. Suppose
ppose that an application needs a dynamic set in which each element has a
key drawn from the universe U = {0,1, . . . , m - 1}, where m is not too large. We shall assume
that no two elements have the same key.
Hash tables are implemented using array. To map a key to an entry, we first rst map a key to an
integer and then use the integer as index in the array A. The function that computes index from
keys is called a hash function.
Hash Function
Hash function is a function
unction which, when applied to the key, produces aan integer which can be
used as an address in a hash table.
103
Hash saves items in a key-indexed table (index is a function of the key). A hash function maps a
key to an index in the hash table:
h(U) → {0, 1, …, m-1} where m is the table-size and |U|=n
Given an item x with key k, put x at location h(k).
h(x) = k
The ideal function, termed as perfect hash function, would distribute all elements across the
buckets such that no collisions ever occur.
Generally, the following hashing functions could be used: division method, and multiplication
method.
Division Method
Once we have a key k represented as an integer, one of the simplest hashing methods is to map
it into one of m positions in a table by taking the remainder of k divided by m. This is called the
division method. Formally stated:
h(k) = k mod m
Using this method, if the table has m = 1699 positions, and we hash the key k =25,657, the hash
coding is 25,657 mod 1699 = 172. Typically, we should avoid values for m that are powers of 2.
This is because if m = 2p, h becomes just the p lowest-order bits of k. Usually we choose m to be
a prime number not too close to a power of 2, while considering storage constraints and load
factor.
Multiplication Method
104
An alternative to the division method is to multiply the integer key k by a constant A in the
range 0 < A < 1; extract the fractional part; multiply this value by the number of positions in the
table, m; and take the floor of the result. Typically, A is chosen to be 0.618, which is the square
root of 5, minus 1, all divided by 2. This method is called the multiplication method. Formally
stated:
h(k) = m(kA mod 1) , where A ≈ (√5 – 1) / 2 = 0.618
An advantage to this method is that m, the number of positions in the table, is not as critical as
in the division method. For example, if the table contains m = 2000 positions, and we hash the
key k = 6341, the hash coding is (2000)((6341)(0.618) mod 1) = (2000)(3918.738 mod 1) =
(2000)(0.738) = 1476.
If the input keys are integers, then simply returning key mod HASH_SIZE is generally a
reasonable strategy, unless key happens to have some undesirable properties. In this case, the
choice of hash function needs to be carefully considered. For instance, if the table size is 10 and
the keys all end in zero, then the standard hash function is obviously a bad choice.
For many reasons, and to avoid situations like the one above, it is usually a good idea to ensure
that the table size is prime. When the input keys are random integers, then this function is not
only very simple to compute but also distributes the keys evenly.
Usually, the keys are strings. In this case, the hash function needs to be chosen carefully. One
option is to add up the ASCII values of the characters in the string.
The hash function given below as an example is simple to implement and computes an answer
quickly. However, if the table size is large, the function does not distribute the keys well. For
instance, suppose that HASH_SIZE = 10,007 (a prime number). If all the keys are eight or
fewer characters long, since a char has an integer value that is always at most 127, the hash
function can only assume values between 0 and 1016, which is 127 * 8. This is clearly not an
equitable distribution!
Another example is given below. This hash function assumes key has at least two characters
plus the NULL terminator. 27 represents the number of letters in the English alphabet, plus the
105
blank, and 729 is 272. This function only examines the first three characters, but if these are
random, and the table size is 10,007, as before, then we would expect a reasonably equitable
distribution. This function, although easily computable, is also not appropriate if the hash table
is reasonably large.
Example: a full program that uses hash table to store student mark using ID as key
const unsigned int SIZE = 11;
int mark[SIZE];
int hash( char *key)
{
unsigned int hash_val = 0;
while(*key != '\0')
{
hash_val += *key++;
}
return( hash_val % SIZE );
}
cout<<"\nID: "<<searched;
cout<<"\nNMark: "<<mark[index];
return 1;
}
int main()
{
int index = 0;
int mval = 0;
char ID[15];
106
for(int i = 0; i < 5; i++)
{
cout<<"ID: "; cin>>ID;
cout<<"Mark: "; cin>>mval;
index = hash(ID);
mark[index] = mval;
}
char searchedID[30];
cout<<"\nEnter searched student ID: ";
cin>>searchedID;
search(searchedID);
return 0;
}
Collision Handling
When the set K of keys stored in a hash table is much smaller than the universe U of all possible
keys, a hash table requires much less storage than a direct-address table.
What to do if more than one keys hash to the same value? This is called collision. If we cannot
define a perfect hashing function, we must deal with collisions. A collision occurs when
multiple keys map onto the same table index.
In a number of cases, where multiple keys map to the same integer, then elements with different
keys may be stored in the same slot of the hash table. It is clear that when the hash function is
used to locate a potential match, it will be necessary to compare the key of that element with the
107
search key. But there may be more than one element which should be stored in a single slot of
the table.
There are many ways to handle collisions. These include chaining, double hashing, linear
probing, quadratic probing, random probing, etc.
The dictionary operations on a hash table are easy to implement when collisions are resolved by
chaining.
Linear Probing
108
Linear probing checks cells sequentially (with wraparound) in search of an empty cell if
collision occurs. For linear probing it is a bad idea to let the hash table get nearly full, because
performance degrades.
If the hash table is not full, attempt to store key in the next array element, i.e. if h(x)=t, then
check t+1, t+2, t+3,…etc until you find an empty slot.
Linear probing resolves collisions by simply checking the next slot, i.e. if a collision occurred in
slot j, the next slot to check would be slot j + 1. More formally, linear probing uses the hash
function
h(k, i) = (h(k) + i) mod m for i = 0, 1,...,m-1
Insert: start with the location where the key hashed and do a sequential search for an empty
slot.
Search: start with the location where the key hashed and do a sequential search until you
either find the key (success) or find an empty slot (failure).
Delete: (lazy deletion) follow same route but mark slot as DELETED rather than EMPTY,
otherwise subsequent searches will fail.
Example: insert the values 89, 18, 49, 58, 69 into hash table using linear probing. Use division
method for hash function.
109
The first collision occurs when 49 is inserted; it is put in the next available spot, namely spot 0,
which is open. 58 collides with 18, 89, and then 49 before an empty cell is found three away.
The collision for 69 is handled in a similar manner.
As long as the table is big enough, a free cell can always be found, but the time to do so can get
quite large. Worse, even if the table is relatively empty, blocks of occupied cells start forming.
This effect, known as primary clustering, means that any key that hashes into the cluster will
require several attempts to resolve the collision, and then it will add to the cluster.
Quadratic Probing
Quadratic probing is a collision resolution method that eliminates the primary clustering
problem of linear probing. Probe the table at slots (h(k)+i2) mod m for i=0, 1,2, 3, ..., m if the
current location to which the key hashed is occupied. The hashing function is
(h(k)+i2) mod m
Quadratic probing is better than linear probing but may result in secondary clustering: if
h(k1)=h(k2) the probing sequences for k1 and k2 are exactly the same.
Random Probing
Use a pseudorandom number generator to obtain the sequence of slots that are probed.
110
Double Hashing
Double hashing resolves collisions by using another hash function to determine which slot to try
next. We use a second hash function to obtain the next slot if collision occurs. The hash
function is
(h(k) + i*h2(k)) mod m
Example: Use double hashing to store 18, 41, 22, 44, 59, 32, 31, 73 with:
m = 13
h(k) = k mod m
h2(k) = 7 - (k mod 7)
111
Performance:
• Much better than linear or quadratic probing.
• Does not suffer from clustering
• But requires computation of a second function
Application Areas
Hashing is used in many areas. Few are:
• Compilers use hash tables for symbol storage.
• The Linux Kernel uses hash tables to manage memory pages and buffers.
• High speed routing tables use hash tables.
• Database systems use hash tables.
112