Data Structure&Algorithms
Data Structure&Algorithms
Data Structure
Array
Linked List
Stack
Queue
Binary Tree
Binary Search Tree
Algorithms
Analysis of algorithms
Searching algorithms
Sorting algorithms
Array
An array is a collection of items stored at contiguous memory locations. The
idea is to store multiple items of the same type together. This makes it easier
to calculate the position of each element by simply adding an offset to a base
value, i.e., the memory location of the first element of the array (generally
denoted by the name of the array).
class RotateArray {
/*Function to left rotate arr[] of size n by d*/
void leftRotate(int arr[], int d, int n)
{
for (int i = 0; i < d; i++)
leftRotatebyOne(arr, n);
}
void leftRotatebyOne(int arr[], int n)
{
int i, temp;
temp = arr[0];
for (i = 0; i < n - 1; i++)
arr[i] = arr[i + 1];
arr[i] = temp;
}
/* utility function to print an array */
void printArray(int arr[], int n)
{
for (int i = 0; i < n; i++)
System.out.print(arr[i] + " ");
}
// Driver program to test above functions
public static void main(String[] args)
{
RotateArray rotate = new RotateArray();
int arr[] = { 1, 2, 3, 4, 5, 6, 7 };
rotate.leftRotate(arr, 2, 7);
rotate.printArray(arr, 7);
}
}
Output :
3456712
Time complexity : O(n * d)
Auxiliary Space : O(1)
In simple words, a linked list consists of nodes where each node contains a
data field and a reference(link) to the next node in the list.
1. Singly Linked List
2. Doubly Linked List
3. Circular Linked List
Why Linked List?
Arrays can be used to store linear data of similar types, but arrays have the
following limitations.
1) The size of the arrays is fixed: So we must know the upper limit on the
number of elements in advance. Also, generally, the allocated memory is
equal to the upper limit irrespective of the usage.
2) Inserting a new element in an array of elements is expensive because the
room has to be created for the new elements and to create room existing
elements have to be shifted.
Output:
Created Linked List:
8 2 3 1 7
Linked List after Deletion at position 4:
8 2 3 1
Output:
Deleting linked list
Linked list deleted
Time Complexity: O(n)
Auxiliary Space: O(1)
Write a function to count the number of nodes in a given singly linked list.
For example, the function should return 5 for linked list 1->3->1->2->1.
// Java program to count number of nodes in a linked list
/* Linked list Node*/
class Node
{
int data;
Node next;
Node(int d) { data = d; next = null; }
}
// Linked List class
class LinkedList
{
Node head; // head of list
/* Inserts a new Node at front of the list. */
public void push(int new_data)
{
/* 1 & 2: Allocate the Node &
Put in the data*/
Node new_node = new Node(new_data);
/* 3. Make next of new Node as head */
new_node.next = head;
/* 4. Move the head to point to new Node */
head = new_node;
}
/* Returns count of nodes in linked list */
public int getCount()
{
Node temp = head;
int count = 0;
while (temp != null)
{
count++;
temp = temp.next;
}
return count;
}
/* Driver program to test above functions. Ideally
this function should be in a separate user class.
It is kept here to keep code compact */
public static void main(String[] args)
{
/* Start with the empty list */
LinkedList llist = new LinkedList();
llist.push(1);
llist.push(3);
llist.push(1);
llist.push(2);
llist.push(1);
System.out.println("Count of nodes is " +
llist.getCount());
}
}
Output:
count of nodes is 5
Applications of Queue:
Queue is used when things don’t have to be processed immediatly, but have
to be processed in First InFirst Out order like Breadth First Search. This
property of Queue makes it also useful in following kind of scenarios.
1) When a resource is shared among multiple consumers. Examples include
CPU scheduling, Disk Scheduling.
2) When data is transferred asynchronously (data not necessarily received at
same rate as sent) between two processes. Examples include IO Buffers,
pipes, file IO, etc.
See this for more detailed applications of Queue and Stack.
Array implementation Of Queue
For implementing queue, we need to keep track of two indices, front and rear.
We enqueue an item at the rear and dequeue an item from front. If we simply
increment front and rear indices, then there may be problems, front may reach
end of the array. The solution to this problem is to increase front and rear in
circular manner (See this for details)
// Java program for array implementation of queue
// A class to represent a queue
class Queue
{
int front, rear, size;
int capacity;
int array[];
public Queue(int capacity) {
this.capacity = capacity;
front = this.size = 0;
rear = capacity - 1;
array = new int[this.capacity];
}
// Queue is full when size becomes equal to
// the capacity
boolean isFull(Queue queue)
{ return (queue.size == queue.capacity);
}
// Queue is empty when size is 0
boolean isEmpty(Queue queue)
{ return (queue.size == 0); }
// Method to add an item to the queue.
// It changes rear and size
void enqueue( int item)
{
if (isFull(this))
return;
this.rear = (this.rear + 1)%this.capacity;
this.array[this.rear] = item;
this.size = this.size + 1;
System.out.println(item+ " enqueued to queue");
}
// Method to remove an item from queue.
// It changes front and size
int dequeue()
{
if (isEmpty(this))
return Integer.MIN_VALUE;
int item = this.array[this.front];
this.front = (this.front + 1)%this.capacity;
this.size = this.size - 1;
return item;
}
// Method to get front of queue
int front()
{
if (isEmpty(this))
return Integer.MIN_VALUE;
return this.array[this.front];
}
// Method to get rear of queue
int rear()
{
if (isEmpty(this))
return Integer.MIN_VALUE;
return this.array[this.rear];
}
}
// Driver class
public class Test
{
public static void main(String[] args)
{
Queue queue = new Queue(1000);
queue.enqueue(10);
queue.enqueue(20);
queue.enqueue(30);
queue.enqueue(40);
System.out.println(queue.dequeue() +
" dequeued from queue\n");
System.out.println("Front item is " +
queue.front());
System.out.println("Rear item is " +
queue.rear());
}
}
// This code is contributed by Gaurav Miglani
Output:
10 enqueued to queue
20 enqueued to queue
30 enqueued to queue
40 enqueued to queue
10 dequeued from queue
Front item is 20
Rear item is 40
Time Complexity: Time complexity of all operations like enqueue(), dequeue(), isFull(),
isEmpty(), front() and rear() is O(1). There is no loop in any of the operations.
file system
-----------
/ <-- root
/ \
... home
/ \
ugrad course
/ / | \
... cs101 cs112 cs113
2. Trees (with some ordering e.g., BST) provide moderate access/search
(quicker than Linked List and slower than arrays).
3. Trees provide moderate insertion/deletion (quicker than Arrays and slower
than Unordered Linked Lists).
4. Like Linked Lists and unlike Arrays, Trees don’t have an upper limit on
number of nodes as nodes are linked using pointers.
Main applications of trees include:
1. Manipulate hierarchical data.
2. Make information easy to search (see tree traversal).
3. Manipulate sorted lists of data.
4. As a workflow for compositing digital images for visual effects.
5. Router algorithms
6. Form of a multi-stage decision-making (see business chess).
Binary Tree: A tree whose elements have at most 2 children is called a
binary tree. Since each element in a binary tree can have only 2 children, we
typically name them the left and right child.
Binary Tree Representation in C: A tree is represented by a pointer to the
topmost node in tree. If the tree is empty, then value of root is NULL.
A Tree node contains following parts.
1. Data
2. Pointer to left child
3. Pointer to right child
3) In a Binary Tree with N nodes, minimum possible height or minimum
number of levels is ? Log2(N+1) ?
This can be directly derived from point 2 above. If we consider the convention
where height of a leaf node is considered as 0, then above formula for
minimum possible height becomes ? Log2(N+1) ? – 1
4) A Binary Tree with L leaves has at least ? Log2L ? + 1 levels
A Binary tree has maximum number of leaves (and minimum number of
levels) when all levels are fully filled. Let all leaves be at level l, then below is
true for number of leaves L.
L <= 2l-1 [From Point 1]
l = ? Log2L ? + 1
where l is the minimum number of levels.
5) In Binary tree where every node has 0 or 2 children, number of leaf
nodes is always one more than nodes with two children.
L=T+1
Where L = Number of leaf nodes
T = Number of internal nodes with two children
18
/ \
15 30
/ \ / \
40 50 100 40
18
/ \
15 20
/ \
40 50
/ \
30 50
18
/ \
40 30
/ \
100 40
Complete Binary Tree: A Binary Tree is complete Binary Tree if all levels are
completely filled except possibly the last level and the last level has all keys
as left as possible
Following are examples of Complete Binary Trees
18
/ \
15 30
/ \ / \
40 50 100 40
18
/ \
15 30
/ \ / \
40 50 100 40
/ \ /
8 7 9
Practical example of Complete Binary Tree is Binary Heap.
Perfect Binary Tree A Binary tree is Perfect Binary Tree in which all internal
nodes have two children and all leaves are at the same level.
Following are examples of Perfect Binary Trees.
18
/ \
15 30
/ \ / \
40 50 100 40
18
/ \
15 30
A Perfect Binary Tree of height h (where height is the number of nodes on the
path from the root to leaf) has 2h – 1 node.
Example of a Perfect binary tree is ancestors in the family. Keep a person at
root, parents as children, parents of parents as their children.
A degenerate (or pathological) tree A Tree where every internal node has
one child. Such trees are performance-wise same as linked list.
10
/
20
\
30
\
40
Binary Search Tree
Binary Search Tree is a node-based binary tree data structure which has the
following properties:
The left subtree of a node contains only nodes with keys lesser than
the node’s key.
The right subtree of a node contains only nodes with keys greater than
the node’s key.
The left and right subtree each must also be a binary search tree.
Searching a key
To search a given key in Binary Search Tree, we first compare it with root, if
the key is present at root, we return root. If key is greater than root’s key, we
recur for right subtree of root node. Otherwise we recur for left subtree.
Illustration to search 6 in below tree:
1. Start from root.
2. Compare the inserting element with root, if less than root, then recurse for
left, else recurse for right.
3. If element to search is found anywhere, return true, else return false.
Insertion of a key
A new key is always inserted at leaf. We start searching a key from root till we
hit a leaf node. Once a leaf node is found, the new node is added as a child of
the leaf node.
100 100
/ \ Insert 40 / \
20 500 ---------> 20 500
/ \ / \
10 30 10 30
\
40
Output:
20
30
40
50
60
70
80
50 60
/ \ delete(50) / \
40 70 ---------> 40 70
/ \ \
60 80 80
The important thing to note is, inorder successor is needed only when right
child is not empty. In this particular case, inorder successor can be obtained
by finding the minimum value in right child of the node.
// Java program to demonstrate delete operation in binary search tree
class BinarySearchTree
{
/* Class containing left and right child of current node and key value*/
class Node
{
int key;
Node left, right;
public Node(int item)
{
key = item;
left = right = null;
}
}
// Root of BST
Node root;
// Constructor
BinarySearchTree()
{
root = null;
}
// This method mainly calls deleteRec()
void deleteKey(int key)
{
root = deleteRec(root, key);
}
/* A recursive function to insert a new key in BST */
Node deleteRec(Node root, int key)
{
/* Base Case: If the tree is empty */
if (root == null) return root;
/* Otherwise, recur down the tree */
if (key < root.key)
root.left = deleteRec(root.left, key);
else if (key > root.key)
root.right = deleteRec(root.right, key);
// if key is same as root's key, then This is the node
// to be deleted
else
{
// node with only one child or no child
if (root.left == null)
return root.right;
else if (root.right == null)
return root.left;
// node with two children: Get the inorder successor (smallest
// in the right subtree)
root.key = minValue(root.right);
// Delete the inorder successor
root.right = deleteRec(root.right, root.key);
}
return root;
}
int minValue(Node root)
{
int minv = root.key;
while (root.left != null)
{
minv = root.left.key;
root = root.left;
}
return minv;
}
// This method mainly calls insertRec()
void insert(int key)
{
root = insertRec(root, key);
}
/* A recursive function to insert a new key in BST */
Node insertRec(Node root, int key)
{
/* If the tree is empty, return a new node */
if (root == null)
{
root = new Node(key);
return root;
}
/* Otherwise, recur down the tree */
if (key < root.key)
root.left = insertRec(root.left, key);
else if (key > root.key)
root.right = insertRec(root.right, key);
/* return the (unchanged) node pointer */
return root;
}
// This method mainly calls InorderRec()
void inorder()
{
inorderRec(root);
}
// A utility function to do inorder traversal of BST
void inorderRec(Node root)
{
if (root != null)
{
inorderRec(root.left);
System.out.print(root.key + " ");
inorderRec(root.right);
}
}
// Driver Program to test above functions
public static void main(String[] args)
{
BinarySearchTree tree = new BinarySearchTree();
/* Let us create following BST
50
/ \
30 70
/ \ / \
20 40 60 80 */
tree.insert(50);
tree.insert(30);
tree.insert(20);
tree.insert(40);
tree.insert(70);
tree.insert(60);
tree.insert(80);
System.out.println("Inorder traversal of the given tree");
tree.inorder();
System.out.println("\nDelete 20");
tree.deleteKey(20);
System.out.println("Inorder traversal of the modified tree");
tree.inorder();
System.out.println("\nDelete 30");
tree.deleteKey(30);
System.out.println("Inorder traversal of the modified tree");
tree.inorder();
System.out.println("\nDelete 50");
tree.deleteKey(50);
System.out.println("Inorder traversal of the modified tree");
tree.inorder();
}
}
Output:
Inorder traversal of the given tree
20 30 40 50 60 70 80
Delete 20
Inorder traversal of the modified tree
30 40 50 60 70 80
Delete 30
Inorder traversal of the modified tree
40 50 60 70 80
Delete 50
Inorder traversal of the modified tree
40 60 70 80
Algorithms
Analysis of Algorithms | Set 1 (Asymptotic Analysis)
Why performance analysis?
There are many important things that should be taken care of, like user
friendliness, modularity, security, maintainability, etc. Why to worry about
performance?
The answer to this is simple, we can have all the above things only if we have
performance. So performance is like currency through which we can buy all
the above things. Another reason for studying performance is – speed is fun!
To summarize, performance == scale. Imagine a text editor that can load
1000 pages, but can spell check 1 page per minute OR an image editor that
takes 1 hour to rotate your image 90 degrees left OR … you get it. If a
software feature can not cope with the scale of tasks users need to perform –
it is as good as dead.
Given two algorithms for a task, how do we find out which one is better?
One naive way of doing this is – implement both the algorithms and run the
two programs on your computer for different inputs and see which one takes
less time. There are many problems with this approach for analysis of
algorithms.
1) It might be possible that for some inputs, first algorithm performs better
than the second. And for some inputs second performs better.
2) It might also be possible that for some inputs, first algorithm perform better
on one machine and the second works better on other machine for some
other inputs.
Asymptotic Analysis is the big idea that handles above issues in analyzing
algorithms. In Asymptotic Analysis, we evaluate the performance of an
algorithm in terms of input size (we don’t measure the actual running time).
We calculate, how does the time (or space) taken by an algorithm increases
with the input size.
For example, let us consider the search problem (searching a given item) in a
sorted array. One way to search is Linear Search (order of growth is linear)
and other way is Binary Search (order of growth is logarithmic). To
understand how Asymptotic Analysis solves the above mentioned problems in
analyzing algorithms, let us say we run the Linear Search on a fast computer
and Binary Search on a slow computer. For small values of input array size n,
the fast computer may take less time. But, after certain value of input array
size, the Binary Search will definitely start taking less time compared to the
Linear Search even though the Binary Search is being run on a slow machine.
The reason is the order of growth of Binary Search with respect to input size
is logarithmic while the order of growth of Linear Search is linear. So the
machine dependent constants can always be ignored after certain values of
input size.
Output:
30 is present at index 2
Worst Case Analysis (Usually Done)
In the worst case analysis, we calculate upper bound on running time of an
algorithm. We must know the case that causes maximum number of
operations to be executed. For Linear Search, the worst case happens when
the element to be searched (x in the above code) is not present in the array.
When x is not present, the search() functions compares it with all the
elements of arr[] one by one. Therefore, the worst case time complexity of
linear search would be Θ(n).
Average Case Analysis (Sometimes done)
In average case analysis, we take all possible inputs and calculate computing
time for all of the inputs. Sum all the calculated values and divide the sum by
total number of inputs. We must know (or predict) distribution of cases. For
the linear search problem, let us assume that all cases are uniformly
distributed (including the case of x not being present in array). So we sum all
the cases and divide the sum by (n+1). Following is the value of average case
time complexity.
Average Case Time =
= Θ(n)
Best Case Analysis (Bogus)
In the best case analysis, we calculate lower bound on running time of an
algorithm. We must know the case that causes minimum number of
operations to be executed. In the linear search problem, the best case occurs
when x is present at the first location. The number of operations in the best
case is constant (not dependent on n). So time complexity in the best case
would be Θ(1)
Most of the times, we do worst case analysis to analyze algorithms. In the
worst analysis, we guarantee an upper bound on the running time of an
algorithm which is good information.
The average case analysis is not easy to do in most of the practical cases and
it is rarely done. In the average case analysis, we must know (or predict) the
mathematical distribution of all possible inputs.
Analysis of Algorithms | Set 3 (Asymptotic Notations)
We have discussed Asymptotic Analysis, and Worst, Average and Best
Cases of Algorithms. The main idea of asymptotic analysis is to have a
measure of efficiency of algorithms that doesn’t depend on machine specific
constants, and doesn’t require algorithms to be implemented and time taken
by programs to be compared. Asymptotic notations are mathematical tools to
represent time complexity of algorithms for asymptotic analysis. The following
3 asymptotic notations are mostly used to represent time complexity of
algorithms.
// Here c is a constant
for (int i = 1; i <= c; i++) {
// some O(1) expressions
}
3) O(nc): Time complexity of nested loops is equal to the number of times the
innermost statement is executed. For example the following sample loops
have O(n2) time complexity
How to calculate time complexity when there are many if, else
statements inside loops?
As discussed here, worst case time complexity is the most useful among best,
average and worst. Therefore we need to consider worst case. We evaluate
the situation when values in if-else conditions cause maximum number of
statements to be executed.
For example consider the linear search function where we consider the case
when element is present at the end or not present at all.
Analysis of Algorithm | Set 4 (Solving Recurrences)
In the previous post, we discussed analysis of loops. Many algorithms are
recursive in nature. When we analyze them, we get a recurrence relation for
time complexity. We get running time on an input of size n as a function of n
and the running time on inputs of smaller sizes. For example in Merge Sort, to
sort a given array, we divide it in two halves and recursively repeat the
process for the two halves. Finally we merge the results. Time complexity of
Merge Sort can be written as T(n) = 2T(n/2) + cn. There are many other
algorithms like Binary Search, Tower of Hanoi, etc.
There are mainly three ways for solving recurrences.
1) Substitution Method: We make a guess for the solution and then we use
mathematical induction to prove the guess is correct or incorrect.
We need to prove that T(n) <= cnLogn. We can assume that it is true
for values smaller than n.
T(n) = 2T(n/2) + n
<= cn/2Log(n/2) + n
= cnLogn - cnLog2 + n
= cnLogn - cn + n
<= cnLogn
cn2
/ \
T(n/4) T(n/2)
cn2
/ \
c(n2)/16 c(n2)/4
/ \ / \
T(n/16) T(n/8) T(n/8) T(n/4)
Breaking down further gives us following
cn2
/ \
c(n2)/16 c(n2)/4
/ \ / \
c(n2)/256 c(n2)/64 c(n2)/64 c(n2)/16
/ \ / \ / \ / \
3) Master Method:
Master Method is a direct way to get the solution. The master method works
only for following type of recurrences or for recurrences that can be
transformed to following type.
T(n) = aT(n/b) + f(n) where a >= 1 and b > 1
There are following three cases:
1. If f(n) = Θ(nc) where c < Logba then T(n) = Θ(nLogba)
2. If f(n) = Θ(nc) where c = Logba then T(n) = Θ(ncLog n)
3.If f(n) = Θ(nc) where c > Logba then T(n) = Θ(f(n))
How does this work?
Master method is mainly derived from recurrence tree method. If we draw
recurrence tree of T(n) = aT(n/b) + f(n), we can see that the work done at root
is f(n) and work done at all leaves is Θ(nc) where c is Logba. And the height
of recurrence tree is Logbn
In recurrence tree method, we calculate total work done. If the work done at
leaves is polynomially more, then leaves are the dominant part, and our result
becomes the work done at leaves (Case 1). If work done at leaves and root is
asymptotically same, then our result becomes height multiplied by work done
at any level (Case 2). If work done at root is asymptotically more, then our
result becomes work done at root (Case 3).
What does ‘Space Complexity’ mean?
Space Complexity:
The term Space Complexity is misused for Auxiliary Space at many
places. Following are the correct definitions of Auxiliary Space and Space
Complexity.
Auxiliary Space is the extra space or temporary space used by an algorithm.
Space Complexity of an algorithm is total space taken by the algorithm with
respect to the input size. Space complexity includes both Auxiliary space and
space used by input.
Searching Algorithms
Linear Search
Binary Search
// Java implementation of recursive Binary Search
class BinarySearch {
// Returns index of x if it is present in arr[l..
// r], else return -1
int binarySearch(int arr[], int l, int r, int x)
{
if (r >= l) {
int mid = l + (r - l) / 2;
// If the element is present at the
// middle itself
if (arr[mid] == x)
return mid;
// If element is smaller than mid, then
// it can only be present in left subarray
if (arr[mid] > x)
return binarySearch(arr, l, mid - 1, x);
// Else the element can only be present
// in right subarray
return binarySearch(arr, mid + 1, r, x);
}
// We reach here when element is not present
// in array
return -1;
}
// Driver method to test above
public static void main(String args[])
{
BinarySearch ob = new BinarySearch();
int arr[] = { 2, 3, 4, 10, 40 };
int n = arr.length;
int x = 10;
int result = ob.binarySearch(arr, 0, n - 1, x);
if (result == -1)
System.out.println("Element not present");
else
System.out.println("Element found at index " + result);
}
}
Output :
Element is present at index 3
Output :
Element is present at index 3
Time Complexity:
The time complexity of Binary Search can be written as
T(n) = T(n/2) + c
Sorting Algorithms
Bubble Sort
Output:
Sorted array:
11 12 22 25 34 64 90