Introduction To Algorithms
Introduction To Algorithms
Complexity Analysis
An essential aspect to data structures is algorithms. Data structures are implemented using
algorithms. An algorithm is a procedure that you can write as a C function or program, or any
other language. An algorithm states explicitly how the data will be manipulated.
Algorithm Efficiency
Some algorithms are more efficient than others. We would prefer to chose an efficient algorithm,
so it would be nice to have metrics for comparing algorithm efficiency.
The complexity of an algorithm is a function describing the efficiency of the algorithm in terms
of the amount of data the algorithm must process. Usually there are natural units for the domain
and range of this function. There are two main complexity measures of the efficiency of an
algorithm:
Time complexity is a function describing the amount of time an algorithm takes in terms
of the amount of input to the algorithm. "Time" can mean the number of memory
accesses performed, the number of comparisons between integers, the number of times
some inner loop is executed, or some other natural unit related to the amount of real time
the algorithm will take. We try to keep this idea of time separate from "wall clock" time,
since many factors unrelated to the algorithm itself can affect the real time (like the
language used, type of computing hardware, proficiency of the programmer, optimization
in the compiler, etc.). It turns out that, if we chose the units wisely, all of the other stuff
doesn't matter and we can get an independent measure of the efficiency of the algorithm.
Ο Notation
Ω Notation
θ Notation
Big Oh Notation, Ο
The notation Ο(n) is the formal way to express the upper bound of an algorithm's running time.
It measures the worst case time complexity or the longest amount of time an algorithm can
possibly take to complete.
Ο(f(n)) = { g(n) : there exists c > 0 and n0 such that f(n) ≤ c.g(n) for all n > n0. }
Omega Notation, Ω
The notation Ω(n) is the formal way to express the lower bound of an algorithm's running time.
It measures the best case time complexity or the best amount of time an algorithm can possibly
take to complete.
For example, for a function f(n)
Ω(f(n)) ≥ { g(n) : there exists c > 0 and n0 such that g(n) ≤ c.f(n) for all n > n0. }
Theta Notation, θ
The notation θ(n) is the formal way to express both the lower bound and the upper bound of an
algorithm's running time. It is represented as follows −
θ(f(n)) = { g(n) if and only if g(n) = Ο(f(n)) and g(n) = Ω(f(n)) for all n > n0. }
constant − Ο(1)
logarithmic − Ο(log n)
linear − Ο(n)
quadratic − Ο(n2)
cubic − Ο(n3)
polynomial − nΟ(1)
exponential − 2Ο(n)
Recursion
The process in which a function calls itself directly or indirectly is called recursion and the
corresponding function is called as recursive function. Using recursive algorithm, certain
problems can be solved quite easily. Examples of such problems are Towers of Hanoi
(TOH), Inorder/Preorder/Postorder Tree Traversals, DFS of Graph, etc.
In recursive program, the solution to base case is provided and solution of bigger problem is
expressed in terms of smaller problems.
int fact(int n)
{
if (n < = 1) // base case
return 1;
else
return n*fact(n-1);
}
In the above example, base case for n < = 1 is defined and larger value of number can be solved
by converting to smaller one till base case is reached.
A function fun is called direct recursive if it calls the same function fun. A function fun is called
indirect recursive if it calls another function say fun_new and fun_new calls fun directly or
indirectly. Difference between direct and indirect recursion has been illustrated in Table 1.
directRecFun();
// Some code...
}
// Some code...
}
void indirectRecFun2()
{
// Some code...
indirectRecFun1();
// Some code...
}
Int a = 0, b = 0;
a = a + rand();
b = b + rand();
Options:
Output:
a = a + i + j;
4. O(N*N)
Explanation:
The above code runs total no of times
= N + (N – 1) + (N – 2) + … 1 + 0
= N * (N + 1) / 2
= 1/2 * N^2 + 1/2 * N
O(N^2) times.
inti, j, k = 0;
for(j = 2; j <= n; j = j * 2) {
k = k + n / 2;
2. O(nLogn)
Explanation:If you notice, j keeps doubling till it is less than or equal to n. Number of times, we
can double a number till it is less than n would be log(n).
Let’s take the examples here.
for n = 16, j = 2, 4, 8, 16
for n = 32, j = 2, 4, 8, 16, 32
So, j would run for O(log n) steps.
i runs for n/2 steps.
So, total steps = O(n/ 2 * log (n)) = O(n*logn)
int a = 0, i = N;
while(i > 0) {
a += i;
i /= 2;
4. O(log N)
Explanation: We have to find the smallest x such that N / 2^x N
x = log(N)
Sorting Algorithms
A Sorting Algorithm is used to rearrange a given array or list elements according to a
comparison operator on the elements. The comparison operator is used to decide the
new order of element in the respective data structure.
HeapSort
Heap sort is a comparison based sorting technique based on Binary Heap data
structure. It is similar to selection sort where we first find the maximum element and
place the maximum element at the end. We repeat the same process for remaining
element.
A Binary Heap is a Complete Binary Tree where items are stored in a special order such that
value in a parent node is greater(or smaller) than the values in its two children nodes. The
former is called as max heap and the latter is called min heap. The heap can be represented by
binary tree or array.
2. Interval Search: These algorithms are specifically designed for searching in sorted data-
structures. These type of searching algorithms are much more efficient than Linear Search
as they repeatedly target the center of the search structure and divide the search space in
half. For Example: Binary Search.
An algorithm is designed to achieve optimum solution for a given problem. In greedy algorithm
approach, decisions are made from the given solution domain. As being greedy, the closest
solution that seems to provide an optimum solution is chosen.
Greedy algorithms try to find a localized optimum solution, which may eventually lead to
globally optimized solutions. However, generally greedy algorithms do not provide globally
optimized solutions.
Examples
Most networking algorithms use the greedy approach. Here is a list of few of them −
The algorithm is a Greedy Algorithm. The Greedy Choice is to pick the smallest weight edge
that does not cause a cycle in the MST constructed so far. Let us understand it with an example:
Consider the below input graph.
The graph contains 9 vertices and 14 edges. So, the minimum spanning tree formed will be
having (9 – 1) = 8 edges.
After sorting:
Weight SrcDest
1 7 6
2 8 2
2 6 5
4 0 1
4 2 5
6 8 6
7 2 3
7 7 8
8 0 7
8 1 2
9 3 4
10 5 4
11 1 7
14 3 5
Now pick all edges one by one from sorted list of edges
1. Pick edge 7-6: No cycle is formed, include it.
6. Pick edge 8-6: Since including this edge results in cycle, discard it.
7. Pick edge 2-3: No cycle is formed, include it.
8. Pick edge 7-8: Since including this edge results in cycle, discard it.
9. Pick edge 0-7: No cycle is formed, include it.
10. Pick edge 1-2: Since including this edge results in cycle, discard it.
11. Pick edge 3-4: No cycle is formed, include it.
Since the number of edges included equals (V – 1), the algorithm stops here.
2. Divide And Conquer Approach (Merge Sort)
In divide and conquer approach, the problem in hand, is divided into smaller sub-problems and
then each problem is solved independently. When we keep on dividing the subproblems into
even smaller sub-problems, we may eventually reach a stage where no more division is
possible. Those "atomic" smallest possible sub-problem (fractions) are solved. The solution of
all sub-problems is finally merged in order to obtain the solution of an original problem.
Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself
for the two halves and then merges the two sorted halves. The merge() function is used for
merging two halves. The merge(arr, l, m, r) is key process that assumes that arr[l..m]
andarr[m+1..r] are sorted and merges the two sorted sub-arrays into one. See following C
implementation for details.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
Call mergeSort(arr, l, m)
Call merge(arr, l, m, r)
The following diagram from wikipedia shows the complete merge sort process for an example
array {38, 27, 43, 3, 9, 82, 10}. If we take a closer look at the diagram, we can see that the array
is recursively divided in two halves till the size becomes 1. Once the size becomes 1, the merge
processes comes into action and starts merging arrays back till the complete array is merged.
3. Dynamic programming(shortest path algorithm)
Bellman–Ford Algorithm
Given a graph and a source vertex src in graph, find shortest paths from src to all vertices in the
given graph. The graph may contain negative weight edges.Dijkstra doesn’t work for Graphs
with negative weight edges, Bellman-Ford works for such graphs. Bellman-Ford is also simpler
than Dijkstra and suites well for distributed systems. But time complexity of Bellman-Ford is
O(VE), which is more than Dijkstra.
Algorithm
Following are the detailed steps.
Input: Graph and a source vertex src
Output: Shortest distance to all vertices from src. If there is a negative weight cycle, then
shortest distances are not calculated, negative weight cycle is reported.
1) This step initializes distances from source to all vertices as infinite and distance to source
itself as 0. Create an array dist[] of size |V| with all values as infinite except dist[src] where src is
source vertex.
2) This step calculates shortest distances. Do following |V|-1 times where |V| is the number of
vertices in given graph.
…..a) Do following for each edge u-v
………………If dist[v] >dist[u] + weight of edge uv, then update dist[v]
………………….dist[v] = dist[u] + weight of edge uv
3) This step reports if there is a negative weight cycle in graph. Do following for each edge u-v
……If dist[v] >dist[u] + weight of edge uv, then “Graph contains negative weight cycle”
The ideael of step 3 is, step 2 guarantees shortest distances if graph doesn’t contain negative
weight cycle. If we iterate through all edges one more time and get a shorter path for any vertex,
then there is a negative weight cycle
Example:
Let the given source vertex be 0. Initialize all distances as infinite, except the distance to
source itself. Total number of vertices in the graph is 5, so all edges must be processed
4 times.
Let all edges are processed in following order: (B,E), (D,B), (B,D), (A,B), (A,C), (D,C),
(B,C), (E,D). We get following distances when all edges are processed first time. The
first row in shows initial distances. The second row shows distances when edges (B,E),
(D,B), (B,D) and (A,B) are processed. The third row shows distances when (A,C) is
processed. The fourth row shows when (D,C), (B,C) and (E,D) are processed.
The first iteration guarantees to give all shortest paths which are at most 1 edge long.
We get following distances when all edges are processed second time (The last row
shows final values).
The second iteration guarantees to give all shortest paths which are at most 2 edges
long. The algorithm processes all edges 2 more times. The distances are minimized
after the second iteration, so third and fourth iterations don’t update the distances.
4. Backtracking Algorithms
Backtracking is finding the solution of a problem whereby the solution depends on the previous
steps taken. For example, in a maze problem, the solution depends on all the steps you take one-
by-one. If any of those steps is wrong, then it will not lead us to the solution. In a maze problem,
we first choose a path and continue moving along it. But once we understand that the particular
path is incorrect, then we just come back and change it. This is what backtracking basically is.
In backtracking, we first take a step and then we see if this step taken is correct or not i.e.,
whether it will give a correct answer or not. And if it doesn’t, then we just come back and change
our first step. In general, this is accomplished by recursion. Thus, in backtracking, we first start
with a partial sub-solution of the problem (which may or may not lead us to the solution) and
then check if we can proceed further with this sub-solution or not. If not, then we just come back
and change it.
If not, then come back and change the sub-solution and continue again
One of the most common examples of the backtracking is to arrange N queens on an NxN
chessboard such that no queen can strike down any other queen. A queen can attack horizontally,
vertically, or diagonally. The solution to this problem is also attempted in a similar way. We first
place the first queen anywhere arbitrarily and then place the next queen in any of the safe places.
We continue this process until the number of unplaced queens becomes zero (a solution is found)
or no safe place is left. If no safe place is left, then we change the position of the previously
placed queen.
The above picture shows an NxN chessboard and we have to place N queens on it. So, we will
start by placing the first queen.
Now, the second step is to place the second queen in a safe position and then the third queen.
Now, you can see that there is no safe place where we can put the last queen. So, we will just
change the position of the previous queen. And this is backtracking.
Also, there is no other position where we can place the third queen so we will go back one more
step and change the position of the second queen.
And now we will place the third queen again in a safe position until we find a solution.
We will continue this process and finally, we will get the solution as shown below.
As now you have understood backtracking, let us now code the above problem of placing N
queens on an NxN chessboard using the backtracking method.
Topological Sorting (UNIT 4 graphs)
Topological sorting for Directed Acyclic Graph (DAG) is a linear ordering of vertices such that
for every directed edge uv, vertex u comes before v in the ordering. Topological Sorting for a
graph is not possible if the graph is not a DAG.
For example, a topological sorting of the following graph is “5 4 2 3 1 0”. There can be more
than one topological sorting for a graph. For example, another topological sorting of the
following graph is “4 5 2 3 1 0”. The first vertex in topological sorting is always a vertex with in-
degree as 0 (a vertex with no incoming edges).
In topological sorting, we use a temporary stack. We don’t print the vertex immediately, we first
recursively call topological sorting for all its adjacent vertices, then push it to a stack. Finally,
print contents of stack. Note that a vertex is pushed to stack only when all of its adjacent vertices
(and their adjacent vertices and so on) are already in stack.
For each edge, make subsets using both the vertices of the edge. If both the vertices are in the
same subset, a cycle is found.
Initially, all slots of parent array are initialized to -1 (means there is only one item in every
subset).
0 1 2
-1 -1 -1
Now process all edges one by one.
Edge 0-1: Find the subsets in which vertices 0 and 1 are. Since they are in different subsets, we
take the union of them. For taking the union, either make node 0 as parent of node 1 or vice-
versa.
0 1 2 <----- 1 is made parent of 0 (1 is now representative of subset {0, 1})
1 -1 -1
Edge 1-2: 1 is in subset 1 and 2 is in subset 2. So, take union.
0 1 2 <----- 2 is made parent of 1 (2 is now representative of subset {0, 1, 2})
1 2 -1
Edge 0-2: 0 is in subset 2 and 2 is also in subset 2. Hence, including this edge forms a cycle.
How subset of 0 is same as 2?
0->1->2 // 1 is parent of 0 and 2 is parent of 1
https://youtu.be/mHz-mx-8lJ8