Analysis and Design of Algorithms PDF
Analysis and Design of Algorithms PDF
Analysis and Design of Algorithms PDF
asia
Department of MCA
LECTURE NOTE
ON
1|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Text Book:
1. Introduction to Algorithms, 2/e ,T.H.Cormen,C.E.Leiserson, R.L.Rivest and
C.Stein, PHI Pvt. Ltd. / Pearson Education
Reference Books:
1. Algorithm Design: Foundations, Analysis and Internet examples, M.T.Goodrich
and R.Tomassia, John Wiley and sons.
Course outcomes:
1. To be able to analyze correctness and the running time of the basic algorithms for
those classic problems in various domains and to be able to apply the algorithms
and design techniques for advanced data structures.
2. To be able to analyze the complexities of various problems in different domains.
and to be able to demonstrate how the algorithms are used in different problem
domains.
3. To be able to design efficient algorithms using standard algorithm design
techniques and demonstrate a number of standard algorithms for problems in
fundamental areas in computer science and engineering such as sorting, searching
and problems involving graphs.
2|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Contents
Module I
Basic Techniques……………………………………………………………………..8
Module II
Data Structures………………………………………………………………………18
Module III
Optimization problems………………………………………………………………28
String matching...........................................................................................................33
Graph Algorithms……………………………………………………………………37
Module IV
Spanning trees……………………………………………………………………….43
Max-flow……………………………………………………………….…………….46
NP – completeness……………………………………………………………………48
3|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Algorithm:-
Informally an algorithm is any well-defined computational procedure that takes some
value or set of values as input and produces some value or set of values as output.
Running time:-
The running time of an algorithm on a particular input is the number of primitive
operations or steps executed.
When we look at input sizes large enough to make only the order of growth of the
running time relevant we are studying the asymptotic efficiency of algorithms.
Asymptotic notation:-
They are used to describe the asymptotic running time of an algorithm. They
are defined in terms of function whose domains are the set of natural numbers N={0,1,2,...}.
Such notations are convenient for describing the worst case running time function which is
defined only on integer input sizes. There are 5 notations :-
• (Theta)θ-notation
• (big-oh)O-notation
• (big-omega)Ω-notation
• (small-oh)o-notation
• (small-omega)ω-notation
(Theta)θ-notation:-
This notation asymptotically bounds a function from above and below. For a given
function g(n) we denote by θ(g(n)) is given by
θ(g(n)) ={f(n): there exist positive constants c1,c2 and n0 such that 0≤c1g(n)≤f(n)≤c2g(n)
for all n≥n0 }.
For all values of n to the right of n0 the value of f(n) lies at or above c1g(n) or at
below c2g(n). We say that g(n) is an asymptotically tight bound for f(n) where c1g(n) is the
lower bound and c2g(n) is the upper bound. Definition of Ɵ(g(n)) is required by every
member f(n) which is element of θ(g(n)) asymptotically non-negative.
(Big-oh)O-notation:-
This notation used when we have only an asymptotic upper bound. For a given
function g(n) we denote by O(g(n)) the set of functions ,
4|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
O(g(n))={f(n): there exist a positive constant c and n0 such that 0≤f(n)≤cg(n) for all n≥n0 }.
(Big-omega)Ω-notation:-
It provides an asymptotic lower bound. For a given function g(n) we denote by
Ω(g(n)) the set of functions
Ω(g(n))={f(n): there exist a positive constant c and n0 such that 0≤cg(n) ≤f(n)for all n≥n0 }.
O and Ɵ notation are used for average and worst case. Ω notation are used for
best case running time.
(small-oh)o-notation:-
The asymptotic upper bound provided by big-oh notation may or may not be
asymptotically tight.
Example:- The bound 2n2=o(n2) is asymptotically tight but 2n= o(n2) is not.
We use small-oh notation to denote an upper bound that is not asymptotically tight.
We define o(g(n)) as the set of functions
o(g(n))={f(n): for any positive constant c>0 there exist a constant n0>0 such that 0≤f(n)<
cg(n) for all n≥n0 }.
(small-omega)ω-notation:-
We use this notation to denote a lower bound that is not asymptotically tight. We
define ω(g(n)) as the set
ω(g(n))={f(n): for any positive constant c>0 there exist a constant n0>0 such that 0≤cg(n)
≤f(n)for all n≥n0 }.
Order of growth:-
Example:-Arrange the following
Answer:-order of growth is
Rate of growth:- It refers to the change in the running time of an algorithm as the input size
increases.
5|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Amortized analysis:-
Here the time require to perform a sequence of data structure operations is averaged
over all the operations performed. Here probability is not involved. The 3 common
techniques for amortized analysis are :-
• Aggregate analysis
• Accounting method
• Potential method
General analysis:-
Here an upper bound T(n) on the total cost of a sequence of ‘n’ operations
()
determined. The average cost for operation is .
2 stack operations are PUSH(s, x) that pus x onto stack s & POP(s) that pops the top
most element of stack. Another operation is MULTIPOP(s, k) that simultaneously pop k
items from the stack
1. PUSH(s, x)
2. POP(s)
3. MULTIPOP(s, k)
While not STACK_EMPTY(s) and k≠0 do POP(s)
K.←k-1
Aggregate analysis:-
Using this we can get a better upper bound that considers the entire sequence of ‘n’
operations. Although a single multipop operation can be expensive. Any sequence of n push,
pop, and multipop operation on an initial empty stack and cost at most O(n) because each
object can be pop at most once for each time it is pushed.
For any value of n, any sequence of n push, pop and multipop operations
()
takes a total of O(n) time. So avg. cost of an operation is =O(1). This is equal to the
amortized cost of each operation.
Accounting method:-
In this method we assign different charges to different operation with some
operations charged more or less than they actually cost. The amount we charge an operation
is called amortized cost. When an operation’s amortized cost exceeds its actual cost the
difference is assigned to specific objects in the data structure as credit.
6|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Example:-
PUSH 1 2
POP 1 0
MULTIPOP min(k, s) 0
This credit can be used to later on to help pay for operations whose amortized cost is
less than their actual cost. If we denote the actual cost of the ithoperation by Ciand amortized
cost of the operation by Ĉi
∑
Ĉi ≥∑
Ci we require this for all operations.
The total credit stored in the data structure is the difference between the total
amortized cost and the total actual cost.
As the amortized cost is greater than or equal to actual cost the total credit associated
with the data structure must be non-negative at all times.
Potential method:-
Instead of representing prepaid work as credit stored with specific objects in the data
structure this method represents the prepaid work as potential energy or just potential that
can be released to pay for future operation.
This potential is associated with the data structure as a whole rather than with
specific object within the data structure. We start with an initial data structure ‘D0’ on which
n operations are performed. For each i=1, 2 ... n. Let Ci is the actual cost of the operation &
Di is the data structure that results after applying the ith operation to the data structure Di-1.
∑
Ĉi =∑
Ci + Ø (Di) − Ø (Di − 1)
=∑
+ Ø (Dn) - Ø (D0)
Ci
7|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Recurrences:-
When an algorithm contains a recursive call to itself its running time can be
described by recurrence. A recurrence is an equation on inequality hat describes a function in
terms of its value on smaller inputs. There
There are 3 methods to solve a recurrence
Master theorem:-
It provide a cook book method for solving recurrences of the form is
Where a≥1 and b>1 are constants and f (n) is an asymptotically positive function.
This equation describes the running time of an algorithm that divides a problem of
size ‘n’ into ‘a’ sub problems each of size n/b. The cost of dividing the problem and
combining the results of the sub problems is given by the function f (n). Then T (n) can be
bounded asymptotically as follows:
Case-1
logb(a)-ε logb(a)
If f (n) = O (n )for some constant ε>0 then T (n) = θ (n )
Case-2
logb(a) logb(a)
If f (n)=θ (n ), then T (n) = θ (n lgn)
Case-3
log (a)+ε
If f (n)= Ω (n b )for some constant ε>0 & if af(n/b)≤cf(n)for
cf(n)for some constant c<1
and sufficiently large ‘n’ then T (n)= θ (f (n))
Substitution method:-
It consists of 2 steps:-
Example:
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
T(1) = 3
T(n) = 2T( n/2 ) + 5
1. Guess:
T(n) = O(n)
0 ≤ T(n) ≤ cn
2. 0 ≤ T( k/2 ) ≤ c k/2
Show:T(k) = 2T(k/2) + 5 ≤ ck
3. Substitution
≤ 2 [ c k/2 ] + 5 IH substitution
= 2c k/2 + 5
= ck + 5
≤ ck Show T(k) ≤ ck
5 ≤ 0 Subtract ck
9|Page
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
• Divide
• Conquer
• Combine
Combine the solution to the sub problems into the solution for the original problem.
T (n) = {θ (1) if n ≤ c
Dynamic programming:-
Divide and conquer algorithm partition the problem into independent
sub problem. Solve the sub problems recursively and then combine their solution to solve the
original sub problem. But the dynamic programming is applicable when the sub problems
are not independent that is when sub problems share sub sub problems.
A DPA algorithm solves every sub sub problems just once and saves
its answers in a table avoiding the work of re-computation. It applies to optimization
problems in which in which a set of choices must be in order to arrive at an optimal solution.
The development of DP algorithm can be broken into a sequence of 4 steps.
10 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
• Optimal substructure
A problem exhibits optimal substructure if an optimal solution to the
problem contains within it optimal solution to sub problems
M [i, j] = {0 if i=j
{ min
{m[i, k] + m[k + 1, j] + pi − 1pkpj}if i < j
o Pop(s, i, j)
o If (i=j)
Then print Ai
o Else print (
Pop(s, i, s[i, j])
Pop(s, s[i, j]+1, j)
o Print )
11 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Let c[i, j] be the length of an LCS of sequences xi, yj. optimal sub-
structure of the LCS problem gives the recursive formula.
d0 represents all value less than k1. dnrepresents all value greater than
kn. For each dummy key di we have probability qi that a search will correspond to di .each
key ki is an internal node and each dummy key is a leaf. Every search is either successful
(finding some key) or unsuccessful (finding some dummy key).
12 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
The e[i, j] values gives the expected search cost in optimal BST. The
recursive formula is given by
Greedy algorithm:-
A greedy algorithm always makes the choice that looks best at the
moment. That is it makes a locally optimal choice in the hope that this choice will lead to a
globally optimal solution. These algorithms do not always yield optimal solution.
1. Optimal substructure:- a problem exhibits optimal sub structure if an optimal solution to the
problem contains within it optimal solution to sub problems.
2. Greedy choice property:- a globally optimal solution can be arrived at by making a locally
optimal greedy choice. When we are considering which choice to make we make the choice
that looks best in the current problem without considering results from sub problems.
• Subset paradigm
The greedy method suggest that one can devise an algorithm that
works in stages considering one input at a time. At each decision is made regarding whether
a particular input is in an optimal solution. This is done by considering the inputs in order
determined by some selection procedure.
If the inclusion of the next input into the partially constructed optimal
solution will result in an infeasible solution then this input is not added to the partial solution
otherwise it is added. This version of the greedy technique is called the subset paradigm.
Example: knapsack problem
• Ordering paradigm
For problems that do not call for the selection of an optimal subset in the
greedy method we make decisions by considering the inputs in some order. Each decision is
13 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
made using an optimization criterion that can be computed using decisions already made.
This version of greedy method is called ordering paradigm.
Example: single source shortest path problem
Huffman codes:-
They are widely used and very effective technique for data. We consider
the data to be sequence of characters. Huffman greedy algorithm uses a table of the
frequencies of occurrence of the characters to build up an optimal way of representing each
character as a binary string.
Example:
Let’s say you have a set of numbers and their frequency of use and want to create a
Huffman encoding for them:
FREQUENCY VALUE
--------- -----
5 1
7 2
10 3
15 4
20 5
45 6
Creating a Huffman tree is simple. Sort this list by frequency and make the two-
lowest elements into leaves, creating a parent node with a frequency that is the sum of the
two lower element's frequencies:
12:*
/ \
5:1 7:2
The two elements are removed from the list and the new parent node, with frequency
12, is inserted into the list by frequency. So now the list, sorted by frequency, is:
10:3
12:*
15:4
20:5
45:6
You then repeat the loop, combining the two lowest elements. This results in:
22:*
/ \
10:3 12:*
/ \
5:1 7:2
And the list is now:
15:4
20:5
22:*
45:6
You repeat until there is only one element left in the list.
14 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
35:*
/ \
15:4 20:5
22:*
35:*
45:6
57:*
___/ \___
/ \
22:* 35:*
/ \ / \
10:3 12:* 15:4 20:5
/ \
5:1 7:2
45:6
57:*
102:*
__________________/ \__
/ \
57:* 45:6
___/ \___
/ \
22:* 35:*
/ \ / \
10:3 12:* 15:4 20:5
/ \
5:1 7:2
Decoding a Huffman encoding is just as easy as you read bits in from your
input stream you traverse the tree beginning at the root, taking the left hand path if you read
a 0 and the right hand path if you read a 1. When you hit a leaf, you have found the code.
Backtracking:-
Many problem which deal with searching for a set of solutions or which
asks for an optimal solution satisfying some constraints can be solved using the backtracking
formula. The name backtracking was first coined by D.H. Lehman in 1950s.
Many application of the backtrack method the desired solution is
expressible as an n-tuple (x1, x2 ...xn) where the xi are chosen from some finite set si. Often
the problem to be solved calls for finding one vector that maximizes or minimizes a criterion
function p(x1, x2 ...xn ).
The basic idea of backtracking algorithm is to build up the solution vector one component at
a time and to use modified criterion functions to test whether the vector being formed has
any chance of success. The major advantage is that if it is realised that the partial vector can
15 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
no way lead to an optimal solution then the rest of the test vectors can be ignored
completely.
Example: N-Queens Problem
Given an N x N sized chess board
Objective: Place N queens on the board so that no queens are in danger
Backtracking prunes entire sub trees if their root node is not a viable solution. The
algorithm will “backtrack” up the tree to search for other possible solutions
16 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Module- 2
Least-cost search: Here a function is used to select the live node. The node
with least cost function value is selected as the live node. Bounding functions are used to
help avoid the generations of sub trees that do not contain an answer node.
In both LIFO and FIFO branch & bound the selection rule for the next
E-node is rigid and in a sense blind. It does not give any preference to a node that very good
chance of getting the search to a answer node quickly.
1. The number of nodes in the sub tree x that need to be generated before an answer
node is generated.
2. The number of levels the nearest answer node is from x.
Let ĝ(x) be an estimate of the additional effort needed to reach an answer
node from x. Node x is assigned a rank using a function Ĉ () such that
Ĉ (x) = f(h(x)) + ĝ(x)
Where h(x) is the cost of reaching x from the root and f () is any non-decreasing
function.
A search strategy that uses such a cost function select the next E-node
would always choose the node with least value of Ĉ (x) as a live node. So such a
search strategy is called a LC-search.
Let us consider an example of 15-puzzle. We are defined with the start state and goal state as
shown in the figure below.
1 2 3 4
5 6 8
9 10 7 11
17 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
13 14 15 12 1 2 3 4
5 6 7 8
9 10 11 12
13 14 15
Start state goal state
Solution:
1 2 3 4 1 2 3 4
5 6 8 5 6 7 8
9 10 7 11 9 10 11
13 14 15 12 13 14 15 12
Step1 step-2
1 2 3 4 1 2 3 4
5 6 7 8 5 6 7 8
9 10 11 9 10 11 12
13 14 15 12 13 14 15
Step-3step-4(target state)
Randomization: -
A randomized algorithm is an algorithm that employs a degree of randomness as part
of its logic. The algorithm typically uses uniformly bits as an auxiliary input to guide its
behaviour, in the hope of achieving good performance in the "average case" over all possible
choices of random bits. Formally, the algorithm's performance will be a random
variable determined by the random bits; thus either the running time, or the output (or both)
are random variables. Example of randomized algorithm is Quicksort.
Quicksort:-
Quicksort is a familiar, commonly used algorithm in which randomness can be
useful. Any deterministic version of this algorithm requires O(n2) time to sort n numbers for
some well-defined class of degenerate inputs (such as an already sorted array), with the
specific class of inputs that generate this behaviour defined by the protocol for pivot
selection. However, if the algorithm selects pivot elements uniformly at random, it has a
provably high probability of finishing in O(n log n) time regardless of the characteristics of
the input.
18 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Data Structure:-
Heap Sort:-
If you have values in a heap and remove them one at a time they come out in
(reverse) sorted order. Since a heap has worst case complexity of O(log(n)) it can get
O(nlog(n)) to remove n value that are sorted.
There are a few areas that we want to make this work well:
If we achieve it all then we have a worst case O(nlog(n)) sort that does not use extra
memory. This is the best theoretically for a comparison sort.
You repeat steps 2 & 3 until you finish all the data.
You could do step 1 by inserting the items one at a time into the heap:
• This would be O(nlog(n)). Turns out we can do in O(n). This does not change the
overall complexity but is more efficient.
• You would have to modify the normal heap implementation to avoid needing a
second array.
Instead we will enter all values and make it into a heap in one pass.
As with other heap operations, we first make it a complete binary tree and then fix up so the
ordering is correct. We have already seen that there is a relationship between a complete
binary tree and an array.
19 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
It will work by letting the smaller values percolate down the tree.
To make into a heap you use an algorithm that fixes the lower part of the tree and works it
way toward the root:
• Go from lowest right parent (non-leaf) and proceed to left. When finish one level go
to next starting again from right.
• at each node, percolate down the item to its proper place in this part of the subtree,
e.g., subheap.Here is how the example goes:
20 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
This example has very few swaps. In some cases you have to percolate a value down by
swapping it with several children.
The Weiss book has the details to show that this is worst case O(n) complexity. It isn't
O(nlog(n)) because each step is log(subtree height currently considering) and most of the
nodes root subtrees with a small height. For example, about half the nodes have no children
(are leaves).
Now that we have a heap, we just remove the items one after another.
The only new twist here is to keep the removed item in the space of the original array. To do
this you swap the largest item (at root) with the last item (lower right in heap). In our
example this gives:
Now let the new value at the root percolate down to where it belongs.
Now repeat with the new root value (just chance it is 5 again):
21 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Heap Complexity:-
The part just shown very similar to removal from a heap which is O(log(n)). You do
it n-1 times so it is O(nlog(n)). The last steps are cheaper but for the reverse reason from the
building of the heap, most are log(n) so it is O(nlog(n)) overall for this part. The build part
was O(n) so it does not dominate. For the whole heap sort you get O(nlog(n)).
22 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Thus, we have finally achieved a comparison sort that uses no extra memory and is
O(nlog(n)) in the worst case.
In many cases people still use quick sort because it uses no extra memory and is
usually O(nlog(n)). Quick sort runs faster than heap sort in practice. The worst case of O(n2)
is not seen in practice.
Search Tree:-
Search tree is a tree data structure used for locating specific values from within a set.
In order for a tree to function as a search tree, the key for each node must be greater than any
keys in subtrees on the left and less than any keys in subtrees on the right.
The advantage of search trees is their efficient search time given the tree is
reasonably balanced, which is to say the leaves at either end are of comparable depths.
Various search-tree data structures exist, several of which also allow efficient insertion and
deletion of elements, which operations then have to maintain tree balance.
Dijkstra’s algorithm:-
Dijkstra’s algorithm is very similar to Prim’s algorithm for minimum spanning tree.
Like Prim’s MST, we generate a SPT (shortest path tree) with given source as root. We
maintain two sets, one set contains vertices included in shortest path tree, other set includes
vertices not yet included in shortest path tree. At every step of the algorithm, we find a
vertex which is in the other set (set of not yet included) and has minimum distance from
source.
Below are the detailed steps used in Dijkstra’s algorithm to find the shortest path
from a single source vertex to all other vertices in the given graph.
Algorithm:-
1)Create a set sptSet (shortest path tree set) that keeps track of vertices included in shortest
path tree, i.e., whose minimum distance from source is calculated and finalized. Initially, this
set is empty.
2) Assign a distance value to all vertices in the input graph. Initialize all distance values as
INFINITE. Assign distance value as 0 for the source vertex so that it is picked first.
3) While sptSet doesn’t include all vertices
….a) Pick a vertex u which is not there in sptSetand has minimum distance value.
….b) Include u to sptSet.
23 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
….c) Update distance value of all adjacent vertices of u. To update the distance values,
iterate through all adjacent vertices. For every adjacent vertex v, if sum of distance value of
u (from source) and weight of edge u-v, is less than the distance value of v, then update the
distance value of v.
The set sptSetis initially empty and distances assigned to vertices are {0, INF, INF, INF,
INF, INF, INF, INF} where INF indicates infinite. Now pick the vertex with minimum
distance value. The vertex 0 is picked, include it in sptSet. So sptSet becomes {0}. After
including 0 to sptSet, update distance values of its adjacent vertices. Adjacent vertices of 0
are 1 and 7. The distance values of 1 and 7 are updated as 4 and 8. Following subgraph
shows vertices and their distance values, only the vertices with finite distance values are
shown. The vertices included in SPT are shown in green color.
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). The vertex 1 is picked and added to sptSet. So sptSet now becomes {0, 1}. Update
the distance values of adjacent vertices of 1. The distance value of vertex 2 becomes 12.
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). Vertex 7 is picked. So sptSet now becomes {0, 1, 7}. Update the distance values of
24 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
adjacent vertices of 7. The distance value of vertex 6 and 8 becomes finite (15 and 9
respectively).
Pick the vertex with minimum distance value and not already included in SPT (not in
sptSET). Vertex 6 is picked. So sptSet now becomes {0, 1, 7, 6}. Update the distance values
of adjacent vertices of 6. The distance value of vertex 5 and 8 are updated.
We repeat the above steps until sptSet doesn’t include all vertices of given graph. Finally, we
get the following Shortest Path Tree (SPT).
Floyd-warshall algorithm:-
This algorithm simply applies the rule n times, each time considering a new vertex
through which possible paths may go. At the end, all paths have been discovered.
25 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
So we have V = { 1, 2, 3, 4, 5, 6 } and E = { (1, 2), (1, 3), (2, 4), (2, 5), (3, 1), (3, 6), (4, 6),
(4, 3), (6, 5) }. Here is the adjacency matrix and corresponding t(0):
down = "from"
across = "to"
123456 123456
1011000 1111000
2000110 2010110
t(0):3 1 0 0 0 0 1 3101001
4001001 4001101
5000000 5000010
6000010 6000011
Now let's look at what happens as we let k go from 1 to 6:
k=1
add (3,2); go from 3 through 1 to 2
123456
1111000
2010110
(1)
t =3 1 1 1 0 0 1
4001101
5000010
6000011
k=2
add (1,4); go from 1 through 2 to 4
add (1,5); go from 1 through 2 to 5
add (3,4); go from 3 through 2 to 4
add (3,5); go from 3 through 2 to 5
123456
1111110
2010110
t(2) =3 1 1 1 1 1 1
4001101
5000010
6000011
k=3
add (1,6); go from 1 through 3 to 6
add (4,1); go from 4 through 3 to 1
add (4,2); go from 4 through 3 to 2
add (4,5); go from 4 through 3 to 5
123456
1111111
2010110
(3)
t =3 1 1 1 1 1 1
4111111
5000010
6000011
26 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
k=4
add (2,1); go from 2 through 4 to 1
add (2,3); go from 2 through 4 to 3
add (2,6); go from 2 through 4 to 6
123456
1111111
2111111
t(4) =3 1 1 1 1 1 1
4111111
5000010
6000011
k=5
123456
1111111
2111111
t(5) =3 1 1 1 1 1 1
4111111
5000010
6000011
k=6
123456
1111111
2111111
(6)
t =3 1 1 1 1 1 1
4111111
5000010
6000011
At the end, the transitive closure is a graph with a complete subgraph (a clique) involving
vertices 1, 2, 3, and 4. You can get to 5 from everywhere, but you can get nowhere from 5.
You can get to 6 from everwhere except for 5, and from 6 only to 5. Analysis This algorithm
has three nested loops containing a (1) core, so it takes (n3) time.
What about storage? It might seem with all these matrices we would need (n3) storage;
however, note that at any point in the algorithm, we only need the last two matrices
computed, so we can re-use the storage from the other matrices, bringing the storage
complexity down to (n2).
27 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
extract a card,shift the remaining cards, and then insert the extracted card in the correct
place. This process isrepeated until all the cards are in the correct sequence. Both average
and worst-case time isO(n2).
Shell Sort:-
Shell sort, developed by Donald L. Shell, is a non-stable in-place sort. Shell sort
improves onthe efficiency of insertion sort by quickly shifting values to their destination.
Average sort timeis O(n1.25), while worst-case time is O(n1.5).
Quicksort:-
Although the shell sort algorithm is significantly better than insertion sort, there is
still room forimprovement. One of the most popular sorting algorithms is quicksort.
Quicksort executes inO(n lg n) on average, and O(n2) in the worst-case. However, with
proper precautions, worst-casebehaviour is very unlikely. Quicksort is a non-stable sort. It
is not an in-place sort as stack species required.
Searching:-
Hash Tables:-
Hash tables are a simple and effective method to implement dictionaries. Average
time to searchfor an element is O(1), while worst-case time is O(n).
Binary Search Trees:-
In the Introduction, we used the binary search algorithm to find data stored in an
array. Thismethod is very effective, as eachiteration reduced the number of items to search
by one-half.However, since data was stored in an array, insertions and deletions were not
efficient. Binarysearch trees store data in nodes that are linked in a tree-like fashion. For
randomly inserted data,search time is O(lg n). Worst-case behaviour occurs when ordered
data is inserted. In this casethe search time is O(n).
28 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Module III
Optimization Problem:-
An optimization problem is the problem of finding the best solution from all feasible
solutions. Optimization problems can be divided into two categories depending on whether
the variables are continuous or discrete. An optimization problem
problem with discrete variables is
known as a combinatorial optimization problem.
• is a set of instances;
• given an instance , is the set of feasible solutions;
• Given an instance and a feasible solution of , denotes the measure of ,
which is usually a positive real.
• is the goal function, and is either or .
The goal is then to find for some instance an optimal solution, that is, a feasible solution
with
29 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
convex polygon) of a set of points. Each input object is represented as a set of points{P1,
P2,…} where is Pi={xi, yi} and xi, yi € R, R= set of real number.
Figure 35.1 (a) The cross product of vectors p1 and p2 is the signed area of the parallelogram.
(b) The lightly shaded region contains vectors that are clockwise from p. The darkly shaded
region contains vectors that are counterclockwise from p.
a matrix:1
1
Actually, the cross product is a three-dimensional concept. It is a vector that is
perpendicular to both p1 and p2 according to the "right-hand rule" and whose magnitude is
30 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
|x1y2 - x2y1|. In this chapter, however, it will prove convenient to treat the cross product
simply as the value x1y2 - x2y1.
If p1 X p2 is positive, then p1 is clockwise from p2 with respect to the origin (0, 0); if this
cross product is negative, then p1 is counter clockwise from p2. Figure (b) shows the
clockwise and counter clockwise regions relative to a vector p. A boundary condition arises
if the cross product is zero; in this case, the vectors are collinear, pointing in either the same
or opposite directions.
(p1 - p0) x (p2 - p0) = (x1 - x0) (y2 - y0) - (x2 - x0) (y1 - y0).
Two consecutive line segments turn left or right at point pl. Equivalently, we
want a method to determine which way a given angle p0p1p2 turns. Cross products allow
us to answer this question without computing the angle. As shown in Figure 35.2, we simply
check whether directed segment is clockwise or counter clockwise relative to directed
segment . To do this, we compute the cross product (p2 - p0) X (p1 - p0). If the sign of
this cross product is negative, then is counter clockwise with respect to , and thus
we make a left turn at P1. A positive cross product indicates a clockwise orientation and a
right turn. A cross product of 0 means that points p0, p1, and p2 are collinear.
31 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Figure 35.2 Using the cross product to determine how consecutive line segments
turn at point p1. We check whether the directed segment is clockwise or counter clockwise
relative to the directed segment . (a) If counter clockwise, the points make a left turn. (b) If
clockwise, they make a right turn.
We use a two-stage process to determine whether two line segments intersect. The first stage
is quick rejection: the line segments cannot intersect if their bounding boxes do not intersect.
The bounding box of a geometric figure is the smallest rectangle that contains the figure and
whose segments are parallel to the x-axis and y-axis. The bounding box of line
is true. The rectangles must intersect in both dimensions. The first two comparisons above
determine whether the rectangles intersect in x; the second two comparisons determine
whether the rectangles intersect in y.
The second stage in determining whether two line segments intersect decides whether
each segment "straddles" the line containing the other. A segment straddles a line if
point p1 lies on one side of the line and pointp2 lies on the other side. If p1 or p2 lies on the
line, then we say that the segment straddles the line. Two line segments intersect if and only
if they pass the quick rejection test and each segment straddles the line containing the other.
32 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Ordering segments:-
Since we assume that there are no vertical segments, any input segment that
intersects a given vertical sweep line intersects it at a single point. We can thus order the
segments that intersect a vertical sweep line according to the y-coordinates of the points of
intersection.
For any given x, the relation ">x" is a total order on segments that intersect the sweep
line at X. The order may differ for differing values of x, however, as segments enter and
leave the ordering. A segment enters the ordering when its left endpoint is encountered by
the sweep, and it leaves the ordering when its right endpoint is encountered.
Their positions in the total order are reversed. Sweep lines v and w are to the left and right,
respectively, of the point of intersection of segments e and f, and we have e >v f and f
>w e. Note that because we assume that no three segments intersect at the same point, there
must be some vertical sweep line x for which intersecting
segments e and f are consecutive in the total order >x. Any sweep line that passes through the
shaded region of Figure (b), such as z, has e and f consecutive in its total order.
Graham's scan:-
Graham's scan is a method of computing the convex hull of a finite set of points in
the plane with time complexity O(n log n). It is named after Ronald Graham, who published
the original algorithm in 1972.[1] The algorithm finds all vertices of the convex hull ordered
along its boundary.
The algorithm proceeds by considering each of the points in the sorted array in
sequence. For each point, it is determined whether moving from the two previously
considered points to this point is a "left turn" or a "right turn". If it is a "right turn", this
means that the second-to-last point is not part of the convex hull and should be removed
from consideration. This process is continued for as long as the set of the last three points is
a "right turn". As soon as a "left turn" is encountered, the algorithm moves on to the next
point in the sorted array. (If at any stage the three points are collinear, one may opt either to
33 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
discard or to report it, since in some applications it is required to find all points on the
boundary of the convex hull.)
String Matching:-
String matching algorithm used to search for particular pattern in string sequences.
The string matching problem can be treated as assume that the text array and pattern is arry
of length m<=n.
Naive Algorithm:-
print shift s;
End algorithm.
Complexity: O((n-m+1)m)
A special note: we allow O(k+1) type notation in order to avoid O(0) term, rather, we want
to have O(1) (constant time) in such a boundary situation.
Rabin-Karp Algorithm:-
34 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
General formula: ts+1 = d (ts - dm-1 T[s+1]) + T[s+m+1], in radix-d, where ts is the
corresponding number for the substring T[s..(s+m)]. Note, m is the size of P.
The first-pass scheme: (1) pre-process for (n-m) numbers on T and 1 for P, (2) compare the
number for P with those computed on Input: Text string T, Pattern string to search for P,
radix to be used d (= |∑|, for alphabet ∑), a prime q
However, if the translated numbers are large (i.e., m is large), then even the number
matching could be O(m). In that case, the complexity for the worst case scenario is when
every shift is successful ("valid shift"), e.g., T=an and P=am. For that case, the complexity is
O(nm) as before.
But actually, for c hits, O((n-m+1) + cm) = O(n+m), for a small c, as is expected in the real
life.
String-matching automata:-
35 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
shall assume that P is a given fixed pattern string; for brevity, we shall not indicate the
dependence upon P in our notation.
Knuth-Morris-Pratt Algorithm:-
The Knuth–Morris–Pratt string searching algorithm (or KMP algorithm) searches for
occurrences of a "word" W within a main "text string" S by employing the observation that
when a mismatch occurs, the word itself embodies sufficient information to determine where
the next match could begin, thus bypassing re-examination of previously matched characters.
Thus, P=ababababca, when S=P6=ababab, largest K is abab, or Pi(6)=4.
An array Pi[1..m] is first developed for the whole set for S, Pi[1] through Pi[10] above.
36 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
The array Pi actually holds a chain for transitions, e.g., Pi[8] = 6, Pi[6]=4, …,
Algorithm KMP-Matcher(T, P)
n = length[T]; m = length[P];
Pi = Compute-Prefix-Function(P);
q = q+1;
if (q = = m) then
q = Pi[q]; // old matched part is preserved, & reused in the next iteration
end if;
end for;
End algorithm.
m = length[P];
Pi[1] = 0;
k = 0;
while (k>0 && P[k+1] =/= P[q]) do // loop breaks with k=0 or next if succeeding
k = Pi[k];
if (P[k+1] = = P[q]) then // check if the next pointed character extends previously
identified symmetry
k = k+1;
37 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
return Pi;
End algorithm.
In reality the inner while loop runs only a few times as the symmetry may not be so
prevalent. Without any symmetry the transition quickly jumps to q=0, e.g., P=acgt, every Pi
value is 0.
Algorithm:-
• The algorithm uses a queue data structure to store intermediate results as it traverses
the graph, as follows:
• Enqueue the root node
• Dequeue a node and examine it
o If the element sought is found in this node, quit the search and return a result.
o Otherwise enqueue any successors (the direct child nodes) that have not yet
been discovered.
• If the queue is empty, every node on the graph has been examined – quit the search
and return "not found".
• If the queue is not empty, repeat from Step 2.
Example: The following figure (from CLRS) illustrates the progress of breadth-first
search on the undirected sample graph.
a. After initialization (paint every vertex white, set d[u] to infinity for each vertex u, and
set the parent of every vertex to be NIL), the source vertex is discovered in line 5. Lines
8-9 initialize Q to contain just the source vertex s.
38 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
b. The algorithm discovers all vertices 1 edge from s i.e., discovered all vertices
(w and r) at level 1.
c.
d. The algorithm discovers all vertices 2 edges from s i.e., discovered all vertices (t, x,
and v) at level 2.
e.
f.
g. The algorithm discovers all vertices 3 edges from s i.e., discovered all vertices
(u and y) at level 3.
39 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
h.
i. The algorithm terminates when every vertex has been fully explored.
Depth-first search(DFS):-
Algorithm:-
In DFS, each vertex has three possible colors representing its state:
NB. For most algorithms boolean classification unvisited / visited is quite enough, but we
show general case here.
Initially all vertices are white (unvisited). DFS starts in arbitrary vertex and runs as follows:
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Example. Traverse a graph shown below, using DFS. Start from a vertex with number 1.
Source graph.
41 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
42 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
As you can see from the example, DFS doesn't go through all edges. The vertices and
edges, which depth-first search has visited is a tree. This tree contains all vertices of the
graph (if it is connected) and is called graph spanning tree. This tree exactly corresponds to
the recursive calls of DFS.
If a graph is disconnected, DFS won't visit all of its vertices. For details, see finding
connected components algorithm.
43 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Module IV
Spanning tree:-
A spanning tree for a graph G is a sub-graph of G which isa tree that includes every
vertex of G.A spanning tree of a graph G is a “maximal” treecontained in the graph G.When
you have a spanning tree T for a graph G, youcannot add another edge of G to T without
producing acircuit.
Example:
Consider the following graph, G, representing pairs ofpeople (A, B, C, D and E) who
areacquainted with eachother.
Kruskal's Algorithm:-
Find the minimal spanning tree for the following connectedweighted graph G.The
starting point of Kruskal's Algorithm is to make an“edge” list, in which the edges are listed
in order ofincreasing weights.
44 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Kruskal's Algorithm for finding minimum spanning treesfor weighted graphs (Epp's
version) is then:
Input: G a connected weighted graph with n vertices.
Algorithm Body:(Build a sub-graph T of G to consist of all of the vertices ofG with edges
added in order of increasing weight. At eachstage, let mbe the number of edges of T.)
1. Initialise T to have all of the vertices of G and noedges.
2. Let Ebe the set of all edges of G and let m = 0.(pre-condition: G is connected.)
3. While (m <n −1)
a. Find an edge e in E of least weight.
b. Delete e from E.
c. If addition of e to the edge set of T does notproduce a circuit then add e to the edge
setof T and set m = m +1
End While (post-condition: T is a minimum spanning tree
forG.)
Output: T (a graph)
End Algorithm
Prim's algorithm:-
45 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
• Input: A non-empty connected weighted graph with vertices V and edges E(the
weights can be negative).
• Initialize: Vnew = {x}, where x is an arbitrary node (starting point) from V, Enew= {}
• Repeat until Vnew = V:
o Choose an edge (u, v) with minimal weight such that u is in Vnew and vis not
(if there are multiple edges with the same weight, any of them may be picked)
o Add v to Vnew, and (u, v) to Enew
• Output: Vnew and Enew describe a minimal spanning tree
Dijkstra's Algorithm:-
Djikstra's algorithm (named after its discover, E.W. Dijkstra) solves the problem of
finding the shortest path from a point in a graph (the source) to a destination. It turns out that
one can find the shortest paths from a given source to all points in a graph in the same time;
hence this problem is sometimes called the single-source shortest paths problem.
The somewhat unexpected result that all the paths can be found as easily as one
further demonstrates the value of reading the literature on algorithms!
This problem is related to the spanning tree one. The graph representing all the paths
from one vertex to all the others must be a spanning tree - it must include all vertices. There
will also be no cycles as a cycle would define more than one path from the selected vertex to
at least one other vertex. For a graph,
46 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
• Sort the vertices in V-S according to the current best estimate of their distance
from the source,
• Add u, the closest vertex in V-S, to S,
• Relax all the vertices still in V-S connected to u
Maximum flow:-
We can also interpret a directed graph as a flow network and use it to answer
questions about material flows. Consider a material flowing through a system from a source
where the material is produced to a sink where it is consumed. The source produces the
material at some study rate and the sink consumes it at the same rate.
The flow of the material at any point in the system is the rate at which the
material moves. Flow networks can be used to model liquids flowing through pipes parts
through assembly lines, current through electrical network, information through
communication networks.
Flow conservation property:-The rate at which a material enters a vertex must equal to the
rate at which it leaves the vertex. This is called as the flow conservation property.
Maximum flow problem:- Here we wish to compute the greatest rate at which material can
be shipped from the source to the sink without violating any capacity constraint.
Definition of flow: Let G(V, E) be a flow network with a capacity function C. Let ‘s’ be the
source of the network and ‘t’ be the sink.
A flow in G is a real valued function f: v×v→R where R is a set of real number that satisfies
the following 3 property.
1. Capacity constraint property:-It says that the flow from one vertex to another must
not exceed the given capacity.
For all u, v∈V, we require f(u, v) ≤ c(u, v)
2. Skew symmetry property:- it says that the flow from a vertex u to a vertex v is the
negative of the flow in the reverse direction.
For all u, v∈ V, we require f(u, v) = -f(u, v)
3. Flow conservation property:- It says that the total flow out of a vertex other then the
source or sink is zero.
For all u ∈ V-{s, t}, we require ∑(∈) %(&, ')=0
47 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Given a graph which represents a flow network where every edge has a capacity.
Also given two vertices source ‘s’ and sink ‘t’ in the graph, find the maximum possible flow
from s to t with following constraints:
b) Incoming flow is equal to outgoing flow for every vertex except s and t.
Ford-Fulkerson Algorithm:-
Augmenting path:-
Given a flow network G=(V, E) and a flow ‘f’ and an augmenting path p is a
simple path from s to t in the residual network Gf. Each edge (u, v)on an augmenting path
48 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
admits some additional positive from u to v without violating the capacity constraints on the
edge.The residual capacity of the path p is given by:
Example: Quick sort running time =O (n2) so its algorithm is called NP-complete algorithm.
• P-class
• NP-class
• NPC-class
P-class:
The class P consist of those problems that are solvable in polynomial time
that is in time O (nk) for some constant’ k’ where n is input size.
NP-class:
The class NP consist of those problems that are verifiable in polynomial time
that is given a certificate of a solution use could verify that the certificate is correct in time
polynomial in the size of the input to the problem.
NPC-class:
The class NP-complete consist of those problems that are in NP and are as
hard as any problem in NP. Any NP-complete problem can not be solved in polynomial
time.
49 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Steps
Example: if a sub circuit always produces 0 then that sub circuit can be replaced by a simpler
sub circuit that omits all logic gates and provides the constant value 0 as its output.
The three basic logic gates that we use in this problem are:
AND gate
50 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
This gate’s output is 1, if all its inputs are 1 and output is 0 otherwise.
OR gate
NOT gate
It takes a single binary input either 0 or 1 and produces a binary output whose
value is opposite to that of the input value.
Example:
51 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
Approximation algorithms :-
Many problems of practical significance are NP-complete but are too
important to abandon nearly because obtaining an optimal solution is intractable. If a
problem is NP-complete it is unlikely to find a polynomial time algorithm for solving it
exactly.
I. If the actual inputs are small an algorithm with exponential running time may
be perfectly satisfactory.
II. We may be able to isolate important special cases that are solvable in
polynomial time.
III. It may be possible to find near optimal solutions in polynomial time either in
the worst case or an average.
(C/C*)<1=>(C*/C)>1
Approximation scheme:
This scheme as polynomial time approximation scheme if for any fixed ε >0
the scheme runs in time polynomial in the size of its input that is ‘n’
52 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
1. c←{}
2. E` ← E[G]
3. while E` is not empty do
4. Let (u, v) be an arbitrary edge of E`
5. c ← c U {u, v}
6. Remove from E` every edge incident on either u or v
7. return c
Example
53 | P a g e
jntuworldupdates.org Specworld.in
Smartzworld.com Smartworld.asia
54 | P a g e
jntuworldupdates.org Specworld.in