Design and Analysis of Algorithms (DAA) Notes
Design and Analysis of Algorithms (DAA) Notes
INTRODUCTION
1.1 Notion of Algorithm
1.2 Review of Asymptotic Notation
1
An algorithm is composed of a finite set of steps, each of which may require one or more op-
erations. The possibility of a computer carrying out these operations necessitates that certain
constraints be placed on the type of operations an algorithm can include. The fourth criterion
for algorithms we assume in this book is that they terminate after a finite number of opera-
tions.
Criterion 5 requires that each operation be effective; each step must be such that it can, at least
in principal, be done by a person using pencil and paper in a finite amount of time. Performing
arithmetic on integers is an example of effective operation, but arithmetic with real numbers is
not, since some values may be expressible only by infinitely long decimal expansion. Adding
two such numbers would violet the effectiveness property.
• Algorithms that are definite and effective are also called computational procedures.
• The same algorithm can be represented in same algorithm can be represented in several ways
• Several algorithms to solve the same problem
• Different ideas different speed
Example:
Problem:GCD of Two numbers m,n
Input specifiastion :Two inputs,nonnegative,not both zero
Euclids algorithm
-gcd(m,n)=gcd(n,m mod n)
Untill m mod n =0,since gcd(m,0) =m
Another way of representation of the same algorithm
Euclids algorithm
Step1:if n=0 return val of m & stop else proceed step 2
Step 2:Divide m by n & assign the value of remainder to r
Step 3:Assign the value of n to m,r to n,Go to step1.
Another algorithm to solve the same problem
Euclids algorithm
Step1:Assign the value of min(m,n) to t
Step 2:Divide m by t.if remainder is 0,go to step3 else goto step4
Step 3: Divide n by t.if the remainder is 0,return the value of t as the answer and
stop,otherwise proceed to step4
Step4 :Decrease the value of t by 1. go to step 2
• Worst-case efficiency: Efficiency (number of times the basic operation will be executed) for
the worst case input of size n. i.e. The algorithm runs the longest among all possible inputs of
size n.
• Best-case efficiency: Efficiency (number of times the basic operation will be executed) for the
best case input of size n. i.e. The algorithm runs the fastest among all possible inputs of size n.
• Average-case efficiency: Average time taken (number of times the basic operation will be
executed) to solve all the possible instances (random) of the input. NOTE: NOT the average of
worst and best case
Asymptotic Notations
Asymptotic notation is a way of comparing functions that ignores constant factors and small input
sizes. Three notations used to compare orders of growth of an algorithm‘s basic operation count are:
O, Ω, Θ notations
Big Oh- O notation
Definition:
A function t(n) is said to be in O(g(n)), denoted t(n)=O(g(n)), if t(n) is bounded above by some
constant multiple of g(n) for all large n, i.e., if there exist some positive constant c and some
nonnegative integer n0 such that
t(n) ≤ cg(n) for all n ≥ n0
n log n n log n
quadratic
n2
cubic
n3
exponential
2n
n! factorial low time efficiency
slow
M (n) = Θ (n)
Example: Find the number of binary digits in the binary representation of a positive
decimal integer
ALGORITHM BinRec (n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n‟sbinary representation
if n = = 1
return 1
else
return BinRec (└ n/2 ┘) + 1
Analysis:
1. Input size: given number = n
2. Basic operation: addition
3. NO best, worst, average cases.
4. Let A (n) denotes number of additions.
A (n) = A (└ n/2 ┘) + 1 for n > 1
A (1) = 0 initial condition
Where: A (└ n/2 ┘) : to compute BinRec (└ n/2 ┘)
1 : to increase the returned value by 1
5. Solve the recurrence:
A (n) = A (└ n/2 ┘) + 1 for n > 1
Assume n = 2k (smoothness rule)
A (2k) = A (2k-1) + 1 for k > 0; A (20) = 0
Solving using “Backward substitution method”:
A (2k) = A (2k-1) + 1
= [A (2k-2) + 1] + 1
= A (2k-2) + 2
= [A (2k-3) + 1] + 2
= A (2k-3) + 3
…
In the ith recursion, we have
13
= A (2k-i) + i
14
When i = k, we have
= A (2k-k) + k = A (20) + k
Since A (20) = 0
A (2k) = k
Since n = 2k, HENCE k = log2 n
A (n) = log2 n
A (n) = Θ ( log n)
14
swap A[i] an
Example:
Thus, selection sort is a O(n2) algorithm on all inputs. The number of key swaps is only O(n)
or, more precisely, n-1 (one for each repetition of the i loop).This property distinguishes selection sort
positively from many other sorting algorithms.
Bubble Sort
Compare adjacent elements of the list and exchange them if they are out of order.Then we
repeat the process,By doing it repeatedly, we end up ‗bubbling up‘ the largest element to the last
position on the list
ALGORITHM
BubbleSort(A [0..n - 1])
//The algorithm sorts array A[0..n - 1] by bubble sort
//Input: An array A[0..n - 1] of orderable elements
//Output: Array A[0..n - 1] sorted in ascending order
for i=0 to n - 2 do
15
for j=0 to n - 2 - i do
16
if A[j + 1]<A[j ]
swap A[j ] and A[j + 1]
Example
The first 2 passes of bubble sort on the list 89, 45, 68, 90, 29, 34, 17. A new line is shown after
a swap of two elements is done. The elements to the right of the vertical bar are in their final positions
and are not considered in subsequent iterations of the algorithm
The number of key swaps depends on the input. For the worst case of decreasing arrays, it is
the same as the number of key comparisons.
Observation: if a pass through the list makes no exchanges, the list has been sorted and we can
stop the algorithm Though the new version runs faster on some inputs, it is still in O(n2) in the worst
and average cases. Bubble sort is not very good for big set of input. How ever bubble sort is very
simple to code.
General Lesson From Brute Force Approach
A first application of the brute-force approach often results in an algorithm that can be
improved with a modest amount of effort. Compares successive elements of a given list with a given
search key until either a match is encountered (successful search) or the list is exhausted without
finding a match (unsuccessful search)
1. 4 Sequential Search and Brute Force String Matching.
Sequential Search
ALGORITHM SequentialSearch2(A [0..n], K)
//The algorithm implements sequential search with a search key as a sentinel
//Input: An array A of n elements and a search key K
//Output: The position of the first element in A[0..n - 1] whose value is
// equal to K or -1 if no such element is found
A[n]=K
i=0
while A[i] = K do
i=i + 1
if i < n return i
else return
The algorithm shifts the pattern almost always after a single character comparison. in the
worst case, the algorithm may have to make all m comparisons before shifting the pattern, and this can
happen for each of the n - m + 1 tries. Thus, in the worst case, the algorithm is in θ(nm).
UNIT - 2
DIVIDE & CONQUER
Problem
of size n
Problem Problem
of size n of size n
Solution to Solution to
sub sub
problem 1 problem 1
NOTE:
1
9
The base case for the recursion is sub-problem of constant size.
Therefore, the order of growth of T(n) depends on the values of the constants a & b and
the order of growth of the function f(n).
Master theorem
Theorem: If f(n) Є Θ (nd) with d ≥ 0 in recurrence equation
T(n) = aT(n/b) + f(n),
then
Θ (nd) if a < bd
T(n) = Θ (ndlog n) if a = bd
Θ (nlogba ) if a > bd
Example:
20
d=0
Therefore:
a > bd i.e., 2 > 20
Case 3 of master theorem holds good. Therefore:
T(n) Є Θ (nlogba )
Є Θ (nlog22 )
Є Θ (n)
1.3Binary Search
Description:
Binary tree is a dichotomic divide and conquer search algorithm. Ti inspects the middle
element of the sorted list. If equal to the sought value, then the position has been found.
Otherwise, if the key is less than the middle element, do a binary search on the first half,
else on the second half.
Algorithm:
Algorithm can be implemented as recursive or non-recursive algorithm.
l 0
r n-1
while l ≤ r do
m ( l + r) / 2
if key = = A[m]
return m
else
if key < A[m]
r m-1
else
l m+1
return -1
Analysis:
• Input size: Array size, n
• Basic operation: key comparison
• Depend on
Best – key matched with mid element
Worst – key not found or key sometimes in the list
• Let C(n) denotes the number of times basic operation is executed. Then
Cworst(n) = Worst case efficiency. Since after each comparison the algorithm
divides the problem into half the size, we have
Cworst(n) = Cworst(n/2) + 1 for n > 1
C(1) = 1
• Solving the recurrence equation using master theorem, to give the number of
times the search key is compared with an element in the array, we have:
C(n) = C(n/2) + 1
a=1
b=2
f(n) = n0 ; d = 0
case 2 holds:
C(n) = Θ (ndlog n)
= Θ (n0log n)
= Θ ( log n)
Applications of binary search:
• Number guessing game
• Word lists/search dictionary etc
Advantages:
• Efficient on very big list
• Can be implemented iteratively/recursively
Limitations:
• Interacts poorly with the memory hierarchy
• Requires given list to be sorted
• Due to random access of list element, needs arrays instead of linked list.
1.4Merge Sort
Definition:
Merge sort is a sort algorithm that splits the items to be sorted into two groups,
recursively sorts each group, and merges them into a final sorted sequence.
Features:
• Is a comparison based algorithm
• Is a stable algorithm
• Is a perfect example of divide & conquer algorithm design strategy
• It was invented by John Von Neumann
Algorithm:
if n > 1
copy A[0… (n/2 -1)] to B[0… (n/2 -1)]
copy A[n/2… n -1)] to C[0… (n/2 -1)]
Mergesort ( B[0… (n/2 -1)] )
Mergesort ( C[0… (n/2 -1)] )
M
e
r
g
e
B
,
C
,
)
ALGORITHM Merge ( B[0… p-1], C[0… q-1], A[0… p+q-1] )
//merges two sorted arrays into one sorted array
//i/p: arrays B, C, both sorted
//o/p: Sorted array A of elements from B & C
I →0
j→0
k→0
while i < p and j < q do
if B[i] ≤ C[j]
A[k] →B[i]
i→i + 1
else
A[k] →C[j]
j→j + 1
k→k + 1
if i == p
copy C [ j… q-1 ] to A [ k… (p+q-1) ]
else
copy B [ i… p-1 ] to A [ k… (p+q-1) ]
Example:
Apply merge sort for the following list of elements: 6, 3, 7, 8, 2, 4, 5, 1
6 3 7 8 2 4 5 1
6378 2451
6 3 7 8 24 51
6 3 7 8 2 4 5 1
24
3 6 7 8 1 5
3678 1 2 4 5
12345678
Analysis:
• Input size: Array size, n
• Basic operation: key comparison
• Best, worst, average case exists:
Worst case: During key comparison, neither of the two arrays becomes empty
before the other one contains just one element.
• Let C(n) denotes the number of times basic operation is executed. Then
C(n) = 2C(n/2) + Cmerge(n) for n > 1
C(1) = 0
where, Cmerge(n) is the number of key comparison made during the merging stage.
In the worst case:
Cmerge(n) = 2 Cmerge(n/2) + n-1 for n > 1
Cmerge(1) = 0
• Solving the recurrence equation using master theorem:
C(n) = 2C(n/2) + n-1 for n > 1
C(1) = 0
Here a=2
b=2
f(n) = n; d = 1
Therefore 2 = 21, case 2 holds
C(n) = Θ (ndlog n)
= Θ (n1log n)
= Θ (n log n)
Advantages:
• Number of comparisons performed is nearly optimal.
• Mergesort will never degrade to O(n2)
• It can be applied to files of any size
Limitations:
• Uses O(n) additional memory.
1.6 Quick Sort and its performance
Definition:
Quick sort is a well –known sorting algorithm, based on divide & conquer approach. The
steps are:
1. Pick an element called pivot from the list
2. Reorder the list so that all elements which are less than the pivot come before the
pivot and all elements greater than pivot come after it. After this partitioning, the
pivot is in its final position. This is called the partition operation
3. Recursively sort the sub-list of lesser elements and sub-list of greater elements.
Features:
• Developed by C.A.R. Hoare
• Efficient algorithm
• NOT stable sort
• Significantly faster in practice, than other algorithms
ALGORITHM Quicksort (A[ l …r ])
//sorts by quick sort
//i/p: A sub-array A[l..r] of A[0..n-1],defined by its left and right indices l and r
//o/p: The sub-array A[l..r], sorted in ascending order
if l < r
Partition (A[l..r]) // s is a split position
Quicksort(A[l..s-1])
Quicksort(A[s+1..r]
ALGORITHM Partition (A[l ..r])
//Partitions a sub-array by using its first element as a pivot
//i/p: A sub-array A[l..r] of A[0..n-1], defined by its left and right indices l and r (l < r)
//o/p: A partition of A[l..r], with the split position returned as this function‘s value
p→A[l]
i→l
j→r + 1;
Repeat
repeat i→i + 1 until A[i] >=p //left-right scan
repeat j→j – 1 until A[j] < p //right-left scan
if (i < j) //need to continue with the scan
swap(A[i], a[j])
until i >= j //no need to scan
swap(A[l], A[j])
return j
Example: Sort by quick sort the following list: 5, 3, 1, 9, 8, 2, 4, 7, show recursion tree.
Analysis:
• Input size: Array size, n
• Basic operation: key comparison
• Best, worst, average case exists:
Best case: when partition happens in the middle of the array each time.
Worst case: When input is already sorted. During key comparison, one half is
empty, while remaining n-1 elements are on the other partition.
• Let C(n) denotes the number of times basic operation is executed in worst case:
Then
C(n) = C(n-1) + (n+1) for n > 1 (2 sub-problems of size 0 and n-1 respectively)
C(1) = 1
Best case:
C(n) = 2C(n/2) + Θ(n) (2 sub-problems of size n/2 each)
NOTE:
The quick sort efficiency in average case is Θ( n log n) on random input.
UNIT - 3
THE GREEDY METHOD
3.1 The General Method
3.2 Knapsack Problem
3.3 Job Sequencing with Deadlines
3.4 Minimum-Cost Spanning Trees
3.5 Prim’sAlgorithm
3.6 Kruskal’s Algorithm
3.7 Single Source Shortest Paths.
The method:
• Applicable to optimization problems ONLY
• Constructs a solution through a sequence of steps
• Each step expands a partially constructed solution so far, until a complete solution
to the problem is reached.
On each step, the choice made must be
• Feasible: it has to satisfy the problem‘s constraints
• Locally optimal: it has to be the best local choice among all feasible choices
available on that step
• Irrevocable: Once made, it cannot be changed on subsequent steps of the
algorithm
NOTE:
• Greedy method works best when applied to problems with the greedy-choice
property
• A globally-optimal solution can always be found by a series of local
improvements from a starting configuration.
∴ The time needed by the algorithm is 0(sn) s ≤ n so the worst case time is 0(n2).
If di = n - i+1 1 ≤ i ≤ n, JS takes θ(n2) time
D and J need θ(s) amount of space.
3.4 Minimum-Cost Spanning Trees
Spanning Tree
Spanning tree is a connected acyclic sub-graph (tree) of the given graph (G) that includes
all of G‘s vertices
5 2
d
c 3
Definition:
MST of a weighted, connected graph G is defined as: A spanning tree of G with
minimum total weight.
Example: Consider the example of spanning tree:
For the given graph there are three possible spanning trees. Among them the spanning
tree with the minimum weight 6 is the MST for the given graph
Algorithm:
ALGORITHM Prim (G)
//Prim‘s algorithm for constructing a MST
//Input: A weighted connected graph G = { V, E }
//Output: ET the set of edges composing a MST of G
// the set of tree vertices can be initialized with any vertex
VT → { v0}
ET → Ø
for i→ 1 to |V| - 1 do
Find a minimum-weight edge e* = (v*, u*) among all the edges (v, u) such
that v is in VT and u is in V - VT
VT → VT U { u*}
ET → ET U { e*}
return ET
STEP 1: Start with a tree, T0, consisting of one vertex
STEP 2: ―Grow‖ tree one vertex/edge at a time
• Construct a series of expanding sub-trees T1, T2, … Tn-1.
• At each stage construct Ti + 1 from Ti by adding the minimum weight edge
connecting a vertex in tree (Ti) to one vertex not yet in tree, choose from
“fringe” edges (this is the “greedy” step!)
Algorithm stops when all vertices are included
Example:
Apply Prim‘s algorithm for the following graph to find MST.
1
b c
34 4 6
a 5 f 5
d
2
6 8
e
Solution:
1
c(b,1) b c
3
d(-,∞)
b ( a, 3
e(a,6) a
)
f(b,4)
1 c
b
d(c,6) 3
c ( b, 1 e(a,6)
) f(b,4)
a f
4
1
b c
3 4
d(f,5) a f
f ( b, 4)
e(f,2)
2
1
b c
3 4
e ( f, 2) d(f,5) a f 5
d
2
Efficiency:
Efficiency of Prim‘s algorithm is based on data structure used to store priority queue.
• Unordered array: Efficiency: Θ(n2)
• Binary heap: Efficiency: Θ(m log n)
• Min-heap: For graph with n nodes and m edges: Efficiency: (n + m) log n
Conclusion:
• Prim‘s algorithm is a ―evrtex based algorithm‖
• Prim‘s algorithm ―Needs priority queue for locating the nearest vertex.‖
The choice of priority queue matters in Prim implementation.
o Array - optimal for dense graphs
o Binary heap - better for sparse graphs
o Fibonacci heap - best in theory, but not in practice
3.6 Kruskal’s Algorithm
Algorithm:
The method:
STEP 1: Sort the edges by increasing weight
STEP 2: Start with a forest having |V| number of trees.
STEP 3: Number of trees are reduced by ONE at every inclusion of an edge
At each stage:
• Among the edges which are not yet included, select the one with minimum
weight AND which does not form a cycle.
• the edge will reduce the number of trees by one by combining two trees of
the forest
Algorithm stops when |V| -1 edges are included in the MST i.e : when the number of
trees in the forest is reduced to ONE.
Example:
Apply Kruskal‘s algorithm for the following graph to find MST.
1
b c
344
6
a 5 f 5
d
2
6 8
e
Solution:
The list of edges is:
Edge ab af ae bc bf cf cd df de ef
Weigh 3 5 6 1 4 4 6 5 8 2
t
Sort the edges in ascending order:
Edge bc ef ab bf cf af df ae cd de
Weigh 1 2 3 4 4 5 5 6 6 8
t
1
bc b
Edge c
1 f
Weight a d
Inserti
YES
on
statu
Inserti e
1
on
orde
ef 1
Edge b c
2
Weight a f d
Inserti
YES
on 2
statu
Inserti e
2
on
orde
ab 1
Edge 3 b c
3
Weight a f d
Inserti
YES
on 2
statu
Inserti
3 e
on
orde
bf 1
Edge 3 b c
4
Weight a 4 f d
Inserti
YES
on 2
statu
Inserti
4 e
on
orde
Edge cf
Weight 4
Inserti
NO
on
statu
Inserti
-
on
orde
Edge af
Weight 5
Inserti
NO
on
statu
Inserti
-
on
orde
df 1
Edge
3 c
5
Weight
Inserti b f d
YES 5
on 4
statu a 2
Inserti
5 e
on
orde
Algorithm stops as |V| -1 edges are included in the MST
39
Efficiency:
Efficiency of Kruskal‘s algorithm is based on the time needed for sorting the edge
weights of a given graph.
• With an efficient sorting algorithm: Efficiency: Θ(|E| log |E| )
Conclusion:
• Kruskal‘s algorithm is an ―dege based algorithm‖
• Prim‘s algorithm with a heap is faster than Kruskal‘s algorithm.
3.7 Single Source Shortest Paths.
VT→0
for i→0 to |V| - 1 do
u*→DeleteMin(Q)
//expanding the tree, choosing the locally best vertex
VT→VT U {u*}
for every vertex u in V – VT that is adjacent to u* do
if Du* + w (u*, u) < Du
Du→Du + w (u*, u); Pu u*
Decrease(Q, u, Du)
The method
Dijkstra‘s algorithm solves the single source shortest path problem in 2 stages.
Stage 1: A greedy algorithm computes the shortest distance from source to all other
nodes in the graph and saves in a data structure.
Stage 2 : Uses the data structure for finding a shortest path from source to any vertex v.
• At each step, and for each vertex x, keep track of a “distance” D(x)
and a directed path P(x) from root to vertex x of length D(x).
• Scan first from the root and take initial paths P( r, x ) = ( r, x ) with
D(x) = w( rx ) when rx is an edge,
D(x) = ∞ when rx is not an edge.
For each temporary vertex y distinct from x, set
D(y) = min{ D(y), D(x) + w(xy) }
Example:
Apply Dijkstra‘s algorithm to find Single source shortest paths with vertex a as the
source.
1
b c
344
6
a 5 f 5
d
2
6 8
e
Solution:
Length Dv of shortest path from source (s) to other vertices v and Penultimate vertex Pv
for every vertex v in V:
Da = 0 , Pa = null
Db = ∞ , Pb = null
Dc = ∞ , Pc = null
Dd = ∞ , Pd = null
De = ∞ , Pe = null
Df = ∞ , Pf = null
41
Tre Remainin istance & Graph
e gD Path
vertic verticesD 0 vertex
Pa = a
a=b(a 3 Pb = [ a, b ] b
, 3D ) b = ∞ Pc = 3
a ( -, 0 c(-, ∞ null a
) ∞D)c = 6 Pd =
d(-, 5 null
∞D)d = Pe = [ a, e ]
e ( a , D 0 Pa Pf = a[ a, f ] 1
a = c ( b , 3 Pb = [ a, b ] b c
3+D1b)= 4 Pc = [a,b,c] 3
b ( a, 3 )
d(-, ∞ Pd = null a
∞D)c = 6 Pe = [ a,
e(a, 5 e]
6D)dDa = = 0 Pa = a
Db
d ( c , 4 = 3 Pb = [ a, b ]
5
4 Pc = [a,b,c]
c ( b, 4 ) Dc+6
)= 0 Pd = a f
e(a,
6)Dd =1 6 [a,b,c,d] Pe
f(a,5)
De = 5 = [ a, e ]
D 0 Pa = a
a 3 Pb = [ a, b ]
= 4 Pc = [a,b,c] a
f ( a, 5) D 0 Pd = 6
b 6 [a,b,c,d] e
= 5 Pe = [ a, e ]
D Pf = [ a, f ]
c
D 0 Pa = a 1
a 3 Pb = [ a, b ] b c
= 4 Pc = [a,b,c]
e ( a, 6) D 10 Pd = 3 6
b [a,b,c,d] 6 d
= Pe = [ a, e ] 5 a
D Pf = [ a, f ]
c Algorithm stops since
d( c, 10)
edge no s to scan
Conclusion:
• Doesn‘t work with negative weights
• Applicable to both undirected and directed graphs
• Use unordered array to store the priority queue: Efficiency = Θ(n2)
• Use min-heap to store the priority queue: Efficiency = O(m log n)
UNIT - 4
Dynamic Programming
4.1 The General Method
4.2 Warshall’s Algorithm
4.3 Floyd’s Algorithm for the All-Pairs Shortest Paths Problem
4.4 Single-Source Shortest Paths
4.5 General Weights 0/1 Knapsack
4.6 The Traveling Salesperson problem.
4.1 The General Method
Definition
Dynamic programming (DP) is a general algorithm design technique for solving
problems with overlapping sub-problems. This technique was invented by American
mathematician ―Richard Bellman‖ in 1950s.
Key Idea
The key idea is to save answers of overlapping smaller sub-problems to avoid re-
computation.
Dynamic Programming Properties
• An instance is solved using the solutions for smaller instances.
• The solutions for a smaller instance might be needed multiple times, so store their
results in a table.
• Thus each smaller instance is solved only once.
• Additional space is used to save time.
Dynamic Programming vs. Divide & Conquer
LIKE divide & conquer, dynamic programming solves problems by combining solutions
to sub-problems. UNLIKE divide & conquer, sub-problems are NOT independent in
dynamic programming.
3. Bottom up algorithms: in
3. Top down algorithms: which which the smallest sub-
logically progresses from the initial problems are explicitly solved
instance down to the smallest sub- first and the results of these
instances via intermediate sub- used to construct solutions to
instances. progressively larger sub-
instances
Dynamic Programming vs. Divide & Conquer: EXAMPLE
Computing Fibonacci Numbers
0 if
F(n) = 1 n=0
if
F(n-1) + F(n-2) n=1
if n
>1
Algorithm F(n)
// Computes the nth Fibonacci number recursively by using its definitions
// Input: A non-negative integer n
// Output: The nth Fibonacci number
if n==0 || n==1 then
return n
else
return F(n-1) + F(n-2)
F(n)
F(n-1) + F(n-2)
A C
D
B
Solution:
A B C D
R(0 = A
0 0 1 0
) B 1 0 0 1
0 0 0 0
C 0 1 0 0
R(0) k=1 A B C D A B C D
Vertex 1 AR1[2,3]
0 0 1 0 A 0 0 1 0
can be B= R0[2,3]
1 OR
0 0 1 B 1 0 1 1
intermediat CR0[2,1]
0 AND
0 0R0[1,3] 0 C 0 0 0 0
D= 0 OR
0 ( 11AND
0 1) 0 D 0 1 0 0
node
=1
ABCD
ABCD
A
B
C
D
R(1 k=
) 2
0 0 R2[4,1]
1 0 0 0 1 0
Vert
{1,2 } = R1[4,1]
can be 1 0 1 1 OR 1 0 1 1
0 0 R1[4,2]
0 AND
0 R1[2,1] 0 0 0 0
intermedia = 0 OR ( 1 AND 1)
te 0 1 0 0 1 1 1 1
=1
R2[4,3]
= R1[4,3] OR
R1[4,2] AND R1[2,3]
= 0 OR ( 1 AND 1)
=1
R2[4,4]
= R1[4,4] OR
R1[4,2] AND R1[2,4]
= 0 OR ( 1 AND 1)
=1
ABCDABCD AA
B
C
D
NO CHANGE
R(2 k=
) 3
Vert
{1,2,3 } 0 0 1 0 0 0 1 0
can be 1 0 1 1 1 0 1 1
intermedi 0 0 0 0 0 0 0 0
nodes
ate 1 1 1 1 1 1 1 1
R(3) k= A B C D A B C D
4 A 0 0 1 0 A 0 0 1 0
Vert B 1 0 1 1 B 1 1 1 1
ex C 0 0 0 0 C 0 0 0 0
{1,2,3,4 } D 1 1 1 1 D 1 1 1 1
can be
intermedi R4[2,2]
ate nodes = R3[2,2] OR
R3[2,4] AND R3[4,2]
= 0 OR ( 1 AND 1)
=1
R(4) A B C D
A 0 0 1 0 TRANSITIVE CLOSURE
1 1 1 1
B 0 0 0 0
1 1 1 1
C
Efficiency:
• Time efficiency is Θ(n3)
• Space efficiency: Requires extra space for separate matrices for recording
intermediate results of the algorithm.
4.3 Floyd’s
Algorithm to find -ALL PAIRS SHORTEST PATHS
Some useful definitions:
• Weighted Graph: Each edge has a weight (associated numerical value). Edge
weights may represent costs, distance/lengths, capacities, etc. depending on the
problem.
• Weight matrix: W(i,j) is
o 0 if i=j
o ∞ if no edge b/n i and j.
o ―wieght of edge‖ if edge b/n i and j.
Problem statement:
Given a weighted graph G( V, Ew), the all-pairs shortest paths problem is to find the
shortest path between every pair of vertices ( vi, vj ) Є V.
Solution:
A number of algorithms are known for solving All pairs shortest path problem
• Matrix multiplication based algorithm
• Dijkstra's algorithm
• Bellman-Ford algorithm
• Floyd's algorithm
Underlying idea of Floyd’s algorithm:
• Let W denote the initial weight matrix.
• Let D(k) [ i, j] denote cost of shortest path from i to j whose intermediate vertices
are a subset of {1,2,…,k}.
• Recursive Definition
Case 1:
A shortest path from vi to vj restricted to using only vertices from {v1,v2,…,vk}
as intermediate vertices does not use vk. Then
D(k) [ i, j ] = D(k-1) [ i, j ].
Case 2:
A shortest path from vi to vj restricted to using only vertices from {v1,v2,…,vk}
as intermediate vertices do use vk. Then
D(k) [ i, j ] = D(k-1) [ i, k ] + D(k-1) [ k, j ].
We conclude:
D(k)[ i, j ] = min { D(k-1) [ i, j ], D(k-1) [ i, k ] + D(k-1) [ k, j ] }
Algorithm:
Algorithm Floyd(W[1..n, 1..n])
// Implements Floyd‘s algorithm
// Input: Weight matrix W
// Output: Distance matrix of shortest paths‘ length
D W
for k → 1 to n do
for i→ 1 to n do
for j→ 1 to n do
D [ i, j]→ min { D [ i, j], D [ i, k] + D [ k, j]
return D
Example:
Find All pairs shortest paths for the given weighted connected graph using Floyd‘s
algorithm.
A 5
4 2 C
B
3
Solution:
D(0) =
ABC A B C
0 2 5
4 0 ∞
∞ 3 0
D(0) k=1
Vertex 1
can be
0 2 5 0 2 5
4 0 9 4 0 9
∞ 3 0 7 3 0
4.40/1 Knapsack Problem Memory function
Definition:
Given a set of n items of known weights w1,…,wn and values v1,…,vn and a knapsack
of capacity W, the problem is to find the most valuable subset of the items that fit into the
knapsack.
Knapsack problem is an OPTIMIZATION PROBLEM
Step 1:
Identify the smaller sub-problems. If items are labeled 1..n, then a sub-problem would be
to find an optimal solution for Sk = {items labeled 1, 2, .. k}
Step 2:
Recursively define the value of an optimal solution in terms of solutions to smaller
problems.
Initial conditions:
V[ 0, j ] = 0 for j ≥ 0
V[ i, 0 ] = 0 for i ≥ 0
Recursive step:
max { V[ i-1, j ], vi +V[ i-1, j - wi ] }
V[ i, j ] = if j - wi ≥ 0
V[ i-1, j ] if j - wi < 0
Step 3:
Bottom up computation using iteration
Question:
Apply bottom-up dynamic programming algorithm to the following instance of the
knapsack problem Capacity W= 5
Solution:
Using dynamic programming approach, we have:
Step Calculation Table
1 Initial conditions:
V[ 0, j ] = 0for j ≥ 0 V[ i, 0 ] = 0for i ≥ 0
V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
W1 = 2, 2 1 0
Available knapsack capacity = 1 W1 > WA,CASE 1 holds: V[2 i, j ] =0 V[ i-1, j ]
V[ 1,1] = V[ 0, 1 ] = 0 3 0
4 0
V[i, j= 1 2 3 4 5
j]i=0 00 0 0 0 0 0
1 0 0
W1 = 2,3 2 0
Available knapsack capacity = 2 W1 = WA,CASE 32 holds: 0
V[ i, j ] = max { V[ i-1, j ], 4 0
vi +V[ i-1, j - wi ] }
V[ 1,2] = max { V[ 0, 2 ],
3 +V[ 0, 0 ] } V[i, j= 1 2 3 4 5
= max { 0, 3 + 0 } = 3 j]i=0 00 0 0 0 0 0
1 0 0 3
2 0
4 W1 = 2, 3 0
Available knapsack capacity = 4 0
3,4,5
W1 < WA,CASE 2 holds:
V[ i, j ] = max { V[ i-1, j ], V[i, j= 1 2 3 4 5
vi +V[ i-1, j - wi ] } j]i=0 00 0 0 0 0 0
V[ 1,3] = max { V[ 0, 3 ], 1 0 0 3 3 3 3
3 +V[ 0, 1 ] } 2 0
= max { 0, 3 + 0 } = 3 3 0
4 0
W2 = 3, 5
Available knapsack capacity = 1 W2 >WA,CASE 1 holds: V[ i, j ] = V[ i-1, j ]
V[ 2,1] = V[ 1, 1 ] = 0
V[i, j= 1 2 3 4 5
j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3
2 0 0
3 0
4 0
W2 = 3, 6
Available knapsack capacity = 2 W2 >WA,CASE 1 holds: V[ i, j ] = V[ i-1, j ]
V[ 2,2] = V[ 1, 2 ] = 3
V[i, j= 1 2 3 4 5
j]i=0 00 0 0 0 0 0
W2 = 3,7 1 0 0 3 3 3 3
Available knapsack capacity = 3 W2 = WA,CASE 22 holds: 0 0 3
V[ i, j ] = max { V[ i-1, j ], 3 0
vi +V[ i-1, j - wi ] } 4 0
V[ 2,3] = max { V[ 1, 3 ],
4 +V[ 1, 0 ] }
= max { 3, 4 + 0 } = 4 V[i, j= 1 2 3 4 5
j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3
W2 = 3,8 2 0 0 3 4
Available knapsack capacity = 4 W2 < WA,CASE 32 holds: 0
V[ i, j ] = max { V[ i-1, j ], 4 0
vi +V[ i-1, j - wi ] }
V[ 2,4] = max { V[ 1, 4 ],
4 +V[ 1, 1 ] } V[i, j= 1 2 3 4 5
= max { 3, 4 + 0 } = 4 j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3
2 0 0 3 4 4
3 0
W2 = 3,9 4 0
Available knapsack capacity = 5 W2 < WA,CASE 2 holds:
V[ i, j ] = max { V[ i-1, j ],
vi +V[ i-1, j - wi ] }
V[ 2,5] = max { V[ 1, 5 ], V[i, j= 1 2 3 4 5
4 +V[ 1, 2 ] } j]i=0 00 0 0 0 0 0
= max { 3, 4 + 3 } = 7 1 0 0 3 3 3 3
2 0 0 3 4 4 7
3 0
4 0
W3
10 = 4,
Available knapsack capacity =
1,2,3
V[i, j= 1 2 3 4 5
W3 > WA,CASE 1 holds: V[ i, j ] = V[ i-1, j ]
j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3
2 0 0 3 4 4 7
3 0 0 3 4
4 0
11 W3 = 4,
Available knapsack capacity = V[i, j= 1 2 3 4 5
4 W3 = WA, CASE 2 holds: j]i=0 00 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] 2 0 0 3 4 4 7
} 3 0 0 3 4 5
V[ 3,4] = max { V[ 2, 4 ],
4 0
5 +V[ 2, 0 ] }
= max { 4, 5 + 0 } = 5
12 W3 = 4,
Available knapsack capacity = V[i, j= 1 2 3 4 5
5 W3 < WA, CASE 2 holds: j]i=0 00 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] 2 0 0 3 4 4 7
} 3 0 0 3 4 5 7
V[ 3,5] = max { V[ 2, 5 ],
4 0
5 +V[ 2, 1 ] }
= max { 7, 5 + 0 } = 7
13 W4 = 5,
Available knapsack capacity = V[i, j= 1 2 3 4 5
1,2,3,4 j]i=0 00 0 0 0 0 0
W4 < WA, CASE 1 1 0 0 3 3 3 3
holds: V[ i, j ] = V[ i-1, j ] 2 0 0 3 4 4 7
3 0 0 3 4 5 7
4 0 0 3 4 5
14 W4 = 5,
Available knapsack capacity = V[i, j= 1 2 3 4 5
5 W4 = WA, CASE 2 holds: j]i=0 00 0 0 0 0 0
V[ i, j ] = max { V[ i-1, j ], 1 0 0 3 3 3 3
vi +V[ i-1, j - wi ] 2 0 0 3 4 4 7
} 3 0 0 3 4 5 7
V[ 4,5] = max { V[ 3, 5 ],
4 0 0 3 4 5 7
6 +V[ 3, 0 ]
}
= max { 7, 6 + 0 } = 7
Maximal value is V [ 4, 5 ] = 7/-
2
V[i, j= 1 2 3 4 5 V[ 3, 5 ] = V[ 2, 5 ]
j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3 ITEM 3 NOT included in
2 0 0 3 4 4 7 the subset
3 0 0 3 4 5 7
4 0 0 3 4 5 7
3
V[i, j= 1 2 3 4 5 V[ 2, 5 ] ≠ V[ 1, 5 ]
j]i=0 00 0 0 0 0 0
1 0 0 3 3 3 3 ITEM 2 included in the subset
2 0 0 3 4 4 7
3 0 0 3 4 5 7
4 0 0 3 4 5 7
55
Efficiency:
• Running time of Knapsack problem using dynamic programming algorithm is:
O( n * W )
• Time needed to find the composition of an optimal solution is: O( n + W )
Memory function
The method:
• Uses top-down manner.
• Maintains table as in bottom-up approach.
• Initially, all the table entries are initialized with special ―unll‖ symbol to
indicate that they have not yet been calculated.
• Whenever a new value needs to be calculated, the method checks the
corresponding entry in the table first:
• If entry is NOT ―unll‖, it is simply retrieved from the table.
• Otherwise, it is computed by the recursive call whose result is then recorded in
the table.
Algorithm:
Algorithm MFKnap( i, j )
if V[ i, j] < 0
if j < Weights[ i ]
value → MFKnap( i-1, j )
else
value → max {MFKnap( i-1, j ),
Values[i] + MFKnap( i-1, j - Weights[i] )}
V[ i, j ]→ value
return V[ i, j]
Example:
Apply memory function method to the following instance of the knapsack problem
Capacity W= 5
Solution:
Using memory function approach, we have:
Computation Remarks
1 Initially, all the table entries are
initialized with special ―unll‖ symbol V[i, j= 1 2 3 4 5
to indicate that they have not yet been j]i=0 0 0 0 0 0 0
calculated. Here null is indicated with -1 1 0 - - - - -
value. 2 0 -1 -1 -1 -1 -1
3 0 1- 1- 1- 1- 1-
4 0 -1 -1 -1 -1 -1
2 MFKnap( 4, 5 ) 1 1 1 1 1
V[ 1, 5 ] = 3
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 )
V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 )
1 0 - - - - 3
2 0 -1 -1 -1 -1 -
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 ) 3 0 1- 1- 1- 1- 1-
0 3 4 0 -1 -1 -1 -1 -1
1 1 1 1 1
MFKnap( 0, 5 ) 3 + MFKnap( 0, 3 )
0 3+0
3 MFKnap( 4, 5 )
V[ 1, 2 ] = 3
MFKnap( 3, 5 ) 6 + MFKnap( 3, 0 )
V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 )
1 0 - 3 - - 3
2 0 1- - 1- 1- -
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 )
3 0 -1 -1 -1 -1 -1
3 3
0
4 0 -1 -1 -1 -1 -1
MFKnap( 0, 2 ) 3+ ) 1 1 1 1 1
MFKnap( 0, 0
0 3+0
4 MFKnap( 4, 5 )
V[ 2, 5 ] = 7
MFKnap( 3, 5 ) 6 + MFKnap( )
3, 0 V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
1 0 - 3 - - 3
MFKnap( 2, 5 ) 5 + MFKnap( 2, 1 )
2 0 1- - 1- 1- 7
3 7
3 0 -1 -1 -1 -1 -
MFKnap( 1, 5 ) 4 + MFKnap( 1, 2 ) 4 0 -1 -1 -1 -1 -1
3 3 1 1 1 1 1
V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
1 0 0 3 - - 3
2 0 0 - -1 -1 7
3 0 - -1 -1 -1 7
4 0 -1 -1 -1 -1 -
1 1 1 1 1
V[i, j= 1 2 3 4 5
j]i=0 0 0 0 0 0 0
1 0 0 3 - - 3
2 0 0 - -1 -1 7
3 0 - -1 -1 -1 7
4 0 1- 1- 1- 1- 7
1 1 1 1
Efficiency:
• Time efficiency same as bottom up algorithm: O( n * W ) + O( n + W )
• Just a constant factor gain by using memory function
• Less space efficient than a space efficient version of a bottom-up algorithm
UNIT-5
DECREASE-AND-CONQUER APPROACHES, SPACE-TIMETRADEOFFS
5.1 INTRODUCTION:
Decrease & conquer is a general algorithm design strategy based on exploiting the
relationship between a solution to a given instance of a problem and a solution to a
smaller instance of the same problem. The exploitation can be either top-down
(recursive) or bottom-up (non-recursive).
59
Decrease by a constant factor (usually by half)
Description:
Insertion sort is an application of decrease & conquer technique. It is a comparison based
sort in which the sorted array is built on one entry at a time
Algorithm:
ALGORITHM Insertionsort(A [0 … n-1] )
//sorts a given array by insertion sort
//i/p: Array A[0…n-1]
//o/p: sorted array A[0…n-1] in ascending order
for i 1 to n-1
V A[i]
j i-1
while j ≥ 0 AND A[j] > V do
A[j+1] A[j]
j j–1
A[j + 1] V
Analysis:
• Input size: Array size, n
• Basic operation: key comparison
• Best, worst, average case exists
Best case: when input is a sorted array in ascending order:
Worst case: when input is a sorted array in descending order:
• Let Cworst(n) be the number of key comparison in the worst case. Then
Example:
Sort the following list of elements using insertion sort:
89, 45, 68, 90, 29, 34, 17
89 45 68 90 29 34 1
45 89 68 90 29 34 17
45 68 89 90 29 34 17
45 68 89 90 29 34 17
29 45 68 89 90 34 71
29 34 45 68 89 90 17
17 29 34 45 68 89 79
0
Advantages of insertion sort:
• Simple implementation. There are three variations
o Left to right scan
o Right to left scan
o Binary insertion sort
• Efficient on small list of elements, on almost sorted list
• Running time is linear in best case
• Is a stable algorithm
• Is a in-place algorithm
5.3 DEPTH-FIRST SEARCH (DFS) AND BREADTH-FIRST SEARCH (BFS)
DFS and BFS are two graph traversing algorithms and follow decrease and conquer
approach – decrease by one variation to traverse the graph
Algorithm:
ALGORITHM DFS (G)
//implements DFS traversal of a given graph
//i/p: Graph G = { V, E}
//o/p: DFS tree
dfs(v)
count count + 1
mark v with count
for each vertex w in V adjacent to v do
if w is marked with 0
dfs(w)
Example:
Starting at vertex A traverse the following graph using DFS traversal method:
A B C D
E F G H
Solution:
2
Insert B into
A B
stack B (2)
A(1)
3
A B Insert F into stack
F (3)
F B (2)
A(1)
4
Insert E into stack
A B
E (4)
F (3)
E F B (2)
A(1)
5 NO unvisited adjacent vertex for E, backtrack Delete E from stack
E (4, 1)
F (3)
B (2)
A(1)
6 NO unvisited adjacent vertex for F, backtrack Delete F from stack
E (4, 1)
F (3, 2)
B (2)
A(1)
7
Insert G into
A B
stack E (4, 1)
F (3, 2) G (5)
E F G B (2)
A(1)
8
A B C Insert C into
stack E (4, 1) C
E F G
(6)
F (3, 2) G (5)
9
C D Insert D into stack
A B
D (7)
E (4, 1) C (6)
E F G
F (3, 2) G (5)
B (2)
A(1)
10
A B C D Insert H into stack
H (8)
D (7)
G H E (4, 1) C (6)
E F
F (3, 2) G (5)
B (2)
A(1)
11 NO unvisited adjacent vertex for H, backtrack
Delete H from stack
H (8, 3)
D (7)
E (4, 1) C (6)
F (3, 2) G (5)
B (2)
A(1)
64
12 NO unvisited adjacent vertex for D, backtrack Delete D from stack
H (8, 3)
D (7, 4)
E (4, 1) C (6)
F (3, 2) G (5)
B (2)
A(1)
13 NO unvisited adjacent vertex for C, backtrack Delete C from stack
H (8, 3)
D (7, 4)
E (4, 1) C (6, 5)
F (3, 2) G (5)
B (2)
A(1)
14 NO unvisited adjacent vertex for G, backtrack Delete G from stack
H (8, 3)
D (7, 4)
E (4, 1) C (6, 5)
F (3, 2) G (5, 6)
B (2)
A(1)
15 NO unvisited adjacent vertex for B, backtrack Delete B from stack
H (8, 3)
D (7, 4)
E (4, 1) C (6, 5)
F (3, 2) G (5, 6)
B (2,
7)
16 NO unvisited adjacent vertex for A, backtrack A(1) A from stack
Delete
H (8, 3)
D (7, 4)
E (4, 1) C (6, 5)
F (3, 2) G (5, 6)
B (2, 7)
A(1, 8)
Stack becomes empty. Algorithm stops as all
the nodes in the given graph are visited
F G
E C
Applications of DFS:
• The two orderings are advantageous for various applications like topological
sorting, etc
• To check connectivity of a graph (number of times stack becomes empty tells the
number of components in the graph)
• To check if a graph is acyclic. (no back edges indicates no cycle)
• To find articulation point in a graph
Efficiency:
• Depends on the graph representation:
o Adjacency matrix : Θ(n2)
o Adjacency list: Θ(n + e)
Breadth-first search (BFS)
Description:
• BFS starts visiting vertices of a graph at an arbitrary vertex by marking it as
visited.
• It visits graph‘s vertices by across to all the neighbors of the last visited vertex
• Instead of a stack, BFS uses a queue
• Similar to level-by-level tree traversal
• ―Redraws‖ graph in tree-like fashion (with tree edges and cross edges
for undirected graph)
Algorithm:
ALGORITHM BFS (G)
//implements BFS traversal of a given graph
//i/p: Graph G = { V, E}
//o/p: BFS tree/forest
Mark each vertex in V with 0 as a mark of being ―unvisited‖
66
count 0
for each vertex v in V do
if v is marked with 0
bfs(v)
bfs(v)
count count + 1
mark v with count and initialize a queue with v
while the queue is NOT empty do
for each vertex w in V adjacent to front‘s vertex v do
if w is marked with 0
count count + 1
mark w with count
add w to the queue
remove vertex v from the front of the queue
Example:
Starting at vertex A traverse the following graph using BFS traversal method:
A B C D
E F G H
Solution:
2
A Insert B, E into
B
queue A(1), B (2),
E E(3)
B (2), E(3)
3
A B Insert F, G into
F(3), G(4)
5 NO unvisited adjacent vertex for F, backtrack Delete F from queue
G(4)
6
A B C Insert C, H into
A C D Insert D into
B
7
queue C(5),
E F G H
H(6), D(7)
H(6), D(7)
NO unvisited adjacent vertex for H, backtrack Delete H from queue
8
D(7)
9 NO unvisited adjacent vertex for D, backtrack Delete D from queue
Queue becomes empty. Algorithm stops as all
the nodes in the given graph are visited
B E
F G
C H
D
Applications of BFS:
• To check connectivity of a graph (number of times queue becomes empty tells the
number of components in the graph)
• To check if a graph is acyclic. (no cross edges indicates no cycle)
• To find minimum edge path in a graph
Efficiency:
• Depends on the graph representation:
o Array : Θ(n2)
o List: Θ(n + e)
DFS BFS
NOTE: There is no solution for topological sorting if there is a cycle in the digraph .
[MUST be a DAG]
DFS Method:
• Perform DFS traversal and note the order in which vertices become dead ends
(popped order)
• Reverse the order, yield the topological sorting.
Example:
Apply DFS – based algorithm to solve the topological sorting problem for the given
graph:
C4
C1
C3
C2 C5
2
C1 Insert C2 into
C3
stack C2 (2)
C1(1)
C5 (4, 1)
C4 (3)
C2 (2)
C1(1)
6 NO unvisited adjacent vertex for C4, backtrack Delete C4 from stack
C5 (4, 1)
C4 (3, 2)
C2 (2)
C1(1)
70
7 NO unvisited adjacent vertex for C3, backtrack Delete C3 from stack
C5 (4, 1)
C4 (3,2)
C2 (2,
3)
8 NO unvisited adjacent vertex for C1, backtrack C1(1) C1 from stack
Delete
C5 (4, 1)
C4 (3,2)
C2 (2, 3)
C1(1, 4)
Stack becomes empty, but there is a node which is unvisited, therefore start the
DFS again from arbitrarily selecting a unvisited node as source
9 Insert C2 into stack
C2
C5 (4, 1)
C4 (3,2)
C2 (2, 3)
C1(1, 4) C2(5)
C5 (4, 1)
C4 (3,2)
C2 (2, 3)
C1(1, 4) C2(5, 5)
Stack becomes empty, NO unvisited node left, therefore algorithm
stops. The popping – off order is:
C5, C4, C3, C1, C2,
Topologically sorted list (reverse of pop
order): C2, C1 C3 C4 C5
Example:
Apply Source removal – based algorithm to solve the topological sorting problem for the
given graph:
C4
C1
C3
C2 C5
Solution:
C4
C4 Delete C1
C1
C3
C3
C5
C2 C C2
5
Delete C2 C4 C4
C3
C5 C5
Delete C4 Delete C5
C5
7
2
5.5 SPACE-TIME TRADEOFFS:
Introduction
Two varieties of space-for-time algorithms:
• input enhancement — preprocess the input (or its part) to store some info
to be used later in solving the problem
• counting sorts
• string searching algorithms
Algorithm:
7
3
5.7 INPUT ENHANCEMENT IN STRING MATCHING.
Horspool’s Algorithm
A simplified version of Boyer-Moore algorithm: preprocesses pattern to
generate a shift table that determines how much to shift the patter when a
mismatch occurs. Always makes a shift based on the text‘s character c aligned
with the last compared (mismatched) character in the pattern according to the shift
table‘s entry for c
void horspoolInitocc()
{
int j;
char a;
74
}
}
void horspoolSearch()
{
int i=0, j;
while (i<=n-m)
{
j=m-1;
while (j>=0 && p[j]==t[i+j]) j--;
if (j<0) report(i);
i+=m-1;
i-=occ[t[i]];
}
}
Time complexity
• the average number of comparisons for one text character is between 1/п and 2/(п+1).
6.1LOWER-BOUND ARGUMENTS
6.2DECISION TREES
6.3 P, NP, AND NP-COMPLETE PROBLEMS
Objectives
We now move into the third and final major theme for this course.
1. Tools for analyzing algorithms.
2. Design strategies for designing algorithms.
3. Identifying and coping with the limitations of algorithms.
Efficiency of an algorithm
• By establishing the asymptotic efficiency class
• The efficiency class for selection sort (quadratic) is lower. Does this mean that
selection sort is a ―better‖ algorithm?
– Like comparing ―paples‖ to ―roanges‖
• By analyzing how efficient a particular algorithm is compared to other algorithms for
the same problem
– It is desirable to know the best possible efficiency any algorithm solving
This problem may have – establishing a lower bound
Lower bound: an estimate on a minimum amount of work needed to solve a given problem
Examples:
• number of comparisons needed to find the largest element in a set of n numbers
• number of comparisons needed to sort an array of size n
• number of comparisons necessary for searching in a sorted array
• number of multiplications needed to multiply two n-by-n matrices
Lower bound can be
– an exact count
– an efficiency class (Ω)
• Tight lower bound: there exists an algorithm with the same efficiency as the lower
bound
7
7
Problem Lower Tightne
sorting bound ye ss
searching in a sorted array Ω(nlog n) s
element uniqueness Ω(log n)
Ω(nlog yes
n-digit integer multiplication n) unkno
multiplication of n-by-n Ω(n) wn
matrices Ω(n2) unkno
78
79
Deriving a Lower Bound from Decision Trees
• How does such a tree help us find lower bounds?
– There must be at least one leaf for each correct output.
– The tree must be tall enough to have that many leaves.
• In a binary tree with l leaves and height h,
h ≥ log2 l
Decision Tree and Sorting Algorithms
Decision-tree example
Decision-tree model
A decision tree can model the execution of any comparison sort:
• One tree for each input size n.
• View the algorithm as splitting whenever it compares two elements.
• The tree contains the comparisons along all possible instruction traces.
• The running time of the algorithm = the length of the path taken.
• Worst-case running time = height of tree.
81
82
Decision Tree Model
• In the insertion sort example, the decision tree reveals all possible key comparison
sequences for 3 distinct numbers.
• There are exactly 3!=6 possible output sequences.
• Different comparison sorts should generate different decision trees.
• It should be clear that, in theory, we should be able to draw a decision tree for ANY
comparison sort algorithm.
• Given a particular input sequence, the path from root to the leaf path traces a particular
key comparison sequence performed by that comparison sort.
- The length of that path represented the number of key comparisons performed by
the sorting algorithm.
• When we come to a leaf, the sorting algorithm has determined the sorted order.
• Notice that a correct sorting algorithm should be able to sort EVERY possible output
sorted order.
• Since, there are n! possible sorted order, there are n! leaves in the decision tree.
• Given a decision tree, the height of the tree represent the longest length of a root to leaf
path.
• It follows the height of the decision tree represents the largest number of key
comparisons, which is the worst-case running time of the sorting algorithm.
―Ayn comparison based sorting algorithm takes Ω(n logn) to sort a list of n distinct
elements in the worst-case.‖
– any comparison sort ← model by a decision tree
– worst-case running time ← the height of decision tree
―Any comparison based sorting algorithm takes Ω(n logn) to sort a list of n distinct
elements in the worst-case.‖
• We want to find a lower bound (Ω) of the height of a binary tree that has n! Leaves.
◇What is the minimum height of a binary tree that has n! leaves?
83
• The binary tree must be a complete tree (recall the definition of complete tree).
• Hence the minimum (lower bound) height is θ(log2(n!)).
• log2(n!)
= log2(n) + log2(n-1) + …+ log2(n/2)+….
≥ n/2 log2(n/2) = n/2 log2(n) – n/2
So, log2(n!) = Ω(n logn).
• It follows the height of a binary tree which has n! leaves is at least Ω(n logn) ◊ worst-
case running time is at least Ω(n logn)
• Putting everything together, we have
―Any comparison based sorting algorithm takes Ω(n logn) to sort a list of n
distinct elements in the worst-case.‖
Adversary Arguments
Adversary argument: a method of proving a lower bound by playing role of adversary that makes
algorithm work the hardest by adjusting input
Example: ―Geussing‖ a number between 1 and n with yes/no questions
Adversary: Puts the number in a larger of the two subsets generated by last question
6.3 CLASS P
P: the class of decision problems that are solvable in O(p(n)) time, where p(n) is a polynomial of
problem‘s input size n
Examples:
• searching
• element uniqueness
• graph connectivity
• graph acyclicity
• primality testing (finally proved in 2002)
6.4 CLASS NP
NP (nondeterministic polynomial): class of decision problems whose proposed solutions can be
verified in polynomial time = solvable by a nondeterministic polynomial algorithm
A nondeterministic polynomial algorithm is an abstract two-stage procedure that:
• generates a random string purported to solve the problem
• checks whether this solution is correct in polynomial time
By definition, it solves the problem if it‘s capable of generating and verifying a solution on one
of its tries
Why this definition?
• led to development of the rich theory called ―ocmputational complexity‖
85
What problems are in NP?
• Hamiltonian circuit existence
• Partition problem: Is it possible to partition a set of n integers into two disjoint subsets
with the same sum?
• Decision versions of TSP, knapsack problem, graph coloring, and many other
combinatorial optimization problems. (Few exceptions include: MST, shortest paths)
• All the problems in P can also be solved in this manner (but no guessing is necessary), so
we have:
P = NP
• Big question: P = NP ?
P = NP ?
• One of the most important unsolved problems is computer science is whether or not
P=NP.
– If P=NP, then a ridiculous number of problems currently believed to be very
difficult will turn out have efficient algorithms.
– If P≠NP, then those problems definitely do not have polynomial time solutions.
• Most computer scientists suspect that P ≠ NP. These suspicions are based partly on the
idea of NP-completeness.
NP -complete
problem
known
NP -complete problem
candidate
for NP - completeness
(x1 �x2 � x3 � x4 ) � ( x5 � x6 � x7 )� x8 � x9
is it possible to assign the input x1...x9, so that the formula evaluates to TRUE?
- If the answer is YES with a proof (i.e. an assignment of input value), then we can check the
proof in polynomial time (SAT is in NP)
- We may not be able to check the NO answer in polynomial time (Nobody really knows.)
• NP-hard
- A problem is NP-hard iff an polynomial-time algorithm for it implies a polynomial-
time algorithm for every problem in NP
- NP-hard problems are at least as hard as NP problems
• NP-complete
- A problem is NP-complete if it is NP-hard, and is an element of NP (NP-easy)
• Relationship between decision problems and optimization problems
– C is in NP
– Any known NP-hard (or complete) problem ≤p C
– Thus a proof must show these two being satisfied
Examples
• Longest path problem: (similar to Shortest path problem, which requires polynomial
time) suspected to require exponential time, since there is no known polynomial
algorithm.
• Hamiltonian Cycle problem: Traverses all vertices exactly once and form a cycle.
Reduction
• P1 : is an unknown problem (easy/hard ?)
• P2 : is known to be difficult
If we can easily solve P2 using P1 as a subroutine then P1 is difficult
Must create the inputs for P1 in polynomial time.
* P1 is definitely difficult because you know you cannot solve P2 in polynomial time unless you
use a component that is also difficult (it cannot be the mapping since the mapping is known to be
polynomial)
Decision Problems
Represent problem as a decision with a boolean output
– Easier to solve when comparing to other problems
– Hence all problems are converted to decision problems.
P = {all decision problems that can be solved in polynomial time}
NP = {all decision problems where a solution is proposed, can be verified in polynomial time}
NP-complete: the subset of NP which are the ―ahrdest problems‖
Alternative Representation
• Every element p in P1 can map to an element q in P2 such that p is true (decision
problem) if and only if q is also true.
• Must find a mapping for such true elements in P1 and P2, as well as for false elements.
• Ensure that mapping can be done in polynomial time.
• *Note: P1 is unknown, P2 is difficult
Cook’s Theorem
• Stephen Cook (Turing award winner) found the first NP-Complete problem, 3SAT.
Basically a problem from Logic.
Generally described using Boolean formula.
A Boolean formula involves AND, OR, NOT operators and some variables.
Ex: (x or y) and (x or z), where x, y, z are boolean variables.
Problem Definition – Given a boolean formula of m clauses, each containing ‗n‘
boolean variables, can you assign some values to these variables so that the
formula can be true?
Boolean formula: (x v y v ẑ) Λ (x v y v ẑ)
Try all sets of solutions. Thus we have exponential set of possible solutions. So it
is a NPC problem.
• Having one definite NP-Complete problem means others can also be proven NP-
Complete, using reduction.
90
Unit 7
COPING WITH LIMITATIONS OF ALGORITHMIC POWER
7.1 Backtracking: n - Queens problem,
7.2 Hamiltonian Circuit Problem,
7.3 Subset –Sum Problem.
7.4 Branch-and-Bound: Assignment Problem,
7.5 Knapsack Problem,
7.6 Traveling Salesperson Problem.
7.7 Approximation Algorithms for NP-Hard Problems – Traveling Salesperson Problem,
Knapsack Problem
Introduction
Tackling Difficult Combinatorial Problems
• There are two principal approaches to tackling difficult combinatorial problems (NP-hard
problems):
• Use a strategy that guarantees solving the problem exactly but doesn‘t guarantee to find a
solution in polynomial time
• Use an approximation algorithm that can find an approximate (sub-optimal) solution in
polynomial time
7.1 Backtracking
• Suppose you have to make a series of decisions, among various choices, where
– You don‘t have enough information to know what to choose
– Each decision leads to a new set of choices
– Some sequence of choices (possibly more than one) may be a solution to your
problem
• Backtracking is a methodical way of trying out various sequences of decisions, until you
find one that ―wroks‖
Backtracking : A Scenario
Example:
n-Queens Problem
Place n queens on an n-by-n chess board so that no two of them are in the same row, column, or
diagonal
92
State-Space Tree of the 4-Queens Problem
7.1.1N-Queens Problem:
• The object is to place queens on a chess board in such as way as no queen can capture
another one in a single move
– Recall that a queen can move horz, vert, or diagonally an infinite distance
• This implies that no two queens can be on the same row, col, or diagonal
– We usually want to know how many different placements there are
4-Queens
• Lets take a look at the simple problem of placing queens 4 queens on a 4x4 board
• The brute-force solution is to place the first queen, then the second, third, and forth
– After all are placed we determine if they are placed legally
• There are 16 spots for the first queen, 15 for the second, etc.
– Leading to 16*15*14*13 = 43,680 different combinations
• Obviously this isn‘t a good way to solve the problem
• First lets use the fact that no two queens can be in the same col to help us
– That means we get to place a queen in each col
• So we can place the first queen into the first col, the second into the second, etc.
• This cuts down on the amount of work
– Now there are 4 spots for the first queen, 4 spots for the second, etc.
• 4*4*4*4 = 256 different combinations
• However, we can still do better because as we place each queen we can look at the
previous queens we have placed to make sure our new queen is not in the same row or
diagonal as a previously place queen
• Then we could use a Greedy-like strategy to select the next valid position for each col
– As you walk though the maze you have to make a series of choices
– If one of your choices leads to a dead end, you need to back up to the last choice
you made and take a different route
• That is, you need to change one of your earlier selections
– Eventually you will find your way out of the maze
• This type of problem is often viewed as a state-space tree
– A tree of all the states that the problem can be in
• We start with an empty board state at the root and try to work our way down to a
leaf node
– Leaf nodes are completed boar
Background
• NP-complete problem:
– Most difficult problems in NP (non- deterministic polynomial time)
• A decision problem D is NP-complete if it is complete for NP, meaning that:
– it is in NP
– it is NP-hard (every other problem in NP is reducible to it.)
• As they grow large, we are not able to solve them in a reasonable time (polynomial time)
Alternative Definition
• . NP Problem such as Hamiltonian Cycle :
– Cannot be solved in Poly-time
– Given a solution, easy to verify in poly-time
a b
c f
0
d e
with 3 w/o 3
3 0
with 5 w/o 5 with 5 w/o 5
8 3 5 0
with 6 w/o 6 with 6 w/o 6 with 6 w/o 6 X
0+13<15
14 8 9 3 11 5
X with 7 w/o 7 X X X X
14+7>15 9+7>15 3+7<15 11+7>14 5+7<15
15 solution
8
X
8
<
1
5
7.3 SUBSET –SUM PROBLEM.
• Problem: Given n positive integers w1, ... wn and a positive integer S. Find all subsets
of w1, ... wn that sum to S.
• Example:
n=3, S=6, and w1=2, w2=4, w3=6
• Solutions:
{2,4} and {6}
• We will assume a binary state space tree.
• The nodes at depth 1 are for including (yes, no) item 1, the nodes at depth 2 are for
item 2, etc.
• The left branch includes wi, and the right branch excludes wi.
• The nodes contain the sum of the weights included so far
Backtracking algorithm
void checknode (node v) {
node u
if (promising ( v ))
if (aSolutionAt( v ))
write the solution
else //expand the node
for ( each child u of v )
checknode ( u )
Checknode
• Checknode uses the functions:
– promising(v) which checks that the partial solution represented by v can lead to
the required solution
– aSolutionAt(v) which checks whether the partial solution represented by node v
solves the problem.
Sum of subsets – when is a node “promising”?
• Consider a node at depth i
• weightSoFar = weight of node, i.e., sum of numbers included in partial solution node
represents
• totalPossibleLeft = weight of the remaining items i+1 to n (for a node at depth i)
• A node at depth i is non-promising
if (weightSoFar + totalPossibleLeft < S )
or (weightSoFar + w[i+1] > S )
• To be able to use this ―rpomising function‖ the wi must be sorted in non-decreasing order
Bounding
• A bound on a node is a guarantee that any solution obtained from expanding the node
will be:
– Greater than some number (lower bound)
– Or less than some number (upper bound)
• If we are looking for a minimal optimal, as we are in weighted graph coloring, then we
need a lower bound
– For example, if the best solution we have found so far has a cost of 12 and the
lower bound on a node is 15 then there is no point in expanding the node
• The node cannot lead to anything better than a 15
• We can compute a lower bound for weighted graph color in the following way:
– The actual cost of getting to the node
– Plus a bound on the future cost
• Min weight color * number of nodes still to color
– That is, the future cost cannot be any better than this
• Recall that we could either perform a depth-first or a breadth-first search
– Without bounding, it didn‘t matter which one we used because we had to expand
the entire tree to find the optimal solution
– Does it matter with bounding?
• Hint: think about when you can prune via bounding
• We prune (via bounding) when:
(currentBestSolutionCost <= nodeBound)
• This tells us that we get more pruning if:
– The currentBestSolution is low
– And the nodeBound is high
• So we want to find a low solution quickly and we want the highest possible lower bound