Iiyr It - Cs8451 Daa - 5 Units Notes
Iiyr It - Cs8451 Daa - 5 Units Notes
Iiyr It - Cs8451 Daa - 5 Units Notes
OBJECTIVES:
The student should be made to:
Learn the algorithm analysis techniques.
Become familiar with the different algorithm design techniques.
Understand the limitations of Algorithm power.
UNIT I INTRODUCTION 9
Notion of an Algorithm – Fundamentals of Algorithmic Problem Solving – Important
Problem Types – Fundamentals of the Analysis of Algorithm Efficiency – Analysis
Framework – Asymptotic Notations and its properties – Mathematical analysis for
Recursive and Non-recursive algorithms.
TEXT BOOK:
1. Anany Levitin, “Introduction to the Design and Analysis of Algorithms”, Third Edition,
Pearson Education, 2012.
REFERENCES:
1. Thomas H.Cormen, Charles E.Leiserson, Ronald L. Rivest and Clifford Stein,
“Introduction to Algorithms”, Third Edition, PHI Learning Private Limited, 2012.
2. Alfred V. Aho, John E. Hopcroft and Jeffrey D. Ullman, “Data Structures and
Algorithms”, Pearson Education, Reprint 2006.
3. Donald E. Knuth, “The Art of Computer Programming”, Volumes 1& 3 Pearson
Education, 2009.
4. Steven S. Skiena, “The Algorithm Design Manual”, Second Edition, Springer, 2008.
5. http://nptel.ac.in/
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.1
UNIT I INTRODUCTION
Problem to be solved
Algorithm
It is a step by step procedure with the input to solve the problem in a finite amount of time
to obtain the required output.
Characteristics of an algorithm:
Input: Zero / more quantities are externally supplied.
Output: At least one quantity is produced.
Definiteness: Each instruction is clear and unambiguous.
Finiteness: If the instructions of an algorithm is traced then for all cases the algorithm must
terminates after a finite number of steps.
Efficiency: Every instruction must be very basic and runs in short time.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.2
Example:
The greatest common divisor(GCD) of two nonnegative integers m and n (not-both-zero),
denoted gcd(m, n), is defined as the largest integer that divides both m and n evenly, i.e., with a
remainder of zero.
Euclid’s algorithm is based on applying repeatedly the equality gcd(m, n) = gcd(n, m mod n),
where m mod n is the remainder of the division of m by n, until m mod n is equal to 0. Since
gcd(m, 0) = m, the last value of m is also the greatest common divisor of the initial m and n.
gcd(60, 24) can be computed as follows:gcd(60, 24) = gcd(24, 12) = gcd(12, 0) = 12.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.3
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.4
approximate answer. E.g., extracting square roots, solving nonlinear equations, and
evaluating definite integrals.
(c) Algorithm Design Techniques
An algorithm design technique (or “strategy” or “paradigm”) is a general
approach to solving problems algorithmically that is applicable to a variety of
problems from different areas of computing.
Algorithms+ Data Structures = Programs
Though Algorithms and Data Structures are independent, but they are combined
together to develop program. Hence the choice of proper data structure is required
before designing the algorithm.
Implementation of algorithm is possible only with the help of Algorithms and Data
Structures
Algorithmic strategy / technique / paradigm are a general approach by which
many problems can be solved algorithmically. E.g., Brute Force, Divide and
Conquer, Dynamic Programming, Greedy Technique and so on.
(iii) Methods of Specifying an Algorithm
Algorithm Specification
Pseudocode and flowchart are the two options that are most widely used nowadays for specifying
algorithms.
a. Natural Language
It is very simple and easy to specify an algorithm using natural language. But many times
specification of algorithm by using natural language is not clear and thereby we get brief
specification.
Example: An algorithm to perform addition of two numbers.
Step 1: Read the first number, say a.
Step 2: Read the first number, say b.
Step 3: Add the above two numbers and store the result in c.
Step 4: Display the result from c.
Such a specification creates difficulty while actually implementing it. Hence many programmers
prefer to have specification of algorithm by means of Pseudocode.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.5
b. Pseudocode
Pseudocode is a mixture of a natural language and programming language constructs.
Pseudocode is usually more precise than natural language.
For Assignment operation left arrow “←”, for comments two slashes “//”,if condition, for,
while loops are used.
ALGORITHM Sum(a,b)
//Problem Description: This algorithm performs addition of two numbers
//Input: Two integers a and b
//Output: Addition of two integers
c←a+b
return c
This specification is more useful for implementation of any language.
c. Flowchart
In the earlier days of computing, the dominant method for specifying algorithms was a flowchart,
this representation technique has proved to be inconvenient.
Flowchart is a graphical representation of an algorithm. It is a a method of expressing an algorithm by
a collection of connected geometric shapes containing descriptions of the algorithm’s steps.
Symbols Example: Addition of a and b
Transition / Assignment
Input the value of a
Condition / Decision
Display the value of c
Flow connectivity
Stop
FIGURE 1.4 Flowchart symbols and Example for two integer addition.
Once an algorithm has been specified then its correctness must be proved.
An algorithm must yields a required result for every legitimate input in a finite amount of
time.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.6
For example, the correctness of Euclid’s algorithm for computing the greatest
common divisor stems from the correctness of the equality gcd(m, n) = gcd(n, m mod n).
A common technique for proving correctness is to use mathematical induction because an
algorithm’s iterations provide a natural sequence of steps needed for such proofs.
The notion of correctness for approximation algorithms is less straightforward than it is for
exact algorithms. The error produced by the algorithm should not exceed a predefined
limit.
(v) Analyzing an Algorithm
For an algorithm the most important is efficiency. In fact, there are two kinds of algorithm
efficiency. They are:
Time efficiency, indicating how fast the algorithm runs, and
Space efficiency, indicating how much extra memory it uses.
The efficiency of an algorithm is determined by measuring both time efficiency and space
efficiency.
So factors to analyze an algorithm are:
Time efficiency of an algorithm
Space efficiency of an algorithm
Simplicity of an algorithm
Generality of an algorithm
(vi) Coding an Algorithm
The coding / implementation of an algorithm is done by a suitable programming language
like C, C++, JAVA.
The transition from an algorithm to a program can be done either incorrectly or very
inefficiently. Implementing an algorithm correctly is necessary. The Algorithm power
should not reduced by inefficient implementation.
Standard tricks like computing a loop’s invariant (an expression that does not change its
value) outside the loop, collecting common subexpressions, replacing expensive
operations by cheap ones, selection of programming language and so on should be known to
the programmer.
Typically, such improvements can speed up a program only by a constant factor, whereas a
better algorithm can make a difference in running time by orders of magnitude. But once
an algorithm is selected, a 10–50% speedup may be worth an effort.
It is very essential to write an optimized code (efficient code) to reduce the burden of
compiler.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.7
(i) Sorting
The sorting problem is to rearrange the items of a given list in nondecreasing
(ascending) order.
Sorting can be done on numbers, characters, strings or records.
To sort student records in alphabetical order of names or by student number or by student
grade-point average. Such a specially chosen piece of information is called a key.
An algorithm is said to be in-place if it does not require extra memory, E.g., Quick sort.
A sorting algorithm is called stable if it preserves the relative order of any two equal
elements in its input.
(ii) Searching
The searching problem deals with finding a given value, called a search key, in a given set.
E.g., Ordinary Linear search and fast binary search.
(iii) String processing
A string is a sequence of characters from an alphabet.
Strings comprise letters, numbers, and special characters; bit strings, which comprise zeros
and ones; and gene sequences, which can be modeled by strings of characters from the four-
character alphabet {A, C, G, T}. It is very useful in bioinformatics.
Searching for a given word in a text is called string matching
(iv) Graph problems
A graph is a collection of points called vertices, some of which are connected by line
segments called edges.
Some of the graph problems are graph traversal, shortest path algorithm, topological sort,
traveling salesman problem and the graph-coloring problem and so on.
(v) Combinatorial problems
These are problems that ask, explicitly or implicitly, to find a combinatorial object such as a
permutation, a combination, or a subset that satisfies certain constraints.
A desired combinatorial object may also be required to have some additional property such
s a maximum value or a minimum cost.
In practical, the combinatorial problems are the most difficult problems in computing.
The traveling salesman problem and the graph coloring problem are examples of
combinatorial problems.
(vi) Geometric problems
Geometric algorithms deal with geometric objects such as points, lines, and polygons.
Geometric algorithms are used in computer graphics, robotics, and tomography.
The closest-pair problem and the convex-hull problem are comes under this category.
(vii) Numerical problems
Numerical problems are problems that involve mathematical equations, systems of
equations, computing definite integrals, evaluating functions, and so on.
The majority of such mathematical problems can be solved only approximately.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.9
TABLE 1.1 Values (approximate) of several functions important for analysis of algorithms
2 3 n
n log2n n n log2n n n 2 n!
1 1 0 1 0 1 1 2 1
2 1.4 1 2 2 4 4 4 2
4 2 2 4 8 16 64 16 24
8 2.8 3 8 2.4•101 64 5.1•10 2 2.6•10 2 4.0•10 4
2 3 3
10 3.2 3.3 10 3.3•101 10 10 10 3.6•10 6
16 4 4 16 6.4•101 2.6•102 4.1•10 3 6.5•10 4 2.1•10 13
2 2 4 6
10 10 6.6 10 6.6•102 10 10 1.3•1030 9.3•10157
3 3 6 9
10 31 10 10 1.0•104 10 10
4 2 4 8
10 10 13 10 1.3•105 10 1012 Very big
5 2 5 10 15
10 3.2•10 17 10 1.7•106 10 10 computation
6 3 6 12 18
10 10 20 10 2.0•107 10 10
In the worst case, there is no matching of elements or the first matching element can found
at last on the list. In the best case, there is matching of elements at first on the list.
Worst-case efficiency
The worst-case efficiency of an algorithm is its efficiency for the worst case input of size n.
The algorithm runs the longest among all possible inputs of that size.
For the input of size n, the running time is Cworst(n) = n.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.11
Yet another type of efficiency is called amortized efficiency. It applies not to a single
run of an algorithm but rather to a sequence of operations performed on the same data structure.
Example 1:
2
where g(n) = n .
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.12
0.
Where t(n) and g(n) are nonnegative functions defined on the set of natural numbers.
O = Asymptotic upper bound = Useful for worst case analysis = Loose bound
= 105n
i.e., 100n + 5 ≤ 105n
i.e., t(n) ≤ cg(n)
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.13
for all n ≥ 0.
− = − − [ ] [ ] for all n ≥ 2.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.14
−
i.e., 4
Hence, 4 −
Θ2
, where c2= , c1= and n0=2
asymptotic notation − ∈ 4
∈
(n) ≤
Let us denote c3 = max{c1, c2} and consider n ≥ max{n1, n2} so that we can use
both inequalities. Adding them yields the following:
t1(n) + t2(n) ≤ c1g1(n) + c2g2(n)
≤ c3g1(n) + c3g2(n)
= c3[g1(n) + g2(n)]
≤ c32 max{g1(n), g2(n)}.
Hence, t1(n) + t2(n)O(max{g1(n), g2(n)}), with the constants c and n0 required by the
definition O being 2c 3 = 2 max{c , c } and max{n , n }, respectively.
1 2 12
∈
The property implies that the algorithm’s overall efficiency will be determined by the part
with a higher order of growth, i.e., its least efficient part.
t1(n) ∈ O(g1(n)) and t2(n) ∈ O(g2(n)), then t1(n) + t2(n) ∈ O(max{g1(n), g2(n)}).
Summation formulas
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.15
EXAMPLE 1: Compute the factorial function F(n) = n! for an arbitrary nonnegative integer n.
Since n!= 1•. . . . • (n − 1) • n = (n − 1)! • n, for n ≥ 1 and 0!= 1 by definition, we can compute
F(n) = F(n − 1) • n with the following recursive algorithm. (ND 2015)
ALGORITHM F(n)
//Computes n! recursively
//Input: A nonnegative integer n
//Output: The value of n!
if n = 0 return 1
else return F(n − 1) * n
Algorithm analysis
For simplicity, we consider n itself as an indicator of this algorithm’s input size. i.e. 1.
The basic operation of the algorithm is multiplication, whose number of executions we
denote M(n). Since the function F(n) is computed according to the formula F(n) = F(n
−1)•n for n > 0.
The number of multiplications M(n) needed to compute it must satisfy the equality
M(n) = M(n-1) + 1 for n > 0
To compute To multiply
F(n-1) F(n-1) by n
M(n − 1) multiplications are spent to compute F(n − 1), and one more multiplication is
needed to multiply the result by n.
Recurrence relations
The last equation defines the sequence M(n) that we need to find. This equation defines
M(n) not explicitly, i.e., as a function of n, but implicitly as a function of its value at another point, namely n − 1. Such equations are called recurrence= relations−+ or recurrences.
Solve the recurrence relation , i.e., to find an explicit formula for M(n) in terms of n only.
To determine a solution uniquely, we need an initial condition that tells us the value with
which the sequence starts. We can obtain this value by inspecting the condition that makes the
algorithm stop its recursive calls:
if n = 0 return 1.
This tells us two things. First, since the calls stop when n = 0, the smallest value of n for
which this algorithm is executed and hence M(n) defined is 0. Second, by inspecting the pseudocode’s
exiting line, we can see that when n = 0, the algorithm performs no multiplications.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.16
Thus, the recurrence relation and initial condition for the algorithm’s number of multiplications
M(n):
M(n) = M(n − 1) + 1 for n > 0,
M(0) = 0 for n = 0.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.17
ALGORITHM TOH(n, A, C, B)
//Move disks from source to destination recursively
//Input: n disks and 3 pegs A, B, and C
//Output: Disks moved to destination as in the source order.
if n=1
Move disk from A to C
else
Move top n-1 disks from A to B using C
TOH(n - 1, A, B, C)
Move top n-1 disks from B to C using A
TOH(n - 1, B, C, A)
Algorithm analysis
The number of moves M(n) depends on n only, and we get the following recurrence
equation for it: M(n) = M(n − 1) + 1+ M(n − 1) for n > 1.
With the obvious initial condition M(1) = 1, we have the following recurrence relation for the
number of moves M(n):
M(n) = 2M(n − 1) + 1 for n > 1,
M(1) = 1.
We solve this recurrence by the same method of backward substitutions:
M(n) = 2M(n − 1) + 1 sub. M(n − 1) = 2M(n − 2) + 1
= 2[2M(n − 2) + 1]+ 1
= 22M(n − 2) + 2 + 1 sub. M(n − 2) = 2M(n − 3) + 1
= 22[2M(n − 3) + 1]+ 2 + 1
3 2
= 2 M(n − 3) + 2 + 2 + 1 sub. M(n − 3) = 2M(n − 4) + 1
4 3 2
= 2 M(n − 4) + 2 + 2 + 2 + 1
…
i i−1 −2 i i
= 2 M(n − i) + 2 + 2i + . . . + 2 + 1= 2 M(n − i) + 2 − 1.
…
Since the initial condition is specified for n = 1, which is achieved for i = n − 1, − − − − − −
n 1 n 1 n 1 n 1 n 1 n 1 n
M(n) = 2 M(n − (n − 1)) + 2 –1=2 M(1) + 2 − 1= 2 +2 − 1= 2 − 1.
Thus, we have an exponential time algorithm
EXAMPLE 3: An investigation of a recursive version of the algorithm which finds the number of
binary digits in the binary representation of a positive decimal integer.
ALGORITHM BinRec(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n’s binary representation if n = 1 return 1 /
else return BinRec( n 2 )+ 1
Th e/nu mb ero f add iti on smad ein co mpu ti ng Bin Rec (n/ 2 ) is A(n/2 ),p lu son e mo re add iti on is mad eby th e algo rith m to in crease th e retu rn ed v alu eby 1 . Th islead sto th erecu rren ceA(n )= A(n 2 )+ 1 fo rn > 1 .
Algorithm analysis
Since the recursive calls end when n is equal to 1 and there are no additions made
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.18
backward substitutions
k k−1 k−1 k−2
A(2 ) = A(2 ) + 1 substitute A(2 ) = A(2 ) + 1
k−2 k−2 k−2 k−3
= [A(2 ) + 1]+ 1= A(2 ) + 2 substitute A(2 ) = A(2 ) + 1
k−3 k−3
= [A(2 ) + 1]+ 2 = A(2 ) + 3 . . .
...
k−i
= A(2 ) + i
...
k−k
= A(2 ) + k.
k k
Thus, we end up with A(2 ) = A(1) + k = k, or, after returning to the original variable n = 2 and
hence k = log2 n,
A(n) = log2 n ϵ Θ (log2 n).
EXAMPLE 1: Consider the problem of finding the value of the largest element in a list of n
numbers. Assume that the list is implemented as an array for simplicity.
ALGORITHM MaxElement(A[0..n − 1])
//Determines the value of the largest element in a given array
//Input: An array A[0..n − 1] of real numbers
//Output: The value of the largest element in A
maxval ←A[0]
for i ←1 to n − 1 do
if A[i]>maxval
maxval←A[i]
return maxval
Algorithm analysis
The measure of an input’s size here is the number of elements in the array, i.e., n.
There are two operations in the for loop’s body:
o The comparison A[i]> maxval and
o The assignment maxval←A[i].
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.19
The comparison operation is considered as the algorithm’s basic operation, because the
comparison is executed on each repetition of the loop and not the assignment.
The number of comparisons will be the same for all arrays of size n; therefore, there is no
need to distinguish among the worst, average, and best cases here.
Let C(n) denotes the number of times this comparison is executed. The algorithm makes
one comparison on each execution of the loop, which is repeated for each value of the
loop’s variable i within the bounds 1 and n − 1, inclusive. Therefore, the sum for C(n) is
−
calculated as follows:
=∑
−
times
i.e., Su mup 1 in rep eatedn -1 =
=∑=−∈
=
EXAMPLE 2: Consider the element uniqueness problem: check whether all the Elements in a
given array of n elements are distinct.
ALGORITHM UniqueElements(A[0..n − 1])
//Determines whether all the elements in a given array are distinct
//Input: An array A[0..n − 1]
//Output: Returns “true” if all the elements in A are distinct and “false”
otherwise for i ←0 to n − 2 do
for j ←i + 1 to n − 1 do
if A[i]= A[j ] return false
return true
Algorithm analysis
The natural measure of the input’s size here is again n (the number of elements in the array).
Since the innermost loop contains a single operation (the comparison of two elements), we
should consider it as the algorithm’s basic operation.
The number of element comparisons depends not only on n but also on whether there are
equal elements in the array and, if there are, which array positions they occupy. We will
limit our investigation to the worst case only.
One comparison is made for each repetition of the innermost loop, i.e., for each value of the
loop variable j between its limits i + 1 and n − 1; this is repeated for each value of the
outer loop, i.e., for each value of the loop variable i between its limits 0 and n − 2.
EXAMPLE 3: Consider matrix multiplication. Given two n × n matrices A and B, find the time
efficiency of the definition-based algorithm for computing their product C = AB. By definition, C
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.20
is an n × n matrix whose elements are computed as the scalar (dot) products of the rows of matrix A
and the columns of matrix B:
where C[i, j ]= A[i, 0]B[0, j]+ . . . + A[i, k]B[k, j]+ . . . + A[i, n − 1]B[n − 1, j] for every pair of
indices 0 ≤ i, j ≤ n − 1.
The total number of multiplications M(n) is expressed by the following triple sum:
Now, we can compute this sum by using formula (S1) and rule (R1)
.
The running time of the algorithm on a particular machine m, we can do it by the product
If we consider, time spent on the additions too, then the total time on the machine is
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit I ______1.21
EXAMPLE 4 The following algorithm finds the number of binary digits in the binary
representation of a positive decimal integer. (MJ 2015)
ALGORITHM Binary(n)
//Input: A positive decimal integer n
//Output: The number of binary digits in n’s binary
representation count ←1
while n > 1 do countn/ ←count + 1
n←
return count
Algorithm analysis
An input’s size is n.
The loop variable takes on only a few values between its lower and upper limits.
Since the value of n is about halved on each repetition of the loop, the answer should be
about log n.
The exact2 formula for> the number of times.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.1
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.2
17 29 34 45 68 89 | 90
The sorting of list 89, 45, 68, 90, 29, 34, 17 is illustrated with the selection sort algorithm.
The analysis of selection sort is straightforward. The input size is given by the number of
elements n; the basic operation is the key comparison . The number of times it is
n− n− n−
t hef ollow
i
n−
g sum:
= =+ = =
Bubble Sort
The bubble sorting algorithm is to compare adjacent elements of the list and exchange them
if they are out of order. By doing it repeatedly, we end up “bubbling up” the largest element to the
last position on the list. The next pass bubbles up the second largest element, and so on, until after n − 1 passes the list is↔ ? sorted. Pass i (0
≤ i ≤ n − 2) of bubble sort can be represented by the following: A0, . . . , Aj Aj+1, . . . , An−i−1 | An−i ≤ . . . ≤ A n−1
ALGORITHM BubbleSort(A[0..n − 1])
//Sorts a given array by bubble sort
//Input: An array A[0..n − 1] of orderable elements
//Output: Array A[0..n − 1] sorted in nondecreasing
order for i ← 0 to n − 2 do
for j ← 0 to n − 2 − i do
if A[j + 1]<A[j ] swap A[j ] and A[j + 1]
The action of the algorithm on the list 89, 45, 68, 90, 29, 34, 17 is illustrated as an example.
etc.
The number of key comparisons for the bubble-sort version given above is the same for all arrays
=∑ ∑ =∑[ n−−i −+ ] =∑n−− i = n− n
n− n− − n− n−
of size n; it is obtained by a sum that is almost identical to the sum for selection sort:
The = =+ = =
number of key swaps, however, depends on the input. In the worst case of decreasing
arrays, it is the same as the number of key comparisons.
∈ Θ (n2)
worst
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.3
Closest-Pair Problem
The closest-pair problem finds the two closest points in a set of n points. It is the simplest of
a variety of problems in computational geometry that deals with proximity of points in the plane or
higher-dimensional spaces.
Consider the two-dimensional case of the closest-pair problem. The points are specified in a standard fashion by their (x, y)
Cartesian coordinates and that the distance between two points p i(xi, yi) and pj(xj, yj ) is the , standard=√ Euclideanx−x +distancey−y.
i j
The following algorithm computes the distance between each pair of distinct points and
finds a pair with the smallest distance.
ALGORITHM BruteForceClosestPair(P)
//Finds distance between two closest points in the plane by brute force
//Input: A list P of n (n ≥ 2) points p1(x1, y1), . . . , pn(x n, y n)
//Output: The distance between the closest pair of
points d←∞
for i ←1 to n − 1 do
for j ←i + 1 to n do
2 2
d ←min(d, sqrt((xi− xj ) + (yi− yj ) )) //sqrt is square root
return d
The basic operation of the algorithm will be squaring a number. The number of times it will
be executed can be computed as follows:
= ∑.
∑
=− = +
= ∑n −i
= 2[ = −1 + −2 + + 1]
(n ) (n ) ...
2
= (n − 1)n ∈ Θ(n ).
Of course, speeding up the innermost loop of the algorithm could only decrease the algorithm’s
running time by a constant factor, but it cannot improve its asymptotic efficiency class.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.4
Convex-Hull Problem
Convex Set
A set of points (finite or infinite) in the plane is called convex if for any two points p and q
in the set, the entire line segment with the endpoints at p and q belongs to the set.
(a) (b)
FIGURE 2.1 (a) Convex sets. (b) Sets that are not convex.
All the sets depicted in Figure 2.1 (a) are convex, and so are a straight line, a triangle, a
rectangle, and, more generally, any convex polygon, a circle, and the entire plane.
On the other hand, the sets depicted in Figure 2.1 (b), any finite set of two or more distinct
points, the boundary of any convex polygon, and a circumference are examples of sets that are not
convex.
Take a rubber band and stretch it to include all the nails, then let it snap into place. The
convex hull is the area bounded by the snapped rubber band as shown in Figure 2.2
Convex hull
The convex hull of a set S of points is the smallest convex set containing S. (The smallest
convex hull of S must be a subset of any convex set containing S.)
If S is convex, its convex hull is obviously S itself. If S is a set of two points, its convex
hull is the line segment connecting these points. If S is a set of three points not on the same line, its
convex hull is the triangle with the vertices at the three points given; if the three points do lie on the
same line, the convex hull is the line segment with its endpoints at the two points that are farthest
apart. For an example of the convex hull for a larger set, see Figure 2.3.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.5
THEOREM
The convex hull of any set S of n>2 points not all on the same line is a convex polygon
with the vertices at some of the points of S. (If all the points do lie on the same line, the polygon
degenerates to a line segment but still with the endpoints at two points of S.)
P
6
P7
P2
P8
P5
P4
P3
P1
FIGURE 2.3 The convex hull for this set of eight points is the convex polygon with vertices at p 1,
p5, p6, p7, and p3.
The convex-hull problem is the problem of constructing the convex hull for a given set
S of n points. To solve it, we need to find the points that will serve as the vertices of the polygon in
question. Mathematicians call the vertices of such a polygon “extreme points.” By definition, an
extreme point of a convex set is a point of this set that is not a middle point of any line
segment with endpoints in the set. For example, the extreme points of a triangle are its three
vertices, the extreme points of a circle are all the points of its circumference, and the extreme points
of the convex hull of the set of eight points in Figure 2.3 are p1, p5, p6, p7, and p3.
Application
Extreme points have several special properties other points of a convex set do not have. One of
them is exploited by the simplex method, This algorithm solves linear programming Problems.
We are interested in extreme points because their identification solves the convex-hull
problem. Actually, to solve this problem completely, we need to know a bit more than just which of n
points of a given set are extreme points of the set’s convex hull. we need to know which pairs of
points need to be connected to form the boundary of the convex hull. Note that this issue can also
be addressed by listing the extreme points in a clockwise or a counterclockwise order.
We can solve the convex-hull problem by brute-force manner. The convex hull problem is
one with no obvious algorithmic solution. there is a simple but inefficient algorithm that is based on
the following observation about line segments making up the boundary of a convex hull: a line segment
connecting two points pi and pj of a set of n points is a part of the convex hull’s boundary
if and only if all the other points of the set lie on the same side of the straight line through these two
points. Repeating this test for every pair of points yields a list of line segments that make up the
convex hull’s boundary.
Facts
A few elementary facts from analytical geometry are needed to implement the above algorithm.
First, the straight line through two points (x1, y1), (x2, y2) in the coordinate plane can be
defined by the equation ax + by = c,where a = y2 − y1, b = x1 − x2, c = x1y2 − y1x2.
Second, such a line divides the plane into two half-planes: for all the points in one of them, ax + by
> c, while for all the points in the other, ax + by < c. (For the points on the line itself, of course, ax
+ by = c.) Thus, to check whether certain points lie on the same side of the line, we can simply
check whether the expression ax + by − c has the same sign for each of these points.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.6
The problem can be conveniently modeled by a weighted graph, with the graph’s vertices
representing the cities and the edge weights specifying the distances. Then the problem can be stated as
the problem of finding the shortest Hamiltonian circuit of the graph. (A Hamiltonian circuit is
defined as a cycle that passes through all the vertices of the graph exactly once).
A Hamiltonian circuit can also be defined as a sequence of n + 1 adjacent vertices vi 0, vi1, .
. . , vi n−1, vi0, where the first vertex of the sequence is the same as the last one and all the other n −
1 vertices are distinct. All circuits start and end at one particular vertex. Figure 2.4 presents a small
instance of the problem and its solution by this method.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.7
Tour Length
a ---> b ---> c ---> d ---> a I = 2 + 8 + 1 + 7 = 18
a ---> b ---> d ---> c ---> a I = 2 + 3 + 1 + 5 = 11 optimal
a ---> c ---> b ---> d ---> a I = 5 + 8 + 3 + 7 = 23
a ---> c ---> d ---> b ---> a I = 5 + 1 + 3 + 2 = 11 optimal
a --- > d ---> b ---> c ---> a I=7+3+8+5=23
a --- > d ---> c ---> b ---> a I=7+1+8+2=18
FIGURE 2.4 Solution to a small instance of the traveling salesman problem by exhaustive search.
Time efficiency
We can get all the tours by generating all the permutations of n − 1 intermediate
cities from a particular city.. i.e. (n - 1)!
Consider two intermediate vertices, say, b and c, and then only permutations in which b
precedes c. (This trick implicitly defines a tour’s direction.)
An inspection of Figure 2.4 reveals three pairs of tours that differ only by their
direction. Hence, we could cut the number of vertex permutations by half because cycle
total lengths in both directions are same.
The total number of permutations needed is still (n − 1)!, which makes the exhaustive-
search approach impractical for large n. It is useful for very small values of n.
The exhaustive-search approach to this problem leads to generating all the subsets of the set
of n items given, computing the total weight of each subset in order to identify feasible subsets
(i.e., the ones with the total weight not exceeding the knapsack capacity), and finding a subset of
the largest value among them.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.8
Note: Exhaustive search of both the traveling salesman and knapsack problems leads to extremely
inefficient algorithms on every input. In fact, these two problems are the best-known examples of
NP-hard problems. No polynomial-time algorithm is known for any NP-hard problem.
Moreover, most computer scientists believe that such algorithms do not exist. some sophisticated
approaches like backtracking and branch-and-bound enable us to solve some instances but not
all instances of these in less than exponential time. Alternatively, we can use one of many
approximation algorithms.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.9
2.6 ASSIGNMENT PROBLEM. There are n people who need to be assigned to execute n jobs, one person [ per,] job. (That is,
each,= pers on,,. si.. , as s igned. toexactlyone joband each job is as s igned toexactlyone pers on.) The cos t
that would accrue if the ith person is assigned to the jth job is a known quantity for each pair The
problem is to find an assignment with the minimum total cost.
FIGURE 2.8 First few iterations of solving a small instance of the assignment problem by
exhaustive search.
Since the number of permutations to be considered for the general case of the assignment
problem is n!, exhaustive search is impractical for all but very small instances of the problem.
Fortunately, there is a much more efficient algorithm for this problem called the Hungarian
method.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.10
The divide-and-conquer technique as shown in Figure 2.9, which depicts the case of
dividing a problem into two smaller subproblems, then the subproblems solved separately. Finally
solution to the original problem is done by combining the solutions of subproblems.
problem of size n
2.8 MERGE SORT Mergesort is based on divide/ -and-conquer / technique. It sorts a given array A[0..n−1] by
dividing it into two halves A[0.. −1] and A[ ..n−1], sorting each of them recursively, and then
merging the two smaller sorted arrays into a single sorted one.
Mergesort(B .. )
//
Mergesort(C[0.. − 1])
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.11
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.12
For large n, the number of comparisons made by this algorithm in the average case turns out
to be about 0.25n less and hence is also in Θ (n log n).
First, the algorithm can be implemented bottom up by merging pairs of the array’s
elements, then merging the sorted pairs, and so on. This avoids the time and space overhead of
using a stack to handle recursive calls. Second, we can divide a list to be sorted in more than two
parts, sort each recursively, and then merge them together. This scheme, which is particularly
useful for sorting files residing on secondary memory devices, is called multiway mergesort.
Sort the two subarrays to the left and to the right of A[s] independently. No work required to
combine the solutions to the subproblems.
Here is pseudocode of quicksort: call Quicksort(A[0..n − 1]) where As a partition algorithm use
the HoarePartition
ALGORITHM Quicksort(A[l..r])
//Sorts a subarray by quicksort
//Input: Subarray of array A[0..n − 1], defined by its left and right indices l and r
//Output: Subarray A[l..r] sorted in nondecreasing order
if l < r
s ←HoarePartition(A[l..r]) //s is a split position
Quicksort(A[l..s − 1])
Quicksort(A[s + 1..r])
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.13
ALGORITHM HoarePartition(A[l..r])
//Partitions a subarray by Hoare’s algorithm, using the first element as a pivot
//Input: Subarray of array A[0..n − 1], defined by its left and right indices l and r (l<r)
//Output: Partition of A[l..r], with the split position returned as this function’s
value p←A[l]
i ←l; j ←r + 1
repeat
repeat i ←i + 1 until A[i]≥ p
repeat j ←j − 1 until A[j ]≤ p
swap(A[i], A[j
]) until i ≥ j
swap(A[i], A[j ]) //undo last swap when i ≥ j
swap(A[l], A[j ])
return j
FIGURE 2.11 Example of quicksort operation of Array with pivots shown in bold.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.14
FIGURE 2.12 Tree of recursive calls to Quicksort with input values l and r of subarray bounds
and split position s of a partition obtained.
∈Θ(n
Cworst(n) = (n + 1) + n + . . . + 3 = ((n + 1)(n + 2))/2− 3 ).
Binary search is a remarkably efficient algorithm for searching in a sorted array (Say A). It
works by comparing a search key K with the array’s middle element A[m]. If they match, the
algorithm stops; otherwise, the same operation is repeated recursively for the first half of the array
if K <A[m], and for the second half if K >A[m]:
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.15
Though binary search is clearly based on a recursive idea, it can be easily implemented as a
nonrecursive algorithm, too. Here is pseudocode of this nonrecursive version.
The standard way to analyze the efficiency of binary search is to count the number of times
the search key is compared with an element of the array (three-way comparisons). One comparison
of K with A[m], the algorithm can determine whether K is smaller, equal to, or larger than A[m].
As an example, let us apply binary search to searching for K = 70 in the array. The iterations
of the algorithm are given in the following table:
index 0 1 2 3 4 5 6 7 8 9 10 11 12
value 3 142731394255707481859398
iteration 1 l m r
iteration 2 l m r
iteration 3 l,m r
The worst-case inputs include all arrays that do not contain a given search key, as well as
some successful searches. Since after one comparison the algorithm faces the same situation but for
an array half the size,
= n + + > , =
The number of key comparisons in the worst case Cworst (n) by recurrence relation.
.
k k
(n) = + 1= (2 ) = (k + 1) = k + 1 for n=2
First, The worst-case time efficiency of binary search is in Θ(log n).
Second, the algorithm simply reduces the size of the remaining array by half on each
iteration, the number of such iterations needed to reduce the initial size n to the final size 1
has to be about log2 n.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.16
Third, the logarithmic function grows so slowly that its values remain small even for very
large values of n.
The average case slightly smaller than that in the worst case
Cavg(n) ≈ log2 n
The average number of comparisons in a successful is
Cavg(n) ≈ log2 n − 1
The average number of comparisons in an unsuccessful is
Cavg(n) ≈ log2(n + 1).
2
+2·10 1 +2·10
=2·∗
∗ 0∗
∗
=3·10
= 322
(2 ∗
1) and (3 ∗ 4) ∗ ∗ ∗ ∗ ∗ ∗ ∗
For any pair of two-digit numbers a = a1a 0 and b = b1b 0, their product c can be computed by
2 1
formula c = a b = c 10 + c 10 + c ,
the where ∗
2 1 0
c2 = a1 b1 is the product of their first digits,
c0 = a0 ∗ b is the product of their second digits,
0
a’s digits and the sum of the b’s digits minus the sum of c2 and c0.
Now we apply this trick to multiplying two n-digit integers a and b where n is a positive
even number. Let us divide both numbers in the middle to take advantage of the divide-and-
conquer technique. We denote the first half of the a’s digits by a1 and the second half by a0; for b,
n/2
the notations are b 1 and n/2
b0, respectively. In these notations, a = a1a 0 implies that a = a110 + a0
and b = b b implies that b = b 10 + b . Therefore, taking advantage of the same trick we used for
1 0 ∗1 0
two-digit numbers, we get
n/2 n/2
C = a b = (a110 + a0) * (b110 + b0)
n n/2
= (a1 * b1)10 + (a1 * b0 + a0 * b1)10 + (a0 * b0)
where
c2 = a1 * b1 is the product of their first halves,
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.17
If n/2 is even, we can apply the same method for computing the products c2 , c0, and c1.
Thus, if n is a power of 2, we have a recursive algorithm for computing the product of two n-digit
integers. In its pure form, the recursion is stopped when n becomes 1. It can also be stopped when
we deem n small enough to multiply the numbers of that size directly.
k k−1
M(2 ) = 3M(2 )
k−2
= 3[3M(2 )]
2 k−2
= 3 M(2 )
=...
i k−i
= 3 M(2 )
=...
k k−k
= 3 M(2 )
k
=3 .
(Since k = log2 n)
log
M(n) = 3 2n = nlog2 3 ≈ n1.585.
log c log a
(On the last step, we took advantage of the following property of logari thms: a b =c b .)
Let A(n) be the number of digit additions and subtractions executed by the above algorithm
in multiplying two n-digit decimal integers. Besides 3A(n/2) of these operations needed to compute
the three products of n/2-digit numbers, the above formulas require five additions and one
subtraction. Hence, we have the recurrence
A(n) = 3· A(n/2) + cn for n > 1, A(1) = 1.
3
By using Master Theorem, we obtain A(n) ∈ Θ(n
log
), 2
which means that the total number of additions and subtractions have the same asymptotic
order of growth as the number of multiplications.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.18
where
The value C00 can be computed either as A00 * B00 + A01 * B10 or as M1 + M4 − M5 + M7
where M 1, M4, M 5, and M7 are found by Strassen’s formulas, with the numbers replaced by the
corresponding submatrices. The seven products of n/2 × n/2 matrices are computed recursively by
Strassen’s matrix multiplication algorithm.
k
Since n = 2 ,
k k−1
M(2 ) = 7M(2 )
k−2
= 7[7M(2 )]
2 k−2
= 7 M(2 )
=.. .
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.19
i k−i
= 7 M(2 )
= ...
k k−k k 0 k k
= 7 M(2 ) = 7 M(2 ) = 7 M(1) = 7 (1) (Since M(1)=1)
k k
M(2 ) = 7 .
Since k = log2 n,
= 7log n
M(n) 2
= log 7
n 2
≈ n2.807
3
which is smaller than n required by the brute-force algorithm.
Since this savings in the number of multiplications was achieved at the expense of making
extra additions, we must check the number of additions A(n) made by Strassen’s algorithm. To
multiply two matrices of order n>1, the algorithm needs to multiply seven matrices of order n/2 and
make 18 additions/subtractions of matrices of size n/2; when n = 1, no additions are made since two
numbers are simply multiplied. These observations yield the following recurrence relation:
2
A(n) = 7A(n/2) + 18(n/2) for n > 1, A(1) = 0.
7
By closed-form solution to this recurrence and the Master Theorem, A(n) (nlog ). which is a
3 ∈Θ
better efficiency class than Θ(n )of the brute-force method. 2
Example: Multiply the following two matrices by Strassen’s matrix multiplication algorithm.
A=[ ] B =[ ]
C =[ ] =[ ] x[
]
Answer: ]
]
]=[ ] − ]
Where A 00 = [ ] A01= [ ] A10= [ ] A11= [
B00= [ ] B 01 = [ ] B 10 = [ ] B 11 = [
C=[ ]=[ ]
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.20
on the line itself, and points lie to the right of or on the line. Then we can solve the closest-
for subsets P and P . Let d and d be the small est / distances between pairs
FIGURE 2.13 (a) Idea of the divide-and-conquer algorithm for the closest-pair problem.
(b) Rectangle that may contain points closer than dmin to point p.
Note that d is not necessarily the smallest distance between all the point pairs because
points of a closer pair can lie on the opposite sides of the separating line. Therefore, as a step
combining the solutions to the smaller subproblems, we need to examine such points. Obviously,
we can limit our attention to the points inside the symmetric vertical strip of width 2d around the
separating line, since the distance between any other pair of points is at least d (Figure 2.13a).
Let S be the list of points inside the strip of width 2d around the separating line, obtained
from Q and hence ordered in nondecreasing order of their y coordinate. We will scan this list,
updating the information about dmin, the minimum distance seen so far, if we encounter a closer
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.21
pair of points. Initially, dmin = d, and subsequently d min ≤ d. Let p(x, y) be a point on this list.
For a point p (x, y) to have a chance to be closer to p than d min, the point must follow p on list S
and the difference between their y coordinates must be less than d min.
Geometrically, this means that p must belong to the rectangle shown in Figure 2.13b. The
principal insight exploited by the algorithm is the observation that the rectangle can contain just a
few such points, because the points in each half (left and right) of the rectangle must be at least
distance d apart. It is easy to prove that the total number of such points in the rectangle, including p,
does not exceed 8. A more careful analysis reduces this number to 6. Thus, the algorithm can
consider no more than five next points following p on the list S, before moving up to the next point.
Here is pseudocode of the algorithm. We follow the advice given in to avoid computing
square roots inside the innermost loop of the algorithm.
ALGORITHM EfficientClosestPair(P, Q)
//Solves the closest-pair problem by divide-and-conquer
//Input: An array P of n ≥ 2 points in the Cartesian plane sorted in nondecreasing
// order of their x coordinates and an array Q of the same points sorted in
// nondecreasing order of the y coordinates
//Output: Euclidean distance between the closest pair of
points if n ≤ 3
return the minimal distance found by the brute-force algorithm
else
copy the first points of P to array P
/ l
copy the same points from Q to array Ql
copy the remaining points of P to array P
/ r
copy the same points from Q to array Q
/ r
dl ← EfficientClosestPair(P / l
,Q )
l
d ←EfficientClosestPair(P , Q )
r r r
d ←min{dl , d r}
m←P[ − 1].x /
copy all the points of Q for which |x − m| < d into array S[0..num −
2
1] dminsq ←d
for i ←0 to num − 2 do
k←i + 1
2
while k ≤ num − 1 and (S[k].y − S[i].y) < dminsq
2 2
dminsq ←min((S[k].x − S[i].x) + (S[k].y − S[i].y) , dminsq)
k←k + 1
return sqrt(dminsq)
The algorithm spends linear time both for dividing the problem into two problems half the
size and combining the obtained solutions. Therefore, assuming as usual that n is a power of 2, we
have the following recurrence for the running time of the algorithm:
T (n) = 2T (n/2) + f (n),
∈ where f (n) ∈ Θ(n). Applying the Master Theorem (with a = 2 , b = 2 , and d = 1), we get T (n) Θ (n log n). The necessity to presort input points does not change the overall efficiency class if sorting is done by a O(n log n) algorithm such as mergesort. In fact, this is the best efficiency
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.22
class one can achieve, because it has been proved that any algorithm for this problem must be in
Ω(n log n) under some natural assumptions about operations an algorithm can .
Convex-Hull Problem
The convex-hull problem is to find the smallest convex polygon that contains n given
points in the plane. We consider here a divide-and-conquer algorithm called quickhull because of
its resemblance to quicksort.
Let S be a set of n>1 points p1(x1, y1), . . . , pn(xn, yn) in the Cartesian plane. We assume that
the points are sorted in nondecreasing order of their x coordinates, with ties resolved by increasing
order of the y coordinates of the points involved. It is not difficult to prove the geometrically
obvious fact that the leftmost point p1 and the rightmost point pn are two distinct extreme points of
the set’s convex hull as Figure 2.14.
The boundary of the convex hull of S is made up of two polygonal chains: an “upper”
boundary and a “lower” boundary. The “upper” boundary, called the upper hull, is a sequence of
line segments with vertices at p1, some of the points in S1 (if S1 is not empty) and pn. The
“lower” boundary, called the lower hull, is a sequence of line segments with vertices at p1, some
of the points in S2 (if S2 is not empty) and pn. The convex hull of the entire set S is composed of
the upper and lower hulls. The lower hull can be constructed in the same manner of upper hull.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit II _____2.23
If S1 is empty, the upper hull is simply the line segment with∠the endpoints at p1 and pn. If S1 is not empty,
the algorithm identifies point pmax in S1, which is the farthest from the line p1pn (Figure 2.15). If there is a tie, the
point that maximizes the angle p maxppn can be selected. (Note that point pmax maximizes the area of the triangle with
two vertices at p1 and pn and the third one at some other point of S1.) Then the algorithm identifies all the points of set
S1 that are to the left of the line p1 pmax; these are the points that will make up the set S1, 1. The points of S1 to the
left of the line p max pn will make up the set S1, 2. It is not difficult to prove the following:
pmax is a vertex of the upper hull.
The points inside p1 p max pn cannot be vertices of the upper hull (and hence can be
eliminated from further consideration).
There are no points to the left of both lines p1pmax and pmax pn.
∪
1∪ 1 ∪ n
∪.
max 1 2 n recursively and then simply concatenate them to get the upper ∪ ∪
p S p
Now we have to figure out how the algorithm’s geometric operations can be actually
implemented. Fortunately, we can take advantage of the following very useful fact from analytical
geometry: if q1 (x1, y1), q2(x2, y2), and q3(x3, y3) are three arbitrary points in the Cartesian plane,
then the area of the triangle q1q2q3 is equal to one-half of the magnitude of the determinant
while the sign of this expression is positive if and only if the point q3 = (x3 , y3) is to the left of the
line q1q2. Using this formula, we can check in constant time whether a point lies to the left of the
line determined by two other points as well as find the distance from the point to the line.
2
Quickhull has the same Θ(n ) worst-case efficiency as Quicksort, In the average case a
much better performance is experienced.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.1
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.2
The cost of the algorithm is filing out the table. Addition is the basic operation. Because k ≤ n, the
sum needs to be split into two parts because only the half the table needs to be filled out for i <k
and remaining part of the table is filled out across the entire row.
A(n, k) = sum for upper triangle + sum for the lower rectangle
k i-1 n k
= ∑i=1 ∑j=1 1 + ∑i=1 ∑j=1 1
k n
= ∑i=1 (i-1) + ∑i=1 k
= (k-1)k/2 + k(n-k) ϵ Θ(nk)
Time efficiency: Θ(nk)
Space efficiency: Θ(nk)
Using an identity called Pascal's Formula a recursive formulation for it looks like this:
This construction forms Each number in the triangle is the sum of the two numbers directly above
it.
7 7 0 6 1 5 2 4 3 3 4 2 5 1 6 0 7
Example: (x+y) = 1•x y +7•x y +21•x y +35•x y +35•x y +21•x y +7•x y +1•x y =
7 6 5 2 4 3 3 4 2 5 6 7
x +7x y+21x y +35x y +35x y +21x y +7xy +y
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.3
FIGURE 3.1 (a) Digraph. (b) Its adjacency matrix. (c) Its transitive closure.
The transitive closure of a digraph can be generated with the help of depth-first search or
breadth-first search. Every vertex as a starting point yields the transitive closure for all.
Warshall’s algorithm constructs the transitive closure through a series of n × n boolean
(0) (k−1) (k) (n)
matrices: R , . . . , R ,R ,...R .
(k) (k)
The element rij in the ith row and jth column of matrix R (i, j = 1, 2, . . . , n, k = 0, 1, . . .
, n) is equal to 1 if and only if there exists a directed path of a positive length from the ith vertex to
the jth vertex with each intermediate vertex, if any, numbered not higher than k.
(0) (k−1) (k) (n)
Steps to compute R , . . . , R ,R ,...R .
(0)
The series starts with R , which does not allow any intermediate vertices in its
(0)
paths; hence, R is nothing other than the adjacency matrix of the digraph.
(1)
R contains the information about paths that can use the first vertex as
(0)
intermediate. it may contain more 1’s than R .
The last matrix in the series, R(n), reflects paths that can use all n vertices of the
digraph as intermediate and hence is nothing other than the digraph’s
transitive closure.
In general, each subsequent matrix in series has one more vertex to use as
intermediate for its paths than its predecessor.
(n)
The last matrix in the series, R , reflects paths that can use all n vertices of the
digraph as intermediate and hence is nothing other than the digraph’s
transitive closure.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.4
(k) (k−1)
All the elements of each matrix R is computed from its immediate predecessor R . Let
(k) (k)
rij , the element in the ith row and jth column of matrix R , be equal to 1. This means that there
exists a path from the ith vertex vi to the jth vertex vj with each intermediate vertex numbered not
higher than k.
The first part of this representation means that there exists a path from vi to vk with each
(k−1)
intermediate vertex numbered not higher than k − 1 (hence, rik = 1), and the second part means
that there exists a path from vk to vj with each intermediate vertex numbered not higher than k − 1
(k−1)
(hence, rkj = 1).
(k)
Thus the following formula generas the elements of matrix R from the elements of matrix
R(k−1):
3 2
Warshall’s algorithm’s time efficiency is only Θ(n ). Space efficiency is Θ(n ). i.e matrix size.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.5
FIGURE 3.3 Application of Warshall’s algorithm to the digraph shown. New 1’s are in bold.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.6
(0)
The series starts with D , which does not allow any intermediate vertices in its
(0)
paths; hence, D is simply the weight matrix of the graph.
(k)
As in Warshall’s algorithm, we can compute all the elements of each matrix D
(k−1)
from its immediate predecessor D
(n)
.
The last matrix in the series, D , contains the lengths of the shortest paths among
all paths that can use all n vertices as intermediate and hence is nothing other than
the distance matrix.
k) (k)
Let d ij( be the element in the ith row and the jth column of matrix D . This means that
k)
dij( is equal to the length of the shortest path among all paths from the ith vertex v i to the jth
vertex vj with their intermediate vertices numbered not higher than k.
The length of the shortest path can be computed by the following recurrence:
3 2
Floyd’s Algorithm’s time efficiency is only Θ(n ). Space efficiency is Θ(n ). i.e matrix size.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.7
FIGURE 3.5 Application of Floyd’s algorithm to the digraph shown. Updated elements are
shown in bold.
FIGURE 3.6 Two out of 14 possible binary search trees with keys A, B, C, and D.
Consider four keys A, B, C, and D to be searched for with probabilities 0.1, 0.2, 0.4, and
0.3, respectively. Figure 3.6 depicts two out of 14 possible binary search trees containing these
keys.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.8
The average number of comparisons in a successful search in the first of these trees is 0.1 .
1+ 0.2 . 2 + 0.4 . 3+ 0.3 . 4 = 2.9, and for the second one it is 0.1 . 2 + 0.2 . 1+ 0.4 . 2 + 0.3 . 3= 2.1.
Neither of these two trees is optimal.
The total number of binary search trees with n keys is equal to the nth Catalan number,
c(n)=(2n)!/(n+1)!n!
Let a1, . . . , an be distinct keys ordered from the smallest to the largest and let p 1, . . . ,
pn be the probabilities of searching for them. Let C(i, j) be the smallest average number of
j
comparisons made in a successful search in a binary search tree Ti made up of keys ai, . . . , aj,
where i, j are some integer indices, 1≤ i ≤ j ≤ n.
FIGURE 3.7 Binary search tree (BST) with root ak and two optimal binary search subtrees
k−1 j
Ti and T k+1 .
Consider all possible ways to choose a root ak among the keys ai, . . . , aj . For such a binary
k−1
search tree (Figure 3.7), the root contains key ak, the left subtree Ti contains keys ai, . . . , ak−1
j
optimally arranged, and the right subtree T k+1 contains keys ak+1, . . . , aj also optimally arranged.
If we count tree levels starting with 1 to make the comparison numbers equal the keys’
levels, the following recurrence relation is obtained:
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.9
FIGURE 3.8 Table of the dynamic programming algorithm for constructing an optimal binary
search tree.
The two-dimensional table in Figure 3.8 shows the values needed for computing C(i, j).
They are in row i and the columns to the left of column j and in column j and the rows below row i.
The arrows point to the pairs of entries whose sums are computed in order to find the smallest one
to be recorded as the value of C(i, j). This suggests filling the table along its diagonals, starting with
all zeros on the main diagonal and given probabilities p i, 1≤ i ≤ n, right above it and moving
toward the upper right corner.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.10
EXAMPLE: Let us illustrate the algorithm by applying it to the four-key set we used at the
beginning of this section:
key A B C D
probability 0.1 0.2 0.4 0.3
The initial tables are:
Thus, out of two possible binary trees containing the first two keys, A and B, the root of the
optimal tree has index 2 (i.e., it contains B), and the average number of comparisons in a successful
search in this tree is 0.4.
We arrive at the following final tables:
Thus, the average number of key comparisons in the optimal tree is equal to 1.7. Since R(1,
4) = 3, the root of the optimal tree contains the third key, i.e., C. Its left subtree is made up of keys
A and B, and its right subtree contains just key D. To find the specific structure of these subtrees,
we find first their roots by consulting the root table again as follows. Since R(1, 2) = 2, the root of
the optimal tree containing A and B is B, with A being its left child (and the root of the one node
tree: R(1, 1) = 1). Since R(4, 4) = 4, the root of this one-node optimal tree is its only key D. Figure
3.10 presents the optimal tree in its entirety.
FIGURE 3.10 Optimal binary search tree for the above example.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.11
Thus, the value of an optimal solution among all feasible subsets of the first I items is the
maximum of these two values. Of course, if the ith item does not fit into the knapsack, the value of
an optimal subset selected from the first i items is the same as the value of an optimal subset
selected from the first i − 1 items. These observations lead to the following recurrence:
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.12
The maximal value is F(4, 5) = $37. We can find the composition of an optimal subset by
backtracing (Back tracing finds the actual optimal subset, i.e. solution), the computations of this
entry in the table. Since F(4, 5) > F(3, 5), item 4 has to be included in an optimal solution along
with an optimal subset for filling 5 − 2 = 3 remaining units of the knapsack capacity. The value of
the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not be in an optimal subset. Since F(2, 3) >
F(1, 3), item 2 is a part of an optimal selection, which leaves element F(1, 3 − 1) to specify its
remaining composition. Similarly, since F(1, 2) > F(0, 2), item 1 is the final part of the optimal
solution {item 1, item 2, item 4}.
Table 3.3 Solving an instance of the knapsack problem by the dynamic programming algorithm.
Capacity j
i 0 1 2 3 4 5
0 0 0 0 0 0 0
w1 = 2, v1 = 12 1 0 0 12 12 12 12
w2 = 1, v2 = 10 2 0 10 12 22 22 22
w3 = 3, v3 = 20 3 0 10 12 22 30 32
w4 = 2, v4 = 15 4 0 10 15 25 30 37
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.13
Memory Functions
The direct top-down approach to finding a solution to such a recurrence leads to an
algorithm that solves common subproblems more than once and hence is very inefficient.
The bottom up fills a table with solutions to all smaller subproblems, but each of them is
solved only once. An unsatisfying aspect of this approach is that solutions to some of these smaller
subproblems are often not necessary for getting a solution to the problem given.
Since this drawback is not present in the top-down approach, it is natural to try to combine
the strengths of the top-down and bottom-up approaches. The goal is to get a method that solves
only subproblems that are necessary and does so only once. Such a method exists; it is based on
using memory functions.
This method solves a given problem in the top-down manner but, in addition, maintains a
table of the kind that would have been used by a bottom-up dynamic programming algorithm.
Initially, all the table’s entries are initialized with a special “null” symbol to indicate that
they have not yet been calculated. Thereafter, whenever a new value needs to be calculated, the
method checks the corresponding entry in the table first: if this entry is not “null,” it is simply
retrieved from the table; otherwise, it is computed by the recursive call whose result is then
recorded in the table.
The following algorithm implements this idea for the knapsack problem. After initializing
the table, the recursive function needs to be called with i = n (the number of items) and j = W (the
knapsack capacity).
ALGORITHM MFKnapsack(i, j )
//Implements the memory function method for the knapsack problem
//Input: A nonnegative integer i indicating the number of the first items being considered
// and a nonnegative integer j indicating the knapsack capacity
//Output: The value of an optimal feasible subset of the first i items
//Note: Uses as global variables input arrays Weights [1..n], Values[1..n],
// and table F[0..n, 0..W ] whose entries are initialized with −1’s except for
// row 0 and column 0 initialized with 0’s
if F[i, j ]< 0
if j <Weights[i]
value←MFKnapsack(i − 1, j)
else
value←max(MFKnapsack(i − 1, j),
Values[i]+ MFKnapsack(i − 1, j −Weights[i]))
F[i, j ]←value
return F[i, j ]
EXAMPLE 2 Let us apply the memory function method to the instance considered in Example 1.
Capacity j
I 0 1 2 3 4 5
0 0 0 0 0 0 0
w1 = 2, v1 = 12 1 0 0 12 12 12 12
w2 = 1, v2 = 10 2 0 - 12 22 - 22
w3 = 3, v3 = 20 3 0 - - 22 - 32
w4 = 2, v4 = 15 4 0 - - - - 37
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.14
Only 11 out of 20 nontrivial values (i.e., not those in row 0 or in column 0) have been
computed. Just one nontrivial entry, V (1, 2), is retrieved rather than being recomputed. For larger
instances, the proportion of such entries can be significantly larger.
Two classic algorithms for the minimum spanning tree problem: Prim’s algorithm and
Kruskal’s algorithm. They solve the same problem by applying the greedy approach in two
different ways, and both of them always yield an optimal solution.
Another classic algorithm named Dijkstra’s algorithm used to find the shortest-path in a
weighted graph problem solved by Greedy Technique . Huffman codes is an important data
compression method that can be interpreted as an application of the greedy technique.
The first way is one of the common ways to do the proof for Greedy Technique is by
mathematical induction.
The second way to prove optimality of a greedy algorithm is to show that on each step it does
at least as well as any other algorithm could in advancing toward the problem’s goal.
Example: find the minimum number of moves needed for a chess knight to go from one corner of a
100 × 100 board to the diagonally opposite corner. (The knight’s moves are L-shaped jumps:
two squares horizontally or vertically followed by one square in the perpendicular direction.)
A greedy solution is clear here: jump as close to the goal as possible on each move. Thus, if
its start and finish squares are (1,1) and (100, 100), respectively, a sequence of 66 moves such as (1, 1)
− (3, 2) − (4, 4) − . . . − (97, 97) − (99, 98) − (100, 100) solves the problem(The number k of
two-move advances can be obtained from the equation 1+ 3k = 100).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.15
Why is this a minimum-move solution? Because if we measure the distance to the goal by
the Manhattan distance, which is the sum of the difference between the row numbers and the
difference between the column numbers of two squares in question, the greedy algorithm decreases
it by 3 on each move.
The third way is simply to show that the final result obtained by a greedy algorithm is
optimal based on the algorithm’s output rather than the way it operates.
Example: Consider the problem of placing the maximum number of chips on an 8 × 8 board so
that no two chips are placed on the same or adjacent vertically, horizontally, or diagonally.
FIGURE 3.12 (a) Placement of 16 chips on non-adjacent squares. (b) Partition of the board
proving impossibility of placing more than 16 chips.
It is impossible to place more than one chip in each of these squares, which implies that the
total number of nonadjacent chips on the board cannot exceed 16.
A spanning tree of an undirected connected graph is its connected acyclic subgraph (i.e.,
a tree) that contains all the vertices of the graph. If such a graph has weights assigned to its edges, a
minimum spanning tree is its spanning tree of the smallest weight, where the weight of a tree
is defined as the sum of the weights on all its edges. The minimum spanning tree problem is
the problem of finding a minimum spanning tree for a given weighted connected graph.
FIGURE 3.13 Graph and its spanning trees, with T1 being the minimum spanning tree.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.16
ALGORITHM Prim(G)
//Prim’s algorithm for constructing a minimum spanning tree
//Input: A weighted connected graph G = {V, E}
//Output: ET, the set of edges composing a minimum spanning tree of
G VT←{v0} //the set of tree vertices can be initialized with any vertex
ET←Φ
for i ←1 to |V| − 1 do
find a minimum-weight edge e∗ = (v∗, u∗ ) among all the edges (v, u)
return ET ∪
If a graph is represented by its adjacency lists and the priority queue is implemented as a
min-heap, the running time of the algorithm is O(|E| log |V |) in a connected graph, where |V| − 1≤
|E|.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.17
FIGURE 3.14 Application of Prim’s algorithm. The parenthesized labels of a vertex in the middle
column indicate the nearest tree vertex and edge weight; selected vertices and edges are in bold.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.18
∪
return ET
The initial forest consists of |V | trivial trees, each comprising a single vertex of the graph.
The final forest consists of a single tree, which is a minimum spanning tree of the graph. On each iteration,
the algorithm takes the next edge (u, v) from the sorted list of the graph’s edges, finds the
trees containing the vertices u and v, and, if these trees are not the same, unites them in a larger tree
by adding the edge (u, v).
Fortunately, there are efficient algorithms for doing so, including the crucial check for
whether two vertices belong to the same tree. They are called union-find algorithms. With an
efficient union-find algorithm, the running time of Kruskal’s algorithm will be O(|E| log |E|).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.19
FIGURE 3.15 Application of Kruskal’s algorithm. Selected edges are shown in bold.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.20
ALGORITHM Dijkstra(G, s)
//Dijkstra’s algorithm for single-source shortest paths
//Input: A weighted connected graph G = (V, E) with nonnegative weights and its vertex s
//Output: The length dv of a shortest path from s to v and its penultimate vertex pv for every
// vertex v in V
Initialize(Q) //initialize priority queue to empty
for every vertex v in V
dv ← ∞; pv ← null
Insert (Q, v, dv) //initialize vertex priority in the priority queue
Ds ← 0; Decrease(Q, s, ds) //update priority of s with
ds VT← Φ
*
for i ←0 to |V| −∪1 do
The time efficiency of Dijkstra’s algorithm depends on the data structures used for
2
implementing the priority queue and for representing an input graph itself. It is in Θ (|V | ) for
graphs represented by their weight matrix and the priority queue implemented as an unordered
array. For graphs represented by their adjacency lists and the priority queue implemented as a min-
heap, it is in O(|E| log |V |).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.21
FIGURE 3.16 Application of Dijkstra’s algorithm. The next closest vertex is shown in bold
The shortest paths (identified by following nonnumeric labels backward from a destination
vertex in the left column to the source) and their lengths (given by numeric labels of the tree
vertices) are as follows:
From a to b : a − b of length 3
From a to d : a − b − d of length 5
From a to c : a − b − c of length 7
From a to e : a − b − d − e of length 9
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.22
Huffman’s algorithm
Step 1 Initialize n one-
node trees and label them with the symbols of the alphabet given.
Record the frequency of each symbol in its tree’s root to indicate the tree’s weight.
(More generally, the weight of a tree will be equal to the sum of the frequencies in
the tree’s leaves.)
Step 2 Repeat the following operation until a single tree is obtained. Find two trees with
the smallest weight (ties can be broken arbitrarily, but see Problem 2 in this section’s
exercises). Make them the left and right subtree of a new tree and record
the sum of their weights in the root of the new tree as its weight.
A tree constructed by the above algorithm is called a Huffman tree. It defines in the
manner described above is called a Huffman code.
EXAMPLE Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence
frequencies in a text made up of these symbols:
symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15
The Huffman tree construction for this input is shown in Figure 3.18
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit III 3.23
symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15
codeword 11 100 00 01 101
We used a fixed-length encoding for the same alphabet, we would have to∙ use at least 3 bits per each symbol. Thus, for this toy
example, Huffman’s code achieves the compression ratio - a standard measure of a compression algorithm’s effectiveness of (3 −
2.25) / 3 100% = 25%. In other words, Huffman’s encoding of the text will use 25% less memory than its fixed-length
encoding.
Running time is O(n log n), as each priority queue operation takes time O( log n).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.1
Example
maximize 3x + 5y
subject to x + y ≤ 4
x + 3y ≤ 6
x ≥ 0, y ≥ 0
Feasible region is the set of points defined by the constraints
y
x + 3y = 6
(0,2)
(3,1)
x
(0,0) (4,0)
x+y=4
Geometric solution
y
(0,2)
(3,1)
x
(0,0) (4,0)
3x + 5y = 20
3x + 5y = 14
3x + 5y = 10
Optimal solution: x = 3, y = 1
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.2
Extreme Point Theorem Any LP problem with a nonempty bounded feasible region has an
optimal solution; moreover, an optimal solution can always be found at an extreme point of the
problem's feasible region.
Example
maximize 3x + 5y maximize 3x + 5y + 0u + 0v
subject to x + y ≤ 4 subject to x + y + u =4
x + 3y ≤ 6 x + 3y + v =6
x≥0, y≥0 x≥0, y≥0, u≥0, v≥0
Variables u and v, transforming inequality constraints into equality constrains, are called slack
variables
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.3
Simplex Tableau
maximize z = 3x + 5y + 0u + 0v
subject to x+ y+ u =4
x + 3y + v =6
x≥0, y≥0, u≥0, v≥0
x y u v
u 1 1 1 0 4
v 1 3 0 1 6
3 5 0 0 0
objective row
basic variables = u,v
basic feasible solution = (0, 0, 4, 6)
value of z at (0, 0, 4, 6) =0
Step 3 [Find departing (leaving) variable] For each positive entry in the pivot column, calculate the
θ-ratio by dividing that row's entry in the rightmost column (solution) by its entry in the
pivot column. (If there are no positive entries in the pivot column then stop: the problem is
unbounded.) Find the row with the smallest θ-ratio, mark this row to indicate the departing
variable and the pivot row.
Step 4 [Form the next tableau] Divide all the entries in the pivot row by its entry in the pivot
column. Subtract from each of the other rows, including the objective row, the new pivot
row multiplied by the entry in the pivot column of the row in question. Replace the label of
the pivot row by the variable's name of the pivot column and go back to Step 1.
x y u v
u 1 1 1 0 4
v 1 3 0 1 6
3 5 0 0 0
x y u v
x 1 0 3/2 1/3 3
y 0 1 1/2 1/2 1
0 0 2 1 14
basic feasible sol. (3, 1, 0, 0) z = 14
Example 1:
Use Simplex method to solve the formers problem given below.
A farmer has a 320 acre farm on which she plants two crops: corn and soybeans. For each
acre of corn planted, her expenses are $50 and for each acre of soybeans planted, her expenses are
$100. Each acre of corn requires 100 bushels of storage and yields a profit of $60; each acre of
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.5
soybeans requires 40 bushels of storage and yields a profit of $90. If the total amount of storage
space available is 19,200 bushels and the farmer has only $20,000 on hand, how many acres of
each crop should she plant in order to maximize her profit? What will her profit be if she follows
this strategy?
Solution
Linear Programming Problem Formulation
Corn Soybean Total
Expenses $50 $100 $20,000
Storage(bushels) 100 40 19,200
Profit 60 90 Maximize profit
A farmer has a 320 acre farm is unwanted data but c+s<=320.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.6
Pivot row:
Replace the leaving variable in basic column with the entering variable.
New Pivot Row = Current Pivot Row / Pivot
Element All other rows including z:
New Row = Current Row – (Its Pivot column coefficient)* New Pivot Row
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.7
Row y
New Pivot Row = Current Pivot Row / Pivot Element =
(0, 50, 100, 1, 0, 20000) / 100
= (0, , 1, , 0, 200)
Row s2
New Row = Current Row – (Its Pivot column coefficient)* New Pivot Row
= (0, 100, 40, 0, 1, 19200) - (40)*( 0, , 1, , 0, 200)
Iteration II
Basic z x y s1 s2 Solution
NPR y 0 1/2 1 1/100 0 200
s2 0 80 0 -4/10 1 96
z 1 -15 0 9/10 0 18000
Row x
New Pivot Row = Current Pivot Row / Pivot Element
= (0, 80, 0, − , 1, 11200) / 80
= (0, , 0, − , , 140)
Row y
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.8
New Row = Current Row – (Its Pivot column coefficient)* New Pivot Row
= (0, 1/2, 1, 1/100, 0, 200) - ( )*( 0, , 0, − , , 140)
= (0, 0, 1, ,− , 130)
Row z
New Row = Current Row – (Its Pivot column coefficient)* New Pivot Row
= (1, 0, 0, , 20100) ,
,
Iteration III
Basic z x y s1 s2 Solution
y 0 0 1 1/80 -1/160 130
x 0 1 0 -1/200 1/80 140
z 1 0 0 33/40 15/80 20100
subject to:∑
i ,
= = ,...,m,
Dual j= , ,...,n.
Minimize
′=∑ ,
subject to:∑ j ,
= = ,...,n,
i = , ,...,m .
Example: =
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.9
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.10
Definition of a Flow
A flow is an assignment of real numbers xij to edges (i,j) of a given network that satisfy the
following:
flow-conservation requirements
The total amount of material entering an intermediate vertex must be equal to the total
amount of the material leaving the vertex
capacity constraints
0 ≤ xij ≤ uij for every edge (i,j) E
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.11
Augmenting path: 1 2 3 6
xij/uij
Augmenting path: 1 4 3 2 5 6
Example 1 (maximum flow)
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.12
Example 2
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.13
1→2→4→3
1→4←2→3 V=1
V=2
…
V=2U
Requires 2U iterations to reach maximum flow of value 2U
Shortest-Augmenting-Path Algorithm
Generate augmenting path with the least number of edges by BFS as follows.
Starting at the source, perform BFS traversal by marking new (unlabeled) vertices with two labels:
• first label – indicates the amount of additional flow that can be brought from the
source to the vertex being labeled
• second label – indicates the vertex from which the vertex being labeled was reached,
with “+” or “–” added to the second label to indicate whether the vertex was
reached via a forward or backward edge
Vertex labeling
The source is always labeled with ∞,-
All other vertices are labeled as follows:
o If unlabeled vertex j is connected to the front vertex i of the traversal queue by a
directed edge from i to j with positive unused capacity rij = uij –xij (forward edge),
+
vertex j is labeled with lj,i , where lj = min{li, rij}
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.14
Queue: 1 2 4 3 5 6
Augment the flow by 2 (the sink’s first label) along the path 1 2 3 6
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.15
Queue: 1 4 3 2 5 6
Augment the flow by 1 (the sink’s first label) along the path 1 43256
Queue: 1 4
ALGORITHM ShortestAugmentingPath(G)
//Implements the shortest-augmenting-path algorithm
//Input: A network with single source 1, single sink n, and positive integer capacities uij on
// its edges (i, j )
//Output: A maximum flow x
assign xij= 0 to every edge (i, j ) in the network
label the source with ∞, − and add the source to the empty queue Q
while not Empty(Q) do
i Front(Q); Dequeue(Q)
for every edge from i to j do //forward edges
if j is unlabeled
rij uij− xij
if rij > 0
lj min{li, rij}; label j with lj, i +
Enqueue(Q, j )
for every edge from j to i do //backward edges
if j is unlabeled
if xji > 0
lj min{li, xji }; label j with lj, i−
Enqueue(Q, j )
if the sink has been labeled
//augment along the augmenting path found
j n //start at the sink and move backwards using second
labels while j ≠ 1 //the source hasn’t been reached
if the second label of vertex j is i+
xij xij+ ln
else //the second label of vertex j is i−
xij xij −ln
j i; i the vertex indicated by i’s second label
erase all vertex labels except the ones of the source
reinitialize Q with the source
return x //the current flow is maximum
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.17
Time Efficiency
• The number of augmenting paths needed by the shortest-augmenting-path algorithm
never exceeds nm/2, where n and m are the number of vertices and edges, respectively.
• Since the time required to find shortest augmenting path by breadth-first search is in
O(n+m)=O(m) for networks represented by their adjacency lists, the time efficiency of
2
the shortest-augmenting-path algorithm is in O(nm ) for this representation.
• More efficient algorithms have been found that can run in close to O(nm) time, but
these algorithms don’t fall into the iterative-improvement paradigm.
A graph is bipartite if and only if it does not have a cycle of an odd length.
A bipartite graph is 2-colorable: the vertices can be colored in two colors so that every edge
has its vertices colored differently
Matching in a Graph
A matching in a graph is a subset of its edges with the property that no two edges share a
vertex
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.18
A maximum (or maximum cardinality) matching is a matching with the largest number of edges
• always exists
• not always unique
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.19
Augmentation along 3, 8, 4, 9, 5, 10
Matching on the right is maximum (perfect matching).
Theorem: A matching M is maximum if and only if there exists no augmenting path with
respect to M.
Augmenting Path Method (template)
• Start with some initial matching . e.g., the empty set
• Find an augmenting path and augment the current matching along that path. e.g., using
breadth-first search like method
• When no augmenting path can be found, terminate and return the last matching, which is
maximum
ranking matrix
Ann Lea Sue
Bob 2,3 1,2 3,3
Jim 3,1 1,3 2,1
Tom 3,2 2,1 1,2
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 __ Design and Analysis of Algorithms _ Unit IV_ ____4.20
Data for an instance of the stable marriage problem. (a) Men’s preference lists; (b) women’s preference lists. (c)
Ranking matrix (with the boxed cells composing an unstable matching).
Example
Free men: Bob, Jim, Tom
Ann Lea Sue
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.1
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.2
Determining the lower bound lies in which part of an input must be processed by
any algorithm solving the problem. For example, searching for an element of a given value in a
sorted array does not require processing all its elements.
Information-Theoretic Arguments
The information-theoretical approach seeks to establish a lower bound based on the
amount of information it has to produce by algorithm.
Consider an example “Game of guessing number”, the well-known game of deducing
a positive integer between 1 and n selected by somebody by asking that person questions with
yes/no answers. The amount of uncertainty that any algorithm solving this problem has to resolve
can be measured by log2 n .
The number of bits needed to specify a particular number among the n possibilities. Each
answer to the question gives information about each bit.
1. Is the first bit zero? → No→ first bit is 1
2. Is the second bit zero? → Yes→ second bit is 0
3. Is the third bit zero? → Yes→ third bit is 0
4. Is the forth bit zero? → Yes→ forth bit is 0
The number in binary is 1000, i.e.8 in decimal value.
The above approach is called the information-theoretic argument because of its
connection to information theory. This is useful for finding information-theoretic lower
bounds for many problems involving comparisons, including sorting and searching.
Its underlying idea can be realized the mechanism of decision trees. Because
Adversary Arguments
Adversary Argument is a method of proving by playing a role of adversary
(opponent) in which algorithm has to work more for adjusting input consistently.
Consider the Game of guessing number between positive integer 1 and n by asking a person
(Adversary) with yes/no type answers for questions. After each question at least one-half of the
numbers reduced. If an algorithm stops before the size of the set is reduced to 1, the adversary can
exhibit a number.
Any algorithm needs log2 n iterations to shrink an n-element set to a one-element set by
halving and rounding up the size of the remaining set. Hence, at least log 2 n questions need to be
asked by any algorithm in the worst case. This example illustrates the adversary method for
establishing lower bounds.
Consider the problem of merging two sorted lists of size n a1< a2 < . . . < an and b1 < b2 < . .
. < bn into a single sorted list of size 2n. For simplicity, we assume that all the a’s and b’s are
distinct, which gives the problem a unique solution.
Merging is done by repeatedly comparing the first elements in the remaining lists and
outputting the smaller among them. The number of key comparisons (lower bound) in the worst
case for this algorithm for merging is 2n − 1.
Problem Reduction
Problem reduction is a method in which a difficult unsolvable problem P is reduced to
another solvable problem B which can be solved by a known algorithm.
A similar reduction idea can be used for finding a lower bound. To show that problem P is
at least as hard as another problem Q with a known lower bound, we need to reduce Q to P (not P
to Q!). In other words, we should show that an arbitrary instance of problem Q can be transformed
to an instance of problem P, so any algorithm solving P would solve Q as well. Then a lower bound
for Q will be a lower bound for P. Table 5.1 lists several important problems that are often used for
this purpose.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.3
TABLE 5.1 Problems often used for establishing lower bounds by problem reduction
Problem Lower bound Tightness
Sorting Ω(n log n) yes
searching in a sorted array Ω (log n) yes
element uniqueness problem Ω (n log n) yes
multiplication of n-digit integers Ω(n) unknown
multiplication of n × n matrices Ω(n2) unknown
Consider the Euclidean minimum spanning tree problem as an example of establishing a
lower bound by reduction:
Given n points in the Cartesian plane, construct a tree of minimum total length whose
vertices are the given points. As a problem with a known lower bound, we use the element
uniqueness problem.
We can transform any set x1, x2, . . . , xn of n real numbers into a set of n points in the
Cartesian plane by simply adding 0 as the points’ y coordinate: (x1, 0), (x2, 0), . . . , (xn, 0). Let T
be a minimum spanning tree found for this set of points. Since T must contain a shortest edge,
checking whether T contains a zero length edge will answer the question about uniqueness of the
given numbers. This reduction implies that Ω (n log n) is a lower bound for the Euclidean
minimum spanning tree problem,
Note: Limitations of algorithm can be studied by obtaining lower bound efficiency.
Consider a binary decision tree with height h and leaves n. and height h, then h ≥ log2 n . A
h h
binary tree of height h with the largest number of leaves on the last level is 2 . In other words, 2 ≥
n, which puts a lower bound on the heights of binary decision trees. Hence the worst-case number
of comparisons made by any comparison-based algorithm for the problem is called the information
theoretic lower bound.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.4
Cba
123
FIGURE 5.2 Decision tree for the tree-element selection sort.
A triple above a node indicates the state of the array being sorted. Note two redundant
comparisons b <a with a single possible outcome because of the results of some previously made
comparisons.
The three-element insertion sort whose decision tree is given in Figure 5.3, this number is (2
+ 3 + 3 + 2 + 3 + 3)/6 = 2.66. Under the standard assumption that all n! outcomes of sorting are
equally likely, the following lower bound on the average number of comparisons Cavg made by any
comparison-based algorithm in sorting an n-element list has been proved:
Cavg(n) ≥ log2 n!.
Decision tree is a convenient model of algorithms involving comparisons in which
internal nodes represent comparisons
leaves represent outcomes (or input cases)
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.5
FIGURE 5.4 Ternary decision tree for binary search in a four-element array.
FIGURE 5.5 Binary decision tree for binary search in a four-element array.
As comparison of the decision trees in the above illustrates, the binary decision tree is
simply the ternary decision tree with all the middle subtrees eliminated. Applying inequality to
such binary decision trees immediately yields C worst(n) ≥ log2(n + 1)
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.6
Definition: Class P is a class of decision problems that can be solved in polynomial time by
deterministic algorithms. This class of problems is called polynomial class.
Problems that can be solved in polynomial time as the set that computer science
theoreticians call P. A more formal definition includes in P only decision problems,
which are problems with yes/no answers.
The class of decision problems that are solvable in O(p(n)) polynomial time, where p(n)
is a polynomial of problem’s input size n
Examples:
Searching
Element uniqueness
Graph connectivity
Graph acyclicity
Primality testing (finally proved in 2002)
The restriction of P to decision problems can be justified by the following reasons.
First, it is sensible to exclude problems not solvable in polynomial time
because of their exponentially large output. e.g., generating subsets of a given set
or all the permutations of n distinct items.
Second, many important problems that are not decision problems in their
most natural formulation can be reduced to a series of decision problems that are
easier to study. For example, instead of asking about the minimum number of colors
needed to color the vertices of a graph so that no two adjacent vertices are colored
the same color. Coloring of the graph’s vertices with no more than m colors for
m = 1, 2, . . . . (The latter is called the m-coloring problem.)
So, every decision problem can not be solved in polynomial time. Some
decision problems cannot be solved at all by any algorithm. Such problems are
called undecidable, as opposed to decidable problems that can be solved by
an algorithm (Halting problem).
Non polynomial-time algorithm: There are many important problems, however, for
which no polynomial-time algorithm has been found.
Hamiltonian circuit problem: Determine whether a given graph has a
Hamiltonian circuit—a path that starts and ends at the same vertex and passes
through all the other vertices exactly once.
Traveling salesman problem: Find the shortest tour through n cities with
known positive integer distances between them (find the shortest Hamiltonian
circuit in a complete graph with positive integer weights).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.7
Definition: Class NP is the class of decision problems that can be solved by nondeterministic
polynomial algorithms. This class of problems ⊆ is called nondeterministic po lynom ial.
Most decision problems are in NP. First of all, this class includes all the problems in P:
P NP
This is true because, if a problem is in P, we can use the deterministic polynomial time
algorithm that solves it in the verification-stage of a nondeterministic algorithm that simply ignores
string S generated in its nondeterministic (“guessing”) stage. But NP also contains the Hamiltonian
circuit problem, the partition problem, decision versions of the traveling salesman, the knapsack,
graph coloring, and many hundreds of other difficult combinatorial optimization. The halting
problem, on the other hand, is among the rare examples of decision problems that are known not to
be in NP.
Note that P = NP would imply that each of many hundreds of difficult combinatorial
decision problems can be solved by a polynomial-time algorithm.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.8
The fact that closely related decision problems are polynomially reducible to each other is not very
surprising. For example, let us prove that the Hamiltonian circuit problem is polynomially
reducible to the decision version of the traveling salesman problem.
NP problems
NP-complete
problem
Proof: Let us prove that the Hamiltonian circuit problem is polynomially reducible to the decision
version of the traveling salesman problem.
We can map a graph G of a given instance of the Hamiltonian circuit problem to a complete
weighted graph G' representing an instance of the traveling salesman problem by assigning 1 as the
weight to each edge in G and adding an edge of weight 2 between any pair of nonadjacent vertices
in G. As the upper bound m on the Hamiltonian circuit length, we take m = n, where n is the
number of vertices in G (and G' ). Obviously, this transformation can be done in polynomial time.
Let G be a yes instance of the Hamiltonian circuit problem. Then G has a Hamiltonian
circuit, and its image in G' will have length n, making the image a yes instance of the decision
traveling salesman problem.
Conversely, if we have a Hamiltonian circuit of the length not larger than n in G', then its
length must be exactly n and hence the circuit must be made up of edges present in G, making the
inverse image of the yes instance of the decision traveling salesman problem be a yes instance of
the Hamiltonian circuit problem.
This completes the proof.
T T T F F F T T F F
T T F F F T T T T T
T F T F T F T F T F
T F F F T T T F T F
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451Design and Analysis of Algorithms Unit V 5.9
F T T T F F F T T F
F T F T F T T T T T
F F T T T F T T T T
F F F T T T T T T T
The CNF-satisfiability problem deals with boolean expressions . Each boolean expression can be represented in conjunctive normal form,
such as the following expression involving three boolean variables x1, x2, and x3 and their negations denoted , , and respectively:
The CNF-satisfiability problem asks whether or not one can assign values true and false to
variables of a given boolean expression in its CNF form to make the entire expression true. (It is
easy to see that this can be done for the above formula: if x1 = true, x2 = true, and x3 = false, the
entire expression is true.)
Since the Cook-Levin discovery of the first known NP-complete problems, computer
scientists have found many hundreds, if not thousands, of other examples. In particular, the well-
known problems (or their decision versions) mentioned above—Hamiltonian circuit, traveling
salesman, partition, bin packing, and graph coloring—are all NP-complete. It is known, however,
that if P != NP there must exist NP problems that neither are in P nor are NP-complete.
known
NP-complete
problem
candidate
for NP -
completeness
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.10
There are some problems that are difficult to solve algorithmically. At the same time, few of
them are so important, we must solve by some other technique. Two algorithm design techniques
backtracking and branch-and-bound that often make it possible to solve at least some large
instances of difficult combinatorial problems.
Both backtracking and branch-and-bound are based on the construction of a state-space tree
whose nodes reflect specific choices made for a solution’s components. Both techniques terminate
a node as soon as it can be guaranteed that no solution to the problem can be obtained by
considering choices that correspond to the node’s descendants
We consider a few approximation algorithms for solving the Assignment Problem, traveling
salesman and knapsack problems. There are three classic methods like the bisection method, the
method of false position, and Newton’s method for approximate root finding.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.11
Branch-and-Bound
Assignment Problem
Knapsack Problem
Traveling Salesman Problem
Approximation Algorithms for NP-Hard Problems
Approximation Algorithms for the Traveling Salesman Problem
Approximation Algorithms for the Knapsack
Problem Algorithms for Solving Nonlinear Equations
Bisection Method
False Position Method
Newton’s Method
5.6 BACKTRACKING
Backtracking is a more intelligent variation approach.
The principal idea is to construct solutions one component at a time and evaluate such
partially constructed candidates as follows.
If a partially constructed solution can be developed further without violating the
problem’s constraints, it is done by taking the first remaining legitimate option for the
next component.
If there is no legitimate option for the next component, no alternatives for any remaining
component need to be considered. In this case, the algorithm backtracks to replace the
last component of the partially constructed solution with its next option.
It is convenient to implement this kind of processing by constructing a tree of choices
being made, called the state-space tree.
Its root represents an initial state before the search for a solution begins.
The nodes of the first level in the tree represent the choices made for the first component
of a solution, the nodes of the second level represent the choices for the second
component, and so on.
A node in a state-space tree is said to be promising if it corresponds to a partially
constructed solution that may still lead to a complete solution. otherwise, it is called
nonpromising.
Leaves represent either nonpromising dead ends or complete solutions found by the
algorithm. In the majority of cases, a statespace tree for a backtracking algorithm is
constructed in the manner of depthfirst search.
If the current node is promising, its child is generated by adding the first remaining
legitimate option for the next component of a solution, and the processing moves to this
child. If the current node turns out to be nonpromising, the algorithm backtracks to the
node’s parent to consider the next possible option for its last component; if there is
no such option, it backtracks one more level up the tree, and so on.
Finally, if the algorithm reaches a complete solution to the problem, it either stops (if
just one solution is required) or continues searching for other possible solutions.
Backtracking techniques are applied to solve the following problems
n-Queens Problem
Hamiltonian Circuit Problem
Subset-Sum Problem
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.12
Step 2: Place queen 1 in the first possible position of its row, which is in column 1 of row 1.
1 2 34
1 Q
2
3
4
Step 3: place queen 2, after trying unsuccessfully columns 1 and 2, in the first acceptable position
for it, which is square (2, 3), the square in row 2 and column 3.
1 2 34
1 Q
2 Q
3
4
Step 4: This proves to be a dead end because there is no acceptable position for queen 3. So, the
algorithm backtracks and puts queen 2 in the next possible position at (2, 4).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.13
1 2 34
1 Q
2 Q
3
4
Step 5: Then queen 3 is placed at (3, 2), which proves to be another dead end.
1 2 34
1 Q
2 Q
3 Q
4
Step 6: The algorithm then backtracks all the way to queen 1 and moves it to (1, 2).
1 2 34
1 Q
2
3
4
Step 9: The queen 3 goes to (4, 3). This is a solution to the problem.
1 2 34
1 Q
2 Q
3 Q
4 Q
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.14
FIGURE 5.10 State-space tree of solving the four-queens problem by backtracking. × denotes an
unsuccessful attempt to place a queen.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.15
Let us consider the problem of finding a Hamiltonian circuit in the graph in Figure 5.13.
FIGURE 5. 13 Graph.
Solution:
Assume that if a Hamiltonian circuit exists, it starts at vertex∈ a. accordingly, we make vertex a the root of the state-space tree as in Figure 5.14.
In a Graph G, Hamiltonian cycle begins at some vertex V1 G, and the vertices are visited
only once in the order V1, V2, . . . , Vn. (Vi are distinct except for V1 and Vn+1 which are
equal).
The first component of our future solution, if it exists, is a first intermediate vertex of a
Hamiltonian circuit to be constructed. Using the alphabet order to break the three-way tie
among the vertices adjacent to a, we
Select vertex b. From b, the algorithm proceeds to c, then to d, then to e, and finally to f,
which proves to be a dead end.
So the algorithm backtracks from f to e, then to d, and then to c, which provides the first
alternative for the algorithm to pursue.
Going from c to e eventually proves useless, and the algorithm has to backtrack from e to c
and then to b. From there, it goes to the vertices f , e, c, and d, from which it can
legitimately return to a, yielding the Hamiltonian circuit a, b, f , e, c, d, a. If we wanted to
find another Hamiltonian circuit, we could continue this process by backtracking from the
leaf of the solution found.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.16
It is convenient to sort the set’s elements in increasing order. So, we will assume that
a1< a2 < . . . < an.
A = {3, 5, 6, 7} and d = 15 of the subset-sum problem. The number inside a node is the sum
of the elements already included in the subsets represented by the node. The inequality below a leaf
indicates the reason for its termination.
FIGURE 5.15 Complete state-space tree of the backtracking algorithm applied to the instance
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.17
Example:
The state-space tree can be constructed as a binary tree like that in Figure 5.15 for the
instance A = {3, 5, 6, 7} and d = 15.
The root of the tree represents the starting point, with no decisions about the given elements
made as yet.
Its left and right children represent, respectively, inclusion and exclusion of a1 in a set being
sought. Similarly, going to the left from a node of the first level corresponds to inclusion of
a2 while going to the right corresponds to its exclusion, and so on.
Thus, a path from the root to a node on the ith level of the tree indicates which of the first I
numbers have been included in the subsets represented by that node.
We record the value of s, the sum of these numbers, in the node.
If s is equal to d, we have a solution to the problem. We can either report this result and stop or,
if all the solutions need to be found, continue by backtracking to the node’s parent.
If s is not equal to d, we can terminate the node as nonpromising if either of the following
∑=+
s <d s
General Remarks
From a more general perspective, most backtracking algorithms fit the following escription.
An output of a backtracking algorithm can be thought of as an n-tuple (x1, x2, . . . , xn) where each
coordinate xi is an element of some finite lin early ordered set Si . For example, for the n-queens
problem, each Si is the set of integers (column numbers) 1 through n.
Backtrack(X[1..i + 1])
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.18
If this information is available, we can compare a node’s bound value with the value of
the best solution seen so far. If the bound value is not better than the value of the best solution seen
so far—i.e., not smaller for a minimization problem and not larger for a maximization problem—
the node is nonpromising and can be terminated (some people say the branch is “pruned”). Indeed,
no solution obtained from it can yield a better solution than the one already available. This is the
principal idea of the branch-and-bound technique.
In general, we terminate a search path at the current node in a state-space tree of a branch-
and-bound algorithm for any one of the following three reasons:
1. The value of the node’s bound is not better than the value of the best solution seen so far.
2. The node represents no feasible solutions because the constraints of the problem are already
violated.
3. The subset of feasible solutions represented by the node consists of a single point (and
hence no further choices can be made)—in this case, we compare the value of the objective
function for this feasible solution with that of the best solution seen so far and update the
latter with the former if the new solution is better.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.19
We have to find a lower bound on the cost of an optimal selection without actually solving
the problem. We can do this by several methods. For example, it is clear that the cost of any
solution, including an optimal one, cannot be smaller than the sum of the smallest elements in each
of the matrix’s rows. For the instance here, this sum is 2 + 3+ 1+ 4 = 10. It is important to stress
that this is not the cost of any legitimate selection (3 and 1 came from the same column of the
matrix); it is just a lower bound on the cost of any legitimate selection. We can and will apply the
same thinking to partially constructed solutions. For example, for any legitimate selection that
selects 9 from the first row, the lower bound will be 9 + 3 + 1+ 4 = 17.
It is sensible to consider a node with the best bound as most promising, although this does
not, of course, preclude the possibility that an optimal solution will ultimately belong to a different
branch of the state-space tree. This variation of the strategy is called the best-first branch-and-
bound.
The lower-bound value for the root, denoted lb, is 10. The nodes on the first level of the tree
correspond to selections of an element in the first row of the matrix, i.e., a job for person a as
shown in Figure 5.15.
FIGURE 5.15 Levels 0 and 1 of the state-space tree for the instance of the assignment problem
being solved with the best-first branch-and-bound algorithm. The number above a node shows the
order in which the node was generated. A node’s fields indicate the job number assigned to person
a and the lower bound value, lb, for this node.
.
So we have four live leaves (promising leaves are also called live) —nodes 1 through 4—
that may contain an optimal solution. The most promising of them is node 2 because it has the
smallest lowerbound value. Following our best-first search strategy, we branch out from that node
first by considering the three different ways of selecting an element from the second row and not in
the second column—the three different jobs that can be assigned to person b (Figure 5.16).
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.20
FIGURE 5.16 Levels 0, 1, and 2 of the state-space tree for the instance of the assignment problem
being solved with the best-first branch-and-bound algorithm.
Of the six live leaves—nodes 1, 3, 4, 5, 6, and 7—that may contain an optimal solution, we again
choose the one with the smallest lower bound, node 5. First, we consider selecting the third column’s
element from c’s row (i.e., assigning person c to job 3); this leaves us with no choice but to select the
element from the fourth column of d’s row (assigning person d to job 4). This yields
leaf 8 (Figure 5.17), which corresponds to the feasible solution {a→2, b→1, c→3, d →4} with the
total cost of 13. Its sibling, node 9, corresponds to the feasible solution {a→2, b→1, c→4, d →3}
with the total cost of 25. Since its cost is larger than the cost of the solution represented by leaf 8,
node 9 is simply terminated. (Of course, if its cost were smaller than 13, we would have to replace
the information about the best solution seen so far with the data provided by this node.)
FIGURE 5.17 Complete state-space tree for the instance of the assignment problem solved with
the best-first branch-and-bound algorithm.
Now, as we inspect each of the live leaves of the last state-space tree—nodes 1, 3, 4, 6, and
7 in Figure 5.17—we discover that their lower-bound values are not smaller than 13, the value of
the best selection seen so far (leaf 8). Hence, we terminate all of them and recognize the solution
represented by leaf 8 as the optimal solution to the problem.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.21
FIGURE 5.18 State-space tree of the best-first branch-and-bound algorithm for the instance of the
knapsack problem.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.22
At the root of the state-space tree (see Figure 5.18), no items have been selected as yet.
Hence, both the total weight of the items already selected w and their total value v are equal to 0.
The value of the upper bound computed by formula (12.1) is $100. Node 1, the left child of the
root, represents the subsets that include item 1. The total weight and value of the items already
included are 4 and $40, respectively; the value of the upper bound is 40 + (10 − 4) * 6 = $76. Node
2 represents the subsets that do not include item 1. Accordingly, w = 0, v = $0, and ub = 0 + (10 −
0) * 6 = $60. Since node 1 has a larger upper bound than the upper bound of node 2, it is more
promising for this maximization problem, and we branch from node 1 first. Its children—nodes 3
and 4—represent subsets with item 1 and with and without item 2, respectively.
Since the total weight w of every subset represented by node 3 exceeds the knapsack’s capacity,
node 3 can be terminated immediately. Node 4 has the same values of w and v as its parent; the
upper bound ub is equal to 40 + (10 − 4) * 5 = $70. Selecting node 4 over node 2 for the next
branching (why?), we get nodes 5 and 6 by respectively including and excluding item 3. The total
weights and values as well as the upper bounds for these nodes are computed in the same way as
for the preceding nodes. Branching from node 5 yields node 7, which represents no feasible
solutions, and node 8, which represents just a single subset {1, 3} of value $65. The remaining live
nodes 2 and 6 have smaller upper-bound values than the value of the solution represented by node
8. Hence, both can be terminated making the subset {1, 3} of node 8 the optimal solution to the
problem.
Solving the knapsack problem by a branch-and-bound algorithm has a rather unusual characteristic.
Typically, internal nodes of a state-space tree do not define a point of the problem’s search space, because
some of the solution’s components remain undefined. If we had done this for
the instance investigated above, we could have terminated nodes 2 and 6 before node 8 was
generated because they both are inferior to the subset of value $65 of node 5.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.23
FIGURE 5.19 (a)Weighted graph. (b) State-space tree of the branch-and-bound algorithm to find a
shortest Hamiltonian circuit in this graph. The list of vertices in a node specifies a beginning part of
the Hamiltonian circuits represented by the node.
For example, for the instance in Figure and above formula yields
lb = = 14.
Moreover, for any subset of tours that must include particular edges of a given graph, we
[+ lower+ bound+
can modify circuits of the graph in ++
accordingly. For
+ +
example,
+ all the+ Hamiltonian]/
for
Figure that must include edge (a, d), we get the following lower bound by summing up the lengths
of the two shortest edges incident with each of the vertices, with the required inclusion of edges (a,
d) and (d, a):
= 16.
We now apply the branch-and-bound algorithm, with the bounding function given by
formula,[+ to+ find +the shortest+ in Figure 5.19a. To reduce the
+Hamiltonian++circuit+ for +the graph]/
amount of potential work. First, without loss of generality, we can consider only tours that start at
a. Second, because our graph is undirected, we can generate only tours in which b is visited before
c. In addition, after visiting n − 1= 4 cities, a tour has no choice but to visit the remaining unvisited city
and return to the starting one. The state-space tree tracing the algorithm’s application is given
in Figure 5.19b.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.24
Approximation algorithms run a gamut in level of sophistication; most of them are based on
some problem-specific heuristic. A heuristic is a common-sense rule drawn from experience
rather than from a mathematically proved assertion. For example, going to the nearest unvisited city
in the travelling salesman problem is a good illustration of this notion.
−
s * ∗ 1, we can simply
where is an exact solution to the problem.
Alternatively, re(sa) = f (sa)/f (s*)
= ∗
−
use the accuracy ratio
= ∗
to make this ratio greater than or equal to 1, as it is for minimization problems. Obviously, the
closer r(sa) is to 1, the better the approximate solution is. For most instances, however, we cannot
compute the accuracy ratio, because we typically do not know f (s*), the true optimal value of the
objective function. Therefore, our hope should lie in obtaining a good upper bound on the values of
r(sa). This leads to the following definitions.
The best (i.e., the smallest) value of c for which inequality holds for all instances of the
problem is called the performance ratio of the algorithm and denoted RA.
The performance ratio serves as the principal metric indicating the quality of the
approximation algorithm. We would like to have approximation algorithms with RA as close to 1
as possible. Unfortunately, as we shall see, some approximation algorithms have infinitely large
performance ratios (RA = ∞). This does not necessarily rule out using such algorithms, but it does
call for a cautious treatment of their outputs.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.25
Greedy Algorithms for the TSP The simplest approximation algorithms for the traveling
salesman problem are based on the greedy technique. We will discuss here two such algorithms.
1. Nearest-neighbor algorithm
2. Minimum-Spanning-Tree–Based Algorithms
NEAREST-NEIGHBOR ALGORITHM
The following well-known greedy algorithm is based on the nearest-neighbor heuristic:
always go next to the nearest unvisited city.
Step 1 Choose an arbitrary city as the start.
Step 2 Repeat the following operation until all the cities have been visited: go to the unvisited city
nearest the one visited last (ties can be broken arbitrarily).
Step 3 Return to the starting city.
EXAMPLE 1 For the instance represented by the graph in Figure 5.20, with a as the starting
vertex, the nearest-neighbor algorithm yields the tour (Hamiltonian circuit) sa: a − b − c − d − a of
length 10.
10
*
(i.e., tour sa is 25% longer than the optimal tour s ).
Multifragment-heuristic algorithm
Another natural greedy algorithm for the traveling salesman problem considers it as the
problem of finding a minimum-weight collection of edges in a given complete weighted graph so
that all the vertices have degree 2.
Step 1 Sort the edges in increasing order of their weights. (Ties can be broken arbitrarily.)
Initialize the set of tour edges to be constructed to the empty set.
Step 2 Repeat this step n times, where n is the number of cities in the instance being
solved: add the next edge on the sorted edge list to the set of tour edges, provided
this addition does not create a vertex of degree 3 or a cycle of length less than n;
otherwise, skip the edge.
Step 3 Return the set of tour edges.
As an example, applying the algorithm to the graph in Figure 5.20 yields {(a, b), (c, d), (b,
c), (a, d)}. This set of edges forms the same tour as the one produced by the nearest-neighbor
algorithm. In general, the multifragment-heuristic algorithm tends to produce significantly better
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.26
tours than the nearest-neighbor algorithm, as we are going to see from the experimental data quoted
at the end of this section. But the performance ratio of the multifragment-heuristic algorithm is also
unbounded, of course.
There is, however, a very important subset of instances, called Euclidean, for which we
can make a nontrivial assertion about the accuracy of both the nearestneighbor and multifragment-
heuristic algorithms. These are the instances in which intercity distances satisfy the following
natural conditions:
triangle inequality
as the
[, for any triple of cities i, j, and k (the distance
]
cities i and j cannot exceed the length of a two-leg path from i to some
[, ]=
between
]
intermediate city k to [
j
,
) ]≤ [, ]+ [,
symmetry for any pair of cities i and j (the distance from I to j is the same
distance from j to i)
MINIMUM-SPANNING-TREE–BASED ALGORITHMS
There are approximation algorithms for the travelling salesman problem that exploit a
connection between Hamiltonian circuits and spanning trees of the same graph. Since removing an
edge from a Hamiltonian circuit yields a spanning tree, we can expect that the structure of a
minimum spanning tree provides a good basis for constructing a shortest tour approximation. Here
is an algorithm that implements this idea in a rather straightforward fashion.
Twice-around-the-tree algorithm
Step 1 Construct a minimum spanning tree of the graph corresponding to a given instance
of the traveling salesman problem.
Step 2 Starting at an arbitrary vertex, perform a walk around the minimum spanning tree
recording all the vertices passed by. (This can be done by a DFS traversal.)
Step 3 Scan the vertex list obtained in Step 2 and eliminate from it all repeated
occurrences of the same vertex except the starting one at the end of the list. (This
step is equivalent to making shortcuts in the walk.) The vertices remaining on the list
will form a Hamiltonian circuit, which is the output of the algorithm.
EXAMPLE 2 Let us apply this algorithm to the graph in Figure 5.21a. The minimum spanning tree
of this graph is made up of edges (a, b), (b, c), (b, d), and (d, e) (Figure 5.21b). A twice-around-the-
tree walk that starts and ends at a is a, b, c, b, d, e, d, b, a. Eliminating the second b (a shortcut
from c to d), the second d, and the third b (a shortcut from e to a) yields the Hamiltonian circuit a,
b, c, d, e, a of length 39.
FIGURE 5.21 Illustration of the twice-around-the-tree algorithm. (a) Graph. (b) Walk around the
minimum spanning tree with the shortcuts.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.27
EXAMPLE 1 Let us consider the instance of the knapsack problem with the knapsack capacity 10
and the item information as follows:
Computing the value-to-weight ratios and sorting the items in non increasing order of these
efficiency ratios yields
The greedy algorithm will select the first item of weight 4, skip the next item of weight 7,
select the next item of weight 5, and skip the last item of weight 3. The solution obtained happens
to be optimal for this instance. So the total items value in knapsack is $65.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC
CS8451 Design and Analysis of Algorithms Unit V 5.28
entirety, take it and proceed to the next item; otherwise, take its largest fraction to
fill the knapsack to its full capacity and stop.
For each of those subsets, it needs O(n) time to determine the subset’s possible extension.
k+1
Thus, the algorithm’s efficiency is in O(kn ). Note that although it is polynomial in n, the time
efficiency of Sahni’s scheme is exponential in k. More sophisticated approximation schemes, called
fully polynomial schemes, do not have this shortcoming.
PREPARED BY :Mrs.M.Hema,AP/IT,EEC