Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
3 views

Algorithm Analysis

The document discusses algorithm analysis, focusing on time complexity and how to mathematically evaluate the efficiency of algorithms based on input size. It introduces concepts such as asymptotic analysis, Big O notation, and various growth rates, emphasizing the importance of dominant terms in determining performance as input size increases. Additionally, it covers techniques for analyzing loops, recursion, and provides examples of applying the Master Theorem for divide and conquer algorithms.

Uploaded by

10423049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Algorithm Analysis

The document discusses algorithm analysis, focusing on time complexity and how to mathematically evaluate the efficiency of algorithms based on input size. It introduces concepts such as asymptotic analysis, Big O notation, and various growth rates, emphasizing the importance of dominant terms in determining performance as input size increases. Additionally, it covers techniques for analyzing loops, recursion, and provides examples of applying the Master Theorem for divide and conquer algorithms.

Uploaded by

10423049
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Algorithm Analysis

Ph.D. Truong Dinh Huy


Background
Suppose we have two algorithms, how
can we tell which is better?

We could implement both algorithms,


run them both.
– Expensive and error prone

Preferably, we should analyze them


mathematically
– Algorithm analysis
Analysis of Algorithms
• Efficiency measure
– how long the program runs time complexity
– how much memory it uses space complexity
• For today, we’ll focus on time complexity only
Algorithm Analysis
 In general, we will always analyze algorithms with
respect to one or more variables of input data. In this
lecture, we focus on 1 variable
 Given an algorithm:
– We need to describe running time as a function T(N)
(Time complexity) mathematically with N is data
input.
– We need to do this in a machine-independent way

 We count number of abstract simple steps


– Not physical runtime in seconds
– Not every machine instruction
Time Complexity: Sum of Array
int sum_array( int *array, int N ) {
int sum; No time
1
sum = 0; 1 N+1
N

for ( int i = 0; i < N; ++i ) {


sum = sum + array[i];
} N N

return sum; 1
}
Time complexity: T(N) = 4*N + 4
Asymptotic Analysis

• Complexity as a function of input size n


T(n) = 4n + 5
T(n) = 0.5 n log n - 2n + 7
T(n) = 2n + n3 + 3n

• What happens as n grows?


Rate of Growth
• Most algorithms are fast for small n
– Time difference too small to be noticeable
– External things dominate (OS, disk I/O, …)
• n is typically large in practice
– Databases, internet, graphics, …

• Time difference really shows up as n grows!


Quadratic Growth
Consider the two functions
f(n) = n2 and g(n) = n2 – 3n + 2
Around n = 0, they look very different
Quadratic Growth
Yet on the range n = [0, 1000], they are
(relatively) indistinguishable:
Quadratic Growth
The absolute difference is large, for
example,
f(1000) = 1 000 000
g(1000) = 997 002
but the relative difference is very small
f(1000)  g(1000)
 0.002998  0.3%
f(1000)

and this difference goes to zero as n → ∞


Polynomial Growth
To demonstrate with another example,
f(n) = n6 and g(n) = n6 – 23n5+193n4 –
729n3+1206n2 – 648n. Around n = 0, they
are very different
2.3.3 Polynomial Growth
Still, around n = 1000, the relative
difference is less than 3%
Comparison of two functions

Which is better: 50 N 2  31N 3  24 N  15 or


3 N 2  N  21  4  3 N
Answer depends on value of N:

N 50 N 2  31N 3  24 N  15 3 N 2  N  21  4  3 N
1 120 37
2 511 71
3 1374 159
4 2895 397
5 5260 1073
6 8655 3051
7 13266 8923
8 19279 26465
9 26880 79005
10 36255 236527
What Happened?

N 3 N 2  N  21  4  3 N 4  3N %ofTotal
1 37 12 32.4
2 71 36 50.7
3 159 108 67.9
4 397 324 81.6
5 1073 972 90.6
6 3051 2916 95.6
7 8923 8748 98.0
8 26465 26244 99.2
9 79005 78732 99.7
10 236527 236196 99.9

– One term dominated the sum


As N Grows, Some Terms Dominate

Function 10 100 1000 10000 100000


log 2 N 3 6 9 13 16
N 10 100 1000 10000 100000
N log 2 N 30 664 9965 105 106

N2 102 104 106 108 1010


N3 103 106 109 1012 1015
2N 103 1030 10301 103010 1030103
Order of Magnitude Analysis

Measure speed with respect to the part of the


sum that grows quickest

50 N 2  31N 3  24 N  15
3 N 2  N  21  4  3 N
Ordering:

1  log 2 N  N  N log 2 N  N  N  2  3
2 3 N N
Order of Magnitude Analysis (cont)
Furthermore, simply ignore any constants in
front of term and simply report general class
of the term:
31N 3 4  3N 15 N log 2 N
3
50 N 2  31N 3  24 N  15 grows proportionally to N
N
3 N  N  21  4  3
2 N
grows proportionally to 3
When comparing algorithms, determine
formulas to count operation(s) of interest,
then compare dominant terms of formulas
Obtaining Asymptotic Bounds
• Eliminate low order terms
– 4n + 5  4n
– 0.5 n log n - 2n + 7  0.5 n log n
– 2n + n3 + 3n  2n

• Eliminate coefficients
– 4n n
– 0.5 n log n  n log n
– n log n2 = 2 n log n  n log n
Big O Notation
• Algorithm A requires time proportional to f(N) - algorithm is said
to be of order f(N) or O(f(N))
• Definition: an algorithm is said to take time proportional to
O(f(N)) if there is some constant C such that for all but a finite
number of values of N, the time taken by the algorithm is less
than C*f(N)
• T(n) = O(f(n)): growth rate of T(n)  that of f(n)
–  constants c and n0 s.t. T(n)  c f(n) n  n0
– Or if limnT(n)/f(n) exists and is finite, then T(n) is O(f(n))

Examples:
50 N 2  31N 3  24 N  15 is O( N 3 )
3 N 2  N  21  4  3 N is O(3N )
Big O Notation(2)
• If an algorithm is O(f(N)), f(N) is said to be
the growth-rate function of the algorithm.

• Or: f(N) is an upper bound on T(N):

• T(N)= 2N2 => T(N)=O(N2)=O(N3)=O(N4)


• But O(N2) is the best answer. The answer
should be as tight (good) as possible.
Other Terminologies
• T(n) = (f(n)) (growth rate of T(n) >= that of f(n))
–  constants c and n0 s.t. T(n)  c f(n) n  n0

• T(n) = (f(n)) (growth rate of T(n) = that of f(n))


– T(n) = O(f(n)) and T(n) = (f(n))
• T(n) = o(f(n)) (growth rate of T(n) < that of f(n))
– T(n) = O(f(n)) and T(n) != (f(n))
• T(n)  (f(n)) (growth rate of T(n) > that of f(n))
– T(n) = (f(n)) and T(n) != (f(n))
Typical Growth Rates
– c: Constant
– log N Logarithmic
– logk N Poly-log (k is a constant)
– N Linear
– N log N Log-linear
– N2 Quadratic
– N3 Cubic
– Nk Polynomial (k is a constant)
– Cn Exponential (C is a constant)
Types of Analysis
Three orthogonal axes:
– bound flavor
• upper bound (O, o)
• lower bound (, )
• asymptotically tight ()
– analysis case
• worst case (adversary)
• average case
• best case
• “common” case
– analysis quality
• loose bound (most true analyses)
• tight bound (no better bound which is asymptotically
different)
Analyzing Code
• General guidelines

Simple C++ operations - constant time


consecutive stmts - sum of times per stmt
conditionals - sum of branches and
condition
loops - sum over iterations
function calls - cost of function body
Simple loop
• Rule 1- single loops: The running time
of a loop is, at most, the running time of
the statements inside the loop (including
tests) multiplied by the number of
iterations.
Ex:
for i = 1 to n do
k++
Simple loops (2)
Rule 2-Nested loops: Analyze from the
inside out. Total running time is the
product of the sizes of all the loops.
for i = 1 to n do
for j = 1 to n do
sum = sum + 1
Simple loops (3)
Ex1. for i = 1 to n do
for j = i to n do
sum = sum + 1

T(n)= ?

Ex2. for i = 1 to n do
for j = i to n do
for k= i to j do
sum = sum + 1
T(n)= ?
Quiz
Conditions
• Worst-case running time: the test, plus
either the then part or the else part
(whichever is the larger).
• Conditional
if C then S1 else S2
T(n) <= time of C + MAX (S1, S2)
<= time of C + S1 + S2
• Ex: If (N==0) return 0;
else {
for (i=0; i<N; i++) sum++;
}
Consecutive statements

T(N)= ?
Logarithms in running time
• Example 1:
for (i = N; i>=1;)
i=i/2;
long gcd( int m, int n )
T(N)= O(logN) {
while( n != 0 )
• Example 2: {
Greatest common divisor int rem = m % n;
m = n;
n = rem;
}
return m;
}
Recursion
• Recursion
– Function calls its self
– Some cases are very difficult to analyze
• Example: Factorial
• Normal case O(N)
• Recursion:
fac(n)
if n = 0 return 1
else return n * fac(n - 1)

T(0) = 1
T(n)  c + T(n - 1) if n > 0
Example: Factorial
Analysis by substitution method
T(n)  c + c + T(n - 2)
(by substitution)
T(n)  c + c + c + T(n - 3)
(by substitution, again)
T(n)  kc + T(n - k)
(extrapolating 0 < k  n)
T(n)  nc + T(0) = nc + b
(setting k = n)

• T(n) = O(n) (the same as normal case)


Bad example of Recursion: Fibonacci

T(N) = T(N-1) + T(N-2) +2

(3/2)N <= T(N) < (5/3)N


Good example of Recursion: 3N
• Normal case: O(N)
• Using recursion, we can achieve a faster
program: O(logN)
Exponentiator (n) // n is an integer
if (n==1)
return 1
else if (n%2==0){
x = Exponentiator (n/2);
return x*x;}
else {
x = Exponentiator ((n-1)/2);
return 3*x*x;}
Proof
• T(N)= T(N/2) + c and T(1)=1;
• T(N) = T(N/4) + c + c
• …
• T(N) = T(N/2i) + c + …. + c
• T(N) = T(1) + clogN = O(logN)
Quiz
• Analysis by substitution method

• T(N) = 2T(N/2) + 1

• T(N) = 2T(N/2) + n
Master Theorem for Divide and
Conquer
T(n) = aT(n/b) + (nklogpn)
a>=1, b> 1, k>= 0 and p is a real number
Case (1): a  bk then T(n) = (nlogba )
Case(2): a = bk :
if p > -1 then T(n) = (nlogbalogp+1n)
if p = -1 then T(n) = (nlogbaloglogn)
if p < -1 then T(n) = (nlogba )
Case(3): a < bk :
if p >= 0 then T(n) = (nklogpn)
if p = 0 then T(n) = O(nk)
Examples
Admissible equations:

Example 1: T(n) = 9T(n/3) + n T(n) = ?

Example 2: T(n) = T(2n/3) + 1 T(n) = ?

Example 3: T(n) = 3T(n/4) + n logn T(n) = ?

Example 4: T(n) = 2T(n/2) + n/logn T(n) = ?

Inadmissible equations:

Example 5: T(n) = 2nT(n/2)+1

Example 6: T(n) = T(n/2) – n2logn


Master Theorem for Subtraction and
Conquer
T(n) = aT(n-b) + f(n)
for some constants a > 0, b > 0, k ≥ 0, and function f(n).
If f(n) is in O(nk), then:
Maximum Subsequence Problem

There is an array of N integers (possible negative)


Find the maximum sum of all elements between the ith and jth position.
For example: -2, 11, -4, 13, -5, -2, the answer is 20 (from A[1] to A[3])
Algorithm 1
Analysis

Inner loop:
j=iN-1 (j-i + 1) = (N – i + 1)(N-i)/2
Outer Loop:
i=0N-1 (N – i + 1)(N-i)/2 = (N3 + 3N2 + 2N)/6

Overall: O(N3)
Algorithm 2
Divide and Conquer

1. Break a big problem into two small sub-problems

2. Solve each of them efficiently.

3. Combine the two solutions


Algorithm 3: Divide and conquer
Divide the array into two parts: left part, right part

Max. subsequence lies completely in left, or


completely in right or spans the middle.

If it spans the middle, then it includes the max


subsequence in the left ending at the last element and the
max subsequence in the right starting from the center
Divide and conquer
4 –3 5 –2 -1 2 6 -2

Max subsequence sum for first half = 6


second half = 8
Max subsequence sum for first half ending at the last element is 4
Max subsequence sum for sum second half starting at the first
element is 7
Max subsequence sum spanning the middle is ? 4+7 = 11

Max subsequence is 11 and the best subarray is to include elements


from both half.
Complexity Analysis
T(n) = 2T(n/2) + cn = O(nlogn) (Master theorem)

Proof:
= 2.cn/2 + 4T(n/4) + cn
= 4T(n/4) + 2cn
= 8T(n/8) + 3cn
=…………..
= 2iT(n/2i) + icn
=………………… (reach a point when n = 2i i=log n)
= n.T(1) + cnlog n = O(nlogn)
Algorithm 4

O(n) complexity
Summary
• Describe running time of an algorithm as a
mathematical function of input size.
• Terminologies: O(•), (•), (•).
• How to analyze a program
• Example: Maximum Subsequence Problem
• Next week: Sorting

You might also like