Analysis of Algorithms
Analysis of Algorithms
Analysis of algorithms
CSE373: Design and Analysis of Algorithms
Algorithm
In simple terms, an algorithm is a series of instructions to
solve a problem (complete a task)
Life
Explain how to compute GCD to a 8 year old child
Explain how to tie a tie
Analysis:
• Estimate the cost of an algorithm in terms of resources (e.g.
memory, processor, bandwidth, etc.) and performance (time).
• Prove why certain algorithm works (achieves its goal).
External motivation
•To be able to be hired by a top software company
• To be able to achieve accolades in programming contests
• To be able to do research on design and analysis of algorithms in
theoretical CS, Data Mining, AI, etc.
d a →b →d
a →c →d
c a →b →c →d
Common Types of Computational Problems
• Counting Problem: Compute the number of solutions to a
given search problem
• 3-coloring counting problem: given an undirected graph G = (V, E)
compute the number of ways we can color the graph using 3 colors:
{1,2,3} (while coloring we have to ensure that no two adjacent vertices
have the same color).
• Path counting Problem: Compute the number of possible paths from u
to v in a given graph G.
d
c
Common Types of Computational Problems
•Optimization Problems: Find the best possible solution among
the set of all possible solutions to a search problem
•Graph coloring optimization problem: given an undirected
graph, G= (V,E), what is the minimum number of colors
required to assign “colors” to each vertex in such a way that
no two adjacent vertices have the same color
•Path optimization problem: Find the shortest path(s) from u
to v in a given graph G.
a b Find all the shortest
paths from a to d:
d a →b →d
a →c →d
c
Complexity of Computational Problems
Computational Problems can be categorized into different
classes based on their complexity, such as:
• Unsolvable problems: problems that can’t be solved by
anyone or any machine ever
• Class P problems: problems that are “efficiently/easily
solvable”
• Class NP problems: problems whose outputs can be
“efficiently/easily verified”
• Class NPC: class NP problems that are “hardest to solve”
….
i = j – 1;
For loop body
body
Loop
i←i–1 }//while
A[i+1] ← key A[i+1] = key;
}//for
Indentation/spacing determines where }
the algorithm/loop/if/else-body ends
i j
Insertion Sort Simulation 1 2 3 4 5
key 4 7 4 9 5 1
INSERTION-SORT (A, n) ⊳ A[1 . . n]
for j ← 2 to n i j
1 2 3 4 5
do ⊳ Insert A[ j ] into the sorted subarray A[1..j -1] 4 7 9 5 1
⊳ in such a position that A[1..j] becomes sorted key 9
key ← A[ j]
i←j–1 i j
while i > 0 and A[i] > key 1 2 3 4 5
4 7 9 5 1
do A[i+1] ← A[i] key 5
i←i–1
A[i+1] ← key i j
1 2 3 4 5
key 1 4 5 7 9 1
Loop Invariant: At the beginning of each iteration
of the for loop, A[1..j-1] is already sorted
•To have an estimate about how much time an algorithm may take to finish for a
given input size. (Running time aka. Time Complexity) analysis
•Sometimes, instead of running time, we are interested in how much
memory/space the algorithm may consume while it runs (space complexity)
•It enables us to compare between two algorithms
• What do we actually mean by running time analysis?
• To determine in which rate running time increases as the problem size increases.
• Size of the problem can be a range of things depending on the problem at hand, such as:
• size of an array -- for e.g. for sorting/searching in an array
• number of bits in the binary representation of the input number – for e.g. for computing the bitwise
not of an input number
• ….
Informal Notion of Running Time
• Express runtime as a function of the input size n (i.e., as a function, f(n)) in order
to understand how f(n) grows with n and
• count only the most significant term of f(n) and ignore everything else (because
those won’t affect running time much for very large values of n).
Thus the running times (also called time complexity) of the programs of the previous
slide becomes:
f(N)= c1N ≤ N*(some constant)
g(N) = (c1+c2+c3)N+(c1+c2) ≤ (c1+c2+c3)N = N*(some constant)
Thus both these functions are bounded (from above) by some constant multiple of
N and as such both have the same upper bound: O(N). This means that, the running
time of each of these algorithms is always less than or equal to a constant multiple of
N; we ignore the values of constants in the Big Oh notation, i.e., we never write
O(543N) [it is actually O(N)] or O(65N2+34N+7) [it is actually O(N2)].
fA(n)=30n+8
Running time
fB(n)=n2+1
Increasing n
Growth of Functions
Complexity Graphs
log(n)
Complexity Graphs
n log(n)
log(n)
Complexity Graphs
n10 n3
n2
n log(n)
Complexity Graphs (log scale)
3n
nn
n20
2n
n10
1.1n
f(n) is O(g(n))
Examples:
T(n) = 3n2+10nlgn+8 is O(n2), O(n2lgn), O(n3), O(n4), …
T’(n) = 52n2+3n2lgn+8 is O(n2lgn), O(n3), O(n4), …
Loose upper
bounds
This means that f(n) is bounded from above by a constant multiple of g(n)
Big-Oh Visualization
Asymptotic Notations
• - notation (Big Omega)
Loose lower
bounds
This means that f(n) is bounded from below by a constant multiple of g(n)
Asymptotic Notations
• -notation (Big Theta)
Examples:
T(n) = 3n2+10nlgn+8 is Θ(n2)
T’(n) = 52n2+3n2lgn+8 is Θ(n2lgn)
This means that f(n) is bounded from above and below by constant multiples of g(n),
i.e., f(n) is roughly proportional to g(n)
Some Examples
Determine the time complexity for the following algorithm.
count = 0; //c1
i = 0; //c1
while(i < n){ //(n+1)c2
count++; //nc3
i++; //nc3
}
Usually, when the code is very simple (like above), we simply say that the runtime of
this code is O(n), without doing the straightforward mathematical analysis. When the
code is not so simple, we do the actual analysis to compute the time complexity.
Some Examples
Determine the time complexity for the following algorithm.
count = 0;
for(i=0; i<10000; i++)
count++;
Some Examples
Determine the time complexity for the following algorithm.
count = 0;
for(i=0; i<10000; i++)
(1)
count++;
Some Examples
Determine the time complexity for the following algorithm.
count = 0;
for(i=0; i<n; i++)
count++;
Some Examples
Determine the time complexity for the following algorithm.
count = 0;
for(i=0; i<n; i++)
(n)
count++;
Some Examples
Determine the time complexity for the following algorithm.
sum = 0;
for(i=0; i<n; i++)
for(j=0; j<n; j++) (n2)
sum += arr[i][j];
Some Examples
Determine the time complexity for the following algorithm.
sum = 0;
for(i=1; i<=n; i=i*2)
sum += i;
Some Examples
Determine the time complexity for the following algorithm.
sum = 0;
for(i=1; i<=n; i=i*2) (lg n)
sum += i;
Why? Outer for loop runs (lg n) times (prove it!) and for
each iteration of outer loop, the inner loop runs (n) times
Some Examples
Determine the time complexity for the following algorithm.
sum = 0;
for(i=1; i<=n; i=i*2)
for(j=0; j<i; j++)
sum += i*j;
Some Examples
Determine the time complexity for the following algorithm.
sum = 0;
for(i=1; i<=n; i=i*2)
for(j=0; j<i; j++)
sum += i*j;
char someString[10];
gets(someString); (n)
int t = strlen(someString);
for(i=0; i<t; i++)
someString[i] -= 32;
This example shows that a badly implemented algorithm may have
greater time complexity than a more efficient implementation
So far, we have been able to ALWAYS determine time complexity of an
algorithm from the input size only. But is input size enough to
determine time complexity unambiguously?
intfind_a(char
*str)
{
inti;
for (i = 0; str[i]; i++)
{
if (str[i] == ’a’)
return i;
}
return -1;
}
Time complexity:
What is the time complexity of the above algorithm?
Depends ontwo
Consider the input (str):“alibi” and “never”
inputs:
• (1) for best possible input which starts with ‘a’ e.g. when str = “alibaba”
• (n) for worst possible input which doesn’t contain any ‘a’ e.g. when str = “nitin”
Types of Time Complexity Analysis
So how does the running time vary with respect to various input?
Three scenarios
Best case
Worst case
Average case
Types of Time Complexity Analysis
• Worst-case: (usually done)
• Running time on worst possible input
• Best-case: (bogus)
• Running time on best possible input
How can you arrange the input numbers so that this algorithm
becomes most inefficient (worst case)?
How can you arrange the input numbers so that this algorithm
becomes most efficient (best case)?
Insertion Sort: Running Time
Statement cost
INSERTION-SORT (A, n) ⊳ A[1 . . n]
forj ← 2ton
dokey ← A[ j]
i←j–1
while i>0 and A[i] >key
doA[i+1] ← A[i]
i←i–1
A[i+1] = key
𝑛 𝑛 𝑛
𝑇 ( 𝑛) =𝑐 1 𝑛+𝑐2 ( 𝑛−1 ) +𝑐 3 ( 𝑛−1 ) +𝑐 4 ∑ 𝑡 𝑗 +𝑐 5 ∑ (𝑡 𝑗 −1)+𝑐6 ∑ (𝑡 𝑗 −1)+𝑐7 (𝑛−1)
𝑗=2 𝑗=2 𝑗=2
Here tj = no. of times the condition of while loop is tested for the current value of j.
How can we simplify T(n)? Hint: compute the value of T(n) in the best/worst case
Insertion Sort: Running Time
Statement cost
INSERTION-SORT (A, n) ⊳ A[1 . . n]
forj ← 2ton
dokey ← A[ j]
i←j–1
while i>0 and A[i] >key
doA[i+1] ← A[i]
If you are asked to compute worst case time of Insertion-Sort, just say that the while loop
i ← runs
i – 1j times for worst possible input i.e. reverse sorted array (explain why) and then
A[i+1] = key compute T(n) from T(n) = Σnj=2 (j) = …. = n(n+1)/2 – 1 which is O(n2)
You don’t really need to show such a detailed calculation as shown here & in the book.
𝑛 𝑛 𝑛
𝑇 ( 𝑛) =𝑐 1 𝑛+𝑐2 ( 𝑛−1 ) +𝑐 3 ( 𝑛−1 ) +𝑐 4 ∑ 𝑡 𝑗 +𝑐 5 ∑ (𝑡 𝑗 −1)+𝑐6 ∑ (𝑡 𝑗 −1)+𝑐7 (𝑛−1)
𝑗=2 𝑗=2 𝑗=2
Here tj = no. of times the condition of while loop is tested for the current value of j.
In the worst case (when input is reverse-sorted), in each iteration of the for loop, all the j-1
elements need to be right shifted, i.e., tj=(j-1)+1 = j :[1 is added to represent the last test].
Putting this in the above eq., we get: T(n) = An2+Bn+C → O(n2), where A, B, C are constants.
What is T(n) in the best case (when the input numbers are already sorted)?
Polynomial & non-polynomial time algorithms
• Polynomial time algorithm: Algorithm whose worst-
case running time is polynomial
• E.g.: Linear Search (in unsorted array): O(n),
Binary Search (in sorted array): O(lg n),
InsertionSort: O(n2), etc.
• Non-polynomial time algorithm: Algorithm whose
worst-case running time is not polynomial
• Examples: an algorithm to enumerate and print all possible
orderings of n persons: O(n!), an algorithm to enumerate
and print all possible binary strings of length n: O(2n)
Amortized Analysis (Section 17.1)
• Amortized Running Time: average time taken by an operation in a
sequence of operations on a given data structure (doesn’t give time
of a single operation).
• Example: Consider MultipoppableStack data structure:- it
supprts 3 operations on a stack:
– PUSH(x) -> takes O(1) time
– POP() -> takes O(1) time
– MULTIPOP(k) //pops top k items from the stack
• while not STACK-EMPTY() and k>0
• do POP()
• k=k-1
• Let’s consider a sequence of n PUSH, POP & MULTIPOP operations
• An e.g. of such a sequence of n=9 operations is: <PUSH, PUSH,
POP, PUSH, PUSH, MULTIPOP(2), PUSH, PUSH, MULTIPOP(3)>
Example of Amortized Analysis (Contd.)
• MULTIPOP = just a # of POP calls; so we need to count only
the total # of PUSHs and POPs in the sequence.
• Each object can be POPed only once for each time it is
PUSHed. Therefore
total # of POPs (including POP calls in MULTIPOP)
≤ total # of PUSHs ≤ total # of operations = n.
Also, total # of PUSHs ≤ n
Therefore total # of PUSH+POP calls ≤ 2n
• Each PUSH/POP operation takes O(1); so total time taken by
the sequence of operations is ≤ 2nO(1) i.e., O(n)
• There are n operations in the sequence, so in average, each
operation takes O(n)/n which is O(1)
⸫ Amortized running time of an operation on MultipoppableStack is O(1)