DSweek3 Algo
DSweek3 Algo
Outline
• Algorithms
• Importance of Algorithms
• Algorithms as a Technology
• Asymptotic Growth
Algorithms
• An algorithm is a computational procedure that
– Finiteness
– Definiteness
– may take some value (or set of values) as input and
– produces some value (or set of values) as output.
– Effectiveness
Strategy
• Use the techniques of design, analysis and
experimentation
• Design: create algorithms
• Analysis: examine algorithms and problems
mathematically
– “Algorithm A1 is more efficient than Algorithm A2"
– “Problem P1 is not solvable; problem P2 is solvable"
– “Problem P3 is solvable but intractable" (i.e., requires
too much time/resources to be of practical value)
• Experimentation: implement systems and study
the resulting behavior
Describing Algorithms
• Format
– Title
– Input
– Output
– Body
• Write in pseudocode or actual code
• Use of indentation
• Keywords: If-then-else, while, return
• Special symbols (next page)
Describing Algorithms
• Special symbols:
– “” (assignment) vs “=“ (checking for equality)
– “[ ]” (array indexing, normally starting at 1)
– “” or “//” (comments)
– “” or “{ }” (empty set)
– “” (there exists; for some) , “” (for all)
– “ ” (floor function), “ ” (ceiling function), “ ” (absolute value, or set cardinality)
– Other math symbols: , , , , ,
Describing Algorithms
ALGORITHM MaxArrayElement
INPUT: Array A of n integers
OUTPUT: Integer k such that A[i]=k i {1, 2,…,n}
and A[j] k j {1, 2,…,n}
max A[1]
For i 2 to n
If A[i] > max then
max A[i]
Return max
Try This!
• Write an algorithm that returns the third
maximum element of an array
ALGORITHM: ThirdMax
INPUT: Array A of n distinct integers
OUTPUT: // How do we state this?
// code here…
Importance of Algorithms
• The Human Genome Project
involves sophisticated
algorithms, enabling it to
• identify all the
(approx.100,000) genes in
human DNA
• determine the sequences of
(approx. 3 billion) chemical
base pairs that make up the
human DNA
• store all of the derived
information in databases
• develop tools for data analysis
Importance of Algorithms
http://www.artima.com/cppsource/images/first-impl.png http://www.artima.com/cppsource/images/second-impl.png
Example: Bubble Sort
• Bubble Sort requires (n2 – n) / 2 “steps”
• Suppose the actual number of instructions on different CPUs are
– Pentium III CPU: 62 * (n2 – n) / 2
– Pentium IV CPU: 56 * (n2 – n) / 2
– Motorola CPU: 84 * (n2 – n) / 2
• Some Observations
– As n increases, the values of other terms become insignificant
• Pentium III CPU: 31n2
• Pentium IV CPU: 28n2
• Motorola CPU: 42n2
– As processors change, the coefficients change but not the exponent of n
• We say Bubble Sort runs in O(n2)-time
• How about if we measure speed?
Big Oh Notation
• Asymptotic Efficiency
• f (n) is O(g(n))
– if there exists constants c > 0 and n0 1
– such that f (n) c g(n)
– for all n n0.
http://www.dgp.toronto.edu/people/JamesStewart/378notes/04formalO/d_domterm.gif
Big Oh Notation
• Prove that:
– 5n + 3 is O(n)
– 3n2 + 2n + 35 is O(n2)
– 4 + 6 log n is O(log n)
• Goal: find a pair (c; n0) such that f (n) c g(n)
for all n n0.
Big Oh Notation
• Some rules of thumb can applied to determine
the running time of some algorithms
– Straightforward (Pseudo) Code
– Loops
– If-Else Statements
– Recursive Functions
Big Oh Notation
• O(n):
For i 1 to n
sum sum + i
• O(n2):
For i 1 to n
For j i to n
sum sum + j
• O(1):
return n * (n + 1) / 2
• O(log n):
While n > 1
count count + 1
nn/2
Asymptotic Notation
• Big-Oh Notation
– O(g(n)) is actually a class of functions.
– O(g(n)) = { f(n) | c, n0 such that 0 f(n) cg(n) n n0}
• O(n2) = {1.3n2 + 35, 4007n2 +11.74,…}
– By its definition, BigOh describes a function that is an
“upperbound”
• 3n + 5 is O(n2)
• 3n + 5 is O(n2010)
• Big-Omega Notation: (a “lowerbound” notation)
– Ω(g(n)) = { f(n) | c, n0 such that 0 cg(n) f(n) n n0}
• 5n4 + 99 is Ω(n4)
• 5n4 + 99 is Ω(n)
• 5n4 + 99 is NOT Ω(n5)
Asymptotic Notation
• Big-Theta Notation
– f(n) is ϴ(g(n)) if and only if f(n) is both O(g(n)) and
Ω(g(n)).
– 3n + 5 is ϴ(n)
– 3n + 5 is NOT ϴ(n2)
– 3n + 5 is NOT ϴ(1)
Asymptotic Notation
• Small Oh (upperbound that is not tight)
– o(g(n)) = { f(n) | c > 0 , n0 such that 0 f(n) <
cg(n) n n0}
– 3n + 5 is o(n2)
– 3n + 5 is NOT o(n)
• Small Omega
– f(n) is ω (g(n)) if and only if g(n) is o(f(n))
– 3n + 5 is NOT ω(n2)
– 3n + 5 is NOT ω(n)
– 3n + 5 IS ω(1)
• How about Small Theta???
Final Remarks
• Big Oh is most often used among the 5 asymptotic
notations
– Easier to prove than Big Theta
– Info on upperbound of required resources is often more
important than a lowerbound
• Common BigOh classes:
– O(1) O(log n) O(n0.5) O(n) O(n log n) O(n2)
– O(nc) O(2n) O(n!) O(nn)
• Although BigOh describes an upperbound, it is ok to say
– This algorithm has a worst-case run-time of O(nc).
– This algorithm has a best-case run-time of O(nc).
– This algorithm has an average run-time of O(nc).
References
• Cormen, Thomas, et al Introduction to
Algorithms (2nd edition), The MIT Press, 2001.
• Zelle, John M., Python Programming: An
Introduction to Computer Science, Franklin,
Beedle & Associates, 2004.