Introduction To Algorithms
Introduction To Algorithms
6.046J/18.401J
LECTURE 1
Analysis of Algorithms
Insertion sort
Asymptotic analysis
Merge sort
Recurrences
Course information
1.
2.
3.
4.
5.
6.
7.
Staff
Distance learning
Prerequisites
Lectures
Recitations
Handouts
Textbook
September 7, 2005
8.
9.
10.
11.
12.
13.
14.
Course website
Extra help
Registration
Problem sets
Describing algorithms
Grading policy
Collaboration policy
Introduction to Algorithms
L1.2
Analysis of algorithms
The theoretical study of computer-program
performance and resource usage.
Whats more important than performance?
modularity
user-friendliness
correctness
programmer time
maintainability
simplicity
functionality
extensibility
robustness
reliability
September 7, 2005
Introduction to Algorithms
L1.3
Introduction to Algorithms
L1.4
Introduction to Algorithms
L1.5
Insertion sort
pseudocode
September 7, 2005
INSERTION-SORT (A, n)
A[1 . . n]
for j 2 to n
do key A[ j]
ij1
while i > 0 and A[i] > key
do A[i+1] A[i]
ii1
A[i+1] = key
Introduction to Algorithms
L1.6
Insertion sort
INSERTION-SORT (A, n)
A[1 . . n]
for j 2 to n
do key A[ j]
ij1
while i > 0 and A[i] > key
do A[i+1] A[i]
ii1
A[i+1] = key
pseudocode
A:
sorted
September 7, 2005
key
Introduction to Algorithms
L1.7
September 7, 2005
Introduction to Algorithms
L1.8
September 7, 2005
Introduction to Algorithms
L1.9
September 7, 2005
Introduction to Algorithms
L1.10
September 7, 2005
Introduction to Algorithms
L1.11
September 7, 2005
Introduction to Algorithms
L1.12
September 7, 2005
Introduction to Algorithms
L1.13
September 7, 2005
Introduction to Algorithms
L1.14
September 7, 2005
Introduction to Algorithms
L1.15
September 7, 2005
Introduction to Algorithms
L1.16
September 7, 2005
Introduction to Algorithms
L1.17
September 7, 2005
9 done
Introduction to Algorithms
L1.18
Running time
The running time depends on the input: an
already sorted sequence is easier to sort.
Parameterize the running time by the size of
the input, since short sequences are easier to
sort than long ones.
Generally, we seek upper bounds on the
running time, because everybody likes a
guarantee.
September 7, 2005
Introduction to Algorithms
L1.19
Kinds of analyses
Worst-case: (usually)
T(n) = maximum time of algorithm
on any input of size n.
Average-case: (sometimes)
T(n) = expected time of algorithm
over all inputs of size n.
Need assumption of statistical
distribution of inputs.
Best-case: (bogus)
Cheat with a slow algorithm that
works fast on some input.
September 7, 2005
Introduction to Algorithms
L1.20
Machine-independent time
What is insertion sorts worst-case time?
It depends on the speed of our computer:
relative speed (on the same machine),
absolute speed (on different machines).
BIG IDEA:
Ignore machine-dependent constants.
Look at growth of T(n) as n .
Asymptotic Analysis
September 7, 2005
Introduction to Algorithms
L1.21
-notation
Math:
Engineering:
Drop low-order terms; ignore leading constants.
Example: 3n3 + 90n2 5n + 6046 = (n3)
September 7, 2005
Introduction to Algorithms
L1.22
Asymptotic performance
When n gets large enough, a (n2) algorithm
always beats a (n3) algorithm.
T(n)
n
September 7, 2005
n0
We shouldnt ignore
asymptotically slower
algorithms, however.
Real-world design
situations often call for a
careful balancing of
engineering objectives.
Asymptotic analysis is a
useful tool to help to
structure our thinking.
Introduction to Algorithms
L1.23
T ( n) =
2)
(
(
j
)
=
[arithmetic series]
j =2
T ( n) =
( j / 2) = (n 2 )
j =2
Introduction to Algorithms
L1.24
Merge sort
MERGE-SORT A[1 . . n]
1. If n = 1, done.
2. Recursively sort A[ 1 . . n/2 ]
and A[ n/2+1 . . n ] .
3. Merge the 2 sorted lists.
Key subroutine: MERGE
September 7, 2005
Introduction to Algorithms
L1.25
September 7, 2005
Introduction to Algorithms
L1.26
1
1
September 7, 2005
Introduction to Algorithms
L1.27
20 12
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.28
20 12
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.29
20 12
20 12
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.30
20 12
20 12
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.31
20 12
20 12
20 12
13 11
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.32
20 12
20 12
20 12
13 11
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.33
20 12
20 12
20 12
20 12
13 11
13 11
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
L1.34
20 12
20 12
20 12
20 12
13 11
13 11
13 11
13 11
13 11
September 7, 2005
Introduction to Algorithms
11
L1.35
20 12
20 12
20 12
20 12
20 12
13 11
13 11
13 11
13 11
13 11
13
September 7, 2005
Introduction to Algorithms
11
L1.36
20 12
20 12
20 12
20 12
20 12
13 11
13 11
13 11
13 11
13 11
13
September 7, 2005
Introduction to Algorithms
11
12
L1.37
20 12
20 12
20 12
20 12
20 12
13 11
13 11
13 11
13 11
13 11
13
11
12
Introduction to Algorithms
L1.38
September 7, 2005
Introduction to Algorithms
L1.39
(1) if n = 1;
2T(n/2) + (n) if n > 1.
September 7, 2005
Introduction to Algorithms
L1.40
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
September 7, 2005
Introduction to Algorithms
L1.41
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
September 7, 2005
Introduction to Algorithms
L1.42
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/2)
T(n/2)
September 7, 2005
Introduction to Algorithms
L1.43
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2
cn/2
T(n/4)
September 7, 2005
T(n/4)
T(n/4)
T(n/4)
Introduction to Algorithms
L1.44
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2
cn/2
cn/4
cn/4
cn/4
cn/4
(1)
September 7, 2005
Introduction to Algorithms
L1.45
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2
cn/2
cn/4
cn/4
cn/4
h = lg n cn/4
(1)
September 7, 2005
Introduction to Algorithms
L1.46
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn
cn/2
cn/2
cn/4
cn/4
cn/4
h = lg n cn/4
(1)
September 7, 2005
Introduction to Algorithms
L1.47
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn
cn/2
cn/2
cn/4
cn/4
cn/4
h = lg n cn/4
cn
(1)
September 7, 2005
Introduction to Algorithms
L1.48
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn
cn/2
cn/2
cn/4
cn/4
cn/4
cn
h = lg n cn/4
cn
(1)
September 7, 2005
Introduction to Algorithms
L1.49
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn
cn/2
cn/2
cn/4
cn/4
(1)
September 7, 2005
cn/4
cn
h = lg n cn/4
cn
#leaves = n
Copyright 2001-5 Erik D. Demaine and Charles E. Leiserson
Introduction to Algorithms
(n)
L1.50
Recursion tree
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn
cn/2
cn/2
cn/4
cn/4
cn/4
(1)
cn
h = lg n cn/4
cn
(n)
#leaves = n
Total = (n lg n)
September 7, 2005
Introduction to Algorithms
L1.51
Conclusions
(n lg n) grows more slowly than (n2).
Therefore, merge sort asymptotically
beats insertion sort in the worst case.
In practice, merge sort beats insertion
sort for n > 30 or so.
Go test it out for yourself!
September 7, 2005
Introduction to Algorithms
L1.52
Introduction to Algorithms
6.046J/18.401J
LECTURE 2
Asymptotic Notation
O-, -, and -notation
Recurrences
Substitution method
Iterating the recurrence
Recursion tree
Master method
Prof. Erik Demaine
September 12, 2005
L2.1
Asymptotic notation
O-notation (upper bounds):
We
We write
write f(n)
f(n) == O(g(n))
O(g(n)) ifif there
there
exist
such
exist constants
constants cc >> 0,
0, nn00 >> 00 such
that
that 00 f(n)
f(n) cg(n)
cg(n) for
for all
all nn nn00..
L2.2
Asymptotic notation
O-notation (upper bounds):
We
We write
write f(n)
f(n) == O(g(n))
O(g(n)) ifif there
there
exist
such
exist constants
constants cc >> 0,
0, nn00 >> 00 such
that
that 00 f(n)
f(n) cg(n)
cg(n) for
for all
all nn nn00..
EXAMPLE: 2n2 = O(n3)
(c = 1, n0 = 2)
L2.3
Asymptotic notation
O-notation (upper bounds):
We
We write
write f(n)
f(n) == O(g(n))
O(g(n)) ifif there
there
exist
such
exist constants
constants cc >> 0,
0, nn00 >> 00 such
that
that 00 f(n)
f(n) cg(n)
cg(n) for
for all
all nn nn00..
EXAMPLE: 2n2 = O(n3)
(c = 1, n0 = 2)
functions,
not values
September 12, 2005
L2.4
Asymptotic notation
O-notation (upper bounds):
We
We write
write f(n)
f(n) == O(g(n))
O(g(n)) ifif there
there
exist
such
exist constants
constants cc >> 0,
0, nn00 >> 00 such
that
that 00 f(n)
f(n) cg(n)
cg(n) for
for all
all nn nn00..
EXAMPLE: 2n2 = O(n3)
functions,
not values
September 12, 2005
(c = 1, n0 = 2)
funny, one-way
equality
L2.5
L2.6
L2.7
L2.8
Macro substitution
Convention: A set in a formula represents
an anonymous function in the set.
L2.9
Macro substitution
Convention: A set in a formula represents
an anonymous function in the set.
EXAMPLE:
f(n) = n3 + O(n2)
means
f(n) = n3 + h(n)
for some h(n) O(n2) .
L2.10
Macro substitution
Convention: A set in a formula represents
an anonymous function in the set.
EXAMPLE:
n2 + O(n) = O(n2)
means
for any f(n) O(n):
n2 + f(n) = h(n)
for some h(n) O(n2) .
Copyright 2001-5 Erik D. Demaine and Charles E. Leiserson
L2.11
L2.12
L2.13
n = (lg n) (c = 1, n0 = 16)
Copyright 2001-5 Erik D. Demaine and Charles E. Leiserson
L2.14
L2.15
1
2
n 2 n = ( n )
2
L2.16
(g(n))
(g(n)) == {{ f(n)
f(n) :: for
for any
any constant
constant cc >> 0,
0,
there
there is
is aa constant
constant nn00 >> 00
such
such that
that 00 f(n)
f(n) << cg(n)
cg(n)
for
for all
all nn nn00 }}
(n0 = 2/c)
L2.17
(g(n))
(g(n)) == {{ f(n)
f(n) :: for
for any
any constant
constant cc >> 0,
0,
there
there is
is aa constant
constant nn00 >> 00
such
such that
that 00 cg(n)
cg(n) << f(n)
f(n)
for
for all
all nn nn00 }}
EXAMPLE:
September 12, 2005
n = (lg n)
(n0 = 1+1/c)
L2.18
Solving recurrences
The analysis of merge sort from Lecture 1
required us to solve a recurrence.
Recurrences are like solving integrals,
differential equations, etc.
o Learn a few tricks.
Lecture 3: Applications of recurrences to
divide-and-conquer algorithms.
L2.19
Substitution method
The most general method:
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
L2.20
Substitution method
The most general method:
1. Guess the form of the solution.
2. Verify by induction.
3. Solve for constants.
EXAMPLE: T(n) = 4T(n/2) + n
[Assume that T(1) = (1).]
Guess O(n3) . (Prove O and separately.)
Assume that T(k) ck3 for k < n .
Prove T(n) cn3 by induction.
September 12, 2005
L2.21
Example of substitution
T (n) = 4T (n / 2) + n
4c ( n / 2 ) 3 + n
= ( c / 2) n 3 + n
desired residual
= cn3 ((c / 2)n3 n)
cn3 desired
whenever (c/2)n3 n 0, for example,
if c 2 and n 1.
residual
September 12, 2005
L2.22
Example (continued)
We must also handle the initial conditions,
that is, ground the induction with base
cases.
Base: T(n) = (1) for all n < n0, where n0
is a suitable constant.
For 1 n < n0, we have (1) cn3, if we
pick c big enough.
L2.23
Example (continued)
We must also handle the initial conditions,
that is, ground the induction with base
cases.
Base: T(n) = (1) for all n < n0, where n0
is a suitable constant.
For 1 n < n0, we have (1) cn3, if we
pick c big enough.
This bound is not tight!
September 12, 2005
L2.24
L2.25
L2.26
L2.27
L2.28
L2.29
L2.30
L2.31
Recursion-tree method
A recursion tree models the costs (time) of a
recursive execution of an algorithm.
The recursion-tree method can be unreliable,
just like any method that uses ellipses ().
The recursion-tree method promotes intuition,
however.
The recursion tree method is good for
generating guesses for the substitution method.
September 12, 2005
L2.32
L2.33
L2.34
T(n/2)
L2.35
T(n/8)
(n/2)2
T(n/8)
T(n/4)
L2.36
(n/8)2
(n/4)2
(n/16)2
(n/2)2
(1)
September 12, 2005
L2.37
(n/2)2
(n/8)2
(n/4)2
(n/16)2
n2
(1)
September 12, 2005
L2.38
(n/8)2
5 n2
16
(n/4)2
(n/16)2
(n/2)2
n2
(1)
September 12, 2005
L2.39
(n/8)2
(n/4)2
5 n2
16
25 n 2
256
(n/8)2
(n/16)2
(n/2)2
n2
(1)
September 12, 2005
L2.40
(n/8)2
(1)
September 12, 2005
(n/4)2
( ) +( )
2
5
5
1 + 16 + 16
Total = n
= (n2)
5 n2
16
25 n 2
256
(n/4)2
(n/16)2
n2
5 3
16
+L
geometric series
L2.41
L2.42
L2.43
L2.44
L2.45
Examples
EX. T(n) = 4T(n/2) + n
a = 4, b = 2 nlogba = n2; f (n) = n.
CASE 1: f (n) = O(n2 ) for = 1.
T(n) = (n2).
L2.46
Examples
EX. T(n) = 4T(n/2) + n
a = 4, b = 2 nlogba = n2; f (n) = n.
CASE 1: f (n) = O(n2 ) for = 1.
T(n) = (n2).
EX. T(n) = 4T(n/2) + n2
a = 4, b = 2 nlogba = n2; f (n) = n2.
CASE 2: f (n) = (n2lg0n), that is, k = 0.
T(n) = (n2lg n).
September 12, 2005
L2.47
Examples
EX. T(n) = 4T(n/2) + n3
a = 4, b = 2 nlogba = n2; f (n) = n3.
CASE 3: f (n) = (n2 + ) for = 1
and 4(n/2)3 cn3 (reg. cond.) for c = 1/2.
T(n) = (n3).
L2.48
Examples
EX. T(n) = 4T(n/2) + n3
a = 4, b = 2 nlogba = n2; f (n) = n3.
CASE 3: f (n) = (n2 + ) for = 1
and 4(n/2)3 cn3 (reg. cond.) for c = 1/2.
T(n) = (n3).
EX. T(n) = 4T(n/2) + n2/lg n
a = 4, b = 2 nlogba = n2; f (n) = n2/lg n.
Master method does not apply. In particular,
for every constant > 0, we have n = (lg n).
September 12, 2005
L2.49
f (n)
a
f (n/b) f (n/b) f (n/b)
a
f (n/b2) f (n/b2) f (n/b2)
(1)
September 12, 2005
L2.50
f (n)
a f (n/b)
a2 f (n/b2)
a
f (n/b) f (n/b) f (n/b)
a
f (n/b2) f (n/b2) f (n/b2)
f (n)
(1)
September 12, 2005
L2.51
f (n)
a f (n/b)
a2 f (n/b2)
a
f (n/b) f (n/b) f (n/b)
a
h = logbn
f (n/b2) f (n/b2) f (n/b2)
f (n)
(1)
September 12, 2005
L2.52
a
f (n/b) f (n/b) f (n/b)
a
h = logbn
f (n/b2) f (n/b2) f (n/b2)
(1)
September 12, 2005
#leaves = ah
= alogbn
= nlogba
Copyright 2001-5 Erik D. Demaine and Charles E. Leiserson
f (n)
a f (n/b)
a2 f (n/b2)
Recursion tree:
nlogba (1)
L2.53
a
f (n/b) f (n/b) f (n/b)
a
h = logbn
f (n/b2) f (n/b2) f (n/b2)
C
CASE
ASE 1:
1: The
The weight
weight increases
increases
geometrically
geometrically from
from the
the root
root to
to the
the
(1) leaves.
leaves. The
The leaves
leaves hold
hold aa constant
constant
fraction
fraction of
of the
the total
total weight.
weight.
September 12, 2005
f (n)
a f (n/b)
a2 f (n/b2)
Recursion tree:
nlogba (1)
(nlogba)
L2.54
f (n)
f (n)
(1)
September 12, 2005
C
CASE
ASE 2:
2: (k
(k == 0)
0) The
The weight
weight
isis approximately
approximately the
the same
same on
on
each
each of
of the
the log
logbbnn levels.
levels.
a f (n/b)
a2 f (n/b2)
a
f (n/b) f (n/b) f (n/b)
a
h = logbn
f (n/b2) f (n/b2) f (n/b2)
nlogba (1)
(nlogbalg n)
L2.55
a
f (n/b) f (n/b) f (n/b)
a
h = logbn
f (n/b2) f (n/b2) f (n/b2)
C
CASE
ASE 3:
3: The
The weight
weight decreases
decreases
geometrically
geometrically from
from the
the root
root to
to the
the
(1) leaves.
leaves. The
The root
root holds
holds aa constant
constant
fraction
fraction of
of the
the total
total weight.
weight.
September 12, 2005
f (n)
a f (n/b)
a2 f (n/b2)
Recursion tree:
nlogba (1)
( f (n))
L2.56
for x 1
1 + x + x2 + L + xn =
1 x
1
1+ x + x +L =
for |x| < 1
1 x
2
Return to last
slide viewed.
L2.57
Introduction to Algorithms
6.046J/18.401J
LECTURE 3
Divide and Conquer
Binary search
Powering a number
Fibonacci numbers
Matrix multiplication
Strassens algorithm
VLSI tree layout
Prof. Erik D. Demaine
September 14, 2005
L2.1
The divide-and-conquer
design paradigm
1. Divide the problem (instance)
into subproblems.
2. Conquer the subproblems by
solving them recursively.
3. Combine subproblem solutions.
L2.2
Merge sort
1. Divide: Trivial.
2. Conquer: Recursively sort 2 subarrays.
3. Combine: Linear-time merge.
L2.3
Merge sort
1. Divide: Trivial.
2. Conquer: Recursively sort 2 subarrays.
3. Combine: Linear-time merge.
T(n) = 2 T(n/2) + (n)
# subproblems
subproblem size
September 14, 2005
work dividing
and combining
L2.4
L2.5
L2.6
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
L2.7
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.8
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.9
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.10
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.11
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.12
Binary search
Find an element in a sorted array:
1. Divide: Check middle element.
2. Conquer: Recursively search 1 subarray.
3. Combine: Trivial.
Example: Find 9
3
September 14, 2005
12
15
L2.13
work dividing
and combining
L2.14
work dividing
and combining
L2.15
Powering a number
Problem: Compute a n, where n N.
Naive algorithm: (n).
L2.16
Powering a number
Problem: Compute a n, where n N.
Naive algorithm: (n).
Divide-and-conquer algorithm:
an
a n/2 a n/2
a (n1)/2 a (n1)/2 a
if n is even;
if n is odd.
L2.17
Powering a number
Problem: Compute a n, where n N.
Naive algorithm: (n).
Divide-and-conquer algorithm:
an
a n/2 a n/2
a (n1)/2 a (n1)/2 a
if n is even;
if n is odd.
L2.18
Fibonacci numbers
Recursive definition:
0
if n = 0;
if n = 1;
Fn = 1
Fn1 + Fn2 if n 2.
0
8 13 21 34 L
L2.19
Fibonacci numbers
Recursive definition:
0
if n = 0;
if n = 1;
Fn = 1
Fn1 + Fn2 if n 2.
0
8 13 21 34 L
L2.20
Computing Fibonacci
numbers
Bottom-up:
Compute F0, F1, F2, , Fn in order, forming
each number by summing the two previous.
Running time: (n).
L2.21
Computing Fibonacci
numbers
Bottom-up:
Compute F0, F1, F2, , Fn in order, forming
each number by summing the two previous.
Running time: (n).
Naive recursive squaring:
Fn = n/ 5 rounded to the nearest integer.
Recursive squaring: (lg n) time.
This method is unreliable, since floating-point
arithmetic is prone to round-off errors.
September 14, 2005
L2.22
Recursive squaring
Fn +1
Theorem:
Fn
Fn 1 1
.
=
Fn 1 1 0
L2.23
Recursive squaring
Fn +1
Theorem:
Fn
Fn 1 1
.
=
Fn 1 1 0
L2.24
Recursive squaring
Fn +1
Theorem:
Fn
Fn 1 1
.
=
Fn 1 1 0
F1 F0 1 0
September 14, 2005
L2.25
Recursive squaring
Inductive step (n 2):
Fn +1
F
n
.
Fn Fn
Fn 1 1 1
Fn 1 Fn 1 Fn 2 1 0
n1
1 1
1 1
=
1 0
1 0
n
1 1
=
1
0
L2.26
Matrix multiplication
Input: A = [aij], B = [bij].
Output: C = [cij] = A B.
c11 c12
c c
21 22
M M
c c
n1 n 2
i, j = 1, 2, , n.
O M M M
L ann bn1 bn 2
L b1n
L b2 n
O M
L bnn
L2.27
Standard algorithm
for i 1 to n
do for j 1 to n
do cij 0
for k 1 to n
do cij cij + aik bkj
L2.28
Standard algorithm
for i 1 to n
do for j 1 to n
do cij 0
for k 1 to n
do cij cij + aik bkj
L2.29
Divide-and-conquer algorithm
IDEA:
nn matrix = 22 matrix of (n/2)(n/2) submatrices:
r s a b e f
t u = c d g h
C
r
s
t
u
= ae + bg
= af + bh
= ce + dg
= cf + dh
September 14, 2005
L2.30
Divide-and-conquer algorithm
IDEA:
nn matrix = 22 matrix of (n/2)(n/2) submatrices:
r s a b e f
t u = c d g h
r
s
t
u
= ae + bg
= af + bh
= ce + dh
= cf + dg
September 14, 2005
C = A B
recursive
8 mults of (n/2)(n/2) submatrices
^
4 adds of (n/2)(n/2) submatrices
Copyright 2001-5 Erik D. Demaine and Charles E. Leiserson
L2.31
work adding
submatrices
L2.32
work adding
submatrices
L2.33
work adding
submatrices
L2.34
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
L2.35
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P1 = a ( f h)
P2 = (a + b) h
P3 = (c + d) e
P4 = d (g e)
P5 = (a + d) (e + h)
P6 = (b d) (g + h)
P7 = (a c) (e + f )
September 14, 2005
L2.36
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P1 = a ( f h)
P2 = (a + b) h
P3 = (c + d) e
P4 = d (g e)
P5 = (a + d) (e + h)
P6 = (b d) (g + h)
P7 = (a c) (e + f )
September 14, 2005
r
s
t
u
= P5 + P4 P2 + P6
= P1 + P2
= P3 + P4
= P5 + P1 P3 P7
L2.37
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P1 = a ( f h)
P2 = (a + b) h
P3 = (c + d) e
P4 = d (g e)
P5 = (a + d) (e + h)
P6 = (b d) (g + h)
P7 = (a c) (e + f )
September 14, 2005
r
s
t
u
= P5 + P4 P2 + P6
= P1 + P2
= P3 + P4
= P5 + P1 P3 P7
77 mults,
mults, 18
18 adds/subs.
adds/subs.
Note:
Note: No
No reliance
reliance on
on
commutativity
commutativity of
of mult!
mult!
L2.38
Strassens idea
Multiply 22 matrices with only 7 recursive mults.
P1 = a ( f h)
P2 = (a + b) h
P3 = (c + d) e
P4 = d (g e)
P5 = (a + d) (e + h)
P6 = (b d) (g + h)
P7 = (a c) (e + f )
September 14, 2005
r = P5 + P4 P2 + P6
= (a + d) (e + h)
+ d (g e) (a + b) h
+ (b d) (g + h)
= ae + ah + de + dh
+ dg de ah bh
+ bg + bh dg dh
= ae + bg
L2.39
Strassens algorithm
1. Divide: Partition A and B into
(n/2)(n/2) submatrices. Form terms
to be multiplied using + and .
2. Conquer: Perform 7 multiplications of
(n/2)(n/2) submatrices recursively.
3. Combine: Form C using + and on
(n/2)(n/2) submatrices.
L2.40
Strassens algorithm
1. Divide: Partition A and B into
(n/2)(n/2) submatrices. Form terms
to be multiplied using + and .
2. Conquer: Perform 7 multiplications of
(n/2)(n/2) submatrices recursively.
3. Combine: Form C using + and on
(n/2)(n/2) submatrices.
L2.41
Analysis of Strassen
T(n) = 7 T(n/2) + (n2)
L2.42
Analysis of Strassen
T(n) = 7 T(n/2) + (n2)
nlogba = nlog27 n2.81 CASE 1 T(n) = (nlg 7).
L2.43
Analysis of Strassen
T(n) = 7 T(n/2) + (n2)
nlogba = nlog27 n2.81 CASE 1 T(n) = (nlg 7).
The number 2.81 may not seem much smaller than
3, but because the difference is in the exponent, the
impact on running time is significant. In fact,
Strassens algorithm beats the ordinary algorithm
on todays machines for n 32 or so.
L2.44
Analysis of Strassen
T(n) = 7 T(n/2) + (n2)
nlogba = nlog27 n2.81 CASE 1 T(n) = (nlg 7).
The number 2.81 may not seem much smaller than
3, but because the difference is in the exponent, the
impact on running time is significant. In fact,
Strassens algorithm beats the ordinary algorithm
on todays machines for n 32 or so.
Best to date (of theoretical interest only): (n2.376L).
September 14, 2005
L2.45
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
L2.46
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
W(n)
H(n)
L2.47
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
W(n)
H(n)
H(n) = H(n/2) + (1)
= (lg n)
September 14, 2005
L2.48
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
W(n)
H(n)
H(n) = H(n/2) + (1)
= (lg n)
September 14, 2005
L2.49
VLSI layout
Problem: Embed a complete binary tree
with n leaves in a grid using minimal area.
W(n)
H(n)
H(n) = H(n/2) + (1) W(n) = 2 W(n/2) + (1)
= (lg n)
= (n)
Area = (n lg n)
September 14, 2005
L2.50
H-tree embedding
L(n)
L(n)
L2.51
H-tree embedding
L(n)
L(n)
L2.52
H-tree embedding
L(n)
L(n) = 2 L(n/4) + (1)
= ( n )
L(n)
Area = (n)
L(n/4) (1) L(n/4)
September 14, 2005
L2.53
Conclusion
Divide and conquer is just one of several
powerful techniques for algorithm design.
Divide-and-conquer algorithms can be
analyzed using recurrences and the master
method (so practice this math).
The divide-and-conquer strategy often leads
to efficient algorithms.
L2.54
Introduction to Algorithms
6.046J/18.401J
LECTURE 4
Quicksort
Divide and conquer
Partitioning
Worst-case analysis
Intuition
Randomized quicksort
Analysis
Prof. Charles E. Leiserson
September 21, 2005
L4.1
Quicksort
Proposed by C.A.R. Hoare in 1962.
Divide-and-conquer algorithm.
Sorts in place (like insertion sort, but not
like merge sort).
Very practical (with tuning).
L4.2
L4.3
Partitioning subroutine
PARTITION(A, p, q) A[ p . . q]
x A[ p]
pivot = A[ p]
Running
Running time
time
ip
== O(n)
O(n) for
for nn
for j p + 1 to q
elements.
elements.
do if A[ j] x
then i i + 1
exchange A[i] A[ j]
exchange A[ p] A[i]
return i
Invariant: xx
p
September 21, 2005
xx
xx
i
??
j
q
L4.4
Example of partitioning
66 10
10 13
13 55
i
j
88
33
22 11
11
L4.5
Example of partitioning
66 10
10 13
13 55
i
j
88
33
22 11
11
L4.6
Example of partitioning
66 10
10 13
13 55
i
j
88
33
22 11
11
L4.7
Example of partitioning
66 10
10 13
13 55
66
88
33
22 11
11
55 13
13 10
10 88
i
j
33
22 11
11
L4.8
Example of partitioning
66 10
10 13
13 55
66
88
33
22 11
11
55 13
13 10
10 88
i
j
33
22 11
11
L4.9
Example of partitioning
66 10
10 13
13 55
66
88
33
22 11
11
55 13
13 10
10 88
i
33
j
22 11
11
L4.10
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
i
j
L4.11
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
i
j
L4.12
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
66
55
33
22
i
88 13
13 10
10 11
11
j
L4.13
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
66
55
33
22
i
88 13
13 10
10 11
11
j
L4.14
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
66
55
33
22
i
88 13
13 10
10 11
11
L4.15
Example of partitioning
66 10
10 13
13 55
88
33
22 11
11
66
55 13
13 10
10 88
33
22 11
11
66
55
33 10
10 88 13
13 22 11
11
66
55
33
22
88 13
13 10
10 11
11
22
55
33
66
i
88 13
13 10
10 11
11
L4.16
L4.17
Analysis of quicksort
Assume all input elements are distinct.
In practice, there are better partitioning
algorithms for when duplicate input
elements may exist.
Let T(n) = worst-case running time on
an array of n elements.
L4.18
Worst-case of quicksort
Input sorted or reverse sorted.
Partition around min or max element.
One side of partition always has no elements.
(arithmetic series)
L4.19
L4.20
L4.21
L4.22
L4.23
O
(1)
L4.24
n
k = (n 2 )
k =1
O
(1)
L4.25
(1) c(n2)
(1)
n
k = (n 2 )
k =1
O
L4.26
Best-case analysis
(For intuition only!)
1 9
always 10 : 10 ?
L4.27
L4.28
T (101 n )
T (109 n )
L4.29
cn
9
10
cn
9
9
81
1
T (100
n ) T (100
n ) T (100
n )T (100
n)
L4.30
cn
9
100
9
10
cn
O(n)
O(n) leaves
leaves
(1)
cn
log10/9n
9
81
cn
cn
100
100
1
100
cn
cn
cn
cn
cn
(1)
September 21, 2005
L4.31
9
100
9
10
cn
(n lg n)
Lucky!
September 21, 2005
O(n)
O(n) leaves
leaves
(1)
cn
log10/9n
9
81
cn
cn
100
100
log10n
1
cn
100
cn
cn
cn
cn
cn
(1)
cn log10n T(n) cn log10/9n + (n)
Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson
L4.32
More intuition
Suppose we alternate lucky, unlucky,
lucky, unlucky, lucky, .
L(n) = 2U(n/2) + (n) lucky
U(n) = L(n 1) + (n) unlucky
Solving:
L(n) = 2(L(n/2 1) + (n/2)) + (n)
= 2L(n/2 1) + (n)
= (n lg n) Lucky!
How can we make sure we are usually lucky?
September 21, 2005
L4.33
Randomized quicksort
IDEA: Partition around a random element.
Running time is independent of the input
order.
No assumptions need to be made about
the input distribution.
No specific input elicits the worst-case
behavior.
The worst case is determined only by the
output of a random-number generator.
September 21, 2005
L4.34
Randomized quicksort
analysis
Let T(n) = the random variable for the running
time of randomized quicksort on an input of size
n, assuming random numbers are independent.
For k = 0, 1, , n1, define the indicator
random variable
Xk =
L4.35
Analysis (continued)
T(n) =
n 1
= X k (T (k ) + T (n k 1) + (n) )
k =0
L4.36
Calculating expectation
n 1
L4.37
Calculating expectation
n 1
n 1
E[ X k (T (k ) + T (n k 1) + (n) )]
k =0
Linearity of expectation.
L4.38
Calculating expectation
n 1
=
=
n 1
E[ X k (T (k ) + T (n k 1) + (n) )]
k =0
n 1
E[ X k ] E[T (k ) + T (n k 1) + (n)]
k =0
L4.39
Calculating expectation
n 1
=
=
n 1
E[ X k (T (k ) + T (n k 1) + (n) )]
k =0
n 1
E[ X k ] E[T (k ) + T (n k 1) + (n)]
k =0
n 1
n 1
n 1
= 1 E [T (k )] + 1 E [T (n k 1)] + 1 (n)
n k =0
n k =0
n k =0
L4.40
Calculating expectation
n 1
E[T (n)] = E X k (T (k ) + T ( n k 1) + (n) )
k =0
=
=
n 1
E[ X k (T (k ) + T (n k 1) + (n) )]
k =0
n 1
E[ X k ] E[T (k ) + T (n k 1) + (n)]
k =0
n 1
n 1
n 1
= 1 E [T (k )] + 1 E [T (n k 1)] + 1 (n)
n k =0
n k =0
n k =0
n 1
= 2 E [T (k )] + (n)
n k =1
September 21, 2005
Summations have
identical terms.
L4.41
Hairy recurrence
n 1
Use fact:
1 n 2 lg n 1n 2
k
lg
k
(exercise).
2
8
k =2
September 21, 2005
L4.42
Substitution method
n 1
E [T (n)] 2 ak lg k + (n)
n k =2
Substitute inductive hypothesis.
L4.43
Substitution method
n 1
E [T (n)] 2 ak lg k + (n)
n k =2
2a 1 n 2 lg n 1 n 2 + (n)
n 2
8
Use fact.
L4.44
Substitution method
n 1
E [T (n)] 2 ak lg k + (n)
n k =2
2a 1 n 2 lg n 1 n 2 + (n)
n 2
8
= an lg n an (n)
4
L4.45
Substitution method
n 1
E [T (n)] 2 ak lg k + (n)
n k =2
= 2a 1 n 2 lg n 1 n 2 + (n)
n 2
8
= an lg n an (n)
4
an lg n ,
if a is chosen large enough so that
an/4 dominates the (n).
September 21, 2005
L4.46
Quicksort in practice
Quicksort is a great general-purpose
sorting algorithm.
Quicksort is typically over twice as fast
as merge sort.
Quicksort can benefit substantially from
code tuning.
Quicksort behaves well even with
caching and virtual memory.
September 21, 2005
L4.47
Introduction to Algorithms
6.046J/18.401J
LECTURE 5
Sorting Lower Bounds
Decision trees
Linear-Time Sorting
Counting sort
Radix sort
Appendix: Punched cards
Prof. Erik Demaine
September 26, 2005
L5.1
L5.2
Decision-tree example
Sort a1, a2, , an
1:2
1:2
2:3
2:3
123
123
1:3
1:3
213
213
1:3
1:3
132
132
312
312
2:3
2:3
231
231
321
321
L5.3
Decision-tree example
Sort a1, a2, a3
= 9, 4, 6 :
1:2
1:2
94
2:3
2:3
123
123
1:3
1:3
213
213
1:3
1:3
132
132
312
312
2:3
2:3
231
231
321
321
L5.4
Decision-tree example
Sort a1, a2, a3
= 9, 4, 6 :
1:2
1:2
2:3
2:3
123
123
1:3
1:3
213
213
1:3
1:3
132
132
312
312
96
2:3
2:3
231
231
321
321
L5.5
Decision-tree example
Sort a1, a2, a3
= 9, 4, 6 :
1:2
1:2
2:3
2:3
123
123
1:3
1:3
213
213 4 6 2:3
2:3
1:3
1:3
132
132
312
312
231
231
321
321
L5.6
Decision-tree example
Sort a1, a2, a3
= 9, 4, 6 :
1:2
1:2
2:3
2:3
123
123
1:3
1:3
213
213
1:3
1:3
132
132
312
312
2:3
2:3
231
231
321
321
469
Each leaf contains a permutation (1), (2),, (n) to
indicate that the ordering a(1) a(2) L a(n) has been
established.
September 26, 2005
L5.7
Decision-tree model
A decision tree can model the execution of
any comparison sort:
One tree for each input size n.
View the algorithm as splitting whenever
it compares two elements.
The tree contains the comparisons along
all possible instruction traces.
The running time of the algorithm = the
length of the path taken.
Worst-case running time = height of tree.
September 26, 2005
L5.8
L5.9
L5.10
L5.11
Counting sort
for i 1 to k
do C[i] 0
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
for i 2 to k
do C[i] C[i] + C[i1]
C[i] = |{key i}|
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.12
Counting-sort example
A:
44
11
33
44
33
C:
B:
L5.13
Loop 1
A:
44
11
33
44
33
C:
00
00
00
00
B:
for i 1 to k
do C[i] 0
September 26, 2005
L5.14
Loop 2
A:
44
11
33
44
33
C:
00
00
00
11
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
September 26, 2005
L5.15
Loop 2
A:
44
11
33
44
33
C:
11
00
00
11
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
September 26, 2005
L5.16
Loop 2
A:
44
11
33
44
33
C:
11
00
11
11
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
September 26, 2005
L5.17
Loop 2
A:
44
11
33
44
33
C:
11
00
11
22
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
September 26, 2005
L5.18
Loop 2
A:
44
11
33
44
33
C:
11
00
22
22
B:
for j 1 to n
do C[A[ j]] C[A[ j]] + 1 C[i] = |{key = i}|
September 26, 2005
L5.19
Loop 3
A:
44
11
33
44
33
B:
for i 2 to k
do C[i] C[i] + C[i1]
September 26, 2005
C:
11
00
22
22
C':
11
11
22
22
L5.20
Loop 3
A:
44
11
33
44
33
B:
for i 2 to k
do C[i] C[i] + C[i1]
September 26, 2005
C:
11
00
22
22
C':
11
11
33
22
L5.21
Loop 3
A:
44
11
33
44
33
B:
for i 2 to k
do C[i] C[i] + C[i1]
September 26, 2005
C:
11
00
22
22
C':
11
11
33
55
L5.22
Loop 4
A:
44
11
33
44
33
B:
33
C:
11
11
33
55
C':
11
11
22
55
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.23
Loop 4
A:
44
11
33
44
33
44
B:
33
C:
11
11
22
55
C':
11
11
22
44
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.24
Loop 4
A:
B:
44
11
33
44
33
33
33
44
C:
11
11
22
44
C':
11
11
11
44
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.25
Loop 4
1
A:
44
11
33
44
33
B:
11
33
33
44
C:
11
11
11
44
C':
00
11
11
44
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.26
Loop 4
1
A:
44
11
33
44
33
B:
11
33
33
44
44
C:
00
11
11
44
C':
00
11
11
33
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
September 26, 2005
L5.27
Analysis
(k)
(n)
(k)
(n)
for i 1 to k
do C[i] 0
for j 1 to n
do C[A[ j]] C[A[ j]] + 1
for i 2 to k
do C[i] C[i] + C[i1]
for j n downto 1
do B[C[A[ j]]] A[ j]
C[A[ j]] C[A[ j]] 1
(n + k)
September 26, 2005
L5.28
Running time
If k = O(n), then counting sort takes (n) time.
But, sorting takes (n lg n) time!
Wheres the fallacy?
Answer:
Comparison sorting takes (n lg n) time.
Counting sort is not a comparison sort.
In fact, not a single comparison between
elements occurs!
September 26, 2005
L5.29
Stable sorting
Counting sort is a stable sort: it preserves
the input order among equal elements.
A:
44
11
33
44
33
B:
11
33
33
44
44
L5.30
Radix sort
Origin: Herman Holleriths card-sorting
machine for the 1890 U.S. Census. (See
Appendix .)
Digit-by-digit sort.
Holleriths original (bad) idea: sort on
most-significant digit first.
Good idea: Sort on least-significant digit
first with auxiliary stable sort.
September 26, 2005
L5.31
720
355
436
457
657
329
839
720
329
436
839
355
457
657
329
355
436
457
657
720
839
L5.32
720
329
436
839
355
457
657
329
355
436
457
657
720
839
L5.33
720
329
436
839
355
457
657
329
355
436
457
657
720
839
L5.34
720
329
436
839
355
457
657
329
355
436
457
657
720
839
L5.35
L5.36
Analysis (continued)
Recall: Counting sort takes (n + k) time to
sort n numbers in the range from 0 to k 1.
If each b-bit word is broken into r-bit pieces,
each pass of counting sort takes (n + 2r) time.
Since there are b/r passes, we have
T (n, b) = b (n + 2 r ) .
r
L5.37
Choosing r
T (n, b) = b (n + 2 r )
r
L5.38
Conclusions
In practice, radix sort is fast for large inputs, as
well as simple to code and maintain.
Example (32-bit numbers):
At most 3 passes when sorting 2000 numbers.
Merge sort and quicksort do at least lg 2000 =
11 passes.
Downside: Unlike quicksort, radix sort displays
little locality of reference, and thus a well-tuned
quicksort fares better on modern processors,
which feature steep memory hierarchies.
September 26, 2005
L5.39
Appendix: Punched-card
technology
Herman Hollerith (1860-1929)
Punched cards
Holleriths tabulating system
Operation of the sorter
Origin of radix sort
Modern IBM card
Web resources on punchedcard technology
September 26, 2005
Return to last
slide viewed.
L5.40
Herman Hollerith
(1860-1929)
The 1880 U.S. Census took almost 10 years to
process.
While a lecturer at MIT, Hollerith prototyped
punched-card technology.
His machines, including a card sorter, allowed
the 1890 census total to be reported in 6 weeks.
He founded the Tabulating Machine Company in
1911, which merged with other companies in 1924
to form International Business Machines.
September 26, 2005
L5.41
Punched cards
Punched card = data record.
Hole = value.
Algorithm = machine + human operator.
Hollerith's tabulating system, punch card
in Genealogy Article on the Internet
Image removed due to copyright restrictions.
Replica of punch card from the 1900 U.S. census. [Howells 2000]
L5.42
Holleriths
tabulating
system
Pantograph card
punch
Hand-press reader
Dial counters
Sorting box
September 26, 2005
L5.43
L5.44
L5.45
Produced by
the WWW
Virtual PunchCard Server.
L5.46
L5.47
Introduction to Algorithms
6.046J/18.401J
LECTURE 6
Order Statistics
Randomized divide and
conquer
Analysis of expected time
Worst-case linear-time
order statistics
Analysis
Prof. Erik Demaine
September 28, 2005
L6.1
Order statistics
Select the ith smallest of n elements (the
element with rank i).
i = 1: minimum;
i = n: maximum;
i = (n+1)/2 or (n+1)/2: median.
Naive algorithm: Sort and index ith element.
Worst-case running time = (n lg n) + (1)
= (n lg n),
using merge sort or heapsort (not quicksort).
September 28, 2005
L6.2
L6.3
Example
Select the i = 7th smallest:
66 10
10 13
13 55
pivot
Partition:
22 55
33
66
88
33
22 11
11
i=7
88 13
13 10
10 11
11
k=4
L6.4
L6.5
L6.6
Analysis (continued)
To obtain an upper bound, assume that the ith
element always falls in the larger side of the
partition:
T(max{0, n1}) + (n) if 0 : n1 split,
T(max{1, n2}) + (n) if 1 : n2 split,
T(n) =
M
T(max{n1, 0}) + (n) if n1 : 0 split,
=
n 1
k =0
September 28, 2005
L6.7
Calculating expectation
n 1
L6.8
Calculating expectation
n 1
n 1
k =0
Linearity of expectation.
L6.9
Calculating expectation
n 1
=
=
n 1
k =0
n 1
k =0
L6.10
Calculating expectation
n 1
=
=
n 1
k =0
n 1
k =0
n 1
n 1
L6.11
Calculating expectation
n 1
E[T (n)] = E X k (T (max{k , n k 1}) + (n) )
k =0
=
=
n 1
k =0
n 1
k =0
n 1
n 1
2 E [T (k )] + (n)
n k = n / 2
September 28, 2005
Upper terms
appear twice.
L6.12
Hairy recurrence
(But not quite as hairy as the quicksort one.)
n 1
3n 2
k
8 (exercise).
Use fact:
k = n / 2
September 28, 2005
L6.13
Substitution method
n 1
E [T (n)] 2 ck + (n)
n k= n/2
Substitute inductive hypothesis.
L6.14
Substitution method
n 1
E [T (n)] 2 ck + (n)
n k= n/2
2c 3 n 2 + (n)
n 8
Use fact.
L6.15
Substitution method
n 1
E [T (n)] 2 ck + (n)
n k= n/2
2c 3 n 2 + (n)
n 8
= cn cn (n)
4
L6.16
Substitution method
n 1
E [T (n)] 2 ck + (n)
n k= n/2
2c 3 n 2 + (n)
n 8
= cn cn (n)
4
cn ,
if c is chosen large enough so
that cn/4 dominates the (n).
September 28, 2005
L6.17
Summary of randomized
order-statistic selection
Works fast: linear expected time.
Excellent algorithm in practice.
But, the worst case is very bad: (n2).
Q. Is there an algorithm that runs in linear
time in the worst case?
A. Yes, due to Blum, Floyd, Pratt, Rivest,
and Tarjan [1973].
IDEA: Generate a good pivot recursively.
September 28, 2005
L6.18
SELECT(i, n)
Same as
RANDSELECT
L6.19
L6.20
L6.21
L6.22
L6.23
Analysis
lesser
greater
September 28, 2005
L6.24
Analysis
lesser
greater
September 28, 2005
L6.25
Analysis
lesser
greater
L6.26
Minor simplification
For n 50, we have 3 n/10 n/4.
Therefore, for n 50 the recursive call to
SELECT in Step 4 is executed recursively
on 3n/4 elements.
Thus, the recurrence for running time
can assume that Step 4 takes time
T(3n/4) in the worst case.
For n < 50, we know that the worst-case
time is T(n) = (1).
September 28, 2005
L6.27
T(3n/4)
SELECT(i, n)
1. Divide the n elements into groups of 5. Find
the median of each 5-element group by rote.
2. Recursively SELECT the median x of the n/5
group medians to be the pivot.
3. Partition around the pivot x. Let k = rank(x).
4. if i = k then return x
elseif i < k
then recursively SELECT the ith
smallest element in the lower part
else recursively SELECT the (ik)th
smallest element in the upper part
L6.28
20
cn ,
if c is chosen large enough to handle both the
(n) and the initial conditions.
Substitution:
T(n) cn
L6.29
Conclusions
Since the work at each level of recursion
is a constant fraction (19/20) smaller, the
work per level is a geometric series
dominated by the linear work at the root.
In practice, this algorithm runs slowly,
because the constant in front of n is large.
The randomized algorithm is far more
practical.
Exercise: Why not divide into groups of 3?
September 28, 2005
L6.30
Introduction to Algorithms
6.046J/18.401J
LECTURE 7
Hashing I
Direct-access tables
Resolving collisions by
chaining
Choosing hash functions
Open addressing
Prof. Charles E. Leiserson
October 3, 2005
L7.1
Symbol-table problem
Symbol table S holding n records:
x
record
key[x]
key[x]
Other fields
containing
satellite data
Operations on S:
INSERT(S, x)
DELETE(S, x)
SEARCH(S, k)
L7.2
Direct-access table
IDEA: Suppose that the keys are drawn from
the set U {0, 1, , m1}, and keys are
distinct. Set up an array T[0 . . m1]:
x
if x K and key[x] = k,
T[k] =
NIL otherwise.
Then, operations take (1) time.
Problem: The range of keys can be large:
64-bit numbers (which represent
18,446,744,073,709,551,616 different keys),
character strings (even larger!).
October 3, 2005
L7.3
Hash functions
Solution: Use a hash function h to map the
universe U of all keys into
T
{0, 1, , m1}:
0
k1
S
k2
k4
k5
k3
h(k1)
h(k4)
h(k2) = h(k5)
h(k3)
m1
L7.4
Resolving collisions by
chaining
Link records in the same slot into a list.
T
Worst case:
Every
key
i
hashes to the
49
49 86
86 52
52
same slot.
Access time =
(n) if |S| = n
h(49) = h(86) = h(52) = i
October 3, 2005
L7.5
L7.6
Search cost
The expected time for an unsuccessful
search for a record with a given key is
= (1 + ).
October 3, 2005
L7.7
Search cost
The expected time for an unsuccessful
search for a record with a given key is
= (1 + ).
search
the list
October 3, 2005
L7.8
Search cost
The expected time for an unsuccessful
search for a record with a given key is
= (1 + ).
search
the list
October 3, 2005
L7.9
Search cost
The expected time for an unsuccessful
search for a record with a given key is
= (1 + ).
search
the list
L7.10
L7.11
Division method
Assume all keys are integers, and define
h(k) = k mod m.
Deficiency: Dont pick an m that has a small
divisor d. A preponderance of keys that are
congruent modulo d can adversely affect
uniformity.
Extreme deficiency: If m = 2r, then the hash
doesnt even depend on all the bits of k:
If k = 10110001110110102 and r = 6, then
h(k) = 0110102 .
h(k)
October 3, 2005
L7.12
L7.13
Multiplication method
Assume that all keys are integers, m = 2r, and our
computer has w-bit words. Define
h(k) = (Ak mod 2w) rsh (w r),
where rsh is the bitwise right-shift operator and
A is an odd integer in the range 2w1 < A < 2w.
Dont pick A too close to 2w1 or 2w.
Multiplication modulo 2w is fast compared to
division.
The rsh operator is fast.
October 3, 2005
L7.14
Multiplication method
example
h(k) = (Ak mod 2w) rsh (w r)
Suppose that m = 8 = 23 and that our computer
has w = 7-bit words:
1011001 =A
1101011 =k
10010100110011
A
h(k)
0
7 1
5 4 3
Modular wheel
October 3, 2005
3A
.
2A
L7.15
L7.16
586
133
204
collision
481
m1
October 3, 2005
L7.17
T
586
133
collision
204
481
m1
October 3, 2005
L7.18
586
133
204
496
481
insertion
m1
October 3, 2005
L7.19
586
133
204
496
481
L7.20
Probing strategies
Linear probing:
Given an ordinary hash function h(k), linear
probing uses the hash function
h(k,i) = (h(k) + i) mod m.
This method, though simple, suffers from primary
clustering, where long runs of occupied slots build
up, increasing the average search time. Moreover,
the long runs of occupied slots tend to get longer.
October 3, 2005
L7.21
Probing strategies
Double hashing
Given two ordinary hash functions h1(k) and h2(k),
double hashing uses the hash function
h(k,i) = (h1(k) + i h2(k)) mod m.
This method generally produces excellent results,
but h2(k) must be relatively prime to m. One way
is to make m a power of 2 and design h2(k) to
produce only odd numbers.
October 3, 2005
L7.22
October 3, 2005
L7.23
n
i
n
<
= for i = 1, 2, , n.
Observe that
mi m
October 3, 2005
L7.24
Proof (continued)
Therefore, the expected number of probes is
n
n
1
n
2
1
1 + 1 +
L
L 1 +
1 +
m m 1 m 2 m n + 1
1 + (1 + (1 + (L (1 + )L)))
1+ + 2 +3 +L
i =0
= 1 .
1
October 3, 2005
L7.25
October 3, 2005
L7.26
Introduction to Algorithms
6.046J/18.401J
LECTURE 8
Hashing II
Universal hashing
Universality theorem
Constructing a set of
universal hash functions
Perfect hashing
Prof. Charles E. Leiserson
October 5, 2005
L7.1
A weakness of hashing
L7.2
Universal hashing
{h : h(x) = h(y)}
of a collision
between x and y is
1/m if we choose h
|H|
randomly from H.
October 5, 2005
L7.3
Universality is good
October 5, 2005
L7.4
Proof of theorem
October 5, 2005
cxy .
yT { x}
L7.5
Proof (continued)
E
[
C x ]
=
E
c xy
yT { x}
October 5, 2005
Take expectation
of both sides.
L7.6
Proof (continued)
E
[
C x ]
=
E
c xy
yT { x
}
E[cxy ]
yT { x}
October 5, 2005
Take expectation
of both sides.
Linearity of
expectation.
L7.7
Proof (continued)
E
[
C x ]
=
E
c xy
yT { x
}
E[cxy ]
Linearity of
expectation.
1/ m
E[cxy] = 1/m.
yT { x
}
yT { x}
October 5, 2005
Take expectation
of both sides.
L7.8
Proof (continued)
x ]
=
E
c xy
E
[C
yT {
x}
E[cxy ]
Linearity of
expectation.
1/ m
E[cxy] = 1/m.
yT { x}
yT { x}
=
n
1 .
m
October 5, 2005
Take expectation
of both sides.
Algebra.
L7.9
Constructing a set of
universal hash functions
Let m be prime. Decompose key k into r + 1
digits, each with value in the set {0, 1, , m1}.
That is, let k = k0, k1, , kr, where 0 ki < m.
Randomized strategy:
Dot
product,
Define ha (k ) = ai ki mod m .
modulo
m
i=0
How big is H = {ha}? |H| = mr + 1. REMEMBER
THIS!
October 5, 2005
L7.10
Universality of dot-product
hash functions
Theorem. The set H = {ha} is universal.
i=0
i=0
ai xi ai yi
October 5, 2005
(mod m) .
L7.11
Proof (continued)
Equivalently, we have
ai (xi yi ) 0
(mod m)
i=0
or
r
a0 (x0 y0 ) + ai (xi yi ) 0
(mod m) ,
i=1
a0 (x0 y0 ) ai (xi yi )
(mod m) .
i=1
October 5, 2005
L7.12
October 5, 2005
z1
1 4
4 5
5 2 3 6
L7.13
We have
a0 (x0 y0 ) ai (xi yi )
(mod m) ,
i=1
a0 ai (xi yi ) (x0 y0 ) 1
i=1
(mod m) .
L7.14
Proof (completed)
a0 =
ai
( xi
y
i
)
( x0
y
0
) mod m .
i =
1
a
r
r
to collide is m 1 = m = |H|/m.
October 5, 2005
L7.15
Perfect hashing
T
0
44 31
31
S1
14
27
1427
h31(14) = h31(27) = 1
11 00
00
S4
26
26
99 86
86
m a
40
22
40 37
37
22
0 1 2 3 4 5 6 7 8
S6
L7.16
Collisions at level 2
n
2
under h is 1/m = 1/n . Since there are (
2 ) pairs
of keys that can possibly collide, the expected
number of collisions is
n
1
n
(
n
1)
1
2 <
1 .
2 =
2
2
n
2
n
October 5, 2005
L7.17
No collisions at level 2
October 5, 2005
L7.18
Analysis of storage
m1
2
E
(ni
) =
(n) ,
i
=
0
L7.19
Introduction to Algorithms
6.046J/18.401J
LECTURE 9
Randomly built binary
search trees
Expected node depth
Analyzing height
Convexity lemma
Jensens inequality
Exponential height
Post mortem
Prof. Erik Demaine
October 17, 2005
L7.1
Binary-search-tree sort
T
Create an empty BST
for i = 1 to n
do TREE-INSERT(T, A[i])
Perform an inorder tree walk of T.
Example:
A = [3 1 8 2 6 7 5]
33
11
88
22
66
55
77
L7.2
8 6 7 5
2
6 75
5
The expected time to build the tree is asymptotically the same as the running time of quicksort.
October 17, 2005
L7.3
Node depth
The depth of a node = the number of comparisons
made during TREE-INSERT. Assuming all input
permutations are equally likely, we have
Average node depth
n
1
= E (# comparisons to insert node i )
n i =1
= 1 O(n lg n)
n
= O(lg n) .
October 17, 2005
(quicksort analysis)
L7.4
lg n
Ave. depth 1 n lg n + n n
n
2
= O(lg n)
October 17, 2005
h= n
L7.5
L7.6
Convex functions
A function f : R R is convex if for all
, 0 such that + = 1, we have
f(x + y) f(x) + f(y)
for all x,y R.
f
f(y)
f(x) + f(y)
f(x)
f(x + y)
x
October 17, 2005
x + y
L7.7
Convexity lemma
Lemma. Let f : R R be a convex function,
and let 1, 2 , , n be nonnegative real
numbers such that k k = 1. Then, for any
real numbers x1, x2, , xn, we have
n
n
f k xk k f ( xk ) .
k =1
k =1
Proof. By induction on n. For n = 1, we have
1 = 1, and hence f(1x1) 1f(x1) trivially.
October 17, 2005
L7.8
Proof (continued)
Inductive step:
n
f k xk =
k =1
n 1
k
f n xn + (1 n )
xk
k =11 n
Algebra.
L7.9
Proof (continued)
Inductive step:
n 1
k
f n xn + (1 n )
xk
k =11 n
n1 k
n f ( xn ) + (1 n ) f
xk
k =11 n
f k xk =
k =1
Convexity.
L7.10
Proof (continued)
Inductive step:
n 1
k
f n xn + (1 n )
xk
k =11 n
n1 k
n f ( xn ) + (1 n ) f
xk
k =11 n
f k xk =
k =1
n 1
k
n f ( xn ) + (1 n )
f ( xk )
k =11 n
Induction.
October 17, 2005
L7.11
Proof (continued)
Inductive step:
n 1
k
f n xn + (1 n )
xk
k =11 n
n1 k
n f ( xn ) + (1 n ) f
xk
k =11 n
f k xk =
k =1
n 1
k
n f ( xn ) + (1 n )
f ( xk )
k =11 n
n
= k f ( xk ) .
k =1
Algebra.
L7.12
f k xk k f ( xk ) ,
k =1
k =1
assuming that these summations exist.
L7.13
i =1 i
i =1 i
L7.14
i =1 i
i =1 i
k xk lim n
n
k =1
i =1 i
1 k xk
k =1
k =1
f ( xk )
k f ( xk )
k =1
L7.15
Jensens inequality
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
Proof.
f ( E[ X ]) = f k Pr{ X = k}
k =
Definition of expectation.
L7.16
Jensens inequality
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
Proof.
f ( E[ X ]) = f k Pr{ X = k}
k =
f (k ) Pr{X = k}
k =
L7.17
Jensens inequality
Lemma. Let f be a convex function, and let X
be a random variable. Then, f (E[X]) E[ f (X)].
Proof.
f ( E[ X ]) = f k Pr{ X = k}
k =
f (k ) Pr{X = k}
k =
= E[ f ( X )] .
L7.18
L7.19
Analysis (continued)
Define the indicator random variable Znk as
1 if the root has rank k,
Znk =
0 otherwise.
Thus, Pr{Znk = 1} = E[Znk] = 1/n, and
n
Yn = Z nk (2 max{Yk 1 , Ynk }) .
k =1
L7.20
L7.21
n
E [Yn ] = E Z nk (2 max{Yk 1 , Ynk })
k =1
n
Linearity of expectation.
L7.22
k =1
n
= 2 E[ Z nk ] E[max{Yk 1 , Ynk }]
k =1
L7.23
k =1
n
= 2 E[ Z nk ] E[max{Yk 1 , Ynk }]
k =1
n
2 E[Yk 1 + Ynk ]
n k =1
L7.24
k =1
n
= 2 E[ Z nk ] E[max{Yk 1 , Ynk }]
k =1
n
2 E[Yk 1 + Ynk ]
n k =1
n 1
= 4 E[Yk ]
n k =0
October 17, 2005
L7.25
n 1
E [Yn ] = 4 E[Yk ]
n k =0
L7.26
n 1
E [Yn ] = 4 E[Yk ]
n k =0
n 1
4 ck 3
n k =0
Substitution.
L7.27
n 1
E [Yn ] = 4 E[Yk ]
n k =0
n 1
4 ck 3
n k =0
n 3
4
c
x dx
n 0
Integral method.
L7.28
n 1
E [Yn ] = 4 E[Yk ]
n k =0
n 1
4 ck 3
n k =0
n 3
4
c
x dx
n 0
4
4
c
n
=
n 4
L7.29
n 1
E [Yn ] = 4 E[Yk ]
n k =0
n 1
4 ck 3
n k =0
n 3
c
4
x dx
n 0
4
c
n
4
=
n 4
= cn3. Algebra.
L7.30
L7.31
L7.32
L7.33
L7.34
Post mortem
Q. Does the analysis have to be this hard?
Q. Why bother with analyzing exponential
height?
Q. Why not just develop the recurrence on
Xn = 1 + max{Xk1, Xnk}
directly?
October 17, 2005
L7.35
L7.36
Thought exercises
See what happens when you try to do the
analysis on Xn directly.
Try to understand better why the proof
uses an exponential. Will a quadratic do?
See if you can find a simpler argument.
(This argument is a little simpler than the
one in the bookI hope its correct!)
L7.37
Introduction to Algorithms
6.046J/18.401J
LECTURE 10
Balanced Search Trees
Red-black trees
Height of a red-black tree
Rotations
Insertion
L7.1
Examples:
AVL trees
2-3 trees
2-3-4 trees
B-trees
Red-black trees
L7.2
Red-black trees
This data structure requires an extra onebit color field in each node.
Red-black properties:
1. Every node is either red or black.
2. The root and leaves (NILs) are black.
3. If a node is red, then its parent is black.
L7.3
18
18
NIL
10
10
88
11
11
h=4
22
22
NIL
26
26
NIL
NIL
L7.4
18
18
NIL
10
10
88
22
22
11
11
NIL
26
26
NIL
NIL
L7.5
18
18
NIL
10
10
88
22
22
11
11
NIL
26
26
NIL
NIL
L7.6
18
18
NIL
10
10
88
22
22
11
11
NIL
26
26
NIL
NIL
L7.7
18
18 bh = 2
NIL
bh = 1 10
10
bh = 1
88
22
22
11
11
NIL
26
26
NIL
NIL
L7.8
h 2 lg(n + 1).
INTUITION:
L7.9
h 2 lg(n + 1).
INTUITION:
L7.10
h 2 lg(n + 1).
INTUITION:
L7.11
h 2 lg(n + 1).
INTUITION:
L7.12
h 2 lg(n + 1).
INTUITION:
L7.13
h 2 lg(n + 1).
INTUITION:
L7.14
Proof (continued)
We have
h h/2, since
at most half
the leaves on any path
are red.
in each tree is n + 1
n + 1 2h'
lg(n + 1) h' h/2
h 2 lg(n + 1).
L7.15
Query operations
L7.16
Modifying operations
color changes,
restructuring the links of the tree via
rotations.
L7.17
Rotations
RIGHT-ROTATE(B)
BB
LEFT-ROTATE(A)
AA
AA
BB
a , b , c a A b B c.
L7.18
IDEA: Insert x in tree. Color x red. Only redblack property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
77
33
18
18
10
10
88
22
22
11
11
26
26
L7.19
IDEA: Insert x in tree. Color x red. Only redblack property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
33
Insert x =15.
77
18
18
10
10
88
22
22
11
11
26
26
15
15
October 19, 2005
L7.20
IDEA: Insert x in tree. Color x red. Only redblack property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
33
Insert x =15.
RIGHT-ROTATE(18).
October 19, 2005
77
18
18
10
10
88
22
22
11
11
26
26
15
15
L7.21
IDEA: Insert x in tree. Color x red. Only redblack property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
77
Example:
33
Insert x =15.
88
Recolor, moving the
RIGHT-ROTATE(18).
LEFT-ROTATE(7) and recolor.
October 19, 2005
10
10
18
18
11
11
15
15
22
22
26
26
L7.22
IDEA: Insert x in tree. Color x red. Only redblack property 3 might be violated. Move the
violation up the tree by recoloring until it can
be fixed with rotations and recoloring.
Example:
77
Insert x =15.
RIGHT-ROTATE(18).
LEFT-ROTATE(7) and recolor.
October 19, 2005
10
10
18
18
11
11
15
15
22
22
26
26
L7.23
Pseudocode
RB-INSERT(T, x)
TREE-INSERT(T, x)
color[x] RED only RB property 3 can be violated
while x root[T] and color[p[x]] = RED
do if p[x] = left[p[p[x]]
then y right[p[p[x]]
y = aunt/uncle of x
if color[y] = RED
then Case 1
else if x = right[p[x]]
then Case 2 Case 2 falls into Case 3
Case 3
else then clause with left and right swapped
color[root[T]] BLACK
October 19, 2005
L7.24
Graphical notation
Let
All
L7.25
Case 1
Recolor
CC
DD
AA
BB
new x
DD
AA
BB
(Or, children of
A are swapped.)
CC
L7.26
Case 2
CC
AA
BB
LEFT-ROTATE(A)
CC
BB
AA
Transform to Case 3.
L7.27
Case 3
CC
BB
AA
RIGHT-ROTATE(C)
y
BB
AA
CC
Done! No more
violations of RB
property 3 are
possible.
October 19, 2005
L7.28
Analysis
L7.29
Introduction to Algorithms
6.046J/18.401J
LECTURE 11
Augmenting Data
Structures
Dynamic order statistics
Methodology
Interval trees
Prof. Charles E. Leiserson
October 24, 2005
L11.1
key
key
size
size
L11.2
Example of an OS-tree
M
M
99
CC
55
PP
33
AA
11
FF
33
DD
11
NN
11
QQ
11
HH
11
L11.3
Selection
Implementation trick: Use a sentinel
(dummy record) for NIL such that size[NIL] = 0.
OS-SELECT(x, i) ith smallest element in the
subtree rooted at x
k size[left[x]] + 1 k = rank(x)
if i = k then return x
if i < k
then return OS-SELECT(left[x], i )
else return OS-SELECT(right[x], i k )
(OS-RANK is in the textbook.)
October 24, 2005
L11.4
Example
OS-SELECT(root, 5)
i=5
k=6
i=5
k=2
M
M
99
CC
55
PP
33
AA
11
FF
33
DD
11
i=3
k=2
HH
11
NN
11
QQ
11
i=1
k=1
L11.5
L11.6
Example of insertion
INSERT(K)
M
M
10
910
9
CC
6565
PP
33
AA
11
FF
4343
DD
11
NN
11
QQ
11
HH
2121
KK
11
L11.7
Handling rebalancing
Dont forget that RB-INSERT and RB-DELETE may
also need to modify the red-black tree in order to
maintain balance.
Recolorings: no effect on subtree sizes.
Rotations: fix up subtree sizes in O(1) time.
Example:
EE
16
16
CC
11
11
7
CC
16
16
4
EE
88
7
3
L11.8
Data-structure augmentation
Methodology: (e.g., order-statistics trees)
1. Choose an underlying data structure (redblack trees).
2. Determine additional information to be
stored in the data structure (subtree sizes).
3. Verify that this information can be
maintained for modifying operations (RBINSERT, RB-DELETE dont forget rotations).
4. Develop new dynamic-set operations that use
the information (OS-SELECT and OS-RANK).
These steps are guidelines, not rigid rules.
October 24, 2005
L11.9
Interval trees
Goal: To maintain a dynamic set of intervals,
such as time intervals.
i = [7, 10]
low[i] = 7
5
4
10 = high[i]
11
17
15
19
18 22
23
L11.10
L11.11
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
m[x] = max
high[int[x]]
m[left[x]]
m[right[x]]
L11.12
Modifying operations
3. Verify that this information can be maintained
for modifying operations.
INSERT: Fix ms on the way down.
Rotations Fixup = O(1) time per rotation:
11,15
11,15
30
30
6,20
6,20
30
30
30
30
6,20
6,20
30
30
19
19
14
14
11,15
11,15
19
19
30
30
14
14
19
19
L11.13
New operations
4. Develop new dynamic-set operations that use
the information.
INTERVAL-SEARCH(i)
x root
while x NIL and (low[i] > high[int[x]]
or low[int[x]] > high[i])
do i and int[x] dont overlap
if left[x] NIL and low[i] m[left[x]]
then x left[x]
else x right[x]
return x
October 24, 2005
L11.14
Example 1: INTERVAL-SEARCH([14,16])
x
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
x root
[14,16] and [17,19] dont overlap
14 18 x left[x]
L11.15
Example 1: INTERVAL-SEARCH([14,16])
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
L11.16
Example 1: INTERVAL-SEARCH([14,16])
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
22,23
22,23
23
23
x
7,10
7,10
10
10
15,18
15,18
18
18
L11.17
Example 2: INTERVAL-SEARCH([12,14])
x
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
x root
[12,14] and [17,19] dont overlap
12 18 x left[x]
L11.18
Example 2: INTERVAL-SEARCH([12,14])
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
L11.19
Example 2: INTERVAL-SEARCH([12,14])
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
22,23
22,23
23
23
x
7,10
7,10
10
10
15,18
15,18
18
18
L11.20
Example 2: INTERVAL-SEARCH([12,14])
17,19
17,19
23
23
5,11
5,11
18
18
4,8
4,8
88
15,18
15,18
18
18
7,10
7,10
10
10
22,23
22,23
23
23
x
x = NIL no interval that
overlaps [12,14] exists
L11.21
Analysis
Time = O(h) = O(lg n), since INTERVAL-SEARCH
does constant work at each level as it follows a
simple path down the tree.
List all overlapping intervals:
Search, list, delete, repeat.
Insert them all again at the end.
Time = O(k lg n), where k is the total number of
overlapping intervals.
This is an output-sensitive bound.
Best algorithm to date: O(k + lg n).
October 24, 2005
L11.22
Correctness
Theorem. Let L be the set of intervals in the
left subtree of node x, and let R be the set of
intervals in xs right subtree.
If the search goes right, then
{ i L : i overlaps i } = .
If the search goes left, then
{i L : i overlaps i } =
{i R : i overlaps i } = .
In other words, its always safe to take only 1
of the 2 children: well either find something,
or nothing was to be found.
October 24, 2005
L11.23
Correctness proof
Proof. Suppose first that the search goes right.
If left[x] = NIL, then were done, since L = .
Otherwise, the code dictates that we must have
low[i] > m[left[x]]. The value m[left[x]]
corresponds to the high endpoint of some
interval j L, and no other interval in L can
have a larger high endpoint than high[ j].
i
j
L
high[ j] = m[left[x]]
low(i)
Therefore, {i L : i overlaps i } = .
October 24, 2005
L11.24
Proof (continued)
Suppose that the search goes left, and assume that
{i L : i overlaps i } = .
Then, the code dictates that low[i] m[left[x]] =
high[ j] for some j L.
Since j L, it does not overlap i, and hence
high[i] < low[ j].
But, the binary-search-tree property implies that
for all i R, we have low[ j] low[i].
But then {i R : i overlaps i } = .
i
October 24, 2005
j
i
L11.25
Introduction to Algorithms
6.046J/18.401J
LECTURE 12
Skip Lists
Data structure
Randomized insertion
With-high-probability bound
Analysis
Coin flipping
Prof. Erik D. Demaine
October 26, 2005
L11.1
Skip lists
Simple randomized dynamic search structure
Invented by William Pugh in 1989
Easy to implement
L11.2
14
14
23
23
October 26, 2005
34
34
42
42
50
50
59
59
66
66
72
72
79
79
L11.3
14
14
23
23
October 26, 2005
34
34
42
42
50
50
59
59
66
66
72
72
79
79
L11.4
23
23
October 26, 2005
34
34
42
42
34
34
42
42
72
72
50
50
59
59
66
66
72
72
79
79
L11.5
23
23
October 26, 2005
34
34
42
42
34
34
42
42
72
72
50
50
59
59
66
66
72
72
79
79
L11.6
Too far:
59 < 72
14
14
14
14
23
23
October 26, 2005
34
34
42
42
34
34
42
42
72
72
50
50
59
59
66
66
72
72
79
79
L11.7
23
23
October 26, 2005
34
34
42
42
34
34
42
42
72
72
50
50
59
59
66
66
72
72
79
79
L11.8
23
23
October 26, 2005
34
34
42
42
34
34
42
42
72
72
50
50
59
59
66
66
72
72
79
79
L11.9
42
42
23
23
n
October 26, 2005
34
34
42
42
66
66
50
50
59
59
66
66
72
72
n
Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson
79
79
n
L11.10
42
42
23
23
n
October 26, 2005
34
34
42
42
66
66
50
50
59
59
66
66
72
72
n
Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson
79
79
n
L11.11
lg n linked lists
lg n sorted linked lists are like a binary tree
(in fact, level-linked B+-tree; see Problem Set 5)
14
79
79
14
14
14
50
50
14
14
14
14
34
34
23
23
October 26, 2005
34
34
79
79
50
50
42
42
50
50
66
66
59
59
66
66
79
79
72
72
79
79
L11.12
79
79
14
14
50
50
14
14
14
14
34
34
23
23
October 26, 2005
34
34
79
79
50
50
42
42
50
50
66
66
59
59
66
66
79
79
72
72
79
79
L11.13
Skip lists
Ideal skip list is this lg n linked list structure
Skip list data structure maintains roughly this
structure subject to updates (insert/delete)
14
79
79
14
14
14
50
50
14
14
14
14
34
34
23
23
October 26, 2005
34
34
79
79
50
50
42
42
50
50
66
66
59
59
66
66
79
79
72
72
79
79
L11.14
INSERT(x)
To insert an element x into a skip list:
SEARCH(x) to see where x fits in bottom list
Always insert into bottom list
INVARIANT: Bottom list contains all elements
Insert into some of the lists above
QUESTION: To which other lists should we add x?
October 26, 2005
L11.15
INSERT(x)
QUESTION: To which other lists should we add x?
IDEA: Flip a (fair) coin; if HEADS,
promote x to next level up and flip again
Probability of promotion to next level = 1/2
On average:
L11.16
50
50
34
34
23
23
34
34
50
50
42
42
L11.17
50
50
Skip lists
A skip list is the result of insertions (and
deletions) from an initially empty structure
(containing just )
INSERT(x) uses random coin flips to decide
promotion level
DELETE(x) removes x from all lists containing it
L11.18
Skip lists
A skip list is the result of insertions (and
deletions) from an initially empty structure
(containing just )
INSERT(x) uses random coin flips to decide
promotion level
DELETE(x) removes x from all lists containing it
How good are skip lists? (speed/balance)
INTUITIVELY: Pretty good on average
CLAIM: Really, really good, almost always
October 26, 2005
L11.19
With-high-probability theorem
THEOREM: With high probability, every search
in an n-element skip list costs O(lg n)
L11.20
With-high-probability theorem
THEOREM: With high probability, every search
in a skip list costs O(lg n)
INFORMALLY: Event E occurs with high
probability (w.h.p.) if, for any 1, there is an
appropriate choice of constants for which
E occurs with probability at least 1 O(1/n)
In fact, constant in O(lg n) depends on
L11.21
With-high-probability theorem
THEOREM: With high probability, every search
in a skip list costs O(lg n)
INFORMALLY: Event E occurs with high
probability (w.h.p.) if, for any 1, there is an
appropriate choice of constants for which
E occurs with probability at least 1 O(1/n)
IDEA: Can make error probability O(1/n)
very small by setting large, e.g., 100
Almost certainly, bound remains true for entire
execution of polynomial-time algorithm
October 26, 2005
L11.22
L11.23
Analysis Warmup
LEMMA: With high probability,
n-element skip list has O(lg n) levels
PROOF:
Error probability for having at most c lg n levels
= Pr{more than c lg n levels}
n Pr{element x promoted at least c lg n times}
(by Booles Inequality)
= n (1/2c lg n)
= n (1/nc)
= 1/nc 1
October 26, 2005
L11.24
Analysis Warmup
LEMMA: With high probability,
n-element skip list has O(lg n) levels
PROOF:
Error probability for having at most c lg n levels
1/nc 1
This probability is polynomially small,
i.e., at most n for = c 1.
We can make arbitrarily large by choosing the
constant c in the O(lg n) bound accordingly.
October 26, 2005
L11.25
Proof of theorem
THEOREM: With high probability, every search
in an n-element skip list costs O(lg n)
COOL IDEA: Analyze search backwardsleaf to root
Search starts [ends] at leaf (node in bottom level)
At each node visited:
If node wasnt promoted higher (got TAILS here),
then we go [came from] left
If node was promoted higher (got HEADS here),
then we go [came from] up
L11.26
Proof of theorem
THEOREM: With high probability, every search
in an n-element skip list costs O(lg n)
COOL IDEA: Analyze search backwardsleaf to root
PROOF:
Search makes up and left moves
until it reaches the root (or )
Number of up moves < number of levels
c lg n w.h.p. (Lemma)
w.h.p., number of moves is at most the number
of times we need to flip a coin to get c lg n HEADs
October 26, 2005
L11.27
L11.28
orders
HEADs
10c lg n 1
Pr{at most c lg n HEADs}
c lg n 2
TAILs
9 c lg n
overestimate TAILs
on orders
October 26, 2005
L11.29
10c lg n 1
Pr{at most c lg n HEADs}
c lg n 2
c lg n
10c lg n
e
c lg n
c lg n 9 c lg n
= (10e ) 2
9 c lg n
1
2
9 c lg n
= 2lg(10 e )c lg n 2 9 c lg n
= 2[lg(10 e ) 9 ]c lg n
= 1 / n for = [9 lg(10e)] c
October 26, 2005
L11.30
L11.31
Introduction to Algorithms
6.046J/18.401J
LECTURE 13
Amortized Analysis
Dynamic tables
Aggregate method
Accounting method
Potential method
Prof. Charles E. Leiserson
October 31, 2005
L13.1
L13.2
1
overflow
L13.3
11
overflow
L13.4
11
2
L13.5
11
22
overflow
L13.6
1
2
overflow
L13.7
1
2
L13.8
INSERT
INSERT
INSERT
INSERT
1
2
3
4
L13.9
INSERT
INSERT
INSERT
INSERT
INSERT
1
2
3
4
overflow
L13.10
INSERT
INSERT
INSERT
INSERT
INSERT
1
2
3
4
overflow
L13.11
INSERT
INSERT
INSERT
INSERT
INSERT
1
2
3
4
L13.12
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
INSERT
1
2
3
4
5
6
7
L13.13
Worst-case analysis
Consider a sequence of n insertions. The
worst-case time to execute one insertion is
(n). Therefore, the worst-case time for n
insertions is n (n) = (n2).
WRONG! In fact, the worst-case cost for
n insertions is only (n) (n2).
Lets see why.
L13.14
Tighter analysis
Let ci = the cost of the i th insertion
i if i 1 is an exact power of 2,
=
1 otherwise.
i
sizei
16 16
ci
10
1
L13.15
Tighter analysis
Let ci = the cost of the i th insertion
i if i 1 is an exact power of 2,
=
1 otherwise.
i
sizei
16 16
1
1
1
2
1
4
1
8
ci
10
1
L13.16
Cost of n insertions = ci
i =1
n+
lg( n 1)
2j
j =0
3n
= ( n ) .
Thus, the average cost of each dynamic-table
operation is (n)/n = (1).
October 31, 2005
L13.17
Amortized analysis
An amortized analysis is any strategy for
analyzing a sequence of operations to
show that the average cost per operation is
small, even though a single operation
within the sequence might be expensive.
Even though were taking averages, however,
probability is not involved!
An amortized analysis guarantees the
average performance of each operation in
the worst case.
October 31, 2005
L13.18
L13.19
Accounting method
Charge i th operation a fictitious amortized cost
i, where $1 pays for 1 unit of work (i.e., time).
This fee is consumed to perform the operation.
Any amount not immediately consumed is stored
in the bank for use by subsequent operations.
The bank balance must not go negative! We
must ensure that n
n
ci ci
i =1
i =1
for all n.
Thus, the total amortized costs provide an upper
bound on the total true costs.
October 31, 2005
L13.20
Accounting analysis of
dynamic tables
Charge an amortized cost of i = $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
Example:
$0
$0 $0
$0 $0
$0 $2
$2 $2
$2 $2 $2 overflow
$0 $0
L13.21
Accounting analysis of
dynamic tables
Charge an amortized cost of i = $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
Example:
overflow
$0
$0 $0
$0 $0
$0 $0
$0 $0
$0 $0
$0 $0
$0
$0 $0
October 31, 2005
L13.22
Accounting analysis of
dynamic tables
Charge an amortized cost of i = $3 for the i th
insertion.
$1 pays for the immediate insertion.
$2 is stored for later table doubling.
When the table doubles, $1 pays to move a
recent item, and $1 pays to move an old item.
Example:
$0
$0 $0
$0 $0
$0 $0
$0 $0
$0 $0
$0 $0
$0 $2 $2 $2
$0 $0
October 31, 2005
L13.23
Accounting analysis
(continued)
Key invariant: Bank balance never drops below 0.
Thus, the sum of the amortized costs provides an
upper bound on the sum of the true costs.
i
sizei
16 16
ci
2* 3
banki
10
*Okay, so I lied. The first operation costs only $2, not $3.
October 31, 2005
L13.24
Potential method
IDEA: View the bank account as the potential
energy ( la physics) of the dynamic set.
Framework:
Start with an initial data structure D0.
Operation i transforms Di1 to Di.
The cost of operation i is ci.
Define a potential function : {Di} R,
such that (D0 ) = 0 and (Di ) 0 for all i.
The amortized cost i with respect to is
defined to be i = ci + (Di) (Di1).
October 31, 2005
L13.25
Understanding potentials
i = ci + (Di) (Di1)
potential difference i
L13.26
i =1
i =1
ci = (ci + ( Di ) ( Di1 ))
Summing both sides.
L13.27
i =1
i =1
n
ci = (ci + ( Di ) ( Di1 ))
= ci + ( Dn ) ( D0 )
i =1
L13.28
i =1
i =1
n
ci = (ci + ( Di ) ( Di1 ))
= ci + ( Dn ) ( D0 )
i =1
n
ci
i =1
L13.29
= 26 23 = 4
$0
$0 $0
$0 $0
$0 $2
$2 $2
$2
$0 $0
accounting method)
L13.30
L13.31
L13.32
L13.33
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
L13.34
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
L13.35
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
L13.36
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
=3
L13.37
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
=3
Case 2: i 1 is not an exact power of 2.
i = 1 + 2 2lg i + 2lg (i1)
L13.38
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
=3
Case 2: i 1 is not an exact power of 2.
i = 1 + 2 2lg i + 2lg (i1)
=3
(since 2lg i = 2lg (i1) )
L13.39
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
=3
Case 2: i 1 is not an exact power of 2.
i = 1 + 2 2lg i + 2lg (i1)
=3
Therefore, n insertions cost (n) in the worst case.
L13.40
Calculation
Case 1: i 1 is an exact power of 2.
i = i + 2 2lg i + 2lg (i1)
= i + 2 2(i 1) + (i 1)
= i + 2 2i + 2 + i 1
=3
Case 2: i 1 is not an exact power of 2.
i = 1 + 2 2lg i + 2lg (i1)
=3
Therefore, n insertions cost (n) in the worst case.
Exercise: Fix the bug in this analysis to show that
the amortized cost of the first insertion is only 2.
October 31, 2005
L13.41
Conclusions
Amortized costs can provide a clean abstraction
of data-structure performance.
Any of the analysis methods can be used when
an amortized analysis is called for, but each
method has some situations where it is arguably
the simplest or most precise.
Different schemes may work for assigning
amortized costs in the accounting method, or
potentials in the potential method, sometimes
yielding radically different bounds.
October 31, 2005
L13.42
Introduction to Algorithms
6.046J/18.401J
LECTURE 14
Competitive Analysis
Self-organizing lists
Move-to-front heuristic
Competitive analysis of
MTF
L14.1
Self-organizing lists
List L of n elements
The operation ACCESS(x) costs rankL(x) =
distance of x from the head of L.
L can be reordered by transposing adjacent
elements at a cost of 1.
November 2, 2005
L14.2
Self-organizing lists
List L of n elements
The operation ACCESS(x) costs rankL(x) =
distance of x from the head of L.
L can be reordered by transposing adjacent
elements at a cost of 1.
Example:
L
November 2, 2005
12
12
33
50
50
14
14
17
17
44
L14.3
Self-organizing lists
List L of n elements
The operation ACCESS(x) costs rankL(x) =
distance of x from the head of L.
L can be reordered by transposing adjacent
elements at a cost of 1.
Example:
L
12
12
33
50
50
14
14
17
17
44
L14.4
Self-organizing lists
List L of n elements
The operation ACCESS(x) costs rankL(x) =
distance of x from the head of L.
L can be reordered by transposing adjacent
elements at a cost of 1.
Example:
L
12
12
50
33
50
50
33
50
14
14
17
17
44
L14.5
L14.6
November 2, 2005
L14.7
E[C A ( S )] =
p( x) rank L ( x) ,
xL
L14.8
November 2, 2005
L14.9
Competitive analysis
Definition. An on-line algorithm A is
-competitive if there exists a constant k
such that for any sequence S of operations,
CA(S) COPT(S) + k ,
where OPT is the optimal off-line algorithm
(Gods algorithm).
November 2, 2005
L14.10
MTF is O(1)-competitive
Theorem. MTF is 4-competitive for selforganizing lists.
November 2, 2005
L14.11
MTF is O(1)-competitive
Theorem. MTF is 4-competitive for selforganizing lists.
Proof. Let Li be MTFs list after the ith access,
and let Li* be OPTs list after the ith access.
Let ci = MTFs cost for the ith operation
= 2 rankL (x) if it accesses x;
i1
ci* = MTFs cost for the ith operation
= rankL *(x) + ti ,
i1
where ti is the number of transposes that OPT
performs.
November 2, 2005
L14.12
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
November 2, 2005
L14.13
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
November 2, 2005
CC
A
A
BB
D
D
EE
L14.14
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
(Li) = 2 |{}|
November 2, 2005
L14.15
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
(Li) = 2 |{(E,C), }|
November 2, 2005
L14.16
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
L14.17
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
L14.18
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
L14.19
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
L14.20
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Example.
EE
CC
A
D
BB
Li
A
D
Li*
CC
A
A
BB
D
D
EE
L14.21
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
November 2, 2005
L14.22
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Note that
(Li) 0 for i = 0, 1, ,
(L0) = 0 if MTF and OPT start with the
same list.
November 2, 2005
L14.23
Potential function
Define the potential function :{Li} R by
(Li) = 2 |{(x, y) : x pL y and y pL * x}|
i
i
= 2 # inversions .
Note that
(Li) 0 for i = 0, 1, ,
(L0) = 0 if MTF and OPT start with the
same list.
How much does change from 1 transpose?
A transpose creates/destroys 1 inversion.
= 2 .
November 2, 2005
L14.24
Li1
Li1*
November 2, 2005
AB
AC
i1
CD
BD
L14.25
AB
CD
r = rankLi1(x)
AC
BD
r* = rankLi1* (x)
November 2, 2005
L14.26
AB
CD
r = rankLi1(x)
AC
BD
r* = rankLi1* (x)
L14.27
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
November 2, 2005
L14.28
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
November 2, 2005
L14.29
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
= 2r + 2(|A| (r 1 |A|) + ti)
(since r = |A| + |B| + 1)
November 2, 2005
L14.30
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
= 2r + 2(|A| (r 1 |A|) + ti)
= 2r + 4|A| 2r + 2 + 2ti
November 2, 2005
L14.31
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
= 2r + 2(|A| (r 1 |A|) + ti)
= 2r + 4|A| 2r + 2 + 2ti
= 4|A| + 2 + 2ti
November 2, 2005
L14.32
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
= 2r + 2(|A| (r 1 |A|) + ti)
= 2r + 4|A| 2r + 2 + 2ti
= 4|A| + 2 + 2ti
4(r* + ti)
(since r* = |A| + |C| + 1 |A| + 1)
November 2, 2005
L14.33
Amortized cost
The amortized cost for the ith operation of
MTF with respect to is
i = ci + (Li) (Li1)
2r + 2(|A| |B| + ti)
= 2r + 2(|A| (r 1 |A|) + ti)
= 2r + 4|A| 2r + 2 + 2ti
= 4|A| + 2 + 2ti
4(r* + ti)
= 4ci*.
November 2, 2005
L14.34
CMTF ( S ) = ci
i =1
November 2, 2005
L14.35
CMTF ( S ) = ci
i =1
S
= (ci + ( Li 1 ) ( Li ) )
i =1
November 2, 2005
L14.36
CMTF ( S ) = ci
i =1
S
= (ci + ( Li 1 ) ( Li ) )
i =1
S
4ci* + ( L0 ) ( L S )
i =1
November 2, 2005
L14.37
CMTF ( S ) = ci
i =1
S
= (ci + ( Li 1 ) ( Li ) )
i =1
S
4ci* + ( L0 ) ( L S )
i =1
4 COPT ( S ) ,
L14.38
Addendum
If we count transpositions that move x toward the
front as free (models splicing x in and out of L
in constant time), then MTF is 2-competitive.
November 2, 2005
L14.39
Addendum
If we count transpositions that move x toward the
front as free (models splicing x in and out of L
in constant time), then MTF is 2-competitive.
What if L0 L0*?
Then, (L0) might be (n2) in the worst case.
Thus, CMTF(S) 4 COPT(S) + (n2), which is
still 4-competitive, since n2 is constant as
|S| .
November 2, 2005
L14.40
Introduction to Algorithms
6.046J/18.401J
LECTURE 15
Dynamic Programming
Longest common
subsequence
Optimal substructure
Overlapping subproblems
L15.1
Dynamic programming
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find
a longest subsequence common to them both.
November 7, 2005
L15.2
Dynamic programming
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find
a longest subsequence common to them both.
a not the
November 7, 2005
L15.3
Dynamic programming
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find
a longest subsequence common to them both.
a not the
x: A B
C
B
D A B
y: B
November 7, 2005
L15.4
Dynamic programming
Design technique, like divide-and-conquer.
Example: Longest Common Subsequence (LCS)
Given two sequences x[1 . . m] and y[1 . . n], find
a longest subsequence common to them both.
a not the
x: A B
C
B
D A B
BCBA =
LCS(x, y)
y: B
D C
A B
A
functional notation,
but not a function
November 7, 2005
L15.5
November 7, 2005
L15.6
L15.7
November 7, 2005
L15.8
November 7, 2005
L15.9
L15.10
Recursive formulation
Theorem.
c[i, j] =
November 7, 2005
c[i1, j1] + 1
if x[i] = y[j],
max{c[i1, j], c[i, j1]} otherwise.
L15.11
Recursive formulation
Theorem.
c[i1, j1] + 1
if x[i] = y[j],
c[i, j] = max{c[i1, j], c[i, j1]} otherwise.
Proof. Case x[i] = y[ j]:
x:
1
1
y:
November 7, 2005
2
2
L
j
L15.12
Recursive formulation
Theorem.
c[i1, j1] + 1
if x[i] = y[j],
c[i, j] = max{c[i1, j], c[i, j1]} otherwise.
Proof. Case x[i] = y[ j]:
x:
1
1
y:
2
2
L
j
L15.13
Proof (continued)
Claim: z[1 . . k1] = LCS(x[1 . . i1], y[1 . . j1]).
Suppose w is a longer CS of x[1 . . i1] and
y[1 . . j1], that is, | w | > k1. Then, cut and
paste: w || z[k] (w concatenated with z[k]) is a
common subsequence of x[1 . . i] and y[1 . . j]
with | w || z[k] | > k. Contradiction, proving the
claim.
November 7, 2005
L15.14
Proof (continued)
Claim: z[1 . . k1] = LCS(x[1 . . i1], y[1 . . j1]).
Suppose w is a longer CS of x[1 . . i1] and
y[1 . . j1], that is, | w | > k1. Then, cut and
paste: w || z[k] (w concatenated with z[k]) is a
common subsequence of x[1 . . i] and y[1 . . j]
with | w || z[k] | > k. Contradiction, proving the
claim.
Thus, c[i1, j1] = k1, which implies that c[i, j]
= c[i1, j1] + 1.
Other cases are similar.
November 7, 2005
L15.15
Dynamic-programming
hallmark #1
Optimal substructure
An optimal solution to a problem
(instance) contains optimal
solutions to subproblems.
November 7, 2005
L15.16
Dynamic-programming
hallmark #1
Optimal substructure
An optimal solution to a problem
(instance) contains optimal
solutions to subproblems.
If z = LCS(x, y), then any prefix of z is
an LCS of a prefix of x and a prefix of y.
November 7, 2005
L15.17
November 7, 2005
L15.18
L15.19
Recursion tree
m = 3, n = 4:
3,4
3,4
2,4
2,4
1,4
1,4
3,3
3,3
2,3
2,3
1,3
1,3
November 7, 2005
3,2
3,2
2,3
2,3
2,2
2,2
1,3
1,3
2,2
2,2
L15.20
Recursion tree
m = 3, n = 4:
3,4
3,4
2,4
2,4
1,4
1,4
3,3
3,3
2,3
2,3
1,3
1,3
3,2
3,2
2,3
2,3
2,2
2,2
1,3
1,3
m+n
2,2
2,2
L15.21
Recursion tree
m = 3, n = 4:
3,4
3,4
2,4
2,4
1,4
1,4
2,3
2,3
1,3
1,3
3,3
3,3
same
subproblem
3,2
3,2
2,3
2,3
2,2
2,2
1,3
1,3
m+n
2,2
2,2
L15.22
Dynamic-programming
hallmark #2
Overlapping subproblems
A recursive solution contains a
small number of distinct
subproblems repeated many times.
November 7, 2005
L15.23
Dynamic-programming
hallmark #2
Overlapping subproblems
A recursive solution contains a
small number of distinct
subproblems repeated many times.
The number of distinct LCS subproblems for
two strings of lengths m and n is only mn.
November 7, 2005
L15.24
Memoization algorithm
Memoization: After computing a solution to a
subproblem, store it in a table. Subsequent calls
check the table to avoid redoing work.
November 7, 2005
L15.25
Memoization algorithm
Memoization: After computing a solution to a
subproblem, store it in a table. Subsequent calls
check the table to avoid redoing work.
LCS(x, y, i, j)
if c[i, j] = NIL
then if x[i] = y[j]
then c[i, j] LCS(x, y, i1, j1) + 1
else c[i, j] max{ LCS(x, y, i1, j),
LCS(x, y, i, j1)}
November 7, 2005
same
as
before
L15.26
Memoization algorithm
Memoization: After computing a solution to a
subproblem, store it in a table. Subsequent calls
check the table to avoid redoing work.
LCS(x, y, i, j)
if c[i, j] = NIL
then if x[i] = y[j]
then c[i, j] LCS(x, y, i1, j1) + 1
else c[i, j] max{ LCS(x, y, i1, j),
LCS(x, y, i, j1)}
same
as
before
L15.27
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
A B C B D
00 00 00 00 00 00
B 00 00 11 11 11 11
D 00 00 11 11 11 22
C 00 00 11
A 00 11 11
B 00 11 22
A 00 11 22
November 7, 2005
A B
00 00
11 11
22 22
22 22 22 22 22
22 22 22 33 33
22 33 33 33 44
22 33 33 44 44
L15.28
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
Time = (mn).
A B C B D
00 00 00 00 00 00
B 00 00 11 11 11 11
D 00 00 11 11 11 22
C 00 00 11
A 00 11 11
B 00 11 22
A 00 11 22
November 7, 2005
A B
00 00
11 11
22 22
22 22 22 22 22
22 22 22 33 33
22 33 33 33 44
22 33 33 44 44
L15.29
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
Time = (mn).
Reconstruct
LCS by tracing
backwards.
November 7, 2005
A B C B D
00 00 00 00 00 00
B 00 00 11 11 11 11
D 00 00 11 11 11 22
C 00 00 11
A 00 11 11
B 00 11 22
A 00 11 22
A B
00 00
11 11
22 22
22 22 22 22 22
22 22 22 33 33
22 33 33 33 44
22 33 33 44 44
L15.30
Dynamic-programming
algorithm
IDEA:
Compute the
table bottom-up.
Time = (mn).
Reconstruct
LCS by tracing
backwards.
Space = (mn).
Exercise:
O(min{m, n}).
November 7, 2005
A B C B D
00 00 00 00 00 00
B 00 00 11 11 11 11
D 00 00 11 11 11 22
C 00 00 11
A 00 11 11
B 00 11 22
A 00 11 22
A B
00 00
11 11
22 22
22 22 22 22 22
22 22 22 33 33
22 33 33 33 44
22 33 33 44 44
L15.31
Introduction to Algorithms
6.046J/18.401J
LECTURE 16
Greedy Algorithms (and
Graphs)
Graph representation
Minimum spanning trees
Optimal substructure
Greedy choice
Prims greedy MST
algorithm
Prof. Charles E. Leiserson
November 9, 2005
L16.1
Graphs (review)
Definition. A directed graph (digraph)
G = (V, E) is an ordered pair consisting of
a set V of vertices (singular: vertex),
a set E V V of edges.
In an undirected graph G = (V, E), the edge
set E consists of unordered pairs of vertices.
In either case, we have | E | = O(V 2). Moreover,
if G is connected, then | E | | V | 1, which
implies that lg | E | = (lg V).
(Review CLRS, Appendix B.)
November 9, 2005
L16.2
Adjacency-matrix
representation
The adjacency matrix of a graph G = (V, E), where
V = {1, 2, , n}, is the matrix A[1 . . n, 1 . . n]
given by
1 if (i, j) E,
A[i, j] =
0 if (i, j) E.
November 9, 2005
L16.3
Adjacency-matrix
representation
The adjacency matrix of a graph G = (V, E), where
V = {1, 2, , n}, is the matrix A[1 . . n, 1 . . n]
given by
1 if (i, j) E,
A[i, j] =
0 if (i, j) E.
22
11
33
44
November 9, 2005
A 1 2 3 4
1 0 1 1 0
2 0 0 1 0
3 0 0 0 0
4 0 0 1 0
(V 2) storage
dense
representation.
L16.4
Adjacency-list representation
An adjacency list of a vertex v V is the list Adj[v]
of vertices adjacent to v.
Adj[1] = {2, 3}
22
11
33
November 9, 2005
44
Adj[2] = {3}
Adj[3] = {}
Adj[4] = {3}
L16.5
Adjacency-list representation
An adjacency list of a vertex v V is the list Adj[v]
of vertices adjacent to v.
Adj[1] = {2, 3}
22
11
Adj[2] = {3}
Adj[3] = {}
Adj[4] = {3}
33
44
For undirected graphs, | Adj[v] | = degree(v).
For digraphs, | Adj[v] | = out-degree(v).
November 9, 2005
L16.6
Adjacency-list representation
An adjacency list of a vertex v V is the list Adj[v]
of vertices adjacent to v.
Adj[1] = {2, 3}
22
11
Adj[2] = {3}
Adj[3] = {}
Adj[4] = {3}
33
44
For undirected graphs, | Adj[v] | = degree(v).
For digraphs, | Adj[v] | = out-degree(v).
Handshaking Lemma: vV = 2 |E| for undirected
graphs adjacency lists use (V + E) storage
a sparse representation (for either type of graph).
November 9, 2005
L16.7
November 9, 2005
L16.8
November 9, 2005
L16.9
Example of MST
6
12
9
5
14
8
3
November 9, 2005
15
10
L16.10
Example of MST
6
12
9
5
14
8
3
November 9, 2005
15
10
L16.11
Optimal substructure
MST T:
(Other edges of G
are not shown.)
November 9, 2005
L16.12
Optimal substructure
MST T:
(Other edges of G
are not shown.)
November 9, 2005
L16.13
Optimal substructure
MST T:
(Other edges of G
are not shown.)
November 9, 2005
L16.14
Optimal substructure
MST T:
(Other edges of G
are not shown.)
u
T1
T2
v
November 9, 2005
L16.15
Optimal substructure
MST T:
(Other edges of G
are not shown.)
u
T1
T2
v
L16.16
November 9, 2005
L16.17
November 9, 2005
L16.18
L16.19
November 9, 2005
L16.20
L16.21
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
T:
v
A
VA
November 9, 2005
u
(u, v) = least-weight edge
connecting A to V A
L16.22
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
T:
v
A
VA
u
(u, v) = least-weight edge
connecting A to V A
November 9, 2005
L16.23
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
T:
v
A
VA
u
(u, v) = least-weight edge
connecting A to V A
L16.24
Proof of theorem
Proof. Suppose (u, v) T. Cut and paste.
T :
A
VA
v
u
(u, v) = least-weight edge
connecting A to V A
L16.25
Prims algorithm
IDEA: Maintain V A as a priority queue Q. Key
each vertex in Q with the weight of the leastweight edge connecting it to a vertex in A.
QV
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v)
DECREASE-KEY
[v] u
L16.26
14
8
3
November 9, 2005
12
9
7
15
00
10
L16.27
14
8
3
November 9, 2005
12
9
7
15
00
10
L16.28
12
14
77
7
8
3
November 9, 2005
00
10
10
9
15
15
15
10
L16.29
12
14
77
7
8
3
November 9, 2005
00
10
10
9
15
15
15
10
L16.30
12
12
55
14
77
7
8
3
November 9, 2005
12
00
10
10
9
15
99
15
15
10
L16.31
12
12
55
14
77
7
8
3
November 9, 2005
12
00
10
10
9
15
99
15
15
10
L16.32
66
55
14
14
14
77
7
8
3
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.33
66
55
14
14
14
77
7
8
3
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.34
66
55
14
14
14
77
7
8
3
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.35
66
55
14
3
77
7
33
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.36
66
55
14
3
77
7
33
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.37
66
55
14
3
77
7
33
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.38
66
55
14
3
77
7
33
November 9, 2005
12
00
88
9
15
99
15
15
10
L16.39
Analysis of Prim
QV
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v)
[v] u
November 9, 2005
L16.40
Analysis of Prim
(V)
total
November 9, 2005
QV
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v)
[v] u
L16.41
Analysis of Prim
(V)
total
|V |
times
November 9, 2005
QV
key[v] for all v V
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
do if v Q and w(u, v) < key[v]
then key[v] w(u, v)
[v] u
L16.42
Analysis of Prim
QV
(V)
key[v] for all v V
total
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
|V |
do if v Q and w(u, v) < key[v]
times degree(u)
times
then key[v] w(u, v)
[v] u
November 9, 2005
L16.43
Analysis of Prim
QV
(V)
key[v] for all v V
total
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
|V |
do if v Q and w(u, v) < key[v]
times degree(u)
times
then key[v] w(u, v)
[v] u
Handshaking Lemma (E) implicit DECREASE-KEYs.
November 9, 2005
L16.44
Analysis of Prim
QV
(V)
key[v] for all v V
total
key[s] 0 for some arbitrary s V
while Q
do u EXTRACT-MIN(Q)
for each v Adj[u]
|V |
do if v Q and w(u, v) < key[v]
times degree(u)
times
then key[v] w(u, v)
[v] u
Handshaking Lemma (E) implicit DECREASE-KEYs.
L16.45
November 9, 2005
L16.46
TEXTRACT-MIN TDECREASE-KEY
November 9, 2005
Total
L16.47
TEXTRACT-MIN TDECREASE-KEY
array
November 9, 2005
O(V)
O(1)
Total
O(V2)
L16.48
TEXTRACT-MIN TDECREASE-KEY
Total
array
O(V)
O(1)
O(V2)
binary
heap
O(lg V)
O(lg V)
O(E lg V)
November 9, 2005
L16.49
TEXTRACT-MIN TDECREASE-KEY
Total
array
O(V)
O(1)
O(V2)
binary
heap
O(lg V)
O(lg V)
O(E lg V)
Fibonacci O(lg V)
heap
amortized
November 9, 2005
O(1)
O(E + V lg V)
amortized worst case
L16.50
MST algorithms
Kruskals algorithm (see CLRS):
Uses the disjoint-set data structure (Lecture 10).
Running time = O(E lg V).
November 9, 2005
L16.51
MST algorithms
Kruskals algorithm (see CLRS):
Uses the disjoint-set data structure (Lecture 10).
Running time = O(E lg V).
Best to date:
Karger, Klein, and Tarjan [1993].
Randomized algorithm.
O(V + E) expected time.
November 9, 2005
L16.52
Introduction to Algorithms
6.046J/18.401J
LECTURE 17
Shortest Paths I
Properties of shortest paths
Dijkstras algorithm
Correctness
Analysis
Breadth-first search
Prof. Erik Demaine
November 14, 2005
L17.1
Paths in graphs
Consider a digraph G = (V, E) with edge-weight
function w : E R. The weight of path p = v1
v2 L vk is defined to be
k 1
w( p ) = w(vi , vi +1 ) .
i =1
L17.2
Paths in graphs
Consider a digraph G = (V, E) with edge-weight
function w : E R. The weight of path p = v1
v2 L vk is defined to be
k 1
w( p ) = w(vi , vi +1 ) .
i =1
Example:
vv11
vv22
vv33
vv44
vv55
w(p) = 2
L17.3
Shortest paths
A shortest path from u to v is a path of
minimum weight from u to v. The shortestpath weight from u to v is defined as
(u, v) = min{w(p) : p is a path from u to v}.
Note: (u, v) = if no path from u to v exists.
L17.4
Optimal substructure
Theorem. A subpath of a shortest path is a
shortest path.
L17.5
Optimal substructure
Theorem. A subpath of a shortest path is a
shortest path.
L17.6
Optimal substructure
Theorem. A subpath of a shortest path is a
shortest path.
L17.7
Triangle inequality
Theorem. For all u, v, x V, we have
(u, v) (u, x) + (x, v).
L17.8
Triangle inequality
Theorem. For all u, v, x V, we have
(u, v) (u, x) + (x, v).
Proof.
(u, v)
uu
(u, x)
vv
(x, v)
xx
November 14, 2005
L17.9
Well-definedness of shortest
paths
If a graph G contains a negative-weight cycle,
then some shortest paths may not exist.
L17.10
Well-definedness of shortest
paths
If a graph G contains a negative-weight cycle,
then some shortest paths may not exist.
Example:
<0
uu
November 14, 2005
vv
Copyright 2001-5 by Erik D. Demaine and Charles E. Leiserson
L17.11
L17.12
Dijkstras algorithm
d[s] 0
for each v V {s}
do d[v]
S
QV
Q is a priority queue maintaining V S
L17.13
Dijkstras algorithm
d[s] 0
for each v V {s}
do d[v]
S
QV
Q is a priority queue maintaining V S
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
L17.14
Dijkstras algorithm
d[s] 0
for each v V {s}
do d[v]
S
QV
Q is a priority queue maintaining V S
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
relaxation
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
step
Implicit DECREASE-KEY
November 14, 2005
L17.15
Example of Dijkstras
algorithm
Graph with
nonnegative
edge weights:
10
AA
1 4
3
BB
CC
2
8
D
D
7 9
EE
L17.16
Example of Dijkstras
algorithm
BB
Initialize:
10
0 AA
Q: A B C D E
0
1 4
3
CC
2
8
D
D
7 9
EE
S: {}
November 14, 2005
L17.17
Example of Dijkstras
algorithm
A EXTRACT-MIN(Q):
10
0 AA
Q: A B C D E
0
BB
1 4
3
CC
2
8
D
D
7 9
EE
S: { A }
November 14, 2005
L17.18
Example of Dijkstras
algorithm
Relax all edges leaving A:
10
0 AA
Q: A B C D E
0
10
10
BB
1 4
3
CC
3
2
8
D
D
7 9
EE
S: { A }
November 14, 2005
L17.19
Example of Dijkstras
algorithm
C EXTRACT-MIN(Q):
10
0 AA
Q: A B C D E
0
10
10
BB
1 4
3
CC
3
2
8
D
D
7 9
EE
S: { A, C }
November 14, 2005
L17.20
Example of Dijkstras
algorithm
Relax all edges leaving C:
10
0 AA
Q: A B C D E
0
10
7
11
7
BB
1 4
3
CC
3
2
8
11
D
D
7 9
EE
5
S: { A, C }
L17.21
Example of Dijkstras
algorithm
E EXTRACT-MIN(Q):
10
0 AA
Q: A B C D E
0
10
7
11
7
BB
1 4
3
CC
3
2
8
11
D
D
7 9
EE
5
S: { A, C, E }
L17.22
Example of Dijkstras
algorithm
Relax all edges leaving E:
10
0 AA
Q: A B C D E
0
10
7
7
11
11
7
BB
1 4
3
CC
3
2
8
11
D
D
7 9
EE
5
S: { A, C, E }
L17.23
Example of Dijkstras
algorithm
B EXTRACT-MIN(Q):
10
0 AA
Q: A B C D E
0
10
7
7
11
11
7
BB
1 4
3
CC
3
2
8
11
D
D
7 9
EE
5
S: { A, C, E, B }
L17.24
Example of Dijkstras
algorithm
Relax all edges leaving B:
10
0 AA
Q: A B C D E
0
10
7
7
11
11
9
7
BB
1 4
3
CC
3
9
D
D
2
8
7 9
EE
5
S: { A, C, E, B }
L17.25
Example of Dijkstras
algorithm
D EXTRACT-MIN(Q):
10
0 AA
Q: A B C D E
0
10
7
7
11
11
9
7
BB
1 4
3
CC
3
2
8
9
D
D
7 9
EE
5
S: { A, C, E, B, D }
L17.26
Correctness Part I
Lemma. Initializing d[s] 0 and d[v] for all
v V {s} establishes d[v] (s, v) for all v V,
and this invariant is maintained over any sequence
of relaxation steps.
L17.27
Correctness Part I
Lemma. Initializing d[s] 0 and d[v] for all
v V {s} establishes d[v] (s, v) for all v V,
and this invariant is maintained over any sequence
of relaxation steps.
Proof. Suppose not. Let v be the first vertex for
which d[v] < (s, v), and let u be the vertex that
caused d[v] to change: d[v] = d[u] + w(u, v). Then,
d[v] < (s, v)
supposition
(s, u) + (u, v) triangle inequality
(s,u) + w(u, v) sh. path specific path
d[u] + w(u, v)
v is first violation
Contradiction.
November 14, 2005
L17.28
Correctness Part II
Lemma. Let u be vs predecessor on a shortest
path from s to v. Then, if d[u] = (s, u) and edge
(u, v) is relaxed, we have d[v] = (s, v) after the
relaxation.
L17.29
Correctness Part II
Lemma. Let u be vs predecessor on a shortest
path from s to v. Then, if d[u] = (s, u) and edge
(u, v) is relaxed, we have d[v] = (s, v) after the
relaxation.
Proof. Observe that (s, v) = (s, u) + w(u, v).
Suppose that d[v] > (s, v) before the relaxation.
(Otherwise, were done.) Then, the test d[v] >
d[u] + w(u, v) succeeds, because d[v] > (s, v) =
(s, u) + w(u, v) = d[u] + w(u, v), and the
algorithm sets d[v] = d[u] + w(u, v) = (s, v).
November 14, 2005
L17.30
L17.31
uu
S, just before
adding u.
November 14, 2005
ss
xx
yy
L17.32
uu
xx
yy
L17.33
Analysis of Dijkstra
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
L17.34
Analysis of Dijkstra
|V |
times
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
do if d[v] > d[u] + w(u, v)
then d[v] d[u] + w(u, v)
L17.35
Analysis of Dijkstra
|V |
times
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
degree(u)
do if d[v] > d[u] + w(u, v)
times
then d[v] d[u] + w(u, v)
L17.36
Analysis of Dijkstra
|V |
times
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
degree(u)
do if d[v] > d[u] + w(u, v)
times
then d[v] d[u] + w(u, v)
L17.37
Analysis of Dijkstra
|V |
times
while Q
do u EXTRACT-MIN(Q)
S S {u}
for each v Adj[u]
degree(u)
do if d[v] > d[u] + w(u, v)
times
then d[v] d[u] + w(u, v)
L17.38
Analysis of Dijkstra
(continued)
Time = (V)TEXTRACT-MIN + (E)TDECREASE-KEY
Q
TEXTRACT-MIN TDECREASE-KEY
Total
L17.39
Analysis of Dijkstra
(continued)
Time = (V)TEXTRACT-MIN + (E)TDECREASE-KEY
Q
TEXTRACT-MIN TDECREASE-KEY
array
O(V)
O(1)
Total
O(V2)
L17.40
Analysis of Dijkstra
(continued)
Time = (V)TEXTRACT-MIN + (E)TDECREASE-KEY
Q
TEXTRACT-MIN TDECREASE-KEY
Total
array
O(V)
O(1)
O(V2)
binary
heap
O(lg V)
O(lg V)
O(E lg V)
L17.41
Analysis of Dijkstra
(continued)
Time = (V)TEXTRACT-MIN + (E)TDECREASE-KEY
Q
TEXTRACT-MIN TDECREASE-KEY
Total
array
O(V)
O(1)
O(V2)
binary
heap
O(lg V)
O(lg V)
O(E lg V)
Fibonacci O(lg V)
heap
amortized
November 14, 2005
O(1)
O(E + V lg V)
amortized worst case
L17.42
Unweighted graphs
Suppose that w(u, v) = 1 for all (u, v) E.
Can Dijkstras algorithm be improved?
L17.43
Unweighted graphs
Suppose that w(u, v) = 1 for all (u, v) E.
Can Dijkstras algorithm be improved?
Use a simple FIFO queue instead of a priority
queue.
L17.44
Unweighted graphs
Suppose that w(u, v) = 1 for all (u, v) E.
Can Dijkstras algorithm be improved?
Use a simple FIFO queue instead of a priority
queue.
Breadth-first search
while Q
do u DEQUEUE(Q)
for each v Adj[u]
do if d[v] =
then d[v] d[u] + 1
ENQUEUE(Q, v)
L17.45
Unweighted graphs
Suppose that w(u, v) = 1 for all (u, v) E.
Can Dijkstras algorithm be improved?
Use a simple FIFO queue instead of a priority
queue.
Breadth-first search
while Q
do u DEQUEUE(Q)
for each v Adj[u]
do if d[v] =
then d[v] d[u] + 1
ENQUEUE(Q, v)
L17.46
Example of breadth-first
search
aa
ff
hh
dd
bb
gg
ee
ii
cc
Q:
November 14, 2005
L17.47
Example of breadth-first
search
0
aa
ff
hh
dd
bb
gg
ee
ii
cc
0
Q: a
November 14, 2005
L17.48
Example of breadth-first
search
0
aa
ff
hh
dd
1
bb
gg
ee
ii
cc
1 1
Q: a b d
November 14, 2005
L17.49
Example of breadth-first
search
0
aa
ff
hh
dd
1
bb
gg
ee
cc
ii
2
1 2 2
Q: a b d c e
November 14, 2005
L17.50
Example of breadth-first
search
0
aa
ff
hh
dd
1
bb
gg
ee
cc
ii
2
2 2
Q: a b d c e
November 14, 2005
L17.51
Example of breadth-first
search
0
aa
ff
hh
dd
1
bb
gg
ee
cc
ii
2
2
Q: a b d c e
November 14, 2005
L17.52
Example of breadth-first
search
0
aa
dd
1
bb
cc
ff
1
3
hh
gg
ee
ii
3
3 3
Q: a b d c e g i
November 14, 2005
L17.53
Example of breadth-first
search
4
0
aa
dd
1
bb
cc
ff
1
3
hh
gg
ee
ii
3
3 4
Q: a b d c e g i f
November 14, 2005
L17.54
Example of breadth-first
search
0
aa
dd
1
bb
cc
ff
hh
gg
ee
ii
3
4 4
Q: a b d c e g i f h
November 14, 2005
L17.55
Example of breadth-first
search
0
aa
dd
1
bb
cc
ff
hh
gg
ee
ii
3
4
Q: a b d c e g i f h
November 14, 2005
L17.56
Example of breadth-first
search
0
aa
dd
1
bb
cc
ff
hh
gg
ee
ii
Q: a b d c e g i f h
November 14, 2005
L17.57
Example of breadth-first
search
0
aa
dd
1
bb
cc
ff
hh
gg
ee
ii
Q: a b d c e g i f h
November 14, 2005
L17.58
Correctness of BFS
while Q
do u DEQUEUE(Q)
for each v Adj[u]
do if d[v] =
then d[v] d[u] + 1
ENQUEUE(Q, v)
Key idea:
The FIFO Q in breadth-first search mimics
the priority queue Q in Dijkstra.
Invariant: v comes after u in Q implies that
d[v] = d[u] or d[v] = d[u] + 1.
November 14, 2005
L17.59
Introduction to Algorithms
6.046J/18.401J
LECTURE 18
Shortest Paths II
Bellman-Ford algorithm
Linear programming and
difference constraints
VLSI layout compaction
L18.1
Negative-weight cycles
Recall: If a graph G = (V, E) contains a negativeweight cycle, then some shortest paths may not exist.
Example:
<0
uu
vv
L18.2
Negative-weight cycles
Recall: If a graph G = (V, E) contains a negativeweight cycle, then some shortest paths may not exist.
Example:
<0
uu
vv
L18.3
Bellman-Ford algorithm
d[s] 0
for each v V {s}
do d[v]
initialization
for i 1 to | V | 1
do for each edge (u, v) E
do if d[v] > d[u] + w(u, v)
relaxation
then d[v] d[u] + w(u, v)
step
for each edge (u, v) E
do if d[v] > d[u] + w(u, v)
then report that a negative-weight cycle exists
At the end, d[v] = (s, v), if no negative-weight cycles.
Time = O(VE).
November 16, 2005
L18.4
Example of Bellman-Ford
BB
1
3
AA
4
CC
1
5
2
2
D
D
EE
3
L18.5
Example of Bellman-Ford
BB
AA
4
CC
1
5
EE
D
D
Initialization.
November 16, 2005
L18.6
Example of Bellman-Ford
BB
0
AA
4
5
CC
1
3
1
5
2
8
D
D
EE
L18.7
Example of Bellman-Ford
0
AA
4
5
BB
7
CC
1
3
1
5
2
8
D
D
EE
L18.8
Example of Bellman-Ford
0
AA
4
5
BB
7
CC
1
3
1
5
2
8
D
D
EE
L18.9
Example of Bellman-Ford
0
AA
4
5
BB
7
CC
1
3
1
5
2
8
D
D
EE
L18.10
Example of Bellman-Ford
0
AA
4
5
BB
7
CC
1
3
1
5
2
8
D
D
EE
L18.11
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
4
1
3
1
5
2
8
D
D
EE
L18.12
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
4
1
3
1
5
2
8
D
D
EE
L18.13
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
4
1
3
1
5
2
8
D
D
EE
L18.14
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
EE
L18.15
Example of Bellman-Ford
1
BB
0
AA
4
5
CC
2
1
3
2
8
D
D
EE
End of pass 1.
November 16, 2005
L18.16
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
EE
L18.17
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
EE
L18.18
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
1
EE
L18.19
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
1
EE
L18.20
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
1
EE
L18.21
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
1
EE
L18.22
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
1
EE
L18.23
Example of Bellman-Ford
0
AA
4
5
1
BB
7
CC
2
1
3
1
5
2
8
D
D
2
1
EE
L18.24
Example of Bellman-Ford
1
BB
0
AA
4
5
CC
2
1
3
1
5
2
8
D
D
2
EE
L18.25
Correctness
Theorem. If G = (V, E) contains no negativeweight cycles, then after the Bellman-Ford
algorithm executes, d[v] = (s, v) for all v V.
L18.26
Correctness
Theorem. If G = (V, E) contains no negativeweight cycles, then after the Bellman-Ford
algorithm executes, d[v] = (s, v) for all v V.
Proof. Let v V be any vertex, and consider a shortest
path p from s to v with the minimum number of edges.
s
p: vv0
0
vv11
vv22
vv33
vvkk
L18.27
Correctness (continued)
s
p: vv0
0
vv11
vv22
vv33
v
vvkk
L18.28
Detection of negative-weight
cycles
Corollary. If a value d[v] fails to converge after
|V| 1 passes, there exists a negative-weight
cycle in G reachable from s.
L18.29
Linear programming
Let A be an mn matrix, b be an m-vector, and c
be an n-vector. Find an n-vector x that maximizes
cTx subject to Ax b, or determine that no such
solution exists.
n
m
.
A
x b
maximizing
cT
x
L18.30
Linear-programming
algorithms
Algorithms for the general problem
Simplex methods practical, but worst-case
exponential time.
Interior-point methods polynomial time and
competes with simplex.
L18.31
Linear-programming
algorithms
Algorithms for the general problem
Simplex methods practical, but worst-case
exponential time.
Interior-point methods polynomial time and
competes with simplex.
Feasibility problem: No optimization criterion.
Just find x such that Ax b.
In general, just as hard as ordinary LP.
November 16, 2005
L18.32
L18.33
L18.34
vvii
wij
vvjj
(The A
matrix has
dimensions
|E | |V |.)
L18.35
Unsatisfiable constraints
Theorem. If the constraint graph contains
a negative-weight cycle, then the system of
differences is unsatisfiable.
L18.36
Unsatisfiable constraints
Theorem. If the constraint graph contains
a negative-weight cycle, then the system of
differences is unsatisfiable.
Proof. Suppose that the negative-weight cycle is
v1 v2 L vk v1. Then, we have
x2 x1
x3 x2
w12
w23
M
xk xk1 wk1, k
x1 xk wk1
L18.37
Unsatisfiable constraints
Theorem. If the constraint graph contains
a negative-weight cycle, then the system of
differences is unsatisfiable.
Proof. Suppose that the negative-weight cycle is
v1 v2 L vk v1. Then, we have
x2 x1
x3 x2
w12
w23
M
xk xk1 wk1, k
x1 xk wk1
0
November 16, 2005
weight of cycle
<0
Therefore, no
values for the xi
can satisfy the
constraints.
L18.38
L18.39
vv11
vv44
vv77
November 16, 2005
vv99
vv33
L18.40
0
s
vv11
vv44
vv77
vv99
vv33
Note:
No negative-weight
cycles introduced
shortest paths exist.
L18.41
Proof (continued)
Claim: The assignment xi = (s, vi) solves the constraints.
Consider any constraint xj xi wij, and consider the
shortest paths from s to vj and vi:
ss
(s, vi)
(s, vj)
vvii
wij
vvjj
L18.42
L18.43
L18.44
x1
x2
x2 x1 d 1 +
Bellman-Ford minimizes maxi{xi} mini{xi},
which compacts the layout in the x-dimension.
Constraint:
L18.45
Introduction to Algorithms
6.046J/18.401J
LECTURE 19
Shortest Paths III
All-pairs shortest paths
Matrix-multiplication
algorithm
Floyd-Warshall algorithm
Johnsons algorithm
Prof. Charles E. Leiserson
November 21, 2005
L19.1
Shortest paths
Single-source shortest paths
Nonnegative edge weights
General
DAG
L19.2
Shortest paths
Single-source shortest paths
Nonnegative edge weights
General
Bellman-Ford: O(VE)
DAG
General
L19.3
L19.4
L19.5
Dynamic programming
Consider the n n adjacency matrix A = (aij)
of the digraph, and define
dij(m) = weight of a shortest path from
i to j that uses at most m edges.
Claim: We have
0 if i = j,
(0)
dij =
if i j;
and for m = 1, 2, , n 1,
dij(m) = mink{dik(m1) + akj }.
November 21, 2005
L19.6
Proof of claim
ks
es
g
d
1e
ii
s
e
g
d
e
1
m
m
1
edg
es
jj
M
m 1 edges
L19.7
Proof of claim
ks
es
g
d
1e
ii
Relaxation!
s
e
g
d
e
1
m
m
1
edg
es
for k 1 to n
do if dij > dik + akj
then dij dik + akj
jj
M
m 1 edges
L19.8
Proof of claim
ks
es
g
d
1e
ii
Relaxation!
s
e
g
d
e
1
m
m
1
edg
es
for k 1 to n
do if dij > dik + akj
then dij dik + akj
jj
M
m 1 edges
L19.9
Matrix multiplication
Compute C = A B, where C, A, and B are n n
matrices:
n
cij = aik bkj .
k =1
L19.10
Matrix multiplication
Compute C = A B, where C, A, and B are n n
matrices:
n
cij = aik bkj .
k =1
L19.11
Matrix multiplication
Compute C = A B, where C, A, and B are n n
matrices:
n
cij = aik bkj .
k =1
0
0
0
0
= D0 = (dij(0)).
L19.12
Matrix multiplication
(continued)
The (min, +) multiplication is associative, and
with the real numbers, it forms an algebraic
structure called a closed semiring.
Consequently, we can compute
D(1) = D(0) A = A1
D(2) = D(1) A = A2
M
M
D(n1) = D(n2) A = An1 ,
yielding D(n1) = ((i, j)).
Time = (nn3) = (n4). No better than n B-F.
November 21, 2005
L19.13
Improved matrix
multiplication algorithm
Repeated squaring: A2k = Ak Ak.
lg(n1)
2
4
2
.
Compute A , A , , A
O(lg n) squarings
Note: An1 = An = An+1 = L.
Time = (n3 lg n).
To detect negative-weight cycles, check the
diagonal for negative values in O(n) additional
time.
November 21, 2005
L19.14
Floyd-Warshall algorithm
Also dynamic programming, but faster!
Define cij(k) = weight of a shortest path from i
to j with intermediate vertices
belonging to the set {1, 2, , k}.
ii
kk
kk
kk
kk
jj
L19.15
Floyd-Warshall recurrence
cij(k) = mink {cij(k1), cik(k1) + ckj(k1)}
cik
ii
(k1)
cij(k1)
ckj(k1)
jj
L19.16
relaxation
Notes:
Okay to omit superscripts, since extra relaxations
cant hurt.
Runs in (n3) time.
Simple to code.
Efficient in practice.
November 21, 2005
L19.17
Transitive closure of a
directed graph
Compute tij =
L19.18
Graph reweighting
Theorem. Given a function h : V R, reweight each
edge (u, v) E by wh(u, v) = w(u, v) + h(u) h(v).
Then, for any two vertices, all paths between them are
reweighted by the same amount.
L19.19
Graph reweighting
Theorem. Given a function h : V R, reweight each
edge (u, v) E by wh(u, v) = w(u, v) + h(u) h(v).
Then, for any two vertices, all paths between them are
reweighted by the same amount.
Proof. Let p = v1 v2 L vk be a path in G. We
k 1
have
wh ( p ) =
=
wh ( vi ,vi+1 )
i =1
k 1
( w( vi ,vi+1 )+ h ( vi ) h ( vi+1 ) )
i =1
k 1
w( vi ,vi+1 ) + h ( v1 ) h ( vk ) Same
i =1
= w ( p ) + h ( v1 ) h ( v k ) .
November 21, 2005
amount!
L19.20
L19.21
L19.22
Johnsons algorithm
1. Find a function h : V R such that wh(u, v) 0 for
all (u, v) E by using Bellman-Ford to solve the
difference constraints h(v) h(u) w(u, v), or
determine that a negative-weight cycle exists.
Time = O(V E).
2. Run Dijkstras algorithm using wh from each vertex
u V to compute h(u, v) for all v V.
Time = O(V E + V 2 lg V).
3. For each (u, v) V V, compute
(u, v) = h(u, v) h(u) + h(v) .
Time = O(V 2).
L19.23
Introduction to Algorithms
6.046J/18.401J
LECTURE 20
Quiz 2 Review
6.046 Staff
November 23, 2005
L20.1
Introduction to Algorithms
6.046J/18.401J
LECTURE 21
Take-Home Quiz
Instructions
Academic honesty
Strategies for doing well
L21.1
Take-home quiz
The take-home quiz contains 5 problems worth
25 points each, for a total of 125 points.
1 easy
2 moderate
1 hard
1 very hard
L21.2
End of quiz
Your exam is due between 10:00 and
11:00 A.M. on Monday, November 22,
2004.
Late exams will not be accepted unless
you obtain a Deans Excuse or make
prior arrangements with your
recitation instructor.
You must hand in your own exam in
person.
L21.3
Planning
The quiz should take you about 12 hours to
do, but you have five days in which to do it.
Plan your time wisely. Do not overwork,
and get enough sleep.
Ample partial credit will be given for good
solutions, especially if they are well written.
The better your asymptotic running-time
bounds, the higher your score.
Bonus points will be given for exceptionally
efficient or elegant solutions.
November 28, 2005
L21.4
Format
Each problem should be answered on
a separate sheet (or sheets) of 3-hole
punched paper.
Mark the top of each problem with
your name,
6.046J/18.410J,
the problem number,
your recitation time,
and your TA.
L21.5
Executive summary
Your solution to a problem should start with
a topic paragraph that provides an executive
summary of your solution.
This executive summary should describe
the problem you are solving,
the techniques you use to solve it,
any important assumptions you make, and
the running time your algorithm achieves.
L21.6
Solutions
Write up your solutions cleanly and concisely
to maximize the chance that we understand
them.
Be explicit about running time and algorithms.
For example, don't just say you sort n numbers,
state that you are using heapsort, which sorts the n
numbers in O(n lg n) time in the worst case.
L21.7
Solutions
Give examples, and draw figures.
Provide succinct and convincing arguments
for the correctness of your solutions.
Do not regurgitate material presented in class.
Cite algorithms and theorems from CLRS,
lecture, and recitation to simplify your
solutions.
L21.8
Assumptions
Part of the goal of this exam is to test
engineering common sense.
If you find that a question is unclear or
ambiguous, make reasonable assumptions
in order to solve the problem.
State clearly in your write-up what
assumptions you have made.
Be careful what you assume, however,
because you will receive little credit if you
make a strong assumption that renders a
problem trivial.
November 28, 2005
L21.9
Bugs, etc.
If you think that youve found a bug, please send
email to 6.046 course staff.
Corrections and clarifications will be sent to the
class via email.
Check your email daily to avoid missing
potentially important announcements.
If you did not receive an email last night
reminding you about Quiz 2, then you are not on
the class email list. Please let your recitation
instructor know immediately.
November 28, 2005
L21.10
Academic honesty
This quiz is limited open book.
You may use
your course notes,
the CLRS textbook,
lecture videos,
basic reference materials such as dictionaries,
and
any of the handouts posted on the course web
page.
No other sources whatsoever may be consulted!
November 28, 2005
L21.11
Academic honesty
For example, you may not use notes or solutions
from other times that this course or other related
courses have been taught, or materials on the
Web.
These materials will not help you, but you may not
use them anyhow.
L21.12
Academic honesty
If at any time you feel that you may have
violated this policy, it is imperative that you
contact the course staff immediately.
It will be much the worse for you if third parties
divulge your indiscretion.
If you have any questions about what resources
may or may not be used during the quiz, send
email to 6.046 course staff.
L21.13
76 No.
1 Yes.
1 Abstain.
L21.14
72 None.
2 3 people compared answers.
1 Suspect 2, but dont know.
1 Either 0 or 2.
1 Abstain.
1 10 (the cheater).
November 28, 2005
L21.15
Reread instructions
L21.16
Test-taking strategies
Manage your time.
Manage your psyche.
Brainstorm.
Write-up early and often.
L21.17
L21.18
L21.19
Brainstorm
Get an upper bound, even if it is loose.
Look for analogies with problems youve seen.
Exploit special structure.
Solve a simpler problem.
Draw diagrams.
Contemplate.
Be wary of self-imposed constraints think
out of the box.
Work out small examples, and abstract.
Understand things in two ways: sanity checks.
November 28, 2005
L21.20
L21.21
Positive attitude
L21.22