Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
59 views

Module 1 Dynamic Programming

The document discusses dynamic programming and its application to finding the longest common subsequence (LCS) between two strings. Dynamic programming is an optimization technique that works bottom-up by breaking down a problem into overlapping subproblems and storing the results to avoid recomputing them. To find the LCS of two strings X and Y, the algorithm uses dynamic programming to fill a 2D table where each entry c[i,j] represents the length of the LCS of the prefixes Xi and Yj. The table is filled recursively based on whether the current characters match.

Uploaded by

Badri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Module 1 Dynamic Programming

The document discusses dynamic programming and its application to finding the longest common subsequence (LCS) between two strings. Dynamic programming is an optimization technique that works bottom-up by breaking down a problem into overlapping subproblems and storing the results to avoid recomputing them. To find the LCS of two strings X and Y, the algorithm uses dynamic programming to fill a 2D table where each entry c[i,j] represents the length of the LCS of the prefixes Xi and Yj. The table is filled recursively based on whether the current characters match.

Uploaded by

Badri
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 46

Module-1: Dynamic programming

–Example:
• Longest Common Subsequence - LCS

1
Dynamic programming

It is used, when the solution can be recursively


described in terms of solutions to sub problems
(optimal substructure)
Algorithm finds solutions to sub problems and
stores them in memory for later use
More efficient than “brute-force methods”, which
solve the same sub problems over and over again

2
Optimal Substructure Property
Definition
– If S is an optimal solution to a problem, then the
components of S are optimal solutions to sub
problems
Examples:
– True for knapsack
– True for coin-changing
– True for single-source shortest path
– Not true for longest-simple-path

3
Dynamic Programming

Works “bottom-up”
– Finds solutions to small sub-problems first
– Stores them
– Combines them somehow to find a solution
to a slightly larger sub problem
Compare to greedy approach
– Also requires optimal substructure
– But greedy makes choice first, then solves

4
Dynamic Programming

5
Overlapping sub-problems - Example

6
Remember Fibonacci numbers?
Recursive code:

long fib( int n) {

assert(n >= 0);

if ( n == 0 ) return 0;

if ( n == 1 ) return 1;

return fib(n-1) + fib(n-2);

}
What’s the problem?
– Repeatedly solves the same sub problems

7
Memoization
Before talking about dynamic
programming, another general
technique: Memoization
– uses a memory function
Simple idea:
– Calculate and store solutions to sub
problems
– Before solving it (again), look to see if
you’ve remembered it
8
Memoization
Use a Table abstract data type
– Lookup key: whatever identifies a
subproblem
– Value stored: the solution
Could be an array/vector
– E.g. for Fibonacci, store fib(n) using

index n
– Need to initialize the array
Could use a map / hash-table

9
Memoization and Fibonacci

Before recursive code below called, must initialize


results[] so all values are -1


long fib_mem(int n, long results[]) {

if ( results[n] != -1 )

return results[n]; // return stored value

long val;

if ( n == 0 || n ==1 ) val = n; // odd but right

else

val = fib_mem(n-1, results)

+ fib_mem(n-2, results);

results[n] = val; // store calculated value 

return val;

}

10
Observations on fib_mem()
Same elegant top-down, recursive
approach based on definition
– Without repeated sub problems
Memory function: a function that
remembers
– Save time by using extra space
Can show this runs in Θ(n)

11
General Strategy of Dynamic Programming
1. Structure: What’s the structure of an optimal
solution in terms of solutions to its sub
problems?
2. Give a recursive definition of an optimal
solution in terms of optimal solutions to
smaller problems
– Usually using min or max
3. Use a data structure (often a table) to store
smaller solutions in a bottom-up fashion
– Optimal value found in the table
4. (If needed) Reconstruct the optimal solution
– I.e. what produced the optimal value

12
Dyn. Prog. vs. Divide and Conquer

Remember D & C?
– Divide into sub problems. Solve each.
Combine.
Good when sub problems do not
overlap, when they’re independent
– No need to repeat them
Divide and conquer: top-down
Dynamic programming: bottom-up

13
Longest Common Subsequence (LCS)
Application: comparison of two DNA strings
Ex: X= {A B C B D A B }, Y= {B D C A B A}
Longest Common Subsequence:
X= AB C BDAB
Y= BDCAB A
Brute force algorithm would compare each subsequence of X
with the symbols in Y

Let S1 = {B, C, D, A, A, C, D} S2 = {A, C, D, B, A, C} Then, common


subsequences are {B, C}, {C, D, A, C}, {D, A, C}, {A, A, C}, {A, C}, {C,
D}, ...

14
LCS Algorithm
if |X| = m, |Y| = n, then there are 2m
subsequences of X; we must compare each
with Y (n comparisons)
So the running time of the brute-force
algorithm is O(n 2m)
Notice that the LCS problem has optimal
substructure: solutions of subproblems are
parts of the final solution.
Subproblems: “find LCS of pairs of prefixes of
X and Y”
15
LCS Algorithm
First we’ll find the length of LCS. Later we’ll
modify the algorithm to find LCS itself.
Define Xi, Yj to be the prefixes of X and Y of
length i and j respectively
Define c[i,j] to be the length of LCS of Xi and
Yj
Then the length of LCS of X and Y will be

c[m,n]
⎧c[i − 1, j − 1] + 1 if x[i ] = y[ j ],
c[i, j ] = ⎨
⎩ max(c[i, j − 1], c[i − 1, j ]) otherwise
16
LCS recursive solution
⎧c[i − 1, j − 1] + 1 if x[i ] = y[ j ],
c[i, j ] = ⎨
⎩ max(c[i, j − 1], c[i − 1, j ]) otherwise
We start with i = j = 0 (empty substrings of x
and y)
Since X0 and Y0 are empty strings, their LCS is
always empty (i.e. c[0,0] = 0)
LCS of empty string and any other string is
empty, so for every i and j: c[0, j] = c[i,0] = 0
17
LCS recursive solution
⎧c[i − 1, j − 1] + 1 if x[i ] = y[ j ],
c[i, j ] = ⎨
⎩ max(c[i, j − 1], c[i − 1, j ]) otherwise
When we calculate c[i,j], we consider two
cases:
First case: x[i]=y[j]: one more symbol in
strings X and Y matches, so the length of LCS
Xi and Yj equals to the length of LCS of
smaller strings Xi-1 and Yi-1 , plus 1
18
LCS recursive solution
⎧c[i − 1, j − 1] + 1 if x[i ] = y[ j ],
c[i, j ] = ⎨
⎩ max(c[i, j − 1], c[i − 1, j ]) otherwise
Second case: x[i] != y[j]
As symbols don’t match, our solution is not
improved, and the length of LCS(Xi , Yj) is the
same as before (i.e. maximum of 

LCS(Xi, Yj-1) and LCS(Xi-1,Yj)

19
LCS Length Algorithm
LCS-Length(X, Y)
1. m = length(X) // get the # of symbols in X
2. n = length(Y) // get the # of symbols in Y
3. for i = 1 to m c[i,0] = 0 // special case: Y0
4. for j = 1 to n c[0,j] = 0 // special case: X0
5. for i = 1 to m // for all Xi
6. for j = 1 to n // for all Yj
7. if ( Xi == Yj )
8. c[i,j] = c[i-1,j-1] + 1
9. else c[i,j] = max( c[i-1,j], c[i,j-1] )
10. return c[m,n] // return LCS length for X and Y 20
LCS Example
We’ll see how LCS algorithm works on the
following example:
X = ABCB
Y = BDCAB

What is the Longest Common


Subsequence
of X and Y?
LCS(X, Y) = BCB
X=AB C B
Y= BDCAB 21
ABCB
LCS Example (0) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi

A
1

2 B

3 C

4 B

X = ABCB; m = |X| = 4
Y = BDCAB; n = |Y| = 5
Allocate array c[5,4]
22
ABCB
LCS Example (1) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0

2 B
0

3 C 0

4 B 0

for i = 1 to m c[i,0] = 0
for j = 1 to n c[0,j] = 0
23
ABCB
LCS Example (2) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0

2 B
0

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

24
ABCB
LCS Example (3) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0

2 B
0

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

25
ABCB
LCS Example (4) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1

2 B
0

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

26
ABCB
LCS Example (5) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

27
ABCB
LCS Example (6) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

28
ABCB
LCS Example (7) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

29
ABCB
LCS Example (8) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

30
ABCB
LCS Example (10) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

31
ABCB
LCS Example (11) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

32
ABCB
LCS Example (12) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

33
ABCB
LCS Example (13) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0 1

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

34
ABCB
LCS Example (14) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0 1 1 2 2

if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

35
ABCB
LCS Example (15) BDCAB
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0 1 1 2 2 3
if ( Xi == Yj )
c[i,j] = c[i-1,j-1] + 1
else c[i,j] = max( c[i-1,j], c[i,j-1] )

36
LCS Algorithm Running Time

LCS algorithm calculates the values of each


entry of the array c[m,n]
So what is the running time?

O(m*n)
since each c[i,j] is calculated in
constant time, and there are m*n
elements in the array
37
How to find actual LCS
So far, we have just found the length of LCS, but not
LCS itself.
We want to modify this algorithm to make it output
Longest Common Subsequence of X and Y
Each c[i,j] depends on c[i-1,j] and c[i,j-1]
or c[i-1, j-1]
For each c[i,j] we can say how it was acquired:

2 2 For example, here


2 3 c[i,j] = c[i-1,j-1] +1 = 2+1=3

38
How to find actual LCS - continued
Remember that
⎧c[i − 1, j − 1] + 1 if x[i ] = y[ j ],
c[i, j ] = ⎨
⎩ max(c[i, j − 1], c[i − 1, j ]) otherwise

So we can start from c[m,n] and go backwards


Look first to see if 2nd case above was true
If not, then c[i,j] = c[i-1, j-1]+1, so remember x[i]
(because x[i] is a part of LCS)
When i=0 or j=0 (i.e. we reached the beginning),
output remembered letters in reverse order
39
Algorithm to find actual LCS
Here’s a recursive algorithm to do this:


LCS_print(x, m, n, c) {

if (c[m][n] == c[m-1][n]) // go up?

LCS_print(x, m-1, n, c);

else if (c[m][n] == c[m][n-1] // go left?

LCS_print(x, m, n-1, c);

else { // it was a match!

LCS_print(x, m-1, n-1, c);

print(x[m]); // print after recursive call

}

}

40
Finding LCS
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0 1 1 2 2 3

41
Finding LCS (2)
j 0 1 2 3 4 5
i Yj B D C A B

0 Xi
0 0 0 0 0 0

A
1 0 0 0 0 1 1

2 B
0 1 1 1 1 2

3 C 0 1 1 2 2 2

4 B 0 1 1 2 2 3
LCS (reversed order):B C B
LCS (straight order): B C B
(this string turned out to be a palindrome) 42
Review: Dynamic programming
DP is a method for solving certain kind of
problems
DP can be applied when the solution of a
problem includes solutions to subproblems
We need to find a recursive formula for the
solution
We can recursively solve subproblems,
starting from the trivial case, and save their
solutions in memory
In the end we’ll get the solution of the whole
problem
43
Properties of a problem that can be solved with
dynamic programming

Simple Subproblems
– We should be able to break the original problem
to smaller subproblems that have the same
structure
Optimal Substructure of the problems
– The solution to the problem must be a
composition of subproblem solutions
Subproblem Overlap
– Optimal subproblems to unrelated problems can
contain subproblems in common
44
Review: Longest Common Subsequence
(LCS)
Problem: how to find the longest pattern of
characters that is common to two text strings
X and Y
Dynamic programming algorithm: solve
subproblems until we get the final solution
Subproblem: first find the LCS of prefixes of
X and Y.
this problem has optimal substructure: LCS
of two prefixes is always a part of LCS of
bigger strings
45
Conclusion
Dynamic programming is a useful technique
of solving certain kind of problems
When the solution can be recursively
described in terms of partial solutions, we
can store these partial solutions and re-use
them as necessary
Running time (Dynamic Programming
algorithm vs. naive algorithm):
– LCS: O(m*n) vs. O(n * 2m)

46

You might also like