04 - OR2 - Dynamic Programming
04 - OR2 - Dynamic Programming
DYNAMIC PROGRAMMING
Dynamic Programming
• Dynamic programming is a quantitative analysis technique
that has been applied to large, complex problems that have
sequences of decisions to be made.
• Dynamic programming divides problems into a number of
decision stages; the outcome of a decision at one stage
affects the decision at each of the next stages.
• The technique is useful in a large number of multi period
business problems, such as:
– smoothing production employment
– allocating capital funds
– allocating salespeople to marketing areas
– evaluating investment opportunities.
2
1
16/09/2021
Dynamic Programming
• Dynamic programming differs from linear programming in
two ways.
– First, there is no algorithm (like the simplex method) that can be
programmed to solve all problems.
Dynamic programming is, instead, a technique that allows us to break
up difficult problems into a sequence of easier subproblems, which
are then evaluated by stages.
– Second, linear programming is a method that gives single-stage (one
time period) solutions.
Dynamic programming has the power to determine the optimal
solution over a one-year time horizon by breaking the problem into
12 smaller one-month horizon problems and to solve each of these
optimally. Hence, it uses a multistage approach.
2
16/09/2021
3
16/09/2021
4
16/09/2021
Shortest Path
Stage1 Stage2 Stage3 Stage4
10
5
16/09/2021
11
S(A,B)=2
S(A,C)=4
S(A,D)=3
12
6
16/09/2021
13
4? ----------------------
(A..D)-F: 4
14
7
16/09/2021
?8
15
16
8
16/09/2021
1. (A..E)-I: 7+4=11
2. (A..F)-I: 4+3=7 *)
3. (A..G)-I: 8+3=11
----------------------
(A..F)-I: 7
7
8
17
18
9
16/09/2021
Optimal Solution
Length
Stage-4 Stage-3 Stage-2 Stage-1
of route
H E C A 11
J D A 11
I F D A 11
19
2 7
0 4 4 11
3 8
20
10
16/09/2021
Characteristics of Dynamic
Programming
1. The problem can be divided into stages, with a policy
decision required at each stage.
2. Each stage has a number of states associated with the
beginning of that stage.
3. The effect of the policy decision at each stage is to
transform the current state to a state associated with the
beginning of the next stage (possibly according to a
probability distribution).
4. The solution procedure is designed to find an optimal policy
for the overall problem, i.e., a prescription of the optimal
policy decision at each stage for each of the possible states.
21
Characteristics of Dynamic
Programming
5. Given the current state, an optimal policy for the remaining
stages is independent of the policy decisions adopted in
previous stages. Therefore, the optimal immediate decision
depends on only the current state and not on how you got
there. This is the principle of optimality for dynamic
programming.
6. The solution procedure begins by finding the optimal policy
for the last stage.
7. A recursive relationship that identifies the optimal policy
for stage n, given the optimal policy for stage n + 1, is
available.
22
11
16/09/2021
Characteristics of Dynamic
Programming
• The precise form of the recursive relationship differs
somewhat among dynamic programming problems.
However, notation analogous as below:
– N = number of stages.
– n = label for current stage (n = 1, 2, . . . , N).
– sn = current state for stage n.
– xn = decision variable for stage n.
– xn* = optimal value of xn (given sn).
– fn(sn, xn) = contribution of stages n, n+1, . . . , N to objective function if
system starts in state sn at stage n, immediate decision is xn, and
optimal decisions are made thereafter.
– fn*(sn) = fn(sn, xn*).
23
Characteristics of Dynamic
Programming
• The recursive relationship will always be of the form:
f n * (S n ) = max {f n (S n , X n )} or f n * (S n ) = min {f n (S n , X n )}
Xn Xn
where fn(sn, xn) would be written in terms of sn, xn, f*n+1(sn+1), and
probably some measure of the immediate contribution of xn to the
objective function.
It is the inclusion of f*n+1(sn+1) on the right-hand side, so that f*n(sn) is
defined in terms of f*n+1(sn+1), that makes the expression for f*n (sn) a
recursive relationship.
The recursive relationship keeps recurring as we move backward stage by
stage. When the current stage number n is decreased by 1, the new
f*n(sn) function is derived by using the f*n+1(sn+1) function that was just
derived during the preceding iteration, and then this process keeps
repeating.
24
12
16/09/2021
Characteristics of Dynamic
Programming
8. When we use this recursive relationship, the solution
procedure starts at the end and moves backward stage by
stage—each time finding the optimal policy for that stage—
until it finds the optimal policy starting at the initial stage.
This optimal policy immediately yields an optimal solution
for the entire problem, namely, x1* for the initial state s1,
then x2* for the resulting state s2, then x3* for the resulting
state s3, and so forth to x*N for the resulting stage sN.
25
Backward Approach
• Stage 4 f 4* ( s ) = csx4
s f4*(s) x4 *
H 3 J
I 4 J
• Stage 3 f 3* ( s ) = min{csx3 + f 4* ( x3 )}
x3
26
13
16/09/2021
Backward Approach
f 2* ( s ) = min{csx2 + f 3* ( x2 )}
• Stage 2 x2
27
Optimal Solution
Length
x1 x2 x3 x4
of route
C E H J 11
A D E H J 11
F I J 11
28
14