Dynamic Programming 2

This document discusses deterministic and probabilistic dynamic programming. Deterministic dynamic programming models decision processes where the next state is fully determined by the current state and action. Probabilistic dynamic programming accounts for uncertainty, where the next state depends on the current state and action according to known probabilities. Examples are provided to illustrate how to formulate problems, define value and policy functions recursively, and solve for the optimal policy using backward induction.

Uploaded by

apa aja

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

166 views

Dynamic Programming 2

Uploaded by

apa aja

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 39

Dynamic Programming

(Part 2)
TI 2102
Optimization Mathematics
Deterministic Dynamic
Programming
 Deterministic dynamic programming can be described diagram-
matically as shown below

 Making policy decision xn then moves the process to some state

sn+1 at stage n + 1.
 The contribution thereafter to the objective function under an opti-
mal policy has been previously calculated to be f*n+1(sn+1).
 The policy decision xn also makes some contribution to the object-
ive function.
Deterministic Dynamic
Programming
 Combining these two quantities in an appropriate way provides
fn(sn, xn), the contribution of stages n onward to the objective
function.
 Optimizing with respect to xn then gives fn*(sn) = fn(sn, xn*).
 After xn* and fn*(sn) are found for each possible value of sn, the so-
lution procedure is ready to move back one stage.
 One way of categorizing deterministic dynamic programming
problems is by the form of the objective function.
 Another categorization is in terms of the nature of the set of states
for the respective stages.
Deterministic Dynamic
Programming
 In particular, states sn might be representable by a discrete state va-
riable (as for the stagecoach problem) or by a continuous state vari-
able, or perhaps a state vector (more than one variable) is required.
 Several examples are presented to illustrate these various possi-
bilities.
 More importantly, they illustrate that these apparently major dif-
ferences are actually quite inconsequential (except in terms of
computational difficulty) because the underlying basic structure
always remains the same.
Deterministic Dynamic
Programming : An Example
Distributing Scientists to Research Teams
A government space project is conducting research on a certain engine-
ering problem that must be solved before people can fly safely to Mars.
Three research teams are currently trying three different approaches for
solving this problem. The estimate has been made that, under present
circumstances, the probability that the respective teams—call them 1, 2,
and 3—will not succeed is 0.40, 0.60, and 0.80, respectively. Thus, the
current probability that all three teams will fail is (0.40)(0.60)(0.80) =
0.192. Because the objective is to minimize the probability of failure,
two more top scientists have been assigned to the project.
Deterministic Dynamic
Programming : An Example
Only integer numbers of scientists are considered because each new scien-
tist will need to devote full attention to one team. The problem is to deter-
mine how to allocate the two additional scientists to minimize the probabi-
lity that all three teams will fail.
Problem Formulation
 In this case, stage n (n = 1, 2, 3) corresponds to research team n, and the sta-
te sn is the number of new scientists still available for allocation to the rema-
ining teams.
 The decision variables xn (n = 1, 2, 3) are the number of additional scientists
allocated to team n.
 Let pi(xi) denote the probability of failure for team i if it is assigned xi additi-
onal scientists, as given by previous table.
Deterministic Dynamic
Programming : An Example
 If we let ∏ denote multiplication, the government’s objective is to cho-
ose x1, x2, x3 so as to

subject to :

where xi are nonnegative integers.

 Consequently, fn(sn, xn) for this problem is

where the minimum is taken over xn+1, . . . , x3 such that

Deterministic Dynamic
Programming : An Example
 For n = 1, 2, 3.Thus,

where

with f4* defined to be 1

 The recursive relationship relating the f1*, f2*, and f3* functions in this
case is

when n = 3,
Deterministic Dynamic
Programming : An Example
Solution Procedure
Stage 3 (n = 3)

Stage 2 (n = 2)
Deterministic Dynamic
Programming : An Example
Stage 1 (n = 1)

 The optimal solution must have x1* = 1, which makes s2 = 2 - 1 = 1, so

that x2* = 0, which makes s3 = 1 - 0 = 1, so that x3* = 1.
Mid Exercise Example
The
Model
The Answer
 Stage 3

 Stage 2
The Answer
 Stage 1
The
Answer
Another Exercise Example
The Formulation
The Formulation
The Formulation
The Answer
 Stage 4

 Stage 3
The Answer
The Answer
 Stage 2
The Answer
 Stage 1
Probabilistic Dynamic Programming
 Probabilistic dynamic programming differs from deterministic dy-
namic programming in that the state at the next stage is not com-
pletely determined by the state and policy decision at the current
stage.
 There is a probability distribution for what the next state will be.
 However, this probability distribution still is completely determi-
ned by the state and policy decision at the current stage.
 The resulting basic structure for probabilistic dynamic program-
ming is described diagrammatically in next figure.
Probabilistic Dynamic Programming

 Let S denote the number of possible states at stage n = 1 and label

these states on the right side as 1, 2, . . . , S.
 The system goes to state i with probability pi (i = 1, 2, . . . , S) given
state sn and decision xn at stage n.
Probabilistic Dynamic Programming
 If the system goes to state i, Ci is the contribution of stage n to the
objective function.
 Because of the probabilistic structure, the relationship between
fn(sn, xn) and the f*n+1(sn+1) necessarily is somewhat more complica-
ted than that for deterministic dynamic programming.
 To illustrate, suppose that the objective is to minimize the expected
sum of the contributions from the individual stages.
 In this case, fn(sn, xn) represents the minimum expected sum from
stage n onward, given that the state and policy decision at stage n
are sn and xn, respectively.
Probabilistic Dynamic Programming
: An Example
An enterprising young statistician believes that she has developed a sys-
tem for winning a popular Las Vegas game. Her colleagues do not believe
that her system works, so they have made a large bet with her that if she
starts with three chips, she will not have at least five chips after three
plays of the game. Each play of the game involves betting any desired
number of available chips and then either winning or losing this number
of chips. The statistician believes that her system will give her a probabi-
lity of 2/3 of winning a given play of the game. Assuming the statistician
is correct, we now use dynamic programming to determine her optimal
policy regarding how many chips to bet (if any) at each of the three plays
of the game. The decision at each play should take into account the re-
sults of earlier plays. The objective is to maximize the probability of win-
ning her bet with her colleagues.
Probabilistic Dynamic Programming
: An Example
Problem Formulation
The dynamic programming formulation for this problem is
Stage n = nth play of game (n = 1, 2, 3),
xn = number of chips to bet at stage n,
State sn = number of chips in hand to begin stage n.
Because the objective is to maximize the probability that the statistician
will win her bet, the objective function to be maximized at each stage
must be the probability of finishing the three plays with at least five
chips.
Then, we can described their functions as:
Probabilistic Dynamic Programming
: An Example
 fn(sn, xn) = probability of finishing three plays with at least five chips, gi-
ven that the statistician starts stage n in state sn, makes immediate deci-
sion xn, and makes optimal decisions.
 fn*(sn) =
 If she loses, the state at the next stage will be sn - xn, and the probability
of finishing with at least five chips will then be f*n+1(sn - xn).
 If she wins the next play instead, the state will become sn + xn, and the
corresponding probability will be f*n+1(sn + xn).
 Because the assumed probability of winning a given play is 2/3, it now
follows that
Probabilistic Dynamic Programming
: An Example
 Therefore, the recursive relationship for this problem is

for n = 1, 2, 3, with f4*(s4) as just defined 0 if s4 < 5 and 1 if s4 ≥ 5.

Probabilistic Dynamic Programming
: An Example
 Solution Procedure
Stage 3 (n=3)

Stage 2 (n=2)
Probabilistic Dynamic Programming
: An Example
Stage 1 (n=1)

The optimal policy is

It gives probability of winning her bet with her colleagues is 20/27

Another Example
The Formulation
The Formulation
The Solution
 Stage 3

 Stage 2
The Solution
 Stage 1
Conclusion
 Dynamic programming is a very useful technique for making a
sequence of interrelated decisions.
 It requires formulating an appropriate recursive relationship for each
individual problem.
 However, it provides a great computational savings over using
exhaustive enumeration to find the best combination of decisions,
especially for large problems.
 Dynamic programming is basic for the next step for further
problem which could only be solved by Markov Chain.
End of Topic
TI 2102
Optimization Mathematic

FE1073 C1 Formal Report
100% (1)
FE1073 C1 Formal Report
9 pages
Exercise 5 Goal Programming
No ratings yet
Exercise 5 Goal Programming
5 pages
Optimization Methods (MFE) : Elena Perazzi
No ratings yet
Optimization Methods (MFE) : Elena Perazzi
28 pages
Complex Book
No ratings yet
Complex Book
166 pages
OR Assignments 29072018 054510PM
No ratings yet
OR Assignments 29072018 054510PM
26 pages
IPandBIP
No ratings yet
IPandBIP
30 pages
Chap 4
No ratings yet
Chap 4
14 pages
Dynamic Programming
100% (1)
Dynamic Programming
15 pages
Penelitian Operasional II - W1 PDF
100% (1)
Penelitian Operasional II - W1 PDF
35 pages
Solutions 9: Demo 1: KKT Conditions With Inequality Constraints
No ratings yet
Solutions 9: Demo 1: KKT Conditions With Inequality Constraints
11 pages
Lec 24 Lagrange Multiplier
No ratings yet
Lec 24 Lagrange Multiplier
20 pages
Taylor Introms10 PPT 03
No ratings yet
Taylor Introms10 PPT 03
42 pages
Game Theory
No ratings yet
Game Theory
12 pages
Operation Management
100% (1)
Operation Management
202 pages
Learning Hessian Matrix PDF
No ratings yet
Learning Hessian Matrix PDF
100 pages
LPP Formulation & Graphical New
100% (1)
LPP Formulation & Graphical New
21 pages
Divide and Conquer Approach
100% (2)
Divide and Conquer Approach
9 pages
Dynamic Programming
No ratings yet
Dynamic Programming
52 pages
Integer Programming Formulation Examples
No ratings yet
Integer Programming Formulation Examples
16 pages
Lecture Slides On Simplex Computations
100% (1)
Lecture Slides On Simplex Computations
58 pages
Operation Research - II Goal Programming Industrial III Yr: Topic: Branch & Year
No ratings yet
Operation Research - II Goal Programming Industrial III Yr: Topic: Branch & Year
26 pages
Octav Dragoi - Lagrange Multipliers
No ratings yet
Octav Dragoi - Lagrange Multipliers
5 pages
Linear Programming
100% (1)
Linear Programming
15 pages
Unit 5 Amortised Analysis 1
100% (1)
Unit 5 Amortised Analysis 1
80 pages
Operation Research Exercise PDF
No ratings yet
Operation Research Exercise PDF
9 pages
Tugas RO Integer Programming Formulation
100% (2)
Tugas RO Integer Programming Formulation
3 pages
Notes 3
No ratings yet
Notes 3
8 pages
Decision Making Using The Analytic Hierarchy Process (AHP) A Step by Step Approach
No ratings yet
Decision Making Using The Analytic Hierarchy Process (AHP) A Step by Step Approach
4 pages
Constrained Optimization
No ratings yet
Constrained Optimization
6 pages
Or Lecture 1 Linear Programming
100% (1)
Or Lecture 1 Linear Programming
51 pages
Computation of KKT Points
No ratings yet
Computation of KKT Points
2 pages
Yohannes Tolesa
No ratings yet
Yohannes Tolesa
97 pages
Dynamic Programming Technique
No ratings yet
Dynamic Programming Technique
3 pages
4 Duality Theory
No ratings yet
4 Duality Theory
17 pages
Milnes Method Matlab Code
No ratings yet
Milnes Method Matlab Code
2 pages
Chapter 16
No ratings yet
Chapter 16
25 pages
Student ID:: Idterm Xamination Cheduling Sequencing
No ratings yet
Student ID:: Idterm Xamination Cheduling Sequencing
4 pages
Accounting Textbook Solutions - 5
No ratings yet
Accounting Textbook Solutions - 5
18 pages
Questions Exam Mathematics and Game Theory 2019-2020
No ratings yet
Questions Exam Mathematics and Game Theory 2019-2020
5 pages
Sensitivity/Post Optimal Analysis
No ratings yet
Sensitivity/Post Optimal Analysis
27 pages
MB0032 Operations Research
100% (1)
MB0032 Operations Research
38 pages
Operation Research
100% (1)
Operation Research
191 pages
Lecture 1
No ratings yet
Lecture 1
7 pages
MSS - CHP 3 - Linear Programming Modeling Computer Solution and Sensitivity Analysis PDF
No ratings yet
MSS - CHP 3 - Linear Programming Modeling Computer Solution and Sensitivity Analysis PDF
10 pages
Linear Programming Problem
No ratings yet
Linear Programming Problem
39 pages
Unit 2 Lecturer Notes of Linear Programming of or by DR
No ratings yet
Unit 2 Lecturer Notes of Linear Programming of or by DR
46 pages
HW1
No ratings yet
HW1
12 pages
Linear Programming-Senstivity Analysis Hakeem Ur Rahman
No ratings yet
Linear Programming-Senstivity Analysis Hakeem Ur Rahman
21 pages
Least Cost Method (LCM), Assignment Help, Transportation Problem in Linear Programming PDF
100% (2)
Least Cost Method (LCM), Assignment Help, Transportation Problem in Linear Programming PDF
2 pages
HW 5 Soln
100% (1)
HW 5 Soln
12 pages
L31 - Non-Linear Programming Problems - Unconstrained Optimization - KKT Conditions
100% (1)
L31 - Non-Linear Programming Problems - Unconstrained Optimization - KKT Conditions
48 pages
Linear Models and Matrix Algebra: Alpha Chiang, Fundamental Methods of Mathematical Economics 3 Edition
No ratings yet
Linear Models and Matrix Algebra: Alpha Chiang, Fundamental Methods of Mathematical Economics 3 Edition
32 pages
Dynamic Programming
No ratings yet
Dynamic Programming
8 pages
dpp (1)
No ratings yet
dpp (1)
14 pages
Lecture 2 Deterministic
No ratings yet
Lecture 2 Deterministic
21 pages
Unit 3 Divide and Conquer: Structure
No ratings yet
Unit 3 Divide and Conquer: Structure
18 pages
Leclin 11
No ratings yet
Leclin 11
42 pages
Solutions Test Chap1
No ratings yet
Solutions Test Chap1
3 pages
Stability Analysis of An Approximate Scheme For Moving Horizon Estimation
No ratings yet
Stability Analysis of An Approximate Scheme For Moving Horizon Estimation
21 pages
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
From Everand
Digital Signal and Image Processing using MATLAB, Volume 3: Advances and Applications, The Stochastic Case
Gérard Blanchet
3/5 (1)
Disc Springs Theory and Practice
100% (1)
Disc Springs Theory and Practice
40 pages
9th Maths final exam
No ratings yet
9th Maths final exam
5 pages
Combinatorial Set Theory PDF
No ratings yet
Combinatorial Set Theory PDF
2 pages
Electric Circuits 2 Lab1
No ratings yet
Electric Circuits 2 Lab1
6 pages
Bev Gear Design PDF
No ratings yet
Bev Gear Design PDF
5 pages
Project Control Cycle
100% (5)
Project Control Cycle
8 pages
Rice Mill
No ratings yet
Rice Mill
42 pages
5.foundations of AI
No ratings yet
5.foundations of AI
17 pages
10 1 1 703 8121 PDF
No ratings yet
10 1 1 703 8121 PDF
41 pages
AMC2019 StudentsResults-Indonesia-9 & 10 I
No ratings yet
AMC2019 StudentsResults-Indonesia-9 & 10 I
3 pages
ICT & Research Methodology: Dr. Aliyu Rufai Yauri
No ratings yet
ICT & Research Methodology: Dr. Aliyu Rufai Yauri
33 pages
Further Area and Solids of Revolution (Ex1)
No ratings yet
Further Area and Solids of Revolution (Ex1)
18 pages
forming-expressions
No ratings yet
forming-expressions
7 pages
Bathtub Dynamics: Initial Results of A Systems Thinking Inventory
No ratings yet
Bathtub Dynamics: Initial Results of A Systems Thinking Inventory
53 pages
Project Report Hotel Management New
100% (4)
Project Report Hotel Management New
110 pages
AN704: SCM7B: Application Note: Failure Rate Calculation and Prediction
No ratings yet
AN704: SCM7B: Application Note: Failure Rate Calculation and Prediction
1 page
Fir Compiler Xilinx
No ratings yet
Fir Compiler Xilinx
85 pages
Quant Interview Prep
67% (3)
Quant Interview Prep
25 pages
Physics Homework 62
100% (1)
Physics Homework 62
4 pages
Design of An Air Distribution System For A Multi-Story Office Building
No ratings yet
Design of An Air Distribution System For A Multi-Story Office Building
54 pages
Engineering Structures: N.S. Trahair
No ratings yet
Engineering Structures: N.S. Trahair
6 pages
Course reg 300 level first
No ratings yet
Course reg 300 level first
1 page
Transshipment Model
No ratings yet
Transshipment Model
5 pages
Research Paper (Banana Peel)
No ratings yet
Research Paper (Banana Peel)
8 pages
EE3301 Electromagnetic Fields Lecture Notes 1
No ratings yet
EE3301 Electromagnetic Fields Lecture Notes 1
84 pages
André Marie Ampère
No ratings yet
André Marie Ampère
3 pages
Are Conversion, Selectivity and Yield Terms Unambiguously Defined in Chemical and Chemical Engineering Terminology?
No ratings yet
Are Conversion, Selectivity and Yield Terms Unambiguously Defined in Chemical and Chemical Engineering Terminology?
10 pages
(Said S.E.H. Elnashaie, Parag Garhyan) Conservatio
No ratings yet
(Said S.E.H. Elnashaie, Parag Garhyan) Conservatio
661 pages