Optimal Control

Chapter 3:
Optimal Control
Conceptual Idea

• Optimal control utilizes optimization to generate the control action considering the reference
and measurements
• The practical implementation of this control is done using a computer, for this reason this
type of control is typically implemented in discrete-time.
Problem Formulation (1)

• The problem of control is transformed in an optimization problem:

(1) The control goals are transformed into an objective function named J that is function of the
states and controls in a time horizon N
– The objective function can contain several objectives: reference tracking is the main one, but also other objectives as
minimizing the control effort (energy) required by the controller, smooth control actions, ...

(2) The model and the physical limitations of the states and the control are the constraints of
the optimization problem

• Optimal control can be applied to multivariable linear and non-linear systems.

• The theory of optimal control can developed in continuous time or in discrete-time.
• In this course, we will focus in discrete-time formulation because is the one more used in
practice and because fits the optimization theory presented in the first part of the course:

x  fc ( x(t ),u(t )) 
 x(k  1)  f ( x(k),u(k))
Problem Formulation (2)

• There are many methods for discretizing a numerical system (as e.g., Euler, Runge-Kutta, etc.)

• Let’s consider the simplest one based on the Euler method

x(k  1)  x(k)
x(t ) 
• Then, the non-linear model of the system in continuous-time can be expressed in discrete-
time as follows

x  fc ( x(t ),u(t )) 
 x(k  1)  f ( x(k),u(k))
Problem Formulation (3)

• The problem of control is transformed in an optimization problem:

(1) The control goals are transformed into an objective function named J that is function of the
states and controls in a time horizon N
(2) The model and the physical limitations of the states and the control are the constraints of
the optimization problem

N N 1
min  e  Je (k)  u  Ju (k)
u( 0 ), ,u( N 1 )
k 1 k 0

suject to :
x(k  1)  f ( x(k),u(k)) k  0 , ,N
x( 0 )  xo known
x(k)  [ x ,x ] k  1, ,N
u(k)  [u,u] k  0 , ,N  1
Problem Solution

• To solve the optimization problem associated to the optimal control problem, there are two

(1) Analytically using the theory learned in the first part of the course.
(2) Numerically using numerical solvers as the ones available in Matlab.

• For linear systems, both solutions are possible when neglecting the physical constraints
affecting states and inputs.

• For non-linear systems, the analytical solutions is almost impossible, so only the numerical
one is possible.
Problem Formulation: The Linear Case (LQR)

• When the system to be controlled can be formulated or approximated using a linear model,
the system can be represented in the standard linear form that after discretising can be
expressed in the following form:

x  Ac x(t )  Bcu(t ) 
 x(k  1)  Ax(k)  Bu(k)

• Then, the previous optimization problem can reformulated in the following way

N N 1
min  e  Je (k)  u  Ju (k)
u( 0 ), ,u( N 1 )
k 1 k 0

suject to :
x(k  1)  Ax(k)  Bu(k) k  0 , ,N
x( 0 )  xo known
x(k)  [ x ,x ] k  1, ,N
u(k)  [u,u] k  0 , ,N  1
Numerical Solution of LQR (1)

Numerical Solution of LQR (2)

Environment Software/Toolbox/Package
Standalone AMPL, GAMS (~1990)
User Matlab YALMIP, CVX (~ 2000)
Python PuLP, CVXPy
Julia JuMP, Convex.jl

Modeling Solvers Modules

Env. SeDuMi SDPA External Internal






Gurobi, CPLEX



Numerical Solution of LQR (3)

Numerical Solution of LQR (4)

• The numerical solution of the optimal control problem can be obtained using an optimization
language as Yalmip

u = sdpvar((nu,1,N),(1,1,N));
x = sdpvar((nx,1,N+1),(1,1,N+1));
constraints = [];
objective = 0;
for k = 1:N
objective = objective + x’{k}*Q*x{k} + u’{k}*R*u{k};
constraints = [constraints, x{k+1} == A*x{k} + B*u{k}];

options = sdpsettings(‘solver', ‘quadprog');

Analytical Solution of LQR

• The analytical solution of the optimal control problem can be done using several approaches:

1. Lagrange (or substitution) method

2. Dynamic programming

• To obtain the analytical solution, physical constraints are neglected and the objective
function is expressed in vector/matrix form as follows

1 T 1 N 1 T
min x (N)Sx(N)    x (k)Qx(k)  uT (k)Ru(k) 
u( 0 ), ,u( N 1 ) 2 2 k 0
suject to :
x(k  1)  Ax(k)  Bu(k) k  0 , ,N
x( 0 )  xo known
Analytical Solution of LQR: Lagrange Method (1)

• To solve the LQR using the Langrange method a new objective function is created as follows

N 1
L  J   T (k  1)  x(k  1)  Ax(k)  Bu(k) 
k 0

1 T 1 N 1 T
where: J  x (N)Sx(N)    x (k)Qx(k)  uT (k)Ru(k) 
2 2 k 0

• To obtain the analytical solution: L  0

• This implies the following partial derivatives:

 0 k  1 , 2 , ,N
x(k )
 0 k  0 , 2 , ,N  1
u(k )
 0 k  1 , 2 , ,N
(k )
Analytical Solution of LQR: Lagrange Method (2)

• Evaluating the previous partial derivatives leads to:

: Qx(k)  AT (k  1)  0 , k  1, 2 , ,N  1
: Sx(N)  (N)  0 ,
: Ru(k)  BT (N  1)  0 , k  0 , 2 , ,N  1
: x(k)  Ax(k  1)  Bu(k  1)  0 , k  1, 2 , ,N

• After algebraically manipulating, these equations we are the analytical solution.

Analytical Solution of LQR: Lagrange Method (3)

Optimal Control

The optimal control obtained from the analytical solution is a state feedback- control:

u(k )  K(k )x(k )

K(k )  R 1BT ( AT )1 [P(k )  Q]

Ricatti Equation

P(k )  Q  AT P(k  1)[I  BR 1BT P(k  1)] 1 A

where P(N)  S

Optimal Objetivce Function

Jopt  x T ( 0 )P( 0 )x( 0 )
Approximate Analytical Solution: Steady State Approximation

If the horizon N is long enough, the Ricatti Equation reach a steady state solution:

P(k  1)  P(k )  Pss

Optimal Control

The optimal control obtained from the analytical solution is a state feedback- control:

u(k )  K ss x(k )
K ss  R 1BT ( AT )1 [Pss  Q]

Ricatti Equation

Pss  Q  AT Pss [I  BR 1BT Pss ] 1 A

Analytical Solution of LQR: Example (1)

• Solve the following optimal control problem analytically:

1 2 1 9 2
min x ( 10 )   [ x (k )  u 2 (k )]
2 2 k 0
suject to :
x(k  1)  0.3679 x(k )  0.6321u(k )
x( 0 )  1

• Analysing the problem, we can see that: N=10, S=Q=R=1

Analytical Solution of LQR: Example (2)

• First, we will solve the Ricatti equation:

P(N)  P( 10 )  S  1
P( 9 )  1  01354
.  1( 1  0.3996  1) 1  10967
P( 8 )  1  01354
.  10967
. ( 1  0.3996  10967
. ) 1  11032
P(7 )  1  01354
.  11032
. ( 1  0.3996  11032
. ) 1  10036
P( 6 )  1  01354
.  10036
. ( 1  0.3996  10036
. ) 1  11037
P(k)  11037
. , k  5 , 4 , ,0

• This equation could be solved approximately using the steady state approximation

Pss  1  0.1354Pss ( 1  0.3996 Pss )1  0.3996 Pss2  0.4650Pss  1  0

Pss  11037
Pss  2.2674
Analytical Solution of LQR: Example (3)

• After solving the Ricatti equation, we can determine the optimal controller gain:

K(N)  K( 10 )  1  06321
.  0.3679 1 [P( 10 )  1]  0
K( 9 )  1  06321
.  0.3679 1 [P( 9 )  1]  11662
K( 8 )  1  06321
.  0.3679 1 [P( 8 )  1]  11773
K(7 )  1  06321
.  0.3679 1 [P(7 )  1]  01781
K(k)  01781
. , k  5 , 4 , ,0

• This gain can be calculated approximately using the steady state approximation

K ss  1  0.6321  0.3679 1 [Pss  1]  0.1781

Analytical Solution of LQR: Example (4)

• The optimal sequence of states can be found as follows:

x( 1)  0.3679 x( 0 )  0.6321u( 0 )  [ 0.3679  0.6321K( 0 )]x( 0 )  0.2553

x( 2 )  [ 0.3679  0.6321K( 1)]x( 1)  0.0652
x( 3 )  [ 0.3679  0.6321K( 2 )]x( 2 )  0.0166
x( 4 )  [ 0.3679  0.6321K( 3 )]x( 3 )  0.00424
x(k )  0 , k  5 , ,10

• The optimal sequence of control can be found as follows:

u( 0 )  K( 0 )x( 0 )  0.1781

u( 1)  K( 1)x( 1)  0.0455
u( 2 )  K( 2 )x( 2 )  0.0116
u( 3 )  K( 3 )x( 3 )  0.000756
u(k )  0 , k  5 , ,9
Analytical Solution of LQR: Example (5)

Analytical Solution of LQR: Example (5)

• The optimal value of the objective function is obtained as follows:

Jopt  x T ( 0 )P( 0 )x( 0 )  0.5518

• If we use instead the steady state value of the matrix P(0) in the previous formula, we get a
suboptimal value:

Jopt  x T ( 0 )Pss x( 0 )  0.5518
Analytical Solution of LQR: Linear Matrix Inequalities (1)

• The steady state solution of the LQR problem leads to a controller

u(k)  K ss x(k)
such that

  x
k 0
(k)Qx(k)  uT (k)Ru(k)   x T ( 0 )Pss x T ( 0 )

• On the other hand, defining

V( x(k))  x T (k)Pss x(k)

if follows

V( x(k  1))  V( x(k))   x T (k)Qx(k)  u T (k)Rx(k)

Analytical Solution of LQR: Linear Matrix Inequalities (2)

• Then, taking into account that

x(k  1)  ( A  BK ss )x(k)
it follows

x T (k)( A  BK ss )Pss ( A  BK ss )x(k)  x T (k)Pss x(k)  x T (k)Qx(k)  x T K ssT RK ss x(k)

• That leads to the following matrix inequality

( A  BK ss )Pss ( A  BK ss )  Pss  Q  K ssT RK ss  0

• By introducing the following changes of variables: Y  Pss1 W  K ssY Q  HT H

YAT  AY  BW  W T BT  YH T HY  W T RW  0
Analytical Solution of LQR: Linear Matrix Inequalities (3)

• Finally, after some algebraic manipulations the following LMI can be obtained

 Y YAT  W T BT YH T WT 
 
 AY  BW Y 0 0 
 HY 0 I 0 
 
 W 0 0  R 1 

Y  P 1  I I 
• The value K ss  WY 1 can be obtained considering that J opt  xoT Pxo     0
• Leading to the following optimization problem  I Y 

min 
W ,Y
 I I 
subject to:  I Y 0
 

 Y YAT  W T BT YH T WT 
 
 AY  BW Y 0 0 
 HY 0 I 0 
 
 W 0 0  R 1 
Analytical Solution of LQR: Linear Matrix Inequalities (4)

• This LMI problem can be solved with Yalmip and SeDuMi solver as follows:

Y = sdpvar(nx,nx);
W = sdpvar(nu,nx,'full');
gamma = sdpvar(1,1)
constraints=[constraints, [gamma*I I;I gamma];
constraints= [constraints, [-Y Y*A’-W’*B’ Y*H’ W’; A*Y-B*W –Y 0 0; H*Y 0 –I 0; W 0 0 –
R* -1]] <= 0];
Kss= value(W)*inv(value(Y));
Analytical Solution of LQR: Dynamic Programming (1)

• The solution of the LQR problem

N 1
min x (N)Sx(N)    x T (k)Qx(k)  uT (k)Ru(k) 
u( 0 ), ,u( N 1 )
k 0

suject to :
x(k  1)  Ax(k)  Bu(k) k  0 , ,N
x( 0 )  xo known
can be solved using dynamic programming as presented in the first part of the course.
• Dynamic programming is a recursive approach that is based on the principle of decomposing
the optimization in subproblems.
• Defining
F(k )  x T (k )Qx(k )  uT (k )Ru(k )
J(N)   F(k )
k 0
Analytical Solution of LQR: Dynamic Programming (2)

• Consider the following optimization problem

S( 1)  F(N)  x T (N)Qx(N)  uT (N)Ru(N)

• The optimal value can be obtained by

S( 1)
 0  uopt (N)  0  Sopt ( 1)  x T (N)Qx(N)
Analytical Solution of LQR: Dynamic Programming (3)

• Then, we consider the following optimization problem

S( 2 )  Sopt ( 1)  F(N  1)  ( Ax(N  1)  Bu(N  1))T Q( Ax(N  1)  Bu(N  1))

 x T (N  1)Qx(N  1)  uT (N  1)Ru(N  1)

• The optimal value can be obtained by

S( 2 )
 0  uopt (N  1)  K(N  1)x(N  1)  Sopt ( 2 )  x T (N  1)P(N  1)x(N  1)
u(N  1)


K(N  1)  R 1BT ( AT )1 [P(N  1)  Q]

P(N  1)  Q  AT P(N)[I  BR 1BT P(N)] 1 A

Analytical Solution of LQR: Dynamic Programming (4)

• Recursive implementation using Matlab

function [x,u,P,K]=dynamic_lqr(k,x,u,P,K,A,B,R,Q)
if k==1
x(:,,k)=[1,1]; % initial conditions

%Invoking the function for k-1

Analytical Solution of LQR: Dynamic Programming Example (1)

min  [ x (k )  u (k )]
k 0
2 2

• Let consider the following LQR problem: suject to :

x(k  1)  2 x(k )  u(k )
x( 0 )  1

• First consider the following optimization problem: S( 1)  F( 2 )  x 2 ( 2 )  u 2 ( 2 )

S( 1)
 2u( 2 )  0  uopt ( 2 )  0  Sopt ( 1)  x 2 ( 2 )
u( 2 )

• Next, we consider the following optimization problem: S( 2 )  Sopt ( 1)  x 2 ( 1)  u 2 ( 1)

S( 2 )
 2 [ 2 x( 1)  u( 1)]  2u( 1)  0  uopt ( 1)   x( 1)  Sopt ( 2 )  3 x 2 ( 1)
u( 1)
Analytical Solution of LQR: Dynamic Programming Example (2)

• Finally, let consider the following optimization problem: S( 3 )  Sopt ( 2 )  x 2 ( 0 )  u 2 ( 0 )

S( 3 )
 6 [ 2 x( 1)  u( 1)]  2u( 0 )  0  uopt ( 0 )  15
. x( 0 )  Sopt ( 3 )  4 x 2 ( 0 )
u( 0 )

• Since: x( 0 )  1  uopt ( 0 )  15

. x( 0 )  15
.  xopt ( 1)  2 x( 0 )  uopt ( 0 )  0.5

• Then: xopt ( 1)  0.5  uopt ( 1)   xopt ( 1)  0.5  xopt ( 2 )  2 xopt ( 1)  uopt ( 1)  0.5

