Calculus of Variations Optimal Control
Calculus of Variations Optimal Control
iii
Preface
This pamphlet on calculus of variations and optimal control theory contains the most important results in the subject, treated largely in order of urgency. Familiarity with linear algebra and
real analysis are assumed. It is desirable, although not mandatory, that the reader has also had a
course on dierential equations. I would greatly appreciate receiving information about any errors
noticed by the readers. I am thankful to Dr. Sara Maad from the University of Virginia, U.S.A.,
for several useful discussions.
Amol Sasane
6 September, 2004.
Theory
Contents
1 Introduction
1.1
Control theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
1.2
1.3
1.4
. . . . . . . . . . . . . . . . .
2.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2.2
2.3
Functionals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
2.4
13
3 Calculus of variations
15
3.1
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
15
3.2
16
3.3
17
3.4
18
3.5
24
3.6
31
3.7
Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
33
4 Optimal control
35
4.1
35
4.2
38
4.3
40
vi
Contents
4.4
43
47
5.1
47
5.2
Bellmans equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
49
Bibliography
55
Index
57
Chapter 1
Introduction
1.1
Control theory
Control theory is application-oriented mathematics that deals with the basic principles underlying
the analysis and design of (control) systems. Systems can be engineering systems (air conditioner,
aircraft, CD player etcetera), economic systems, biological systems and so on. To control means
that one has to inuence the behaviour of the system in a desirable way: for example, in the case
of an air conditioner, the aim is to control the temperature of a room and maintain it at a desired
level, while in the case of an aircraft, we wish to control its altitude at each point of time so that
it follows a desired trajectory.
1.2
The basic objects of study are underdetermined dierential equations. This means that there is
some freeness in the choice of the variables satisfying the dierential equation. An example of
an underdetermined algebraic equation is x + u = 10, where x, u are positive integers. There is
freedom in choosing, say u, and once u is chosen, then x is uniquely determined. In the same
manner, consider the dierential equation
dx
(t) = f (x(t), u(t)), x(ti ) = xi , t ti ,
dt
(1.1)
Chapter 1. Introduction
..
.
dxn
(t)
dt
where f1 , . . . , fn denote the components of f . In (1.1), u is the free variable, called the input, which
is usually assumed to be piecewise continuous1 . Let the class of Rm -valued piecewise continuous
functions be denoted by U. Under some regularity conditions on the function f : Rn Rm Rn ,
there exists a unique solution to the dierential equation (1.1) for every initial condition xi Rn
and every piecewise continuous input u:
Theorem 1.2.1 Suppose that f is continuous in both variables. If there exist K > 0, r > 0 and
tf > ti such that
f (x2 , u(t)) f (x1 , u(t)) Kx2 x1
(1.2)
for all x1 , x2 B(xi , r) = {x Rn | x xi r} and for all t [ti , tf ], then (1.2) has a
unique solution x() in the interval [ti , tm ], for some tm > ti . Furthermore, this solution depends
continuously on xi for xed t and u.
Remarks.
1. Continuous dependence on the initial condition is very important, since some inaccuracy
is always present in practical situations. We need to know that if the initial conditions are
slightly changed, the solution of the dierential equation will change only slightly. Otherwise,
slight inaccuracies could yield very dierent solutions.
2. x is called the state and (1.1) is called the state equation.
3. Condition (1.2) is called the Lipschitz condition.
The above theorem guarantees that a solution exists and that it is unique, but it does not give
any insight into the size of the time interval on which the solutions exist. The following theorem
sheds some light on this.
Theorem 1.2.2 Let r > 0 and dene Br = {u U | u(t) r for all t}. Suppose that f is
continuously dierentiable in both variables. For every xi Rn , there exists a unique tm (xi )
(ti , +] such that for every u Br , (1.1) has a unique solution x() in [ti , tm (xi )).
For our purposes, a control system is an equation of the type (1.1), with input u and state
x. Once the input u and the intial state x(ti ) = xi are specied, the state x is determined. So
one can think of a control system as a box, which given the input u and intial state x(t i ) = xi ,
manufactures the state according to the law (1.1); see Figure 1.1.
If the function f is linear, that is, if f (x, u) = Ax + Bu for some A Rnn and B Rnm ,
then the control system is said to be linear.
Exercises.
1 By a Rm -valued piecewise continuous function on an interval [a, b], we mean a function f : [a, b] R m
such that there exist nitely many points t1 , . . . , tk [a, b] such that f is continuous on each of the intervals
(a, t1 ), (t1 , t2 ), . . . , (tk1 , tk ), (tk , b), the left- and right- hand limits limttl f (t) and limttl f (t) exist for all l
{1, . . . , k}, and limta f (t) and limttb f (t) exist.
plant
x(t)
= f (x(t), u(t))
x(ti ) = xi
(1.3)
(tti )A
xi + e
tA
e A Bu( )d.
ti
1
p(t) +
p(t)
= (p(t))2 1, t [0, 1], p(1) = 0.
A characteristic of underdetermined equations is that one can choose the free variable in a
way that some desirable eect is produced on the other dependent variable. For example, if with
our algebraic equation x + u = 10 we wish to make x < 5, then we can achieve this by choosing
the free variable u to be strictly larger than 5. Control theory is all about doing similar things
with dierential equations of the type (1.1). The state variables x comprise the to-be-controlled
variables, which depend on the free variables u, the inputs. For example, in the case of an aircraft,
the speed, altitude and so on are the to-be-controlled variables, while the angle of the wing aps,
the speed of the propeller and so on, which the pilot can specify, are the inputs.
1.3
1. How do we choose the control inputs to achieve regulation of the state variables?
For instance, we might want the state x to track some desired reference state xr , and there
must be stability under external disturbances. For example, a thermostat is a device in
an air conditioner that changes the input in such a way that the temperature tracks a
constant reference temperature and there is stability despite external disturbances (doors
being opened or closed, change in the number of people in the room, activity in the kitchen
etcetera): if the temperature in the room goes above the reference value, then the thermostat
Chapter 1. Introduction
(which is a bimetallic strip) bends and closes the circuit so that electricity ows and the air
conditioner produces a cooling action; on the other hand if the temperature in the room
drops below the reference value, the bimetallic strip bends the other way hence breaking the
circuit and the air conditioner produces no further cooling. These problems of regulation
are mostly the domain of control theory for engineering systems. In economic systems, one
is furthermore interested in extreme performances of control systems. This naturally brings
us to the other important question in control theory, which is the realm of optimal control
theory.
2. How do we control optimally?
Tools from calculus of variations are employed here. These questions of optimality arise
naturally. For example, in the case of an aircraft, we are not just interested in ying from
one place to another, but we would also like to do so in a way so that the total travel time
is minimized or the fuel consumption is minimized. With our algebraic equation x + u = 10,
in which we want x < 5, suppose that furthermore we wish to do so in manner such that
u is the least possible integer. Then the only possible choice of the (input) u is 6. Optimal
control addresses similar questions with dierential equations of the type (1.1), together with
a performance index functional, which is a function that measures optimality.
This course is about the basic principles behind optimal control theory.
1.4
In this appendix, we introduce the exponential of a matrix, which is useful for obtaining explicit
solutions to the linear control system (1.3) in the exercise 1 on page 3. We begin with a few
preliminaries concerning vector-valued functions.
With a slight abuse of notation, a vector-valued function x(t) is a vector whose entries are
functions of t. Similarly, a matrix-valued function A(t) is a matrix whose entries are functions:
x1 (t)
a11 (t) . . . a1n (t)
..
..
..
.
. , A(t) =
.
.
xn (t)
am1 (t) . . .
amn (t)
The calculus operations of taking limits, dierentiating, and so on are extended to vector-valued
and matrix-valued functions by performing the operations on each entry separately. Thus by
denition,
limtt0 x1 (t)
..
lim x(t) =
.
.
tt0
limtt0 xn (t)
So this limit exists i limtt0 xi (t) exists for all i {1, . . . , n}. Similiarly, the derivative of
a vector-valued or matrix-valued function is the function obtained by dierentiating each entry
separately:
x1 (t)
a11 (t) . . . a1n (t)
dx
dA
..
..
(t) = ... ,
(t) =
,
.
.
dt
dt
xn (t)
am1 (t) . . . amn (t)
where xi (t) is the derivative of xi (t), and so on. So dx
dt is dened i each of the functions xi (t) is
dierentiable. The derivative can also be described in vector notation, as
x(t + h) x(t)
dx
(t) = lim
.
h0
dt
h
(1.4)
Here x(t + h) x(t) is computed by vector addition and the h in the denominator stands for
scalr multiplication by h1 . The limit is obtained by evaluating the limit of each entry separately,
as above. So the entries of (1.4) are the derivatives xi (t). The same is true for matrix-valued
functions.
A system of homogeneous, rst-order, linear constant-coecient dierential equations is a
matrix equation of the form
dx
(t) = Ax(t),
(1.5)
dt
where A is a n n real matrix and x(t) is an n dimensional vector-valued function. Writing out
such a system, we obtain a system of n dierential equations, of the form
dx1
(t)
dt
dxn
(t)
dt
The xi (t) are unknown functions, and the aij are scalars. For example, if we substitute the matrix
3 2
1 4
for A, (1.5) becomes a system of two equations in two unknowns:
dx1
(t)
dt
dx2
(t)
dt
Now consider the case when the matrix A is simply a scalar. We learn in calculus that the
solutions to the rst-order scalar linear dierential equation
dx
(t) = ax(t)
dt
are x(t) = ceta , c being an arbitrary constant. Indeed, ceta obviously solves this equation. To
show that every solution has this form, let x(t) be an arbitrary dierentiable function which is a
solution. We dierentiate eta x(t) using the product rule:
d ta
(e x(t)) = aeta x(t) + eta ax(t) = 0.
dt
Thus eta x(t) is a constant, say c, and x(t) = ceta . Now suppose that analogous to
ea = 1 + a +
a3
a2
+
+ . . . , a R,
2!
3!
we dene
1 2
1
A + A3 + . . . , A Rnn .
(1.6)
2!
3!
Later in this section, we study this matrix exponential, and use the matrix-valued function
eA = I + A +
etA = I + tA +
t2 2 t3 2
A + A + ...
2!
3!
(where t is a variable scalar) to solve (1.5). We begin by stating the following result, which shows
that the series in (1.6) converges for any given square matrix A.
Chapter 1. Introduction
Theorem 1.4.1 The series (1.6) converges for any given square matrix A.
We have collected the proofs together at the end of this section in order to not break up the
discussion.
Since matrix multiplication is relatively complicated, it isnt easy to write down the matrix
entries of eA directly. In particular, the entries of eA are usually not obtained by exponentiating
the entries of A. However, one case in which the exponential is easily computed, is when A is
a diagonal matrix, say with diagonal entries i . Inspection of the series shows that eA is also
diagonal in this case and that its diagonal entries are ei .
The exponential of a matrix A can also be determined when A is diagonalizable , that is,
whenever we know a matrix P such that P 1 AP is a diagonal matrix D. Then A = P DP 1 , and
using (P DP 1 )k = P Dk P 1 , we obtain
eA
1 2
1
A + A3 + . . .
2!
3!
12
1
1
P D2 P 1 + P D3 P 1 + . . .
= I + P DP +
2!
3!
2
1
1
P D2 P 1 + P D3 P 1 + . . .
= P IP + P DP 1 +
2!
3!
1 3
1 2
= P I + D + D + D + . . . P 1
2!
3!
= I +A+
= P eD P 1
e1
= P 0 ...
1
P ,
en
(a+b)
1 + (a+b)
ea+b =
1! +
2!
+
. . . and
2
a
ea eb = 1 + 1!
+ a2! + . . .
1 + 1!b +
b2
2!
+ ... .
(1.7)
We cannot substitute matrices into this identity because the commutative law is needed to obtain
equality of the two series. For instance, the quadratic terms of (1.7), computed without the
commutative law, are 12 (a2 + ab + ba + b2 ) and 12 a2 + ab + 12 b2 . They are not equal unless ab = ba.
So there is no reason to expect eA+B to equal eA eB in general. However, if two matrices A and
B happen to commute, the formal identity can be applied.
Theorem 1.4.2 If A, B Rnn commute (that is AB = BA), then eA+B = eA eB .
The proof is at the end of this section. Note that the above implies that eA is always invertible
and in fact its inverse is eA : Indeed I = eAA = eA eA .
Exercises.
1. Give an example of 2 2 matrices A and B such that eA+B = eA eB .
2. Compute eA , where A is given by
A=
Hint: A = 2I +
0
0
3
0
2
0
3
2
We now come to the main result relating the matrix exponential to dierential equations.
Given an n n matrix, we consider the exponential etA , t being a variable scalar, as a matrixvalued function:
t2
t3
etA = I + tA + A2 + A3 + . . . .
2!
3!
Theorem 1.4.3 etA is a dierentiable matrix-valued function of t, and its derivative is e tA .
The proof is at the end of the section.
Theorem 1.4.4 (Product rule.) Let A(t) and B(t) be dierentiable matrix-valued functions of t,
of suitable sizes so that their product is dened. Then the matrix product A(t)B(t) is dierentiable,
and its derivative is
dA(t)
dB(t)
d
(A(t)B(t)) =
B(t) + A(t)
.
dt
dt
dt
The proof is left as an exercise.
Theorem 1.4.5 The rst-order linear dierential equation
dx
(t) = Ax(t), t ti , x(ti ) = xi
dt
has the unique solution x(t) = e(tti )A xi .
Proof
Chapter 2
Introduction
Optimal control theory is about controlling the given system in some best way. The optimal
control strategy will depend on what is dened as the best way. This is usually specied in terms
of a performance index functional.
As a simple example, consider the problem of a rocket launching a satellite into an orbit about
the earth. An associated optimal control problem is to choose the controls (the thrust attitude
angle and the rate of emission of the exhaust gases) so that the rocket takes the satellite into its
prescribed orbit with minimum expenditure of fuel or in minimum time.
We rst look at a number of specic examples that motivate the general form for optimal
control problems, and having seen these, we give the statement of the optimal control problem
that we study in these notes in 2.4.
2.2
10
k
Figure 2.1: Production function .
where C and I are the rates of consumption and investment, respectively.
The investment is used to increase the capital stock and replace machinery, that is
I(t) =
dK
(t) + K(t),
dt
,
=
dt L
L dt
L dt
it follows that
L
dk
(k) = c +
+ k + k.
dt
L
Assuming that labour grows exponentially, that is L(t) = L0 et , we have
dk
(t) = (k(t)) ( + )k(t) c(t),
dt
which is the governing equation of this economic growth model. The consumption rate per worker,
namely c, is the control input for this problem.
The central planners problem is to choose c on a time interval [0, T ] in some best way. But
what are the desired economic objectives that dene this best way? One method of quantifying
the best way is to introduce a utility function U ; which is a measure of the value attached
to the consumption. The function U normally satises U (c) 0, which means that a xed
increment in consumption will be valued increasingly highly with decreasing consumption level.
This is illustrated in Figure 2.2. We also need to optimize consumption for [0, T ], but with some
discounting for future time. So the central planner wishes to maximize the welfare integral
T
et U (c(t))dt,
W (c) =
0
where is known as the discount rate, which is a measure of preference for earlier rather than
later consumption. If = 0, then there is no time discounting and consumption is valued equally
at all times; as increases, so does the discounting of consumption and utility at future times.
The mathematical problem has now been reduced to nding the optimal consumption path
{c(t), t [0, T ]}, which maximizes W subject to the constraint
dk
(t) = (k(t)) ( + )k(t) c(t),
dt
11
c
Figure 2.2: Utility function U .
Example. (Exploited populations.) Many resources are to some extent renewable (for example,
sh populations, grazing land, forests) and a vital problem is their optimal management. With
no harvesting, the resource population x is assumed to obey a growth law of the form
dx
(t) = (x(t)).
dt
(2.1)
et (pe(t)x(t) ce(t))dt,
subject to
dx
(t) = (x(t)) e(t)x(t),
dt
and the initial condition x(0) = x0 .
12
2.3
Functionals
The examples from the previous section involve nding extremum values of integrals subject to a
dierential equation constraint. These integrals are particular examples of a functional.
A functional is a correspondence which assigns a denite real number to each function belonging to some class. Thus, one might say that a functional is a kind of function, where the
independent variable is itself a function.
Examples. The following are examples of functionals:
1. Consider the set of all rectiable plane curves1. A denite number associated with each such
curve, is for instance, its length. Thus the length of a curve is a functional dened on the
set of rectiable curves.
2. Let x be an arbitrary continuously dierentiable function dened on [ti , tf ]. Then the formula
tf
I(x) =
ti
2
dx
(t) dt
dt
tf
Ixi (u) =
analysis, the length of a curve is dened as the limiting length of a polygonal line inscribed in the curve
(that is, with vertices lying on the curve) as the maximum length of the chords forming the polygonal line goes to
zero. If this limit exists and is nite, then the curve is said to be rectiable.
13
2.4
The examples discussed in 2.2 can be put in the following form. As mentioned in the introduction,
we assume that the state of the system satises the coupled rst order dierential equations
dx1
(t) =
dt
..
.
dxn
(t) =
dt
on [ti , tf ], and where the m variables u1 , . . . , um form the control input vector u. We can conveniently write the system of equations above in the form
dx
(t) = f (x(t), u(t)), x(ti ) = xi , t [ti , tf ].
dt
We assume that u (C[ti , tf ])m , that is, each component of u is a continuous function on [ti , tf ].
It is also assumed that f1 , . . . , fn possess partial derivatives with respect to xk , 1 k n and
ul , 1 l m and these are continuous. (So f is continuously dierentiable in both variables.)
The initial value of x is specied (xi at time ti ), which means that specifying u(t) for t [ti , tf ]
determines x (see Theorem 1.2.1).
The basic optimal control problem is to choose the control u (C[ti , tf ])m such that:
1. The state x is transferred from xi to a state at terminal time tf where some (or all or none)
of the state variable components are specied; for example, without loss of generality 2 x(tf )k
is specied for k {1, . . . , r}.
2. The functional
tf
Ixi (u) =
is minimized3 .
A function u that minimizes the functional I is called an optimal control, the corresponding state
x is called the optimal state, and the pair (x , u ) is called an optimal trajectory. Using the
notation above, we can identify the two optimal control problem examples listed in 2.2.
2 Or
14
1,
m =
x =
1,
k,
u =
f (x, u) =
F (x, u, t) =
c,
(x) ( + )x u,
et U (u).
Chapter 3
Calculus of variations
3.1
Introduction
Before we attempt solving the optimal control problem described in Section 2.4 of Chapter 2,
that is, an extremum problem for a functional of the type described in item 4 on page 12, we
consider the following simpler problem in this chapter: we would like to nd extremal curves x for
a functional of the type described in item 3 on page 12. This is simpler since there is no dierential
equation constraint.
In order to solve this problem, we rst make the problem more abstract by considering the
problem of nding extremal points x X for a functional I : X R, where X is a normed linear
space. (The notion of a normed linear space is introduced in Section 3.4.) We develop a calculus
for solving such problems. This situation is entirely analogous to the problem of nding extremal
points for a dierentiable function f : R R:
Consider for example the quadratic function f (x) = ax2 + bx + c. Suppose that one wants
to know the points x at which f assumes a maximum or a minimum. We know that if f has
a maximum or a minimum at the point x , then the derivative of the function must be zero at
that point: f (x ) = 0. See Figure 3.1. So one can then one can proceed as follows. First nd
the expression for the derivative: f (x) = 2ax + b. Next solve for the unknown x in the equation
f (x ) = 0, that is,
(3.1)
2ax + b = 0
b
,
and so we nd that a candidate for the point x which minimizes or maximizes f is x = 2a
which is obtained by solving the algebraic equation (3.1) above.
f
x
16
at extremal points. We dene the derivative of a functional I : X R in Section 3.4, and also
prove Theorem 3.4.2, which says that this derivative must vanish at an extremal point x X.
In the remainder of the chapter, we apply Theorem 3.4.2 to the concrete case where X comprises continuously dierentiable functions, and I is a functional of the form
tf
F (x(t), x (t), t)dt.
(3.2)
I(x) =
ti
We nd the derivative of such a functional, and equating it to zero, we obtain a necessary condition that an extremal curve should satisfy: instead of an algebraic equation (3.1), we now obtain
a dierential equation, called the Euler-Lagrange equation, given by (3.9). Continuously dierentiable solutions x of this dierential equation are then candidates which maximize or minimize
the functional I. Historically speaking, such optimization problems arising from physics gave birth
to the subject of calculus of variations. We begin this chapter with the discussion of one such
milestone problem, called the brachistochrone problem (brachistos=shortest, chronos=time).
3.2
The calculus of variations originated from a problem posed by the Swiss mathematician Johann
Bernoulli (1667-1748). He required the form of the curve joining two xed points A and B in a
vertical plane such that a body sliding down the curve (under gravity and no friction) travels from
A to B in minimum time. This problem does not have a trivial solution; the straight line from A
to B is not the solution (this is also intuitively clear, since if the slope is high at the beginning,
the body picks up a high velocity and so its plausible that the travel time could be reduced) and
it can be veried experimentally by sliding beads down wires in appropriate shapes.
To pose the problem in mathematical terms, we introduce coordinates as shown in Figure 3.2,
so that A is the point (0, 0), and B corresponds to (x0 , y0 ). Assuming that the particle is released
A (0, 0)
x0
x
gravity
y0
B (x0 , y0 )
(3.3)
where we have taken the zero potential energy level at y = 0, and where v denotes the speed of
the particle. Thus the speed is given by
v=
ds
= 2gy,
dt
(3.4)
where s denotes arc length along the curve. From Figure 3.3, we see that an element of arc length,
1
s is given approximately by ((x)2 + (y)2 ) 2 . Hence the time of descent is given by
17
x
Figure 3.3: Element of arc length.
T =
curve
ds
1
=
2gy
2g
0
y0
1 +
dx
dy
2 12
dy.
Our problem is to nd the path {x(y), y [0, y0 ]}, satisfying x(0) = 0 and x(y0 ) = x0 , which
minimizes T .
3.3
To understand the basic meaning of the problems and methods of the calculus of variations, it is
important to see how they are related to the problems of the study of functions of n real variables.
Thus, consider a functional of the form
tf
dx
(t), t dt, x(ti ) = xi , x(tf ) = xf .
F x(t),
I(x) =
dt
ti
Here each curve x is assigned a certain number. To nd a related function of the sort considered
in classical analysis, we may proceed as follows. Using the points
ti = t0 , t1 , . . . , tn , tn+1 = tf ,
we divide the interval [ti , tf ] into n + 1 equal parts. Then we replace the curve {x(t), t [ti , tf ]}
by the polygonal line joining the points
(t0 , xi ), (t1 , x(t1 )), . . . , (tn , x(tn )), (tn+1 , xf ),
and we approximate the functional I at x by the sum
In (x1 , . . . , xn ) =
n
k=1
xk xk1
F xk ,
, tk h k ,
hk
(3.5)
where xk = x(tk ) and hk = tk tk1 . Each polygonal line is uniquely determined by the ordinates
x1 , . . . , xn of its vertices (recall that x0 = xi and xn+1 = xf are xed), and the sum (3.5) is
therefore a function of the n variables x1 , . . . , xn . Thus as an approximation, we can regard the
variational problem as the problem of nding the extrema of the function In (x1 , . . . , xn ).
In solving variational problems, Euler made extensive use of this method of nite dierences.
By replacing smooth curves by polygonal lines, he reduced the problem of nding extrema of a
functional to the problem of nding extrema of a function of n variables, and then he obtained
exact solutions by passing to the limit as n . In this sense, functionals can be regarded
as functions of innitely many variables (that is, the innitely many values of x(t) at dierent
points), and the calculus of variations can be regarded as the corresponding analog of dierential
calculus of functions of n real variables.
18
3.4
tf
I(x) =
ti
dx
(t), t dt,
F x(t),
dt
then it is natural to regard the functional as dened on the set of all functions with a continuous
rst derivative.
The concept of continuity plays an important role for functionals, just as it does for the ordinary functions considered in classical analysis. In order to formulate this concept for functionals,
we must somehow introduce a notion of closeness for elements in a function space. This is most
conveniently done by introducing the concept of the norm of a function, analogous to the concept
of the distance between a point in Euclidean space and the origin. Although in what follows we
shall always be concerned with function spaces, it will be most convenient to introduce the concept
of a norm in a more general and abstract form, by introducing the concept of a normed linear
space.
By a linear space (or vector space) over R, we mean a set X together with the operations of
addition + : X X X and scalar multiplication : R X X that satisfy the following:
1. x1 + (x2 + x3 ) = (x1 + x2 ) + x3 for all x1 , x2 , x3 X.
2. There exists an element, denoted by 0 (called the zero element) such that x + 0 = 0 + x = x
for all x X.
3. For every x X, there exists an element, denoted by x such that x + (x) = (x) + x = 0.
4. x1 + x2 = x2 + x1 for all x1 , x2 X.
5. 1 x = x for all x X.
6. ( x) = () x for all , R and for all x X.
7. ( + ) x = x + x for all , R and for all x X.
8. (x1 + x2 ) = x1 + x2 for all R and for all x1 , x2 X.
A linear functional L : X R is a map that satises
1. L(x1 + x2 ) = L(x1 ) + L(x2 ) for all x1 , x2 X.
2. L( x) = L(x) for all R and for all x X.
19
The set ker(L) = {x X | L(x) = 0} is called the kernel of the linear functional L.
Exercise. () If L1 , L2 are linear functionals dened on X such that ker(L1 ) ker(L2 ), then
prove that there exists a constant R such that L2 (x) = L1 (x) for all x X.
Hint: The case when L1 = 0 is trivial. For the other case, rst prove that if ker(L1 ) = X, then
there exists a x0 X such that X = ker(L1 ) + [x0 ], where [x0 ] denotes the linear span of x0 .
What is L2 x for x X?
A linear space over R is said to be normed, if there exists a function : X [0, ) (called
norm), such that:
1. x = 0 i x = 0.
2. x = || x for all R and for all x X.
3. x1 + x2 x1 + x2 for all x1 , x2 X. (Triangle inequality.)
In a normed linear space, we can talk about distances between elements, by dening the distance
between x1 and x2 to be the quantity x1 x2 . In this manner, a normed linear space becomes
a metric space. Recall that a metric space is a set X together with a function d : X X R,
called distance, that satises
1. d(x, y) 0 for all x, y in X, and d(x, y) = 0 i x = y.
2. d(x, y) = d(y, x) for all x, y in X.
3. d(x, z) d(x, y) + d(y, z) for all x, y, z in X.
Exercise. Let (X, ) be a normed linear space. Prove that (X, d) is a metric space, where
d : X X [0, ) is dened by d(x1 , x2 ) = x1 x2 , x1 , x2 X.
The elements of a normed linear space can be objects of any kind, for example, numbers,
matrices, functions, etcetera. The following normed spaces are important for our subsequent
purposes:
1. C[ti , tf ].
The space C[ti , tf ] consists of all continuous functions x() dened on the closed interval
[ti , tf ]. By addition of elements of C[ti , tf ], we mean pointwise addition of functions: for
x1 , x2 C[ti , tf ], (x1 + x2 )(t) = x1 (t) + x2 (t) for all t [ti , tf ]. Scalar multiplication is
dened as follows: ( x)(t) = x(t) for all t [ti , tf ]. The norm is dened as the maximum
of the absolute value:
x = max |x(t)|.
t[ti ,tf ]
Thus in the space C[ti , tf ], the distance between the function x and the function x does not
exceed if the graph of the function x lies inside a strip of width 2 bordering the graph
of the function x , as shown in Figure 3.4.
2. C 1 [ti , tf ].
20
x
x
tf
ti
1 + (x (t))2 dt
Hint: One might proceed as follows: consider the curves x (t) = 2 sin
and prove that x 0, while I(x ) as 0.
t
for > 0,
21
At rst it might seem that the space C[ti , tf ] (which is strictly larger than C 1 [ti , tf ]) would
be adequate for the study of variational problems. However, this is not true. In fact one of the
basic functionals
tf
dx
(t), t dt
F x(t),
I(x) =
dt
ti
is continuous if we interpret closeness of functions as closeness in the space C 1 [ti , tf ]. For example,
arc length is continuous if we use the norm in C 1 [ti , tf ], but not1 continuous if we use the norm
in C[ti , tf ]. Since we want to be able to use ordinary analytic operations such as passage to the
limit, then, given a functional, it is reasonable to choose a function space such that the functional
is continuous.
So far we have talked about linear spaces and functionals dened on them. However, in
many variational problems, we have to deal with functionals dened on sets of functions which
do not form linear spaces. In fact, the set of functions satisfying the constraints of a given
variational problem, called the admissible functions is in general not a linear space. For example,
the admissible curves for the brachistochrone problem are the smooth plane curves passing through
two xed points, and the sum of two such curves does not in general pass through the two points.
Nevertheless, the concept of a normed linear space and the related concepts of the distance between
functions, continuity of functionals, etcetera, play an important role in the calculus of variations.
A similar situation is encountered in elementary analysis, where, in dealing with functions of n
variables, it is convenient to use the concept of the n-dimensional Euclidean space R n , even though
the domain of denition of a function may not be a linear subspace of Rn .
Next we introduce the concept of the (Frechet) derivative of a functional, analogous to the
concept of the derivative of a function of n variables. This concept will then be used to nd
extrema of functionals.
Recall that for a function f : R R, the derivative at a point x is the approximation of f
around x by an ane linear map. See Figure 3.5.
f (x )
f (x + h) = f (x ) + f (x )h + (h)|h|
with (h) 0 as |h| 0. Here the derivative f (x ) : R R is simply the linear map of
multiplication. Similarly in the case of a functional I : Rn R, the derivative at a point is a
linear map I (x ) : Rn R such that
I(x + h) = I(x ) + (I (x ))(h) + (h)h,
with (h) 0 as h 0. A linear map L : Rn R is always continuous. But this is not true in
general if Rn is replaced by an innite dimensional normed linear space X. So while generalizing
1 For
every curve, we can nd another curve arbitrarily close to the rst in the sense of the norm of C[t i , tf ],
whose length diers from that of the rst curve by a factor of 10, say.
22
the notion of the derivative of a functional I : X R, we specify continuity of the linear map
as well. This motivates the following denition. Let X be a normed linear space. Then a map
L : X R is said to be a continuous linear functional if it is linear and continuous.
Exercises.
1. Let L : X R be a linear functional on a normed linear space X. Prove that the following
are equivalent:
(a) L is continuous.
(b) L is continuous at 0.
(c) There exists a M > 0 such that |L(x)| M x for all x X.
Hint. The implication (1a)(1b) follows from the denition and (1c)(1a) is easy to prove
. For (1b)(1c), use M > and consider separately the cases x = 0 and
using a < M
M
x = 0. In the latter case, note that with x1 := x
x, there holds that x1 < .
Remark. Thus in the case of linear functionals, remarkably, continuity is equivalent to
continuity at only one point, and this is furthermore equivalent to proving an estimate of
the type given in item 1c.
2. Let tm [ti , tf ]. Prove that the map L : C[ti , tf ] R given by L(x) = x(tm ) is a continuous
linear functional.
3. Let , C[ti , tf ]. Prove that the map L : C 1 [ti , tf ] R given by
tf
L(x) =
ti
dx
(t)x(t) + (t) (t) dt
dt
We are now ready to dene the derivative of a functional. Let X be a normed linear space
and I : X R be a functional. Then I is said to be (Frechet) dierentiable at x ( X) if there
exists a continuous linear functional, denoted by I (x ), and a map : X R such that
I(x + h) = I(x ) + (I (x ))(h) + (h)h, for all h X,
and (h) 0 as h 0. Then I (x ) is called the (Frechet) derivative of I at x . If I is
dierentiable at every point x X, then it is simply said to be dierentiable.
Theorem 3.4.1 The derivative of a dierentiable functional I : X R at a point x ( X) is
unique.
Proof First we note that if L : X R is a linear functional and if
L(h)
0 as h 0,
h
then L = 0. For if L(h0 ) = 0 for some nonzero h0 X, then dening hn =
hn 0 as n , but
D(h0 )
L(hn )
=
= 0,
lim
n hn
h0
(3.7)
1
n h0 ,
we see that
23
24
We note that hn 0 as n , and so with N chosen large enough, we have hn < r for all
n > N . It follows that for n > N ,
0
|[I (x )](h0 )|
I(x + hn ) I(x )
=
+ (hn ).
hn
h0
3.5
xf
xi
t
ti
tf
Figure 3.6: Possible paths joining the two xed points (ti , xi ) and (tf , xf ).
in Theorem 3.4.2) to the solve the simplest variational problem described above. This will enable
us to solve the brachistochrone problem from 3.2.
Theorem 3.5.1 Let I be a functional of the form
tf
dx
(t), t dt,
F x(t),
I(x) =
dt
ti
where F (x, x , t) is a function with continuous rst and second partial derivatives with respect to
(x, x , t) and x C 1 [ti , tf ] such that x(ti ) = xi and x(tf ) = xf . If I has an extremum at x , then
x satises the Euler-Lagrange equation:
d F
F
dx
dx
(t), t
(t), t
= 0, t [ti , tf ].
(3.9)
x (t),
x (t),
x
dt
dt x
dt
25
d
dt Fx
= 0.)
Dening I(h)
= I(x +h), we note that I : X R has an extremum at 0. It follows from Theorem
3.4.2 that I (0) = 0. Note that by the 0 in the right hand side of the equality, we mean the zero
functional, namely the continuous linear map from X to R, which is dened by h 0 for all
h X.
Step 2. We now calculate I (0). We have
I(0)
I(h)
=
tf
ti
tf
=
ti
tf
ti
F (x (t), x (t), t) dt
Recall that from Taylors theorem, if F possesses partial derivatives of order 2 in some neighbourhood N of (x0 , x0 , t0 ), then for all (x, x , t) N , there exists a [0, 1] such that
+ (x x0 ) + (t t0 )
+
F (x, x , t) = F (x0 , x0 , t0 ) + (x x0 )
F
x
x
t
(x0 ,x0 ,t0 )
2
1
+ (x x0 ) + (t t0 )
F
.
(x x0 )
2!
x
x
t
(x0 ,x0 ,t0 )+((x,x ,t)(x0 ,x0 ,t0 ))
Hence for h X such that h is small enough,
F
F
(x (t), x (t), t) h(t) + (x (t), x (t), t) h (t) dt +
x
x
ti
2
tf
1
h(t)
F
dt.
+ h (t)
2! ti
x
x
(x (t)+(t)h(t),x (t)+(t)h (t),t)
I(0)
I(h)
=
tf
+ h (t)
F
dt
h(t)
M h2 ,
2! ti
x
x
(x (t)+(t)h(t),x (t)+(t)h (t),t)
and so I (0) is the map
h
tf
ti
F
F
(x (t), x (t), t) h(t) + (x (t), x (t), t) h (t) dt.
x
x
(3.10)
26
Step 3. Next we show that if the map in (3.10) is the zero map, then this implies that (3.9)
holds. Dene
t
F
(x ( ), x ( ), ) d.
A(t) =
ti x
Integrating by parts, we nd that
tf
tf
F
(x (t), x (t), t) h(t)dt =
A(t)h (t)dt,
x
ti
ti
and so from (3.10), it follows that I (0) = 0 implies that
tf
F
A(t) + (x (t), x (t), t) h (t)dt = 0 for all h X.
x
ti
Step 4. Finally we will complete the proof by proving the following.
Lemma 3.5.2 If K C[ti , tf ] and
tf
K(t)h (t)dt = 0
ti
for all h C 1 [ti , tf ] with h(ti ) = h(tf ) = 0, then there exists a constant k such that K(t) = k for
all t [ti , tf ].
Proof Let k be the constant dened by the condition
tf
[K(t) k] dt = 0,
ti
and let
h(t) =
[K( ) k] d.
ti
tf
ti
[K(t) k] dt =
tf
[K(t) k] h (t)dt =
ti
tf
ti
F
(x (t), x (t), t) = k for all t [ti , tf ].
x
Dierentiating with respect to t, we obtain (3.10). This completes the proof of Theorem 3.5.1.
Since the Euler-Lagrange equation is in general a second order dierential equation, it solution will in general depend on two arbitrary constants, which are determined from the boundary
conditions x(ti ) = xi and x(tf ) = xf . The problem usually considered in the theory of dierential equations is that of nding a solution which is dened in the neighbourhood of some point
and satises given initial conditions (Cauchys problem). However, in solving the Euler-Lagrange
equation, we are looking for a solution which is dened over all of some xed region and satises
27
given boundary conditions. Therefore, the question of whether or not a certain variational problem
has a solution does not just reduce to the usual existence theorems for dierential equations.
Note that the Euler-Lagrange equation is only a necessary condition for the existence of an
extremum. This is analogous to the case of f : R R given by f (x) = x3 , for which f (0) = 0,
although f clearly does not have a minimum or maximum at 0. See Figure 3.7 and also the
Exercise 1 on page 27. However, in many cases, the Euler-Lagrange equation by itself is enough to
give a complete solution of the problem. In fact, the existence of an extremum is sometimes clear
from the context of the problem. From example, in the brachistochrone problem, it is clear from
the physical meaning. Similarly in the problem concerning nding the curve with the shortest
distance between two given points, this is clear from the geometric meaning. If in such scenarios,
there exists only one critical curve2 satisfying the boundary conditions of the problem, then this
critical curve must a fortiori be the curve for which the extremum is achieved.
y
y = x3
0
Figure 3.7: The derivative vanishes at 0, although it is not a point at which the function has a
maximum or a minimum.
The Euler-Lagrange equation is in general a second order dierential equation, but in some
special cases, it can be reduced to a rst order dierential equation or where its solution can be
obtained entirely by evaluating integrals. We indicate some special cases in Exercise 2 on page ,
where in each instance, F is independent of one of its arguments.
Exercises.
1. Consider the functional I given by
I(x) =
(x(t))3 dt,
dened for all x C 1 [0, 1] with x(0) = 0 = x(1). Using Theorem 3.5.1, nd the critical
curve x for this functional. Is this a curve which maximizes or minimizes the functional I?
2. Prove that:
(a) If F does not depend in x, then the Euler-Lagrange equation becomes
F
(x(t), x (t), t) = c,
x
where c is a constant.
(b) If F does not depend in x , then the Euler-Lagrange equation becomes
F
(x(t), x (t), t) = 0.
x
2 The
28
(c) If F does not depend in t and if x C 2 [ti , tf ], then the Euler-Lagrange equation
becomes
F
F (x(t), x (t), t) x (t) (x(t), x (t), t) = c,
x
where c is a constant.
d
Hint: What is dt
F (x(t), x (t), t) x (t) F
x (x(t), x (t), t) ?
Example. (Brachistochrone problem, continued.) Determine the minimum value of the functional
1
I(y) =
2g
x0
1 + (x (y))2
y
12
dy,
F (x, x , t) =
1 + x2
t
12
dy
1 + (x (y))2 y
Integrating with respect to y, we obtain
x (y)
1
= c,
2
y
1 + (x (y))
where c is a constant. It can be shown that the general solution of this dierential equation is
given by
x()
y() =
1
( sin ) + c,
2c2
1
(1 cos ),
2c2
where c is another constant. The constants are chosen so that the curve passes through the points
(0, 0) and (x0 , y0 ). This curve is known as a cycloid, and in fact it is the curve described by a
point P in a circle that rolls without slipping on the x axis, in such a way that P passes through
(x0 , y0 ); see Figure 3.8.
(0, 0)
(x0 , y0 )
y
Figure 3.8: The cycloid through (0, 0) and (x0 , y0 ).
3 Strictly
speaking, the F here does not satisfy the demands made in Theorem 3.5.1. Notwithstanding this fact,
with some additional argument, the solution given here can be fully justied.
29
Example. Among all the curves joining two given points (x0 , y0 ) and (x1 , y1 ), nd the one which
generates the surface of minimum area when rotated about the x axis. The area of the surface of
revolution generated by rotating the curve y about the x axis is
x1
y(x) 1 + (y (x))2 dx.
S(y) = 2
x0
Since the integrand does not depend explicitly on x, the Euler-Lagrange equation is
F (y(x), y (x), x) y (x)
F
(y(x), y (x), x) = c,
y
(y )2
1 + (y )2 y
= c.
1 + (y )2
1 + (y )2 , and it can be shown that this dierential equation has the general solution
x + c1
y(x) = c cosh
.
(3.11)
c
This curve is called a catenary. The values of the arbitrary constants c and c 1 are determined
by the conditions y(x0 ) = y0 and y(x1 ) = y1 . It can be shown that the following three cases are
possible, depending on the positions of the points (x0 , y0 ) and (x1 , y1 ):
1. If a single curve of the form (3.11) passes through the points (x0 , y0 ) and (x1 , y1 ), then this
curve is the solution of the problem; see Figure 3.9.
y1
y0
x0
x1
30
y1
y0
x1
x0
1
0
1
0
1
0
x (t)dt.
x(t)x (t)dt.
(x(t) + tx (t))dt. (See the Exercise on page 2.3.)
2
1
2
(x (t))3
dt
t2
1
1
0
1
0
2tx(t) (x (t))2 + 3x (t)(x(t))2 dt where
2(x(t))3 + 3t2 x (t) dt where x(0) = 0 and
7. A strip-mining company intends to remove all of the iron ore from a region that contains
an estimated Q tons over a xed time interval [0, T ]. As it is extracted, they will sell it for
processing at a net price per ton of
p(x(t), x (t)) = P x(t) x (t)
for positive constants P , , and , where x(t) denotes the total tonnage sold by time t. (This
pricing model allows the cost of mining to increase with the extent of the mined region and
speed of production.)
(a) If the company wishes to maximize its total prot given by
I(x) =
(3.12)
31
3.6
Besides the simplest variational problem considered in the previous section, we now consider the
variational problem with free boundary conditions (see Figure 3.11):
Let F (x, x , t) be a function with continuous rst and second partial derivatives with respect to
(x, x , t). Then nd x C 1 [ti , tf ] which is an extremum for the functional
tf
dx
(t), t dt.
(3.13)
F x(t),
I(x) =
dt
ti
tf
ti
x
dt
t=ti
and
F
x
dx
(t), t
= 0.
x (t),
dt
t=tf
(3.15)
Proof
Step 1. We take X = C 1 [ti , tf ] and compute I (x ). Proceeding as in the proof of Theorem 3.5.1,
it is easy to see that
tf
F
F
(x (t), x (t), t) h(t) + (x (t), x (t), t) h (t) dt,
(I (x ))(h) =
x
x
ti
h C 1 [ti , tf ]. Theorem 3.4.2 implies that this linear functional must be the zero map, that is,
(I (x ))(h) = 0 for all h C 1 [ti , tf ]. In particular, it is also zero for all h in C 1 [ti , tf ] such that
h(ti ) = h(tf ) = 0. But recall that in Step 3 and Step 4 of the proof of Theorem 3.5.1, we proved
that if
tf
F
F
(x (t), x (t), t) h(t) + (x (t), x (t), t) h (t) dt = 0
(3.16)
x
x
ti
32
for all h in C 1 [ti , tf ] such that h(ti ) = h(tf ) = 0, then this implies that he Euler-Lagrange equation
(3.14) holds.
Step 2. Integration by parts in (3.16) now gives
tf
d F
F
(x (t), x (t), t)
(I (x ))(h) =
(x (t), x (t), t) h(t)dt +
(3.17)
x
dt x
ti
t=tf
F
(x
(t),
x
(t),
t)
h(t)
x
t=t
i
F
F
= 0+
(x (t), x (t), t)
h(tf )
(x (t), x (t), t)
h(ti ).
x
x
t=tf
t=ti
The integral in (3.17) vanishes since we have shown in Step 1 above that (3.14) holds. Thus the
condition I (x ) = 0 now takes the form
F
F
(x
(t),
x
(t),
t)
h(t
)
(x
(t),
x
(t),
t)
h(ti ) = 0,
x
x
t=tf
t=ti
from which (3.15) follows, since h is arbitrary. This completes the proof.
Exercises.
1. Find all curves y = y(x) which have minimum length between the lines x = 0 and the line
x = 1.
2. Find critical curves for the following functional, when the values of x are free at the endpoints:
1
1
I(x) =
(x (t))2 + x(t)x (t) + x (t) + x(t) dt.
0 2
Similarly, we can also consider the mixed case (see Figure 3.12), when one end of the curve is
xed, say x(ti ) = xi , and the other end is free. Then it can be shown that the curve x satises
the Euler-Lagrange equation, the transversality condition
F
(x
(t),
x
(t),
t)
h(ti ) = 0
x
t=ti
at the free end point, and x(ti ) = xi serves as the other boundary condition.
We can summarize the results by the following: critical curves for (3.13) satisfy the EulerLagrange equation (3.14) and moreover there holds
F
(x (t), x (t), t) = 0 at the free end point.
x
Exercises.
1. Find the curve y = y(x) which has minimum length between (0, 0) and the line x = 1.
2. Find critical curves for the following functionals:
(a) I(x) = 02 (x(t))2 (x (t))2 dt, x(0) = 0 and x 2 is free.
(b) I(x) = 02 (x(t))2 (x (t))2 dt, x(0) = 1 and x 2 is free.
1
(c) I(x) = 0 cos(x (t))dt, x(0) = 0 and x(1) is free.
33
3.7. Generalization
xf
xi
tf
ti
ti
tf
3.7
Generalization
The results in this chapter can be generalized to the case when the integrand F is a function of
more than one independent variable: if we wish to nd extremum values of the functional
tf
dxn
dx1
(t), . . . ,
(t), t dt,
F x1 (t), . . . , xn (t),
I(x1 , . . . , xn ) =
dt
dt
ti
where F (x1 , . . . , xn , x1 , . . . , xn , t) is a function with continuous partial derivatives of order 2,
and x1 , . . . , xn are independent functions of the variable t, then following a similar analysis as
before, we obtain n Euler-Lagrange equations to be satised by the optimal curve, that is,
F
d F
(x1 (t), . . . , xn (t), x1 (t), . . . , xn (t), t)
(x1 (t), . . . , xn (t), x1 (t), . . . , xn (t), t) = 0,
xk
dt xk
for t [ti , tf ], k {1, . . . , n}. Also at any end point where xk is free,
F
dxn
dx1
(t), . . . ,
(t), t = 0.
x1 (t), . . . , xn (t),
xk
dt
dt
2
= 1, x2 (0) = 0, x2
2
= 1.
Remark. Note that with the above result, we can also solve the problem of nding extremal
curves for a functional of the type
tf
dn x
dx
(t), . . . , n (t), t dt,
F x(t),
I(x) :
dt
dt
ti
for over all (suciently dierentiable) curves x dened on an interval [ti , tf ], taking values in R.
Indeed, we may introduce the auxiliary functions
x1 (t) = x(t), x2 (t) =
dx
dn x
(t), . . . , xn (t) = n (t),
dt
dt
t [ti , tf ],
34
and consider the problem of nding extremal curves for the new functional I dened by
tf
Using the result mentioned in this section, we can then solve this problem. Note that we eliminated
high order derivatives at the price of converting the scalar function into a vector-valued function.
Since we can always do this, this is one of the reasons in fact for considering functionals of the
type (3.2) where no high order derivatives occur.
Chapter 4
Optimal control
4.1
tf
Ixi (u) =
(4.1)
If x denotes the state corresponding to the input u , then there exists a p C 1 [ti , tf ] such that
F
f
(x (t), u (t), t) + p (t) (x (t), u (t))
x
x
f
F
(x (t), u (t), t) + p (t) (x (t), u (t))
u
u
= p (t),
t [ti , tf ], p (tf ) = 0,
(4.2)
= 0,
t [ti , tf ].
(4.3)
36
that the derivative must vanish at extremal points (now simply for a function from R to R!), we
obtain a certain condition, given by equation (4.6).
Let 2 C[ti , tf ] be such that 2 (ti ) = 2 (tf ) = 0. Dene u (t) = u (t) + 2 (t), R. Then
from Theorem 1.2.2, for all such that | | < , with small enough, there exists a unique x ()
satisfying
x (t) = f (x (t), u (t)), t [ti , tf ], x (ti ) = xi .
(4.4)
Dene 1 C 1 [ti , tf ] by
1 (t) =
x (t)x (t)
if = 0,
if = 0.
tf
tf
It thus follows that I2 : (, ) R is dierentiable (dierentiation under the integral sign can
be justied!), and from the hypothesis that u is optimal for Ixi , it follows that the function I2
has an extremum for = 0. As a consequence of the necessity of the condition that the derivative
dI
must vanish at extremal points, there must hold that d2 (0) = 0. But we have
dI2
( ) =
d
tf
ti
F
F
(x (t), u (t), t)1 (t) +
(x (t), u (t), t)2 (t) dt,
x
u
and so we obtain
tf
ti
F
F
(x (t), u (t), t)1 (t) +
(x (t), u (t), t)2 (t) dt = 0.
x
u
(4.5)
(4.6)
Step 2. We now introduce an function p in order to rewrite (4.6) in a dierent manner, which
will eventually help us to conclude (4.2) and (4.3).
Let p C 1 [ti , tf ] be an unspecied function. Multiplying (4.6) by p, we have that for all
t [ti , tf ], there holds
f
f
p(t)
(x (t), u (t))1 (t) +
(x (t), u (t))2 (t) 1 (t) = 0.
(4.7)
x
u
Thus adding the left hand side of (4.7) to the integrand in (4.5) does not change the integral.
Consequently,
tf
F
f
(x (t), u (t), t) + p(t) (x (t), u (t)) 1 (t)+
x
x
ti
F
f
(x (t), u (t), t) + p(t) (x (t), u (t)) 2 (t) p(t)1 (t) dt = 0.
u
u
37
Hence
tf
ti
F
f
(x (t), u (t), t) + p(t) (x (t), u (t)) + p(t)
1 (t)+
x
x
t=tf
F
f
(x (t), u (t), t) + p(t) (x (t), u (t)) 2 (t) dt + p(t)1 (t)
= 0. (4.8)
u
u
t=ti
Step 3. In this nal step, we choose the right p: one which makes the rst summand in the
integrand appearing in (4.8) vanish (in other other words a solution of the dierential equation
in (4.5)) and impose a boundary condition for this special (denoted by p ) in such a manner that
the boundary term in (4.8) also disappears. With this choice of p, (4.8) allows one to conclude
that (4.3) holds too!
Now choose p = p , where p is such that
F
f
(x (t), u (t), t) + p (t) (x (t), u (t)) + p (t) = 0, t [ti , tf ], p (tf ) = 0.
x
x
(4.9)
tf
Rs
F
(x (s), u (s), s)e t
x
f
x
(x ( ),u ( ))d
ds
tf
dx
u) =
I(x,
(t) f (x(t), u(t)) dt.
F (x(t), u(t), t) + p (t)
dt
ti
Then it can be shown that (4.2) and (4.3) imply that I (x , u ) = 0. This is known as the
relative stationarity condition. It is analogous to the Lagrange multiplier theorem encountered in constrained optimization problems in nite dimensions: a necessary condition for
38
4.2
(4.10)
H is called the Hamiltonian and Theorem 4.1.1 can be equivalently be expressed in the following
form.
Theorem 4.2.1 Let F (x, u, t) and f (x, u) be continuously dierentiable functions of each of their
arguments. If u C[ti , tf ] is an optimal control for the functional
tf
Ixi (u) =
F (x(t), u(t), t)dt,
ti
p (t),
(4.11)
0,
t [ti , tf ].
(4.12)
Note that the dierential equation x = f (x , u ) with x (ti ) = xi can be expressed in terms
of the Hamiltonian as follows:
H
(p (t), x (t), u (t), t) = x (t),
p
t [ti , tf ], x (ti ) = xi .
(4.13)
The equations (4.11) and (4.13) resemble the equations arising in Hamiltonian mechanics, and
these equations together are said to comprise a Hamiltonian dierential system. The function
p is called the co-state, and (4.11) is called the adjoint dierential equation. This analogy with
Hamiltonian mechanics was responsible for the original motivation of the Pontryagin minimum
principle, which we state below without proof.
Theorem 4.2.2 (Pontryagin minimum principle.) Let F (x, u, t) and f (x, u) be continuously differentiable functions of each of their arguments. If u C[ti , tf ] is an optimal control for the
functional
tf
Ixi (u) =
39
t [ti , tf ], p (tf ) = 0,
(4.14)
holds.
The fact that the optimal input u minimizes the Hamiltonian (inequality (4.14)) is known as
Pontryagin minimum principle. Equation (4.12) is then a corollary of this result.
Exercises.
1. Find a critical control of the functional
Ix0 (u) =
(x(t))2 + (u(t))2 dt
subject to x(t)
1
3(x(t))2 + (u(t))2 dt
2
subject to x(t)
= x(t) + u(t), t [0, T ], x(0) = x0 . Show that there exists a constant k such
1
that
lim uT (t) = k lim xT (t)
T
Example. (Economic growth, continued.) The problem is to choose the consumption path c
C[0, T ] which maximizes the welfare integral
Wk0 (c) =
et U (c(t))dt
k(t)
= (k(t)) ( + )k(t) c(t),
t [0, T ], k(0) = k0 ,
k being the capital, the production function, , positive constants, the discount factor. The
Hamiltonian is given by
H(p, k, u, t) = et U (c) + p((k) ( + )k c).
1A
40
From Theorem 4.2.1, it follows that any optimal control input u and the corresponding state k
satises
H
(p (t), k (t), c (t), t) = 0, t [0, T ],
c
that is,
dU
(c (t)) p (t) = 0, t [0, T ].
(4.15)
et
dc
The adjoint equation is
H
(p (t), k (t), c (t), t) = p (t),
x
that is,
p (t)
d
(k (t)) ( + )
dk
t [0, T ], p (T ) = 0,
= p (t),
t [0, T ], p (T ) = 0.
(k (t)) ( + + ) ,
c (t) = ddc
2U
dk
dc2 (c (t))
t [0, T ],
(4.16)
dU
(c (T )) = 0.
dc
So the equations governing the optimal path are the following coupled, rst order, nonlinear
dierential equations:
k (t)
c (t)
t [0, T ], k(0) = k0 ,
t [0, T ],
dU
(c (T )) = 0.
dc
In general, it is not possible to solve these equations analytically, and instead one nds an approximate solution numerically on a computer.
4.3
In the general case when x(t) Rn and u(t) Rm , Theorem 4.2.1 holds with p now being a
function taking its values in Rn :
Theorem 4.3.1 Let F (x, u, t) and f (x, u) be continuously dierentiable functions of each of their
arguments. If u (C[ti , tf ])m is an optimal control for the functional
tf
Ixi (u) =
F (x(t), u(t), t)dt,
ti
H
(p (t), x (t), u (t), t)
x
H
(p (t), x (t), u (t), t)
u
= p (t),
= 0,
t [ti , tf ],
41
Example. (Linear systems and the Riccati equation.) Let A Rnn , B Rnm , Q Rnn
such that Q = Q 0 and R Rmm such that R = R > 0. We wish to nd2 optimal controls
for the functional
tf
1
x(t) Qx(t) + u(t) Ru(t) dt
Ixi (u) =
ti 2
subject to the dierential equation
x(t)
1
x Qx + u Ru + p [Ax + Bu] .
2
From Theorem 4.3.1, it follows that any optimal input u and the corresponding state x satises
H
(p (t), x (t), u (t), t) = 0,
u
that is, u (t) R + p (t) B = 0. Thus u (t) = R1 B p (t). The adjoint equation is
H
(p (t), x (t), u (t), t) = p (t), t [ti , tf ], p (tf ) = 0,
x
that is,
(x (t) Q + p (t) A) = p (t), t [ti , tf ], p (tf ) = 0.
So we have
p (t) = A p (t) Qx (t), t [ti , tf ], p (tf ) = 0.
Consequently,
d x (t)
A BR1 B
x (t)
=
, t [ti , tf ], x (ti ) = xi , p (tf ) = 0.
p (t)
Q
A
dt p (t)
(4.17)
This is a linear, time-invariant dierential equation in (x , p ). If we would only have to deal with
initial boundary conditions exclusively or nal boundary conditions exclusively, then we could
easily solve (4.17). However, here we have combined initial and nal conditions, and so it is not
clear how we could solve (4.17). It is unclear if (4.17) has a solution at all! We now prove the
following.
Theorem 4.3.2 Let P be a solution of the following Riccati equation
P (t) = P (t)A A P (t) + P (t)BR1 B P (t) Q, t [ti , tf ], P (tf ) = 0.
Let x be the solution of
x (t) = A BR1 B P (t) x (t), t [ti , tf ], x (ti ) = xi ,
and let
p (t) = P (t)x (t).
Then (x , p ) above is the unique solution of (4.17).
2 This
42
Proof We have
d x (t)
=
dt p (t)
=
=
=
Ax (t) BR1 B p (t)
Qx (t) A p (t)
A BR1 B
x (t)
.
Q
A
p (t)
Furthermore, x and p satisfy x(ti ) = xi and p (tf ) = P (tf )x (tf ) = 0x (tf ) = 0. So the pair
(x , p ) satises (4.17).
The uniqueness can be shown as follows. If (x1 , p1 ) and (x2 , p2 ) satisfy (4.17), then x
= x1 x2 ,
p = p1 p2 satisfy
d x(t)
x
(t)
A BR1 B
(4.18)
, t [ti , tf ], x(ti ) = 0, p(tf ) = 0.
=
p(t)
Q
A
dt p(t)
This implies that
0
= p(tf ) x
(tf ) p(ti ) x(ti )
tf
d
p(t) x(t) dt
=
ti dt
tf
(t) + p(t) x
(t) dt
=
p (t) x
ti
tf
ti
tf
=
=
(Q
x(t) A p(t))
x(t) + p(t) (A
x(t) BR1 B p(t)) dt
x
(t) Q
x(t) + p(t) BR1 B p(t) dt.
ti
Consequently Q
x(t) = 0 and R1 B p(t) = 0 for all t [ti , tf ]. From (4.18), we obtain
x (t) =
p (t) =
A
x(t),
A p(t),
t [ti , tf ], x
(ti ) = 0, and
t [ti , tf ], p(tf ) = 0.
R1 B P (t)x (t),
t [ti , tf ],
43
plant
u
x = Ax + Bu
u(t) = R1 B P (t)x(t)
controller
Figure 4.1: The closed loop system.
4.4
In many optimization problems, in addition to minimizing the cost, one may also have to satisfy
a condition for the nal state x(tf ); for instance, one may wish to drive the state to zero. This
brings us naturally to the notion of controllability. For the sake of simplicity, we restrict ourselves
to linear systems:
x(t)
= Ax(t) + Bu(t), t ti .
(4.19)
The system (4.19) is said to be controllable if for every pair of vectors xi , xf in Rn , there exists a
tf > ti and a control u (C[ti , tf ])m such that the solution x of (4.19) with x(ti ) = xi satises
x(tf ) = xf . Controllability means that any state can be driven to any other state using an
appropriate control.
Example. (A controllable system.) Consider the system
x(t)
= u(t),
so that
A = 0, B = 1.
Then given xi , xf R, with any tf > tf , dene u C[ti , tf ] to be the constant function
u(t) =
xf xi
, t [ti , tf ].
tf t i
tf
x(tf ) = x(ti ) +
tf
x(
)d = xi +
ti
u( )d = xi +
ti
xf xi
(tf ti ) = xf .
tf t i
Note that
rank
AB
...
An1 B
(n=1)
= rank B = rank 1 = 1 = n,
x1 (t) + u(t),
(4.20)
x 2 (t)
x2 (t),
(4.21)
44
so that
A=
1
0
0
1
, B=
1
0
The equation (4.21) implies that x2 (t) = etti x2 (ti ), and so if x2 (ti ) > 0, then x2 (t) > 0 for all
t ti . So a nal state with the x2 -component negative is never reachable by any control. Note
that
(n=2)
1 1
n1
B
rank B AB . . . A
= rank B AB = rank
= 1 = 2 = n,
0 0
the dimension of the state space (R2 ).
AB
...
An1 B
= n, the
Exercises.
1. For what values of is the system (4.19) controllable, if
2 1
1
A=
, B=
?
0 1
2. () Let A Rnn and B Rn1 . Prove that the system (4.19) is controllable i every
matrix commuting with A is a polynomial in A.
The following theorem tells us how we can calculate the optimal control when x(tf ) is specied,
in the case of controllable linear systems.
Theorem 4.4.2 Suppose that the system
x(t)
= Ax(t) + Bu(t), t ti
is controllable. Let F (x, u, t) be a continuously dierentiable function of each of their arguments.
If u (C[ti , tf ])m is an optimal control for the functional
tf
F (x(t), u(t), t)dt,
Ixi (u) =
ti
H
(p (t), x (t), u (t), t)
=
x
H
(p (t), x (t), u (t), t) =
u
p (t),
0,
t [ti , tf ],
45
We will not prove this theorem. Note that for a dierential equation to have a unique solution,
there should not be too few or too many initial and nal conditions to be satised by that solution.
Intuitively, one expects as many conditions as there are dierential equations. In Theorem 4.4.2,
we have, in total, 2n dierential equations (for x and p ). We also have the right number of
conditions: n + r for x , and n r for p .
Exercises.
1. Find a critical control for the functional
I(u) =
(u(t))2 dt
subject to x(t)
= 2x(t) + u(t), t [0, 1], x(0) = 1 and x(1) = 0. Is this control unique?
2. Find a critical control for the functional
I(u) =
(u(t))2 dt
subject to x(t)
subject to x(t)
I(u) =
0
1
(u(t))2 dt
2
subject to
x 1 (t)
x 2 (t)
=
=
x2 (t),
x2 (t) + u(t)
t [0, 1] and
x1 (0) = 1,
x1 (1) = 0,
x2 (0) = 1,
x2 (1) = 0.
5. (Higher order dierential equation constraint.) Find a critical control for the functional
I(u) =
0
1
(u(t))2 dt
2
= v0 , y(T ) = y(t)
= 0.
y(t) + y(t) = u(t), t [0, T ], y(0) = y0 , y(0)
Hint: Introduce the state variables x1 (t) = y(t), x2 (t) = y(t).
Chapter 5
5.1
The underlying idea of the optimality principle is extremely simple. Roughly speaking, the optimality principle simply says that any part of an optimal trajectory is optimal.
We denote the class of piecewise continuous Rm valued functions on [ti , tf ] by U[ti , tf ].
Theorem 5.1.1 (Optimality principle.) Let F (x, u, t) and f (x, u) be continuously dierentiable
functions of each of their arguments. Let u U[ti , tf ] be an optimal control for the functional
tf
Ixi (u) =
F (x(t), u(t), t)dt,
ti
subject to
x(t)
(5.1)
Let x be the corresponding optimal state. If tm [ti , tf ), then the restriction of u to [tm , tf ] is
an optimal control for the functional
tf
Ix (tm ) (u) =
F (x(t), u(t), t)dt,
tm
subject to
x(t)
min
uU[ti ,tf ]
subject to (5.1)
Ixi (u) =
tm
min
uU[tm ,tf ]
subject to (5.2)
Ix (tm ) (u).
(5.2)
(5.3)
48
Proof We have
Ix (tm ) (u ) =
tf
tf
ti
(5.4)
tm
is simply the restriction of x to [tm , tf ]. Thus the second term in (5.4) is the cost Ix (tm ) u |[tm ,tf ] (t)
subject to (5.2).
Suppose that there exists a u
U[ti , tf ] such that
tf
F (
x(t), u(t), t)dt = Ix (tm ) (
u) < Ix (tm ) u |[tm ,tf ] (t) =
tm
tf
tm
where x
is the solution to (5.2) corresponding to u
. Dene u U[ti , tf ] by
u (t) for t [ti , tm ),
u(t) =
u
(t) for t [tm , tf ],
and let x be the corresponding solution to (5.1). From Theorem 1.2.1, it follows that
x|[ti ,tm ] = x |[ti ,tm ] .
Hence we have
Ixi (u)
tf
tf
tf
=
ti
tm
=
ti
tm
<
ti
tm
F (
x(t), u(t), t)dt
tm
tf
=
ti
= Ixi (u ),
which contradicts the optimality of u . This proves that an optimal control for Ix (tm ) subject
to (5.2) exists and it is given by the restriction of u to [tm , tf ]. From (5.4), it follows that (5.3)
holds.
Note that we have shown that
min
uU[tm ,tf ]
Ix (tm ) (u) = Ix (tm ) u |[tm ,tf ] (t) .
subject to (5.2)
So the theorem above says that if you are on an optimal trajectory, then the best thing you can
do is to stay on that trajectory. See Figure 5.1.
49
x (tm )
u
xi
u |[ti ,tm ]
u |[tm ,tf ]
tf
tm
ti
5.2
Bellmans equation
In this section we will prove Theorem 5.2.1 below, which gives a sucient condition for the existence of an optimal control in terms of the existence of an appropriate solution to Bellmans
equation (5.6). However, we rst provide a heuristic argument that leads one to Bellmans equation: we do not start by asking when the optimal control problem has a solution, but rather
we begin by assuming that the optimal control problem is solvable and study the so-called value
function, which will lead us to Bellmans equation.
Let tm [ti , tf ). Dene the value function V : Rn [ti , tf ] R by
V (xm , tm ) =
tf
min
uU[tm ,tf ]
(5.5)
tm
min
uU[tm ,tf ]
tf
tf
Consequently
V (x (tm + ), tm + ) V (x (tm ), tm ) =
tm +
It is tempting to divide by on both sides and let tend to 0. Formally, the left hand side would
become
V
V
(x (tm ), tm ) +
(x (tm ), tm )f (x (tm ), u (tm )),
t
x
while the right hand side would become
F (x (tm ), u (tm ), tm ).
Thus we would obtain the equation
V
V
(x (tm ), tm ) +
(x (tm ), tm )f (x (tm ), u (tm )) + F (x (tm ), u (tm ), tm ) = 0.
t
x
This motivates the following result.
50
Theorem 5.2.1 Let F (x, u, t) and f (x, u) be continuously dierentiable functions of each of their
arguments. Suppose that there exists a function W : Rn [ti , tf ] R such that:
1. W is continuous on Rn [ti , tf ].
2. W is continuously dierentiable in Rn (ti , tf ).
3. W satises Bellmans equation
W
W
(x, t) + minm
(x, t)f (x, u) + F (x, u, t) = 0, (x, t) Rn (ti , tf ).
uR
t
x
(5.6)
tf
tm
W
W
(x, t)f (x, (x, t)) + F (x, (x, t), t) = minm
(x, t)f (x, u) + F (x, u, t) .
uR
x
x
(b) The equation
x(t)
tf
Ixi (u) =
tf
Ixi (u ) =
(5.7)
ti
3. Let be the function from part 2. If for every tm [ti , tf ) and every xm Rn , the equation
x(t)
51
Proof 1. We have
tf
F (x(t), u(t), t)dt
tf
tm
tm
tf
tm
tf
tm
tf
W
(x(t), t)f (x(t), u(t)) + F (x(t), u(t), t) dt
x
W
(x(t), t)f (x(t), u(t))dt
x
W
(x(t), t)f (x(t), u) + F (x(t), u, t) dt
min
uRm
x
W
(x(t), t)f (x(t), u(t))dt
tm x
tf
W
W
=
(x(t), t)
(x(t), t)f (x(t), u(t)) dt
t
x
tm
tf
d
W (x(), ) (t)dt
=
dt
tm
= W (x(tf ), tf ) + W (x(tm ), tm )
= W (xm , tm ).
2. Let x be a solution of x(t)
tf
tf
ti
ti
tf
ti
tf
=
ti
tf
=
=
=
W
(x (t), t)f (x (t), (x (t), t)) + F (x (t), (x (t), t), t) dt
x
W
(x (t), t)f (x (t), (x (t), t))dt
x
W
(x (t), t)f (x (t), u) + F (x (t), u, t) dt
min
uRm
x
W
(x (t), t)f (x (t), (x (t), t))dt
x
ti
tf
W
W
(x (t), t)
(x (t), t)f (x (t), (x (t), t)) dt
t
x
ti
tf
d
W (x (), ) (t)dt
dt
ti
W (xi , ti ).
tf
ti
tf
min
uU[tm ,tf ]
52
In the following example, we show how Theorem 5.2.1 can be used to calculate an optimal
control.
Example. Consider the following linear system with state space R:
x(t)
Ix0 (u) =
(x(t))2 + (u(t))2 dt.
W
W
2
2
(x, t) + min
(x, t)u + x + u = 0, (x, t) R (0, 1), W (x, 1) = 0.
uR
t
x
It is easy to see that the minimum in the above is assumed for
u = (x, t) =
1 W
(x, t).
2 x
Thus we obtain
W
1
(x, t) + x2
t
4
2
W
(x, t) = 0, (x, t) R (0, 1), W (x, 1) = 0.
x
et+1 et1
.
et+1 + et1
et+1 et1
.
et+1 + et1
x (t) = x0 e
Rt
0
p ( )d
(5.8)
We note that all the conditions from Theorem 5.2.1 are satised, so the optimization problem is
solvable. The optimal input is given by
u (t) = (x (t), t) = x (t)p (t),
53
where x is the optimal state given by (5.8). Note that the optimal control is given in the form of
1
0
(x(t))2 dt.
Bibliography
[1] R.W. Brockett. Finite Dimensional Linear Systems. John Wiley, 1970.
[2] D.N. Burghes and A. Graham. Introduction to Control Theory, Including Optimal Control.
John Wiley, 1980.
[3] I.M. Gelfand and S.V. Fomin. Calculus of Variations. Dover, 1963.
[4] H.J. Sussmann and J.C. Willems. 300 years of optimal control: from the brachystochrone
problem to the maximum principle. IEEE Control Systems, 17:3244, 1997.
[5] J.L. Troutman. Variational Calculus and Optimal Control: Optimization with Elementary
Convexity. 2nd Edition, Springer, 1996.
[6] R. Weinstock. Calculus of Variations with applications to physics and engineering. Dover,
1974.
[7] H.J. Zwart. Optimal Control Theory. Rijksuniversiteit Groningen, 1997.
Index
C[ti , tf ], 19
C 1 [ti , tf ], 19
continuous linear functional, 22
adjoint dierential equation, 38
admissible functions, 21
ane linear map, 21
arc length, 21
arc length, 16
Bellmans equation, 50
brachistochrone problem, 16
calculus of variations, 15
catenary, 29
Cauchys problem, 26
closed loop system, 43
co-state, 38
commuting matrices, 6
conservation of energy, 16
constrained optimization problem, 37
continuous functional, 20
control, 1
control system, 2
control theory, 1
controllability, 43
critical control, 38
critical curve, 27
cycloid, 28
dense set, 6
diagonalizable matrix, 6
discount rate, 10
distance, 19
economic growth, 9
economic growth, 14, 39
Euler, 17
Euler-Lagrange equation, 24, 31
exploited population, 11
Frechet derivative, 22
free boundary conditions, 31
function spaces, 18
functional, 12
Lagrange multiplier, 38
Lagrange multiplier theorem, 37
length of a curve, 12
linear control system, 2
linear functional, 18
linear quadratic control problem, 41
linear space, 18
linear span, 19
linear system, 41
Lipschitz condition, 2
local extremum, 23
LQ problem, 41
method of nite dierences, 17
metric space, 19
mixed boundary conditions, 32
norm, 19
normed linear space, 19
optimal control, 13
optimal control theory, 4
optimal state, 13
optimal trajectory, 13
optimality principle, 47
path independence, 13
piecewise continuous, 2
Pontryagin minimum principle, 39
present value, 11
production function, 9
rate of depreciation, 10
rectiable curve, 12
58
relative stationarity condition, 37
Riccati equation, 3, 41, 52
state, 2
state equation, 2
state feedback, 53
state-feedback, 42, 45
static state-feedback, 39
system, 1
Taylors theorem, 25
transversality conditions, 31
triangle inequality, 19
underdetermined dierential equation, 1
utility function, 10
value function, 49
vector space, 18
vector-valued function, 4
Verhulst model, 11
welfare integral, 10
zero element, 18
Index
Mock examination
MA305
Control Theory
(Half Unit)
Suitable for all candidates
Instructions to candidates
Time allowed: 2 hours
This examination paper contains 6 questions. You may attempt as many questions as you wish,
but only your best 4 questions will count towards the final mark. All questions carry equal
numbers of marks.
Please write your answers in dark ink (preferably black or blue) only.
Calculators are not allowed in this exam.
You are supplied with: Answer Booklet
Page 1 of 4
Z 1
0
continuous at 0? Here the linear space C[0, 1] is equipped with the usual norm
kxk = sup |x(t)|.
t[0,1]
ti
ti
2
e2
H INT: Use the Cauchy-Schwarz inequality with x1 (t) = e2t and x2 = e2t dtd x(t).
(c) Using the Euler-Lagrange equation, find a critical curve x for I, and find I(x ).
Using part (b) show that x indeed minimizes the functional I.
Page 2 of 4
[3(x(t))2 + (u(t))2]dt
(1)
(2)
I(u) =
(a) Using the Hamiltonian method, write down the equations that govern an optimal
control for (1) subject to (2).
(b) Find an equation that describes the evolution of the state x and the co-state p and
specify the boundary conditions x (0) and p (1).
(c) Solve the equations from part (b) and determine a critical control.
(3)
[3(x(t))2 + (u(t))2]dt
(5)
(6)
I(u) =
Page 3 of 4
Page 4 of 4