Optimality Conditions: Mar Ia M. Seron
Optimality Conditions: Mar Ia M. Seron
Optimality Conditions: Mar Ia M. Seron
Marı́a M. Seron
September 2004
1 Unconstrained Optimisation
Local and Global Minima
Descent Direction
Necessary Conditions for a Minimum
Necessary and Sufficient Conditions for a Minimum
2 Constrained Optimisation
Geometric Necessary Optimality Conditions
Problems with Inequality and Equality Constraints
The Fritz John Necessary Conditions
Karush–Kuhn–Tucker Necessary Conditions
Karush–Kuhn–Tucker Sufficient Conditions
Quadratic Programs
minimise f (x ), (1)
eplacements
S
Global minima
Proof.
By the differentiability of f at x̄ , we have
where α(x̄ , λd ) → 0 as λ → 0.
f (x̄ + λd ) − f (x̄ )
= ∇f (x̄ )d + kd kα(x̄ , λd ).
λ
Since ∇f (x̄ )d < 0 and α(x̄ , λd ) → 0 as λ → 0, there exists a δ > 0
such that the right hand side above is negative for all λ ∈ (0, δ).
Proof.
Suppose that ∇f (x̄ ) , 0. Then, letting d = −∇f (x̄ ) , we get
f (x̄ + λd ) − f (x̄ ) 1
= d H (x̄ )d + kd k2 α(x̄ , λd ) . (3)
λ2 2
Since x̄ is a local minimum, f (x̄ + λd ) ≥ f (x̄ ) for sufficiently small λ .
From (3), 21 d H (x̄ )d + kd k2 α(x̄ , λd ) ≥ 0 for sufficiently small λ. By
taking the limit as λ → 0, it follows that d H (x̄ )d ≥ 0; and, hence,
H (x̄ ) is positive semidefinite.
Centre for Complex Dynamic
Systems and Control
Necessary and Sufficient Conditions for a Minimum
minimise f (x ), (4)
subject to:
x ∈ S.
f C D
E
B
Local minimum
A
Local minima
Global minimum
[ ]
S
minimise f (x ), (5)
subject to:
x ∈ S.
minimise f (x ), (6)
subject to:
x∈S
F = {d : f (x̄ + λd ) < f (x̄ ) for all λ ∈ (0, δ) for some δ > 0}.
x¯
x¯
D
D
ents PSfrag replacements PSfrag replacements
S S S
x¯
D
F
s
es
se
se
as
rea
rea
cre
F
ec
ec
e
x¯
fd
fd
fd
x¯
F
Hence, we have
F0 ⊆ F ⊆ F00 . (9)
F0 ⊆ F ⊆ F00
where
F0 , {d : ∇f (x̄ )d < 0} F00 , {d , 0 : ∇f (x̄ )d ≤ 0}.
F
s
es
se
se
as
rea
rea
cre
F
ec
ec
e
x¯
fd
fd
fd
x¯
ents PSfrag replacements PSfrag
F replacements
x¯
F0 ∩ D = ∅, (10)
Proof.
Suppose, by contradiction, that there exists a vector d ∈ F 0 ∩ D .
Since d ∈ F0 , then, by Theorem 2.1 (Descent Direction), there
exists a δ1 > 0 such that
S = {x ∈ X : gi (x ) ≤ 0, i = 1, . . . , m, hi (x ) = 0, i = 1, . . . , l } ,
minimise f (x ),
subject to:
gi (x ) ≤ 0 for i = 1, . . . , m, (13)
hi (x ) = 0 for i = 1, . . . , `,
x ∈ X.
Let
Then
G0 ⊆ D ⊆ G00 . (14)
Centre for Complex Dynamic
Systems and Control
Algebraic Description of the Cone of Feasible Directions
x̄ + λd ∈ X for λ ∈ (0, δ1 ).
G0 ⊆ D ⊆ G00
where
G0 , {d : ∇gi (x̄ )d < 0 for i ∈ I},
G00 , {d , 0 : ∇gi (x̄ )d ≤ 0 for i ∈ I}.
x¯
D
ents PSfrag replacements PSfrag replacements
S S D
D
x¯ x¯ S
Proof.
(Only for inequality constraints.) Contours
Let x̄ be a local minimum. We of f
G
0
then have the following
implications from PSfrag
(10) and (14):
replacements
F0 ∇f (x¯)
x̄ is a local minimum
x¯
S
=⇒ F0 ∩ D = ∅
=⇒ F0 ∩ G0 = ∅.
f decr
eas es
X X̀
u0 ∇f (x̄ ) + ui ∇gi (x̄ ) + vi ∇hi (x̄ ) = 0,
i ∈I i =1
(16)
u0 , ui ≥ 0 for i ∈ I,
{u0 , ui , i ∈ I, v1 , . . . , v` } not all zero .
Centre for Complex Dynamic
Systems and Control
The Fritz John Necessary Conditions
F0 ∩ G0 ∩ H0 = ∅. (18)
Let A1 be the matrix whose rows are ∇f (x̄ ) and ∇gi (x̄ ) for i ∈ I, and
let A2 be the matrix whose rows are ∇hi (x̄ ) for i = 1, . . . , `. Then,
(18) is satisfied if and only if the following system is inconsistent:
A1 d < 0,
A2 d = 0.
Centre for Complex Dynamic
Systems and Control
The Fritz John Necessary Conditions
Proof (continued):
Now consider the following two sets:
S1 = {(z1 , z2 ) : z1 = A1 d , z2 = A2 d , d ∈ Rn },
S2 = {(z1 , z2 ) : z1 < 0, z2 = 0}.
Note that S1 and S2 are nonempty convex sets and, since the sys-
tem A1 d < 0, A2 d = 0 has no solution, then S1 ∩ S2 = ∅.
Proof (continued):
Hence
p1 A1 d + p2 A2 d ≥ p1 z1 + p2 z2 ,
for each d ∈ Rn and (z1 , z2 ) ∈ cl S2 = {(z1 , z2 ) : z1 < 0, z2 = 0}.
Proof (continued):
Summarising, we have found a nonzero vector p = (p1 , p2 ) with
p1 ≥ 0 such that A1 p1 + A2 p2 = 0, where A1 is the matrix whose
rows are ∇f (x̄ ) and ∇gi (x̄ ) for i ∈ I, and A2 is the matrix whose
rows are ∇hi (x̄ ) for i = 1, . . . , `.
where
∇g (x̄ ) is the m × n Jacobian matrix whose i th row is ∇gi (x̄ ),
∇h (x̄ ) is the ` × n Jacobian matrix whose i th row is ∇hi (x̄ ),
g (x̄ ) is the m vector function whose i th component is gi (x̄ ).
Any point x̄ for which there exist Lagrange multipliers such that the
FJ conditions are satisfied is called an FJ point.
ents
Feasible point
The constraint set S is:
ases S = {x ∈ R2 :
g1 (x ) = 0 x̄
g1 (x ) ≤ 0,
g2 (x ) ≤ 0,
g2 (x ) = 0
g3 (x ) ≤ 0}
S
∇g1
∇g2 Consider the feasible
∇g3 point x̄ .
g 3 (x ) = 0
−∇f
∇f
ents ∇g 1
point ∇g2
ases
g1 (x ) = 0 x̄
Consider the gradients
of the active
g2 (x ) = 0 constraints at x̄ , ∇g1 (x̄ )
S and ∇g2 (x̄ ).
∇g3
g 3 (x ) = 0
−∇f
∇f
ses
ea
nts ecr ∇g 1
fd −∇f
oint ∇g2 For the given contours
of the objective
g1 (x ) = 0 x̄ function f , we have that
u0 (−∇f (x̄ )) is in the
g 2 (x ) = 0 cone spanned by
S
∇g1 (x̄ ) and ∇g2 (x̄ ) with
u0 > 0.
∇g3
g3 (x ) = 0
∇f
ses
ea
ents ecr ∇g 1 The FJ conditions are
fd −∇f
∇f (x̄ ) u0 + ∇g (x̄ ) u = 0,
oint ∇g2
u g (x̄ ) = 0,
g1 (x ) = 0 x̄ (u0 , u) ≥ (0, 0),
(u0 , u, v ) , (0, 0, 0),
g 2 (x ) = 0
x̄ is an FJ point with
S u0 > 0.
It is also a local
∇g3
g3 (x ) = 0 minimum.
∇f
x̄ is an FJ point
g3 (x ) = 0 with u0 = 0.
∇g1 −∇f
x̄
It is also a local
∇g3 creases minimum.
f de
∇f
ents
g1 (x ) = 0
point
g 2 (x ) = 0 x̄ is an FJ point with
S ∇g2 u0 = 0.
It is also a local
g3 (x ) = 0 maximum.
∇g1 ∇f
x̄
f decre
ases ∇g3
−∇f
X X̀
∇f (x̄ ) + ui ∇gi (x̄ ) + vi ∇hi (x̄ ) = 0,
i ∈I i =1
(20)
ui ≥ 0 for i ∈ I.
X X̀
û0 ∇f (x̄ ) + ûi ∇gi (x̄ ) + v̂i ∇hi (x̄ ) = 0,
i ∈I i =1
(22)
û0 , ûi ≥ 0 for i ∈ I.
where
∇g (x̄ ) is the m × n Jacobian matrix whose i th row is ∇gi (x̄ ),
∇h (x̄ ) is the ` × n Jacobian matrix whose i th row is ∇hi (x̄ ),
g (x̄ ) is the m vector function whose i th component is gi (x̄ ).
Any point x̄ for which there exist Lagrange multipliers that satisfy
the KKT conditions (23) is called a KKT point.
S
S ∇g1 ∇g1
x¯ −∇f
∇g3 ∇g3
∇g 3
f decr. −∇f
∇f ∇f
When the constraints are linear the KKT conditions are al-
ways necessary optimality conditions irrespective of the ob-
jective function.
Let J = {i : v̄i > 0} and K = {i : v̄i < 0}. Further, suppose that f is
pseudoconvex at x̄, gi is quasiconvex at x̄ for i ∈ I, hi is
quasiconvex at x̄ for i ∈ J, and hi is quasiconcave at x̄ (that is, −hi
is quasiconvex at x̄) for i ∈ K . Then x̄ is a global optimal solution to
problem (13).
1
minimise x Hx + x c , (25)
2
subject to:
AI x ≤ bI ,
AE x = bE ,
Thus,
In this case:
The KKT conditions (23) for the QP problem defined in (25) are:
PF: AI x̄ ≤ bI ,
AE x̄ = bE ,
DF: H x̄ + c + AI u + AE v = 0, (26)
u ≥ 0,
CS: u (AI x̄ − bI ) = 0,