Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
124 views

Optimality Conditions For General Constrained Optimization: CME307/MS&E311: Optimization Lecture Note #07

This document provides a summary of lecture notes on optimality conditions for constrained optimization problems. It discusses first-order necessary conditions where the derivative of the objective function equals zero at an optimal solution. It also discusses second-order necessary and sufficient conditions related to convexity. The document covers the Karush-Kuhn-Tucker (KKT) conditions, which provide a unified set of optimality conditions for problems with equality and inequality constraints. It also discusses applying these conditions to problems with nonlinear constraints.

Uploaded by

Ikbal Gencarslan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
124 views

Optimality Conditions For General Constrained Optimization: CME307/MS&E311: Optimization Lecture Note #07

This document provides a summary of lecture notes on optimality conditions for constrained optimization problems. It discusses first-order necessary conditions where the derivative of the objective function equals zero at an optimal solution. It also discusses second-order necessary and sufficient conditions related to convexity. The document covers the Karush-Kuhn-Tucker (KKT) conditions, which provide a unified set of optimality conditions for problems with equality and inequality constraints. It also discusses applying these conditions to problems with nonlinear constraints.

Uploaded by

Ikbal Gencarslan
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

CME307/MS&E311: Optimization Lecture Note #07

Optimality Conditions for General Constrained Optimization

Yinyu Ye
Department of Management Science and Engineering
Stanford University
Stanford, CA 94305, U.S.A.

http://www.stanford.edu/˜yyye
Chapter 11.1-8

1
CME307/MS&E311: Optimization Lecture Note #07

KKT Optimality Condition Illustration in One-Dimension

a b c d e
x

Figure 1: Global and Local Minimizers of One-Variable Function in Interval [a e]

2
CME307/MS&E311: Optimization Lecture Note #07

A differentiable function f of one variable defined on an interval F = [a e]. If an interior-point x̄ is a


local/global minimizer, then f ′ (x̄) = 0; if the left-end-point x̄ = a is a local minimizer, then f ′ (a) ≥ 0; if
the right-end-point x̄ = e is a local minimizer, then f ′ (e) ≤ 0. first-order necessary condition (FONC)
summarizes the three cases by a unified set of optimality/complementarity slackness conditions:

a ≤ x ≤ e, f ′ (x) = y a + y e , y a ≥ 0, y e ≤ 0, y a (x − a) = 0, y e (x − e) = 0.

If f ′ (x̄) = 0, then it is also necessary that f (x) is locally convex at x̄ for it being a local minimizer.
How to tell the function is locally convex at the solution? It is necessary f ′′ (x̄) ≥ 0, which is called the
second-order necessary condition (SONC), which we would explored further.

These conditions are still not, in general, sufficient. It does not distinguish between local minimizers, local
maximizers, or saddle points.

If the second-order sufficient condition (SOSC): f ′′ (x̄) > 0, is satisfied or the function is strictly locally
convex, then x̄ is a local minimizer.

Thus, if the function is convex everywhere, the first-order necessary condition is already sufficient.

3
CME307/MS&E311: Optimization Lecture Note #07

Second-Order Optimality Condition for Unconstrained Optimization

Theorem 1 (First-Order Necessary Condition) Let f (x) be a C 1 function where x ∈ Rn . Then, if x̄ is a


minimizer, it is necessarily ∇f (x̄) = 0.
Theorem 2 (Second-Order Necessary Condition) Let f (x) be a C 2 function where x ∈ Rn . Then, if x̄
is a minimizer, it is necessarily
∇f (x̄) = 0 and ∇2 f (x̄) ≽ 0.
Furthermore, if ∇2 f (x̄) ≻ 0, then the condition becomes sufficient.
The proofs would be based on 2nd-order Taylor’s expansion at x̄ such that if these conditions are not
satisfied, then one would be find a descent-direction d and a small constant ᾱ > 0 such that
f (x̄ + αd) < f (x̄), ∀0 < α ≤ ᾱ.
For example, if ∇f (x̄)
= 0 and ∇2 f (x̄) ̸≽ 0, the eigenvector of a negative eigenvalue of the
Hessian would be a descent direction from x̄.

Again, they may still not be sufficient, e.g., f (x) = x3 .

4
CME307/MS&E311: Optimization Lecture Note #07

General Constrained Optimization

(GCO) min f (x)


s.t. h(x) = 0 ∈ Rm ,
c(x) ≥ 0 ∈ Rp .
We have dealt the cases when the feasible region is a convex polyhedron and/or the feasible can be
represented by nonlinear convex cones intersect linear equality constraints.

We now study the case that the only assumption is that all functions are in C 1 , and C 2 later, either convex
or nonconvex.

We again establish optimality conditions to qualify/verify any local optimizers. These conditions give us
qualitative structures of (local) optimizers and lead to quantitative algorithms to numerically find a local
optimizer or an KKT solution.

The main proof idea is that if x̄ is a local minimier of (GCO), then it must be a local minimizer of the
problem where the constraints are linearlized using the First-Order Taylor expansion.

5
CME307/MS&E311: Optimization Lecture Note #07

Hypersurface and Implicit Function Theorem

Consider the (intersection) of Hypersurfaces (vs. Hyperplanes):

{x ∈ Rn : h(x) = 0 ∈ Rm , m ≤ n}

When functions hi (x)s are C 1 functions, we say the surface is smooth.

For a point x̄ on the surface, we call it a regular point if ∇h(x̄) have rank m or the rows, or the gradient
vector of each hi (·) at x̄, are linearly independent. For example, (0;
0) is not a regular point of
{(x1 ; x2 ) ∈ R2 : x21 + (x2 − 1)2 − 1 = 0, x21 + (x2 + 1)2 − 1 = 0}.
Based on the Implicit Function Theorem (Appendix A of the Text), if x̄ is a regular point and m < n, then
for every d ∈ Tx̄ = {z : ∇h(x̄)z = 0} there exists a curve x(t) on the hypersurface, parametrized by
a scalar t in a sufficiently small interval [−a a], such that

h(x(t)) = 0, x(0) = x̄, ẋ(0) = d.

Tx̄ is called the tangent-space or tangent-plane of the constraints at x̄.

6
CME307/MS&E311: Optimization Lecture Note #07

Figure 2: Tangent Plane on a Hypersurface at Point x∗


7
CME307/MS&E311: Optimization Lecture Note #07

First-Order Necessary Conditions for Constrained Optimization I

Lemma 1 Let x̄ be a feasible solution and a regular point of the hypersurface of

{x : h(x) = 0, ci (x) = 0, i ∈ Ax̄ }

where active-constraint set Ax̄ = {i : ci (x̄) = 0}. If x̄ is a (local) minimizer of (GCO), then there must
be no d to satisfy linear constraints:

∇f (x̄)d <0
∇h(x̄)d = 0 ∈ Rm , (1)

∇ci (x̄)d ≥ 0, ∀i ∈ Ax̄ .

This lemma was proved when constraints are linear in which case d is a feasible direction, but needs more
work otherwise since there is no feasible direction when constraints are nonlinear.

x̄ being a regular point is often referred as a Constraint Qualification condition.

8
CME307/MS&E311: Optimization Lecture Note #07

Proof
Suppose we have a d̄ satisfies all linear constraints. Then ∇f (x̄)d̄ < 0 so that d̄ is a descent-direction
vector. Denote the active-constraint set at d̄ among the linear inequalities in (1) by Ad
x̄ (⊂ Ax̄ ). Then, x̄
remains a regular point of hypersurface of

{x : h(x) = 0, ci (x) = 0, i ∈ Adx̄ }.


Thus, there is a curve x(t) such that

h(x(t)) = 0, ci (x(t)) = 0, i ∈ Adx̄ , x(0) = x̄, ẋ(0) = d̄,


for t ∈ [0 a] of a sufficiently small positive constant a.
Also, ∇ci (x̄)d̄> 0, ∀i ̸∈ Adx̄ but i ∈ Ax̄ ; and ci (x̄) > 0, ∀i ̸∈ Ax̄ . Then, from Taylor’s theorem,
ci (x(t)) > 0 for all i ̸∈ Adx̄ so that x(t) is a feasible curve to the original (GCO) problem for t ∈ [0 a].
Thus, x̄ must be also a local minimizer among all local solutions on the curve x(t).

Let ϕ(t) = f (x(t)). Then, t = 0 must be a local minimizer of ϕ(t) for 0 ≤ t ≤ a so that
0 ≤ ϕ′ (0) = ∇f (x(0))ẋ(0) = ∇f (x̄)d̄ < 0, ⇒ a contradiction.

9
CME307/MS&E311: Optimization Lecture Note #07

First-Order Necessary Conditions for Constrained Optimization II

Theorem 3 (First-Order or KKT Optimality Condition) Let x̄ be a (local) minimizer of (GCO) and it is a
regular point of {x : h(x) = 0, ci (x) = 0, i ∈ Ax̄ }. Then, for some multipliers (ȳ, s̄ ≥ 0)

∇f (x̄) = ȳT ∇h(x̄) + s̄T ∇c(x̄) (2)

and (complementarity slackness)


s̄i ci (x̄) = 0, ∀i.

The proof is again based on the Alternative System Theory or Farkas Lemma. The complementarity
slackness condition is from that ci (x̄) = 0 for all i ∈ Ax̄ , and for i ̸∈ Ax̄ , we simply set s̄i = 0.
A solution who satisfies these conditions is called an KKT point or solution of (GCO) – any local minimizer
x̄, if it is also a regular point, must be an KKT solution; but the reverse may not be true.

10
CME307/MS&E311: Optimization Lecture Note #07

KKT via the Lagrangian Function

It is more convenient to introduce the Lagrangian Function associated with generally constrained
optimization:
L(x, y, s) = f (x) − yT h(x) − sT c(x),
where multipliers y of the equality constraints are “free” and s ≥ 0 for the “greater or equal to” inequality
constraints, so that the KKT condition (2) can be written as

∇x L(x̄, ȳ, s̄) = 0.

Lagrangian Function can be viewed as a function aggregated the original objective function plus the
penalized terms on constraint violations.

In theory, one can adjust the penalty multipliers (y, s ≥ 0) to repeatedly solve the following so-called
Lagrangian Relaxation Problem:

(LRP ) minx L(x, y, s).

11
CME307/MS&E311: Optimization Lecture Note #07

Constraint Qualification and the KKT Theorem

One condition for a local minimizer x̄ that must always be an KKT solution is the constraint qualification: x̄
is a regular point of the constraints. Otherwise, a local minimizer may not be an KKT solution: Consider
x̄ = (0; 0) of a convex nonlinearly-constrained problem

min x1 , s.t. x21 + (x2 − 1)2 − 1 ≤ 0, x21 + (x2 + 1)2 − 1 ≤ 0}.

On the other hand, even the regular point condition does not hold, the KKT theorem may still true:

min x2 , s.t. x21 + (x2 − 1)2 − 1 ≤ 0, x21 + (x2 + 1)2 − 1 ≤ 0},

that is, x̄ = (0; 0) is an KKT solution of the latter problem.


Therefore, finding an KKT solution is a plausible way to find a local minimizer.

12
CME307/MS&E311: Optimization Lecture Note #07

Summary Theorem of KKT Conditions for GCO

We now consider optimality conditions for problems having three types of inequalities:

min f (x)
(GCO)
s.t. ci (x) (≤, =, ≥) 0, i = 1, ..., m, (Original Problem Constraints (OPC))

For any feasible point x of (GCO) define the active constraint set by Ax = {i : ci (x) = 0}.
Let x̄ be a local minimizer for (GCO) and x̄ is a regular point on the hypersurface of the active constraints
Then there exist multipliers ȳ such that

∇f (x̄) = ȳT ∇c(x̄) (Lagrangian Derivative Conditions (LDC))


ȳi (≤,′ free′ , ≥) 0, i = 1, ..., m, (Multiplier Sign Constraints (MSC))
ȳi ci (x̄) = 0, (Complementarity Slackness Conditions (CSC)).

The complete First-Order KKT Conditions consist of these four parts!

13
CME307/MS&E311: Optimization Lecture Note #07

Recall SOCP Relaxation of Sensor Network Localization

Given ak ∈ R2 and Euclidean distances dk , k = 1, 2, 3, find x ∈ R2 such that

minx 0T x,
s.t. ∥x − ak ∥2 − d2k ≤ 0, k = 1, 2, 3,


3
L(x, y) = 0T x − yk (∥x − ak ∥2 − d2k ),
k=1
∑3
0 = k=1 yk (x − ak ) (LDC)

yk ≤ 0, k = 1, 2, 3, (MSC)

yk (∥x − ak ∥2 − d2k ) = 0. (CSC).

14
CME307/MS&E311: Optimization Lecture Note #07

Arrow-Debreu’s Exchange Market with Linear Economy

Each trader i, equipped with a good bundle vector wi , trade with others to maximize its individual utility
function. The equilibrium price is an assignment of prices to goods so as when every producer sells
his/her own good bundle and buys a maximal bundle of goods then the market clears. Thus, trader i’s
optimization problem, for given prices pj , j ∈ G, is

maximize uTi xi:= j∈P uij xij

subject to pT xi := j∈P pj xij ≤ pT wi ,
xij ≥ 0, ∀j,

Then, the equilibrium price vector is the one such that there are maximizers x(p)i s
∑ ∑
x(p)ij = wij , ∀j.
i i

15
CME307/MS&E311: Optimization Lecture Note #07

Example of Arrow-Debreu’s Model

Traders 1, 2 have good bundle


   
1 0
w1 =   , w2 =  .
0 1

Their optimization problems for given prices px , py are:

max 2x1 + y1 max 3x2 + y2


s.t. px · x1 + py · y1 ≤ px , s.t. px · x2 + py · y2 ≤ py
x 1 , y1 ≥ 0 x2 , y2 ≥ 0.

One can normalize the prices p such that one of them equals 1. This would be one of the problems in
HW2.

16
CME307/MS&E311: Optimization Lecture Note #07

Equilibrium conditions of the Arrow-Debreu market

Similarly, the necessary and sufficient equilibrium conditions of the Arrow-Debreu market are

pT w i
pj ≥ uij · uT xi
·, ∀i, j,
∑ ∑ i

x
i ij = i wij ∀j,
pj > 0, xi ≥ 0, ∀i, j;

where the budget for trader i is replaced by pT wi . Again, the nonlinear inequality can be rewritten as

log(uTi xi ) + log(pj ) − log(pT wi ) ≥ log(uij ), ∀i, j, uij > 0.


Let yj = log(pj ) or pj = eyj for all j . Then, these inequalities become

log(ui xi ) + yj − log(
T
wij eyj ) ≥ log(uij ), ∀i, j, uij > 0.
j

Note that the function on the left is concave in xi and yj .

Theorem 4 The equilibrium set of the Arrow-Debreu Market is convex in allocations and the logarithmic of
prices.

17
CME307/MS&E311: Optimization Lecture Note #07

Exchange Markets with Other Economies

Cobb-Douglas Utility:
∏ u
ui (xi ) = xijij , xij ≥ 0.
j∈G

Leontief Utility:
xij
ui (xi ) = min{ , xij ≥ 0.}.
j∈G uij
Again, the equilibrium price vector is the one such that there are maximizers to clear the market.

18
CME307/MS&E311: Optimization Lecture Note #07

Example of Geometric Optimization

Consider the Geometric Optimization Problem


∑m ( ∏n uij
)
minx i=1 ai x
j=1 j
∏n ckj
s.t. x
j=1 j = bk , k = 1, ..., K
xj > 0, ∀j,

where the coefficients ai ≥ 0 ∀i and bk > 0 ∀k .

minx,y,z xy + yz + zx
s.t. xyz = 1
(x, y, x) > 0.

19
CME307/MS&E311: Optimization Lecture Note #07

Convexification of Geometric Optimization

Let yj = log(xj ) so that xj = eyj . Then the problem becomes


( ∑n )
∑m uij yj
minx i=1 a i e j=1

∑n
s.t. j=1 ckj yj = log(bk ), k = 1, ..., K
yj free ∀j.

This is a convex objective function with linear constraints!

minu,v,w eu+v + ev+w + ew+u


s.t. u+v+w =0
(u, v, w) free.
Now the KKT solution suffices!

20
CME307/MS&E311: Optimization Lecture Note #07

Second-Order Necessary Conditions for Constrained Optimization

Now in addition we assume all functions are in C 2 , that is, twice continuously differentiable. Recall the
tangent linear sub-space at x̄:

Tx̄ := {z : ∇h(x̄)z = 0, ∇ci (x̄)z = 0 ∀i ∈ Ax̄ }.

Theorem 5 Let x̄ be a (local) minimizer of (GCO) and a regular point of hypersurface


{x : h(x) = 0, ci (x) = 0, i ∈ Ax̄ }, and let ȳ, s̄ denote Lagrange multipliers such that (x̄, ȳ, s̄)
satisfies the (first-order) KKT conditions of (GCO). Then, it is necessary to have

dT ∇2x L(x̄, ȳ, s̄)d ≥ 0 ∀ d ∈ Tx̄ .

The Hessian of the Lagrangian function need to be positive semidefinite on the tangent-space.

21
CME307/MS&E311: Optimization Lecture Note #07

Proof
The proof reduces to one-dimensional case by considering the objective function ϕ(t) = f (x(t)) for the
feasible curve x(t) on the surface of ALL active constraints. Since 0 is a (local) minimizer of ϕ(t) in an
interval [−a a] for a sufficiently small a > 0, we must have ϕ′ (0) = 0 so that

0 ≤ ϕ′′ (t)|t=0 = ẋ(0)T ∇2 f (x̄)ẋ(0) + ∇f (x̄)ẍ(0) = dT ∇2 f (x̄)d + ∇f (x̄)ẍ(0).


Let all active constraints (including the equality ones) be h(x) = 0 and differentiating equations

ȳ T h(x(t)) = i ȳi hi (x(t)) = 0 twice, we obtain
∑ ∑
T
0 = ẋ(0) [ ȳi ∇ hi (x̄)]ẋ(0) + ȳ ∇h(x̄)ẍ(0) = d [
2 T T
ȳi ∇2 hi (x̄)]d + ȳ T ∇h(x̄)ẍ(0).
i i

Let the second expression subtracted from the first one on both sides and use the FONC:

0 ≤ d ∇ f (x̄)d − d [ i ȳi ∇2 hi (x̄)]d + ∇f (x̄)ẍ(0) − ȳ T ∇h(x̄)ẍ(0)
T 2 T

= d ∇ f (x̄)d − d [ i ȳi ∇2 hi (x̄)]d
T 2 T

= dT ∇2x L(x̄, ȳ, s̄)d.


Note that this inequality holds for every d ∈ Tx̄ .

22
CME307/MS&E311: Optimization Lecture Note #07

Second-Order Sufficient Conditions for GCO

Theorem 6 Let x̄ be a regular point of (GCO) with equality constraints only and let ȳ be the Lagrange
multipliers such that (x̄, ȳ) satisfies the (first-order) KKT conditions of (GCO). Then, if in addition

dT ∇2x L(x̄, ȳ)d > 0 ∀ 0 ̸= d ∈ Tx̄ ,

then x̄ is a local minimizer of (GCO).

See the proof in Chapter 11.5 of LY.

The SOSC for general (GCO) is proved in Chapter 11.8 of LY.

23
CME307/MS&E311: Optimization Lecture Note #07

min (x1 )2 + (x2 )2 s.t. (x1 )2 /4 + (x2 )2 − 1 = 0

v v
v

Figure 3: FONC and SONC for Constrained Minimization

24
CME307/MS&E311: Optimization Lecture Note #07

L(x1 , x2 , y) = (x1 )2 + (x2 )2 − y(−(x1 )2 /4 − (x2 )2 + 1),

∇x L(x1 , x2 , y) = (2x1 (1 + y/4), 2x2 (1 + y)),

 
2(1 + y/4) 0
∇2x L(x1 , x2 , y) =  
0 2(1 + y)

Tx := {(z1 , z2 ) : (x1 /4)z1 + x2 z2 = 0}.


We see that there are two possible values for y : either −4 or −1, which lead to total four KKT points:
         
x1 2 −2 0 0
         
 x2  =  0  ,  0  ,  1  , and  −1  .
         
y −4 −4 −1 −1

25
CME307/MS&E311: Optimization Lecture Note #07

Consider the first KKT point:


 
0 0
∇2x L(2, 0, −4) =  , Tx̄ = {(z1 , z2 ) : z1 = 0}
0 −6

Then the Hessian is not positive semidefinite on Tx̄ since

dT ∇2x L(2, 0, −4)d = −6d22 ≤ 0.

Consider the third KKT point:


 
3/2 0
∇2x L(0, 1, −1) =  , Tx̄ = {(z1 , z2 ) : z2 = 0}
0 0

Then the Hessian is positive definite on Tx̄ since

dT ∇2x L(0, 0, −1)d = (3/2)d21 > 0, ∀0 ̸= d ∈ Tx̄ .

This would be sufficient for the third KKT solution to be a local minimizer.

26
CME307/MS&E311: Optimization Lecture Note #07

Test Positive Semidefiniteness in a Subspace

In the second-order test, we typically like to know whether or not

dT Qd ≥ 0, ∀d, s.t. Ad = 0

for a given symmetric matrix Q and a rectangle matrix A. (In this case, the subspace is the null space of
matrix A.) This test itself might be a nonconvex optimization problem.

But it is known that d is in the null space of matrix A if and only if

d = (I − AT (AAT )−1 A)u = PA u

for some vector u ∈ Rn , where PA is called the projection matrix of A. Thus, the test becomes whether
or not
uT PA QPA u ≥ 0, ∀u ∈ Rn ,
that is, we just need to test positive semidefiniteness of PA QPA as usual.

27
CME307/MS&E311: Optimization Lecture Note #07

Spherical Constrained Nonconvex Quadratic Optimization

(SCQP ) min xT Qx + cT x
s.t. ∥x∥2 (≤, =) 1.

Theorem 7 The FONC and SONC, that is, the following conditions on x, together with the multiplier y ,

∥x∥2 (≤, =) 1, (OP C)


2Qx + c − 2yx = 0, (LDC)
y (≤,′ free′ ) 0, (M SC)
y(1 − ∥x∥2 ) = 1, (CSC)
(Q − yI) ≽ 0, (SOC).

are necessary and sufficient for finding the global minimizer of (SCQP).

28

You might also like