Optimality Conditions For General Constrained Optimization: CME307/MS&E311: Optimization Lecture Note #07
Optimality Conditions For General Constrained Optimization: CME307/MS&E311: Optimization Lecture Note #07
Yinyu Ye
Department of Management Science and Engineering
Stanford University
Stanford, CA 94305, U.S.A.
http://www.stanford.edu/˜yyye
Chapter 11.1-8
1
CME307/MS&E311: Optimization Lecture Note #07
a b c d e
x
2
CME307/MS&E311: Optimization Lecture Note #07
a ≤ x ≤ e, f ′ (x) = y a + y e , y a ≥ 0, y e ≤ 0, y a (x − a) = 0, y e (x − e) = 0.
If f ′ (x̄) = 0, then it is also necessary that f (x) is locally convex at x̄ for it being a local minimizer.
How to tell the function is locally convex at the solution? It is necessary f ′′ (x̄) ≥ 0, which is called the
second-order necessary condition (SONC), which we would explored further.
These conditions are still not, in general, sufficient. It does not distinguish between local minimizers, local
maximizers, or saddle points.
If the second-order sufficient condition (SOSC): f ′′ (x̄) > 0, is satisfied or the function is strictly locally
convex, then x̄ is a local minimizer.
Thus, if the function is convex everywhere, the first-order necessary condition is already sufficient.
3
CME307/MS&E311: Optimization Lecture Note #07
4
CME307/MS&E311: Optimization Lecture Note #07
We now study the case that the only assumption is that all functions are in C 1 , and C 2 later, either convex
or nonconvex.
We again establish optimality conditions to qualify/verify any local optimizers. These conditions give us
qualitative structures of (local) optimizers and lead to quantitative algorithms to numerically find a local
optimizer or an KKT solution.
The main proof idea is that if x̄ is a local minimier of (GCO), then it must be a local minimizer of the
problem where the constraints are linearlized using the First-Order Taylor expansion.
5
CME307/MS&E311: Optimization Lecture Note #07
{x ∈ Rn : h(x) = 0 ∈ Rm , m ≤ n}
For a point x̄ on the surface, we call it a regular point if ∇h(x̄) have rank m or the rows, or the gradient
vector of each hi (·) at x̄, are linearly independent. For example, (0;
0) is not a regular point of
{(x1 ; x2 ) ∈ R2 : x21 + (x2 − 1)2 − 1 = 0, x21 + (x2 + 1)2 − 1 = 0}.
Based on the Implicit Function Theorem (Appendix A of the Text), if x̄ is a regular point and m < n, then
for every d ∈ Tx̄ = {z : ∇h(x̄)z = 0} there exists a curve x(t) on the hypersurface, parametrized by
a scalar t in a sufficiently small interval [−a a], such that
6
CME307/MS&E311: Optimization Lecture Note #07
where active-constraint set Ax̄ = {i : ci (x̄) = 0}. If x̄ is a (local) minimizer of (GCO), then there must
be no d to satisfy linear constraints:
∇f (x̄)d <0
∇h(x̄)d = 0 ∈ Rm , (1)
This lemma was proved when constraints are linear in which case d is a feasible direction, but needs more
work otherwise since there is no feasible direction when constraints are nonlinear.
8
CME307/MS&E311: Optimization Lecture Note #07
Proof
Suppose we have a d̄ satisfies all linear constraints. Then ∇f (x̄)d̄ < 0 so that d̄ is a descent-direction
vector. Denote the active-constraint set at d̄ among the linear inequalities in (1) by Ad
x̄ (⊂ Ax̄ ). Then, x̄
remains a regular point of hypersurface of
Let ϕ(t) = f (x(t)). Then, t = 0 must be a local minimizer of ϕ(t) for 0 ≤ t ≤ a so that
0 ≤ ϕ′ (0) = ∇f (x(0))ẋ(0) = ∇f (x̄)d̄ < 0, ⇒ a contradiction.
9
CME307/MS&E311: Optimization Lecture Note #07
Theorem 3 (First-Order or KKT Optimality Condition) Let x̄ be a (local) minimizer of (GCO) and it is a
regular point of {x : h(x) = 0, ci (x) = 0, i ∈ Ax̄ }. Then, for some multipliers (ȳ, s̄ ≥ 0)
The proof is again based on the Alternative System Theory or Farkas Lemma. The complementarity
slackness condition is from that ci (x̄) = 0 for all i ∈ Ax̄ , and for i ̸∈ Ax̄ , we simply set s̄i = 0.
A solution who satisfies these conditions is called an KKT point or solution of (GCO) – any local minimizer
x̄, if it is also a regular point, must be an KKT solution; but the reverse may not be true.
10
CME307/MS&E311: Optimization Lecture Note #07
It is more convenient to introduce the Lagrangian Function associated with generally constrained
optimization:
L(x, y, s) = f (x) − yT h(x) − sT c(x),
where multipliers y of the equality constraints are “free” and s ≥ 0 for the “greater or equal to” inequality
constraints, so that the KKT condition (2) can be written as
Lagrangian Function can be viewed as a function aggregated the original objective function plus the
penalized terms on constraint violations.
In theory, one can adjust the penalty multipliers (y, s ≥ 0) to repeatedly solve the following so-called
Lagrangian Relaxation Problem:
11
CME307/MS&E311: Optimization Lecture Note #07
One condition for a local minimizer x̄ that must always be an KKT solution is the constraint qualification: x̄
is a regular point of the constraints. Otherwise, a local minimizer may not be an KKT solution: Consider
x̄ = (0; 0) of a convex nonlinearly-constrained problem
On the other hand, even the regular point condition does not hold, the KKT theorem may still true:
12
CME307/MS&E311: Optimization Lecture Note #07
We now consider optimality conditions for problems having three types of inequalities:
min f (x)
(GCO)
s.t. ci (x) (≤, =, ≥) 0, i = 1, ..., m, (Original Problem Constraints (OPC))
For any feasible point x of (GCO) define the active constraint set by Ax = {i : ci (x) = 0}.
Let x̄ be a local minimizer for (GCO) and x̄ is a regular point on the hypersurface of the active constraints
Then there exist multipliers ȳ such that
13
CME307/MS&E311: Optimization Lecture Note #07
minx 0T x,
s.t. ∥x − ak ∥2 − d2k ≤ 0, k = 1, 2, 3,
∑
3
L(x, y) = 0T x − yk (∥x − ak ∥2 − d2k ),
k=1
∑3
0 = k=1 yk (x − ak ) (LDC)
yk ≤ 0, k = 1, 2, 3, (MSC)
14
CME307/MS&E311: Optimization Lecture Note #07
Each trader i, equipped with a good bundle vector wi , trade with others to maximize its individual utility
function. The equilibrium price is an assignment of prices to goods so as when every producer sells
his/her own good bundle and buys a maximal bundle of goods then the market clears. Thus, trader i’s
optimization problem, for given prices pj , j ∈ G, is
∑
maximize uTi xi:= j∈P uij xij
∑
subject to pT xi := j∈P pj xij ≤ pT wi ,
xij ≥ 0, ∀j,
Then, the equilibrium price vector is the one such that there are maximizers x(p)i s
∑ ∑
x(p)ij = wij , ∀j.
i i
15
CME307/MS&E311: Optimization Lecture Note #07
One can normalize the prices p such that one of them equals 1. This would be one of the problems in
HW2.
16
CME307/MS&E311: Optimization Lecture Note #07
Similarly, the necessary and sufficient equilibrium conditions of the Arrow-Debreu market are
pT w i
pj ≥ uij · uT xi
·, ∀i, j,
∑ ∑ i
x
i ij = i wij ∀j,
pj > 0, xi ≥ 0, ∀i, j;
where the budget for trader i is replaced by pT wi . Again, the nonlinear inequality can be rewritten as
Theorem 4 The equilibrium set of the Arrow-Debreu Market is convex in allocations and the logarithmic of
prices.
17
CME307/MS&E311: Optimization Lecture Note #07
Cobb-Douglas Utility:
∏ u
ui (xi ) = xijij , xij ≥ 0.
j∈G
Leontief Utility:
xij
ui (xi ) = min{ , xij ≥ 0.}.
j∈G uij
Again, the equilibrium price vector is the one such that there are maximizers to clear the market.
18
CME307/MS&E311: Optimization Lecture Note #07
minx,y,z xy + yz + zx
s.t. xyz = 1
(x, y, x) > 0.
19
CME307/MS&E311: Optimization Lecture Note #07
∑n
s.t. j=1 ckj yj = log(bk ), k = 1, ..., K
yj free ∀j.
20
CME307/MS&E311: Optimization Lecture Note #07
Now in addition we assume all functions are in C 2 , that is, twice continuously differentiable. Recall the
tangent linear sub-space at x̄:
The Hessian of the Lagrangian function need to be positive semidefinite on the tangent-space.
21
CME307/MS&E311: Optimization Lecture Note #07
Proof
The proof reduces to one-dimensional case by considering the objective function ϕ(t) = f (x(t)) for the
feasible curve x(t) on the surface of ALL active constraints. Since 0 is a (local) minimizer of ϕ(t) in an
interval [−a a] for a sufficiently small a > 0, we must have ϕ′ (0) = 0 so that
Let the second expression subtracted from the first one on both sides and use the FONC:
∑
0 ≤ d ∇ f (x̄)d − d [ i ȳi ∇2 hi (x̄)]d + ∇f (x̄)ẍ(0) − ȳ T ∇h(x̄)ẍ(0)
T 2 T
∑
= d ∇ f (x̄)d − d [ i ȳi ∇2 hi (x̄)]d
T 2 T
22
CME307/MS&E311: Optimization Lecture Note #07
Theorem 6 Let x̄ be a regular point of (GCO) with equality constraints only and let ȳ be the Lagrange
multipliers such that (x̄, ȳ) satisfies the (first-order) KKT conditions of (GCO). Then, if in addition
23
CME307/MS&E311: Optimization Lecture Note #07
v v
v
24
CME307/MS&E311: Optimization Lecture Note #07
2(1 + y/4) 0
∇2x L(x1 , x2 , y) =
0 2(1 + y)
25
CME307/MS&E311: Optimization Lecture Note #07
This would be sufficient for the third KKT solution to be a local minimizer.
26
CME307/MS&E311: Optimization Lecture Note #07
dT Qd ≥ 0, ∀d, s.t. Ad = 0
for a given symmetric matrix Q and a rectangle matrix A. (In this case, the subspace is the null space of
matrix A.) This test itself might be a nonconvex optimization problem.
for some vector u ∈ Rn , where PA is called the projection matrix of A. Thus, the test becomes whether
or not
uT PA QPA u ≥ 0, ∀u ∈ Rn ,
that is, we just need to test positive semidefiniteness of PA QPA as usual.
27
CME307/MS&E311: Optimization Lecture Note #07
(SCQP ) min xT Qx + cT x
s.t. ∥x∥2 (≤, =) 1.
Theorem 7 The FONC and SONC, that is, the following conditions on x, together with the multiplier y ,
are necessary and sufficient for finding the global minimizer of (SCQP).
28