Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

KKT Conditions and Duality: March 23, 2012

Download as pdf or txt
Download as pdf or txt
You are on page 1of 36

KKT conditions and Duality

March 23, 2012


Tutorial Example

Want to solve this constrained optimization problem

min f (x) = min .4 (x21 + x22 )


x∈R2 x∈R2

subject to

g(x) = 2 − x1 − x2 ≤ 0
Tutorial example - Cost function

x2
iso-contours of f (x)

x1

f (x) = .4 (x21 + x22 )


Tutorial example - Constraint

x2
iso-contours of f (x)

feasible region

x1

g(x) = 2 − x1 − x2 ≤ 0
Solve this problem with Lagrange Multipliers
Can solve this constrained optimization with Lagrange multipliers:

L(x, λ) = f (x) + λ g(x)

Solution:
The Lagrangian is

L(x, λ) = .4 x21 + .4 x22 + λ (2 − x1 − x2 )

The KKT conditions say that at an optimum λ∗ ≥ 0 and

∂L(x∗ , λ∗ )
= .8 x∗1 − λ∗ = 0
∂x1
∂L(x∗ , λ∗ )
= .8 x∗2 − λ∗ = 0
∂x2
∂L(x∗ , λ∗ )
= 2 − x∗1 − x∗2 = 0
∂λ
Solve this problem with Lagrange Multipliers
Can solve this constrained optimization with Lagrange multipliers:

L(x, λ) = f (x) + λ g(x)

Solution ctd:
Find (x∗1 , x∗2 , λ∗ ) which fulfill these simultaneous equations. The first two
equations imply
5 ∗ 5 ∗
x∗1 = λ , x2 = λ
4 4
Substituting these into the last equation we get
4
8 − 5λ∗ − 5λ∗ = 0 =⇒ λ∗ = ← greater than 0
5
and in turn this means
5 ∗ 5 ∗
x∗1 = λ = 1, x∗2 = λ =1
4 4
Solve this particular problem in another way
Alternate solution:
Construct the Lagrangian dual function

q(λ) = min L(x, λ) = min (f (x) + λg(x))


x x

Find optimal value of x wrt L(x, λ) in terms of the Lagrange multiplier:


5 5
x∗1 = λ, x∗2 = λ
4 4
Substitute back into the expression of L(x, λ) to get
5 2 5 5
q(λ) = λ + λ (2 − λ − λ)
4 4 4
Find λ ≥ 0 which maximizes q(λ). Luckily in this case the global
optimum of q(λ) corresponds to the constrained optimum

∂q(λ) 5 4
=− λ+2=0 =⇒ λ∗ = =⇒ x∗1 = x∗2 = 1
∂λ 2 5
Solve the same problem in another way

The Primal Problem

min f (x) subject to g(x) ≤ 0


x∈R2

The Lagrangian Dual Problem

max q(λ) subject to λ ≥ 0


λ∈R

where

q(λ) = min (f (x) + λ g(x))


x∈R2

is referred to as the Lagrangian dual function.


The general statement

In general we will have multiple inequality and equality constraints.


The statement of the Primal Problem is

min f (x)
x∈X

subject to

g(x) ≤ 0 and h(x) = 0


While the Dual problem is

Lagrangian Dual Problem

max q(λ, µ) subject to λ ≥ 0


λ,µ

where

q(λ, µ) = min f (x) + λt g(x) + µt h(x)


 
x

is the Lagrangian dual function.


Why ??

This dual approach is not guaranteed to succeed. However,


• It does for a certain class of functions
• In these cases it often leads to a simpler optimization problem.
• Particularly in the case when the dimension of x is much
larger than the number of constraints.
• The expression of x∗ in terms of the Lagrange multipliers may
give some insight into the optimal solution i.e. the optimal
separating hyper-plane found by the SVM.
Why ??

This dual approach is not guaranteed to succeed. However,


• It does for a certain class of functions
• In these cases it often leads to a simpler optimization problem.
• Particularly in the case when the dimension of x is much
larger than the number of constraints.
• The expression of x∗ in terms of the Lagrange multipliers may
give some insight into the optimal solution i.e. the optimal
separating hyper-plane found by the SVM.

We will now focus on the geometry of the dual solution...


Geometry of the Dual Problem
Map the original problem
x2
z
(g(x), f (x))

x1 G y

• Map each point x ∈ R2 to (g(x), f (x)) ∈ R2 .


• This map defines the set
G = {(y, z) | y = g(x), z = f (x) for some x ∈ R2 }.
• Note: L(x, λ) = z + λ y for some z and y.
Map the original problem

z
(g(x), f (x))

G y

Define G ⊂ R2 as the image of R2 under the (g, f ) map

G = {(y, z) | y = g(x), z = f (x) for some x ∈ R2 }

In this space only points with y ≤ 0 correspond to feasible points.


The Primal Problem

z
(g(x), f (x))

(y ∗ , z ∗ )
G y

• The primal problem consists in finding a point in G with


y ≤ 0 that has minimum ordinate z.
• Obviously this optimal point is (y ∗ , z ∗ ).
Visualization of the Lagrangian

z
(g(x), f (x))

(y ∗ , z ∗ )
G y

z + λy = α

• Given a λ ≥ 0, the Lagrangian is given by


L(x, λ) = f (x) + λg(x) = z + λ y
with (y, z) ∈ G.
• Note z + λy = α is the eqn of a straight line with slope −λ that
intercepts the z-axis at α.
Visualization of the Lagrangian Dual function

z
(g(x), f (x))

q(λ)
(y ∗ , z ∗ )
G y

z + λy = q(λ)

For a given λ ≥ 0 Lagrangian dual sub-problem is find: min (z + λ y)


(y,z)∈G

• Move the line z + λy in the direction (−λ, −1) while remaining in


contact with G.
• The last intercept on the z-axis obtained this way is the value of
q(λ) corresponding to the given λ ≥ 0.
Solving the Dual Problem

z
(g(x), f (x))


q(λ )

(y ∗ , z ∗ )
G y

z + λ∗ y = q(λ∗ )
z + λy = q(λ)

Finally want to find the dual optimum: max q(λ)


λ

• the line with slope −λ with maximal intercept, q(λ), on the z-axis.
• This line has slope λ∗ and dual optimal solution q(λ∗ ).
Solving the Dual Problem

z
(g(x), f (x))


q(λ )

(y ∗ , z ∗ )
G y

z + λ∗ y = q(λ∗ )
z + λy = q(λ)

• For this problem the optimal dual objective z ∗ equals the optimal
primal objective z ∗ .
• In such cases, there is no duality gap (strong duality).
Properties of the Lagrangian Dual Function
q(λ) is concave

Theorem
Let Dq = {λ | q(λ) > −∞} then q(λ) is concave function on Dq .

Proof.
For any x ∈ X and λ1 , λ2 ∈ Dq and α ∈ (0, 1)

L(x, αλ1 + (1 − α)λ2 ) = f (x) + (αλ1 + (1 − α)λ2 )t g(x)


= α(f (x) + λt1 g(x)) + (1 − α)(f (x) + λt2 g(x))
= α L(x, λ1 ) + (1 − α) L(x, λ2 ).

Take the min on both sides

min{L(x, αλ1 + (1 − α)λ2 )} = min{αL(x, λ1 ) + (1 − α)L(x, λ2 )}


x∈X x∈X

≥ α min{L(x, λ1 )} + (1 − α) min{L(x, λ2 )}
x∈X x∈X

Therefore

q(αλ1 + (1 − α)λ2 ) ≥ α q(λ1 ) + (1 − α) q(λ2 )

This implies that q is concave over Dq .


The set of Lagrange Multipliers is convex

Theorem
Let Dq = {λ | q(λ) > −∞}. This constraint ensures valid Lagrange
Multipliers exist. Then Dq is a convex set.

Proof.
Let λ1 , λ2 ∈ Dq . Therefore q(λ1 ) > −∞ and q(λ2 ) > −∞. Let
α ∈ (0, 1), then as q is concave

q(α λ1 + (1 − α) λ2 ) ≥ α q(λ1 ) + (1 − α) q(λ2 ) > −∞

and this implies

α λ1 + (1 − α) λ2 ∈ Dq

Hence Dq is a convex set.


Significance of these results

• The dual is always concave, irrespective of the primal problem.


• Therefore finding the optimum of the dual function is a
convex optimization problem.
Weak Duality
Weak Duality

Theorem (Weak Duality)


Let x be a feasible solution, x ∈ X , g(x) ≤ 0 and h(x) = 0, to the
primal problem P . Let (λ, µ) be a feasible solution, λ ≥ 0, to the
dual problem D. Then

f (x) ≥ q(λ, µ)
Weak Duality

Proof of the Weak Duality Theorem.


Remember
m
X l
X
q(λ, µ) = inf{f (x) + λi gi (x) + µi hi (x) : x ∈ XF }
i=1 i=1

Then we have

q(λ, µ) = inf{f (x̃) + λt g(x̃) + µt h(x̃) : x̃ ∈ XF }


≤ f (x) + λt g(x) + µt h(x)
≤ f (x)

and the result follows.


Weak Duality

Corollary
Let

f ∗ = inf{f (x) : x ∈ X, g(x) ≥ 0, h(x) = 0}


q ∗ = sup{q(λ, µ) : λ ≥ 0}

then

q∗ ≤ f ∗

• Thus the
optimal value of the primal problem ≥ optimal value of the dual problem.
• If optimal value of the primal problem > optimal value of the
dual problem, then there exists a duality gap.
Weak Duality

Corollary
Let

f ∗ = inf{f (x) : x ∈ X, g(x) ≥ 0, h(x) = 0}


q ∗ = sup{q(λ, µ) : λ ≥ 0}

then

q∗ ≤ f ∗

• Thus the
optimal value of the primal problem ≥ optimal value of the dual problem.
• If optimal value of the primal problem > optimal value of the
dual problem, then there exists a duality gap.
Example with a Duality Gap
Example with a non-convex objective function
f (x) non-convex f (x)

feasible region
defined by g(x) ≤ 0

• Consider the constrained optimization of this 1D non-convex


objective function.
• Let’s visualize G = {(y, z) | ∃x ∈ R s.t. y = g(x), z = f (x))} and its
dual solution...
Dual Solution ≤ Primal Solution: Have a Duality Gap
z
G

Optimal primal objective

Duality Gap
Optimal dual objective

• Above is the geometric interpretation of the primal and dual


problems.
• Note there exists a duality gap due to the nonconvexity of
the set G.
Strong Duality
When does Dual Solution = Primal Solution?

The Strong Duality Theorem states, that if some suitable


convexity conditions are satisfied, then there is no duality gap
between the primal and dual optimisation problems.
Strong Duality

Theorem (Strong Duality)


Let
• X be a non-empty convex set in Rn
• f : X → R and each gi : Rn → R (i = 1, . . . , m) be convex,
• each hi : Rn → R (i = 1, . . . , l) be affine.
If
• there exists x̂ ∈ X such that g(x̂) < 0 and
• 0 ∈ int(h(X)) where h(X) = {h(x) : x ∈ X}.
then

inf{f (x) : x ∈ X, g(x) ≤ 0, h(x) = 0} = sup{q(λ, µ) : λ ≥ 0}

where q(λ, µ) = inf{f (x) + λt g(x) + µt h(x) : x ∈ X}.


Strong Duality

Theorem (Strong Duality ctd)


Furthermore, if

inf{f (x) : x ∈ X, g(x) ≤ 0, h(x) = 0} > −∞

then the

sup{q(λ, µ) : λ ≥ 0}

is achieved at (λ∗ , µ∗ ) with λ∗ ≥ 0. If the inf is achieved at x∗ then

(λ∗ )t g(x∗ ) = 0

You might also like