Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Opt Con Gen 000

Download as pdf or txt
Download as pdf or txt
You are on page 1of 28

GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED

OPTIMIZATION PROBLEMS

1. S TATEMENT OF THE P ROBLEM


Consider the problem defined by
maximize f (x)
x
subject to g(x) = 0
where g(x) = 0 denotes an m × 1 vector of constraints, m < n. We can also write this as
max f (x1 , x2 , . . . , xn )
x1 , x2 ,...xn

subject to
g1(x1 , x2 , . . . , xn ) = 0
g2(x1 , x2, . . . , xn ) = 0
.. (1)
.
gm (x1 , x2, . . . , xn ) = 0
The solution can be obtained using the Lagrangian function

L(x; λ) = f (x) − λ0g(x) where λ0 = (λ1, λ2, . . . , λm)


(2)
= f (x1, x2 , . . . ) − λ1g1(x) − λ2g2 (x) − · · · − λm gm (x)
Notice that the gradient of L will involve a set of derivatives, i.e.
 
∂g
∇xL = ∇x f (x) − λ
∂x
where  
∂g1(x∗) ∂g2(x∗) ∂gm (x∗)
 ∂x1 ...
 ∂x1 ∂x1  
 
 
 ∂g1(x∗) ∂g2(x∗) ∂gm (x∗) 
   ... 
∂g  ∂x2 ∂x2 ∂x2 
= Jg = 


 (3)
∂x  
 .. .. .. .. 
 . . . . 
 
 
 ∂g1(x∗) ∂g2(x∗) ∂gm (x ) 

...
∂xn ∂xn ∂xn
Date: October 7, 2004.
1
2 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

There will be one equation for each x. There will also be equations involving the deriva-
tives of L with respect to each λ.

2. N ECESSARY C ONDITIONS FOR AN E XTREME P OINT


The necessary conditions for an extremum of f with the equality constraints g(x) = 0
are that

∇L(x∗, λ∗) = 0 (4)

where it is implicit that the gradient in (3) is with respect to both x and λ.

3. S UFFICIENT C ONDITIONS FOR AN E XTREME P OINT


3.1. Statement of Conditions. Let f, g1, . . . , gm be twice continuously differentiable real-
valued functions on Rn . If there exist vectors x∗  Rn , λ∗  Rm such that

∇L(x∗, λ∗) = 0 (5)

and for every non-zero vector z  Rn satisfying

z 0∇gi (x∗) = 0, i = 1, . . . , m (6)

it follows that

z 0 ∇2x L(x∗, λ∗)z > 0, (7)

then f has a strict local minimum at x∗ , subject to gi (x) = 0, i = 1, . . . , m. If the inequal-


ity in (7) is reversed, then f has strict local maximum at x∗ . The idea is that if equation 5
holds, then if equation 7 holds for all vectors satisfying equation 6, f will have a strict local
minimum at x∗.

3.2. Checking the Sufficient Conditions. These conditions for a maximum or minimum
can be stated in terms of the Hessian of the Lagrangian function (or bordered Hessian).
Let f, g1, . . . , gm be twice continuously differentiable real valued functions. If there exist
vectors x∗  Rn , λ∗  Rm , such that

∇L(x∗, λ∗) = 0 (8)

and if
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 3

 
∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g1 (x∗) ∂gm (x∗)
 ∂x ∂x ... ...
 1 1 ∂x1 ∂xp ∂x1 ∂x1  
 · · · · · · 
 
 · · · · · · 
 
 
 2 ·∗ ∗ · · · · · 
 ∂ L(x , λ ) ∂ 2L(x∗, λ∗) ∂g1 (x∗) ∗ 
∂gm(x ) 
 ... ... d
 ∂x ∂x ∂xp ∂xp ∂xp ∂xp 
 p 1 
(−1) det 
m

>0
 (9)
 ∂g1(x∗ ) ∂g1(x∗ ) 
 ... 0 ... 0 
 ∂x1 ∂xp 
 
 · · · · · · 
 
 · · · · · · 
 
 · · · · · · 
 
 ∂gm(x∗ ) ∂gm (x∗) 
... 0 ... 0
∂x1 ∂xp
for p = m + 1, . . . , n, then f has a strict local minimum at x∗ , such that

gi (x∗) = 0, i = 1, . . . , m. (10)
We check the determinants in (9) starting with the one that has m + 1 elements in each
row and column of the Hessian and m+1 elements in each row or column of the derivative
of a given constraint with respect to x. Note that m does not change as we check the various
determinants so that they will all be of the same sign for a given m.
If there exist vectors x∗  Rn , λ∗  Rm , such that

∇L(x∗ , λ∗ ) = 0 (11)
and if
 
∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g1(x∗) ∂gm (x∗)
 ∂x ∂x ... ...
 1 1 ∂x1∂xp ∂x1 ∂x1  
 · · · · · · 
 
 · · · · · · 
 
 
 2 ·∗ ∗ · · · · · 
 ∂ L(x , λ ) ∂ 2L(x∗, λ∗) ∂g1(x∗) ∂gm (x ) 

 ... ... 
 ∂x ∂x ∂xp ∂xp ∂xp ∂xp 
 p 1 
(−1)p det 

>0
 (12)
 ∂g1(x∗ ) ∂g1(x )∗ 
 ... 0 ... 0 
 ∂x1 ∂xp 
 
 · · · · · · 
 
 · · · · · · 
 
 · · · · · · 
 
 ∂gm(x∗ ) ∂gm(x∗ ) 
... 0 ... 0
∂x1 ∂xp
4 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

for p = m + 1, . . ., n then f has a strict local maximum at x∗, such that

gi (x∗) = 0, i = 1, . . ., m. (13)

We check the determinants in (12) starting with the one that has m + 1 elements in
each row and column of the Hessian and m + 1 elements in each row or column of the
derivative of a given constraint with respect to x. Note that p changes as we check the
various determinants so that they will alternate in sign for a given m.
Consider the case where n = 2 and m = 1. Note that the first matrix we check has
p = m + 1 = 2. Then the condition for a minimum is

 
∂ 2L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗)
 ∂x1∂x1 ∂x1∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗) 
(−1) det  >0 (14)
 ∂x2∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

This, of course, implies

 
∂ 2 L(x∗, λ∗) ∂ 2L(x∗ , λ∗ ) ∂g(x∗)
 ∂x1∂x1 ∂x1∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗) ∂ 2L(x∗ , λ∗ ) ∂g(x∗) 
det  <0 (15)
 ∂x2∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

The condition for a maximum is

 
∂ 2 L(x∗, λ∗) ∂ 2 L(x∗, λ∗) ∂g(x∗)
 ∂x1 ∂x1 ∂x1∂x2 ∂x1 
 
 
 2 ∗ 
 ∂ L(x∗, λ∗) ∂ 2 L(x∗, λ∗) ∂g(x ) 
(−1)2 det  >0 (16)
 ∂x2 ∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

This, of course, implies


GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 5

 
∂ 2 L(x∗, λ∗) ∂ 2 L(x∗, λ∗) ∂g(x∗)
 ∂x1 ∂x1 ∂x1∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗) ∂ 2 L(x∗, λ∗) ∂g(x∗) 
det  >0 (17)
 ∂x2 ∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

Also consider the case where n = 3 and m = 1. We start with p = m + 1 = 2 and


continue until p = n. Then the condition for a minimum is

 
∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗)
 ∂x1∂x1 ∂x1∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗) ∂ 2L(x∗ , λ∗) ∂g(x∗) 
(−1) det  >0
 ∂x2∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

 
∂ 2L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗) (18)
 ∂x1∂x1 ∂x1∂x2 ∂x1 ∂x3 ∂x1 
 
 
 2 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗) 
 
 ∂x2∂x1 ∂x2∂x2 ∂x2 ∂x3 ∂x2 
 
(−1) det  >0
 2 ∗ 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x ) 
 
 ∂x3∂x1 ∂x3∂x2 ∂x3 ∂x3 ∂x3 
 
 
 ∂g(x∗) ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2 ∂x3

The condition for a maximum is


6 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

 
∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗)
 ∂x1∂x1 ∂x1 ∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∗
∂g(x ) 
(−1)2 det  >0
 ∂x2∂x1 ∂x2 ∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

 
∂ 2L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗) (19)
 ∂x1∂x1 ∂x1∂x2 ∂x1 ∂x3 ∂x1 
 
 
 2 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗) 
 
 ∂x2∂x1 ∂x2∂x2 ∂x2 ∂x3 ∂x2 
3  
(−1) det  >0
 2 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗, λ∗) ∂ 2L(x∗, λ∗) ∂g(x∗) 
 
 ∂x1∂x1 ∂x1∂x2 ∂x1 ∂x3 ∂x3 
 
 
 ∂g(x∗) ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2 ∂x3
3.3. Sufficient Condition for a Maximum and Minimum and Positive and Negative Def-
inite Quadratic Forms. Note that at the optimum, equation 6 is just linear in the sense that
the derivatives
∂gi(x∗ )
∂xj

are fixed numbers at the point x and we can write equation 6 as
z 0 Jg = 0
 
∂g1(x∗) ∂g2 (x∗) ∂gm (x∗)
 ∂x1 ...
 ∂x1 ∂x1    
 
 ∗ ∗ ∗  0
 ∂g1(x ) ∂g2 (x ) ∂gm (x )   
 ...  0
(z1 z2 . . . zn ) 

∂x2 ∂x2 ∂x2  =  . 
 . (20)
  .
 .. .. .. ..  0
 . . . . 
 
 ∂g1(x ) ∂g2 (x )
∗ ∗
∂gm (x ) 

...
∂xn ∂xn ∂xn
n ∗
o
∂gi (x )
where Jg is the matrix ∂xj and where there is a column of the Jg for each constraint
and a row for each x variable we are considering. This then implies that the sufficient con-
dition for a strict local maximum of the function f is that |HB | has the same sign as (−1)p,
that is the last n − m leading principal minors of HB alternate in sign on the constraint set
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 7

denoted by equation 6. This is the same as the condition that the quadratic form z 0HB z be
negative definite on the constraint set

z 0 ∇gi(x∗ ) = 0, i = 1, . . ., m (21)
If |HB | and these last n − m leading principal minors all have the same sign as (−1)m ,
then z 0 HB z is positive definite on the constraint set z 0 ∇gi(x∗ ) = 0, i = 1, . . ., m and the
function has strict local minimum at the point x∗ .

If both of conditions are violated by non-zero leading principal minors, then z 0 HB z


is indefinite on the constraint set and we cannot determine whether the function has a
maximum or a minimum.

3.4. Example 1: Minimizing Cost Subject to an Output Constraint. Consider a produc-


tion function given by

y = 20x1 − x21 + 15x2 − x22 (22)


Let the prices of x1 and x2 be 10 and 5 respectively with an output constraint of 55.
Then to minimize the cost of producing 55 units of output given this prices we set up the
following Lagrangian

L = 10x1 + 5x2 − λ(20x1 − x21 + 15x2 − x22 − 55)

∂L
= 10 − λ(20 − 2x1) = 0
∂x1
(23)
∂L
= 5 − λ(15 − 2x2) = 0
∂x2

∂L
= (−1)(20x1 − x21 + 15x2 − x22 − 55) = 0
∂λ
If we take the ratio of the first two first order conditions we obtain

10 20 − 2x1
=2=
5 15 − 2x2
⇒ 30 − 4x2 = 20 − 2x1 (24)
⇒ 10 − 4x2 = −2x1
⇒ x1 = 2x2 − 5
Now plug this into the negative of the last first order condition to obtain

20(2x2 − 5) − (2x2 − 5)2 + 15x2 − x22 − 55 = 0 (25)


Multiplying out and solving for x2 will give
8 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

40x2 − 100 − (4x22 − 20x2 + 25) + 15x2 − x22 − 55 = 0


⇒ 40x2 − 100 − 4x22 + 20x2 − 25 + 15x2 − x22 − 55 = 0
⇒ −5x22 + 75x2 − 180 = 0 (26)
⇒ 5x22 − 75x2 + 180 = 0
⇒ x22 − 15x2 + 36 = 0
Now solve this quadratic equation for x2 as follows
p
15 ± 225 − 4(36)
x2 =
2

15 ± 81 (27)
=
2
= 12 or 3

Therefore,

x1 = 2x2 − 5
(28)
= 19 or 1
The Lagrangian multiplier λ can be obtained by solving the first equation that was ob-
tained by differentiating L with respect to x1

10 − λ(20 − (19)) = 0
5
⇒λ=−
9 (29)
10 − λ(20 − 2(1)) = 0
5
⇒λ=
9
To check for a maximum or minimum we set up the bordered Hessian as in
equations 14–17. The bordered Hessian in this case is
 
∂ 2L(x∗, λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗)
 ∂x1∂x1 ∂x1 ∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗) 
HB =   (30)
 ∂x2∂x1 ∂x ∂x ∂x 
 2 2 2 
 
 ∂g(x∗) ∂g(x )∗ 
0
∂x1 ∂x2
We only need to compute one determinant. We compute the various elements of the
bordered Hessian as follows
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 9

L = 10x1 + 5x2 − λ(20x1 − x21 + 15x2 − x22 − 55)


∂L
= 10 − λ(20 − 2x1)
∂x1
∂L
= 5 − λ(15 − 2x2 )
∂x2
∂ 2L
= 2λ
∂x1 ∂x1
∂ 2L (31)
=0
∂x1 ∂x2
∂ 2L
= 2λ
∂x2 ∂x2
∂g
= 20 − 2x1
∂x1
∂g
= 15 − 2x2
∂x2

Consider first the point (19, 12, -5/9). The bordered Hessian is given by

 
2λ 0 20 − 2x1
 
 
HB = 
 0 2λ 15 − 2x2

 
20 − 2x1 15 − 2x2 0

5
x1 = 19, x2 = 12, λ=−
9 (32)

 10 
− 0 −18
 9 
 
 
HB = 
 0 −
10 
−9 
 9 
 
−18 −9 0

The determinant of the bordered Hessian is


10 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

  − 10 −9 − 10 −9 0 − 10
2 10 9
3
9
4
9
|HB | = (−1) − + (−1) (0) + (−1) (−18)
9
−9 0 −9 0 −18 −9
 
10 (33)
= − (−81) + 0 + (−18)(−20)
9

= 90 + 360 = 450
Here p = 2 so the condition for a maximum is that (−1)2|HB | > 0, so this point is a
relative maximum.

Now consider the other point, (1, 3, 5/9). The bordered Hessian is given by
 
2λ 0 20 − 2x1
 
 
HB = 
 0 2λ 15 − 2x2 

 
20 − 2x1 15 − 2x2 0

5
x1 = 1, x2 = 3, λ=
9 (34)

 10 
0 18
9 
 
 
HB = 
0
10 
9
 9 
 
18 9 0
The determinant of the bordered Hessian is

  10
9 10
9 0 10
2 10 9
3
9
4
9
|HB | = (−1) + (−1) (0)| + (−1) (18)
9
9 0 9 0 18 9

  (35)
10
= (−81) + 0 + (18)(−20)
9

= −90 − 360 = −450


The condition for a minimum is that (−1)|HB | > 0, so this point is a relative minimum.
The minimum cost is obtained by substituting into the cost expression to obtain
C = 10(1) + 5(3) = 25 (36)
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 11

3.5. Example 2: Maximizing Output Subject to a Cost Constraint. Consider a production


function given by

y = 30x1 + 12x2 − x21 + x1 x2 − x22 (37)


Let the prices of x1 and x2 be 10 and 4 respectively with an cost constraint of $260.
Then to maximize output with a cost of $260 given these prices we set up the following
Lagrangian

L = 30x1 + 12x2 − x21 + x1 x2 − x22 − λ(10x1 + 4x2 − 260)


∂L
= 30 − 2x1 + x2 − 10λ = 0
∂x1
∂L (38)
= 12 + x1 − 2x2 − 4λ = 0
∂x2
∂L
= −10x1 − 4x2 + 260 = 0
∂λ

If we take the ratio of the first two first order conditions we obtain

10 30 − 2x1 + x2
= 2.5 =
4 12 + x1 − 2x2
⇒ 30 + 2.5x1 − 5x2 = 30 − 2x1 + x2
(39)
⇒ 4.5x1 = 6x2

⇒ x1 = 1.33̄x2

Now plug this value for x1 into the negative of the last first order condition to obtain

10x1 + 4x2 − 260 = 0

⇒ (10)(1.33̄x2) + 4x2 − 260 = 0

⇒ 13.33̄x2 + 4x2 = 260

⇒ 17.33̄x2 = 260 (40)

⇒ x2 = 15
 
4
⇒ x1 = 7 (15) = 20
3

We can also find the maximum y by substituting in for x1 and x2.


12 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

y = 30x1 + 12x2 − x21 + x1x2 − x22


= (30)(20) + (12)(15) − (20)2 − (20)(15) − (15)2
(41)
= 600 + 180 − 400 + 300 − 225
= 455

The Lagrangian multiplier λ can be obtained by solving the first equation that was ob-
tained by differentiating L with respect to x1

30 − 2x1 + x2 − 10λ = 0
⇒ 30 − 2(20) + (15) − 10λ = 0
⇒ 30 − 40 + 15 − 10λ = 0
(42)
⇒ 5 = 10λ
1
⇒λ=
2

To check for a maximum or minimum we set up the bordered Hessian as in equa-


tions 14–17 where p = 2 and m = 1. The bordered Hessian in this case is

 
∂ 2L(x∗, λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗)
 ∂x1∂x1 ∂x1 ∂x2 ∂x1 
 
 
 2 
 ∂ L(x∗, λ∗ ) ∂ 2L(x∗, λ∗) ∂g(x∗) 
HB =   (43)
 ∂x2∂x1 ∂x2 ∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2

We compute the various elements of the bordered Hessian as follows


GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 13

L = 30x1 + 12x2 − x21 + x1x2 − x22 − λ(10x1 + 4x2 − 260)


∂L
= 30 − 2x1 + x2 − 10λ
∂x1
∂L
= 12 + x1 − 2x2 − 4λ
∂x2
∂ 2L
= −2
∂x1∂x1
∂ 2L (44)
=1
∂x1∂x2
∂ 2L
= −2
∂x2∂x2
∂g
= 10
∂x1
∂g
=4
∂x2
The derivatives are all constants. The bordered Hessian is given by
 
−2 1 10
 
HB =  1 −2 4 (45)
10 4 0
The determinant of the bordered Hessian is

−2 4 1 4 1 −2
|HB| = (−1)2 (−2) + (−1)3(1) + (−1)4(10)|
4 0 10 0 10 4
(46)
= (−2)(−16) − (−40) + (10)(24)

= 32 + 40 + 240 = 312
The condition for a maximum is that (−1)2|HB | > 0, so this point is a relative maximum.
3.6. Example 3: Maximizing Utility Subject to an Income Constraint. Consider a utility
function given by

u = xα 1 α2
1 x2
Now maximize this function subject to the constraint that

w1 x1 + w2x2 = c0
14 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

Set up the Lagrangian problem:

L = xα1 α2
1 x2 − λ[w1 x1 + w2 x2 − c0 ]

The first order conditions are

∂L
= α 1 xα
1
1 −1 α2
x2 − λ w 1 = 0
∂x1
∂L
= α 2 xα1 α2 −1
1 x2 − λw2 = 0
∂x2
∂L
= −w1 x1 − w2 x2 + c0= 0
∂λ

Taking the ratio of the 1st and 2nd equations we obtain

w1 α 1 x2
=
w2 α2 x1

We can now solve the equation for the 2nd quantity as a function of the 1st input quantity
and the prices. Doing so we obtain

α 2 x1 w1
x2 =
α1 w2

Now substituting in the income equation we obtain

w 1 x1 + w 2 x2 = c 0
 
α 2 x1 w1
⇒ w 1 x1 + w 2 = c0
α1 w2
 
α 2 w1 w2
⇒ w 1 x1 + x1 = c0
α1 w2
 
α 2 w1
⇒ w 1 x1 + x1 = c0
α1
 
α 2 w1
⇒ x1 w1 + = c0
α1
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 15
 
α2
⇒ x 1 w1 1 + = c0
α1
 
α1 + α2
⇒ x 1 w1 = c0
α1
 
c0 α1
⇒ x1 =
w1 α1 + α 2

We can now get x2 by substitution

 
α 2 w1
x 2 = x1
α1 w2
  
c0 α1 α 2 w1
=
w1 α1 + α 2 α1 w2
 
c0 α2
=
w2 α1 + α 2

We can find the value of the optimal u by substitution

u = xα1 α2
1 x2

  α1   α2
c0 α1 c0 α2
=
w1 α1 + α 2 w2 α1 + α 2

= cα
0
1 +α2
w1−α1 w2−α2 αα1 α2
1 α2 (α2 + α2 )
−α1 −α2

This can also be written

u = xα1 α2
1 x2

  α1   α2
c0 α1 c0 α2
=
w1 α1 + α 2 w2 α1 + α 2
 α1  α2  α1  α2
α1 α2 c0 c0
=
α1 + α 2 α1 + α 2 w1 w2

For future reference note that the derivative of the optimal u with respect to c0 is given
by
16 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

u = cα
0
1 +α2
w1−α1 w2−α2 αα1 α2
1 α2 (α2 + α2 )
−α1 −α2

∂u
= (α1 + α2)cα
0
1 +α2 −1 −α1
w1 w2−α2 αα1 α2
1 α1 (α2 + α2 )
−α1 −α2
∂c0

= cα
0
1 +α2 −1 −α1
w1 w2−α2 αα1 α2
1 α1 (α2 + α2 )
1−α1 −α2

We obtain λ by substituting in either the first or second equation as follows

α 1 xα
1
1 −1 α2
x2 − λw1 = 0
α 1 xα
1
1 −1 α2
x2
⇒λ=
w1
α2 xα1 α2 −1
1 x2 − λw2 = 0
α 2 xα1 α2 −1
1 x2
⇒λ=
w2
If we now substitute for x1 and x2 , we obtain

α 1 xα
1
1 −1 α2
x2
λ=
w1
 
c0 α1
x1 =
w1 α1 + α 2
 
c0 α2
x2 =
w2 α1 + α 2
  α1 −1   α2
c0 α1 c0 α2
α1
w1 α1 + α 2 w2 α1 + α 2
⇒λ=
w1

α 1 cα
0
1 +α2 −1 1−α1
w1 w2−α2 αα
1 α2 (α1 + α2 )1−α1 −α2
1 −1 α2
=
w1
1−α1 −α2
= cα
0
1 +α2 −1 −α1
w1 w2−α2 αα1 α2
1 α2 (α1 + α2 )

Thus λ is equal to the derivative of the optimal u with respect to c0.

To check for a maximum or minimum we set up the bordered Hessian as in equa-


tions 14–17 where p = 2 and m = 1. The bordered Hessian in this case is
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 17

 
∂ 2L(x∗ , λ∗ ) ∂ 2L(x∗ , λ∗ ) ∂g(x∗)
 ∂x1∂x1 ∂x1∂x2 ∂x1 
 
 
 2 ∗ 
 ∂ L(x∗ , λ∗ ) ∂ 2L(x∗ , λ∗ ) ∂g(x ) 
HB =   (47)
 ∂x2∂x1 ∂x2∂x2 ∂x2 
 
 
 ∂g(x∗) ∂g(x∗) 
0
∂x1 ∂x2
We need compute the various elements of the bordered Hessian as follows

L = xα1 α2
1 x2 − λ[w1 x1 + w2 x2 − c0 ]

∂L
= α 1 xα
1
1 −1 α2
x2 − λw1
∂x1

∂L
= α 2 xα1 α2 −1
1 x2 − λw2
∂x2

∂ 2L
= (α1)(α1 − 1)xα
1
1 −2 α2
x2
∂x21

∂ 2L
= α 1 α2 xα
1
1 −1 α2 −1
x2
∂x1 ∂x2

∂ 2L
= (α2)(α2 − 1)xα1 α2 −2
1 x2
∂x22

∂g
= w1
∂x1

∂g
= w2
∂x2
The derivatives of the constraints are constants. The bordered Hessian is given by
 
(α1)(α1 − 1)xα
1
1 −2 α2
x2 α1 α2 xα
1
1 −1 α2 −1
x2 w1
 
 
HB = 
 α1 α2 xα
1
1 −1 α2 −1
x2 (α2 )(α2 − 1)xα1 α2 −2
1 x2 w2 
 (48)
 
w1 w2 0
To find the determinant of the bordered Hessian, expand by the third row as follows
18 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

4
α 1 α2 xα
1
1 −1 α2 −1
x2 w1 5
(α1)(α1 − 1)xα
1
1 −2 α2
x2 w1
|HB | = (−1) w1| + (−1) w2 +0
(α2)(α2 − 1)xα1 α2 −2
1 x2 w2 α1 α2 xα
1
1 −1 α2 −1
x2 w2

α 1 α2 xα
1
1 −1 α2 −1
x2 w1 (α1)(α1 − 1)xα
1
1 −2 α2
x2 w1
= w1 − w2
(α2)(α2 − 1)xα1 α2 −2
1 x2 w2 α1 α2 xα
1
1 −1 α2 −1
x2 w2

= w 1 w2 α1 α2 xα
1
1 −1 α2 −1
x2 − w12(α2 )(α2 − 1)xα1 α2 −2
1 x2
− w22 (α1)(α1 − 1)xα
1
1 −2 α2
x2 + w 1 w2 α1 α2 xα
1
1 −1 α2−1
x2

= 2w1w2α1 α2xα
1
1 −1 α2 −1
x2 − w12(α2 )(α2 − 1)xα1 α2 −2
1 x2 − w22 (α1)(α1 − 1)xα
1
1 −2 α2
x2
(49)
For a maximum we want this expression to be positive. Rewriting it we obtain

2w1w2α1 α2 xα
1
1 −1 α2 −1
x2 − w12(α2 )(α2 − 1)xα1 α2 −2
1 x2 − w22 (α1)(α1 − 1)xα
1
1 −2 α2
x2 > 0 (50)
We can also write it in the following convenient way

2w1w2α1 α2 xα
1
1 −1 α2 −1
x2

+α2 w12xα1 α2 −2
1 x2 − α22 w12xα1 α2 −2
1 x2 (51)

+α1 w22 xα
1
1 −2 α2
x2 − α21 w22 xα
1
1 −2 α2
x2 > 0
To eliminate the prices we can substitute from the first-order conditions.

α 1 xα
1
1 −1 α2
x2
w1 =
λ

α 2 xα1 α2 −1
1 x2
w2 =
λ
This then gives
  !
α1 xα1−1 xα 2
α 2 xα1 α2 −1
1 x2
2 1 2
α 1 α2 xα
1
1 −1 α2 −1
x2
λ λ
!2 !
α1 −1 α2 2
α 1 xα 1 −1 α2
x α x
1 1 x
+α2 1 2
xα1 α2 −2
1 x2 − α22 2
xα 1 α2 −2
1 x2 (52)
λ λ
!2 !2
α 2 xα1 α2 −1
1 x2 α 2 xα1 α2 −1
1 x2
+α1 xα
1
1 −2 α2
x2 − α21 xα
1
1 −2 α2
x2 > 0
λ λ
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 19

Multiply both sides by λ2 and combine terms to obtain

2α21α22 x3α
1
1 −2 3α2 −2
x2

+α21 α2 x3α
1
1 −2 3α2 −2
x2 − α22 α21 x3α1−2
1 x3α
2
2 −2
(53)

+α1 α22 x3α


1
1 −2 3α2 −2
x2 − α21 α22 x3α
1
1 −2 3α2 −2
x2 >0
Now factor out x3α
1
1 −2 3α2 −2
x2 to obtain

x3α
1
1 −2 3α2 −2
x2 2α21α22 + α21 α2 − α22 α21 + α1 α22 − α21 α22 >0
 (54)
⇒ x3α1
1 −2 3α2 −2
x2 α21 α2 + α1 α22 >0
With positive values for x1 and x2 the whole expression will be positive if the last term
in parentheses is positive. Then rewrite this expression as

α21 α2 + α1 α22 > 0 (55)
Now divide both sides by α21α22 (which is positive) to obtain
 
1 1
+ >0 (56)
α2 α1
3.7. Some More Example Problems.
(i) opt [x1x2 ] s. t.
x1 , x2
x1 + x2 = 6
(ii) opt [x1x2 + 2x1 ] s.t.
x1 , x2
4x1 + 2x2 = 60
(iii) opt [x21 + x22 ] s.t
x1 , x2
x1 + 2x2 = 20
(iv) opt [x1x2 ] s.t.
x1 , x2

x21 + 4x22 = 1
1 1
(v) opt [x14 x22 ] s.t.
x1 , x2
2x1 + 8x2 = 60

4. T HE I MPLICIT F UNCTION T HEOREM


4.1. Statement of Theorem. We are often interested in solving implicit systems of equa-
tions for m variables, say x1 , x2 , . . . , xm in terms of m+p variables where there are a mini-
mum of m equations in the system. We typically label the variables xm+1 , xm+2 , . . . , xm+p ,
∂xi
y1 , y2 , . . . , yp . We are frequently interested in the derivatives ∂x j
where it is implicit that
all other xk and all y` are held constant. The conditions guaranteeing that we can solve for
20 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

m of the variables in terms of p variables along with a formula for computing derivatives
is given by the implicit function theorem.

Theorem 1 (Implicit Function Theorem). Suppose that φi are real-valued functions defined on
a domain D and continuously differentiable on an open set D1 ⊂ D ⊂ Rm+p , where p > 0 and

φ(x01, x02 , . . . , x0m , y10, y20 , . . . , yp0 ) = φi (x0 , y 0 ) = 0,


(57)
i = 1, 2, . . . , m, and (x0 , y 0 ) ∈ D1 .
0 0
Assume the Jacobian matrix [ ∂φi(x∂xj
,y )
] has rank m. Then there exists a neighborhood Nδ (x0,
y0) ⊂ D1 , an open set D2 ⊂ Rp containing y0 and real valued functions ψk , k = 1, 2, . . . , m,
continuously differentiable on D2 , such that the following conditions are satisfied:

x0k = ψk (y 0), k = 1, 2, . . . , m. (58)


For every y ∈ D2 , we have

φi (ψ1(y), ψ2(y), . . . , ψm (y), y1 , y2 , . . . , yp ) ≡ 0, i = 1, 2, . . . , m.


or (59)
φi (ψ(y), y) ≡ 0, i = 1, 2, . . . , m.
∂φ (x, y)
We also have that for all (x,y) ∈ Nδ (x0 , y0 ), the Jacobian matrix [ ∂x
i
j
] has rank m. Further-
2
more for y ∈ D , the partial derivatives of ψ(y) are the solutions of the set of linear equations

m
X ∂φi (ψ(y), y) ∂ψk (y) −∂φi (ψ(y), y
= i = 1, 2, . . . , m (60)
∂xk ∂yj ∂yj
k=1

4.2. Example with one equation and three variables. Consider one implicit equation
with three variables.

φ(x01 , x02 , y 0 ) = 0 (61)


The implicit function theorem says that we can solve equation 61 for x01 as a function of
x02 and y0 , i.e.,

x01 = ψ1 (x02, y 0 ) (62)


and that

φ(ψ1(x2, y), x2, y) = 0 (63)


The theorem then says that
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 21

∂φ(ψ1(x2 , y), x2, y) ∂ψ1 −∂φ(ψ1(x2, y), x2 , y)


=
∂x1 ∂x2 ∂x2
∂φ(ψ1(x2, y), x2 , y) ∂x1(x2 , y) ∂φ(ψ1(x2 , y), x2, y)
⇒ = − (64)
∂x1 ∂x2 ∂x2

∂x1(x2 , y) − ∂φ(ψ1 (x2 , 2y), x2 , y)


⇒ = ∂φ(ψ (x ∂x
∂x2 1 2 , y), x2 , y)
∂x1
Consider the following example.

φ(x01, x02, y 0 ) = 0
(65)
y 0 − f (x01, x02 ) = 0
The theorem says that we can solve the equation for x01 .

x01 = ψ1(x02, y 0 ) (66)


It is also true that

φ(ψ1(x2, y), x2 , y) = 0
(67)
y − f (ψ1(x2, y), x2) = 0
Now compute the relevant derivatives

∂φ(ψ1(x2, y), x2 , y) ∂f (ψ1(x2, y), x2)


= −
∂x1 ∂x1
(68)
∂φ(ψ1(x2, y), x2 , y) ∂f (ψ1(x2, y), x2)
= −
∂x2 ∂x2
The theorem then says that
" ∂φ(ψ1 (x2 , y), x2 , y) #
∂x1(x2, y)
= − ∂φ(ψ (x∂x, y), 2
x2 , y)
∂x2 1 2
∂x1
" #
− ∂f (ψ1(x
∂x2
2 , y),x2 )

= − (69)
− ∂f (ψ1 (x
∂x1
2 , y),x2 )

∂f (ψ1 (x2 , y),x2 )


∂x2
= − ∂f (ψ1 (x2 , y),x2 )
∂x1

4.3. Example with two equations and three variables. Consider the following system of
equations

φ1(x1 , x2, y) = 3x1 + 2x2 + 4y = 0


(70)
φ2 (x1 , x2, y) = 4x1 + x2 + y = 0
22 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

The Jacobian is given by


" #  
∂φ1 ∂φ1
∂x1 ∂x2 3 2
∂φ2 ∂φ2 = (71)
∂x1 ∂x2
4 1

We can solve system 70 for x1 and x2 as functions of y. Move y to the right hand side in
each equation.

3x1 + 2x2 = −4y (72a)


4x1 + x2 = −y (72b)

Now solve equation 72b for x2

x2 = −y − 4x1 (73)
Substitute the solution to equation 73 into equation 72a and simplify

3x1 + 2(−y − 4x1) = −4y


⇒ 3x1 − 2y − 8x1 = −4y
⇒ −5x1 = −2y (74)

2
⇒ x1 = y = ψ1(y)
5
Substitute the solution to equation 74 into equation 73 and simplify
 
2
x2 = −y − 4 y
5
5 8 (75)
⇒ x2 = − y − y
5 5
13
= − y = ψ2(y)
5
If we substitute these expressions for x1 ad x2 into equation 70 we obtain
     
2 13 2 13
φ1 y , − y, y ) = 3 y + 2 − y + 4y
5 5 5 5
6 26 20 (76)
= y − y + y
5 5 5
20 20
= − y + y = 0
5 5
and
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 23

     
2 13 2 13
φ2 y , − y, y ) = 4 y + − y + y
5 5 5 5
8 13 5 (77)
= y − y + y
5 5 5
13 13
= y − y = 0
5 5
Furthermore

∂ψ1) 2
=
∂y 5
(78)
∂ψ2) 13
= −
∂y 5
We can solve for these partial derivatives using equation 60 as follows

∂φ1 ∂ψ1 ∂φ1 ∂ψ2 −∂φ1


+ = (79a)
∂x1 ∂y ∂x2 ∂y ∂y
∂φ2 ∂ψ1 ∂φ2 ∂ψ2 −∂φ2
+ = (79b)
∂x1 ∂y ∂x2 ∂y ∂y
Now substitute in the derivatives of φ1 and φ2 with respect to x1 , x2 , and y.

∂ψ1 ∂ψ2
3 + 2 = −4 (80a)
∂y ∂y
∂ψ1 ∂ψ2
4 + 1 = −1 (80b)
∂y ∂y
∂ψ2
Solve equation 80b for ∂y

∂ψ2 ∂ψ1
= −1 − 4 (81)
∂y ∂y
Now substitute the answer from equation 81 into equation 80a
 
∂ψ1 ∂ψ1
3 + 2 −1 − 4 = −4
∂y ∂y
∂ψ1 ∂ψ1
⇒ 3 − 2 − 8 = −4
∂y ∂y
(82)
∂ψ1
⇒ −5 = −2
∂y
∂ψ1 2
⇒ =
∂y 5
If we substitute equation 82 into equation 81 we obtain
24 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

∂ψ2 ∂ψ1
= −1 − 4
∂y ∂y
 
∂ψ2 2
⇒ = −1 − 4 (83)
∂y 5
−5 8 13
= − = −
5 5 5
5. F ORMAL A NALYSIS OF L AGRANGIAN M ULTIPLIERS AND E QUALITY C ONSTRAINED
P ROBLEMS
5.1. Definition of the Lagrangian. Consider a function on n variables denoted
f (x) = f (x1, x2 , . . . , xn ). Suppose x∗ minimizes f (x) for all x  Nδ (x∗ ) that satisfy

gi (x) = 0 i = 1, . . . , m
Assume the Jacobian matrix (J) of the constraint equations gi (x∗ ) has rank m. Then:
m
X
∇f (x∗ ) = λ∗i ∇gi(x∗ ) (84)
i=1
In other words the gradient of f at x∗ is a linear combination of the gradients of gi at x∗
with weights λ∗i . For later reference note that the Jacobian can be written
 
∂g1(x∗) ∂g2(x∗ ) ∂gm (x∗)
 ∂x1 ...
 ∂x1 ∂x1  
 ∂g (x∗) ∂g (x∗ ) ∂g (x ∗) 
 1 2
...
m 
 
 ∂x2 ∂x2 ∂x2 
Jg =   (85)
 .. .. .. .. 
 . . . . 
 
 
 ∂g1(x∗) ∂g2(x∗ ) ∂gm (x ) ∗ 
...
∂xn ∂xn ∂xn
Proof:
By suitable rearrangement of the rows  we can  always assume the m × m matrix formed
∂gi (x∗ )
from the first m rows of the Jacobian ∂xj is non-singular. Therefore the set of linear
equations:
m
X ∂gi (x∗) ∂f (x∗)
λj = j = 1, . . . , m (86)
∂xj ∂xj
i=1
will have a unique solution λ∗. In matrix notation we can write equation 86 as

Jλ = ∇f
If J is invertible, we can solve the system for λ. Therefore (84) is true for the first m
elements of ∇f (x∗).
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 25

We must show (84) is also true for the last n−m elements. Let x̃ = (xm+1 , xm+2 , . . . , xn ).
Then by using the implicit function theorem we can solve for the first m xs in terms of the
remaining xs or x̃ .

x∗j = hj (x̃∗) j = 1, . . ., m (87)



We can define f (x ) as

f (x∗ ) = f (h1 (x̃∗), h2 (x̃∗) . . . hm (x̃∗ ), x∗m+1 . . . x∗n ) (88)


Since we are at a minimum, we know that the first partial derivatives of f with respect
to xm+1 , xm+2 , . . . , xn must vanish at x∗ , i.e.
∂f (x∗ )
=0 j = m + 1, . . ., n
∂xj
Totally differentiating (88) we obtain
m
∂f (x∗) X ∂f (x∗) ∂hk (x̃∗ ) ∂f (x∗)
= + =0
∂xj ∂xk ∂xj ∂xj
k=1 (89)

j = m + 1, . . . , n
by the implicit function theorem. We can also use the implicit function theorem to find the
derivative of the ith constraint with respect to the jth variable where the jth variable goes
from m + 1 to n. Applying the theorem to
gi (x∗) = gi (h1 (x̃∗), h2(x̃∗ ) . . . hm (x̃∗ ), x∗m+1 . . . x∗n ) = 0
we obtain
m
X ∂gi (x∗) ∂hk (x̃∗) −∂gi(x∗ )
= i = 1, . . ., m (90)
∂xk ∂xj ∂xj
k=1
Now multiply each side of (90) by λ∗i and add them up.
m X
X m
∂gi(x∗) ∂hk (x̃∗ ) ∂gi(x∗ )
λ∗i + λ∗i =0
∂xk ∂xj ∂xj
i=1 k=1 (91)

j = m + 1, . . . , n
Now subtract (91) from (89) to obtain:

m
" m
# m
X ∂f (x∗) X ∗ ∂gi(x∗) ∂f (x∗) X ∗ ∂gi(x∗ )
− λi + − λi =0
∂xk ∂xk ∂xj ∂xj
k=1 i=1 i=1 (92)

j = m + 1, . . . , n
26 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

The bracket term is zero from (86) so that


m
∂f (x∗) X ∗ ∂gi(x∗ )
− λi =0 j = m + 1, . . ., n (93)
∂xj ∂xj
i=1
Since (86) implies this is true, for j = 1, . . . , m we know it is true for j = 1, 2, . . ., n and
we are finished.
The λi are called Lagrange multipliers and the expression
m
X
L(x, λ) = f (x) − λi gi (x) (94)
i=1
is called the Lagrangian function.
5.2. Proof of Necessary Conditions. The necessary conditions for an extreme point are

∇L(x∗, λ∗) = ∇f (x∗ ) − Jg (x∗ )λ = 0

m
∂f (x∗) X ∗ ∂gi (x∗) (95)
⇒ − λi =0 j = m + 1, . . . , n
∂xj ∂xj
i=1
This is obvious from (84) and (94).
5.3. Proof of Sufficient Conditions. The sufficient conditions are repeated here for con-
venience
Let f, g1, . . . , gm be twice continuously differentiable real-valued functions on Rn . If
there exist vectors x∗  Rn , λ∗ Rm such that

∇L(x∗ λ) = 0 (5)

and for every non-zero vector z  Rn satisfying

z 0 ∇gi (x∗) = 0, . . . i = 1, . . . , m (6)

it follows that

z 0 ∇2x L(x∗, λ∗)z > 0 (7)


then f has a strict local minimum at x∗ , subject to gi(x) = 0, i = 1, . . . , , m. If the
inequality in (7) is reversed , then f has strict local maximum at x∗ .
Proof:
Assume x∗ is not a strict local minimum. Then there exists a neighborhood Nδ (x∗ ) and
a sequence {z k }, zk  Nδ (x∗), z k 6= x∗ converging to x∗ such that for every zk  {zk}.

gi (z k ) = 0 i = l, . . . , m (96)

f (x∗) ≥ f (z k ) (97)
GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS 27

This simply says that since x∗ is not the minimum value subject to the constraints there
exists a sequence of values in the neighborhood of x∗ that satisfies the constraints and has
an objective function value less than or equal to f (∗).
The proof will require the mean value theorem which is repeated here for completeness.
Mean Value Theorem
Theorem 2. Let f be defined on an open subset (Ω) of Rn and have values in R1 . Suppose the set
Ω contains the points a,b and the line segment S joining them, and that f is differentiable at every
point of this segment. Then there exists a point c on S such that

f (b) − f (a) = ∇f (c)0(b − a)


∂f (c) ∂f (c) ∂f (c) (98)
= (b1 − a1 ) + (b2 − a2 ) + · · · + (bn − an )
∂x1 ∂x2 ∂xn
where b is the vector (b1, b2 , . . ., bn ) and a is the vector (a1, a2 , . . . , an ).
Now let y k and z k be vectors in Rn and let z k = x∗ + θk y k where θk > 0 and || y k ||= 1
so that z k − x∗ = θk y k . The sequence {θk , y k } has a subsequence that converges to (0, ȳ)
where || y ||= 1. Now if we use the mean value theorem we obtain for each k in this
subsequence

gi(z k ) − gi (x∗) = θk y k0 ∇gi (x∗ + γik θk y k ) = 0, i = 1, . . . , m (99)


k
where γi is a number between 0 and 1 and gi is the ith constraint. The expression is
equal to zero because we assume that the constraint is satisfied at the optimal point and at
the point z k by equation 98.
Expression 99 follows from the mean value theorem because z k − x∗ = θk y k and with
γik between zero and one, γik θk y k is between z k = x∗ + θk and x∗
If we use the mean value theorem to evaluate f (zk ) we obtain

f (z k ) − f (x∗) = θk y k0 ∇f (x∗ + η k θk y k ) ≤ 0 (100)


where 0 < ηk < 1. This is less than zero by our assumption in equation 97.
If we divide (99) and (100) by θk and take the limit as k → ∞ we obtain
h i
lim y k0 ∇gi (x∗ + η k θk y k ) = ȳ 0 ∇gi(x∗ ) = 0 i = 1, 2, . . . , m (101)
k→∞
h i
lim y k0 ∇f (x∗ + η k θk y k ) = ȳ 0 ∇fi (x∗) ≤ 0 (102)
k→∞
Now remember from Taylor’s theorem that we can write the Lagrangian in (95) as

L(z k , λ∗) = L(x∗, λ∗) + (z k − x∗ )0∇x L(x∗, λ∗)


1 2
+ θk (z k − x∗)0 ∇2x L(x∗ + β k θk y k , λ∗)(z k − x∗ )
2 (103)
1 2
= L(x∗, λ∗) + θk y k0∇x L(x∗, λ∗ ) + θk y k0∇2x L(x∗ + β k θk y k , λ∗)y k
2
28 GENERAL ANALYSIS OF MAXIMA/MINIMA IN CONSTRAINED OPTIMIZATION PROBLEMS

where 0 < β k < 1.


Now note that

m
X
L(z k , λ∗ ) = f (z k ) − λigi (z k )
i=1
m
X
L(x∗ , λ∗ ) = f (x∗) − λigi (x∗ )
i=1

and that at the optimum or at the assumed point z k , gi (·) = 0.


Also ∇L(x∗ , λ∗ ) = 0 at the optimum so the second term on the right hand side of (103)
is zero. Move the first term to the left hand side to obtain
1 k2 k0 2
L(z k , λ∗) − L(x∗, λ∗) =
θ y ∇x L(x∗ + β k θk y k , λ∗)y k (104)
2
Because we assumed f (x∗) ≥ f (z k ) in (97) and that g(·) is zero at either x∗ or z k , it is
clear that

L(z k , λ∗) − L(x∗ , λ∗ ) ≤ 0 (105)


Therefore,
1 k2 k0 2
θ y ∇xL(x∗ + β k θk y k , λ∗)y k ≤ 0 (106)
2
2
Divide both sides by 12 θk to obtain
0
y k ∇2x L(x∗ + β k θk y k , λ∗)y k ≤ 0 (107)
Now take the limit as k → ∞ to obtain

ȳ 0 ∇2x L(x∗, λ∗)ȳ ≤ 0 (108)


We are finished since ȳ =
6 0, and by equation 101,
ȳ 0 ∇gi(x∗ ) = 0, i = 1, 2, . . . , m
that is, if x∗ is not a minimum then we have a non-zero vector y satisfying

ȳ 0 ∇gi(x∗ ) = 0, i = 1, 2, . . . , m (109)
0 2 ∗ ∗ ∗
where ȳ ∇x L(x , λ )ȳ ≤ 0 . But if x is a minimum then equation 6 rather than (108)
will hold.

You might also like