Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
50 views

Lecture 6

1) The document presents two theorems for determining if a function is convex: (1) The first order condition states that a function f(x) is convex if f(y) ≥ f(x) + ∇f(x)(y - x) for all x, y in the domain. (2) The second order condition states that a function is convex if the Hessian matrix ∇2f(x) is positive semidefinite for all x in the domain. 2) Examples are given to illustrate the second order condition, including quadratic functions where the Hessian is constant and positive semidefinite, and the loss function of linear regression where the Hessian is the transpose of the design matrix times itself,

Uploaded by

Tấn Long Lê
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
50 views

Lecture 6

1) The document presents two theorems for determining if a function is convex: (1) The first order condition states that a function f(x) is convex if f(y) ≥ f(x) + ∇f(x)(y - x) for all x, y in the domain. (2) The second order condition states that a function is convex if the Hessian matrix ∇2f(x) is positive semidefinite for all x in the domain. 2) Examples are given to illustrate the second order condition, including quadratic functions where the Hessian is constant and positive semidefinite, and the loss function of linear regression where the Hessian is the transpose of the design matrix times itself,

Uploaded by

Tấn Long Lê
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 3

SYS 6003: Optimization Fall 2015

Lecture 6
Instructor: Quanquan Gu Date: Sep 12th

The following theorem provides a sufficient and necessary condition to verify a function
is convex.

Theorem 1 (First order condition for convex functions) Suppose f : Rd → R is a


continuously differentiable function over its convex domain domf , then f (x) is convex if
and only if

f (y) ≥ f (x) + ∇f (x)> (y − x) (1)

for all x, y ∈ domf .

Proof: We first prove the forward direction “⇒”


Suppose that f (x) is convex, then for any x, y ∈ domf and any α ∈ [0, 1], we have f (αy +
(1 − α)x) ≤ αf (y) + (1 − α)f (x). Rearranging this inequality leads to

f (αy + (1 − α)x) − (1 − α)f (x) f (αy + (1 − α)x) − f (x)


f (y) ≥ = f (x) + . (2)
α α
By Taylor expansion, we have

f (x + α(y − x)) = f (x) + α∇f (x)> (y − x) + o(α).

where o(α) means limt→0 o(α)/α = 0. Also note that x + α(y − x) = αy + (1 − α)x. Thus,
it follows from (2) that

α∇f (x)> (y − x) + o(α) o(α)


f (y) ≥ f (x) + = f (x) + ∇f (x)> (y − x) + ,
α α
which immediately leads to (1) by taking α → 0.
Now we prove the backward direction “⇐”:
We want to show that, for any x, y ∈ domf and any α ∈ [0, 1],

f (αx + (1 − α)y) ≤ αf (x) + (1 − α)f (y).

Let z ≡ αx+(1−α)y, since domf is a convex set, we have z ∈ domf . Since x, y, z ∈ domf ,
we have

f (x) ≥ f (z) + ∇f (z)> (x − z) (3)


>
f (y) ≥ f (z) + ∇f (z) (y − z). (4)

Now multiply inequality (3) by α and inequality (4) by (1 − α) to obtain:

αf (x) + (1 − α)f (y) ≥ αf (z) + α∇f (z)> (x − z) + (1 − α)f (z) + (1 − α)∇f (z)> (y − z)

1
And we are left with the right hand side equal to:

f (z) + ∇f (z)> (αx + αz + (1 + α)(y − z)) = f (z) + ∇f (z)> (αx + (1 − α)y − z)


= f (z) + ∇f (z)> (z − z)
= f (z).

And since we know f (z) = f (αx+(1−α)y), we can conclude this is indeed a convex function.

Figure 1: Illustrating the 1st Order Condition for Convex Functions

In order to prove that a function is convex, we can use the definition. But sometimes
that can be tedious. In the following, we will introduce second order sufficient and necessary
condition for convex functions, which provides an easy way to prove a function is convex.

Theorem 2 (Second order condition for convex functions) Suppose f : Rd → R is


twice continuously differentiable over its convex domain domf , then f is convex if and only
if,

∇2 f (x)  0 for all x ∈ domf

Proof: By the mean value theorem, we have:


1
f (y) = f (x) + ∇f (x)> (y − x) + (y − x)> ∇2 f (z)(y − x), (5)
2
where z = αx + (1 − α)y, α ∈ [0, 1]. Note that since domf is convex, we have z ∈ domf .
We first prove the forward direction “⇒”:
Since f is convex, by the first order condition, we have for any x, y ∈ domf ,

f (y) ≥ f (x) + ∇f (x)> (y − x). (6)

Therefore, by combining (5) and (6), we have


1
(y − x)> ∇2 f (z)(y − x) ≥ 0.
2

2
Let y → x, then z → x. By the continuity of ∇2 f (x), we then have

(y − x)> ∇2 f (x)(y − x) ≥ 0.

Due to the arbitrariness of y and x, it follows that ∇2 f (x)  0.


We now prove the backward direction “⇐”:
Consider any x, y ∈ domf where α ∈ [0, 1]. Let z = αx + (1 − α)y, since domf is a convex
set, we have z ∈ domf . Since ∇2 f (x)  0 for all x ∈ domf , we have ∇2 f (z)  0. For any
y ∈ domf , by the definition of positive semidefinite, we then know

(y − x)> ∇2 f (z)(y − x) ≥ 0. (7)

Therefore, by combining (5) and (7), we have f (y) ≥ f (x) + ∇f (x)> (y − x) . By the
first-order condition for convex functions, f is convex.

Now we will illustrate the application of second-order condition for convex functions with
several examples.

Example 1 (Quadratic Function) f (x) = 12 x> Px+q> x+r, where P ∈ Rd×d , P  0, q ∈


Rd , r ∈ R, x ∈ Rd .
f (x) is convex, since

∇2 f (x) = P  0.

Example 2 (Loss function of Linear Regression)


1
f (x) = kAx − bk22 ,
2
where A ∈ Rn×d , x ∈ Rd , b ∈ Rn . f (x) is convex, since

∇2 f (x) = A> A  0,

You might also like