Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Homework 1

Download as pdf or txt
Download as pdf or txt
You are on page 1of 3

Homework 1

Mathematics for AI: AIT2005

Submit your work on https://courses.uet.vnu.edu.vn/.

1 Convex sets

a) Show that a polyhedron {x ∈ Rn : Ax ≤ b}, for some A ∈ Rm×n , b ∈ Rm , is both convex and
closed.

b) Show that if Si ⊆ Rn , i ∈ I is a collection of convex sets, then their intersection ∩i∈I Si is also
convex. Show that the same statement holds if we replace “convex” with “closed”.
c) Given an example of a closed set in R2 whose convex hull is not closed.

2 Convex functions

a) Prove that the entropy function, defined as


n
X
f (x) = − xi log(xi ),
i=1
Pn
with dom(f ) = {x ∈ Rn++ : i=1 xi = 1}, is strictly concave.
b) Let f be twice differentiable, with dom(f ) convex. Prove that f is convex if and only if

(∇f (x) − ∇f (y))T (x − y) ≥ 0,

for all x, y. This property is called monotonicity of the gradient ∇f .

c) Prove that the maximum of a convex function over a bounded polyhedron must occur at one of
the vertices. Hint: you may use the fact that a bounded polyhedron can be represented as the
convex hull of its vertices.

1
3 Partial optimization with ℓ2 penalties
Consider the problem
n
λX
min f (β) + g(βi , σi ), (1)
β, σ≥0 2 i=1
for some convex f with domain Rn , λ ≥ 0, and

2
x /y + y
 if y > 0
g(x, y) = 0 if x = 0, y = 0

∞ else.

In other words, the problem (1) is just the weighted ℓ2 penalized problem
n
λ X  βi2 
min f (β) + + σi ,
β, σ≥0 2 i=1 σi

but being careful to treat the ith term in the sum as zero when βi = σi = 0.

a) Prove that g is convex. Hence argue that (1) is a convex problem. Note that this means we can
perform partial optimization in (1) and expect it to return another convex problem. Hint: use the
definition of convexity.
b) Argue that miny≥0 g(x, y) = 2|x|.
c) Argue that minimizing over σ ≥ 0 in (1) gives the ℓ1 penalized problem

min f (β) + λ∥β∥1 .


β

4 Lipschitz gradients and strong convexity


Let f be convex and twice continuously differentiable.

a) Show that the following statements are equivalent.


i. ∇f is Lipschitz with constant L;
ii. (∇f (x) − ∇f (y))T (x − y) ≤ L∥x − y∥22 for all x, y;
iii. ∇2 f (x) ⪯ LI for all x;
L
iv. f (y) ≤ f (x) + ∇f (x)T (y − x) + 2 ∥y − x∥22 for all x, y.
Your solution should have 5 parts, where you prove i ⇒ ii, ii ⇒ iii, iii ⇒ iv, iv ⇒ ii, and iii ⇒ i.

b) Show that the following statements are equivalent.


i. f is strongly convex with constant m;
ii. (∇f (x) − ∇f (y))T (x − y) ≥ m∥x − y∥22 for all x, y;
iii. ∇2 f (x) ⪰ mI for all x;
m
iv. f (y) ≥ f (x) + ∇f (x)T (y − x) + 2 ∥y − x∥22 for all x, y.
Your solution should have 4 parts, where you prove i ⇒ ii, ii ⇒ iii, iii ⇒ iv, and iv ⇒ i.

2
5 Solving optimization problems with CVX (18 points)
“CVX is a fantastic framework for disciplined convex programming: it’s rarely the fastest tool for the
job, but it’s widely applicable, and so it’s a great tool to be comfortable with. In this exercise we will
set up the CVX environment and solve a convex optimization problem.
Generally speaking, for homeworks in this class, your solution to programming-based problems
should include plots and whatever explanation necessary to answer the questions asked. In addition,
you full code should be submitted as an appendix to the homework document.
CVX variants are available for each of the major numerical programming languages. There are
some minor syntactic and functional differences between the variants but all provide essentially the
same functionality. Download the CVX variant of your choosing:
• Matlab: http: // cvxr. com/ cvx/
• Python: http: // www. cvxpy. org/

• R: https: // cvxr. rbind. io


• Julia: https: // github. com/ JuliaOpt/ Convex. jl
and consult the documentation to understand the basic functionality. Make sure that you can solve the
least squares problem minβ ∥y − Xβ∥22 for an arbitrary vector y and matrix X. Check your answer
by comparing with the closed-form solution (X T X)−1 X T y. ”

Given labels y ∈ {−1, 1}n , and a feature matrix X ∈ Rn×p with rows x1 , . . . xn , recall the support
vector machine (SVM) problem
n
1 X
min ∥β∥22 + C ξi
β,β0 ,ξ 2 i=1
subject to ξi ≥ 0, i = 1, . . . n
yi (xTi β + β0 ) ≥ 1 − ξi , i = 1, . . . n.

i. Load the training data in xy train.csv. This is a matrix of n = 200 row and 3 columns. The
first two columns give the first p = 2 features, and the third column gives the labels. Using
CVX, solve the SVM problem with C = 1. Report the optimal crtierion value, and the optimal
coefficients β ∈ R2 and intercept β0 ∈ R.

ii. Investigate many values of the cost parameter C = 2a , as a varies from −5 to 5. For each one,
solve the SVM problem, form the decision boundary, and calculate the misclassification error on
the test data in xy test.csv. Make a plot of misclassification error (y-axis) versus C (x-axis,
which you will probably want to put a log scale).

You might also like