additional_exercises
additional_exercises
This is a collection of additional exercises, meant to supplement those found in the book Convex
Optimization, by Stephen Boyd and Lieven Vandenberghe. These exercises were used in several
courses on convex optimization, EE364a (Stanford), EE236b (UCLA), or 6.975 (MIT), usually for
homework, but sometimes as exam questions. Some of the exercises were originally written for the
book, but didn’t make the final cut.
Many of the exercises include a computational component using one of the software packages
for convex optimization: CVXPY (Python), Convex.jl (Julia), CVX (Matlab), or CVXR (R).
We refer to these collectively as CVX*. The files required for these exercises can be found at
https://github.com/cvxgrp/cvxbook_additional_exercises. From 2023 on, new problems
use Python only.
You are free to use these exercises any way you like (for example in a course you teach), provided
you acknowledge the source. In turn, we gratefully acknowledge the teaching assistants (and in
some cases, students) who have helped us develop and debug these exercises. Pablo Parrilo helped
develop some of the exercises that were originally used in MIT 6.975, Sanjay Lall and John Duchi
developed some other problems when they taught EE364a, and the instructors of EE364a during
summer quarters developed others.
We’ll update this document as new exercises become available, so the exercise numbers and
sections will occasionally change. We have categorized the exercises into sections that follow the
book chapters, as well as various additional application areas. Some exercises fit into more than
one section, or don’t fit well into any section, so we have just arbitrarily assigned these.
Course instructors can obtain solutions to these exercises by email to us. Please tell us the
course you are teaching and give its URL.
1
Contents
1 Introduction 3
2 Convex sets 4
3 Convex functions 9
5 Duality 51
7 Statistical estimation 95
8 Geometry 126
14 Circuits 165
17 Finance 205
2
1 Introduction
1.1 Convex optimization. Are the following statements true or false?
1.2 Device sizing. In a device sizing problem the goal is to minimize power consumption subject to the
total area not exceeding 50, as well as some timing and manufacturing constraints. Four candidate
designs meet the timing and manufacturing constraints, and have power and area listed in the table
below.
Design Power Area
A 10 50
B 8 55
C 10 45
D 11 50
Are the statements below true or false?
1.3 Computation time. Very roughly, how long would it take to solve a linear program with 100
variables and 1000 constraints on a computer capable of carrying out a 10 Gflops/sec (i.e., 1010
floating-point operations per second)?
(a) Microseconds.
(b) Milliseconds.
(c) Seconds.
(d) Minutes.
(a) Local optimization can be quite useful in some contexts, and therefore is widely used.
(b) Local optimization is currently illegal in 17 states.
(c) Local optimization can’t guarantee finding a (global) solution, and so is not widely used.
3
2 Convex sets
2.1 Affine set. Show that the set {Ax + b | F x = g} is affine. Here A ∈ Rm×n , b ∈ Rm , F ∈ Rp×n ,
and g ∈ Rp .
2.2 Set distributive characterization of convexity [Rockafellar]. Show that C ⊆ Rn is convex if and
only if (α + β)C = αC + βC for all nonnegative α, β. Here we use standard notation for scalar-set
multiplication and set addition, i.e., αC = {ac | c ∈ C} and A + B = {a + b | a ∈ A, b ∈ B}.
2.3 Shapley-Folkman theorem. First we define a measure of non-convexity for a set A ⊆ Rn , denoted
δ(A), defined as
δ(A) = sup dist(u, A),
u∈conv A
where dist(u, A) = inf{∥u − v∥2 | v ∈ A}. In words, δ(A) is the maximum distance between a point
in the convex hull of A and its closest point in A. Note that δ(A) = 0 if and only if the closure of
A is convex. Sometimes δ(A) is referred to as the distance to convexity (of the set A).
As a simple example, suppose n = 1 and A = {−1, 1}, so conv A = [−1, 1]. We have δ(A) = 1; the
point 0 is the point in conv A farthest from A (with distance 1).
Now we get to the Shapley-Folkman theorem. Let C ⊆ Rn , and define
Sk = (1/k)(C + · · · + C),
where the (set) sum involves k copies of C. You can think of Sk as the average (in the set
sense) of k copies of C; elements of Sk consist of averages of k elements of C. We observe that
conv Sk = conv C, i.e., Sk has the same convex hull as C. The Shapley-Folkman theorem states
that
lim δ(Sk ) = 0.
k→∞
You can think of the Shapley-Folkman theorem as a kind of central limit theorem for sets; roughly
speaking, averages of k copies of a non-convex set become convex in the limit. It is not too hard
to prove the Shapley-Folkman theorem, but we won’t do that in this exercise.
(a) Consider the specific case C = {−1, 1} ⊂ R. Find S2 and S3 , and then work out what Sn
is, and evaluate δ(Sk ). Verify that δ(Sn ) → 0 as n → ∞. Draw a picture that shows Sk for
k = 4, its convex hull (which is [−1, 1]), and show a point in conv S4 that is farthest from S4 .
(b) Repeat for C = [−1, −1/2] ∪ [1/2, 1].
Note. We are not asking for formal arguments for your expressions for Sk .
2.4 Let S = {α ∈ R3 | α1 + α2 e−t + α3 e−2t ≤ 1.1 for t ≥ 1}. Is S affine, a halfspace, a convex cone, a
convex set, or none of these? (For each one, you can respond true or false.)
2.5 Convex and conic hull. Let C = {(1, 0), (1, 1), (−1, −1), (0, 0)}. Are the following statements true
or false?
4
(c) (0, 1/3) is in the conic hull of C.
2.7 A set of matrices. Let C = {A ∈ Sn | xT Ax ≥ 0 for all x ⪰ 1}. Are the following statements true
or false?
2.9 Define the square S = {x ∈ R2 | 0 ≤ xi ≤ 1, i = 1, 2}, and the disk D = {x ∈ R2 | ∥x∥2 ≤ 1}. Are
the following statements true or false?
(a) S ∩ D is convex.
(b) S ∪ D is convex.
(c) S \ D is convex.
p(t) = a1 + a2 t + · · · + ak tk−1 ,
convex?
with domains dom ϕ = {x | cT x + d > 0}, dom ψ = {y | g T x + h > 0}. We associate with ϕ and
ψ the matrices
A b E f
, ,
cT d gT h
respectively.
5
Now consider the composition Γ of ψ and ϕ, i.e., Γ(x) = ψ(ϕ(x)), with domain
Show that Γ is linear-fractional, and that the matrix associated with it is the product
E f A b
.
gT h cT d
conv S ∩ conv T ̸= ∅.
The result in (b) is easily rephrased in a more general form, known as Helly’s theorem. Let S1 , . . . ,
Sm be a collection of m convex sets in Rn . Suppose the intersection of every k ≤ n + 1 sets from
the collection is nonempty. Then the intersection of all sets S1 , . . . , Sm is nonempty.
6
2.13 Portfolio weight constraints. We consider a portfolio of n assets specified by a holdings vector
h ∈ Rn+ , with hi denoting the USD value invested or held in asset i, for i = 1, . . . , n. The total
portfolio value (in USD) is 1T h, which we assume is positive, which is the same as h ̸= 0. The
portfolio weights are defined as w = h/1T h, so wi is the fraction of the total portfolio value that is
held in asset i. Since h ⪰ 0, we have w ⪰ 0 and 1T w = 1, i.e., w lies in the probability simplex.
Constraints on the portfolio are typically specified in terms of the weights, as a convex set W ⊆ Rn .
The associated holdings constraint set is
Is H convex? If so, briefly explain why. If not, give a simple and specific counterexample, i.e., a
specific convex weight set W and the associated nonconvex holdings set H.
2.14 Generalized inequality. Let K = {(x1 , x2 ) | 0 ≤ x1 ≤ x2 }. Are the following statements true or
false?
2.15 Minimal and minimum elements. Consider the set S = {(0, 2), (1, 1), (2, 3), (1, 2), (4, 0)}. Are
the following statements true or false?
Here, minimum and minimal are with respect to the nonnegative orthant K = R2+ .
2.17 Dual cones in R2 . Describe the dual cone for each of the following cones.
(a) K = {0}.
(b) K = R2 .
7
(c) K = {(x1 , x2 ) | |x1 | ≤ x2 }.
(d) K = {(x1 , x2 ) | x1 + x2 = 0}.
{(x, y, z) | x ≤ 0, y = 0, z ≥ 0}.
(This makes no difference, since the dual of a cone is equal to the dual of its closure.)
2.19 Dual of intersection of cones. Let C and D be closed convex cones in Rn . In this problem we will
show that
(C ∩ D)∗ = C ∗ + D∗
when C ∗ + D∗ is closed. Here, + denotes set addition, i.e., C ∗ + D∗ is the set {u + v | u ∈ C ∗ , v ∈
D∗ }. In other words, the dual of the intersection of two closed convex cones is the sum of the
dual cones. (A sufficient condition for of C ∗ + D∗ to be closed is that C ∩ int D ̸= ∅. The general
statement is that (C ∩ D)∗ = cl(C ∗ + D∗ ), and that the closure is unnecessary if C ∩ int D ̸= ∅,
but we won’t ask you to show this.)
(C ∩ D)∗ ⊆ C ∗ + D∗ ⇐⇒ C ∩ D ⊇ (C ∗ + D∗ )∗ .
V ∗ = {AT v | v ⪰ 0}.
8
3 Convex functions
3.1 Curvature of some functions. Determine the curvature of the functions below. Your responses
can be: affine, convex, concave, and none (meaning, neither convex nor concave). Give a brief
justification.
√
(a) f (x) = min{2, x, x}, with dom f = R+ .
(b) f (x) = x3 , with dom f = R.
(c) f (x) = x3 , with dom f = R++ .
(d) f (x, y) = x min{y, 2}, with dom f = R2+ .
p
√ √
(e) f (x, y) = ( x + y)2 , with dom f = R2+ .
(f) f (θ) = log det θ − tr(Sθ), with dom f = Sn++ , and S ≻ 0.
3.2 Convex and concave functions. Determine the curvature of each of the functions f below (i.e.,
convex, concave, or neither). Give a brief justification.
3.3 Battery energy. The state of a chemical battery (such as a Lithium ion battery) is characterized
by its terminal voltage v ≥ 0 (in V, volts) and its charge q ≥ 0 (in C, coulombs). These are related
by v = ϕ(q), where ϕ : R → R is an increasing differentiable function with ϕ(0) = 0.
The energy (in J, Joules) stored in the battery is a function of its charge q, given by
Z q
E(q) = ϕ(r) dr.
0
What can you say about the function E, with no further information? Is E convex, concave, or
neither? Briefly justify your answer.
Remarks.
• You don’t need to know any electrical or chemical engineering to answer this question.
• In EE dialect, the battery model above is a nonlinear capacitor.
3.4 Coordinate convexity. Consider a function f : R2 → R. Suppose that for each x ∈ R, f (x, y) is a
convex function of y, and for each y ∈ R, f (x, y) is a convex function of x. (You might call such a
function coordinate convex.)
Is f convex? If so, give a very brief justification. If not, give a simple counter-example, i.e., a
specific function f that satisfies the conditions above, but isn’t convex.
9
3.5 Dotsort function. For x ∈ Rn , we let S(x) ∈ Rn denote the entries of x sorted in decreasing
order, i.e., S(x) = (x[1] , x[2] , . . . , x[n] ), where x[i] is the ith largest entry of x. We define the dotsort
function f : Rn × Rn → R as f (x, y) = S(x)T S(y). The function f is called the dotsort function
since it is the dot product of its vector arguments, sorted. Show that f is convex in x for fixed y.
(It’s also convex in y with x fixed, which means the function is bi-convex.)
Hint. You can use without proof the so-called re-arrangement inequality, which states that for any
a, b ∈ Rn , aT b ≤ S(a)T S(b). In words: the maximum value of the inner product of two vectors,
over all permutions of the entries, is obtained when both are sorted.
3.6 Curvature of some order statistics. For x ∈ Rn , with n > 1, x[k] denotes the kth largest entry
of x, for k = 1, . . . , n, so, for example, x[1] = maxi=1,...,n xi and x[n] = mini=1,...,n xi . Functions
that depend on these sorted values are called order statistics or order functions. Determine the
curvature of the order statistics below, from the choices convex, concave, or neither. For each
function, explain why the function has the curvature you claim. If you say it is neither convex nor
concave, give a counterexample showing it is not convex, and a counterexample showing it is not
concave. All functions below have domain Rn .
Remark. For the functions defined in (d)–(f), you might find slightly different definitions in the
literature. Please use the formulas above to answer each question.
3.7 Determinant characterization of convex function. Show that a function f : R → R is convex if and
only if dom f is convex and
1 1 1
det x y z ≥0
f (x) f (y) f (z)
for all x, y, z ∈ dom f with x < y < z.
10
3.8 Reverse Jensen inequality. Suppose f is convex, λ1 > 0, λi ≤ 0, i = 2, . . . , k, and λ1 + · · · + λn = 1,
and let x1 , . . . , xn ∈ dom f . Show that the inequality
f (λ1 x1 + · · · + λn xn ) ≥ λ1 f (x1 ) + · · · + λn f (xn )
always holds. Hints. Draw a picture for the n = 2 case first. For the general case, express x1 as a
convex combination of λ1 x1 + · · · + λn xn and x2 , . . . , xn , and use Jensen’s inequality.
3.9 Convexity of nonsymmetric matrix fractional function. Consider the function f : Rn×n × Rn → R,
defined by
f (X, y) = y T X −1 y, dom f = {(X, y) | X + X T ≻ 0}.
When this function is restricted to X ∈ Sn , it is convex.
Is f convex? If so, prove it. If not, give a (simple) counterexample.
3.10 Convexity of some sets. Determine if each set is convex.
(a) {P ∈ Rn×n | xT P x ≥ 0 for all x ⪰ 0}.
(b) {(c0 , c1 , c2 ) ∈ R3 | c0 = 1, |c0 + c1 t + c2 t2 | ≤ 1 for all − 1 ≤ t ≤ 1}.
√ √
(c) {(u, v) ∈ R2 | cos(u + v) ≥ 2/2, u2 + v 2 ≤ π 2 /4}. Hint: cos(π/4) = 2/2.
(d) {x ∈ Rn | xT A−1 x ≥ 0}, where A ≺ 0.
3.11 Maximum of a convex function over a polyhedron. Show that the maximum of a convex function f
over the polyhedron P = conv{v1 , . . . , vk } is achieved at one of its vertices, i.e.,
sup f (x) = max f (vi ).
x∈P i=1,...,k
(A stronger statement is: the maximum of a convex function over a closed bounded convex set is
achieved at an extreme point, i.e., a point in the set that is not a convex combination of any other
points in the set.) Hint. Assume the statement is false, and use Jensen’s inequality.
3.12 Show that the function
∥Ax − b∥22
f (x) = , dom f = {x | ∥x∥2 < 1},
1 − xT x
is convex.
3.13 Convex hull of functions. Suppose g and h are convex functions, bounded below, with dom g =
dom h = Rn . The convex hull function of g and h is defined as
f (x) = inf {θg(y) + (1 − θ)h(z) | θy + (1 − θ)z = x, 0 ≤ θ ≤ 1} ,
where the infimum is over θ, y, z. Show that the convex hull of h and g is convex. Describe epi f
in terms of epi g and epi h.
3.14 Weighted geometric mean. The geometric mean f (x) = ( k xk )1/n with dom f = Rn++ is concave,
Q
as shown on page 74 of the book. Extend the proof to show that
n
Y
f (x) = xαk k , dom f = Rn++
k=1
Pn
is concave, where αk are nonnegative numbers with k=1 αk ≤ 1.
11
3.15 A quadratic-over-linear composition theorem. Suppose that f : Rn → R is nonnegative and convex,
and g : Rn → R is positive and concave. Show that the function f 2 /g, with domain dom f ∩dom g,
is convex.
3.16 Circularly symmetric convex functions. Suppose f : Rn → R is convex and symmetric with respect
to orthogonal transformations, i.e., f (x) depends only on ∥x∥2 . Show that f must have the form
f (x) = ϕ(∥x∥2 ), where ϕ : R → R is nondecreasing and convex, with dom f = R. (Conversely,
any function of this form is symmetric and convex, so this form characterizes such functions.)
3.17 [Roberts and Varberg] Suppose λ1 , . . . , λn are positive. Show that the function f : Rn → R, given
by
Y n
f (x) = (1 − e−xi )λi ,
i=1
is concave on
n
( )
X
−xi
dom f = x∈ Rn++ λi e ≤1 .
i=1
(a) For fixed w ∈ Rn++ , what is the curvature of g(x) = f (w, x)? Is it convex, concave, both (i.e.,
affine), or neither?
(b) For fixed x ∈ Rn , what is the curvature of h(w) = f (w, x)? Is it convex, concave, both (i.e.,
affine), or neither?
for x ∈ dom g.
(b) The function
f (x) = max {∥AP x − b∥ | P is a permutation matrix}
m×n
with A ∈ R , b ∈ Rm .
3.20 Let f, g : Rn → R and ϕ : R → R be given functions. Determine if each statement is true or false.
12
p
(c) If f, g are concave and positive, then f (x)g(x) is concave.
3.21 Curvature of some functions. Determine the curvature of the functions below. Your responses can
be: affine, convex, concave, and none (meaning, neither convex nor concave).
(a) f (u, v) = uv, with dom f = R2 .
(b) f (x, u, v) = log(v − xT x/u), with dom f = {(x, u, v) | uv > xT x, u > 0}.
(c) the ‘exponential barrier’ for a polyhedron,
m
X 1
f (x) = exp ,
i=1
bi − aTi x
(a) Mean. E x.
(b) Second moment. E x2 .
(c) Third moment. E x3 .
(d) Variance. var(x) = E(x − E x)2 .
3.23 Distances between probability distributions on a finite set. We describe a probability distribution
on n outcomes as a vector p ∈ Rn+ with 1T p = 1, with pi the probability of event i, for i = 1, . . . , n.
Suppose p and q are two such probability distributions. There are several ways to define a distance
or deviation d(p, q) between p and q. Show that each of the metrics below is a convex function of
(p, q).
where prob(S; p) is the probability of the event S under the distribution p, i.e., prob(S; p) =
n n
P
i∈S pi . Since there are 2 subsets of {1, . . . , n}, the maximum above is over 2 numbers. In
mp
words, d (p, q) is the maximum difference in probability assigned to any set of outcomes by
the distributions p and q.
In addition to showing that dmp is convex, express it in a simple explicit form involving ∥p−q∥1 .
(b) Hellinger distance. The Hellinger distance is defined as
n
he
X √ √
d (p, q) = ( pi − q i )2 .
i=1
13
Remark. There are many others, for example the usual ℓ2 -norm ∥p − q∥2 or the Kullback-Leibler
divergence
n
X
dkl (p, q) = pi log(pi /qi ),
i=1
which are also convex in (p, q). (See Convex Optimization, Example 3.19.)
3.24 Some functions of graph weights. Consider a connected weighted graph G = (V, E), with weights
we ∈ R+ for e ∈ E.
(a) Distance between two sets of vertices. Let S ⊂ E, T ⊂ E be disjoint sets of vertices. The
distance between S and T , denoted dist(S, T ) is defined as the minimum of the sum of edge
weights over any path that starts in S and ends in T .
|E|
Considered as a function of edge weights w ∈ R+ , is dist(S, T ) convex, concave, or neither
of these?
(b) Optimal value of traveling salesman problem. A tour is a path that includes each vertex in
the graph exactly once. The traveling salesman problem is to find a tour that minimizes total
edge weight along the tour. Its optimal value, denoted T ⋆ , is the minimum of the total edge
weight among all tours.
|E|
Considered as a function of edge weights w ∈ R+ , is T ⋆ convex, concave, or neither of these?
is convex on Sn .
(a) A square matrix S is doubly stochastic if its elements are nonnegative and all row sums and
column sums are equal to one. It can be shown that every doubly stochastic matrix is a convex
combination of permutation matrices.
Show that if f is convex and symmetric and S is doubly stochastic, then
f (Sx) ≤ f (x).
14
(c) Use the results in parts (a) and (b) to show that if f is convex and symmetric and X ∈ Sn ,
then
f (λ(X)) = sup f (diag(V T XV ))
V ∈V
where V is the set of n × n orthogonal matrices. Show that this implies that f (λ(X)) is convex
in X.
3.26 Functions of the eigenvalues of symmetric matrices. Use the results of exercise 3.25 or 3.27 to give
one-line proofs of the following.
(a) Themaximum eigenvalue λ1 (X) is convex in X ∈ Sn .
(b) Theminimum eigenvalue λn (X) is concave in X ∈ Sn .
(c) Thetrace inverse tr(X −1 ) is convex on Sn++ .
(d) Thegeometric mean (det X)1/n is concave on Sn++ . (See page 74 of the book.)
(e) Thelog determinant log det X is concave on Sn++ .
(f) The sum of the k-largest eigenvalues ki=1 λi (X) is convex in X ∈ Sn .
P
(g) All Ky-Fan p-norms, defined by the usual p ≥ 1-norm on the eigenvalues ∥X∥ = ∥λ(X)∥p , are
convex in X.
3.27 Symmetric convex matrix functions. We call a function f : Rn → R symmetric if f (x) = f (P x)
for all permutation matrices P , i.e., matrices P that satisfy P ∈ {0, 1}n×n , P 1 = 1, and P T 1 = 1.
We call a function f : Sn → R unitarily invariant if for all orthogonal Q ∈ Rn×n , i.e., QT Q = I,
we have f (X) = f (QXQT ). For a matrix X ∈ Sn , let λ(X) ∈ Rn be the vector of its eigenvalues.
In this exercise you will prove the following result: if F : Sn → R is convex and unitarily invariant,
there is a symmetric f : Rn → R such that F (X) = f (λ(X)). Conversely, if f : Rn → R is
symmetric and convex, then the function F (X) = f (λ(X)) is convex and unitarily invariant. (See
also Exercise 3.25 for a different proof.)
(a) Show that F : Sn → R is unitarily invariant if and only if there exists a symmetric f : Rn → R
such that F (X) = f (λ(X)).
(b) For any symmetric function g : Rn → R, define the matricization gsy (X) = g(λ(X)). Let
f : Rn → R be convex and symmetric. Use the result of Exercise 12.3 (Von Neumann’s trace
inequality) to show that the convex conjugate of fsy is the matricization of f ∗ , that is,
∗
fsy (Y ) = sup{tr(XY ) − fsy (X) | X ∈ Sn } = (f ∗ )sy (Y ).
X
(c) Use the result of exercise 3.39(d) in the book to show that if f : Rn → R is closed convex and
symmetric, then fsy (X) = f (λ(X)) is also closed convex and unitarily invariant.
(d) Show that if F : Sn → R is closed convex and unitarily invariant, then F (X) = f (λ(X)) for
some symmetric convex f . Hint. This is the easy part. Consider diagonal matrices.
(e) A subgradient of a convex function f : Rn → R at the point x ∈ dom f is a vector g ∈ Rn
such that f (y) ≥ f (x) + g T (y − x) for all y ∈ Rn . Show that if f is symmetric and convex,
then
fsy (Y ) ≥ fsy (X) + tr(G(Y − X))
for all matrices G ∈ Sn of the form G = U diag(g)U T , where g ∈ ∂f (x) and X = U diag(x)U T .
That is, the subgradients of f determine the subgradients of fsy .
15
3.28 Cube of convex and concave functions. For each of the following, determine if the function f is
convex, concave, or neither. ‘Convex’ means that f must be convex, with no further assumptions.
3.29 Square and reciprocal of convex and concave functions. For each of the following, determine if the
function f is convex, concave, or neither.
3.30 Curvature of functions. Are the following functions affine, convex, concave, quasi-convex, quasi-
linear, or none of these?
3.31 Square of sum of squareroots. You know that the squareroot of the sum of squares of the entries
of a vector is a convex function, its Euclidean norm. Here we consider a similar function, with the
roles reversed: The square of the sum of the squareroots,
√ √
f (x) = ( x1 + · · · + xn )2 , dom f = Rn+ .
where h : Rk → R is convex, and gi : Rn → R. Suppose that for each i, one of the following holds:
Show that f is convex. This composition rule subsumes all the ones given in the book, and is
the one used in software systems that are based on disciplined convex programming (DCP) such
as CVX*. You can assume that dom h = Rk ; the result also holds in the general case when the
monotonicity conditions listed above are imposed on h̃, the extended-valued extension of h.
16
3.33 Continued fraction function. Show that the function
1
f (x) =
1
x1 −
1
x2 −
1
x3 −
x4
defined where every denominator is positive, is convex and decreasing. (There is nothing special
about n = 4 here; the same holds for any number of variables.)
3.34 Convexity properties of quadratic norm. Consider the function f (x, P ) = (xT P x)1/2 , where x ∈ Rn
and P ∈ Sn++ .
(a) Suppose we fix P and consider f as a function of x. Is it convex, concave, affine, or neither?
(b) Suppose we fix x and consider f a function of P . Is it convex, concave, affine, or neither?
3.35 Explain why the following functions are convex. In each problem, x is an n-vector.
3.36 Pre-composition with a linear fractional mapping. Suppose f : Rm → R is convex, and A ∈ Rm×n ,
b ∈ Rm , c ∈ Rn , and d ∈ R. Show that g : Rn → R, defined by
is convex.
3.37 Logarithmic barrier for the second-order cone. The function f (x, t) = − log(t2 −xT x), with dom f =
{(x, t) ∈ Rn × R | t > ∥x∥2 } (i.e., the interior of the second-order cone), is called the logarithmic
barrier function for the second-order cone. There are several ways to show that f is convex,
for example by evaluating the Hessian and demonstrating that it is positive semidefinite. In this
exercise you establish convexity of f using a relatively painless method, leveraging some composition
rules and known convexity of a few other functions.
(a) Explain why t−(1/t)uT u is a concave function on dom f . Hint. Use convexity of the quadratic
over linear function.
(b) From this, show that − log(t − (1/t)uT u) is a convex function on dom f .
(c) From this, show that f is convex.
3.38 Circularly symmetric Huber function. The scalar Huber function is defined as
(1/2)x2
|x| ≤ 1
fhub (x) =
|x| − 1/2 |x| > 1.
This convex function comes up in several applications, including robust estimation. This prob-
lem concerns generalizations of the Huber function to Rn . One generalization to Rn is given
17
by fhub (x1 ) + · · · + fhub (xn ), but this function is not circularly symmetric, i.e., invariant under
transformation of x by an orthogonal matrix. A generalization to Rn that is circularly symmetric
is
(1/2)∥x∥22 ∥x∥2 ≤ 1
fcshub (x) = fhub (∥x∥2 ) =
∥x∥2 − 1/2 ∥x∥2 > 1.
(The subscript stands for ‘circularly symmetric Huber function’.) Show that fcshub is convex. Find
∗
the conjugate function fcshub .
3.39 Monotone extension of a convex function. Suppose f : Rn → R is convex. Recall that a function
h : Rn → R is monotone nondecreasing if h(x) ≥ h(y) whenever x ⪰ y. The monotone extension
of f is defined as
g(x) = inf f (x + z).
z⪰0
(We will assume that g(x) > −∞.) Show that g is convex and monotone nondecreasing, and
satisfies g(x) ≤ f (x) for all x. Show that if h is any other convex function that satisfies these
properties, then h(x) ≤ g(x) for all x. Thus, g is the maximum convex monotone underestimator
of f .
Remark. For simple functions (say, on R) it is easy to work out what g is, given f . On Rn , it
can be very difficult to work out an explicit expression for g. However, systems such as CVX* can
immediately handle functions such as g, defined by partial minimization.
3.40 Robust regression loss. Let θ ∈ Rn denote the parameters in a statistical model. We have training
data x1 , . . . , xN ∈ Rd , and a loss function ℓ(θ; x) which is convex in θ for any x. (Smaller values of
ℓ(θ; x) mean a better fit.) The loss on the training data is given by
N
X
L(θ) = ℓ(θ; xi ).
i=1
Suppose that k = 0.9N is an integer (i.e., N is multiple of 10). The robust loss is
where ϕ is the sum of the largest k entries. The robust loss is the sum of the losses on the worst
90% of the training data.
Is Lrob a convex function of θ?
Remarks. We’ll cover these topics later. You don’t need to know anything about statistics or
machine learning to answer this question.
3.41 Monotonicity of the extended-value function is necessary in the composition rule. Consider the
composition f = h ◦ g,
18
We consider the specific function h : R → R, defined as h(u) = u, with dom h = R+ . Find a
specific convex function g : R → R, with dom g = R, for which f is not convex. Briefly (in one or
two sentences), justify your answer.
(a) The difference between the maximum and minimum value of a polynomial on a given interval
[a, b], as a function of its coefficients:
3.44 Convexity of products of powers. This problem concerns the product of powers function f : Rn++ →
R given by f (x) = xθ11 · · · xθnn , where θ ∈ Rn is a vector of powers. We are interested in finding
values of θ for which f is convex or concave. You already know a few, for example when n = 2 and
θ = (2, −1), f is convex (the quadratic-over-linear function), and when θ = (1/n)1, f is concave
(geometric mean). Of course, if n = 1, f is convex when θ ≥ 1 or θ ≤ 0, and concave when
0 ≤ θ ≤ 1.
Show each of the statements below. We will not read long or complicated proofs, or ones that
involve Hessians. We are looking for short, snappy ones, that (where possible) use composition
rules, perspective, partial minimization, or other operations, together with known convex or concave
functions, such as the ones listed in the previous paragraph. Feel free to use the results of earlier
statements in later ones.
(a) When n = 2, θ ⪰ 0, and 1T θ = 1, f is concave. (This function is called the weighted geometric
mean.)
19
(b) When θ ⪰ 0 and 1T θ = 1, f is concave. (This is the same as part (a), but here it is for general
n.)
(c) When θ ⪰ 0 and 1T θ ≤ 1, f is concave.
(d) When θ ⪯ 0, f is convex.
(e) When 1T θ = 1 and exactly one of the elements of θ is positive, f is convex.
(f) When 1T θ ≥ 1 and exactly one of the elements of θ is positive, f is convex.
Remark. Parts (c), (d), and (f) exactly characterize the cases when f is either convex or concave.
That is, if none of these conditions on θ hold, f is neither convex nor concave. Your teaching staff
has, however, kindly refrained from asking you to show this.
3.45 Fuel use as function of distance and speed. A vehicle uses fuel at a rate f (s), which is a function
of the vehicle speed s. We assume that f : R → R is a positive increasing convex function, with
dom f = R+ . The physical units of s are m/s (meters per second), and the physical units of f (s)
are kg/s (kilograms per second).
(a) Let g(d, t) be the total fuel used (in kg) when the vehicle moves a distance d ≥ 0 (in meters)
in time t > 0 (in seconds) at a constant speed. Show that g is convex.
(b) Let h(d) be the minimum fuel used (in kg) to move a distance d (in m) at a constant speed s
(in m/s). Show that h is convex.
3.46 Perspective of the perspective. In this exercise we explore what happens when you take the per-
spective of a convex function twice. Suppose f : Rn → R is convex. Let g : Rn × R++ → R be
its perspective function, which is also convex. Let h : Rn × R++ × R++ → R be the perspective
function of g, which is also convex. What can you say about h? Be as specific as you can be. For
example, is h related to f or g in some simple way?
3.47 Perspective of log determinant. Show that f (X, t) = nt log t−t log det X, with dom f = Sn++ ×R++ ,
is convex in (X, t). Use this to show that
3.48 A perspective composition rule [Maréchal]. Let f : Rn → R be a convex function with f (0) ≤ 0.
(a) Show that the perspective tf (x/t), with domain {(x, t) | t > 0, x/t ∈ dom f }, is nonincreasing
as a function of t.
(b) Let g be concave and positive on its domain. Show that the function
is convex.
20
(c) As an example, show that
xT x
h(x) = Qn , dom h = Rn++
( k=1 xk )1/n
is convex.
λ1 (X)α+1
f (X) = , dom f = Sn++ ,
λn (X)α
for α ≥ 0, where λ1 (X) is the largest eigenvalue of X and λn (X) the smallest eigenvalue.
Hint. Show that the epigraph of f is a convex set. Note that f (X) ≤ t if and only if X ≻ 0,
t > 0, and λ1 (X)α+1 /t ≤ λn (X)α .
3.50 DCP rules. The function f (x, y) = −1/(xy) with dom f = R2++ is concave. Briefly explain how to
√ √
represent it, using disciplined convex programming (DCP), limited to the atoms 1/u, uv, v, u2 ,
u2 /v, addition, subtraction, and scalar multiplication. Justify any statement about the curvature,
monotonicity, or other properties of the functions you use. Assume these atoms take their usual
√
domains (e.g., u has domain u ≥ 0), and that DCP is sign-sensitive (e.g., u2 /v is increasing in u
when u ≥ 0).
p
3.51 DCP rules. The function f (x, y) = 1 + x4 /y, with dom f = R × R++ , is convex. Use disciplined
convex programming (DCP) to express f so that it is DCP convex. You can use any of the following
atoms
You may also use addition, subtraction, scalar multiplication, and any constant functions. Assume
that DCP is sign-sensitive, e.g., square(u) is known to be increasing in u for u ≥ 0.
21
3.52 DCP compliance. Determine if each expression below is (sign-sensitive) DCP compliant, and if it
is, state whether it is affine, convex, or concave.
3.53 Inverse of product. The function f (x, y) = 1/(xy) with x, y ∈ R, dom f = R2++ , is convex. How do
√ √
we represent it using disciplined convex programming (DCP), and the functions 1/u, uv, u, u2 ,
u2 /v, addition, subtraction, and scalar multiplication? (These functions have the obvious domains,
and you can assume a sign-sensitive version of DCP, e.g., u2 /v increasing in u for u ≥ 0.) Hint.
There are several ways to represent f using the atoms given above.
3.54 DCP representation of inverse product. The function f (x) = 1/(xy), with dom f = R2++ , is convex.
CVXPY includes an atom for it, called inv_prod(). Here we ask you to implement or express this
function using other atoms and of course the DCP rules. The atoms you can use are
as well as affine operations like sum, difference, matrix multiply, slicing, and stacking. You can
assume sign-dependent monotonicity, e.g., square is known to be decreasing if its argument is
nonpositive.
3.55 DCP representation of cube-over-linear function. The function f (x, y) = x3 /y with dom f =
R+ × R++ , is convex. Here we ask you to implement or express this function using a restricted list
of atoms and of course the DCP rules. The atoms you can use are
as well as affine operations like sum, difference, matrix multiply, slicing, and stacking. You can
assume sign-dependent monotonicity, e.g., square is known to be decreasing if its argument is
nonpositive.
3.56 Disciplined convex programming. For each of the following functions, provide an equivalent DCP
expression. The atoms you are allowed to use are square, inv pos, sqrt, quad over lin, geo mean,
norm1, log det, norm2, norm inf, pos, max, min, log, exp, log sum exp, power, abs, inv prod, as
well as constants, and affine operations like sum, difference, matrix multiply, slicing, and stacking.
(We accept pseudocode that uses the atoms above; your solutions do not have to use the exact
syntax of Python or dcp.stanford.edu.)
You can assume sign-dependent monotonicity, e.g., square is known to be decreasing if its argument
is nonpositive. You do not need to justify your answer.
√ √
(a) f (x, y) = ( x − y)2 , with dom f = R2++ .
(b) f (x, y) = log 1 + emax(x,y) , with dom f = R2 .
22
(c) f (x, y) = x3 y −1/2 , with dom f = R2++ .
2
x x≥0
(d) f (x) = , with dom f = R.
−x x < 0
3.57 CVX implementation of a concave function. Consider the concave function f : R → R defined by
(x + 1)/2 x > 1
f (x) = √
x 0 ≤ x ≤ 1,
with dom f = R+ . Give a CVX implementation of f , via a partially specified optimization problem.
Check your implementation by maximizing f (x) + f (a − x) for several interesting values of a (say,
a = −1, a = 1, and a = 3).
3.58 Conjugate of pinball loss function. The pinball loss function f : R → R has the form
−ax x≤0
f (x) =
(1 − a)x x > 0,
where a ∈ [0, 1] is a parameter. (The pinball loss is used for quantile regression, but that’s not
relevant for this problem.)
What is the conjugate of the pinball loss? That is, what is f ∗ (y)? Be sure to specify its domain if
it is not all of R.
3.59 Conjugate of composition of convex and linear function. Suppose A ∈ Rm×n with rank A = m,
and g is defined as g(x) = f (Ax), where f : Rm → R is convex. Show that
where A† = (AAT )−1 A is the pseudo-inverse of A. (This generalizes the formula given on page 95
for the case when A is square and invertible.)
3.60 Majorization. Define C as the set of all permutations of a given n-vector a, i.e., the set of vectors
(aπ1 , aπ2 , . . . , aπn ) where (π1 , π2 , . . . , πn ) is one of the n! permutations of (1, 2, . . . , n).
where sk denotes the function sk (x) = x[1] + x[2] + · · · + x[k] . When these inequalities hold, we
say the vector a majorizes the vector x.
23
(c) Conclude from this that the conjugate of SC is given by
∗ 0 if x is majorized by a
SC (x) =
+∞ otherwise.
Since SC∗ is the indicator function of the convex hull of C, this establishes the following result:
x is a convex combination of the permutations of a if and only if a majorizes x.
3.61 Infimal convolution. Let f1 , . . . , fm be convex functions on Rn . Their infimal convolution, denoted
g = f1 ⋄ · · · ⋄ fm (several other notations are also used), is defined as
with the natural domain (i.e., defined by g(x) < ∞). In one simple interpretation, fi (xi ) is the cost
for the ith firm to produce a mix of products given by xi ; g(x) is then the optimal cost obtained
if the firms can freely exchange products to produce, all together, the mix given by x. (The name
‘convolution’ presumably comes from the observation that if we replace the sum above with the
product, and the infimum above with integration, then we obtain the normal convolution.)
3.62 Huber penalty. The infimal convolution of two functions f and g on Rn is defined as
(see exercise 3.61). Show that the infimal convolution of f (x) = ∥x∥1 and g(x) = (1/2)∥x∥22 , i.e.,
the function
1
h(x) = inf (f (y) + g(x − y)) = inf (∥y∥1 + ∥x − y∥22 ),
y y 2
is the Huber penalty (with threshold 1)
n
u2 /2
X |u| ≤ 1
h(x) = ϕ(xi ), ϕ(u) =
|u| − 1/2 |u| > 1.
i=1
3.63 Suppose the function h : R → R is convex, nondecreasing, with dom h = R, and h(t) = h(0) for
t ≤ 0.
24
3.64 Let h : Rn → R be a convex function, nondecreasing in each of its n arguments, and with
domain Rn .
(c) As an example, take n = 1, h(u) = exp(u+ ), and f (x) = exp |x|. Find the conjugates of h and
f , and verify that f ∗ (y) = h∗ (|y|).
3.65 Conjugate of the positive part function. Let f (x) = (x)+ = max{0, x} for x ∈ R. (This function
has various names, such as the positive part of x, or ReLU for Rectified Linear Unit in the context
of neural networks.) What is f ∗ ?
What is the conjugate of f ? That is, what is f ∗ (y)? Be sure to specify its domain.
3.68 Scalar valued linear fractional functions. A function f : Rn → R is called linear fractional if it has
the form f (x) = (aT x + b)/(cT x + d), with dom f = {x | cT x + d > 0}. When is a linear fractional
function convex? When is a linear fractional function quasiconvex?
3.69 For x ∈ Rn , we define f (x) = min{k | ki=1 |xi | > 1}, with f (x) = ∞ if ni=1 |xi | ≤ 1. Is f
P P
quasiconvex, quasiconcave, both, or neither?
3.70 Fractional or relative error. The fractional or relative error between two positive numbers u, v is
defined as
|u − v|
E(u, v) = .
min{u, v}
Are the statements below true or false?
25
3.71 Continuous ranking probability loss function. The continuous ranking probability score (CRPS) is a
function that expresses how well a cumulative distribution function (CDF) F fits a single observed
value y ∈ R. The CRPS is given by
Z y Z ∞
F (u)2 du + (1 − F (u))2 du.
−∞ y
The CRPS is widely used in weather forecasting. (A more commonly used loss function for a dis-
tribution and an observed value is the negative log-likelihood − log F ′ (y), when F is differentiable.)
Suppose we parametrize F as F (u) = m
P
j=1 aj Fj (u), where Fj : R → R are given basis CDFs, and
a = (a1 , . . . , am ) are coefficients which satisfy a ⪰ 0, 1T a = 1 (which ensures that F is a CDF).
Let C : Rm → R denote the CRPS as a function of the coefficients a ∈ Rm , with dom C = {a |
a ⪰ 0, 1T a = 1}. (We consider the basis CDFs Fj and the observed value y ∈ R as fixed.)
Are each of the following statements true or false?
(a) f (x) = (x2 + 2)/(x + 2), with dom f = (−∞, −2) is convex.
(b) f (x) = 1/(1 − x2 ), with domf = (−1, 1) is convex.
(c) f (x) = 1/(1 − x2 ), with domf = (−1, 1) is log-convex.
(d) f (x) = cosh x = (ex + e−x )/2 is convex.
(e) f (x) = cosh x is log-concave.
(f) f (x) = cosh x is log-convex.
3.73 Relations among convexity, quasi-convexity, and log-convexity. Are the following statements true
or false? As usual, ‘true’ means that it holds with no additional assumptions.
3.74 Functions of a random variable with log-concave density. Suppose the random variable X on Rn
has log-concave density, and let Y = g(X), where g : Rn → R. For each of the following statements,
either give a counterexample, or show that the statement is true.
26
(b) If g is convex, then prob(Y ≤ a) is a log-concave function of a.
(c) If g is concave, then E ((Y − a)+ ) is a convex and log-concave function of a. (This quantity is
called the tail expectation of Y ; you can assume it exists. We define (s)+ as (s)+ = max{s, 0}.)
3.75 CDF of the maximum of a vector random variable with log-concave density. Let X be an Rn -valued
random variable, with log-concave probability density function p. Define the scalar random variable
Y = maxi Xi , which has cumulative distribution function ϕ(a) = prob(Y ≤ a). Determine whether
ϕ must be a log-concave function, given only the assumptions above. If it must be log-concave, give
a brief justification. Otherwise, provide a (very) simple counterexample. (We will deduct points for
overly complicated solutions.) Please note. The coordinates Xi need not be independent random
variables.
3.76 Functions of a log-concave scalar random variable. Suppose the random variable X defined on R+
has a log-concave and decreasing density pX .
√
(a) Square root. Let Y = X. Does Y have a log-concave density? Either show that it does,
or give a counterexample, i.e., a specific log-concave decreasing density for X for which the
density of Y is not log-concave.
(b) Square. Let Z = X 2 . Does Z have a log-concave density? Either show that it does, or give a
counterexample, i.e., a specific log-concave decreasing density for X for which the density of
Z is not log-concave.
3.77 Tail bounds for log-concave densities. When X ∼ N (0, 1) and a > 0, a well-known upper bound
on prob(X ≥ a) is prob(X ≥ a) ≤ φ(a)/a, where φ is the Gaussian density. In this exercise
we explore a generalization of this bound to vector random variables and non-Gaussian, but log-
concave, distributions.
Let X ∈ Rn be a random variable with log-concave differentiable probability density function
p : Rn → R+ . We can express p as p(x) = exp(−ψ(x)), where ψ : Rn → R is convex and
differentiable.
(a) Tail bound. Suppose that ∇p(a) ≺ 0 (which is the same as ∇ψ(a) ≻ 0). Show that
n
!−1
Y
prob(X ⪰ a) ≤ p(a) (∇ψ(a))i .
i=1
We expect a solution based on ideas from this course, without reference to other tail bounds
you might know about. Remark. When X ∼ N (0, 1), this recovers the well-known tail bound
mentioned above.
Hints.
• Start with a basic inequality involving ψ(x), ψ(a), and ∇ψ(a), and from this obtain an
upper bound on p(x).
• Recall that x⪰a f1 (x1 ) · · · fn (xn ) dx = Πni=1 xi ≥ai fi (xi ) dxi .
R R
(b) Evaluate the upper bound for the specific case n = 2, X ∼ N (0, Σ), with
1 ρ
Σ= , ρ = 0.5, a = (3, 3).
ρ 1
27
We estimated prob(X ⪰ a) (using a Monte Carlo method) as 8.2 × 10−5 ; compare this to the
upper bound.
3.78 Convex function of a random vector with log-concave density. Let X be a random variable on Rn
with log-concave density p. Let Y = f (X), where f : Rn → R is convex. The CDF of Y is
F (a) = prob(Y ≤ a).
Is F log-concave (with no further assumptions)? If yes, give a justification. If no, give a simple
and specific counterexample.
Is p log-concave? That is, does the Gumbel distribution have log-concave density?
3.80 A composition rule for log-log convex functions. A function f : Rn++ → R++ is called log-log
convex if F (u) = log f (eu ) is convex, where the exponentiation is applied elementwise. Similarly,
f is log-log concave if F is concave, and it is log-log affine if F is affine. For example, posynomials
are log-log convex and monomials are log-log affine.
It turns that log-log convex functions obey a composition rule, analogous to the one for convex
functions. Suppose
f (x) = h(g1 (x), g2 (x), . . . , gk (x)),
where h : Rk++ → R++ is log-log convex, and gi : Rn++ → R++ . Suppose that for each i, one of
the following holds:
Show that f is log-log convex. (This composition rule is the basis of disciplined geometric program-
ming, which is implemented in CVXPY.)
3.81 Generalization of the convexity of log det X −1 . Let P ∈ Rn×m have rank m. In this problem we
show that the function f : Sn → R, with dom f = Sn++ , and
is convex. To prove this, we assume (without loss of generality) that P has the form
I
P = .
0
(a) Let Y and Z be symmetric matrices with 0 ≺ Y ⪯ Z. Show that det Y ≤ det Z.
28
(b) Let X ∈ Sn++ , partitioned as
X11 X12
X= T ,
X12 X22
with X11 ∈ Sm . Show that the optimization problem
Hence, the convex function f defined in part (c) can also be expressed as f (X) = log det(P T X −1 P ).
Hint. Use the formula for the inverse of a symmetric block matrix:
−1 T
A B 0 0 −I −1 T −1 −I
= + (A − BC B )
BT C 0 C −1 −1
C BT −1
C BT
29
4 Convex optimization problems
4.1 Minimizing a function over the probability simplex. Find simple necessary and sufficient conditions
for x ∈ Rn to minimize a differentiable convex function f over the probability simplex {x | 1T x =
1, x ⪰ 0}.
4.2 ‘Hello World’ in CVX*. Use CVX* to verify the optimal values you obtained (analytically) for
exercise 4.1 in Convex Optimization.
4.3 Formulating constraints in CVX*. Below we give several convex constraints on scalar variables x,
y, and z. Express each one as a set of valid constraints in CVX*. (Directly expressing them in
CVX* will lead to invalid constraints.) You can also introduce additional variables, if needed.
Check your reformulations by creating a small problem that includes these constraints, and solving
it using CVX*. Your test problem doesn’t have to be feasible; it’s enough to verify that CVX*
processes your constraints without error.
4.4 Optimal activity levels. Solve the optimal activity level problem described in exercise 4.17 in Convex
Optimization, for the instance with problem data
1 2 0 1 100
0 0 3 1 100 3 2 4
2 1 10
cmax = pdisc =
A= 0 3 1 1 , 100 , p= 7 , 4 , q= 5 .
2 1 2 5 100
6 2 10
1 0 3 2 100
You can do this by forming the LP you found in your solution of exercise 4.17, or more directly,
using CVX*. Give the optimal activity levels, the revenue generated by each one, and the total
revenue generated by the optimal solution. Also, give the average price per unit for each activity
level, i.e., the ratio of the revenue associated with an activity, to the activity level. (These numbers
should be between the basic and discounted prices for each activity.) Give a very brief story
explaining, or at least commenting on, the solution you find.
4.5 Minimizing the ratio of convex and concave piecewise-linear functions. We consider the problem
maxi=1,...,m (aTi x + bi )
minimize
mini=1,...,p (cTi x + di )
subject to F x ⪯ g,
with variable x ∈ Rn . We assume that cTi x+di > 0 and maxi=1,...,m (aTi x+bi ) ≥ 0 for all x satisfying
F x ⪯ g, and that the feasible set is nonempty and bounded. This problem is quasiconvex, and can
be solved using bisection, with each iteration involving a feasibility LP. Show how the problem can
be solved by solving one LP, using a trick similar to one described in §4.3.2.
30
4.6 The illumination problem. In lecture 1 we encountered the function
4.7 Schur complements and LMI representation. Recognizing Schur complements (see §A5.5) often
helps to represent nonlinear convex constraints as linear matrix inequalities (LMIs). Consider the
function
f (x) = (Ax + b)T (P0 + x1 P1 + · · · + xn Pn )−1 (Ax + b)
where A ∈ Rm×n , b ∈ Rm , and Pi = PiT ∈ Rm×m , with domain
dom f = {x ∈ Rn | P0 + x1 P1 + · · · + xn Pn ≻ 0}.
This is the composition of the matrix fractional function and an affine mapping, and so is convex.
Give an LMI representation of epi f . That is, find a symmetric matrix F (x, t), affine in (x, t), for
which
x ∈ dom f, f (x) ≤ t ⇐⇒ F (x, t) ⪰ 0.
Remark. LMI representations, such as the one you found in this exercise, can be directly used in
CVX*.
4.8 Complex least-norm problem. We consider the complex least ℓp -norm problem
minimize ∥x∥p
subject to Ax = b,
for p ≥ 1, and ∥x∥∞ = maxi=1,...,n |xi |. We assume A is full rank, and m < n.
(a) Formulate the complex least ℓ2 -norm problem as a least ℓ2 -norm problem with real problem
data and variable. Hint. Use z = (ℜx, ℑx) ∈ R2n as the variable.
(b) Formulate the complex least ℓ∞ -norm problem as an SOCP.
(c) Solve a random instance of both problems with m = 30 and n = 100. To generate the
matrix A, you can use the Matlab command A = randn(m,n) + i*randn(m,n). Similarly,
use b = randn(m,1) + i*randn(m,1) to generate the vector b. Use the Matlab command
scatter to plot the optimal solutions of the two problems on the complex plane, and comment
(briefly) on what you observe. You can solve the problems using the CVX functions norm(x,2)
31
and norm(x,inf), which are overloaded to handle complex arguments. To utilize this feature,
you will need to declare variables to be complex in the variable statement. (In particular,
you do not have to manually form or solve the SOCP from part (b).)
4.9 Linear programming with random cost vector. We consider the linear program
minimize cT x
subject to Ax ⪯ b.
Here, however, the cost vector c is random, normally distributed with mean E c = c0 and covariance
E(c − c0 )(c − c0 )T = Σ. (A, b, and x are deterministic.) Thus, for a given x ∈ Rn , the cost cT x is
a (scalar) Gaussian variable.
We can attach several different meanings to the goal ‘minimize cT x’; we explore some of these
below.
E cT x + γ var(cT x) (1)
of the expected value E cT x and the variance var(cT x) = E(cT x)2 − (E cT x)2 . This is called
the ‘risk-sensitive cost’, and the parameter γ ≥ 0 is called the risk-aversion parameter, since
it sets the relative values of cost variance and expected value. (For γ > 0, we are willing to
tradeoff an increase in expected cost for a decrease in cost variance.) How would you minimize
the risk-sensitive cost? Is this problem a convex optimization problem? Be as specific as you
can.
(c) We can also minimize the risk-sensitive cost, but with γ < 0. This is called ‘risk-seeking’. Is
this problem a convex optimization problem?
(d) Another way to deal with the randomness in the cost cT x is to formulate the problem as
minimize β
subject to prob(cT x ≥ β) ≤ α
Ax ⪯ b.
Here, α is a fixed parameter, which corresponds roughly to the reliability we require, and
might typically have a value of 0.01. Is this problem a convex optimization problem? Be as
specific as you can. Can you obtain risk-seeking by choice of α? Explain.
4.10 Formulate the following optimization problems as semidefinite programs. The variable is x ∈ Rn ;
F (x) is defined as
F (x) = F0 + x1 F1 + x2 F2 + · · · + xn Fn
with Fi ∈ Sm . The domain of f in each subproblem is dom f = {x ∈ Rn | F (x) ≻ 0}.
32
(c) Minimize f (x) = sup cT F (x)−1 c.
∥c∥2 ≤1
(d) Minimize f (x) = E(cT F (x)−1 c) where c is a random vector with mean E c = c̄ and covariance
E(c − c̄)(c − c̄)T = S.
4.11 A matrix fractional function [Ando]. Show that X = B T A−1 B solves the SDP
minimize tr
X
A B
subject to ⪰ 0,
BT X
4.12 Trace of harmonic mean of matrices [Ando]. The matrix H(A, B) = 2(A−1 + B −1 )−1 is known as
the harmonic mean of positive definite matrices A and B. Show that X = (1/2)H(A, B) solves the
SDP
maximize tr X
X X A 0
subject to ⪯ ,
X X 0 B
n
with variable X ∈ S . The matrices A ∈ Sn++ and B ∈ Sn++ are given. Conclude that the function
tr (A−1 + B −1 )−1 , with domain Sn++ × Sn++ , is concave.
is known as the geometric mean of positive definite matrices A and B. Show that X = G(A, B)
solves the SDP
maximize tr X
A X
subject to ⪰ 0.
X B
The variable is X ∈ Sn . The matrices A ∈ Sn++ and B ∈ Sn++ are given.
Conclude that the function tr G(A, B) is concave, for A, B positive definite.
Hint. The symmetric matrix square root is monotone: if U and V are positive semidefinite with
U ⪯ V then U 1/2 ⪯ V 1/2 .
33
4.14 Transforming a standard form convex problem to conic form. In this problem we show that any
convex problem can be cast in conic form, provided some technical conditions hold. We start with
a standard form convex problem with linear objective (without loss of generality):
minimize cT x
subject to fi (x) ≤ 0, i = 1, . . . , m,
Ax = b,
where fi : Rn → R are convex, and x ∈ Rn is the variable. For simplicity, we will assume that
dom fi = Rn for each i.
Now introduce a new scalar variable t ∈ R and form the convex problem
minimize cT x
subject to tfi (x/t) ≤ 0, i = 1, . . . , m,
Ax = b, t = 1.
Define
K = cl{(x, t) ∈ Rn+1 | tfi (x/t) ≤ 0, i = 1, . . . , m, t > 0}.
Then our original problem can be expressed as
minimize cT x
subject to (x, t) ∈ K,
Ax = b, t = 1.
(a) Show that K is a convex cone. (It is closed by definition, since we take the closure.)
(b) Suppose the original problem is strictly feasible, i.e., there exists a point x̄ with fi (x) < 0,
i = 1, . . . , m. (This is called Slater’s condition.) Show that K has nonempty interior.
(c) Suppose that the inequalities define a bounded set, i.e., {x | fi (x) ≤ 0, i = 1, . . . , m} is
bounded. Show that K is pointed.
4.15 Exploring nearly optimal points. An optimization algorithm will find an optimal point for a problem,
provided the problem is feasible. It is often useful to explore the set of nearly optimal points. When
a problem has a ‘strong minimum’, the set of nearly optimal points is small; all such points are close
to the original optimal point found. At the other extreme, a problem can have a ‘soft minimum’,
which means that there are many points, some quite far from the original optimal point found, that
are feasible and have nearly optimal objective value. In this problem you will use a typical method
to explore the set of nearly optimal points.
We start by finding the optimal value p⋆ of the given problem
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p,
34
as well as an optimal point x⋆ ∈ Rn . We then pick a small positive number ϵ, and a vector c ∈ Rn ,
and solve the problem
minimize cT x
subject to fi (x) ≤ 0, i = 1, . . . , m
hi (x) = 0, i = 1, . . . , p
f0 (x) ≤ p⋆ + ϵ.
Note that any feasible point for this problem is ϵ-suboptimal for the original problem. Solving this
problem multiple times, with different c’s, will generate (perhaps different) ϵ-suboptimal points. If
the problem has a strong minimum, these points will all be close to each other; if the problem has
a weak minimum, they can be quite different.
There are different strategies for choosing c in these experiments. The simplest is to choose the
c’s randomly; another method is to choose c to have the form ±ei , for i = 1, . . . , n. (This method
gives the ‘range’ of each component of x, over the ϵ-suboptimal set.)
You will carry out this method for the following problem, to determine whether it has a strong
minimum or a weak minimum. You can generate the vectors c randomly, with enough samples for
you to come to your conclusion. You can pick ϵ = 0.01p⋆ , which means that we are considering the
set of 1% suboptimal points.
The problem is a minimum fuel optimal control problem for a vehicle moving in R2 . The position
at time kh is given by p(k) ∈ R2 , and the velocity by v(k) ∈ R2 , for k = 1, . . . , K. Here h > 0 is
the sampling period. These are related by the equations
where f (k) ∈ R2 is the force applied to the vehicle at time kh, m > 0 is the vehicle mass, and
α ∈ (0, 1) models drag on the vehicle; in the absense of any other force, the vehicle velocity decreases
by the factor 1 − α in each discretized time interval. (These formulas are approximations of more
accurate formulas that involve matrix exponentials.)
The force comes from two thrusters, and from gravity:
cos θ1 cos θ2 0
f (k) = u1 (k) + u2 (k) + , k = 1, . . . , K − 1.
sin θ1 sin θ2 −mg
Here u1 (k) ∈ R and u2 (k) ∈ R are the (nonnegative) thruster force magnitudes, θ1 and θ2 are the
directions of the thrust forces, and g = 10 is the constant acceleration due to gravity.
The total fuel use is
K−1
X
F = (u1 (k) + u2 (k)) .
k=1
35
(These state that at the time hki , the vehicle must pass through the location wi ∈ R2 .) In addition,
we require that the vehicle should remain in a square operating region,
∥p(k)∥∞ ≤ P max , k = 1, . . . , K.
Both parts of this problem concern the specific problem instance with data given in thrusters_data.*.
(a) Find an optimal trajectory, and the associated minimum fuel use p⋆ . Plot the trajectory p(k)
in R2 (i.e., in the p1 , p2 plane). Verify that it passes through the way-points.
(b) Generate several 1% suboptimal trajectories using the general method described above, and
plot the associated trajectories in R2 . Would you say this problem has a strong minimum, or
a weak minimum?
4.16 Minimum fuel optimal control. Solve the minimum fuel optimal control problem described in
exercise 4.16 of Convex Optimization, for the instance with problem data
−1 0.4 0.8 1 7
A= 1 0 0 , b = 0 , xdes = 2 , N = 30.
0 1 0 0.3 −6
You can do this by forming the LP you found in your solution of exercise 4.16, or more directly
using CVX*. Plot the actuator signal u(t) as a function of time t.
4.17 Heuristic suboptimal solution for Boolean LP. This exercise builds on exercises 4.15 and 5.13 in
Convex Optimization, which involve the Boolean LP
minimize cT x
subject to Ax ⪯ b
xi ∈ {0, 1}, i = 1, . . . , n,
with optimal value p⋆ . Let xrlx be a solution of the LP relaxation
minimize cT x
subject to Ax ⪯ b
0 ⪯ x ⪯ 1,
so L = cT xrlx is a lower bound on p⋆ . The relaxed solution xrlx can also be used to guess a Boolean
point x̂, by rounding its entries, based on a threshold t ∈ [0, 1]:
1 xrlx
x̂i = i ≥t
0 otherwise,
for i = 1, . . . , n. Evidently x̂ is Boolean (i.e., has entries in {0, 1}). If it is feasible for the Boolean
LP, i.e., if Ax̂ ⪯ b, then it can be considered a guess at a good, if not optimal, point for the Boolean
LP. Its objective value, U = cT x̂, is an upper bound on p⋆ . If U and L are close, then x̂ is nearly
optimal; specifically, x̂ cannot be more than (U − L)-suboptimal for the Boolean LP.
This rounding need not work; indeed, it can happen that for all threshold values, x̂ is infeasible.
But for some problem instances, it can work well.
Of course, there are many variations on this simple scheme for (possibly) constructing a feasible,
good point from xrlx .
Finally, we get to the problem. Generate problem data using the following Python code:
36
import numpy as np
np.random.seed(0)
m, n = 300, 100
A = np.random.rand(m, n)
b = A.dot(np.ones(n)) / 2
c = -np.random.rand(n)
You can think of xi as a job we either accept or decline, and −ci as the (positive) revenue we
generate if we accept job i. We can think of Ax ⪯ b as a set of limits on m resources. Aij , which
is positive, is the amount of resource i consumed if we accept job j; bi , which is positive, is the
amount of resource i available.
Find a solution of the relaxed LP and examine its entries. Note the associated lower bound L.
Carry out threshold rounding for (say) 100 values of t, uniformly spaced over [0, 1]. For each value
of t, note the objective value cT x̂ and the maximum constraint violation maxi (Ax̂ − b)i . Plot the
objective value and the maximum violation versus t. Be sure to indicate on the plot the values of
t for which x̂ is feasible, and those for which it is not.
Find a value of t for which x̂ is feasible, and gives minimum objective value, and note the associated
upper bound U . Give the gap U − L between the upper bound on p⋆ and the lower bound on p⋆ .
4.18 Optimal operation of a hybrid vehicle. Solve the instance of the hybrid vehicle operation problem de-
scribed in exercise 4.65 in Convex Optimization, with problem data given in the file hybrid_veh_data.*,
and fuel use function F (p) = p + γp2 (for p ≥ 0).
Hint. You will actually formulate and solve a relaxation of the original problem. You may find that
some of the equality constraints you relaxed to inequality constraints do not hold for the solution
found. This is not an error: it just means that there is no incentive (in terms of the objective) for
the inequality to be tight. You can fix this in (at least) two ways. One is to go back and adjust
certain variables, without affecting the objective and maintaining feasibility, so that the relaxed
constraints hold with equality. Another simple method is to add to the objective a term of the
form
XT
ϵ max{0, −Pmg (t)},
t=1
where ϵ is small and positive. This makes it more attractive to use the brakes to extract power
from the wheels, even when the battery is (or will be) full (which removes any fuel incentive).
Find the optimal fuel consumption, and compare to the fuel consumption with a non-hybrid ver-
sion of the same vehicle (i.e., one without a battery). Plot the braking power, engine power,
motor/generator power, and battery energy versus time.
max , i.e., the partial
How would you use optimal dual variables for this problem to find ∂Ftotal /∂Ebatt
derivative of optimal fuel consumption with respect to battery capacity? (You can just assume
that this partial derivative exists.) You do not have to give a long derivation or proof; you can
just state how you would find this derivative from optimal dual variables for the problem. Verify
your method numerically, by changing the battery capacity a small amount and re-running the
optimization, and comparing this to the prediction made using dual variables.
37
4.19 Optimal vehicle speed scheduling. A vehicle (say, an airplane) travels along a fixed path of n
segments, between n + 1 waypoints labeled 0, . . . , n. Segment i starts at waypoint i − 1 and
terminates at waypoint i. The vehicle starts at time t = 0 at waypoint 0. It travels over each
segment at a constant (nonnegative) speed; si is the speed on segment i. We have lower and upper
limits on the speeds: smin ⪯ s ⪯ smax . The vehicle does not stop at the waypoints; it simply
proceeds to the next segment. The travel distance of segment i is di (which is positive), so the
travel time over segment i is di /si . We let τi , i = 1, . . . , n, denote the time at which the vehicle
arrives at waypoint i. The vehicle is required to arrive at waypoint i, for i = 1, . . . , n, between
times τimin and τimax , which are given. The vehicle consumes fuel over segment i at a rate that
depends on its speed, Φ(si ), where Φ is positive, increasing, and convex, and has units of kg/s.
You are given the data d (segment travel distances), smin and smax (speed bounds), τ min and τ max
(waypoint arrival time bounds), and the fuel use function Φ : R → R. You are to choose the speeds
s1 , . . . , sn so as to minimize the total fuel consumed in kg.
(a) Show how to pose this as a convex optimization problem. If you introduce new variables, or
change variables, you must explain how to recover the optimal speeds from the solution of
your problem. If convexity of the objective or any constraint function in your formulation is
not obvious, explain why it is convex.
(b) Carry out the method of part (a) on the problem instance with data in
veh_speed_sched_data.*. Use the fuel use function Φ(si ) = as2i + bsi + c (the parameters
a, b, and c are defined in the data file). What is the optimal fuel consumption? Plot the
optimal speed versus segment, using the matlab command stairs or the function step from
matplotlib in Python and Julia to better show constant speed over the segments.
(a) Use the observation at the beginning of exercise 4.26 in Convex Optimization to express the
constraint
√
y ≤ z1 z2 , y, z1 , z2 ≥ 0,
with variables y, z1 , z2 , as a second-order cone constraint. Then extend your result to the
constraint
y ≤ (z1 z2 · · · zn )1/n , y ≥ 0, z ⪰ 0,
where n is a positive integer, and the variables are y ∈ R and z ∈ Rn . First assume that n is
a power of two, and then generalize your formulation to arbitrary positive integers.
(b) Express the constraint
f (x) ≤ t
as a second-order cone constraint, for the following two convex functions f :
α
x x≥0
f (x) =
0 x < 0,
where α is rational and greater than or equal to one, and
38
(c) Formulate the norm approximation problem
as a second-order cone program, where p is a rational number greater than or equal to one.
The variable in the optimization problem is x ∈ Rn . The matrix A ∈ Rm×n and the vector
b ∈ Rm are given. For an m-vector y, the norm ∥y∥p is defined as
m
!1/p
X
p
∥y∥p = |yk |
k=1
when p ≥ 1.
4.21 Linear optimization over the complement of a convex set. Suppose C ⊆ Rn+ is a closed bounded
convex set with 0 ∈ C, and c ∈ Rn+ . We define
4.23 Positive nonconvex QCQP. We consider a (possibly nonconvex) QCQP, with nonnegative variable
x ∈ Rn ,
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m
x ⪰ 0,
where fi (x) = (1/2)xT Pi x + qiT x + ri , with Pi ∈ Sn , qi ∈ Rn , and ri ∈ R, for i = 0, . . . , m. We do
not assume that Pi ⪰ 0, so this need not be a convex problem.
Suppose that qi ⪯ 0, and Pi have nonpositive off-diagonal entries, i.e., they satisfy
(Pi )jk ≤ 0, j ̸= k, j, k = 1, . . . , n,
39
4.24 Affine policy. We consider a family of LPs, parametrized by the random variable u, which is
uniformly distributed on U = [−1, 1]p ,
minimize cT x
subject to Ax ⪯ b(u),
The variables here are x0 and K. The expectation in the objective is over u, and the constraint
requires that Axaff (u) ⪯ b(u) hold almost surely.
(a) Explain how to find optimal values of x0 and K by solving a standard explicit convex op-
timization problem (i.e., one that does not involve an expectation or an infinite number of
constraints, as the one above does.) The numbers of variables or constraints in your formula-
tion should not grow exponentially with the problem dimensions n, p, or m.
(b) Carry out your method on the data given in affine_pol_data.m. To evaluate your affine
policy, generate 100 independent samples of u, and for each value, compute the objective
value of the affine policy, cT xaff (u), and of the optimal policy, cT x⋆ (u). Scatter plot the
objective value of the affine policy (y-axis) versus the objective value of the optimal policy
(x-axis), and include the line y = x on the plot. Report the average values of cT xaff (u) and
cT x⋆ (u) over your samples. (These are estimates of E cT xaff (u) and E cT x⋆ (u). The first
number, by the way, can be found exactly.)
4.25 Probability bounds. Consider random variables X1 , X2 , X3 , X4 that take values in {0, 1}. We are
40
given the following marginal and conditional probabilities:
prob(X1 = 1) = 0.9,
prob(X2 = 1) = 0.9,
prob(X3 = 1) = 0.1,
prob(X1 = 1, X4 = 0 | X3 = 1) = 0.7,
prob(X4 = 1 | X2 = 1, X3 = 0) = 0.6.
Explain how to find the minimum and maximum possible values of prob(X4 = 1), over all (joint)
probability distributions consistent with the given data. Find these values and report them.
Hint. In CVXPY, you create a multidimensional array; for example, cp.Variable((2, 2, 2, 2))
declares a 4-dimensional array of variables, with each of the four indices taking the values 0 or 1.
The function cp.sum takes an axis argument that allows to sum along dimensions selectively. For
example, cp.sum(x, axis=(0, 3) creates a (2, 2) variable by summing along the first and last
dimensions.
4.26 Robust quadratic programming. In this problem, we consider a robust variation of the (convex)
quadratic program
minimize (1/2)xT P x + q T x + r
subject to Ax ⪯ b.
For simplicity we assume that only the matrix P is subject to errors, and the other parameters (q,
r, A, b) are exactly known. The robust quadratic program is defined as
41
4.27 Smallest confidence ellipsoid. Suppose the random variable X on Rn has log-concave density p.
Formulate the following problem as a convex optimization problem: Find an ellipsoid E that satisfies
prob(X ∈ E) ≥ 0.95 and is smallest, in the sense of minimizing the sum of the squares of its semi-
axis lengths. You do not need to worry about how to solve the resulting convex optimization
problem; it is enough to formulate the smallest confidence ellipsoid problem as the problem of
minimizing a convex function over a convex set involving the parameters that define E.
4.28 Stochastic optimization via Monte Carlo sampling. In (convex) stochastic optimization, the goal
is to minimize a cost function of the form F (x) = E f (x, ω), where ω is a random variable on Ω,
and f : Rn × Ω → R is convex in its first argument for each ω ∈ Ω. (For simplicity we consider
the unconstrained problem; it is not hard to include constraints.) Evidently F is convex. Let p⋆
denote the optimal value, i.e., p⋆ = inf x F (x) (which we assume is finite).
In a few very simple cases we can work out what F is analytically, but in general this is not possible.
Moreover in many applications, we do not know the distribution of ω; we only have access to an
oracle that can generate independent samples from the distribution.
A standard method for approximately solving the stochastic optimization problem is based on
Monte Carlo sampling. We first generate N independent samples, ω1 , . . . , ωN , and form the empir-
ical expectation
N
1 X
F̂ (x) = f (x, ωi ).
N
i=1
This is a random function, since it depends on the particular samples drawn. For each x, we
have E F̂ (x) = F (x), and also E(F̂ (x) − F (x))2 ∝ 1/N . Roughly speaking, for N large enough,
F̂ (x) ≈ F (x).
To (approximately) minimize F , we instead minimize F̂ (x). The minimizer, x̂⋆ , and the optimal
value p̂⋆ = F̂ (x̂⋆ ), are also random variables. The hope is that for N large enough, we have p̂⋆ ≈ p⋆ .
(In practice, stochastic optimization via Monte Carlo sampling works very well, even when N is
not that big.)
One way to check the result of Monte Carlo sampling is to carry it out multiple times. We repeatedly
generate different batches of samples, and for each batch, we find x̂⋆ and p̂⋆ . If the values of p̂⋆ are
near each other, it’s reasonable to believe that we have (approximately) minimized F . If they are
not, it means our value of N is too small.
Show that E p̂⋆ ≤ p⋆ .
This inequality implies that if we repeatedly use Monte Carlo sampling and the values of p̂⋆ that
we get are all very close, then they are (likely) close to p⋆ .
Hint. Show that for any function G : Rn × Ω → R (convex or not in its first argument), and any
random variable ω on Ω, we have
minimize f0 (x)
subject to fi (x) ≤ 0 holds for at least k values of i,
42
with variable x ∈ Rn , where the objective f0 and the constraint functions fi , i = 1, . . . , m (with
m ≥ k), are convex. Here we require that only k of the constraints hold, instead of all m of them.
In general this is a hard combinatorial problem; the brute force solution is to solve all m k convex
problems obtained by choosing subsets of k constraints to impose, and selecting one with smallest
objective value.
In this problem we explore a convex restriction that can be an effective heuristic for the problem.
guarantees that fi (x) ≤ 0 holds for at least k values of i. ((u)+ means max{u, 0}.)
Hint. For each u ∈ R, (1 + λu)+ ≥ 1(u > 0), where 1(u > 0) = 1 for u > 0, and 1(u > 0) = 0
for u ≤ 0.
(b) Consider the problem
minimize fP 0 (x)
m
subject to i=1 (1 + λfi (x))+ ≤ m − k
λ > 0,
with variables x and λ. This is a restriction of the original problem: If (x, λ) are feasible for
it, then x is feasible for the original problem. Show how to solve this problem using convex
optimization. (This may involve a change of variables.)
(c) Apply the method of part (b) to the problem instance
minimize cT x
subject to aTi x ≤ bi holds for at least k values of i,
with m = 70, k = 58, and n = 12. The vectors b, c and the matrix A with rows aTi are given
in the file satisfy_some_constraints_data.*.
Report the optimal value of λ, the objective value, and the actual number of constraints that
are satisfied (which should be larger than or equal to k). To determine if a constraint is
satisfied, you can use the tolerance aTi x − bi ≤ ϵfeas , with ϵfeas = 10−5 .
A standard trick is to take this tentative solution, choose the k constraints with the smallest
values of fi (x), and then minimize f0 (x) subject to these k constraints (i.e., ignoring the other
m − k constraints). This improves the objective value over the one found using the restriction.
Carry this out for the problem instance, and report the objective value obtained.
minimize tr(CX)
subject to tr(Ai X) = bi , i = 1, . . . , m (2)
X ⪰ 0.
43
A matrix X̂ is called an extreme point of the feasible set of (2) if X̂ is feasible and if the only matrix
V ∈ Sn that satisfies the conditions
tr (Ai V ) = 0, i = 1, . . . , m, X̂ + V ⪰ 0, X̂ − V ⪰ 0 (3)
is V = 0. In this problem we work out a bound on the rank of extreme points.
(a) Suppose X̂ is feasible for (2) and has rank r. Define an eigenvalue decomposition
Λ1 0 T
X̂ = Q1 Q2 Q1 Q2 ,
0 0
where [ Q1 Q2 ] is orthogonal, Q1 has r columns, and Λ1 is a diagonal r × r matrix with
positive diagonal elements. Show that V satisfies (3) if and only if it can be expressed as
V = Q1 Y QT1 where Y ∈ Sr satisfies
tr (QT1 Ai Q1 Y ) = 0, i = 1, . . . , m, Λ1 + Y ⪰ 0, Λ1 − Y ⪰ 0.
(b) Show that if r(r + 1)/2 > m, then X̂ is not an extreme point.
(c) Interpret the SDP (2) as a relaxation of the non-convex QCQP
minimize xT Cx
subject to xT Ai x = bi , i = 1, . . . , m.
What are the implications of part b for the exactness of the relaxation? (You can assume
that the feasible set of (2) is non-empty and bounded. Under this assumption, the SDP is
guaranteed to have optimal solutions that are extreme points of the feasible set.)
4.31 Exact relaxation of a rank constrained problem. Consider the following optimization problem, which
we shall call problem A.
minimize tr(AP )
subject to rank(P ) = k,
λi (P ) ∈ {0, 1} for all i = 1, . . . , n
Here the variable P ∈ Sn , and λi (P ) is its ith largest eigenvalue. We are give A ∈ Rn×n and k ∈ Z
with k > 0. Both of the constraints are not convex.
Problem B is the following semidefinite program, with the same problem data.
minimize tr(AP )
subject to tr(P ) = k,
0 ⪯ P ⪯ I,
(a) Show that problem B is a relaxation of problem A. That is, show that if P is feasible for
problem A then it is also feasible for problem B.
(b) Consider the problem
minimize cP Tx
subject to i xi = k
0 ≤ xi ≤ 1
where the variable is x ∈ Rn , and c ∈ Rn is given. Explain why there always exists an optimal
x for which all components are integers.
44
(c) Using the previous result, show that problem B is a tight relaxation of problem A. Specifically,
show that there is an optimal solution P to problem B for which λi (P ) ∈ {0, 1} for all i.
The optimization variable is an n-vector x. The n-vector c and the matrices Ai , B ∈ Sp are given.
The norm ∥Ui ∥2 is the matrix norm. (For symmetric matrices ∥U ∥2 = maxk |λk (U )|.)
(a) Show that x satisfies the constraint in the problem if and only if
n
X
∥x∥1 ≤ λmin (B − xi Ai ).
i=1
P
The right-hand side is the smallest eigenvalue of B − i x i Ai .
(b) Use the observation of part (a) to formulate the robust optimization problem as an SDP.
4.33 For a symmetric n × n matrix A we define f (A) as the optimal value of the semidefinite program
minimize tr X + tr Y
X A
subject to ⪰0
A Y +I
Y ⪰ 0,
with variables X ∈ Sn and Y ∈ Sn .
where λ1 (A), . . . , λn (A) are the eigenvalues of A. Give an explicit formula for ϕ.
4.34 In this problem we generalize the optimality condition in §4.2.3 of Convex Optimization (page 4-9
of the slides). We consider an optimization problem
x̂ ∈ dom f ∩ dom g, ∇f (x̂)T (y − x̂) + g(y) − g(x̂) ≥ 0 for all y ∈ dom g. (4)
45
(a) Show that (4) is a necessary condition for x̂ to be locally optimal.
(b) Assume f is convex. Show that (4) is also sufficient for x̂ to be optimal.
(c) Take g(x) = ∥x∥1 . Show that (4) reduces to the following: x̂ ∈ dom f and for each i = 1, . . . , n,
∂f (x̂) ∂f (x̂) ∂f (x̂)
= −1 if x̂i > 0, ≤1 if x̂i = 0, =1 if x̂i < 0.
∂xi ∂xi ∂xi
4.35 Robust piecewise-linear optimization. Consider the robust piecewise-linear minimization problem
minimize sup max (aTi x + bi )
ai ∈Ai , i=1,...,m i=1,...,m
with variable x ∈ Rn . For each of the following definitions of Ai , formulate the problem as an
SOCP.
(a) Each set Ai is a Euclidean ball
Ai = {ai | ∥ai − ci ∥2 ≤ ri }.
The vectors ci and positive scalars ri are given.
(b) Each set Ai is the union of pi Euclidean balls
[
Ai = {ai | ∥ai − cij ∥2 ≤ rij }.
j=1,...,pi
The vectors cij and positive scalars rij are given. We assume the sets Ai have nonempty
interior.
4.36 Risk-sensitive linear programming. We revisit the linear programming problem with random cost
(page 154 of Convex Optimization). The goal is to give an interpretation to the LP
minimize cT x
subject to Gx ⪯ h
when the cost vector c is random. For simplicity, we assume a discrete distribution with m possible
values: c takes the value ci with probability pi , for i = 1, . . . , m. The vectors ci represent different
scenarios or cases, each occurring with probability pi . The mean and covariance matrix are denoted
by
m
X Xm
c̄ = pi ci , Σ= pi (ci − c̄)(ci − c̄)T .
i=1 i=1
In this exercise, we formulate the problem as
m
1 T 1 X T
minimize log E eγc x = log pi eγci x
γ γ (5)
i=1
subject to Gx ⪯ h.
46
We will see that the parameter γ controls the risk sensitivity of the optimization model. To simplify
some notation, we define the function fγ : Rm → R with
m
1 X
fγ (y) = log pi eγyi
γ
i=1
P
(where pi > 0 and i pi = 1). With this notation, problem (5) can be written as
minimize fγ (Cx)
(6)
subject to Gx ⪯ h,
lim fγ (Cx) = max cTi x, lim fγ (Cx) = min cTi x, lim fγ (Cx) = c̄T x.
γ→∞ i=1,...,m γ→−∞ i=1,...,m γ→0
In (5) these three values of γ correspond to extreme pessimism (minimizing the worst case
maxi cTi x), extreme optimism (minimizing the best case mini cTi x), and a risk-neutral attitude
(minimizing the average case c̄T x).
(b) Next we examine the effect of choosing γ positive or negative. Show that
Hence, if γ > 0, the variability of cT x around its mean increases the cost function; if γ < 0,
it decreases it. In problem (5), choosing γ > 0 makes the optimization strategy risk-averse;
choosing γ < 0 makes it risk-seeking. We also note that the objective function is convex if
γ > 0 and concave if γ < 0.
(c) Finally, we relate (5) to the QP formulation on page 155 of Convex Optimization. We make
a quadratic approximation of the function fγ (y) around the vector ŷ = (c̄T x)1. Verify that
minimize x21
subject to x1 ≤ −1, x21 + x22 ≤ 2,
with variable x = (x1 , x2 ). Determine whether each of the following statements is true or false.
47
(a) The point (−1, 1) is a solution.
(b) The optimal value is 1.
(c) The problem is convex.
(d) The problem has multiple solutions.
4.38 Feasibility and optimal value. Consider an optimization problem in which we seek to minimize the
objective. We let p⋆ denote the optimal value of the problem. Which of the following statements
are true?
4.39 Scalarizing a bi-criterion problem using the max function. Consider the bi-criterion optimization
problem
minimize (f (x), g(x)),
with variable x. Suppose x̃ is the unique minimizer of max{f (x), g(x)}. Is x̃ Pareto optimal?
Either explain why it is, or give a counterexample.
minimize cT x
subject to Ax = b, x ⪰ 0,
with variable x ∈ Rn , with data A ∈ Rm×n , b ∈ Rm , and c ∈ Rn . We denote the optimal value as
f (A, b, c). This can be +∞ (if the LP is infeasible) or −∞ (if it is unbounded below).
(a) Suppose we fix A and b at values for which the LP is feasible. What can you say about the
curvature of the mapping from c to f (A, b, c)? Is it convex? Concave? Neither?
(b) Suppose we fix A and c, with c ⪰ 0 (which implies the LP is not unbounded below). What
can you say about the curvature of the mapping from b to f (A, b, c)? Is it convex? Concave?
Neither?
4.41 Convex-concave procedure. This exercise is about a very simple but powerful heuristic for approxi-
mately solving problems of the form
with variable x ∈ Rn , where fi are convex, i = 0, . . . , m, but g is concave. The objective term g
makes this problem not convex.
We will assume that g is differentiable. We denote the linearization or first order Taylor expansion
of g at the point z as
ĝ(x; z) = g(z) + ∇g(z)T (x − z)
48
(considered a function of x).
The convex-concave procedure is an iterative algorithm where xk+1 is a solution of the problem
minimize f0 (x) + ĝ(x; xk )
subject to fi (x) ≤ 0, i = 1, . . . , m
Ax = b.
In words: Each new iterate is the solution of the original problem, with the concave term replaced
by its linearization at the previous iterate. (We assume that these problems have solutions.) This
algorithm need not solve the origonal nonconvex problem globally, but it often finds a very good
approximate solution.
(a) Explain why the problem solved in each iteration of the convex-concave procedure is convex.
(b) Show that f0 (xk+1 )+g(xk+1 ) ≤ f0 (xk )+g(xk ). This means that the convex-concave procedure
is a descent method, i.e., the objective decreases in each iteration.
(c) Consider the problem
minimize −xT P x
subject to ∥x∥2 ≤ 1,
where P ∈ Sn++ . (This problem has a simple analytical solution in terms of the eigenvectors
of P .) Explain how to obtain the update in the convex-concave procedure. (For this simple
problem, the update has an analytical solution.)
Generate a random instance of the problem (i.e., choose P ∈ Sn++ ) and run the convex-concave
procedure from several different starting points. Plot the objective value versus iteration for
each run, along with a horizontal line that shows the (globally) optimal value. How well does
the convex-concave procedure work for this problem?
4.42 A simple GP. Use CVXPY to solve an instance of the GP described in the book in exercise 4.30,
with data
α1 = 0.2, α2 = 0.1, α3 = 0.3, α4 = 1, Cmax = 60,
and
Tmin = 10, Tmax = 40, rmin = 35, rmax = 80, wmin = 3, wmax = 4.
Give the optimal cost, and optimal values of the variables T , r, and w.
You do not need to express the problem as a canonical GP (i.e., with righthand sides all one), or
convert the GP to a convex problem. In CVXPY, you’ll use disciplined geometric programming
(DGP), as described in cvxpy.org/tutorial/dgp/.
4.43 Minimum fuel regulation cost. Consider the linear dynamical system
where xt ∈ Rn is the state and ut ∈ Rm is the input at time t, and the matrices A and B are given.
Consider the function ϕ : Rn → R, where ϕ(z) is the optimal value of the problem
49
with variables x1 , . . . , xT and u1 , . . . , uT −1 . This problem is called the minimum fuel regulation
problem, since the objective is a basic model of fuel use, and regulation refers to finding a sequence
of inputs that results in the state being zero at t = T . We will assume that the problem above is
feasible for any z ∈ Rn . Roughly speaking, ϕ(z) gives the minimum fuel required to move the state
from x1 = z to xT = 0.
Is ϕ convex, concave, affine, or neither?
50
5 Duality
5.1 Numerical perturbation analysis example. Consider the quadratic program
minimize x21 + 2x22 − x1 x2 − x1
subject to x1 + 2x2 ≤ u1
x1 − 4x2 ≤ u2 ,
5x1 + 76x2 ≤ 1,
with variables x1 , x2 , and parameters u1 , u2 .
(a) Solve this QP, for parameter values u1 = −2, u2 = −3, to find optimal primal variable values
x⋆1 and x⋆2 , and optimal dual variable values λ⋆1 , λ⋆2 and λ⋆3 . Let p⋆ denote the optimal objective
value. Verify that the KKT conditions hold for the optimal primal and dual variables you
found (within reasonable numerical accuracy).
Hint: Check the documentation or users’ guides for CVXPY to find out how to retrieve
optimal dual variables.
(b) We will now solve some perturbed versions of the QP, with
u1 = −2 + δ1 , u2 = −3 + δ2 ,
where δ1 and δ2 each take values from {−0.1, 0, 0.1}. (There are a total of nine such combi-
nations, including the original problem with δ1 = δ2 = 0.) For each combination of δ1 and δ2 ,
make a prediction p⋆pred of the optimal value of the perturbed QP, and compare it to p⋆exact ,
the exact optimal value of the perturbed QP (obtained by solving the perturbed QP). Put
your results in the two righthand columns in a table with the form shown below. Check that
the inequality p⋆pred ≤ p⋆exact holds.
δ1 δ2 p⋆pred p⋆exact
0 0
0 −0.1
0 0.1
−0.1 0
−0.1 −0.1
−0.1 0.1
0.1 0
0.1 −0.1
0.1 0.1
51
bounds on the covariance of the random variables yi = ATi z ∈ Rki , which is ATi XAi . The problem
is to find the covariance matrix for z, that is consistent with the known upper bounds on the
covariance of yi , that has the largest volume confidence ellipsoid.
Derive the Lagrange dual of this problem. Be sure to state what the dual variables are (e.g.,
vectors, scalars, matrices), any constraints they must satisfy, and what the dual function is. If
the
Pmdual function has any implicit equality constraints, make them explicit. You can assume that
T
i=1 Ai Ai ≻ 0, which implies the feasible set of the original problem is bounded.
What can you say about the optimal duality gap for this problem?
This is a convex function, jointly in x and y. In the following problem we calculate the vector x
that minimizes the relative entropy with a given vector y, subject to equality constraints on x:
n
X
minimize xk log(xk /yk )
k=1
subject to Ax = b
1T x = 1
The optimization variable is x ∈ Rn . The domain of the objective function is Rn++ . The parameters
y ∈ Rn++ , A ∈ Rm×n , and b ∈ Rm are given.
Derive the Lagrange dual of this problem and simplify it to get
aT
Pn
maximize bT z − log k=1 yk e
kz
5.4 Source localization from range measurements [Beck, Stoica, and Li]. A signal emitted by a source
at an unknown position x ∈ Rn (n = 2 or n = 3) is received by m sensors at known positions y1 ,
. . . , ym ∈ Rn . From the strength of the received signals, we can obtain noisy estimates dk of the
distances ∥x − yk ∥2 . We are interested in estimating the source position x based on the measured
distances dk .
In the following problem the error between the squares of the actual and observed distances is
minimized:
m
X 2
minimize f0 (x) = ∥x − yk ∥22 − d2k .
k=1
52
The variables are x ∈ Rn , t ∈ R. Although this problem is not convex, it can be shown that
strong duality holds. (It is a variation on the problem discussed on page 229 and in exercise 5.29
of Convex Optimization.)
Solve (7) for an example with m = 5,
1.8 2.0 1.5 1.5 2.5
y1 = , y2 = , y3 = , y4 = , y5 = ,
2.5 1.7 1.5 2.0 1.5
and
d = (2.00, 1.24, 0.59, 1.31, 1.44).
The figure shows some contour lines of the cost function f0 , with the positions yk indicated by
circles.
3
2.5
2
x2
1.5
0.5
To solve the problem, you can note that x⋆ is easily obtained from the KKT conditions for (7) if
the optimal multiplier ν ⋆ for the equality constraint is known. You can use one of the following
two methods to find ν ⋆ .
• Derive the dual problem, express it as an SDP, and solve it using CVX.
• Reduce the KKT conditions to a nonlinear equation in ν, and pick the correct solution (simi-
larly as in exercise 5.29 of Convex Optimization).
5.5 Projection on the ℓ1 ball. Consider the problem of projecting a point a ∈ Rn on the unit ball in
ℓ1 -norm:
minimize (1/2)∥x − a∥22
subject to ∥x∥1 ≤ 1.
Derive the dual problem and describe an efficient method for solving it. Explain how you can
obtain the optimal x from the solution of the dual problem.
53
5.6 A nonconvex problem with strong duality. On page 229 of Convex Optimization, we consider the
problem
minimize f (x) = xT Ax + 2bT x
(8)
subject to xT x ≤ 1
with variable x ∈ Rn , and data A ∈ Sn , b ∈ Rn . We do not assume that A is positive semidefinite,
and therefore the problem is not necessarily convex. In this exercise we show that x is (globally)
optimal if and only if there exists a λ such that
From this we will develop an efficient method for finding the global solution. The conditions (9)
are the KKT conditions for (8) with the inequality A + λI ⪰ 0 added.
(a) Show that if x and λ satisfy (9), then f (x) = inf x̃ L(x̃, λ) = g(λ), where L is the Lagrangian
of the problem and g is the dual function. Therefore strong duality holds, and x is globally
optimal.
(b) Next we show that the conditions (9) are also necessary. Assume that x is globally optimal
for (8). We distinguish two cases.
(i) ∥x∥2 < 1. Show that (9) holds with λ = 0.
(ii) ∥x∥2 = 1. First prove that (A + λI)x = −b for some λ ≥ 0. (In other words, the negative
gradient −(Ax + b) of the objective function is normal to the unit sphere at x, and point
away from the origin.) You can show this by contradiction: if the condition does not
hold, then there exists a direction v with v T x < 0 and v T (Ax + b) < 0. Show that
f (x + tv) < f (x) for small positive t.
It remains to show that A + λI ⪰ 0. If not, there exists a w with wT (A + λI)w < 0, and
without loss of generality we can assume that wT x ̸= 0. Show that the point y = x + tw
with t = −2wT x/wT w satisfies ∥y∥2 = 1 and f (y) < f (x).
(c) The optimality conditions (9) can
P be used to derive a simple algorithm for (8). Using the
eigenvalue decomposition A = ni=1 αi qi qiT , of A, we make a change of variables yi = qiT x,
and write (8) as Pn Pn
minimize 2
i=1 αi yi + 2 i=1 βi yi
subject to y T y ≤ 1
where βi = qiT b. The transformed optimality conditions (9) are
5.7 Connection between perturbed optimal cost and Lagrange dual functions. In this exercise we explore
the connection between the optimal cost, as a function of perturbations to the righthand sides of
the constraints,
p⋆ (u) = inf{f0 (x) | ∃x ∈ D, fi (x) ≤ ui , i = 1, . . . , m},
(as in §5.6), and the Lagrange dual function
54
with domain restricted to λ ⪰ 0. We assume the problem is convex. We consider a problem with
inequality constraints only, for simplicity.
We have seen several connections between p⋆ and g:
• Slater’s condition and strong duality. Slater’s condition is: there exists u ≺ 0 for which
p⋆ (u) < ∞. Strong duality (which follows) is: p⋆ (0) = supλ g(λ). (Note that we include the
condition λ ⪰ 0 in the domain of g.)
• A global inequality. We have p⋆ (u) ≥ p⋆ (0) − λ⋆T u, for any u, where λ⋆ maximizes g.
• Local sensitivity analysis. If p⋆ is differentiable at 0, then we have ∇p⋆ (0) = −λ⋆ , where λ⋆
maximizes g.
In fact the two functions are closely related by conjugation. Show that
Here (−g)∗ is the conjugate of the function −g. You can show this for u ∈ int dom p⋆ .
Hint. Consider the problem
minimize f0 (x)
subject to f˜i (x) = fi (x) − ui ≤ 0, i = 1, . . . , m.
Verify that Slater’s condition holds for this problem, for u ∈ int dom p⋆ .
5.8 Exact penalty method for SDP. Consider the pair of primal and dual SDPs
minimize f0 (x)
(10)
subject to fi (x) ≤ 0, i = 1, . . . , m,
where α > 0, is convex. Suppose x̃ minimizes ϕ. Show how to find from x̃ a feasible point for the
dual of (10). Find the corresponding lower bound on the optimal value of (10).
55
5.10 Boolean least-squares. We consider the non-convex least-squares approximation problem with
Boolean constraints
minimize ∥Ax − b∥22
(11)
subject to x2k = 1, k = 1, . . . , n,
where A ∈ Rm×n and b ∈ Rm . We assume that rank(A) = n, i.e., AT A is nonsingular.
One possible application of this problem is as follows. A signal x̂ ∈ {−1, 1}n is sent over a noisy
channel, and received as b = Ax̂ + v where v ∼ N (0, σ 2 I) is Gaussian noise. The solution of (11)
is the maximum likelihood estimate of the input signal x̂, based on the received signal b.
at the optimum of (12), then the relaxation is exact, i.e., the optimal values of problems (11)
and (12) are equal, and the optimal solution z of (12) is optimal for (11). This suggests a
heuristic for rounding the solution of the SDP (12) to a feasible solution of (11), if (13) does
not hold. We compute the eigenvalue decomposition
n+1 T
Z z X vi vi
= λi ,
zT 1 ti ti
i=1
(Here we assume the eigenvalues are sorted in decreasing order). Then we take x = sign(v1 )
as our guess of good solution of (11).
(c) We can also give a probabilistic interpretation of the relaxation (12). Suppose we interpret z
and Z as the first and second moments of a random vector v ∈ Rn (i.e., z = E v, Z = E vv T ).
Show that (12) is equivalent to the problem
56
E v = z, E vv T = Z (for example, the Gaussian distribution N (z, Z − zz T )). We generate a
number of samples ṽ from the distribution and round them to feasible solutions x = sign(ṽ).
We keep the solution with the lowest objective value as our guess of the optimal solution
of (11).
(d) Solve the dual problem (12) using CVX. Generate problem instances using the Matlab code
randn(’state’,0)
m = 50;
n = 40;
A = randn(m,n);
xhat = sign(randn(n,1));
b = A*xhat + s*randn(m,1);
for four values of the noise level s: s = 0.5, s = 1, s = 2, s = 3. For each problem instance,
compute suboptimal feasible solutions x using the the following heuristics and compare the
results.
(i) x(a) = sign(xls ) where xls is the solution of the least-squares problem
(ii) x(b) = sign(z) where z is the optimal value of the variable z in the SDP (12).
(iii) x(c) is computed from a rank-one approximation of the optimal solution of (12), as ex-
plained in part (b) above.
(iv) x(d) is computed by rounding 100 samples of N (z, Z −zz T ), as explained in part (c) above.
minimize f0 (x)
(14)
subject to fi (x) ≤ 0, i = 1, . . . , m.
57
(b) Let p⋆ denote the optimal value of (14) (so the optimal value of (15) is exp p⋆ ). From λ we
obtain the bound
p⋆ ≥ g(λ),
where g is the dual function for (14). From λ̃ we obtain the bound exp p⋆ ≥ g̃(λ̃), where g̃ is
the dual function for (15). This can be expressed as
p⋆ ≥ log g̃(λ̃).
How do these bounds compare? Are they the same, or is one better than the other?
5.12 Variable bounds and dual feasibility. In many problems the constraints include variable bounds, as
in
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m (16)
li ≤ xi ≤ ui , i = 1, . . . , n.
Let µ ∈ Rn+ be the Lagrange multipliers associated with the constraints xi ≤ ui , and let ν ∈ Rn+
be the Lagrange multipliers associated with the constraints li ≥ xi . Thus the Lagrangian is
m
X
L(x, λ, µ, ν) = f0 (x) + λi fi (x) + µT (x − u) + ν T (l − x).
i=1
n
(a) Show that for any x ∈ R and any λ, we can choose µ ⪰ 0 and ν ⪰ 0 so that x minimizes
L(x, λ, µ, ν). In particular, it is very easy to find dual feasible points.
(b) Construct a dual feasible point (λ, µ, ν) by applying the method you found in part (a) with
x = (l + u)/2 and λ = 0. From this dual feasible point you get a lower bound on f ⋆ . Show
that this lower bound can be expressed as
5.13 Deducing costs from samples of optimal decision. A system (such as a firm or an organism) chooses
a vector of values x as a solution of the LP
minimize cT x
subject to Ax ⪰ b,
where x(j) is an optimal point for the LP, with b = b(j) . (The solution of an LP need not be unique;
all we say here is that x(j) is an optimal solution.) Roughly speaking, we have samples of optimal
decisions, for different values of requirements.
58
You do not know the cost vector c. Your job is to compute the tightest possible bounds on the
costs ci from the given data. More specifically, you are to find cmax
i and cmin
i , the maximum and
minimum possible values for ci , consistent with the given data.
Note that if x is optimal for the LP for a given c, then it is also optimal if c is scaled by any positive
factor. To normalize c, then, we will assume that c1 = 1. Thus, we can interpret ci as the relative
cost of activity i, compared to activity 1.
5.15 State and solve the optimality conditions for the problem
−1 !
X1 X2
minimize log det
X2T X3
subject to tr X1 = α
tr X2 = β
tr X3 = γ.
59
5.16 Consider the optimization problem
with domain Sn++ and variable X ∈ Sn . The matrix S ∈ Sn is given. Show that the optimal Xopt
satisfies
−1
(Xopt )ij = Sij , |i − j| ≤ 1.
5.17 We denote by f (A) the sum of the largest r eigenvalues of a symmetric matrix A ∈ Sn (with
1 ≤ r ≤ n), i.e.,
Xr
f (A) = λk (A),
k=1
maximize tr(AX)
subject to tr X = r
0 ⪯ X ⪯ I,
minimize f (A(x)),
minimize f0 (x)
(17)
subject to fi (x) ≤ 0, i = 1, . . . , m
with dual
maximize g(λ)
(18)
subject to λ ⪰ 0.
We assume that Slater’s condition holds, so we have strong duality and the dual optimum is
attained. For simplicity we will assume that there is a unique dual optimal solution λ⋆ .
For fixed t > 0, consider the unconstrained minimization problem
60
(b) We can express (19) as
minimize f0 (x) + ty
subject to fi (x) ≤ y, i = 1. . . . , m (20)
0≤y
(The second term in (19) is called a penalty function for the constraints in (17). It is zero if x is
feasible, and adds a penalty to the cost function when x is infeasible. The penalty function is called
exact because for t large enough, the solution of the unconstrained problem (19) is also a solution
of (17).)
5.19 Infimal convolution. Let f1 , . . . , fm be convex functions on Rn . Their infimal convolution, denoted
g = f1 ⋄ · · · ⋄ fm (several other notations are also used), is defined as
with the natural domain (i.e., defined by g(x) < ∞). In one simple interpretation, fi (xi ) is the cost
for the ith firm to produce a mix of products given by xi ; g(x) is then the optimal cost obtained
if the firms can freely exchange products to produce, all together, the mix given by x. (The name
‘convolution’ presumably comes from the observation that if we replace the sum above with the
product, and the infimum above with integration, then we obtain the normal convolution.)
61
5
4.5
3.5
φ(u)
2.5
1.5
0.5
0
−1 −0.5 0 0.5 1
u
5.21 Robust LP with polyhedral cost uncertainty. We consider a robust linear programming problem,
with polyhedral uncertainty in the cost:
minimize supc∈C cT x
subject to Ax ⪰ b,
with variable x ∈ Rn , where C = {c | F c ⪯ g}. You can think of x as the quantities of n products
to buy (or sell, when xi < 0), Ax ⪰ b as constraints, requirements, or limits on the available
quantities, and C as giving our knowledge or assumptions about the product prices at the time we
place the order. The objective is then the worst possible (i.e., largest) cost, given the quantities x,
consistent with our knowledge of the prices.
In this exercise, you will work out a tractable method for solving this problem. You can assume
that C ̸= ∅, and the inequalities Ax ⪰ b are feasible.
(a) Let f (x) = supc∈C cT x be the objective in the problem above. Explain why f is convex.
(b) Find the dual of the problem
maximize cT x
subject to F c ⪯ g,
with variable c. (The problem data are x, F , and g.) Explain why the optimal value of the
dual is f (x).
This shows that you can express the worst-case cost supc∈C cT x as the optimal value of a
convex minimization problem, in which x, F , and g appear as data.
(c) Use the expression for f (x) found in part (b) in the original problem, to obtain a single LP
equivalent to the original robust LP.
Hint. Use the rule expressed roughly as: min-min is the same as min. More precisely the
problem
minimize inf v∈V f (u, v)
subject to u ∈ U,
with variable u (and v being a dummy variable), is equivalent to the problem
minimize f (u, v)
subject to u ∈ U, v ∈ V,
62
with variables u and v.
(d) Carry out the method found in part (c) to solve a robust LP with the data below. In Python:
import numpy as np
np.random.seed(10)
(m, n) = (30, 10)
A = np.random.rand(m, n)
b = np.random.rand(m, 1)
c_nom = np.ones((n, 1)) + np.random.rand(n, 1)
Then, use C described as follows. Each ci deviates no more than 25% from its nominal value,
i.e., 0.75cnom ⪯ c ⪯ 1.25cnom , and the average of c does not deviate more than 10% from the
average of the nominal values, i.e., 0.9(1T cnom )/n ≤ 1T c/n ≤ 1.1(1T cnom )/n.
Compare the worst-case cost f (x) and the nominal cost cTnom x for x optimal for the robust
problem, and for x optimal for the nominal problem, i.e., with objective cTnom x. Compare the
values and make a brief comment.
5.22 Diagonal scaling with prescribed column and row sums [Marshall and Olkin]. Let A be an n × n
matrix with positive entries, and let c and d be positive n-vectors that satisfy 1T c = 1T d = 1.
Consider the geometric program
minimize xT Ay
n
xci i = 1
Q
subject to
i=1
n
d
yj j = 1,
Q
j=1
with variables x, y ∈ Rn (and implicit constraints x ≻ 0, y ≻ 0). Write this geometric program
in convex form and derive the optimality conditions. Show that if x and y are optimal, then the
matrix
1
B= T diag(x)A diag(y)
x Ay
satisfies B1 = c and B T 1 = d.
5.23 [Schoenberg] Suppose m balls in Rn , with centers ai and radii ri , have a nonempty intersection.
We define y to be a point in the intersection, so
∥y − ai ∥2 ≤ ri , i = 1, . . . , m. (21)
Suppose we move the centers to new positions bi in such a way that the distances between the
centers do not increase:
∥bi − bj ∥2 ≤ ∥ai − aj ∥2 , i, j = 1, . . . , m. (22)
We will prove that the intersection of the translated balls is nonempty, i.e., there exists a point x
with ∥x − bi ∥2 ≤ ri , i = 1, . . . , m. To show this we prove that the optimal value of
minimize t
(23)
subject to ∥x − bi ∥22 ≤ ri2 + t, i = 1, . . . , m,
63
(a) Show that (22) implies that
t − (x − bi )T (x − bj ) ≤ −(y − ai )T (y − aj ) for i, j ∈ I,
5.24 Controlling a switched linear system via duality. We consider a discrete-time dynamical system
with state xt ∈ Rn . The state propagates according to the recursion
xt+1 = At xt , t = 0, 1, . . . , T − 1,
where the matrices At are to be chosen from a finite set A = {A(1) , . . . , A(K) } in order to control the
state xt over a finite time horizon of length T . More formally, the switched-linear control problem
is PT
minimize t=1 f (xt )
subject to xt+1 = A(ut ) xt , for t = 0, . . . , T − 1
The problem variables are xt ∈ Rn , for t = 1, . . . , T , and ut ∈ {1, . . . , K}, for t = 0, . . . , T − 1.
We assume the initial state, x0 ∈ Rn is a problem parameter (i.e., is known and fixed). You may
assume the function f is convex, though it isn’t necessary for this problem.
Note that, to find a feasible point, we take any sequence u0 , . . . , uT −1 ∈ {1, . . . , K}; we then
generate a feasible point according to the recursion
xt+1 = A(ut ) xt , t = 0, 1, . . . , T − 1.
The switched-linear control problem is not convex, and is hard to solve globally. Instead, we
consider a heuristic based on Lagrange duality.
(a) Find the dual of the switched-linear control problem explicitly in terms of x0 , A(1) , . . . , A(K) ,
the function f , and its conjugate f ∗ . Your formulation cannot involve a number of constraints
or objective terms that is exponential in K or T . (This includes minimization or maximization
with an exponential number of terms.)
(b) Given optimal dual variables ν1⋆ , . . . , νT⋆ corresponding to the T constraints of the switched-
linear control problem, a heuristic to choose ut is to minimize the Langrangian using these
optimal dual variables:
Given the optimal dual variables, show (explicitly) how to find ũ0 , . . . , ũT −1 .
(c) Consider the case f (x) = (1/2)xT Qx. with Q ∈ Sn++ . For the data given in sw lin ctrl data.*,
solve the dual problem and report its optimal value d⋆ , which is a lower bound on p⋆ . (As a
courtesy, we also included p⋆ in the data file, so you can check your bound.)
64
(d) Using the same data as is part (c), carry out the heuristic method of part (b) to compute
ũ0 , . . . , ũT −1 . Use these values to generate a feasible point. Report the value of the objective
at this feasible point, which is an upper bound on p⋆ .
5.25 [Friedland and Karlin] Let A be an n × n matrix with positive entries, and let u and v be two
positive n-vectors. Show that one can compute positive diagonal matrices D1 and D2 that satisfy
by the following method. Define αi = ui vi for i = 1, . . . , n, and solve the optimization problem
n P n
Aij xj )αi
Q
minimize (
i=1 j=1
n (25)
xαi i =
Q
subject to 1
i=1
The first equality in (24) follows immediately from the expressions of D1 and D2 . To show the
second equality in (24), express (25) as a convex optimization problem and derive the optimality
conditions.
(u − 1)2 /2 u ≥ 1
h(u) =
0 otherwise.
65
Derive the Lagrange dual of the equivalent problem
m
h(∥yi ∥2 ) − cT x
P
minimize
i=1
subject to Ai x + bi − yi = 0, i = 1, . . . , m,
with variables x ∈ Rn and yi ∈ R3 for i = 1, . . . , m.
This optimization problem describes the equilibrium of a structure consisting of m elastic cables
suspended between different points or nodes. Some of the nodes are anchored, other nodes are free.
The variable x contains the displacements of the free nodes. The vector c specifies the external
forces applied to the nodes. The norm ∥Ai x + bi ∥2 is the distance between the endpoints of the ith
cable as a function of the node displacements. The ith term in the sum in the cost function is the
potential energy stored in the ith cable, assuming its undeformed length is one.
5.28 Robust least squares with polyhedral uncertainty. We consider a robust least-squares problem
m
X
minimize sup (aTi x − bi )2
i=1 ai ∈Pi
66
(b) What does the result of part (a) imply about the convexity properties of f (A)?
(c) Derive the Lagrange dual of the SDP in part (a). Use the dual problem to give an SDP
formulation of the problem
minimize f (A0 + x1 A1 + · · · + xp Ap )
has a finite optimal value p⋆ and a dual optimal solution z ⋆ that satisfies z ⋆ ⪯ 1. Let q ⋆ be
the optimal value of (26). Show that
m log 2
p⋆ ≤ q ⋆ ≤ p⋆ + .
µ
5.31 In this problem, r is an integer between 1 and n, and ∥x∥ denotes the norm
(a) Explain why ∥x∥ is the optimal value of the optimization problem
maximize xT y
subject to ∥y∥∞ ≤ 1
∥y∥1 ≤ r.
67
(b) From part (a), −∥x∥ is the optimal value of the convex optimization problem
minimize f (y)
subject to ∥y∥1 ≤ r
as a quadratic program.
5.32 Consider the following optimization problem with two variables x1 and x2 :
minimize x1
p
subject to x21 + x22 ≤ x2
−x1 ≤ 1.
0 u≤0
ϕ(u) = 2
u /2 0<u≤1 0.5
u − 1/2 u > 1.
u
1
(a)
m
P
minimize ϕ(yi )
i=1
subject to Ax + b = y
68
(b)
minimize ϕ(∥y∥2 )
subject to Ax + b = y.
5.34 Some standard duals. Give (Lagrange) dual problems for the following convex optimization prob-
lems.
(a)
minimize 12 xT P x
subject to Ax = b
where P ⪰ 0 but may be indefinite.
(b)
1
minimize ∥Ax − b∥22 + λ∥x∥1 .
2
n
!1/n
Y
g(x) = xi
i=1
n
!1/k
Y
gk (x) = x[i]
i=n−k+1
(where we recall the notation that x[i] is the ith largest component of x ∈ Rn , so that x[1] ≥ x[2] ≥
· · · ≥ x[n] ) are both concave on Rn++ .
(c) Explain (in one sentence) why parts (a) and (b) imply that g and gk are concave over their
domains Rn++ .
69
5.36 Sensitivity analysis. Consider the convex optimization problem
minimize f0 (x)
subject to f1 (x) ≤ s, Ax = b,
with variables x ∈ Rn , parametrized by the real number s. We assume that a strong duality
holds for some nominal value s = snom . Let λ⋆ be an optimal dual variable (Lagrange multiplier)
associated with the constraint f1 (x) ≤ snom . Below we consider scenarios in which we change the
value of s below or above the nominal value snom , and then solve the modified problem. We are
interested in the optimal objective value of this modified problem, compared to the original one
above.
For each of the following, choose the best response. (Please note that the words were carefully
chosen.)
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m
Ax = b,
with variable x ∈ Rn , that satisfies Slater’s constraint qualification Determine whether each of the
statements below is true or false. True means it holds with no further assumptions.
(a) The primal and dual problems have the same objective value.
(b) The primal problem has a unique solution.
(c) The dual problem is not unbounded.
(d) Suppose x⋆ is optimal, with f1 (x⋆ ) = −0.2. Then for every dual optimal point (λ⋆ , ν ⋆ ), we
have λ⋆1 = 0.
70
6 Approximation and fitting
6.1 Three measures of the spread of a group of numbers. For x ∈ Rn , we define three functions that
measure the spread or width of the set of its elements (or coefficients). The first function is the
spread, defined as
ϕsprd (x) = max xi − min xi .
i=1,...,n i=1,...,n
This is the width of the smallest interval that contains all the elements of x.
The second function is the standard deviation, defined as
n n
!2 1/2
1 X 1 X
ϕstdev (x) = x2i − xi .
n n
i=1 i=1
This is the statistical standard deviation of a random variable that takes the values x1 , . . . , xn , each
with probability 1/n.
The third function is the average absolute deviation from the median of the values:
n
X
ϕaamd (x) = (1/n) |xi − med(x)|,
i=1
71
6.3 Approximation with trigonometric polynomials. Suppose y : R → R is a 2π-periodic function. We
will approximate y with the trigonometric polynomial
K
X K
X
f (t) = ak cos(kt) + bk sin(kt).
k=0 k=1
We consider two approximations: one that minimizes the L2 -norm of the error, defined as
Z π 1/2
2
∥f − y∥2 = (f (t) − y(t)) dt ,
−π
(A standard rule of thumb is to take N at least 10 times larger than K.) The L1 approximation
(or really, an approximation of the L1 approximation) can now be found by solving the (finite-
dimensional) convex problem, which can be converted to an LP.
We consider a specific case, where y is a 2π-periodic square-wave, defined for −π ≤ t ≤ π as
1 |t| ≤ π/2
y(t) =
0 otherwise.
(The graph of y over a few cycles explains the name ‘square-wave’.)
Find the optimal L2 approximation and (discretized) L1 optimal approximation for K = 10. You
can find the L2 optimal approximation analytically, or by solving a least-squares problem associated
with the discretized version of the problem. Since y is even, you can take the sine coefficients in
your approximations to be zero. Show y and the two approximations on a single plot.
In addition, plot a histogram of the residuals (i.e., the numbers f (ti )−y(ti )) for the two approxima-
tions. Use the same horizontal axis range, so the two residual distributions can easily be compared.
Make some brief comments about what you see.
6.4 Penalty function approximation. We consider the approximation problem
minimize ϕ(Ax − b)
where A ∈ Rm×n and b ∈ Rm , the variable is x ∈ Rn , and ϕ : Rm → R is a convex penalty
function that measures the quality of the approximation Ax ≈ b. We will consider the following
choices of penalty function:
72
(a) Euclidean norm.
Xm
ϕ(y) = ∥y∥2 = ( yk2 )1/2 .
k=1
(b) ℓ1 -norm.
m
X
ϕ(y) = ∥y∥1 = |yk |.
k=1
where |y|[1] , |y|[2] , |y|[3] , . . . , denote the absolute values of the components of y sorted in
decreasing order.
(d) A piecewise-linear penalty.
m
X 0 |u| ≤ 0.2
ϕ(y) = h(yk ), h(u) = |u| − 0.2 0.2 ≤ |u| ≤ 0.3
2|u| − 0.5 |u| ≥ 0.3.
k=1
with M = 0.2.
(f) Log-barrier penalty.
m
X
ϕ(y) = h(yk ), h(u) = − log(1 − u2 ), dom h = {u | |u| < 1}.
k=1
m = 200;
n = 100;
A = randn(m,n);
b = randn(m,1);
b = b/(1.01*max(abs(b)));
(The normalization of b ensures that the domain of ϕ(Ax − b) is nonempty if we use the log-barrier
penalty.) To compare the results, plot a histogram of the vector of residuals y = Ax − b, for each
of the solutions x, using the Matlab command
hist(A*x-b,m/2);
73
Some additional hints and remarks for the individual problems:
(a) This problem can be solved using least-squares (x=A\b).
(b) Use the CVX function norm(y,1).
(c) Use the CVX function norm_largest().
(d) Use CVX, with the overloaded max(), abs(), and sum() functions.
(e) Use the CVX function huber().
(f) The current version of CVX handles the logarithm using an iterative procedure, which is slow
and not entirely reliable. However, you can reformulate this problem as
1/2m
maximize ( m
Q
k=1 ((1 − (Ax − b)k )(1 + (Ax − b)k ))) ,
and use the CVX function geo_mean().
6.5 ℓ1.5 optimization. Optimization and approximation methods that use both an ℓ2 -norm (or its
square) and an ℓ1 -norm are currently very popular in statistics, machine learning, and signal and
image processing. Examples include Huber estimation, LASSO, basis pursuit, SVM, various ℓ1 -
regularized classification methods, total variation de-noising, etc. Very roughly, an ℓ2 -norm cor-
responds to Euclidean distance (squared), or the negative log-likelihood function for a Gaussian;
in contrast the ℓ1 -norm gives ‘robust’ approximation, i.e., reduced sensitivity to outliers, and also
tends to yield sparse solutions (of whatever the argument of the norm is). (All of this is just
background; you don’t need to know any of this to solve the problem.)
In this problem we study a natural method for blending the two norms, by using the ℓ1.5 -norm,
defined as !2/3
X k
∥z∥1.5 = |zi |3/2
i=1
k
for z ∈ R . We will consider the simplest approximation or regression problem:
minimize ∥Ax − b∥1.5 ,
with variable x ∈ Rn , and problem data A ∈ Rm×n and b ∈ Rm . We will assume that m > n and
the A is full rank (i.e., rank n). The hope is that this ℓ1.5 -optimal approximation problem should
share some of the good features of ℓ2 and ℓ1 approximation.
(a) Give optimality conditions for this problem. Try to make these as simple as possible.
(b) Explain how to formulate the ℓ1.5 -norm approximation problem as an SDP. (Your SDP can
include linear equality and inequality constraints.)
(c) Solve the specific numerical instance generated by the following code:
randn(’state’,0);
A=randn(100,30);
b=randn(100,1);
Numerically verify the optimality conditions. Give a histogram of the residuals, and repeat
for the ℓ2 -norm and ℓ1 -norm approximations. You can use any method you like to solve the
problem (but of course you must explain how you did it); in particular, you do not need to
use the SDP formulation found in part (b).
74
6.6 Total variation image interpolation. A grayscale image is represented as an m × n matrix of
intensities U orig . You are given the values Uijorig , for (i, j) ∈ K, where K ⊂ {1, . . . , m} × {1, . . . , n}.
Your job is to interpolate the image, by guessing the missing values. The reconstructed image
will be represented by U ∈ Rm×n , where U satisfies the interpolation conditions Uij = Uijorig for
(i, j) ∈ K.
The reconstruction is found by minimizing a roughness measure subject to the interpolation con-
ditions. One common roughness measure is the ℓ2 variation (squared),
m X
X n m X
X n
2
(Uij − Ui−1,j ) + (Uij − Ui,j−1 )2 .
i=2 j=1 i=1 j=2
6.7 Piecewise-linear fitting. In many applications some function in the model is not given by a formula,
but instead as tabulated data. The tabulated data could come from empirical measurements,
historical data, numerically evaluating some complex expression or solving some problem, for a set
of values of the argument. For use in a convex optimization model, we then have to fit these data
with a convex function that is compatible with the solver or other system that we use. In this
problem we explore a very simple problem of this general type.
Suppose we are given the data (xi , yi ), i = 1, . . . , m, with xi , yi ∈ R. We will assume that xi are
sorted, i.e., x1 < x2 < · · · < xm . Let a0 < a1 < a2 < · · · < aK be a set of fixed knot points, with
a0 ≤ x1 and aK ≥ xm . Explain how to find the convex piecewise linear function f , defined over
[a0 , aK ], with knot points ai , that minimizes the least-squares fitting criterion
m
X
(f (xi ) − yi )2 .
i=1
You must explain what the variables are and how they parametrize f , and how you ensure convexity
of f .
Hints. One method to solve this problem is based on the Lagrange basis, f0 , . . . , fK , which are the
piecewise linear functions that satisfy
fj (ai ) = δij , i, j = 0, . . . , K.
Another method is based on defining f (x) = αi x + βi , for x ∈ (ai−1 , ai ]. You then have to add
conditions on the parameters αi and βi to ensure that f is continuous and convex.
75
Apply your method to the data in the file pwl_fit_data.m, which contains data with xj ∈ [0, 1].
Find the best affine fit (which corresponds to a = (0, 1)), and the best piecewise-linear convex
function fit for 1, 2, and 3 internal knot points, evenly spaced in [0, 1]. (For example, for 3 internal
knot points we have a0 = 0, a1 = 0.25, a2 = 0.50, a3 = 0.75, a4 = 1.) Give the least-squares
fitting cost for each one. Plot the data and the piecewise-linear fits found. Express each function
in the form
f (x) = max (αi x + βi ).
i=1...,K
(In this form the function is easily incorporated into an optimization problem.)
6.8 Least-squares fitting with convex splines. A cubic spline (or fourth-order spline) with breakpoints
α0 , α1 , . . . , αM (that satisfy α0 < α1 < · · · < αM ) is a piecewise-polynomial function with the
following properties:
The figure shows an example of a cubic spline f (t) with M = 10 segments and breakpoints α0 = 0,
α1 = 1, . . . , α10 = 10.
10
5
f (t)
−5
−10
0 2 4 6 8 10
t
and that there exist efficient algorithms for computing g(t) = (g1 (t), . . . , gM +3 (t)). The next figure
shows the 13 B-splines for the breakpoints 0, 1, . . . , 10.
76
1
0.8
0.6
gk (t)
0.4
0.2
0
0 2 4 6 8 10
t
In this exercise we study the problem of fitting a cubic spline to a set of data points, subject to the
constraint that the spline is a convex function. Specifically, the breakpoints α0 , . . . , αM are fixed,
and we are given N data points (tk , yk ) with tk ∈ [α0 , αM ]. We are asked to find the convex cubic
spline f (t) that minimizes the least-squares criterion
N
X
(f (tk ) − yk )2 .
k=1
We will use B-splines to parametrize f , so the variables in the problem are the coefficients x in
f (t) = xT g(t). The problem can then be written as
N
X 2
minimize xT g(tk ) − yk
k=1
(27)
subject to xT g(t) is convex in t on [α0 , αM ].
77
6.9 Robust least-squares with interval coefficient matrix. An interval matrix in Rm×n is a matrix whose
entries are intervals:
The matrix Ā ∈ Rm×n is called the nominal value or center value, and R ∈ Rm×n , which is
elementwise nonnegative, is called the radius.
The robust least-squares problem, with interval matrix, is
with optimization variable x ∈ Rn . The problem data are A (i.e., Ā and R) and b ∈ Rm . The
objective, as a function of x, is called the worst-case residual norm. The robust least-squares
problem is evidently a convex optimization problem.
(a) Formulate the interval matrix robust least-squares problem as a standard optimization prob-
lem, e.g., a QP, SOCP, or SDP. You can introduce new variables if needed. Your reformulation
should have a number of variables and constraints that grows linearly with m and n, and not
exponentially.
(b) Consider the specific problem instance with m = 4, n = 3,
60 ± 0.05 45 ± 0.05 −8 ± 0.05 −6
90 ± 0.05 30 ± 0.05 −30 ± 0.05 −3
A= 0 ± 0.05 −8 ± 0.05 −4 ± 0.05 ,
18 .
b=
(The first part of each entry in A gives Āij ; the second gives Rij , which are all 0.05 here.) Find
the solution xls of the nominal problem (i.e., minimize ∥Āx − b∥2 ), and robust least-squares
solution xrls . For each of these, find the nominal residual norm, and also the worst-case residual
norm. Make sure the results make sense.
6.10 Identifying a sparse linear dynamical system. A linear dynamical system has the form
where x(t) ∈ Rn is the state, u(t) ∈ Rm is the input signal, and w(t) ∈ Rn is the process noise,
at time t. We assume the process noises are IID N (0, W ), where W ≻ 0 is the covariance matrix.
The matrix A ∈ Rn×n is called the dynamics matrix or the state transition matrix, and the matrix
B ∈ Rn×m is called the input matrix.
You are given accurate measurements of the state and input signal, i.e., x(1), . . . , x(T ), u(1), . . . , u(T −
1), and W is known. Your job is to find a state transition matrix  and input matrix B̂ from these
data, that are plausible, and in addition are sparse, i.e., have many zero entries. (The sparser the
better.)
By doing this, you are effectively estimating the structure of the dynamical system, i.e., you are
determining which components of x(t) and u(t) affect which components of x(t + 1). In some
applications, this structure might be more interesting than the actual values of the (nonzero)
coefficients in  and B̂.
78
By plausible, we mean that
T −1
X 2
W −1/2 x(t + 1) − Âx(t) − B̂u(t)
p
≤ n(T − 1) + 2 2n(T − 1).
2
t=1
(You can just take this as our definition of plausible. But to explain this choice, we note that when
2
 = A and B̂ = B, the left-hand pside is χ , with n(T − 1) degrees of freedom, and so has mean
n(T − 1) and standard deviation 2n(T − 1). Thus, the constraint above states that the LHS does
not exceed the mean by more than 2 standard deviations.)
(a) Describe a method for finding  and B̂, based on convex optimization.
We are looking for a very simple method, that involves solving one convex optimization
problem. (There are many extensions of this basic method, that would improve the simple
method, i.e., yield sparser  and B̂ that are still plausible. We’re not asking you to describe
or implement any of these.)
(b) Carry out your method on the data found in sparse_lds_data.m. Give the values of  and
B̂ that you find, and verify that they are plausible.
In the data file, we give you the true values of A and B, so you can evaluate the performance
of your method. (Needless to say, you are not allowed to use these values when forming  and
B̂.) Using these true values, give the number of false positives and false negatives in both Â
and B̂. A false positive in Â, for example, is an entry that is nonzero, while the corresponding
entry in A is zero. A false negative is an entry of  that is zero, while the corresponding
entry of A is nonzero. To judge whether an entry of  (or B̂) is nonzero, you can use the test
|Âij | ≥ 0.01 (or |B̂ij | ≥ 0.01).
6.11 Measurement with bounded errors. A series of K measurements y1 , . . . , yK ∈ Rp , are taken in order
to estimate an unknown vector x ∈ Rq . The measurements are related to the unknown vector x by
yi = Ax + vi , where vi is a measurement noise that satisfies ∥vi ∥∞ ≤ α but is otherwise unknown.
(In other words, the entries of v1 , . . . , vK are no larger than α.) The matrix A and the measurement
noise norm bound α are known. Let X denote the set of vectors x that are consistent with the
observations y1 , . . . , yK , i.e., the set of x that could have resulted in the measurements made. Is X
convex?
Now we will examine what happens when the measurements are occasionally in error, i.e., for a few
i we have no relation between x and yi . More precisely suppose that Ifault is a subset of {1, . . . , K},
and that yi = Ax + vi with ∥vi ∥∞ ≤ α (as above) for i ̸∈ Ifault , but for i ∈ Ifault , there is no relation
between x and yi . The set Ifault is the set of times of the faulty measurements.
Suppose you know that Ifault has at most J elements, i.e., out of K measurements, at most J are
faulty. You do not know Ifault ; you know only a bound on its cardinality (size). For what values of
J is X, the set of x consistent with the measurements, convex?
6.12 Least-squares with some permuted measurements. We want to estimate a vector x ∈ Rn , given
some linear measurements of x corrupted with Gaussian noise. Here’s the catch: some of the
measurements have been permuted.
More precisely, our measurement vector y ∈ Rm has the form
y = P (Ax + v),
79
where vi are IID N (0, 1) measurement noises, x ∈ Rn is the vector of parameters we wish to
estimate, and P ∈ Rm×m is a permutation matrix. (This means that each row and column of P
has exactly one entry equal to one, and the remaining m − 1 entries zero.) We assume that m > n
and that at most k of the measurements are permuted; i.e., P ei ̸= ei for no more than k indices i.
We are interested in the case when k < m (e.g. k = 0.4m); that is, only some of the measurements
have been permuted. We want to estimate x and P .
Once we make a guess P̂ for P , we can get the maximum likelihood estimate of x by minimizing
∥Ax − P̂ T y∥2 . The residual Ax̂ − P̂ T y is then our guess of what v is, and should be consistent with
being a sample of a N (0, I) vector.
In principle, we can find the maximum likelihood estimate of x and P by solving a set of m
k (k! − 1)
least-squares problems, and choosing one that has minimum residual. But this is not practical unless
m and k are both very small.
Describe a heuristic method for approximately solving this problem, using convex optimization.
(There are many different approaches which work quite well.)
You might find the following fact useful. The solution to
over P ∈ Rm×m a permutation matrix, is the permutation that matches the smallest entry in y
with the smallest entry in Ax, does the same for the second smallest entries and so forth.
Carry out your method on the data in ls_perm_meas_data.*. Give your estimate of the permuted
indices. The data file includes the true permutation matrix and value of x (which of course you
cannot use in forming your estimate). Compare the estimate of x you get after your guessed
permutation with the estimate obtained assuming P = I.
Remark. This problem comes up in several applications. In target tracking, we get multiple noisy
measurements of a set of targets, and then guess which targets are the same in the different sets of
measurements. If some of our guesses are wrong (i.e., our target association is wrong) we have the
present problem. In vision systems the problem arises when we have multiple camera views of a
scene, which give us noisy measurements of a set of features. A feature correspondence algorithm
guesses which features in one view correspond to features in other views. If we make some feature
correspondence errors, we have the present problem.
6.13 Fitting with censored data. In some experiments there are two kinds of measurements or data
available: The usual ones, in which you get a number (say), and censored data, in which you don’t
get the specific number, but are told something about it, such as a lower bound. A classic example
is a study of lifetimes of a set of subjects (say, laboratory mice). For those who have died by the end
of data collection, we get the lifetime. For those who have not died by the end of data collection,
we do not have the lifetime, but we do have a lower bound, i.e., the length of the study. These are
the censored data values.
We wish to fit a set of data points,
with x(k) ∈ Rn and y (k) ∈ R, with a linear model of the form y ≈ cT x. The vector c ∈ Rn is the
model parameter, which we want to choose. We will use a least-squares criterion, i.e., choose c to
80
minimize
K
X 2
J= y (k) − cT x(k) .
k=1
Here is the tricky part: some of the values of y (k) are censored; for these entries, we have only a
(given) lower bound. We will re-order the data so that y (1) , . . . , y (M ) are given (i.e., uncensored),
while y (M +1) , . . . , y (K) are all censored, i.e., unknown, but larger than D, a given number. All the
values of x(k) are known.
(a) Explain how to find c (the model parameter) and y (M +1) , . . . , y (K) (the censored data values)
that minimize J.
(b) Carry out the method of part (a) on the data values in cens_fit_data.*. Report ĉ, the value
of c found using this method.
Also find ĉls , the least-squares estimate of c obtained by simply ignoring the censored data
samples, i.e., the least-squares estimate based on the data
The data file contains ctrue , the true value of c, in the vector c_true. Use this to give the two
relative errors
∥ctrue − ĉ∥2 ∥ctrue − ĉls ∥2
, .
∥ctrue ∥2 ∥ctrue ∥2
6.14 Spectrum analysis with quantized measurements. A sample is made up of n compounds, in quantities
qi ≥ 0, for i = 1, . . . , n. Each compound has a (nonnegative) spectrum, which we represent as a
vector s(i) ∈ Rm n. (Precisely what s(i) means won’t matter to us.) The spectrum
+ , for i = 1, . . . ,P
of the sample is given by s = ni=1 qi s(i) . We can write this more compactly as s = Sq, where
S ∈ Rm×n is a matrix whose columns are s(1) , . . . , s(n) .
Measurement of the spectrum of the sample gives us an interval for each spectrum value, i.e.,
l, u ∈ Rm
+ for which
li ≤ si ≤ ui , i = 1, . . . , m.
(We don’t directly get s.) This occurs, for example, if our measurements are quantized.
Given l and u (and S), we cannot in general deduce q exactly. Instead, we ask you to do the
following. For each compound i, find the range of possible values for qi consistent with the spectrum
measurements. We will denote these ranges as qi ∈ [qimin , qimax ]. Your job is to find qimin and qimax .
Note that if qimin is large, we can confidently conclude that there is a significant amount of compound
i in the sample. If qimax is small, we can confidently conclude that there is not much of compound
i in the sample.
81
6.15 Learning a quadratic pseudo-metric from distance measurements. We are given a set of N pairs of
points in Rn , x1 , . . . , xN , and y1 , . . . , yN , together with a set of distances d1 , . . . , dN > 0.
The goal is to find (or estimate or learn) a quadratic pseudo-metric d,
1/2
d(x, y) = (x − y)T P (x − y) ,
with P ∈ Sn+ , which approximates the given distances, i.e., d(xi , yi ) ≈ di . (The pseudo-metric d is
a metric only when P ≻ 0; when P ⪰ 0 is singular, it is a pseudo-metric.)
To do this, we will choose P ∈ Sn+ that minimizes the mean squared error objective
N
1 X
(di − d(xi , yi ))2 .
N
i=1
(a) Explain how to find P using convex or quasiconvex optimization. If you cannot find an exact
formulation (i.e., one that is guaranteed to minimize the total squared error objective), give
a formulation that approximately minimizes the given objective, subject to the constraints.
(b) Carry out the method of part (a) with the data given in quad_metric_data.m. The columns
of the matrices X and Y are the points xi and yi ; the row vector d gives the distances di . Give
the optimal mean squared distance error.
We also provide a test set, with data X_test, Y_test, and d_test. Report the mean squared
distance error on the test set (using the metric found using the data set above).
6.16 Polynomial approximation of inverse using eigenvalue information. We seek a polynomial of degree
k, p(a) = c0 + c1 a + c2 a2 + · · · + ck ak , for which
p(A) = c0 I + c1 A + c2 A2 · · · + ck Ak
is an approximate inverse of the nonsingular matrix A, for all A ∈ A ⊂ Rn×n . When x̂ = p(A)b
is used as an approximate solution of the linear equation Ax = b, the associated residual norm is
∥A(p(A)b) − b∥2 . We will judge our polynomial (i.e., the coefficients c0 , . . . , ck ) by the worst case
residual over A ∈ A and b in the unit ball:
The set of matrices we take is A = {A ∈ Sn | σ(A) ⊆ Ω}, where σ(A) is the set of eigenvalues of A
(i.e., its spectrum), and Ω ⊂ R is a union of a set of intervals (that do not contain 0).
(a) Explain how to find coefficients c⋆0 , . . . , c⋆k that minimize Rwc . Your solution can involve ex-
pressions that involve the supremum of a polynomial (with scalar argument) over an interval.
(b) Carry out your method for k = 4 and Ω = [−0.6, −0.3] ∪ [0.7, 1.8]. You can replace the
supremum of a polynomial over Ω by a maximum over uniformly spaced (within each interval)
points in Ω, with spacing 0.01. Give the optimal value Rwc⋆ and the optimal coefficients
c⋆ = (c⋆0 , . . . , c⋆k ).
82
• The approximate inverse p(A)b would be computed by recursively, requiring the multiplication
of A with a vector k times.
• This approximate inverse could be used as a preconditioner for an iterative method.
• The Cayley-Hamilton theorem tells us that the inverse of any (invertible) matrix is a polyno-
mial of degree n − 1 of the matrix. Our hope here, however, is to get a single polynomial, of
relatively low degree, that serves as an approximate inverse for many different matrices.
6.17 Fitting a generalized additive regression model. A generalized additive model has the form
n
X
f (x) = α + fj (xj ),
j=1
for x ∈ Rn , where α ∈ R is the offset, and fj : R → R, with fj (0) = 0. The functions fj are called
the regressor functions. When each fj is linear, i.e., has the form wj xj , the generalized additive
model is the same as the standard (linear) regression model. Roughly speaking, a generalized
additive model takes into account nonlinearities in each regressor xj , but not nonlinear interactions
among the regressors. To visualize a generalized additive model, it is common to plot each regressor
function (when n is not too large).
We will restrict the functions fj to be piecewise-affine, with given knot points p1 < · · · < pK . This
means that fj is affine on the intervals (−∞, p1 ], [p1 , p2 ], . . . , [pK−1 , pK ], [pK , ∞), and continuous at
p1 , . . . , pK . Let C denote the total (absolute value of) change in slope across all regressor functions
and all knot points. The value C is a measure of nonlinearity of the regressor functions; when
C = 0, the generalized additive model reduces to a linear regression model.
Now suppose we observe samples or data (x(1) , y (1) ), . . . , (x(N ) , y (N ) ) ∈ Rn × R, and wish to fit
a generalized additive model to the data. We choose the offset and the regressor functions to
minimize
N
1 X (i)
(y − f (x(i) )2 + λC,
N
i=1
where λ > 0 is a regularization parameter. (The first term is the mean-square error.)
Hints.
• You can represent each regressor function fj as a linear combination of the basis functions
b0 (u) = u and bi (u) = (u − pk )+ − (−pk )+ for k = 1, 2, . . . , K, where (a)+ = max{a, 0}.
• You might find the matrix XX = [b0 (X) b1 (X) · · · bK (X)] useful.
83
6.18 Multi-label support vector machine. The basic SVM described in the book is used for classification
of data with two labels. In this problem we explore an extension of SVM that can be used to carry
out classification of data with more than two labels. Our data consists of pairs (xi , yi ) ∈ Rn ×
{1, . . . , K}, i = 1, . . . , m, where xi is the feature vector and yi is the label of the ith data point. (So
the labels can take the values 1, . . . , K.) Our classifier will use K affine functions, fk (x) = aTk x + bk ,
k = 1, . . . , K, which we also collect into affine function from Rn into RK as f (x) = Ax + b. (The
rows of A are aTk .) Given feature vector x, we guess the label ŷ = argmaxk fk (x). We assume that
exact ties never occur, or if they do, an arbitrary choice can be made. Note that if a multiple of 1
is added to b, the classifier does not change. Thus, without loss of generality, we can assume that
1T b = 0.
To correctly classify the data examples, we need fyi (xi ) > maxk̸=yi fk (xi ) for all i. This is a set of
homogeneous strict inequalities in ak and bk , which are feasible if and only if the set of nonstrict
inequalities fyi (xi ) ≥ 1 + maxk̸=yi fk (xi ) are feasible. This motivates the loss function
m
X
L(A, b) = 1 + max fk (xi ) − fyi (xi ) ,
k̸=yi +
i=1
where (u)+ = max{u, 0}. The multi-label SVM chooses A and b to minimize
L(A, b) + µ∥A∥2F ,
(a) Show how to find A and b using convex optimization. Be sure to justify any changes of
variables or reformulation (if needed), and convexity of the objective and constraints in your
formulation.
(b) Carry out multi-label SVM on the data given in multi_label_svm_data.m. Use the data
given in X and y to fit the SVM model, for a range of values of µ. This data set includes an
additional set of data, Xtest and ytest, that you can use to test the SVM models. Plot the
test set classification error rate (i.e., the fraction of data examples in the test set for which
ŷ ̸= y) versus µ.
You don’t need to try more than 10 or 20 values of µ, and we suggest choosing them uniformly
on a log scale, from (say) 10−2 to 102 .
6.19 Colorization with total variation regularization. A m×n color image is represented as three matrices
of intensities R, G, B ∈ Rm×n , with entries in [0, 1], representing the red, green, and blue pixel
intensities, respectively. A color image is converted to a monochrome image, represented as one
matrix M ∈ Rm×n , using
M = 0.299R + 0.587G + 0.114B.
(These weights come from different perceived brightness of the three primary colors.)
In colorization, we are given M , the monochrome version of an image, and the color values of some
of the pixels; we are to guess its color version, i.e., the matrices R, G, B. Of course that’s a very
84
underdetermined problem. A very simple technique is to minimize the total variation of (R, G, B),
defined as
Rij − Ri,j+1
Gij − Gi,j+1
m−1
X n−1
X
Bij − Bi,j+1
tv(R, G, B) = Rij − Ri+1,j ,
i=1 j=1
Gij − Gi+1,j
Bij − Bi+1,j 2
subject to consistency with the given monochrome image, the known ranges of the entries of
(R, G, B) (i.e., in [0, 1]), and the given color entries. Note that the sum above is of the norm
of 6-vectors, and not the norm-squared. (The 6-vector is an approximation of the spatial gradient
of (R, G, B).)
Carry out this method on the data given in image_colorization_data.*. The file loads flower.png
and provides the monochrome version of the image, M, along with vectors of known color intensities,
R_known, G_known, and B_known, and known_ind, the indices of the pixels with known values. If R
denotes the red channel of an image, then R(known_ind) returns the known red color intensities in
Matlab, and R[known_ind] returns the same in Python and Julia. The file also creates an image,
flower_given.png, that is monochrome, with the known pixels colored.
The tv function, invoked as tv(R,G,B), gives the total variation. CVXPY has the tv function
built-in, but CVX and CVX.jl do not, so we have provided the files tv.m and tv.jl which contain
implementations for you to use.
In Python and Julia we have also provided the function save_img(filename,R,G,B) which writes
the image defined by the matrices R, G, B, to the file filename. To view an image in Matlab use
the imshow function.
The problem instance is a small image, 75 × 75, so the solve time is reasonable, say, under ten
seconds or so in CVX or CVXPY, and around 60 seconds in Julia.
Report your optimal objective value and, if you have access to a color printer, attach your recon-
structed image. If you don’t have access to a color printer, it’s OK to just give the optimal objective
value.
6.20 Recovering latent periodic signals. First, a definition: a signal x ∈ Rn is p-periodic with p < n if
xi+p = xi for i = 1, . . . , n − p.
In this problem, we consider a noisy, measured signal y ∈ Rn which is (approximately) the sum
of a several periodic signals, with unknown periods. Given only the noisy signal y, our task is to
recover these latent periodic signals. In particular, y is given as
X
y=v+ x(p) ,
p∈P
where v ∈ Rn is a (small) random noise term, and x(p) is a p-periodic signal. The set P ⊂
{1, . . . , pmax } contains the periods of the latent periodic signals that compose y.
If P were known, we could approximately recover the latent periodic signals x(p) using, say, least
squares. Because P is not known, we instead propose to recover the latent periodic signals x(p) by
85
solving the following optimization problem:
The variables are ŷ and x̂(p) , for p = 1, . . . , pmax . The first sum in the objective penalizes the
squared deviation of the measured signal y from our estimate ŷ, and the second sum is a heuristic
for producing vectors x̂(p) that contain only zeros. The weight vector w ⪰ 0 is increasing in its
indices, which encodes our desire that the latent periodic signals have small period.
(a) Explain how to solve the given optimization problem using convex optimization, and how to
use it to (approximately) recover the set P and the latent periodic signals x(p) , for p ∈ P.
(b) The file periodic signals data.* contains a signal y, as well as a weight vector w. Return
your best guess of the set P. plot the measured signal y, as well as the different periodic
components that (approximately) compose it. (Use separate graphs for each signal, so you
should have |P| + 1 graphs.)
6.21 Rank one nonnegative matrix approximation. We are given some entries of an m × n matrix A with
positive entries, and wish to approximate it as the outer product of vectors x and y with positive
entries, i.e., xy T . We will use the average relative deviation between the entries of A and xy T as
our approximation criterion,
m n
1 XX
R(Aij , xi yj ),
mn
i=1 j=1
If we scale x by the positive number α, and y by 1/α, the outer product (αx)(y/α)T is the same
as xy T , so we will normalize x as 1T x = 1.
The data in the problem consists of some of the values of A. Specifically, we are given Aij for
(i, j) ∈ Ω ⊆ {1, . . . , m} × {1, . . . , n}. Thus, your goal is to find x ∈ Rm T
++ (which satisfies 1 x = 1),
n
y ∈ R++ , and Aij > 0 for (i, j) ̸∈ Ω, to minimize the average relative deviation between the entries
of A and xy T .
(a) Explain how to solve this problem using convex or quasiconvex optimization.
(b) Solve the problem for the data given in rank_one_nmf_data.*. This includes a matrix A, and
a set of indexes Omega for the given entries. (The other entries of A are filled in with zeros.)
Report the optimal average relative deviation between A and xy T . Give your values for x1 ,
y1 , and A11 = x1 y1 .
6.22 Total variation de-mosaicing. A color image is represented by 3 m × n matrices R, G, and B that
give the red, green, and blue pixel intensities. A camera sensor, however, measures only one of the
color intensities at each pixel. The pattern of pixel sensor colors varies, but most of the patterns
86
have twice as many green sensor pixels as red or blue. A common arrangement repeats the 2 × 2
block
R G
G B
(assuming m and n are even).
De-mosaicing is the process of guessing, or interpolating, the missing color values at each pixel. The
sensors give us mn entries in the matrices R, G, and B; in de-mosaicing, we guess the remaining
2mn entries in the matrices.
First we describe a very basic method of de-mosaicing. For each 2 × 2 block of pixels we have the
4 intensity values
Ri,j Gi,j+1
.
Gi+1,j Bi+1,j+1
We use the value Ri,j as the red value for the other three pixels, and we do the same for the blue
value Bi+1,j+1 . For guessing the green values at i, j and i + 1, j + 1, we simply use the average of
the two measured green values, (Gi,j+1 + Gi+1,j )/2.
A more sophisticated method relies on convex optimization. You choose the unknown pixel values
in R, G, and B to minimize the total variation of the color image, defined as
Ri,j − Ri,j+1
Gi,j − Gi,j+1
m−1
X n−1
X Bi,j − Bi,j+1
Ri+1,j − Ri,j .
i=1 j=1
Gi+1,j − Gi,j
Bi+1,j − Bi,j 2
Note that the norms in the sum here are not squared. The argument of the norms is a vector in
R6 , an estimate of the spatial gradient of the RGB values.
We have provided you with several files in the data directory. Three images are given (in png for-
mat): demosaic_raw.png, which contains the raw or mosaic image to de-mosaic, demosaic_original.png,
which contains the original image from which the raw image was constructed, and demosaic_simple.png,
which is the image de-mosaiced by the simple method described above. Remember that the raw
image, and any reconstructed de-mosaiced image, have only one third the information of the origi-
nal, so we cannot expect them to look as good as the original. You don’t need the original or basic
de-mosaiced image files to solve the problem; they are given only so you can look at them to see
what they are. You should zoom in while viewing the raw image and the basic de-mosaic version,
so you can see the pattern of 2 × 2 blocks in the first, and the simple de-mosaic method in the
second.
The tv function, invoked as tv(R,G,B), gives the total variation. CVXPY has the tv function
built-in, but CVX and CVX.jl do not, so we have provided the files tv.m and tv.jl which contain
implementations for you to use.
The file demosaic_data.* constructs arrays R_mask, G_mask, and B_mask, which contain the in-
dices of pixels whose values we know in the original image, the number of rows and columns in
the image, m, n respectively, and arrays R_raw, B_raw, G_raw, which contain the known values
of each color at each pixel, filled in with zeroes for the unknown values. So if R is an m × n
87
matrix variable, the constraint R[R_mask]==R_raw[R_mask] in Julia and Python will impose the
constraint that it agrees with the given red pixel values; in Matlab, the constraint can be ex-
pressed as R(R_mask)==R_raw(R_mask). This file also contains a save_image method, which takes
three arguments, R, G, B arrays (that you’ve reconstructed) and saves the file under the name
output_image.png. To see the image in Matlab, use the imshow function.
Report the optimal value of total variation, and attach the de-mosaiced image. (If you don’t have
access to a color printer, you can submit a monochrome version. Print it large enough that we can
see it, say, at least half the page width wide.)
Hint. Your solution code should take less than 10 seconds or so to run in Python and Matlab, but
up to a minute or so in Julia. You might get a warning about an inaccurate solution, but you can
ignore it.
6.23 Fitting with a nonnegative combination of vectors from ellipsoids. You are given ellipsoids E1 , . . . , En ⊂
Rk , and the vector b ∈ Rk . Explain how to use convex optimization to choose ai ∈ Ei , i = 1, . . . , n,
and nonnegative x1 , . . . , xn ∈ R, that minimize
n
X
xi ai − b .
i=1 2
You can use any parametrization of the ellipsoids you like, for example,
Ei = {a | ∥Pi a + qi ∥2 ≤ 1} ,
or
Ei = {Pi u + qi | ∥u∥2 ≤ 1} ,
or
Ei = a | (a − ci )T Pi−1 (a − ci ) ≤ 1 ,
88
in variables x ∈ Rn and w ∈ Rm , where we assume we know the signal x has ℓ2 -norm bounded
by
Pmr, which seeks the x satisfying the contraints while simultaneously minimizing some measure
i=1 ϕ(wi ) of the amount of noise. This is obviously a non-convex problem—the equality con-
straints are quadratic—but we can often effectively approximate its solutions by lifting it into a
higher-dimensional space. We do so by taking the dual of the dual of the problem.
(a) By introducing variables νi ∈ R for the equality constraints and λ ≥ 0 for the constraint
∥x∥22 ≤ r2 , show that a dual to problem (28) is the semidefinite program
m
X
maximize − ϕ∗ (νi ) + ν T b − λr2
i=1
m
X (29)
subject to λI − νi ai aTi ⪰ 0,
i=1
λ≥0
in variables ν ∈ Rm and λ.
(b) Introducing the dual variable X ∈ Sn+ for the semidefinite constraint, show that a dual to the
problem (29) is
Xm
minimize ϕ(bi − aTi Xai )
i=1
subject to X ⪰ 0, tr X ≤ r2 ,
where X ∈ Sn is the variable.
(c) Suppose that X ⋆ is optimal for the problem in part (b) and X ⋆ has rank 1, i.e., X ⋆ = x⋆ (x⋆ )T
for some vector x⋆ ∈ Rn . What does that tell you about problem (28)?
(d) Let ϕ(t) = |t| be the absolute value. Generate data according to the following process, which
we write in Julia notation:
m = 40;
n = 6;
A = randn(m, n);
xtrue = randn(n);
b = (A * xtrue) .* (A * xtrue);
b[1:10] .= 1000;
(This means that we generate a matrix A ∈ Rm×n with i.i.d. N (0, 1) entries, set xtrue ∈ Rn to
be a random vector with i.i.d. N (0, 1) entries, set b = (Axtrue )2 elementwise, and then corrupt
the first 10 entries of b to satisfy bi = 1000.) Using CVX*, solve the SDP in part (b) for 25
different random realizations of the problem data.
The numerical rank of a symmetric matrix X ∈ Sn at tolerance ϵ > 0 is the number of
eigenvalues λi of X with |λi | > ϵ. Plot a histogram of the numerical ranks at tolerance
ϵ = 10−2 of your solutions.
= ni=1 λi vi viT , the
P
(e) Given a positive semidefinite matrix X with spectral decomposition X P
best rank-1 approximation to X is √ λ1 v1 v1T . Thus, given a solution X ⋆ = ni=1 λi vi viT to part
(b), we approximate x true by x̂ = λ1 v1 . For your code in part (d), how frequently do you
true
(effectively) recover x ? Note that we may not correctly recover the sign of x, so we measure
the error by min{∥x̂ − xtrue ∥2 , ∥x̂ + xtrue ∥2 }.
89
6.25 Implementing the asymmetric Huber function in CVX*. We define the asymmetric Huber function
ϕ : R → R as
M− (−2u − M− ) u < −M−
ϕ(u) = u2 −M− ≤ u ≤ M+
M+ (2u − M+ ) u > M+ ,
where M− > 0 and M+ > 0 are parameters, the negative and positive thresholds, respectively.
This function is the same as the standard Huber function with threshold M > 0,
(
u2 |u| ≤ M
h(u) =
M (2|u| − M ) |u| > M,
when M− = M+ = M .
The standard Huber function is implemented in CVX* as an atom. The asymmetric Huber function
is not.
Explain how to implement the asymmetric Huber function in CVX* using standard operations
and functions, including the standard Huber function, satisfying the DCP (disciplined convex
programming) rules. You may use any atom in CVX* except the atoms huber circ, huber pos,
and berhu. Your solution should be very short and should include an explanation of how the two
thresholds M− and M+ come in to your implementation. Verify your implementation by plotting
it over the range [−3, 3] with M− = 1 and M+ = 2.
Hints. Some of the following might be helpful: Pre-composing the standard Huber function with
an affine function of u; adding an affine function of u; scaling.
Remark. The standard Huber function is used as a penalty function in regression when the data
includes outliers, with M interpreted roughly as the threshold in residuals between a valid sample
and an outlier sample. The asymmetric Huber function can be used when the outliers might have
different negative and positive thresholds.
6.26 Deconvolution of a known filter. In a (simplified) imaging system, instead of observing a true image,
one observes an image with slight blurring (or other aberrations due to sensor error), and wishes
to recover the true image. We let Z ∈ Rd×d by the true image, which is unobsered, and Y ∈ Rd×d
be the observed image. Here, Y = F ∗ Z is the convolution of Z with a known filter F (this is the
point spread function), with entries
d
X
Ykl = (F ∗ Z)(k, l) = Fi,j Zk−i,l−j .
i,j=1
For those indices (k − i, l − j) out of the range {1, . . . , d}2 , we define Zk−i,l−j = 0. Let m = d2 be
the number of measurements we take. If we let z = vec(Z), that is, the vectorized version of Z,
and y = vec(Y ), then there is a matrix A ∈ Rm×m such that
y = Az.
(You do not need to know what the matrix A is or precisely what vec does.) Sensor failures at
some pixels (k, l) mean that instead of observing Ykl = (F ∗ Z)(k, l) we observe a Ykl = 0.
90
A vectorized image z ∈ Rm is represented in an overcomplete basis B ∈ Rm×n , n > m, so there
exist vectors x ∈ Rn such that z = Bx. Given Y with y = vec(Y ), the image deconvolution
problem is to find x minimizing an objective f (x) while reconstructing the observed image, i.e.
satisfying y = ABx.
Formulate the following as convex optimization problems:
Solve your optimization problems from parts (a), (b), and (c) on the data in deconvolution_data.*.
In the file we have defined a vector y ∈ Rm and filter matrix A ∈ Rm×m , with zeroed-out entries
indicating sensor failures, as well as a basis matrix B ∈ {−1, 0, 1}m×n .
(d) For each of (a)–(c), display the estimated true image z = Bx⋆ that is reconstructed from x⋆ .
In addition, display the initial sensed image y. Explain your results in one or two sentences.
Note. You can view an image Z ∈ Rd×d from a vector z ∈ Rm with m = d2 by reshaping and
displaying. A few commands to view z ∈ Rm as an image follow.
minimize ∥x∥p
subject to Ax = b,
with variable x ∈ Rn , where A ∈ Rm×n , with m ≪ n. (And of course, p ∈ [1, ∞].) Determine if
the statements below are reasonable or unreasonable.
6.28 Predicting complete rankings. A (complete) ranking of K items consists of an ordering of the items
from rank 1 to rank K. For example, these could be K candidates, ranked from 1 (best) to K
(worst), or the order in which K horses cross the finish line in a race. We represent a ranking of
K items as a vector π ∈ RK , with πi the rank of item i. In the vector π, the numbers 1, . . . , K
91
each appear exactly once (so it can also be considered a permutation), so there are K! different
rankings. We will let P ⊂ RK denote the set of all K! rankings.
For example with K = 3, (2, 3, 1) and (1, 3, 2) are two of the six possible rankings. In the first
ranking, item 1 has rank 2, whereas in the second ranking, item 1 has rank 1. Both rankings agree
that item 2 has rank 3.
There are many ways to assign a distance between two rankings π and σ, but we will use a simple
one, (1/2)∥π − σ∥1 . This distance is zero if and only if π = σ, and one if and only if π and σ
assign the same rank to all items except two, whose ranks are off by one. The maximum possible
distance is K 2 /4 for K even and (K 2 − 1)/4 for K odd, achieved by, e.g., π = (1, 2, . . . , K) and
σ = (K, K − 1, . . . , 1). The average distance between two randomly chosen rankings is (K 2 − 1)/6.
(These observations are not relevant for this problem, but only meant to give you an idea of the
range and scale of the distance between rankings.)
We wish to build a predictor of an outcome which is a ranking, based on a vector of features. We
denote the predictor as P : Rd → P, where P (x) is the ranking we predict when the feature vector
is x ∈ Rd . We will judge a predictor by the average distance between the true ranking and the
predicted one, on a test set of data (xtest test
i , πi ), i = 1, . . . , N
test (that presumably was not used to
We refer to this quantity as the average test error of the predictor. (The smaller this is, the better
the predictor performs on the test data set.)
We will consider a simple predictor of the form P (x) = Π(θx), where θ ∈ RK×d is the predictor
coefficient matrix, and Π : RK → P is Euclidean projection onto P. (We will describe this
projection in more detail below, but for now we note that if there are multiple rankings that are
closest to θx, we arbitrarily choose one.)
We choose the predictor parameter matrix θ to minimize
N
1 X
∥πi − θxi ∥1 ,
2N
i=1
where (xi , πi ), i = 1, . . . , N , is some given training data. (Note that this objective would become
the average distance between the true and predicted rankings if we replace θxi with Π(θxi ), but
then the objective is no longer convex.)
Projection onto rankings. You can use the following, without deriving or justifying it. The projec-
tion π = Π(y) is the vector of rank orders of the entries of y in nondecreasing order. For example
with y = (1.1, −0.3, 0.5, 0.4), we have Π(y) = (4, 1, 3, 2), since the first entry of y is the largest (i.e.,
has rank 4), the second entry of y is the smallest (i.e., has rank 1), and so on. So we can compute
Π(y) by sorting the entries of y (breaking any ties arbitrarily), keeping track of the sort ordering.
Explain how to fit the predictor using the training data with convex optimization.
The data file ranking_est_data.* contains functions that generate synthetic training and test
data, as well as a function that implements Π. The data are in the matrices X_train, pi_train,
X_test, pi_test, and the projection Π is given in Pi(). Fit the predictor using the training data,
92
and give the average distance between the true and predicted ranking on both the training and test
data sets.
6.29 Robust logistic regression. We are given a data set xi ∈ Rd , yi ∈ {−1, 1}, i = 1, . . . , n. We seek a
prediction model ŷ = sign(θT x), where θ ∈ Rd is the model parameter. In logistic regression, θ is
chosen as the minimizer of the logistic loss
n
X
log 1 + exp(−yi θT xi ) ,
ℓ(θ) =
i=1
i=1 ∥δi ∥∞ ≤ϵ
In words: we perturb each feature vector’s entries by up to ϵ in such a way as to make the logistic
loss as large as possible. Each term is convex, since it is the supremum of a family of convex
functions of θ, and so ℓwc (θ) is a convex function of θ.
In robust logistic regression, we choose θ to minimize ℓwc (θ). (Here too we assume a minimizer
exists.)
(a) Explain how to carry out robust logistic regression by solving a single convex optimization
problem in disciplined convex programming (DCP) form. Justify any change of variables or
introduction of new variables. Explain why solving the problem you propose also solves the
robust logistic regression problem.
Hint: log(1 + exp(u))) is monotonic in u.
(b) Fit a logistic regression model (i.e., minimize ℓ(θ)), and also a robust logistic regression model
(i.e., minimize ℓwc (θ)), using the data given in rob_logistic_reg_data.py. The xi s are
provided as the rows of a n × d matrix named X. The yi s are provided as the entries of a n
vector named y. The file also contains a test data set, X_test, y_test. Give the test error
rate (i.e., fraction of test set data points for which ŷ ̸= y) for the logistic regression and robust
logistic regression models.
6.30 Asymmetric least squares. We consider the problem of choosing x ∈ Rn to minimize f (Ax − b),
where A ∈ Rm×n , b ∈ Rm , and f is the asymmetric square penalty function
m 2
X ri ri ≤ 0
f (r) = ϕ(ri ), ϕ(ri ) = 2
κri ri > 0,
i=1
where κ > 0 is a parameter. Note that when κ = 1, this reduces to simple ordinary least squares.
(a) Explain how to express this problem in DCP compatible form, using the standard set of
atoms. You can assume sign-sensitive DCP, which keeps track of monotonicity of arguments
depending on the sign.
93
(b) Solve the asymmetric least squares problem with data given in asymm_ls_data.py, for κ = 0.1,
κ = 1 (which is ordinary least squares), and κ = 10. Plot the histogram of residuals in each
of these cases, and make a brief comment on what you observe.
94
7 Statistical estimation
7.1 Maximum likelihood estimation of x and noise mean and covariance. Consider the maximum
likelihood estimation problem with the linear measurement model
yi = aTi x + vi , i = 1, . . . , m.
The vector x ∈ Rn is a vector of unknown parameters, yi are the measurement values, and vi are
independent and identically distributed measurement errors.
In this problem we make the assumption that the normalized probability density function of the
errors is given (normalized to have zero mean and unit variance), but not their mean and variance.
In other words, the density of the measurement errors vi is
1 z−µ
p(z) = f( ),
σ σ
where f is a given, normalized density. The parameters µ and σ are the mean and standard
deviation of the distribution p, and are not known.
The maximum likelihood estimates of x, µ, σ are the maximizers of the log-likelihood function
m m
X X yi − aTi x − µ
log p(yi − aTi x) = −m log σ + log f ( ),
σ
i=1 i=1
where y is the observed value. Show that if f is log-concave, then the maximum likelihood estimates
of x, µ, σ can be determined by solving a convex optimization problem.
7.2 Mean and covariance estimation with conditional independence constraints. Let X ∈ Rn be a
Gaussian random variable with density
1
p(x) = exp(−(x − a)T S −1 (x − a)/2).
(2π)n/2 (det S)1/2
The conditional density of a subvector (Xi , Xj ) ∈ R2 of X, given the remaining variables, is also
Gaussian, and its covariance matrix Rij is equal to the Schur complement of the 2 × 2 submatrix
Sii Sij
Sij Sjj
in the covariance matrix S. The variables Xi , Xj are called conditionally independent if the
covariance matrix Rij of their conditional distribution is diagonal.
Formulate the following problem as a convex optimization problem. We are given N independent
samples y1 , . . . , yN ∈ Rn of X. We are also given a list N ∈ {1, . . . , n} × {1, . . . , n} of pairs of
conditionally independent variables: (i, j) ∈ N means Xi and Xj are conditionally independent.
The problem is to compute the maximum likelihood estimate of the mean a and the covariance
matrix S, subject to the constraint that Xi and Xj are conditionally independent for (i, j) ∈ N .
7.3 Maximum likelihood estimation for exponential family. A probability distribution or density on a
set D, parametrized by θ ∈ Rn , is called an exponential family if it has the form
95
for x ∈ D, where c : D → Rn , and a(θ) is a normalizing function. Here we interpret pθ (x) as a
density function when D is a continuous set, and a probability distribution when D is discrete.
Thus we have Z −1
a(θ) = exp(θT c(x)) dx
D
when pθ is a density, and
!−1
X
a(θ) = exp(θT c(x))
x∈D
when pθ represents a distribution. We consider only values of θ for which the integral or sum above
is finite. Many families of distributions have this form, for appropriate choice of the parameter θ
and function c.
(a) When c(x) = x and D = Rn+ , what is the associated family of densities? What is the set of
valid values of θ?
(b) Consider the case with D = {0, 1}, with c(0) = 0, c(1) = 1. What is the associated exponential
family of distributions? What are the valid values of the parameter θ ∈ R?
(c) Explain how to represent the normal family N (µ, Σ) as an exponential family. Hint. Use pa-
rameter (z, Y ) = (Σ−1 µ, Σ−1 ). With this parameter, θT c(x) has the form z T c1 (x)+tr Y C2 (x),
where C2 (x) ∈ Sn .
(d) Log-likelihood function. Show that for any x ∈ D, the log-likelihood function log pθ (x) is
concave in θ. This means that maximum-likelihood estimation for an exponential family leads
to a convex optimization problem. You don’t have to give a formal proof of concavity of
log pθ (x) in the general case: You can just consider the case when D is finite, and state that
the other cases (discrete but infinite D, continuous D) can be handled by taking limits of finite
sums.
(e) Optimality condition for ML estimation. Let ℓθ (x1 , . . . , xK ) be the log-likelihood function for
K IID samples, x1 , . . . , xk , from the distribution or density pθ . Assuming log pθ is differentiable
in θ, show that
K
1 X
(1/K)∇θ ℓθ (x1 , . . . , xK ) = c(xi ) − E c(x).
K θ
i=1
(The subscript under E means the expectation under the distribution or density pθ .)
Interpretation. The ML estimate of θ is characterized by the empirical mean of c(x) being
equal to the expected value of c(x), under the density or distribution pθ . (We assume here
that the maximizer of ℓ is characterized by the gradient vanishing.)
7.4 Maximum likelihood prediction of team ability. A set of n teams compete in a tournament. We
model each team’s ability by a number aj ∈ [0, 1], j = 1, . . . , n. When teams j and k play each
other, the probability that team j wins is equal to prob(aj − ak + v > 0), where v ∼ N (0, σ 2 ).
You are given the outcome of m past games. These are organized as
(j (i) , k (i) , y (i) ), i = 1, . . . , m,
meaning that game i was played between teams j (i) and k (i) ; y (i) = 1 means that team j (i) won,
while y (i) = −1 means that team k (i) won. (We assume there are no ties.)
96
(a) Formulate the problem of finding the maximum likelihood estimate of team abilities, â ∈ Rn ,
given the outcomes, as a convex optimization problem. You will find the game incidence
matrix A ∈ Rm×n , defined as
y (i) l = j (i)
useful.
The prior constraints âi ∈ [0, 1] should be included in the problem formulation. Also, we
note that if a constant is added to all team abilities, there is no change in the probabilities of
game outcomes. This means that â is determined only up to a constant, like a potential. But
this doesn’t affect the ML estimation problem, or any subsequent predictions made using the
estimated parameters.
(b) Find â for the team data given in team_data.m, in the matrix train. (This matrix gives the
outcomes for a tournament in which each team plays each other team once.) You may find
the CVX function log_normcdf helpful for this problem.
You can form A using the commands
A = sparse(1:m,train(:,1),train(:,3),m,n) + ...
sparse(1:m,train(:,2),-train(:,3),m,n);
(c) Use the maximum likelihood estimate â found in part (b) to predict the outcomes of next
year’s tournament games, given in the matrix test, using ŷ (i) = sign(âj (i) − âk(i) ). Compare
these predictions with the actual outcomes, given in the third column of test. Give the
fraction of correctly predicted outcomes.
The games played in train and test are the same, so another, simpler method for predicting
the outcomes in test it to just assume the team that won last year’s match will also win this
year’s match. Give the percentage of correctly predicted outcomes using this simple method.
7.5 Estimating a vector with unknown measurement nonlinearity. (A specific instance of exercise 7.9
in Convex Optimization.) We want to estimate a vector x ∈ Rn , given some measurements
yi = ϕ(aTi x + vi ), i = 1, . . . , m.
Here ai ∈ Rn are known, vi are IID N (0, σ 2 ) random noises, and ϕ : R → R is an unknown
monotonic increasing function, known to satisfy
α ≤ ϕ′ (u) ≤ β,
for all u. (Here α and β are known positive constants, with α < β.) We want to find a maximum
likelihood estimate of x and ϕ, given yi . (We also know ai , σ, α, and β.)
This sounds like an infinite-dimensional problem, since one of the parameters we are estimating is a
function. In fact, we only need to know the m numbers zi = ϕ−1 (yi ), i = 1, . . . , m. So by estimating
ϕ we really mean estimating the m numbers z1 , . . . , zm . (These numbers are not arbitrary; they
must be consistent with the prior information α ≤ ϕ′ (u) ≤ β for all u.)
97
(a) Explain how to find a maximum likelihood estimate of x and ϕ (i.e., z1 , . . . , zm ) using convex
optimization.
(b) Carry out your method on the data given in nonlin_meas_data.*, which includes a matrix
A ∈ Rm×n , with rows aT1 , . . . , aTm . Give x̂ml , the maximum likelihood estimate of x. Plot your
estimated function ϕ̂ml . (You can do this by plotting (ẑml )i versus yi , with yi on the vertical
axis and (ẑml )i on the horizontal axis.)
Hint. You can assume the measurements are numbered so that yi are sorted in nondecreasing order,
i.e., y1 ≤ y2 ≤ · · · ≤ ym . (The data given in the problem instance for part (b) is given in this
order.)
7.6 Maximum likelihood estimation of an increasing nonnegative signal. We wish to estimate a scalar
signal x(t), for t = 1, 2, . . . , N , which is known to be nonnegative and monotonically nondecreasing:
This occurs in many practical problems. For example, x(t) might be a measure of wear or dete-
rioration, that can only get worse, or stay the same, as time t increases. We are also given that
x(t) = 0 for t ≤ 0.
We are given a noise-corrupted moving average of x, given by
k
X
y(t) = h(τ )x(t − τ ) + v(t), t = 2, . . . , N + 1,
τ =1
(a) Show how to formulate the problem of finding the maximum likelihood estimate of x, given
y, taking into account the prior assumption that x is nonnegative and monotonically nonde-
creasing, as a convex optimization problem. Be sure to indicate what the problem variables
are, and what the problem data are.
(b) We now consider a specific instance of the problem, with problem data (i.e., N , k, h, and y)
given in the file ml_estim_incr_signal_data.*. (This file contains the true signal xtrue,
which of course you cannot use in creating your estimate.) Find the maximum likelihood
estimate x̂ml , and plot it, along with the true signal. Also find and plot the maximum likelihood
estimate x̂ml,free not taking into account the signal nonnegativity and monotonicity.
Hints.
• Matlab: The function conv (convolution) is overloaded to work with CVX.
• Python: Numpy has a function convolve which performs convolution. CVXPY has conv
which does the same thing for variables.
• Julia: The function conv is overloaded to work with Convex.jl.
7.7 Relaxed and discrete A-optimal experiment design. This problem concerns the A-optimal experi-
ment design problem, described on page 387, with data generated as follows.
98
n = 5; % dimension of parameters to be estimated
p = 20; % number of available types of measurements
m = 30; % total number of measurements to be carried out
randn(’state’, 0);
V=randn(n,p); % columns are vi, the possible measurement vectors
with variable λ ∈ Rp . Find the optimal point λ⋆ and the associated optimal value of the relaxed
problem. This optimal value is a lower bound on the optimal value of the discrete A-optimal
experiment design problem,
T −1
Pp
minimize tr i=1 mi vi vi
subject to m1 + · · · + mp = m, mi ∈ {0, . . . , m}, i = 1, . . . , p,
with variables m1 , . . . , mp . To get a suboptimal point for this discrete problem, round the entries
in mλ⋆ to obtain integers m̂i . If needed, adjust these by hand or some other method to ensure that
they sum to m, and compute the objective value obtained. This is, of course, an upper bound on
the optimal value of the discrete problem. Give the gap between this upper bound and the lower
bound obtained from the relaxed problem. Note that the two objective values can be interpreted
as mean-square estimation error E ∥x̂ − x∥22 .
7.8 Optimal detector design. We adopt here the notation of §7.3 of the book. Explain how to design a
(possibly randomized) detector that minimizes the worst-case probability of our estimate being off
by more than one,
Pwc = max prob(|θ̂ − θ| ≥ 2).
θ
7.9 Experiment design with condition number objective. Explain how to solve the experiment design
problem (§7.5) with the condition number cond(E) of E (the error covariance matrix) as the
objective to be minimized.
7.10 Worst-case probability of loss. Two investments are made, with random returns R1 and R2 . The
total return for the two investments is R1 + R2 , and the probability of a loss (including breaking
even, i.e., R1 + R2 = 0) is ploss = prob(R1 + R2 ≤ 0). The goal is to find the worst-case (i.e.,
maximum possible) value of ploss , consistent with the following information. Both R1 and R2 have
Gaussian marginal distributions, with known means µ1 and µ2 and known standard deviations σ1
and σ2 . In addition, it is known that R1 and R2 are correlated with correlation coefficient ρ, i.e.,
99
Your job is to find the worst-case ploss over any joint distribution of R1 and R2 consistent with the
given marginals and correlation coefficient.
We will consider the specific case with data
We can compare the results to the case when R1 and R2 are jointly Gaussian. In this case we have
which for the data given above gives ploss = 0.050. Your job is to see how much larger ploss can
possibly be.
This is an infinite-dimensional optimization problem, since you must maximize ploss over an infinite-
dimensional set of joint distributions. To (approximately) solve it, we discretize the values that R1
and R2 can take on, to n = 100 values r1 , . . . , rn , uniformly spaced from r1 = −30 to rn = +70.
We use the discretized marginals p(1) and p(2) for R1 and R2 , given by
exp −(ri − µk )2 /(2σk2 )
(k)
pi = prob(Rk = ri ) = Pn 2 2
,
j=1 exp −(rj − µk ) /(2σk )
for k = 1, 2, i = 1, . . . , n.
Formulate the (discretized) problem as a convex optimization problem, and solve it. Report the
maximum value of ploss you find. Plot the joint distribution that yields the maximum value of ploss
using the Matlab commands mesh and contour.
Remark. You might be surprised at both the maximum value of ploss , and the joint distribution
that achieves it.
7.12 Cox proportional hazards model. Let T be a continuous random variable taking on values in R+ .
We can think of T as modeling an event that takes place at some unknown future time, such as
the death of a living person or a machine failure.
The survival function is S(t) = prob(T ≥ t), which satisfies S(0) = 1, S ′ (t) ≤ 0, and limt→∞ S(t) =
0. The hazard rate is given by λ(t) = −S ′ (t)/S(t) ∈ R+ , and has the following interpretation: For
100
small δ > 0, λ(t)δ is approximately the probability of the event occurring in [t, t + δ], given that it
has not occurred up to time t. The survival function can be expressed in terms of the hazard rate:
Z t
S(t) = exp − λ(τ ) dτ .
0
(The hazard rate must have infinite integral over [0, ∞).)
The Cox proportional hazards model gives the hazard rate as a function of some features or ex-
planatory variables (assumed constant in time) x ∈ Rn . In particular, λ is given by
where λ0 (which is nonnegative, with infinite integral) is called the baseline hazard rate, and w ∈ Rn
is a vector of model parameters. (The name derives from the fact that λ(t) is proportional to
exp(wi xi ), for each i.)
Now suppose that we have observed a set of independent samples, with event times tj and feature
values xj , for j = 1, . . . , N . In other words, we observe that the event with features xj occurred at
time tj . You can assume that the baseline hazard rate λ0 is known. Show that maximum likelihood
estimation of the parameter w is a convex optimization problem.
Remarks. Regularization is typically included in Cox proportional hazards fitting; for example,
adding ℓ1 regularization yields a sparse model, which selects the features to be used. The basic
Cox proportional hazards model described here is readily extended to include discrete times of the
event, censored measurements (which means that we only observe T to be in an interval), and the
effects of features that can vary with time.
7.13 Maximum likelihood estimation for an affinely transformed distribution. Let z be a random variable
on Rn with density pz (u) = exp −ϕ(∥u∥2 ), where ϕ : R → R is convex and increasing. Examples of
such distributions include the standard normal N (0, σ 2 I), with ϕ(u) = (u)2+ + α, and the multivari-
able Laplacian distribution, with ϕ(u) = (u)+ + β, where α and β are normalizing constants, and
(a)+ = max{a, 0}. Now let x be the random variable x = Az + b, where A ∈ Rn×n is nonsingular.
The distribution of x is parametrized by A and b.
Suppose x1 , . . . , xN are independent samples from the distribution of x. Explain how to find a
maximum likelihood estimate of A and b using convex optimization. If you make any further
assumptions about A and b (beyond invertiblility of A), you must justify it.
Hint. The density of x = Az + b is given by
1
px (v) = pz (A−1 (v − b)).
| det A|
7.14 A simple MAP problem. We seek to estimate a point x ∈ R2+ , with exponential prior density
p(x) = exp −(x1 + x2 ), based on the measurements
y1 = x1 + v1 , y2 = x2 + v2 , y3 = x1 − x2 + v3 ,
where v1 , v2 , v3 are IID N (0, 1) random variables (also independent of x). A naı̈ve estimate of x is
given by x̂naive = (y1 , y2 ).
101
(a) Explain how to find the MAP estimate of x, given the observations y1 , y2 , y3 .
(b) Generate 100 random instances of x and y, from the given distributions. For each instance,
find the MAP estimate x̂map and the naı̈ve estimate xnaive . Give a scatter plot of the MAP
estimation error, i.e., x̂map −x, and another scatter plot of the naı̈ve estimation error, x̂naive −x.
7.15 Minimum possible maximum correlation. Let Z be a random variable taking values in Rn , and let
Σ ∈ Sn++ be its covariance matrix. We do not know Σ, but we do know the variance of m linear
functions of Z. Specifically, we are given nonzero vectors a1 , . . . , am ∈ Rn and σ1 , . . . , σm > 0 for
which
var(aTi Z) = σi2 , i = 1, . . . , m.
For i ̸= j the correlation of Zi and Zj is defined to be
Σij
ρij = p .
Σii Σjj
Let ρmax = maxi̸=j |ρij | be the maximum (absolute value) of the correlation among entries of Z. If
ρmax is large, then at least two components of Z are highly correlated (or anticorrelated).
(a) Explain how to find the smallest value of ρmax that is consistent with the given information,
using convex or quasiconvex optimization. If your formulation involves a change of variables
or other transformation, justify it.
(b) The file correlation_bounds_data.* contains σ1 , . . . , σm and the matrix A with columns
a1 , . . . , am . Find the minimum value of ρmax that is consistent with this data. Report your
minimum value of ρmax , and give a corresponding covariance matrix Σ that achieves this value.
You can report the minimum value of ρmax to an accuracy of 0.01.
7.16 Direct standardization. Consider a random variable (x, y) ∈ Rn ×R, and N samples (x1 , y1 ), . . . , (xN , yN ) ∈
Rn × R, which we will use to estimate the (marginal) distribution of y. If the given samples were
chosen according to the joint distribution of (x, y), a reasonable estimate for the distribution of y
would be the uniform empirical distribution, which takes on values y1 , . . . , yN each with probabil-
ity 1/N . (If y is Boolean, i.e., y ∈ {0, 1}, we are using the fraction of samples with y = 1 as our
estimate of prob(y = 1).)
The bad news is that the samples (x1 , y1 ), . . . , (xN , yN ) ∈ Rn × R were not chosen from the
distribution of (x, y), but instead from another (unknown, but presumably similar) distribution.
The good news is that we know E x, the expected value of x. We will use our knowledge of E x,
together with the samples, to estimate the distribution of y. Direct standardization replaces the
uniform empirical distribution with a weighted one, which takes on values yi with probability πi ,
T
PN π ⪰ 0, 1 π = 1. The weights or sample probabilities π are found by maximizing the entropy
where
− i=1 πi log πi , subject to the requirement that the weighted sample expected value of x matches
the known probabilities of x in the distribution, E x. This can be expressed as N
P
i=1 πi xi = E x.
(Both xi and E x are known.)
102
overall population is known to have equal numbers of females and males, but in the sample
population the male : female proportions are 0.7 : 0.3.
(c) The data in direct std data.* contain the samples x(i) and y (i) , as well as E x. Find the
weights π ⋆ , and report the weighted empirical distribution. On the same plot, compare the
cumulative distributions of
• the uniform empirical distribution,
• the weighted empirical distribution using π ⋆ , and
• the true distribution of y.
The true and empirical distributions are provided in the data file. (For example, the 20
elements of p true give prob(y = 1) up to prob(y = 20), in order).
Note: Julia users might want to use the ECOS solver, by including using ECOS, and solving
by using solve!(prob, ECOSSolver()).
Note: You don’t need to know this to solve the problem, but the data for part (c) are real. The
random variable x is a vector of a student’s gender, age, and mother’s and father’s educational
attainment, and y is the student’s score on a standardized test.
7.17 Maximum likelihood estimation of a discrete log-concave distribution. Suppose random variable
X ∈ {1, . . . , n} has unknown probability mass function p ∈ Rn , where prob(X = k) = pk ,
k = 1, . . . , n. Suppose we know that the probability mass function is log-concave, which means
p
prob(X = k) ≥ prob(X = k − 1) prob(X = k + 1), k = 2, . . . , n − 1.
(a) Explain how to compute a maximum likelihood estimate of the log-concave probability mass
function p, given the N observations described above.
(b) Carry out your procedure on the data found in logccv_mle_data.*. Plot the empirical
probability mass function (which is the maximum likelihood estimate without the log-concave
assumption), your maximum likelihood estimate (with the log-concave assumption), and the
true probability mass function found in the data file. Comment briefly on the result.
7.18 Maximum likelihood estimation of a log-concave distribution. We have a random variable X which
takes values in {1, . . . , n}. It has a distribution p ∈ Rn , with prob(X = i) = pi . However, we
do not know p, and would like to determine it based on N independent P samples of X. In those
N samples, let mi denote the number of samples for which X = i, so i mi = N . The likelihood
function is then
n
Y
l(p) = pm i
i .
i=1
103
(b) We have n = 13 and observe
m= 1, 5, 6, 15, 18, 20, 22, 11, 22, 8, 9, 4, 2 .
Carry out your method from part (a) on this data. Plot mi /N (the empirical distribution)
and your estimate of p.
7.19 Rank one nonnegative matrix approximation. We are given some entries of an m × n matrix A with
positive entries, and wish to approximate it as the outer product of vectors x and y with positive
entries, i.e., xy T . We will use the average relative deviation between the entries of A and xy T as
our approximation criterion,
m n
1 XX
R(Aij , xi yj ),
mn
i=1 j=1
If we scale x by the positive number α, and y by 1/α, the outer product (αx)(y/α)T is the same
as xy T , so we will normalize x as 1T x = 1.
The data in the problem consists of some of the values of A. Specifically, we are given Aij for
(i, j) ∈ Ω ⊆ {1, . . . , m} × {1, . . . , n}. Thus, your goal is to find x ∈ Rm T
++ (which satisfies 1 x = 1),
n
y ∈ R++ , and Aij > 0 for (i, j) ̸∈ Ω, to minimize the average relative deviation between the entries
of A and xy T .
(a) Explain how to solve this problem using convex or quasiconvex optimization.
(b) Solve the problem for the data given in rank_one_nmf_data.*. This includes a matrix A, and
a set of indexes Omega for the given entries. (The other entries of A are filled in with zeros.)
Report the optimal average relative deviation between A and xy T . Give your values for x1 ,
y1 , and A11 = x1 y1 .
7.20 Transforming to a normal distribution. We are given n samples xi ∈ R from an unknown distri-
bution. We seek an increasing piecewise-affine function φ : R → R for which yi = φ(xi ) has a
distribution close to N (0, 1). In other words, the nonlinear transformation x 7→ y = φ(x) (approx-
imately) transforms the given distribution to a standard normal distribution.
You can assume that the samples are distinct and sorted, i.e., x1 < x2 < · · · < xn , and therefore
we also have y1 < y2 < · · · < yn . The empirical CDF (cumulative distribution function) of yi is the
piecewise-constant function F : R → R given by
0 z < y1 ,
F (z) = k/n yk ≤ z < yk+1 , k = 1, . . . , n − 1,
1 z ≥ yn .
The Kolmogorov-Smirnov distance between the empirical distribution of yi and the standard normal
distribution is given by
D = sup |F (z) − Φ(z)|,
z
104
where Φ is the CDF of an N (0, 1) random variable. We will use D as our measure of how close the
transformed distribution is to normal. Note that D can be as small as 1/(2n) (but no smaller), by
choosing yi = Φ−1 ((i − 1/2)/n).
Note that D only depends on the n numbers y1 , . . . , yn . From these numbers we extend φ to a
function on R using linear interpolation between these values, and extending outside the interval
[x1 , xn ] using the same slopes as the first and last segments, respectively. So y1 , . . . , yn determine
φ.
Our regularization (measure of complexity) of φ is
n−1
X yi+1 − yi yi − yi−1
R= − .
xi+1 − xi xi − xi−1
i=2
This is the sum of the absolute values of the change in slope of φ. Note that R = 0 if and only if
φ has no kinks, i.e., is affine.
We will choose yi (which defines φ) by minimizing R, subject to D ≤ Dmax , where Dmax ≥ 1/(2n)
is a parameter. It can be shown that the condition yi < yi+1 will hold automatically; but if you are
nervous about this, you are welcome to add the constraint yi + ϵ ≤ yi+1 , where ϵ is a small positive
number.
(a) Explain how to solve this problem using convex or quasiconvex optimization. If your formu-
lation involves a change of variables or other transformation, justify it.
(b) The file transform_to_normal_data.* contains the vector x (in sorted order) and its length
n. Use the method of part (a) to find the optimal φ (i.e., y) for Dmax = 0.05. Plot the empirical
CDF of the original data x and the normal CDF Φ on one plot, the empirical CDF of the
transformed data y and the normal CDF Φ on another plot, and the optimal transformation
φ on a third plot. Report the optimal value of R.
Hints. In Python and Julia, you should use the (default) ECOS solver to avoid warnings about
inaccurate solutions. You can evaluate the normal CDF Φ using normcdf.m/norminv.m (Mat-
lab), scipy.stats.norm.cdf/ppf (Python), or normcdf/norminvcdf in StatsFuns.jl (Julia).
To plot the empirical CDFs of x and y, you are welcome to use the basic plot functions, which
connect adjacent points with lines. But if you’d like to create step function style plots, you
can use ecdf.m (Matlab), matplotlib.pyplot.step (Python), or step in PyPlot.jl (Julia).
7.21 ARX model with sparse excitation. Consider a time series y = (y1 , . . . , yT ). The auto-regressive
with excitation (ARX) model has the form
yt+1 = β1 yt + · · · + βM yt−M +1 + xt+1 , t = M, . . . , T − 1,
where β ∈ RM are the coefficients, and xM +1 , . . . , xT is the excitation or input signal. Neither β
nor x ∈ RT are known. (The excitation values x1 , . . . , xM do not enter the model.)
(a) The classical assumption is that xt are IID N (0, σ 2 ) random variables. Explain how to find
the maximum likelihood estimate of β ∈ RM , given y.
(b) Now assume that the excitation signal x is sparse. Suggest a simple method, based on convex
optimization, for estimating β. Remark. This is a common model of various phenomena. In
one example y is an acoustic signal of a voiced phoneme, and x is the glottal excitation. And
no, you do not need to know this.
105
(c) Apply the methods of parts (a) and (b) to the signal given in arx_fit_data.*, with M = 10
and T = 200. The data file also contains the “true” coefficient β true from which the data is
generated. Compare the two estimates of β with the true value, by plotting all three.
Hint. You may use the fact that x can be expressed in terms of the convolution of b = (1, −β)
and y, defined as
min{i,M }
X
(b ∗ y)i = bj yi−j+1 , i = 1, . . . , T + M.
j=1
The function conv(b,y) is overloaded to work with CVX*. (Warning: b ∗ y is not x; but x
can be written in terms of b ∗ y.)
7.22 Blending overlapping covariance matrices. We consider the problem of constructing a covariance
matrix R ∈ Sn+ from two (not necessarily consistent) estimates of submatrices S and T . We order
the indices in the underlying random variable so that the first n1 entries correspond to those
in the first submatrix but not the second, the next n2 entries correspond to the entries in both
submatrices, and the last n3 entries are those in the second submatrix but not the first. We have
n1 + n2 + n3 = n, and we assume all three are positive. We partition the matrix R as
R11 R12 R13
R = R12T R22 R23 .
T
R13 R23 T R33
We wish to choose R ∈ Sn+ so that
(1) R11 R12 S11 S12
R = T ≈S= T
R12 R22 S12 S22
and
(2) R22 R23 T22 T23
R = T ≈T = T .
R23 R33 T23 T33
n1 +n2
(Note the non-standard labeling of the block indices in T .) You can assume that S ∈ S+ and
n2 +n3
T ∈ S+ are given.
Roughly speaking, your job is to guess the six submatrices Rij for i ≤ j. For four of these,
R11 , R12 , R23 , and R33 , you have only one piece of data to work with, i.e., S11 , S12 , T23 , and T33 ,
respectively. For one of them, R22 , you have two pieces of data to work with, i.e., S22 and T22 . For
one submatrix, R13 , you have no pieces of data to work with.
(a) A simple method. Based on the given data S and T , our guess of R is
R11 = S11 , R12 = S12 , R13 = 0,
R22 = (1/2)(S22 + T22 ), R23 = T23 , R33 = T33 .
For the four submatrices for which you have only one piece of data, we simply use that data
as our guess. For the one submatrix for which we have two pieces of data, we average the two
values. For the one submatrix for which we have no data, we guess the zero matrix.
Show by a specific numerical example that this simple method can yield an unacceptable value
of R. (No, we will not be more specific about what we mean by this; part of the problem
is to figure out what we mean. Also, we will deduct points from examples that are more
complicated than they need to be.)
106
(b) Convex optimization to the rescue. Suppose we choose R by solving the convex optimization
problem
minimize ∥R(1) − S∥2F + ∥R(2) − T ∥2F + ∥R13 ∥2F
subject to R ⪰ 0.
Here the variable is R ∈ Sn , and ∥U ∥F = (tr(U T U ))1/2 is the Frobenius norm of a matrix.
Let Rsim be the estimate of R obtained using the simple method in part (a). Show that if
Rsim ⪰ 0, then it is the solution of this problem.
(c) Apply the method described in part (b) to the specific numerical example you provided in
part (a), and check (numerically) that the result R⋆ is now acceptable.
7.23 Fitting a periodic Poisson distribution to data. We model the (random) number of times that some
type of event occurs in each hour of the day as independent Poisson variables, with
λkt
prob(k events occur) = e−λt , k = 0, 1, . . . ,
k!
with parameter λt ≥ 0, t = 1, . . . , 24. (For λt = 0, k = 0 events occur with probability one.) Here
t denotes the hour, with t = 1 corresponding to the hour from midnight to 1AM, and t = 24 the
hour between 11PM and midnight. (This is the periodic Poisson distribution in the title.) The
parameter λt is the expected value of the number of events that occur in hour t; it can be thought
of as the rate of occurence of the events in hour t.
Over one day we observe the numbers of events N1 , . . . , N24 .
(a) Maximum likelihood estimate of parameters. What is the maximum likelihood estimate of the
parameters λ1 , . . . , λ24 ? Hint. There is a simple analytical solution. You should consider the
cases Nt > 0 and Nt = 0 separately.
(b) Regularized maximum likelihood estimate of parameters. In many applications it is reasonable
to assume that λt varies smoothly over the day; for example, the rate of occurence of events for
3PM–4PM is not too different from the rate of occurence for 4PM–5PM. To obtain a smooth
estimate of λt we maximize the log likelihood minus the regularization term
23
!
X
2 2
ρ (λt+1 − λt ) + (λ1 − λ24 ) ,
t=1
where ρ ≥ 0. Explain how to find the values λ1 , . . . , λ24 using convex optimization. If you
change variables, explain.
(c) What happens as ρ → ∞? You can give a very short answer, with an informal argument.
Hint. As in part (a), there is a simple analytical solution.
(d) Numerical example. Over one day, we observe
N = (0, 4, 2, 2, 3, 0, 4, 5, 6, 6, 4, 1, 4, 4, 0, 1, 3, 4, 2, 0, 3, 2, 0, 1).
Find the regularized maximum likelihood parameters for ρ ∈ {0.1, 1, 10, 100} using CVX*,
and plot λt versus t for each value of ρ.
107
(e) Choosing the hyper-parameter value by out-of-sample test. One way to choose the value of ρ
is to see which of the models found in part (d) has the highest log likelihood on a test set,
i.e., another day’s data, that was not used to create the model. For each of the 4 values of
the parameters you estimated in part (d), evaluate the log likelihood of another day’s number
of events,
N test = (0, 1, 3, 2, 3, 1, 4, 5, 3, 1, 4, 3, 5, 5, 2, 1, 1, 1, 2, 0, 1, 2, 1, 0).
Which hyper-parameter value ρ would you choose?
7.24 Morphing between two discrete distributions. Consider two distributions for a random variable that
takes values in {1, 2, . . . , n}, given by q, r ∈ Rn , with q ⪰ 0, 1T q = 1, and r ⪰ 0, 1T r = 1. We
seek a sequence of distributions p(i) , i = 1, . . . , N , that ‘morph’ between q and r. This means that
p(1) = q, p(N ) = r, and p(i+1) is close to p(i) for i = 1, . . . , (N − 1), in some sense. Specifically we
will minimize
NX−1
d(p(i) , p(i+1) )
i=1
(a) Euclidean morphing. What is the solution when the distance function is the sum of squares,
dsq (u, v) = ∥u − v∥22 ? The solution is simple; you can just give it without justification.
(b) Hellinger morphing. Now we use the Hellinger distance function
n
X √ √
dhel (u, v) = ( ui − vi )2 .
i=1
Explain how to solve the Hellinger morphing problem using convex optimization.
(c) Kolmogorov morphing. Now we use the Kolmogorov distance function
i
X i
X
dkol (u, v) = max uj − vj ,
i=1,...,n
j=1 j=1
which is the ℓ∞ distance between the respective cumulative distributions (using the order
of the outcomes). Explain how to solve the Kolmogorov morphing problem using convex
optimization.
(d) Find the Euclidean, Hellinger, and Kolmogorov morphings for N = 10, n = 100. Use q and r
provided in morphing_data.*. Plot each p(i) versus n. Produce one figure for each choice of
distance function.
Note. In Python and Julia, you should use the ECOS solver.
7.25 Constrained maximum likelihood estimation of mean and covariance. You are given some indepen-
dent samples x1 , . . . , xN ∈ Rn from a Gaussian distribution N (µ, Σ).
(a) Explain how to find the maximum-likelihood estimate of µ and Σ, subject to the constraint
that Σ−1 µ ⪰ 0, using convex optimization. You must fully justify any change of variables.
108
Finance interpretation. (Not needed to solve the problem.) Suppose x ∼ N (µ, Σ) is the return
of n assets. The portfolio vector h that maximizes the risk-adjusted return µT h−γhT Σh, where
γ > 0 is the risk aversion parameter, is h = (1/2γ)Σ−1 µ. So the constraint in the problem
above is that the optimal portfolio has nonnegative entries, i.e., is a long-only portfolio. The
constrained maximum-likelihood estimate finds the maximum likelihood mean and covariance
of the return distribution, subject to the constraint that the associated optimal portfolio is
long-only.
Probability interpretation. (Not needed to solve the problem.) The constraint Σ−1 µ ⪰ 0 is the
same as ∇p(0) ⪰ 0, where p is the density of the N (µ, Σ) distribution. In other words, at 0,
the density is nondecreasing in each coordinate.
(b) Use your method on the data in long_only_ml_data.*. The data file also contains the ‘true’
mean and covariance, from the which the data are generated. (Of course in any practical
application, you would not know these.) Report the ℓ2 distance between your estimated mean
and the true mean, and also the Frobenius norm of the difference between your estiamted
covariance and the true covariance.
Repeat for the maximum likelihood estimate of µ and Σ without the constraint Σ−1 µ ⪰ 0.
(That is, find the maximum likelihood estimates and give the distances to the true mean and
covariance.) Hint. The unconstrained maximum likelihood estimates are the empirical mean
and covariance of the data, when the empirical covariance is positive definite.
7.26 c-Optimal experiment design [Harman and Jurı́k]. The optimization problem
−1
minimize cT A diag(x)AT c
subject to x ⪰ 0 (30)
1T x = 1,
with variable x ∈ Rn , where c is a nonzero m-vector and A an m × n matrix, is known in statistics
as the c-optimal experiment design problem. Here we show that it can be reformulated as an LP.
Since x ⪰ 0, the matrix A diag(x)AT is at least positive semidefinite. If it is not positive definite,
we interpret the cost function as cT (A diag(x)AT )† c if c is in the range of A diag(x)AT , and as
+∞ otherwise.
Hint. To show the equivalence between (30) and (31), assume x is fixed in (31), with x ⪰ 0 and
c in the range of A diag(x)AT . The minimization problem over y is an equality-constrained
quadratic optimization problem. Use the optimality conditions to find the optimal y as a
function of x.
109
(b) Use the result of part (a) to show that the solution of (30) is xk = |ŷk |/∥ŷ∥1 , where ŷ is the
solution of
minimize ∥y∥21
(32)
subject to Ay + c = 0.
This can be further reduced to an LP.
Hint. Follow a similar idea as in the hint of part (a), but now assume y is fixed in (31) and
optimize over x.
where λ > 0 is a given hyper-parameter. (Note that log at+1 − log at can be interpreted as the
fractional change in the scaling parameter from t to t + 1.)
(a) Show how to solve this fitting problem using convex or quasiconvex optimization. Fully justify
any changes of variables, or relaxations, that your method uses.
(b) Carry out your method on the data given in covar_series_data.*, for the three hyper-
parameter values λ = 0.01, λ = 1, λ = 100. (This gives three different estimates of the scale
factor time series.) Plot these three estimates versus t.
(c) Validation. The data covar_series_data.* contains another time series y1 , . . . , yT from the
same source. Evaluate the negative log likelihood of your three models obtained in part (b)
on this validation data set. Which of the three hyper-parameter values achieves the smallest
negative log-likelihood?
7.28 Elliptical distributions. An elliptical distribution on Rn has a probability density function of the
form
p(x) = Cg((x − µ)T Σ−1 (x − µ)),
where µ ∈ Rn and Σ ∈ Sn++ are parameters, and g : R → R is an unnormalized density function.
The constant C normalizes the density:
Z −1
T −1
C= g((x − µ) Σ (x − µ)) dx .
(We assume g is such that the integral is finite for any choice of Σ ∈ Sn++ and µ ∈ Rn .) When
g(u) = exp(−u), the elliptical distribution reduces to a Gaussian.
110
You are given independent samples x1 , . . . , xN ∈ Rn from an elliptical distribution. Suppose that
g is log-concave and decreasing. Explain how to find the maximum-likelihood estimate of µ and Σ
using convex optimization.
Hint. Define z = Σ−1/2 (x − µ). Then
Z −1
−1/2 T
C = det Σ g(z z) dz
Z −1
−1/2 2 n−1
= det Σ g(u )V u du ,
π n/2
where u = ∥z∥2 and V = Γ( n +1) is the hyper-volume of the unit sphere in Rn .
2
7.29 Maximum likelihood prediction of team ability. (A more CVX-friendly tweak of problem 7.4.) A
set of n teams compete in a tournament. We model each team’s ability by a number aj ∈ [0, 1],
j = 1, . . . , n. When teams j and k play each other, the probability that team j wins is equal to
prob(aj − ak + v > 0), where v is a symmetric random variable with density
2σ −1
p(v) = ,
(ev/σ + e−v/σ )2
where σ controls the standard deviation of v. For this question, you will likely find it useful that
the cumulative distribution function (CDF) of v is
t
et/σ
Z
F (t) = p(v)dv = .
−∞ et/σ + e−t/σ
You are given the outcome of m past games. These are organized as
meaning that game i was played between teams j (i) and k (i) ; y (i) = 1 means that team j (i) won,
while y (i) = −1 means that team k (i) won. (We assume there are no ties.)
(a) Formulate the problem of finding the maximum likelihood estimate of team abilities, â ∈ Rn ,
given the outcomes, as a convex optimization problem. You will find the game incidence
matrix A ∈ Rm×n , defined as
y (i) l = j (i)
useful.
The prior constraints âi ∈ [0, 1] should be included in the problem formulation. Also, we
note that if a constant is added to all team abilities, there is no change in the probabilities of
game outcomes. This means that â is determined only up to a constant, like a potential. But
this doesn’t affect the ML estimation problem, or any subsequent predictions made using the
estimated parameters.
111
(b) Find â for the team data given in team_data.jl, in the matrix train. (This matrix gives the
outcomes for a tournament in which each team plays each other team once.)
You can form A using the commands
using SparseArrays;
A1 = sparse(1:m, train[:, 1], train[:,3], m, n);
A2 = sparse(1:m, train[:, 2], -train[:,3], m, n);
A = A1 + A2;
(c) Use the maximum likelihood estimate â found in part (b) to predict the outcomes of next
year’s tournament games, given in the matrix test, using ŷ (i) = sign(âj (i) − âk(i) ). Compare
these predictions with the actual outcomes, given in the third column of test. Give the
fraction of correctly predicted outcomes.
The games played in train and test are the same, so another, simpler method for predicting
the outcomes in test it to just assume the team that won last year’s match will also win this
year’s match. Give the percentage of correctly predicted outcomes using this simple method.
7.30 Duals of some multiclass classification problems. In the k-class multiclass classification problem,
we are given data pairs {xi , yi } ∈ Rn × {1, . . . , k}, where xi ∈ Rn are covariates with which we
wish to predict the label yi ∈ {1, . . . , k}, and i = 1, . . . , m. A typical approach is to seek a classifier
represented by a matrix T
θ1
θT
2
Θ ∈ Rk×n , Θ = .
..
θkT
and find Θ so that θyTi xi ≫ θjT xi for all j ̸= yi and i = 1, . . . , m, that is, the correct label is assigned
a much higher score than incorrect labels. In this problem, you will compute duals for different
approaches for the multiclass classification problem.
(a) For a set of scores z1 , . . . , zk ∈ R and a label y ∈ {1, . . . , k}, the multiclass logistic loss is
k
X
ℓmc (z, y) = log exp(zj − zy ) ,
j=1
which is convex in z. Show that if zy ≤ zj for some index j ̸= y, ℓmc (z, y) ≥ log 2, while
inf z ℓmc (z, y) = 0.
(b) Let ∥ · ∥ be an arbitrary norm on matrices and λ ≥ 0. Give a dual problem for the regularized
multiclass logistic regression problem
m
X
minimize ℓmc (Θxi , yi ) + λ∥Θ∥.
i=1
Your dual may depend on the dual norm ∥·∥∗ of ∥·∥, but all other functions should be explicit.
112
(c) Instead of a general matrix norm, it is often useful to regularize individual rows of Θ. Give a
dual problem for
Xm k
X
minimize ℓmc (Θxi , yi ) + λ ∥θj ∥.
i=1 j=1
Your dual may depend on the dual norm ∥·∥∗ of ∥·∥, but all other functions should be explicit.
(d) Instead of the multiclass logistic loss, it is sometimes useful to have a loss that decomposes
across all indices. Let ϕ : R → R+ be a non-increasing convex function. Give a dual problem
for
Xm X k
minimize ϕ(xTi θj − xTi θyi ) + λ∥Θ∥.
i=1 j=1
Your dual problem should be written in terms of the conjugates ϕ∗ and the dual norm ∥ · ∥∗ .
7.31 The kernel trick. Consider a binary classification problem with data in pairs (xi , yi ) ∈ Rn ×{−1, 1},
where we represent a classifer mapping x ∈ Rn to {±1} by θ ∈ Rn , predicting ŷ = sign(xT θ). Given
m observations (xi , yi ), i = 1, . . . , m, we solve the convex optimization problem
m
X λ
minimize f (yi xTi θ) + ∥θ∥22 (33)
2
i=1
In many scenarios, it is useful to consider functions of x, including (but not limited to) polynomials,
Fourier transforms, or other nonlinearities. In this case, for a feature mapping φ : Rn → RN we
instead predict with θ ∈ RN and use ŷ = sign(θT φ(x)). For example, if
2
φ(x) = (1, x1 , . . . , xn , x21 , x1 x2 , x1 x3 , . . . , x1 xn , x2 x1 , . . . , xn−1 xn , x2n ) ∈ R1+n+n ,
we can represent all quadratic functions of the input vector x ∈ Rn . Instead of solving problem (33),
we wish to find a classifier based on a (nonlinear) transformation of the xi vectors, replacing xi by
φ(xi ) ∈ RN , that is, we wish to solve
m
X λ
minimize f (yi φ(xi )T θ) + ∥θ∥22 . (34)
2
i=1
113
Pm
By part (b), the solution to this must satisfy θ⋆ = i=1 νi φ(xi ) for some ν ∈ Rm , and therefore
we may classify a new instance x by evaluating
m
X
T ⋆
φ(x) θ = φ(x)T φ(xi )νi .
i=1
The kernel trick works as follows. In statistical machine learning parlance, a kernel function is
a symmetric function K : Rn × Rn → R that can be written as K(x, z) = φ(x)T φ(z) for some
mapping φ : Rn → RN , where N may even be infinite. In many cases, our choice of φ(θ) in
problem (34) may be efficiently evaluated by such a kernel, even when N is large. These mappings
can allow one to introduce nonlinearities in classification rules that are quite effective.
For completeness, we enumerate a few different kernel functions to highlight why these ideas may
be important. For k ∈ Z+ and x ∈ Rn define the tensor
X = x⊗k ∈ |Rn × Rn {z
× · · · × Rn}
k times
k
to have entries Xi1 ,...,ik = xi1 xi2 · · · xik , and let vec(X) ∈ Rn be the vectorized version of X, that
is, we simply stack all entries of X on one another. (We say x⊗0 = 1.) Then for any x, z ∈ Rn and
degree d ∈ Z+ we have
"s #d
T d T d ⊗k
(1 + x z) = φ(x) φ(z) where φ(x) = vec(x ) .
k
k=0
d+1
Note that the dimension of φ(x) is di=0 ni = n n−1−1 , and the structure of φ shows that the kernel
P
7.32 Maximum likelihood estimation with conditional independence priors. Let X ∈ Rn be a zero mean
Gaussian random vector with covariance matrix Σ ∈ Sn++ , i.e., with density
114
Given IID observations x(1) , . . . , x(m) of X, we seek Σ̂, the maximum likelihood estimate of Σ subject
to P
the conditional independence constraints. You can assume that the empirical covariance Y =
1 m (k) (k) T
m k=1 (x )(x ) is positive definite. (The empirical covariance Y is the maximum likelihood
estimate of Σ without the conditional independence constraints.)
(a) Explain how to solve this problem using convex optimization. If your method involves a change
of variables, be sure to explain how to recover Σ̂ from a solution of your problem.
(b) Solve the instance of the problem with data given in mle cond ind data.* and prior condi-
tional independence information
Give the maximum likelihood estimate Σ̂ and verify numerically that the conditional indepen-
dence constraint is satisfied.
The data file includes the ‘true’ covariance matrix Σtrue , stored in Sigma true. (Of course,
in any practical application you would not have Σtrue .) Give ∥Σ̂ − Σtrue ∥F . Compare with
∥Y − Σtrue ∥F , the estimation error obtained without the conditional independence constraints.
7.33 Maximum entropy distribution with quartile constraints. Let X be a random variable on [−1, 1]
with mean µ, variance σ 2 , 25th percentile q25 , median q50 , and 75th percentile q75 . Your goal is to
find a density function f : [−1, 1] → R+ that maximizes the entropy,
Z 1
H(f ) = − f (x) log f (x) dx,
−1
subject to the given mean, variance, and quartiles. (In the formula above, we define f (x) log f (x)
as 0 when f (x) = 0.)
You will work with a discretized version of the problem. We take xi = −1 + 2i/N , for i = 1, . . . , N ,
where N is the number of points in the discretization. You will specify the density f by its values
at xi , pi = f (xi ), i = 1, . . . , N , so the density function f is given by a vector p ∈ RN . You can
use the Riemann sum approximation of any integral, i.e., for any function g : [−1, 1] → R you can
replace
1
Z
g(x) dx
−1
with the approximation
N
2 X
g(xi ).
N
i=1
(a) Solve the problem with N = 300 and the given data
Plot the resulting probability distribution and its cumulative distribution function. Add dots
to the CDF plot at (−0.3, 0.25), (−0.05, 0.5), and (0.1, 0.75). (Your CDF should pass through
these points.)
115
(b) Repeat part (a) without the quartile constraints, i.e., find the maximum likelihood density
subject only to the mean and variance given in part (a). Hint: The maximum entropy
distribution on R with mean µ and variance σ 2 is the Gaussian distribution N (µ, σ 2 ). Your
distribution should look like N (µ, σ 2 ), truncated to [−1, 1].
7.34 Time varying regression model. We are given data of the form (x, t), where x ∈ Rn is a vector of
features, and t ∈ {1, . . . , T } is a time stamp (which you can also consider to be a feature). We are
also given the corresponding label or outcome y ∈ R. The data set has the form
(xi , ti ), yi, i = 1, . . . , m.
We allow for the possibility that multiple data points can have the same time stamp. We also allow
for the possibility that for some values of t, we have no data points.
We will fit the data with a time-varying regression model, ŷ = θtT x, where θt ∈ Rn is the regression
parameter vector for time period t. Unless we have many data points for each time stamp, this
model will likely be very overfit.
To combat this, we add regularization that encourages θt and θt+1 to be near each other. In other
words, we want the regression model to vary smoothly over time. We choose θ1 , . . . , θT to minimize
m
X T
X −1
p(ŷ i − y i ) + λ q(θt+1 − θt ),
i=1 t=1
116
In this problem, the fi are closed convex functions.
(a) Formulate the problem of minimizing h(x) as a convex optimization problem. You may intro-
duce new variables.
(b) Suppose the loss for task i is the squared error
1
fi (x) = (aTi x − bi )2 ,
2
where ai ∈ Rn , bi ∈ R. Formulate minimizing h as a quadratic program with the single
variable x ∈ Rn . Be as explicit as you can.
7.36 Let x(1) , . . . , x(N ) be independent samples from an N (µ, Σ) distribution, where it is known that
Σmin ⪯ Σ ⪯ Σmax , where Σmin and Σmax are given positive definite matrices. Roughly speaking,
we are given lower and upper bounds on the covariance matric.
Explain how to find the maximum likelihood of µ and Σ (including the constraint on Σ) using
convex optimization. Explain any change of variables you use.
7.37 Estimating mixture coefficients. We are given N IID samples x1 , . . . , xN ∈ Rm from a distribution
with mixture density
k
X
p(x; λ) = λj pj (x),
j=1
where λ ∈ Rk+ , with 1T λ = 1, are the mixture coefficients, and p1 , . . . , pk are given densities on
Rm .
(a) Explain how to use convex optimization to find the maximum likelihood estimate of the
mixture coefficients λml ∈ Rk+ . (You can assume that the maximum likelihood problem is well
posed, i.e., there is an optimal λml ∈ Rk+ .) If you change variables, or form a relaxation, be
sure to fully justify it.
Note. We will not accept methods or algorithms from other courses or fields, even if they
work.
(b) The data files mixture_coeffs_data.* contain code that generates N = 100 samples from a
mixture of k = 3 distributions on R,
with mixture coefficients λtrue = (0.3, 0.5, 0.2). The first distribution is Gaussian with mean 3
and variance 4; the second is a uniform distribution on [−1, 2], and the third is a Laplace or
double-sided exponential distribution with mean −2 and shape parameter 3, which has density
p(x) = 61 exp(−|x + 2|/3). The data file contains code for evaluating the density values at the
sample points, i.e.,pj (xi ), j = 1, 2, 3 and i = 1, . . . , N .
Carry out the method of part (a) on this data. Compare the ML estimate of the mixture
coefficients with their true values. Plot the true and estimated mixture densities on the same
plot. The data file also contains code for these plots; you just have to plug in your λml . (Of
course, in any real problem you would not have a ‘true’ distribution.)
117
7.38 Bounding the median. We consider a random variable X on [−3, 3], with moments
E X = 0, E X 2 = 1, E X 3 = 1.
The first two state that X is standardized; the last tells us that X has significant positive skew.
Our goal is to find the range of possible values of med(X), its median or 50th percentile, over all
distributions consistent with the three moment constraints above.
We consider a simple discretization of this problem, where X takes on N values uniformly spaced
on [−3, 3], i.e., xk = −3 + (k − 1)(6/(N − 1)), k = 1, . . . , N . We take the median to be med(X) =
min{xk | prob(X ≤ xk ) ≥ 1/2}.
(a) Explain how to use convex optimization, or quasiconvex optimization, to find the range of
possible values of med(X).
(b) Solve the problem for N = 300. Give the minimum possible value and maximum possible
values of med(X), and plot the associated cumulative distribution functions.
7.39 Linear classifier for one Gaussian versus others. We consider m independent Gaussian random
variables in Rn , Zi ∼ N (µi , Σi ), i = 1, . . . , m. Our goal is to find a halfspace H = {x | cT x+d ≥ 0},
with c ̸= 0, that separates Z1 from Z2 , . . . , Zm in the sense that prob(Z1 ∈ H) is large, while
prob(Zi ∈ H) is small for i = 2, . . . , m. We do this by solving the problem
maximize prob(Z1 ∈ H)
subject to prob(Zi ∈ H) ≤ η, i = 2, . . . , m,
where η ∈ (0, 1/2) is a given upper limit. You can assume that there exists a feasible hyperplane
with prob(Z1 ∈ H) ≥ 1/2. The variables are the vector c and constant d that define H. (The
solution is not unique, since we can multiply c and d by any positive scale factor without affecting
H.)
Remark. (Not needed to solve the problem.) For m = 2, the classifier found by this method is
the same as Fisher’s linear discriminant analysis (LDA). So the method described above can be
considered a generalization of LDA to multiple Gaussians in one of the classes.
(a) Explain how to use convex or quasiconvex optimization to solve the problem above. Justify
any relaxations or change of variables.
(b) Solve the problem instance with data given in linear_classifier_gaussian_data.*. Give
the optimal value, and an optimal c and d (which are not unique). The data file contains a
function that will plot the hyperplane you find, along with the one-σ ellipsoids for the Gaussian
distributions. Create this plot and submit it.
Coding help. The normal distribution’s CDF is Φ(t) = prob(N (0, 1) ≤ t). In Python, you
can compute the normal distribution’s inverse CDF Φ−1 via scipy.stats.norm.ppf. In Julia,
you can use quantile.(Normal(),·) while using the packages Random and Distributions.
7.40 Estimating a sparse covariance matrix. We are given independent samples xi ∼ N (0, Σ), i =
1, . . . , N , and wish to estimate Σ ∈ Sn++ , taking into account prior information that Σ is sparse.
118
To do this we minimize the negative log-likelihood of the samples, plus a sparsifying regularizer of
the form X
r(Σ) = λ |Σij | ,
i<j
(a) Explain how to approximately solve this estimation problem using the convex-concave proce-
dure. If you use a change of variables or relaxation, explain and justify it. Give the linearization
of the concave term, and the optimization problem you solve to find the update.
(b) Carry out the method of part (a) on the problem instance given in
estim_sparse_cov_data.*, which also gives a specific value for λ. Plot the convergence of
the objective versus iteration for a few different initial guesses Σ1 . Print out your estimate of
the covariance matrix, and use the plotting helper in the data file to plot the true, estimated,
and emprirical covariance matrices.
The data is generated from a true distribution Σtrue . Give the error,
∥Σtrue − Σ̂∥1 ,
7.41 Autogressive process with Poisson conditionals. We consider a stochastic process X1 , X2 , . . . with
each Xt having Poisson distribution,
e−λt λkt
prob(Xt = k) = , k = 0, 1, 2, . . . ,
k!
where λt > 0 is the rate or mean. In the formula above, we take λ0t = 1 and 0! = 1, so prob(Xt =
0) = e−λt . The process is autoregressive (AR), with
λt = νω Xt−1 , t = 2, 3, . . . ,
where ν and ω are positive parameters that define the process. You can assume that λ1 = ν, which
is the same as assuming that X0 = 0.
This AR process can model excitatory and inhibitory behavior. When ω = 1, the values Xt are
independent and identically distributed Poisson random variables with mean ν. With ω > 1,
whenever Xt > 0, the mean of the following value Xt+1 increases. This is called excitatory, since
it typically leads to bursts of positive values of Xt . (The term self-exciting is sometimes used to
describe such a process.) With ω < 1, whenever Xt > 0, the mean of the following value Xt+1
119
decreases. This is called inhibitory, since it typically leads to zero values of Xt following a positive
value. (These interpretations are not needed to solve this problem.)
Suppose we have data x1 , . . . , xT ∈ Z+ (the nonnegative integers) and wish to fit the process
described above to it, i.e., choose ν and ω, using maximum likelihood estimation.
(a) Explain how to do this using convex optimization. If you use a change of variable or introduce
new variables, explain your reasoning.
Remark. This particular problem involves only two variables, ν and ω. We can easily solve
such a problem in practice by just plotting the likelihood function. But simple generalizations
of it, for example increasing the memory of the AR process, leads to a problem with more
than two variables, which is worthy of a convex optimization formulation.
(b) Carry out the method from part (a) on the data given in poisson_ar_data.*. Give your
estimated values of ν and ω. Your are welcome to compare these to the true values from
which we generate the data, ν true = 0.3 and ω true = 1.5.
7.42 Computational bounds on the values of a copula. A copula C : [0, 1]d → [0, 1] is the cumulative
distribution function of a random variable X ∈ [0, 1]d , where each marginal Xi has a uniform
distribution. Copulas are used in areas like insurance to estimate the probability of a big loss.
(You don’t need to know this to solve this problem.)
We will focus on the case d = 2, where
C(u) = prob(X1 ≤ u1 , X2 ≤ u2 ).
Here are some examples. If X1 = X2 is uniform on [0, 1], then C(u) = min{u1 , u2 }. If X1 = 1 − X2 ,
with X1 uniform on [0, 1], we have C(u) = max{u1 + u2 − 1, 0}. If X1 and X2 are independent and
uniform on [0, 1], then C(u) = u1 u2 .
We are given the value of a copula at N points u(i) ∈ [0, 1]2 ,
C(u(i) ) = ci , i = 1, . . . , N.
Our goal is to find the range of possible values of C(0.5, 0.5), over all possible copulas consistent
with these given values. (You can assume that there is at least one copula consistent with the given
values.)
To carry out the computation we will discretize the values of u to an M × M uniform grid,
1
(i − 1, j − 1), i, j = 1, . . . , M.
M −1
With this discretization we can represent C as an M × M matrix. You can assume that (0.5, 0.5)
and the given points ui are all on this grid. The matrix C represents a copula if and only if the
following three conditions hold:
120
• C1,i = Ci,1 = 0 for i = 1, . . . , M , and
• Ci+1,j+1 − Ci,j+1 − Ci+1,j + Ci,j ≥ 0 for i, j = 1, . . . , M − 1.
(The lefthand quantity in the last condition is the probability that X is in the rectangle with lower
and upper limits (i − 1, j − 1)/(M − 1) and (i, j)/(M − 1).)
(a) Explain how to find the minimum and maximum possible values of C(0.5, 0.5) consistent with
the given values, using convex optimization. Explain any change of variables or relaxation you
use.
(b) Carry out the method of part (a) for the specific case M = 101 and
These constraints mean that the copula C agrees with the copula when X1 and X2 are inde-
pendent, at the four given points.
Report the minimum and maximum values of C(0.5, 0.5).
7.43 Consistent decile regression. We model the 9 deciles (i.e., 10%, . . . , 90% quantiles) of the conditional
distribution of an outcome y ∈ R as a function of a feature vector x ∈ Rn , using the regression
model
q̂ = v + θx,
where q̂ ∈ R9 is the vector of our estimates of the 10%, . . . , 90% quantiles, and the regression
model parameters are v ∈ R9 and θ ∈ R9×n .
For given model parameters v and θ and feature vector x, we say that the decile estimates are
consistent if they are in the correct order, i.e.,
(It’s always socially awkward when you estimate the 30% quantile to be smaller than the 20%
quantile.) We can write consistency as Dq̂ ⪰ 0, where D ∈ R8×9 is the first difference matrix, i.e.,
Du = (u2 − u1 , . . . , u9 − u8 ).
We will assume that the features have been constructed in such a way that xi ∈ [−1, 1] always
holds, i.e., ∥x∥∞ ≤ 1. We will require that for any such feature vector x, we have consistency, i.e.,
Dq̂ ⪰ 0. This imposes a constraint on (v, θ), which we write as (v, θ) ∈ C. Note that this constraint
requires consistency for all possible feature vectors (that satisfy ∥x∥∞ ≤ 1), not just on the given
data. In particular, C does not depend on the given data.
We will fit the decile regression model to given training data with features and outcomes
x1 , . . . , xN ∈ Rn , y1 , . . . , yN ∈ R.
121
where ϕj : R → R are given by
−0.1ju u<0
ϕj (u) = = (1/2)|u| + (1/2 − 0.1j)u,
(1 − 0.1j)u u ≥ 0
and θjT is the jth row of θ. This loss is minimized subject to (v, θ) ∈ C.
(a) Explain how to use convex optimization to carry out the fitting method described above, i.e.,
consistent decile regression. In particular, give an explicit description of C, that can be used
in CVXPY.
(b) Carry out the method of part (a) on the data found in consistent_decile_data.py, where
xTi are given as the rows of an N × n matrix X. Give your estimates v̂ and the first row of θ̂,
θ̂1T (so you don’t have to include the whole matrix). Find the fraction of the data samples i
for which yi ≤ (q̂i )j , where q̂i = v̂ + θ̂xi is the vector of decile predictions for data sample i,
and (q̂i )j is the prediction of the jth decile on the ith data sample. We expect the fraction of
samples for which yi ≤ (q̂i )j holds to be around 0.1j, j = 1, . . . , 9.
(c) Fit the decile regression model without the constraint (v, θ) ∈ C, by simply minimizing the
sum of the pinball losses over the given data. Give the model coefficients ṽ and θ̃1T (the first
row of θ̃) that you find. (This breaks into 9 independent decile regression problems, but you’re
welcome to solve it as one.)
Find a feature vector xinc , with ∥xinc ∥∞ ≤ 1, for which this decile regression model is incon-
sistent, i.e., we have Dq̃ = D(ṽ + θ̃xinc ) ̸⪰ 0. (Verify that this holds for the xinc you find.)
The feature vector xinc need not be one of the given data points; it can be any vector that
satisfies ∥x∥∞ ≤ 1.
7.44 Fitting a K-Markov chain to a sequence of distributions. We have a sequence of probability distri-
butions π1 , . . . , πT , with πt ∈ Rn+ , with 1T πt = 1. We wish to fit a K-Markov model to these, of
the form
π̂t+1 = A1 πt + · · · + AK πt−K+1 , t = K, . . . , T − 1,
where A1 , . . . , AK ∈ Rn×n are the model parameters we are to choose. We will use an average ℓ1
loss, ∥π̂t+1 − πt+1 ∥1 , over the data set to choose the model coefficients A1 , . . . , AK , i.e., we will
minimize
T −1
1 X
∥π̂t+1 − πt+1 ∥1 .
T −K
t=K
(There are many other choices for loss function, such as mean square error or average KL diver-
gence.)
A basic requirement on A1 , . . . , AK is that π̂t+1 is a probability distribution (i.e., has nonnegative
entries and sums to one) for any probability distributions πt , . . . , πt−K+1 . The condition that
1T π̂t+1 = 1 whenever 1T πt = · · · = 1T πt−K+1 = 1 is equivalent to
ATk 1 = αk 1, k = 1, . . . , K,
for some αk satisfying α1 + · · · + αK = 1. A sufficient condition for the entries of π̂t+1 to be
nonnegative is that the entries of A1 , . . . , AK are nonnegative. We will impose these two conditions
on A1 , . . . , AK . Note that when K = 1, these conditions reduce to AT1 1 = 1, and Aij ≥ 0, i.e., A1
is a stochastic matrix.
122
(a) Explain how to solve the fitting problem using convex optimization. If you change variables
or form a relaxation, explain.
(b) Carry out the fitting for the problem instance with n = 4, K = 2, T = 100, and data π1 , . . . , πT
given in fit_k_markov_data.py as the n×T matrix Pi_train. Create a stackplot of the abso-
lute deviations between your predictions and the true data via plot_prediction_error(hat_Pi, Pi)
for both the train and test data. We also provide a test data set in Pi_test, of the same size.
Give the optimal value obtained and the coefficient matrices (to two significant figures). Eval-
uate the K-Markov model on the test data set, and report the average ℓ1 loss function on the
test data.
7.45 Minimax, maximin, and minimum average error detectors. We consider the optimal detector design
problem as in §7.3 of the book. Suppose X is a random variable with values in {1, . . . , n}, with a
distribution that depends on a parameter θ ∈ {1, . . . , m}. The m possible distributions of X are
represented by a matrix P ∈ Rn×m , with the jth column of P representing the distribution of X
when θ = j. Our goal is to design a (possibly) randomized detector, given by a matrix T ∈ Rm×n ,
with entries
tik = prob(θ̂ = i | X = k), i = 1, . . . , m, k = 1, . . . , n.
P
Evidently, we must have tik ≥ 0 and i tik = 1. Note that θ̂ represents an estimated value of θ, as
in statistics convention.
We let D = T P ∈ Rm×m denote the detection probability matrix. Its off-diagonal entries Dij ,
i ̸= j are the error probabilities. The diagonal entry Dii is the accuracy of detecting θ = i,
Dii = prob(θ̂ = i | θ = i).
The following problems concern the specific problem instance with probabilities given in the
minimax_detector_data.py. This file includes code that plots the m different distributions, as
well as a detector matrix T (which you will solve for) as a stack plot. We consider three different
detector design problems.
(a) Minimax detector. Minimize the maximum error probability maxi̸=j Dij .
(b) Maximin detector. Maximize the minimum detection accuracy mini Dii .
(c) Minimum average error detector. Minimize the average error m21−m i̸=j Dij .
P
For each case, give the optimal value. Also give an optimal detection probability matrix D⋆ , and
a stack plot of an optimal detector T ⋆ .
Remarks.
7.46 Metric learning. We consider a set of N items, labeled i = 1, . . . , N , each one associated with a
feature vector xi ∈ Rn . We are given data that tells us that some pairs of items are similar, and
some pairs are dissimilar. These are given as two sets of pairs of indices, S and D, where {i, j} ∈ S
means that items i and j are similar, and {i, j} ∈ D means that items i and j are dissimilar. You
123
can assume that S ∩ D = ∅, i.e., no pairs are both similar and dissimilar, and you can assume that
the xi are distinct.
We consider a quadratic metric of the form d(x, y) = (x − y)T P (x − y), where P ∈ Sn+ . The matrix
P parametrizes the metric. Roughly speaking, we wish to choose P so that d(xi , xj ) is small when
{i, j} ∈ S, and large when {i, j} ∈ D.
We consider one simple formulation of this problem. We create a loss function of the form
X X 1
L(P ) = d(xi , xj ) + .
d(xi , xj )
{i,j}∈S {i,j}∈D
(In the righthand sum you can interpret 1/d(xi , xj ) as ∞ if d(xi , xj ) = 0.) We find (or learn) P by
minimizing this loss, subject to P ⪰ 0.
The learned metric matrix P can be used to classify items with feature vectors x and y as similar,
if d(x, y) ≤ 1, and dissimilar if d(x, y) > 1. The error rate is the fraction of pairs in S ∪ D that are
misclassified, i.e., {i, j} ∈ S but d(xi , xj ) > 1 or {i, j} ∈ D but d(xi , xj ) ≤ 1.
(a) Explain how to find P using convex optimization. Justify any change of variables or other
problem transformations.
(b) Carry out the method of part (a) on the data found in metric_learning_data.py. Report
the optimal value of the objective.
(c) Report the error rate of the learned metric on the training data S and D, and also on a set of
test data S test and Dtest given in the data file.
7.47 Conditional Gaussian regression. We wish to fit a model of the form y ∼ N (µ(x), σ 2 (x)), where
y ∈ R is an outcome, given a feature vector x ∈ Rd . We will assume that the features all lie in the
interval [−1, 1], i.e., ∥x∥∞ ≤ 1. We will use parametrization ω(x) = 1/σ(x), κ(x) = µ(x)/σ(x). We
focus on linear regression, which means that ω(x) = αT x + u and κ(x) = β T x + v, where α ∈ Rd ,
β ∈ Rd are vectors of parameters, and u ∈ R, v ∈ R are offsets that together define our predictor.
We require that ω ≥ 0 for all x with ∥x∥∞ ≤ 1 (and not just the training data). We are given
training data x1 , . . . , xN ∈ Rd and y1 , . . . , yN ∈ R.
(a) Formulate the problem of finding the maximum likelihood estimate of α, β, u, v given the
training data as a convex optimization problem.
(b) Carry out the method of part (a) on the data found in cond gauss reg data.py. What are
optimal α⋆ , β ⋆ , u⋆ , v ⋆ ? Report the mean and variance of the conditional distribution of the
first outcome y1 given x1 . Round reported values to two decimal places.
7.48 Maximum entropy correction and completion of a correlation matrix. We consider a correlation
matrix C ∈ Sn++ , which means Cii = 1, i = 1, . . . , n. We are given only some of the off-diagonal
g
entries of C, and some of these data can be wrong. We denote these given values as Cij , (i, j) ∈ G,
2
where G ⊆ {1, . . . , n} is the set of indices of given off-diagonal values. Since C is symmetric, we
have (i, j) ∈ G whenever (j, i) ∈ G, and since only off-diagonal entries are given, we have (i, i) ̸∈ G.
We will estimate C by modifying some of the entries of C g , which incurs a cost
g
X
q(C) = |Cij − Cij |.
(i,j)∈G
124
In addition we wish to maximize log det C (which is the entropy of a Gaussian distribution with
covariance matrix C, up to a constant), which imposes the implicit constraint that C ≻ 0. We
minimize the weighted combination − log det C + λq(C), where λ is a positive hyperparameter.
(a) Explain how to estimate C using convex optimization. Justify any change of variables.
(b) Carry out the method of part (a) on the data
1 −0.5 ? ? 0.7
−0.5 1 ? −0.6 0.8
Cg =
? ? 1 0.3 ? ,
? −0.6 0.3 1 ?
0.7 0.8 ? ? 1
where ? means (i, j) ̸∈ G. Use λ = 10. Report each entry of C ⋆ . Round reported values to
two decimal places. How many of the given off-diagonal entries of C g were modified?
7.49 Control variable selection via SDP. Let x ∼ N (0, Σ) be a multivariate Gaussian random variable,
with a known Σ ∈ Sn++ . We want to create control variables x̃ ∼ N (0, Σ̃), and consider the joint
variable χ = (x, x̃) ∈ R2n . Here, χ ∼ N (0, Λ) is jointly Gaussian, with Λ ∈ S2n
++ . We wish to find
Λ such that the following properties hold:
Remark. This procedure is also known as the model-x Gaussian knockoffs procedure. Using prop-
erties of conditional Gaussians, we can sample knockoffs x̃ for observed data x. These knockoffs
can be used for evaluating feature importances (even when valid p-values cannot be obtained) and
controlling the false discovery rate. You do not need to know this for the problem.
(a) Explain how you can find Λ by solving a semidefinite program (SDP).
(b) Carry out the procedure outlined in (a) for the problem instance with n = 3 and
4.9 −3.8 1.4
Σ = −3.8 3.8 −1.9 .
1.4 −1.9 2.5
125
8 Geometry
8.1 Efficiency of maximum volume inscribed ellipsoid. In this problem we prove the following geo-
metrical result. Suppose C is a polyhedron in Rn , symmetric about the origin, and described
as
C = {x | − 1 ≤ aTi x ≤ 1, i = 1, . . . , p}.
Let
E = {x | xT Q−1 x ≤ 1},
with Q ∈ Sn++ , be the maximum volume ellipsoid with center at the origin, inscribed in C. Then
the ellipsoid √
nE = {x | xT Q−1 x ≤ n}
√
(i.e., the ellipsoid E, scaled by a factor n about the origin) contains C.
8.2 Euclidean distance matrices. A matrix X ∈ Sn is a Euclidean distance matrix if its elements xij
can be expressed as
xij = ∥pi − pj ∥22 , i, j = 1, . . . , n,
for some vectors p1 , . . . , pn (of arbitrary dimension). In this exercise we prove several classical
characterizations of Euclidean distance matrices, derived by I. Schoenberg in the 1930s.
126
(b) Show that the set of Euclidean distance matrices is a convex cone.
(c) Show that X is a Euclidean distance matrix if and only if
diag(X) = 0, X22 − X21 1T − 1X21
T
⪯ 0. (37)
The subscripts refer to the partitioning
T
x11 X21
X=
X21 X22
8.3 Minimum total covering ball volume. We consider a collection of n points with locations x1 , . . . , xn ∈
Rk . We are also given a set of m groups or subsets of these points, G1 , . . . , Gm ⊆ {1, . . . , n}. For
each group, let Vi be the volume of the smallest Euclidean ball that contains the points in group
Gi . (The volume of a Euclidean ball of radius r in Rk is ak rk , where ak is known constant that
is positive but otherwise irrelevant here.) We let V = V1 + · · · + Vm be the total volume of these
minimal covering balls.
The points xk+1 , . . . , xn are fixed (i.e., they are problem data). The variables to be chosen are
x1 , . . . , xk . Formulate the problem of choosing x1 , . . . , xk , in order to minimize the total minimal
covering ball volume V , as a convex optimization problem. Be sure to explain any new variables
you introduce, and to justify the convexity of your objective and inequality constraint functions.
8.4 Maximum-margin multiclass classification. In an m-category pattern classification problem, we are
given m sets Ci ⊆ Rn . Set Ci contains Ni examples of feature vectors in class i. The learning
problem is to find a decision function f : Rn → {1, 2, . . . , m} that maps each training example to
its class, and also generalizes reliably to feature vectors that are not included in the training sets
Ci .
127
(a) A common type of decision function for two-way classification is
1 if aT x + b > 0
f (x) =
2 if aT x + b < 0.
In the simplest form, finding f is equivalent to solving a feasibility problem: find a and b such
that
aT x + b > 0 if x ∈ C1
aT x + b < 0 if x ∈ C2 .
Since these strict inequalities are homogeneous in a and b, they are feasible if and only if the
nonstrict inequalities
aT x + b ≥ 1 if x ∈ C1
aT x + b ≤ −1 if x ∈ C2
are feasible. This is a feasibility problem with N1 + N2 linear inequalities in n + 1 variables a,
b.
As an extension that improves the robustness (i.e., generalization capability) of the classifier,
we can impose the condition that the decision function f classifies all points in a neighborhood
of C1 and C2 correctly, and we can maximize the size of the neighborhood. This problem can
be expressed as
maximize t
subject to aT x + b > 0 if dist(x, C1 ) ≤ t,
aT x + b < 0 if dist(x, C2 ) ≤ t,
where dist(x, C) = miny∈C ∥x − y∥2 .
This is illustrated in the figure. The centers of the shaded disks form the set C1 . The centers
of the other disks form the set C2 . The set of points at a distance less than t from Ci is the
union of disks with radius t and center in Ci . The hyperplane in the figure separates the two
expanded sets. We are interested in expanding the circles as much as possible, until the two
expanded sets are no longer separable by a hyperplane.
aT x + b < 0
aT x + b > 0
Since the constraints are homogeneous in a, b, we can again replace them with nonstrict
inequalities
maximize t
subject to aT x + b ≥ 1 if dist(x, C1 ) ≤ t, (39)
aT x + b ≤ −1 if dist(x, C2 ) ≤ t.
128
The variables are a, b, and t.
(b) Next we consider an extension to more than two classes. If m > 2 we can use a decision
function
f (x) = argmax (aTi x + bi ),
i=1,...,m
n
parameterized by m vectors ai ∈ R and m scalars bi . To find f , we can solve a feasibility
problem: find ai , bi , such that
or, equivalently,
maximize t
subject to aTi x + bi ≥ 1 + maxj̸=i (aTj x + bj ) if dist(x, Ci ) ≤ t, (40)
i = 1, . . . , m.
Formulate the optimization problems (39) and (40) as SOCPs (if possible), or as quasiconvex
optimization problems involving SOCP feasibility problems (otherwise).
fi (z) = aTi z − bi , i = 1, 2, 3,
In words: f1 is the largest of the three functions on the x data points, f2 is the largest of the three
functions on the y data points, f3 is the largest of the three functions on the z data points. We
can give a simple geometric interpretation: The functions f1 , f2 , and f3 partition Rn into three
regions,
129
defined by where each function is the largest of the three. Our goal is to find functions with
x(j) ∈ R1 , y (j) ∈ R2 , and z (j) ∈ R3 .
Pose this as a convex optimization problem. You may not use strict inequalities in your formulation.
Solve the specific instance of the 3-way separation problem given in sep3way_data.m, with the
columns of the matrices X, Y and Z giving the x(j) , j = 1, . . . , N , y (j) , j = 1, . . . , M and z (j) , j =
1, . . . , P . To save you the trouble of plotting data points and separation boundaries, we have
included the plotting code in sep3way_data.m. (Note that a1, a2, a3, b1 and b2 contain arbitrary
numbers; you should compute the correct values using CVX.)
8.6 Feature selection and sparse linear separation. Suppose x(1) , . . . , x(N ) and y (1) , . . . , y (M ) are two
given nonempty collections or classes of vectors in Rn that can be (strictly) separated by a hyper-
plane, i.e., there exists a ∈ Rn and b ∈ R such that
This means the two classes are (weakly) separated by the slab
S = {z | |aT z − b| ≤ 1},
which has thickness 2/∥a∥2 . You can think of the components of x(i) and y (i) as features; a and b
define an affine function that combines the features and allows us to distinguish the two classes.
To find the thickest slab that separates the two classes, we can solve the QP
minimize ∥a∥2
subject to aT x(i) − b ≥ 1, i = 1, . . . , N
aT y (i) − b ≤ −1, i = 1, . . . , M,
with variables a ∈ Rn and b ∈ R. (This is equivalent to the problem given in (8.23), p424, §8.6.1;
see also exercise 8.23.)
In this problem we seek (a, b) that separate the two classes with a thick slab, and also has a sparse,
i.e., there are many j with aj = 0. Note that if aj = 0, the affine function aT z − b does not depend
on zj , i.e., the jth feature is not used to carry out classification. So a sparse a corresponds to a
classification function that is parsimonious; it depends on just a few features. So our goal is to find
an affine classification function that gives a thick separating slab, and also uses as few features as
possible to carry out the classification.
This is in general a hard combinatorial (bi-criterion) optimization problem, so we use the standard
heuristic of solving
minimize ∥a∥2 + λ∥a∥1
subject to aT x(i) − b ≥ 1, i = 1, . . . , N
aT y (i) − b ≤ −1, i = 1, . . . , M,
where λ ≥ 0 is a weight vector that controls the trade-off between separating slab thickness and
(indirectly, through the ℓ1 norm) sparsity of a.
Get the data in sp_ln_sp_data.m, which gives x(i) and y (i) as the columns of matrices X and Y,
respectively. Find the thickness of the maximum thickness separating slab. Solve the problem above
for 100 or so values of λ over an appropriate range (we recommend log spacing). For each value,
130
record the separation slab thickness 2/∥a∥2 and card(a), the cardinality of a (i.e., the number
of nonzero entries). In computing the cardinality, you can count an entry aj of a as zero if it
satisfies |aj | ≤ 10−4 . Plot these data with slab thickness on the vertical axis and cardinality on the
horizontal axis.
Use this data to choose a set of 10 features out of the 50 in the data. Give the indices of the features
you choose. You may have several choices of sets of features here; you can just choose one. Then
find the maximum thickness separating slab that uses only the chosen features. (This is standard
practice: once you’ve chosen the features you’re going to use, you optimize again, using only those
features, and without the ℓ1 regularization.
8.7 Thickest slab separating two sets. We are given two sets in Rn : a polyhedron
C1 = {x | Cx ⪯ d},
C2 = {P u + q | ∥u∥2 ≤ 1},
defined by a matrix P ∈ Rn×n and a vector q ∈ Rn . We assume that the sets are nonempty and
that they do not intersect. We are interested in the optimization problem
with variable a ∈ Rn .
Explain how you would solve this problem. You can answer the question by reducing the problem
to a standard problem class (LP, QP, SOCP, SDP, . . . ), or by describing an algorithm to solve it.
Remark. The geometrical interpretation is as follows. If we choose
1
b = ( inf aT x + sup aT x),
2 x∈C1 x∈C2
8.8 Bounding object position from multiple camera views. A small object is located at unknown position
x ∈ R3 , and viewed by a set of m cameras. Our goal is to find a box in R3 ,
B = {z ∈ R3 | l ⪯ z ⪯ u},
for which we can guarantee x ∈ B. We want the smallest possible such bounding box. (Although
it doesn’t matter, we can use volume to judge ‘smallest’ among boxes.)
Now we describe the cameras. The object at location x ∈ R3 creates an image on the image plane
of camera i at location
1
vi = T (Ai x + bi ) ∈ R2 .
ci x + di
131
The matrices Ai ∈ R2×3 , vectors bi ∈ R2 and ci ∈ R3 , and real numbers di ∈ R are known, and
depend on the camera positions and orientations. We assume that cTi x + di > 0. The 3 × 4 matrix
Ai bi
Pi =
cTi di
is called the camera matrix (for camera i). It is often (but not always) the case the that the first 3
columns of Pi (i.e., Ai stacked above cTi ) form an orthogonal matrix, in which case the camera is
called orthographic.
We do not have direct access to the image point vi ; we only know the (square) pixel that it lies in.
In other words, the camera gives us a measurement v̂i (the center of the pixel that the image point
lies in); we are guaranteed that
∥vi − v̂i ∥∞ ≤ ρi /2,
where ρi is the pixel width (and height) of camera i. (We know nothing else about vi ; it could be
any point in this pixel.)
Given the data Ai , bi , ci , di , v̂i , ρi , we are to find the smallest box B (i.e., find the vectors l and
u) that is guaranteed to contain x. In other words, find the smallest box in R3 that contains all
points consistent with the observations from the camera.
(a) Explain how to solve this using convex or quasiconvex optimization. You must explain any
transformations you use, any new variables you introduce, etc. If the convexity or quasicon-
vexity of any function in your formulation isn’t obvious, be sure justify it.
(b) Solve the specific problem instance given in the file camera_data.m. Be sure that your final
numerical answer (i.e., l and u) stands out.
8.9 Triangulation from multiple camera views. A projective camera can be described by a linear-
fractional function f : R3 → R2 ,
1
f (x) = (Ax + b), dom f = {x | cT x + d > 0},
cT x + d
with
A
rank( T ) = 3.
c
The domain of f consists of the points in front of the camera.
Before stating the problem, we give some background and interpretation, most of which will not
be needed for the actual problem.
132
x
x′
The 3 × 4-matrix
A b
P =
cT d
is called the camera matrix and has rank 3. Since f is invariant with respect to a scaling of P , we
can normalize the parameters and assume, for example, that ∥c∥2 = 1. The numerator cT x + d is
then the distance of x to the plane {z | cT z + d = 0}. This plane is called the principal plane. The
point
−1
A b
xc = − T
c d
lies in the principal plane and is called the camera center. The ray {xc + θc | θ ≥ 0}, which is
perpendicular to the principal plane, is the principal axis. We will define the image plane as the
plane parallel to the principal plane, at a unit distance from it along the principal axis.
The point x′ in the figure is the intersection of the image plane and the line through the camera
center and x, and is given by
1
x′ = xc + T (x − xc ).
c (x − xc )
Using the definition of xc we can write f (x) as
1
f (x) = A(x − xc ) = A(x′ − xc ) = Ax′ + b.
cT (x − xc )
This shows that the mapping f (x) can be interpreted as a projection of x on the image plane to
get x′ , followed by an affine transformation of x′ . We can interpret f (x) as the point x′ expressed
in some two-dimensional coordinate system attached to the image plane.
In this exercise we consider the problem of determining the position of a point x ∈ R3 from its
image in N cameras. Each of the cameras is characterized by a known linear-fractional mapping
fk and camera matrix Pk :
1 Ak bk
fk (x) = T (Ak x + bk ), Pk = , k = 1, . . . , N.
ck x + dk cTk dk
133
The image of the point x in camera k is denoted y (k) ∈ R2 . Due to camera imperfections and
calibration errors, we do not expect the equations fk (x) = y (k) , k = 1, . . . , N , to be exactly
solvable. To estimate the point x we therefore minimize the maximum error in the N equations by
solving
minimize g(x) = max ∥fk (x) − y (k) ∥2 . (41)
k=1,...,N
(a) Show that (41) is a quasiconvex optimization problem. The variable in the problem is x ∈ R3 .
The functions fk (i.e., the parameters Ak , bk , ck , dk ) and the vectors y (k) are given.
(b) Solve the following instance of (41) using CVX (and bisection): N = 4,
1 0 0 0 1 0 0 0
P1 = 0 1 0 0 , P2 = 0 0 1 0 ,
0 0 1 0 0 −1 0 10
1 1 1 −10 0 1 1 0
P3 = −1 1 1 0 , P4 = 0 −1 1 0 ,
−1 −1 1 10 −1 0 0 10
0.98 1.01 0.95 2.04
y (1) = , (2)
y = , (3)
y = , (4)
y = .
0.93 1.01 1.05 0.00
You can terminate the bisection when a point is found with accuracy g(x) − p⋆ ≤ 10−4 , where
p⋆ is the optimal value of (41).
8.10 Projection onto the probability simplex. In this problem you will work out a simple method for
finding the Euclidean projection y of x ∈ Rn onto the probability simplex P = {z | z ⪰ 0, 1T z = 1}.
Hints. Consider the problem of minimizing (1/2)∥y − x∥22 subject to y ⪰ 0, 1T y = 1. Form the
partial Lagrangian
L(y, ν) = (1/2)∥y − x∥22 + ν(1T y − 1),
leaving the constraint y ⪰ 0 implicit. Show that y = (x − ν1)+ minimizes L(y, ν) over y ⪰ 0.
8.11 Conformal mapping via convex optimization. Suppose that Ω is a closed bounded region in C with
no holes (i.e., it is simply connected). The Riemann mapping theorem states that there exists a
conformal mapping φ from Ω onto D = {z ∈ C | |z| ≤ 1}, the unit disk in the complex plane.
(This means that φ is an analytic function, and maps Ω one-to-one onto D.)
One proof of the Riemann mapping theorem is based on an infinite dimensional optimization
problem. We choose a point a ∈ int Ω (the interior of Ω). Among all analytic functions that map
∂Ω (the boundary of Ω) into D, we choose one that maximizes the magnitude of the derivative at
a. Amazingly, it can be shown that this function is a conformal mapping of Ω onto D.
We can use this theorem to construct an approximate conformal mapping, by sampling the bound-
ary of Ω, and by restricting the optimization to a finite-dimensional subspace of analytic functions.
Let b1 , . . . , bN be a set of points in ∂Ω (meant to be a sampling of the boundary). We will search
only over polynomials of degree up to n,
134
where α1 , . . . , αn+1 ∈ C. With these approximations, we obtain the problem
(a) Explain how to solve the problem above via convex or quasiconvex optimization.
(b) Carry out your method on the problem instance given in conf_map_data.m. This file defines
the boundary points bi and plots them. It also contains code that will plot φ̂(bi ), the boundary
of the mapped region, once you provide the values of αj ; these points should be very close to
the boundary of the unit disk. (Please turn in this plot, and give us the values of αj that you
find.) The function polyval may be helpful.
Remarks.
• We’ve been a little informal in our mathematics here, but it won’t matter.
• You do not need to know any complex analysis to solve this problem; we’ve told you everything
you need to know.
• A basic result from complex analysis tells us that φ̂ is one-to-one if and only if the image of
the boundary does not ‘loop over’ itself. (We mention this just for fun; we’re not asking you
to verify that the φ̂ you find is one-to-one.)
8.12 Fitting a vector field to given directions. This problem concerns a vector field on Rn , i.e., a function
F : Rn → Rn . We are given the direction of the vector field at points x(1) , . . . , x(N ) ∈ Rn ,
1
q (i) = F (x(i) ), i = 1, . . . , N.
∥F (x(i) )∥2
(These directions might be obtained, for example, from samples of trajectories of the differential
equation ż = F (z).) The goal is to fit these samples with a vector field of the form
F̂ = α1 F1 + · · · + αm Fm ,
where ̸ (z, w) = cos−1 ((z T w)/∥z∥2 ∥w∥2 ) denotes the angle between nonzero vectors z and w. We
are only interested in the case when J is smaller than π/2.
(a) Explain how to choose α so as to minimize J using convex optimization. Your method can
involve solving multiple convex problems. Be sure to explain how you handle the constraints
F̂ (x(i) ) ̸= 0.
135
(b) Use your method to solve the problem instance with data given in vfield_fit_data.m, with
an affine vector field fit, i.e., F̂ (z) = Az + b. (The matrix A and vector b are the parameters
α above.) Give your answer to the nearest degree, as in ‘20◦ < J ⋆ ≤ 21◦ ’.
This file also contains code that plots the vector field directions, and also (but commented
out) the directions of the vector field fit, F̂ (x(i) )/∥F̂ (x(i) )∥2 . Create this plot, with your fitted
vector field.
8.13 Robust minimum volume covering ellipsoid. Suppose z is a point in Rn and E is an ellipsoid in Rn
with center c. The Mahalanobis distance of the point to the ellipsoid center is defined as
which is the factor by which we need to scale the ellipsoid about its center so that z is on its
boundary. We have z ∈ E if and only if M (z, E) ≤ 1. We can use (M (z, E) − 1)+ as a measure of
the Mahalanobis distance of the point z to the ellipsoid E.
Now we can describe the problem. We are given m points x1 , . . . , xm ∈ Rn . The goal is to find the
optimal trade-off between the volume of the ellipsoid E and the total Mahalanobis distance of the
points to the ellipsoid, i.e.,
Xm
(M (z, E) − 1)+ .
i=1
Note that this can be considered a robust version of finding the smallest volume ellipsoid that
covers a set of points, since here we allow one or more points to be outside the ellipsoid.
(a) Explain how to solve this problem. You must say clearly what your variables are, what problem
you solve, and why the problem is convex.
(b) Carry out your method on the data given in rob_min_vol_ellips_data.m. Plot the optimal
trade-off curve of ellipsoid volume versus total Mahalanobis distance. For some selected points
on the trade-off curve, plot the ellipsoid and the points (which are in R2 ). We are only
interested in the region of the curve where the ellipsoid volume is within a factor of ten (say)
of the minimum volume ellipsoid that covers all the points.
Important. Depending on how you formulate the problem, you might encounter problems that
are unbounded below, or where CVX encounters numerical difficulty. Just avoid these by
appropriate choice of parameter.
Very important. If you use Matlab version 7.0 (which is filled with bugs) you might find that
functions involving determinants don’t work in CVX. If you use this version of Matlab, then
you must download the file blkdiag.m on the course website and put it in your Matlab path
before the default version (which has a bug).
8.14 Isoperimetric problem. We consider the problem of choosing a curve in a two-dimensional plane
that encloses as much area as possible between itself and the x-axis, subject to constraints. For
simplicity we will consider only curves of the form
C = {(x, y) | y = f (x)},
where f : [0, a] → R. This assumes that for each x-value, there can only be a single y-value, which
need not be the case for general curves. We require that at the end points (which are given), the
136
curve returns to the x-axis, so f (0) = 0, and f (a) = 0. In addition, the length of the curve cannot
exceed a budget L, so we must have
Z ap
1 + f ′ (x)2 dx ≤ L.
0
To pose this as a finite dimensional optimization problem, we discretize over the x-values. Specif-
ically, we take xi = h(i − 1), i = 1, . . . , N + 1, where h = a/N is the discretization step size, and
we let yi = f (xi ). Thus our objective becomes
N
X
h yi ,
i=1
In addition to these constraints, we will also require that our curve passes through a set of pre-
specified points. Let F ⊆ {1, . . . , N + 1} be an index set. For j ∈ F, we require yj = yjfixed , where
y fixed ∈ RN +1 (the entries of y fixed whose indices are not in F can be ignored). Finally, we add a
constraint on maximum curvature,
−C ≤ (yi+2 − 2yi+1 + yi )/h2 ≤ C, i = 1, . . . , N − 1.
Explain how to find the curve, i.e., y1 , . . . , yN +1 , that maximizes the area enclosed subject to these
constraints, using convex optimization. Carry out your method on the problem instance with data
given in iso_perim_data.m. Report the optimal area enclosed, and use the commented out code
in the data file to plot your curve.
Remark (for your amusement only). The isoperimetric problem is an ancient problem in mathe-
matics with a history dating all the way back to the tragedy of queen Dido and the founding of
Carthage. The story (which is mainly the account of the poet Virgil in his epic volume Aeneid ),
goes that Dido was a princess forced to flee her home after her brother murdered her husband. She
travels across the mediterranean and arrives on the shores of what is today modern Tunisia. The
natives weren’t very happy about the newcomers, but Dido was able to negotiate with the local
King: in return for her fortune, the King promised to cede her as much land as she could mark out
with the skin of a bull.
The king thought he was getting a good deal, but Dido outmatched him in mathematical skill. She
broke down the skin into thin pieces of leather and sewed them into a long piece of string. Then,
taking the seashore as an edge, they laid the string in a semicircle, carving out a piece of land
larger than anyone imagined; and on this land, the ancient city of Carthage was born. When the
king saw what she had done, he was so impressed by Dido’s talent that he asked her to marry him.
Dido refused, so the king built a university in the hope that he could find another woman with
similar talent.
137
8.15 Dual of maximum volume ellipsoid problem. Consider the problem of computing the maximum
volume ellipsoid inscribed in a nonempty bounded polyhedron
C = {x | aTi x ≤ bi , i = 1, . . . , m}.
Parametrizing the ellipsoid as E = {Bu + d | ∥u∥2 ≤ 1}, with B ∈ Sn++ and d ∈ Rn , the optimal
ellipsoid can be found by solving the convex optimization problem
minimize − log det B
subject to ∥Bai ∥2 + aTi d ≤ bi , i = 1, . . . , m
with variables B ∈ Sn , d ∈ Rn . Derive the Lagrange dual of the equivalent problem
minimize − log det B
subject to ∥yi ∥2 + aTi d ≤ bi , i = 1, . . . , m
Bai = yi , i = 1, . . . , m
with variables B ∈ Sn , d ∈ Rn , yi ∈ Rn , i = 1, . . . , m.
8.16 Fitting a sphere to data. Consider the problem of fitting a sphere {x ∈ Rn | ∥x − c∥2 = r} to m
points u1 , . . . , um ∈ Rn , by minimizing the loss function
m
X 2
∥ui − c∥22 − r2
i=1
n
over the variables c ∈ R , r ∈ R.
(a) Explain how to solve this problem using convex optimization. Hint. Consider the change of
variables from (c, r) to (c, t), with t = r2 − ∥c∥22 . You’ll need to argue that you can recover r⋆
from t⋆ once you solve the problem with these transformed variables.
(b) Use your method to solve the problem instance with data given in the file sphere_fit_data.*,
with n = 2. This file creates ui as a 2 × m matrix U. Plot the fitted circle and the data points.
8.17 The polar of a set C ⊆ Rn is defined as
C ◦ = {x | uT x ≤ 1 ∀u ∈ C}.
C1 = {u ∈ Rn | A1 u ⪯ b1 }, C2 = {v ∈ Rn | A2 v ⪯ b2 }
with A1 ∈ Rm1 ×n , A2 ∈ Rm2 ×n , b1 ∈ Rm1 , b2 ∈ Rm2 . Formulate the problem of finding the
Euclidean distance between C1◦ and C2◦ ,
minimize ∥x1 − x2 ∥22
subject to x1 ∈ C1◦
x2 ∈ C2◦ ,
as a QP. Your formulation should be efficient, i.e., the dimensions of the QP (number of
variables and constraints) should be linear in m1 , m2 , n. (In particular, formulations that
require enumerating the extreme points of C1 and C2 are to be avoided.)
138
8.18 Polyhedral cone questions. You are given matrices A ∈ Rn×k and B ∈ Rn×p .
Explain how to solve the following two problems using convex optimization. Your solution can
involve solving multiple convex problems, as long as the number of such problems is no more than
linear in the dimensions n, k, p.
(a) How would you determine whether ARk+ ⊆ BRp+ ? This means that every nonnegative linear
combination of the columns of A can be expressed as a nonnegative linear combination of the
columns of B.
(b) How would you determine whether ARk+ = Rn ? This means that every vector in Rn can be
expressed as a nonnegative linear combination of the columns of A.
with Ai ∈ Rn×n and bi ∈ Rn . Consider the problem of projecting a point a ∈ Rn on the convex
hull of the union of the ellipsoids:
minimize ∥x − a∥2
subject to x ∈ conv(E1 ∪ · · · ∪ Em ).
8.20 Bregman divergences. Let f : Rn → R be strictly convex and differentiable. Then the Bregman
divergence associated with f is the function Df : Rn × Rn → R given by
139
8.21 Ellipsoidal peeling. In this problem, you will implement an outlier identification technique using
Löwner-John ellipsoids. Given a set of points D = {x1 , . . . , xN } in Rn , the goal is to identify a
set O ⊆ D that are anomalous in some sense. Roughly speaking, we think of an outlier as a point
that is far away from most of the points, so we would like the points in D \ O to be relatively close
together, and to be relatively far apart from the points in O.
We describe a heuristic technique for identifying O. We start with O = ∅ and find the minimum
volume (Löwner-John) ellipsoid E containing all xi ∈ / O (which is all xi in the first step). Each
iteration, we flag (i.e., add to O) the point that corresponds to the largest dual variable for the
constraint xi ∈ E; this point will be one of the points on the boundary of E, and intuitively, it will
be the one for whom the constraint is ‘most’ binding. We then plot vol E (on a log scale) versus
card O and hope that we see a sharp drop in the curve. We use the value of O after the drop.
The hope is that after removing a relatively small number of points, the volume of the minimum
volume ellipsoid containing the remaining points will be much smaller than the minimum volume
ellipsoid for D, which means the removed points are far away from the others.
For example, suppose we have 100 points that lie in the unit ball and 3 points with (Euclidean)
norm 1000. Intuitively, it is clear that it is reasonable to consider the three large points outliers.
The minimum volume ellipsoid of all 103 points will have very large volume. The three points will
be the first ones removed, and as soon as they are, the volume of the ellipsoid ellipsoid will drop
dramatically and be on the order of the volume of the unit ball.
Run 6 iterations of the algorithm on the data given in ellip_anomaly_data.m. Plot vol E (on a
log scale) versus card O. In addition, on a single plot, plot all the ellipses found with the function
ellipse_draw(A,b) along with the outliers (in red) and the remaining points (in blue).
Of course, we have chosen an example in R2 so the ellipses can be plotted, but one can detect
outliers in R2 simply by inspection. In dimension much higher than 3, however, detecting outliers
by plotting will become substantially more difficult, while the same algorithm can be used.
Note. In CVX, you should use det_rootn (which is SDP-representable and handled exactly) instead
of log_det (which is handled using an inefficient iterative procedure).
8.22 Urban planning. An urban planner would like to choose the location x ∈ R2 for a new warehouse.
This should be close to n distribution centers located at y1 , . . . , yn ∈ R2 . The objective is to
minimize the worst-case distance, i.e., solve
8.23 Optimizing a set of disks. A disk D ⊂ R2 is parametrized by its center c ∈ R2 and its radius r ≥ 0,
with the form D = {x | ∥x − c∥2 ≤ r}. (We allow r = 0, in which case the disk reduces to a single
140
point {c}.) The goal is to choose a set of n disks D1 , . . . , Dn (i.e., specify their centers and radii),
to minimize an objective subject to some constraints.
One constraint is that the first k disks are fixed, i.e.,
ci = cfix
i , ri = rifix , i = 1, . . . , k,
(a) Explain how to solve these two problems using convex optimization.
(b) Solve both problems for the problem data given in disks_data.*. Give the optimal total
area, and the optimal total perimeter. Plot the two optimal disk arrangements, using the
code included in the data file. Give a very brief comment on the results, especially the
distribution of disk radii each problem obtains.
8.24 Minimum-volume ellipsoid around intersection of two ellipsoids. We work out a simple method for
computing the minimum-volume ellipsoid
E0 = {x | xT Ax ≤ 1}
that covers the intersection of two ellipsoids E1 and E2 centered at the origin. The problem can be
written as an optimization problem
The variable is the matrix A ∈ Sn . As usual, we define the domain of log det A−1 as Sn++ . It can
be shown that it is sufficient to consider the problem for the special case
E1
E2
141
(a) We first show that the optimal A is diagonal. Suppose A is feasible for (42), with E1 and E2
defined in (43).
(i) Verify that diag(s)A diag(s), where s is an n-vector with elements si = ±1, is also
feasible.
(ii) Show that the matrix
1 X
B= diag(s)A diag(s),
2n
si ∈{−1,+1}, i=1,...,n
is also feasible for (42). The sum is over all n-vectors s with elements si = ±1.
(iii) Show that log det B −1 ≤ log det A−1 .
Since B is diagonal (each off-diagonal element Bij is the sum of terms ±Aij , with an equal
number of positive and negative signs in the sum) and the cost function in (42) is strictly
convex, this implies that the optimal solution is diagonal.
(b) Hence, we can take A = diag(a) and write the problem as
n
P
minimize − log ai
i=1 (44)
subject to xT diag(a)x ≤1 for all x ∈ E1 ∩ E2 .
Show that this is true if and only if there exist scalars λ and µ that satisfy
λ + µ ≤ 1, λ1 + µd ⪰ a, λ ≥ 0, µ ≥ 0. (45)
(c) Replace the constraint in (44) by the equivalent set of constraints (45) and simplify the prob-
lem. Show that the optimal a is given by a = 1 + µ(d − 1) where µ is the solution of a simple
convex optimization problem with one variable,
n
P
minimize − log(1 + µ(di − 1))
i=1
subject to 0 ≤ µ ≤ 1.
8.25 Probabilistic centers. Let C ⊂ Rn be a bounded convex set. For a vector w ∈ Rn , the distance of
x to the boundary of C in the direction w is
be the expected distance. (We could easily define this for discrete random variables w as well.)
The p-centers of C are all points x⋆ ∈ C minimizing f (x), i.e., satisfying f (x⋆ ) = inf x∈C f (x).
142
(a) Let w1 , . . . , wm ∈ Rn and consider the empirical average
m
1 X
fm (x) = dC (x, wi ). (46)
m
i=1
C = {x ∈ Rn | Ax ⪯ b}
8.26 Some problems involving a polyhedron and a point. Let P ⊂ Rn be a polyhedron described by a
set of (a modest number of) linear inequalities, and a a point in Rn . Are the following problems
easy or hard? (Easy means the solution can be found by solving one or a modest number of convex
optimization problems, and here, we take ‘hard’ to mean not ‘easy’. We are not asking for a detailed
justification when you choose ‘hard’.)
8.27 Minimum volume ellipsoid that contains points and is inside a polyhedron. We seek the minimum
volume ellipsoid in Rn , centered at 0, that contains the points x1 , . . . , xK ∈ Rn , and is itself
contained in (i.e., a subset of) a polyhedron P = {x | Ax ⪯ b}, where A ∈ Rm×n and b ∈ Rm .
This combines two of the extremal volume problems studied in the book. The data are the points
xi , and the matrix A and vector b that define the polyhedron. You can assume that b ⪰ 0, which
means 0 ∈ P.
Explain how to use convex optimization to find this ellipsoid, or to determine that no such ellipsoid
exists. Be sure to explain how you parametrize the ellipsoid, how the constraints on the ellipsoid
are expressed in your problem, and why the problem you propose is convex.
143
9 Unconstrained minimization
9.1 Gradient descent and nondifferentiable functions.
γ−1 k γ−1 k
(k) (k)
x1 = γ , x2 = − .
γ+1 γ+1
Therefore x(k) converges to (0, 0). However, this is not the optimum, since f is unbounded
below.
9.2 Suggestions for exercise 9.30 in Convex Optimization. We recommend the following to generate a
problem instance:
n = 100;
m = 200;
randn(’state’,1);
A=randn(m,n);
Of course, you should try out your code with different dimensions, and different data as well.
In all cases, be sure that your line search first finds a step length for which the tentative point is
in dom f ; if you attempt to evaluate f outside its domain, you’ll get complex numbers, and you’ll
never recover.
To find expressions for ∇f (x) and ∇2 f (x), use the chain rule (see Appendix A.4); if you attempt
to compute ∂ 2 f (x)/∂xi ∂xj , you will be sorry.
To compute the Newton step, you can use vnt=-H\g.
9.3 Suggestions for exercise 9.31 in Convex Optimization. For 9.31a, you should try out N = 1,
N = 15, and N = 30. You might as well compute and store the Cholesky factorization of the
Hessian, and then back solve to get the search directions, even though you won’t really see any
speedup in Matlab for such a small problem. After you evaluate the Hessian, you can find the
Cholesky factorization as L=chol(H,’lower’). You can then compute a search step as -L’\(L\g),
where g is the gradient at the current point. Matlab will do the right thing, i.e., it will first solve
144
L\g using forward substitution, and then it will solve -L’\(L\g) using backward substitution. Each
substitution is order n2 .
To fairly compare the convergence of the three methods (i.e., N = 1, N = 15, N = 30), the
horizontal axis should show the approximate total number of flops required, and not the number
of iterations. You can compute the approximate number of flops using n3 /3 for each factorization,
and 2n2 for each solve (where each ‘solve’ involves a forward substitution step and a backward
substitution step).
9.4 Efficient numerical method for a regularized least-squares problem. We consider a regularized least
squares problem with smoothing,
k
X n−1
X n
X
minimize (aTi x − bi )2 + δ (xi − xi+1 )2 + η x2i ,
i=1 i=1 i=1
(a) Express the optimality conditions for this problem as a set of linear equations involving x.
(These are called the normal equations.)
(b) Now assume that k ≪ n. Describe an efficient method to solve the normal equations found in
part (a). Give an approximate flop count for a general method that does not exploit structure,
and also for your efficient method.
(c) A numerical instance. In this part you will try out your efficient method. We’ll choose k = 100
and n = 4000, and δ = η = 1. First, randomly generate A and b with these dimensions. Form
the normal equations as in part (a), and solve them using a generic method. Next, write
(short) code implementing your efficient method, and run it on your problem instance. Verify
that the solutions found by the two methods are nearly the same, and also that your efficient
method is much faster than the generic one.
Note: You’ll need to know some things about Matlab to be sure you get the speedup from the
efficient method. Your method should involve solving linear equations with tridiagonal coefficient
matrix. In this case, both the factorization and the back substitution can be carried out very
efficiently. The Matlab documentation says that banded matrices are recognized and exploited,
when solving equations, but we found this wasn’t always the case. To be sure Matlab knows your
matrix is tridiagonal, you can declare the matrix as sparse, using spdiags, which can be used to
create a tridiagonal matrix. You could also create the tridiagonal matrix conventionally, and then
convert the resulting matrix to a sparse one using sparse.
One other thing you need to know. Suppose you need to solve a group of linear equations with the
same coefficient matrix, i.e., you need to compute F −1 a1 , ..., F −1 am , where F is invertible and ai
are column vectors. By concatenating columns, this can be expressed as a single matrix
−1
F a1 · · · F −1 am = F −1 [a1 · · · am ] .
To compute this matrix using Matlab, you should collect the righthand sides into one matrix (as
above) and use Matlab’s backslash operator: F\A. This will do the right thing: factor the matrix
F once, and carry out multiple back substitutions for the righthand sides.
145
In Python, np.linalg.solve is unable to recognize banded matrices, and will therefore take a long
time to solve the resulting system of equations. Instead, you can use scipy.linalg.solve banded
with (l, u) = (1, 1) for a tridiagonal matrix to solve it more efficiently. In Julia, you can use
spdiagm in the SparseArrays package to generate banded matrices; Julia will solve these efficiently.
9.5 Newton method for approximate total variation de-noising. Total variation de-noising is based on
the bi-criterion problem with the two objectives
n−1
X
cor
∥x − x ∥2 , ϕtv (x) = |xi+1 − xi |.
i=1
Here xcor ∈ Rn is the (given) corrupted signal, x ∈ Rn is the de-noised signal to be computed,
and ϕtv is the total variation function. This bi-criterion problem can be formulated as an SOCP,
or, by squaring the first objective, as a QP. In this problem we consider a method used to approx-
imately formulate the total variation de-noising problem as an unconstrained problem with twice
differentiable objective, for which Newton’s method can be used.
We first observe that the Pareto optimal points for the bi-criterion total variation de-noising problem
can found as the minimizers of the function
where µ ≥ 0 is parameter. (Note that the Euclidean norm term has been squared here, and so is
twice differentiable.) In approximate total variation de-noising, we substitute a twice differentiable
approximation of the total variation function,
n−1
X p
ϕatv (x) = ϵ2 + (xi+1 − xi )2 − ϵ ,
i=1
for the total variation function ϕtv . Here ϵ > 0 is parameter that controls the level of approximation.
In approximate total variation de-noising, we use Newton’s method to minimize
146
9.6 Derive the Newton equation for the unconstrained minimization problem
minimize (1/2)xT x + log m T
P
i=1 exp(ai x + bi ).
Give an efficient method for solving the Newton system, assuming the matrix A ∈ Rm×n (with
rows aTi ) is dense with m ≪ n. Give an approximate flop count of your method.
9.7 Estimation of a vector from one-bit measurements. A system of m sensors is used to estimate an
unknown parameter x ∈ Rn . Each sensor makes a noisy measurement of some linear combination
of the unknown parameters, and quantizes the measured value to one bit: it returns +1 if the
measured value exceeds a certain threshold, and −1 otherwise. In other words, the output of
sensor i is given by
1 aTi x + vi ≥ bi
T
yi = sign(ai x + vi − bi ) =
−1 aTi x + vi < bi ,
where ai and bi are known, and vi is measurement error. We assume that the measurement errors
vi are independent random variables with
√ a zero-mean unit-variance Gaussian distribution (i.e.,
2
with a probability density ϕ(v) = (1/ 2π)e−v /2 ). As a consequence, the sensor outputs yi are
random variables with possible values ±1. We will denote prob(yi = 1) as Pi (x) to emphasize that
it is a function of the unknown parameter x:
Z ∞
1 2
Pi (x) = prob(yi = 1) = prob(aTi x + vi ≥ bi ) = √ e−t /2 dt
2π bi −ai x
T
Z bi −aT x
1 i 2
1 − Pi (x) = prob(yi = −1) = prob(aTi x + vi < bi ) = √ e−t /2 dt.
2π −∞
The problem is to estimate x, based on observed values ȳ1 , ȳ2 , . . . , ȳm of the m sensor outputs.
We will apply the maximum likelihood (ML) principle to determine an estimate x̂. In maximum
likelihood estimation, we calculate x̂ by maximizing the log-likelihood function
Y Y X X
l(x) = log Pi (x) (1 − Pi (x)) = log Pi (x) + log(1 − Pi (x)).
ȳi =1 ȳi =−1 ȳi =1 ȳi =−1
147
9.8 Functions with bounded Newton decrement. Let f : Rn → R be a convex function with ∇2 f (x) ≻ 0
for all x ∈ dom f and Newton decrement bounded by a positive constant c:
λ(x)2 ≤ c ∀x ∈ dom f.
9.9 Monotone convergence of Newton’s method. Suppose f : R → R is strongly convex and smooth,
and in addition, f ′′′ ≤ 0. Let x⋆ minimize f , and suppose Newton’s method is initialized with
x(0) < x⋆ . Show that the iterates x(k) converge to x⋆ monotonically, and that a backtracking line
search always takes a step size of one, i.e., t(k) = 1.
(a) In descent methods, the particular choice of search direction does not matter so much.
(b) In descent methods, the particular choice of line search does not matter so much.
(c) When the gradient method is started from a point near the solution, it will converge very
quickly.
(d) When Newton’s method is started from a point near the solution, it will converge very quickly.
(e) Newton’s method with step size h = 1 always works; damping (i.e., using h < 1) is used only
to improve the speed of convergence.
(f) Using the gradient method to minimize f (T y), where T y = x and T is nonsingular, can greatly
improve the convergence speed when T is chosen appropriately.
(g) Using Newton’s method to minimize f (T y), where T y = x and T is nonsingular, can greatly
improve the convergence speed when T is chosen appropriately.
9.11 Self-concordance. Determine whether the following statements are true or false.
over x ∈ Rn , where w ≻ 0. We wish to solve it to high accuracy, i.e., we seek a point x for which
∇f (x) is very small. Determine whether the following statements are true or false, with a short
justification or explanation.
(a) Newton’s method would probably require fewer iterations than the gradient method.
(b) But each iteration of Newton’s method would be far more costly than an iteration of the
gradient method.
148
(c) So it’s not clear which method would be better for this problem.
9.13 Newton’s method in machine learning problems. Newton’s method is seldom used to solve large
unconstrained problems with smooth objective that arise in machine learning. Choose the most
reasonable explanation from among the ones below.
149
10 Equality constrained minimization
10.1 A characterization of the Newton decrement. Let f : Rn → R be convex and twice differentiable,
and let A be a p × n-matrix with rank p. Suppose x̂ is feasible for the equality constrained problem
minimize f (x)
subject to Ax = b.
Recall that the Newton step ∆x at x̂ can be computed from the linear equations
2
∇ f (x̂) AT
∆x −∇f (x̂)
= ,
A 0 u 0
and that the Newton decrement λ(x̂) is defined as
λ(x̂) = (−∇f (x̂)T ∆x)1/2 = (∆xT ∇2 f (x̂)∆x)1/2 .
Assume the coefficient matrix in the linear equations above is nonsingular and that λ(x̂) is positive.
Express the solution y of the optimization problem
minimize ∇f (x̂)T y
subject to Ay = 0
y T ∇2 f (x̂)y ≤ 1
in terms of Newton step ∆x and the Newton decrement λ(x̂).
10.2 We consider the equality constrained problem
minimize tr(CX) − log det X
subject to diag(X) = 1.
The variable is the matrix X ∈ Sn . The domain of the objective function is Sn++ . The matrix
C ∈ Sn is a problem parameter. This problem is similar to the analytic centering problem discussed
in lecture 11 (p.18–19) and pages 553-555 of the textbook. The differences are the extra linear term
tr(CX) in the objective, and the special form of the equality constraints. (Note that the equality
constraints can be written as tr(Ai X) = 1 with Ai = ei eTi , a matrix of zeros except for the i, i
element, which is equal to one.)
(b) The Newton step ∆X at a feasible X is defined as the solution of the Newton equations
X −1 ∆XX −1 + diag(w) = −C + X −1 , diag(∆X) = 0,
with variables ∆X ∈ Sn , w ∈ Rn . (Note the two meanings of the diag function: diag(w) is
the diagonal matrix with the vector w on its diagonal; diag(∆X) is the vector of the diagonal
elements of ∆X.) Eliminating ∆X from the first equation gives an equation
diag(X diag(w)X) = 1 − diag(XCX).
This is a set of n linear equations in n variables, so it can be written as Hw = g. Give a
simple expression for the coefficients of the matrix H.
150
(c) Implement the feasible Newton method in Matlab. You can use X = I as starting point. The
code should terminate when λ(X)2 /2 ≤ 10−6 , where λ(X) is the Newton decrement.
You can use the Cholesky factorization to evaluate the T
P cost function: if X = LL where L is
triangular with positive diagonal then log det X = 2 i log Lii .
To ensure that the iterates remain feasible, the line search has to consist of two phases. Starting
at t = 1, you first need to backtrack until X + t∆X ≻ 0. Then you continue the backtracking
until the condition of sufficient decrease
is satisfied. To check that a matrix X + t∆X is positive definite, you can use the Cholesky
factorization with two output arguments ([R, p] = chol(A) returns p > 0 if A is not positive
definite).
Test your code on randomly generated problems of sizes n = 10, . . . , 100 (for example, using
n = 100; C = randn(n); C = C + C’).
10.3 Estimation of a vector from one-bit measurements. A system of m sensors is used to estimate an
unknown parameter x ∈ Rn . Each sensor makes a noisy measurement of some linear combination
of the unknown parameters, and quantizes the measured value to one bit: it returns +1 if the
measured value exceeds a certain threshold, and −1 otherwise. In other words, the output of
sensor i is given by
1 aTi x + vi ≥ bi
T
yi = sign(ai x + vi − bi ) =
−1 aTi x + vi < bi ,
where ai and bi are known, and vi is measurement error. We assume that the measurement errors
vi are independent random variables with
√ a −v zero-mean unit-variance Gaussian distribution (i.e.,
2
with a probability density ϕ(v) = (1/ 2π)e /2 ). As a consequence, the sensor outputs yi are
random variables with possible values ±1. We will denote prob(yi = 1) as Pi (x) to emphasize that
it is a function of the unknown parameter x:
Z ∞
1 2
T
Pi (x) = prob(yi = 1) = prob(ai x + vi ≥ bi ) = √ e−t /2 dt
2π bi −aTi x
Z bi −aT x
1 i 2
T
1 − Pi (x) = prob(yi = −1) = prob(ai x + vi < bi ) = √ e−t /2 dt.
2π −∞
The problem is to estimate x, based on observed values ȳ1 , ȳ2 , . . . , ȳm of the m sensor outputs.
We will apply the maximum likelihood (ML) principle to determine an estimate x̂. In maximum
likelihood estimation, we calculate x̂ by maximizing the log-likelihood function
Y Y X X
l(x) = log Pi (x) (1 − Pi (x)) = log Pi (x) + log(1 − Pi (x)).
ȳi =1 ȳi =−1 ȳi =1 ȳi =−1
maximize l(x)
is a convex optimization problem. The variable is x. The measured vector ȳ, and the param-
eters ai and bi are given.
151
(b) Solve the ML estimation problem with data defined in one_bit_meas_data.m, using Newton’s
method with backtracking line search. This file will define a matrix A (with rows aTi ), a vector
b, and a vector ȳ with elements ±1.
Remark. The Matlab functions erfc and erfcx are useful to evaluate the following functions:
Z u Z ∞
1 −t2 /2 1 u 1 2 1 u
√ e dt = erfc(− √ ), √ e−t /2 dt = erfc( √ )
2π −∞ 2 2 2π u 2 2
Z u Z ∞
1 2 2 1 u 1 2 2 1 u
√ eu /2 e−t /2 dt = erfcx(− √ ), √ eu /2 e−t /2 dt = erfcx( √ ).
2π −∞ 2 2 2π u 2 2
10.4 Infeasible start Newton method for LP centering problem. Implement the infeasible start Newton
method for solving the centering problem arising in the standard form LP,
minimize cT x − ni=1 log xi
P
subject to Ax = b,
with variable x. The data are A ∈ Rm×n , with m < n, c ∈ Rn , and b ∈ Rm . You can assume that
A is full rank. This problem cannot be solved when it is infeasible or unbounded below.
Your code should accept A, b, c, and x0 , and return x⋆ , the primal optimal point, ν ⋆ , a dual optimal
point, and the number of Newton steps executed. The initial point x(0) must satisfy x(0) ≻ 0, but
it need not satisfy the equality constraints.
Use the block elimination method to compute the Newton step. (You can also compute the Newton
step via the KKT system, and compare the result to the Newton step computed via block elimi-
nation. The two steps should be close, but if any xi is very small, you might get a warning about
the condition number of the KKT matrix.)
Plot ∥r(x, ν)∥2 , the norm of the concatenated primal and dual residuals, versus iteration k for
various problem data and initial points, to verify that your implementation achieves quadratic
convergence. As stopping criterion, you can use ∥r(x, ν)∥2 ≤ 10−6 (which means the problem was
solved) or some maximum number of iterations (say, 50) was reached, which means it was not
solved (likely because the problem is either infeasible or unbounded below).
For a fixed problem instance, experiment with varying the algorithm parameters α and β, observing
the effect on the total number of Newton steps required.
To generate problem data (i.e., A, b, c, x0 ) that are feasible, you can first generate A, then random
positive vector p, and set b = Ap. You can be sure that the problem is not unbounded by making
one row of A have positive entries. You may also want to check that A is full rank.
Test the behavior of your implementation on data instances that are not feasible, and also ones
that are unbounded below.
Hints.
• You can use np.linalg.solve for solving the system of equations AH −1 AT w = h − AH −1 g
for w in the computation of the step direction (see Lecture 10).
• Your backtracking line search won’t work correctly if x + t∆x ∈ / domf ; look at the last
paragraph on p. 465 of the textbook for guidance on how to ensure x + t∆x ∈ domf .
• Your dual variable ν can be initialized to any value.
152
11 Interior-point methods
11.1 Dual feasible point from analytic center. We consider the problem
minimize f0 (x)
(47)
subject to fi (x) ≤ 0, i = 1, . . . , m,
where the functions fi are convex and differentiable. For u > p⋆ , define xac (u) as the analytic
center of the inequalities
f0 (x) ≤ u, fi (x) ≤ 0, i = 1, . . . , m,
i.e.,
m
!
X
xac (u) = argmin − log(u − f0 (x)) − log(−fi (x)) .
i=1
Show that λ ∈ Rm , defined by
u − f0 (xac (u))
λi = , i = 1, . . . , m
−fi (xac (u))
is dual feasible for the problem above. Express the corresponding dual objective value in terms of
u, xac (u) and the problem parameters.
11.2 Efficient solution of Newton equations. Explain how you would solve the Newton equations in the
barrier method applied to the quadratic program
minimize (1/2)xT x + cT x
subject to Ax ⪯ b
where A ∈ Rm×n is dense. Distinguish two cases, m ≫ n and n ≫ m, and give the most efficient
method in each case.
11.3 Efficient solution of Newton equations. Describe an efficient method for solving the Newton equa-
tion in the barrier method for the quadratic program
minimize (1/2)(x − a)T P −1 (x − a)
subject to 0 ⪯ x ⪯ 1,
with variable x ∈ Rn . The matrix P ∈ Sn and the vector a ∈ Rn are given.
Assume that the matrix P is large, positive definite, and sparse, and that P −1 is dense. ‘Efficient’
means that the complexity of the method should be much less than O(n3 ).
11.4 Dual feasible point from incomplete centering. Consider the SDP
minimize 1T x
subject to W + diag(x) ⪰ 0,
with variable x ∈ Rn , and its dual
maximize − tr W Z
subject to Zii = 1, i = 1, . . . , n
Z ⪰ 0,
153
with variable X ∈ Sn . (These problems arise in a relaxation of the two-way partitioning problem,
described on page 219; see also exercises 5.39 and 11.23.)
Standard results for the barrier method tell us that when x is on the central path, i.e., minimizes
the function
ϕ(x) = t1T x + log det(W + diag(x))−1
for some parameter t > 0, the matrix
1
Z = (W + diag(x))−1
t
is dual feasible, with objective value − tr W Z = 1T x − n/t.
Now suppose that x is strictly feasible, but not necessarily on the central path. (For example, x
might be the result of using Newton’s method to minimize ϕ, but with early termination.) Then
the matrix Z defined above will not be dual feasible. In this problem we will show how to construct
a dual feasible Ẑ (which agrees with Z as given above when x is on the central path), from any
point x that is near the central path. Define X = W + diag(x), and let v = −∇2 ϕ(x)−1 ∇ϕ(x) be
the Newton step for the function ϕ defined above. Define
1
X −1 − X −1 diag(v)X −1 .
Ẑ =
t
(a) Verify that when x is on the central path, we have Ẑ = Z.
(b) Show that Ẑii = 1, for i = 1, . . . , n.
(c) Let λ(x) = ∇ϕ(x)T ∇2 ϕ(x)−1 ∇ϕ(x) be the Newton decrement at x. Show that
(d) Show that λ(x) < 1 implies that Ẑ ≻ 0. Thus, when x is near the central path (meaning,
λ(x) < 1), Z is dual feasible.
11.5 Standard form LP barrier method. In the following three parts of this exercise, you will implement
a barrier method for solving the standard form LP
minimize cT x
subject to Ax = b, x ⪰ 0,
with variable x ∈ Rn , where A ∈ Rm×n , with m < n. Throughout these exercises we will assume
that A is full rank, and the sublevel sets {x | Ax = b, x ⪰ 0, cT x ≤ γ} are all bounded. (If this is
not the case, the centering problem is unbounded below.)
(a) Centering step. Implement Newton’s method for solving the centering problem
154
Use the block elimination method to compute the Newton step. (You can also compute the
Newton step via the KKT system, and compare the result to the Newton step computed via
block elimination. The two steps should be close, but if any xi is very small, you might get a
warning about the condition number of the KKT matrix.)
Plot λ2 /2 versus iteration k, for various problem data and initial points, to verify that your
implementation gives asymptotic quadratic convergence. As stopping criterion, you can use
λ2 /2 ≤ 10−6 . Experiment with varying the algorithm parameters α and β, observing the effect
on the total number of Newton steps required, for a fixed problem instance. Check that your
computed x⋆ and ν ⋆ (nearly) satisfy the KKT conditions.
To generate some random problem data (i.e., A, b, c, x0 ), we recommend the following ap-
proach. First, generate A randomly. (You might want to check that it has full rank.) Then
generate a random positive vector x0 , and take b = Ax0 . (This ensures that x0 is strictly
feasible.) The parameter c can be chosen randomly. To be sure the sublevel sets are bounded,
you can add a row to A with all positive elements. If you want to be able to repeat a run with
the same problem data, be sure to set the state for the uniform and normal random number
generators.
Here are some hints that may be useful.
• We recommend computing λ2 using the formula λ2 = −∆xTnt ∇f (x). You don’t really need
λ for anything; you can work with λ2 instead. (This is important for reasons described
below.)
• There can be small numerical errors in the Newton step ∆xnt that you compute. When
x is nearly optimal, the computed value of λ2 , i.e., λ2 = −∆xTnt ∇f (x), can actually be
(slightly) negative. If you take the squareroot to get λ, you’ll get a complex number,
and you’ll never recover. Moreover, your line search will never exit. However, this only
happens when x is nearly optimal. So if you exit on the condition λ2 /2 ≤ 10−6 , everything
will be fine, even when the computed value of λ2 is negative.
• For the line search, you must first multiply the step size t by β until x + t∆xnt is feasible
(i.e., strictly positive). If you don’t, when you evaluate f you’ll be taking the logarithm
of negative numbers, and you’ll never recover.
(b) LP solver with strictly feasible starting point. Using the centering code from part (a), imple-
ment a barrier method to solve the standard form LP
minimize cT x
subject to Ax = b, x ⪰ 0,
with variable x ∈ Rn , given a strictly feasible starting point x0 . Your LP solver should take
as argument A, b, c, and x0 , and return x⋆ .
You can terminate your barrier method when the duality gap, as measured by n/t, is smaller
than 10−3 . (If you make the tolerance much smaller, you might run into some numerical
trouble.) Check your LP solver against the solution found by CVX*, for several problem
instances.
The comments in part (a) on how to generate random data hold here too.
Experiment with the parameter µ to see the effect on the number of Newton steps per centering
step, and the total number of Newton steps required to solve the problem.
155
Plot the progress of the algorithm, for a problem instance with n = 500 and m = 100, showing
duality gap (on a log scale) on the vertical axis, versus the cumulative total number of Newton
steps (on a linear scale) on the horizontal axis.
Your algorithm should return a 2×k matrix history, (where k is the total number of centering
steps), whose first row contains the number of Newton steps required for each centering step,
and whose second row shows the duality gap at the end of each centering step. In order to
get a plot that looks like the ones in the book (e.g., figure 11.4, page 572), you should use the
following code:
[xx, yy] = stairs(cumsum(history(1,:)),history(2,:));
semilogy(xx,yy);
(c) LP solver. Using the code from part (b), implement a general standard form LP solver, that
takes arguments A, b, c, determines (strict) feasibility, and returns an optimal point if the
problem is (strictly) feasible.
You will need to implement a phase I method, that determines whether the problem is strictly
feasible, and if so, finds a strictly feasible point, which can then be fed to the code from
part (b). In fact, you can use the code from part (b) to implement the phase I method.
To find a strictly feasible initial point x0 , we solve the phase I problem
minimize t
subject to Ax = b
x ⪰ (1 − t)1, t ≥ 0,
with variables x and t. If we can find a feasible (x, t), with t < 1, then x is strictly feasible for
the original problem. The converse is also true, so the original LP is strictly feasible if and
only if t⋆ < 1, where t⋆ is the optimal value of the phase I problem.
We can initialize x and t for the phase I problem with any x0 satisfying Ax0 = b, and
t0 = 2 − mini x0i . (Here we can assume that min x0i ≤ 0; otherwise x0 is already a strictly
feasible point, and we are done.) You can use a change of variable z = x+(t−1)1 to transform
the phase I problem into the form in part (b).
Check your LP solver against CVX* on several numerical examples, including both feasible
and infeasible instances.
11.6 Primal and dual feasible points in the barrier method for LP. Consider a standard form LP and its
dual
minimize cT x maximize bT y
subject to Ax = b subject to AT y ⪯ c,
x⪰0
with A ∈ Rm×n and rank(A) = m. In the barrier method the (feasible) Newton method is applied
to the equality constrained problem
minimize tcT x + ϕ(x)
subject to Ax = b,
n
P
where t > 0 and ϕ(x) = − log xi . The Newton equation at a strictly feasible x̂ is given by
i=1
∇2 ϕ(x̂) AT
∆x −tc − ∇ϕ(x̂)
= .
A 0 w 0
156
Suppose λ(x̂) ≤ 1 where λ(x̂) is the Newton decrement at x̂.
with variables x and y. Derive the Lagrange dual of this problem and express it in terms of
the dual function g(λ) in (48).
(b) Suppose the feasible set of the dual problem in (48) contains strictly positive λ. Show that
the centering problem (49) is bounded below for any positive t.
11.8 Standard form LP barrier method with infeasible start Newton method. Implement the barrier
method for the standard form LP,
minimize cT x
subject to Ax = b, x ⪰ 0,
with variable x ∈ Rn , where A ∈ Rm×n , with m < n, with A full rank. (Your method will of
course fail if the problem is not strictly feasible, or if it is unbounded.)
Use the centering code that you developed in exercise 10.4. Your LP solver should take as argument
A, b, c, and return primal and dual optimal points x⋆ , ν ⋆ , and λ⋆ .
You can terminate your barrier method when the duality gap, as measured by n/t, is smaller than
10−3 . (If you make the tolerance much smaller, you might run into some numerical trouble.)
Check your LP solver against the solution found by CVX* for several problem instances. The
comments in exercise 10.4 on how to generate random data hold here too.
157
Experiment with the parameter µ to see the effect on the number of Newton steps per centering
step, and the total number of Newton steps required to solve the problem.
Plot the progress of the algorithm, for a problem instance with n = 500 and m = 100, showing
duality gap (on a log scale) on the vertical axis, versus the cumulative total number of Newton
steps (on a linear scale) on the horizontal axis.
Your algorithm should return a 2 × k matrix history, (where k is the total number of centering
steps), whose first row contains the number of Newton steps required for each centering step, and
whose second row shows the duality gap at the end of each centering step. In order to get a plot
that looks like the ones in the book (e.g., figure 11.4, page 572), in Julia, with PyPlot you can use
the following:
using PyPlot
step(cumsum(history[1,:]’),history[2,:]’)
yscale("log")
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , m,
(a) Show that the Karush-Kuhn-Tucker (KKT) conditions are equivalent to the two conditions
m
X
λi = max {λi + fi (x), 0}, i = 1, . . . , m, ∇f0 (x) + λi ∇fi (x) = 0.
i=1
√
(b) The function max {u, 0} can be approximated by the smooth function (u+ u2 + 4t)/2, where
t is a positive constant that determines the quality of the approximation.
√
(u + u2 + 4t)/2
max {u, 0}
√
t
u
158
If we use this approximation in the KKT conditions of part (a), we obtain
p m
λi + fi (x) + (λi + fi (x))2 + 4t X
λi = , i = 1, . . . , m, ∇f0 (x) + λi ∇fi (x) = 0. (50)
2
i=1
Solving this set of nonlinear equations for a small value of t, gives an approximate solution
of the KKT conditions. Show that this P is another interpretation of the central path: if x, λ
satisfy (50), then x minimizes f0 (x) − t mi=1 log(−fi (x)) and λi = −t/fi (x).
11.11 Consider the quadratic program that arises in the Markowitz portfolio selection problem:
minimize xT P x + q T x
subject to x ⪰ 0
1T x = 1.
The variable is x ∈ Rn . We assume P is positive definite.
(a) The barrier method applied to this problem requires the repeated solution of equality-constrained
optimization problems
n
t(xT P x + q T x) −
P
minimize log xk
k=1
subject to 1T x = 1,
where t is a positive constant. Give the set of linear equations that defines the Newton step
∆x.
159
(b) Suppose P = F F T + D where D is positive diagonal and F is an n × p matrix with n ≫ p.
Describe an efficient method for solving the Newton equation in part (a). How does the
complexity of your method depend on n? Is it a linear, quadratic, or cubic function?
11.12 Self concordance and a logarithmic barrier for the exponential cone. Let C ⊂ Rn+ be an open convex
set and f : C → R be three times continuously differentiable and convex. Recall the notation that
if for a vector v ∈ Rn we define g(t) = f (x + tv) for t ∈ R, then
v T ∇2 f (x + tv)v − v T ∇2 f (x)v
∇3 f (x)[v, v, v] = g ′′′ (0) = lim .
t→0 t
Assume the following condition, which generalizes inequality (9.43) in the book:
v
u n 2
3 T 2
uX vi
|∇ f (x)[v, v, v]| ≤ 3v ∇ f (x)v t . (51)
x2
i=1 i
160
It turns out that ψ is a barrier for K, meaning that ∥∇ψ(x, y, z)∥ → +∞ as (x, y, z) → bd K,
and that it is the optimal (in a sense we will not make precise) such self-concordant barrier; such
logarithmically-homogeneous barriers are useful for bounding the complexity of Newton methods
for solving convex problems.
11.13 Educational testing problem. The educational testing problem (ETP) has the following form:
maximize 1T x
subject to Σ − diag(x) ⪰ 0 (52)
x ⪰ 0,
with variable x ∈ Rn ; the problem data is Σ ∈ Sn++ .
(a) Dual of the ETP. Form the Lagrangian, and derive an explicit expression for the dual function
g. Show that the Lagrange dual can be simplified as
minimize tr ΣZ
subject to Z ⪰ 0 (53)
Zii ≥ 1, i = 1, . . . , n,
with variable Z ∈ Sn . Note that Z = I is dual feasible. What is the corresponding bound on
the optimal value of the ETP? Can you derive this bound directly (i.e., without any duality
theory)?
(b) Central path for ETP. We will use the standard logarithmic barrier for the positive definite
cone, i.e., log det X −1 for X ∈ Sn++ . Derive and simplify the conditions under which a strictly
feasible x is on the central path. Show explicitly how to find a matrix Z that is feasible for
the dual (53), given a central point x⋆ (t).
(c) Barrier method for ETP. Write code that solves the ETP, with a guaranteed accuracy of 1%,
by which we mean that on exit your solution must satisfy
p⋆ − 1T x ≤ 0.01p⋆ ,
where p⋆ is the optimal value of (52).
Along with your code, give formulas for the gradient and Hessian of barrier and related func-
tions, how you find an initial strictly feasible x, what starting value you use for t, and how
your stopping criterion guarantees the required 1% accuracy. You can use a backtracking line
search.
Test your code on a variety of simple instances (diagonal, 2 × 2, . . . ). Then test it on some
larger problems, with random (positive definite symmetric) Σ.
For a moderate sized problem (say, n = 30), experiment with the effect of µ on the total
number of Newton steps required to solve the problem.
Hint. The gradient and Hessian of the logarithmic barrier ϕ(x) = log det F (x)−1 for the linear
matrix inequality
m
X
F (x) = F0 + xi Fi ⪰ 0, Fi = FiT ∈ Rn×n , i = 0, . . . , m
i=1
are given by
∇ϕ(x)i = − tr Fi F (x)−1 , ∇2 ϕ(x)ij = tr Fi F (x)−1 Fj F (x)−1 , i, j = 1, . . . , m.
161
12 Mathematical background
12.1 Some famous inequalities. The Cauchy-Schwarz inequality states that
n
!−1
1X 1
.
n xk
k=1
P
Use the Cauchy-Schwarz inequality to show that the arithmetic mean ( k xk )/n of a positive
n-vector is greater than or equal to its harmonic mean.
where A ∈ Rk×k . If det A ̸= 0, the matrix S = C − B T A−1 B is called the Schur complement of
A in X. Schur complements arise in many situations and appear in many important formulas and
theorems. For example, we have det X = det A det S. (You don’t have to prove this.)
(a) The Schur complement arises when you minimize a quadratic form over some of the variables.
Let f (u, v) = (u, v)T X(u, v), where u ∈ Rk . Let g(v) be the minimum value of f over u, i.e.,
g(v) = inf u f (u, v). Of course g(v) can be −∞.
Show that if A ≻ 0, we have g(v) = v T Sv.
(b) The Schur complement arises in several characterizations of positive definiteness or semidefi-
niteness of a block matrix. As examples we have the following three theorems:
• X ≻ 0 if and only if A ≻ 0 and S ≻ 0.
• If A ≻ 0, then X ⪰ 0 if and only if S ⪰ 0.
• X ⪰ 0 if and only if A ⪰ 0, B T (I − AA† ) = 0 and C − B T A† B ⪰ 0, where A† is the
pseudo-inverse of A. (C − B T A† B serves as a generalization of the Schur complement in
the case where A is positive semidefinite but singular.)
162
Prove one of these theorems. (You can choose which one.)
(a) The set P of doubly stochastic matrices (often called the Birkhoff polytope) are those matrices
P ∈ Rn×n
+ satisfying P 1 = 1 and P T 1 = 1. For a vector x ∈ Rn , we let x[i] denote the ith
largest component of x. Show that for any two vectors x, y ∈ Rn ,
n
X
xT P y ≤ x[i] y[i] .
i=1
Hint. You may use the result that if S n = {A ∈ {0, 1}n×n | A1 = 1, AT 1 = 1} is the set of
permutation matrices, then P = conv{S n }, that is, the Birkhoff polytope is the convex hull
of S n .
(b) Prove Von Neumann’s trace inequality, that is, that for any matrices X, Y ∈ Sn ,
n
X
tr(XY ) ≤ λi (X)λi (Y ), (54)
i=1
where λi (X), λi (Y ) are the eigenvalues of X, Y in sorted order. Hint. Argue that it is no
loss of generality to assume X is diagonal, and use that you may write Y = U ΛU T for an
orthogonal U and diagonal Λ.
(c) Give a sufficient condition for equality in Eq. (54).
12.4 Von Neumann’s trace inequality revisited. In this question, you prove the full version of Von
Neumann’s trace inequality, that is, that for matrices X, Y ∈ Rn×m ,
min{n,m}
X
tr(X T Y ) ≤ σi (X)σi (Y ), (55)
i=1
That is, the dual norm to the ℓ2 -operator norm is the trace norm (or sum of singular values).
163
13 Numerical linear algebra
13.1 Time to solve one or multiple sets of linear equations.
(a) About how long does it take a 10 Gflop/s computer to solve a system of 100 linear equations
(with 100 variables)? Choose one below.
• Ten microseconds.
• One hundred microseconds.
• One millisecond.
• Ten milliseconds.
• One hundred milliseconds.
• One second.
(b) About how long does it take a 10 Gflop/s computer to solve 10 systems of 100 linear equations,
with the same coefficient matrix but 10 different righthand sides?
• Ten microseconds.
• One hundred microseconds.
• One millisecond.
• Ten milliseconds.
• One hundred milliseconds.
• One second.
(a) Algorithm flop counts allow for very accurate prediction of running time on a given computer.
(b) Since matrix multiplication is associative, the flop count for multiplying three or more matrices
doesn’t depend on the order in which you multiply them.
(c) Suppose A ∈ Rn×n is lower triangular. The flop count for computing Ab is the same order as
the flop count for computing A−1 b.
13.3 Ridge regression. Suppose A ∈ Rm×n , and we need to compute x that minimizes ∥Ax − b∥22 +
(ρ/2)∥x∥22 , where ρ > 0. (This is the problem of using ridge regression to fit a regression model,
but that doesn’t matter here.)
(a) Tall matrix. For m ≥ n, the flop count (order) of a good method is (choose one)
• m3 .
• m2 n.
• mn2 .
• n3 .
(b) Wide matrix. For m ≤ n, the flop count (order) of a good method is (choose one)
• m3 .
• m2 n.
• mn2 .
• n3 .
164
14 Circuits
14.1 Interconnect sizing. In this problem we will size the interconnecting wires of the simple circuit
shown below, with one voltage source driving three different capacitive loads Cload1 , Cload2 , and
Cload3 .
Cload1
Cload2
Cload3
We divide the wires into 6 segments of fixed length li ; our variables will be the widths wi of the
segments. (The height of the wires is related to the particular IC technology process, and is fixed.)
The total area used by the wires is, of course,
X
A= wi li .
i
We’ll take the lengths to be one, for simplicity. The wire widths must be between a minimum and
maximum allowable value:
Wmin ≤ wi ≤ Wmax .
For our specific problem, we’ll take Wmin = 0.1 and Wmax = 10.
Each of the wire segments will be modeled by a simple simple RC circuit, with the resistance
inversely proportional to the width of the wire and the capacitance proportional to the width. (A
far better model uses an extra constant term in the capacitance, but this complicates the equations.)
The capacitance and resistance of the ith segment is thus
Ci = k0 wi , Ri = ρ/wi ,
where k0 and ρ are positive constants, which we take to be one for simplicity. We also have
Cload1 = 1.5, Cload2 = 1, and Cload3 = 5.
Using the RC model for the wire segments yields the circuit shown below.
165
R2 R3
R1 C2 C3 + Cload1
R5
R4
R6 C5 + Cload2
C1
C4 C6 + Cload3
We will use the Elmore delay to model the delay from the source to each of the loads. The Elmore
delay to loads 1, 2, and 3 are given by
T = max{T1 , T2 , T3 }.
(a) Explain how to find the optimal trade-off curve between area A and delay T .
(b) Optimal area-delay sizing. For the specific problem parameters given, plot the area-delay
trade-off curve, together with the individual Elmore delays. Comment on the results you
obtain.
(c) The simple method. Plot the area-delay trade-off obtained when you assign all wire widths
to be the same width (which varies between Wmin and Wmax ). Compare this curve to the
optimal one, obtained in part (b). How much better does the optimal method do than the
simple method? Note: for a large circuit, say with 1000 wires to size, the difference is far
larger.
For this problem you can use the CVX in GP mode. We’ve also made available the function
elm_del_example.m, which evaluates the three delays, given the widths of the wires.
14.2 Optimal sizing of power and ground trees. We consider a system or VLSI device with many sub-
systems or subcircuits, each of which needs one or more power supply voltages. In this problem we
consider the case where the power supply network has a tree topology with the power supply (or
external pin connection) at the root. Each node of the tree is connected to some subcircuit that
draws power.
166
We model the power supply as a constant voltage source with value V . The m subcircuits are
modeled as current sources that draw currents i1 (t), . . . , im (t) from the node (to ground) (see the
figure below).
R2 R3
R1 i2 (t) i3 (t)
R5
V R4
R6 i5 (t)
i1 (t)
i4 (t) i6 (t)
where idc ac
k is the DC current draw (which is a positive constant), and ik (t) is the AC draw (which
has zero average value). We characterize the AC current draw by its RMS value, defined as
1/2
1 T ac 2
Z
ac
RMS(ik ) = lim ik (t) dt .
T →∞ T 0
For each subcircuit we are given maximum values for the DC and RMS AC currents draws, i.e.,
constants Ikdc and Ikac such that
0 ≤ idc dc
k ≤ Ik , RMS(iac ac
k ) ≤ Ik . (56)
The n wires that form the distribution network are modeled as resistors Rk (which, presumably, have
small value). (Since the circuit has a tree topology, we can use the following labeling convention:
node k and the current source ik (t) are immediately following resistor Rk .) The resistance of the
wires is given by
Ri = αli /wi ,
where α is a constant and li are the lengths of the wires, which are known and fixed. The variables
in the problem are the width of the wires, w1 , . . . , wn . Obviously by making the wires very wide,
the resistances become very low, and we have a nearly ideal power network. The purpose of this
problem is to optimally select wire widths, to minimize area while meeting certain specfications.
Note that in this problem we ignore dynamics, i.e., we do not model the capacitance or inductance
of the wires.
As a result of the current draws and the nonzero resistance of the wires, the voltage at node k
(which supplies subcircuit k) has a DC value less than the supply voltage, and also an AC voltage
(which is called power supply ripple or noise). By superposition these two effects can be analyzed
separately.
167
• The DC voltage drop V − vkdc at node k is equal to the sum of the voltage drops across wires
on the (unique) path from node k to the root. It can be expressed as
m
X X
V − vkdc = idc
j Ri , (57)
j=1 i∈N (j,k)
where N (j, k) consists of the indices of the branches upstream from nodes j and k, i.e.,
i ∈ N (j, k) if and only if Ri is in the path from node j to the root and in the path from node
k to the root.
• The power supply noise at a node can be found as follows. The AC voltage at node k is equal
to
m
X X
vkac (t) = − iac
j (t) Ri .
j=1 i∈N (j,k)
We assume the AC current draws are independent, so the RMS value of vkac (t) is given by the
squareroot of the sum of the squares of the RMS value of the ripple due to each other node,
i.e.,
2 1/2
Xm X
RMS(vkac ) = RMS(iac
j ) Ri . (58)
j=1 i∈N (j,k)
Pn
The problem is to choose wire widths wi that minimize the total wire area i=k wk lk subject to
the following specifications:
V − vkdc ≤ Vmax
dc
, k = 1, . . . , m, (59)
dc is a given constant.
where V − vkdc is given by (57), and Vmax
• maximum allowable power supply noise at each node:
ac
RMS(vkac ) ≤ Vmax , k = 1, . . . , m, (60)
where M(k) is the set of all indices of nodes downstream from resistor k, i.e., j ∈ M(k) if
and only if Rk is in the path from node j to the root, and ρmax is a given constant.
168
• maximum allowable total DC power dissipation in supply network:
2
n
X X
Rk idc
j
≤ Pmax , (63)
k=1 j∈M(k)
These specifications must be satisfied for all possible ik (t) that satisfy (56).
Formulate this as a convex optimization problem in the standard form
minimize f0 (x)
subject to fi (x) ≤ 0, i = 1, . . . , p
Ax = b.
You may introduce new variables, or use a change of variables, but you must say very clearly
• what the optimization variable x is, and how it corresponds to the problem variables w (i.e.,
is x equal to w, does it include auxiliary variables, . . . ?)
• what the objective f0 and the constraint functions fi are, and how they relate to the objectives
and specifications of the problem description
• why the objective and constraint functions are convex
• what A and b are (if applicable).
14.3 Optimal amplifier gains. We consider a system of n amplifiers connected (for simplicity) in a chain,
as shown below. The variables that we will optimize over are the gains a1 , . . . , an > 0 of the
amplifiers. The first specification is that the overall gain of the system, i.e., the product a1 · · · an ,
is equal to Atot , which is given.
a1 a2 an
We are concerned about two effects: noise generated by the amplifiers, and amplifier overload.
These effects are modeled as follows.
We first describe how the noise depends on the amplifier gains. Let Ni denote the noise level (RMS,
or root-mean-square) at the output of the ith amplifier. These are given recursively as
2
1/2
N0 = 0, Ni = ai Ni−1 + αi2 , i = 1, . . . , n
where αi > 0 (which is given) is the (‘input-referred’) RMS noise level of the ith amplifier. The
output noise level Nout of the system is given by Nout = Nn , i.e., the noise level of the last amplifier.
Evidently Nout depends on the gains a1 , . . . , an .
169
Now we describe the amplifier overload limits. Si will denote the signal level at the output of the
ith amplifier. These signal levels are related by
S0 = Sin , Si = ai Si−1 , i = 1, . . . , n,
where Sin > 0 is the input signal level. Each amplifier has a maximum allowable output level
Mi > 0 (which is given). (If this level is exceeded the amplifier will distort the signal.) Thus we
have the constraints Si ≤ Mi , for i = 1, . . . , n. (We can ignore the noise in the overload condition,
since the signal levels are much larger than the noise levels.)
The maximum output signal level Smax is defined as the maximum value of Sn , over all input signal
levels Sin that respect the the overload constraints Si ≤ Mi . Of course Smax ≤ Mn , but it can be
smaller, depending on the gains a1 , . . . , an .
The dynamic range D of the system is defined as D = Smax /Nout . Evidently it is a (rather
complicated) function of the amplifier gains a1 , . . . , an .
The
Q goal is to choose the gains ai to maximize the dynamic range D, subject to the constraint
tot max (which are given).
i ai = A , and upper bounds on the amplifier gains, ai ≤ Ai
Explain how to solve this problem as a convex (or quasiconvex) optimization problem. If you intro-
duce new variables, or transform the variables, explain. Clearly give the objective and inequality
constraint functions, explaining why they are convex if it is not obvious. If your problem involves
equality constraints, give them explicitly.
Carry out your method on the specific instance with n = 4, and data
Atot = 10000,
α = (10−5 , 10−2 , 10−2 , 10−2 ),
M = (0.1, 5, 10, 10),
max
A = (40, 40, 40, 20).
W min ≤ wi ≤ W max , i = 1, . . . , n,
where W min and W max are given (positive) values. (You can assume there are no other constraints
on w.) The design is judged by three objectives, each of which we would like to be small: the
circuit power P (w), the circuit delay D(w), and the total circuit area A(w). These three objectives
are (complicated) posynomial functions of w.
You do not know the functions P , D, or A. (That is, you do not know the coefficients or exponents
in the posynomial expressions.) You do know a set of k designs, given by w(1) , . . . , w(k) ∈ Rn , and
their associated objective values
170
You can assume that these designs satisfy the width limits. The goal is to find a design w that
satisfies the width limits, and the design specifications
Hints/comments.
• You do not need to know anything about circuit design to solve this problem.
• See the title of this problem.
14.5 Solving nonlinear circuit equations using convex optimization. An electrical circuit consists of b
two-terminal devices (or branches) connected to n nodes, plus a so-called ground node. The goal is
to compute several sets of physical quantities that characterize the circuit operation. The vector of
branch voltages is v ∈ Rb , where vj is the voltage appearing across device j. The vector of branch
currents is i ∈ Rb , where ij is the current flowing through device j. (The symbol i, which is often
used to denote an index, is unfortunately the standard symbol used to denote current.) The vector
of node potentials is e ∈ Rn , where ek is the potential of node k with respect to the ground node.
(The ground node has potential zero by definition.)
The circuit variables v, i, and e satisfy several physical laws. Kirchhoff’s current law (KCL) can
be expressed as Ai = 0, and Kirchhoff’s voltage law (KVL) can be expressed as v = AT e, where
A ∈ Rn×b is the reduced incidence matrix, which describes the circuit topology:
−1 branch j enters node k
Akj = +1 branch j leaves node k
0 otherwise,
for k = 1, . . . , n, j = 1, . . . , b. (KCL states that current is conserved at each node, and KVL states
that the voltage across each branch is the difference of the potentials of the nodes it is connected
to.)
The branch voltages and currents are related by
vj = ϕj (ij ), j = 1, . . . , b,
where ϕj is a given function that depends on the type of device j. We will assume that these
functions are continuous and nondecreasing. We give a few examples. If device j is a resistor with
resistance Rj > 0, we have ϕj (ij ) = Rj ij (which is called Ohm’s law). If device j is a voltage
source with voltage Vj and internal resistance rj > 0, we have ϕj (ij ) = Vj + rj ij . And for a more
interesting example, if device j is a diode, we have ϕj (ij ) = VT log(1 + ij /IS ), where IS and VT are
known positive constants.
171
(a) Find a method to solve the circuit equations, i.e., find v, i, and e that satisfy KCL, KVL,
and the branch equations, that relies on convex optimization. State the optimization problem
clearly, indicating what the variables are. Be sure to explain how solving the convex optimiza-
tion problem you propose leads to choices of the circuit variables that satisfy all of the circuit
equations. You can assume that no pathologies occur in the problem that you propose, for
example, it is feasible, a suitable constraint qualification holds, and so on.
Hint. You might find the function ψ : Rb → R,
b Z
X ij
ψ(i1 , . . . , ib ) = ϕj (uj ) duj ,
j=1 0
useful.
(b) Consider the circuit shown in the diagram below. Device 1 is a voltage source with parameters
V1 = 1000, r1 = 1. Devices 2 and 5 are resistors with resistance R2 = 1000, and R5 = 100
respectively. Devices 3 and 4 are identical diodes with parameters VT = 26, IS = 1. (The
units are mV, mA, and Ω.)
The nodes are labeled N1 , N2 , and N3 ; the ground node is at the bottom. The incidence
matrix A is
1 1 0 0 0
A = 0 −1 1 1 0 .
0 0 0 −1 1
(The reference direction for each edge is down or to the right.)
Use the method in part (a) to compute v, i, and e. Verify that all the circuit equations hold.
R2 D4
N1 N2 N3
+
V1 D3 R5
−
172
15 Signal processing and communications
15.1 FIR low-pass filter design. Consider the (symmetric, linear phase) finite impulse response (FIR)
filter described by its frequency response
N
X
H(ω) = a0 + ak cos kω,
k=1
where ω ∈ [0, π] is the frequency. The design variables in our problems are the real coefficients
a = (a0 , . . . , aN ) ∈ RN +1 , where N is called the order or length of the FIR filter. In this problem
we will explore the design of a low-pass filter, with specifications:
• For 0 ≤ ω ≤ π/3, 0.89 ≤ H(ω) ≤ 1.12, i.e., the filter has about ±1dB ripple in the ‘passband’
[0, π/3].
• For ωc ≤ ω ≤ π, |H(ω)| ≤ α. In other words, the filter achieves an attenuation given by α in
the ‘stopband’ [ωc , π]. Here ωc is called the filter ‘cutoff frequency’.
(It is called a low-pass filter since low frequencies are allowed to pass, but frequencies above the
cutoff frequency are attenuated.) These specifications are depicted graphically in the figure below.
1.12
1.00
0.89
H(ω)
α
0
−α0 π/3 ωc π
ω
For parts (a)–(c), explain how to formulate the given problem as a convex or quasiconvex optimiza-
tion problem.
(a) Maximum stopband attenuation. We fix ωc and N , and wish to maximize the stopband atten-
uation, i.e., minimize α.
(b) Minimum transition band. We fix N and α, and want to minimize ωc , i.e., we set the stopband
attenuation and filter length, and wish to minimize the ‘transition’ band (between π/3 and
ωc ).
(c) Shortest length filter. We fix ωc and α, and wish to find the smallest N that can meet the
specifications, i.e., we seek the shortest length FIR filter that can meet the specifications.
173
(d) Numerical filter design. Use CVX to find the shortest length filter that satisfies the filter
specifications with
ωc = 0.4π, α = 0.0316.
(The attenuation corresponds to −30dB.) For this subproblem, you may sample the constraints
in frequency, which means the following. Choose K large (say, 500; an old rule of thumb is that
K should be at least 15N ), and set ωk = kπ/K, k = 0, . . . , K. Then replace the specifications
with
• For k with 0 ≤ ωk ≤ π/3, 0.89 ≤ H(ωk ) ≤ 1.12.
• For k with ωc ≤ ωk ≤ π, |H(ωk )| ≤ α.
Plot H(ω) versus ω for your design.
15.2 SINR maximization. Solve the following instance of problem 4.20: We have n = 5 transmitters,
grouped into two groups: {1, 2} and {3, 4, 5}. The maximum power for each transmitter is 3, the
total power limit for the first group is 4, and the total power limit for the second group is 6. The
noise σ is equal to 0.5 and the limit on total received power is 5 for each receiver. Finally, the path
gain matrix is given by
1.0 0.1 0.2 0.1 0.0
0.1 1.0 0.1 0.1 0.0
G= 0.2 0.1 2.0 0.2 0.2 .
0.1 0.1 0.2 1.0 0.1
0.0 0.0 0.2 0.1 1.0
Find the transmitter powers p1 , . . . , p5 that maximize the minimum SINR ratio over all receivers.
Also report the maximum SINR value. Solving the problem to an accuracy of 0.05 (in SINR) is
fine.
Hint. When implementing a bisection method in CVX, you will need to check feasibility of a convex
problem. You can do this using strcmpi(cvx_status, ’Solved’).
15.3 Power control for sum rate maximization in interference channel. We consider the optimization
problem
n
!
X pi
maximize log 1 + P
i=1 j̸=i Aij pj + vi
Xn
subject to pi = 1
i=1
pi ≥ 0, i = 1, . . . , n
with variables p ∈ Rn . The problem data are the matrix A ∈ Rn×n and the vector v ∈ Rn .
We assume A and v are componentwise nonnegative (Aij ≥ 0 and vi ≥ 0), and that the diagonal
elements of A are equal to one. If the off-diagonal elements of A are zero (A = I), the problem
has a simple solution, given by the waterfilling method. We are interested in the case where the
off-diagonal elements are nonzero.
We can give the following interpretation of the problem, which is not needed below. The variables
in the problem are the transmission powers in a communications system. We limit the total power
to one (for simplicity; we could have used any other number). The ith term in the objective is
174
the Shannon capacity of the ith channel; the fraction in the argument of the log is the signal to
interference plus noise ratio.
We can express the problem as
n Pn !
j=1 Bij pj
X
maximize log Pn
j=1 Bij pj − pi
i=1
n
X (64)
subject to pi = 1
i=1
pi ≥ 0, i = 1, . . . , n,
15.4 Radio-relay station placement and power allocation. Radio relay stations are to be located at posi-
tions x1 , . . . , xn ∈ R2 , and transmit at power p1 , . . . , pn ≥ 0. In this problem we will consider the
problem of simultaneously deciding on good locations and operating powers for the relay stations.
The received signal power Sij at relay station i from relay station j is proportional to the transmit
power and inversely proportional to the distance, i.e.,
αpj
Sij = ,
∥xi − xj ∥2
Sij ≥ βRij ,
where β > 0 is a known constant. (In other words, the minimum allowable received signal power
is proportional to the signal bit rate or bandwidth.)
The relay station positions xr+1 , . . . , xn are fixed, i.e., problem parameters. The problem variables
are x1 , . . . , xr and p1 , . . . , pn . The goal is to choose the variables to minimize the total transmit
power, i.e., p1 + · · · + pn .
Explain how to solve this problem as a convex or quasiconvex optimization problem. If you intro-
duce new variables, or transform the variables, explain. Clearly give the objective and inequality
constraint functions, explaining why they are convex. If your problem involves equality constraints,
express them using an affine function.
175
15.5 Power allocation with coherent combining receivers. In this problem we consider a variation on
the power allocation problem described on pages 4-13 and 4-14 of the notes. In that problem we
have m transmitters, each of which transmits (broadcasts) to n receivers, so the total number of
receivers is mn. In this problem we have the converse: multiple transmitters send a signal to each
receiver.
More specifically we have m receivers labeled 1, . . . , m, and mn transmitters labeled (j, k), j =
1, . . . , m, k = 1, . . . , n. The transmitters (i, 1), . . . , (i, n) all transmit the same message to the
receiver i, for i = 1, . . . , m.
Transmitter (j, k) operates at power pjk , which must satisfy 0 ≤ pjk ≤ Pmax , where Pmax is a given
maximum allowable transmitter power.
The path gain from transmitter (j, k) to receiver i is Aijk > 0 (which are given and known). Thus
the power received at receiver i from transmitter (j, k) is given by Aijk pjk .
For i ̸= j, the received power Aijk pjk represents an interference signal. The total interference-plus-
noise power at receiver i is given by
X
Ii = Aijk pjk + σ
j̸=i, k=1,...,n
where σ > 0 is the known, given (self) noise power of the receivers. Note that the powers of the
interference and noise signals add to give the total interference-plus-noise power.
The receivers use coherent detection and combining of the desired message signals, which means
the effective received signal power at receiver i is given by
2
X
Si = (Aiik pik )1/2 .
k=1,...,n
(Thus, the amplitudes of the desired signals add to give the effective signal amplitude.)
The total signal to interference-plus-noise ratio (SINR) for receiver i is given by γi = Si /Ii .
The problem is to choose transmitter powers pjk that maximize the minimum SINR mini γi , subject
to the power limits.
Explain in detail how to solve this problem using convex or quasiconvex optimization. If you
transform the problem by using a different set of variables, explain completely. Identify the objective
function, and all constraint functions, indicating if they are convex or quasiconvex, etc.
15.6 Antenna array weight design. We consider an array of n omnidirectional antennas in a plane, at
positions (xk , yk ), k = 1, . . . , n.
(xk , yk )
176
A unit plane wave with frequency ω is incident from an angle θ. This incident wave √ induces in
the kth antenna element a (complex) signal exp(i(xk cos θ + yk sin θ − ωt)), where i = −1. (For
simplicity we assume that the spatial units are normalized so that the wave number is one, i.e., the
wavelength is λ = 2π.) This signal is demodulated, i.e., multiplied by eiωt , to obtain the baseband
signal (complex number) exp(i(xk cos θ + yk sin θ)). The baseband signals of the n antennas are
combined linearly to form the output of the antenna array
n
X
G(θ) = wk ei(xk cos θ+yk sin θ)
k=1
n
X
= (wre,k cos γk (θ) − wim,k sin γk (θ)) + i (wre,k sin γk (θ) + wim,k cos γk (θ)) ,
k=1
if we define γk (θ) = xk cos θ + yk sin θ. The complex weights in the linear combination,
wk = wre,k + iwim,k , k = 1, . . . , n,
are called the antenna array coefficients or shading coefficients, and will be the design variables
in the problem. For a given set of weights, the combined output G(θ) is a function of the angle
of arrival θ of the plane wave. The design problem is to select weights wi that achieve a desired
directional pattern G(θ).
We now describe a basic weight design problem. We require unit gain in a target direction θtar ,
i.e., G(θtar ) = 1. We want |G(θ)| small for |θ − θtar | ≥ ∆, where 2∆ is our beamwidth. To do this,
we can minimize
max |G(θ)|,
|θ−θtar |≥∆
where the maximum is over all θ ∈ [−π, π] with |θ − θtar | ≥ ∆. This number is called the sidelobe
level for the array; our goal is to minimize the sidelobe level. If we achieve a small sidelobe level,
then the array is relatively insensitive to signals arriving from directions more than ∆ away from
the target direction. This results in the optimization problem
with w ∈ Cn as variables.
The objective function can be approximated by discretizing the angle of arrival with (say) N values
(say, uniformly spaced) θ1 , . . . , θN over the interval [−π, π], and replacing the objective with
177
Compute the optimal weights and make a plot of |G(θ)| (on a logarithmic scale) versus θ.
Hint. CVX can directly handle complex variables, and recognizes the modulus abs(x) of a
complex number as a convex function of its real and imaginary parts, so you do not need to
explicitly form the SOCP from part (a). Even more compactly, you can use norm(x,Inf)
with complex argument.
15.7 Power allocation problem with analytic solution. Consider a system of n transmitters and n re-
ceivers. The ith transmitter transmits with power xi , i = 1, . . . , n. The vector x will be the variable
in this problem. The path gain from each transmitter j to each receiver i will be denoted Aij and
is assumed to be known (obviously, Aij ≥ 0, so the matrix A is elementwise nonnegative, and
Aii > 0). The signal received by each receiver i consists of three parts: the desired signal, arriving
from transmitter
P i with power Aii xi , the interfering signal, arriving from the other receivers with
power j̸=i Aij xj , and noise βi (which are positive and known). We are interested in allocating
the powers xi in such a way that the signal to noise plus interference ratio at each of the receivers
exceeds a level α. (Thus α is the minimum acceptable SNIR for the receivers; a typical value
might be around α = 3, i.e., around 10dB). In other words, we want to find x ⪰ 0 such that for
i = 1, . . . , n
X
Aii xi ≥ α Aij xj + βi .
j̸=i
x ⪰ 0, Bx ⪰ αβ (65)
(a) Show that (65) is feasible if and only if B is invertible and z = B −1 1 ⪰ 0 (1 is the vector with
all components 1). Show how to construct a feasible power allocation x from z.
(b) Show how to find the largest possible SNIR, i.e., how to maximize α subject to the existence
of a feasible power allocation.
(a) s > ρ(T ), where ρ(T ) = maxi |λi (T )| is the spectral radius of T .
(b) sI − T is nonsingular and the matrix (sI − T )−1 has nonnegative elements.
(c) there exists an x ⪰ 0 with (sI − T )x ≻ 0.
178
15.8 Optimizing rates and time slot fractions. We consider a wireless system that uses time-domain
multiple access (TDMA) to support n communication flows. The flows have (nonnegative) rates
r1 , . . . , rn , given in bits/sec. To support a rate ri on flow i requires transmitter power
p = ai (ebr − 1),
where b is a (known) positive constant, and ai are (known) positive constants related to the noise
power and gain of receiver i.
TDMA works like this. Time is divided up into periods of some fixed duration T (seconds). Each
of these T -long periods is divided into n time-slots, with durations t1 , . . . , tn , that must satisfy
t1 + · · · + tn = T , ti ≥ 0. In time-slot i, communications flow i is transmitted at an instantaneous
rate r = T ri /ti , so that over each T -long period, T ri bits from flow i are transmitted. The power
required during time-slot i is ai (ebT ri /ti − 1), so the average transmitter power over each T -long
period is
X n
P = (1/T ) ai ti (ebT ri /ti − 1).
i=1
When ti is zero, we take P = ∞ if ri > 0, and P = 0 if ri = 0. (The latter corresponds to the case
when there is zero flow, and also, zero time allocated to the flow.)
The problem is to find rates r ∈ Rn and time-slot durations t ∈ Rn that maximize the log utility
function
Xn
U (r) = log ri ,
i=1
subject to P ≤ P max .(This utility function is often used to ensure ‘fairness’; each communication
flow gets at least some positive rate.) The problem data are ai , b, T and P max ; the variables are ti
and ri .
(a) Formulate this problem as a convex optimization problem. Feel free to introduce new variables,
if needed, or to change variables. Be sure to justify convexity of the objective or constraint
functions in your formulation.
(b) Give the optimality conditions for your formulation. Of course we prefer simpler optimality
conditions to complex ones. Note: We do not expect you to solve the optimality conditions;
you can give them as a set of equations (and possibly inequalities).
Hint. With a log utility function, we cannot have ri = 0, and therefore we cannot have ti = 0;
therefore the constraints ri ≥ 0 and ti ≥ 0 cannot be active or tight. This will allow you to simplify
the optimality conditions.
15.9 Optimal jamming power allocation. A set of n jammers transmit with (nonnegative) powers
p1 , . . . , pn , which are to be chosen subject to the constraints
p ⪰ 0, F p ⪯ g.
179
where Gij is the (nonnegative) channel gain from jammer j to receiver i.
Receiver i has capacity (in bits/s) given by
Ci = α log(1 + βi /(σi2 + di )), i = 1, . . . , m,
where α, βi , and σi are positive constants. (Here βi is proportional to the signal power at receiver
i and σi2 is the receiver i self-noise, but you won’t need to know this to solve the problem.)
Explain how to choose p to minimize the sum channel capacity, C = C1 + · · · + Cm , using convex
optimization. (This corresponds to the most effective jamming, given the power constraints.) The
problem data are F , g, G, α, βi , σi .
If you change variables, or transform your problem in any way that is not obvious (for example, you
form a relaxation), you must explain fully how your method works, and why it gives the solution.
If your method relies on any convex functions that we have not encountered before, you must show
that the functions are convex.
Disclaimer. The teaching staff does not endorse jamming, optimal or otherwise.
15.10 2D filter design. A symmetric convolution kernel with support {−(N − 1), . . . , N − 1}2 is charac-
terized by N 2 coefficients
hkl , k, l = 1, . . . , N.
These coefficients will be our variables. The corresponding 2D frequency response (Fourier trans-
form) H : R2 → R is given by
X
H(ω1 , ω2 ) = hkl cos((k − 1)ω1 ) cos((l − 1)ω2 ),
k,l=1,...,N
where ω1 and ω2 are the frequency variables. Evidently we only need to specify H over the region
[0, π]2 , although it is often plotted over the region [−π, π]2 . (It won’t matter in this problem, but
we should mention that the coefficients hkl above are not exactly the same as the impulse response
coefficients of the filter.)
We will design a 2D filter (i.e., find the coefficients hkl ) to satisfy H(0, 0) = 1 and to minimize the
maximum response R in the rejection region Ωrej ⊂ [0, π]2 ,
R= sup |H(ω1 , ω2 )|.
(ω1 ,ω2 )∈Ωrej
180
15.11 Maximizing log utility in a wireless system with interference. Consider a wireless network consisting
of n data links, labeled 1, . . . , n. Link i transmits with power Pi > 0, and supports a data rate
Ri = log(1 + γi ), where γi is the signal-to-interference-plus-noise ratio (SINR). These SINR ratios
depend on the transmit powers, as described below.
n×n
The system is characterized by the link gain matrix G ∈ R++ , where Gij is the gain from the
transmitter on link j to the receiver for link i. The received signal power for link i is Gii Pi ; the
noise plus interference power for link i is given by
X
σi2 + Gij Pj ,
j̸=i
where σi2 > 0 is the receiver noise power for link i. The SINR is the ratio
G P
γi = Pii i .
σi2 + j̸=i Gij Pj
The problem is to choose the transmit powers P1 , . . . , Pn , subject to 0 < Pi ≤ Pimax , in order to
maximize the log utility function
n
X
U (P ) = log Ri .
i=1
(This utility function can be argued to yield a fair distribution of rates.) The data are G, σi2 , and
Pimax .
Formulate this problem as a convex or quasiconvex optimization problem. If you make any trans-
formations or use any steps that are not obvious, explain.
Hints.
• The function log log(1 + ex ) is concave. (If you use this fact, you must show it.)
• You might find the new variables defined by zi = log Pi useful.
15.12 Spectral factorization via semidefinite programming. A Toeplitz matrix is a matrix that has constant
values on its diagonals. We use the notation
x1 x2 x3 · · · xm−1 xm
x2
x1 x2 · · · xm−2 xm−1
x3 x 2 x 1 · · · x m−3 x m−2
Tm (x1 , . . . , xm ) = .
.. .. .. .. ..
.. . . . . .
xm−1 xm−2 xm−2 · · · x1 x2
xm xm−1 xm−2 · · · x2 x1
to denote the symmetric Toeplitz matrix in Sm×m constructed from x1 , . . . , xm . Consider the
semidefinite program
minimize cT x
subject to Tn (x1 , . . . , xn ) ⪰ e1 eT1 ,
with variable x = (x1 , . . . , xn ), where e1 = (1, 0, . . . , 0).
181
(a) Derive the dual of the SDP above. Denote the dual variable as Z. (Hence Z ∈ Sn and the
dual constraints include an inequality Z ⪰ 0.)
(b) Show that Tn (x1 , . . . , xn ) ≻ 0 for every feasible x in the SDP above. You can do this by
induction on n.
• For n = 1, the constraint is x1 ≥ 1 which obviously implies x1 > 0.
• In the induction step, assume n ≥ 2 and that Tn−1 (x1 , . . . , xn−1 ) ≻ 0. Use a Schur
complement argument and the Toeplitz structure of Tn to show that Tn (x1 , . . . , xn ) ⪰ e1 eT1
implies Tn (x1 , . . . , xn ) ≻ 0.
(c) Suppose the optimal value of the SDP above is finite and attained, and that Z is dual optimal.
Use the result of part (b) to show that the rank of Z is at most one, i.e., Z can be expressed
as Z = yy T for some n-vector y. Show that y satisfies
15.13 Bandlimited signal recovery from zero-crossings. Let y ∈ Rn denote a bandlimited signal, which
means that it can be expressed as a linear combination of sinusoids with frequencies in a band:
B
X 2π 2π
yt = aj cos (fmin + j − 1)t + bj sin (fmin + j − 1)t , t = 1, . . . , n,
n n
j=1
where fmin is lowest frequency in the band, B is the bandwidth, and a, b ∈ RB are the cosine and
sine coefficients, respectively. We are given fmin and B, but not the coefficients a, b or the signal y.
We do not know y, but we are given its sign s = sign(y), where st = 1 if yt ≥ 0 and st = −1
if yt < 0. (Up to a change of overall sign, this is the same as knowing the ‘zero-crossings’ of the
signal, i.e., when it changes sign. Hence the name of this problem.)
We seek an estimate ŷ of y that is consistent with the bandlimited assumption and the given signs.
Of course we cannot distinguish y and αy, where α > 0, since both of these signals have the same
sign pattern. Thus, we can only estimate y up to a positive scale factor. To normalize ŷ, we will
require that ∥ŷ∥1 = n, i.e., the average value of |yi | is one. Among all ŷ that are consistent with the
bandlimited assumption, the given signs, and the normalization, we choose the one that minimizes
∥ŷ∥2 .
182
(a) Show how to find ŷ using convex or quasiconvex optimization.
(b) Apply your method to the problem instance with data in zero_crossings_data.py. The
data files also include the true signal y (which of course you cannot use to find ŷ). Plot ŷ and
y, and report the relative recovery error, ∥y − ŷ∥2 /∥y∥2 . Give one short sentence commenting
on the quality of the recovery.
G p
si = Pii i .
σi2 + j̸=i Gij pj
The SINR si determines the data rate Ri (in bits/sec) that receiver i can receive, which has the
form Ri = α log(1 + si ), where α is a known positive constant. We will use system objective
R = mini Ri , i.e., the minimum data rate of any of the n receivers.
We wish to maximize R, while minimizing total system power P = p1 + · · · + pn . This is a
bi-objective problem.
(a) Explain how to compute the optimal trade-off curve of minimum rate R versus total power
P , using convex or quasiconvex optimization. (By computing the curve, we mean computing
a number of Pareto optimal points.) If you change variables in your formulation, be sure to
explain.
Hint. In addition to the usual scalarization, there are many ways to compute Pareto optimal
points for the bi-criterion problem above.
(b) Find and plot the optimal trade-off curve for the problem instance with data given in power_control_data.*,
with total power P on the horizontal axis. (It is enough to compute a few tens of points on
the Pareto curve. Be sure to check that the end-points of the Pareto curve make sense.)
15.15 Sparse blind deconvolution. We are given a time series observation y ∈ RT , and seek a filter
(convolution kernel) w ∈ Rk , so that the convolution x = w ∗ y ∈ RT +k−1 is sparse after truncating
the first and last k − 1 entries, i.e., xk:T = (xk , xk+1 , . . . , xT ) is sparse. Here ∗ denotes convolution,
k
X
xi = wj yi−j , i = 1, . . . , T + k − 1,
j=1
183
Interpretations. (These are not needed to solve the problem.) In signal processing dialect, we can
say that w is a filter which, when applied to the signal y, results in x, a simpler, sparse signal. As
a second interpretation, we can say that y = w−1 ∗ x, where w−1 is the convolution inverse of w,
defined as
w−1 = F −1 (1/F(w)),
where F is discrete Fourier transform at length N = T + k and F −1 is its inverse transform. In
this interpretation, we can say that we have decomposed the signal into the convolution of a sparse
signal x and a signal with short (k-long) inverse, w−1 .
Carry out blind deconvolution on the signal given in blind_deconv_data.*. This file also defines
the kernel length k. Plot optimal w and x, and also the given observation y. Also plot the inverse
kernel w−1 , use the function inverse_ker that we provided in blind_deconv_data.*.
Hint. The function conv(w,y) is overloaded to work with CVX*.
15.16 Recovering a time series corrupted by late reporting. We consider a scalar time series y1 , . . . , yT ,
where yt represents the total value of some quantity over time interval t. We will consider the case
where the quantities are nonnegative, i.e., yt ≥ 0.
Some of the raw data used to create the time series arrives late, causing it to be erroneously included
in the total for the next period. Let lt be the total amount that arrives late, t = 1, . . . , T − 1. That
is, lt is the total amount that should have been reported in period t, but ended up being reported
in period t + 1. With this late reporting, the time series we observe is ỹ1 , . . . , ỹT , with
i.e., a total of no more than 10% of the total quantity is reported late.
We observe the corrupted time series ỹ but not the true one y. The goal is to find an estimate ŷ
of the true time series y, which we do by minimizing a convex loss function ℓ : RT → R, where
smaller values of ℓ(ŷ) are more plausible than larger values. (For example, ℓ(ŷ) might be the
negative log-likelihood in a statistical model of y.)
the sum of squares of the second difference. Plot your estimated time series ŷ, as well as
the corrupted ỹ and true y, which is given in the data √ file as y_true. Report the RMS error
between the recovered and true time series,
√ ∥ŷ − y∥ 2 / T , and the RMS error between the true
and perturbed time series, ∥ỹ − y∥2 / T . (Of course in any practical application you would
not have access to the true time series.)
184
15.17 Maximizing utility in a wireless network with interference. A wireless network consists of a set of
nodes and a set of m links (between nodes) over which data can be transmitted. There are n routes,
each corresponding to a sequence of links from a source to a destination node. Route j has a data
flow rate fj ∈ R+ (in units of bits per second, say). The goal is to maximize the total utility
n
X
U (f ) = Uj (fj ),
j=1
The total traffic on a link is the sum of the flows that pass over the link, and can be written as
t = Rf ∈ Rm . The traffic on each link is constrained by the capacity of the link and by interference
from traffic on other links. The capacity constraint is simply t ⪯ c, where c ∈ Rm++ is the vector of
link capacities.
The interference is modeled by rate regions, which are convex regions in which a subset of mutually
interfering traffic values can lie. We will describe the rate regions as a single polyhedron, R = {t |
At ⪯ b}, with A ∈ Rp×m + and b ∈ Rp++ . Each row aTk t ≤ bk specifies a limit on some (nonnegative)
linear combination of the link traffic values. Typically, A is sparse, meaning that each link only
interferes with a few others.
The data in the problem are the utility functions Uj , the route matrix R, the link capacity c, and
the matrix A and vector b that define the rate regions.
(a) Formulate the problem of finding the flow rates that maximize total network utility, subject
to the network’s interference and capacity constraints, as a convex optimization problem.
√
(b) Solve the instance of this problem with m = 20, n = 10, p = 8, Uj (x) = x with other data
given in max_util_wireless_data.py.
Report the optimal flow f ⋆ and the associated utility. List the links that operate at full
capacity, i.e., ti = ci . (You can determine this using a reasonable tolerance, such as ci − ti <
0.001).
(c) For comparison, solve the problem without the interference constraints. Report the optimal
flow and the associated utility. As above, list the links that operate at full capacity.
185
where ∗ denotes convolution, s ∈ RM is the known standard response of a neuron firing (an action
potential obtained from the so-called Hodgkin-Huxley model, but you don’t need to know that),
with M its length, a ∈ RN T
+ is the unknown activation, and v ∈ R is the unknown noise signal.
The activation signal a is sparse and nonnegative, with at > 0 meaning that a neuron near the
probe fired at time t, with amplitude at . We have T = M + N − 1, the length of the convolution
s ∗ a. Written out explicitly the model above is
M
X
yt = at−τ sτ + vt , t = 1, . . . , T.
τ =1
where r : RN → R is a convex regularizer function and λ is a positive parameter that scales the
regularization. The variable here is a; s and y are given. We let â denote a solution of this problem.
(a) Suggest a regularizer r for which the resulting a would typically be sparse. Then simplify r
using the fact that here we have a ⪰ 0.
(b) Carry out the method using the regularizer suggested in part (a), on the data given in
neural_signal_data.*. This defines synthetic data, which includes the true value atrue ,
which of course you would not have in a real problem. You may use the function visualize_data,
provided with the data, to plot the given y and the true neural signal s ∗ atrue . The data gives
T = 2199 samples, spaced 0.1 milliseconds apart, so the whole recording covers a little more
than 0.2 seconds. The true activation atrue has 10 nonzero entries.
Find â using λ = 2. Using the function visualize_estimate provided with the data, plot the
estimated and true activations, â and atrue , and the estimated and true values of the neural
signal, s ∗ â and s ∗ atrue .
Use the function find_nonzero_entries provided with the data to find the nonzero entries
in â (based on the threshold at > 0.01). How well do the nonzero entries in â and atrue match?
Hint. CVXPY supports convolution, in its conv atom. It returns a T × 1 matrix, so you may
need to use cp.conv(s,a).flatten() to obtain a vector.
Remark. We do not expect the true and estimated nonzero indices to match exactly; in
addition, what is really just one nonzero in the true activation can end up as a few contiguous
or nearby nonzero entries in the estimated activation.
(c) Polishing. Regularization has the effect of making a smaller than it would be without the
regularization. (In statistical estimation this phenomenon is called shrinkage, and is a desirable
feature. In this case, it is not.) A method called polishing can be used to improve the estimate
when this is not desirable.
To do this, we first solve the regularized problem in part (b) above. This gives us T = {τ |
aτ > 0.01}, the set of times we think a neuron fires (with our threshold). Then we solve the
problem again, with no regularization, adding the explicit constraint that aτ = 0 for τ ̸∈ T .
186
Carry out this polishing procedure for the â found in part (b) to obtain âpol . Use the function
visualize_polished provided with the data to create the same plots as in part (b), i.e., âpol
and atrue and also s ∗ âpol and s ∗ atrue . Does polishing improve your estimate of the neural
signal?
Remark. A more sophisticated version of the polishing step can include logic that combines
adjacent or nearby nonzero entries in â into just one. Specifically, in this problem, biology
dictates that neurons cannot be activated twice within 1ms intervals, providing a natural
interval to combine adjacent nonzero entries. But we are not asking you to do this.
187
16 Control and trajectory optimization
16.1 Quickest take-off. This problem concerns the braking and thrust profiles for an airplane during
take-off. For simplicity we will use a discrete-time model. The position (down the runway) and the
velocity in time period t are pt and vt , respectively, for t = 0, 1, . . .. These satisfy p0 = 0, v0 = 0,
and pt+1 = pt + hvt , t = 0, 1, . . ., where h > 0 is the sampling time period. The velocity updates as
where η ∈ (0, 1) is a friction or drag parameter, ft is the engine thrust, and bt is the braking force,
at time period t. These must satisfy
|ft+1 − ft | ≤ S, t = 0, 1, . . . .
Here B max , F max , and S are given parameters. The initial thrust is f0 = 0. The take-off time is
T to = min{t | vt ≥ V to }, where V to is a given take-off velocity. The take-off position is P to = pT to ,
the position of the aircraft at the take-off time. The length of the runway is L > 0, so we must
have P to ≤ L.
(a) Explain how to find the thrust and braking profiles that minimize the take-off time T to ,
respecting all constraints. Your solution can involve solving more than one convex problem,
if necessary.
(b) Solve the quickest take-off problem with data
Plot pt , vt , ft , and bt versus t. Comment on what you see. Report the take-off time and
take-off position for the profile you find.
16.2 Optimal spacecraft landing. We consider the problem of optimizing the thrust profile for a spacecraft
to carry out a landing at a target position. The spacecraft dynamics are
mp̈ = f − mge3 ,
where m > 0 is the spacecraft mass, p(t) ∈ R3 is the spacecraft position, with 0 the target landing
position and p3 (t) representing height, f (t) ∈ R3 is the thrust force, and g > 0 is the gravitational
acceleration. (For simplicity we assume that the spacecraft mass is constant. This is not always
a good assumption, since the mass decreases with fuel use. We will also ignore any atmospheric
friction.) We must have p(T td ) = 0 and ṗ(T td ) = 0, where T td is the touchdown time. The
spacecraft must remain in a region given by
where α > 0 is a given minimum glide slope. The initial position p(0) and velocity ṗ(0) are given.
188
The thrust force f (t) is obtained from a single rocket engine on the spacecraft, with a given
maximum thrust; an attitude control system rotates the spacecraft to achieve any desired direction
of thrust. The thrust force is therefore characterized by the constraint ∥f (t)∥2 ≤ F max . The fuel
use rate is proportional to the thrust force magnitude, so the total fuel use is
Z T td
γ∥f (t)∥2 dt,
0
where γ > 0 is the fuel consumption coefficient. The thrust force is discretized in time, i.e., it is
constant over consecutive time periods of length h > 0, with f (t) = fk for t ∈ [(k − 1)h, kh), for
k = 1, . . . , K, where T td = Kh. Therefore we have
vk+1 = vk + (h/m)fk − hge3 , pk+1 = pk + (h/2)(vk + vk+1 ),
where pk denotes p((k−1)h), and vk denotes ṗ((k−1)h). We will work with this discrete-time model.
For simplicity, we will impose the glide slope constraint only at the times t = 0, h, 2h, . . . , Kh.
(a) Minimum fuel descent. Explain how to find the thrust profile f1 , . . . , fK that minimizes fuel
consumption, given the touchdown time T td = Kh and discretization time h.
(b) Minimum time descent. Explain how to find the thrust profile that minimizes the touch-
down time, i.e., K, with h fixed and given. Your method can involve solving several convex
optimization problems.
(c) Carry out the methods described in parts (a) and (b) above on the problem instance with
data given in spacecraft_landing_data.py. Report the optimal total fuel consumption for
part (a), and the minimum touchdown time for part (b). The data files also contain plotting
code (commented out) to help you visualize your solution. Use the code to plot the spacecraft
trajectory and thrust profiles you obtained for parts (a) and (b).
Remarks. If you’d like to see the ideas of this problem in action, watch these videos:
• http://www.youtube.com/watch?v=2t15vP1PyoA
• https://www.youtube.com/watch?v=orUjSkc2pG0
• https://www.youtube.com/watch?v=1B6oiLNyKKI
• https://www.youtube.com/watch?v=ZCBE8ocOkAQ
16.3 Feedback gain optimization. A system (such as an industrial plant) is characterized by y = Gu + v,
where y ∈ Rn is the output, u ∈ Rn is the input, and v ∈ Rn is a disturbance signal. The matrix
G ∈ Rn×n , which is known, is called the system input-output matrix. The input signal u is found
using a linear feedback (control) policy: u = F y, where F ∈ Rn×n is the feedback (gain) matrix,
which is what we need to determine. From the equations given above, we have
y = (I − GF )−1 v, u = F (I − GF )−1 v.
(You can simply assume that I − GF will be invertible.)
The disturbance v is random, with E v = 0, E vv T = σ 2 I, where σ is known. The objective is to
minimize maxi=1,...,n E yi2 , the maximum mean square value of the output components, subject to
the constraint that E u2i ≤ 1, i = 1, . . . , n, i.e., each input component has a mean square value not
exceeding one. The variable to be chosen is the matrix F ∈ Rn×n .
189
(a) Explain how to use convex (or quasi-convex) optimization to find an optimal feedback gain
matrix. As usual, you must fully explain any change of variables or other transformations you
carry out, and why your formulation solves the problem described above. A few comments:
• You can assume that matrices arising in your change of variables are invertible; you do
not need to worry about the special cases when they are not.
• You can assume that G is invertible if you need to, but we will deduct a few points from
these answers.
(b) Carry out your method for the problem instance with data
0.3 −0.1 −0.9
σ = 1, G = −0.6 0.3 −0.3 .
−0.3 0.6 0.2
16.4 Minimum time speed profile along a road. A vehicle of mass m > 0 moves along a road in R3 , which
is piecewise linear with given knot points p1 , . . . , pN +1 ∈ R3 , starting at p1 and ending at pN +1 . We
let hi = (pi )3 , the z-coordinate of the knot point; these are the heights of the knot points (above
sea-level, say). For your convenience, these knot points are equidistant, i.e., ∥pi+1 − pi ∥2 = d for all
i. (The points give an arc-length parametrization of the road.) We let si > 0 denote the (constant)
vehicle speed as it moves along road segment i, from pi to pi+1 , for i = 1, . . . , N , and sN +1 ≥ 0
denote the vehicle speed after it passes through knot point pN +1 . Our goal is to minimize the total
time to traverse the road, which we denote T .
We let fi ≥ 0 denote the total fuel burnt while traversing the ith segment. This fuel burn is turned
into an increase in vehicle energy given by ηfi , where η > 0 is a constant that includes the engine
efficiency and the energy content of the fuel. While traversing the ith road segment the vehicle is
subject to a drag force, given by CD s2i , where CD > 0 is the coefficient of drag, which results in an
energy loss dCD s2i .
We derive equations that relate these quantities via energy balance:
1 2 1
msi+1 + mghi+1 = ms2i + mghi + ηfi − dCD s2i , i = 1, . . . , N,
2 2
where g = 9.8 is the gravitational acceleration. The lefthand side is the total vehicle energy (kinetic
plus potential) after it passes through knot point pi+1 ; the righthand side is the total vehicle energy
after it passes through knot point pi , plus the energy gain from the fuel burn, minus the energy
lost to drag. To set up the first vehicle speed s1 requires an additional initial fuel burn f0 , with
ηf0 = 21 ms21 .
Fuel is also used to power the on-board system of the vehicle. The total fuel used for this purpose is
fob , where ηfob = T P , where P > 0 Pis the (constant) power consumption of the on-board system.
N
We have a fuel capacity constraint: i=0 fi + fob ≤ F , where F > 0 is the total initial fuel.
The problem data are m, d, h1 , . . . , hN +1 , η, CD , P , and F . (You don’t need the knot points pi .)
(a) Explain how to find the fuel burn levels f0 , . . . , fN that minimize the time T , subject to the
constraints.
190
(b) Carry out the method described in part (a) for the problem instance with data given in
min_time_speed_data.m. Give the optimal time T ⋆ , and compare it to the time T unif achieved
if the fuel for propulsion were burned uniformly, i.e., f0 = · · · = fN . For each of these cases,
plot speed versus distance along the road, using the plotting code in the data file as a template.
16.5 Minimum time maneuver for a crane. A crane manipulates a load with mass m > 0 in two
dimensions using two cables attached to the load. The cables maintain angles ±θ with respect to
vertical, as shown below.
θ θ
load
The (scalar) tensions T left and T right in the two cables are independently controllable, from 0 up
to a given maximum tension T max . The total force on the load is
left − sin θ right sin θ
F =T +T + mg,
cos θ cos θ
where g = (0, −9.8) is the acceleration due to gravity. The acceleration of the load is then F/m.
We approximate the motion of the load using
where pi ∈ R2 is the position of the load, vi ∈ R2 is the velocity of the load, and Fi ∈ R2 is the
force on the load, at time t = ih. Here h > 0 is a small (given) time step.
The goal is to move the load, which is initially at rest at position pinit to the position pdes , also at
rest, in minimum time. In other words, we seek the smallest k for which
(a) Explain how to solve this problem using convex (or quasiconvex) optimization.
(b) Carry out the method of part (a) for the problem instance with
with time step h = 0.1. Report the minimum time k ⋆ . Plot the tensions versus time, and the
load trajectory, i.e., the points p1 , . . . , pk in R2 . Does the load move along the line segment
between pinit and pdes (i.e., the shortest path from pinit and pdes )? Comment briefly.
191
16.6 Planning an autonomous lane change. A vehicle is traveling down a highway with two lanes,
separated by L meters. At time t, its position is p(t) = (x(t), y(t)) ∈ R2+ . We require that
y(t) ∈ [0, L], for all t. When y(t) = 0, it means the vehicle is in lane 1, when y(t) = L, it means
the vehicle is in lane 2, and when 0 < y(t) < L, it means the vehicle is passing between lanes.
(Notice that since a lane on a highway has traffic moving in a single direction, we require that x(t)
is nondecreasing in t.)
For simplicity, we discretize the problem. We will consider the position of the vehicle every second,
so pt = (xt , yt ), t = 0, 1, . . . , T , denotes the vehicles position from 0 to T seconds (in particular,
this means pt = p(t)). Initially (t = 0), the vehicle lies in lane 1, and we assume x0 = 0. Between t
and t + 1 seconds, we assume the vehicle travels at constant speed, measured in meters per second
(m/s). The speed from time t to t + 1 is simply ∥pt+1 − pt ∥2 . We require that these speeds never
exceed S max (for example, the speed limit plus, say, 4 or 5 m/s).
The goal of this problem is to plan a lane change. In particular, after time T start , the vehicle may
initiate a lane change from lane 1, and by time T end , the vehicle should have fully entered lane 2.
The vehicle should always travel at a speed of at most S max , measured in meters per second.
Additionally, when the vehicle is not allowed to lane change (before T start and after T end ), the
vehicle must be driving with at least a given minimum speed, S min , which is also given. (You may
assume that T start and T end are integers.)
Your goal is to determine the smoothest possible lane change, subject to the constraints described
above. By smooth, we simply mean that you should minimize the total acceleration of the vehicle,
which can be approximated by
T
X −1
∥(pt+1 − pt ) − (pt − pt−1 )∥22 .
t=1
(a) Explain how to plan this autonomous lane change using convex or quasiconvex optimization,
given T , T start , T end , S min , S max , and L. If you introduce new variables or make any trans-
formations you must justify them.
(b) Carry out this method on the data below,
T = 30, T start = 15, T end = 20, S min = 25, S max = 35, L = 3.7.
Produce a plot the speed of the vehicle against time, as well as the position of the vehicle in
R2 for the plan you produce.
Remark. In fact, many highways have lanes separated by 3.7 meters. Additionally, on average,
a lane change for a vehicle on a standard US freeway takes 5 to 6 seconds, and the speed limits
we impose here correspond to a vehicle driving between 55 mph, and 75 mph, which aren’t
unreasonable for a standard US highway.
16.7 Optimal racing of an energy-limited vehicle. We have an energy-limited vehicle, such as a solar car,
moving along a fixed straight track. We’d like to design a control system to move the vehicle from
the starting point to the finishing point using minimum energy in the time interval [0, T ]. (There
are other related natural formulations of this problem, such as traversing the track in the minimum
time subject to a maximum energy usage. We will not consider these here, but the same techniques
are applicable.)
192
At time t the car has position x(t) ∈ R, velocity v(t) ∈ R and acceleration a(t) ∈ R. The car
starts with x(0) = 0 and v(0) = 0 and must finish with x(T ) ≥ xfinal .
At time t the kinetic energy of the vehicle is k(t) = 21 mv(t)2 , where m is the mass. Let the energy
delivered from the battery to the drivetrain be p(t), which is nonnegative (there is no regenerative
braking.) Then
k̇(t) = p(t) − pbrake (t) − ploss (t)
where pbrake (t) ≥ 0 is an input that the control system (i.e., your optimization) chooses, and losses
due to drag are modeled via
ploss (t) = closs v(t)3
Here closs is a positive constant that depends on the shape of the vehicle and the density of the air.
The vehicle must move according to the following requirements. Tire traction limits acceleration
so that v̇(t) ≤ amax . Note that there is no lower bound on the acceleration. The vehicle cannot
move backwards and must stay within the speed limit, and so 0 ≤ v(t) ≤ v max . The final velocity
of the vehicle must satisfy v(T ) ≤ v final .
We will use period h > 0 and sample position according to xi = x(ih), and similarly for veloc-
ity, acceleration and kinetic energy. The vehicle dynamics ẋ(t) = v(t) and v̇(t) = a(t) are then
discretized according to
h
xi+1 = xi + (vi + vi+1 ), vi+1 = vi + hai
2
and the rate of change of kinetic energy is discretized according to
1
(ki+1 − ki ) = pi − pbrake
i − ploss
i
h
We would like to minimize the total energy used, which is discretized as
n
X
E=h pi
i=0
(a) Formulate this problem as an optimization problem with variables pi , xi , vi , ki (and others if
necessary) for i = 0, . . . , n. If this problem is not convex, explain briefly why.
(b) By relaxing the energy constraint k(t) = 12 mv(t)2 to
1
k(t) ≥ mv(t)2
2
state a convex optimization problem whose solution provides an optimal trajectory x, v, p,
and k for part (a). Explain why the relaxation is tight. By tight, we mean that the solution
to your problem has the same optimal value as that of part (a).
(c) Carry out your method from part (b). Report the optimal value of the total energy E. Plot
the position x, velocity v and power used p of the vehicle as functions of time.
193
16.8 Well that was a bit roundabout. You’re late for the last lecture of Convex Optimization and you
need to get the lecture hall. You get on your bike, and proceed directly to class.
At time t the bike has position x(t) ∈ R2 , velocity v(t) ∈ R2 , and acceleration a(t) ∈ R2 . You
start at position x(0) = xinitial and finish at time T with x(T ) = xfinal . Your initial velocity is
v(0) = v initial .
We will use period h > 0 and sample position according to xi = x(ih), and similarly for velocity and
acceleration. Fortunately your bicycle is a point mass, and so the vehicle dynamics are ẋ(t) = v(t)
and v̇(t) = a(t). These are then discretized according to
Despite your desire to arrive at class on time, you cycle somewhat leisurely, avoiding unnecessary
exertion. So you choose to minimize
Xn
J =h ∥ai ∥22
i=0
where T = nh. We have
initial −5 final 6 initial 2
x = , x = , v = , T = 12, h = 0.1.
0 1 0
(a) Find and plot the optimal trajectory of the bicycle. Report the optimal value J.
(b) Unfortunately, somebody has built a roundabout in the way. The roundabout is a disk of
radius 1 centered at the origin
R = {x ∈ R2 | ∥x∥2 ≤ 1}
The constable observing your path advises you that he fears that your trajectory has the
unfortunate property of failing to correctly circumnavigate the roundabout.
Unfortunately, the constraint that you should avoid the roundabout is not convex. After
considering this, you arrive at a new strategy. Let the previous solution be xprev . You
construct a new optimization problem, where at each time step i you add the constraint that
cTi xi ≥ 1, where
ci = xprev
i /∥xprev
i ∥2 .
Give a brief interpretation of these constraints. Solve the optimization problem again, with
these new constraints. Plot the optimal trajectory and report the optimal cost.
(c) Repeat part (b) until the trajectory converges. Plot the final trajectory along with the the
trajectories from part (a),(b) and the roundabout R. Note that each optimization only uses
constraints generated by the previous solution. What is the final cost J achieved?
16.9 Path planning with contingencies. A vehicle path down a (straight, for simplicity) road is specified
by a vector p ∈ RN , where pi gives the position perpendicular to the centerline at the point ih
meters down the road, where h > 0 is a given discretization size. (Throughout this problem, indexes
on N -vectors will correspond to positions on the road.) We normalize p so −1 ≤ pi ≤ 1 gives the
road boundaries. (We are modeling the vehicle as a point, by adjusting for its width.) You are
194
given the initial two positions p1 = a and p2 = b (which give the initial road position and angle),
as well as the final two positions pN −1 = c and pN = d.
You know there may be an obstruction at position i = O. This will require the path to either
go around the obstruction on the left, which requires pO ≥ 0.5, or on the right, which requires
pO ≤ −0.5, or possibly the obstruction will clear, and the obstruction does not place any additional
constraint on the path. These are the three contingencies in the problem title, which we label as
k = 1, 2, 3.
You will plan three paths for these contingencies, p(i) ∈ RN for i = 1, 2, 3. They must each
satisfy the given initial and final two road positions and the constraint of staying within the road
boundaries. Paths p(1) and p(2) must satisfy the (different) obstacle avoidance constraints given
above. Path p(3) does not need to satisfy an avoidance constraint.
Now we add a twist: You will not learn which of the three contingencies will occur until the vehicle
arrives at position i = S, when the sensors will determine which contingency holds. We model this
with the information constraints (also called causality constraints or non-anticipatory constraints),
(1) (2) (3)
pi = pi = pi , i = 1, . . . , S,
which state that before you know which contingency holds, the three paths must be the same.
The objective to be minimized is
3 N −1
(k) (k) (k)
X X
(pi−1 − 2pi + pi+1 )2 ,
k=1 i=2
the sum of the squares of the second differences, which gives smooth paths.
16.10 Control with various objectives. We consider a standard optimal control problem, with dynamics
xt+1 = Axt + But , t = 0, 1, . . . , T − 1. Here xt ∈ Rn is the state, and ut ∈ Rm is the control or
input, at time period t, A ∈ Rn×n is the dynamics matrix, and B ∈ Rn×m is the input matrix.
We are given the initial state, x0 = xinit , and we require that the final state be zero, xT = 0. (In
applications, the state 0 corresponds to some desirable state.) Your job is to choose the sequence
of inputs u0 , . . . , uT −1 that minimize an objective. Values for xinit , A, B, and T are given in
various_obj_regulator_data.*.
We consider various objectives, all of which measure the size of the inputs (or, in control dialect,
the control effort).
PT −1
(a) Sum of squares of 2-norms. t=0 ∥ut ∥22 . This is the traditional objective.
PT −1
(b) Sum of 2-norms. t=0 ∥ut ∥2 .
195
(c) Max of 2-norms. maxt=0,...,T −1 ∥ut ∥2 .
PT −1
(d) Sum of 1-norms. t=0 ∥ut ∥1 . In some applications this is an approximation of the fuel use.
For each objective, plot (the components of) optimal input, as well as ∥ut ∥2 , versus t. Make a
very brief comment on each plot of optimal control inputs, explaining why you might expect what
happened.
16.11 Multi-period liability clearing. We consider a financial system with n financial entities or agents,
such as banks, who make payments to each other over discrete time periods t = 1, 2, . . .. We let
ct ∈ Rn+ denote the cash held at the beginning of time period t, where (ct )i is the amount held by
the ith entity in dollars.
We let Lt ∈ Rn×n + denote the liability between the entities at the beginning of time period t, where
(Lt )ij is the amount in dollars that entity i owes entity j. You can assume that (Lt )ii = 0, i.e., the
entities do not owe anything to themselves. Note that Lt 1 is the vector of total liabilities of the
entities, i.e., the total amount owed to other entities, and LTt 1 is the vector of total amount owed
to the entities by others, at time period t.
We let Pt ∈ Rn×n
+ denote the amount paid between each entity during time period t, where (Pt )ij is
the amount, in dollars, paid from entity i to entity j. Thus Pt 1 is the vector of total cash payments
made by the entities to others in period t (i.e., (Pt 1)i is the total payments from entity i to all
other entities), and PtT 1 is the vector of total cash received by the entities from others in period t.
The liabilities and cash follows the dynamics
Lt+1 = Lt − Pt , t = 1, 2, . . . ,
ct+1 = ct − Pt 1 + PtT 1, t = 1, 2, . . . .
Each entity cannot pay more than the cash that it has on hand, so we have the constraint
Pt 1 ⪯ ct , t = 1, 2, . . . .
We are given the initial cash held c1 and the initial liabilities L1 . You can assume that for each
entity, the cash held plus the cash owed to it are at least as much as the amount it owes, i.e.,
c1 − L1 1 + LT1 1 ⪰ 0.
(a) Minimum time to clear the liabilities. Explain how to find the minimum T for which there is a
feasible sequence of payments P1 , . . . , PT −1 that results in LT = 0. (Reducing the liabilities to
zero is called clearing them.) Your method can involve solving a reasonable number of convex
problems.
(b) Carry out the method of part (a) on the data given in clearing data.*.
16.12 Lyapunov analysis of a dynamical system. We consider a discrete-time time-varying linear dynami-
cal system with state xt ∈ Rn . The state propagates according to the linear recursion xt+1 = At xt ,
for t = 0, 1, . . ., where the matrices At are unknown but satisfy At ∈ A = {A(1) , . . . , A(K) }, where
A(1) , . . . , A(K) are known. (In computer science, this would be called a non-deterministic linear
automaton.) We call the sequence x0 , x1 , . . . a trajectory of the system. There are infinitely many
trajectories, one for each sequence A0 , A1 , . . ..
196
The Lyapunov exponent κ of the system is defined as
1/t
κ= sup lim sup ∥xt ∥2 .
A0 ,A1 ,... t→∞
(If you don’t know what sup and lim sup mean, you can replace them with max and lim, respec-
tively.) Roughly speaking, this means that all trajectories grow no faster than κt . When κ < 1,
the system is called exponentially stable.
It is a hard problem to determine the Lyapunov exponent of the system, or whether the system is
exponentially stable, given the data A(1) , . . . , A(K) . In this problem we explore a powerful method
for computing an upper bound on the Lyapunov exponent.
Show that κ ≤ γ. Thus γ is an upper bound on the Lyapunov exponent κ. (The function V
is called a quadratic Lyapunov function for the system.)
(b) Explain how to use convex or quasiconvex optimization to find a matrix P ∈ Sn++ with the
smallest value of γ, i.e., with the best upper bound on κ. You must justify your formulation.
(c) Carry out the method of part (b) for the specific problem with data given in lyap_exp_bound_data.m.
Report the best upper bound on κ, to a tolerance of 0.01. The data A(i) are given as a cell
array; A{i} gives A(i) .
(d) Approximate worst-case trajectory simulation. The quadratic Lyapunov function found in
1/t
part (c) can be used to generate sequences of At that tend to result in large values of ∥xt ∥2 .
Start from a random vector x0 . At each t, generate xt+1 by choosing At = A(i) that maximizes
V (A(i) xt ), where P is computed from part (c). Do this for 50 time steps, and generate 5 such
1/t
trajectories. Plot ∥xt ∥2 and γ against t to verify that the bound you obtained in the previous
part is valid. Report the lower bound on the Lyapunov exponent that the trajectories suggest.
16.13 Optimal evacuation planning. We consider the problem of evacuating people from a dangerous area
in a way that minimizes risk exposure. We model the area as a connected graph with n nodes and
m edges; people can assemble or collect at the nodes, and travel between nodes (in either direction)
over the edges. We let qt ∈ Rn+ denote the vector of the numbers of people at the nodes, in time
period t, for t = 1, . . . , T , where T is the number of periods we consider. (We will consider the
entries of qt as real numbers, not integers.) The initial population distribution q1 is given. The
nodes have capacity constraints, given by qt ⪯ Q, where Q ∈ Rn+ is the vector of node capacities.
We use the incidence matrix A ∈ Rn×m to describe the graph. We assign an arbitrary reference
direction to each edge, and take
+1 if edge j enters node i
Aij = −1 if edge j exits node i
0 otherwise.
197
denotes movement in the direction of the edge; negative flow denotes population flow in the reverse
direction. Each edge has a capacity, i.e., |ft | ⪯ F , where F ∈ Rm
+ is the vector of edge capacities,
and |ft | denotes the elementwise absolute value of ft .
An evacuation plan is a sequence q1 , q2 , . . . , qT and f1 , f2 , . . . , fT −1 obeying the constraints above.
The goal is to find an evacuation plan that minimizes the total risk exposure, defined as
T −1
X TX
rT qt + sT qt2 + r̃T |ft | + s̃T ft2 ,
Rtot =
t=1 t=1
where r, s ∈ Rn+ are given vectors of risk exposure coefficients associated with the nodes, and
r̃, s̃ ∈ Rm
+ are given vectors of risk exposure coefficients associated with the edges. The notation
qt and ft2 refers to elementwise squares of the vectors. Roughly speaking, the risk exposure is a
2
quadratic function of the occupancy of a node, or the (absolute value of the) flow of people along
an edge. The linear terms can be interpreted as the risk exposure per person; the quadratic terms
can be interpreted as the additional risk associated with crowding.
A subset of nodes have zero risk (ri = si = 0), and are designated as safe nodes. The population is
considered evacuated at time t if rT qt + sT qt2 = 0. The evacuation time tevac of an evacuation plan
is the smallest such t. We will assume that T is sufficiently large and that the total capacity of the
safe nodes exceeds the total initial population, so evacuation is possible.
Use CVX* to find an optimal evacuation plan for the problem instance with data given in opt_evac_data.*.
(We display the graph below, with safe nodes denoted as squares.)
2 6
1 4 7
1 5 9
2 5 8
3 7 8
4
3 6
versus time. (For t = T , you can take the edge risk to be zero.) Plot the node occupancies qt , and
edge flows ft versus time. Briefly comment on the results you see. Give the evacuation time tevac
(considering any rT qt + sT qt2 ≤ 10−4 to be zero).
Hint. With CVXPY, use the ECOS solver with p.solve(solver=cvxpy.ECOS).
16.14 Dual of an optimal control problem. We consider the optimal control problem
PT 1 2
minimize t=1 2 ∥ut ∥
subject to xt+1 = Axt + But t = 1, . . . , T
x1 = xinit , xT +1 = xterm ,
198
with variables x1 , . . . , xT +1 ∈ Rn (the state trajectory) and u1 , . . . , uT ∈ Rm (the input or control
trajectory). The matrices A ∈ Rn×n and B ∈ Rn×m are given, as are the initial state xinit ∈ Rn
and final or terminal state xterm ∈ Rn . The norm appearing in the objective is an arbitrary norm
on Rm , with dual norm denoted ∥ · ∥∗ .
We will use ν1 , . . . , νT ∈ Rn as the dual variables associated with the dynamics equality constraints
xt+1 = Axt + But , t = 1, . . . , T . We associate the dual variable ν0 ∈ Rn with the initial state
constraint x1 = xinit , and the dual variable νT +1 ∈ Rn with the terminal state constraint xT +1 =
xterm .
(a) Give the Lagrangian for this optimal control problem, using the dual variables ν0 , . . . , νT +1
described above.
(b) Derive the dual function g. Be sure to specify its domain, i.e., conditions on the dual variables
under which g(ν0 , . . . , νT +1 ) > −∞. Hint. The conjugate of 12 ∥ · ∥2 is 12 ∥ · ∥2∗ .
(c) Give the Lagrange dual of the optimal control problem. Express the implicit constraints in g
(i.e., its domain) as explicit constraints.
16.15 Robust state feedback control. We consider a continuous time linear dynamical system with state
x(t) ∈ Rn and input u(t) ∈ Rm for t ∈ R+ and dynamics
where A(t) ∈ conv{A1 , . . . , Ak }, with A1 , . . . , Ak ∈ Rn×n and B ∈ Rn×m given. (The dot means
the derivative with respect to time t.) Note that the trajectory is not fully specified since we do
not know what A(t) is, other than it lies in the convex hull of A1 , . . . , Ak . (This is sometimes called
a differential inclusion, since the state derivative ẋ lies in a given set.)
We will use linear state feedback, u(t) = Kx(t), where the matrix K ∈ Rm×n is the state feedback
gain. The so-called closed-loop dynamics is
We seek K for which all trajectories of the closed-loop system converge to zero exponentially with
rate ω > 0, i.e., ∥x(t)∥2 ≤ M e−ωt , where M can depend on x(0). The rate ω is given.
A sufficient condition for this can be derived using Lyapunov theory. If there exists P ∈ Sn++ that
satisfies
(Ai + BK)T P + P (Ai + BK) + ωP ⪯ 0, i = 1, . . . , k,
then all trajectories of the closed-loop system converge to zero exponentially with rate ω. (Showing
this is not hard, but not part of this problem.) In addition, we impose a limit on the condition
number of P , κ(P ) ≤ κmax , where κmax > 1 is given and κ(P ) = λmax (P )/λmin (P ).
The data in the problem are A1 , . . . , Ak , B, κmax , and ω. You are to find K and P . (This is a
feasibility problem which can have multiple solutions; you only need to find one.)
(a) Explain how to find K and P using convex optimization. Justify any change of variables or
other problem transformations. Explain how you handle the constraints P ≻ 0 and κ(P ) ≤
κmax . You can assume that the problem is feasible.
199
(b) Carry out the method of part (a) with the data given in rob_state_fdbk_data.py. Give K
and P .
In addition, to verify that the conditions above hold, give:
• the minimum eigenvalue of P (to verify that P ≻ 0);
• the condition number of P (to verify it does not exceed κmax ); and
• the maximum eigenvalues of the k matrices (Ai + BK)T P + P (Ai + BK) + ωP (to verify
that the matrix inequalities hold).
Hints.
Remark. You don’t need to know anything about differential equations, control systems, or Lya-
punov theory to solve this problem.
16.16 Flight simulator. A flight simulator moves in a bounded region, attempting to recreate the pilot’s
feel of the motion of an airplane. In this problem we consider a simple 1D version of this problem.
(Real flight simulators involve 6 degrees of freedom.)
Let pref ∈ RT denote the (1D) position of the airplane at time periods t = 1, . . . , T , where time
periods are spaced h > 0 seconds apart. The velocity is given by v ref ∈ RT −1 ,
pref ref
t+1 − pt
vtref = , t = 1, . . . , T − 1,
h
the acceleration aref ∈ RT −2 is
ref − v ref
vt+1 t
aref
t = , t = 1, . . . , T − 2,
h
and the jerk (third derivative) j ref ∈ RT −3 is
aref ref
t+1 − at
jtref = , t = 1, . . . , T − 3.
h
You will choose the position of the simulator, p ∈ RT , which must satisfy ∥p∥∞ ≤ P max , where
P max is given. We let v, a, and j denote the velocity, acceleration, and jerk of the simulator, defined
in the same way as above for the reference. A person’s sense of motion is sensitive to jerk, which
motivates the simple least squares penalty
T
X −3
(jt − jtref )2 .
t=1
200
(a) Explain how to solve this problem using convex optimization, i.e., find the simulator position
trajectory p ∈ RT that minimizes the penalty described above subject to the constraints
described above.
(b) Solve the problem with data given in flight_simulator_data.py. Plot the optimal simulator
position p⋆ , and compare it to the reference position pref . Plot the jerk of the simulator and
the reference on the same plot.
16.17 Approximate worst-case initial state in optimal control. Consider the convex optimal control prob-
lem PT
minimize t=1 g(xt , ut )
subject to xt+1 = Axt + But , t = 1, . . . , T − 1, x1 = z
ut ∈ U, t = 1, . . . , T,
with variables x1 , . . . , xT ∈ Rn (the state trajectory) and u1 , . . . , uT ∈ Rm (the input trajectory).
The stage cost g : Rn × Rm → R is convex, and the input constraint set U ⊂ Rm is convex.
The matrices A ∈ Rn×n and B ∈ Rn×m are given. We let F : Rn → R be the optimal cost
of the control problem as a function of the initial state z, i.e., F (z) is the optimal value of the
control problem. (This function is also called the value function or Bellman function in dynamic
programming.) You can assume that F is differentiable.
We want to approximately solve the problem of finding the worst initial state in some given convex
set C ⊂ Rn , i.e., we wish to solve the problem
maximize F (z)
subject to z ∈ C,
with variable z ∈ Rn .
You can assume that the input contraint set U and the set of possible initial states C are described
via sets of convex constraints.
(a) Explain how to use the convex-concave procedure, outlined in additional exercise A4.44, to
approximately solve the worst-case initial state problem above. Be sure to explain how you
compute the gradient of F .
(b) Carry out the method of part (a) on the problem instance with n = 3, m = 1, T = 50,
and
0.98 0.06 −0.12 0.3
A = −0.02 0.95 0.28 , B = 0.6 .
0.14 −0.27 0.94 0.1
For each of five random initializations of z, plot F (z k ) versus k, where k denotes iteration
of the convex-concave procedure. Report the worst-case cost (the maximum of the five final
values of your convex-concave method), and the associated worst-case initial condition.
Plot the state and input trajectories of the optimal trajectory for the (approximate) worst-case
initial state you find. You can give the state coordinates (xt )1 , (xt )2 , (xt )3 , and the input ut ,
versus t, on the same plot.
201
16.18 Smoothest ride through a set of green lights. We will design the smoothest trajectory of a car
moving along a road, making a sequence of green lights, with given initial and terminal conditions.
Let pt ∈ R denote the position along the road at time period t = 0, 1, . . . , T . We are given the
initial condition p0 = 0 and the terminal condition pT = L, where L is the total length of the route
along the road.
We define the speed as st = pt+1 − pt , t = 0, . . . , T − 1, the acceleration as at = st+1 − st ,
t = 0, . . . , T −2, and the jerk (third derivative in continuous time) as jt = at+1 −at , t = 0, . . . , T −3.
Our objective is to minimize the mean square jerk of the trajectory, given by
T −3
1 X 2
J= jt .
T −2
t=0
We are given a minimum and maximum speed, i.e., S min ≤ st ≤ S max , t = 1, . . . , T − 1. We have
S min > 0, which means the car always moves forward, and never backs up. We assume that the
car starts at rest, i.e., s0 = 0.
The twist is that the car must pass through a given set of K green lights. Each one is specified by
lk ∈ (0, L), the distance along the road to the stoplight, the time it turns green gk ∈ {0, . . . , T },
and the time it turns red rk ∈ {0, . . . , T } (both gk and rk are integers). We have rk > gk , with
rk − gk the total time the kth stoplight is green.
Now we explain what it means to make a green light. Note that pt is an increasing sequence which
starts at 0 and ends at L. Let us extend the sequence p to a piecewise linear function p̃ of the
continuous variable τ ∈ [0, T ]. Thus p̃τ is an increasing continuous function with p̃0 = 0 and
p̃T = L. Let Tk be the unique τ ∈ (0, T ) for which p̃τ = lk , i.e., the (continuous time) moment
when the car arrives at the kth light. To say that we make the lights means
gk ≤ Tk ≤ rk , k = 1, . . . , K,
i.e., the car arrives at the light no earlier than the light turns green, and no later than when it
turns red.
This is illustrated in the plot below. In this problem instance, we are given T = 300 s, K = 5, L =
3000 m, S min = 4 m/s, S max = 16 m/s, l = (300, 825, 1620, 1900, 2800), g = (10, 50, 100, 200, 240),
and r = (40, 80, 130, 230, 270). The plot shows the position of the car versus time.
Remark. Note that for this example, the car doesn’t minimize the jerk, but rather takes a subop-
timal path to make the lights. It also violates the maximum speed constraint. Such a path is not
smooth, and would result in an uncomfortable ride for the passengers.
202
3000
l5 T5
2500
2000 T4
l4
Position (meters)
l3 T3
1500
1000
l2 T2
500
l1 T1
0
0 g1 r1 g2 r2 g3 r3 150 g4 r4 g5 250 r5 300
Time (seconds)
(a) Explain how to transform to a convex or quasiconvex problem. Justify any change of variables
or rewriting of the constraints. Be sure to identify the variables in your formulation.
(b) Carry out the solution with the given problem instance. Report the mean square jerk of the
optimal trajectory. Round reported values to four decimal places. Code smooth ride plot.py
generates a plot for you with the position versus time if you provide p⋆ . Briefly compare the
plot with the nonoptimal one above.
16.19 Minimizing the number of actuators used in a control problem. We consider a standard control
problem with state xt ∈ Rn , t = 0, . . . , T , input ut ∈ Rm , t = 0, . . . , T − 1, and dynamics
xt+1 = Axt + But for t = 0, . . . , T − 1. The dynamics matrix A ∈ Rn×n and the input matrix
B ∈ Rn×m are given. We are given the initial state x0 and the terminal state xT , and seek
u0 , . . . , uT −1 with ∥ut ∥∞ ≤ 1. We call such an input sequence feasible.
We say that actuator i is not used if (ut )i = 0, t = 0, . . . , T − 1. The problem is to determine a
feasible input trajectory that minimizes the number of actuators used.
(a) Explain how to solve this problem exactly using convex optimization. Your method can involve
solving a modest number of convex optimization problems. You can assume that m is small
enough that 2m is considered modest.
(b) Carry out the procedure in part (a) with the following data: T = 20, n = 3, m = 4,
0 −4
x0 = 0 , xT = 8 ,
0 −2
203
and
1 0.1 0.0 0.9 −0.4 0.0 0.5
A = 0.0 0.9 −0.1 , B = 2.0 0.7 0.3 −0.4 .
−0.1 0 0.9 0.0 0.2 −0.3 1.7
Report the minimum number of actuators used, and a set of indices of actuators that achieves
this minimum number.
204
17 Finance
17.1 Transaction cost. Consider a market for some asset or commodity, which we assume is infinitely
divisible, i.e., can be bought or sold in quantities of shares that are real numbers (as opposed to
integers). The order book at some time consists of a set of offers to sell or buy the asset, at a given
price, up to a given quantity of shares. The N offers to sell the asset have positive prices per share
psell sell sell sell
1 , . . . , pN , sorted in increasing order, in positive share quantities q1 , . . . , qN . The M offers to
buy buy
buy the asset have positive prices p1 , . . . , pN , sorted in decreasing order, and positive quantities
q1buy , . . . , qMbuy
. The price psell
1 is called the (current) ask price for the asset; pbuy
1 is the bid price
for the asset. The ask price is larger than the bid price; the difference is called the spread. The
average of the ask and bid prices is called the mid-price, denoted pmid .
Now suppose that you want to purchase q > 0 shares of the asset, where q ≤ q1sell + · · · + qN
sell , i.e.,
your purchase quantity does not exceed the total amount of the asset currently offered for sale.
Your purchase proceeds as follows. Suppose that
Roughly speaking, you work your way through the offers in the order book, from the least (ask)
price, and working your way up the order book until you fill the order. We define the transaction
cost as
T (q) = A − pmid q.
This is the difference between what you pay, and what you would have paid had you been able to
purchase the shares at the mid-price. It is always positive.
We handle the case of selling the asset in a similar way. Here we take q < 0 to mean that we sell −q
shares of the asset. Here you sell shares at the bid price, up to the quantity q buy (or −q, whichever
is smaller); if needed, you sell shares at the price pbuy
2 , and so on, until all −q shares are sold. Here
buy buy
we assume that −q ≤ q1 + · · · + qM , i.e., you are not selling more shares than the total quantity
of offers to buy. Let A denote the amount you receive from the sale. Here we define the transaction
cost as
T (q) = −pmid q − A,
the difference between the amount you would have received had you sold the shares at the mid-price,
and the amount you received. It is always positive. We set T (0) = 0.
205
17.2 Risk-return trade-off in portfolio optimization. We consider the portfolio risk-return trade-off prob-
lem of page 185, with the following data:
0.12 0.0064 0.0008 −0.0011 0
0.10 0.0008 0.0025 0 0
0.07 ,
p̄ = Σ= −0.0011
.
0 0.0004 0
0.03 0 0 0 0
for a large number of positive values of µ (for example, 100 values logarithmically spaced
between 1 and 107 ). Plot the optimal values of the expected return p̄T x versus the standard
deviation (xT Σx)1/2 . Also make an area plot of the optimal portfolios x versus the standard
deviation (as in figure 4.12).
(b) Assume the price change vector p is a Gaussian random variable, with mean p̄ and covariance
Σ. Formulate the problem
maximize p̄T x
subject to prob(pT x ≤ 0) ≤ η
1T x = 1, x ⪰ 0,
as a convex optimization problem, where η < 1/2 is a parameter. In this problem we maximize
the expected return subject to a constraint on the probability of a negative return. Solve the
problem for a large number of values of η between 10−4 and 10−1 , and plot the optimal values
of p̄T x versus η. Also make an area plot of the optimal portfolios x versus η.
Hint: The Matlab functions erfc and erfcinv can be used to evaluate
√ Z x
2
Φ(x) = (1/ 2π) e−t /2 dt
−∞
17.3 Simple portfolio optimization. We consider a portfolio optimization problem as described on pages
155 and 185–186 of Convex Optimization, with data that can be found in the file simple_portfolio_data.py.
206
(a) Find minimum-risk portfolios with the same expected return as the uniform portfolio (x =
(1/n)1), with risk measured by portfolio return variance, and the following portfolio con-
straints (in addition to 1T x = 1):
• No (additional) constraints.
• Long-only: x ⪰ 0.
• Limit on total short position: 1T (x− ) ≤ 0.5, where (x− )i = max{−xi , 0}.
Compare the optimal risk in these portfolios with each other and the uniform portfolio.
(b) Plot the optimal risk-return trade-off curves for the long-only portfolio, and for total short-
position limited to 0.5, in the same figure. Follow the style of figure 4.12 (top), with horizontal
axis showing standard deviation of portfolio return (which is (xT Σx)1/2 ), and vertical axis
showing mean return.
17.4 Bounding portfolio risk with incomplete covariance information. Consider the following instance of
the problem described in §4.6, on p171–173 of Convex Optimization. We suppose that Σii , which
are the squares of the price volatilities of the assets, are known. For the off-diagonal entries of Σ,
all we know is the sign (or, in some cases, nothing at all). For example, we might be given that
Σ12 ≥ 0, Σ23 ≤ 0, etc. This means that we do not know the correlation between p1 and p2 , but we
do know that they are nonnegatively correlated (i.e., the prices of assets 1 and 2 tend to rise or
fall together).
2 , the worst-case variance of the portfolio return, for
Compute σwc the specific case
0.1 0.2 + + ±
0.2 + 0.1 − −
x= −0.05 , Σ = + − 0.3
,
+
0.1 ± − + 0.1
where a “+” entry means that the element is nonnegative, a “−” means the entry is nonpositive,
and “±” means we don’t know anything about the entry. (The negative value in x represents a
short position: you sold stocks that you didn’t have, but must produce at the end of the investment
period.) In addition to σwc2 , give the covariance matrix Σ
wc associated with the maximum risk.
Compare the worst-case risk with the risk obtained when Σ is diagonal.
17.5 Log-optimal investment strategy. In this problem you will solve a specific instance of the log-optimal
investment problem described in exercise 4.60, with n = 5 assets and m = 10 possible outcomes in
each period. The problem data are defined in log_opt_invest.*, with the rows of the matrix P
giving the asset return vectors pTj . The outcomes are equiprobable, i.e., we have πj = 1/m. Each
column of the matrix P gives the return of the associated asset in the different possible outcomes.
You can examine the columns to get an idea of the types of assets. For example, the last asset gives
a fixed and certain return of 1%; the first asset is a very risky one, with occasional large return,
and (more often) substantial loss.
Find the log-optimal investment strategy x⋆ , and its associated long term growth rate Rlt
⋆ . Compare
this to the long term growth rate obtained with a uniform allocation strategy, i.e., x = (1/n)1, and
also with a pure investment in each asset.
For the optimal investment strategy, and also the uniform investment
QT strategy, plot 10 sample
trajectories of the accumulated wealth, i.e., W (T ) = W (0) t=1 λ(t), for T = 0, . . . , 200, with
initial wealth W (0) = 1.
207
To save you the trouble of figuring out how to simulate the wealth trajectories or plot them nicely,
we’ve included the simulation and plotting code in log_opt_invest.*; you just have to add the
code needed to find x⋆ .
(a) Show that the optimality conditions for the log-optimal investment problem described in
exercise 4.60 can be expressed as: 1T x = 1, x ⪰ 0, and for each i,
m m
X pij X pij
xi > 0 ⇒ πj T = 1, xi = 0 ⇒ πj ≤ 1.
j=1
pj x j=1
pTj x
We can interpret this as follows. pij /pTj x is a random variable, which gives the ratio of the
investment gain with asset i only, to the investment gain with our mixed portfolio x. The
optimality condition is that, for each asset we invest in, the expected value of this ratio is one,
and for each asset we do not invest in, the expected value cannot exceed one. Very roughly
speaking, this means our portfolio does as well as any of the assets that we choose to invest
in, and cannot do worse than any assets that we do not invest in.
Hint. You can start from the simple criterion given in §4.2.3 or the KKT conditions.
(b) In this part we will derive the dual of the log-optimal investment problem. We start by writing
the problem as
minimize − m
P
j=1 πj log yj
subject to y = P T x, x ⪰ 0, 1T x = 1.
Here, P has columns p1 , . . . , pm , and we have the introduced new variables y1 , . . . , ym , with
the implicit constraint y ≻ 0. We will associate dual variables ν, λ and ν0 with the constraints
y = P T x, x ⪰ 0, and 1T x = 1, respectively. Defining ν̃j = νj /ν0 for j = 1, . . . , m, show that
the dual problem can be written as
Pm
maximize j=1 πj log(ν̃j /πj )
subject to P ν̃ ⪯ 1,
with variable ν̃. The objective here is the (negative) Kullback-Leibler divergence between the
given distribution π and the dual variable ν̃.
17.7 Arbitrage and theorems of alternatives. Consider an event (for example, a sports game, political
elections, the evolution of the stock market over a certain period) with m possible outcomes.
Suppose that n wagers on the outcome are possible. If we bet an amount xj on wager j, and the
outcome of the event is i (i = 1, . . . , m), then our return will be equal to rij xj . The return rij xj is
the net gain: we pay xj initially, and receive (1 + rij )xj if the outcome of the event is i. We allow
the bets xj to be positive, negative, or zero. The interpretation of a negative bet is as follows. If
xj < 0, then initially we receive an amount of money |xj |, with an obligation to pay (1 + rij )|xj | if
outcome i occurs. In that case, we lose rij |xj |, i.e., our net is gain rij xj (a negative number).
We call the matrix R ∈ Rm×n with elements rij the return matrix. A betting strategy is a vector
x ∈ Rn , with as components xj the amounts we bet onP each wager. If we use a betting strategy
x, our total return in the event of outcome i is equal to nj=1 rij xj , i.e., the ith component of the
vector Rx.
208
Country Odds Country Odds
Holland 3.5 Czech Republic 17.0
Italy 5.0 Romania 18.0
Spain 5.5 Yugoslavia 20.0
France 6.5 Portugal 20.0
Germany 7.0 Norway 20.0
England 10.0 Denmark 33.0
Belgium 14.0 Turkey 50.0
Sweden 16.0 Slovenia 80.0
(a) The arbitrage theorem. Suppose you are given a return matrix R. Prove the following theorem:
there is a betting strategy x ∈ Rn for which
Rx ≻ 0
RT p = 0, p ⪰ 0, p ̸= 0.
We can interpret this theorem as follows. If Rx ≻ 0, then the betting strategy x guarantees a
positive return for all possible outcomes, i.e., it is a sure-win betting scheme. In economics,
we say there is an arbitrage opportunity.
If we normalize the vector p in the second condition, so that 1T p = 1, we can interpret it as
a probability vector on the outcomes. The condition RT p = 0 means that
E Rx = pT Rx = 0
for all x, i.e., the expected return is zero for all betting strategies. In economics, p is called a
risk neutral probability.
We can therefore rephrase the arbitrage theorem as follows: There is no sure-win betting
strategy (or arbitrage opportunity) if and only if there is a probability vector on the outcomes
that makes all bets fair (i.e., the expected gain is zero).
(b) Betting. In a simple application, we have exactly as many wagers as there are outcomes
(n = m). Wager i is to bet that the outcome will be i. The returns are usually expressed as
odds. For example, suppose that a bookmaker accepts bets on the result of the 2000 European
soccer championships. If the odds against Belgium winning are 14 to one, and we bet $100 on
Belgium, then we win $1400 if they win the tournament, and we lose $100 otherwise.
In general, if we have m possible outcomes, and the odds against outcome i are λi to one,
then the return matrix R ∈ Rm×m is given by
rij = λi if j = i
rij = −1 otherwise.
209
Show that there is no sure-win betting scheme (or arbitrage opportunity) if
m
X 1
= 1.
1 + λi
i=1
In fact, you can verify that if this equality is not satisfied, then the betting strategy
1/(1 + λi )
xi = Pm
1 − i=1 1/(1 + λi )
Su u−r
−S =S
1+r 1+r
if the stock goes up, and
Sd d−r
−S =S
1+r 1+r
if the stock goes down.
We can also buy options, at a unit price of C. An option gives us the right to purchase one
stock at a fixed price K at the end of the period. Whether we exercise the option or not
depends on the price of the stock at the end of the period. If the stock price S̄ at the end of
the period is greater than K, we exercise the option, buy the stock and sell it immediately,
so we receive an amount S̄ − K. If the stock price S̄ is less than K, we do not exercise the
option and receive nothing. Combining both cases, we can say that the value of the option at
the end of the period is max{0, S̄ − K}, and the present value is max{0, S̄ − K}/(1 + r). If
we pay a price C per option, then our return is
1
max{0, S̄ − K} − C
1+r
per option.
210
We can summarize the situation with the return matrix
u − 1 − r max{0, Su − K}
1+r −1
(1 + r)C
R= .
d − 1 − r max{0, Sd − K}
−1
1+r (1 + r)C
The elements of the first row are the (present values of the) returns in the event that the stock
price goes up. The second row are the returns in the event that the stock price goes down.
The first column gives the returns per unit investment in the stock. The second column gives
the returns per unit investment in the option.
In this simple example the arbitrage theorem allows us to determine the price of the option,
given the other information S, K, u, d, and r. Show that if there is no arbitrage, then the
price of the option C must be equal to
1
(p max{0, Su − K} + (1 − p) max{0, Sd − K})
1+r
where
1+r−d
p= .
u−d
17.8 Log-optimal investment. We consider an instance of the log-optimal investment problem described
in exercise 4.60 of Convex Optimization. In this exercise, however, we allow x, the allocation vector,
to have negative components. Investing a negative amount xi W (t) in an asset is called shorting
the asset. It means you borrow the asset, sell it for |xi W (t)|, and have an obligation to purchase it
back later and return it to the lender.
We assume that the elements rij of R are all positive, which implies that the log-optimal
investment problem is feasible. Show the following property: if there exists a v ∈ Rn with
1T v = 0, RT v ⪰ 0, RT v ̸= 0 (66)
then the log-optimal investment problem is unbounded (assuming that the probabilities pj are
all positive).
(b) Derive a Lagrange dual of the log-optimal investment problem (or an equivalent problem of
your choice). Use the Lagrange dual to show that the condition in part a is also necessary for
unboundedness. In other words, the log-optimal investment problem is bounded if and only
if there does not exist a v satisfying (66).
(c) Consider the following small example. We have four scenarios and three investment options.
The return vectors for the four scenarios are
2 2 0.5 0.5
r1 = 1.3 , r2 = 0.5 , r3 = 1.3 , r4 = 0.5 .
1 1 1 1
211
The probabilities of the three scenarios are
The interpretation is as follows. We can invest in two stocks. The first stock doubles in value
in each period with a probability 1/2, or decreases by 50% with a probability 1/2. The second
stock either increases by 30% with a probability 2/3, or decreases by 50% with a probability
1/3. The fluctuations in the two stocks are independent, so we have four scenarios: both stocks
go up (probability 2/6), stock 1 goes up and stock 2 goes down (probability 1/6), stock 1 goes
down and stock 2 goes up (probability 1/3), both stocks go down (probability 1/6). The
fractions of our capital we invest in stocks 1 and 2 are denoted by x1 and x2 , respectively.
The rest of our capital, x3 = 1 − x1 − x2 is not invested.
What is the expected growth rate of the log-optimal strategy x? Compare with the strategies
(x1 , x2 , x3 ) = (1, 0, 0), (x1 , x2 , x3 ) = (0, 1, 0) and (x1 , x2 , x3 ) = (1/2, 1/2, 0). (Obviously the
expected growth rate for (x1 , x2 , x3 ) = (0, 0, 1) is zero.)
Remark. The figure below shows a simulation that compares three investment strategies over
200 periods. The solid line shows the log-optimal investment strategy. The dashed lines show
the growth for strategies x = (1, 0, 0), (0, 1, 0), and (0, 0, 1).
6
10
4
10
2
10
0
10
W (t)
−2
10
−4
10
−6
10
−8
10
0 50 100 150 200 250
17.9 Maximizing house profit in a gamble and imputed probabilities. A set of n participants bet on
which one of m outcomes, labeled 1, . . . , m, will occur. Participant i offers to purchase up to qi > 0
gambling contracts, at price pi > 0, that the true outcome will be in the set Si ⊂ {1, . . . , m}. The
house then sells her xi contracts, with 0 ≤ xi ≤ qi . If the true outcome j is in Si , then participant
i receives $1 per contract, i.e., xi . Otherwise, she loses, and receives nothing. The house collects a
212
total of x1 p1 + · · · + xn pn , and pays out an amount that depends on the outcome j,
X
xi .
i: j∈Si
(a) Optimal house strategy. How should the house decide on x so that its worst-case profit (over the
possible outcomes) is maximized? (The house determines x after examining all the participant
offers.)
(b) Imputed probabilities. Suppose x⋆ maximizes the worst-case house profit. Show that there
exists a probability distribution π on the possible outcomes (i.e., π ∈ Rm T
+ , 1 π = 1) for which
x⋆ also maximizes the expected house profit. Explain how to find π.
Hint. Formulate the problem in part (a) as an LP; you can construct π from optimal dual
variables for this LP.
Remark. Given π, the ‘fair’ price for offer i is pfair = j∈Si πj . All offers with pi > pfair
P
i i will
fair
be completely filled (i.e., xi = qi ); all offers with pi < pi will be rejected (i.e., xi = 0).
Remark. This exercise shows how the probabilities of outcomes (e.g., elections) can be guessed
from the offers of a set of gamblers.
(c) Numerical example. Carry out your method on the simple example below with n = 5 partici-
pants, m = 5 possible outcomes, and participant offers
Participant i pi qi Si
1 0.50 10 {1,2}
2 0.60 5 {4}
3 0.60 5 {1,4,5}
4 0.60 20 {2,5}
5 0.20 10 {3}
Compare the optimal worst-case house profit with the worst-case house profit, if all offers were
accepted (i.e., xi = qi ). Find the imputed probabilities.
17.10 Optimal investment to fund an expense stream. An organization (such as a municipality) knows
its operating expenses over the next T periods, denoted E1 , . . . , ET . (Normally these are positive;
but we can have negative Et , which corresponds to income.) These expenses will be funded by a
combination of investment income, from a mixture of bonds purchased at t = 0, and a cash account.
The bonds generate investment income, denoted I1 , . . . , IT . The cash balance is denoted B0 , . . . , BT ,
where B0 ≥ 0 is the amount of the initial deposit into the cash account. We can have Bt < 0 for
t = 1, . . . , T , which represents borrowing.
After paying for the expenses using investment income and cash, in period t, we are left with
Bt − Et + It in cash. If this amount is positive, it earns interest at the rate r+ > 0; if it is negative,
we must pay interest at rate r− , where r− ≥ r+ . Thus the expenses, investment income, and cash
balances are linked as follows:
(1 + r+ )(Bt − Et + It ) Bt − Et + It ≥ 0
Bt+1 =
(1 + r− )(Bt − Et + It ) Bt − Et + It < 0,
213
for t = 1, . . . , T − 1. We take B1 = (1 + r+ )B0 , and we require that BT − ET + IT = 0, which
means the final cash balance, plus income, exactly covers the final expense.
The initial investment will be a mixture of bonds, labeled 1, . . . , n. Bond i has a price Pi > 0,
a coupon payment Ci > 0, and a maturity Mi ∈ {1, . . . , T }. Bond i generates an income stream
given by
Ci t < Mi
(i)
at = C i + 1 t = Mi
0 t > Mi ,
for t = 1, . . . , T . If xi is the number of units of bond i purchased (at t = 0), the total investment
cash flow is
(1) (n)
It = x1 at + · · · + xn at , t = 1, . . . , T.
We will require xi ≥ 0. (The xi can be fractional; they do not need to be integers.)
The total initial investment required to purchase the bonds, and fund the initial cash balance at
t = 0, is x1 P1 + · · · + xn Pn + B0 .
(a) Explain how to choose x and B0 to minimize the total initial investment required to fund the
expense stream.
(b) Solve the problem instance given in opt_funding_data.m. Give optimal values of x and B0 .
Give the optimal total initial investment, and compare it to the initial investment required if
no bonds were purchased (which would mean that all the expenses were funded from the cash
account). Plot the cash balance (versus period) with optimal bond investment, and with no
bond investment.
17.11 Planning production with uncertain demand. You must order (nonnegative) amounts rI , . . . , rm of
raw materials, which are needed to manufacture (nonnegative) quantities q1 , . . . , qn of n different
products. To manufacture one unit of product j requires at least Aij units of raw material i, so
we must have r ⪰ Aq. (We will assume that Aij are nonnegative.) The per-unit cost of the raw
materials is given by c ∈ Rm T
+ , so the total raw material cost is c r.
The (nonnegative) demand for product j is denoted dj ; the number of units of product j sold is
sj = min{qj , dj }. (When qj > dj , qj − dj is the amount of product j produced, but not sold; when
dj > qj , dj − qj is the amount of unmet demand.) The revenue from selling the products is pT s,
where p ∈ Rn+ is the vector of product prices. The profit is pT s − cT r. (Both d and q are real
vectors; their entries need not be integers.)
You are given A, c, and p. The product demand, however, is not known. Instead, a set of K
possible demand vectors, d(1) , . . . , d(K) , with associated probabilities π1 , . . . , πK , is given. (These
satisfy 1T π = 1, π ⪰ 0.)
You will explore two different optimization problems that arise in choosing r and q (the variables).
I. Choose r and q ahead of time. You must choose r and q, knowing only the data listed
above. (In other words, you must order the raw materials, and commit to producing the chosen
quantities of products, before you know the product demand.) The objective is to maximize the
expected profit.
214
II. Choose r ahead of time, and q after d is known. You must choose r, knowing only the
data listed above. Some time after you have chosen r, the demand will become known to you.
This means that you will find out which of the K demand vectors is the true demand. Once you
know this, you must choose the quantities to be manufactured. (In other words, you must order
the raw materials before the product demand is known; but you can choose the mix of products to
manufacture after you have learned the true product demand.) The objective is to maximize the
expected profit.
(a) Explain how to formulate each of these problems as a convex optimization problem. Clearly
state what the variables are in the problem, what the constraints are, and describe the roles
of any auxiliary variables or constraints you introduce.
(b) Carry out the methods from part (a) on the problem instance with numerical data given in
planning_data.m. This file will define A, D, K, c, m, n, p and pi. The K columns of D are the
possible demand vectors. For both of the problems described above, give the optimal value of
r, and the expected profit.
17.12 Gini coefficient of inequality. Let x1 , . . . , xn be a set of nonnegative numbers with positive sum,
which typically represent the wealth or income of n individuals in some group. The Lorentz curve
is a plot of the fraction fi of total wealth held by the i poorest individuals,
i
X
fi = (1/1T x) x(j) , i = 0, . . . , n,
j=1
versus i/n, where x(j) denotes the jth smallest of the numbers {x1 , . . . , xn }, and we take f0 = 0.
The Lorentz curve starts at (0, 0) and ends at (1, 1). Interpreted as a continuous curve (as, say,
n → ∞) the Lorentz curve is convex and increasing, and lies on or below the straight line joining
the endpoints. The curve coincides with this straight line, i.e., fi = (i/n), if and only if the wealth
is distributed equally, i.e., the xi are all equal.
The Gini coefficient is defined as twice the area between the straight line corresponding to uniform
wealth distribution and the Lorentz curve:
n
X
G(x) = (2/n) ((i/n) − fi ).
i=1
The Gini coefficient is used as a measure of wealth or income inequality: It ranges between 0 (for
equal distribution of wealth) and 1 − 1/n (when one individual holds all wealth).
17.13 Internal rate of return for cash streams with a single initial investment. We use the notation of
example 3.34 in the textbook. Let x ∈ Rn+1 be a cash flow over n periods, with x indexed from 0
to n, where the index denotes period number. We assume that x0 < 0, xj ≥ 0 for j = 1, . . . , n, and
215
x0 + · · · + xn > 0. This means that there is an initial positive investment; thereafter, only payments
are made, with the total of the payments exceeding the initial investment. (In the more general
setting of example 3.34, we allow additional investments to be made after the initial investment.)
17.14 Efficient solution of basic portfolio optimization problem. This problem concerns the simplest
possible portfolio optimization problem:
maximize µT w − (λ/2)wT Σw
subject to 1T w = 1,
with variable w ∈ Rn (the normalized portfolio, with negative entries meaning short positions),
and data µ (mean return), Σ ∈ Sn++ (return covariance), and λ > 0 (the risk aversion parameter).
The return covariance has the factor form Σ = F QF T + D, where F ∈ Rn×k (with rank K) is
the factor loading matrix, Q ∈ Sk++ is the factor covariance matrix, and D is a diagonal matrix
with positive entries, called the idiosyncratic risk (since it describes the risk of each asset that is
independent of the factors). This form for Σ is referred to as a ‘k-factor risk model’. Some typical
dimensions are n = 2500 (assets) and k = 30 (factors).
(a) What is the flop count for computing the optimal portfolio, if the low-rank plus diagonal
structure of Σ is not exploited? You can assume that λ = 1 (which can be arranged by
absorbing it into Σ).
(b) Explain how to compute the optimal portfolio more efficiently, and give the flop count for your
method. You can assume that k ≪ n. You do not have to give the best method; any method
that has linear complexity in n is fine. You can assume that λ = 1.
Hints. You may want to introduce a new variable y = F T w (which is called the vector of
factor exposures). You may want to work with the matrix
1 F
G= ∈ R(n+k)×(1+k) ,
0 −I
216
(d) Risk return trade-off curve. Now suppose we want to compute the optimal portfolio for M
values of the risk aversion parameter λ. Explain how to do this efficiently, and give the
complexity in terms of M , n, and k. Compare to the complexity of using the method of
part (b) M times. Hint. Show that the optimal portfolio is an affine function of 1/λ.
17.15 Sparse index tracking. The (weekly, say) return of n stocks is given by a random variable r ∈ Rn ,
with mean r and covariance E(r − r)(r − r)T = Σ ≻ 0. An index (such as S&P 500 or Wilshire
5000) is a weighted sum of these returns, given by z = cT r, where c ∈ Rn+ . (For example, the
vector c is nonzero only for the stocks in the index, and the coefficients ci might be proportional to
some measure of market capitalization of stock i.) We will assume that the index weights c ∈ Rn ,
as well as the return mean and covariance r and Σ, are known and fixed.
Our goal is to find a sparse weight vector w ∈ Rn , which can include negative entries (meaning,
short positions), so that the RMS index tracking error, defined as
1/2
E(z − wT r)2
E= ,
E z2
does not exceed 0.10 (i.e., 10%). Of course, taking w = c results in E = 0, but we are interested in
finding a weight vector with (we hope) many fewer nonzero entries than c has.
Remark. This is the idea behind an index fund : You find a sparse portfolio that replicates or tracks
the return of the index (within some error tolerance). Acquiring (and rebalancing) the sparse
tracking portfolio will incur smaller transactions costs than trading in the full index.
(a) Propose a (simple) heuristic method for finding a sparse weight vector w that satisfies E ≤ 0.10.
(b) Carry out your method on the problem instance given in sparse_idx_track_data.m. Give
card(w), the number of nonzero entries in w. (To evaluate card(w), use sum(abs(w)>0.01),
which treats weight components smaller than 0.01 as zero.) (You might want to compare the
index weights and the weights you find by typing [c w]. No need to print or turn in the
resulting output, though.)
17.16 Option price bounds. In this problem we use the methods and results of example 5.10 to give
bounds on the arbitrage-free price of an option. (See exercise 5.38 for a simple version of option
pricing.) We will use all the notation and definitions from example 5.10.
We consider here options on an underlying asset (such as a stock); these have a payoff or value
that depends on S, the value of the underlying asset at the end of the investment period. We
will assume that the underlying asset can only take on m different values, S (1) , . . . , S (m) . These
correspond to the m possible scenarios or outcomes described in example 5.10.
A risk-free asset has value (or payoff) r > 1 in every scenario. (We refer to r − 1 as the risk-free
interest rate.) The value of the underlying asset is simply S.
A put option at strike price K gives the owner the right to sell one unit of the underlying stock
at price K. At the end of the investment period, if the stock is trading at a price S, then the
put option has payoff (K − S)+ = max{0, K − S} (since the option is exercised only if K > S).
Similarly a call option at strike price K gives the buyer the right to buy a unit of stock at price K.
A call option has payoff (S − K)+ = max{0, S − K}.
217
A collar is an option with payoff
C S>C
ϕ(S) = min(C, max(F, S)) = S F ≤S≤C
F S<F
where F is the floor and C is the cap, with 0 < F < C. A collar option limits both the upside and
downside of payoff.
These payoffs, which are functions of S (the price of the underlying stock at the end of the period)
are listed in the table below.
Asset/option Payoff/value
Risk-free asset r
Underlying stock S
Put option with strike price K (K − S)+
Call option with strike price K (S − K)+
Collar with floor F and cap C min(C, max(F, S))
Now we consider a specific problem. The price of the risk-free asset, with r = 1.05, is 1. The price
of the underlying asset is S0 = 1. We will use m = 200 scenarios, with S (i) uniformly spaced from
S (1) = 0.5 to S (200) = 2. The following options are traded on an exchange, with prices listed below.
A collar with floor F = 0.9 and cap C = 1.15 is not traded on an exchange, so we do not know
its market price. We wish to determine what prices for it would be appropriate. Find the range
of prices for this collar, consistent with the absence of arbitrage and the prices of the call and put
options above above.
There are 7 assets in total: The risk-free one, the underlying stock, two call options, two put
options, and one collar. We are given the prices of each of these except the last.
17.17 Portfolio optimization with qualitative return forecasts. We consider the risk-return portfolio opti-
mization problem described on pages 155 and 185 of the book, with one twist: We don’t precisely
know the mean return vector p̄. Instead, we have a range of possible values for each asset, i.e., we
have l, u ∈ Rn with l ⪯ p̄ ⪯ u. We use l and u to encode various qualitative forecasts we have
about the mean return vector p̄. For example, l7 = 0.02 and u7 = 0.20 means that we believe the
mean return for asset 7 is between 2% and 20%.
Define the worst-case mean return Rwc , as a function of portfolio vector x, as the worst (minimum)
value of p̄T x, over all p̄ consistent with the given bounds l and u.
(a) Explain how to find a portfolio x that maximizes Rwc , subject to a budget constraint and risk
limit,
1T x = 1, xT Σx ≤ σmax
2
,
where Σ ∈ Sn++ and σmax ∈ R++ are given.
218
(b) Solve the problem instance given in port_qual_forecasts_data.m. Give the optimal worst-
case mean return achieved by the optimal portfolio x⋆ .
In addition, construct a portfolio xmid that maximizes cT x subject to the budget constraint
and risk limit, where c = (1/2)(l + u). This is the optimal portfolio assuming that the mean
return has the midpoint value of the forecasts. Compare the midpoint mean returns cT xmid
and cT x⋆ , and the worst-case mean returns of xmid and x⋆ .
Briefly comment on the results.
17.18 De-leveraging. We consider a multi-period portfolio optimization problem, with n assets and T
time periods, where xt ∈ Rn gives the holdings (say, in dollars) at time t, with negative entries
denoting, as usual, short positions. For each time period the return vector has mean µ ∈ Rn and
covariance Σ ∈ Sn++ . (These are known.)
The initial portfolio x0 maximizes the risk-adjusted expected return µT x − γxT Σx, where γ > 0,
subject to the leverage limit constraint ∥x∥1 ≤ Linit , where Linit > 0 is the given initial leverage
limit. (There are several different ways to measure leverage; here we use the sum of the total
short and long positions.) The final portfolio xT maximizes the risk-adjusted return, subject to
∥x∥1 ≤ Lnew , where Lnew > 0 is the given final leverage limit (with Lnew < Linit ). This uniquely
determines x0 and xT , since the objective is strictly concave.
The question is how to move from x0 to xT , i.e., how to choose x1 , . . . , xT −1 . We will do this so as
to maximize the objective
T
X
µT xt − γxTt Σxt − ϕ(xt − xt−1 ) ,
J=
t=1
which is the total risk-adjusted expected return, minus the total transaction cost. The transaction
cost function ϕ has the form
Xn
κi |ui | + λi u2i ,
ϕ(u) =
i=1
where κ ⪰ 0 and λ ⪰ 0 are known parameters. We will require that ∥xt ∥1 ≤ Linit , for t =
1, . . . , T − 1. In other words, the leverage limit is the initial leverage limit up until the deadline T ,
when it drops to the new lower value.
(a) Explain how to find the portfolio sequence x⋆1 , . . . , x⋆T −1 that maximizes J subject to the
leverage limit constraints.
(b) Find the optimal portfolio sequence x⋆t for the problem instance with data given in deleveraging_data.m.
Compare this sequence with two others: xlp t = x0 for t = 1, . . . , T − 1 (i.e., one that does all
trading at the last possible period), and the linearly interpolated portfolio sequence
xlin
t = (1 − t/T )x0 + (t/T )xT , t = 1, . . . , T − 1.
For each of these three portfolio sequences, give the objective value obtained, and plot the
risk and transaction cost adjusted return,
219
and the leverage ∥xt ∥1 , versus t, for t = 0, . . . , T . Also, for each of the three portfolio sequences,
generate a single plot that shows how the holdings (xt )i of the n assets change over time, for
i = 1, . . . , n.
Give a very short (one or two sentence) intuitive explanation of the results.
17.19 Worst-case variance. Suppose Z is a random variable on Rn with covariance matrix Σ ∈ Sn+ .
Let c ∈ Rn . The variance of Y = cT Z is var(Y ) = cT Σc. We define the worst-case variance
of Y , denoted wcvar(Y ), as the maximum possible value of cT Σ̃c, over all Σ̃ ∈ Sn+ that satisfy
Σii = Σ̃ii , i = 1, . . . , n. In other words, the worst-case variance of Y is the maximum possible
variance, if we are allowed to arbitrarily change the correlations between Zi and Zj . Of course we
have wcvar(Y ) ≥ var(Y ).
(a) Find a simple expression for wcvar(Y ) in terms of c and the diagonal entries of Σ. You must
justify your expression.
(b) Portfolio optimization. Explain how to find the portfolio x ∈ Rn that maximizes the expected
return µT x subject to a limit on risk, var(rT x) = xT Σx ≤ R, and a limit on worst-case
risk wcvar(rT x) ≤ Rwc , where R > 0 and Rwc > R are given. Here µ = E r and Σ =
E(r − µ)(r − µ)T are the (given) mean and covariance of the (random) return vector r ∈ Rn .
(c) Carry out the method of part (b) for the problem instance with data given in
wc_risk_portfolio_opt_data.m. Also find the optimal portfolio when the worst-case risk
limit is ignored. Find the expected return and worst-case risk for these two portfolios.
Remark. If a portfolio is highly leveraged, and the correlations in the returns change drastically,
you (the portfolio manager) can be in big trouble, since you are now exposed to much more risk
than you thought you were. And yes, this (almost exactly) has happened.
17.20 Risk budget allocation. Suppose an amount xi > 0 is invested in n assets, labeled i = 1, ..., n, with
asset return covariance matrix Σ ∈ Sn++ . We define the risk of the investments as the standard
deviation of the total return, R(x) = (xT Σx)1/2 .
We define the (relative) risk contribution of asset i (in the portfolio x) as
220
(a) Explain how to solve the risk P
budget allocation problem using convex optimization.
Hint. Minimize (1/2)x Σx − ni=1 ρdes
T
i log xi .
(b) Find the investment mix that achieves risk parity for the return covariance matrix
6.1 2.9 −0.8 0.1
2.9 4.3 −0.3 0.9
Σ= −0.8 −0.3
.
1.2 −0.7
0.1 0.9 −0.7 2.3
17.21 Portfolio rebalancing. We consider the problem of rebalancing a portfolio of assets over multiple
periods. We let ht ∈ Rn denote the vector of our dollar value holdings in n assets, at the beginning
of period t, for t = 1, . . . , T , with negative entries meaning short positions. We will work with the
portfolio weight vector, defined as wt = ht /(1T ht ), where we assume that 1T ht > 0, i.e., the total
portfolio value is positive.
The target portfolio weight vector w⋆ is defined as the solution of the problem
maximize µT w − γ2 wT Σw
subject to 1T w = 1,
where w ∈ Rn is the variable, µ is the mean return, Σ ∈ Sn++ is the return covariance, and γ > 0 is
the risk aversion parameter. The data µ, Σ, and γ are given. In words, the target weights maximize
the risk-adjusted expected return.
At the beginning of each period t we are allowed to rebalance the portfolio by buying and selling
assets. We call the post-trade portfolio weights w̃t . They are found by solving the (rebalancing)
problem
maximize µT w − γ2 wT Σw − κT |w − wt |
subject to 1T w = 1,
with variable w ∈ Rn , where κ ∈ Rn+ is the vector of (so-called linear) transaction costs for
the assets. (For example, these could model bid/ask spread.) Thus, we choose the post-trade
weights to maximize the risk-adjusted expected return, minus the transactions costs associated
with rebalancing the portfolio. Note that the pre-trade weight vector wt is known at the time we
solve the problem. If we have w̃t = wt , it means that no rebalancing is done at the beginning of
period t; we simply hold our current portfolio. (This happens if wt = w⋆ , for example.)
After holding the rebalanced portfolio over the investment period, the dollar value of our portfolio
becomes ht+1 = diag(rt )h̃t , where rt ∈ Rn++ is the (random) vector of asset returns over period
t, and h̃t is the post-trade portfolio given in dollar values (which you do not need to know). The
next weight vector is then given by
diag (rt )w̃t
wt+1 = .
rtT w̃t
(If rtT w̃t ≤ 0, which means our portfolio has negative value after the investment period, we have
gone bust, and all trading stops.) The standard model is that rt are IID random variables with
mean and covariance µ and Σ, but this is not relevant in this problem.
221
(a) No-trade condition. Show that w̃t = wt is optimal in the rebalancing problem if
γ |Σ(wt − w⋆ )| ⪯ κ
holds, where the absolute value on the left is elementwise.
Interpretation. The lefthand side measures the deviation of wt from the target portfolio w⋆ ;
when this deviation is smaller than the cost of trading, you do not rebalance.
Hint. Find dual variables, that with w = wt satisfy the KKT conditions for the rebalancing
problem.
(b) Starting from w1 = w⋆ , compute a sequence of portfolio weights w̃t for t = 1, . . . , T . For each
t, find w̃t by solving the rebalancing problem (with wt a known constant); then generate a
vector of returns rt (using our supplied function) to compute wt+1 (The sequence of weights
is random, so the results won’t be the same each time you run your script. But they should
look similar.)
Report the fraction of periods in which the no-trade condition holds and the fraction of periods
in which the solution has only zero (or negligible) trades, defined as ∥w̃t − wt ∥∞ ≤ 10−3 . Plot
the sequence w̃t for t = 1, 2, . . . , T .
The file portf_weight_rebalance_data.* provides the data, a function to generate a (ran-
dom) vector rt of market returns, and the code to plot the sequence w̃t . (The plotting code
also draws a dot for every non-negligible trade.)
Carry this out for two values of κ, κ = κ1 and κ = κ2 . Briefly comment on what you observe.
Hint. In CVXPY we recommend using the solver ECOS. But if you use SCS you should
increase the default accuracy, by passing eps=1e-4 to the cvxpy.Problem.solve() method.
17.22 Portfolio optimization using multiple risk models. Let w ∈ Rn be a vector of portfolio weights,
where negative values correspond to short positions, and the weights are normalized such that
1T w = 1. The expected return of the portfolio is µT w, where µ ∈ Rn is the (known) vector
of expected asset returns. As usual we measure the risk of the portfolio using the variance of
the portfolio return. However, in this problem we do not know the covariance matrix Σ of the
asset returns; instead we assume that Σ is one of M (known) covariance matrices Σ(k) ∈ Sn++ ,
k = 1, . . . , M . We can think of the Σ(k) as representing M different risk models, associated with
M different market regimes (say). For a weight vector w, there are M different possible values
of the risk: wT Σ(k) w, k = 1, . . . , M . The worst-case risk, across the different models, is given by
maxk=1,...,M wT Σ(k) w. (This is the same as the worst-case risk over all covariance matrices in the
convex hull of Σ(1) , . . . , Σ(M ) .)
We will choose the portfolio weights in order to maximize the expected return, adjusted by the
worst-case risk, i.e., as the solution w⋆ of the problem
maximize µT w − γ maxk=1,...,M wT Σ(k) w
subject to 1T w = 1,
with variable w, where γ > 0 is a given risk-aversion parameter. We call this the mean-worst-case-
risk portfolio problem.
PM ⋆
(a) Show that there exist γ1 , . . . , γM ≥ 0 such that k=1 γk = γ and the solution w of the
mean-worst-case-risk portfolio problem is also the solution of the problem
maximize µT w − M T (k)
P
k=1 γk w Σ w
T
subject to 1 w = 1,
222
with variable w.
Remark. The result above has a beautiful interpretation: We can think of the γk as allocating
our total risk aversion γ in the mean-worst-case-risk portfolio problem across the M different
regimes.
Hint. The values γk are not easy to find: you have to solve the mean-worst-case-risk problem
to get them. Thus, this result does not help us solve the mean-worst-case-risk problem; it
simply gives a nice interpretation of its solution.
(b) Find the optimal portfolio weights for the problem instance with data given in multi_risk_portfolio_data.*
Report the weights and the values of γk , k = 1, . . . , M . Give the M possible values of the risk
associated with your weights, and the worst-case risk.
17.23 Computing market-clearing prices. We consider n commodities or goods, with p ∈ Rn++ the vector
of prices (per unit quantity) of them. The (nonnegative) demand for the products is a function of
the prices, which we denote D : Rn → Rn , so D(p) is the demand when the product prices are
p. The (nonnegative) supply of the products (i.e., the amounts that manufacturers are willing to
produce) is also a function of the prices, which we denote S : Rn → Rn , so S(p) is the supply
when the product prices are p. We say that the market clears if S(p) = D(p), i.e., supply equals
demand, and we refer to p in this case as a set of market-clearing prices.
Elementary economics courses consider the special case n = 1, i.e., a single commodity, so supply
and demand can be plotted (vertically) against the price (on the horizontal axis). It is assumed
that demand decreases with increasing price, and supply increases; the market clearing price can be
found ‘graphically’, as the point where the supply and demand curves intersect. In this problem we
examine some cases in which market-clearing prices (for the general case n > 1) can be computed
using convex optimization.
We assume that the demand function is Hicksian, which means it has the form D(p) = ∇E(p),
where E : Rn → R is a differentiable function that is concave and increasing in each argument,
called the expenditure function. (While not relevant in this problem, Hicksian demand arises from
a model in which consumers make purchases by maximizing a concave utility function.)
We will assume that the producers are independent, so S(p)i = Si (pi ), i = 1, . . . , n, where Si :
R → R is the supply function for good i. We will assume that the supply functions are positive
and increasing on their domain R+ .
(a) Explain how to use convex optimization to find market-clearing prices under the assumptions
given above. (You do not need to worry about technical details like zero prices, or cases in
which there are no market-clearing prices.)
(b) Compute market-clearing prices for the specific case with n = 4,
4
!1/4
Y
E(p) = pi ,
i=1
223
Hint: In CVX and CVXPY, geo_mean gives the geometric mean of the entries of a vector
argument. Julia does not yet have a vector argument geom_mean function, but you can get the
geometric mean of 4 variables a, b, c, d using geomean(geomean(a, b), geomean(c, d)).
17.24 Funding an expense stream. Your task is to fund an expense stream over n time periods. We
consider an expense stream e ∈ Rn , so that et is our expenditure at time t.
One possibility for funding the expense stream is through our bank account. At time period t,
the account has balance bt and we withdraw an amount wt . (A negative withdrawal represents a
deposit.) The value of our bank account accumulates with an interest rate ρ per time period, less
withdrawals:
bt+1 = (1 + ρ)bt − wt .
We assume the account value must be nonnegative, so that bt ≥ 0 for all t.
We can also use other investments to fund our expense stream, which we purchase at the initial
time period t = 1, and which pay out over the n time periods. The amount each investment type
pays out over the n time periods is given by the payout matrix P , defined so that Ptj is the amount
investment type j pays out at time period t per dollar invested. There are m investment types,
and we purchase xj ≥ 0 dollars of investment type j. In time period t, the total payout of all
investments purchased is therefore given by (P x)t .
In each time period, the sum of the withdrawals and the investment payouts must cover the expense
stream, so that
wt + (P x)t ≥ et
for all t = 1, . . . , n.
The total amount we invest to fund the expense stream is the sum of the initial account balance,
and the sum total of the investments purchased: b1 + 1T x.
(a) Show that the minimum initial investment that funds the expense stream can be found by
solving a convex optimization problem.
(b) Using the data in expense stream data.*, carry out your method in part (a). On three
graphs, plot the expense stream, the payouts from the m investment types (so m different
curves), and the bank account balance, all as a function of the time period t. Report the
minimum initial investment, and the initial investment required when no investments are
purchased (so x = 0).
17.25 Yield curve envelope. The term structure of interest rates gives the current value of a future
payment at the current time. The value of a $1 payment in period t is worth pt , with p0 = 1.
The curve pt is called the discount curve. It is often described in terms of the yield curve, defined
−1/t
as yt = pt − 1. We will assume that pt is positive, nonincreasing, and satisfies p0 = 1. (The
nonincreasing assumption means that future payments are worth less than current payments.) We
don’t know the discount curve (or equivalently, the yield curve) but we do have some additional
information about it, beyond the assumptions made above, based on the market prices of some
known bonds.
A bond is characterized by a future cash flow, called the the coupon payment schedule, given by
c ∈ RT , where ct ≥ 0 is the payment in period t, for t = 1, . . . , T . (Bond payment schedules typically
224
have the form of a constant payment every month, or quarter, or year, and a large payment on its
maturity date, with no payments after that. But you don’t need to know this for this problem.)
The present (or discounted) value of the bond is cT p. We assume this is the current market price
of the bond, which we know. We have K bonds with known coupon schedules ck and market prices
bk , k = 1, . . . , K. Together with the assumptions above (nonnegativity and monotonicity), this
describes a set of possible discount rates which we denote P ⊂ RT . Thus P is the set of discount
curves that are consistent with the known bond market prices and our assumptions.
Define
dmax
t = max pt , dmin
t = min pt , t = 0, . . . , T.
p∈P p∈P
These functions, called the upper and lower envelopes of the discount curve, give the range of
possible values of the discount at each time, over all discounts compatible with our assumptions
and the known bond prices. From these we can get the maximum and minimum values of the yield
curve,
−1/t
ytmax = (dmin
t ) − 1, ytmin = (dmax
t )−1/t − 1, t = 1, . . . , T.
These are called the upper and lower envelope of the yield curve, respectively.
(a) Explain how to find dmax and dmin using convex or quasiconvex optimization, given T , the
coupon payment schedules ck , and known prices, bk , of the K bonds. Your solution may
involve solving a reasonable number of problems.
(b) Solve the problem you formulated using the problem data found in the file, yield_curve_data.*.
After computing dmax and dmin , you should plot your lower and upper envelopes against time.
Create one figure for the yield curve, and another for the discount curve. Besides t = 0,
are there any time points at which you can be certain what value of the discount curve (or
equivalently, the yield curve) is? If so, explain briefly.
17.26 Portfolio optimization with a drawdown limit. You are given the time series of the daily share prices
of n assets over T days, p1 , . . . , pT ∈ Rn++ . We consider a buy-and-hold portfolio, consisting of si
shares of asset i (with negative meaning a short position). The value of the portfolio on day t is
given by Vt = pTt s. We will assume that Vt > 0 for all t. We are interested in choosing s ∈ Rn ,
subject to the constraint V1 = B, where B > 0 is the total budget to be invested on day 1. The
objective is to maximize the ending portfolio value VT , subject to the budget constraint above, and
an additional constraint described below related to the maximum drawdown of the portfolio.
The last high or high water mark at time t is Ht = maxτ ≤t Vτ , the maximum portfolio value up to
time t. The drawdown Dt at time t is defined as
Dt = (Ht − Vt )/Ht .
(This is often expressed as a percentage.) Investors get nervous when the drawdown gets too large,
say, more than 10%.
(a) Explain how to use convex optimization to find a portfolio s that maximizes final value VT ,
subject to the budget constraint, and Dt ≤ Dmax , where Dmax ∈ (0, 1) is given. You can change
or introduce variables, reformulate the problem, or use quasiconvex optimization. In all cases,
you must explain your method, and establish the convexity (or concavity, or quasiconvexity,
etc.) of any function for which it is not obvious. The number of variables or constraints you
introduce in your formulation should be no more than a small multiple of T .
225
(b) Find optimal portfolios for the problem data given in drawdown_data.* and asset_prices.csv,
under drawdown limits Dmax ∈ {0.05, 0.10, 0.20}. Report the optimal final value in each case.
Plot the portfolio value Vt versus time for each choice of Dmax .
Remark. The optimization problem described here is not immediately useful, since you obviously
would not know future returns when you choose your portfolio. Instead we are finding what buy
and hold portfolio would have been best over the (past) T periods, had you known future returns.
But if you believe the future looks like the past (which it need not), this portfolio could be a good
choice for the future, too.
17.27 Minimax portfolio optimization. We consider a portfolio optimization problem with n assets held
for a fixed period of time. Let xi denote the amount of asset i held. The price change of asset i
over the period is given by pi . The random vector p has mean µ ∈ Rn+ and covariance Σ ∈ Sn+ .
The risk adjusted return is defined to be
µT x − γxT Σx
where γ ≥ 0 is the risk aversion parameter. Unfortunately we do not know Σ, but instead we know
that Σ ∈ A where
A = {Σ | Σ ⪰ 0, Lij ≤ Σij ≤ Uij for i, j = 1, . . . , n}
We do know the matrices L and U , which are symmetric and provide lower and upper bounds
on the entries of the matrix Σ. (Note the two different types of inequality symbols used in the
definition of A above.) We assume that Σmid = (L + U ) /2 ≻ 0. We require that xi ≥ 0 (so all
positions are long.)
We intend to maximize the worst-case risk-adjusted return, by solving the optimization problem
maximize µT x − γ maxΣ∈A xT Σx
subject to x ⪰ 0,
1T x = 1,
(a) Show that maxΣ∈A xT Σx is equal to the optimal value of the following optimization problem
with variables V, W ∈ Rn×n
minimize tr (V U − W L)
subject to Vij ≥ 0, i, j = 1, . . . , n,
Wij ≥ 0, i, j= 1, . . . , n,
V −W x
⪰ 0.
xT 1
(b) Explain how the worst-case risk-adjusted return problem can be solved using convex optimiza-
tion.
(c) The file minimax_data.m contains L, U and µ. For each γ ∈ {2−4 , 2−3 , . . . , 23 , 24 } apply your
method from part (b). Plot the mean return µT x⋆ of the optimal portfolio versus log2 γ.
Similarly plot the worst-case risk, given by maxΣ∈A (x⋆ )T Σx⋆ , versus log2 γ.
226
17.28 Maximum Sharpe ratio portfolio. We consider a portfolio optimization problem with portfolio vector
x ∈ Rn , mean return µ ∈ Rn , and return covariance Σ ∈ Sn++ . The ratio of portfolio mean return
µT x to portfolio standard deviation ∥Σ1/2 x∥2 is called the Sharpe ratio of the portfolio. (It is often
modified by subtracting a risk-free return from the mean return.) The Sharpe ratio measures how
much return you get per risk taken on, and is a widely used single metric that combines return and
risk. It is undefined for µT x ≤ 0.
Consider the problem of choosing the portfolio to maximize the Sharpe ratio, subject to the con-
straint 1T x = 1, and the leverage constraint ∥x∥1 ≤ Lmax , where Lmax ≥ 1 is a given leverage limit.
You can assume there is a feasible x with µT x > 0.
(a) Show that the maximum Sharpe ratio problem is quasiconvex in the variable x.
(b) Show how to solve the maximum Sharpe ratio problem by solving one convex optimization
problem. You must fully justify any change of variables or problem transformation.
17.29 Post-modern portfolio optimization metrics. Let r ∈ RT denote a time series (say, daily) of in-
vestment returns, i.e., the increase in value divided by initial value. The value of the investment
(typically, a portfolio) is the time series vector v ∈ RT defined by the recursion
vt+1 = vt (1 + rt ), t = 0, . . . , T − 1,
with v0 a given positive initial value. Here we are compounding the investment returns. We will
assume that all returns satisfy rt > −1, which implies that v ≻ 0. We define the high-water value
or last high value as
ht = max vτ , t = 1, . . . , T.
τ ≤t
227
(b) Downside variance. (Minimize.) The downside variance is (1/T ) t (rt − µ)2− , where (u)− =
P
max{−u, 0}, and µ is the mean return. This assesses a penalty for a return below the average
(the ‘downside’), but not for a return above the average.
(c) Maximum drawdown. (Minimize.) The drawdown at period t is defined as dt = (ht − vt )/ht .
The maximum drawdown is defined as maxt dt .
(d) Maximum consecutive days under water. (Minimize.) A time period t is called under water
if vt < ht , i.e., the current value is less than the last high. Maximum consecutive days under
water means just that, i.e., the maximum number of consecutive days under water.
Remark. Many other post-modern metrics can derived be from, or are related to, the ones described
above. Examples include the Sortino, Calmar, and Information ratios. You can thank the EE364a
staff for refraining from asking about these.
17.30 Currency exchange. An entity (such as a multinational corporation) holds n = 10 currencies, with
cinit
i ≥ 0 denoting the number of units of currency i. The currencies are, in order, USD, EUR, GBP,
CAD, JPY, CNY, RUB, MXN, INR, and BRL. Our goal is to exchange currencies on a market so
that, after the exchanges, we hold at least creq
i units of each currency i.
The exchange rates are given by F ∈ Rn×n , where Fij is the units of currency j it costs to buy one
unit of currency i. We call 1/Fij the bid price for currency j in terms of currency i, and Fji the
ask price for currency j in terms of currency i.
For example, suppose that F12 = 0.88 and F21 = 1.18. This means that it takes 0.88 EUR to buy
one USD, and it takes 1.18 USD to buy one EUR; the bid and ask prices for EUR in USD are
1.1364 USD and 1.1800 USD, respectively.
We will value a set of currency holdingsp
in USD, by valuing each unit of currency j at the geometric
mean
√ of the bid and ask price in USD, Fj1 /F1j . In our example above, we would value one EUR
as 1.1364 · 1.1800 = 1.1580 USD.
We let X ∈ Rn×n + denote the currency exchanges that we carry out, with Xij ≥ 0 the amount
of currency j we exchange on the market for currency i, for which we obtain Xij /Fij of currency
i. (You can assume that Xii = 0.) The total of each currency j that we exchange into other
currencies cannot exceed our initial holdings, cinit
j . After the currency exchange, we must end up
req
with at least ci of currency i. (The post-exchange amount we hold of currency i is our original
holding cinit
i , minus the total we exchange into other currencies, plus the total amount we obtain
from exchanging other currencies into currency i.)
The cost of the exchanges is the decrease in value between the currency holdings before and after
the exchanges, in USD. The cost can be interpreted as the transaction costs incurred by crossing
the bid-ask spread (i.e., if the bid and the ask were the same, there would be no cost.)
Find the currency exchanges X ⋆ that minimize the currency exchange cost for the data in currency_exchange_data
(These data are based on real exchange rates, but with artificially large spreads, to make sure that
you don’t encounter any numerical issues.) Explain your method, and give the optimal value, i.e.,
the cost obtained.
17.31 Minimizing tax liability. You will liquidate (sell) some stocks that you hold to raise a given amount
of cash C. The stocks are divided into n tax lots; a tax lot is a group of stocks you bought at the
same time. For each tax lot i, you have the cost basis bi > 0, the current market value vi > 0 (both
228
in $), and its short term / long term status. (Long term means that you acquired the stock in the
tax lot more than one year ago, and short term means that you acquired it less than one year ago.)
We assume that tax lots i = 1, . . . , L are long term, and tax lots i = L + 1, . . . , n are short term.
The goal is to choose how much of each lot to sell. We let si denote the amount of tax lot i we sell
(in $). These must satisfy 0 ≤ si ≤ vi , and we must have 1T s = C.
When vi < bi , the sale is called a loss, and when vi > bi , the sale is called a gain. The amount of
the gain or loss is given by gi = (si /vi )(vi − bi ), with positive values meaning a gain, and negative
values meaning a loss. We define the (net) long and short term gains as
L
X n
X
l s
N = gi , N = gi .
i=1 i=L+1
When N l > 0 (N l < 0), we say that we have had a long term capital gain (loss), and similary for
short term gain.
These two net gains determine the total tax liability. The long and short term net gains are taxed
at two different rates, ρl and ρs , respectively, which satisfy 0 < ρl < ρs .
The simplest case is when both net gains are nonnegative, in which case the tax is ρl N l + ρs N s .
Another simple case occurs when both net gains are nonpositive, in which case the tax is zero.
In the case when one of the net gains is positive and the other is negative, you are allowed to use
the net loss in one to offset the net gain in the other, up to the value of the net gain. Specifically,
if N l < 0 (you have a long term loss), the tax is ρs (N s + N l )+ ; if N s < 0 (you have a short term
loss), the tax is ρl (N s + N l )+ . (Here (u)+ = max{u, 0}.) Note that you have zero tax liability if
N s + N l ≤ 0, i.e., your total long and short net gains is less than or equal to zero.
Apology. Sorry this sounds complicated. In fact, this is a highly simplified version of the way taxes
really work.
Hint. The tax liability is neither a convex nor quasiconvex function of the long and short term net
gains N l and N s .
(a) Explain how to find s that minimizes the tax liability, subject to the constraints listed above,
using convex optimization. Your solution can involve solving a modest number of convex
problems.
(b) Suppose you want to raise C = 2300 dollars from n = 10 tax lots, and the cost basis and
values of each lot are given by
b = (400, 80, 400, 200, 400, 400, 80, 400, 100, 500),
v = (500, 100, 500, 200, 700, 300, 120, 300, 150, 600).
Carry out your method on this data with L = 4, ρl = 0.2, and ρs = 0.3. Give optimal values
of si , and the optimal value of the tax liability. Compare this to the tax liability when you
liquidate all tax lots proportionally, i.e., s = (C/1T v)v.
229
years, the investor puts money into the investment in response to capital calls, up to the amount
of previous commitments. The investor receives money from the investment in later years through
distributions. Examples of alternative investments include private equity, venture capital, and
infrastructure projects. Alternative investments are found in the portfolios of insurance companies,
retirement funds, and university endowments. (‘Alternative’ refers to the investment not being the
more usual stocks, bonds, currencies, and financial derivatives.)
We consider time periods t = 1, . . . , T , which are typically quarters. We first describe some critical
quantities.
The units for all of these is typically millions of USD. Among these quantities, the only ones we
have direct control over are the commitments ct ; the others are functions of these.
A simple dynamical model of these variables is
where r ≥ 0 is the per-period return, with initial conditions n1 = u1 = 0. (Note that n and u
are (T + 1)-vectors, whereas c, d, and p are T -vectors.) In words: the value of the investment
increases by its return, plus the amount paid in, minus the amount distributed; the total uncalled
commitments is decreased by the capital calls, and increased by new commitments. The calls and
distributions are modeled as
pt = γ call ut , dt = γ dist nt , t = 1, . . . , T,
where γ call ∈ (0, 1) and γ dist ∈ (0, 1) are the call and distribution intensities, respectively. The
parameters r, γ call , and γ dist are given. Your job is to choose the sequence of commitments c =
(c1 , . . . , cT ).
The commitments and the capital calls are limited by ct ≤ cmax and pt ≤ pmax , for t = 1, . . . , T ,
where cmax > 0 and pmax > 0 are given. In addition we have a total budget B > 0 for commitments,
with 1T c ≤ B. Our objective is to minimize
T +1 T −1
1 X 1 X
(nt − ndes )2 + λ (ct+1 − ct )2 ,
T +1 T −1
t=1 t=1
where ndes > 0 is a given positive target NAV, and λ > 0 is a parameter. The first term in the
objective is the mean-square tracking error, and the second term, the mean-square difference in
commitments, encourages smooth sequences of commitments.
230
(a) Optimized commitments. Explain how to solve this problem with convex optimization. Solve
this problem with parameters T = 40 (ten years), r = 0.04 (4% quarterly return),
Plot c, p, d, n, and u versus t. Give the root-mean-square (RMS) tracking error, i.e., the
squareroot of the mean-square tracking error, for the optimal commitments.
(b) Constant commitment based on steady-state. By solving the dynamics equations with all
quantities constant, we find that css = (γ dist − r)ndes is the value of a constant commitment
(i.e., the same each period) that gives nt = ndes asymptotically, in steady-state. Plot the same
quantities as in part (a) for the constant commitment ct = css for t = 1, . . . , T . Give the RMS
tracking error. Hint. A quick and simple (but not computationally efficient) way to do the
simulation is to modify the code for part (a), adding the constraint that ct = css , t = 1, . . . , T .
Give a very brief description of what you see, comparing the optimal sequence of commitments
found in part (a) and the constant commitments found in part (b).
17.33 Maximizing diversification ratio. Let x ∈ Rn+ , with 1T x = 1, denote a portfolio of n assets, with
xi the fraction of the total value (assumed positive) invested in asset i. Let Σ ∈ Sn++ denote the
covariance matrix of the asset returns. The diversification ratio of the portfolio is defined as
σT x
D(x) = ,
(xT Σx)1/2
where σi = (Σii )1/2 . Note that D is defined for any x ∈ Rn+ with 1T x = 1.
We consider the problem of choosing x to maximize the diversification ratio, subject to limits on
the weights,
maximize D(x)
subject to 1T x = 1, 0 ⪯ x ⪯ M,
where M ≻ 0 is a given vector of maximum allowed weights, with 1T M > 1.
Remark. (The following is not needed to solve the problem, but gives some background.) For any
long-only portfolio x we have D(x) ≥ 1. To see this we note that
X X
xT Σx = xi xj σi σj ρij ≤ xi xj σi σj = (σ T x)2 ,
ij ij
where ρij = Σij /(σi σj ) is the correlation, which satisfies ρij ≤ 1. The smallest possible value
of diversification D(x) = 1 occurs only when x = ek (the kth unit vector), i.e., the portfolio is
concentrated in one asset.
(a) Explain how to use convex optimization to solve the problem. We will give half credit for
a solution that involves solving a quasiconvex optimization problem, and full credit to one
that relies on solving one convex problem. Hints. You may need to change variables to get a
one-convex-problem method. Note also that D(tx) = D(x) for any t > 0.
231
(b) Use your method from part (a) to solve the problem instance with data given in max_divers_data.*.
Give an optimal x⋆ , and the associated diversification ratio D(x⋆ ).
The (long-only) minimum variance portfolio xmv is the one that minimizes xT Σx subject to
0 ⪯ x ⪯ M , 1T x = 1. Find D(xmv ), and compare it to D(x⋆ ). Compare the maximum
diversification and minimum variances portfolios using a bar plot. (The data file contains
code for creating such plots.)
17.34 Optimal exchange. We consider a market with n (divisible) goods that a set of N agents or
participants can exchange or trade with each other. We let xi ∈ Rn denote the amounts of
goods that agent i takes, with (xi )j < 0 meaning that agent i gives the amount |(xi )j |. We say
that the market clears if x1 + · · · + xN = 0, which means that for each good, the total amount
taken by participants balances the total amount given by other participants. (We assume that
each participant starts with an endowment of goods, which allows them to give some away.) The
particular choice xi = 0, i = 1, . . . , N , means that no goods are exchanged.
Each participant derives a utility Ui (xi ) (in dollars, say) from the level of goods taken (or given) xi .
We will assume that the functions Ui : Rn → R are increasing, strictly concave, and differentiable,
with 0 ∈ dom Ui , i = 1, . . . , N . (Everything can be made to work when they are just concave, but
it gets more complicated.)
Suppose x⋆i , i = 1, . . . , N , maximize the total utility U1 (x1 ) + · · · + UN (xN ) subject to the market
clearing. Unless all of these are are zero, we have (by definition)
which means that by optimal trading, the total utility increases. In this exercise we discuss how to
compensate the participants, or, put another way, how to allocate the increase in total utility to
the participants.
To fix the sign convention for the dual variable, we work with the Lagrangian
with dual variable ν ∈ Rn , and let p = ν ⋆ denote an optimal dual variable value. (And yes, you do
have strong duality here.)
Not surprisingly, p can be interpreted as a vector of prices for the goods. Below you will show that
p ≻ 0, i.e., the prices for the goods are all positive. The payment by participant i (in dollars) for
participating in the exchange is pT xi . (If this is negative, participant i receives money.) You will
work out various properties of this payment scheme.
(a) Price and marginal utility. Relate p to ∇Ui (x⋆i ). The latter is the marginal utility of the goods
to participant i, at x⋆i . From this relation, conclude that p ≻ 0.
(b) Cash balance. Show that the sum of the payments across the participants is zero. This means
that the total cash paid in by participants balances the total cash paid out to participants. In
other words, the cash payments also clear.
(c) Nash equilibrium. Explain why x⋆i maximizes Ui (xi ) − pT xi , which is the net utility for
participant i. In other words, with the prices of goods fixed at p, each participant maximizes
their net utility with xi = x⋆i . This is called a Nash equilibrium: No participant is incentivized
to change their value of xi from x⋆i .
232
(d) Everyone does better by trading. Show that for each i, Ui (x⋆i ) − pT x⋆i ≥ Ui (0). (The inequality
is strict when x⋆i ̸= 0.) The lefthand side is the net utility when the participant trades; the
righthand side is the (net) utility when she does not trade.
Your solutions can be brief; we will penalize solutions that are substantially more complicated than
they need to be.
17.35 Worst case bond portfolio value. A portfolio of bonds has a known cash flow or sequence of payments
the holder will receive, given by the vector c = (c1 , . . . , cT ) ∈ RT , where ct ≥ 0 is the cash that
will be received in period t. (These payments include coupon payments and also the principal for
bonds in the portfolio that mature. But you don’t need to know these details.)
The (net present) value of the bond portfolio is given by V = cT p, where p ∈ RT++ is the vector
of discounts for future payments. We interpret pt as the current value of a payment of $1 in
period t. For example, with a constant interest rate r and continuous compounding of interest,
we have pt = exp(−tr). These discount factors are typically specified by the so-called yield curve
y = (y1 , . . . , yT ), where
log pt
yt = , t = 1, . . . , T.
−t
We interpret yt as the constant, continuously compounded interest rate r which would yield pt =
exp(−tyt ). (If you are curious what a real yield curve looks like, search online for ‘today’s US
treasury yield curve’.)
We consider the situation where we know the portfolio cash flow c, but not the yield curve y. But
we do have a set of possible yield curves given by Y ⊂ RT , where Y is convex. The worst case
value of the portfolio, over the set of possible yield curves, is defined as
V wc = min{cT p | y ∈ Y}.
(a) Explain how to find V wc using convex optimization. If you change variables or use a relaxation,
explain.
(b) Suppose that Y has a maximum element y max with respect to the nonnegative cone RT+ . (This
does not always happen, of course.) Give a simple expression or formula for V wc in terms of
y max . Justify your answer.
(c) We now consider a particular form of Y, given by yield curves of the form y nom + δ where
y nom ∈ RT is a given nominal yield curve, and δ ∈ RT is a deviation from the nominal yield
curve satisfying
T −1
!1/2
X
δ1 = 0, 1T δ = 0, (δt+1 − δt )2 ≤ ρ, −κ ≤ δt ≤ κ, t = 1, . . . , T,
t=1
233
17.36 Portfolio optimization with buy/hold/sell recommendations. We consider the problem of choosing
a portfolio of n assets specified by the weight vector w ∈ Rn , with 1T w = 1, with wi the fraction
of the total portfolio value (assumed to be positive) held in asset i, with negative wi meaning
a short position. Markowitz-style portfolio optimization uses the mean return of the assets µ ∈
Rn . (Of course, in practice this is always an estimate or forecast of the mean.) In this problem
we show how to carry out Markowitz-style optimization with a traditional qualitative estimate
of returns, which specifies for each asset whether the investor should buy it (which means the
return is thought to be positive), sell it (which means the return is thought to be negative), or
hold it (which means the return is not clear). To simplify notation, we assume the assets are
sorted with all buy recommendations, then all hold, then all sell, so we can partition the weight
vector as w = (wb , wh , ws ). These subvectors have positive dimensions nb , nh , ns , respectively, with
nb + nh + ns = n.
We pose the portfolio optimization problem as a robust optimization problem, working with the
worst-case portfolio return over asset returns consistent with the buy/hold/sell recommendations.
We translate the recommendations into a set of possible returns
where the subscripts denote the subvectors associated with buy, hold, and sell recommendations.
Here ν > 0 is a parameter that gives the minimum return we expect from a buy recommendation,
with −ν the maximum return we expect for a sell recommendation. We define the worst-case return
as Rwc (w) = minµ∈M µT w, which is evidently a concave function of w. Note that Rwc (w) can be
−∞.
We wish to solve the problem
with variable w, where L ≥ 1 is a leverage limit, Σ ∈ Sn++ is the covariance matrix of the asset
returns, and σ > 0 is a maximum allowed portfolio return standard deviation. The parameters L,
Σ, and σ are given. Since the objective is homogeneous in ν, we can assume that ν = 1. This
problem is convex, but not immediately solvable, since the objective cannot be directly handled by
standard solvers.
(a) Show how to solve the problem using standard solvers, in a form compatible with CVXPY.
Justify any change of variables, or other transformations you use.
(b) Carry out the method in part (a) on the problem instance with data given in buy_hold_sell_data.py.
Give an optimal w⋆ and its associated optimal objective value Rwc (w⋆ ). Give a µwc ∈ M for
which Rwc (w⋆ ) = (µwc )T w⋆ .
(c) A naı̈ve method. A simpler approach to handling buy/hold/sell recommendations is to assume
that all buy assets have return +1, all sell assets have return −1, and all hold assets have
return 0. Solve the problem with this (linear) objective. Give a solution wnaive . What is the
associated worst-case return, Rwc (wnaive )?
17.37 Fundamental theorem of asset pricing. Consider a universe of n assets and m possible scenarios
for future payoffs. A payoff matrix P ∈ Rm×n is defined as follows: Pij is the payoff, per dollar
234
invested, of asset j in scenario i. For example, if Pij = −0.1, then asset j lost 10% in value in
scenario i. Show that the following systems of equalities and inequalities are strong alternatives:
(a) There exists a vector x ∈ Rn , such that P x ≻ 0.
(b) There exists a vector π ∈ Rm , such that P T π = 0, π ⪰ 0, and 1T π = 1.
In words, (a) says that there is an arbitrage, i.e., an investment strategy given by x (where xj ,
which can be negative, denotes the dollar amount invested in asset j) that makes money in every
scenario. The statement (b) states that there is a (discrete) probability distribution π such that
the expected payoff of every asset is zero under this distribution.
Remarks. The fundamental theorem of asset pricing states that the asset universe is free of ar-
bitrage if and only if there exists a probability distribution π such that the expected payoff of
every asset under this distribution is zero. In finance, such a probability distribution is called a
risk-neutral probability measure or a martingale measure. If π is unique, the market is said to be
complete.
17.38 Optimal certificate of deposit investment. You have an initial endowment of cash C1 > 0 that you
P to pay a series of known liabilities lt ≥ 0 in periods t = 1, . . . , T . We will assume that
will use
C1 ≥ Tt=1 lt , i.e., the initial cash can cover all the liabilities.
In any period you can invest a nonnegative amount in CDs (certificates of deposit, also known as
zero-coupon bonds). A CD is characterized by its maturity M ≥ 1 (given in periods) and its (per
period) interest rate R > 0. If you invest an amount z ≥ 0 in such a CD in period t, you pay z ≥ 0
in period t (to fund the CD), and you will receive the amount z(1 + R)M in period t + M . We say
that the CD matures in period t + M . (Here we assume that t + M ≤ T .)
There are K different CDs available, with maturities and rates given by Mk and Rk , k = 1, . . . , K.
(The maturities are distinct, i.e., there is at most one CD with a given maturity.) Let ztk ≥ 0
denote the amount invested in CD k in period t, with ztk = 0 when t + Mk > T . (That is, we do
not invest in CDs which mature after time period T .) The total amount you must pay in period t
to fund these CDs is
XK
Ft = ztk .
k=1
The amount you receive in period t from previous CDs maturing is
X
Gt = zτ k (1 + Rk )Mk .
τ +Mk =t
The sum here is over all τ ∈ {1, . . . , t − 1} and k for which τ + Mk = t, i.e., over all CDs that
mature in period t.
The cash dynamics are given by
Ct+1 = Ct − lt − Ft + Gt , t = 1, . . . , T.
We require that Ct ≥ 0 for all t. Our objective is to maximize CT +1 , the amount of cash on hand
after paying the last liability.
The variables in the problem are ztk , t = 1, . . . , T − 1. The data is C1 , the liability stream l1 , . . . , lT ,
and the CD maturities and rates Mk , Rk , k = 1, . . . , K.
235
(a) Explain how to solve this problem using convex optimization. If you use any change of variables
or a relaxation, explain.
(b) Find optimal CD investments for the problem instance with data in cd_invest_data.py. The
data consists of 120 monthly liabilities over 10 years, and K = 6 CDs with maturities of 1, 3,
6, 12, 24, and 36 months, and interest rates of 1%, 2%, 3%, 4%, 5%, and 6% annually. (These
are converted to per- period, i.e., monthly, interest rates R1 , . . . , R6 in the data file.)
Report the optimal final cash value CT⋆ +1 for your investments. Plot ztk ⋆ versus t for each M ,
k
on different plots.
Compare the optimal policy to a baseline policy that invests all cash in each period in the CD
with a maturity of M = 1.
Remarks. (Not needed to solve the problem.) We are making a number of simplifying assumptions
that do not hold in practice. In practice (but not in this simplified problem):
17.39 Optimal purchase schedule. We wish to schedule our purchases of a given (large and positive)
quantity Q (in shares) of some asset over T periods t = 1, . . . , T , with t = 1 being the current
period. We denote the amount we buy in period t as qt , with qt ≥ 0, t = 1, . . . , T . Thus we have
q ⪰ 0 and 1T q = Q. We refer to q ∈ RT as our purchase schedule. We will choose q to trade off risk
and transaction cost, described below. The purchase schedule q = Qe1 , where e1 = (1, 0, . . . , 0),
corresponds to purchasing all shares in the first period. We will work with Q and q given in millions
of shares.
The price (per share) in period t is denoted pt , in USD. The total amount paid, not including
transaction cost (described below), is pT q, in millions of USD. The nominal cost is given by p1 Q; it
is the total cost if you bought all shares in the first period, without transaction cost. The difference
between your cost pT q and the nominal cost p1 Q is δ = pT q − p1 Q. It can be positive (meaning
you paid more than if you had purchased all shares in the first period), or negative.
We are given p1 (the current price), but we do not know p2 , . . . , pT (future prices). Instead we
assume the prices follow a Brownian motion with zero drift, i.e.,
pt = pt−1 + ξt , t = 2, . . . , T,
with ξt ∼ N (0, σ 2 ) IID, where σ, the price volatility, is known and positive (and has units USD).
With this price model we have E δ = 0. We refer to the variance of δ, i.e., R = var(δ) = E δ 2 ,
as the risk of the purchase schedule. Its squareroot is called the volatility of the schedule, and is
given in units millions of USD.
When we purchase shares in period t, we pay pt qt , which is the gross cost, plus a transaction cost (or
market impact cost). The transaction cost depends on the market participation πt , defined as πt =
qt /vt , where vt > 0 is the total market volume (number of shares traded by all market participants,
in millions) in period t. We will assume that vt are known. A common model of transaction cost
1/2
is σπt qt , where σ is the price volatility. This is sometimes called the squareroot model, since it
236
states that the price, including the market impact, increases by an amount proportional to the
squareroot of the market participation. The total transaction cost (in millions of USD) is
T
1/2
X
C= σπt qt .
t=1
In parts (a) and (b) below, we expect simple analytical solutions, along with a brief explanation;
we do not need a detailed or formal proof. (Yes, we know that analytical solutions are not the
focus of the class.) Of course you can check your answers to parts (a) and (b) using the code you
develop for part (d).
(a) Minimum risk schedule. Find q mr , the schedule that minimizes R subject to the constraints
described above.
(b) Minimum transaction cost. Find q mc , the schedule that minimizes C subject to the constraints
described above.
(c) Optimal execution problem. In the optimal execution problem we choose q, subject to the
constraints described above, plus a participation limit πt ≤ π max , t = 1, . . . , T , where π max is
given, so as to minimize C + γR, where γ > 0 is a given risk aversion parameter. Explain how
to solve the optimal execution problem using convex optimization.
(d) Carry out the method of part (c) with data in opt_purchase_exec_data.py. (The data are
real; it is for Apple stock (AAPL) for 10 trading days from February 8, 2024, to February
22, 2024. The volumes and desired purchase quantity are in millions of shares, and the price
and price volatility are in USD.) Plot the optimal purchase
√ schedule q ⋆ and the associated
⋆
market participation π . Give the optimal
√ volatility R and transaction cost C, in millions
of USD. Give their normalized values, R/(p1 Q) and C/(p1 Q), which are unitless and easier
to interpret.
for k, l = 1, . . . , T . (The first row and column of Σ are zero since p1 is assumed known.) This will
be useful in deriving an expression for R. (We also know methods for deriving an expression for R
that do not use Σ, so do not worry if your solution doesn’t use it.)
17.40 Leverage limit. Let w ∈ Rn , with 1T w = 1, denote the set of weights for a portfolio of n investments,
with wi the fraction of the total portfolio value (assumed to be positive) invested in asset i. When
wi < 0, it means we hold a short position in asset i; when wi > 0, we hold a long position in asset
i. (You do not need to know what these mean.)
237
The total long weight and total short weight are defined as
n
X n
X
L = 1T (w)+ = max{0, wi }, S = 1T (w)− = max{0, −wi },
i=1 i=1
respectively. As a common example, a portfolio with weights w with L(w) = 1.3 and S(w) = 0.3 is
called a 130–30 portfolio.
A leverage limit is a constraint of the form S ≤ ηL, where η ∈ [0, 1) is a parameter. Is a leverage
limit constraint convex (i.e., is the set of weights that satisfy it convex)? If so, explain. If not, give
a specific counterexample. Hint. L − S = 1.
rT x + d
maximize
∥Rx + q∥2
Pn
subject to fi (xi ) ≤ b
i=1
x ⪰ c.
with βi > |αi |, γi > 0. We assume there exists a feasible x with rT x + d > 0.
Show that this problem can be solved by solving an SOCP (if possible) or a sequence of SOCP
feasibility problems (otherwise).
238
18 Mechanical and aerospace engineering
18.1 Optimal design of a tensile structure. A tensile structure is modeled as a set of n masses in R2 ,
some of which are fixed, connected by a set of N springs. The masses are in equilibrium, with spring
forces, connection forces for the fixed masses, and gravity balanced. (This equilibrium occurs when
the position of the masses minimizes the total energy, defined below.)
We let (xi , yi ) ∈ R2 denote the position of mass i, and mi > 0 its mass value. The first p masses
are fixed, which means that xi = xfixed
i and yi = yifixed , for i = 1, . . . , p. The gravitational potential
energy of mass i is gmi yi , where g ≈ 9.8 is the gravitational acceleration.
Suppose spring j connects masses r and s. Its elastic potential energy is
Here we arbitrarily choose a head and tail for each spring, but in fact the springs are completely
symmetric, and the choice can be reversed without any effect. (Hopefully you will discover why it
is convenient to use the incidence matrix A to specify the topology of the system.)
The total energy is the sum of the gravitational energies, over all the masses, plus the sum of the
elastic energies, over all springs. The equilibrium positions of the masses is the point that minimizes
the total energy, subject to the constraints that the first p positions are fixed. (In the equilibrium
positions, the total force on each mass is zero.) We let Emin denote the total energy of the system,
in its equilibrium position. (We assume the energy is bounded below; this occurs if and only if each
mass is connected, through some set of springs with positive stiffness, to a fixed mass.)
The total energy Emin is a measure of the stiffness of the structure, with larger Emin corresponding
to stiffer. (We can think of Emin = −∞ as an infinitely unstiff structure; in this case, at least one
mass is not even supported against gravity.)
239
18.2 Equilibrium position of a system of springs. We consider a collection of n masses in R2 , with
locations (x1 , y1 ), . . . , (xn , yn ), and masses m1 , . . . , mn . (In other words, the vector x ∈ Rn gives
the x-coordinates, and y ∈ Rn gives the y-coordinates, of the points.) The masses mi are, of course,
positive.
For i = 1, . . . , n − 1, mass i is connected to mass i + 1 by a spring. The potential energy in the ith
spring is a function of the (Euclidean) distance di = ∥(xi , yi ) − (xi+1 , yi+1 )∥2 between the ith and
(i + 1)st masses, given by
0 di < li
Ei =
(ki /2)(di − li )2 di ≥ li
where li ≥ 0 is the rest length, and ki > 0 is the stiffness, of the ith spring. The gravitational
potential energy of the ith mass is gmi yi , where g is a positive constant. The total potential energy
of the system is therefore
n−1
X
E= Ei + gmT y.
i=1
The locations of the first and last mass are fixed. The equilibrium location of the other masses is
the one that minimizes E.
(a) Show how to find the equilibrium positions of the masses 2, . . . , n−1 using convex optimization.
Be sure to justify convexity of any functions that arise in your formulation (if it is not obvious).
The problem data are mi , ki , li , g, x1 , y1 , xn , and yn .
(b) Carry out your method to find the equilibrium positions for a problem with n = 10, mi = 1,
ki = 10, li = 1, x1 = y1 = 0, xn = yn = 10, with g varying from g = 0 (no gravity) to g = 10
(say). Verify that the results look reasonable. Plot the equilibrium configuration for several
values of g.
18.3 Elastic truss design. In this problem we consider a truss structure with m bars connecting a set
of nodes. Various external forces are applied at each node, which cause a (small) displacement in
the node positions. f ∈ Rn will denote the vector of (components of) external forces, and d ∈ Rn
will denote the vector of corresponding node displacements. (By ‘corresponding’ we mean if fi is,
say, the z-coordinate of the external force applied at node k, then di is the z-coordinate of the
displacement of node k.) The vector f is called a loading or load.
The structure is linearly elastic, i.e., we have a linear relation f = Kd between the vector of
external forces f and the node displacements d. The matrix K = K T ≻ 0 is called the stiffness
matrix of the truss. Roughly speaking, the ‘larger’ K is (i.e., the stiffer the truss) the smaller the
node displacement will be for a given loading.
We assume that the geometry (unloaded bar lengths and node positions) of the truss is fixed; we
are to design the cross-sectional areas of the bars. These cross-sectional areas will be the design
variables xi , i = 1, . . . , m. The stiffness matrix K is a linear function of x:
K(x) = x1 K1 + · · · + xm Km ,
where Ki = KiT ⪰ 0 depend on the truss geometry. You can assume these matrices are given or
known. The total weight Wtot of the truss also depends on the bar cross-sectional areas:
Wtot (x) = w1 x1 + · · · + wm xm ,
240
where wi > 0 are known, given constants (density of the material times the length of bar i). Roughly
speaking, the truss becomes stiffer, but also heavier, when we increase xi ; there is a tradeoff between
stiffness and weight.
Our goal is to design the stiffest truss, subject to bounds on the bar cross-sectional areas and total
truss weight:
l ≤ xi ≤ u, i = 1, . . . , m, Wtot (x) ≤ W,
where l, u, and W are given. You may assume that K(x) ≻ 0 for all feasible vectors x. To obtain
a specific optimization problem, we must say how we will measure the stiffness, and what model of
the loads we will use.
(a) There are several ways to form a scalar measure of how stiff a truss is, for a given load f . In
this problem we will use the elastic stored energy
1
E(x, f ) = f T K(x)−1 f
2
to measure the stiffness. Maximizing stiffness corresponds to minimizing E(x, f ).
Show that E(x, f ) is a convex function of x on {x | K(x) ≻ 0}.
Hint. Use Schur complements to prove that the epigraph is a convex set.
(b) We can consider several different scenarios that reflect our knowledge about the possible
loadings f that can occur. The simplest is that f is a single, fixed, known loading. In more
sophisticated formulations, the loading f might be a random vector with known distribution,
or known only to lie in some set F, etc.
Show that each of the following four problems is a convex optimization problem, with x as
variable.
• Design for a fixed known loading. The vector f is known and fixed. The design problem
is
minimize E(x, f )
subject to l ≤ xi ≤ u, i = 1, . . . , m
Wtot (x) ≤ W.
• Design for multiple loadings. The vector f can take any of N known values f (i) , i =
1, . . . , N , and we are interested in the worst-case scenario. The design problem is
• Design for worst-case, unknown but bounded load. Here we assume the vector f can take
arbitrary values in a ball B = {f | ∥f ∥2 ≤ α}, for a given value of α. We are interested
in minimizing the worst-case stored energy, i.e.,
241
• Design for a random load with known statistics. We can also use a stochastic model of the
uncertainty in the load, and model the vector f as a random variable with known mean
and covariance:
E f = f (0) , E(f − f (0) )(f − f (0) )T = Σ.
In this case we would be interested in minimizing the expected stored energy, i.e.,
Hint. If v is a random vector with zero mean and covariance Σ, then E v T Av = E tr Avv T =
tr A E vv T = tr AΣ.
(c) Formulate the four problems in (b) as semidefinite programming problems.
18.4 A structural optimization problem [Bazaraa, Sherali, and Shetty]. The figure shows a two-bar truss
with height 2h and width w. The two bars are cylindrical tubes with inner radius r and outer
radius R. We are interested in determining the values of r, R, w, and h that minimize the weight
of the truss subject to a number of constraints. The structure should be strong enough for two
loading scenarios. In the first scenario a vertical force F1 is applied to the node; in the second
scenario the force is horizontal with magnitude F2 .
h
R
F2 r
h
F1
The weight of the truss is proportional to the total volume of the bars, which is given by
p
2π(R2 − r2 ) w2 + h2
242
The maximum force in each bar is equal to the cross-sectional area times the maximum allowable
stress σ (which is a given constant). This gives us the first constraint:
√
w2 + h2
F1 ≤ σπ(R2 − r2 ).
2h
The second constraint is that the truss should be strong enough to carry the load F2 . When F2 is
applied, the magnitudes of the forces in two bars are again equal and given by
√
w 2 + h2
F2 ,
2w
which gives us the second constraint:
√
w2 + h2
F2 ≤ σπ(R2 − r2 ).
2w
We also impose limits wmin ≤ w ≤ wmax and hmin ≤ h ≤ hmax on the width and the height of the
structure, and limits 1.1r ≤ R ≤ Rmax on the outer radius.
In summary, we obtain the following problem:
√
minimize 2π(R2 − r2 ) w2 + h2
√
w 2 + h2
subject to F1 ≤ σπ(R2 − r2 )
2h
√
w 2 + h2
F2 ≤ σπ(R2 − r2 )
2w
wmin ≤ w ≤ wmax
hmin ≤ h ≤ hmax
1.1r ≤ R ≤ Rmax
R > 0, r > 0, w > 0, h > 0.
18.5 Optimizing the inertia matrix of a 2D mass distribution. An object has density ρ(z) at the point
z = (x, y) ∈ R2 , over some region R ⊂ R2 . Its mass m ∈ R and center of gravity c ∈ R2 are given
by Z Z
1
m= ρ(z) dxdy, c= ρ(z)z dxdy,
R m R
and its inertia matrix M ∈ R2×2 is
Z
M= ρ(z)(z − c)(z − c)T dxdy.
R
(You do not need to know the mechanics interpretation of M to solve this problem, but here it is,
for those interested. Suppose we rotate the mass distribution around a line passing through the
243
center of gravity in the direction q ∈ R2 that lies in the plane where the mass distribution is, at
angular rate ω. Then the total kinetic energy is (ω 2 /2)q T M q.)
The goal is to choose the density ρ, subject to 0 ≤ ρ(z) ≤ ρmax for all z ∈ R, and a fixed total
mass m = mgiven , in order to maximize λmin (M ).
To solve this problem numerically, we will discretize R into N pixels each of area a, with pixel
i having constant density ρi and location (say, of its center) zi ∈ R2 . We will assume that the
integrands above don’t vary too much over the pixels, and from now on use instead the expressions
N N N
X a X X
m=a ρi , c= ρi zi , M =a ρi (zi − c)(zi − c)T .
m
i=1 i=1 i=1
(a) Explain how to solve the problem using convex (or quasiconvex) optimization.
(b) Carry out your method on the problem instance with data in inertia_dens_data.m. This
file includes code that plots a density. Give the optimal inertia matrix and its eigenvalues,
and plot the optimal density.
18.6 Truss loading analysis. A truss (in 2D, for simplicity) consists of a set of n nodes, with positions
p(1) , . . . , p(n) ∈ R2 , connected by a set of m bars with tensions t1 , . . . , tm ∈ R (tj < 0 means bar j
operates in compression).
Each bar puts a force on the two nodes which it connects. Suppose bar j connects nodes k and l.
The tension in this bar applies a force
tj
(p(l) − p(k) ) ∈ R2
∥p(l) − p(k) ∥2
to node k, and the opposite force to node l. In addition to the forces imparted by the bars, each
node has an external force acting on it. We let f (i) ∈ R2 be the external force acting on node i. For
the truss to be in equilibrium, the total force on each node, i.e., the sum of the external force and
the forces applied by all of the bars that connect to it, must be zero. We refer to this constraint as
force balance.
The tensions have given limits, Tjmin ≤ tj ≤ Tjmax , with Tjmin ≤ 0 and Tjmax ≥ 0, for j = 1, . . . , m.
(For example, if bar j is a cable, then it can only apply a nonnegative tension, so Tjmin = 0, and
we interpret Tjmax as the maximum tension the cable can carry.)
The first p nodes, i = 1, . . . , p, are free, while the remaining n − p nodes, i = p + 1, . . . , n, are
anchored (i.e., attached to a foundation). We will refer to the external forces on the free nodes
as load forces, and external forces at the anchor nodes as anchor forces. The anchor forces are
unconstrained. (More accurately, the foundations at these points are engineered to withstand any
total force that the bars attached to it can deliver.) We will assume that the load forces are just
dead weight, i.e., have the form
(i) 0
f = , i = 1, . . . , p,
−wi
244
The set of weights w ∈ Rp+ is supportable if there exists a set of tensions t ∈ Rm and anchor forces
f (p+1) , . . . , f (n) that, together with the given load forces, satisfy the force balance equations and
respect the tension limits. (The tensions and anchor forces in a real truss will adjust themselves to
have such values when the load forces are applied.) If there does not exist such a set of tensions
and anchor forces, the set of load forces is said to be unsupportable. (In this case, a real truss will
fail, or collapse, when the load forces are applied.)
Finally, we get to the questions.
(a) Explain how to find the maximum total weight, 1T w, that is supportable by the truss.
(b) Explain how to find the minimum total weight that is not supportable by the truss. (Here we
mean: Find the minimum value of 1T w, for which (1 + ϵ)w is not supportable, for all ϵ > 0.)
(c) Carry out the methods of parts (a) and (b) on the data given in truss_load_data.m. Give
the critical total weights from parts (a) and (b), as well as the individual weight vectors.
Notes.
• In parts (a) and (b), we don’t need a fully formal mathematical justification; a clear argument
or explanation of anything not obvious is fine.
• The force balance equations can be expressed in the compact and convenient form
load,x
f
At + f load,y = 0,
f anch
where
(1) (p)
f load,x = (f1 , . . . , f1 ) ∈ Rp ,
(1) (p)
f load,y = (f2 , . . . , f2 ) ∈ Rp ,
(p+1) (n) (p+1) (n)
f anch = (f1 , . . . , f1 , f2 , . . . , f2 ) ∈ R2(n−p) ,
and A ∈ R2n×m is a matrix that can be found from the geometry data (truss topology and
node positions). You may refer to A in your solutions to parts (a) and (b). For part (c), we
have very kindly provided the matrix A for you in the m-file, to save you the time and trouble
of working out the force balance equations from the geometry of the problem.
18.7 Least-cost road grading. A road is to be built along a given path. We must choose the height of
the roadbed (say, above sea level) along the path, minimizing the total cost of grading, subject to
some constraints. The cost of grading (i.e., moving earth to change the height of the roadbed from
the existing elevation) depends on the difference in height between the roadbed and the existing
elevation. When the roadbed is below the existing elevation it is called a cut; when it is above
it is called a fill. Each of these incurs engineering costs; for example, fill is created in a series of
lifts, each of which involves dumping just a few inches of soil and then compacting it. Deeper cuts
and higher fills require more work to be done on the road shoulders, and possibly, the addition of
reinforced concrete structures to stabilize the earthwork. This explains why the marginal cost of
cuts and fills increases with their depth/height.
245
We will work with a discrete model, specifying the road height as hi , i = 1, . . . , n, at points equally
spaced a distance d from each other along the given path. These are the variables to be chosen.
(The heights h1 , . . . , hn are called a grading plan.) We are given ei , i = 1, . . . , n, the existing
elevation, at the points. The grading cost is
n
X
C= ϕfill ((hi − ei )+ ) + ϕcut ((ei − hi )+ ) ,
i=1
where ϕfill and ϕcut are the fill and cut cost functions, respectively, and (a)+ = max{a, 0}. The fill
and cut functions are increasing and convex. The goal is to minimize the grading cost C.
The road height is constrained by given limits on the first, second, and third derivatives:
|hi+1 − hi |/d ≤ D(1) , i = 1, . . . , n − 1
2 (2)
|hi+1 − 2hi + hi−1 |/d ≤ D , i = 2, . . . , n − 1
3 (3)
|hi+1 − 3hi + 3hi−1 − hi−2 |/d ≤ D , i = 3, . . . , n − 1,
where D(1) is the maximum allowable road slope, D(2) is the maximum allowable curvature, and
D(3) is the maximum allowable third derivative.
18.8 Lightest structure that resists a set of loads. We consider a mechanical structure in 2D (for simplic-
ity) which consists of a set of m nodes, with known positions p1 , . . . , pm ∈ R2 , connected by a set
of n bars (also called struts or elements), with cross-sectional areas a1 , . . . , an ∈ R+ , and internal
tensions t1 , . . . , tn ∈ R.
Bar j is connected between nodes rj and sj . (The indices r1 , . . . , rn and s1 , . . . , sn give the structure
topology.)
Pn The length of bar j is Lj = ∥prj − psj ∥2 , and the total volume of the bars is V =
j=1 j j (The total weight is proportional to the total volume.)
a L .
Bar j applies a force (tj /Lj )(prj − psj ) ∈ R2 to node sj and the negative of this force to node rj .
Thus, positive tension in a bar pulls its two adjacent nodes towards each other; negative tension
(also called compression) pushes them apart. The ratio of the tension in a bar to its cross-sectional
area is limited by its yield strength, which is symmetric in tension and compression: |tj | ≤ σaj ,
where σ > 0 is a known constant that depends on the material.
The nodes are divided into two groups: free and fixed. We will take nodes 1, . . . , k to be free,
and nodes k + 1, . . . , m to be fixed. Roughly speaking, the fixed nodes are firmly attached to the
ground, or a rigid structure connected to the ground; the free ones are not.
246
A loading consists of a set of external forces, f1 , . . . , fk ∈ R2 applied to the free nodes. Each free
node must be in equilibrium, which means that the sum of the forces applied to it by the bars and
the external force is zero. The structure can resist a loading (without collapsing) if there exists a
set of bar tensions that satisfy the tension bounds and force equilibrium constraints. (For those
with knowledge of statics, these conditions correspond to a structure made entirely with pin joints.)
(i) (i)
Finally, we get to the problem. You are given a set of M loadings, i.e., f1 , . . . , fk ∈ R2 ,
i = 1, . . . , M . The goal is to find the bar cross-sectional areas that minimize the structure volume
V while resisting all of the given loadings. (Thus, you are to find one set of bar cross-sectional areas,
and M sets of tensions.) Using the problem data provided in lightest_struct_data.m, report V ⋆
and V unif , the smallest feasible structure volume when all bars have the same cross-sectional area.
The node positions are given as a 2 × m matrix P, and the loadings as a 2 × k × M array F. Use
the code included in the data file to visualize the structure with the bar cross-sectional areas that
you find, and provide the plot in your solution.
Hint. You might find the graph incidence matrix A ∈ Rm×n useful. It is defined as
+1 i = rj
Aij = −1 i = sj
0 otherwise.
Remark. You could reasonably ask, ‘Does a mechanical structure really solve a convex optimization
problem to determine whether it should collapse?’. It sounds odd, but the answer is, yes it does.
247
18.9 Maintaining static balance. In this problem we study a human’s ability to
maintain balance against an applied external force. We will use a planar
(two-dimensional) model to characterize the set of push forces a human
can sustain before he or she is unable to maintain balance. We model
the human as a linkage of 4 body segments, which we consider to be rigid
bodies: the foot, lower leg, upper leg, and pelvis (into which we lump the
upper body). The pose is given by the joint angles, but this won’t matter
in this problem, since we consider a fixed pose. A set of 40 muscles act
on the body segments; each of these develops a (scalar) tension ti that
satisfies 0 ≤ ti ≤ Timax , where Timax is the maximum possible tension for
muscle i. (The maximum muscle tensions depend on the pose, and the
person, but here they are known constants.) An external pushing force
f push ∈ R2 acts on the pelvis. Two (ground contact) forces act on the
foot: f heel ∈ R2 and f toe ∈ R2 . (These are shown at right.) These must
satisfy
|f1heel | ≤ µf2heel , |f1toe | ≤ µf2toe ,
where µ > 0 is the coefficient of friction of the ground. There are also joint
forces that act at the joints between the body segments, and gravity forces
for each body segment, but we won’t need them explicitly in this problem.
248
To maintain balance, the net force and torque on each each body segment must be satisfied. These
equations can be written out from the geometry of the body (e.g., attachment points for the
muscles) and the pose. They can be reduced to a set of 6 linear equations:
where t ∈ R40 is the vector of muscle tensions, and Amusc , Atoe , Aheel , and Apush are known matrices
and b ∈ R6 is a known vector. These data depend on the pose, body weight and dimensions, and
muscle lines of action. Fortunately for you, our biomechanics expert Apoorva has worked them
out; you will find them in static_balance_data.* (along with T max and µ).
We say that the push force f push can be resisted if there exist muscle tensions and ground contact
forces that satisfy the constraints above. (This raises a philosophical question: Does a person solve
an optimization to decide whether he or she should lose their balance? In any case, this approach
makes good predictions.)
Find F res ⊂ R2 , the set of push forces that can be resisted. Plot it as a shaded region.
Hints. Show that F res is a convex set. For the given data, 0 ∈ F res . Then for θ = 1◦ , 2◦ , . . . , 360◦ ,
determine the maximum push force, applied in the direction θ, that can be resisted. To make
a filled region on a plot, you can use the command fill() in Matlab. For Python and Ju-
lia, fill() is also available through PyPlot. In Julia, make sure to use the ECOS solver with
solver = ECOSSolver(verbose=false).
Remark. A person can resist a much larger force applied to the hip than you might think.
18.10 Thermodynamic potentials. We consider a mixture of k chemical species. The internal energy of
the mixture is
U (S, V, N1 , . . . , Nk ),
where S is the entropy of the mixture, V is the volume occupied by the mixture, and Ni is the
quantity (in moles) of chemical species i. We assume the function U is convex. (Real internal
energy functions satisfy this and other interesting properties, but we won’t need any others for this
problem.) The enthalpy H, the Helmoltz free energy A, and the Gibbs free energy G are defined as
The variables T and P can be interpreted physically as the temperature and pressure of the mixture.
These four functions are called thermodynamic potentials. We refer to the arguments S, V , and
N1 , . . . Nk as the extensive variables, and the arguments T and P as the intensive variables.
(a) Show that H, A, and G are convex in the extensive variables, when the intensive variables are
fixed.
(b) Show that H, A, and G are concave in the intensive variables, when the extensive variables
are fixed.
249
(c) We consider a simple reaction involving three species,
2[species 1] ⇀
↽ [species 2] + [species 3],
carried out at temperature Treact and volume Vreact . The Helmholtz free energy of the mixture
is
3 3 cj
X X V0 T0
A(T, V, N1 , N2 , N3 ) = T Nj (s0,j − Rcj ) + T R Nj log Nj ,
V T
j=1 j=1
where R, V0 , T0 , s0,j , and cj , for j = 1, . . . , k, are known, positive constants. The equilibrium
molar quantities N1⋆ , N2⋆ , and N3⋆ of the three species are those that minimize A(Treact , Vreact , N1 , N2 , N3 )
subject to the stoichiometry constraints
where Nj,init is the initial quantity of species j, and the variable z gives the amount of the
reaction that has proceeded. For the values of Treact , Vreact , R, V0 , T0 , s0,j , and cj given in
thermo potentials data.*, report the equilibrium molar quantities N1⋆ , N2⋆ , and N3⋆ .
Note: Julia users might want the ECOS solver. Include using ECOS, and solve by using
solve!(prob, ECOSSolver()).
18.11 Elastic stored energy in a spring. A spring is a mechanical device that exerts a force F that depends
on its extension x: F = ϕ(x), where ϕ : R → R. The domain dom ϕ is an interval [xmin , xmax ]
containing 0, where xmin (xmax ) is the minimum (maximum) possible extension of the spring. When
x > 0, the spring is said to be extended, and when x < 0, it is said to be in compression. The
force exerted by the spring must be restoring, which means that F ≥ 0 when x ≥ 0, and F ≤ 0
when x ≤ 0. (Our sign convention is that a positive force F opposes a positive extension x.) This
implies that F = 0 when x = 0, i.e., zero force is developed when the spring is not extended or
compressed.
The simplest spring is a Hooke (linear) spring, with ϕ(x) = Kx, where K > 0 is the spring constant.
(The constant 1/K is called the spring compliance.)
A spring is called monotonic if the function ϕ is nondecreasing, i.e., larger extension leads to
a stronger restoring force. Many, but not all, springs are monotonic. A classic example is a
compound bow, which has a force that first increases with x, and then decreases to a small value
at the extension x where it is fully drawn. (This decrease in force from the maximum is called the
let off of the bow.)
The elastic stored energy in the spring is
Z x
E(x) = ϕ(x′ ) dx′ ,
0
250
18.12 Quickest take-off. This problem concerns the braking and thrust profiles for an airplane during
take-off. For simplicity we will use a discrete-time model. The position (down the runway) and the
velocity in time period t are pt and vt , respectively, for t = 0, 1, . . .. These satisfy p0 = 0, v0 = 0,
and pt+1 = pt + hvt , t = 0, 1, . . ., where h > 0 is the sampling time period. The velocity updates as
where η ∈ (0, 1) is a friction or drag parameter, ft is the engine thrust, and bt is the braking force,
at time period t. These must satisfy
|ft+1 − ft | ≤ S, t = 0, 1, . . . .
Here B max , F max , and S are given parameters. The initial thrust is f0 = 0. The take-off time is
T to = min{t | vt ≥ V to }, where V to is a given take-off velocity. The take-off position is P to = pT to ,
the position of the aircraft at the take-off time. The length of the runway is L > 0, so we must
have P to ≤ L.
(a) Explain how to find the thrust and braking profiles that minimize the take-off time T to ,
respecting all constraints. Your solution can involve solving more than one convex problem,
if necessary.
(b) Solve the quickest take-off problem with data
Plot pt , vt , ft , and bt versus t. Comment on what you see. Report the take-off time and
take-off position for the profile you find.
18.13 Minimum time maneuver for a crane. A crane manipulates a load with mass m > 0 in two
dimensions using two cables attached to the load. The cables maintain angles ±θ with respect to
vertical, as shown below.
θ θ
load
The (scalar) tensions T left and T right in the two cables are independently controllable, from 0 up
to a given maximum tension T max . The total force on the load is
left − sin θ right sin θ
F =T +T + mg,
cos θ cos θ
251
where g = (0, −9.8) is the acceleration due to gravity. The acceleration of the load is then F/m.
We approximate the motion of the load using
where pi ∈ R2 is the position of the load, vi ∈ R2 is the velocity of the load, and Fi ∈ R2 is the
force on the load, at time t = ih. Here h > 0 is a small (given) time step.
The goal is to move the load, which is initially at rest at position pinit to the position pdes , also at
rest, in minimum time. In other words, we seek the smallest k for which
(a) Explain how to solve this problem using convex (or quasiconvex) optimization.
(b) Carry out the method of part (a) for the problem instance with
with time step h = 0.1. Report the minimum time k ⋆ . Plot the tensions versus time, and the
load trajectory, i.e., the points p1 , . . . , pk in R2 . Does the load move along the line segment
between pinit and pdes (i.e., the shortest path from pinit and pdes )? Comment briefly.
18.14 Design of an unmanned aerial vehicle. You are tasked with developing the high-level design for
an electric unmanned aerial vehicle (UAV). The goal is to design the least expensive UAV that is
able to complete K missions, labeled k = 1, . . . , K. Mission k involves transporting a payload of
weight Wkpay > 0 (in kilograms) over a distance Dk > 0 (in meters), at a speed Vk > 0 (in meters
per second). These mission quantities are given.
The high-level design consists of choosing the engine weight W eng (in kilograms), the battery weight
W bat (in kilograms), and the wing area S (in m2 ), within the given limits
eng
Wmin ≤ W eng ≤ Wmax
eng
, bat
Wmin ≤ W bat ≤ Wmax
bat
, Smin ≤ S ≤ Smax .
(The lower limits are all positive.) We refer to the variables W eng , W bat , and S as the design
variables.
In addition to choosing the design variables, you must choose the power Pk > 0 (in watts) that
flows from the battery to the engine, and the angle of attack αk > 0 (in degrees) of the UAV during
mission k, for k = 1, . . . , K. These must satisfy
0 ≤ Pk ≤ Pmax , 0 ≤ αk ≤ αmax ,
where αmax is given, and Pmax depends on the engine weight as described below. We refer to these
2K variables as the mission variables. The engine weight, battery weight, and wing area are the
same for all k missions; the power and angle of attack can change with the mission.
The weight of the wing is W wing (in kilograms) is given by W wing = CW S 1.2 , where CW > 0 is
given. The total weight of the UAV during mission k, denoted Wk , is the sum of the battery weight,
252
engine weight, wing weight, the payload weight, and a baseline weight W base , which is given. The
total weight depends on the mission, via the payload weight, and so is subscripted by k.
The lift and drag forces acting on the UAV in mission k are
1 1
Fklift = ρVk2 CL (αk )S, Fkdrag = ρVk2 CD (αk )S
2 2
(in newtons), where CL and CD are the lift and drag coefficients as functions of the angle of attack
αk , and ρ > 0 is the (known) air density (in kilograms per cubic meter). We will use the simple
functions
CL (α) = cL α, CD (α) = cD1 + cD0 α2 ,
where cL > 0, cD0 > 0, and cD1 > 0 are given constants.
To maintain steady level flight, the lift must equal the weight, and the drag must equal the thrust
from the propeller, denoted Tk (in newtons), i.e.,
Fklift = Wk , Fkdrag = Tk .
The thrust force, power Pk (in watts), and the UAV speed are related via Pk = Tk Vk . The engine
maximum power is related to its weight by W eng = CP Pmax
0.803 where C > 0 is given.
P
The battery capacity E (in joules) is equal to CE W bat , where CE > 0 is given. The total energy
expended over mission k, with speed Vk , power output Pk , and distance Dk is Pk Dk /Vk . This must
not exceed the battery capacity E.
The overal cost of the UAV is the sum of a design cost and a mission cost. The design cost Cdes ,
which is an approximation of the cost of building the UAV, is given by
which captures our desire that the thrust and angle of attack be small.
eng eng bat , W bat , S
In summary, Wmin , Wmax , Wmin max min , Smax , αmax , Wbase , CW , cL , cD0 , cD1 , CP , CE , and
pay
ρ are given. Additionally, Dk , Vk , and Wk are given for k = 1, . . . , K.
(a) The problem as stated is almost a geometric problem (GP). By relaxing two constraints it
becomes a GP, and therefore readily solved. Identify these constraints and give the relaxed
versions. Briefly explain why the relaxed constraints will be tight at the solution, which means
by solving the GP, you’ve actually solved the original problem. You do not need to reduce
the relaxed problem to a standard form GP, or the equivalent convex problem; it’s enough to
express is in DGP compatible form.
(b) Solve the relaxed problem you formulate in part (a) with data given in uav_design_data.py.
⋆ and C ⋆ , and the values of all design and mission variables. Check
Give the optimal costs Cdes mis
that at your solution, the relaxed constraints are tight.
253
Remarks and hints.
• No, you do not need to be an expert on aeronautics to solve the problem; we’ve given everything
you need.
• It’s tempting to jump in and work out a bunch of algebra by hand. Don’t.
• In CVXPY, you’ll use disciplined geometric programming (DGP), as described in https:
//cvxpy.org/tutorial/dgp/.
• This is highly simplified version of the UAV design problem. More complex versions can
include many other effects, more complicated missions, and more variables, e.g., the altitude
of each mission. You can learn more about this topic at
https://people.eecs.berkeley.edu/~pabbeel/papers/2012_gp_design.pdf.
254
19 Graphs and networks
19.1 A hypergraph with nodes 1, . . . , m is a set of nonempty subsets of {1, 2, . . . , m}, called edges. An
ordinary graph is a special case in which the edges contain no more than two nodes.
We consider a hypergraph with m nodes and assume coordinate vectors xj ∈ Rp , j = 1, . . . , m, are
associated with the nodes. Some nodes are fixed and their coordinate vectors xj are given. The
other nodes are free, and their coordinate vectors will be the optimization variables in the problem.
The objective is to place the free nodes in such a way that some measure of the physical size of the
nets is small.
As an example application, we can think of the nodes as modules in an integrated circuit, placed
at positions xj ∈ R2 . Every edge is an interconnect network that carries a signal from one module
to one or more other modules.
To define a measure of the size of a net, we store the vectors xj as columns of a matrix X ∈ Rp×m .
For each edge S in the hypergraph, we use XS to denote the p × |S| submatrix of X with the
columns associated with the nodes of S. We define
as the size of the edge S, where ∥ · ∥ is a matrix norm, and 1 is a vector of ones of length |S|.
255
• Sum-row-max norm:
p
X
∥Xs − y1T ∥srm = max |xij − yi |
j∈S
i=1
(iii) fS (X) is the squareroot of the sum of the squares of the Euclidean distances to the mean
of the coordinates of the nodes in S:
1/2
X 1 X
fS (X) = ∥xj − x̄∥22 where x̄i = xik , i = 1, . . . , p.
|S|
j∈S k∈S
(iv) fS (X) is the sum of the ℓ1 -distances to the (coordinate-wise) median of the coordinates
of the nodes in S:
X
fS (X) = ∥xj − x̂∥1 where x̂i = median({xik | k ∈ S}), i = 1, . . . , p.
j∈S
19.2 Let W ∈ Sn be a symmetric matrix with nonnegative elements wij and zero diagonal. We can
interpret W as the representation of a weighted undirected graph with n nodes. If wij = wji > 0,
there is an edge between nodes i and j, with weight wij . If wij = wji = 0 then nodes i and j are
not connected. The Laplacian of the weighted graph is defined as
is convex.
256
(b) Give a simple argument why f (W ) is an upper bound on the optimal value of the combinatorial
optimization problem
maximize y T L(W )y
subject to yi ∈ {−1, 1}, i = 1, . . . , n.
This problem is known as the max-cut problem, for the following reason. Every vector y
with components ±1 can be interpreted as a partition of the nodes of the graph in a set
S = {i | yi = 1} and a set T = {i | yi = −1}. Such a partition is called a cut of the graph.
The objective function in the max-cut problem is
X
y T L(W )y = wij (yi − yj )2 .
i≤j
19.3 Utility versus latency trade-off in a network. We consider a network with m edges, labeled 1, . . . , m,
and n flows, labeled 1, . . . , n. Each flow has an associated nonnegative flow rate fj ; each edge or
link has an associated positive capacity ci . Each flow passes over a fixed set of links (its route);
the total traffic ti on link i is the sum of the flow rates over all flows that pass through link i. The
flow routes are described by a routing matrix R ∈ Rm×n , defined as
1 flow j passes through link i
Rij =
0 otherwise.
257
(multiplied by a constant, that doesn’t matter to us). We take di = ∞ for ti = ci . The delay or
latency for flow j, denoted lj , is the sum of the link delays over all links that flow j passes through.
We define the maximum flow latency as
L = max{l1 , . . . , ln }.
(a) Explain how to solve the problem of choosing the interdiction effort vector x ∈ Rm , subject
to the constraints, so as to minimize P max . Partial credit will be given for a method that
involves an enumeration over all possible paths
Q (in the objective or constraints). Hint. For
each node i, let Pi denote the maximum of k∈P pk over all paths P from the source node 1
to node i (so P max = Pn ).
(b) Carry out your method on the problem instance given in interdict_alloc_data.m. The data
file contains the data a, xmax , B, and the graph incidence matrix A ∈ Rn×m , where
−1 if edge j leaves node i
Aij = +1 if edge j enters node i
0 otherwise.
258
Give P max⋆ , the optimal value of P max , and compare it to the value of P max obtained with
uniform allocation of resources, i.e., with x = (B/m)1.
Hint. Given a vector z ∈ Rn , AT z is the vector of edge differences: (AT z)j = zk − zl if edge j
goes from node l to node k.
The following figure shows the topology of the graph in question. (The data file contains A;
this figure, which is not needed to solve the problem, is shown here so you can visualize the
graph.)
Node 1
Node 2
Node 4 Node 3
Node 6 Node 5
Node 7
Node 8
Node 9
Node 10
19.5 Network sizing. We consider a network with n directed arcs. The flow through arc k is denoted
xk and can be positive, negative, or zero. The flow vector x must satisfy the network constraint
Ax = b where A is the node-arc incidence matrix and b is the external flow supplied to the nodes.
Each arc has a positive capacity or width yk . The quantity |xk |/yk is the flow density in arc k.
The cost of the flow in arc k depends on the flow density and the width of the arc, and is given by
yk ϕk (|xk |/yk ), where ϕk is convex and nondecreasing on R+ .
(a) Define f (y, b) as the optimal value of the network flow optimization problem
n
X
minimize yk ϕk (|xk |/yk )
k=1
subject to Ax = b
with variable x, for given values of the arc widths y ≻ 0 and external flows b. Is f a convex
function (jointly in y, b)? Carefully explain your answer.
259
(b) Suppose b is a discrete random vector with possible values b(1) , . . . , b(m) . The probability that
b = b(j) is πj . Consider the problem of sizing the network (selecting the arc widths yk ) so that
the expected cost is minimized:
The variable is y. Here g is a convex function, representing the installation cost, and E f (y, b)
is the expected optimal network flow cost
m
X
E f (y, b) = πj f (y, b(j) ),
j=1
19.6 Maximizing algebraic connectivity of a graph. Let G = (V, E) be a weighted undirected graph with
n = |V | nodes, m = |E| edges, and weights w1 , . . . , wm ∈ R+ on the edges. If edge k connects
nodes i and j, then define ak ∈ Rn as (ak )i = 1, (ak )j = −1, with other entries zero. The weighted
Laplacian (matrix) of the graph is defined as
m
X
L= wk ak aTk = A diag(w)AT ,
k=1
where A = [a1 · · · am ] ∈ Rn×m is the incidence matrix of the graph. Nonnegativity of the weights
implies L ⪰ 0.
Denote the eigenvalues of the Laplacian L as
λ1 ≤ λ2 ≤ · · · ≤ λn ,
which are functions of w. The minimum eigenvalue λ1 is always zero, while the second smallest
eigenvalue λ2 is called the algebraic connectivity of G and is a measure of the connectedness of
a graph: The larger λ2 is, the better connected the graph is. It is often used, for example, in
analyzing the robustness of computer networks.
Though not relevant for the rest of the problem, we mention a few other examples of how the
algebraic connectivity can be used. These results, which relate graph-theoretic properties of G
to properties of the spectrum of L, belong to a field called spectral graph theory. For example,
λ2 > 0 if and only if the graph is connected. The eigenvector v2 associated with λ2 is often called
the Fiedler vector and is widely used in a graph partitioning technique called spectral partitioning,
which assigns nodes to one of two groups based on the sign of the relevant component in v2 . Finally,
λ2 is also closely related to a quantity called the isoperimetric number or Cheeger constant of G,
which measures the degree to which a graph has a bottleneck.
The problem is to choose the edge weights w ∈ Rm + , subject to some linear inequalities (and the
nonnegativity constraint) so as to maximize the algebraic connectivity:
maximize λ2
subject to w ⪰ 0, F w ⪯ g,
with variable w ∈ Rm . The problem data are A (which gives the graph topology), and F and g
(which describe the constraints on the weights).
260
(a) Describe how to solve this problem using convex optimization.
(b) Numerical example. Solve the problem instance given in max_alg_conn_data.m, which uses
F = 1T and g = 1 (so the problem is to allocate a total weight of 1 to the edges of the graph).
Compare the algebraic connectivity for the graph obtained with the optimal weights w⋆ to the
one obtained with wunif = (1/m)1 (i.e., a uniform allocation of weight to the edges).
Use the function plotgraph(A,xy,w) to visualize the weighted graphs, with weight vectors
w⋆ and wunif . You will find that the optimal weight vector v ⋆ has some zero entries (which
due to the finite precision of the solver, will appear as small weight values); you may want to
round small values (say, those under 10−4 ) of w⋆ to exactly zero. Use the gplot function to
visualize the original (given) graph, and the subgraph associated with nonzero weights in w⋆ .
Briefly comment on the following (incorrect) intuition: “The more edges a graph has, the more
connected it is, so the optimal weight assignment should make use of all available edges.”
19.7 Graph isomorphism via linear programming. An (undirected) graph with n vertices can be described
by its adjacency matrix A ∈ Sn , given by
1 there is an edge between vertices i and j
Aij =
0 otherwise.
Two (undirected) graphs are isomorphic if we can permute the vertices of one so it is the same as
the other (i.e., the same pairs of vertices are connected by edges). If we describe them by their
adjacency matrices A and B, isomorphism is equivalent to the existence of a permutation matrix
P ∈ Rn×n such that P AP T = B. (Recall that a matrix P is a permutation matrix if each row and
column has exactly one entry 1, and all other entries 0.) Determining if two graphs are isomorphic,
and if so, finding a suitable permutation matrix P , is called the graph isomorphism problem.
Remarks (not needed to solve the problem). It is not currently known if the graph isomorphism
problem is NP-complete or solvable in polynomial time. The graph isomorphism problem comes
up in several applications, such as determining if two descriptions of a molecule are the same, or
whether the physical layout of an electronic circuit correctly reflects the given circuit schematic
diagram.
(a) Find a set of linear equalities and inequalities on P ∈ Rn×n , that together with the Boolean
constraint Pij ∈ {0, 1}, are necessary and sufficient for P to be a permutation matrix satisfying
P AP T = B. Thus, the graph isomorphism problem is equivalent to a Boolean feasibility LP.
(b) Consider the relaxed version of the Boolean feasibility LP found in part (a), i.e., the LP that
results when the constraints Pij ∈ {0, 1} are replaced with Pij ∈ [0, 1]. When this LP is
infeasible, we can be sure that the two graphs are not isomorphic. If a solution of the LP
is found that satisfies Pij ∈ {0, 1}, then the graphs are isomporphic and we have solved the
graph isomorphism problem. This of course does not always happen, even if the graphs are
isomorphic.
A standard trick to encourage the entries of P to take on the values 0 and 1 is to add a random
linear objective to the relaxed feasibility LP. (This
P doesn’t change whether the problem is
feasible or not.) In other words, we minimize i,j Wij Pij , where Wij are chosen randomly
(say, from N (0, 1)). (This can be repeated with different choices of W .)
261
Carry out this scheme for the two isomorphic graphs with adjacency matrices A and B given
in graph_isomorphism_data.* to find a permutation matrix P that satisfies P AP T = B.
Report the permutation vector, given by the matrix-vector product P v, where v = (1, 2, . . . , n).
Verify that all the required conditions on P hold. To check that the entries of the solution of
the LP are (close to) {0, 1}, report maxi,j Pij (1 − Pij ). And yes, you might have to try more
than one instance of the randomized method described above before you find a permutation
that establishes isomorphism of the two graphs.
19.8 Flow optimization on a lossy network. We consider a network represented as a directed graph
with n nodes and m edges, with a single commodity flowing across the edges. With each edge we
associate two nonnegative flows, the input flow uj and the output flow vj . We have vj ≤ uj , with
uj − vj interpreted as the loss (of the commodity) on edge j. The relation between the input and
output flows is given by an increasing convex function ϕj : R+ → R+ ∪ {∞}, with uj = ϕj (vj ).
These functions satisfy ϕj (0) = 0 and ϕj (vj ) ≥ vj . We think of uj = ϕj (vj ) as giving the amount of
flow that must go into edge i to achieve a given output flow vj . We interpret ϕj (vj ) = ∞ as meaning
that there is no amount of input flow that can achieve an output flow vj . We write this in compact
vector form as u = ϕ(v), where ϕ : Rm m
+ → (R ∪ {∞}) is defined as ϕ(v) = (ϕ1 (v1 ), . . . , ϕm (vm )).
An alternative, equivalent characterization is vj = ψj (uj ), where ψj = ϕ−1 j gives the amount of
output flow we achieve for a given input flow. These functions are increasing and concave, and
satisfy ψj (0) = 0 and ψj (uj ) ≤ uj . We express this in compact vector form as v = ψ(u).
Lossy edges occur in many practical problems, for example power networks, when we model losses
in transmission lines, or financial networks, where there are costs associated with moving money
or some other good across an edge.
Each node has an external source, with flow si into the node. Thus si > 0 means external flow into
the node, and si < 0 means that there is a flow of value −si out of the node.
We have flow conservation at each node in the network. Flow comes into each node from the
external source, and also from the output flows of each edge that is incoming to the node. Flow
comes out of a node from each edge that is outgoing from the node, with the amount equal to the
input flow of that edge. The total incoming and total outgoing flows must match. Let A ∈ Rn×m
denote the incidence matrix of the network, i.e., Aij = 1 if edge j is incoming to node i, Aij = −1
if edge j is outgoing from node i, and Aij = 0 otherwise. Define the matrices Ain = max{A, 0}
(elementwise) and Aout = max{−A, 0}, so A = Ain − Aout . The flow balance equations are then
Ain v + s = Aout u. (We know that the description above is a lot to parse and follow; you can just
use this equation as the flow balance constraint.)
Each node has a cost function associated with its external flow, given by fi (si ). We will assume
that these are convex, and their extended valued extensions are nondecreasing. You can think of
fi (si ) as the cost of injecting flow into the network, when si > 0, and −fi (si ) as the revenue or
utility from extracting −si from the network,Pwhen si < 0. The objective we wish to minimize is
the total of these external flow costs, f (s) = ni=1 fi (si ).
Explain how to pose the problem of minimizing f (s) subject to the constraints described above,
with variables u ∈ Rm m n
+ , v ∈ R+ , and the external flows s ∈ R , as a convex optimization problem.
If you use any relaxation, introduce new variables, or use a change of variables, be sure to justify
it.
262
19.9 Some functions of graph weights. Consider a connected weighted graph G = (V, E), with weights
we ∈ R+ for e ∈ E.
(a) Distance between two sets of vertices. Let S ⊂ V, T ⊂ V be disjoint sets of vertices. The
distance between S and T , denoted dist(S, T ), is defined as the minimum of the sum of edge
weights over all paths that start in S and end in T .
|E|
Considered as a function of edge weights w ∈ R+ , is dist(S, T ) convex, concave, or neither
of these?
(b) Optimal value of traveling salesman problem. A tour is a path that includes each vertex in
the graph exactly once. The traveling salesman problem is to find a tour that minimizes total
edge weight along the tour. Its optimal value, denoted T ⋆ , is the minimum of the total edge
weight among all tours.
|E|
Considered as a function of edge weights w ∈ R+ , is T ⋆ convex, concave, or neither of these?
Please justify your answers. As always, we want the attribute you choose to hold with no further
assumptions.
263
20 Energy and power
20.1 Power flow optimization with ‘N − 1’ reliability constraint. We model a network of power lines as a
graph with n nodes and m edges. The power flow along line j is denoted pj , which can be positive,
which means power flows along the line in the direction of the edge, or negative, which means power
flows along the line in the direction opposite the edge. (In other words, edge orientation is only
used to determine the direction in which power flow is considered positive.) Each edge can support
power flow in either direction, up to a given maximum capacity Pjmax , i.e., we have |pj | ≤ Pjmax .
Generators are attached to the first k nodes. Generator i provides power gi to the network. These
must satisfy 0 ≤ gi ≤ Gmax
i , where Gmax
i is a given maximum power available from generator i.
The power generation costs are ci > 0, which are given; the total cost of power generation is cT g.
Electrical loads are connected to the nodes k + 1, . . . , n. We let di ≥ 0 denote the demand at node
k + i, for i = 1, . . . , n − k. We will consider these loads as given. In this simple model we will
neglect all power losses on lines or at nodes. Therefore, power must balance at each node: the total
power flowing into the node must equal the sum of the power flowing out of the node. This power
balance constraint can be expressed as
−g
Ap = ,
d
In the basic power flow optimization problem, we choose the generator powers g and the line flow
powers p to minimize the total power generation cost, subject to the constraints listed above.
The (given) problem data are the incidence matrix A, line capacities P max , demands d, maximum
generator powers Gmax , and generator costs c.
In this problem we will add a basic (and widely used) reliability constraint, commonly called an
‘N − 1 constraint’. (N is not a parameter in the problem; ‘N − 1’ just means ‘all-but-one’.) This
states that the system can still operate even if any one power line goes out, by re-routing the line
powers. The case when line j goes out is called ‘failure contingency j’; this corresponds to replacing
Pjmax with 0. The requirement is that there must exist a contingency power flow vector p(j) that
(j)
satisfies all the constraints above, with pj = 0, using the same given generator powers. (This
corresponds to the idea that power flows can be re-routed quickly, but generator power can only
be changed more slowly.) The ‘N − 1 reliability constraint’ requires that for each line, there is a
contingency power flow vector. The ‘N − 1 reliability constraint’ is (implicitly) a constraint on the
generator powers.
The questions below concern the specific instance of this problem with data given in rel_pwr_flow_data.*.
(Executing this file will also generate a figure showing the network you are optimizating.) Especially
for part (b) below, you must explain exactly how you set up the problem as a convex optimization
problem.
264
(a) Nominal optimization. Find the optimal generator and line power flows for this problem
instance (without the N − 1 reliability constraint). Report the optimal cost and generator
powers. (You do not have to give the power line flows.)
(b) Nominal optimization with N − 1 reliability constraint. Minimize the nominal cost, but you
must choose generator powers that meet the N − 1 reliability requirement as well. Report the
optimal cost and generator powers. (You do not have to give the nominal power line flows, or
any of the contingency flows.)
20.2 Optimal generator dispatch. In the generator dispatch problem, we schedule the electrical output
power of a set of generators over some time interval, to minimize the total cost of generation while
exactly meeting the (assumed known) electrical demand. One challenge in this problem is that the
generators have dynamic constraints, which couple their output powers over time. For example,
every generator has a maximum rate at which its power can be increased or decreased.
We label the generators i = 1, . . . , n, and the time periods t = 1, . . . , T . We let pi,t denote the
(nonnegative) power output of generator i at time interval t. The (positive) electrical demand in
period t is dt . The total generated power in each period must equal the demand:
n
X
pi,t = dt , t = 1, . . . , T.
i=1
The cost of operating generator i at power output u is ϕi (u), where ϕi is an increasing strictly
convex function. (Assuming the cost is mostly fuel cost, convexity of ϕi says that the thermal
efficiency of the generator decreases as its output power increases.) We will assume these cost
functions are quadratic: ϕi (u) = αi u + βi u2 , with αi and βi positive.
Each generator has a maximum ramp-rate, which limits the amount its power output can change
over one time period:
|pi,t+1 − pi,t | ≤ Ri , i = 1, . . . , n, t = 1, . . . , T − 1.
In addition, changing the power output of generator i from ut to ut+1 incurs an additional cost
ψi (ut+1 − ut ), where ψi is a convex function. (This cost can be a real one, due to increased fuel
use during a change of power, or a fictitious one that accounts for the increased maintenance cost
or decreased lifetime caused by frequent or large changes in power output.) We will use the power
change cost functions ψi (v) = γi |v|, where γi are positive.
Power plants with large capacity (i.e., Pimax ) are typically more efficient (i.e., have smaller αi , βi ),
but have smaller ramp-rate limits, and higher costs associated with changing power levels. Small
gas-turbine plants (‘peakers’) are less efficient, have less capacity, but their power levels can be
rapidly changed.
The total cost of operating the generators is
n X
X T n T
X X −1
C= ϕi (pi,t ) + ψi (pi,t+1 − pi,t ).
i=1 t=1 i=1 t=1
265
Choosing the generator output schedules to minimize C, while respecting the constraints described
above, is a convex optimization problem. The problem data are dt (the demands), the generator
power limits Pimin and Pimax , the ramp-rate limits Ri , and the cost function parameters αi , βi , and
γi . We will assume that problem is feasible, and that p⋆i,t are the (unique) optimal output powers.
(a) Price decomposition. Show that there are power prices Q1 , . . . , QT for which the following
holds: For each i, p⋆i,t solves the optimization problem
PT PT −1
minimize t=1 (ϕi (pi,t ) − Qt pi,t ) + t=1 ψi (pi,t+1 − pi,t )
subject to Pimin ≤ pi,t ≤ Pimax , t = 1, . . . , T
|pi,t+1 − pi,t | ≤ Ri , t = 1, . . . , T − 1.
The objective here is the portion of the objective for generator i, minus the revenue generated
by the sale of power at the prices Qt . Note that this problem involves only generator i; it can
be solved independently of the other generators (once the prices are known). How would you
find the prices Qt ?
You do not have to give a full formal proof; but you must explain your argument fully. You
are welcome to use results from the text book.
(b) Solve the generator dispatch problem with the data given in gen_dispatch_data.m, which
gives (fake, but not unreasonable) demand data for 2 days, at 15 minute intervals. This file
includes code to plot the demand, optimal generator powers, and prices. (You must replace
these variables with their correct values.) Comment on anything you see in your solution
that might at first seem odd. Using the prices found, solve the problems in part (a) for the
generators separately, to be sure they give the optimal powers (up to some small numerical
errors).
Remark. While beyond the scope of this course, we mention that there are very simple price update
mechanisms that adjust the prices in such a way that when the generators independently schedule
themselves using the prices (as described above), we end up with the total power generated in each
period matching the demand, i.e., the optimal solution of the whole (coupled) problem. This gives
a decentralized method for generator dispatch.
20.3 Optimizing a portfolio of energy sources. We have n different energy sources, such as coal-fired
plants, several wind farms, and solar farms. Our job is to size each of these, i.e., to choose its
capacity. We will denote by ci the capacity of plant i; these must satisfy cmin
i ≤ ci ≤ cmax
i , where
min
ci and ci max are given minimum and maximum values.
Each generation source has a cost to build and operate (including fuel, maintenance, government
subsidies and taxes) over some time period. We lump these costs together, and assume that the
cost is proportional to ci , with (given) coefficient bi . Thus, the total cost to build and operate the
energy sources is bT c (in, say, $/hour).
Each generation source is characterized by an availability ai , which is a random variable with values
in [0, 1]. If source i has capacity ci , then the power available from the plant is ci ai ; the total power
available from the portfolio of energy sources is cT a, which is a random variable. A coal fired plant
has ai = 1 almost always, with ai < 1 when one of its units is down for maintenance. A wind farm,
in contrast, is characterized by strong fluctations in availability with ai = 1 meaning a strong wind
266
is blowing, and ai = 0 meaning no wind is blowing. A solar farm has ai = 1 only during peak sun
hours, with no cloud cover; at other times (such as night) we have ai = 0.
Energy demand d ∈ R+ is also modeled as a random variable. The components of a (the availabil-
ities) and d (the demand) are not independent. Whenever the total power available falls short of
the demand, the additional needed power is generated by (expensive) peaking power plants at a
fixed positive price p. The average cost of energy produced by the peakers is
E p(d − cT a)+ ,
where x+ = max{0, x}. This average cost has the same units as the cost bT c to build and operate
the plants.
The objective is to choose c to minimize the overall cost
C = bT c + E p(d − cT a)+ .
Sample average approximation. To solve this problem, we will minimize a cost function based
on a sample average of peaker cost,
N
1 X
C sa = bT c + p(d(j) − cT a(j) )+
N
j=1
where (a(j) , d(j) ), j = 1, . . . , N , are (given) samples from the joint distribution of a and d. (These
might be obtained from historical data, weather and demand forecasting, and so on.)
Validation. After finding an optimal value of c, based on the set of samples, you should double
check or validate your choice of c by evaluating the overall cost on another set of (validation)
samples, (ã(j) , d˜(j) ), j = 1, . . . , N val ,
N val
val T 1 X ˜(j)
C = b c + val p(d − cT ã(j) )+ .
N
j=1
(These could be another set of historical data, held back for validation purposes.) If C sa ≈ C val ,
our confidence that each of them is approximately the optimal value of C is increased.
Finally we get to the problem. Get the data in energy_portfolio_data.m, which includes the
required problem data, and the samples, which are given as a 1 × N row vector d for the scalars
d(j) , and an n × N matrix A for a(j) . A second set of samples is given for validation, with the names
d_val and A_val.
Carry out the optimization described above. Give the optimal cost obtained, C sa , and compare to
the cost evaluated using the validation data set, C val .
Compare your solution with the following naive (‘certainty-equivalent’) approach: Replace a and
d with their (sample) means, and then solve the resulting optimization problem. Give the optimal
cost obtained, C ce (using the average values of a and d). Is this a lower bound on the optimal value
of the original problem? Now evaluate the cost for these capacities on the validation set, C ce,val .
Make a brief statement.
267
20.4 Optimizing processor speed. A set of n tasks is to be completed by n processors. The variables
to be chosen are the processor speeds s1 , . . . , sn , which must lie between a given minimum value
smin and a maximum value smax . The computational load of task i is αi , so the time required to
complete task i is τi = αi /si .
The power consumed by processor i is given by pi = f (si ), where f : R → R is positive, increasing,
and convex. Therefore, the total energy consumed is
n
X αi
E= f (si ).
si
i=1
(Here we ignore the energy used to transfer data between processors, and assume the processors
are powered down when they are not active.)
There is a set of precedence constraints for the tasks, which is a set of m ordered pairs P ⊆
{1, . . . , n} × {1, . . . , n}. If (i, j) ∈ P, then task j cannot start until task i finishes. (This would be
the case, for example, if task j requires data that is computed in task i.) When (i, j) ∈ P, we refer
to task i as a precedent of task j, since it must precede task j. We assume that the precedence
constraints define a directed acyclic graph (DAG), with an edge from i to j if (i, j) ∈ P.
If a task has no precedents, then it starts at time t = 0. Otherwise, each task starts as soon as all
of its precedents have finished. We let T denote the time for all tasks to be completed.
To be sure the precedence constraints are clear, we consider the very small example shown below,
with n = 6 tasks and m = 6 precedence constraints.
P = {(1, 4), (1, 3), (2, 3), (3, 6), (4, 6), (5, 6)}.
1 4 6
2 3
In this example, tasks 1, 2, and 5 start at time t = 0 (since they have no precedents). Task 1
finishes at t = τ1 , task 2 finishes at t = τ2 , and task 5 finishes at t = τ5 . Task 3 has tasks 1 and 2 as
precedents, so it starts at time t = max{τ1 , τ2 }, and ends τ3 seconds later, at t = max{τ1 , τ2 } + τ3 .
Task 4 completes at time t = τ1 + τ4 . Task 6 starts when tasks 3, 4, and 5 have finished, at time
t = max{max{τ1 , τ2 } + τ3 , τ1 + τ4 , τ5 }. It finishes τ6 seconds later. In this example, task 6 is the
last task to be completed, so we have
T = max{max{τ1 , τ2 } + τ3 , τ1 + τ4 , τ5 } + τ6 .
(a) Formulate the problem of choosing processor speeds (between the given limits) to minimize
completion time T , subject to an energy limit E ≤ Emax , as a convex optimization problem.
268
The data in this problem are P, smin , smax , α1 , . . . , αn , Emax , and the function f . The variables
are s1 , . . . , sn .
Feel free to change variables or to introduce new variables. Be sure to explain clearly why
your formulation of the problem is convex, and why it is equivalent to the problem statement
above.
Important:
• Your formulation must be convex for any function f that is positive, increasing, and
convex. You cannot make any further assumptions about f .
• This problem refers to the general case, not the small example described above.
(b) Consider the specific instance with data given in proc_speed_data.m, and processor power
f (s) = 1 + s + s2 + s3 .
The precedence constraints are given by an m × 2 matrix prec, where m is the number of
precedence constraints, with each row giving one precedence constraint (the first column gives
the precedents).
Plot the optimal trade-off curve of energy E versus time T , over a range of T that extends
from its minimum to its maximum possible value. (These occur when all processors operate at
smax and smin , respectively, since T is monotone nonincreasing in s.) On the same plot, show
the energy-time trade-off obtained when all processors operate at the same speed s̄, which is
varied from smin to smax .
Note: In this part of the problem there is no limit E max on E as in part (a); you are to find
the optimal trade-off of E versus T .
20.5 Minimum energy processor speed scheduling. A single processor can adjust its speed in each of T
time periods, labeled 1, . . . , T . Its speed in period t will be denoted st , t = 1, . . . , T . The speeds
must lie between given (positive) minimum and maximum values, S min and S max , respectively, and
must satisfy a slew-rate limit, |st+1 − st | ≤ R, t = 1, . . . , T − 1. (That is, R is the maximum allowed
period-to-period change in speed.) The energy consumed by the processor in period t is given by
PT ϕ : R → R is increasing and convex. The total energy consumed over all the periods
ϕ(st ), where
is E = t=1 ϕ(st ).
The processor must handle n jobs, labeled 1, . . . , n. Each job has an availability time Ai ∈
{1, . . . , T }, and a deadline Di ∈ {1, . . . , T }, with Di ≥ Ai . The processor cannot start work
on job i until period t = Ai , and must complete the job by the end of period Di . Job i involves a
(nonnegative) total work Wi . You can assume that in each time period, there is at least one job
available, i.e., for each t, there is at least one i with Ai ≤ t and Di ≥ t.
In period t, the processor allocates its effort across the n jobs as θt , where 1T θt = 1, θt ⪰ 0. Here
θti (the ith component of θt ) gives the fraction of the processor effort devoted to job i in period t.
Respecting the availability and deadline constraints requires that θti = 0 for t < Ai or t > Di . To
complete the jobs we must have
Di
X
θti st ≥ Wi , i = 1, . . . , n.
t=Ai
269
(a) Formulate the problem of choosing the speeds s1 , . . . , sT , and the allocations θ1 , . . . , θT , in
order to minimize the total energy E, as a convex optimization problem. The problem data
are S min , S max , R, ϕ, and the job data, Ai , Di , Wi , i = 1, . . . , n. Be sure to justify any change
of variables, or introduction of new variables, that you use in your formulation.
(b) Carry out your method on the problem instance described in proc_sched_data.m, with
quadratic energy function ϕ(st ) = α + βst + γs2t . (The parameters α, β, and γ are given
in the data file.) Executing this file will also give a plot showing the availability times and
deadlines for the jobs.
Give the energy obtained by your speed profile and allocations. Plot these using the command
bar((s*ones(1,n)).*theta,1,’stacked’), where s is the T × 1 vector of speeds, and θ is
the T × n matrix of allocations with components θti . This will show, at each time period, how
much effective speed is allocated to each job. The top of the plot will show the speed st . (You
don’t need to turn in a color version of this plot; B&W is fine.)
20.6 AC power flow analysis via convex optimization. This problem concerns an AC (alternating current)
power system consisting of m transmission lines that connect n nodes. We describe the topology
by the node-edge incidence matrix A ∈ Rn×m , where
+1 line j leaves node i
Aij = −1 line j enters node i
0 otherwise.
The power flow on line j is pj (with positive meaning in the direction of the line as defined in A,
negative meaning power flow in the opposite direction).
Node i has voltage phase angle ϕi , and external power input si . (If a generator is attached to node
i we have si > 0; if a load is attached we have si < 0; if the node has neither, si = 0.) Neglecting
power losses in the lines, and assuming power is conserved at each node, we have Ap = s. (We must
have 1T s = 0, which means that the total power pumped into the network by generators balances
the total power pulled out by the loads.)
The line power flows are a nonlinear function of the difference of the phase angles at the nodes they
connect to:
pj = κj sin(ϕk − ϕl ),
where line j goes from node k to node l. Here κj is a known positive constant (related to the
inductance of the line). We can write this in matrix form as p = diag(κ) sin(AT ϕ), where sin is
applied elementwise.
The DC power flow equations are
In the power analysis problem, we are given s, and want to find p and ϕ that satisfy these equations.
We are interested in solutions with voltage phase angle differences that are smaller than ±90◦ .
(Under normal conditions, real power lines are never operated with voltage phase angle differences
more than ±20◦ or so.)
270
You will show that the DC power flow equations can be solved by solving the convex optimization
problem Pm
minimize i=j ψj (pj )
subject to Ap = s,
with variable p, where
Z u q
ψj (u) = sin−1 (v/κj ) dv = u sin−1 (u/κj ) + κj ( 1 − (u/κj )2 − 1),
0
with domain dom ψj = (−κj , κj ). (The second expression will be useless in this problem.)
20.7 Power transmission with losses. A power transmission grid is modeled as a set of n nodes and
m directed edges (which represent transmission lines), with topology described by the node-edge
incidence matrix A ∈ Rn×m , defined by
+1 edge j enters node i,
Aij = −1 edge j leaves node i,
0 otherwise.
We let pin
j ≥ 0 denote the power that flows into the tail of edge j, and pj
out ≥ 0 the power that
emerges from the head of edge j, for j = 1, . . . , m. Due to transmission losses, the power that flows
into each edge is more than the power that emerges:
where Lj > 0 is the length of transmission line j, Rj > 0 is the radius of the conductors on line
j, and α > 0 is a constant. (The second term on the righthand side above is the transmission line
power loss.) In addition, each edge has a maximum allowed input power, that also depends on the
conductor radius: pin 2
j ≤ σRj , j = 1, . . . , m, where σ > 0 is a constant.
Generators are attached to nodes i = 1, . . . , k, and loads are attached to nodes i = k + 1, . . . , n.
We let gi denote the (nonnegative) power injected into node i by its generator, for i = 1, . . . , k. We
let li denote the (nonnegative) power pulled from node i by the load, for i = k + 1, . . . , n. These
load powers are known and fixed.
We must have power balance at each node. For i = 1, . . . , k, the sum of all power entering the node
from incoming transmission lines, plus the power supplied by the generator, must equal the sum of
all power leaving the node on outgoing transmission lines:
X X
pout
j + gi = pin
j , i = 1, . . . , k,
j∈E(i) j∈L(i)
271
where E(i) (L(i)) is the set of edge indices for edges entering (leaving) node i. For the load nodes
i = k + 1, . . . , n we have a similar power balance condition:
X X
pout
j = pin
j + li , i = k + 1, . . . , n.
j∈E(i) j∈L(i)
Each generator can vary its power gi over a given range [0, Gmax i ], and has an associated cost of
generation ϕi (gi ), where ϕi is convex and strictly increasing, for i = 1, . . . , k.
(a) Minimum total cost of generation. Formulate the problem of choosing generator and edge input
and output powers, so as to minimize the total cost of generation, as a convex optimization
problem. (All other quantities described above are known.) Be sure to explain any additional
variables or terms you introduce, and to justify any transformations you make.
Hint: You may find the matrices A+ = (A)+ and A− = (−A)+ helpful in expressing the power
balance constraints.
(b) Marginal cost of power at load nodes. The (marginal) cost of power at node i, for i = k +
1, . . . , n, is the partial derivative of the minimum total power generation cost, with respect to
varying the load power li . (We will simply assume these partial derivatives exist.) Explain
how to find the marginal cost of power at node i, from your formulation in part (a).
(c) Optimal sizing of lines. Now suppose that you can optimize over generator powers, edge input
and output powers (as above), and the power line radii Rj , j = 1, . . . , m. These must lie
between given limits, Rj ∈ [Rjmin , Rjmax ] (Rjmin > 0), and we must respect a total volume
constraint on the lines,
Xm
Lj Rj2 ≤ V max .
j=1
Formulate the problem of choosing generator and edge input and output powers, as well as
power line radii, so as to minimize the total cost of generation, as a convex optimization
problem. (Again, explain anything that is not obvious.)
(d) Numerical example. Using the data given in ptrans_loss_data.m, find the minimum total
generation cost and the marginal cost of power at nodes k + 1, . . . , n, for the case described
in parts (a) and (b) (i.e., using the fixed given radii Rj ), and also for the case described in
part (c), where you are allowed to change the transmission line radii, keeping the same total
volume as the original lines. For the generator costs, use the quadratic functions
ϕi (gi ) = ai gi + bi gi2 , i = 1, . . . , k,
20.8 Utility/power trade-off in a wireless network. In this problem we explore the trade-off between
total utility and total power usage in a wireless network in which the link transmit powers can
be adjusted. The network consists of a set of nodes and a set of links over which data can be
transmitted. There are n routes, each corresponding to a sequence of links from a source to a
272
destination node. Route j has a data flow rate fj ∈ R+ (in units of bits/sec, say). The total utility
(which we want to maximize) is
n
X
U (f ) = Uj (fj ),
j=1
The total traffic on a link is the sum of the flows that pass over the link. The traffic (vector) is thus
t = Rf ∈ Rm . The traffic on each link cannot exceed the capacity of the link, i.e., t ⪯ c, where
c ∈ Rm
+ is the vector of link capacities.
The link capacities, in turn, are functions of the link transmit powers, given by p ∈ Rm
+ , which
cannot exceed given limits, i.e., p ⪯ pmax . These are related by
ci = αi log(1 + βi pi ),
where αi and βi are positive parameters that characterize link i. The second objective (which we
want to minimize) is P = 1T p, the total (transmit) power.
(a) Explain how to find the optimal trade-off curve of total utility and total power, using convex
or quasiconvex optimization.
√
(b) Plot the optimal trade-off curve for the problem instance with m = 20, n = 10, Uj (x) = x
for j = 1, . . . , n, pmax
i = 10, αi = βi = 1 for i = 1, . . . , m, and network topology generated
using
rand(’seed’,3);
R = round(rand(m,n));
Your plot should have the total power on the horizontal axis.
20.9 Energy storage trade-offs. We consider the use of a storage device (say, a battery) to reduce the
total cost of electricity consumed over one day. We divide the day into T time periods, and let
pt denote the (positive, time-varying) electricity price, and ut denote the (nonnegative) usage or
consumption, in period t, for t = 1, . . . , T . Without the use of a battery, the total cost is pT u.
Let qt denote the (nonnegative) energy stored in the battery in period t. For simplicity, we neglect
energy loss (although this is easily handled as well), so we have qt+1 = qt + ct , t = 1, . . . , T − 1,
where ct is the charging of the battery in period t; ct < 0 means the battery is discharged. We will
require that q1 = qT + cT , i.e., we finish with the same battery charge that we start with. With
the battery operating, the net consumption in period t is ut + ct ; we require this to be nonnegative
(i.e., we do not pump power back into the grid). The total cost is then pT (u + c).
The battery is characterized by three parameters: The capacity Q, where qt ≤ Q; the maximum
charge rate C, where ct ≤ C; and the maximum discharge rate D, where ct ≥ −D. (The parameters
Q, C, and D are nonnegative.)
273
(a) Explain how to find the charging profile c ∈ RT (and associated stored energy profile q ∈ RT )
that minimizes the total cost, subject to the constraints.
(b) Solve the problem instance with data p and u given in storage_tradeoff_data.*, Q = 35,
and C = D = 3. Plot ut , pt , ct , and qt versus t.
(c) Storage trade-offs. Plot the minimum total cost versus the storage capacity Q, using p and
u from storage_tradeoff_data.*, and charge/discharge limits C = D = 3. Repeat for
charge/discharge limits C = D = 1. (Put these two trade-off curves on the same plot.) Give
an interpretation of the endpoints of the trade-off curves.
20.10 Cost-comfort trade-off in air conditioning. A heat pump (air conditioner) is used to cool a residence
to temperature Tt in hour t, on a day with outside temperature Ttout , for t = 1, . . . , 24. These
temperatures are given in Kelvin, and we will assume that Ttout ≥ Tt .
A total amount of heat Qt = α(Ttout − Tt ) must be removed from the residence in hour t, where α
is a positive constant (related to the quality of thermal insulation).
The electrical energy required to pump out this heat is given by Et = Qt /γt , where
Tt
γt = η
Ttout − Tt
is the coefficient of performance of the heat pump and η ∈ (0, 1] is the efficiency constant. The
efficiency is typically around 0.6 for a modern unit; the theoretical limit is η = 1. (When Tt = Ttout ,
we take γt = ∞ and Et = 0.)
Electrical energy prices
P vary with the hour, and are given by Pt > 0 for t = 1, . . . , 24. The total
energy cost is C = t Pt Et . We will assume that the prices are known.
Discomfort is measured using a piecewise-linear function of temperature,
Dt = (Tt − T ideal )+ ,
ideal is an ideal temperature, below which there is no discomfort. The total daily discomfort
where TP
is D = 24t=1 Dt . You can assume that T
ideal < T out .
t
To get a point on the optimal cost-comfort trade-off curve, we will minimize C + λD, where λ > 0.
The variables to be chosen are T1 , . . . , T24 ; all other quantities described above are given.
Show that this problem has an analytical solution of the form Tt = ψ(Pt , Ttout ), where ψ : R2 → R.
The function ψ can depend on the constants α, η, T ideal , λ. Give ψ explicitly. You are free (indeed,
encouraged) to check your formula using CVX, with made up values for the constants.
Disclaimer. The focus of this course is not on deriving 19th century pencil and paper solutions to
problems. But every now and then, a practical problem will actually have an analytical solution.
This is one of them.
20.11 Optimal electric motor drive currents. In this problem you will design the drive current waveforms
for an AC (alternating current) electric motor. The motor has a magnetic rotor which spins with
constant angular velocity ω ≥ 0 inside the stationary stator. The stator contains three circuits
(called phase windings) with (vector) current waveform i : R → R3 and (vector) voltage waveform
274
v : R → R3 , which are 2π-periodic functions of the angular position θ of the rotor. The circuit
dynamics are
d
v(θ) = Ri(θ) + ωL i(θ) + ωk(θ),
dθ
where R ∈ S3++ is the resistance matrix, L ∈ S3++ is the inductance matrix, and k : R → R3 , a
2π-periodic function of θ, is the back-EMF waveform (which encodes the electromagnetic coupling
between the rotor permanent magnets and the phase windings). The angular velocity ω, the
matrices R and L, and the back-EMF waveform k, are known.
We must have |vi (θ)| ≤ v supply , i = 1, 2, 3, where v supply is the (given) supply voltage. The output
torque of the motor at rotor position θ is τ (θ) = k(θ)T i(θ). We will require the torque to have a
given constant nonnegative value: τ (θ) = τ des for all θ.
The average power loss in the motor is
Z 2π
loss 1
P = i(θ)T Ri(θ) dθ.
2π 0
The mechanical output power is P out = ωτ des , and the motor efficiency is
P loss = (1/N ) N T
P
j=1 i(jh) Ri(jh).
20.12 Decomposing a PV array output time series. We are given a time series p ∈ RT+ that gives the
output power of a photo-voltaic (PV) array in 5-minute intervals, over T = 2016 periods (one week),
given in pv_output_data.*. In this problem you will use convex optimization to decompose the
time series into three components:
• The clear sky output c ∈ RT+ , a smooth daily-periodic component, which gives what the PV
output would have been without clouds. This signal is 24-hour-periodic, i.e., ct+288 = ct for
t = 1, . . . , T − 288. (The clear sky output is zero at night, but we will not use this prior
information in our decomposition method.)
275
• A weather shading loss component s ∈ RT+ , which gives the loss of power due to clouds. This
component satisfies 0 ⪯ s ⪯ c, can change rapidly, and is not periodic.
• A residual r ∈ RT , which accounts for measurement error, anomalies, and other errors.
20.13 Optimal operation of a microgrid. We consider a small electrical microgrid that consists of a photo-
voltaic (PV) array, a storage device (battery), a load, and a connection to an external grid. We will
optimize the operation of the microgrid over one day, in 15 minute increments, so all powers, and the
battery charge, are represented as vectors in R96 . The load power is pld , which is nonnegative and
known. The power that we take from the external grid is pgrid ; pgrid
i ≥ 0 means we are consuming
grid
power from the grid, and pi < 0 means we are sending power back into the grid, in time period
i. The PV array output, which is nonnegative and known, is denoted as ppv . The battery power
is pbatt , with pbatt
i ≥ 0 meaning the battery is discharging, and pbatt
i < 0 meaning the battery is
charging. These powers must balance in all periods, i.e., we have
(This is called the power balance constraint. The lefthand side is the load power, and the righthand
side is the sum of the power coming from the grid, the battery, and the PV array.) All powers are
given in kW.
The battery state of charge is given by q ∈ R96 . It must satisfy 0 ≤ qi ≤ Q for all i, where Q is
the battery capacity (in kWh). The battery dynamics are
qi+1 = qi − (1/4)pbatt
i , i = 1, . . . , 95, q1 = q96 − (1/4)pbatt
96 .
(The last equation means that we seek a periodic operation of the microgrid.) The battery power
must satisfy −C ≤ pbatt
i ≤ D for all i, where C and D are (positive) known maximum charge and
maximum discharge rates.
When we buy power (i.e., pgrid i ≥ 0) we pay for it at the rate of Ribuy (in $/kWh). When we sell
grid
power to the grid (i.e., pi < 0) we are paid for it at the rate of Risell . These (positive) prices vary
with time period, and are known. The total cost of the grid power (in $) is
T T
(1/4) Rbuy pgrid − (1/4) Rsell pgrid ,
+ −
276
where (pgrid )+ = max{pgrid , 0} and (pgrid )− = max{−pgrid , 0} (elementwise). You can assume that
Ribuy > Risell > 0, i.e., in every period, you pay at a higher rate to consume power from the grid
than you are paid when you send power back into the grid.
The data for the problem are
(a) Explain how to find the powers and battery state of charge that minimize the total cost of the
grid power. Carry out your method using the data given in microgrid_data.*. Report the
optimal cost of the grid power. Plot pgrid , pload , ppv , pbatt , and q versus i. Note. For CVXPY,
you might need to specify solver=cvx.ECOS when you call the solve() method.
(b) Price and payments. Let ν ∈ R96 denote the optimal dual variable associated with the power
balance constraint. The vector 4ν can be interpreted as the (time-varying) price of electricity
at the microgrid, and is called the locational marginal price (LMP). The LMP is in $/kWh,
and is generally positive; the factor 4 converts between 15 minute power intervals and per kWh
prices. Find and plot the LMP, along with the grid buy and sell prices, versus i. Make a very
brief comment comparing the LMP prices with the buy and sell grid prices. Hint. Depending
on how you express the power balance constraint, your software might return −ν instead of
ν. Feel free to use −4ν instead of ν, or to switch the left-hand and righ-hand sides of your
power balance constraint.
(c) The LMPs can be used as a system for payments among the load, the PV array, the battery,
and the grid. The load pays ν T pld ; the PV array is paid ν T ppv ; the battery is paid ν T pbatt ;
and the grid is paid ν T pgrid . Note carefully the directions of these payments. Also note that
the battery and grid, whose powers can have either sign, can be paid in some time intervals
and pay in others.
Use this pricing scheme to calculate the LMP payments made by the load, and to the PV
array, the battery, and the grid. If all goes well, these payments will balance, i.e., the load
will pay an amount equal to the sum of the others.
When you execute the script that contains the data, it will create plots showing the various powers
and prices versus time. Your are welcome to use these as templates for plotting your results. You
are very welcome to look inside the script to see how the data is generated.
Remark. (Not needed to solve the problem.) The given data is approximately consistent with a
group of ten houses, a common or pooled PV array of around 100 panels, and two Tesla Powerwall
batteries.
20.14 Electric vehicle charging. A group of N electric vehicles need to charge their batteries over the next
T time periods. The charging energy for vehicle i in period t is given by ct,i ≥ 0, for t = 1, . . . , T
and i = 1, .P . . , N . In each time period, the total charging energy over all vehicles cannot exceed
C max , i.e., N i=1 ct,i ≤ C
max for t = 1, . . . , T .
The state of charge for vehicle i in period t is denoted qt,i ≥ 0. The charging dynamics is
Note that qt,i is defined for t = T + 1. The initial vehicle charges q1,i are given. The charging
energy and state of charge are given in kWh (kilowatt-hours).
277
The vehicles have different preferences for how much charge they acquire over time. This is ex-
pressed by a target minimum charge level over time, given by qt,i tar ∈ R , t = 1, . . . , T + 1. These
+
tar tar
are nondecreasing, i.e., qt+1,i ≥ qt,i for t = 1, . . . , T , i = 1, . . . , N . The charging shortfall in period
t for vehicle i is given by
tar
st,i = (qt,i − qt,i )+ , t = 1, . . . , T + 1, i = 1, . . . , N,
where (a)+ = max{a, 0}. Our objective is to minimize the mean square shortfall, given by
T +1 X
N
1 X
S= s2t,i .
(T + 1)N
t=1 i=1
√
This is the same as minimizing the root-mean-square (RMS) shortfall, given by S (which has
units of kWh).
Explain how to solve the problem using convex optimization, and solve the following problem
instance. We have N = 4 vehicles, T = 90 time periods, and C max = 3. The initial charges q1,i are
20, 0, 30, and 25, respectively. The target minimum charge profiles have the form
γi
tar t
qt,i = qides , t = 1, . . . , T + 1, i = 1, . . . , N,
T +1
with γ values 0.5, 0.3, 2.0, 0.6 and qides values 60, 100, 75, 125. Note that qides gives the final value of
the target minimum charge level for vehicle i, and the parameter γi sets the ‘urgency’ of charging,
with smaller values indicating more urgency, i.e., a target minimum charge value that rises more
quickly.
(With the charges all given in kWh, and the time period 5 minutes, these values are all realistic.
The total charging period is 7.5 hours, and the maximum charging of 3kWh/period corresponds to
a real power of 36kW. And no, you do not need to know or understand this to solve the problem.)
Give the optimal RMS shortfall, i.e., the squareroot of the optimal objective value. Plot the target
minimum charge values and optimal state of charge for each vehicle, with dashed lines showing the
target and solid lines showing the optimal charge. Plot the optimal charging energies ct,i over time
in a stack plot.
Constant charging. Compare the optimal charging above to a very simple charging policy: Charge
each vehicle at a constant energy per period, proportional to qides − q1,i , i.e.,
ct,i = θi C max , i = 1, . . . , N, t = 1, . . . , T,
with
q des − q1,i
θi = PN i , i = 1, . . . , N.
des
j=1 (qj − q1,j )
Give the associated RMS shortfall, and the same plots as above.
Plotting hints. In Python, a basic stack plot is obtained with
278
where rr is a range object (like range(a, b)) with len(list(rr)) == n and y is an n×N NumPy
array.
In Julia, a basic stack plot is obtained with
using Plots
areaplot(rr, y)
20.15 The duck curve and storage. The plot below shows the total daily electricity load (or demand) and
the total solar generation for California, averaged over March 2021, in GW (gigawatts). The plot
below that shows the load and also the net load, the load minus the solar generation, which must
be supplied by other generation sources. If you squint your eyes, the load and net load curves look
like a duck, hence the name ‘duck curve’.
25
20
15
Power (GW)
Load
Solar
10
0
30
25
20
Power (GW)
Load
Net Load
15
10
5
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
:00
00
01
02
03
04
05
06
07
08
09
10
11
12
13
14
15
16
17
18
19
20
21
22
23
Time (hour)
The steep rise in net load around 5–7pm can be a challenge for generators, which work better and
more efficiently when generating constant or smoothly varying power. Many generators have a
279
ramp rate limit, i.e., there is a limit on how fast they can change their power over time. In this
exercise we consider using storage (batteries) to reduce the cost of supplying the net load and to
reduce the maximum ramp rate.
We work with quantities in 15 minute (0.25 hour) intervals over one day, which is 96 periods, with
t = 1 corresponding to the interval 0:00–0:15 and t = 96 corresponding to the interval 23:45–24:00.
The storage device has energy qt , t = 1, . . . , 96, with 0 ≤ qt ≤ Q, where Q = 15 GWh (gigawatt-
hour) is the capacity of the storage. In period t we charge the battery with power ct (in GW),
with negative values of ct denoting discharge. The maximum charge and discharge power is C,
i.e., |ct | ≤ C, t = 1, . . . , 96, with C = 3.75 GW. This means that the battery can be charged from
empty to full, or discharged from full to empty, in 4 hours. The battery energy satisfies
(The 0.25 factor gives the energy in GWh for a constant power over a 15 minute period, and the
last constraint requires that the storage device starts and ends with the same energy.)
We denote the load as lt , the solar generation as st , and the net load as nt . These are related as
nt = lt − st , t = 1, . . . , 96. The (non-solar) generation is denoted gt , t = 1, . . . , 96. It must equal
the net load plus the battery charging power, i.e.,
gt = nt + ct , t = 1, . . . , 96.
The variables are qt , ct , and gt , t = 1, . . . , 96. The data are available in duck_curve_data.py.
The objective is the cost of generation, which is quadratic,
96
X
J cost = agt + bgt2 ,
t=1
with a = 0.1 $106 /GW and b = 0.005 $106 /(GW)2 . We also limit the maximum ramp rate, defined
as
J ramp = 4 max{|g2 − g1 |, . . . , |g96 − g95 |, |g1 − g96 |},
in GW/h. (The factor 4 converts the change in power to hourly.)
Find the charging, storage, and generation profiles that minimize cost, subject to reducing the
maximum ramp rate by a factor of at least two compared to the maximum ramp rate with no
storage. Report the cost and maximum ramp rate with and without storage, to two significant
digits.
Plot the optimal generation and net load versus time on the same plot. Plot the optimal charging
versus time, with two dashed horizontal lines showing the charge limits ±C. Plot the storage energy
versus time, with the storage capacity Q shown as a dashed horizontal line.
Remarks. (Not needed to solve the problem.)
• The storage capacity corresonds to about 10% of California residences having a 15 kWh
(kilowatt-hour) battery such as a Tesla Powerwall or equivalent.
• The formula for cost of generation is roughly accurate.
280
• By the time 15 GWh of storage could be brought online, there will be even more solar power,
further deepening the duck curve.
20.16 A statistical model of seasonal shading. Let pt ∈ R++ denote the average power obtained from
a photovoltaic (PV) system on day t, t = 1, . . . , T . This power depends on the day of the year,
which affects the sun’s trajectory over the system, as well as whether or not there is shading from
clouds on that day. We let pcst ∈ R++ denote the average power that would be obtained from the
system if there were no shading from clouds. (This can be estimated separately, and is called the
clear sky power.) We define the shade fraction as st = pt /pcs
t , which is the fraction of the clear sky
power that was obtained on day t. These numbers lie between zero (full shade) and one (clear sky).
These shade fraction numbers are our given data.
We model st as independent random variables with CDF F (x) = xγt for x ∈ [0, 1], where γt > 0.
(This is a special case of a beta distribution.) If γt = 1, then the shading on day t is uniform on
[0, 1]; if γt = 3, then the shading on day t is more concentrated on high values than low ones.
We will require that γt be 365-periodic, i.e., it repeats every year, which means γt+365 = γt for
t = 1, . . . , T − 365. This means our model is specified by the 365 numbers γ1 , . . . , γ365 . In addition
we assume that γt varies smoothly over the year, which means that the Dirichlet energy (sum of
differences squared),
is small.
We obtain our estimate of γt by minimizing ℓ(γ) + ωD(γ), where ℓ is the negative log-likelihood,
and ω is a positive hyperparameter. Note that here we use the total negative log-likelihood across
the T data points, and not the average negative log-likelihood, which would divide this by t.
(a) Explain how to formulate this as a convex problem. Justify any change of variables.
(b) Carry out the method of part (a) on the data found in seasonal shading data.py, using
hyperparameter value ω = 300. This file contains the shade fraction for a residental PV
system in California from July 1, 2015 to June 30, 2018 (with the leap year day February 29th
2016 removed so each year has 365 days). Plot γt versus t along with st . Plot the CDF of the
estimated distribution for t = 138 and t = 290 (corresponding to November 15 and April 15).
Briefly discuss the results.
20.17 Constant current constant voltage charging of a Li-Ion battery. We consider the charging of a Li-Ion
battery over the time t = 1, . . . , T + 1 (in minutes). The charge in the battery at the beginning of
time period t is denoted qt , t = 1, . . . , T + 1. We have initial and terminal conditions q1 = Qmin
and qT +1 = Qmax , where Qmin = 960 and Qmax = 6300 are the charge levels corresponding to fully
discharged and fully charged, given in C (Coulombs). We let it , t = 1, . . . , T , denote the charging
current at time t, given in A (Amperes). We require 0 ≤ it ≤ I max , where I max = 1.5 is the
maximum allowed charging current. The charge and current are related as
qk+1 = qk + hik , k = 1, . . . , T,
281
The battery has an open-circuit voltage that depends on the charge in it:
b
vtoc = a + , t = 1, . . . , T + 1,
Qcrit − qt
where a = 3.40, b = 500, and Qcrit = 6925 are parameters. The open circuit voltage has physical
units of V (Volts), a is in V, b is in J (Joules), and Qcrit is in C.
The voltage of the battery (not to be confused with its open-circuit voltage) at time t is denoted
vt , t = 1, . . . , T . It is given by vt = vtoc + Rit , where R = 0.4 Ω (Ohm) is the internal resistance. (In
EE dialect, this model of a battery is a nonlinear capacitor in series with a resistance.) The battery
voltage cannot exceed a given maximum value, i.e., vt ≤ V max , t = 1, . . . , T , with V max = 4.22 V.
The energy loss in charging the battery is given by
T
X
L = hR i2t ,
t=1
where h = 60 seconds converts power in W (Watts) for one minute into J. The total energy
transferred into the battery in this exercise is E = 20975 J. The charging efficiency is then (E −
L)/E.
(a) Formulate the problem of charging the battery with maximum efficiency as a convex optimiza-
tion problem. The variables are q1 , . . . , qT +1 and i1 , . . . , iT .
(b) Carry out the method of part (a) for three different charging times: T = 120 (fast), T = 180
(normal), and T = 240 (slow). Give the charging efficiency for each of these charging times.
Values should be rounded to two decimal places.
For each charging time, plot the current, charge, and voltage versus time t. (All together 9
plots.) Please use plotting code in cccv charging plot.py. You need to provide i⋆ , v ⋆ , q ⋆ for
each charging time.
(c) The industry standard method for charging a Li-Ion battery is called constant current constant
voltage (CCCV) charging. It consists of charging the battery with a constant current until
some specified level of charge is achieved, and then switching to constant voltage charging
until the battery is fully charged.
Give some brief remarks comparing the optimal charging you found above and the industry
standard method.
Remarks.
• Don’t worry about the physical units; we give them because the problem instance is realistic.
• The parameters given above are reasonable values for a so-called 18650 Li-Ion battery, which
is a little bit larger than what you might have in your mobile phone.
• More sophisticated charging methods (e.g., for electric vehicles) take into account battery
temperature, and the problem of charging many diffferent battery cells.
• Please neglect potential solver warnings about numerical accuracy.
282
21 Miscellaneous applications
21.1 Earth mover’s distance. In this exercise we explore a general method for constructing a distance
between two probability distributions on a finite set, called the earth mover’s distance, Wasserstein
metric, Dubroshkin metric, or optimal transport metric. Let x and y be two probability distributions
on {1, . . . , n}, i.e., 1T x = 1T y = 1, x ⪰ 0, y ⪰ 0. We imagine that xi is the amount of earth stored
at location i; our goal is to move the earth between locations to obtain the distribution given by
y. Let Cij be the cost of moving one unit of earth from location j to location i. We assume that
Cii = 0, and Cij = Cji > 0 for i ̸= j. (We allow Cij = ∞, which means that earth cannot be moved
directly from node j to node i.)PLet Sij ≥ 0 denote the amount of earth moved from location j
to location i. The total cost is ni,j=1 Sij Cij = tr C T S. The shipment matrix S must satisfy the
balance equations,
n
X n
X
Sij = yi , i = 1, . . . , n, Sij = xj , j = 1, . . . , n,
j=1 i=1
which we can write compactly as S1 = y, S T 1 = x. (The first equation states that the total
amount shipped into location i equals yi ; the second equation states that the total shipped out
from location j is xj .) The earth mover’s distance between x and y, denoted d(x, y), is given by the
minimal cost of earth moving required to transform x to y, i.e., the optimal value of the problem
minimize tr C T S
subject to Sij ≥ 0, i, j = 1, . . . , n
S1 = y, S T 1 = x,
maximize ν T x + µT y
subject to νi + µj ≤ Cij , i, j = 1, . . . , n,
with variables ν, µ ∈ Rn .
283
21.2 Radiation treatment planning. In radiation treatment, radiation is delivered to a patient, with the
goal of killing or damaging the cells in a tumor, while carrying out minimal damage to other tissue.
The radiation is delivered in beams, each of which has a known pattern; the level of each beam can
be adjusted. (In most cases multiple beams are delivered at the same time, in one ‘shot’, with the
treatment organized as a sequence of ‘shots’.) We let bj denote the level of beam j, for j = 1, . . . , n.
These must satisfy 0 ≤ bj ≤ B max , where B max is the maximum possible beam level. The exposure
area is divided into m voxels,
P labeled i = 1, . . . , m. The dose di delivered to voxel i is linear in
the beam levels, i.e., di = nj=1 Aij bj . Here A ∈ Rm×n
+ is a (known) matrix that characterizes the
beam patterns. We now describe a simple radiation treatment planning problem.
A (known) subset of the voxels, T ⊂ {1, . . . , m}, corresponds to the tumor or target region. We
require that a minimum radiation dose Dtarget be administered to each tumor voxel, i.e., di ≥ Dtarget
for i ∈ T . For all other voxels, we would like to have di ≤ Dother , where Dother is a desired maximum
dose for non-target voxels. This is generally not feasible, so instead we settle for minimizing the
penalty X
E= ((di − Dother )+ )2 ,
i̸∈T
where (·)+ denotes the nonnegative part. We can interpret E as the sum of the squares of the
nontarget excess doses.
(a) Show that the treatment planning problem is convex. The optimization variable is b ∈ Rn ;
the problem data are B max , A, T , Dtarget , and Dother .
(b) Solve the problem instance with data given in the file treatment_planning_data.m. Here we
have split the matrix A into Atarget, which contains the rows corresponding to the target
voxels, and Aother, which contains the rows corresponding to other voxels. Give the optimal
value. Plot the dose histogram for the target voxels, and also for the other voxels. Make a
brief comment on what you see. Remark. The beam pattern matrix in this problem instance
is randomly generated, but similar results would be obtained with realistic data.
21.3 Flux balance analysis in systems biology. Flux balance analysis is based on a very simple model of
the reactions going on in a cell, keeping track only of the gross rate of consumption and production
of various chemical species within the cell. Based on the known stoichiometry of the reactions, and
known upper bounds on some of the reaction rates, we can compute bounds on the other reaction
rates, or cell growth, for example.
We focus on m metabolites in a cell, labeled M1 , . . . , Mm . There are n reactions going on, labeled
R1 , . . . , Rn , with nonnegative reaction rates v1 , . . . , vn . Each reaction has a (known) stoichiometry,
which tells us the rate of consumption and production of the metabolites per unit of reaction rate.
The stoichiometry data is given by the stoichiometry matrix S ∈ Rm×n , defined as follows: Sij
is the rate of production of Mi due to unit reaction rate vj = 1. Here we consider consumption
of a metabolite as negative production; so Sij = −2, for example, means that reaction Rj causes
metabolite Mi to be consumed at a rate 2vj .
As an example, suppose reaction R1 has the form M1 → M2 + 2M3 . The consumption rate of M1 ,
due to this reaction, is v1 ; the production rate of M2 is v1 ; and the production rate of M3 is 2v1 .
(The reaction R1 has no effect on metabolites M4 , . . . , Mm .) This corresponds to a first column of
S of the form (−1, 1, 2, 0, . . . , 0).
284
Reactions are also used to model flow of metabolites into and out of the cell. For example, suppose
that reaction R2 corresponds to the flow of metabolite M1 into the cell, with v2 giving the flow
rate. This corresponds to a second column of S of the form (1, 0, . . . , 0).
The last reaction, Rn , corresponds to biomass creation, or cell growth, so the reaction rate vn is
the cell growth rate. The last column of S gives the amounts of metabolites used or created per
unit of cell growth rate.
Since our reactions include metabolites entering or leaving the cell, as well as those converted
to biomass within the cell, we have conservation of the metabolites, which can be expressed as
Sv = 0. In addition, we are given upper limits on some of the reaction rates, which we express as
v ⪯ v max , where we set vjmax = ∞ if no upper limit on reaction rate j is known. The goal is to
find the maximum possible cell growth rate (i.e., largest possible value of vn ) consistent with the
constraints
Sv = 0, v ⪰ 0, v ⪯ v max .
(a) Find the maximum possible cell growth rate G⋆ , as well as optimal Lagrange multipliers for
the reaction rate limits. How sensitive is the maximum growth rate to the various reaction
rate limits?
(b) Essential genes and synthetic lethals. For simplicity, we’ll assume that each reaction is con-
trolled by an associated gene, i.e., gene Gi controls reaction Ri . Knocking out a set of genes
associated with some reactions has the effect of setting the reaction rates (or equivalently, the
associated v max entries) to zero, which of course reduces the maximum possible growth rate.
If the maximum growth rate becomes small enough or zero, it is reasonable to guess that
knocking out the set of genes will kill the cell. An essential gene is one that when knocked
out reduces the maximum growth rate below a given threshold Gmin . (Note that Gn is always
an essential gene.) A synthetic lethal is a pair of non-essential genes that when knocked out
reduces the maximum growth rate below the threshold. Find all essential genes and synthetic
lethals for the given problem instance, using the threshold Gmin = 0.2G⋆ .
21.4 Online advertising displays. When a user goes to a website, one of a set of n ads, labeled 1, . . . , n, is
displayed. This is called an impression. We divide some time interval (say, one day) into T periods,
labeled t = 1, . . . , T . Let Nit ≥ 0 denote the number of impressions in period t forPwhich we display
ad i. In period t there will be a total of It > 0 impressions, so we must have ni=1 Nit = It , for
t = 1, . . . , T . (The numbers It might be known from past history.) You can treat all these numbers
as real. (This is justified since they are typically very large.)
The revenue for displaying ad i in period t is Rit ≥ 0 per
P impression. (This might come from click-
through payments, for example.) The total revenue is Tt=1 ni=1 Rit Nit . To maximize revenue, we
P
would simply display the ad with the highest revenue per impression, and no other, in each display
period.
We also have in place a set of m contracts that require us to display certain numbers of ads, or mixes
of ads (say, associated with the products of one company), over certain periods, with a penalty for
any shortfalls. Contract j is characterized by a set of ads Aj ⊆ {1, . . . , n} (while it does not affect
the math, these are often disjoint), a set of periods Tj ⊆ {1, . . . , T }, a target number of impressions
285
qj ≥ 0, and a shortfall penalty rate pj > 0. The shortfall sj for contract j is
XX
sj = qj − Nit ,
t∈Tj i∈Aj
+
where (u)+ means max{u, 0}. (This is the number of impressions by which Pm we fall short of the
target value qj .) Our contracts require a total penalty payment equal to j=1 pj sj . Our net profit
is the total revenue minus the total penalty payment.
(a) Explain how to find the display numbers Nit that maximize net profit. The data in this
problem are R ∈ Rn×T , I ∈ RT (here I is the vector of impressions, not the identity matrix),
and the contract data Aj , Tj , qj , and pj , j = 1, . . . , m.
(b) Carry out your method on the problem with data given in ad_disp_data.py. The data Aj
and Tj , for j = 1, . . . , m are given by matrices Acontr ∈ Rn×m and T contr ∈ RT ×m , with
contr 1 i ∈ Aj contr 1 t ∈ Tj
Aij = Ttj =
0 otherwise, 0 otherwise.
Report the optimal net profit, and the associated revenue and total penalty payment. Give
the same three numbers for the strategy of simply displaying in each period only the ad with
the largest revenue per impression.
21.5 Ranking by aggregating preferences. We have n objects, labeled 1, . . . , n. Our goal is to assign a
real valued rank ri to the objects. A preference is an ordered pair (i, j), meaning that object i is
preferred over object j. The ranking r ∈ Rn and preference (i, j) are consistent if ri ≥ rj + 1.
(This sets the scale of the ranking: a gap of one in ranking is the threshold for preferring one item
over another.) We define the preference violation of preference (i, j) with ranking r ∈ Rn as
We have a set of m preferences among the objects, (i(1) , j (1) ), . . . , (i(m) , j (m) ). (These may come
from several different evaluators of the objects, but this won’t matter here.)
We will select our ranking r as a minimizer of the total preference violation penalty, defined as
m
X
J= ϕ(v (k) ),
k=1
where v (k) is the preference violation of (i(k) , j (k) ) with r, and ϕ is a nondecreasing convex penalty
function that satisfies ϕ(u) = 0 for u ≤ 0.
(a) Make a (simple, please) suggestion for ϕ for each of the following two situations:
(i) We don’t mind some small violations, but we really want to avoid large violations.
(ii) We want as many preferences as possible to be consistent with the ranking, but will accept
some (hopefully, few) larger preference violations.
286
(b) Find the rankings obtained using the penalty functions proposed in part (a), on the data
set found in rank_aggr_data.m. Plot a histogram of preference violations for each case and
briefly comment on the differences between them. Give the number of positive preference
violations for each case. (Use sum(v>0.001) to determine this number.)
Remark. The objects could be candidates for a position, papers at a conference, movies, websites,
courses at a university, and so on. The preferences could arise in several ways. Each of a set of
evaluators provides some preferences, for example by rank ordering a subset of the objects. The
problem can be thought of as aggregating the preferences given by the evaluators, to come up with
a composite ranking.
21.6 Time release formulation. A patient is treated with a drug (say, in pill form) at different times.
Each treatment (or pill) contains (possibly) different amounts of various formulations of the drug.
Each of the formulations, in turn, has a characteristic pattern as to how quickly it releases the
drug into the bloodstream. The goal is to optimize the blend of formulations that go into each
treatment, in order to achieve a desired drug concentration in the bloodstream over time.
We will use discrete time, t = 1, 2, . . . , T , representing hours (say). There will be K treatments,
administered at known times 1 = τ1 < τ2 < · · · < τK < T . We have m drug formulations; each
treatment consists of a mixture of these m formulations. We let a(k) ∈ Rm + denote the amounts of
the m formulations in treatment k, for k = 1, . . . , K.
(k)
Each formulation i has a time profile pi (t) ∈ R+ , for t = 1, 2, . . .. If an amount ai of formulation
i from treatment k is administered at time t0 , the drug concentration in the bloodstream (due to
(k)
this formulation) is given by ai pi (t − t0 ) for t > t0 , and 0 for t ≤ t0 . To simplify notation, we will
define pi (t) to be zero for t = 0, −1, −2, . . .. We assume the effects of the different formulations and
different treatments are additive, so the total bloodstream drug concentration is given by
K X
m
(k)
X
c(t) = pi (t − τk )ai , t = 1, . . . , T.
k=1 i=1
(This is just a vector convolution.) Recall that pi (t − τk ) = 0 for t ≤ τk , which means that the
effect of treatment k does not show up until time τk + 1.
We require that c(t) ≤ cmax for t = 1, . . . , T , where cmax is a given maximum permissible concen-
tration. We define the therapeutic time T ther as
with T ther = ∞ if c(t) < cmin for t = 1, . . . , T . Here, cmin is the minimum concentration for the drug
to have therapeutic value. Thus, T ther is the first time at which the drug concentration reaches,
and stays above, the minimum therapeutic level.
Finally, we get to the problem. The optimization variables are the treatment formulation vectors
a(1) , . . . , a(K) . There are two objectives: T ther (which we want to be small), and
K−1
X
J ch = ∥a(k+1) − a(k) ∥∞
k=1
287
(which we also want to be small). This second objective is a penalty for changing the formulation
amounts in the treatments.
The rest of the problem concerns the specific instance with data given in the file time_release_form_data.m.
This gives data for T = 168 (one week, starting from 8AM Monday morning), with treatments oc-
curing 3 times each day, at 8AM, 2PM, and 11PM, so we have a total of K = 21 treatments. We
have m = 6 formulations, with profiles with length 96 (i.e., pi (t) = 0 for t > 96).
• Explain how to find the optimal trade-off curve of T ther versus J ch . Your method may involve
solving several convex optimization problems.
• Plot the trade-off curve over a reasonable range, and be sure to explain or at least comment
on the endpoints of the trade-off curve.
• Plot the treatment formulation amounts versus k, and the bloodstream concentration versus
t, for the two trade-off curve endpoints, and one corresponding to T ther = 8.
Warning. We’ve found that CVX can experience numerical problems when solving this problem
(depending on how it is formulated). In one case, cvx_status is “Solved/Inaccurate” when in fact
the problem has been solved (just not to the tolerances SeDuMi likes to see). You can ignore this
status, taking it to mean Optimal. You can also try switching to the SDPT3 solver. In any case,
please do not spend much time worrying about, or dealing with, these numerical problems.
21.7 Sizing a gravity feed water supply network. A water supply network connects water supplies (such
as reservoirs) to consumers via a network of pipes. Water flow in the network is due to gravity
(as opposed to pumps, which could also be added to the formulation). The network is composed
of a set of n nodes and m directed edges between pairs of nodes. The first k nodes are supply or
reservoir nodes, and the remaining n − k are consumer nodes. The edges correspond to the pipes
in the water supply network.
We let fj ≥ 0 denote the water flow in pipe (edge) j, and hi denote the (known) altitude or height
of node i (say, above sea level). At nodes i = 1, . . . , k, we let si ≥ 0 denote the flow into the network
from the supply. For i = 1, . . . , n − k, we let ci ≥ 0 denote the water flow taken out of the network
(by consumers) at node k + i. Conservation of flow can be expressed as
−s
Af = ,
c
where A ∈ Rn×m is the incidence matrix for the supply network, given by
−1 if edge j leaves node i
Aij = +1 if edge j enters node i
0 otherwise.
We assume that each edge is oriented from a node of higher altitude to a node of lower altitude; if
edge j goes from node i to node l, we have hi > hl . The pipe flows are determined by
αθj Rj2 (hi − hl )
fj = ,
Lj
where edge j goes from node i to node l, α > 0 is a known constant, Lj > 0 is the (known) length
of pipe j, Rj > 0 is the radius of pipe j, and θj ∈ [0, 1] corresponds to the valve opening in pipe j.
288
Finally, we have a few more constraints. The supply feed rates are limited: we have si ≤ Simax .
The pipe radii are limited: we have Rjmin ≤ Rj ≤ Rjmax . (These limits are all known.)
(a) Supportable consumption vectors. Suppose that the pipe radii are fixed and known. We say
that c ∈ Rn−k+ is supportable if there is a choice of f , s, and θ for which all constraints
and conditions above are satisfied. Show that the set of supportable consumption vectors is
a polyhedron, and explain how to determine whether or not a given consumption vector is
supportable.
(b) Optimal pipe sizing. You must select the pipe radii Rj to minimize the cost, which we take to
be (proportional to) the total volume of the pipes, L1 R12 +· · ·+Lm Rm
2 , subject to being able to
(1) (N )
support a set of consumption vectors, denoted c , . . . , c , which we refer to as consumption
scenarios. (This means that any consumption vector in the convex hull of {c(1) , . . . , c(N ) } will
be supportable.) Show how to formulate this as a convex optimization problem. Note. You
are asked to choose one set of pipe radii, and N sets of valve parameters, flow vectors, and
source vectors; one for each consumption scenario.
(c) Solve the instance of the optimal pipe sizing problem with data defined in the file grav_feed_network_data.m,
and report the optimal value and the optimal pipe radii. The columns of the matrix C in the
data file are the consumption vectors c(1) , . . . , c(N ) .
Hint. −AT h gives a vector containing the height differences across the edges.
21.8 Optimal political positioning. A political constituency is a group of voters with similar views on a
set of political issues. The electorate (i.e., the set of voters in some election) is partitioned (by a
political analyst) into K constituencies, with (nonnegative) populations P1 , . . . , PK . A candidate in
the election has an initial or prior position on each of n issues, but is willing to consider (presumably
small) deviations from her prior positions in order to maximize the total number of votes she will
receive. We let xi ∈ R denote the change in her position on issue i, measured on some appropriate
scale. (You can think of xi < 0 as a move to the ‘left’ and xi > 0 as a move to the ‘right’ on the
issue, if you like.) The vector x ∈ Rn characterizes the changes in her position on all issues; x = 0
represents the prior positions. On each issue she has a limit on how far in each direction she is
willing to move, which we express as l ⪯ x ⪯ u, where l ≺ 0 and u ≻ 0 are given.
The candidate’s position change x affects the fraction of voters in each constituency that will vote
for her. This fraction is modeled as a logistic function,
fk = g(wkT x + vk ), k = 1, . . . , K.
Here g(z) = 1/(1 + exp(−z)) is the standard logistic function, and wk ∈ Rn and vk ∈ R are given
data that characterize the views of constituency k on the issues. Thus the total number of votes
the candidate will receive is
V = P1 f1 + · · · + PK fK .
The problem is to choose x (subject to the given limits) so as to maximize V . The problem data
are l, u, and Pk , wk , and vk for k = 1, . . . , K.
(a) The general political positioning problem. Show that the objective function V need not be
quasiconcave. (This means that the general optimal political positioning problem is not a
quasiconvex problem, and therefore also not a convex problem.) In other words, choose prob-
lem data for which V is not a quasiconcave function of x.
289
(b) The partisan political positioning problem. Now suppose the candidate focuses only on her
core constituencies, i.e., those for which a significant fraction will vote for her. In this case
we interpret the K constituencies as her core constituencies; we assume that vk ≥ 0, which
means that with her prior position x = 0, at least half of each of her core constituencies will
vote for her. We add the constraint that wkT x + vk ≥ 0 for each k, which means that she will
not take positions that alienate a majority of voters from any of her core constituencies.
Show that the partisan political positioning problem (i.e., maximizing V with the additional
assumptions and constraints) is convex.
(c) Numerical example. Find the optimal positions for the partisan political positioning problem
with data given in opt_pol_pos_data.m. Report the number of votes from each constituency
under the politician’s prior positions (x = 0) and optimal positions, as well as the total number
of votes V in each case.
You may use the function
21.9 Resource allocation in stream processing. A large data center is used to handle a stream of J types
of jobs. The traffic (number of instances per second) of each job type is denoted t ∈ RJ+ . Each
instance of each job type (serially) invokes or calls a set of processes. There are P types of processes,
and we describe the job-process relation by the P × J matrix
1 job j invokes process p
Rpj =
0 otherwise.
The process loads (number of instances per second) are given by λ = Rt ∈ RP , i.e., λp is the sum
of the traffic from the jobs that invoke process p.
The latency of a process or job type is the average time that it takes one instance to complete.
These are denoted lproc ∈ RP and ljob ∈ RJ , respectively, and are related by ljob = RT lproc , i.e.,
ljjob is the sum of the latencies of the processes called by j. Job latency is important to users, since
ljjob is the average time the data center takes to handle an instance of job type j. We are given a
maximum allowed job latency: ljob ⪯ lmax .
The process latencies depend on the process load and also how much of n different resources are
made available to them. These resources might include, for example, number of cores, disk storage,
and network bandwidth. Here, we represent amounts of these resources as (nonnegative) real
numbers, so xp ∈ Rn+ represents the resources allocated to process p. The process latencies are
given by
lpproc = ψp (xp , λp ), p = 1, . . . , P,
where ψp : Rn × R → R ∪ {∞} is a known (extended-valued) convex function. These functions are
nonincreasing in their first (vector) arguments, and nondecreasing in their second arguments (i.e.,
290
more resources or less load cannot increase latency). We interpret ψp (xp , λp ) = ∞ to mean that
the resources given by xp are not sufficient to handle the load λp .
We wish to allocate a total resource amount xtot ∈ Rn++ among the P processes, so we have
PP tot
p=1 xp ⪯ x . The goal is to minimize the objective function
J
X
wj (ttar
j − tj )+ ,
j=1
where ttar
j is the target traffic level for job type j, wj > 0 give the priorities, and (u)+ is the
nonnegative part of a vector, i.e., ui = max{ui , 0}. (Thus the objective is a weighted penalty for
missing the target job traffic.) The variables are t ∈ RJ+ and xp ∈ Rn+ , p = 1, . . . , P . The problem
data are the matrix R, the vectors lmax , xtot , ttar , and w, and the functions ψp , p = 1, . . . , P .
21.10 Optimal parimutuel betting. In parimutuel betting, participants bet nonnegative amounts on each
of n outcomes, exactly one of which will actually occur. (For example, the outcome can be which
of n horses wins a race.) The total amount bet by all participants on all outcomes is called the pool
or tote. The house takes a commission from the pool (typically around 20%), and the remaining
pool is divided among those who bet on the outcome that occurs, in proportion to their bets on
the outcome. This problem concerns the choice of the amount to bet on each outcome.
Let xi ≥ 0 denote the amount we bet on outcome i, so the total amount we bet on all outcomes is
1T x. Let ai > 0 denote the amount bet by all other participants on outcome i, so after the house
commission, the remaining pool is P = (1 − c)(1T a + 1T x), where c ∈ (0, 1) is the house commission
rate. Our payoff if outcome i occurs is then
xi
pi = P.
xi + ai
The goal is to choose x, subject to 1T x = B (where B is the total amount to be bet, which is
given), so as to maximize the expected utility
n
X
πi U (pi ),
i=1
where πi is the probability that outcome i occurs, and U is a concave increasing utility function,
with U (0) = 0. You can assume that ai , πi , c, B, and the function U are known.
291
(a) Explain how to find an optimal x using convex or quasiconvex optimization. If you use a
change of variables, be sure to explain how your variables are related to x.
(b) Suggest a fast method for computing an optimal x. You can assume that U is strictly concave,
and that scalar optimization problems involving U (such as evaluating the conjugate of −U )
are easily and quickly solved.
Remarks.
• To carry out this betting strategy, you’d need to know ai , and then be the last participant
to place your bets (so that ai don’t subsequently change). You’d also need to know the
probabilities πi . These could be estimated using sophisticated machine learning techniques or
insider information.
• The formulation above assumes that the total amount to bet (i.e., B) is known. If it is not
known, you could solve the problem above for a range of values of B and use the value of B
that yields the largest optimal expected utility.
Here H nom ∈ Sn is the nominal (unperturbed) Hamiltonian, x ∈ Rk gives the strength or value of
the perturbations, and H1 , . . . , Hk ∈ Sn characterize the perturbations. We have limits for each
perturbation, which we express as |xi | ≤ 1, i = 1, . . . , k. The problem is to choose x to maximize
the gap η of the perturbed Hamiltonian, subject to the constraint that the perturbed Hamiltonian
H has the same ground state (up to scaling, of course) as the unperturbed Hamiltonian H nom . The
problem data are the nominal Hamiltonian matrix H nom and the perturbation matrices H1 , . . . , Hk .
(a) Explain how to formulate this as a convex or quasiconvex optimization problem. If you change
variables, explain the change of variables clearly.
(b) Carry out the method of part (a) for the problem instance with data given in hamiltonian_gap_data.*.
Give the optimal perturbations, and the energy gap for the nominal and perturbed systems.
The data Hi are given as a cell array; H{i} gives Hi .
21.12 Theory-applications split in a course. A professor teaches an advanced course with 20 lectures,
labeled i = 1, . . . , 20. The course involves some interesting theoretical topics, and many practical
applications of the theory. The professor must decide how to split each lecture between theory and
applications. Let Ti and Ai denote the fraction of the ith lecture devoted to theory and applications,
for i = 1, . . . , 20. (We have Ti ≥ 0, Ai ≥ 0, and Ti + Ai = 1.)
292
A certain amount of theory has to be covered before the applications can be taught. We model
this in a crude way as
A1 + · · · + Ai ≤ ϕ(T1 + · · · + Ti ), i = 1, . . . , 20,
Find (four different) theory-applications splits that maximize the terminal emotional state of the
first group, the terminal emotional state of the second group, the terminal emotional state of the
third group, and, finally, the minimum of the terminal emotional states of all three groups.
For each case, plot Ti and the emotional state si for the three groups, versus i. Report the numerical
values of the terminal emotional states for each group, for each of the four theory-applications splits.
21.13 Optimal material blending. A standard industrial operation is to blend or mix raw materials
(typically fluids such as different grades of crude oil) to create blended materials or products.
This problem addresses optimizing the blending operation. We produce n blended materials from
m raw materials. Each raw and blended material is characterized by a vector that gives the
concentration of each of q constituents (such as different octane hydrocarbons). Let c1 , . . . , cm ∈ Rq+
and c̃1 , . . . , c̃n ∈ Rq+ be the concentration vectors of the raw materials and the blended materials,
respectively. We have 1T cj = 1T c̃i = 1 for i = 1, . . . , n and j = 1, . . . , m. The raw material
concentrations are given; the blended product concentrations must lie between some given bounds,
c̃min
i ⪯ c̃i ⪯ c̃max i .
Each blended material is created by pumping raw materials (continuously) into a vat or container
where they are mixed to produce the blended material (which continuously flows out of the mixing
vat). Let fij ≥ 0 denote the flow of raw material j (say, in kg/s) into the vat for product i, for
293
iP= 1, . . . , n, j = 1, . . . , m. These flows are limited by the total availability of each raw material:
n
i=1 fij ≤ Fj , j = 1, . . . , m, where Fj > 0 is the maximum total flow of raw material j available.
Let f˜i ≥ 0 denote the flow rates of the blended materials. These also have limits: f˜i ≤ F̃i ,
i = 1, . . . , n.
The raw and blended material flows are related by the (mass conservation) equations
m
X
fij cj = f˜i c̃i , i = 1, . . . , n.
j=1
(The lefthand side is the vector of incoming constituent mass flows and the righthand side is the
vector of outgoing constituent mass flows.)
Each raw and blended material has a (positive) price, pj , j = 1, . . . , m (for the raw materials), and
p̃i , i = 1, . . . , n (for the blended materials). We pay for the raw materials, and get paid for the
blended materials. The total profit for the blending process is
n X
X m n
X
− fij pj + f˜i p̃i .
i=1 j=1 i=1
The goal is to choose the variables fij , f˜i , and c̃i so as to maximize the profit, subject to the
constraints. The problem data are cj , c̃min
i , c̃i
max , F , F̃ , p , and p̃ .
j i j j
(a) Explain how to solve this problem using convex or quasi-convex optimization. You must justify
any change of variables or problem transformation, and explain how you recover the solution
of the blending problem from the solution of your proposed problem.
(b) Carry out the method of part (a) on the problem instance given in
material_blending_data.*. Report the optimal profit, and the associated values of fij , f˜i ,
and c̃i .
21.14 Ideal preference point. A set of K choices for a decision maker is parametrized by a set of vectors
c(1) , . . . , c(K) ∈ Rn . We will assume that the entries ci of each choice are normalized to lie in the
range [0, 1]. The ideal preference point model posits that there is an ideal choice vector cideal with
entries in the range [0, 1]; when the decision maker is asked to choose between two candidate choices
c and c̃, she will choose the one that is closest (in Euclidean norm) to her ideal point. Now suppose
that the decision maker has chosen between all K(K − 1)/2 pairs of given choices c(1) , . . . , c(K) .
The decisions are represented by a list of pairs of integers, where the pair (i, j) means that c(i) is
chosen when given the choices c(i) , c(j) . You are given these vectors and the associated choices.
(a) How would you determine if the decision maker’s choices are consistent with the ideal prefer-
ence point model?
(b) Assuming they are consistent, how would you determine the bounding box of ideal choice vec-
tors consistent with her decisions? (That is, how would you find the minimum and maximum
values of cideal
i , for cideal consistent with being the ideal preference point.)
(c) Carry out the method of part (b) using the data given in ideal_pref_point_data.*. These
files give the points c(1) , . . . , c(K) and the choices, and include the code for plotting the results.
Report the width and the height of the bounding box and include your plot.
294
21.15 Matrix equilibration. We say that a matrix is ℓp equilibrated if each of its rows has the same ℓp
norm, and each of its columns has the same ℓp norm. (The row and column ℓp norms are related
by m, n, and p.) Suppose we are given a matrix A ∈ Rm×n . We seek diagonal invertible matrices
D ∈ Rm×m and E ∈ Rn×n for which DAE is ℓp equilibrated.
(a) Explain how to find D and E using convex optimization. (Some matrices cannot be equili-
brated. But you can assume that all entries of A are nonzero, which is enough to guarantee
that it can be equilibrated.)
(b) Equilibrate the matrix A given in the file matrix_equilibration_data.*, with
m = 20, n = 10, p = 2.
Print the row ℓp norms and the column ℓp norms of the equilibrated matrix as vectors to check
that each matches.
Hints.
21.16 Approximations of the PSD cone. A symmetric matrix is positive semidefinite if and only if all
its principal minors are nonnegative. Here we consider approximations of the positive-semidefinite
cone produced by partially relaxing this condition.
Denote by K1,n the cone of matrices whose 1 × 1 principal minors (i.e., diagonal elements) are
nonnegative, so that
K1,n = {X ∈ Sn | Xii ≥ 0 for all i}.
Similarly, denote by K2,n the cone of matrices whose 1×1 and 2×2 principal minors are nonnegative:
n Xii Xij
K2,n = X ∈ S ⪰ 0, for all i ̸= j ,
Xij Xjj
i.e., the cone of symmetric matrices with positive semidefinite 2 × 2 principal submatrices. These
two cones are convex (and in fact, proper), and satisfy the relation:
∗ ∗
K1,n ⊆ K2,n ⊆ Sn+ ⊆ K2,n ⊆ K1,n ,
∗ and K ∗ are the dual cones of K
where K1,n 2,n 1,n and K2,n , respectively. (The last two inclusions
are immediate, and the first two inclusions follow from the second bullet on page 53 of the text.)
∗ .
(a) Give an explicit characterization of K1,n
∗ .
(b) Give an explicit characterization of K2,n
Hint: You can use the fact that if K = K1 ∩ · · · ∩ Km , then K ∗ = K1∗ + · · · + Km
∗.
295
(c) Consider the problem
minimize tr CX
subject to tr AX = b
X∈K
with variable X ∈ Sn . The problem parameters are C ∈ Sn , A ∈ Sn , b ∈ R, and the cone
K ⊆ Sn . Using the data in psd cone approx data.*, solve this problem five times, each time
∗ , and K ∗ . Report the five different
replacing K with one of the five cones K1,n , K2,n , Sn+ , K2,n 1,n
optimal values you obtain.
Note: For parts (a) and (b), the shorter and clearer your description is, the more points you will
receive. At the very least, it should be possible to implement your description in CVX*.
21.17 Equilibrating chemical reactions. A chemical system involving n species eventually reaches ther-
modynamic equilibrium. The composition of such a system is described by x ∈ Rn++ , where xi
denotes the amount of species i, measured in moles. We will assume for simplicity that all species
stay in the same phase for the entire process (this is usually violated, but is easy enough to deal
with). As time passes, the species in the system react. For example, consider the following chemical
reaction,
N2 + 3 H 2 ⇀ ↽ 2 NH3 .
This means that one mole of N2 and 3 moles of H2 can form 2 moles of NH3 , and vice versa. Such
reactions are termed reversible, because they can proceed in either direction. Hence, we may think
of such a reaction as an equation, N2 + 3 H2 − 2 NH3 = 0. In general, many reactions among the
n species may be possible. We assume for our system, there are m possible reversible reactions,
described by equations
For reaction i, the vector ai = (ai1 , . . . , ain ) describes the effect of a single reaction on the compo-
sition of the system. In terms of the forward reactions, aij > 0 if species Xj is consumed during
reaction i, aij < 0 if species Xi is produced in reaction i, and aij = 0 if the quantity of Xj is left
unchanged during the ith reaction.
As the system proceeds towards thermodynamic equilibrium, each reaction occurs zi ∈ R times,
i = 1, . . . , m. Thus, zi ai describes the change in composition due to zi instances of reaction i.
The equilibrium composition of the system, xe , minimizes the total free energy of the system. For
simplicity, an assumption that chemists sometimes make is that their system is comprised of ideal
gasses and liquids, in which case the total free energy of the system at composition x is
n
X
G(x) = cT x + xi log(xi /1T x),
i=1
where c ∈ Rn is a given vector determined by system conditions, such as temperature and pressure.
This free energy is often measured in joules (J) or kilo-joules (kJ).
Importantly, matter is conserved, which means that change in composition due to the reactions
must be equal to the the difference between the initial and equilibrium compositions, so that the
difference in composition between the equilibrium and initial composition, xe − x0 is precisely the
change in composition due to the m reactions.
296
Assuming the reactants in the system act as ideal gasses and liquids, explain how to use convex or
quasiconvex optimization to compute the equilibrium composition of the system, given ai , initial
composition x0 , and free energy parameter c ∈ Rn . You must justify any statement about curvature
or monotonicity that you use. If you make a change of variables, you must justify it as well.
21.18 To randomize or not to randomize. At a start-up, two colleagues Alice and Bob are debating what
ad to run in a new marketing campaign. There are n options to choose from. The start-up has
data on the distribution of revenue xj generated per view for each ad j. The distribution for each
ad is discretetized over m possible outcomes for revenue, c1 , . . . , cm ∈ R, with c1 < c2 < · · · < cm .
Alice and Bob use a single matrix P ∈ Rm×n to describe all n distributions, with
P (xj = ci ) = Pij .
Alice argues for building a new system to randomly display different ads to visitors, with each ad
j appearing for a fraction θj of the time. Bob disagrees and believes that the old system, which
shows the same ad every time, is sufficient.
Under a randomized policy, the revenue per view xmix follows the discrete distribution p = P θ,
where θ ∈ Rn+ , 1T θ = 1, and P (xmix = ci ) = pi . The start-up will lose money on the campaign
if xmix < 0, and will consider the campaign successful if xmix > L. Alice and Bob’s goal is to
maximize the probability of a successful campaign while ensuring the probability of a loss is no
more than β.
Using the data n, m, P, c, beta, and L in rand_policy_data.*, determine whether it is better to
show a single ad or a randomized assortment, and explain why.
21.19 Typesetting TEX. TEX uses a mechanical spring model to determine the spacing before and after
each character, on each line of a document. A line consists of n characters, each with width
wi , i = 1, . . . , n, and n + 1 spaces before
P and after
P each character. The spaces have width si ,
i = 1, . . . , n + 1. We will assume that ni=1 wi + n+1
i=1 si = W , i.e., the characters and spaces fill
the line. We are to determine the widths si , subject to the line-filling constraint.
In TEX, spaces are modeled as springs that can be compressed or extended from their natural length
(or in this context, width). This is expressed by an energy associated with the width si , given by
( ext
ki 2
Ei (si ) = 2 (si − Ni ) si > Ni
kicomp 2
2 (si − Ni ) Ni ≥ si
where kiext , Ni , and kicomp are given positive parameters. (We have left out a few details.)
We can interpret the parameters as follows. The parameter Ni is the natural space, i.e., the space’s
minimum energy width. The parameters kiext and kicomp are the stiffness of the space in extension
and compression, respectively.
The space widths are chosen to minimize the total energy
297
(a) Explain how to find the space widths s1 , . . . , sn+1 by solving a convex optimization problem.
You do not need to constrain s1 , . . . , sn+1 to be nonnegative.
(b) Write out the KKT conditions for the optimization problem you derived. Use the KKT
conditions to find an analytical solution to the problem.
21.20 Train time-table optimization. We consider a transit system with K trains, denoted k = 1, . . . , K.
Each train k travels over a route, which is a sequence of Sk stops. For simplicity, we will assume
that each train has the same number of stops, S. The train schedule or time-table is given by the
arrival and departure times for each train, at each of the stops on its route. We let Aks ∈ R be the
arrival time of train k at stop s for s = 1, . . . S and Dks ∈ R be the departure time of train k at
stop s, for s = 1, . . . S. These times are given in minutes from some starting or reference time. The
first station arrival times Ak1 and the last station arrival times AkS are given, for k = 1, . . . , K.
Our goal is to choose the remaining arrival and departure times.
We let dks ∈ R be the distance between stop s + 1 and stop s for train k, for k = 1, . . . , K and
s = 1, . . . , S − 1. There are minimum and maximum speed limits on each travel segment between
stops, v min and v max , respectively. In this simple model, you can assume the trains travel at
constant speed between stops.
At each stop, each train must stop for at least τ min minutes, i.e., Dks −Aks ≥ τ min for k = 1, . . . , K,
s = 1, . . . , S − 1.
Trains are meant to overlap at various stops called connections. We have C connections, indexed
by c = 1, . . . , C. Each connection consists of a pair of trains and stops; (k, s, k ′ , s′ ) means that the
s stop of train k should connect with the s′ stop of train k ′ . (Presumably this means the trains
stop at the same station, but we’re not keeping track of the stations where the trains stop here.)
The connection time associated with connection c is
which is the time interval during which both trains are at the station. There is a required minimum
connection time, i.e., Tc ≥ T min .
The objective is to maximize the sum of the logs of the connection times (or equivalently, their
geometric mean), subject to the constraints described above.
(a) Show how to pose this as a convex optimization problem. If you introduce new variables, or
change variables, you must explain how to recover the optimal arrival and departure times
from the solution of your problem.
Hint. The geometric mean may be a more numerically stable objective when optimizing with
CVX*.
(b) Carry out your method on the problem instance described in train_schedule_data.*. Re-
port the optimal objective value, i.e., the sum of the logs of the connection times. Plot the
optimal schedule using the given function scheduleDraw(A, D, C) where A and D are matri-
ces containing the optimal arrival and departure times and C is connections. Also, plot the
histogram of connection times using the given function get_hist(A, D, C).
21.21 Radiation therapy dose scheduling. An oncology patient is given a dose of radiation dt ∈ R+ in
time periods t = 1, . . . , T − 1, with the goal of shrinking a tumor to some specified target size while
298
minimizing the damage to the patient’s health. We can choose the doses dt , subject to the limit
dt ≤ dmax , where dmax is a given maximum dose. This problem has several names, including dose
scheduling,Pdose planning, and dose fractionation. (The last name refers to how we break up the
T −1
total dose t=1 dt into the doses delivered in each period.)
We let St ∈ R+ denote the tumor size in period t. The tumor size evolves as
St+1 = αe−βdt St , t = 1, . . . , T − 1,
where α > 1 is the per-period tumor growth rate with no radiation, and β > 0 is a known constant.
(Since dt ≥ 0 and β > 0, we see that the radiation applied in one period shrinks the tumor in the
next period.) The initial tumor size S1 is given. The goal is to achieve ST ≤ S tar , where S tar is a
target final tumor size.
We let Ht ∈ R+ denote some measure of the damage to the patient’s health from the radiation
treatments. It evolves as
Ht+1 = γeδdt Ht , t = 1, . . . , T − 1,
where γ ∈ (0, 1] is the per-period damage recovery rate with no radiation, and δ > 0 is a known
constant. (Since dt ≥ 0 and δ > 0, we see that the radiation applied in one period increases the
damage in the next period.) The initial damage H1 is given.
The goal is to find a series of doses d1 , . . . , dT −1 that satisfies the constraints described above, and
minimizes the maximum damage H max = maxt=1,...,T Ht .
(a) Explain how to solve this problem using convex optimization. If you change variables or form
a relaxation, you must explain and justify it.
(b) Solve the problem with T = 20 and
Report the optimal objective value, i.e., the maximum damage. Plot the dose dt , damage Ht ,
and tumor size St versus t, for an optimal dose plan. Plot the same for the case when no
treatment is given, i.e., dt = 0 for t = 1, . . . , T − 1.
21.22 Optimal policies for and shipments between two blood banks. We consider two blood banks, each
with their own supply of and demand for the four types of blood, O, A, B, and AB, which we label
1, 2, 3, and 4. We let d ∈ R4+ denote the demand for the 4 blood types at the first bank, and
d˜ ∈ R4+ denote the demand at the second bank. We let s ∈ R4+ denote the supply of the 4 blood
types at the first bank, and s̃ ∈ R4+ denote the supply at the second bank. (These values are given
in units of blood.)
Some blood types can be substituted for others, according to the following list of possible substi-
tutions.
299
For example, the demand for type B blood can be satisfied using any combination of type O and
type B blood.
Each bank has a policy which specifies how much of each blood type is used to satisfy the demands
for the different blood types. These are expressed as the matrices B ∈ R4×4
+ and B̃ ∈ R+4×4
for the
first and second banks, respectively. Here Bij denotes the amount of blood type j we use to satisfy
demand for blood type i, at the first bank (and similarly for the second bank). The substitution
list above imposes sparsity constraints on B and B̃. Note that B T 1 is the vector of total amounts
of blood used, and B1 is the vector of total amount of demand that is satisfied, at the first bank,
and similarly for the second bank.
We can transport blood between the two banks. We let t ∈ R4 denote the amounts of the 4 types
that are sent or transported from the first to the second bank, with ti < 0 meaning that blood of
type i is sent from the second bank to the first. These shipments incur a cost κ∥t∥1 , where κ ∈ R+
is the transport cost per unit. The effect of the shipments is to change the blood supply at the two
banks from s and s̃ to s+ = s − t and s̃+ = s̃ + t, respectively. (The superscript means s+ and s̃+
are the post-shipment supplies at the two banks.) We require that s+ and s̃+ are nonnegative.
Your task is to choose the shipments between banks t, and the policy for each bank B and B̃, in
order to satisfy the demand at each bank (with the post-shipment supplies). You must minimize
the cost, which is the shipment cost plus the total cost of all blood consumed at the two banks,
using the prices p ∈ R4++ . (So for example, p3 is the cost for one unit of type B blood.)
Remark. We consider here a static problem with only two blood banks just to keep things simple. A
more realistic problem formulation would plan over a sequence of time periods, and include aspects
such as shipping time, blood storage life, and donations to the banks over time, as well as more
than two banks.
(a) Explain how to solve this problem using convex optimization. If you change variables or relax
the problem, you must justify and explain it.
(b) Solve the problem instance with κ = 0.5 and the given data
4 20 30 10 5
2
, d = 5 , s = 10 , d˜ = 25 , s̃ = 20 .
p= 2 10 5 5 15
1 15 0 15 20
Report the optimal shipment vector t and the optimal policies B and B̃ for the two banks.
Give the optimal cost.
Verify that the problem is infeasible when shipments are not allowed, i.e., t = 0, and explain
why.
21.23 Combining partial rankings. In the rank aggregation problem, we are given several ordered lists of
items, and want to construct a single list that reflects the orderings in the given lists. Suppose we
have m ordered lists of k indices i ∈ {1, . . . , n}, each of the form
σ j = (ij1 , ij2 , . . . , ijk ), j = 1, . . . , m.
The meaning of the lists is that, according to list j, ij1 is preferred to ij2 , is preferred to ij3 , and so
on. (The lists have the same number of items, k, for notational simplicity. Everything works in the
more realistic case when the lists have different lengths.)
300
We will search for a set of scores for the items, denoted by s ∈ Rn . This set of scores induces a
ranking of the items, with the item of highest score the first, second highest second, and so on. (If
there are repeated entries in s, the ranking is ambiguous.)
We say that a score s is consistent with a ranking σ j if
(a) Finding a consistent score. Explain how to use convex optimization to find a set of scores
(and therefore also a ranking) that is consistent with the given lists, assuming there is one.
Note. No, you cannot solve convex problems with strict inequalities.
(b) Use your method on the data in ranked lists data.*. Give the ordering you get. (Be sure
to include your code.)
(c) Finding a score consistent with many of the lists. Suppose there is no consistent score for the
set of lists. Suggest a convex optimization problem that is a heuristic for finding a score that
is consistent with as many of the given lists as possible. Note that this is not the same as
finding a score for which many of the inequalities hold; all k − 1 inequalities in a given list
must hold for the scores to be consistent with the list. Note. We will accept any reasonable
solution; there are several we can think of.
(d) Use your method on the data in ranked lists inconsistent data.*. Give the ordering you
get as well as the number of lists for which your solution is inconsistent (i.e., where at least
one of the k − 1 pairs in the list is mis-ordered).
21.24 Allocating memory. A multicore processor has n cores and m memory blocks. Each core i is
allocated a (nonnegative) amount Mij from memory block j. Core i requires a total of bi memory,
so M 1 = b. Memory block j has a total capacity cj , so we have M T 1 ⪯ c. Our goal is to find a
memory allocation M ∈ Rn×m+ that satisfies these requirements and minimizes the cost
n X
X m
Cij Mij + Dij Mij2 ,
i=1 j=1
where Cij , Dij are given nonnegative cost rates. (Typically Cij and Dij are increasing functions of
the ℓ1 distance between memory block j and core i, but you don’t need to know this to solve the
problem.)
The data in the problem are the core memory requirements bi , the memory block capacities cj , and
the cost rates Cij , Dij . You are to determine an optimal memory allocation M ∈ Rn×m
+ .
Solve the problem instance with the data given in allocate_memory_data.*. This file also contains
plotting code to show a memory allocation. Use this to plot your optimal allocation.
21.25 Wasserstein midpoint. Let p and q be two probability distributions on {1, . . . , n}, i.e., p, q ⪰ 0,
1T p = 1T q = 1. We associate with each index i = 1, . . . , n a location ai ∈ Rd .
The Wasserstein distance between the two distributions, denoted dW (p, q), is defined as the optimal
value of the problem
minimize tr C T X
subject to X1 = q, X T 1 = p,
Xij ≥ 0, i, j = 1, . . . , n,
301
with variable X ∈ Rn×n . Here C ∈ Rn×n is defined as Cij = ∥ai − aj ∥22 , for i, j = 1, . . . , n. (Some
authors consider dW (p, q)1/2 to be the Wasserstein distance, but we will use the definition above.)
We interpret the Wasserstein distance as follows. The number Xij denotes thePamount of probability
mass we move from j to i, and Cij Xij is the associated cost; the total cost is i,j Cij Xij = tr C T X.
The Wasserstein midpoint between the two probability distributions p and q is defined as
(a) Explain how to find the Wasserstein midpoint cW using convex optimization.
(b) Find the Wasserstein midpoint for the two distributions on a k × k grid (so n = k 2 ) shown
below.
a b
0 5 10 15 20 0 5 10 15 20
0 0
5 5
10 10
15 15
20 20
The two distributions are given as vectors p and q in the file wass_midpoint_data.py. The
cost matrix C is given by the n × n matrix C; therefore, we do not give A.
Give the optimal objective value dW (p, cW ) + dW (cW , q) (to two significant figures). Plot
the Wasserstein midpoint using the provided function plot_pdfs(p, q, c). Also plot the
(algebraic) midpoint calg = (1/2)(p + q), and make a brief statement comparing them.
Remarks. The Wasserstein distance is also called optimal transport, earth mover’s, or Monge-
Kantorovich distance. In many applications (e.g., with images) it can give a more intuitive and
useful distance than other common ones such as Euclidean.
21.26 Optimal diagonal preconditioner for a PD matrix. Let A ∈ Sn++ . We seek a diagonal matrix D
with positive diagonal entries that minimizes the condition number of DAD,
λmax (DAD)
cond(DAD) = .
λmin (DAD)
We will simply assume that an optimal D exists. Note also that if D is optimal, so is αD for
any α > 0. This optimization problem arises in numerical methods where D represents a diagonal
preconditioner, but you don’t need to know this to solve this problem.
(a) Explain how to use convex optimization to find an optimal diagonal preconditioner for A. You
will receive half credit for a valid quasiconvex formulation.
302
(b) Carry out the procedure from part (a) for the problem instance with
0.2 −0.2 0.6 −0.6
−0.2 0.4 −1.4 1.3
A= 0.6 −1.4
.
5.2 −4.7
−0.6 1.3 −4.7 4.4
Give the diagonal entries of D⋆ . Report the optimal condition number cond(D⋆ AD⋆ ) and the
condition number of the original matrix, cond(A).
303