Convexsol 1

Convex Analysis and
Optimization
Chapter 1 Solutions
Dimitri P. Bertsekas
with
Angelia Nedić and Asuman E. Ozdaglar
Massachusetts Institute of Technology
Athena Scientific, Belmont, Massachusetts

http://www.athenasc.com
LAST UPDATE March 24, 2004
CHAPTER 1: SOLUTION MANUAL
1.1
We always have (λ1 + λ2 )C ⊂ λ1 C + λ2 C, even if C is not convex. To show the

reverse inclusion assuming C is convex, note that a vector x in λ1 C + λ2 C is of
the form x = λ1 x1 + λ2 x2 , where x1 , x2 ∈ C. By convexity of C, we have
λ1 λ2
x1 + x2 ∈ C,
λ1 + λ2 λ1 + λ2
and it follows that
x = λ1 x1 + λ2 x2 ∈ (λ1 + λ2 )C.
Hence λ1 C + λ2 C ⊂ (λ1 + λ2 )C.
For a counterexample when C is not convex, let C be a set in <n consisting
of two vectors, 0 and x 6= 0, and let λ1 = λ2 = 1. Then C is not convex, and
(λ1 + λ2 )C = 2C = {0, 2x}, while λ1 C + λ2 C = C + C = {0, x, 2x}, showing that
(λ1 + λ2 )C 6= λ1 C + λ2 C.
1.2 (Properties of Cones)
(a) Let x ∈ ∩i∈I Ci and let α be a positive scalar. Since x ∈ Ci for all i ∈ I and
each Ci is a cone, the vector αx belongs to Ci for all i ∈ I. Hence, αx ∈ ∩i∈I Ci ,
showing that ∩i∈I Ci is a cone.
(b) Let x ∈ C1 × C2 and let α be a positive scalar. Then x = (x1 , x2 ) for some
x1 ∈ C1 and x2 ∈ C2 , and since C1 and C2 are cones, it follows that αx1 ∈ C1
and αx2 ∈ C2 . Hence, αx = (αx1 , αx2 ) ∈ C1 × C2 , showing that C1 × C2 is a
cone.
(c) Let x ∈ C1 + C2 and let α be a positive scalar. Then, x = x1 + x2 for some
x1 ∈ C1 and x2 ∈ C2 , and since C1 and C2 are cones, αx1 ∈ C1 and αx2 ∈ C2 .
Hence, αx = αx1 + αx2 ∈ C1 + C2 , showing that C1 + C2 is a cone.
(d) Let x ∈ cl(C) and let α be a positive scalar. Then, there exists a sequence
{xk } ⊂ C such that xk → x, and since C is a cone, αxk ∈ C for all k. Further-
more, αxk → αx, implying that αx ∈ cl(C). Hence, cl(C) is a cone.
(e) First we prove that A·C is a cone, where A is a linear transformation and A·C
is the image of C under A. Let z ∈ A · C and let α be a positive scalar. Then,
Ax = z for some x ∈ C, and since C is a cone, αx ∈ C. Because A(αx) = αz,
the vector αz is in A · C, showing that A · C is a cone.
Next we prove that the inverse image A−1 · C of C under A is a cone. Let
−1
x ∈ A · C and let α be a positive scalar. Then Ax ∈ C, and since C is a cone,
αAx ∈ C. Thus, the vector A(αx) ∈ C, implying that αx ∈ A−1 · C, and showing
that A−1 · C is a cone.
2
1.3 (Lower Semicontinuity under Composition)
(a) Let {xk } ⊂ <n be a sequence n

of vectors converging to some x ∈ < . By
continuity of f , it follows that f (xk ) ⊂ <m converges to f (x) ∈ <m , so that
by lower semicontinuity of g, we have

lim inf g f (xk ) ≥ g f (x) .
k→∞
Hence, h is lower semicontinuous.

(b) Assume, to arrive at a contradiction, that h is not lower semicontinuous at
some x ∈ <n . Then, there exists a sequence {xk } ⊂ <n converging to x such
that
lim inf g f (xk ) < g f (x) .
k→∞
Let {xk }K be a subsequence attaining the above limit inferior, i.e.,

lim g f (xk ) = lim inf g f (xk ) < g f (x) . (1.1)
k→∞, k∈K k→∞
Without loss of generality, we assume that

g f (xk ) < g f (x) , ∀ k ∈ K.
Since g is monotonically nondecreasing, it follows that
f (xk ) < f (x), ∀ k ∈ K,
which together with the fact {xk }K → x and the lower semicontinuity of f , yields
f (x) ≤ lim inf f (xk ) ≤ lim sup f (xk ) ≤ f (x),

k→∞, k∈K k→∞, k∈K

showing that f (xk ) K → f (x). By our choice of the sequence {xk }K and the
lower semicontinuity of g, it follows that

lim g f (xk ) = lim inf g f (xk ) ≥ g f (x) ,
k→∞, k∈K k→∞, k∈K
contradicting Eq. (1.1). Hence, h is lower semicontinuous.

As an example showing that the assumption that g is monotonically non-
decreasing is essential, consider the functions
0 if x ≤ 0,
n
f (x) =
1 if x > 0,
and g(x) = −x. Then

0 if x ≤ 0,
n
g f (x) =
−1 if x > 0,
which is not lower semicontinuous at 0.
3
1.4 (Convexity under Composition)
Let x, y ∈ <n and let α ∈ [0, 1]. By the definitions of h and f , we have

h αx + (1 − α)y = g f αx + (1 − α)y

= g f1 αx + (1 − α)y , . . . , fm αx + (1 − α)y

≤ g αf1 (x) + (1 − α)f1 (y), . . . , αfm (x) + (1 − α)fm (y)

= g α f1 (x), . . . , fm (x) + (1 − α) f1 (y), . . . , fm (y)

≤ αg f1 (x), . . . , fm (x) + (1 − α)g f1 (y), . . . , fm (y)

= αg f (x) + (1 − α)g f (y)
= αh(x) + (1 − α)h(y),
where the first inequality follows by convexity of each fi and monotonicity of g,

while the second inequality follows by convexity of g.
If m = 1, g is monotonically increasing, and f is strictly convex, then the
first inequality is strict whenever x 6= y and α ∈ (0, 1), showing that h is strictly
convex.
1.5 (Examples of Convex Functions)
(a) Denote X = dom(f1 ). It can be seen that f1 is twice continuously differen-

tiable over X and its Hessian matrix is given by
1−n 1 1
···
 
x2 x1 x2 x1 xn
1
1 1−n 1

x2
··· 
f1 (x)  x2 x1 x2 xn 
∇2 f1 (x) =  2 
n2  .. 
 . 
1 1 1−n
xn x1 x1 x2
··· x2
n
for all x = (x1 , . . . , xn ) ∈ X. From this, direct computation shows that for all
z = (z1 , . . . , zn ) ∈ <n and x = (x1 , . . . , xn ) ∈ X, we have
n
!2 n
!
f1 (x) zi zi 2
X X
0 2
z ∇ f1 (x)z = −n .
n2 xi xi
i=1 i=1
Note that this quadratic form is nonnegative for all z ∈ <n and x ∈ X, since
f1 (x) < 0, and for any real numbers α1 , . . . , αn , we have
(α1 + · · · + αn )2 ≤ n(α12 + · · · + αn
2
),
in view of the fact that 2αj αk ≤ αj2 + αk2 . Hence, ∇2 f1 (x) is positive semidefinite
for all x ∈ X, and it follows from Prop. 1.2.6(a) that f1 is convex.
4
(b) We show that the Hessian of f2 is positive semidefinite at all x ∈ <n . Let
β(x) = ex1 + · · · + exn . Then a straightforward calculation yields
n n
1 X X (xi +xj )
z 0 ∇2 f2 (x)z = e (zi − zj )2 ≥ 0, ∀ z ∈ <n .
β(x)2
i=1 j=1
Hence by Prop. 1.2.6(a), f2 is convex.

(c) The function f3 (x) = kxkp can be viewed as a composition g f (x) of the
scalar function g(t) = tp with p ≥ 1 and the function f (x) = kxk. In this case, g is
convex and monotonically increasing over the nonnegative axis, the set of values
that f can take, while f is convex over <n (since any vector norm is convex,
see the discussion preceding Prop. 1.2.4). Using Exercise 1.4, it follows that the
function f3 (x) = kxkp is convex over <n .
1

(d) The function f4 (x) = f (x) can be viewed as a composition g h(x) of the
function g(t) = − t for t < 0 and the function h(x) = −f (x) for x ∈ <n . In this
1
case, the g is convex and monotonically increasing in the set {t | t < 0}, while h
is convex over <n . Using Exercise 1.4, it follows that the function f4 (x) = f (x)
1
n
is convex over < .

(e) The function f5 (x) = αf (x) + β can be viewed as a composition g f (x) of
the function g(t) = αt + β, where t ∈ <, and the function f (x) for x ∈ <n . In
this case, g is convex and monotonically increasing over < (since α ≥ 0), while f
is convex over <n . Using Exercise 1.4, it follows that f5 is convex over <n .
0
(f) The function f6 (x) = eβx Ax can be viewed as a composition g f (x) of the
function g(t) = eβt for t ∈ < and the function f (x) = x0 Ax for x ∈ <n . In this
case, g is convex and monotonically increasing over <, while f is convex over <n
(since A is positive semidefinite). Using Exercise 1.4, it follows that f6 is convex
over <n .
(g) This part is straightforward using the definition of a convex function.
1.6 (Ascent/Descent Behavior of a Convex Function)
(a) Let x1 , x2 , x3 be three scalars such that x1 < x2 < x3 . Then we can write x2
as a convex combination of x1 and x3 as follows
x3 − x2 x2 − x1
x2 = x1 + x3 ,
x3 − x1 x3 − x1
so that by convexity of f , we obtain
x3 − x2 x2 − x1
f (x2 ) ≤ f (x1 ) + f (x3 ).
x3 − x1 x3 − x1
This relation and the fact
x3 − x2 x2 − x1
f (x2 ) = f (x2 ) + f (x2 ),
x3 − x1 x3 − x1
5
imply that
x3 − x2 x2 − x1
f (x2 ) − f (x1 ) ≤ f (x3 ) − f (x2 ) .
x3 − x1 x3 − x1
By multiplying the preceding relation with x3 − x1 and by dividing it with (x3 −

x2 )(x2 − x1 ), we obtain
f (x2 ) − f (x1 ) f (x3 ) − f (x2 )

≤ .
x2 − x1 x3 − x2
(b) Let {xk } be an increasing scalar sequence, i.e., x1 < x2 < x3 < · · · . Then
according to part (a), we have for all k
f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) f (xk+1 ) − f (xk )

≤ ≤ ··· ≤ . (1.2)
x2 − x1 x3 − x2 xk+1 − xk

Since f (xk ) − f (xk−1 ) /(xk − xk−1 ) is monotonically nondecreasing, we have
f (xk ) − f (xk−1 )
→ γ, (1.3)
xk − xk−1
where γ is either a real number or ∞. Furthermore,
f (xk+1 ) − f (xk )
≤ γ, ∀ k. (1.4)
xk+1 − xk
We now show that γ is independent of the sequence {xk }. Let {yj } be

any increasing scalar sequence. For each j, choose xkj such that yj < xkj and
xk1 < xk2 < · · · < xkj , so that we have yj < yj+1 < xkj+1 < xkj+2 . By part (a),
it follows that
f (yj+1 ) − f (yj ) f (xkj+2 ) − f (xkj+1 )
≤ ,
yj+1 − yj xkj+2 − xkj+1
and letting j → ∞ yields
f (yj+1 ) − f (yj )
lim ≤ γ.
j→∞ yj+1 − yj
Similarly, by exchanging the roles of {xk } and {yj }, we can show that
f (yj+1 ) − f (yj )
lim ≥ γ.
j→∞ yj+1 − yj
Thus the limit in Eq. (1.3) is independent of the choice for {xk }, and Eqs. (1.2)
and (1.4) hold for any increasing scalar sequence {xk }.
6
We consider separately each of the three possibilities γ < 0, γ = 0, and
γ > 0. First, suppose that γ < 0, and let {xk } be any increasing sequence. By
using Eq. (1.4), we obtain
k−1
X f (xj+1 ) − f (xj )
f (xk ) = (xj+1 − xj ) + f (x1 )
xj+1 − xj
j=1
k−1
X
≤ γ(xj+1 − xj ) + f (x1 )
j=1
= γ(xk − x1 ) + f (x1 ),
and since γ < 0 and xk → ∞, it follows that f (xk ) → −∞. To show that f
decreases monotonically, pick any x and y with x < y, and consider the sequence
x1 = x, x2 = y, and xk = y + k for all k ≥ 3. By using Eq. (1.4) with k = 1, we
have
f (y) − f (x)
≤ γ < 0,
y−x
so that f (y) − f (x) < 0. Hence f decreases monotonically to −∞, corresponding
to case (1).
Suppose now that γ = 0, and let {xk } be any increasing sequence. Then,
by Eq. (1.4), we have f (xk+1 ) − f (xk ) ≤ 0 for all k. If f (xk+1 ) − f (xk ) < 0 for all
k, then f decreases monotonically. To show this, pick any x and y with x < y,
and consider a new sequence given by y1 = x, y2 = y, and yk = xK+k−3 for all
k ≥ 3, where K is large enough so that y < xK . By using Eqs. (1.2) and (1.4)
with {yk }, we have
f (y) − f (x) f (xK+1 ) − f (xK )

≤ < 0,
y−x xK+1 − xK
implying that f (y) − f (x) < 0. Hence f decreases monotonically, and it may
decrease to −∞ or to a finite value, corresponding to cases (1) or (2), respectively.
If for some K we have f (xK+1 ) − f (xK ) = 0, then by Eqs. (1.2) and (1.4)
where γ = 0, we obtain f (xk ) = f (xK ) for all k ≥ K. To show that f stays at
the value f (xK ) for all x ≥ xK , choose any x such that x > xK , and define {yk }
as y1 = xK , y2 = x, and yk = xN +k−3 for all k ≥ 3, where N is large enough so
that x < xN . By using Eqs. (1.2) and (1.4) with {yk }, we have
f (x) − f (xK ) f (xN ) − f (x)

≤ ≤ 0,
x − xK xN − x
so that f (x) ≤ f (xK ) and f (xN ) ≤ f (x). Since f (xK ) = f (xN ), we have
f (x) = f (xK ). Hence f (x) = f (xK ) for all x ≥ xK , corresponding to case (3).
Finally, suppose
that γ > 0, and let {xk } be any increasing sequence. Since
f (xk ) − f (xk−1 ) /(xk − xk−1 ) is nondecreasing and tends to γ [cf. Eqs. (1.3)
and (1.4)], there is a positive integer K and a positive scalar with < γ such
that
f (xk ) − f (xk−1 )
≤ , ∀ k ≥ K. (1.5)
xk − xk−1
7
Therefore, for all k > K
k−1
X f (xj+1 ) − f (xj )
f (xk ) = (xj+1 − xj ) + f (xK ) ≥ (xk − xK ) + f (xK ),
xj+1 − xj
j=K
implying that f (xk ) → ∞. To show that f (x) increases monotonically to ∞ for

all x ≥ xK , pick any x < y satisfying xK < x < y, and consider a sequence given
by y1 = xK , y2 = x, y3 = y, and yk = xN +k−4 for k ≥ 4, where N is large enough
so that y < xN . By using Eq. (1.5) with {yk }, we have
f (y) − f (x)
≤ .
y−x
Thus f (x) increases monotonically to ∞ for all x ≥ xK , corresponding to case
(4) with x = xK .
1.7 (Characterization of Differentiable Convex Functions)
If f is convex, then by Prop. 1.2.5(a), we have

f (y) ≥ f (x) + ∇f (x)0 (y − x), ∀ x, y ∈ C.
By exchanging the roles of x and y in this relation, we obtain
f (x) ≥ f (y) + ∇f (y)0 (x − y), ∀ x, y ∈ C,
and by adding the preceding two inequalities, it follows that
0
∇f (y) − ∇f (x) (x − y) ≥ 0. (1.6)
Conversely, let Eq. (1.6) hold, and let x and y be two points in C. Define
the function h : < 7→ < by

h(t) = f x + t(y − x) .
Consider some t, t0 ∈ [0, 1] such that t < t0 . By convexity of C, we have that
x + t(y − x) and x + t0 (y − x) belong to C. Using the chain rule and Eq. (1.6),
we have
dh(t0 ) dh(t)
− (t0 − t)
dt dt
0
= ∇f x + t0 (y − x) − ∇f x + t(y − x) (y − x)(t0 − t)

≥ 0.
Thus, dh/dt is nondecreasing on [0, 1] and for any t ∈ (0, 1), we have
Z t Z 1
h(t) − h(0) 1 dh(τ ) 1 dh(τ ) h(1) − h(t)
= dτ ≤ h(t) ≤ dτ = .
t t 0
dτ 1−t t
dτ 1−t
Equivalently,
th(1) + (1 − t)h(0) ≥ h(t),
and from the definition of h, we obtain

tf (y) + (1 − t)f (x) ≥ f ty + (1 − t)x .
Since this inequality has been proved for arbitrary t ∈ [0, 1] and x, y ∈ C, we
conclude that f is convex.
8
1.8 (Characterization of Twice Continuously Differentiable
Convex Functions)
Suppose that f : <n 7→ < is convex over C. We first show that for all x ∈ ri(C)
and y ∈ S, we have y 0 ∇2 f (x)y ≥ 0. Assume to arrive at a contradiction, that
there exists some x ∈ ri(C) such that for some y ∈ S, we have
y 0 ∇2 f (x)y < 0.
Without loss of generality, we may assume that kyk = 1. Using the continuity of
∇2 f , we see that there is an open ball B(x, ) centered at x̄ with radius such
that B(x, ) ∩ aff(C) ⊂ C [since x ∈ ri(C)], and
y 0 ∇2 f (x)y < 0, ∀ x ∈ B(x, ). (1.7)
By Prop. 1.1.13(a), for all positive scalars α with α < , we have
1 0 2
f (x̄ + αy) = f (x̄) + α∇f (x̄)0 y + y ∇ f (x̄ + ᾱy)y,
2
for some ᾱ ∈ [0, α]. Furthermore, k(x + αy) − xk ≤ [since kyk = 1 and ᾱ < ].
Hence, from Eq. (1.7), it follows that
f (x̄ + αy) < f (x̄) + α∇f (x̄)0 y, ∀ α ∈ [0, ).
On the other hand, by the choice of and the assumption that y ∈ S, the vectors
x̄ + αy are in C for all α with α ∈ [0, ), which is a contradiction in view of
the convexity of f over C. Hence, we have y 0 ∇2 f (x)y ≥ 0 for all y ∈ S and all
x ∈ ri(C).
Next, let x be a point in C that is not in the relative interior of C. Then, by
the Line Segment Principle, there is a sequence {xk } ⊂ ri(C) such that xk → x.
As seen above, y 0 ∇2 f (xk )y ≥ 0 for all y ∈ S and all k, which together with the
continuity of ∇2 f implies that
y 0 ∇2 f (x)y = lim y 0 ∇2 f (xk )y ≥ 0, ∀ y ∈ S.

k→∞
It follows that y 0 ∇2 f (x)y ≥ 0 for all x ∈ C and y ∈ S.

Conversely, assume that y 0 ∇2 f (x)y ≥ 0 for all x ∈ C and y ∈ S. By Prop.
1.1.13(a), for all x, z ∈ C we have
f (z) = f (x) + (z − x)0 ∇f (x) + 21 (z − x)0 ∇2 f x + α(z − x) (z − x)

for some α ∈ [0, 1]. Since x, z ∈ C, we have that (z − x) ∈ S, and using the
convexity of C and our assumption, it follows that
f (z) ≥ f (x) + (z − x)0 ∇f (x), ∀ x, z ∈ C.
From Prop. 1.2.5(a), we conclude that f is convex over C.
9
1.9 (Strong Convexity)
n
(a) Fix some x, y ∈ < such that x 6= y, and define the function h : < 7→ < by
h(t) = f x + t(y − x) . Consider scalars t and s such that t < s. Using the chain
rule and the equation
0
∇f (x) − ∇f (y) (x − y) ≥ αkx − yk2 , ∀ x, y ∈ <n , (1.8)
for some α > 0, we have

dh(s) dh(t)

− (s − t)
dt dt
0
= ∇f x + s(y − x) − ∇f x + t(y − x) (y − x)(s − t)
≥ α(s − t)2 kx − yk2 > 0.
Thus, dh/dt is strictly increasing and for any t ∈ (0, 1), we have
Z t Z 1
h(t) − h(0) 1 dh(τ ) 1 dh(τ ) h(1) − h(t)
= dτ < dτ = .
t t 0
dτ 1−t t
dτ 1−t
Equivalently, th(1) + (1 − t)h(0) > h(t). The definition of h yields tf (y) + (1 −

t)f (x) > f ty + (1 − t)x . Since this inequality has been proved for arbitrary
t ∈ (0, 1) and x 6= y, we conclude that f is strictly convex.
(b) Suppose now that f is twice continuously differentiable and Eq. (1.8) holds.
Let c be a scalar. We use Prop. 1.1.13(b) twice to obtain
c2 0 2
f (x + cy) = f (x) + cy 0 ∇f (x) + y ∇ f (x + tcy)y,
2
and
c2 0 2
f (x) = f (x + cy) − cy 0 ∇f (x + cy) +y ∇ f (x + scy)y,
2
for some t and s belonging to [0, 1]. Adding these two equations and using Eq.
(1.8), we obtain
c2 0 2 0
y ∇ f (x + scy) + ∇2 f (x + tcy) y = ∇f (x + cy) − ∇f (x) (cy) ≥ αc2 kyk2 .

2
We divide both sides by c2 and then take the limit as c → 0 to conclude that
y 0 ∇2 f (x)y ≥ αkyk2 . Since this inequality is valid for every y ∈ <n , it follows
that ∇2 f (x) − αI is positive semidefinite.
For the converse, assume that ∇2 f (x) − αI is positive semidefinite for all
n
x ∈ < . Consider the function g : < 7→ < defined by
0
g(t) = ∇f tx + (1 − t)y (x − y).
Using the Mean Value Theorem (Prop. 1.1.12), we have

0 dg(t)
∇f (x) − ∇f (y) (x − y) = g(1) − g(0) =
dt
10
for some t ∈ [0, 1]. On the other hand,
dg(t)
= (x − y)0 ∇2 f tx + (1 − t)y (x − y) ≥ αkx − yk2 ,

dt

where the last inequality holds because ∇2 f tx+(1−t)y −αI is positive semidef-
inite. Combining the last two relations, it follows that f is strongly convex with
coefficient α.
1.10 (Posynomials)
(a) Consider the following posynomial for which we have n = m = 1 and β = 12 ,

1
g(y) = y 2 , ∀ y > 0.
This function is not convex.

(b) Consider the following change of variables, where we set

f (x) = ln g(y1 , . . . , yn ) , bi = ln βi , ∀ i, xj = ln yj , ∀ j.
With this change of variables, f (x) can be written as

m
!
X
f (x) = ln ebi +ai1 x1 +···+ain xn .
i=1
Note that f (x) can also be represented as
f (x) = ln exp(Ax + b), ∀ x ∈ <n ,

where ln exp(z) = ln ez1 + · · · + ezm for all z ∈ <m , A is an m × n matrix with
entries aij , and b ∈ <m is a vector with components bi . Let f2 (z) = ln(ez1 +
· · · + ezm ). This function is convex by Exercise 1.5(b). With this identification,
f (x) can be viewed as the composition f (x) = f2 (Ax + b), which is convex by
Exercise 1.5(g).
(c) Consider a function g : <n 7→ < of the form
g(y) = g1 (y)γ1 · · · gr (y)γr ,
where gk is a posynomial and γk > 0 for all k. Using a change of variables similar
to part (b), we see that we can represent the function f (x) = ln g(y) as
r
X
f (x) = γk ln exp(Ak x + bk ),
k=1
with the matrix Ak and the vector bk being associated with the posynomial gk for
each k. Since f (x) is a linear combination of convex functions with nonnegative
coefficients [part (b)], it follows from Prop. 1.2.4(a) that f (x) is convex.
11
1.11 (Arithmetic-Geometric Mean Inequality)
Consider the function f (x) = − ln(x). Since ∇2 f (x) = 1/x2 > 0 for all x > 0, the
function − ln(x) is strictly convex Pn over (0, ∞). Therefore, for all positive scalars
x1 , . . . , xn and α1 , . . . αn with α = 1, we have
i=1 i
− ln(α1 x1 + · · · + αn xn ) ≤ −α1 ln(x1 ) − · · · − αn ln(xn ),
which is equivalent to
eln(α1 x1 +···+αn xn ) ≥ eα1 ln(x1 )+···+αn ln(xn ) = eα1 ln(x1 ) · · · eαn ln(xn ) ,
or
α
α1 x1 + · · · + αn xn ≥ x1 1 · · · xα n
n ,
as desired. Since − ln(x) is strictly convex, the above inequality is satisfied with
equality if and only if the scalars x1 , . . . , xn are all equal.
1.12 (Young and Holder Inequalities)
According to Exercise 1.11, we have

1 1 u v
up vq ≤ + , ∀ u > 0, ∀ v > 0,
p q
where 1/p + 1/q = 1, p > 0, and q > 0. The above relation also holds if u = 0 or
v = 0. By setting u = xp and v = y q , we obtain Young’s inequality
xp yq
xy ≤ + , ∀ x ≥ 0, ∀ y ≥ 0.
p q
To show Holder’s inequality, note that it holds if x1 = · · · = xn = 0 or
y1 = · · · = yn = 0. If x1 , . . . , xn and y1 , . . . , yn are such that (x1 , . . . , xn ) 6= 0
and (y1 , . . . , yn ) 6= 0, then by using
|xi | |yi |
x= 1/p and y= 1/q
P n P n
j=1
|xj |p j=1
|yj |q
in Young’s inequality, we have for all i = 1, . . . , n,
|xi | |yi | i |x |p |y |q
P 1/p P 1/q ≤ Pn + P i .
n n n
|xj |p |yj |q p j=1
|xj |
p q j=1
|yj |q
j=1 j=1
By adding these inequalities over i = 1, . . . , n, we obtain

Pn
i=1
|xi | · |yi | 1 1
P 1/p P 1/q ≤ p + q = 1,
n n
j=1
|xj |p j=1
|yj |q
which implies Holder’s inequality.
12
1.13
Let (x, w) and (y, v) be two vectorsin epi(f ). Then f (x) ≤ w and f (y) ≤ v,
implying that there exist sequences (x, wk ) ⊂ C and (y, v k ) ⊂ C such that
for all k,
1 1
wk ≤ w + , vk ≤ v + .
k k
By the convexity of C, we have for all α ∈ [0, 1] and all k,

αx + (1 − αy), αwk + (1 − α)v k ∈ C,
so that for all k,

1
f αx + (1 − α)y ≤ αwk + (1 − α)v k ≤ αw + (1 − α)v + .
k
Taking the limit as k → ∞, we obtain

f αx + (1 − α)y ≤ αw + (1 − α)v,
so that α(x, w) + (1 − α)(y, v) ∈ epi(f ). Hence, epi(f ) is convex, implying that

f is convex.
1.14
The elements of X belong to conv(X), so all their convex combinations belong

to conv(X) since conv(X) is a convex set. On the other hand, consider any
two convex combinations of elements of X, x = λ1 x1 + · · · + λm xm and y =
µ1 y1 + · · · + µr yr , where xi ∈ X and yj ∈ X. The vector
(1 − α)x + αy = (1 − α) (λ1 x1 + · · · + λm xm ) + α (µ1 y1 + · · · + µr yr ) ,
where 0 ≤ α ≤ 1, is another convex combination of elements of X.

Thus, the set of convex combinations of elements of X is itself a convex
set, which contains X, and is contained in conv(X). Hence it must coincide with
conv(X), which by definition is the intersection of all convex sets containing X.
1.15
Let y ∈ cone(C). If y = 0, then y ∈ ∪x∈C {γx | γ ≥ 0}. If y 6= 0, then by

definition of cone(C), we have
m
X
y= λi xi ,
i=1
Pm xi ∈ C. Since
for some positive integer m, nonnegative scalars λi , and vectors
y 6= 0, we cannot have all λi equal to zero, implying that λ > 0. Because
i=1 i
xi ∈ C for all i and C is convex, the vector
m
X λ
x= Pm i xi
j=1
λj
i=1
13
belongs to C. For this vector, we have
m
!
X
y= λi x,
i=1
Pm
with i=1
λi > 0, implying that y ∈ ∪x∈C γx | γ ≥ 0} and showing that
cone(C) ⊂ ∪x∈C {γx | γ ≥ 0}.
The reverse inclusion follows from the definition of cone(C).
1.16 (Convex Cones)
(a) Let x ∈ C and let λ be a positive scalar. Then
a0i (λx) = λa0i x ≤ 0, ∀ i ∈ I,
showing that λx ∈ C and that C is a cone. Let x, y ∈ C and let λ ∈ [0, 1]. Then
a0i λx + (1 − λ)y = λa0i x + (1 − λ)a0i y ≤ 0,

∀ i ∈ I,

showing that λx + (1 − λ)y ∈ C and that C is convex. Let a sequence {xk } ⊂ C
converge to some x̄ ∈ <n . Then
a0i x̄ = lim a0i xk ≤ 0, ∀ i ∈ I,

k→∞
showing that x̄ ∈ C and that C is closed.

(b) Let C be a cone such that C + C ⊂ C, and let x, y ∈ C and α ∈ [0, 1]. Then
since C is a cone, αx ∈ C and (1 − α)y ∈ C, so that αx + (1 − α)y ∈ C + C ⊂ C,
showing that C is convex. Conversely, let C be a convex cone and let x, y ∈ C.
Then, since C is a cone, 2x ∈ C and 2y ∈ C, so that by the convexity of C,
x + y = 12 (2x + 2y) ∈ C, showing that C + C ⊂ C.
(c) First we prove that C1 + C2 ⊂ conv(C1 ∪ C2 ). Choose any x ∈ C1 + C2 .
Since C1 + C2 is a cone [see Exercise 1.2(c)], the vector 2x is in C1 + C2 , so that
2x = x1 + x2 for some x1 ∈ C1 and x2 ∈ C2 . Therefore,
1 1
x= x1 + x2 ,
2 2
showing that x ∈ conv(C1 ∪ C2 ).

Next, we show that conv(C1 ∪ C2 ) ⊂ C1 + C2 . Since 0 ∈ C1 and 0 ∈ C2 , it
follows that
Ci = Ci + 0 ⊂ C1 + C2 , i = 1, 2,
implying that
C1 ∪ C2 ⊂ C1 + C2 .
14
By taking the convex hull of both sides in the above inclusion and by using the
convexity of C1 + C2 , we obtain
conv(C1 ∪ C2 ) ⊂ conv(C1 + C2 ) = C1 + C2 .
We finally show that

[
C1 ∩ C2 = αC1 ∩ (1 − α)C2 .
α∈[0,1]
We claim that for all α with 0 < α < 1, we have
αC1 ∩ (1 − α)C2 = C1 ∩ C2 .
Indeed, if x ∈ C1 ∩ C2 , it follows that x ∈ C1 and x ∈ C2 . Since C1 and C2

are cones and 0 < α < 1, we have x ∈ αC1 and x ∈ (1 − α)C2 . Conversely, if
x ∈ αC1 ∩ (1 − α)C2 , we have
x
∈ C1 ,
α
and
x
∈ C2 .
(1 − α)
Since C1 and C2 are cones, it follows that x ∈ C1 and x ∈ C2 , so that x ∈ C1 ∩C2 .
If α = 0 or α = 1, we obtain
αC1 ∩ (1 − α)C2 = {0} ⊂ C1 ∩ C2 ,
since C1 and C2 contain the origin. Thus, the result follows.
1.17
By Exercise 1.14, C is the set of all convex combinations x = α1 y1 + · · · + αm ym ,

where m is a positive integer, and the vectors y1 , . . . , ym belong to the union of
the sets Ci . Actually, we can get C just by taking those combinations in which
the vectors are taken from different sets Ci . Indeed, if two of the vectors, y1 and
y2 belong to the same Ci , then the term α1 y1 + α2 y2 can be replaced by αy,
where α = α1 + α2 and
y = (α1 /α)y1 + (α2 /α)y2 ∈ Ci .
Thus, C is the union of the vector sums of the form
α1 Ci1 + · · · + αm Cim ,
with
m
X
αi ≥ 0, ∀ i = 1, . . . , m, αi = 1,
i=1
and the indices i1 , . . . , im are all different, proving our claim.
15
1.18 (Convex Hulls, Affine Hulls, and Generated Cones)
(a) We first show that X and cl(X) have the same affine hull. Since X ⊂ cl(X),
there holds
aff(X) ⊂ aff cl(X) .
Conversely, because X ⊂ aff(X) and aff(X) is closed, we have cl(X) ⊂ aff(X),
implying that
aff cl(X) ⊂ aff(X).
We now show that X and conv(X) have the same affine hull. By using a
translation argument if necessary, we assume without loss of generality that X
contains the origin, so that both aff(X) and aff conv(X) are subspaces. Since

X ⊂ conv(X), evidently aff(X) ⊂ aff conv(X) . To show the reverse inclusion,

let the dimension of aff conv(X) be m, and let x1 , . . . , xm be linearly indepen-

dent vectors in conv(X) that span aff conv(X) . Then every x ∈ aff conv(X) is
a linear combination of the vectors x1 , . . . , xm , i.e., there exist scalars β1 , . . . , βm
such that
m
X
x= βi xi .
i=1
By the definition of convex hull, each xi is a convex combination of vectors in

combination of vectors in X, implying that x ∈ aff(X).
X, so that x is a linear
Hence, aff conv(X) ⊂ aff(X).

(b) Since X ⊂ conv(X), clearly cone(X) ⊂ cone conv(X) . Conversely, let

x ∈ cone conv(X) . Then x is a nonnegative combination of some vectors in
conv(X), i.e., for some positive integer p, vectors x1 , . . . , xp ∈ conv(X), and
nonnegative scalars α1 , . . . , αp , we have
p
X
x= αi xi .
i=1
Each xi is a convex combination of some vectors in X, so that x is a nonneg-

ative combination
of some vectors in X, implying that x ∈ cone(X). Hence
cone conv(X) ⊂ cone(X).
(c) Since conv(X) is the set of all convex combinations of vectors in X, and
cone(X) is the set of all nonnegative combinations of vectors in X, it follows that
conv(X) ⊂ cone(X). Therefore

aff conv(X) ⊂ aff cone(X) .
As an example showing that the above inclusion can be strict, consider the
set X = (1, 1) in <2 . Then conv(X) = X, so that

aff conv(X) = X = (1, 1) ,

and the dimension of conv(X) is zero. On the other hand, cone(X) = (α, α) |
α ≥ 0 , so that

aff cone(X) = (x1 , x2 ) | x1 = x2 ,
16
and the dimension of cone(X) is one.
(d) In view of parts (a) and (c), it suffices to show that

aff cone(X) ⊂ aff conv(X) = aff(X).

It is always true that 0 ∈ cone(X), so aff cone(X) is a subspace. Let the

dimension of aff cone(X) be m, and let x1 , . . . , xm be linearly independent

vectors in cone(X) that span aff cone(X) . Since every vector in aff cone(X) is
a linear combination of x1 , . . . , xm , and since each xi is a nonnegative combination
of some vectors in X, it follows that every vector in aff cone(X) is a linear
combination of some vectors in X. In view of the assumption that 0 ∈ conv(X),
the affine hull of conv(X) is a subspace, which implies by part (a) that the affine
hull of X is a subspace. Hence, every vector in aff cone(X) belongs to aff(X),

showing that aff cone(X) ⊂ aff(X).
1.19
By definition, f (x) is the infimum of the values of w such that (x, w) ∈ C, where
C is the convex hull of the union of nonempty convex sets epi(fi ). We have that
(x, w) ∈ C if and only if (x, w) can be expressed as a convex combination of the
form  
X X X
(x, w) = αi (xi , wi ) =  αi xi , αi wi  ,
i∈I i∈I i∈I
where I ⊂ I is a finite set and (xi , wi ) ∈ epi(fi ) for all i ∈ I. Thus, f (x) can be
expressed as
(
X X
f (x) = inf αi wi (x, w) = αi (xi , wi ),
i∈I i∈I
)
X
(xi , wi ) ∈ epi(fi ), αi ≥ 0, ∀ i ∈ I, αi = 1 .
i∈I

Since the set xi , fi (xi ) | xi ∈ <n is contained in epi(fi ), we obtain
 
X X X 
f (x) ≤ inf αi fi (xi ) x= αi xi , xi ∈ <n , αi ≥ 0, ∀ i ∈ I, αi = 1 .
 
i∈I i∈I i∈I
On the other hand, by the definition of epi(fi ), for each (xi , wi ) ∈ epi(fi ) we
have wi ≥ fi (xi ), implying that
 
X X X 
f (x) ≥ inf αi fi (xi ) x= αi xi , xi ∈ <n , αi ≥ 0, ∀ i ∈ I, αi = 1 .
 
i∈I i∈I i∈I
17
By combining the last two relations, we obtain
 
X X X 
f (x) = inf αi fi (xi ) x= αi xi , xi ∈ <n , αi ≥ 0, ∀ i ∈ I, αi = 1 ,
 
i∈I i∈I i∈I
where the infimum is taken over all representations of x as a convex combination

of elements xi such that only finitely many coefficients αi are nonzero.
1.20 (Convexification of Nonconvex Functions)

(a) Since conv epi(f ) is a convex set, it follows from Exercise 1.13 that F is con-

vex over conv(X). By Caratheodory’s Theorem, it can be seen that conv epi(f )
is the set of all convex combinations of elements of epi(f ), so that
(
X X
F (x) = inf αi wi (x, w) = αi (xi , wi ),
i i
)
X
(xi , wi ) ∈ epi(f ), αi ≥ 0, αi = 1 ,
i
where the infimum is taken over all

representations
of x as a convex combination
of elements of X. Since the set z, f (z) | z ∈ X is contained in epi(f ), we
obtain
( )
X X X
F (x) ≤ inf αi f (xi ) x= αi xi , xi ∈ X, αi ≥ 0, αi = 1 .
i i i
On the other hand, by the definition of epi(f ), for each (xi , wi ) ∈ epi(f ) we have
wi ≥ f (xi ), implying that
(
X X
F (x) ≥ inf αi f (xi ) (x, w) = αi (xi , wi ),
i i
)
X
(xi , wi ) ∈ epi(f ), αi ≥ 0, αi = 1 ,
i
( )
X X X
= inf αi f (xi ) x= αi xi , xi ∈ X, αi ≥ 0, αi = 1 ,
i i i
which combined with the preceding inequality implies the desired relation.
(b) By using part (a), we have for every x ∈ X
F (x) ≤ f (x),
18
P
since f (x) corresponds to the value of the function α f (xi ) for a particular
i i
representation of x as a finite convex combination of elements of X, namely
x = 1 · x. Therefore, we have
inf F (x) ≤ inf f (x),

x∈X x∈X
and since X ⊂ conv(X), it follows that
inf F (x) ≤ inf f (x).

x∈conv(X) x∈X
Let f ∗ = inf x∈X f (x). If inf x∈conv(X) F (x) < f ∗ , then there exists z ∈
conv(X) with F (z) < f ∗ . According
P Pexist points xi ∈ X and
to part (a), there
nonnegative scalars αi with α = 1 such that z = i αi xi and
i i
X
F (z) ≤ αi f (xi ) < f ∗ ,
i
implying that
X
αi f (xi ) − f ∗ < 0.

i
Since each αi is nonnegative, for this inequality to hold, we must have f (xi )−f ∗ <
0 for some i, but this cannot be true because xi ∈ X and f ∗ is the optimal value
of f over X. Therefore
inf F (x) = inf f (x).

x∈conv(X) x∈X
(c) If x∗ ∈ X is a global minimum of f over X, then x∗ also belongs to conv(X),

and by part (b)
inf F (x) = inf f (x) = f (x∗ ) ≥ F (x∗ ),

x∈conv(X) x∈X
showing that x∗ is also a global minimum of F over conv(X).
1.21 (Minimization of Linear Functions)
Let f : X 7→ < be the function f (x) = c0 x, and define

F (x) = inf w | (x, w) ∈ conv epi(f ) ,
as in Exercise 1.20. According to this exercise, we have
inf F (x) = inf f (x),

x∈conv(X) x∈X
19
and
( )
X 0
X X
F (x) = inf αi c xi αi xi = x, xi ∈ X, αi = 1, αi ≥ 0
i i i
( ! )
X X X
= inf c0 αi xi αi xi = x, xi ∈ X, αi = 1, αi ≥ 0
i i i
= c0 x,
showing that
inf c0 x = inf c0 x.
x∈conv(X) x∈X
According to Exercise 1.20(c), if inf x∈X c0 x is attained at some x∗ ∈ X,

then inf x∈conv(X) c0 x is also attained at x∗ . Suppose now that inf x∈conv(X) c0 x is
attained at some x∗ ∈ conv(X), i.e., there is x∗ ∈ conv(X) such that
inf c0 x = c0 x∗ .
x∈conv(X)
Then, by Caratheodory’s Theorem, there exist vectors x1 , . . . , xn+1Pin X and

Pn+1 n+1
nonnegative scalars α1 , . . . , αn+1 with i=1
αi = 1 such that x∗ = i=1 αi xi ,
implying that
n+1
X
c0 x∗ = αi c0 xi .
i=1
Since xi ∈ X ⊂ conv(X) for all i and c0 x ≥ c0 x∗ for all x ∈ conv(X), it follows

that
n+1
X n+1
X
c0 x∗ = αi c0 xi ≥ αi c0 x∗ = c0 x∗ ,
i=1 i=1
implying that c0 xi = c0 x∗ for all i corresponding to αi > 0. Hence, inf x∈X c0 x is

attained at the xi ’s corresponding to αi > 0.
1.22 (Extension of Caratheodory’s Theorem)
The proof will be an application of Caratheodory’s Theorem [part (a)] to the

subset of <n+1 given by

Y = (x, 1) | x ∈ X1 ∪ (y, 0) | y ∈ X2 .
If x ∈ X, then
k m
X X
x= γi xi + γi yi ,
i=1 i=k+1
where the vectors x1 , . . . , xk belong to X1 , the vectors yk+1 , . . . , ym belong to X2 ,

and the scalars γ1 , . . . , γm are nonnegative with γ1 + · · · + γk = 1. Equivalently,
(x, 1) ∈ cone(Y ). By Caratheodory’s Theorem part (a), we have that
k
X m
X
(x, 1) = αi (xi , 1) + αi (yi , 0),
i=1 i=k+1
20
for some positive scalars α1 , . . . , αm and vectors
(x1 , 1), . . . (xk , 1), (yk+1 , 0), . . . , (ym , 0),
which are linearly independent (implying that m ≤ n + 1) or equivalently,
k m k
X X X
x= αi xi + αi yi , 1= αi .
i=1 i=k+1 i=1
Finally, to show that the vectors x2 − x1 , . . . , xk − x1 , yk+1 , . . . , ym are linearly

independent, assume to arrive at a contradiction, that there exist λ2 , . . . , λm , not
all 0, such that
k
X m
X
λi (xi − x1 ) + λi yi = 0.
i=2 i=k+1
Equivalently, defining λ1 = −(λ2 + · · · + λm ), we have
k
X m
X
λi (xi , 1) + λi (yi , 0) = 0,
i=1 i=k+1
which contradicts the linear independence of the vectors
(x1 , 1), . . . , (xk , 1), (yk+1 , 0), . . . , (ym , 0).
1.23
The set cl(X) is compact since X is bounded by assumption. Hence, by Prop.

1.3.2, its convex hull, conv cl(X) , is compact, and it follows that

cl conv(X) ⊂ cl conv cl(X) = conv cl(X) .
It is also true in general that

conv cl(X) ⊂ conv cl conv(X) = cl conv(X) ,
since by Prop. 1.2.1(d), the closure of a convex set is convex. Hence, the result
follows.
21
1.24 (Radon’s Theorem)
Consider the system of n + 1 equations in the m unknowns λ1 , . . . , λm
m m
X X
λi xi = 0, λi = 0.
i=1 i=1
Since m > n + 1, there exists a nonzero solution, call it λ∗ . Let
I = {i | λ∗i ≥ 0}, J = {j | λ∗j < 0},
and note that I and J are nonempty, and that

X X
λ∗k = (−λ∗k ) > 0.
k∈I k∈J
Consider the vector X

x∗ = αi xi ,
i∈I
where
λ∗i
αi = P , i ∈ I.
k∈I
λ∗k
Pm Pm
In view of the equations i=1
λ∗i xi = 0 and i=1
λ∗i = 0, we also have
X
x∗ = αj xj ,
j∈J
where
−λ∗j
αj = P , j ∈ J.
k∈J
(−λ∗k )
It is seen that the αi and αj are nonnegative, and that

X X
αi = αj = 1,
i∈I j∈J
so x∗ belongs to the intersection

conv {xi | i ∈ I} ∩ conv {xj | j ∈ J} .
Given four distinct points in the plane (i.e., m = 4 and n = 2), Radon’s
Theorem guarantees the existence of a partition into two subsets, the convex
hulls of which intersect. Assuming, there is no subset of three points lying on the
same line, there are two possibilities:
(1) Each set in the partition consists of two points, in which case the convex
hulls intesect and define the diagonals of a quadrilateral.
22
(2) One set in the partition consists of three points and the other consists of one
point, in which case the triangle formed by the three points must contain
the fourth.
In the case where three of the points define a line segment on which they lie,
and the fourth does not, the triangle formed by the two ends of the line segment
and the point outside the line segment form a triangle that contains the fourth
point. In the case where all four of the points lie on a line segment, the degenerate
triangle formed by three of the points, including the two ends of the line segment,
contains the fourth point.
1.25 (Helly’s Theorem [Hel21])
Consider the induction argument of the hint, let Bj be defined as in the hint,
and for each j, let xj be a vector in Bj . Since M + 1 ≥ n + 2, we can apply
Radon’s Theorem to the vectors x1 , . . . , xM +1 . Thus, there exist nonempty and
disjoint index subsets I and J such that I ∪ J = {1, . . . , M + 1}, nonnegative
scalars α1 , . . . , αM +1 , and a vector x∗ such that
X X X X
x∗ = αi xi = αj xj , αi = αj = 1.
i∈I j∈J i∈I j∈J
It can be seen that for every i ∈ I, a vector in Bi belongs to the intersection

∩j∈J Cj . Therefore, since x∗ is a convex combination of vectors in Bi , i ∈ I, x∗
also belongs to the intersection ∩j∈J Cj . Similarly, by reversing the role of I and
J, we see that x∗ belongs to the intersection ∩i∈I CI . Thus, x∗ belongs to the
intersection of the entire collection C1 , . . . , CM +1 .
1.26
Assume the contrary, i.e., that for every index set I ⊂ {1, . . . , M }, which contains
no more than n + 1 indices, we have
n o
infn max fi (x) < f ∗.
x∈< i∈I
This means that for every such I, the intersection ∩i∈I Xi is nonempty, where
Xi = x | fi (x) < f ∗ .

From Helly’s Theorem, it follows that the entire collection {Xi | i = 1, . . . , M }

has nonempty intersection, thereby implying that

infn max fi (x) < f ∗.
x∈< i=1,...,M
This contradicts the definition of f ∗ . Note: The result of this exercise relates to
the following question: what is the minimal number of functions fi that we need
to include in the cost function maxi fi (x) in order to attain the optimal value f ∗ ?
According to the result, the number is no more than n + 1. For applications of
this result in structural design and Chebyshev approximation, see Ben Tal and
Nemirovski [BeN01].
23
1.27
Let x be an arbitrary vector in cl(C). If f (x) = ∞, then we are done, so assume

that f (x) is finite. Let x be a point in the relative interior of C. By the Line
Segment Principle, all the points on the line segment connecting x and x, except
possibly x, belong to ri(C) and therefore, belong to C. From this, the given
property of f , and the convexity of f , we obtain for all α ∈ (0, 1],

αf (x) + (1 − α)f (x) ≥ f αx + (1 − α)x ≥ γ.
By letting α → 0, it follows that f (x) ≥ γ. Hence, f (x) ≥ γ for all x ∈ cl(C).
1.28
From Prop. 1.4.5(b), we have that for any vector a ∈ <n , ri(C + a) = ri(C) + a.
Therefore, we can assume without loss of generality that 0 ∈ C, and aff(C)
coincides with S. We need to show that
ri(C) = int(C + S ⊥ ) ∩ C.
Let x ∈ ri(C). By definition, this implies that x ∈ C and there exists some
open ball B(x, ) centered at x with radius > 0 such that
B(x, ) ∩ S ⊂ C. (1.9)
We now show that B(x, ) ⊂ C + S ⊥ . Let z be a vector in B(x, ). Then,

we can express z as z = x + αy for some vector y ∈ <n with kyk = 1, and
some α ∈ [0, ). Since S and S ⊥ are orthogonal subspaces, y can be uniquely
decomposed as y = yS + yS ⊥ , where yS ∈ S and yS ⊥ ∈ S ⊥ . Since kyk = 1, this
implies that kyS k ≤ 1 (Pythagorean Theorem), and using Eq. (1.9), we obtain
x + αyS ∈ B(x, ) ∩ S ⊂ C,
from which it follows that the vector z = x + αy belongs to C + S ⊥ , implying

that B(x, ) ⊂ C + S ⊥ . This shows that x ∈ int(C + S ⊥ ) ∩ C.
Conversely, let x ∈ int(C + S ⊥ ) ∩ C. We have that x ∈ C and there exists
some open ball B(x, ) centered at x with radius > 0 such that B(x, ) ⊂ C +S ⊥ .
Since C is a subset of S, it can be seen that (C + S ⊥ ) ∩ S = C. Therefore,
B(x, ) ∩ S ⊂ C,
implying that x ∈ ri(C).
24
1.29
(a) Let C be the given convex set. The convex hull of any subset of C is contained
in C. Therefore, the maximum dimension of the various simplices contained in
C is the largest m for which C contains m + 1 vectors x0 , . . . , xm such that
x1 − x0 , . . . , xm − x0 are linearly independent.
Let K = {x0 , . . . , xm } be such a set with m maximal,
and let aff(K) denote
the affine hull of set K. Then, we have dim aff(K) = m, and since K ⊂ C, it
follows that aff(K) ⊂ aff(C).
We claim that C ⊂ aff(K). To see this, assume that there exists some
x ∈ C, which does not belong to aff(K). This implies that the set {x, x0 , . . . , xm }
is a set of m + 2 vectors in C such that x − x0 , x1 − x0 , . . . , xm − x0 are linearly
independent, contradicting the maximality of m. Hence, we have C ⊂ aff(K),
and it follows that
aff(K) = aff(C),
thereby implying that dim(C) = m.
(b) We first consider the case where C is n-dimensional with n > 0 and show that
the interior of C is not empty. By part (a), an n-dimensional convex set contains
an n-dimensional simplex. We claim that such a simplex S has a nonempty
interior. Indeed, applying an affine transformation if necessary, we can assume
that the vertices of S are the vectors (0, 0, . . . , 0), (1, 0, . . . , 0), . . . , (0, 0, . . . , 1),
i.e., ( )
n
X
S= (x1 , . . . , xn ) xi ≥ 0, ∀ i = 1, . . . , n, xi ≤ 1 .
i=1
The interior of the simplex S,

( n
)
X
int(S) = (x1 , . . . , xn ) | xi > 0, ∀ i = 1, . . . , n, xi < 1 ,
i=1
is nonempty, which in turn implies that int(C) is nonempty.

For the case where dim(C) < n, consider the n-dimensional set C + S ⊥ ,
where S ⊥ is the orthogonal complement of the subspace parallel to aff(C). Since
C + S ⊥ is a convex set, it follows from the above argument that int(C + S ⊥ ) is
nonempty. Let x ∈ int(C + S ⊥ ). We can represent x as x = xC + xS ⊥ , where
xC ∈ C and xS ⊥ ∈ S ⊥ . It can be seen that xC ∈ int(C + S ⊥ ). Since
ri(C) = int(C + S ⊥ ) ∩ C,
(cf. Exercise 1.28), it follows that xc ∈ ri(C), so ri(C) is nonempty.
1.30

(a) Let C1 be the segment (x1 , x2 ) | 0 ≤ x1 ≤ 1, x2 = 0 and let C2 be the box

(x1 , x2 ) | 0 ≤ x1 ≤ 1, 0 ≤ x2 ≤ 1 . We have

ri(C1 ) = (x1 , x2 ) | 0 < x1 < 1, x2 = 0 ,
25

ri(C2 ) = (x1 , x2 ) | 0 < x1 < 1, 0 < x2 < 1 .
Thus C1 ⊂ C2 , while ri(C1 ) ∩ ri(C2 ) = Ø.

(b) Let x ∈ ri(C1 ), and consider a open ball B centered at x such that B ∩
aff(C1 ) ⊂ C1 . Since aff(C1 ) = aff(C2 ) and C1 ⊂ C2 , it follows that B ∩ aff(C2 ) ⊂
C2 , so x ∈ ri(C2 ). Hence ri(C1 ) ⊂ ri(C2 ).
(c) Because C1 ⊂ C2 , we have
ri(C1 ) = ri(C1 ∩ C2 ).
Since ri(C1 ) ∩ ri(C2 ) 6= Ø, there holds
ri(C1 ∩ C2 ) = ri(C1 ) ∩ ri(C2 )
[Prop. 1.4.5(a)]. Combining the preceding two relations, we obtain ri(C1 ) ⊂

ri(C2 ).
(d) Let x2 be in the intersection of C1 and ri(C2 ), and let x1 be in the relative
interior of C1 [ri(C1 ) is nonempty by Prop. 1.4.1(b)]. If x1 = x2 , then we are
done, so assume that x1 6= x2 . By the Line Segment Principle, all the points
on the line segment connecting x1 and x2 , except possibly x2 , belong to the
relative interior of C1 . Since C1 ⊂ C2 , the vector x1 is in C2 , so that by the
Line Segment Principle, all the points on the line segment connecting x1 and x2 ,
except possibly x1 , belong to the relative interior of C2 . Hence, all the points on
the line segment connecting x1 and x2 , except possibly x1 and x2 , belong to the
intersection ri(C1 ) ∩ ri(C2 ), showing that ri(C1 ) ∩ ri(C2 ) is nonempty.
1.31
(a) Let x ∈ ri(C). We will show that for every x ∈ aff(C), there exists a γ > 1
such that x + (γ − 1)(x − x) ∈ C. This is true if x = x, so assume that x 6= x.
Since x ∈ ri(C), there exists > 0 such that

z | kz − xk < ∩ aff(C) ⊂ C.

Choose a point x ∈ C in the intersection of the ray x + α(x − x) | α ≥ 0 and

the set z | kz − xk < ∩ aff(C). Then, for some positive scalar α ,
x − x = α (x − x).
Since x ∈ ri(C) and x ∈ C, by Prop. 1.4.1(c), there is γ > 1 such that
x + (γ − 1)(x − x ) ∈ C,
which in view of the preceding relation implies that
x + (γ − 1)α (x − x) ∈ C.
26
The result follows by letting γ = 1 + (γ − 1)α and noting that γ > 1, since
(γ − 1)α > 0. The converse assertion follows from the fact C ⊂ aff(C) and
Prop. 1.4.1(c).
(b) The inclusion cone(C) ⊂ aff(C) always holds if 0 ∈ C. To show the reverse
inclusion, we note that by part (a) with x = 0, for every x ∈ aff(C), there exists
γ > 1 such that x̃ = (γ − 1)(−x) ∈ C. By using part (a) again with x = 0, for
x̃ ∈ C ⊂ aff(C), we see that there is γ̃ > 1 such that z = (γ̃ − 1)(−x̃) ∈ C, which
combined with x̃ = (γ − 1)(−x) yields z = (γ̃ − 1)(γ − 1)x ∈ C. Hence
1
x= z
(γ̃ − 1)(γ − 1)
with z ∈ C and (γ̃ − 1)(γ − 1) > 0, implying that x ∈ cone(C) and, showing that
aff(C) ⊂ cone(C).
(c) This follows by part (b), where C = conv(X), and the fact

cone conv(X) = cone(X)
[Exercise 1.18(b)].
1.32
(a) If 0 ∈ C, then 0 ∈ ri(C) since 0 is not on the relative boundary of C.

By Exercise 1.31(b), it follows that cone(C) coincides with aff(C), which is a
closed set. If 0 6∈ C, let y be in the closure of cone(C) and let {yk } ⊂ cone(C)
be a sequence converging to y. By Exercise 1.15, for every yk , there exists a
nonnegative scalar αk and a vector xk ∈ C such that yk = αk xk . Since {yk } → y,
the sequence {yk } is bounded, implying that
αk kxk k ≤ sup kym k < ∞, ∀ k.

m≥0
We have inf m≥0 kxm k > 0, since {xk } ⊂ C and C is a compact set not containing
the origin, so that
supm≥0 kym k
0 ≤ αk ≤ < ∞, ∀ k.
inf m≥0 kxm k
Thus, the sequence {(αk , xk )} is bounded and has a limit point (α, x) such that
α ≥ 0 and x ∈ C. By taking a subsequence of {(αk , xk )} that converges to (α, x),
and by using the facts yk = αk xk for all k and {yk } → y, we see that y = αx
with α ≥ 0 and x ∈ C. Hence, y ∈ cone(C), showing that cone(C) is closed.
(b) To
see that the assertion in part (a) fails when C is unbounded, let C be the
line (x1 , x2 ) | x1 = 1, x2 ∈ < in <2 not passing through the origin. Then,

cone(C) is the nonclosed set (x1 , x2 ) | x1 > 0, x2 ∈ < ∪ (0, 0) .
To see that the assertion in part (a) fails
when C contains the origin on its
relative boundary, let C be the closed ball (x1 , x2 ) | (x1 − 1)2 + x22 ≤ 1 in <2 .
27

Then, cone(C) is the nonclosed set (x1 , x2 ) | x1 > 0, x2 ∈ < ∪ (0, 0) (see
Fig. 1.3.2).
(c) Since C is compact, the convex hull of C is compact (cf. Prop. 1.3.2). Because
conv(C) does not contain the origin on its relative boundary, by part (a),
the cone
generated by conv(C) is closed. By Exercise 1.18(b), cone conv(C) coincides
with cone(C) implying that cone(C) is closed.
1.33
(a) By Prop. 1.4.1(b), the relative interior of a convex set is a convex set. We
only need to show that ri(C) is a cone. Let y ∈ ri(C). Then, y ∈ C and since C
is a cone, αy ∈ C for all α > 0. By the Line Segment Principle, all the points on
the line segment connecting y and αy, except possibly αy, belong to ri(C). Since
this is true for every α > 0, it follows that αy ∈ ri(C) for all α > 0, showing that
ri(C) is a cone.
(b) Consider the linear transformation A that maps (α1 , . . . , αm ) ∈ <m into
P m n
i=1
αi xi ∈ < . Note that C is the image of the nonempty convex set

(α1 , . . . , αm ) | α1 ≥ 0, . . . , αm ≥ 0
under the linear transformation A. Therefore, by using Prop. 1.4.3(d), we have

ri(C) = ri A · (α1 , . . . , αm ) | α1 ≥ 0, . . . , αm ≥ 0

= A · ri (α1 , . . . , αm ) | α1 ≥ 0, . . . , αm ≥ 0

= A · (α1 , . . . , αm ) | α1 > 0, . . . , αm > 0
(m )
X
= αi xi | α1 > 0, . . . , αm > 0 .
i=1
1.34
Define the sets
D = <n × C, S = (x, Ax) | x ∈ <n .

Let T be the linear transformation that maps (x, y) ∈ <n+m into x ∈ <n . Then
it can be seen that
A−1 · C = T · (D ∩ S). (1.10)
The relative interior of D is given by ri(D) = <n × ri(C), and the relative interior
of S is equal to S (since S is a subspace). Hence,
A−1 · ri(C) = T · ri(D) ∩ S .

(1.11)
28
In view of the assumption that A−1 · ri(C) is nonempty, we have that the in-
tersection ri(D) ∩ S is nonempty. Therefore, it follows from Props. 1.4.3(d) and
1.4.5(a) that
ri T · (D ∩ S) = T · ri(D) ∩ S . (1.12)
Combining Eqs. (1.10)-(1.12), we obtain
ri(A−1 · C) = A−1 · ri(C).
Next, we show the second relation. We have
A−1 · cl(C) = x | Ax ∈ cl(C) = T · (x, Ax) | Ax ∈ cl(C) = T · cl(D) ∩ S .

Since the intersection ri(D) ∩ S is nonempty, it follows from Prop. 1.4.5(a) that
cl(D) ∩ S = cl(D ∩ S). Furthermore, since T is continuous, we obtain
A−1 · cl(C) = T · cl(D ∩ S) ⊂ cl T · (D ∩ S) ,

which combined with Eq. (1.10) yields
A−1 · cl(C) ⊂ cl(A−1 · C).
To show the reverse inclusion, cl(A−1 · C) ⊂ A−1 · cl(C), let x be some vector in
cl(A−1 · C). This implies that there exists some sequence {xk } converging to x
such that Axk ∈ C for all k. Since xk converges to x, we have that Axk converges
to Ax, thereby implying that Ax ∈ cl(C), or equivalently, x ∈ A−1 · cl(C).
1.35 (Closure of a Convex Function)
(a) Let g : <n 7→ [−∞, ∞] be such that g(x) ≤ f (x) for all x ∈ <n . Choose
any x ∈ dom(cl f ). Since epi(cl f ) = cl epi(f ) , we can choose a sequence

(xk , wk ) ∈ epi(f ) such that xk → x, wk → (cl f )(x). Since g is lower semicon-
tinuous at x, we have
g(x) ≤ lim inf g(xk ) ≤ lim inf f (xk ) ≤ lim inf wk = (cl f )(x).
k→∞ k→∞ k→∞
Note also that since epi(f ) ⊂ epi(cl f ), we have (cl f )(x) ≤ f (x) for all x ∈ <n .
(b) For the proof of this part and the next, we will use the easily shown fact that
for any convex function f , we have

ri epi(f ) = (x, w) | x ∈ ri dom(f ) , f (x) < w .

Let x ∈ ri dom(f ) , and consider the vertical line L = (x, w) | w ∈ < .

Then there exists ŵ such that (x, ŵ) ∈ L∩ri epi(f ) . Let w be such that (x, w) ∈

L ∩ cl epi(f ) . Then, by Prop. 1.4.5(a), we have L ∩ cl epi(f ) = cl L ∩ epi(f ) ,

so that (x, w) ∈ cl L ∩ epi(f ) . It follows from the Line Segment Principle that

the vector x, ŵ + α(w − ŵ) belongs to epi(f ) for all α ∈ [0, 1). Taking the
29

limit as α → 1, we see that f (x) ≤ w for all w such that (x, w) ∈ L ∩ cl epi(f ) ,
implying that f (x) ≤ (cl f )(x). On the other hand, since epi(f ) ⊂ epi(cl f ), we
have (cl f )(x) ≤ f (x) for all x ∈ <n , so f (x) = (cl f )(x).
We know that a closed convex function that is improper cannot take a finite
value at any point. Since cl f is closedand convex, and takes a finite value at all
points of the nonempty set ri dom(f ) , it follows that cl f must be proper.
(c) Since the function cl f is closed and is majorized by f , we have

(cl f )(y) ≤ lim inf (cl f ) y + α(x − y) ≤ lim inf f y + α(x − y) .
α↓0 α↓0
To show the reverse inequality, let w be such that f (x) < w. Then, (x, w) ∈
ri epi(f ) , while y, (cl f )(y) ∈ cl epi(f ) . From the Line Segment Principle, it
follows that

αx + (1 − α)y, αw + (1 − α)(cl f )(y) ∈ ri epi(f ) , ∀ α ∈ (0, 1].
Hence,

f αx + (1 − α)y < αw + (1 − α)(cl f )(y), ∀ α ∈ (0, 1].
By taking the limit as α → 0, we obtain

lim inf f y + α(x − y) ≤ (cl f )(y),
α↓0
thus completing the proof.

(d) Let x ∈ ∩m
i=1 ri dom(fi ) . Since by Prop. 1.4.5(a), we have
ri dom(f ) = ∩m

i=1 ri dom(fi ) ,

it follows that x ∈ ri dom(f ) . By using part (c), we have for every y ∈ dom(cl f ),
m
X m
X

(cl f )(y) = lim f y + α(x − y) = lim fi y + α(x − y) = (cl fi )(y).
α↓0 α↓0
i=1 i=1
1.36
The assumption that “C ∩ M is bounded” must be modified to read “cl(C) ∩ M

is bounded”. Assume first that C is closed. Since C ∩ M is bounded, by part
(c) of the Recession Cone Theorem (cf. Prop. 1.5.1), RC∩M = {0}. This and the
fact RC∩M = RC ∩ RM , imply that RC ∩ RM = {0}. Let S be a subspace such
that M = x + S for some x ∈ M . Then RM = S, so that RC ∩ S = {0}. For
every affine set M that is parallel to M , we have RM = S, so that
RC∩M = RC ∩ RM = RC ∩ S = {0}.
Therefore, by part (c) of the Recession Cone Theorem, C ∩ M is bounded.

In the general case where C is not closed, we replace C with cl(C). By
what has already been proved, cl(C) ∩ M is bounded, implying that C ∩ M is
bounded.
30
1.37 (Properties of Cartesian Products)
(a) We first show that the convex hull of X is equal to the Cartesian product of
the convex hulls of the sets Xi , i = 1, . . . , m. Let y be a vector that belongs to
conv(X). Then, by definition, for some k, we have
k k
X X
y= αi yi , with αi ≥ 0, i = 1, . . . , m, αi = 1,
i=1 i=1
where yi ∈ X for all i. Since yi ∈ X, we have that yi = (xi1 , . . . , xim ) for all i,
with xi1 ∈ X1 , . . . , xim ∈ Xm . It follows that
k k k
!
X X X
y= αi (xi1 , . . . , xim ) = αi xi1 , . . . , αi xim ,
i=1 i=1 i=1
thereby implying that y ∈ conv(X1 ) × · · · × conv(Xm ).

To prove the reverse inclusion, assume that y is a vector in conv(X1 )×· · ·×
conv(Xm ). Then, we can represent y as y = (y1 , . . . , ym ) with yi ∈ conv(Xi ),
i.e., for all i = 1, . . . , m, we have
ki ki
X X
yi = αji xij , xij ∈ Xi , ∀ j, αji ≥ 0, ∀ j, αji = 1.
j=1 j=1
First, consider the vectors
(x11 , x2r1 , . . . , xm 1 2 m 1 2 m
rm−1 ), (x2 , xr1 , . . . , xrm−1 ), . . . , (xki , xr1 , . . . , xrm−1 ),
for all possible values of r1 , . . . , rm−1 , i.e., we fix all components except the
first one, and vary the first component over all possible x1j ’s used in the convex
combination that yields y1 . Since all these vectors belong to X, their convex
combination given by
k1
!
X
αj1 x1j , x2r1 , . . . , xm
rm−1
j=1
belongs to the convex hull of X for all possible values of r1 , . . . , rm−1 . Now,
consider the vectors
k1 k1
! !
X X
αj1 x1j , x21 , . . . , xm
rm−1 ,..., αj1 x1j , x2k2 , . . . , xm
rm−1 ,
j=1 j=1
i.e., fix all components except the second one, and vary the second component
over all possible x2j ’s used in the convex combination that yields y2 . Since all
these vectors belong to conv(X), their convex combination given by
k1 k2
!
X X
αj1 x1j , αj2 x2j , . . . , xm
rm−1
j=1 j=1
31
belongs to the convex hull of X for all possible values of r2 , . . . , rm−1 . Proceeding
in this way, we see that the vector given by
k1 k2 km
!
X X X
αj1 x1j , αj2 x2j , . . . , αjm xm
j
j=1 j=1 j=1
belongs to conv(X), thus proving our claim.

Next, we show the corresponding result for the closure of X. Assume that
y = (x1 , . . . , xm ) ∈ cl(X). This implies that there exists some sequence {y k } ⊂ X
such that y k → y. Since y k ∈ X, we have that y k = (xk1 , . . . , xkm ) with xki ∈ Xi
for each i and k. Since y k → y, it follows that xi ∈ cl(Xi ) for each i, and
hence y ∈ cl(X1 ) × · · · × cl(Xm ). Conversely, suppose that y = (x1 , . . . , xm ) ∈
cl(X1 ) × · · · × cl(Xm ). This implies that there exist sequences {xki } ⊂ Xi such
that xki → xi for each i = 1, . . . , m. Since xki ∈ Xi for each i and k, we have that
y k = (xk1 , . . . , xkm ) ∈ X and {y k } converges to y = (x1 , . . . , xm ), implying that
y ∈ cl(X).
Finally, we show the corresponding result for the affine hull of X. Let’s
assume, by using a translation argument if necessary, that all the Xi ’s contain
the origin, so that aff(X1 ), . . . , aff(Xm ) as well as aff(X) are all subspaces.
Assume that y ∈ aff(X). Let the dimension of aff(X) be r, and let
y 1 , . . . , y r be linearly independent vectors in X that span aff(X). Thus, we
can represent y as
r
X
y= β i yi ,
i=1
where β , . . . , β are scalars. Since y ∈ X, we have that y i = (xi1 , . . . , xim ) with

1 r i
xij ∈ Xj . Thus,
r r r
!
X i
X X
y= β (xi1 , . . . , xim ) = β i xi1 , . . . , β i xim ,
i=1 i=1 i=1
implying that y ∈ aff(X1 ) × · · · × aff(Xm ). Now, assume that y ∈ aff(X1 ) ×

r
· · · × aff(Xm ). Let the dimension of aff(Xi ) be ri , and let x1i , . . . , xi i be linearly
independent vectors in Xi that span aff(Xi ). Thus, we can represent y as
r1 rm
!
X X
y= β1j xj1 , . . . , j j
βm xm .
j=1 j=1
Since each Xi contains the origin, we have that the vectors

r1 r2
! ! rm
!
X X X
β1j xj1 , 0, . . . , 0 , 0, β2j xj2 , 0, . . . , 0 ,..., 0, . . . , j j
βm xm ,
j=1 j=1 j=1
belong to aff(X), and so does their sum, which is the vector y. Thus, y ∈ aff(X),
concluding the proof.
32
(b) Assume that y ∈ cone(X). We can represent y as
r
X
y= αi y i ,
i=1
for some r, where α1 , . . . , αr are nonnegative scalars and yi ∈ X for all i. Since
y i ∈ X, we have that y i = (xi1 , . . . , xim ) with xij ∈ Xj . Thus,
r r r
!
X i
X X
y= α (xi1 , . . . , xim ) = αi xi1 , . . . , αi xim ,
i=1 i=1 i=1
implying that y ∈ cone(X1 ) × · · · × cone(Xm ).

Conversely, assume that y ∈ cone(X1 ) × · · · × cone(Xm ). Then, we can
represent y as
r1
! rm
X X
y= α1j xj1 , . . . , j
αm xjm ,
j=1 j=1
where xji ∈ Xi and αij ≥ 0 for each i and j. Since each Xi contains the origin,
we have that the vectors
r1 r2
! ! rm
!
X X X
α1j xj1 , 0, . . . , 0 , 0, α2j xj2 , 0, . . . , 0 ..., 0, . . . , j
αm xjm ,
j=1 j=1 j=1
belong to the cone(X), and so does their sum, which is the vector y. Thus,
y ∈ cone(X), concluding the proof.
Finally, consider the example where
X1 = {0, 1} ⊂ <, X2 = {1} ⊂ <.
For this example, cone(X1 ) × cone(X2 ) is given by the nonnegative quadrant,

whereas cone(X) is given by the two halflines α(0, 1) and α(1, 1) for α ≥ 0 and
the region that lies between them.
(c) We first show that
ri(X) = ri(X1 ) × · · · × ri(Xm ).
Let x = (x1 , . . . , xm ) ∈ ri(X). Then, by Prop. 1.4.1 (c), we have that for all
x = (x1 , . . . , xm ) ∈ X, there exists some γ > 1 such that
x + (γ − 1)(x − x) ∈ X.
Therefore, for all xi ∈ Xi , there exists some γ > 1 such that
xi + (γ − 1)(xi − xi ) ∈ Xi ,
33
which, by Prop. 1.4.1(c), implies that xi ∈ ri(Xi ), i.e., x ∈ ri(X1 ) × · · · × ri(Xm ).
Conversely, let x = (x1 , . . . , xm ) ∈ ri(X1 ) × · · · × ri(Xm ). The above argument
can be reversed through the use of Prop. 1.4.1(c), to show that x ∈ ri(X). Hence,
the result follows.
Finally, let us show that
RX = RX1 × · · · × RXm .
Let y = (y1 , . . . , ym ) ∈ RX . By definition, this implies that for all x ∈ X and
α ≥ 0, we have x + αy ∈ X. From this, it follows that for all xi ∈ Xi and α ≥ 0,
xi + αyi ∈ Xi , so that yi ∈ RXi , implying that y ∈ RX1 × · · · × RXm . Conversely,
let y = (y1 , . . . , ym ) ∈ RX1 × · · · × RXm . By definition, for all xi ∈ Xi and α ≥ 0,
we have xi + αyi ∈ Xi . From this, we get for all x ∈ X and α ≥ 0, x + αy ∈ X,
thus showing that y ∈ RX .
1.38 (Recession Cones of Nonclosed Sets)
(a) Let y ∈ RC . Then, by the definition of RC , x + αy ∈ C for every x ∈ C and

every α ≥ 0. Since C ⊂ cl(C), it follows that x + αy ∈ cl(C) for some x ∈ cl(C)
and every α ≥ 0, which, in view of part (b) of the Recession Cone Theorem (cf.
Prop. 1.5.1), implies that y ∈ Rcl(C) . Hence
RC ⊂ Rcl(C) .
By taking closures in this relation and by using the fact that Rcl(C) is closed [part
(a) of the Recession Cone Theorem], we obtain cl(RC ) ⊂ Rcl(C) .
To see that the inclusion cl(RC ) ⊂ Rcl(C) can be strict, consider the set

C = (x1 , x2 ) | 0 ≤ x1 , 0 ≤ x2 < 1 ∪ (0, 1) ,
whose closure is
cl(C) = {(x1 , x2 ) | 0 ≤ x1 , 0 ≤ x2 ≤ 1}.
The recession cones of C and its closure are

RC = (0, 0) , Rcl(C) = (x1 , x2 ) | 0 ≤ x1 , x2 = 0 .

Thus, cl(RC ) = (0, 0) , and cl(RC ) is a strict subset of Rcl(C) .
(b) Let y ∈ RC and let x be a vector in C. Then we have x + αy ∈ C for all
α ≥ 0. Thus for the vector x, which belongs to C, we have x + αy ∈ C for all
α ≥ 0, and it follows from part (b) of the Recession Cone Theorem (cf. Prop.
1.5.1) that y ∈ RC . Hence, RC ⊂ RC .
To see that the inclusion RC ⊂ RC can fail when C is not closed, consider
the sets

C = (x1 , x2 ) | x1 ≥ 0, x2 = 0 , C = (x1 , x2 ) | x1 ≥ 0, 0 ≤ x2 < 1 .
Their recession cones are

RC = C = (x1 , x2 ) | x1 ≥ 0, x2 = 0 , RC = (0, 0) ,
showing that RC is not a subset of RC .
34
1.39 (Recession Cones of Relative Interiors)
(a) The inclusion Rri(C) ⊂ Rcl(C) follows from Exercise 1.38(b).

Conversely, let y ∈ Rcl(C) , so that by the definition of Rcl(C) , x+αy ∈ cl(C)
for every x ∈ cl(C) and every α ≥ 0. In particular, x + αy ∈ cl(C) for every
x ∈ ri(C) and every α ≥ 0. By the Line Segment Principle, all points on the
line segment connecting x and x + αy, except possibly x + αy, belong to ri(C),
implying that x + αy ∈ ri(C) for every x ∈ ri(C) and every α ≥ 0. Hence,
y ∈ Rri(C) , showing that Rcl(C) ⊂ Rri(C) .
(b) If y ∈ Rri(C) , then by the definition of Rri(C) for every vector x ∈ ri(C) and
α ≥ 0, the vector x + αy is in ri(C), which holds in particular for some x ∈ ri(C)
[note that ri(C) is nonempty by Prop. 1.4.1(b)].
Conversely, let y be such that there exists a vector x ∈ ri(C) with x + αy ∈
ri(C) for all α ≥ 0. Hence, there exists a vector x ∈ cl(C) with x + αy ∈ cl(C) for
all α ≥ 0, which, by part (b) of the Recession Cone Theorem (cf. Prop. 1.5.1),
implies that y ∈ Rcl(C) . Using part (a), it follows that y ∈ Rri(C) , completing the
proof.
(c) Using Exercise 1.38(c) and the assumption that C ⊂ C [which implies that
C ⊂ cl(C)], we have
RC ⊂ Rcl(C) = Rri(C) = RC ,
where the equalities follow from part (a) and the assumption that C = ri(C).
To see that the inclusion RC ⊂ RC can fail when C 6= ri(C), consider the
sets

C = (x1 , x2 ) | x1 ≥ 0, 0 < x2 < 1 , C = (x1 , x2 ) | x1 ≥ 0, 0 ≤ x2 < 1 ,
for which we have C ⊂ C and

RC = (x1 , x2 ) | x1 ≥ 0, x2 = 0 , RC = (0, 0) ,
showing that RC is not a subset of RC .
1.40
For each k, consider the set C k = Xk ∩ Ck . Note that {C k } is a sequence of

nonempty closed convex sets and X is specified by linear inequality constraints.
We will show that, under the assumptions given in this exercise, the assumptions
of Prop. 1.5.6 are satisfied, thus showing that the intersection X ∩ (∩∞ k=0 C k )
[which is equal to the intersection ∩∞
k=0 (Xk ∩ Ck )] is nonempty.
Since Xk+1 ⊂ Xk and Ck+1 ⊂ Ck for all k, it follows that
C k+1 ⊂ C k , ∀ k,
showing that assumption (1) of Prop. 1.5.6 is satisfied. Similarly, since by as-
sumption Xk ∩ Ck is nonempty for all k, we have that, for all k, the set
X ∩ C k = X ∩ Xk ∩ Ck = Xk ∩ Ck ,
35
is nonempty, showing that assumption (2) is satisfied. Finally, let R denote the
set R = ∩∞ k=0 RC . Since by assumption C k is nonempty for all k, we have, by
k
part (e) of the Recession Cone Theorem, that RC = RXk ∩ RCk implying that
k
R = ∩∞
k=0 RC
k
∞
= ∩k=0 (RXk ∩ RCk )
∩∞ ∩ ∩∞

= k=0 RXk k=0 RCk
= RX ∩ RC .
Similarly, letting L denote the set L = ∩∞

k=0 LC , it can be seen that L = LX ∩LC .
k
Since, by assumption RX ∩ RC ⊂ LC , it follows that
RX ∩ R = RX ∩ RC ⊂ LC ,
which, in view of the assumption that RX = LX , implies that
RX ∩ R ⊂ LC ∩ LX = L,
showing that assumption (3) of Prop. 1.5.6 is satisfied, and thus proving that the
intersection X ∩ (∩∞
k=0 C k ) is nonempty.
1.41
Let y be in the closure of A · C. We will show that y = Ax for some x ∈ cl(C).

For every > 0, the set

C = cl(C) ∩ x | ky − Axk ≤
is closed. Since A·C ⊂ A·cl(C) and y ∈ cl(A·C), it follows that y is in the closure
of A · cl(C), so that
C is nonempty for every > 0. Furthermore, the recession
cone of the set x | kAx − yk ≤ coincides with the null space N (A), so that
RC = Rcl(C) ∩ N (A). By assumption we have Rcl(C) ∩ N (A) = {0}, and by part
(c) of the Recession Cone Theorem (cf. Prop. 1.5.1), it follows that C is bounded
for every > 0. Now, since the sets C are nested nonempty compact sets, their
intersection ∩>0 C is nonempty. For any x in this intersection, we have x ∈ cl(C)
and Ax − y = 0, showing that y ∈ A · cl(C). Hence, cl(A · C) ⊂ A · cl(C). The
converse A · cl(C) ⊂ cl(A · C) is clear, since for any x ∈ cl(C) and sequence
{xk } ⊂ C converging to x, we have Axk → Ax, showing that Ax ∈ cl(A · C).
Therefore,
cl(A · C) = A · cl(C). (1.13)
We now show that A · Rcl(C) = RA·cl(C) . Let y ∈ A · Rcl(C) . Then, there

exists a vector u ∈ Rcl(C) such that Au = y, and by the definition of Rcl(C) ,
there is a vector x ∈ cl(C) such that x + αu ∈ cl(C) for every α ≥ 0. Therefore,
Ax + αAu ∈ A · cl(C) for every α ≥ 0, which, together with Ax ∈ A · cl(C) and
Au = y, implies that y is a direction of recession of the closed set A · cl(C) [cf.
Eq. (1.13)]. Hence, A · Rcl(C) ⊂ RA·cl(C) .
36
Conversely, let y ∈ RA·cl(C) . We will show that y ∈ A · Rcl(c) . This is true
if y = 0, so assume that y 6= 0. By definition of direction of recession, there is a
vector z ∈ A · cl(C) such that z + αy ∈ A · cl(C) for every α ≥ 0. Let x ∈ cl(C) be
such that Ax = z, and for every positive integer k, let xk ∈ cl(C) be such that
Axk = z + ky. Since y 6= 0, the sequence {Axk } is unbounded, implying that
{xk } is also unbounded (if {xk } were bounded, then {Axk } would be bounded,
a contradiction). Because xk 6= x for all k, we can define
xk − x
uk = , ∀ k.
kxk − xk
Let u be a limit point of {uk }, and note that u 6= 0. It can be seen that
u is a direction of recession of cl(C) [this can be done similar to the proof of
part (c) of the Recession Cone Theorem (cf. Prop. 1.5.1)]. By taking an appro-
priate subsequence if necessary, we may assume without loss of generality that
limk→∞ uk = u. Then, by the choices of uk and xk , we have
Axk − Ax k
Au = lim Auk = lim = lim y,
k→∞ k→∞ kxk − xk k→∞ kxk − xk
implying that limk→∞ kx k−xk exists. Denote this limit by λ. If λ = 0, then u is

k
in the null space N (A), implying that u ∈ Rcl(C) ∩ N (A). By the given condition
Rcl(C) ∩ N (A) = {0}, we have u = 0 contradicting the fact u 6= 0. Thus, λ is
positive and Au = λy, so that A(u/λ) = y. Since Rcl(C) is a cone [part (a) of the
Recession Cone Theorem] and u ∈ Rcl(C) , the vector u/λ is in Rcl(C) , so that y
belongs to A · Rcl(C) . Hence, RA·cl(C) ⊂ A · Rcl(C) , completing the proof.
As an example showing that A·Rcl(C) and RA·cl(C) may differ when Rcl(C) ∩
N (A) 6= {0}, consider the set
C = (x1 , x2 ) | x1 ∈ <, x2 ≥ x21 ,

and the linear transformation A that maps (x1 , x2 ) ∈ <2 into x1 ∈ <. Then, C
is closed and its recession cone is

RC = (x1 , x2 ) | x1 = 0, x2 ≥ 0 ,
so that A · RC = {0}, where 0 is scalar. On the other hand, A · C coincides with

<, so that RA·C = < =6 A · RC .
1.42
Let S be defined by
S = Rcl(C) ∩ N (A),
and note that S is a subspace of Lcl(C) by the given assumption. Then, by Lemma
1.5.4, we have
cl(C) = cl(C) ∩ S ⊥ + S,

37
so that the images of cl(C) and cl(C) ∩ S ⊥ under A coincide [since S ⊂ N (A)],
i.e.,
A · cl(C) = A · cl(C) ∩ S ⊥ .

(1.14)
Because A · C ⊂ A · cl(C), we have

cl(A · C) ⊂ cl A · cl(C) ,
which in view of Eq. (1.14) gives

cl(A · C) ⊂ cl A · cl(C) ∩ S ⊥ .
Define
C = cl(C) ∩ S ⊥
so that the preceding relation becomes
cl(A · C) ⊂ cl(A · C). (1.15)
The recession cone of C is given by
RC = Rcl(C) ∩ S ⊥ , (1.16)
[cf. part (e) of the Recession Cone Theorem, Prop. 1.5.1], for which, since S =
Rcl(C) ∩ N (A), we have
RC ∩ N (A) = S ∩ S ⊥ = {0}.
Therefore, by Prop. 1.5.8, the set A · C is closed, implying that cl(A · C) = A · C.

By the definition of C, we have A · C ⊂ A · cl(C), implying that cl(A · C) ⊂
A · cl(C) which together with Eq. (1.15) yields cl(A · C) ⊂ A · cl(C). The converse
A · cl(C) ⊂ cl(A · C) is clear, since for any x ∈ cl(C) and sequence {xk } ⊂ C
converging to x, we have Axk → Ax, showing that Ax ∈ cl(A · C). Therefore,
cl(A · C) = A · cl(C). (1.17)
We next show that A · Rcl(C) = RA·cl(C) . Let y ∈ A · Rcl(C) . Then, there

exists a vector u ∈ Rcl(C) such that Au = y, and by the definition of Rcl(C) ,
there is a vector x ∈ cl(C) such that x + αu ∈ cl(C) for every α ≥ 0. Therefore,
Ax + αAu ∈ Acl(C) for some x ∈ cl(C) and for every α ≥ 0, which together with
Ax ∈ A · cl(C) and Au = y implies that y is a recession direction of the closed
set A · cl(C) [Eq. (1.17)]. Hence, A · Rcl(C) ⊂ RA·cl(C) .
Conversely, in view of Eq. (1.14) and the definition of C, we have
RA·cl(C) = RA·C .
Since RC ∩ N (A) = {0} and C is closed, by Exercise 1.41, it follows that

RA·C = A · RC ,
which combined with Eq. (1.16) implies that
A · RC ⊂ A · Rcl(C) .
The preceding three relations yield RA·cl(C) ⊂ A · Rcl(C) , completing the proof.
38
1.43 (Recession Cones of Vector Sums)
(a) Let C be the Cartesian product C1 × · · · × Cm . Then, by Exercise 1.37, C is

closed, and its recession cone and lineality space are given by
RC = RC1 × · · · × RCm , LC = LC1 × · · · × LCm .
Let A be a linear transformation that maps (x1 , . . . , xm ) ∈ <mn into x1 + · · · +

xm ∈ <n . The null space of A is the set of all (y1 , . . . , ym ) such that y1 +· · ·+ym =
0. The intersection RC ∩ N (A) consists of points (y1 , . . . , ym ) such that y1 + · · · +
ym = 0 with yi ∈ RCi for all i. By the given condition, every vector (y1 , . . . , ym )
in the intersection RC ∩ N (A) is such that yi ∈ LCi for all i, implying that
(y1 , . . . , ym ) belongs to the lineality space LC . Thus, RC ∩ N (A) ⊂ LC ∩ N (A).
On the other hand by definition of the lineality space, we have LC ⊂ RC , so that
LC ∩ N (A) ⊂ RC ∩ N (A). Therefore, RC ∩ N (A) = LC ∩ N (A), implying that
RC ∩ N (A) is a subspace of LC . By Exercise 1.42, the set A · C is closed and
RA·C = A · RC . Since A · C = C1 + · · · + Cm , the assertions of part (a) follow.
(b) The proof is similar to that of part (a). Let C be the Cartesian product
C1 × · · · × Cm . Then, by Exercise 1.37(a),
cl(C) = cl(C1 ) × · · · × cl(Cm ), (1.18)
and its recession cone and lineality space are given by
Rcl(C) = Rcl(C1 ) × · · · × Rcl(Cm ) , (1.19)
Lcl(C) = Lcl(C1 ) × · · · × Lcl(Cm ) .
Let A be a linear transformation that maps (x1 , . . . , xm ) ∈ <mn into x1 + · · · +

xm ∈ <n . Then, the intersection Rcl (C) ∩ N (A) consists of points (y1 , . . . , ym )
such that y1 + · · · + ym = 0 with yi ∈ Rcl(Ci ) for all i. By the given condition,
every vector (y1 , . . . , ym ) in the intersection Rcl(C) ∩N (A) is such that yi ∈ Lcl(Ci )
for all i, implying that (y1 , . . . , ym ) belongs to the lineality space Lcl(C) . Thus,
Rcl(C) ∩ N (A) ⊂ Lcl(C) ∩ N (A). On the other hand by definition of the lineality
space, we have Lcl(C) ⊂ Rcl(C) , so that Lcl(C) ∩ N (A) ⊂ Rcl(C) ∩ N (A). Hence,
Rcl(C) ∩ N (A) = Lcl(C) ∩ N (A), implying that Rcl(C) ∩ N (A) is a subspace of
Lcl(C) . By Exercise 1.42, we have cl(A · C) = A · cl(C) and RA·cl(C) = A · Rcl(C) ,
from which by using the relation A · C = C1 + · · · + Cm , and Eqs. (1.18) and
(1.19), we obtain
cl(C1 + · · · + Cm ) = cl(C1 ) + · · · + cl(Cm ),
Rcl(C1 +···+Cm ) = Rcl(C1 ) + · · · + Rcl(Cm ) .
39
1.44
Let C be the Cartesian product C1 × · · · × Cm viewed as a subset of <mn , and

let A be the linear transformation that maps a vector (x1 , . . . , xm ) ∈ <mn into
x1 + · · · + xm . Note that set C can be written as
C = x = (x1 , . . . , xm ) | x0 Qij x + a0ij x + bij ≤ 0, i = 1, . . . , m, j = 1, . . . , ri ,

where the Qij are appropriately defined symmetric positive semidefinite mn×mn
matrices and the aij are appropriately defined vectors in <mn . Hence, the set C
is specified by convex quadratic inequalities. Thus, we can use Prop. 1.5.8(c) to
assert that the set AC = C1 + · · · + Cm is closed.
1.45 (Set Intersection and Helly’s Theorem)
Helly’s Theorem implies that the sets C k defined in the hint are nonempty. These
sets are also nested and satisfy the assumptions of Props. 1.5.5 and 1.5.6. There-
fore, the intersection ∩∞
i=1 C i is nonempty. Since
∩∞ ∞
i=1 C i ⊂ ∩i=1 Ci ,
the result follows.
40

Convexsol 1

Uploaded by

Document Informationclick to expand document information

Document Informationclick to expand document information

Copyright:

Available Formats

Convexsol 1

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Convexsol 1

Uploaded by

Copyright:

Available Formats

Convex Analysis and

Angelia Nedić and Asuman E. Ozdaglar

Massachusetts Institute of Technology

Athena Scientific, Belmont, Massachusetts

CHAPTER 1: SOLUTION MANUAL

We always have (λ1 + λ2 )C ⊂ λ1 C + λ2 C, even if C is not convex. To show the

1.2 (Properties of Cones)

(a) Let {xk } ⊂ <n be a sequence n

Hence, h is lower semicontinuous.

Let {xk }K be a subsequence attaining the above limit inferior, i.e.,

Without loss of generality, we assume that

Since g is monotonically nondecreasing, it follows that

f (xk ) < f (x), ∀ k ∈ K,

f (x) ≤ lim inf f (xk ) ≤ lim sup f (xk ) ≤ f (x),

contradicting Eq. (1.1). Hence, h is lower semicontinuous.

and g(x) = −x. Then

where the first inequality follows by convexity of each fi and monotonicity of g,

1.5 (Examples of Convex Functions)

(a) Denote X = dom(f1 ). It can be seen that f1 is twice continuously differen-

Hence by Prop. 1.2.6(a), f2 is convex.

1.6 (Ascent/Descent Behavior of a Convex Function)

so that by convexity of f , we obtain

This relation and the fact

By multiplying the preceding relation with x3 − x1 and by dividing it with (x3 −

f (x2 ) − f (x1 ) f (x3 ) − f (x2 )

f (x2 ) − f (x1 ) f (x3 ) − f (x2 ) f (xk+1 ) − f (xk )

where γ is either a real number or ∞. Furthermore,

We now show that γ is independent of the sequence {xk }. Let {yj } be

and letting j → ∞ yields

f (y) − f (x) f (xK+1 ) − f (xK )

f (x) − f (xK ) f (xN ) − f (x)

implying that f (xk ) → ∞. To show that f (x) increases monotonically to ∞ for

1.7 (Characterization of Differentiable Convex Functions)

If f is convex, then by Prop. 1.2.5(a), we have

y 0 ∇2 f (x)y < 0, ∀ x ∈ B(x, ). (1.7)

By Prop. 1.1.13(a), for all positive scalars α with α < , we have

f (x̄ + αy) < f (x̄) + α∇f (x̄)0 y, ∀ α ∈ [0, ).

y 0 ∇2 f (x)y = lim y 0 ∇2 f (xk )y ≥ 0, ∀ y ∈ S.

It follows that y 0 ∇2 f (x)y ≥ 0 for all x ∈ C and y ∈ S.

f (z) = f (x) + (z − x)0 ∇f (x) + 21 (z − x)0 ∇2 f x + α(z − x) (z − x)

f (z) ≥ f (x) + (z − x)0 ∇f (x), ∀ x, z ∈ C.

From Prop. 1.2.5(a), we conclude that f is convex over C.

for some α > 0, we have

≥ α(s − t)2 kx − yk2 > 0.

Equivalently, th(1) + (1 − t)h(0) > h(t). The definition of h yields tf (y) + (1 −

Using the Mean Value Theorem (Prop. 1.1.12), we have

(a) Consider the following posynomial for which we have n = m = 1 and β = 12 ,

This function is not convex.

With this change of variables, f (x) can be written as

Note that f (x) can also be represented as

f (x) = ln exp(Ax + b), ∀ x ∈ <n ,

g(y) = g1 (y)γ1 · · · gr (y)γr ,

− ln(α1 x1 + · · · + αn xn ) ≤ −α1 ln(x1 ) − · · · − αn ln(xn ),

1.12 (Young and Holder Inequalities)

According to Exercise 1.11, we have

in Young’s inequality, we have for all i = 1, . . . , n,

By adding these inequalities over i = 1, . . . , n, we obtain

y 0 ∇2 f (x)y < 0, ∀ x ∈ B(x, ). (1.7)

By Prop. 1.1.13(a), for all positive scalars α with α < , we have

f (x̄ + αy) < f (x̄) + α∇f (x̄)0 y, ∀ α ∈ [0, ).

We now show that B(x, ) ⊂ C + S ⊥ . Let z be a vector in B(x, ). Then,