Concave and Convex Functions: 1 Basic Definitions

John Nachbar
Washington University
March 27, 2018
Concave and Convex Functions1

1 Basic Definitions.
Definition 1. Let C ⊆ RN be non-empty and convex and let f : C → R.
1. (a) f is concave iff for any a, b ∈ C and any θ ∈ [0, 1],
f (θa + (1 − θ)b) ≥ θf (a) + (1 − θ)f (b);
(b) f is strictly concave iff for any a, b ∈ C and any θ ∈ (0, 1), the above
inequality is strict.
2. (a) f is convex iff for any a, b ∈ C and any θ ∈ [0, 1],
f (θa + (1 − θ)b) ≤ θf (a) + (1 − θ)f (b);
(b) f is strictly convex iff for any a, b ∈ C and any θ ∈ (0, 1), the above
inequality is strict.
The following equivalence is immediate from the definitions.
Theorem 1. Let C ⊆ RN be non-empty and convex and let f : C → R. f is convex

iff −f is concave. f is strictly convex iff −f is strictly concave.
f is both concave and convex iff for any a, b ∈ RN and any θ ∈ (0, 1), f (θa +
(1 − θ)b) = θf (a) + (1 − θ)f (b). A function f is affine iff there is a 1 × N matrix
A and a number y ∗ ∈ R such that for all x ∈ C, f (x) = Ax + y ∗ . f is linear if it is
affine with y ∗ = 0.
Theorem 2. f : RN → R is affine iff it is both concave and convex.
Proof.
1. ⇒. For any a, b ∈ C and any θ ∈ (0, 1), f (θa+(1−θ)b) = A(θa+(1−θ)b)+y ∗ =

θ(Aa + y ∗ ) + (1 − θ)(Ab + y ∗ ) = θf (a) + (1 − θ)f (b), so that f is both concave
and convex.
2. ⇐. Let y ∗ = f (0) and let g(x) = f (x) − y ∗ , so that g(0) = 0. Since f is both
concave and convex, so is g.
cbna. This work is licensed under the Creative Commons Attribution-NonCommercial-
1
ShareAlike 4.0 License.
1
• Claim: for any a ∈ RN , for any γ ≥ 0, g(γa) = γg(a).
The claim is trivially true for γ equal to either 0 or 1. Suppose γ ∈ (0, 1).
Then g(γa) = g(γa + (1 − γ)0) = γg(a) + (1 − γ)g(0) = γg(a).
On the other hand, if γ > 1, then 1/γ ∈ (0, 1) and hence g(a) =
g((1/γ)γa + (1 − 1/γ)0) = (1/γ)g(γa) + (1 − 1/γ)g(0) = (1/γ)g(γa).
Multiplying through by γ gives γg(a) = g(γa).
• Claim: for any a, b ∈ RN , g(a + b) = g(a) + g(b).
g(a + b) = g((1/2)(2a) + (1/2)(2b)) = (1/2)g(2a) + (1/2)g(2b) = g(a) +
g(b), where the last equality comes from the previous claim.
Construct A by setting an = g(en ),P where en = (0, . . . , 0, 1, 0, . . . , 0) is the

coordinate n unit vector.
P Since x =P n xn en , induction on the second claim
above gives g(x) = g( n xn e ) = n g(xn en ) = n xn g(en ) = n xn an =
n
P P
Ax. Finally, f (x) = g(x) + y ∗ = Ax + y ∗ .
For N = 1, the next result says that a function is concave iff, informally, its
slope is weakly decreasing. If the function is differentiable then the implication is
that the derivative is weakly decreasing.
Theorem 3. Let C ⊆ R be an open interval.
1. f : C → R is concave iff for any a, b, c ∈ C, with a < b < c,
f (b) − f (a) f (c) − f (b)

≥ ,
b−a c−b
and,
f (b) − f (a) f (c) − f (a)
≥ .
b−a c−a
For strict concavity, the inequalities are strict.
2. f : C → R is convex iff for any a, b, c ∈ C, with a < b < c,
f (b) − f (a) f (c) − f (b)

≤ ,
b−a c−b
and,
f (b) − f (a) f (c) − f (a)
≤ .
b−a c−a
For strictly convexity, the inequalities are strict.
2
Proof. Take any a, b, c ∈ C, a < b < c. Since b − a and c − b > 0, the first inequality
under (1), holds iff
[f (b) − f (a)](c − b) ≥ [f (c) − f (b)](b − a),
which holds iff (collecting terms in f (b)),
f (b)(c − a) ≥ f (a)(c − b) + f (c)(b − a),
which (since c − a > 0) holds iff

c−b b−a
f (b) ≥ f (a) + f (c).
c−a c−a
Take θ = (c − b)/(c − a) ∈ (0, 1) and verify that, indeed, b = θa + (1 − θ)c. Then

the last inequality holds since f is concave. Conversely, the preceding argument
shows that if the first inequality in (1) holds then f is concave (take any a < c, any
θ ∈ (0, 1), and let b = θa + (1 − θ)c). The proofs of the other claims are similar.
It is also possible to characterize concavity or convexity of functions in terms of

the convexity of particular sets. Given the graph of a function, the hypograph of f ,
written hypf , is the set of points that lies on or below the graph of f , while the
epigraph of f , written epif , is the set of points that lies on or above the graph of
f .2 Formally,
epif = {(x, y) ∈ RN +1 : y ≥ f (x)},

hypf = {(x, y) ∈ RN +1 : y ≤ f (x)}.
Theorem 4. Let C ⊆ RN be convex and let f : C → R.
1. f is concave iff hypf is convex.
2. f is convex iff epif is convex.
Proof. Suppose that f is concave. I will show that hypf is convex. Take any
z1 , z2 ∈ hypf and any θ ∈ [0, 1]. Then there is an a, b ∈ C and y1 , y2 ∈ R, such
that z1 = (a, y1 ), z2 = (b, y2 ), with f (a) ≥ y1 , f (b) ≥ y2 . By concavity of f ,
f (θa + (1 − θ)b) ≥ θf (a) + (1 − θ)f (b). Hence f (θa + (1 − θ)b) ≥ θy1 + (1 − θ)y2 . The
latter says that the point θz1 + (1 − θ)z2 = (θa + (1 − θ)b, θy1 + (1 − θ)y2 ) ∈ hypf ,
as was to be shown. The other directions are similar.
2
”Hypo” means “under” and “epi” means “over.” A hypodermic needle goes under your skin,
the top layer of which is your epidermis.
3
2 Concavity, Convexity, and Continuity.
Theorem 5. Let C ⊆ RN be non-empty, open and convex and let f : C → R be
either convex or concave. Then f is continuous.
Proof. Let f be concave. Consider first the case N = 1. Theorem 3 implies that
for any a, b, c ∈ C, with a < b < c, the graph of f is sandwiched between the graphs
of two lines through the point (b, f (b)), one line through the points (a, f (a)) and
(b, f (b)) and the other through the points (b, f (b)) and (c, f (c)). Explicitly, Theorem
3 implies that for all x ∈ [a, b],
f (b) − f (a) f (c) − f (b)

f (b) − (b − x) ≤ f (x) ≤ f (b) − (b − x), (1)
b−a c−b
and for all x ∈ [b, c],
f (b) − f (a) f (c) − f (b)

f (b) + (x − b) ≥ f (x) ≥ f (b) + (x − b). (2)
b−a c−b
These inequalities imply that f is continuous at b.
The argument for general N uses the following additional fact from the 1-
dimensional case. Assume that b lies in the center of the [a, c] line segment, so
that b − a = c − b. If f (a) ≤ f (c) then for any x ∈ [a, c],
|f (b) − f (x)| ≤ f (b) − f (a). (3)
If f (c) ≤ f (a), then the analog of (3) holds with f (b) − f (c) on the right-hand side.
Inequality (3) may be obvious from the “sandwich” characterization above, but
for the sake of completeness, here is a detailed argument. First, note that concavity
implies that f (b) ≥ f (a). Next, the first inequality in (1) implies that for all
x ∈ [a, b],
f (b) − f (a)
f (b) − f (x) ≤ (b − x) ≤ f (b) − f (a), (4)
b−a
where the second inequality in (4) follows since f (b)−f (a) ≥ 0 and since (b−x)/(b−
a) is non-negative and has a maximum value of 1 on [a, b]. Similarly, the second
inequality in (1) implies that
f (c) − f (b)
− (f (b) − f (x)) ≤ − (b − x) ≤ f (b) − f (a), (5)
c−b
where the second inequality in (5) follows trivially if −(f (c) − f (b)) ≤ 0, since
f (b) − f (a) ≥ 0; if −(f (c) − f (b)) > 0, then the inequality follows since −f (c) ≤
−f (a) and since (b − x)/(b − a) is non-negative and has a maximum value of 1 on
[a, b]). And similar arguments obtain if x ∈ [b, c]. Combining all this gives inequality
(3).
4
To complete the proof of continuity, take any x∗ ∈ C and consider the (hyper)
cube formed by the 2N vertices of the form x∗ + (1/t)en and x∗ − (1/t)en , where en
is the unit vector for coordinate n and where t ∈ {1, 2, . . . } is large enough that this
cube lies in C. Let vt be the vertex that minimizes f across the 2N vertices. For
any x in the cube, concavity implies f (vt ) ≤ f (x). Take any x in the cube. Then for
the line segment given by the intersection of the cube with the line through x and
x∗ , inequality (3) implies, since f (vt ) is less than or equal to the minimum value of
f along this line segment, and since x∗ lies in the center of this line segment.
|f (x) − f (x∗ )| ≤ f (x∗ ) − f (vt ). (6)
Since there are only a finite number of coordinates, it is possible to find a subse-
quence {vtk } lying on a single axis. Continuity in the 1-dimensional case, established
above, implies f (vtk ) → f (x∗ ). Inequality (6) then implies continuity at x∗ .
Finally, for convex f , −f is concave, hence −f is continuous, and f is continuous
iff −f is continuous.
For functions defined on non-open sets, continuity can fail at the boundary. In
particular, if the domain is a closed interval in R, then concave functions can jump
down at end points and convex functions can jump up.
Example 1. Let C = [0, 1] and define
(
−x2 if x > 0,
f (x) =
−1 if x = 0.
Then f is concave. It is lower semi-continuous on [0, 1] and continuous on (0, 1].

Remark 1. The proof of Theorem 5 makes explicit use of the fact that the domain
is finite dimensional. The theorem does not generalize to domains that are arbi-
trary vector metric spaces. In particular, there are infinite dimensional vector space
domains for which even some linear functions (which are both concave and convex)
are not continuous.
3 Concavity, Convexity, and Differentiability.

A differentiable function is concave iff it lies on or below the tangent line (or plane,
for N > 1) at any point.
Theorem 6. Let C ⊆ RN be non-empty, open and convex and let f : C → R be

differentiable.
1. f is concave iff for any x∗ , x ∈ C
f (x) ≤ ∇f (x∗ ) · (x − x∗ ) + f (x∗ ). (7)
5
2. f is convex iff for any x∗ , x ∈ C
f (x) ≥ ∇f (x∗ ) · (x − x∗ ) + f (x∗ ). (8)
Proof. If f is concave then for any x, x∗ ∈ C, x 6= x∗ , and any θ ∈ (0, 1), f (θx +
(1 − θ)x∗ ) ≥ θf (x) + (1 − θ)f (x∗ ), or, dividing by θ and rearranging,
f (x∗ + θ(x − x∗ )) − f (x∗ )
f (x) − f (x∗ ) ≤ .
θ
Taking the limit of the right-hand side as θ ↓ 0 and rearranging yields inequality
(7).
Conversely, consider any a, b ∈ C, take any θ ∈ (0, 1), and let x∗ = θa + (1 − θ)b.
Note that a − x∗ = −(1 − θ)(b − a) and b − x∗ = θ(b − a). Therefore, by inequality
(7),
f (a) ≤ ∇f (x∗ ) · [−(1 − θ)(b − a)] + f (x∗ )

f (b) ≤ ∇f (x∗ ) · [θ(b − a)] + f (x∗ )
Multiplying the first by θ > 0 and the second by 1 − θ > 0, and adding, yields
θf (a) + (1 − θ)f (b) ≤ f (x∗ ), as was to be shown.
The proofs for inequality (8) are analogous.
Remark 2. If f is concave then a version of inequality (7) holds even if f is not

differentiable (e.g., f (x) = −|x|), and analogously for inequality (8) if f is convex.
Explicitly, recall from Theorem 4 that f is concave iff hyp f , the set lying on or
below the graph of f , is convex. Therefore, by the Supporting Hyperplane Theorem
for RN +1 , for any x∗ ∈ C, there is a vector (v, w) ∈ Rn+1 , v ∈ RN , w ∈ R, (v, w) 6= 0,
such that for all (x, y) ∈ hypf ,
(v, w) · (x, y) ≥ (v, w) · (x∗ , f (x∗ )). (9)
If w > 0, this inequality will be violated for (x∗ , y) for any y < 0 of sufficiently large
magnitude. By definition of hypf , there will be such (x∗ , y) ∈ hypf . Therefore,
w ≤ 0. Moreover, if w = 0 (and v 6= 0 since (v, w) 6= 0), then inequality (9) will be
violated at any (x, f (x)) with x = x∗ − γv, with γ > 0 small enough that x ∈ C.
Therefore, w < 0.
Since w < 0, I can assume w = −1: inequality (9) holds for (v, w) iff it holds
for γ(v, w), for any γ > 0; take γ = 1/|w|. With (v, −1) as the supporting vector,
inequality (9) then implies, taking (x, y) = (x, f (x)) and rearranging,
f (x) ≤ v · (x − x∗ ) + f (x∗ ). (10)
If f is differentiable at x∗ , then ∇f (x∗ ) is the unique v for which (10) holds for all
x. If f is not differentiable at x∗ then there will be a continuum of vectors, called
subgradients, for which inequality (10) holds.
6
It is easy to verify that the set of subgradients is closed and convex. In the case
N = 1, a subgradient is just a number and the set of subgradients is particularly easy
to characterize. Explicitly, for any x∗ ∈ C, Theorem 3 implies that the left-hand
and right-hand derivatives at x∗ ,
f (x) − f (x∗ )
m = lim∗ ,
x↑x x − x∗
f (x) − f (x∗ )
m = lim∗ ,
x↓x x − x∗
are well defined even if f is not differentiable at x∗ . The set of subgradients at x∗
is [m, m]; if f is differentiable at x∗ then m = m = Df (x∗ ).
Subgradients play an important role in some parts of economic theory, but I will
not be pursuing them here.
For differentiable functions on R (N = 1), Theorem 3 says implies that a function
is concave iff its derivative is weakly decreasing. For twice differentiable functions,
the derivative is weakly decreasing iff the second derivative is weakly negative ev-
erywhere. The following result, Theorem 7, records this fact and generalizes it to
N > 1. Recall that a symmetric N × N matrix A is negative semi-definite iff, for
any v ∈ RN , v 0 Av ≤ 0. The matrix is negative definite iff, for any v ∈ RN , v 6= 0,
v 0 Av < 0. The definitions of positive semi-definite and positive definite are analo-
gous. As is standard practice, I write D2 f (x) for the Hessian, D2 f (x) = D(∇f )(x),
which is the N ×N matrix of second order partial derivatives. By Young’s Theorem,
if f is C 2 then D2 f (x) is symmetric.
Theorem 7. Let C ⊆ RN be non-empty, open and convex and let f : C → R be C 2 .
1. (a) D2 f (x) is negative semi-definite for every x ∈ C iff f is concave.
(b) If D2 f (x) is negative definite for every x ∈ C then f is strictly concave.
2. (a) D2 f (x) is positive semi-definite for every x ∈ C iff f is convex.

(b) If D2 f (x) is positive definite for every x ∈ C then f is strictly convex.
Proof.
1. N = 1. In this case, D2 f (x) ∈ R, hence D2 f (x) is negative semi-definite iff
D2 f (x) ≤ 0.
Consider first the ⇒ direction of 1(a) . If D2 f (x) ≤ 0 for all x then Df (x) is
weakly decreasing for all x, which implies that for any a, b, c ∈ C, a < b < c,
f (b) − f (a) f (c) − f (b)
≥ ,
b−a c−b
which, by Theorem 3, implies that f is concave. The proof of 1(b) is almost
identical, as is the proof of 2(b) and the ⇒ direction of 2(a).
7
It remains to prove the ⇐ direction of 1(a) and 2(a). Consider the ⇐ direction
of 1(a). I argue by contraposition. Suppose that D2 f (x∗ ) > 0 for some x∗ ∈ C.
Since f is C 2 , D2 f (x) > 0 for every x in some open interval containing x∗ .
Then Theorem 3 implies that f is not concave (it is, in fact, strictly convex)
for x in this interval. The proof of the ⇐ direction of 2(a) is similar.
2. N > 1. Then D2 f (x) is a symmetric matrix. I will show 1(b). The other
cases are similar.
Suppose, therefore, that D2 f (x) is negative definite for all x ∈ C. Consider
any a, b ∈ C, b 6= a, and any θ ∈ (0, 1). Let xθ = θa + (1 − θ)b. To show 1(b),
I need to show that f (xθ ) > θf (a) + (1 − θ)f (b).
Let g(θ) = b + θ(a − b), let h(θ) = f (g(θ)) = f (b + θ(a − b)), and let v = a − b.
By the N = 1 step above, a sufficient condition for the strict concavity of h is
that D2 h(θ) < 0. The interpretation is that D2 h(θ) is the second derivative
of f , evaluated at xθ , in the direction v = a − b.
By the Chain Rule, for any θ ∈ (0, 1), Dh(θ) = Df (g(θ))Dg(θ) = Df (g(θ))v =
∇f (g(θ)) · v. Also by the Chain Rule (and the symmetry of D2 f ),
D[∇f (g(θ)) · v] = [D2 f (xθ )v] · v = v 0 D2 f (xθ )v.
Hence,
D2 h(θ) = v 0 D2 f (xθ )v.
Therefore, if v 0 D2 f (xθ )v < 0 for every θ ∈ (0, 1) then h is strictly concave on

(0, 1), which implies that h(θ) > θh(1) + (1 − θ)h(0), which implies f (xθ ) >
θf (a) + (1 − θ)f (b), as was to be shown.
Say that f is differentiably strictly concave iff D2 f (x) is negative definite for
every x. If N = 1 then f is differentiably strictly concave iff D2 f (x) < 0 for every
x. The definition of differentiable strict convexity is analogous.
Example 2. Consider f : R → R, f (x) = −x4 . f is strictly concave but fails
differentiable strict concavity since D2 f (0) = 0.
Remark 3. To reiterate a point made in the proof of Theorem 7, if f is C 2 and D2 f
is negative definite at the point x∗ then D2 f (x) is negative definite for every x in
some open ball around x∗ , and hence f is strictly concave in some open ball around
x∗ . In words, if f is differentiable strictly concave at a point then it is differentiably
strictly concave near that point.
If, on the other hand, D2 f is merely negative semi-definite at x∗ then we cannot
infer anything about the concavity or convexity of f near x∗ . For example, if
f : R → R is given by f (x) = x4 then Df (0) = 0, which is negative semi-definite,
but f is not concave; it is, in fact, strictly convex.
8
4 Facts about Concave and Convex Functions.
Recall that f : RN → R is affine iff it is of the form f (x) = Ax + b for some 1 × N
matrix A and some point b ∈ R. Geometrically, the graph of a real-valued affine
function is a plane (a line, if the domain is R). An important elementary fact is that
real-valued affine functions are both concave and convex. This is consistent with
the fact that the second derivative of any affine function is the zero matrix.
Showing that other functions are concave or convex typically requires work. For
N = 1, Theorem 7 can be used to show that many standard functions are concave,
strictly concave, and so on.
Example 3. All of the following claims can be verified with a simple calculation.
1. ex is strictly convex on R,
2. ln(x) is strictly concave on R+ ,
3. 1/x is strictly convex on R++ and strictly concave on R−− ,
4. xt , where t is an integer greater than 1, is strictly convex on R+ . On R− , xt
is strictly convex for t even and strictly concave for t odd.
5. xα , where α is a real number in (0, 1) is strictly concave on R+ .

One can often verify the concavity of other, more complicated functions by
decomposing the functions into simpler pieces. The following results help do this.
Theorem 8. Let C ⊆ RN be non-empty and convex. Let f : C → R be concave.
Let D be any interval containing f (C) and let g : D → R be concave and weakly
increasing. Then h : C → R defined by h(x) = g(f (x)) is concave. Moreover, if f
is strictly concave and g is strictly increasing then h is strictly concave. Analogous
claims hold for f convex (again with g increasing).
Proof. Consider any a, b ∈ C and any θ ∈ [0, 1]. Let xθ = θa + (1 − θ)b. Since f is
concave,
f (xθ ) ≥ θf (a) + (1 − θ)f (b).
Then
h(xθ ) = g(f (xθ )) ≥ g(θf (a) + (1 − θ)f (b))

≥ θg(f (a)) + (1 − θ)g(f (b))
= θh(a) + (1 − θ)h(b),
where the first inequality does from the fact that f is concave and g is weakly in-
creasing and the second inequality comes from the fact that g is concave. The other
parts of the proof are essentially identical.
9
Example 4. Let the domain be R. Consider h(x) = e1/x . Let f (x) = 1/x and let
g(y) = ey . Then h(x) = g(f (x)). Function f is strictly convex and g is (strictly)
convex and strictly increasing. Therefore, by Theorem 8, h is strictly convex.
It is important in Theorem 8 that g be increasing.
2
Example 5. Let the domain by R. Consider h(x) = e−x √ . This is just the 2standard
normal density except that it is off by a factor of 1/ 2π. Let f (x) = ex and let
g(y) = 1/y. Then h(x) = g(f (x)). Now, f is convex on R++ (indeed, on R) and g
is also convex on R++ . The function h is not, however, convex. While it is strictly
convex for |x| sufficiently large, for x near zero it is strictly concave. This does not
contradict Theorem 8 because g here is decreasing.
Theorem 9. Let C ⊆ R be a interval and let f : C → R be concave and strictly
positive for all x ∈ C. Then h : C → R defined by h(x) = 1/f (x) is convex.
Proof. Follows from Theorem 8, the fact that if f is concave then −f is convex,
and the fact that the function g(x) = −1/x is convex and increasing on R− .
Theorem 10. Let C ⊆ R be an open interval. If f : C → R is strictly increasing

or decreasing then f −1 : f (C) → C is well defined. If, in addition, f is concave or
convex, then f (C) is convex and the following hold.
1. If f is concave and strictly increasing then f −1 is convex.
2. If f is concave and strictly decreasing then f −1 is concave.
3. If f is convex and strictly increasing then f −1 is concave.
4. If f is convex and strictly decreasing then f −1 is convex.
Proof. For 1, by Theorem 5, f is continuous. It follows (see the notes on connected
sets) that f (C) is an interval, and hence is convex. Let y, ŷ be any two points in
f (C) and let x = f −1 (y), x̂ = f −1 (ŷ). Take any θ ∈ [0, 1]. Then, since f is concave,
f (θx + (1 − θ)x̂) ≥ θf (x) + (1 − θ)f (x̂)
= θy + (1 − θ)ŷ.
Taking the inverse of both sides yields, since f is strictly increasing, and since
x = f −1 (y), x̂ = f −1 (ŷ),
θf −1 (y) + (1 − θ)f −1 (ŷ) = θx + (1 − θ)x̂
≥ f −1 (θy + (1 − θ)ŷ).
Since y, ĥ, and θ were arbitrary, this implies that f −1 is convex. The other results
are similar. Note in particular that if f is concave but strictly decreasing then the
last inequality above flips, and f −1 is concave.
10
Example 6. Let C = (0, ∞) and let f (x) = ln(x). This function is (strictly) concave
and strictly increasing. Its inverse is f −1 (y) = ey , which is (strictly) convex and
strictly increasing.
Example 7. Let C = (−∞, 0) and let f (x) = ln(−x). This function is (strictly)
concave and strictly decreasing. Its inverse is f −1 (y) = −ey , which is (strictly)
concave and strictly decreasing.
Theorem 11. Let C ⊆ RN be non-empty and convex. Let f1 : C → R and f2 :

C → R be concave.
1. The function f1 + f2 is concave. Moreover, if either f1 or f2 is strictly concave

then f1 + f2 is strictly concave.
2. For any r ≥ 0, the function rf1 is concave. Moreover, if f1 is strictly concave

then for any r > 0, rf1 is strictly concave.
Analogous claims hold if f1 , f2 are convex.
Proof. Omitted. Almost immediate from the definition of concavity.
A special case in which N > 1 is effectively as easy to analyze as N = 1 is when

the function is separable in the sense that it is the sum of univariate functions. For
example, the function h(x1 , x2 ) =Qex1 + ex2 is separable. More generally, suppose
Cn ⊆ R are intervals and let C = n Cn . Then h : C → R is separable iff there are
functions fn : Cn → R such that
N
X
h(x) = fn (xn ).
n=1
Q
Theorem 12. Let Cn ⊆ R be intervals, let C = n Cn , and P for each n let fn :
Cn → R. Define the separable function h : C → R by h(x) = n fn (x). Then h
is concave iff every fn is concave and h is strictly concave iff every fn is strictly
concave. If every fn is C 2 then h is differentiably strictly concave iff every fn is
differentiably strictly concave. And analogous statements hold for convexity.
Proof. The claims for concavity and strict concavity are almost immediate. For
differentiable strict concavity, note that
D f1 (x∗ )
 2 
0 ··· 0 0

 0 D2 f2 (x∗ ) · · · 0 0 

D2 h(x∗ ) =  .. .. .. .. ..
.
 
. . . . .
∗
 2

 0 0 · · · D fN −1 (x ) 0 
0 0 ··· 0 2 ∗
D fN (x )
11
This matrix is negative definite iff the diagonal terms are all strictly negative, which
is equivalent to saying that the fn are all differentiably strictly concave.
Theorem 12 is superficially similar to Theorem 11 but there are important dif-

ferences, as illustrated in the following example.
Example 8. Consider h(x1 , x2 ) = ex1 + ex2 . Since h is separable, Theorem 12, and
the fact that the second derivative of ex is always strictly positive, implies that h is
strictly convex.
I can also employ Theorem 11, but I reach the weaker conclusion that h is convex,
rather than strictly convex. Explicitly, I can view ex1 not as a function on R but
as a function on R2 where the second coordinate is simply ignored. Viewed as a
function on R2 , ex1 is convex (one can use Theorem 8 to show this) but not strictly
convex. Because the exn are convex but not strictly convex, Theorem 11 implies
only that h is convex, not strictly convex.
12

Concave and Convex Functions: 1 Basic Definitions

Uploaded by

Copyright:

Available Formats

Concave and Convex Functions: 1 Basic Definitions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Concave and Convex Functions: 1 Basic Definitions

Uploaded by

Copyright:

Available Formats

John Nachbar

Concave and Convex Functions1

1. (a) f is concave iff for any a, b ∈ C and any θ ∈ [0, 1],

f (θa + (1 − θ)b) ≥ θf (a) + (1 − θ)f (b);

2. (a) f is convex iff for any a, b ∈ C and any θ ∈ [0, 1],

f (θa + (1 − θ)b) ≤ θf (a) + (1 − θ)f (b);

The following equivalence is immediate from the definitions.

Theorem 1. Let C ⊆ RN be non-empty and convex and let f : C → R. f is convex

Theorem 2. f : RN → R is affine iff it is both concave and convex.

1. ⇒. For any a, b ∈ C and any θ ∈ (0, 1), f (θa+(1−θ)b) = A(θa+(1−θ)b)+y ∗ =

ShareAlike 4.0 License.

Construct A by setting an = g(en ),P where en = (0, . . . , 0, 1, 0, . . . , 0) is the

Theorem 3. Let C ⊆ R be an open interval.

1. f : C → R is concave iff for any a, b, c ∈ C, with a < b < c,

f (b) − f (a) f (c) − f (b)

2. f : C → R is convex iff for any a, b, c ∈ C, with a < b < c,

f (b) − f (a) f (c) − f (b)

[f (b) − f (a)](c − b) ≥ [f (c) − f (b)](b − a),

which holds iff (collecting terms in f (b)),

f (b)(c − a) ≥ f (a)(c − b) + f (c)(b − a),

which (since c − a > 0) holds iff

Take θ = (c − b)/(c − a) ∈ (0, 1) and verify that, indeed, b = θa + (1 − θ)c. Then

It is also possible to characterize concavity or convexity of functions in terms of

epif = {(x, y) ∈ RN +1 : y ≥ f (x)},

Theorem 4. Let C ⊆ RN be convex and let f : C → R.

1. f is concave iff hypf is convex.

2. f is convex iff epif is convex.

f (b) − f (a) f (c) − f (b)

f (b) − f (a) f (c) − f (b)

|f (b) − f (x)| ≤ f (b) − f (a). (3)

|f (x) − f (x∗ )| ≤ f (x∗ ) − f (vt ). (6)

Then f is concave. It is lower semi-continuous on [0, 1] and continuous on (0, 1].

3 Concavity, Convexity, and Differentiability.

Theorem 6. Let C ⊆ RN be non-empty, open and convex and let f : C → R be

1. f is concave iff for any x∗ , x ∈ C

f (x) ≤ ∇f (x∗ ) · (x − x∗ ) + f (x∗ ). (7)

f (x) ≥ ∇f (x∗ ) · (x − x∗ ) + f (x∗ ). (8)

f (a) ≤ ∇f (x∗ ) · [−(1 − θ)(b − a)] + f (x∗ )

Remark 2. If f is concave then a version of inequality (7) holds even if f is not

(v, w) · (x, y) ≥ (v, w) · (x∗ , f (x∗ )). (9)

f (x) ≤ v · (x − x∗ ) + f (x∗ ). (10)

2. (a) D2 f (x) is positive semi-definite for every x ∈ C iff f is convex.

D[∇f (g(θ)) · v] = [D2 f (xθ )v] · v = v 0 D2 f (xθ )v.

Therefore, if v 0 D2 f (xθ )v < 0 for every θ ∈ (0, 1) then h is strictly concave on

h(xθ ) = g(f (xθ )) ≥ g(θf (a) + (1 − θ)f (b))

Theorem 10. Let C ⊆ R be an open interval. If f : C → R is strictly increasing

Theorem 11. Let C ⊆ RN be non-empty and convex. Let f1 : C → R and f2 :

1. The function f1 + f2 is concave. Moreover, if either f1 or f2 is strictly concave

2. For any r ≥ 0, the function rf1 is concave. Moreover, if f1 is strictly concave

Analogous claims hold if f1 , f2 are convex.

Proof. Omitted. Almost immediate from the definition of concavity.

A special case in which N > 1 is effectively as easy to analyze as N = 1 is when

Theorem 12 is superficially similar to Theorem 11 but there are important dif-

You might also like