Nonlinear Functional Analysis: Gerald Teschl

Nonlinear Functional Analysis
Gerald Teschl
Gerald Teschl
Fakultät für Mathematik
Nordbergstraße 15
Universität Wien
1090 Wien, Austria
E-mail address: Gerald.Teschl@univie.ac.at
URL: http://www.mat.univie.ac.at/~gerald/
1991 Mathematics subject classification. 46-01, 47H10, 47H11, 58Fxx, 76D05
Abstract. This manuscript provides a brief introduction to nonlinear functional

analysis.
We start out with calculus in Banach spaces, review differentiation and integra-
tion, derive the implicit function theorem (using the uniform contraction principle)
and apply the result to prove existence and uniqueness of solutions for ordinary
differential equations in Banach spaces.
Next we introduce the mapping degree in both finite (Brouwer degree) and in-
finite dimensional (Leray-Schauder degree) Banach spaces. Several applications to
game theory, integral equations, and ordinary differential equations are discussed.
As an application we consider partial differential equations and prove existence
and uniqueness for solutions of the stationary Navier-Stokes equation.
Finally, we give a brief discussion of monotone operators.
Keywords and phrases. Mapping degree, fixed-point theorems, differential equa-

tions, Navier–Stokes equation.
Typeset by LATEX and Makeindex.

Version: October 13, 2005
Copyright c 1998-2004 by Gerald Teschl
ii
Preface
The present manuscript was written for my course Nonlinear Functional Analysis
held at the University of Vienna in Summer 1998 and 2001. It is supposed to give
a brief introduction to the field of Nonlinear Functional Analysis with emphasis
on applications and examples. The material covered is highly selective and many
important and interesting topics are not covered.
It is available from
http://www.mat.univie.ac.at/~gerald/ftp/book-nlfa/
Acknowledgments
I’d like to thank Volker Enß for making his lecture notes available to me and
Matthias Hammerl for pointing out errors in previous versions.
Gerald Teschl
Vienna, Austria
February 2001
iii
iv Preface
Contents
Preface iii
1 Analysis in Banach spaces 1

1.1 Differentiation and integration in Banach spaces . . . . . . . . . . . 1
1.2 Contraction principles . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3 Ordinary differential equations . . . . . . . . . . . . . . . . . . . . . 8
2 The Brouwer mapping degree 11

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Definition of the mapping degree and the determinant formula . . . 13
2.3 Extension of the determinant formula . . . . . . . . . . . . . . . . . 17
2.4 The Brouwer fixed-point theorem . . . . . . . . . . . . . . . . . . . 24
2.5 Kakutani’s fixed-point theorem and applications to game theory . . 25
2.6 Further properties of the degree . . . . . . . . . . . . . . . . . . . . 29
2.7 The Jordan curve theorem . . . . . . . . . . . . . . . . . . . . . . . 31
3 The Leray–Schauder mapping degree 33

3.1 The mapping degree on finite dimensional Banach spaces . . . . . . 33
3.2 Compact operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 The Leray–Schauder mapping degree . . . . . . . . . . . . . . . . . 35
3.4 The Leray–Schauder principle and the Schauder fixed-point theorem 37
3.5 Applications to integral and differential equations . . . . . . . . . . 39
4 The stationary Navier–Stokes equation 43

4.1 Introduction and motivation . . . . . . . . . . . . . . . . . . . . . . 43
4.2 An insert on Sobolev spaces . . . . . . . . . . . . . . . . . . . . . . 44
4.3 Existence and uniqueness of solutions . . . . . . . . . . . . . . . . . 50
v
vi Contents
5 Monotone operators 53
5.1 Monotone operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.2 The nonlinear Lax–Milgram theorem . . . . . . . . . . . . . . . . . 55
5.3 The main theorem of monotone operators . . . . . . . . . . . . . . 57
Bibliography 61
Glossary of notations 63
Index 65
Chapter 1
Analysis in Banach spaces
1.1 Differentiation and integration in Banach sp-

aces
We first review some basic facts from calculus in Banach spaces.
Let X and Y be two Banach spaces and denote by C(X, Y ) the set of continuous
functions from X to Y and by L(X, Y ) ⊂ C(X, Y ) the set of (bounded) linear
functions. Let U be an open subset of X. Then a function F : U → Y is called
differentiable at x ∈ U if there exists a linear function dF (x) ∈ L(X, Y ) such that
F (x + u) = F (x) + dF (x) u + o(u), (1.1)
where o, O are the Landau symbols. The linear map dF (x) is called derivative of
F at x. If F is differentiable for all x ∈ U we call F differentiable. In this case we
get a map
dF : U → L(X, Y )
. (1.2)
x 7→ dF (x)
1
Qm we call F continuously differentiable and write F ∈ C (U, Y ).
If dF is continuous,
Let Y = j=1 Yj and let F : X → Y be given by F = (F1 , . . . , Fm ) with
Fj : X → Yi . Then F ∈ C 1 (X, Y ) if and only if Fj ∈QC 1 (X, Yj ), 1 ≤ j ≤ m, and
in this case dF = (dF1 , . . . , dFm ). Similarly, if X = m
i=1 Xi , then one can define
the partial derivative ∂i F ∈ L(Xi , Y ), which is the derivative of F considered as
a function
P of the i-th variable alone (the other variables being fixed). We have
dF v = ni=1 ∂i F vi , v = (v1 , . . . , vn ) ∈ X, and F ∈ C 1 (X, Y ) if and only if all
partial derivatives exist and are continuous.
1
2 Chapter 1. Analysis in Banach spaces
In the case of X = Rm and Y = Rn ,the matrix representation of dF with

respect to the canonical basis in Rm and Rn is given by the partial derivatives
∂i Fj (x) and is called Jacobi matrix of F at x.
We can iterate the procedure of differentiation and write F ∈ C r (U, Y ), r ≥ 1,
if the r-th derivative of F , dr F (i.e., the derivative of the T
(r − 1)-th derivative of
F ), exists and is continuous. Finally, we set C ∞ (U, Y ) = r∈N C r (U, Y ) and, for
notational convenience, C 0 (U, Y ) = C(U, Y ) and d0 F = F .
It is often necessary to equip C r (U, Y ) with a norm. A suitable choice is
|F | = max sup |dj F (x)|. (1.3)

0≤j≤r x∈U
The set of all r times continuously differentiable functions for which this norm is
finite forms a Banach space which is denoted by Cbr (U, Y ).
If F is bijective and F , F −1 are both of class C r , r ≥ 1, then F is called a
diffeomorphism of class C r .
Note that if F ∈ L(X, Y ), then dF (x) = F (independent of x) and dr F (x) = 0,
r > 1.
For the composition of mappings we note the following result (which is easy to
prove).
Lemma 1.1 (Chain rule) Let F ∈ C r (X, Y ) and G ∈ C r (Y, Z), r ≥ 1. Then
G ◦ F ∈ C r (X, Z) and
d(G ◦ F )(x) = dG(F (x)) ◦ dF (x), x ∈ X. (1.4)
In particular, if λ ∈ Y ∗ is a linear functional, then d(λ ◦ F ) = dλ ◦ dF = λ ◦ dF .

In addition, we have the following mean value theorem.
Theorem 1.2 (Mean value) Suppose U ⊆ X and F ∈ C 1 (U, Y ). If U is convex,

then
|F (x) − F (y)| ≤ M |x − y|, M = max |dF ((1 − t)x + ty)|. (1.5)
0≤t≤1
Conversely, (for any open U ) if
|F (x) − F (y)| ≤ M |x − y|, x, y ∈ U, (1.6)
then
sup |dF (x)| ≤ M. (1.7)
x∈U
1.1. Differentiation and integration in Banach spaces 3
Proof. Abbreviate f (t) = F ((1 − t)x + ty), 0 ≤ t ≤ 1, and hence df (t) =

dF ((1 − t)x + ty)(y − x) implying |df (t)| ≤ M̃ = M |x − y|. For the first part it
suffices to show
φ(t) = |f (t) − f (0)| − (M̃ + δ)t ≤ 0 (1.8)
for any δ > 0. Let t0 = max{t ∈ [0, 1]|φ(t) ≤ 0}. If t0 < 1 then
φ(t0 + ε) = |f (t0 + ε) − f (t0 ) + f (t0 ) − f (0)| − (M̃ + δ)(t0 + ε)

≤ |f (t0 + ε) − f (t0 )| − (M̃ + δ)ε + φ(t0 )
≤ |df (t0 )ε + o(ε)| − (M̃ + δ)ε
≤ (M̃ + o(1) − M̃ − δ)ε = (−δ + o(1))ε ≤ 0, (1.9)
for ε ≥ 0, small enough. Thus t0 = 1.

To prove the second claim suppose there is an x0 ∈ U such that |dF (x0 )| =
M + δ, δ > 0. Then we can find an e ∈ X, |e| = 1 such that |dF (x0 )e| = M + δ
and hence
M ε ≥ |F (x0 + εe) − F (x0 )| = |dF (x0 )(εe) + o(ε)|

≥ (M + δ)ε − |o(ε)| > M ε (1.10)
since we can assume |o(ε)| < εδ for ε > 0 small enough, a contradiction. 2
As an immediate consequence we obtain
Corollary 1.3 Suppose U is a connected subset of a Banach space X. A mapping

F ∈ C 1 (U, Y ) is constant if and only if dF = 0. In addition, if F1,2 ∈ C 1 (U, Y )
and dF1 = dF2 , then F1 and F2 differ only by a constant.
Next we want to look at higher derivatives more closely. Let X = m

Q
i=1 Xi ,
then F : X → Y is called multilinear if it is linear with respect to each argument.
It is not hard to see that F is continuous if and only if
|F | = sup
Qm
|F (x1 , . . . , xm )| < ∞. (1.11)
x: i=1 |xi |=1
If we take n copies of the same space, the set of multilinear functions F : X n → Y

will be denoted by Ln (X, Y ). A multilinear function is called symmetric provided
its value remains unchanged if any two arguments are switched. With the norm
from above it is a Banach space and in fact there is a canonical isometric iso-
morphism between Ln (X, Y ) and L(X, Ln−1 (X, Y )) given by F : (x1 , . . . , xn ) 7→
F (x1 , . . . , xn ) maps to x1 7→ F (x1 , .). In addition, note that to each F ∈ Ln (X, Y )

we can assign its polar form F ∈ C(X, Y ) using F (x) = F (x, . . . , x), x ∈ X. If F
is symmetric it can be reconstructed from its polar form using
n
1 X
F (x1 , . . . , xn ) = ∂t1 · · · ∂tn F ( ti xi )|t1 =···=tn =0 . (1.12)
n! i=1
Moreover, the r-th derivative of F ∈ C r (X, Y ) is symmetric since,

r
X
r
d Fx (v1 , . . . , vr ) = ∂t1 · · · ∂tr F (x + ti vi )|t1 =···=tr =0 , (1.13)
i=1
where the order of the partial derivatives can be shown to be irrelevant.

Now we turn to integration. We will only consider the case of mappings f :
I → X where I = [a, b] ⊂ R is a compact interval and X is a Banach space. A
function f : I → X is called simple if the image of f is finite, f (I) = {xi }ni=1 ,
and if each inverse image f −1 (xi ), 1 ≤ i ≤ n is a Borel set. The set of simple
functions S(I, X) forms a linear space and can be equipped with the sup norm.
The corresponding Banach space obtained after completion is called the set of
regulated functions R(I, X). Pn−1
Observe that C(I, X) ⊂ R(I, X). In fact, consider fn = i=0 f (ti )χ[ti ,ti+1 ) ∈
S(I, X), where ti = a+i b−a
n
and χ is the characteristic function. Since f ∈ C(I, X)
is uniformly continuous, we infer that fn converges
R uniformly to f .
For f ∈ S(I, X) we can define a linear map : S(I, X) → X by
Z b n
X
f (t)dt = xi µ(f −1 (xi )), (1.14)
a i=1
where µ denotes the Lebesgue measure on I. This map satisfies

Z b
f (t)dt ≤ |f |(b − a). (1.15)
a
R
and hence it can be extended uniquely to a linear map : R(I, X) → X with the
same norm (b − a). We even have
Z b Z b
f (t)dt ≤ |f (t)|dt. (1.16)
a a
1.2. Contraction principles 5
In addition, if λ ∈ X ∗ is a continuous linear functional, then

Z b Z b
λ( f (t)dt) = λ(f (t))dt, f ∈ R(I, X). (1.17)
a a
Rt Rb Rt
We use the usual conventions t12 f (s)ds = a χ(t1 ,t2 ) (s)f (s)ds and t21 f (s)ds =
Rt
− t12 f (s)ds.
If I ⊆ R, we have an isomorphism L(I, X) ≡ X and if F : I → X we will
write Ḟ (t) in stead of dF (t) Rif we regard dF (t) as an element of X. In particular,
t
if f ∈ C(I, X), then F (t) = a f (s)ds ∈ C 1 (I, X) and Ḟ (t) = f (t) as can be seen
from
Z t+ε Z t Z t+ε
| f (s)ds − f (s)ds − f (t)ε| = | (f (s) − f (t))ds| ≤ |ε| sup |f (s) − f (t)|.
a a t s∈[t,t+ε]
Rt (1.18)
This even shows that F (t) = F (a) + a
(Ḟ (s))ds for any F ∈ C 1 (I, X).
1.2 Contraction principles

A fixed point of a mapping F : C ⊆ X → C is an element x ∈ C such that
F (x) = x. Moreover, F is called a contraction if there is a contraction constant
θ ∈ [0, 1) such that
|F (x) − F (x̃)| ≤ θ|x − x̃|, x, x̃ ∈ C. (1.19)
Note that a contraction is continuous. We also recall the notation F n (x) =

F (F n−1 (x)), F 0 (x) = x.
Theorem 1.4 (Contraction principle) Let C be a closed subset of a Banach

space X and let F : C → C be a contraction, then F has a unique fixed point
x ∈ C such that
θn
|F n (x) − x| ≤ |F (x) − x|, x ∈ C. (1.20)
1−θ
Proof. If x = F (x) and x̃ = F (x̃), then |x − x̃| = |F (x) − F (x̃)| ≤ θ|x − x̃|
shows that there can be at most one fixed point.
Concerning existence, fix x0 ∈ C and consider the sequence xn = F n (x0 ). We
have
|xn+1 − xn | ≤ θ|xn − xn−1 | ≤ · · · ≤ θn |x1 − x0 | (1.21)
and hence by the triangle inequality (for n > m)

n
X n−m−1
X
m
|xn − xm | ≤ |xj − xj−1 | ≤ θ θj |x1 − x0 |
j=m+1 j=0
m
θ
≤ |x1 − x0 |. (1.22)
1−θ
Thus xn is Cauchy and tends to a limit x. Moreover,
|F (x) − x| = lim |xn+1 − xn | = 0 (1.23)

n→∞
shows that x is a fixed point and the estimate (1.20) follows after taking the limit
n → ∞ in (1.22). 2
Next, we want to investigate how fixed points of contractions vary with respect
to a parameter. Let U ⊆ X, V ⊆ Y be open and consider F : U × V → U . The
mapping F is called a uniform contraction if there is a θ ∈ [0, 1) such that
|F (x, y) − F (x̃, y)| ≤ θ|x − x̃|, x, x̃ ∈ U , y ∈ V. (1.24)
Theorem 1.5 (Uniform contraction principle) Let U , V be open subsets of

Banach spaces X, Y , respectively. Let F : U × V → U be a uniform contraction
and denote by x(y) ∈ U the unique fixed point of F (., y). If F ∈ C r (U × V, U ),
r ≥ 0, then x(.) ∈ C r (V, U ).
Proof. Let us first show that x(y) is continuous. From
|x(y + v) − x(y)| = |F (x(y + v), y + v) − F (x(y), y + v)

+ F (x(y), y + v) − F (x(y), y)|
≤ θ|x(y + v) − x(y)| + |F (x(y), y + v) − F (x(y), y)| (1.25)
we infer
1
|x(y + v) − x(y)| ≤ |F (x(y), y + v) − F (x(y), y)| (1.26)
1−θ
and hence x(y) ∈ C(V, U ). Now let r = 1 and let us formally differentiate x(y) =
F (x(y), y) with respect to y,
d x(y) = ∂x F (x(y), y)d x(y) + ∂y F (x(y), y). (1.27)
Considering this as a fixed point equation T (x0 , y) = x0 , where T (., y) : L(Y, X) →

L(Y, X), x0 7→ ∂x F (x(y), y)x0 + ∂y F (x(y), y) is a uniform contraction since we have
1.2. Contraction principles 7
|∂x F (x(y), y)| ≤ θ by Theorem 1.2. Hence we get a unique continuous solution
x0 (y). It remains to show
x(y + v) − x(y) − x0 (y)v = o(v). (1.28)
Let us abbreviate u = x(y + v) − x(y), then using (1.27) and the fixed point
property of x(y) we see
(1 − ∂x F (x(y), y))(u − x0 (y)v) =
= F (x(y) + u, y + v) − F (x(y), y) − ∂x F (x(y), y)u − ∂y F (x(y), y)v
= o(u) + o(v) (1.29)
since F ∈ C 1 (U ×V, U ) by assumption. Moreover, |(1−∂x F (x(y), y))−1 | ≤ (1−θ)−1
and u = O(v) (by (1.26)) implying u − x0 (y)v = o(v) as desired.
Finally, suppose that the result holds for some r − 1 ≥ 1. Thus, if F is
C r , then x(y) is at least C r−1 and the fact that d x(y) satisfies (1.27) implies
x(y) ∈ C r (V, U ). 2
As an important consequence we obtain the implicit function theorem.
Theorem 1.6 (Implicit function) Let X, Y , and Z be Banach spaces and let
U , V be open subsets of X, Y , respectively. Let F ∈ C r (U × V, Z), r ≥ 1, and fix
(x0 , y0 ) ∈ U × V . Suppose ∂x F (x0 , y0 ) ∈ L(X, Z) is an isomorphism. Then there
exists an open neighborhood U1 × V1 ⊆ U × V of (x0 , y0 ) such that for each y ∈ V1
there exists a unique point (ξ(y), y) ∈ U1 × V1 satisfying F (ξ(y), y) = F (x0 , y0 ).
Moreover, the map ξ is in C r (V1 , Z) and fulfills
dξ(y) = −(∂x F (ξ(y), y))−1 ◦ ∂y F (ξ(y), y). (1.30)
Proof. Using the shift F → F − F (x0 , y0 ) we can assume F (x0 , y0 ) = 0.
Next, the fixed points of G(x, y) = x − (∂x F (x0 , y0 ))−1 F (x, y) are the solutions
of F (x, y) = 0. The function G has the same smoothness properties as F and
since |∂x G(x0 , y0 )| = 0, we can find balls U1 and V1 around x0 and y0 such that
|∂x G(x, y)| ≤ θ < 1. Thus G(., y) is a uniform contraction and in particular,
G(U1 , y) ⊂ U1 , that is, G : U1 × V1 → U1 . The rest follows from the uniform
contraction principle. Formula (1.30) follows from differentiating F (ξ(y), y) = 0
using the chain rule. 2
Note that our proof is constructive, since it shows that the solution ξ(y) can
be obtained by iterating x − (∂x F (x0 , y0 ))−1 F (x, y).
Moreover, as a corollary of the implicit function theorem we also obtain the
inverse function theorem.
Theorem 1.7 (Inverse function) Suppose F ∈ C r (U, Y ), U ⊆ X, and let dF (x0 )

be an isomorphism for some x0 ∈ U . Then there are neighborhoods U1 , V1 of x0 ,
F (x0 ), respectively, such that F ∈ C r (U1 , V1 ) is a diffeomorphism.
Proof. Apply the implicit function theorem to G(x, y) = y − F (x). 2
1.3 Ordinary differential equations

As a first application of the implicit function theorem, we prove (local) existence
and uniqueness for solutions of ordinary differential equations in Banach spaces.
The following lemma will be needed in the proof.
Lemma 1.8 Suppose I ⊆ R is a compact interval and f ∈ C r (U, Y ). Then

f∗ ∈ C r (Cb (I, U ), Cb (I, Y )), where
(f∗ x)(t) = f (x(t)). (1.31)
Proof. Fix x0 ∈ Cb (I, U ) and ε > 0. For each t ∈ I we have a δ(t) > 0 such
that |f (x) − f (x0 (t))| ≤ ε/2 for all x ∈ U with |x − x0 (t)| ≤ 2δ(t). The balls
Bδ(t) (x0 (t)), t ∈ I, cover the set {x0 (t)}t∈I and since I is compact, there is a finite
subcover Bδ(tj ) (x0 (tj )), 1 ≤ j ≤ n. Let |x − x0 | ≤ δ = min1≤j≤n δ(tj ). Then
for each t ∈ I there is ti such that |x0 (t) − x0 (tj )| ≤ δ(tj ) and hence |f (x(t)) −
f (x0 (t))| ≤ |f (x(t)) − f (x0 (tj ))| + |f (x0 (tj )) − f (x0 (t))| ≤ ε since |x(t) − x0 (tj )| ≤
|x(t) − x0 (t)| + |x0 (t) − x0 (tj )| ≤ 2δ(tj ). This settles the case r = 0.
Next let us turn to r = 1. We claim that df∗ is given by (df∗ (x0 )x)(t) =
df (x0 (t))x(t). Hence we need to show that for each ε > 0 we can find a δ > 0 such
that
sup |f∗ (x0 (t) + x(t)) − f∗ (x0 (t)) − df (x0 (t))x(t)| ≤ εδ (1.32)
t∈I
whenever |x − x0 | ≤ δ. By assumption we have
|f∗ (x0 (t) + x(t)) − f∗ (x0 (t)) − df (x0 (t))x(t)| ≤ εδ(t) (1.33)
whenever |x(t) − x0 (t)| ≤ δ(t). Now argue as before. It remains to show that df∗
is continuous. To see this we use the linear map
λ : Cb (I, L(X, Y )) → L(Cb (I, X), Cb (I, Y )) , (1.34)

T 7→ T∗ x
1.3. Ordinary differential equations 9
where (T∗ x)(t) = T (t)x(t). Since we have
|T∗ x| = sup |T (t)x(t)| ≤ sup |T (t)||x(t)| ≤ |T ||x|, (1.35)

t∈I t∈I
we infer |λ| ≤ 1 and hence λ is continuous. Now observe df∗ = λ ◦ (df )∗ .

The general case r > 1 follows from induction. 2
Now we come to our existence and uniqueness result for the initial value prob-
lem in Banach spaces.
Theorem 1.9 Let I be an open interval, U an open subset of a Banach space X

and Λ an open subset of another Banach space. Suppose F ∈ C r (I × U × Λ, X),
then the initial value problem
ẋ(t) = F (t, x, λ), x(t0 ) = x0 , (t0 , x0 , λ) ∈ I × U × Λ, (1.36)
has a unique solution x(t, t0 , x0 , λ) ∈ C r (I1 × I2 × U1 × Λ1 , X), where I1,2 , U1 , and

Λ1 are open subsets of I, U , and Λ, respectively. The sets I2 , U1 , and Λ1 can be
chosen to contain any point t0 ∈ I, x0 ∈ U , and λ0 ∈ Λ, respectively.
Proof. If we shift t → t − t0 , x → x − x0 , and hence F → F (. + t0 , . + x0 , λ),

we see that it is no restriction to assume x0 = 0, t0 = 0 and to consider (t0 , x0 )
as part of the parameter λ (i.e., λ → (t0 , x0 , λ)). Moreover, using the standard
transformation ẋ = F (τ, x, λ), τ̇ = 1, we can even assume that F is independent of
t. We will also replace U by a smaller (bounded) subset such that F is uniformly
continuous with respect to x on this subset.
Our goal is to invoke the implicit function theorem. In order to do this we
introduce an additional parameter ε ∈ R and consider
ẋ = εF (x, λ), x ∈ Dr+1 = {x ∈ Cbr+1 ((−1, 1), U )|x(0) = 0}, (1.37)
such that we know the solution for ε = 0. The implicit function theorem will show
that solutions still exist as long as ε remains small. At first sight this doesn’t seem
to be good enough for us since our original problem corresponds to ε = 1. But
since ε corresponds to a scaling t → εt, the solution for one ε > 0 suffices. Now
let us turn to the details.
Our problem (1.37) is equivalent to looking for zeros of the function
G : Dr+1 × Λ × (−ε0 , ε0 ) → Cbr ((−1, 1), X)

. (1.38)
(x, λ, ε) 7→ ẋ − εF (x, λ)
Lemma 1.8 ensures that this function is C r . Now fix λ0 , thenR G(0, λ0 , 0) = 0
t
and ∂x G(0, λ0 , 0) = T , where T x = ẋ. Since (T −1 x)(t) = 0 x(s)ds we can
apply the implicit function theorem to conclude that there is a unique solution
x(λ, ε) ∈ C r (Λ1 × (−ε0 , ε0 ), Dr+1 ). In particular, the map (λ, t) 7→ x(λ, ε)(t/ε) is
in C r (Λ1 , C r+1 ((−ε, ε), X)) ,→ C r (Λ × (−ε, ε), X). Hence it is the desired solution
of our original problem. 2
Chapter 2
The Brouwer mapping degree
2.1 Introduction
Many applications lead to the problem of finding all zeros of a mapping f : U ⊆
X → X, where X is some (real) Banach space. That is, we are interested in the
solutions of
f (x) = 0, x ∈ U. (2.1)
In most cases it turns out that this is too much to ask for, since determining the
zeros analytically is in general impossible.
Hence one has to ask some weaker questions and hope to find answers for them.
One such question would be ”Are there any solutions, respectively, how many are
there?”. Luckily, this questions allows some progress.
To see how, lets consider the case f ∈ H(C), where H(C) denotes the set of
holomorphic functions on a domain U ⊂ C. Recall the concept of the winding
number from complex analysis. The winding number of a path γ : [0, 1] → C
around a point z0 ∈ C is defined by
Z
1 dz
n(γ, z0 ) = ∈ Z. (2.2)
2πi γ z − z0
It gives the number of times γ encircles z0 taking orientation into account. That
is, encirclings in opposite directions are counted with opposite signs.
In particular, if we pick f ∈ H(C) one computes (assuming 0 6∈ f (γ))
Z 0
1 f (z) X
n(f (γ), 0) = dz = n(γ, zk )αk , (2.3)
2πi γ f (z) k
11
12 Chapter 2. The Brouwer mapping degree
where zk denote the zeros of f and αk their respective multiplicity. Moreover, if γ

is a Jordan curve encircling a simply connected domain U ⊂ C, then n(γ, zk ) = 0
if zk 6∈ U and n(γ, zk ) = 1 if zk ∈ U . Hence n(f (γ), 0) counts the number of zeros
inside U .
However, this result is useless unless we have an efficient way of computing
n(f (γ), 0) (which does not involve the knowledge of the zeros zk ). This is our next
task.
Now, lets recall how one would compute complex integrals along complicated
paths. Clearly, one would use homotopy invariance and look for a simpler path
along which the integral can be computed and which is homotopic to the original
one. In particular, if f : γ → C\{0} and g : γ → C\{0} are homotopic, we have
n(f (γ), 0) = n(g(γ), 0) (which is known as Rouchés theorem).
More explicitly, we need to find a mapping g for which n(g(γ), 0) can be com-
puted and a homotopy H : [0, 1] × γ → C\{0} such that H(0, z) = f (z) and
H(1, z) = g(z) for z ∈ γ. For example, how many zeros of f (z) = 12 z 6 + z − 31 lie
inside the unit circle? Consider g(z) = z, then H(t, z) = (1 − t)f (z) + t g(z) is the
required homotopy since |f (z) − g(z)| < |g(z)|, |z| = 1, implying H(t, z) 6= 0 on
[0, 1] × γ. Hence f (z) has one zero inside the unit circle.
Summarizing, given a (sufficiently smooth) domain U with enclosing Jordan
curve ∂U , we have defined a degree deg(f, U, z0 ) = n(f (∂U ), z0 ) = n(f (∂U ) −
z0 , 0) ∈ Z which counts the number of solutions of f (z) = z0 inside U . The
invariance of this degree with respect to certain deformations of f allowed us to
explicitly compute deg(f, U, z0 ) even in nontrivial cases.
Our ultimate goal is to extend this approach to continuous functions f : Rn →
n
R . However, such a generalization runs into several problems. First of all, it is
unclear how one should define the multiplicity of a zero. But even more severe is
the fact, that the number of zeros is unstable with respect to small perturbations.
For example, consider fε : [−1, 2] → R, x 7→ x2 − ε. Then fε has√no zeros for
√ zero for ε = 0, two zeros for 0 < ε ≤ 1, one for 1 < ε ≤ 2, and none
ε < 0, one
for ε > 2. This shows the following facts.
1. Zeros with f 0 6= 0 are stable under small perturbations.
2. The number of zeros can change if two zeros with opposite sign change (i.e.,
opposite signs of f 0 ) run into each other.
3. The number of zeros can change if a zero drops over the boundary.
Hence we see that we cannot expect too much from our degree. In addition, since
2.2. Definition of the mapping degree and the determinant formula 13
it is unclear how it should be defined, we will first require some basic properties a
degree should have and then we will look for functions satisfying these properties.
2.2 Definition of the mapping degree and the de-

terminant formula
To begin with, let us introduce some useful notation. Throughout this section U
will be a bounded open subset of Rn . For f ∈ C 1 (U, Rn ) the Jacobi matrix of f
at x ∈ U is f 0 (x) = (∂xi fj (x))1≤i,j≤n and the Jacobi determinant of f at x ∈ U is
Jf (x) = det f 0 (x). (2.4)
The set of regular values is
RV(f ) = {y ∈ Rn |∀x ∈ f −1 (y) : Jf (x) 6= 0}. (2.5)
Its complement CV(f ) = Rn \RV(f ) is called the set of critical values. We set
C r (U , Rn ) = {f ∈ C r (U, Rn )|dj f ∈ C(U , Rn ), 0 ≤ j ≤ r} and
Dyr (U , Rn ) = {f ∈ C r (U , Rn )|y 6∈ f (∂U )}, Dy (U , Rn ) = Dy0 (U , Rn ),

y ∈ Rn .
(2.6)
Now that these things are out of the way, we come to the formulation of the
requirements for our degree.
A function deg which assigns each f ∈ Dy (U , Rn ), y ∈ Rn , a real number
deg(f, U, y) will be called degree if it satisfies the following conditions.
(D1). deg(f, U, y) = deg(f − y, U, 0) (translation invariance).
(D2). deg(1l, U, y) = 1 if y ∈ U (normalization).
(D3). If U1,2 are open, disjoint subsets of U such that y 6∈ f (U \(U1 ∪ U2 )), then
deg(f, U, y) = deg(f, U1 , y) + deg(f, U2 , y) (additivity).
(D4). If H(t) = (1−t)f + tg ∈ Dy (U , Rn ), t ∈ [0, 1], then deg(f, U, y) = deg(g, U, y)

(homotopy invariance).
Before we draw some first conclusions form this definition, let us discuss the
properties (D1)–(D4) first. (D1) is natural since deg(f, U, y) should have something
to do with the solutions of f (x) = y, x ∈ U , which is the same as the solutions
of f (x) − y = 0, x ∈ U . (D2) is a normalization since any multiple of deg would

also satisfy the other requirements. (D3) is also quite natural since it requires deg
to be additive with respect to components. In addition, it implies that sets where
f 6= y do not contribute. (D4) is not that natural since it already rules out the
case where deg is the cardinality of f −1 (U ). On the other hand it will give us the
ability to compute deg(f, U, y) in several cases.
Theorem 2.1 Suppose deg satisfies (D1)–(D4) and let f, g ∈ Dy (U , Rn ), then the
following statements hold.
(i). We have deg(f, ∅, y) = 0. Moreover, if Ui , 1 ≤ i ≤ N , areP

disjoint open sub-
sets of U such that y 6∈ f (U \ i=1 Ui ), then deg(f, U, y) = N
SN
i=1 deg(f, Ui , y).
(ii). If y 6∈ f (U ), then deg(f, U, y) = 0 (but not the other way round). Equiva-
lently, if deg(f, U, y) 6= 0, then y ∈ f (U ).
(iii). If |f (x) − g(x)| < dist(y, f (∂U )), x ∈ ∂U , then deg(f, U, y) = deg(g, U, y).
In particular, this is true if f (x) = g(x) for x ∈ ∂U .
Proof. For the first part of (i) use (D3) with U1 = U and U2 = ∅. For the
second part use U2 = ∅ in (D3) if i = 1 and the rest follows from induction. For
(ii) use i = 1 and U1 = ∅ in (ii). For (iii) note that H(t, x) = (1 − t)f (x) + t g(x)
satisfies |H(t, x) − y| ≥ dist(y, f (∂U )) − |f (x) − g(x)| for x on the boundary. 2
Next we show that (D.4) implies several at first sight much stronger looking
facts.
Theorem 2.2 We have that deg(., U, y) and deg(f, U, .) are both continuous. In
fact, we even have
(i). deg(., U, y) is constant on each component of Dy (U , Rn ).
(ii). deg(f, U, .) is constant on each component of Rn \f (∂U ).
Moreover, if H : [0, 1] × U → Rn and y : [0, 1] → Rn are both continuous such
that H(t) ∈ Dy(t) (U, Rn ), t ∈ [0, 1], then deg(H(0), U, y(0)) = deg(H(1), U, y(1)).
Proof. For (i) let C be a component of Dy (U , Rn ) and let d0 ∈ deg(C, U, y). It

suffices to show that deg(., U, y) is locally constant. But if |g −f | < dist(y, f (∂U )),
then deg(f, U, y) = deg(g, U, y) by (D.4) since |H(t) − y| ≥ |f − y| − |g − f | >
0, H(t) = (1 − t)f + t g. The proof of (ii) is similar. For the remaining part
observe, that if H : [0, 1] × U → Rn , (t, x) 7→ H(t, x), is continuous, then so
2.2. Definition of the mapping degree and the determinant formula 15
is H : [0, 1] → C(U , Rn ), t 7→ H(t), since U is compact. Hence, if in addition

H(t) ∈ Dy (U , Rn ), then deg(H(t), U, y) is independent of t and if y = y(t) we can
use deg(H(0), U, y(0)) = deg(H(t) − y(t), U, 0) = deg(H(1), U, y(1)). 2
Note that this result also shows why deg(f, U, y) cannot be defined meaning-
ful for y ∈ f (∂D). Indeed, approaching y from within different components of
Rn \f (∂U ) will result in different limits in general!
In addition, note that if Q is a closed subset of a locally pathwise connected
space X, then the components of X\Q are open (in the topology of X) and
pathwise connected (the set of points for which a path to a fixed point x0 exists is
both open and closed).
Now let us try to compute deg using its properties. Lets start with a simple
case and suppose f ∈ C 1 (U, Rn ) and y 6∈ CV(f ) ∪ f (∂U ). Without restriction we
consider y = 0. In addition, we avoid the trivial case f −1 (y) = ∅. Since the points
of f −1 (0) inside U are isolated (use Jf (x) 6= 0 and the inverse function theorem)
they can only cluster at the boundary ∂U . But this is also impossible since f would
equal y at the limit point on the boundary by continuity. Hence f −1 (0) = {xi }Ni=1 .
i i
Picking sufficiently small neighborhoods U (x ) around x we consequently get
N
X
deg(f, U, 0) = deg(f, U (xi ), 0). (2.7)
i=1
It suffices to consider one of the zeros, say x1 . Moreover, we can even assume
x1 = 0 and U (x1 ) = Bδ (0). Next we replace f by its linear approximation around
0. By the definition of the derivative we have
f (x) = f 0 (0)x + |x|r(x), r ∈ C(Bδ (0), Rn ), r(0) = 0. (2.8)
Now consider the homotopy H(t, x) = f 0 (0)x + (1 − t)|x|r(x). In order to conclude

deg(f, Bδ (0), 0) = deg(f 0 (0), Bδ (0), 0) we need to show 0 6∈ H(t, ∂Bδ (0)). Since
Jf (0) 6= 0 we can find a constant λ such that |f 0 (0)x| ≥ λ|x| and since r(0) = 0
we can decrease δ such that |r| < λ. This implies |H(t, x)| ≥ ||f 0 (0)x| − (1 −
t)|x||r(x)|| ≥ λδ − δ|r| > 0 for x ∈ ∂Bδ (0) as desired.
In order to compute the degree of a nonsingular matrix we need the following
lemma.
Lemma 2.3 Two nonsingular matrices M1,2 ∈ GL(n) are homotopic in GL(n) if
and only if sgn det M1 = sgn det M2 .
Proof. We will show that any given nonsingular matrix M is homotopic to

diag(sgn det M, 1, . . . , 1), where diag(m1 , . . . , mn ) denotes a diagonal matrix with
diagonal entries mi .
In fact, note that adding one row to another and multiplying a row by a pos-
itive constant can be realized by continuous deformations such that all interme-
diate matrices are nonsingular. Hence we can reduce M to a diagonal matrix
diag(m1 , . . . , mn ) with (mi )2 = 1. Next,

± cos(πt) ∓ sin(πt)
(2.9)
sin(πt) cos(πt),
shows that diag(±1, 1) and diag(∓1, −1) are homotopic. Now we apply this result
to all two by two subblocks as follows. For each i starting from n and going down
to 2 transform the subblock diag(mi−1 , mi ) into diag(1, 1) respectively diag(−1, 1).
The result is the desired form for M .
To conclude the proof note that a continuous deformation within GL(n) cannot
change the sign of the determinant since otherwise the determinant would have to
vanish somewhere in between (i.e., we would leave GL(n)). 2
Using this lemma we can now show the main result of this section.
Theorem 2.4 Suppose f ∈ Dy1 (U , Rn ) and y 6∈ CV(f ), then a degree satisfying

(D1)–(D4) satisfies X
deg(f, U, y) = sgn Jf (x), (2.10)
x∈f −1 (y)
P
where the sum is finite and we agree to set x∈∅ = 0.
Proof. By the previous lemma we obtain
deg(f 0 (0), Bδ (0), 0) = deg(diag(sgnJf (0), 1, . . . , 1), Bδ (0), 0) (2.11)
since det M 6= 0 is equivalent to M x 6= 0 for x ∈ ∂Bδ (0). Hence it remains to show

deg(f 0 (0), Bδ (0), 0) = sgnJf (0).
If sgnJf (0) = 1 this is true by (D2). Otherwise we can replace f 0 (0) by M− =
diag(−1, 1, . . . , 1).
Now let U1 = {x ∈ Rn ||xi | < 1, 1 ≤ i ≤ n}, U2 = {x ∈ Rn |1 < x1 < 3, |xi | <
1, 2 ≤ i ≤ n}, U = {x ∈ Rn | − 1 < x1 < 3, |xi | < 1, 2 ≤ i ≤ n}, and abbreviate
y0 = (2, 0, . . . , 0). On U consider two continuous mappings M1,2 : U → Rn such
that M1 (x) = M− if x ∈ U1 , M1 (x) = 1l−y0 if x ∈ U2 , and M2 (x) = (1, x2 , . . . , xn ).
2.3. Extension of the determinant formula 17
Since M1 (x) = M2 (x) for x ∈ ∂U we infer deg(M1 , U, 0) = deg(M2 , U, 0) = 0.

Moreover, we have deg(M1 , U, 0) = deg(M1 , U1 , 0) + deg(M1 , U2 , 0) and hence
deg(M− , U1 , 0) = − deg(1l − y0 , U2 , 0) = − deg(1l, U2 , y0 ) = −1 as claimed. 2
Up to this point we have only shown that a degree (provided there is one at
all) necessarily satisfies (2.10). Once we have shown that regular values are dense,
it will follow that the degree is uniquely determined by (2.10) since the remaining
values follow from point (iv) of Theorem 2.1. On the other hand, we don’t even
know whether a degree exists. Hence we need to show that (2.10) can be extended
to f ∈ Dy (U , Rn ) and that this extension satisfies our requirements (D1)–(D4).
2.3 Extension of the determinant formula

Our present objective is to show that the determinant formula (2.10) can be ex-
tended to all f ∈ Dy (U , Rn ). This will be done in two steps, where we will show
that deg(f, U, y) as defined in (2.10) is locally constant with respect to both y
(step one) and f (step two).
Before we work out the technical details for these two steps, we prove that the
set of regular values is dense as a warm up. This is a consequence of a special case
of Sard’s theorem which says that CV(f ) has zero measure.
Lemma 2.5 (Sard) Suppose f ∈ C 1 (U, Rn ), then the Lebesgue measure of CV(f )
is zero.
Proof. Since the claim is easy for linear mappings our strategy is as follows.
We divide U into sufficiently small subsets. Then we replace f by its linear ap-
proximation in each subset and estimate the error.
Let CP(f ) = {x ∈ U |Jf (x) = 0} be the set of critical points of f . We first
pass to cubes which are easier to divide. Let {Qi }i∈N be a countable cover for
U consisting of open cubes such that Qi ⊂ U . Then it suffices S to prove that
f (CP(f ) ∩ Qi ) has zero measure since CV(f ) = f (CP(f )) = i f (CP(f ) ∩ Qi )
(the Qi ’s are a cover).
Let Q be any of these cubes and denote by ρ the length of its edges. Fix ε > 0
and divide Q into N n cubes Qi of length ρ/N . Since f 0 (x) is uniformly continuous
on Q we can find an N (independent of i) such that
Z 1
0 ερ
|f (x) − f (x̃) − f (x̃)(x − x̃)| ≤ |f 0 (x̃ + t(x − x̃)) − f 0 (x̃)||x̃ − x|dt ≤ (2.12)
0 N
for x̃, x ∈ Qi . Now pick a Qi which contains a critical point x̃i ∈ CP(f ). Without
restriction we assume x̃i = 0, f (x̃i ) = 0 and set M = f 0 (x̃i ). By det M = 0 there
is an orthonormal basis {bi }1≤i≤n of Rn such that bn is orthogonal to the image of
M . In addition, there is a constant C1 such that Qi ⊆ { n−1 i ρ
P
i=1 i | |λi | ≤ C1 N }
λ b
(e.g., C1 = n2(n/2) ) and hence there is a second constant (again independent of i)
such that
n−1
X ρ
M Qi ⊆ { λi bi | |λi | ≤ C2 } (2.13)
i=1
N
(e.g., C2 = nC1 maxx∈Q |f 0 (x)|). Next, by our estimate (2.12) we even have
n
X ρ ρ
f (Qi ) ⊆ { λi bi | |λi | ≤ (C2 + ε) , |λn | ≤ ε } (2.14)
i=1
N N
and hence the measure of f (Qi ) is smaller than CN3nε . Since there are at most N n
such Qi ’s, we see that the measure of f (Q) is smaller than C3 ε. 2
Having this result out of the way we can come to step one and two from above.
Step 1: Admitting critical values
By (v) of Theorem 2.1, deg(f, U, y) should be constant on each component

of Rn \f (∂U ). Unfortunately, if we connect y and a nearby regular value ỹ by
a path, then there might be some critical values in between. To overcome this
problem we need a definition for deg which works for critical values as well. Let
us try to look Rfor an integral representation. Formally (2.10) can be written as
deg(f, U, y) = U δy (f (x))Jf (x)dx, where δy (.) is the Dirac distribution at y. But
since we don’t want to mess with distributions, we replace δy (.) by φε (. − y), where
{φε }ε>0 is a family of functionsR such that φε is supported on the ball Bε (0) of
radius ε around 0 and satisfies Rn φε (x)dx = 1.
Lemma 2.6 Let f ∈ Dy1 (U , Rn ), y 6∈ CV(f ). Then

Z
deg(f, U, y) = φε (f (x) − y)Jf (x)dx (2.15)
U
for all positive ε smaller than a certain ε0 depending on f and y. Moreover,

supp(φε (f (.) − y)) ⊂ U for ε < dist(y, f (∂U )).
Proof. If f −1 (y) = ∅, we can set ε0 = dist(y, f (U )), implying φε (f (x) − y) = 0

for x ∈ U .
If f −1 (y) = {xi }1≤i≤N , we can find an ε0 > 0 such that f −1 (Bε0 (y)) is a union
of disjoint neighborhoods U (xi ) of xi by the inverse function theorem. Moreover,
after possibly decreasing ε0 we can assume that f |U (xi ) is a bijection and that Jf (x)
is nonzero on U (xi ). Again φε (f (x) − y) = 0 for x ∈ U \ N i
S
i=1 U (x ) and hence
Z N Z
X
φε (f (x) − y)Jf (x)dx = φε (f (x) − y)Jf (x)dx
U i=1 U (xi )
N
X Z
= sgn(Jf (x)) φε (x̃)dx̃ = deg(f, U, y), (2.16)
i=1 Bε0 (0)
where we have used the change of variables x̃ = f (x) in the second step. 2
Our new integral representation makes sense even for critical values. But since
ε depends on y, continuity with respect to y is not clear. This will be shown next
at the expense of requiring f ∈ C 2 rather than f ∈ C 1 .
The key idea is to rewrite deg(f, U, y 2 ) − deg(f, U, y 1 ) as an integral over a
divergence (here we will need f ∈ C 2 ) supported in U and then apply Stokes
theorem. For this purpose the following result will be used.
Lemma 2.7 Suppose f ∈ C 2 (U, Rn ) and u ∈ C 1 (Rn , Rn ), then
(div u)(f )Jf = divDf (u), (2.17)
where Df (u)j is the determinant of the matrix obtained from f 0 by replacing the
j-th column by u(f ).
Proof. We compute
n
X n
X
divDf (u) = ∂xj Df (u)j = Df (u)j,k , (2.18)
j=1 j,k=1
where Df (u)j,k is the determinant of the matrix obtained from the matrix associ-
ated with Df (u)j by applying ∂xj to the k-th column. Since ∂xj ∂xk f = ∂xk ∂xj f we
infer Df (u)j,k = −Df (u)k,j , j 6= k, by exchanging the k-th and the j-th column.
Hence n
X
divDf (u) = Df (u)i,i . (2.19)
i=1
(i,j) (i,j)
Now let Jf (x) denote the (i, j) minor of f 0 (x) and recall ni=1 Jf ∂xi fk = δj,k Jf .
P
Using this to expand the determinant Df (u)i,i along the i-th column shows
n
X n
X n
X
(i,j) (i,j)
divDf (u) = Jf ∂xi uj (f ) = Jf (∂xk uj )(f )∂xi fk
i,j=1 i,j=1 k=1
X n n
X Xn
(i,j)
= (∂xk uj )(f ) Jf ∂xj fk = (∂xj uj )(f )Jf (2.20)
j,k=1 i=1 j=1
as required. 2
Now we can prove
Lemma 2.8 Suppose f ∈ C 2 (U , Rn ). Then deg(f, U, .) is constant in each ball

contained in Rn \f (∂U ), whenever defined.
Proof. Fix ỹ ∈ Rn \f (∂U ) and consider the largest ball Bρ (ỹ), ρ = dist(ỹ, f (∂U ))
around ỹ contained in Rn \f (∂U ). Pick y i ∈ Bρ (ỹ) ∩ RV(f ) and consider
Z
deg(f, U, y ) − deg(f, U, y ) = (φε (f (x) − y 2 ) − φε (f (x) − y 1 ))Jf (x)dx (2.21)
2 1
U
for suitable φε ∈ C 2 (Rn , R) and suitable ε > 0. Now observe

Z 1
(div u)(y) = zj ∂yj φ(y + tz)dt
0
Z 1
d
= ( φ(y + t z))dt = φε (y − y 2 ) − φε (y − y 1 ), (2.22)
0 dt
where
Z 1
u(y) = z φ(y + t z)dt, φ(y) = φε (y − y 1 ), z = y 2 − y 1 , (2.23)
0
R
and apply the previous lemma to rewrite the integral as U divDf (u)dx. Since the
integrand vanishes in a neighborhood of ∂U it is no restriction to assume
R that ∂U is
smooth
R such that we can apply Stokes theorem. Hence we have U divDf (u)dx =
∂U
D f (u)dF = 0 since u is supported inside Bρ (ỹ) provided ε is small enough
(e.g., ε < ρ − max{|y i − ỹ|}i=1,2 ). 2
As a consequence we can define
deg(f, U, y) = deg(f, U, ỹ), y 6∈ f (∂U ), f ∈ C 2 (U , Rn ), (2.24)
where ỹ is a regular value of f with |ỹ − y| < dist(y, f (∂U )).
Remark 2.9 Let me remark a different approach due to Kronecker. For U with
sufficiently smooth boundary we have
Z Z
1 1 1 ˜ = f , (2.25)
deg(f, U, 0) = n−1 Df˜(x)dF = n Df (x)dF, f
|S | ∂U |S | ∂U |f |n |f |
for f ∈ Cy2 (U , Rn ). Explicitly we have

Z n
1 X fj
deg(f, U, 0) = (−1)j−1 df1 ∧ · · · ∧ dfj−1 ∧ dfj+1 ∧ · · · ∧ dfn . (2.26)
|S n−1 | ∂U j=1 |f |n
Since f˜ : ∂U → S n−1 the integrand can also be written as the pull back f˜∗ dS of
the canonical surface element dS on S n−1 .
This coincides with the boundary value approach for complex functions (note
that holomorphic functions are orientation preserving).
Step 2: Admitting continuous functions
Our final step is to remove the condition f ∈ C 2 . As before we want the degree
to be constant in each ball contained in Dy (U , Rn ). For example, fix f ∈ Dy (U , Rn )
and set ρ = dist(y, f (∂U )) > 0. Choose f i ∈ C 2 (U , Rn ) such that |f i − f | < ρ,
implying f i ∈ Dy (U , Rn ). Then H(t, x) = (1 − t)f 1 (x) + tf 2 (x) ∈ Dy (U , Rn ) ∩
C 2 (U, Rn ), t ∈ [0, 1], and |H(t) − f | < ρ. If we can show that deg(H(t), U, y) is
locally constant with respect to t, then it is continuous with respect to t and hence
constant (since [0, 1] is connected). Consequently we can define
deg(f, U, y) = deg(f˜, U, y), f ∈ Dy (U , Rn ), (2.27)
where f˜ ∈ C 2 (U , Rn ) with |f˜ − f | < dist(y, f (∂U )).

It remains to show that t 7→ deg(H(t), U, y) is locally constant.
Lemma 2.10 Suppose f ∈ Cy2 (U , Rn ). Then for each f˜ ∈ C 2 (U , Rn ) there is an

ε > 0 such that deg(f + t f˜, U, y) = deg(f, U, y) for all t ∈ (−ε, ε).
Proof. If f −1 (y) = ∅ the same is true for f + t g if |t| < dist(y, f (U ))/|g|.
Hence we can exclude this case. For the remaining case we use our usual strategy
of considering y ∈ RV(f ) first and then approximating general y by regular ones.
Suppose y ∈ RV(f ) and let f −1 (y) = {xi }N j=1 . By the implicit function theorem
i
we can find disjoint neighborhoods U (x ) such that there exists a unique solution
xi (t) ∈ U (xi ) of (f + t g)(x) = y for |t| < ε1 . By reducing U (xi ) if necessary, we
can even assume that the sign of Jf +t g is constant on U (xi ). Finally, let ε2 =
dist(y, f (U \ N i
S
i=1 U (x )))/|g|. Then |f + t g| > 0 for |t| < ε2 and ε = min(ε1 , ε2 ) is
the quantity we are looking for.
It remains to consider the case y ∈ CV(f ). pick a regular value ỹ ∈ Bρ/3 (y),
where ρ = dist(y, f (∂U )), implying deg(f, U, y) = deg(f, U, ỹ). Then we can
find an ε̃ > 0 such that deg(f, U, ỹ) = deg(f + t g, U, ỹ) for |t| < ε̃. Setting
ε = min(ε̃, ρ/(3|g|)) we infer ỹ − (f + t g)(x) ≥ ρ/3 for x ∈ ∂U , that is |ỹ − y| <
dist(ỹ, (f + t g)(∂U )), and thus deg(f + t g, U, ỹ) = deg(f + t g, U, y). Putting it
all together implies deg(f, U, y) = deg(f + t g, U, y) for |t| < ε as required. 2
Now we can finally prove our main theorem.
Theorem 2.11 There is a unique degree deg satisfying (D1)-(D4). Moreover,

deg(., U, y) : Dy (U , Rn ) → Z is constant on each component and given f ∈
Dy (U , Rn ) we have X
deg(f, U, y) = sgn Jf˜(x) (2.28)
x∈f˜−1 (y)
where f˜ ∈ Dy2 (U , Rn ) is in the same component of Dy (U , Rn ), say |f − f˜| <

dist(y, f (∂U )), such that y ∈ RV(f˜).
Proof. Our previous considerations show that deg is well-defined and locally
constant with respect to the first argument by construction. Hence deg(., U, y) :
Dy (U , Rn ) → Z is continuous and thus necessarily constant on components since
Z is discrete. (D2) is clear and (D1) is satisfied since it holds for f˜ by construction.
Similarly, taking U1,2 as in (D3) we can require |f − f˜| < dist(y, f (U \(U1 ∪ U2 )).
Then (D3) is satisfied since it also holds for f˜ by construction. Finally, (D4) is a
consequence of continuity. 2
To conclude this section, let us give a few simple examples illustrating the use
of the Brouwer degree.
First, let’s investigate the zeros of
f (x1 , x2 ) = (x1 − 2x2 + cos(x1 + x2 ), x2 + 2x1 + sin(x1 + x2 )). (2.29)

Denote the linear part by

g(x1 , x2 ) = (x1 − 2x2 , x2 + 2x1 ). (2.30)
√
Then we have |g(x)| = 5|x| and |f (x)−g(x)| = 1 and hence√h(t) = (1−t)g+t f =
g + t(f − g) satisfies |h(t)| ≥ |g| − t|f − g| > 0 for |x| > 1/ 5 implying
deg(f, B5 (0), 0) = deg(g, B5 (0), 0) = 1. (2.31)
Moreover, since Jf (x) = 5 + 3 cos(x1 + x2 ) + sin(x1 + x2 ) > 1 we see that f (x) =√0
has a unique solution in R2 . This solution has even √ to lie on the circle |x| = 1/ 5
since f (x) = 0 implies 1 = |f (x) − g(x)| = |g(x)| = 5|x|.
Next let us prove the following result which implies the hairy ball (or hedgehog)
theorem.
Theorem 2.12 Suppose U contains the origin and let f : ∂U → Rn \{0} be con-
tinuous. If n is odd, then there exists a x ∈ ∂U and a λ 6= 0 such that f (x) = λx.
Proof. By Theorem 2.15 we can assume f ∈ C(U , Rn ) and since n is odd we
have deg(−1l, U, 0) = −1. Now if deg(f, U, 0) 6= −1, then H(t, x) = (1−t)f (x)−tx
t0
must have a zero (t0 , x0 ) ∈ (0, 1) × ∂U and hence f (x0 ) = 1−t 0
x0 . Otherwise, if
deg(f, U, 0) = −1 we can apply the same argument to H(t, x) = (1−t)f (x)+tx. 2
In particular this result implies that a continuous tangent vector field on the
unit sphere f : S n−1 → Rn (with f (x)x = 0 for all x ∈ S n ) must vanish somewhere
if n is odd. Or, for n = 3, you cannot smoothly comb a hedgehog without leaving
a bald spot or making a parting. It is however possible to comb the hair smoothly
on a torus and that is why the magnetic containers in nuclear fusion are toroidal.
Another simple consequence is the fact that a vector field on Rn , which points
outwards (or inwards) on a sphere, must vanish somewhere inside the sphere.
Theorem 2.13 Suppose f : BR (0) → Rn is continuous and satisfies
f (x)x > 0, |x| = R. (2.32)
Then f (x) vanishes somewhere inside BR (0).
Proof. If f does not vanish, then H(t, x) = (1 − t)x + tf (x) must vanish at
some point (t0 , x0 ) ∈ (0, 1) × ∂BR (0) and thus
0 = H(t0 , x0 )x0 = (1 − t0 )R2 + t0 f (x0 )x0 . (2.33)
But the last part is positive by assumption, a contradiction. 2
2.4 The Brouwer fixed-point theorem

Now we can show that the famous Brouwer fixed-point theorem is a simple conse-
quence of the properties of our degree.
Theorem 2.14 (Brouwer fixed point) Let K be a topological space homeomor-

phic to a compact, convex subset of Rn and let f ∈ C(K, K), then f has at least
one fixed point.
Proof. Clearly we can assume K ⊂ Rn since homeomorphisms preserve fixed

points. Now lets assume K = Br (0). If there is a fixed-point on the boundary
∂Br (0)) we are done. Otherwise H(t, x) = x − t f (x) satisfies 0 6∈ H(t, ∂Br (0))
since |H(t, x)| ≥ |x| − t|f (x)| ≥ (1 − t)r > 0, 0 ≤ t < 1. And the claim follows
from deg(x − f (x), Br (0), 0) = deg(x, Br (0), 0) = 1.
Now let K be convex. Then K ⊆ Bρ (0) and, by Theorem 2.15 below, we can
find a continuous retraction R : Rn → K (i.e., R(x) = x for x ∈ K) and consider
f˜ = f ◦ R ∈ C(Bρ (0), Bρ (0)). By our previous analysis, there is a fixed point
x = f˜(x) ∈ conv(f (K)) ⊆ K. 2
Note that any compact, convex subset of a finite dimensional Banach space
(complex or real) is isomorphic to a compact, convex subset of Rn since linear
transformations preserve both properties. In addition, observe that all assumptions
are needed. For example, the map f : R → R, x 7→ x + 1, has no fixed point (R
is homeomorphic to a bounded set but not to a compact one). The same is true
for the map f : ∂B1 (0) → ∂B1 (0), x 7→ −x (∂B1 (0) ⊂ Rn is simply connected for
n ≥ 3 but not homeomorphic to a convex set).
It remains to prove the result from topology needed in the proof of the Brouwer
fixed-point theorem.
Theorem 2.15 Let X and Y be Banach spaces and let K be a closed subset of X.
Then F ∈ C(K, Y ) has a continuous extension F ∈ C(X, Y ) such that F (X) ⊆
conv(F (K)).
Proof. Consider the open cover {Bρ(x) (x)}x∈X\K for X\K, where ρ(x) =
dist(x, X\K)/2. Choose a (locally finite) partition of unity {φλ }λ∈Λ subordinate
to this cover and set
X
F (x) = φλ (x)F (xλ ) for x ∈ X\K, (2.34)
λ∈Λ
2.5. Kakutani’s fixed-point theorem and applications to game theory 25
where xλ ∈ K satisfies dist(xλ , suppφλ ) ≤ 2dist(K, suppφλ ). By construction, F

is continuous except for possibly at the boundary of K. Fix x0 ∈ ∂K, ε > 0 and
choose δ > 0 such that |F (x) − F (x0 )| ≤ ε for all x ∈ K with |x − x0 | < 4δ.
We will show that |F (x) − F (xP 0 )| ≤ ε for all x ∈ X with |x − x0 | < δ. Suppose
x 6∈ K, then |F (x) − F (x0 )| ≤ λ∈Λ φλ (x)|F (xλ ) − F (x0 )|. By our construction,
xλ should be close to x for all λ with x ∈ suppφλ since x is close to K. In fact, if
x ∈ suppφλ we have
|x − xλ | ≤ dist(xλ , suppφλ ) + d(suppφλ ) ≤ 2dist(K, suppφλ ) + d(suppφλ ), (2.35)
where d(suppφλ ) = supx,y∈suppφλ |x − y|. Since our partition of unity is subordinate
to the cover {Bρ(x) (x)}x∈X\K we can find a x̃ ∈ X\K such that suppφλ ⊂ Bρ(x̃) (x̃)
and hence d(suppφλ ) ≤ ρ(x̃) ≤ dist(K, suppφλ ). Putting it all together we have
|x − xλ | ≤ 3dist(xλ , suppφλ ) and hence
|x0 − xλ | ≤ |x0 − x| + |x − xλ | ≤ 4dist(xλ , suppφλ ) ≤ 4|x − x0 | ≤ 4δ (2.36)
as expected. By our choice of δ we have |F (xλ ) − F (x0 )| ≤ ε for all λ with
φλ (x) 6= 0. Hence |F (x) − F (x0 )| ≤ ε whenever |x − x0 | ≤ δ and we are done. 2
Note that the same proof works if X is only a metric space.
Finally, let me remark that the Brouwer fixed point theorem is equivalent to
the fact that there is no continuous retraction R : B1 (0) → ∂B1 (0) (with R(x) = x
for x ∈ ∂B1 (0)) from the unit ball to the unit sphere in Rn .
In fact, if R would be such a retraction, −R would have a fixed point x0 ∈
∂B1 (0) by Brouwer’s theorem. But then x0 = −f (x0 ) = −x0 which is impossible.
Conversely, if a continuous function f : B1 (0) → B1 (0) has no fixed point we can
define a retraction R(x) = f (x) + t(x)(x − f (x)), where t(x) ≥ 0 is chosen such
that |R(x)|2 = 1 (i.e., R(x) lies on the intersection of the line spanned by x, f (x)
with the unit sphere).
Using this equivalence the Brouwer fixed point theorem can also be derived
easily by showing that the homology groups of the unit ball B1 (0) and its boundary
(the unit sphere) differ (see, e.g., [9] for details).
2.5 Kakutani’s fixed-point theorem and applica-

tions to game theory
In this section we want to apply Brouwer’s fixed-point theorem to show the exis-
tence of Nash equilibria for n-person games. As a preparation we extend Brouwer’s
fixed-point theorem to set valued functions. This generalization will be more suit-
able for our purpose.
Denote by CS(K) the set of all nonempty convex subsets of K.
Theorem 2.16 (Kakutani) Suppose K is a compact convex subset of Rn and

f : K → CS(K). If the set
Γ = {(x, y)|y ∈ f (x)} ⊆ K 2 (2.37)
is closed, then there is a point x ∈ K such that x ∈ f (x).
Proof. Our strategy is to apply Brouwer’s theorem, hence we need a function

related to f . For this purpose it is convenient to assume that K is a simplex
K = hv1 , . . . , vm i, m ≤ n, (2.38)
where vi are the vertices. If we pick yi ∈ f (vi ) we could set

m
X
1
f (x) = λi y i , (2.39)
i=1
Pm
where
Pnλi are the barycentric coordinates of x (i.e., λi ≥ 0, i=1 λi = 1 and
x = i=1 λi vi ). By construction, f ∈ C(K, K) and there is a fixed point x1 . But
1
unless x1 is one of the vertices, this doesn’t help us too much. So lets choose a
better function as follows. Consider the k-th barycentric subdivision and for each
vertex vi in this subdivision pick an element yi ∈ f (vi ). Now define f k (vi ) = yi
and extend f k to the interior of each subsimplex as before. Hence f k ∈ C(K, K)
and there is a fixed point
m
X m
X
k
x = λki vik = λki yik , yik = f k (vik ), (2.40)
i=1 i=1
in the subsimplex hv1k , . . . , vm

k
i. Since (xk , λk1 , . . . , λkm , y1k , . . . , ym
k
) ∈ K 2m+1 we can
assume that this sequence converges to (x0 , λ01 , . . . , λ0m , y10 , . . . , ym 0
) after passing to
a subsequence. Since the subsimplices shrink to a point, this implies vik → x0 and
hence yi0 ∈ f (x0 ) since (vik , yik ) ∈ Γ → (vi0 , yi0 ) ∈ Γ by the closedness assumption.
Now (2.40) tells us
Xm
0
x = λki yik ∈ f (x0 ) (2.41)
i=1
2.5. Kakutani’s fixed-point theorem and applications to game theory 27
since f (x0 ) is convex and the claim holds if K is a simplex.

If K is not a simplex, we can pick a simplex S containing K and proceed as in
the proof of the Brouwer theorem. 2
If f (x) contains precisely one point for all x, then Kakutani’s theorem reduces
to the Brouwer’s theorem.
Now we want to see how this applies to game theory.
An n-person game consists of n players who have mi possible actions to choose
from. The set of all possible actions for the i-th player will be denoted by Φi =
{1, . . . , mi }. An element ϕi ∈ Φi is also called a pure strategy for reasons to
become clear in a moment. Once all players have chosen their move ϕi , the payoff
for each player is given by the payoff function
n
Y
Ri (ϕ), ϕ = (ϕ1 , . . . , ϕn ) ∈ Φ = Φi (2.42)
i=1
of the i-th player. We will consider the case where the game is repeated a large
number of times and where in each step the players choose their action according to
a fixed strategy. Here a strategy si for the i-th player is a probability
Pmi k distribution
1 mi k
on Φi , that is, si = (si , . . . , si ) such that si ≥ 0 and k=1 si = 1. The set
of all possible strategies for the i-th player is denoted by Si . The number ski is
the probability forQthe k-th pure strategy to be chosen. Consequently, if s =
(s1 , . . . , sn ) ∈ S = ni=1 Si is a collection of strategies, then the probability that a
given collection of pure strategies gets chosen is
n
Y
s(ϕ) = si (ϕ), si (ϕ) = ski i , ϕ = (k1 , . . . , kn ) ∈ Φ (2.43)
i=1
(assuming all players make their choice independently) and the expected payoff
for player i is
X
Ri (s) = s(ϕ)Ri (ϕ). (2.44)
ϕ∈Φ
By construction, Ri (s) is continuous.

The question is of course, what is an optimal strategy for a player? If the
other strategies are known, a best reply of player i against s would be a strategy
si satisfying
Ri (s\si ) = max Ri (s\s̃i ) (2.45)
s̃i ∈Si
Here s\s̃i denotes the strategy combination obtained from s by replacing si by

s̃i . The set of all best replies against s for the i-th player is denoted by Bi (s).
Explicitly, si ∈ B(s) if and only if ski = 0 whenever Ri (s\k) < max1≤l≤mi Ri (s\l)
(in particular Bi (s) 6= ∅).
Let s, s ∈ S, we call s a best reply against s if sQ
i is a best reply against s for
all i. The set of all best replies against s is B(s) = ni=1 Bi (s).
A strategy combination s ∈ S is a Nash equilibrium for the game if it is a best
reply against itself, that is,
s ∈ B(s). (2.46)
Or, put differently, s is a Nash equilibrium if no player can increase his payoff by
changing his strategy as long as all others stick to their respective strategies. In
addition, if a player sticks to his equilibrium strategy, he is assured that his payoff
will not decrease no matter what the others do.
To illustrate these concepts, let us consider the famous prisoners dilemma.
Here we have two players which can choose to defect or cooperate. The payoff is
symmetric for both players and given by the following diagram
R1 d2 c2 R2 d2 c2
d1 0 2 d1 0 −1 (2.47)
c1 −1 1 c1 2 1
where ci or di means that player i cooperates or defects, respectively. It is easy

to see that the (pure) strategy pair (d1 , d2 ) is the only Nash equilibrium for this
game and that the expected payoff is 0 for both players. Of course, both players
could get the payoff 1 if they both agree to cooperate. But if one would break this
agreement in order to increase his payoff, the other one would get less. Hence it
might be safer to defect.
Now that we have seen that Nash equilibria are a useful concept, we want to
know when such an equilibrium exists. Luckily we have the following result.
Theorem 2.17 (Nash) Every n-person game has at least one Nash equilibrium.
Proof. The definition of a Nash equilibrium begs us to apply Kakutani’s theo-

rem to the set valued function s 7→ B(s). First of all, S is compact and convex and
so are the sets B(s). Next, observe that the closedness condition of Kakutani’s
theorem is satisfied since if sm ∈ S and sm ∈ B(sn ) both converge to s and s,
respectively, then (2.45) for sm , sm
Ri (sm \s̃i ) ≤ Ri (sm \sm

i ), s̃i ∈ Si , 1 ≤ i ≤ n, (2.48)
2.6. Further properties of the degree 29
implies (2.45) for the limits s, s

Ri (s\s̃i ) ≤ Ri (s\si ), s̃i ∈ Si , 1 ≤ i ≤ n, (2.49)
by continuity of Ri (s). 2
2.6 Further properties of the degree

We now prove some additional properties of the mapping degree. The first one will
relate the degree in Rn with the degree in Rm . It will be needed later on to extend
the definition of degree to infinite dimensional spaces. By virtue of the canonical
embedding Rm ,→ Rm × {0} ⊂ Rn we can consider Rm as a subspace of Rn .
Theorem 2.18 (Reduction property) Let f ∈ C(U , Rm ) and y ∈ Rm \(1l +

f )(∂U ), then
deg(1l + f, U, y) = deg(1l + fm , Um , y), (2.50)
where fm = f |Um , where U − M is the projection of U to Rm .
Proof. Choose a f˜ ∈ C 2 (U, Rm ) sufficiently close to f such that y ∈ RV(f˜).

Let x ∈ (1l+ f˜)−1 (y), then x = y −f (x) ∈ Rm implies (1l+ f˜)−1 (y) = (1l+ f˜m )−1 (y).
Moreover,
δij + ∂j f˜i (x) ∂j f˜j (x)

˜0
J1l+f˜(x) = det(1l + f )(x) = det
0 δij
= det(δij + ∂j f˜i ) = J ˜ (x)
1l+fm (2.51)
shows deg(1l + f, U, y) = deg(1l + f˜, U, y) = deg(1l + f˜m , Um , y) = deg(1l + fm , Um , y)

as desired. 2
Let U ⊆ Rn and f ∈ C(U , Rn ) be as usual. By Theorem 2.2 we know that
deg(f, U, y) is the same for every y in a connected component of Rn \f (∂U ). We will
denote these components by Kj and write deg(f, U, y) = deg(f, U, Kj ) if y ∈ Kj .
Theorem 2.19 (Product formula) Let U ⊆ Rn be a bounded and open set and
denote by Gj the connected components of Rn \f (∂U ). If g ◦ f ∈ Dy (U, Rn ), then
X
deg(g ◦ f, U, y) = deg(f, U, Gj ) deg(g, Gj , y), (2.52)
j
where only finitely many terms in the sum are nonzero.

Proof. Since f (U ) is is compact, we can find an r > 0 such that f (U ) ⊆ Br (0).

Moreover, since g −1 (y) is closed, g −1 (y) ∩ Br (0) is compact and hence can be
covered by finitely many components {Gj }m j=1 . In particular, the others will have
deg(f, U, Gk ) = 0 and hence only finitely many terms in the above sum are nonzero.
We begin by computing deg(g ◦ f, U, y) in the case where f, g ∈ C 1 and
y 6∈ CV(g ◦ f ). Since (g ◦ f )0 (x) = g 0 (f (x))f 0 (x) the claim is a straightforward
calculation
X
deg(g ◦ f, U, y) = sgn(Jg◦f (x))
x∈(g◦f )−1 (y)
X
= sgn(Jg (f (x)))sgn(Jf (x))
x∈(g◦f )−1 (y)
X X
= sgn(Jg (z)) sgn(Jf (x))
z∈g −1 (y) x∈f −1 (z)
X
= sgn(Jg (z)) deg(f, U, z)
z∈g −1 (y)
and, using our cover {Gj }m

j=1 ,
m
X X
deg(g ◦ f, U, y) = sgn(Jg (z)) deg(f, U, z)
j=1 z∈g −1 (y)∩Gj
m
X X
= deg(f, U, Gj ) sgn(Jg (z)) (2.53)
j=1 z∈g −1 (y)∩Gj
m
X
= deg(f, U, Gj ) deg(g, Gj , y). (2.54)
j=1
Moreover, this formula still holds for y ∈ CV(g ◦ f ) and for g ∈ C by construction
of the Brouwer degree. However, the case f ∈ C will need a closer investigation
since the sets Gj depend on f . To overcome this problem we will introduce the
sets
Ll = {z ∈ Rn \f (∂U )| deg(f, U, z) = l}. (2.55)
Observe that Ll , l > 0, must be a union of some sets of {Gj }m
j=1 .
Now choose f˜ ∈ C such that |f (x) − f˜(x)| < 2 dist(g (y), f (∂U )) for x ∈
1 −1 −1
U and define K̃j , L̃l accordingly. Then we have Ul ∩ g −1 (y) = Ũl ∩ g −1 (y) by
2.7. The Jordan curve theorem 31
Theorem 2.1 (iii). Moreover,

X
deg(f ◦ g, U, y) = deg(f˜ ◦ g, U, y) = deg(f, U, K̃j ) deg(g, K̃j , y)
j
X X
= l deg(g, Ũl , y) = l deg(g, Ul , y)
l>0 l>0
X
= deg(f, U, Gj ) deg(g, Gj , y) (2.56)
j
which proves the claim. 2
2.7 The Jordan curve theorem

In this section we want to show how the product formula (2.52) for the Brouwer
degree can be used to prove the famous Jordan curve theorem which states that
a homeomorphic image of the circle dissects R2 into two components (which nec-
essarily have the image of the circle as common boundary). In fact, we will even
prove a slightly more general result.
Theorem 2.20 Let Cj ⊂ Rn , j = 1, 2, be homeomorphic compact sets. Then

Rn \C1 and Rn \C2 have the same number of connected components.
Proof. Denote the components of Rn \C1 by Hj and those of Rn \C2 by Kj . Let

h : C1 → C2 be a homeomorphism with inverse k : C2 → C1 . By Theorem 2.15 we
can extend both to Rn . Then Theorem 2.1 (iii) and the product formula imply
X
1 = deg(k ◦ h, Hj , y) = deg(h, Hj , Gl ) deg(k, Gl , y) (2.57)
l
for any y ∈ Hj . Now we have

[ [
Ki = Rn \C2 ⊆ Rn \h(∂Hj ) = Gl (2.58)
i l
and hence fore every i we have Ki ⊆ Gl for some l since components are maximal
connected sets. Let Nl = {i|Ki ⊆ Gl } and observe that we have deg(k, Gl , y) =
P
i∈Nl deg(k, Ki , y) and deg(h, Hj , Gl ) = deg(h, Hj , Ki ) for every i ∈ Nl . There-
fore,
XX X
1= deg(h, Hj , Ki ) deg(k, Ki , y) = deg(h, Hj , Ki ) deg(k, Ki , Hj )
l i∈Nl i
(2.59)
By reversing the role of C1 and C2 , the same formula holds with Hj and Ki
interchanged.
Hence X XX X
1= deg(h, Hj , Ki ) deg(k, Ki , Hj ) = 1 (2.60)
i i j j
n n
shows that if the number of components of R \C1 or R \C2 is finite, then so is
the other and both are equal. Otherwise there is nothing to prove. 2
Chapter 3
The Leray–Schauder mapping

degree
3.1 The mapping degree on finite dimensional

Banach spaces
The objective of this section is to extend the mapping degree from Rn to general
Banach spaces. Naturally, we will first consider the finite dimensional case.
Let X be a (real) Banach space of dimension n and let φ be any isomorphism
between X and Rn . Then, for f ∈ Dy (U , X), U ⊂ X open, y ∈ X, we can define
deg(f, U, y) = deg(φ ◦ f ◦ φ−1 , φ(U ), φ(y)) (3.1)
provided this definition is independent of the basis chosen. To see this let ψ be a
second isomorphism. Then A = ψ ◦ φ−1 ∈ GL(n). Abbreviate f ∗ = φ ◦ f ◦ φ−1 ,
y ∗ = φ(y) and pick f˜∗ ∈ Cy1 (φ(U ), Rn ) in the same component of Dy (φ(U ), Rn )
as f ∗ such that y ∗ ∈ RV(f ∗ ). Then A ◦ f˜∗ ◦ A−1 ∈ Cy1 (ψ(U ), Rn ) is the same
component of Dy (ψ(U ), Rn ) as A ◦ f ∗ ◦ A−1 = ψ ◦ f ◦ ψ −1 (since A is also a
homeomorphism) and
JA◦f˜∗ ◦A−1 (Ay ∗ ) = det(A)Jf˜∗ (y ∗ ) det(A−1 ) = Jf˜∗ (y ∗ ) (3.2)
by the chain rule. Thus we have deg(ψ ◦ f ◦ ψ −1 , ψ(U ), ψ(y)) = deg(φ ◦ f ◦
φ−1 , φ(U ), φ(y)) and our definition is independent of the basis chosen. In addition,
it inherits all properties from the mapping degree in Rn . Note also that the re-
duction property holds if Rm is replaced by an arbitrary subspace X1 since we can
always choose φ : X → Rn such that φ(X1 ) = Rm .
33
34 Chapter 3. The Leray–Schauder mapping degree
Our next aim is to tackle the infinite dimensional case. The general idea is to
approximate F by finite dimensional operators (in the same spirit as we approx-
imated continuous f by smooth functions). To do this we need to know which
operators can be approximated by finite dimensional operators. Hence we have to
recall some basic facts first.
3.2 Compact operators

Let X, Y be Banach spaces and U ⊂ X. An operator F : U ⊂ X → Y is called
finite dimensional if its range is finite dimensional. In addition, it is called compact
if it is continuous and maps bounded sets into relatively compact ones. The set
of all compact operators is denoted by C(U, Y ) and the set of all compact, finite
dimensional operators is denoted by F(U, Y ). Both sets are normed linear spaces
and we have F(U, Y ) ⊆ C(U, Y ) ⊆ C(U, Y ).
If U is compact, then C(U, Y ) = C(U, Y ) (since the continuous image of a com-
pact set is compact) and if dim(Y ) < ∞, then F(U, Y ) = C(U, Y ). In particular,
if U ⊂ Rn is bounded, then F(U , Rn ) = C(U , Rn ) = C(U , Rn ).
Now let us collect some results to be needed in the sequel.
Lemma 3.1 If K ⊂ X is compact, then for every ε > 0 there is a finite di-
mensional subspace Xε ⊆ X and a continuous map Pε : K → Xε such that
|Pε (x) − x| ≤ ε for all x ∈ K.
Proof. Pick {xi }ni=1 ⊆ K such that ni=1 Bε (xi ) covers K. Let {φi }ni=1 be
S
a partition of unity (restricted to K) subordinate
Pn to {Bε (xi )}ni=1 , that is, φi ∈
C(K, [0, 1]) with supp(φi ) ⊂ Bε (xi ) and i=1 φi (x) = 1, x ∈ K. Set
n
X
Pε (x) = φi (x)xi , (3.3)
i=1
then
n
X n
X
|Pε (x) − x| = | φi (x)x − φi (x)xi | (3.4)
i=1 i=1
n
X
≤ φi (x)|x − xi | ≤ ε.
i=1
2
This lemma enables us to prove the following important result.
3.3. The Leray–Schauder mapping degree 35
Theorem 3.2 Let U be bounded, then the closure of F(U, Y ) in C(U, Y ) is C(U, Y ).
Proof. Suppose FN ∈ C(U, Y ) converges to F . If F 6∈ C(U, Y ) then we can find

a sequence xn ∈ U such that |F (xn ) − F (xm )| ≥ ρ > 0 for n 6= m. If N is so large
that |F − FN | ≤ ρ/4, then
|FN (xn ) − FN (xm )| ≥ |F (xn ) − F (xm )| − |FN (xn ) − F (xn )| − |FN (xm ) − F (xm )|
ρ ρ
≥ ρ−2 = (3.5)
4 2
This contradiction shows F(U, y) ⊆ C(U, Y ). Conversely, let K = F (U ) and choose

Pε according to Lemma 3.1, then Fε = Pε ◦ F ∈ F(U, Y ) converges to F . Hence
C(U, Y ) ⊆ F(U, y) and we are done. 2
Finally, let us show some interesting properties of mappings 1l + F , where
F ∈ C(U, Y ).
Lemma 3.3 Let U be bounded and closed. Suppose F ∈ C(U, Y ), then 1l + F is

proper (i.e., inverse images of compact sets are compact) and maps closed subsets
to closed subsets.
Proof. Let A ⊆ U be closed and yn = (1l + F )(xn ) ∈ (1l + F )(A). Since

{yn − xn } ⊂ F −1 ({yn }) we can assume that yn − xn → z after passing to a
subsequence and hence xn → x = y − z ∈ A. Since y = x + F (x) ∈ (1l + F )(A),
(1l + F )(A) is closed.
Next, let U be closed and K ⊂ X be compact. Let {xn } ⊆ (1l + F )−1 (K).
Then we can pass to a subsequence ynm = xnm + F (xnm ) such that ynm → y. As
before this implies xnm → x and thus (1l + F )−1 (K) is compact. 2
Now we are all set for the definition of the Leray–Schauder degree, that is, for
the extension of our degree to infinite dimensional Banach spaces.
3.3 The Leray–Schauder mapping degree

For U ⊂ X we set Dy (U , X) = {F ∈ C(U , X)|y 6∈ (1l + F )(∂U )} and Fy (U , X) =
{F ∈ F(U , X)|y 6∈ (1l+F )(∂U )}. Note that for F ∈ Dy (U , X) we have dist(y, (1l+
F )(∂U )) > 0 since 1l + F maps closed sets to closed sets.
Abbreviate ρ = dist(y, (1l + F )(∂U )) and pick F1 ∈ F(U , X) such that |F −
F1 | < ρ implying F1 ∈ Fy (U , X). Next, let X1 be a finite dimensional subspace
of X such that F1 (U ) ⊂ X1 , y ∈ X1 and set U1 = U ∩ X1 . Then we have

F1 ∈ Fy (U1 , X1 ) and might define
deg(1l + F, U, y) = deg(1l + F1 , U1 , y) (3.6)
provided we show that this definition is independent of F1 and X1 (as above).

Pick another operator F2 ∈ F(U , X) such that |F − F2 | < ρ and let X2 be a
corresponding finite dimensional subspace as above. Consider X0 = X1 + X2 ,
U0 = U ∩ X0 , then Fi ∈ Fy (U0 , X0 ), i = 1, 2, and
deg(1l + Fi , U0 , y) = deg(1l + Fi , Ui , y), i = 1, 2, (3.7)
by the reduction property. Moreover, set H(t) = 1l + (1 − t)F1 + t F2 implying

H(t) ∈, t ∈ [0, 1], since |H(t) − (1l + F )| < ρ for t ∈ [0, 1]. Hence homotopy
invariance
deg(1l + F1 , U0 , y) = deg(1l + F2 , U0 , y) (3.8)
shows that (3.6) is independent of F1 , X1 .
Theorem 3.4 Let U be a bounded open subset of a (real) Banach space X and let
F ∈ Dy (U , X), y ∈ X. Then the following hold true.
(i). deg(1l + F, U, y) = deg(1l + F − y, U, 0).
(ii). deg(1l, U, y) = 1 if y ∈ U .
(iii). If U1,2 are open, disjoint subsets of U such that y 6∈ f (U \(U1 ∪ U2 )), then
deg(1l + F, U, y) = deg(1l + F, U1 , y) + deg(1l + F, U2 , y).
(iv). If H : [0, 1]×U → X and y : [0, 1] → X are both continuous such that H(t) ∈
Dy(t) (U, Rn ), t ∈ [0, 1], then deg(1l + H(0), U, y(0)) = deg(1l + H(1), U, y(1)).
Proof. Except for (iv) all statements follow easily from the definition of the
degree and the corresponding property for the degree in finite dimensional spaces.
Considering H(t, x) − y(t), we can assume y(t) = 0 by (i). Since H([0, 1], ∂U )
is compact, we have ρ = dist(y, H([0, 1], ∂U ) > 0. By Theorem 3.2 we can pick
H1 ∈ F([0, 1] × U, X) such that |H(t) − H1 (t)| < ρ, t ∈ [0, 1]. this implies
deg(1l+H(t), U, 0) = deg(1l+H1 (t), U, 0) and the rest follows from Theorem 2.2. 2
In addition, Theorem 2.1 and Theorem 2.2 hold for the new situation as well
(no changes are needed in the proofs).
3.4. The Leray–Schauder principle and the Schauder fixed-point theorem 37
Theorem 3.5 Let F, G ∈ Dy (U, X), then the following statements hold.
(i). We have deg(1l + F, ∅, y) = 0. Moreover, S if Ui , 1 ≤ i ≤ N , are disjoint

open subsets of U such that y 6∈ (1l + F )(U \ Ni=1 Ui ), then deg(1l + F, U, y) =
PN
i=1 deg(1l + F, Ui , y).
(ii). If y 6∈ (1l + F )(U ), then deg(1l + F, U, y) = 0 (but not the other way round).
Equivalently, if deg(1l + F, U, y) 6= 0, then y ∈ (1l + F )(U ).
(iii). If |f (x) − g(x)| < dist(y, f (∂U )), x ∈ ∂U , then deg(f, U, y) = deg(g, U, y).
In particular, this is true if f (x) = g(x) for x ∈ ∂U .
(iv). deg(1l + ., U, y) is constant on each component of Dy (U , X).
(v). deg(1l + F, U, .) is constant on each component of X\f (∂U ).
3.4 The Leray–Schauder principle and the Schauder

fixed-point theorem
As a first consequence we note the Leray–Schauder principle which says that a
priori estimates yield existence.
Theorem 3.6 (Leray–Schauder principle) Suppose F ∈ C(X, X) and any so-

lution x of x = tF (x), t ∈ [0, 1] satisfies the a priori bound |x| ≤ M for some
M > 0, then F has a fixed point.
Proof. Pick ρ > M and observe deg(1l + F, Bρ (0), 0) = deg(1l, Bρ (0), 0) = 1

using the compact homotopy H(t, x) = tF (x). Here 0 6∈ H(t, ∂Bρ (0)) due to the
a priori bound. 2
Now we can extend the Brouwer fixed-point theorem to infinite dimensional
spaces as well.
Theorem 3.7 (Schauder fixed point) Let K be a closed, convex, and bounded
subset of a Banach space X. If F ∈ C(K, K), then F has at least one fixed
point. The result remains valid if K is only homeomorphic to a closed, convex,
and bounded subset.
Proof. Since K is bounded, there is a ρ > 0 such that K ⊆ Bρ (0). By

Theorem 2.15 we can find a continuous retraction R : X → K (i.e., R(x) = x
for x ∈ K) and consider F̃ = F ◦ R ∈ C(Bρ (0), Bρ (0)). The compact homotopy
H(t, x) = tF̃ (x) shows that deg(1l + F̃ , Bρ (0), 0) = deg(1l, Bρ (0), 0) = 1. Hence
there is a point x0 = F̃ (x0 ) ∈ K. Since F̃ (x0 ) = F (x0 ) for x0 ∈ K we are done. 2
Finally, let us prove another fixed-point theorem which covers several others
as special cases.
Theorem 3.8 Let U ⊂ X be open and bounded and let F ∈ C(U , X). Suppose
there is an x0 ∈ U such that
F (x) − x0 6= α(x − x0 ), x ∈ ∂U, α ∈ (1, ∞). (3.9)
Then F has a fixed point.
Proof. Consider H(t, x) = x − x0 − t(F (x) − x0 ), then we have H(t, x) 6= 0

for x ∈ ∂U and t ∈ [0, 1] by assumption. If H(1, x) = 0 for some x ∈ ∂U ,
then x is a fixed point and we are done. Otherwise we have deg(1l − F, U, 0) =
deg(1l − x0 , U, 0) = deg(1l, U, x0 ) = 1 and hence F has a fixed point. 2
Now we come to the anticipated corollaries.
Corollary 3.9 Let U ⊂ X be open and bounded and let F ∈ C(U , X). Then F
has a fixed point if one of the following conditions holds.
1. U = Bρ (0) and F (∂U ) ⊆ U (Rothe).
2. U = Bρ (0) and |F (x) − x|2 ≥ |F (x)|2 − |x|2 for x ∈ ∂U (Altman).
3. X is a Hilbert space, U = Bρ (0) and hF (x), xi ≤ |x|2 for x ∈ ∂U (Kras-

nosel’skii).
Proof. (1). F (∂U ) ⊆ U and F (x) = αx for |x| = ρ implies |α|ρ ≤ ρ and hence
(3.9) holds. (2). F (x) = αx for |x| = ρ implies (α − 1)2 ρ2 ≥ (α2 − 1)ρ2 and hence
α ≤ 0. (3). Special case of (2) since |F (x) − x|2 = |F (x)|2 − 2hF (x), xi + |x|2 . 2
3.5. Applications to integral and differential equations 39
3.5 Applications to integral and differential equa-

tions
In this section we want to show how our results can be applied to integral and
differential equations. To be able to apply our results we will need to know that
certain integral operators are compact.
Lemma 3.10 Suppose I = [a, b] ⊂ R and f ∈ C(I × I × Rn , Rn ), τ ∈ C(I, I),

then
F : C(I, Rn ) → C(I, Rn ) (3.10)
R τ (t)
x(t) 7→ F (x)(t) = a f (t, s, x(s))ds
is compact.
Proof. We first need to prove that F is continuous. Fix x0 ∈ C(I, Rn ) and

ε > 0. Set ρ = |x0 | + 1 and abbreviate B = Bρ (0) ⊂ Rn . The function f is
uniformly continuous on Q = I ×I ×B since Q is compact. Hence for ε1 = ε/(b−a)
we can find a δ ∈ (0, 1] such that |f (t, s, x) − f (t, s, y)| ≤ ε1 for |x − y| < δ. But
this implies
Z
τ (t)
|F (x) − F (x0 )| = sup f (t, s, x(s)) − f (t, s, x0 (s))ds

t∈I a
Z τ (t)
≤ sup |f (t, s, x(s)) − f (t, s, x0 (s))|ds
t∈I a
≤ sup(b − a)ε1 = ε, (3.11)
t∈I
for |x−x0 | < δ. In other words, F is continuous. Next we note that if U ⊂ C(I, Rn )
is bounded, say |U | < ρ, then
Z
τ (t)
|F (U )| ≤ sup f (t, s, x(s))ds ≤ (b − a)M, (3.12)

x∈U a
where M = max |f (I, I, Bρ (0))|. Moreover, the family F (U ) is equicontinuous.

Fix ε and ε1 = ε/(2(b − a)), ε2 = ε/(2M ). Since f and τ are uniformly continuous
on I × I × Bρ (0) and I, respectively, we can find a δ > 0 such that |f (t, s, x) −
f (t0 , s, x)| ≤ ε1 and |τ (t)−τ (t0 )| ≤ ε2 for |t−t0 | < δ. Hence we infer for |t−t0 | < δ
Z
τ (t) Z τ (t0 )
|F (x)(t) − F (x)(t0 )| = f (t, s, x(s))ds − f (t0 , s, x(s))ds

a a
Z τ (t0 )
Z τ (t)
≤ |f (t, s, x(s)) − f (t0 , s, x(s))|ds + |f (t, s, x(s))|ds

a τ (t0 )
≤ (b − a)ε1 + ε2 M = ε. (3.13)
This implies that F (U ) is relatively compact by the Arzelà-Ascoli theorem. Thus

F is compact. 2
As a first application we use this result to show existence of solutions to integral
equations.
Theorem 3.11 Let F be as in the previous lemma. Then the integral equation
x − λF (x) = y, λ ∈ R, y ∈ C(I, Rn ) (3.14)
has at least one solution x ∈ C(I, Rn ) if |λ| ≤ ρ/M (ρ), where M (ρ) = (b −
a) max(s,t,x)∈I×I×Bρ (0) |f (s, t, x − y(s))| and ρ > 0 is arbitrary.
Proof. Note that, by our assumption on λ, λF maps Bρ (y) into itself. Now
apply the Schauder fixed-point theorem. 2
This result immediately gives the Peano theorem for ordinary differential equa-
tions.
Theorem 3.12 (Peano) Consider the initial value problem
ẋ = f (t, x), x(t0 ) = x0 , (3.15)
where f ∈ C(I, Rn ) and I ⊂ R is an interval containing t0 . Then (3.15) has

at least one local solution x ∈ C 1 ([t0 − ε, t0 + ε], Rn ), ε > 0. For example, any
ε satisfying εM (ε, ρ) ≤ ρ, ρ > 0 with M (ε, ρ) = max |f ([t0 − ε, t0 + ε], Bρ (x0 ))|
works. In addition, if M (ε, ρ) ≤ M̃ (ε)(1 + ρ), then there exists a global solution.
Proof. For notational simplicity we make the shift t → t − t0 , x → x − x0 ,

f (t, x) → f (t + t0 , x + t0 ) and assume t0 = 0, x0 = 0. In addition, it suffices to
consider t ≥ 0 since t → −t amounts to f → −f .
3.5. Applications to integral and differential equations 41
Now observe, that (3.15) is equivalent to

Z t
x(t) − f (s, x(s))ds, x ∈ C([−ε, ε], Rn ) (3.16)
0
and the first part follows from our previous theorem. To show the second, fix ε > 0
and assume M (ε, ρ) ≤ M̃ (ε)(1 + ρ). Then
Z t Z t
|x(t)| ≤ |f (s, x(s))|ds ≤ M̃ (ε) (1 + |x(s)|)ds (3.17)
0 0
implies |x(t)| ≤ exp(M̃ (ε)ε) by Gronwall’s inequality. Hence we have an a pri-

ori bound which implies existence by the Leary–Schauder principle. Since ε was
arbitrary we are done. 2
Chapter 4
The stationary Navier–Stokes

equation
4.1 Introduction and motivation

In this chapter we turn to partial differential equations. In fact, we will only
consider one example, namely the stationary Navier–Stokes equation. Our goal is
to use the Leray–Schauder principle to prove an existence and uniqueness result
for solutions.
Let U (6= ∅) be an open, bounded, and connected subset of R3 . We assume that
U is filled with an incompressible fluid described by its velocity field vj (t, x) and its
pressure p(t, x), (t, x) ∈ R × U . The requirement that our fluid is incompressible
implies ∂j vj = 0 (we sum over two equal indices from 1 to 3), which follows from
the Gauss theorem since the flux trough any closed surface must be zero.
Rather than just writing down the equation, let me give a short physical mo-
tivation. To obtain the equation which governs such a fluid we consider the forces
acting on a small cube spanned by the points (x1 , x2 , x3 ) and (x1 + ∆x1 , x2 +
∆x2 , x3 + ∆x3 ). We have three contributions from outer forces, pressure differ-
ences, and viscosity.
The outer force density (force per volume) will be denoted by Kj and we assume
that it is known (e.g. gravity).
The force from pressure acting on the surface through (x1 , x2 , x3 ) normal to
the x1 -direction is p∆x2 ∆x3 δ1j . The force from pressure acting on the opposite
surface is −(p + ∂1 p∆x1 )∆x2 ∆x3 δ1j . In summary, we obtain
− (∂j p)∆V, (4.1)
43
44 Chapter 4. The stationary Navier–Stokes equation
where ∆V = ∆x1 ∆x2 ∆x3 .

The viscosity acting on the surface through (x1 , x2 , x3 ) normal to the x1 -
direction is −η∆x2 ∆x3 ∂1 vj by some physical law. Here η > 0 is the viscosity
constant of the fluid. On the opposite surface we have η∆x2 ∆x3 ∂1 (vj + ∂1 vj ∆x1 ).
Adding up the contributions of all surface we end up with
η∆V ∂i ∂i vj . (4.2)
Putting it all together we obtain from Newton’s law
d
ρ∆V vj (t, x(t)) = η∆V ∂i ∂i vj (t, x(t)) − (∂j p(t, x(t)) + ∆V Kj (t, x(t)), (4.3)
dt
where ρ > 0 is the density of the fluid. Dividing by ∆V and using the chain rule
yields the Navier–Stokes equation
ρ∂t vj = η∂i ∂i vj − ρ(vi ∂i )vj − ∂j p + Kj . (4.4)
Note that it is no restriction to assume ρ = 1.

In what follows we will only consider the stationary Navier–Stokes equation
0 = η∂i ∂i vj − (vi ∂i )vj − ∂j p + Kj . (4.5)
In addition to the incompressibility condition ∂j vj = 0 we also require the bound-

ary condition v|∂U = 0, which follows from experimental observations.
In summary, we consider the problem (4.5) for v in (e.g.) X = {v ∈ C 2 (U , R3 )|
∂j vj = 0 and v|∂U = 0}.
Our strategy is to rewrite the stationary Navier–Stokes equation in integral
form, which is more suitable for our further analysis. For this purpose we need to
introduce some function spaces first.
4.2 An insert on Sobolev spaces

Let U be a bounded open subset of Rn and let Lp (U, R) denote the Lebesgue spaces
of p integrable functions with norm
Z 1/p
p
|u|p = |u(x)| dx . (4.6)
U
4.2. An insert on Sobolev spaces 45
In the case p = 2 we even have a scalar product

Z
hu, vi2 = u(x)v(x)dx (4.7)
U
and our aim is to extend this case to include derivatives.

Given the set C 1 (U, R) we can consider the scalar product
Z Z
hu, vi2,1 = u(x)v(x)dx + (∂j u)(x)(∂j v)(x)dx. (4.8)
U U
Taking the completion with respect to the associated norm we obtain the Sobolev
space H 1 (U, R). Similarly, taking the completion of C01 (U, R) with respect to the
same norm, we obtain the Sobolev space H01 (U, R). Here C0r (U, Y ) denotes the
set of functions in C r (U, Y ) with compact support. This construction of H 1 (U, R)
implies that a sequence uk in C 1 (U, R) converges to u ∈ H 1 (U, R) if and only if uk
and all its first order derivatives ∂j uk converge in L2 (U, R). Hence we can assign
each u ∈ H 1 (U, R) its first order derivatives ∂j u by taking the limits from above. In
order to show that this is a useful generalization of the ordinary derivative, we need
to show that the derivative depends only on the limiting function u ∈ L2 (U, R).
To see this we need the following lemma.
Lemma 4.1 (Integration by parts) Suppose u ∈ H01 (U, R) and v ∈ H 1 (U, R),
then Z Z
u(∂j v)dx = − (∂j u)v dx. (4.9)
U U
Proof. By continuity it is no restriction to assume u ∈ C01 (U, R) and v ∈

C 1 (U, R). Moreover, we can find a function φ ∈ C01 (U, R) which is 1 on the
support of u. Hence by considering φv we can even assume v ∈ C01 (U, R).
Moreover, we can replace U by a rectangle K containing U and extend u, v to
K by setting it 0 outside U . Now use integration by parts with respect to the j-th
coordinate. 2
In particular, this lemma says that if u ∈ H 1 (U, R), then
Z Z
(∂j u)φdx = − u(∂j φ) dx, φ ∈ C0∞ (U, R). (4.10)
U U
And since C0∞ (U, R) is dense in L2 (U, R), the derivatives are uniquely determined
by u ∈ L2 (U, R) alone. Moreover, if u ∈ C 1 (U, R), then the derivative in the
Sobolev space corresponds to the usual derivative. In summary, H 1 (U, R) is the

space of all functions u ∈ L2 (U, R) which have first order derivatives (in the sense
of distributions, i.e., (4.10)) in L2 (U, R).
Next, we want to consider some additional properties which will be used later
on. First of all, the Poincaré-Friedrichs inequality.
Lemma 4.2 (Poincaré-Friedrichs inequality) Suppose u ∈ H01 (U, R), then

Z Z
2 2
u dx ≤ dj (∂j u)2 dx, (4.11)
U U
where dj = sup{(xj − yj )2 |(x1 , . . . , xn ), (y1 , . . . , yn ) ∈ U }.
Proof. Again we can assume u ∈ C01 (U, R) and we assume j = 1 for notational
convenience. Replace U by a set K = [a, b] × K̃ containing U and extend u to K
by setting it 0 outside U . Then we have
Z x1 2
2
u(x1 , x2 , . . . , xn ) = 1 · (∂1 u)(ξ, x2 , . . . , xn )dξ
a
Z b
≤ (b − a) (∂1 u)2 (ξ, x2 , . . . , xn )dξ, (4.12)
a
where we have used the Cauchy-Schwarz inequality. Integrating this result over
[a, b] gives
Z b Z b
2 2
u (ξ, x2 , . . . , xn )dξ ≤ (b − a) (∂1 u)2 (ξ, x2 , . . . , xn )dξ (4.13)
a a
and integrating over K̃ finishes the proof. 2

Hence, from the view point of Banach spaces, we could also equip H01 (U, R)
with the scalar product
Z
hu, vi = (∂j u)(x)(∂j v)(x)dx. (4.14)
U
This scalar product will be more convenient for our purpose and hence we will use
it from now on. (However, all results stated will hold in either case.) The norm
corresponding to this scalar product will be denoted by |.|.
Next, we want to consider the embedding H01 (U, R) ,→ L2 (U, R) a little closer.
This embedding is clearly continuous since by the Poincaré-Friedrichs inequality
we have
d(U )
|u|2 ≤ √ |u|, d(U ) = sup{|x − y| |x, y ∈ U }. (4.15)
n
Moreover, by a famous result of Rellich, it is even compact. To see this we first
prove the following inequality.
Lemma 4.3 (Poincaré inequality) Let Q ⊂ Rn be a cube with edge length ρ.

Then 2
nρ2
Z Z Z
2 1
u dx ≤ n udx + (∂k u)(∂k u)dx (4.16)
Q ρ Q 2 Q
for all u ∈ H 1 (Q, R).
Proof. After a scaling we can assume Q = (0, 1)n . Moreover, it suffices to

consider u ∈ C 1 (Q, R).
Now observe
n Z xi
X
u(x) − u(x̃) = (∂i u)dxi , (4.17)
i=1 xi−1
where xi = (x̃1 , . . . , x̃i , xi+1 , . . . , xn ). Squaring this equation and using Cauchy–
Schwarz on the right hand side we obtain
n Z
!2 n Z 2
X 1 X 1
2 2
u(x) − 2u(x)u(x̃) + u(x̃) ≤ |∂i u|dxi ≤n |∂i u|dxi
i=1 0 i=1 0
Xn Z 1
≤ n (∂i u)2 dxi . (4.18)
i=1 0
Now we integrate over x and x̃, which gives

Z Z 2 Z
2
2 u dx − 2 u dx ≤n (∂i u)(∂i u)dx (4.19)
Q Q Q
and finishes the proof. 2

Now we are ready to show Rellich’s compactness theorem.
Theorem 4.4 (Rellich’s compactness theorem) Let U be a bounded open sub-

set of Rn . Then the embedding
H01 (U, R) ,→ L2 (U, R) (4.20)
is compact.
Proof. Pick a cube Q (with edge length ρ) containing U and a bounded sequence
u ∈ H01 (U, R). Since bounded sets are weakly compact, it is no restriction to
k
assume that uk is weakly convergent in L2 (U, R). By setting uk (x) = 0 for x 6∈ U

we can also assume uk ∈ H 1 (Q, R). Next, subdivide Q into N subcubes Qi with
edge lengths ρ/N . On each subcube (4.16) holds and hence
Nn 2
nρ2
Z Z Z Z
2 2
X N
u dx = u dx = udx + (∂k u)(∂k u)dx (4.21)
U Q i=1
ρ Qi 2N 2 U
for all u ∈ H 1 (U, R). Hence we infer

Nn 2
nρ2 k
Z
k
X N
|u − u` |22 ≤ k
(u − u )dx `
+ |u − u` |2 . (4.22)
i=1
ρ Qi 2N 2
The last term can be made arbitrarily small by picking N large. The first term
converges to 0 since uk converges weakly and each summand contains the L2 scalar
product of uk − u` and χQi (the characteristic function of Qi ). 2
In addition to this result we will also need the following interpolation inequality.
Lemma 4.5 (Ladyzhenskaya inequality) Let U ⊂ R3 . For all u ∈ H01 (U, R)

we have √ 1/4
|u|4 ≤ 8|u|2 |u|3/4 .
4
(4.23)
Proof. We first prove the case where u ∈ C01 (U, R). The key idea is to start with
U ⊂ R1 and then work ones way up to U ⊂ R2 and U ⊂ R3 .
If U ⊂ R1 we have
Z x Z
2 2
u(x) = ∂1 u (x1 )dx1 ≤ 2 |u∂1 u|dx1 (4.24)
and hence Z
2
max u(x) ≤ 2 |u∂1 u|dx1 . (4.25)
x∈U
Here, if an integration limit is missing, it means that the integral is taken over the
whole support of the function.
If U ⊂ R2 we have
ZZ Z Z
u dx1 dx2 ≤ max u(x, x2 ) dx2 max u(x1 , y)2 dx1
4 2
x y
ZZ ZZ
≤4 |u∂1 u|dx1 dx2 |u∂2 u|dx1 dx2
ZZ 2/2 ZZ 1/2 ZZ 1/2
2 2 2
≤4 u dx1 dx2 (∂1 u) dx1 dx2 (∂2 u) dx1 dx2
ZZ ZZ
≤4 u2 dx1 dx2 ((∂1 u)2 + (∂2 u)2 )dx1 dx2 (4.26)
Now let U ⊂ R3 , then

ZZZ Z ZZ ZZ
4 2
u dx1 dx2 dx3 ≤ 4 dx3 u dx1 dx2 ((∂1 u)2 + (∂2 u)2 )dx1 dx2
ZZ ZZZ
2
≤4 max u(x1 , x2 , z) dx1 dx2 ((∂1 u)2 + (∂2 u)2 )dx1 dx2 dx3
z
ZZZ ZZZ
≤8 |u∂3 u|dx1 dx2 dx3 ((∂1 u)2 + (∂2 u)2 )dx1 dx2 dx3 (4.27)
and applying Cauchy–Schwarz finishes the proof for u ∈ C01 (U, R).
If u ∈ H01 (U, R) pick a sequence uk in C01 (U, R) which converges to u in H01 (U, R)
and hence in L2 (U, R). By our inequality, this sequencepis CauchyR in L4 (U, R)
and
qR converges to a limit v ∈ L4 (U, R). Since |u|2 ≤ 4 |U ||u|4 ( 1 · u2 dx ≤
R
1 dx u4 dx), uk converges to v in L2 (U, R) as well and hence u = v. Now take
the limit in the inequality for uk . 2
As a consequence we obtain
1/4
8d(U )
|u|4 ≤ √ |u|, U ⊂ R3 , (4.28)
3
and
Corollary 4.6 The embedding
H01 (U, R) ,→ L4 (U, R), U ⊂ R3 , (4.29)
is compact.
Proof. Let uk be a bounded sequence in H01 (U, R). By Rellich’s theorem there
is a subsequence converging in L2 (U, R). By the Ladyzhenskaya inequality this
subsequence converges in L4 (U, R). 2
Our analysis clearly extends to functions with values in Rn since H01 (U, Rn ) =
⊕nj=1 H01 (U, R).
4.3 Existence and uniqueness of solutions

Now we come to the reformulation of our original problem (4.5). We pick as
underlying Hilbert space H01 (U, R3 ) with scalar product
Z
hu, vi = (∂j ui )(∂j vi )dx. (4.30)
U
Let X be the closure of X in H01 (U, R3 ), that is,
X = {v ∈ C 2 (U , R3 )|∂j vj = 0 and v|∂U = 0} = {v ∈ H01 (U, R3 )|∂j vj = 0}. (4.31)
Now we multiply (4.5) by w ∈ X and integrate over U

Z Z
η∂k ∂k vj − (vk ∂k )vj + Kj wj dx = (∂j p)wj dx = 0. (4.32)
U U
Using integration by parts this can be rewritten as

Z
η(∂k vj )(∂k wj ) − vk vj (∂k wj ) − Kj wj dx = 0. (4.33)
U
Hence if v is a solution of the Navier-Stokes equation, then it is also a solution of

Z
ηhv, wi − a(v, v, w) − Kw dx = 0, for all w ∈ X , (4.34)
U
where Z
a(u, v, w) = uk vj (∂k wj ) dx. (4.35)
U
In other words, (4.34) represents a necessary solubility condition for the Navier-
Stokes equations. A solution of (4.34) will also be called a weak solution of the
Navier-Stokes equations. If we can show that a weak solution is in C 2 , then we can
read our argument backwards and it will be also a classical solution. However, in
4.3. Existence and uniqueness of solutions 51
general this might not be true and it will only solve the Navier-Stokes equations in
the sense of distributions. But let us try to show existence of solutions for (4.34)
first.
For later use we note
Z Z
1
a(v, v, v) = vk vj (∂k vj ) dx = vk ∂k (vj vj ) dx
U 2 U
Z
1
= − (vj vj )∂k vk dx = 0, v ∈ X. (4.36)
2 U
R
We proceed by studying (4.34). Let K ∈ L2 (U, R3 ), then U Kw dx is a linear
functional on X and hence there is a K̃ ∈ X such that
Z
Kw dx = hK̃, wi, w ∈ X. (4.37)
U
Moreover, the same is true for the map a(u, v, .), u, v ∈ X , and hence there is an
element B(u, v) ∈ X such that
a(u, v, w) = hB(u, v), wi, w ∈ X. (4.38)
In addition, the map B : X 2 → X is bilinear. In summary we obtain
hηv − B(v, v) − K̃, wi = 0, w ∈ X, (4.39)
and hence
ηv − B(v, v) = K̃. (4.40)
So in order to apply the theory from our previous chapter, we need a Banach space
Y such that X ,→ Y is compact.
Let us pick Y = L4 (U, R3 ). Then, applying the Cauchy-Schwarz inequality
twice to each summand in a(u, v, w) we see
XZ 1/2 Z 1/2
2 2
|a(u, v, w)| ≤ (uk vj ) dx (∂k wj ) dx
j,k U U
XZ 1/4 Z 1/4
4
≤ |w| (uk ) dx (vj )4 dx = |u|4 |v|4 |w|. (4.41)
j,k U U
Moreover, by Corollary 4.6 the embedding X ,→ Y is compact as required.

Motivated by this analysis we formulate the following theorem.
Theorem 4.7 Let X be a Hilbert space, Y a Banach space, and suppose there is
a compact embedding X ,→ Y . In particular, |u|Y ≤ β|u|. Let a : X 3 → R be a
multilinear form such that
|a(u, v, w)| ≤ α|u|Y |v|Y |w| (4.42)
and a(v, v, v) = 0. Then for any K̃ ∈ X , η > 0 we have a solution v ∈ X to the
problem
ηhv, wi − a(v, v, w) = hK̃, wi, w ∈ X. (4.43)
Moreover, if 2αβ|K̃| < η 2 this solution is unique.
Proof. It is no loss to set η = 1. Arguing as before we see that our equation is
equivalent to
v − B(v, v) + K̃ = 0, (4.44)
where our assumption (4.42) implies
|B(u, v)| ≤ α|u|Y |v|Y ≤ αβ 2 |u||v| (4.45)
Here the second equality follows since the embedding X ,→ Y is continuous.
Abbreviate F (v) = B(v, v). Observe that F is locally Lipschitz continuous
since if |u|, |v| ≤ ρ we have
|F (u) − F (v)| = |B(u − v, u) − B(v, u − v)| ≤ 2α ρ |u − v|Y ≤ 2αβ 2 ρ|u − v|. (4.46)
Moreover, let vn be a bounded sequence in X . After passing to a subsequence we
can assume that vn is Cauchy in Y and hence F (vn ) is Cauchy in X by |F (u) −
F (v)| ≤ 2α ρ|u − v|Y . Thus F : X → X is compact.
Hence all we need to apply the Leray-Schauder principle is an a priori estimate.
Suppose v solves v = tF (v) + tK̃, t ∈ [0, 1], then
hv, vi = t a(v, v, v) + thK̃, vi = thK̃, vi. (4.47)
Hence |v| ≤ |K̃| is the desired estimate and the Leray-Schauder principle yields
existence of a solution.
Now suppose there are two solutions vi , i = 1, 2. By our estimate they satisfy
|vi | ≤ |K̃| and hence |v1 − v2 | = |F (v1 ) − F (v2 )| ≤ 2αβ 2 |K̃||v1 − v2 | which is a
contradiction if 2αβ 2 |K̃| < 1. 2
Hence we have found a solution v to the generalized problem (4.34). This
solution is unique if 2( 2d(U )
√ )3/2 |K|2 < η 2 . Under suitable additional conditions on
3
the outer forces and the domain, it can be shown that weak solutions are C 2 and
thus also classical solutions. However, this is beyond the scope of this introductory
text.
Chapter 5
Monotone operators
5.1 Monotone operators

The Leray–Schauder theory can only be applied to compact perturbations of
the identity. If F is not compact, we need different tools. In this section we
briefly present another class of operators, namely monotone ones, which allow
some progress.
If F : R → R is continuous and we want F (x) = y to have a unique solution for
every y ∈ R, then f should clearly be strictly monotone increasing (or decreasing)
and satisfy limx→±∞ F (x) = ±∞. Rewriting these conditions slightly such that
they make sense for vector valued functions the analogous result holds.
Lemma 5.1 Suppose F : Rn → Rn is continuous and satisfies
F (x)x
lim = ∞. (5.1)
|x|→∞ |x|
Then the equation
F (x) = y (5.2)
has a solution for every y ∈ Rn . If F is strictly monotone
(F (x) − F (y))(x − y) > 0, x 6= y, (5.3)
then this solution is unique.
Proof. Our first assumption implies that G(x) = F (x) − y satisfies G(x)x =
F (x)x − yx > 0 for |x| sufficiently large. Hence the first claim follows from Theo-
rem 2.13. The second claim is trivial. 2
53
54 Chapter 5. Monotone operators
Now we want to generalize this result to infinite dimensional spaces. Through-

out this chapter, X will be a Hilbert space with scalar product h., ..i. An operator
F : X → X is called monotone if
hF (x) − F (y), x − yi ≥ 0, x, y ∈ X, (5.4)
strictly monotone if
hF (x) − F (y), x − yi > 0, x 6= y ∈ X, (5.5)
and finally strongly monotone if there is a constant C > 0 such that
hF (x) − F (y), x − yi ≥ C|x − y|2 , x, y ∈ X. (5.6)
Note that the same definitions can be made if X is a Banach space and F :
X → X ∗.
Observe that if F is strongly monotone, then it automatically satisfies
hF (x), xi
lim = ∞. (5.7)
|x|→∞ |x|
(Just take y = 0 in the definition of strong monotonicity.) Hence the following
result is not surprising.
Theorem 5.2 (Zarantonello) Suppose F ∈ C(X, X) is (globally) Lipschitz con-

tinuous and strongly monotone. Then, for each y ∈ X the equation
F (x) = y (5.8)
has a unique solution x ∈ X.
Proof. Set
G(x) = x − t(F (x) − y), t > 0, (5.9)
then F (x) = y is equivalent to the fixed point equation
G(x) = x. (5.10)
It remains to show that G is a contraction. We compute
|G(x) − G(x̃)|2 = |x − x̃|2 − 2thF (x) − F (x̃), x − x̃i + t2 |F (x) − F (x̃)|2

C
≤ (1 − 2 (Lt) + (Lt)2 )|x − x̃|2 , (5.11)
L
5.2. The nonlinear Lax–Milgram theorem 55
where L is a Lipschitz constant for F (i.e., |F (x) − F (x̃)| ≤ L|x − x̃|). Thus, if t ∈
(0, 2C
L
), G is a contraction and the rest follows from the contraction principle. 2
Again observe that our proof is constructive. In fact, the best choice for t is
clearly t = CL such that the contraction constant θ = 1 − ( CL )2 is minimal. Then
the sequence
C
xn+1 = xn − (1 − ( )2 )(F (xn ) − y), x0 = x, (5.12)
L
converges to the solution.
5.2 The nonlinear Lax–Milgram theorem

As a consequence of the last theorem we obtain a nonlinear version of the Lax–
Milgram theorem. We want to investigate the following problem:
a(x, y) = b(y), for all y ∈ X, (5.13)
where a : X 2 → R and b : X → R. For this equation the following result holds.
Theorem 5.3 (Nonlinear Lax–Milgram theorem) Suppose b ∈ L(X, R) and

a(x, .) ∈ L(X, R), x ∈ X, are linear functionals such that there are positive con-
stants L and C such that for all x, y, z ∈ X we have
a(x, x − y) − a(y, x − y) ≥ C|x − y|2 (5.14)
and
|a(x, z) − a(y, z)| ≤ L|z||x − y|. (5.15)
Then there is a unique x ∈ X such that (5.13) holds.
Proof. By the Riez theorem there are elements F (x) ∈ X and z ∈ X such that
a(x, y) = b(y) is equivalent to hF (x) − z, yi = 0, y ∈ X, and hence to
F (x) = z. (5.16)
By (5.14) the operator F is strongly monotone. Moreover, by (5.15) we infer
|F (x) − F (y)| = sup |hF (x) − F (y), x̃i| ≤ L|x − y| (5.17)

x̃∈X,|x̃|=1
that F is Lipschitz continuous. Now apply Theorem 5.2. 2

The special case where a ∈ L2 (X, R) is a bounded bilinear form which is

strongly continuous, that is,
a(x, x) ≥ C|x|2 , x ∈ X, (5.18)
is usually known as (linear) Lax–Milgram theorem.

The typical application of this theorem is the existence of a unique weak solu-
tion of the Dirichlet problem for elliptic equations
∂i Aij (x)∂j u(x) + bj (x)∂j u(x) + c(x)u(x) = f (x), x ∈ U,

u(x) = 0, x ∈ ∂U, (5.19)
where U is a bounded open subset of Rn . By elliptic we mean that all coefficients

A, b, c plus the right hand side f are bounded and a0 > 0, where
a0 = inf ei Aij (x)ej , b0 = − inf b(x), c0 = inf c(x). (5.20)

e∈S n ,x∈U x∈U x∈U
As in Section 4.3 we pick H01 (U, R) with scalar product

Z
hu, vi = (∂j u)(∂j v)dx (5.21)
U
as underlying Hilbert space. Next we multiply (5.19) by v ∈ H01 and integrate over
U
Z Z
∂i Aij (x)∂j u(x) + bj (x)∂j u(x) + c(x)u(x) v(x) dx = f (x)v(x) dx. (5.22)
U U
After a partial integration we can write this equation as
a(v, u) = f (v), v ∈ H01 , (5.23)
where
Z
a(v, u) = ∂i v(x)Aij (x)∂j u(x) + bj (x)v(x)∂j u(x) + c(x)v(x)u(x) dx
ZU
f (v) = f (x)v(x) dx, (5.24)
U
We call a solution of (5.23) a weak solution of the elliptic Dirichlet problem

(5.19).
5.3. The main theorem of monotone operators 57
By a simple use of the Cauchy-Schwarz and Poincaré-Friedrichs inequalities we

see that the bilinear form a(u, v) is bounded. To be able to apply R the2 (linear)
Lax–Milgram theorem we need to show that it satisfies a(u, u) ≥ |∂j u| dx.
Using (5.20) we have
Z
a(u, u) ≥ a0 |∂j u|2 − b0 |u||∂j u| + c0 |u|2 , (5.25)
U
where −b0 = inf b(x), c0 = inf c(x) and we need to control the middle term. If
b0 ≤ 0 there is nothing to do and it suffices to require c0 ≥ 0.
If b0 > 0 we distribute the middle term by means of the elementary inequality
ε 1
|u||∂j u| ≤ |u|2 + |∂j u|2 (5.26)
2 2ε
which gives Z
b0 2 εb0 2

a(u, u) ≥ (a0 − )|∂j u| + (c0 − )|u| . (5.27)
U 2ε 2
b0
Since we need a0 − 2ε > 0 and c0 − εb20 ≥ 0, or equivalently 2cb00 ≥ ε > 2a
b0
0
, we see
2
that we can apply the Lax–Milgram theorem if 4a0 c0 > b0 . In summary, we have
proven
Theorem 5.4 The elliptic Dirichlet problem (5.19) has a unique weak solution
u ∈ H01 (U, R) if a0 > 0, b0 ≤ 0, c0 ≥ 0 or 4a0 c0 > b20 .
5.3 The main theorem of monotone operators

Now we return to the investigation of F (x) = y and weaken the conditions of
Theorem 5.2. We will assume that X is a separable Hilbert space and that F :
X → X is a continuous monotone operator satisfying
hF (x), xi
lim = ∞. (5.28)
|x|→∞ |x|
In fact, if suffices to assume that F is weakly continuous
lim hF (xn ), yi = hF (x), yi, for all y ∈ X (5.29)

n→∞
whenever xn → x.
The idea is as follows: Start with a finite dimensional subspace Xn ⊂ X and

project the equation F (x) = y to Xn resulting in an equation
Fn (xn ) = yn , xn , yn ∈ Xn . (5.30)
More precisely, let Pn be the (linear) projection onto Xn and set Fn (xn ) = Pn F (xn ),
yn = Pn y (verify that Fn is continuous and monotone!).
Now Lemma 5.1 ensures that there exists a solution S un . Now chose the sub-
spaces Xn such that Xn → X (i.e., Xn ⊂ Xn+1 and ∞ n=1 Xn is dense). Then our
hope is that un converges to a solution u.
This approach is quite common when solving equations in infinite dimensional
spaces and is known as Galerkin approximation. It can often be used for
numerical computations and the right choice of the spaces Xn will have a significant
impact on the quality of the approximation.
So how should we show that xn converges? First of all observe that our con-
struction of xn shows that xn lies in some ball with radius Rn , which is chosen
such that
hFn (x), xi > |yn ||x|, |x| ≥ Rn , x ∈ Xn . (5.31)
Since hFn (x), xi = hPn F (x), xi = hF (x), Pn xi = hF (x), xi for x ∈ Xn we can
drop all n’s to obtain a constant R which works for all n. So the sequence xn is
uniformly bounded
|xn | ≤ R. (5.32)
Now by a well-known result there exists a weakly convergent subsequence. That is,
after dropping some terms, we can assume that there is some x such that xn * x,
that is,
hxn , zi → hx, zi, for every z ∈ X. (5.33)
And it remains to show that x is indeed a solution. This follows from
Lemma 5.5 Suppose F : X → X is weakly continuous and monotone, then
hy − F (z), x − zi ≥ 0 for every z ∈ X (5.34)
implies F (x) = y.
Proof. Choose z = x ± tw, then ∓hy − F (x ± tw), wi ≥ 0 and by continuity

∓hy − F (x), wi ≥ 0. Thus hy − F (x), wi = 0 for every w implying y − F (x) = 0. 2
Now we can show
5.3. The main theorem of monotone operators 59
Theorem 5.6 (Browder, Minty) Suppose F : X → X is weakly continuous,

monotone, and satisfies
hF (x), xi
lim = ∞. (5.35)
|x|→∞ |x|
Then the equation
F (x) = y (5.36)
has a solution for every y ∈ X. If F is strictly monotone then this solution is
unique.
Proof. Abbreviate yn = F (xn ), then we have hy − F (z), xn − zi = hyn −

Fn (z), xn − zi ≥ 0Sfor z ∈ Xn . Taking the limit implies hy − F (z), x − zi ≥ 0 for
every z ∈ X∞ = ∞ n=1 Xn . Since X∞ is dense, hy − F (z), x − zi ≥ 0 for every
z ∈ X by continuity and hence F (x) = y by our lemma. 2
Note that in the infinite dimensional case we need monotonicity even to show
existence. Moreover, this result can be further generalized in two more ways.
First of all, the Hilbert space X can be replaced by a reflexive Banach space if
F : X → X ∗ . The proof is almost identical. Secondly, it suffices if
t 7→ hF (x + ty), zi (5.37)
is continuous for t ∈ [0, 1] and all x, y, z ∈ X, since this condition together with
monotonicity can be shown to imply weak continuity.
Bibliography
[1] M. Berger and M. Berger, Perspectives in Nonlinearity, Benjamin, New

York, 1968.
[2] L. C. Evans, Weak Convergence Methods for nonlinear Partial Differen-
tial Equations, CBMS 74, American Mathematical Society, Providence,
1990.
[3] S.-N. Chow and J. K. Hale, Methods of Bifurcation Theory, Springer,
New York, 1982.
[4] K. Deimling, Nichtlineare Gleichungen und Abbildungsgrade, Springer,
Berlin, 1974.
[5] K. Deimling, Nonlinear Functional Analysis, Springer, Berlin, 1985.
[6] J. Franklin, Methods of Mathematical Economics, Springer, New York
1980.
[7] O.A. Ladyzhenskaya, The Boundary Values Problems of Mathematical
Physics, Springer, New York, 1985.
[8] N. Lloyd, Degree Theory, Cambridge University Press, London, 1978.
[9] J.J. Rotman, Introduction to Algebraic Topology, Springer, New York,
1988.
[10] M. Ru̇žička, Nichtlineare Funktionalanalysis, Springer, Berlin, 2004.
[11] E. Zeidler, Applied Functional Analysis: Applications to Mathematical
Physics, Springer, New York 1995.
[12] E. Zeidler, Applied Functional Analysis: Main Principles and Their Ap-
plications, Springer, New York 1995.
61
62 Bibliography
63
64 Glossary of notations
Glossary of notations
Bρ (x) . . . ball of radius ρ around x

conv(.) . . . convex hull
C(U, Y ) . . . set of continuous functions from U to Y , 1
C r (U, Y ) . . . set of r times continuously differentiable functions, 2
C0r (U, Y ) . . . functions in C r with compact support, 45
C(U, Y ) . . . set of compact functions from U to Y , 34
CP(f ) . . . critical points of f , 13
CS(K) . . . nonempty convex subsets of K, 26
CV(f ) . . . critical values of f , 13
deg(D, f, y) . . . mapping degree, 13, 22
det . . . determinant
dim . . . dimension of a linear space
div . . . divergence
dist(U, V ) = inf (x,y)∈U ×V |x − y| distance of two sets
Dyr (U, Y ) . . . functions in C r (U , Y ) which do not attain y on the boundary.
dF . . . derivative of F , 1
F(X, Y ) . . . set of compact finite dimensional functions, 34
GL(n) . . . general linear group in n dimensions
H(C) . . . set of holomorphic functions, 11
H 1 (U, Rn ) . . . Sobolev space, 45
H01 (U, Rn ) . . . Sobolev space, 45
inf . . . infimum
Jf (x) = det f 0 (x) Jacobi determinant of f at x, 13
L(X, Y ) . . . set of bounded linear functions, 1
Lp (U, Rn ) . . . Lebesgue space of p integrable functions, 44
max . . . maximum
n(γ, z0 ) . . . winding number
O(.) . . . Landau symbol, f = O(g) iff lim supx→x0 |f (x)/g(x)| < ∞
o(.) . . . Landau symbol, f = o(g) iff limx→x0 |f (x)/g(x)| = 0
Glossary of notations 65
∂U . . . boundary of the set U

∂x F (x, y) . . . partial derivative with respect to x, 1
RV(f ) . . . regular values of f , 13
R(I, X) . . . set of regulated functions, 4
S(I, X) . . . set of simple functions, 4
sgn . . . sign of a number
sup . . . supremum
supp . . . support of a functions
Index
Arzelà-Ascoli theorem, 40 Functional, linear, 5
Best reply, 27 Galerkin approximation, 58

Brouwer fixed-point theorem, 24 Gronwall’s inequality, 41
Chain rule, 2 Holomorphic function, 11

Characteristic function, 4 Homotopy, 12
Compact operator, 34 Homotopy invariance, 13
Contraction principle, 5
Critical values, 13 Implicit function theorem, 7
Integral, 4
Derivative, 1 Integration by parts, 45
partial, 1 Inverse function theorem, 8
Diffeomorphism, 2
Differentiable, 1 Jordan curve theorem, 31
Differential equations, 8 Kakutani’s fixed-point theorem, 26
Distribution, 46
Ladyzhenskaya inequality, 48
Elliptic equation, 56 Landau symbols, 1
Embedding, 48 Lax–Milgram theorem, 55
Equilibrium Leray–Schauder principle, 37
Nash, 28
Mean value theorem, 2
Finite dimensional operator, 34 monotone, 54
Fixed-point theorem operator, 53
Altman, 38 strictly, 54
Brouwer, 24 strongly, 54
contraction principle, 5 Multilinear function, 3
Kakutani, 26
Krasnosel’skii, 38 Nash equilibrium, 28
Rothe, 38 Nash theorem, 28
Schauder, 37 Navier–Stokes equation, 44
66
Index 67
stationary, 44
n-person game, 27
Payoff, 27
Peano theorem, 40
Poincaré inequality, 47
Poincaré-Friedrichs inequality, 46
Prisoners dilemma, 28
Proper, 35
Reduction property, 29
Regular values, 13
Regulated function, 4
Rellich’s compactness theorem, 48
Rouchés theorem, 12
Sard’s theorem, 17
Simple function, 4
Stokes theorem, 19
Strategy, 27
Symmetric multilinear function, 3
Uniform contraction principle, 6
Weak solution, 50, 56

Winding number, 11

Nonlinear Functional Analysis: Gerald Teschl

Uploaded by

Copyright:

Available Formats

Nonlinear Functional Analysis: Gerald Teschl

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Nonlinear Functional Analysis: Gerald Teschl

Uploaded by

Copyright:

Available Formats

Nonlinear Functional Analysis

1991 Mathematics subject classification. 46-01, 47H10, 47H11, 58Fxx, 76D05

Abstract. This manuscript provides a brief introduction to nonlinear functional

Keywords and phrases. Mapping degree, fixed-point theorems, differential equa-

Typeset by LATEX and Makeindex.

1 Analysis in Banach spaces 1

2 The Brouwer mapping degree 11

3 The Leray–Schauder mapping degree 33

4 The stationary Navier–Stokes equation 43

Analysis in Banach spaces

1.1 Differentiation and integration in Banach sp-

F (x + u) = F (x) + dF (x) u + o(u), (1.1)

In the case of X = Rm and Y = Rn ,the matrix representation of dF with

|F | = max sup |dj F (x)|. (1.3)

d(G ◦ F )(x) = dG(F (x)) ◦ dF (x), x ∈ X. (1.4)

In particular, if λ ∈ Y ∗ is a linear functional, then d(λ ◦ F ) = dλ ◦ dF = λ ◦ dF .

Theorem 1.2 (Mean value) Suppose U ⊆ X and F ∈ C 1 (U, Y ). If U is convex,

Conversely, (for any open U ) if

|F (x) − F (y)| ≤ M |x − y|, x, y ∈ U, (1.6)

Proof. Abbreviate f (t) = F ((1 − t)x + ty), 0 ≤ t ≤ 1, and hence df (t) =

φ(t0 + ε) = |f (t0 + ε) − f (t0 ) + f (t0 ) − f (0)| − (M̃ + δ)(t0 + ε)

for ε ≥ 0, small enough. Thus t0 = 1.

M ε ≥ |F (x0 + εe) − F (x0 )| = |dF (x0 )(εe) + o(ε)|

Corollary 1.3 Suppose U is a connected subset of a Banach space X. A mapping

Next we want to look at higher derivatives more closely. Let X = m

If we take n copies of the same space, the set of multilinear functions F : X n → Y

F (x1 , . . . , xn ) maps to x1 7→ F (x1 , .). In addition, note that to each F ∈ Ln (X, Y )

Moreover, the r-th derivative of F ∈ C r (X, Y ) is symmetric since,

where the order of the partial derivatives can be shown to be irrelevant.

where µ denotes the Lebesgue measure on I. This map satisfies

In addition, if λ ∈ X ∗ is a continuous linear functional, then

1.2 Contraction principles

|F (x) − F (x̃)| ≤ θ|x − x̃|, x, x̃ ∈ C. (1.19)

Note that a contraction is continuous. We also recall the notation F n (x) =

Theorem 1.4 (Contraction principle) Let C be a closed subset of a Banach

and hence by the triangle inequality (for n > m)

|F (x) − x| = lim |xn+1 − xn | = 0 (1.23)

|F (x, y) − F (x̃, y)| ≤ θ|x − x̃|, x, x̃ ∈ U , y ∈ V. (1.24)

Theorem 1.5 (Uniform contraction principle) Let U , V be open subsets of

Proof. Let us first show that x(y) is continuous. From

|x(y + v) − x(y)| = |F (x(y + v), y + v) − F (x(y), y + v)

d x(y) = ∂x F (x(y), y)d x(y) + ∂y F (x(y), y). (1.27)

Considering this as a fixed point equation T (x0 , y) = x0 , where T (., y) : L(Y, X) →

Theorem 1.7 (Inverse function) Suppose F ∈ C r (U, Y ), U ⊆ X, and let dF (x0 )

Proof. Apply the implicit function theorem to G(x, y) = y − F (x). 2

1.3 Ordinary differential equations

Lemma 1.8 Suppose I ⊆ R is a compact interval and f ∈ C r (U, Y ). Then

(f∗ x)(t) = f (x(t)). (1.31)

whenever |x − x0 | ≤ δ. By assumption we have

λ : Cb (I, L(X, Y )) → L(Cb (I, X), Cb (I, Y )) , (1.34)

where (T∗ x)(t) = T (t)x(t). Since we have

|T∗ x| = sup |T (t)x(t)| ≤ sup |T (t)||x(t)| ≤ |T ||x|, (1.35)

we infer |λ| ≤ 1 and hence λ is continuous. Now observe df∗ = λ ◦ (df )∗ .

Theorem 1.9 Let I be an open interval, U an open subset of a Banach space X

ẋ(t) = F (t, x, λ), x(t0 ) = x0 , (t0 , x0 , λ) ∈ I × U × Λ, (1.36)

has a unique solution x(t, t0 , x0 , λ) ∈ C r (I1 × I2 × U1 × Λ1 , X), where I1,2 , U1 , and

Proof. If we shift t → t − t0 , x → x − x0 , and hence F → F (. + t0 , . + x0 , λ),

ẋ = εF (x, λ), x ∈ Dr+1 = {x ∈ Cbr+1 ((−1, 1), U )|x(0) = 0}, (1.37)

G : Dr+1 × Λ × (−ε0 , ε0 ) → Cbr ((−1, 1), X)