Functions of Several Real Variables
Functions of Several Real Variables
Functions of Several Real Variables
Matı́as Raja
Preface 5
2 Normed Spaces 21
2.1 Norms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Finite-dimensional normed spaces . . . . . . . . . . . . . . . . . 22
2.3 Linear operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.4 Spaces of functions . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.5 Complements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.6 Rationale and remarks . . . . . . . . . . . . . . . . . . . . . . . 29
2.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
1
4 Differentiable mappings 43
4.1 The basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . 46
4.3 Second order differentiability and more . . . . . . . . . . . . . . 49
4.4 Applications to extrema . . . . . . . . . . . . . . . . . . . . . . 53
4.5 Two applications to Algebra . . . . . . . . . . . . . . . . . . . . 56
4.5.1 The Fundamental Theorem of Algebra . . . . . . . . . . 56
4.5.2 Diagonalization of symmetric matrices . . . . . . . . . . 57
4.6 Rationale and remarks . . . . . . . . . . . . . . . . . . . . . . . 59
4.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6 Riemann Integral 83
6.1 Rectangles and partitions . . . . . . . . . . . . . . . . . . . . . 83
6.2 Integrals on compact rectangles . . . . . . . . . . . . . . . . . . 85
6.3 Integrability and continuity points . . . . . . . . . . . . . . . . . 87
6.4 Integration on general domains . . . . . . . . . . . . . . . . . . 89
6.5 Iterated integrals . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.6 Improper integrals . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.7 Rationale and remarks . . . . . . . . . . . . . . . . . . . . . . . 96
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
2
7.5.3 Integrals of Dirichlet . . . . . . . . . . . . . . . . . . . . 111
7.6 Rationale and remarks . . . . . . . . . . . . . . . . . . . . . . . 111
7.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
3
11 Classic Vector Analysis 201
11.1 Operations with vectors in R3 . . . . . . . . . . . . . . . . . . . 201
11.2 Differential forms on R3 . . . . . . . . . . . . . . . . . . . . . . 203
11.3 Vector operators . . . . . . . . . . . . . . . . . . . . . . . . . . 204
11.4 Newtonian potential . . . . . . . . . . . . . . . . . . . . . . . . 208
11.5 Harmonic functions . . . . . . . . . . . . . . . . . . . . . . . . . 213
11.6 Vector Analysis in R2 . . . . . . . . . . . . . . . . . . . . . . . . 215
11.7 Assorted applications . . . . . . . . . . . . . . . . . . . . . . . . 218
11.7.1 Mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . 218
11.7.2 Hydrostatics . . . . . . . . . . . . . . . . . . . . . . . . . 219
11.7.3 Hydrodynamics . . . . . . . . . . . . . . . . . . . . . . . 221
11.7.4 Electromagnetic fields . . . . . . . . . . . . . . . . . . . 224
11.8 Rationale and remarks . . . . . . . . . . . . . . . . . . . . . . . 228
11.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
Bibliography 265
4
Preface
These lectures are based on quite a few years of teaching Functions of Sev-
eral Real Variables for the degree in Mathematics at the University of Murcia.
The purpose is to cover not only the usual topics, but a “little more” which
makes the difference with other courses, notably, the inclusion of many related
topics and nontrivial applications.
5
I have added three appendices at the end. Two of them complement infor-
mation on the most important spaces of functions that appear in the course:
the space of continuous functions over a compact C(K) and the spaces of inte-
grable functions Lp (µ). The results we have included (Stone-Weierstrass the-
orem, relations amongst the types of convergence for integrable functions. . . )
lie in a limbo between Real Analysis and Functional Analysis. The third ap-
pendix on Mechanics is an “experiment” based in the fact that is possible to
obtain the Lagrange’s equations of the movement from Newton’s laws and the
chain rule of Calculus.
6
Chapter 1
1.1 Generalities
The basics on metric spaces are allegedly known by the students, so we will
get through this first section rather quickly.
7
c) finite intersections of open sets give open sets.
Statements on metric spaces that can be formulated in terms of open sets (the
topology) are called topological. For instance, we may define convergence of
sequences in a metric space as follows (xn ) ⊂ M is converging to x ∈ M if
limn d(xn , x) = 0. Apparently, the definition strongly uses the metric, however
convergence of sequences is a topological notion actually, because it can be
equivalently formulated as: for every U ∋ x open there is nU such that xn ∈ U
whenever n ≥ nU . We say that x is a cluster point of a sequence (xn ) if there
is a subsequence (xnk ) with limit x. A cluster point of an infinite set is the
limit of a sequence of different point from the set.
8
the continuous function
d(x, A)
f (x) =
d(x, A) + d(x, B)
satisfies that f (x) ∈ [0, 1], A = f −1 (0) and B = f −1 (1).
Given two metrics d1 and d2 on the same set M , we say that d1 is finer
than d2 (equivalently, d2 is coarser than d1 ) if any open set with respect to
d2 is also open with respect to d1 . Note that this is equivalent to the conti-
nuity of the identity map Id : (M, d1 ) → (M, d2 ). The two metrics on M are
said equivalent if they produce the same topology, that is, the identity map
is continuous forth and back (topological homeomorphism). Given a metric
space (M, d), we may always suppose that the metric is bounded just taking
the equivalent metric d1 (x, y) = min{1, d(x, y)}.
A very useful operation with metric spaces (and more general topological
spaces) is the product. Let (M1 , d1 ) and (M2 , d2 ) be metric spaces. We can
endow M1 × M2 with the metric defined by
d((x1 , x2 ), (y1 , y2 )) = d1 (x1 , y1 ) + d2 (x2 , y2 ).
That operation can be extended to more finitely many factors. In the particular
case of the product of “copies” of R, it is not difficult to check that the product
metric is equivalent to the Euclidean distance. We may even consider countable
many factors {(Mn , dn )}n∈N . In that case, define the metric by a series
∞
X
d((xn ), (yn )) = 2−n dn (xn , yn )
n=1
1.2 Separability
A subset A ⊂ M is said to dense if A = M . A metric space is said to be
separable if it contains a countable dense set {xn : n ∈ N}. Note that in such a
case, the collection of balls {B(xn , 1/m) : n, m ∈ N} is a countable base of the
topology, that is, every open set can be expressed as a union of balls from that
collection. A metric (or more generally, topological) space is said Lindelöf if
every cover of the space by open sets has a countable subcover. With all these
definitions we have the following.
9
Proposition 1.2.1. A metric space is separable if and only if it is Lindelöf.
Separability implies that ε-discrete sets are countable. We say that the
metric space M is totally bounded if all the discrete sets are finite. We will call
ε-net to a maximal ε-discrete set.
1.3 Completeness
A sequence (xn ) is said Cauchy if for every ε > 0 there is N ∈ N such that
d(xn , xm ) < ε whenever n, m ≥ N (equivalently, limn,m d(xn , xm ) = 0). The
reader could establish these easy facts: every convergent sequence is Cauchy;
a Cauchy sequence with a cluster point must be convergent. A metric space is
said complete if every Cauchy sequence is convergent. The notion of complete-
ness is non topological. Observe that (−π/2, π/2) is not complete with the
10
usual metric on R but the metric d(x, y) = | tan x − tan y| makes its complete.
Proof. Observe that the hypothesis implies that (xn ) is a Cauchy sequence
for any choice Tof xn ∈ Fn . If the space is complete then limn xn = x and
clearly {x} = ∞ n=1 Fn . On the other hand, if M is not complete then there
is a Cauchy sequence (xn ) with no limit. Since (xn ) cannot have cluster
points, the sets Fn = {xk : k ≥ n}T are closed. Cauchy property implies
that limn diam(Fn ) = 0, but we have ∞ n=1 Fn = ∅.
11
The following is Banach’s fixed point for contractive mappings.
Then there is unique point x ∈ M such that f (x) = x (a fixed point for f ).
Moreover, whenever x1 ∈ M is chosen, the sequence defined inductively by
xn = f (xn−1 ) for n ≥ 2 is converging to x.
Proof. If y ∈ M is another fixed point for f and y ̸= x, then we have
It just remains to prove that (xn ) is converging, and this will be done checking
that (xn ) is Cauchy. For that aim, firstly observe that
Recursively we have
d(xn , xn−1 ) ≤ λn−2 d(x2 , x1 ).
Triangle inequality gives us for n > m ≥ 1 that
d(x2 , x1 ) m−1
≤ (λn−2 + · · · + λm−1 ) d(x2 , x1 ) ≤ λ .
1−λ
The inequality clearly implies that (xn ) is Cauchy.
12
1.4 Compactness
The compactness is one of the most important properties for Analysis.
Definition 1.4.1. A topological space is said to be compact if any open cover
has a finite subcover.
Passing to complement sets compactness is equivalent to the following prop-
erty: a family of closed sets has nonempty intersection whenever it has the fi-
nite intersection property, that is, if any of its finite subfamilies have nonempty
intersection. Note as well that compactness is preserved by continuous maps.
Proposition 1.4.2. “Compactness versus countable compactness”.
1. In a compact topological space any infinite subset has a cluster point.
2. If topological space satisfies that any infinite subset has a cluster point,
then any countable open cover has a finite subcover.
Proof. Suppose that the infinite set A ⊂ X has not cluster points. Then, for
any subset B ⊂ A the set A \ B is closed. Now note that
{A \ B : A ⊃ B is finite}
is a family of closed subsets with the finite intersection property and empty
intersection.
For the second statement it is enough to show that a decreasing sequence (Fn )
of nonempty closed subsets of X has nonempty intersection. Indeed, take a
point xnT∈ Fn . If (xn ) is finite, then x = xn for infinitely many n ∈ N and
so x ∈ ∞ n=1 Fn . Otherwise, (xn ) is infinite and thus it has a clusterTpoint x
which is a cluster point of anyT∞set {xk : k ≥ n} ⊂ Fn . Therefore x ∈ ∞ n=1 Fn .
In any case, the intersection n=1 Fn is nonempty.
Proposition 1.4.3. For a metric space M , the following are equivalent:
(i) M is compact;
(ii) any sequence in M has a convergent subsequence;
(iii) M is complete and totally bounded.
13
Proof. (i)⇒(ii) Clearly we may assume that the sequence has infinitely many
points, and so it has a cluster point as infinite set. In a metric space, a cluster
point of a sequence is the limit of some of its subsequences.
(ii)⇒(i) Note that any infinite subset of M has a cluster point. Note as well
that the metric space M must be separable since otherwise it would contain
an uncountable metrically discrete subset, and any sequence of different points
made from that set has no convergent subsequence. Now M is Lindelöf, any
open cover has a countable subcover. By previous result, this countable cover
has further a finite subcover.
(i)+(ii)⇒(iii) Given ε > 0, then {B(x, ε) : x ∈ M } is an open cover of M ,
which has a finite subcover of the form {B(xk , ε) : 1 ≤ k ≤ n}. Clearly,
{xk : 1 ≤ k ≤ n} is a finite ε-net of M . Given a Cauchy sequence (xn ), it
has a convergent subsequence with limit x ∈ M . That implies that the whole
sequence (xn ) is converging to x.
(iii)⇒(ii) Suppose we are given a sequence (xn ) ⊂ M . Since M is covered by
finitely many balls of radius 1/2, there is at least one that contains infinitely
many terms of the sequence (xn ), that is, there is A1 ⊂ N infinite such that
d(xn , xm ) ≤ 1 for n, m ∈ A1 . With the same argument, we can find A2 ⊂ A1
infinite such that d(xn , xm ) ≤ 1/2 for n, m ∈ A2 . Proceeding in this way, we
will have A1 ⊃ A2 ⊃ · · · ⊃ Ak ⊃ . . . all infinite such that if xn , xm ∈ Ak then
d(xn , xm ) ≤ 1/2k . Now, we may take inductively n1 < n2 < · · · < nk < . . .
such that xnk ∈ Ak , the construction shows that (xnk ) is a Cauchy sequence,
which should be convergent by the completeness of M .
14
Proof. The readers should be able to prove that by theirselves.
15
Proposition 1.5.1. The space (C(K), ∥ · ∥∞ ) is complete.
for every n ≥ N . Fix U ∋ x neighborhood such that |fN (y) − fN (x)| < ε/3 if
y ∈ U . Triangle inequality gives that |f (y) − f (x)| < ε for y ∈ U . Finally, the
above inequality also gives that ∥fn − f ∥∞ ≤ ε/3 for any n ≥ N which implies
the convergence in the uniform distance to f of (fn ).
Theorem 1.5.2 (Dini). Let (fn ) ⊂ C(K) a sequence of functions that con-
verges to some f ∈ C(K). If the sequence (fn ) is monotone (increasing or
decreasing), then (fn ) converges uniformly to f .
Hint of proof. Assume that (fn ) is decreasing, for instance. Then fix ε > 0
and prove that the sequence of sets
is an open cover of K.
16
Sn
is an open cover of K, so there are points {xk }nk=1 such that K = k=1 Uxk .
Since A is bounded, the set
1.6 Fractals
Let (M, d) be a complete metric space and denote by K(M ) the set of nonempty
compact subsets of M . For A ∈ K(M ) and r > 0 define a closed “neighbour-
hood” as
D[A, r] = {x ∈ M : d(A, x) ≤ r}.
And now a distance between elements from K(M ) by
17
Theorem 1.6.1. If M is a complete metric space, then (K(M ), d) is complete.
Proof. Consider a Cauchy sequence (An ) ⊂ K(M ). We claim that for any
choice xn ∈ An , the sequence (xn ) has a cluster point. Indeed, fix ε > 0 and let
nε such that if n ≥ nε then An ⊂ D[Anε , ε], and thus (xn ) ⊂ D[Anε , ε] except
finitely many terms. It is clear that D[Anε , ε] can be covered by finitely many
balls of radius (3/2)ε and infinitely many xn ’s are inside of one of those balls.
Therefore d(xnk , xnj ) ≤ 3ε for some subsequence (xnk ). This selection process
applied for ε = 1/m and further diagonal argument will produce a Cauchy
subsequence of (xn )
Let A ⊂ M be the set of all the cluster points of sequences obtained as before.
Note that if x ∈ A, we can take xn ∈ An such that (xn ) converges to x. Note
as well that A has to be closed since any cluster point of A can be reached by
a suitable diagonal choice. Now, for ε > 0 note that A ⊂ D(An , ε) and n large
enough. That implies that A is totally bounded, and therefore A ∈ K(M ).
Also implies that A is “half limit” of (An ). In order to complete, the proof we
have to prove that An ⊂ D(A, ε) for n large. If it is not the case, we can take
xn ∈ An such that d(A, x) ≥ ε for infinitely many n’s. That would produce a
cluster point x such that d(A, x) ≥ ε. On the other hand, by definition of A
we have x ∈ A. The contradiction proves the theorem.
18
which implies the contractivity of f .
1.8 Exercises
1. Prove that closed balls are closed sets, and open balls are open sets.
2. Prove that uniformly continuous maps between metric spaces preserve
Cauchy sequences.
3. The distance d(A, x) from a point x to a set A ∈ M in a metric space
is defined by d(A, x) = inf{d(y, x) : y ∈ A}. Prove the following state-
ments:
(a) |d(A, x) − d(A, y)| ≤ d(x, y),
(b) A = {x ∈ X : d(A, x) = 0},
(c) d(A, x) ≤ d(B, x) if and only if B ⊂ A.
4. Prove with the help of Dini’s theorem that the uniform convergence on
bounded subsets of the sequence
x n
fn (x) = 1 + .
n
19
6. Prove that separability of a metric space is hereditary to subsets.
20
Chapter 2
Normed Spaces
2.1 Norms
A basic notion in Analysis is the notion of normed space, which is just a vector
space together a norm. Let X be a vector space (either on R or C, say K). A
function ∥ · ∥ : X → [0, +∞) is called a norm if:
1. ∥x + y∥ ≤ ∥x∥ + ∥y∥ for all x, y ∈ X;
2. ∥λx∥ = |λ|∥x∥ for all x ∈ X, λ ∈ K;
3. ∥x∥ = 0 if and only if x = 0.
A norm induces a distance on X by means of d(x, y) = ∥x − y∥, and
that provides a topological structure, as a metric space. There are several
weakenings of the notion of norm which are also interesting, see the section
“Complements”. Sometimes, we use (X, ∥ · ∥) to denote a normed space, how-
ever that is not necessary when the norm we are dealing with is understood.
From now on we will focused on real normed spaces. The notation for open
and closed balls will be the same that within metric spaces, however we will
distinguish the unit ball
SX := {x ∈ X : ∥x∥ = 1}
21
is the topological boundary of BX and so the interior of BX is exactly B(0, 1).
That is not generally true in metric spaces.
Two norms ∥ · ∥1 and ∥ · ∥2 on the same vector space are said equivalent is
they generate the same topology. A nice consequence of the similarity of balls
is the nice characterization of the equivalence of norms.
Proposition 2.1.1. Let X be a vector space and let ∥ · ∥1 and ∥ · ∥2 be two
norms on X. Then the norms are equivalent if and only if there are constants
α, β > 0 such that
α∥x∥1 ≤ ∥x∥2 ≤ β∥x∥1
for all x ∈ X.
The notion of completeness has several nuances. Firstly, a consequence of
the former Proposition.
Corollary 2.1.2. The completeness (or its absence) of a normed space is
invariant among the equivalent norms.
P∞Now we will consider series in normed spaces. As in the real case, a series
n=1 xn is just a symbolicPnexpression. The series is said to be convergent
if the partial sums sn = k=1 xk converge to some element in X called the
sum of the series. We say that a series is unconditionally convergent if any
rearrangement
P∞ of its terms is convergent with Pthe same sum. Finally, a series
∞
x
n=1 n is said to be absolutely convergent if n=1 ∥xn ∥ < +∞. Despite the
name, absolute convergence does not always imply convergence.
22
Theorem 2.2.1. All the norms on a finite dimensional space X are equivalent.
Proof. We will denote by ∥ · ∥2 the Euclidean norm on X given by the
isomorphism associated to a basis {e1 , . . . , en }. Let ∥ · ∥ be an arbitrary norm
on X. We will show that ∥ · ∥ is continuous as a function on (X, ∥ · ∥2 ). Indeed,
note that
∥x∥ = ∥λ1 e1 + · · · + λn en ∥ ≤ |λ1 |∥e1 ∥ + · · · + |λn |∥en ∥
≤ (|λ1 |2 + · · · + |λn |2 )1/2 (∥e1 ∥2 + · · · + ∥en ∥2 )1/2 = c∥x∥2
by the Cauchy-Schwarz inequality and taking c = (∥e1 ∥2 + · · · + ∥en ∥2 )1/2 .
Now, we have
|∥x∥ − ∥y∥| ≤ ∥x − y∥ ≤ c∥x − y∥2
that means that ∥ · ∥ is Lipschitz (with constant c) with respect to ∥ · ∥2 ,
and thus continuous as wanted. Let α and β the minimum and maximum
respectively of ∥ · ∥ on the set S = {x ∈ X : ∥x∥2 = 1}. We have α > 0 since
∥ · ∥ is a norm and S does not contain 0. If x ∈ X \ {0} then x/∥x∥2 ∈ S and
thus
x
α≤ ≤β
∥x∥2
and so
α∥x∥2 ≤ ∥x∥ ≤ β∥x∥2
which is the desired equivalence.
23
Proof. Note that f (z) = ∥z −x∥ is a continuous function on Y whose infimum
can be computed on a bounded subset.
24
Proof. If X has finite dimension n, then it is isomorphic to Rn , and therefore
the unit ball, as closed bounded set, is compact. On the other hand, if X
has infinite dimension, then BX contains a sequence with no convergent sub-
sequence. In such a case, BX cannot be compact.
Finite dimensional spaces are also characterized by the fact that uncondi-
tionally convergent series are absolutely convergent. On implication is clear,
the other one is the celebrated Dvoretsky-Rogers theorem.
Proposition 2.3.1. Let X, Y be normed spaces (on the same field) and T :
X → Y be linear. Then the following are equivalent:
1. T is continuous;
2. T is continuous at 0;
3. there is c > 0 such that ∥T (x)∥ ≤ c∥x∥ for every x ∈ X;
A similar statement can be proved for bilinear or multilinear maps. The set
of continuous operators from X to Y is denoted L(X, Y ). Note tha tL(X, Y )
becomes a normed space with the norm
∥T ∥ = sup{∥T (x)∥ : x ∈ BX }
This norm inherits the completeness from Y , that is L(X, Y ) is a Banach space
if and only if Y is.
25
The norm, by its very definition has the following remarkable property: If
T ∈ L(X, Y ) and S ∈ L(Y, Z), then S ◦ T ∈ L(X, Z) and ∥S ◦ T ∥ ≤ ∥S∥∥T ∥.
26
That norm makes ℓ∞ (M ) a complete normed space and the induced topology
is usually referred as the topology of uniform convergence. When M has an
additional structure as being a metric space we can study the properties that
are preserved by uniform limits. We already know that it is the case with the
continuity. Something more general can be said.
Proposition 2.4.1. Assume that (fn ) ⊂ ℓ∞ (M ) and (xm ) ⊂ M are such that:
1. the limit of (fn ) exists uniformly;
2. limm fn (xm ) exists for every n ∈ N.
Then the following iterated limits exist and satisfies the equality
lim (lim fn (xm )) = lim (lim fn (xm )).
n m m n
Despite the fact that a uniform limit of Lipschitz functions could not be
Lipschitz, as for instance, the sequence fn (x) = n−1 sin n2 x on [0, π], it is
possible to endow the set of Lipschitz functions with a norm that makes it
complete. Indeed, denote by L(M ) the set of Lipschitz functions defined on
the metric space M and fix a point x0 ∈ M . Then the number
|f (x) − f (y)|
∥f ∥ = |f (x0 )| + sup : x, y ∈ M, x ̸= y}
d(x, y)
defines a norm on L(M ) that makes it complete. We left the proof to the
reader. A variation for differentiable functions is asked among the exercises of
the chapter.
27
2.5 Complements
In this section we include, without proof, several results that traditionally are
reserved as topics for Functional Analysis, despite some proofs are accesible to
this level.
The topologies defined by a family of seminorms appear quite often and they
will be considered in other chapters, for instance, uniform convergence on com-
pact subsets of Rn .
28
Theorem 2.5.1 (Hahn-Banach). Let X be a real vector space, Y ⊂ X a
subspace, p a sublinear homogeneous functional defined on X and f a linear
form defined on Y such that f (x) ≤ p(x) for every x ∈ Y . Then there is a
linear form f˜ defined on X such that f˜|Y = f and f˜(x) ≤ p(x) for all x ∈ X.
The formulation with a sublinear functional is the key to prove separation
results for convex sets, which is a matter we are not interested here. Never-
theless, if p is of the form c∥ · ∥, we obtain the following consequence.
Theorem 2.5.2 (Hahn-Banach). Let X a normed space and x ∈ X. There
exists x∗ ∈ X ∗ with ∥x∗ ∥ = 1 such that x∗ (x) = ∥x∥.
Note that the result informally says that the dual X ∗ is, at least, as large
as X, which was quite evident for a finite dimensional X. A straightforward
application of that result says that X embeds isometrically into X ∗∗ := (X ∗ )∗ .
The closure of X as a subset of X ∗∗ provides a model for the completion of X.
Corollary 2.5.3. Every normed space can be isometrically embedded into a
complete normed space as a dense subset.
A completion can be built “more directly” as a quotient of the space of
Cauchy sequences.
29
spaces have a richer theory. In particular, the properties of finite normed
spaces (or subspaces) should be remarked.
Some aspects of the convergence of sequences and series of functions are bet-
ter understood in the frame of normed spaces because the uniform convergence
is a metric one. However, the most important cases, power and trigonometric
series, have a particular treatment in other subjects along the degree studies.
2.7 Exercises
1. Prove that the notions of boundedness, Cauchy sequence and complete-
ness are invariant by equivalence of norms. Show with an example that
the same does not hold in general metric spaces.
a∥ · ∥2 ≤ ∥ · ∥1 ≤ b∥ · ∥2
30
6. Define on C[a, b] the norm ∥ · ∥p for p ≥ 1 by the formula
s
Z b
p
∥f ∥p = |f (t)|p dt
a
Prove that ∥ · ∥p is actually a norm for p = 1, 2 (the other cases are mor
difficult). Show that ∥ · ∥p is not equivalent to ∥ · ∥q if p ̸= q, on C[a, b].
Prove also that for any f ∈ C[a, b] then
lim ∥f ∥p = ∥f ∥∞ .
p→∞
7. Prove that for every n ∈ N there is a constant Cn such that for all the
n × n matrices with non-negative entries (ai,j ) the following inequality is
verified n X n n X n
X X
2
( ai,j ) ≤ Cn ( ai,j )2 .
i=1 j=1 j=1 i=1
∥f ∥ = |f (0)| + ∥f ′ ∥∞
defines a norm on C 1 [0, 1]. Show also that C 1 [0, 1] endowed with such a
norm is complete.
9. Let X be a normed space and consider the unit sphere S = {x ∈ X :
∥x∥ = 1}. Show that
d(S, x) = | 1 − ∥x∥ |.
31
12. The set of real bounded sequences is denoted ℓ∞ , and the formula
∥(xn )∞
n=1 ∥∞ = sup{|xn | : n ∈ N}
for (xn )∞
n=1 ∈ ℓ
∞
defines a norm. Show that (ℓ∞ , ∥ · ∥∞ ) is complete and
non separable.
13. We say that a function Q : X → R defined on a vector space is a quadratic
form if there exists a bilinear form B : X × E → R such that Q(x) =
B(x, x). Show that B is not determined, in general, by Q. However, if
we ask the bilinear form to be symmetric, that is, B(x, y) = B(y, x) for
all x, y ∈ X then
1
B(x, y) = (Q(x + y) − Q(x) − Q(y)).
2
Find the generalization of that result for cubic forms and symmetric
trilinear forms.
fn (x) = np x(1 − x2 )n
32
18. Find the set where the series
∞
X
e−nx
n=0
33
34
Chapter 3
35
coordinate system. In such a way is how we should think of the intrinsicness
of vector operators (chapter on Vector Analysis).
Now, assume you have a function of two variables given by some formula
f (x, y). What is the simplest way to represent it? For pedagogical reasons,
the best answer is the graph, that is, the set
However, in practise these functions occurs often and the graph is not advis-
able in some cases: think of (x,y) being a geographical position (longitude,
latitude) and the function being the height (over the sea level) or the atmo-
spheric pressure (at ground level). For that case, several curves (level curves)
of the form
{(x, y) : f (x, y) = c}
for some values of the constant c provide a contour map that can be read as
we, allegedly, can read a topographical map. Eventually, the curves, which
necessarily are discrete, could be changed by the continuous variation of tone
or colour.
As the representation could be difficult to visualise (the surfaces cover one an-
other like Russian dolls), it would be convenient to choose just one significative
value of c, for instance, when it reach a maximum. That is exactly what you
see in the pictures of atomic orbitals in Chemistry books.
An interesting situation is when the function has vector values. For func-
tions of the form f : R → R3 the representation is a curve, and we could
think of it as a trajectory regarding the variable as time. For functions as
f : R2 → R2 or f : R3 → R3 we may think of them as deformations of the
space, and we can visualise the functions by watching how they act on simple
sets of points: curves, simple domains. . . In that way is usually done in Com-
plex Analysis since a complex function is, in practise, a function from the plane
to the plane. Another way to think of functions f : R2 → R2 is to consider
36
the domain composed of points and, on the other hand, the images as vectors.
That is a plane vector field that can be depicted by choosing a regularly or-
dered set of points and drawing an arrow on each of them (usually the arrow
starts at the point) that represents the value of the function. In this way, the
speed of wing is represented in weather forecast informations, for instance.
The ellipse is defined as the curve made up of points from the plane such
that the sum of the distances to two points (foci) is constant. The curve is
symmetric with respect to the line passing through the foci, as well as the
bisector line of the segment joining the foci. When referred the ellipse to those
axes (X and Y , respectively) the well known cartesian equation is
x2 y 2
+ 2 = 1,
a2 b
where a is the long semi-axis and b the short one. The equation is particularly
simple because of the good choice of the coordinate frame. Note that√the sum
of distances to the foci is 2a and the distance between them is 2c = 2 a2 − b2 .
However, if we try to obtain the polar equation straight from the previous one,
the result is
r2 cos2 θ r2 sin2 θ
+ =1
a2 b2
and so
ab
r=√
b2 cos2 θ + a2 sin2 θ
which is not specially nice. In orden to obtain a better expression, move the
origin to one of the focus (the one on our right). The distance to the focus at
the origin is r. The distance to the other focus is
p q √
(x + 2c)2 + y 2 = (r cos θ + 2c)2 + r2 sin2 θ = r2 + 4cr cos θ + 4c2 .
37
Therefore, √
r2 + 4cr cos θ + 4c2 = 2a − r.
Squaring we get
thus, cr cos θ + ar = a2 − c2 = b2 ,
and the polar expression now is
b2 p
r= =
a + c cos θ 1 + ϵ cos θ
being p = b2 /a and ϵ = c/a the eccentricity. This equation, which is common
for al the conics (ϵ = 0 for the circle, ϵ ∈ (0, 1) ellipse, ϵ = 1 parabola and ϵ > 1
for the hyperbola) is useful for the description of the movement of planets.
3.2 Topology
The topology required to deal with functions of several variables is exactly the
metric topology, where the metric is induced by any of the usual norms, which
turn out to be equivalent. Moreover, guessing that geometry could be of any
help when dealing with limits or continuity is a wrong idea. You may think
that the limit of a function exits because it exits through all the lines going to
that point (radial limit), and, however, the ordinary (topological) limit may
not exits. That is the case of
2xy 2
f (x, y) =
x2 + y 4
having limit 0 at (x, y) = (0, 0) through lines, however the limit is not null
using suitable parabolas. In any case, radial and other variations of limits
are useful as training exercises, and the more interesting fact that they relate
limits in two o more variables to the functional limit. Indeed, the existence of
radial limits at a point implies the existence of the ordinary limit if the radial
limit is uniform with respect to the angle.
Despite the example, a radially continuous function is not so bad. For in-
stance, if we assume that a function is separately continuous, that is, for every
fixed value of all the variables except one the restricted function is continuous
38
with respect the remaining variable. It is possible to prove with the help of
Baire’s theorem that a separately continuous function has a dense set of points
of actual continuity.
39
3.4 Rationale and remarks
Despite the very general frame with metric and normed spaces, sometimes is
necessary to point out that the matter is “Functions of several real variables”,
not a few, but not too many.
3.5 Exercises
1. Use polar coordinates to express the set limited by the triangle with
vertices at (0, 0), (0, 1) y (1, 0).
2. Use polar coordinates to find the equation of a circle that passes by the
origin.
3. Express the set {(x, y, z) : 0 ≤ 2z ≤ 1 − x2 − y 2 } into spherical coordi-
nates.
4. Find a two variable function whose level curves is the family of circles
that are tangent to the Y axis at the origin.
5. Parameterize the curve resulting form the intersection of these surfaces
with t ∈ R, that is, find two functions f (x, y, z), g(x, y, z) such that the
curve is the set
40
p
7. Prove that f (x, y) = 4 x2 + y 2 does not satisfy Lipschitz condition and
yet it is uniformly continuous on R2 .
8. Prove the existence and compute the limit
x3 + y 3
lim .
(x,y)→(0,0) x2 + y 2
41
42
Chapter 4
Differentiable mappings
df (x0 ) := A;
3. the assignment f → df (x0 ) is linear among the maps that are differen-
tiable at x0 ;
4. if f is linear, then df (x) = f at any x ∈ E.
For real valued functions, the geometrical idea behind the notion of dif-
ferentiability can be understood as that the “graph of the function f is well
approximated by the tangent plane”. However, to be rigorous the definition
of tangent plane should depend on the notion of differentiability.
43
Definition 4.1.2. We call the tangent plane to the graph of f : D ⊂ Rn → R
at (x0 , f (x0 )) to the set
Proof. Indeed
(g ◦ f )(x) − (g ◦ f )(x0 )
44
= dg(y0 )(df (x0 )(x − x0 ) + o(∥x − x0 ∥)) + o(∥x − x0 ∥)
= (dg(y0 ) ◦ df (x0 ))(x, x0 ) + o(∥x − x0 ∥)
which implies the differentiability of the composed map and d(g ◦ f )(x0 ) =
dg(y0 ) ◦ df (x0 ) as wished.
Simple examples such as f (t) = (cos t, sin t, t) for t ∈ [0, 2π] show that
the mean value theorem with an equality is not longer true for vector valued
functions
f (2π) − f (0) ̸= f ′ (t)(2π − 0)
for all t ∈ [0, 2π]. However, we have the following that it is enough for most
applications.
Theorem 4.1.5. Let E be a normed space and let f : [a, b] → E and g :
[a, b] → R be continuous functions such that they are derivable on (a, b) and
satisfy ∥f ′ (t)∥ ≤ g ′ (t) for all t ∈ (a, b). Then
∥f (b) − f (a)∥ ≤ g(b) − g(a).
Proof. Take ε > 0 and consider the set
I = {t ∈ [a, b] : ∀ a ≤ s ≤ t, ∥f (s) − f (a)∥ ≤ g(s) − g(a) + ε(s − a) + ε}.
By construction and continuity of the functions it is clear that I = [a, s] for
some a < s ≤ b. If s = b for all ε > 0 we are done, so we may assume s < b in
order to get a contradiction. Assume that there exists a decreasing sequence
(sn ) ⊂ (s, b) with limit s such that
∥f (sn ) − f (a)∥ > g(sn ) − g(a) + ε(sn − a) + ε.
45
Therefore
46
∂f 0 ∂f 0
= λ1(x ) + · · · + λn (x ).
∂x1 ∂xn
If we denote by dxi the linear map λ1 e1 + · · · + λn en → λi , then for any x ∈ Rn
we may write the previous identity as
∂f 0 ∂f 0
df (x0 )(x) = dx1 (x)(x ) + · · · + dxn (x) (x )
∂x1 ∂xn
that can be rewritten in a more aesthetically way
∂f 0 ∂f 0
df (x0 ) = (x )dx1 + · · · + (x )dxn
∂x1 ∂xn
or simply
∂f ∂f
df = dx1 + · · · + dxn
∂x1 ∂xn
∂f
despite the fact that the partial derivatives ∂x i
may have vector values. For
real valued functions compare with the formula of the tangent plane in terms
of the partial derivatives
∂f 0 ∂f 0
y − f (x0 ) = (x )(x1 − x01 ) + · · · + (x )(xn − x0n ).
∂x1 ∂xn
In the old times before the arrival of rigor in Calculus, was usual to think that
dx1 , . . . , dxn where infinitesimal increments of the variables. That spirit still
last in reasonings that can be found in some Physics and Engineering books.
The chain rule for the differential implies a chain rule for partial derivatives.
Indeed, assume that the variables x1 , . . . , xn are replaced by derivable functions
X1 (t), . . . , Xn (t). Take X(t) = (X1 (t), . . . , Xn (t)). Then
d ∂f dX1 ∂f dXn
(f (X(t))) = (X(t)) (t) + · · · + (X(t)) (t).
dt ∂x1 dt ∂xn dt
Typical abuse of language and removal of variables that are obvious leads to
this neat expression
df ∂f dx1 ∂f dxn
= + ··· +
dt ∂x1 dt ∂xn dt
that reminds of the expression of the differential above divided by “dt”. If t
were one of several other variables, then the expression of the chain rule would
be with partial derivatives
∂f ∂f ∂x1 ∂f ∂xn
= + ··· + .
∂t ∂x1 ∂t ∂xn ∂t
47
Let us stress once more that the chain rule is valid provided that the (sec-
ond) function is differentiable. So far we have not provided a differentiability
criterion based on the partial derivatives. The following will fill the gap.
Theorem 4.2.1. Let f : D ⊂ Rn → R be a function such that its first
partial derivatives are defined on a neighbourhood of x0 ∈ D and they are also
continuous at x0 . Then f is differentiable at x0 .
Proof. The idea of the proof is the same in n dimensions than 2. In order not
to complicate much the notation we will assume something in the middle, say
3 dimensions. The point from the hypothesis will be denoted p = (x0 , y0 , z0 ).
Fix ε > 9 and let δ > 0 be such that the partial derivatives exists B(p, δ) (we
shall consider the Euclidean norm) and its values on points of B(p, δ) differs
less than ε from the value at (x0 , y0 , z0 ). Assume that ∥(x, y, z) − p∥ < δ, then
the four points (x0 , y0 , z0 ), (x, y0 , z0 ), (x, y, z0 ) and (x, y, z) are in the ball and
so the segments joining them. We have
f (x, y, z) − f (x0 , y0 , z0 ) =
f (x, y0 , z0 ) − f (x0 , y0 , z0 ) + f (x, y, z0 ) − f (x, y0 , z0 ) + f (x, y, z) − f (x, y, z0 )
∂f ∂f ∂f
= (x, y0 , z0 )(x − x0 ) + (x, y, z0 )(y − y0 ) + (x, y, z)(z − z0 ),
∂x ∂y ∂z
where x ∈ [x0 , x], y ∈ [y0 , y] and z ∈ [z0 , z] are given by the finite increments
theorem. Now
∂f ∂f ∂f
f (x, y, z) − f (p) − (p)(x − x0 ) − (p)(y − y0 ) − (p)(z − z0 )
∂x ∂y ∂z
√
≤ ε|x − x0 | + ε|y − y0 | + ε|z − z0 | ≤ 3ε∥(x, y, z) − p∥.
That means f is differentiable at p as wished.
Corollary 4.2.2. Let f be a function whose first derivatives are null on a
connected domain. Then f is constant.
Proof. In that case f is differentiable by the the previous theorem and its
differential is null everywhere. Two arbitrary points can be joined by a C 1
path γ. As f ◦ γ has null derivative, it is constant and thus the function has
the same value at the butts.
We could skip the use of Theorem 4.2.1 in the Corollary by showing that
two points in a connected (open) domain can be joined by a path made of
finitely many segments which are parallel to the axes.
48
4.3 Second order differentiability and more
Assume that a map f : D ⊂ E → F is differentiable at any point of D. In
such a case we may consider the differential map df : D ⊂ E → L(E, F ). We
may consider the continuity of df with respect to the norm on L(E, F ) and,
moreover, we may consider its further differentiability at some point x0 ∈ D.
In such a case, note that d(df )(x0 ) ∈ L(E, L(E, F )). For simplicity, we have
the identification
L(E, L(E, F )) = B(E × E, F )
which means bilinear maps on E valued in F . Therefore, d2 f (x0 ) = d(df )(x0 )
can be interpreted as a bilinear form. The relation of that bilinear form to the
increment of the function is depicted in the following result.
Theorem 4.3.1. Let f : D ⊂ E → F be twice differentiable at x0 ∈ D. Then
1
f (x0 + h) = f (x0 ) + df (x0 )(h) + d2 f (x0 )(h, h) + o(∥h∥2 ).
2
Proof. By the very definition, given ε > 0 there is δ > 0 such that if ∥h∥ < δ
then
∥df (x0 + h) − df0 (x0 ) − d2 f (x0 )(h)∥ < ε∥h∥.
The definition of the norm for linear operators implies
for every v ∈ E. Fix h ∈ E with ∥h∥ < δ and consider the functions h(t) = t2
and
t2
g(t) = f (x0 + th) − f (x0 ) − tdf (x0 )(h) − d2 f (x0 )(h, h).
2
The finite increment theorem for two functions says that
g(1) − g(0) g ′ (τ )
= ′
h(1) − h(0) h (τ )
49
= (2τ )−1 |df (x0 + τ h)(h) − df (x0 )(h) − d2 f (x0 )(τ h, h)|
ε∥τ h∥∥h∥ ε∥h∥2
< =
2τ 2
which proves the theorem.
is called the Hessian matrix. The Hessian is symmetric under very general
conditions, actually if f is twice differentiable at x0 , however we will prove a
fairly general result with a simpler proof. We will use a more compact notation
∂2f ∂2f
fx = ∂f∂x
, fy = ∂f
∂y
, fxy = ∂x∂y and fyx = ∂y∂x for the statement and the proof
of the following result.
In all what follows we will consider only real valued functions. Some re-
sults can be extended in an obvious way to functions taking values in finite
dimensional space.
50
Theorem 4.3.2. Let f be a real function defined on a neighbourhood of (x0 , y0 )
such that fx , fy , fxy and fyx are also defined and continuous. Then
Proof. Take h, k small enough and for a function g(x, y) we introduce the
notation
∆x g(x, y) = g(x + h, y) − g(x, y),
∆y g(x, y) = g(x, y + k) − g(x, y).
Note that ∆x (∆y f ) = ∆y (∆x f ). Now we will work with one of the terms,
being the other one similar. The finite increments theorem implies
That implies
∆x (∆y f )(x0 , y0 )
lim = fxy (x0 , y0 ).
h,k hk
The commutation of the increments claimed at the beginning implies the com-
mutation of the derivatives.
The Taylor formula for several variables. The commutativity of the derivations
can be extended to orders higher that 2 if the hypothesis is satisfied. In partic-
ular, for a C k function all the derivatives commute till the order k, meaning by
order of a derivative the sum of of the orders with respect to each variable. In
order to consider formulae involving complicated derivatives is convenient to
introduce a multi-index notation: let α = (k1 , . . . , kn ) be an n-uple of positive
integers (including 0) and let
k = k1 + · · · + kn .
We will denote
∂kf ∂kf
= .
∂xα ∂xk11 . . . ∂xknn
51
The fact that the derivations are ordered with the variables implicitly means
that the commutation is assumed. For the next result we will also need facto-
rials, multi-powers and related functions. Put
α! = k1 ! . . . kn !
and
k k!
= .
α α!
If x = (x1 , . . . , xn ), then put xα = xk11 . . . xknn . With this notation, we can prove
Newton’s multinomial formula
X m
m
(x1 + x2 + · · · + xn ) = xα .
α
|α|=m
The reasons for that choice will be clear along the proof of the following result.
Theorem 4.3.3. Let f be a C m+1 (Rn ) function and let c > 0 be bound for
the absolute value of the derivatives of order (m + 1) on B(p, r). Then for
x ∈ B(p, r), where the ball is taken with respect to the ∥ · ∥1 norm, we have
c rm+1
|f (x) − Tm (p, x)| ≤ .
(m + 1)!
X X ∂ 2f
h′′ (t) = (tx1 , . . . , txn ) xi xj
i j
∂xi ∂xj
52
and so on, following the schema of the powers of a multinomial. Therefore, we
can gather the terms in this way
X k ∂k f
(k)
h (t) = (tx1 , . . . , txn ) xα .
α ∂xα
|α|=k
Now we have m
X h(k) (0) h(m+1) (θx1 , . . . , θxn )
h(1) = +
k=0
k! (m + 1)!
where θ ∈ (0, 1) by the one variable Taylor formula with Lagrange remainder.
Clearly
m X 1 ∂ |α| f
X h(k) (0)
= α
(0) xα
k=0
k! α! ∂x
|α|≤m
and
h(m+1) (θx1 , . . . , θxn ) X 1 ∂ m+1 f
= α
(θx1 , . . . , θxn ) xα
(m + 1)! α! ∂x
|α|=m+1
c X m + 1 c
≤ |x|α = (|x1 | + · · · + |xn |)m+1
(m + 1)! α (m + 1)!
|α|=m+1
c ∥x∥m+1
1
≤
(m + 1)!
as wished.
53
In this case the domain is not assumed open. Typically, problems about
extrema are posed on a compact domain. The existence of extrema is assured
by Weierstrass theorem, however finding them is a different matter.
The discussion in the finite dimensional case is done with the help of the
Hessian matrix 2
∂ f ∂2f ∂2f
2 ∂x1 ∂x2
. . . ∂x1 ∂xn
∂x. 1 . . ..
A= . . .
. . . .
2
∂ f 2
∂ f 2
∂ f
∂xn ∂x1 ∂xn ∂x2
. . . ∂x2
n
54
In Linear Algebra is shown a method called Sylvester criterion to know if the
associated quadratic form is positive or negative by checking n determinants.
The quadratic form d2 f , and so the Hessian, can also be used to check the
convexity of a C 2 function.
55
b Z b
∂ϕ ′ ′ ′ d ∂ϕ
(f + sh, f + sh , t)h − (f + sh, f ′ + sh′ , x) h′ dx
∂ ẏ a a dx ∂ ẏ
Z b
d ∂ϕ
=− (f + sh, f ′ + sh′ , x) h′ dt.
a dx ∂ ẏ
Going back to the derivative of the functional, we have
d b
Z
ϕ(f + sh, f ′ + sh′ , x) dx =
ds a
Z b
∂ϕ d ∂ϕ
(f + sh, f ′ + sh′ , x) − (f + sh, f ′ + sh′ , t) h′ dx.
a ∂y dx ∂ ẏ
For s = 0 we get that
Z b
∂ϕ d ∂ϕ
(f, f ′ , x) − (f, f ′ , x) h′ dx = 0
a ∂y dx ∂ ẏ
for all the perturbations h satisfying the required assumptions. As it is possible
to take h with arbitrarily small support contained into (a, b), we deduce that
∂ϕ ′ d ∂ϕ ′
(f, f , x) − (f, f , x) = 0
∂y dx ∂ ẏ
56
Therefore, the real function f : R2 → R defined by f (x, y) = |p(x+iy)| satisfies
Together the continuity, that implies the existence of (x0 , y0 ) such that
|p(z)|2 = a0 a0 + a0 an z n + a0 an z n + . . .
where the following non-written terms are of degree greater than n. Put a0 a0 =
A(cos α + i sin α) and z = r(cos θ + i sin θ) and note that A ̸= 0. Now we can
use the fact that |a0 | = m and De Moivre ’s to get
57
Theorem 4.5.2. Let A be a symmetric matrix. Then there exists an orthog-
onal matrix such that Qt AQ is diagonal.
2
P
Proof. For an n × n matrix B = (bij ) define σ(B) = i̸=j bij . The set
of orthogonal n × n matrices, name it Ω can be identifies with a closed and
2
bounded subset of Rn . Therefore Ω is compact. Assume that the symmetric
matrix A is fixed and consider a function f : Ω → R defined by
That function attains its minimum value at some matrix, say B. We claim
that B is diagonal. The idea is to prove that if B is not diagonal, then we
could obtain a smaller value for σ by applying an orthogonal transformation
to B. Thus, assume bsr = brs ̸= 0 for some r ̸= s. Consider a rotation
cos θ − sin θ
sin θ cos θ
58
4.6 Rationale and remarks
Differentiability is a fundamental notion in Analysis. The students should be
get used to all the different versions as well as the right use of the chain rule. In
practise, people do not derive functions, they derive one variable with respect
another. From the formal point of view, that always entails abuse of language.
Nevertheless, future mathematicians should get used to that use in order to
communicate with scientists and engineers.
The Taylor polynomial of second degree with vector values and infinitesimal
remainder is presented mostly to show the complexity of the second differential
from the point of view of the involved operator spaces.
4.7 Exercises
1. Compute at any point the directional derivative of
ln(ex + ey )
along directions parallel to the line x = y. Provide a simple explanation
for the result.
2. Calculate
∂u ∂u ∂u
x +y +z
∂x ∂y ∂z
for
y2 + z2
u = arctan .
x2
3. Let the function
x2 − y 2
u = xy .
x2 + y 2
Compute at (0, 0) the following functions
∂ ∂u ∂ ∂u
y
∂x ∂y ∂y ∂x
59
4. Consider the function on R2 \ {(0, 0)} defined by
x3 y 3
f (x, y) = .
x4 + y 4
Show that is possible to continuously extend the function to R2 . Then,
study its differentiability.
5. Consider a function F : R2 → R of the form F (x, y) = yf (x) + xg(y)
where f, g : R → R are continuous at 0. Prove that F is differentiable
at (0, 0). Find a reasonable hypothesis to guarantee that F is twice
differentiable at (0, 0).
6. Let w = f (x, y, z) and z = g(x, y). Then
∂w ∂w ∂x ∂w ∂y ∂w ∂z ∂w ∂w ∂z
= + + = +
∂x ∂x ∂x ∂y ∂x ∂z ∂x ∂x ∂z ∂x
∂y
since ∂x
∂x
= 1 y ∂x = 0. Therefore ∂w ∂z
∂z ∂x
= 0. Now, assume that w =
∂w ∂z
x + y + z and z = x + y. Then we get ∂z = ∂x = 1 an so 1 = 0. Please,
find the mistake.
2 −y 2
7. Find and classify the critical points of f (x, y) = (x2 + y 2 ) ex .
8. Find the maximum volume of the straight parallelepiped contained into
the ellipsoid
x2 y 2 z 2
+ 2 + 2 = 1.
a2 b c
9. Find the minimum volume ellipsoid
x2 y 2 z 2
E(a, b, c) = {(x, y, z) : + 2 + 2 = 1}
a2 b c
passing at the point (1, 2, 3).
10. Prove that the maximum value of the function
f (x1 , x2 , . . . , xn ) = x21 x22 . . . x2n
on the sphere S = {(x1 , x2 , . . . , xn ) : x21 + x22 + · · · + x2n = 1} is n−n . Find,
as an application, the arithmetic-geometric mean inequality: for every
n ∈ N and ak ≥ 0 with 1 ≤ k ≤ n we have
√ a1 + a2 + · · · + an
n
a1 a2 . . . an ≤ .
n
60
2 2
11. Let the function f : R3 → R be defined by f (x, y, z) = xx2 +y z
2 for
√
(x, y, z) ̸= (0, 0, z) and f (0, 0, z) = 0.√ Find Dv f (1, 1, 2) being v an
unitary vector which tangent at (1, 1, 2) to the curve
x2 + y 2 = 2x;
x2 + y 2 + z 2 = 4.
14. Consider the function ⟨x| a⟩ e−∥x∥2 defined on Rn , and find its maximum
and minimum values.
15. Determine the values of the parameters a, b ∈ R for which the surface
2
z = eax+y + b cos(x2 + y 2 )
log(1 + x2 + y 2 + z 2 ).
61
18. We say that a function f : Ω → R is analytic on an open domain Ω ⊂ Rn
if it admits a power expansion centred at any point of Ω, that is, for
every (x01 , . . . , x0n ) ∈ Ω there are coefficients (ak1 ,...,kn ) such that
∞
X ∞
X
f (x1 , . . . , xn ) = ··· ak1 ,...,kn (x1 − x01 )k1 . . . (xn − x0n )kn
k1 =0 kn =0
x2 /16 + y 2 /9 ≤ 1.
62
Chapter 5
The first aim in this chapter is to prove the inverse mapping theorem for
maps defined on subsets of Rd . In this context, the one variable condition
f ′ (x0 ) ̸= 0 is replaced by the non degeneracy of df (x0 ), that is, it is invertible
as map on Rd . For that, we need a couple of lemmata to understand the local
behaviour of a C 1 map on Rd . That information will play a crucial role in the
proof of theorem for change of variable in multiple integrals.
63
We will prove the first auxiliary results in the frame of Banach spaces
because we do not need any special property of Rd . Before stating the first
lemma, let us state this “mean value theorem” for vector valued functions,
which is just a corollary of Theorem 4.1.5:
∥df (x) − I∥ ≤ η
64
of statement (b).
The other set inclusion is more delicate and requieres Banach’s fixed point
theorem. Assume that y ∈ B[0, (1 − η)r]. We want to find x ∈ B[0, r] such
that y = f (x). Observe that such a point x is a fixed point of the map
ϕ(x) := x − f (x) + y
for ∥x∥ < δ. Put x = f −1 (y) and observe that if y ∈ f −1 (B(0, δ)) we have
where we are using one of the inequalities from (a). Note that the last inequal-
ity holds for every y ∈ f (B(0, δ)) which is a neighbourhood of 0. As ε > 0 was
arbitrary, that implies the differentiability of f −1 at 0 with d(f −1 )(0) = I.
Lemma 5.1.2. Let f : D ⊂ E → E be a differentiable map and let x0 ∈ D
such that df is continuous at x0 and df (x0 ) has a continuous inverse. Then
65
for every 0 < η < 1 there exists δ > 0 such that f |B[x0 ,δ] is one-to-one, f −1 is
differentiable at f (x0 ) and
f (x0 ) + df (x0 )(B[0, (1 − η)r]) ⊂ f (B[x0 , r]) ⊂ f (x0 ) + df (x0 )(B[0, (1 + η)r])
for every 0 ≤ r ≤ δ In particular, the image through f of a neighbourhood of
x0 is a neighbourhood of f (x0 ). Moreover, f (U ) is open whenever U ⊂ D is
open and df (x) has a continuous inverse at every point x ∈ U .
For the proof we will use the shift map τh (x) = x + h.
66
Proof. Being f of class C 1 we may restrict our attention to a neighbourhood of
x0 where df is nonsingular. By the previous lemma we may fix neighbourhoods
U of x0 and V of y0 such that f is a bijection of U onto V . For any x ∈ U the
application of the previous lemma gives us that f −1 is differentiable at f (x),
thus d(f −1 )(y) is defined for every y ∈ V . Before proving that f −1 (y) is C k ,
note that the map sending the nonsingular linear maps A on Rd to their inverses
A−1 is C ∞ . Indeed, use the matrix expression for A and observe that the
coefficients of the matrix of A−1 are polynomials on the coefficients of A divided
by the determinant, which is a non vanishing polynomial of the coefficients of
A. Now we will proceed by induction: if k = 1 then d(f −1 )(y) = (df (f −1 (y)))−1
is continuous as a composition of continuous maps and so f −1 is C 1 . Assume
that the theorem is proven for k − 1 and f is C k . In such a case we know
that df (x) is C k−1 and, by the induction hypothesis f −1 (y) is C k−1 . Therefore,
d(f −1 )(y) = (df (f −1 (y)))−1 is C k−1 as composition of C k−1 maps, which means
that f −1 is C k .
67
G(x0 , y0 ) = (x0 , 0), that we may assume of the form U × B(0, δ), with values
in a neighbourhood of (x0 , y0 ), that we may assume of the form U × V . The
condition U × V ⊂ D can be achieved by shrinking U and δ > 0. Note that if
x ∈ U , then G−1 (x, 0) = (x, y) with y ∈ V , and thus F (x, y) = 0. That point
y is unique because G is injective on U × V , therefore we may define f (x) = y.
Now, the map f can be written as the composition of G−1 with a couple of
linear maps, and thus it is C k .
68
Do not confuse with the tangent manifold at x0 , that is the affine space
x0 + Tx0 M . It s not difficult to prove that for a (n − 1)-dimensional mani-
fold the tangent manifold that can be expresed as the graph of a real function
coincides with tangent plane introduced in Section 4.1.
The previous proposition only gives local information on the set, that is,
being a manifold. Additional properties have to be obtained by different tech-
niques.
Mλ = {(x, y) ∈ R2 : f (x, y) = λ}
s = inf{f (x, y) = x2 + y 2 = r2 }.
p
The convexity implies that f (x, y) ≥ (s/r) x2 + y 2 for x2 + y 2 ≥ r2 . We
easily deduce that Mλ is bounded, and thus compact (it is evidently closed).
We also deduce that f is strictly increasing on any line starting at (0, 0). The
mapping ϕ : Mλ → T defined by
(x, y)
ϕ(x, y) = p
x2 + y 2
69
Elimination of variables. When we have a system of equations like
f (x, y, z) = 0
g(x, y, z) = 0
70
5.3.1 Lagrange multipliers
Let f : M → R be a C 1 function defined on a 1-dimensional C 1 manifold
M = {(x, y) : g(x, y) = 0} (a curve in R2 ). We look for the relative extrema
(maximum or minimum) of f on M . Assume (x0 , y0 ) ∈ M is one of such
points. We can represent M around (x0 , y0 ) ∈ M by a C 1 parameterization
(x(t), y(t)) with x0 = x(0), y0 = y(0). Necessarily we have
d
f (x(t), y(t)) = 0.
dt t=0
∂g ∂g
(x0 )x′ (0) + (y0 )y ′ (0) = 0.
∂x ∂y
∇f (x0 , y0 ) + λ∇g(x0 , y0 ) = 0.
Note that the argument works the other way around, so the existence of such
a λ implies that (x0 , y0 ) is a critical point of f on M . Consequently, we deduce
that the extrema of f on M are contained among the solutions of the equations
∇f (x0 , y0 ) + λ∇g(x0 , y0 ) = 0,
(5.1)
g(x, y) = 0.
71
Curiously, the solutions of the system (5.1) is equivalent to the search of critical
points of the function
F (x, y, λ) = f (x, y) + λg(x, y).
The new variable λ is called Lagrange multiplier and its introduction reduces
the constrained problem of extrema to an unconstrained problem. That can
be done in similar terms with more variables and constraints, adding one mul-
tiplier by each constraint. For instance, looking for the extrema of f (x, y, z) on
the 1-dimensional manifold {(x, y, z) : g(x, y, z) = h(x, y, z) = 0} is equivalent
to investigate the critical points of
F (x, y, z, λ, ν) = f (x, y, z) + λg(x, y, z) + νh(x, y, z).
Example 5.3.1. The production function of Cobb-Douglas (with 3 variables)
is a function that modelizes the profits after a investment in different stages of
the manufacturing of a product: materials, machinery... and maybe tech and
marketing too. The function has the form
f (x, y, z) = c xα y β z γ ,
where c, α, β, γ > 0 and by homogeneity we should have α + β + γ = 1.
We wish to maximize the production f with a limited budget x + y + z ≤ m.
Obviously, we can restrict ourselves to a budget equal to m. As Lagrange
auxiliary function we can take
F (x, y, z, λ) = xα y β z γ + λ(x + y + z).
The partial derivatives should be zero
αxα−1 y β z γ + λ = 0
βxα y β−1 z γ + λ = 0
γxα y β z γ−1 + λ = 0
Multiplying by x, y, z respectively and adding we get
(α + β + γ)xα y β z γ + λ(x + y + z) = xα y β z γ + λm = 0.
Using that information in the first equation we get
αxα−1 y β z γ = m−1 xα y β z γ ,
therefore x = αm. Analogously, y = βm and z = γm.
72
5.3.2 Functional dependence
Now we will discuss functional dependence. It is an easy task to check that
the functions cos x and sin x are linearly independent. However they are alge-
braically dependent since cos2 x + sin2 x = 1. More generaly, we say that the
functions fk : D ⊂ Rn → R for k = 1, . . . , m are functionally dependent if
there is a nontrivial F : Ω ⊂ Rm → R such that
Here “nontrivial” means dF of maximal rank. Note that any pair of C 1 one
variable functions f and g are functionally dependent on some interval. Indeed,
we may assume that (f ′ (x0 ), g ′ (x0 )) ̸= (0, 0), otherwise both functions are
constant and so dependent. Therefore, one of the functions is locally monotone.
Let us assume it is f , and thus f −1 is defined on some neighbourhood. Now
note that F (f (x), g(x)) = 0 where F (u, v) = g(f −1 (u)) − v is non trivial
(rank 1). For a couple of functions f and g defined on an open subset of
R2 its functional dependence is locally equivalent to another one of the form
g(x, y) = G(f (x, y)) thanks to the implicit function theorem. Observe that
∇g = G′ ∇f at every point and thus
∂(f, g)
= 0,
∂(x, y)
where we are using the standard notation for the Jacobian determinant.
73
Then the chain rule implies
n−1
X ∂G
∇fn = ∇fk
k=1
∂yk
and so the jacobian vanishes. The converse is a little more technical, thus we
will prove the particular case n = 2 which enough to show the ideas behind.
Suppose we are given functions f (x, y) and g(x, y) such that
∂(f, g)
=0
∂(x, y)
on D ⊂ R2 . Consider the system of equations
u − f (x, y) = 0
v − g(x, y) = 0
∂(f,g)
Assume that the coefficients of ∂(x,y) do not vanishes at once on any open sub-
set, otherwise all the functions are constant there and so they are functionally
dependent. Without loss of generality we may assume that ∂f ∂y
̸= 0 on some
open subset. In that case, we may use the first equation to solve y as a function
of (x, u), that is, y = ϕ(x, u). Later we will need the derivative ∂ϕ∂x
expressed
in terms of f . That can be done by implicit derivation
∂ ∂f ∂f ∂ϕ
0= (u − f (x, ϕ(x, u)) = − −
∂x ∂x ∂y ∂x
∂ϕ
therefore ∂x
= −( ∂f
∂y
)−1 ∂f
∂x
. Consider the composition
74
5.3.3 Envelope of a family of curves.
Consider a family of curves in R2 depending of a parameter. The more general
way to express such a family is the implicit form
f (x, y, t) = 0
x cos t + y sin t = 1
75
Example 5.3.3. Find the envelope of all the trajectories of an object which
is thrown from the same point, at the same speed and only affected by the
(uniform) gravitational force.
Without loss of generality, the objects departs from the origin. We will consider
only the trajectories contained in a vertical plane XY (the spatial case will
follow by symmetry). Let v be the speed, θ the angle of depart and g denote
the gravitational force per unit of mass. Elementary Newtonian Physics gives
the trajectory as a function of the time t (t = 0 at the depart moment)
x = (v cos θ) t,
y = (v sin θ) t − (g/2) t2 .
The parameter time can be eliminated (put t = x(v cos θ)−1 in the second
equation)
xv sin θ g x 2 g
y= − = (tan θ)x − 2 (tan2 θ + 1)x2 .
v cos θ 2 v cos θ 2v
Therefore, the family of trajectories in terms of the angle θ is given by
g
y − (tan θ)x + (tan2 θ + 1)x2 .
2v 2
Derivation with respect to θ gives
g
−(tan2 θ + 1)x + (2 tan θ)(tan2 θ + 1)x2 .
2v 2
The factor tan2 θ + 1 can be eliminated, so x tan θ = v 2 /g. The substitution
above produces
2
v2 v2 v2
g g 2 g
0=y− + 2 + 2
x = y − + 2 x2 ,
g 2v g 2v g 2v
76
mapping theorem and the implicit functions theorem are still valid in the Ba-
nach frame, but the proof requires to know that the inversion of operators is
a C ∞ mapping, analytic in fact.
There are interesting directions to suggest for a TFG. For instance, Saint-
Raymond proved the inverse mapping theorem on Rn under weaker hypotheses,
or the study of global invertibility following the ideas of Hadamard.
5.5 Exercises
1. Study the local and global invertibility of the mapping f : D ⊂ R3 → R3
defined by
x y z
f (x, y, z) = ( , , ),
1−x−y−z 1−x−y−z 1−x−y−z
where D = {(x, y, z) ∈ R3 : x + y + z ̸= 1}.
2. Study the local and global invertibility of the mapping f : R2 → R2
defined by f (x, y) = (x2 − y 2 , 2xy).
3. Consider the mapping J : R2 \ {(0, 0)} → R2 defined by means of polar
coordinates on the domain and Cartesian for the image by
(r, θ) → ((r + 1/r) cos θ, (r − 1/r) sin θ).
Prove that every point of R2 \ {(−2, 0), (2, 0)}has exactly two preimages.
Find the maximal regions in R2 where J is a diffeomorphism.
4. Show that the equation x2 + xy + y 3 − 11 = 0 defines y as a function of
x around x = 1, taking the value y = 2. Compute the first and second
derivatives of that function at x = 1.
5. The mapping f (x, y, z) = (y 3 + z 5 , x + z 5 , x + y 3 ) is globally invertible on
R3 ¿Does it satisfies the hypotheses of the inverse mapping theorem at
(0, 0, 0)? ¿What is the relation with the possible differentiability of f −1 ?
77
6. Consider the mapping f : R2 → R2 defined by f (x, y) = (u, v) where
u = x, v = y − x2 if x2 ≤ y, v = (y 2 − x2 y)/x2 if 0 ≤ y < x2 and
v(x, y) = −v(x, −y) in case that y < 0. Prove that f is differentiable at
(0, 0), compute its differential and show that it is one-to-one ¿It verified
the inverse mapping theorem around (0, 0)?
7. Assume that the equation f (x, y, z) = 0 defines every variable as a func-
tion of the remaining two ones. Show that
∂x ∂y ∂z
= −1.
∂y ∂z ∂x
8. Find the extreme values of the implicit functions defined by the equation
y 3 − x2 y + x3 − 3 = 0.
78
14. Prove that the equation
π cos θ = t θ
has a unique solution θ(t) for t in a neighbourhood of 3/2 and find θ(3/2).
Prove also that θ′ (t) exists on a neighbourhood and compute θ′ (3/2).
15. Prove that the equations
4x2 − 3y 2 − z = 0
x2 + y 2 + z 2 = 24
define a C ∞ curve on a neighbourhood of (2, −2, 4). Find the tangent
line at that point. Show that, actually, the equations define a closed C ∞
curve.
16. Let f : D ⊂ Rn → Rm be a C 1 mapping and x0 ∈ D. Prove that:
(a) if df (x0 ) is one-to-one, then there is a neighbourhood V of x0 such
that f |V is one-to-one;
(b) if df (x0 ) is onto, then there is a neighbourhood V of x0 such that
f (V ) is a neighbourhood of f (x0 ).
17. Check that these functions are functionally dependent and find their
relation
x x−y
f (x, y) = ; g(x, y) = .
y x+y
18. Check that these functions are functionally dependent and find their
relation
19. Check that these functions are functionally dependent and find their
relation
79
21. Consider the set P ⊂ R3 defined by the equations
x2 + 4y 2 = 16
25. An object is thrown from the origin of R2 with the same speed and
variable direction, and its movement is affected only by its weight so
it follows parabolic trajectories. Find the envelope of the family of all
possible trajectories.
26. Let f (x1 , . . . , xn ) = a1 x1 + · · · + an xn . Find the maximum of f on the
set
Bpn = {(x1 , . . . , xn ) : |x1 |p + · · · + |xn |p ≤ 1}.
80
27. Let X be a Banach space and let L(X) denote the linear continuous
operators acting on X with the operator norm.
(a) Let A ∈ L(X) such that ∥I − A∥ < 1, where I lis the identity map
on X. Show that A is invertible.
(b) Assume that A ∈ L(X) is an invertible. Prove the existence of δ > 0
such that if B ∈ L(X) with ∥B − A∥ < δ, then B is invertible.
(c) Deduce that the assignation A → A−1 within the invertible opera-
tors of L(X) is continuous and, moreover, it is C ∞ .
81
82
Chapter 6
Riemann Integral
The rectangle is non degenerate if m(R) > 0. Clearly, the topological interior
of R is the set
(a1 , b1 ) × · · · × (ad , bd )
Two rectangles R1 and R2 are said not overlapping if they meet only on their
borders.
Any non degenerate rectangle R can be tiled with smaller non degener-
ate rectangles {Ri }ni=1 which are pairwise not overlapping. To see that, just
consider the rectangles of the form I1 × · · · × Id where each Ik is an interval
coming from a finite partition of [ak , bk ]. Then arrange all these rectangles into
a sequence {Ri }ni=1 . The tiling {Ri }ni=1 of R obtained
Pn in this way is called a
grill of R. It is not difficult to see that m(R) = i=1 m(Ri ) in this case, but
something more general is true. Given a rectangle R, a S collection π = {Ri }ni=1
is said a partition of R if they are not overlapping and ni=1 Ri = R.
83
Proposition 6.1.1. If {Ri }ni=1 is a partition of a rectangle R, then
n
X
m(R) = m(Ri ).
i=1
A partition π ′ = {Rj′ }m n
j=1 is finer than π = {Ri }i=1 if for every j : 1 . . . m
there is i : 1 . . . n such that Rj′ ⊂ Ri . Observe that in this case we have
[
Ri = {Rj′ : Rj′ ⊂ Ri }.
84
6.2 Integrals on compact rectangles
Given a bounded function f : R → R defined on a rectangle and partition
π = {Ri }ni=1 of R, we consider the numbers
n
X
L(f, π) = inf{f, Ri }m(Ri )
i=1
n
X
U (f, π) = sup{f, Ri }m(Ri )
i=1
named lower and upper sums respectively. Observe that for π1 ≤ π2 partitions
of R we always have
The Darboux lower and upper integrals of f (on R) are defined this way
Z
f = sup{L(f, π) : π partition of R}
Z
f = inf{U (f, π) : π partition of R}.
85
Hint of Proof. Just notice that osc(f, Ri ) = sup{f, Ri } − inf{f, Ri }.
The reader that is acquainted with the properties of the Riemann integral
for one variable functions will not see anything new in the following result.
Proposition 6.2.4. Let R(R) denote the set of functions which are Riemann
integrable on R. Then
R R R
1. R(R) is a vector space and R (αf + βg) = α R f + β R g whenever
f, g ∈ R(R) and α, β ∈ R.
2. R(R) is stable by products (so it is an algebra).
R R
3. If f, g ∈ R(R) and f ≤ g, then R f ≤ R g.
and Z Z Z Z
αf = α f, αf = α f
R R R R
86
Integrability of products can be reduced to integrability of squares of positive
functions. In such a case, we have
87
i) f is Riemann integrable on R;
ii) {x ∈ R : osc(f, x) ≥ δ} is of null content for every δ > 0;
iii) the set of discontinuity points of f is of null measure.
Proof. Note that the equivalence between ii) and iii) is consequence of this
set equality
∞
[
{x ∈ R : osc(f, x) > 0} = {x ∈ R : osc(f, x) ≥ 1/n}
n=1
bearing in mind that the first are the discontinuity points of f and the second
is represented as a union of compact subsets of R.
Suppose that f is Riemann integrable. For ε, δ > 0, take a partition {Ri }ni=1
of R into rectangles such that
n
X
osc(f, Ri )m(Ri ) < δε
i=1
Consider the open set O = ni=1 Ri◦ . If y ∈ O ∩ {x ∈ R : osc(f, x) > δ}, then
S
osc(f, Ri ) > δ if y ∈ Ri . Take N = {i : 1 ≤ i ≤ n, osc(f, Ri ) > δ} and observe
that X X
δ m(Ri ) < osc(f, Ri )m(Ri ) < δε
i∈N i∈N
88
X ε X ε ε
≤M m(Ri ) + m(Ri ) ≤ M + m(R) = 2ε.
Ri◦ ⊂O
m(R) M m(R)
Ri ⊂R\O
It is not difficult to check that the definition is independent of the chosen rect-
angle R, and taking R(D). Properties of function integrables on rectangles
extend naturally to R(D). In a similar fashion, for f : Rd → R with compact R
support, that is, if the set {x ∈ Rd : f (x) ̸= 0} is bounded, we may define f
in terms of integration on rectangles.
89
Proof. The discontinuities of χA happen exactly at the points of ∂A.
We have defined the Jordan content from the Riemann integral. The other
way around is posible as shows the following result. The details of the proof
are left to the reader.
Proposition 6.4.3. Let R ⊂ Rd be a rectangle.
1. If f : R → [0, +∞) a bounded function and consider F = {(x, t) : x ∈
R, 0 ≤ t ≤ f (x)}. Then
Z Z
f = c∗ (F ), f = c∗ (F )
R R
90
Proposition 6.4.6. If Ai ni=1 ⊂ Rd is a non overlapping finite
P family of Jordan
sets, then its union is Jordan as well and c( ni=1 Ai ) = ni=1 c(Ai ).
S
XZ XZ
= |f (ti ) − f | + |f (ti ) − f |
i∈N Di i̸∈N Di
X X ε ε
≤ M c(Di ) + c(Di ) ≤ M c(O) + c(D) ≤ ε
i∈N i̸∈N
2c(D) 2c(D)
91
whenever the points ti ∈ Di are chosen.
In fact, the thesis in the previous statement implies the Riemann integra-
bility suitably reformulated. Indeed, if the Riemann sums
n
X
f (ti )c(Di )
i=1
have a common limit when the Jordan partition {Di }ni=1 is either refined or
the maximum diameter of its sets goes to zero, then the function f must be
integrable on D.
ri−1 + ri
c(Di,j ) = (ri − ri−1 )(θj − θj−1 )
2
The associate Riemann sum over D with the evaluation on central points is
n X m r + r
X i−1 i θi−1 + θi ri−1 + ri θi−1 + θi
f cos( ), cos( ) c(Di,j )
i=1 j=1
2 2 2 2
RR
which approaches D
f (x, y) dxdy. On the other hand, the sum coincides with
Xn X m r + r θ + θ r + r
˜ i−1 i j−1 j i−1 i
f , (ri − ri−1 )(θj − θj−1 )
i=1 j=1
2 2 2
RR
which is a Riemann sum associate to E f (r cos θ, r sin θ) r dr dθ. The refining
of the partition in the sense of Theorem 6.4.8 gives the equality of the two
integrals of the thesis.
92
6.5 Iterated integrals
Until this moment we have not said how Riemann integrals in Rd are com-
puted. The idea is to reduce to iterated integral in spaces of lesser dimension,
which in practice means that all can be reduced to one dimensional integrals
where the calculus of primitive functions is the main device for its computation.
that implies
Z n
X n X
X m
U≤ sup{U, Ri }m(Ri ) ≤ sup{f, Ri × Sj }m(Ri × Sj )
R i=1 i=1 j=1
93
Taking infimum on the left hand side we get that
Z Z Z
U≤ f= f
R R×S R×S
R R
All together implies that R U = R U , so U is Riemann integrable on R. Sim-
R R
ilarly, we have R L = R L and so the Riemann integrability of L, as well as
R R
the equality with R×S f . Now observe that R (U − L) = 0 and the function
U − L is positive, so U = L except a null measure set.
Therefore,
Z 1 Z 1 ZZ Z 1 Z y
y2 y2 y2
e dy dx = e dx dy = e dx dy
0 x D 0 0
1 1 1
e−1
Z Z
y2
y
y21 2
= xe dy = ye dy = ey = .
0 x=0 0 2 y=0 2
94
6.6 Improper integrals
Not always the interesting integrals satisfy the Riemann requirements: bound-
edness of the function or boundedness of the domain. In such a case we R have
an improper Riemann integral. The approach is simple. Assume that D f is
improper. If we S
can take an increasing sequence of bounded (Jordan) domains
(Dn ) such that ∞ n=1 Dn = D and f |Dn is bounded (and integrable Riemann,
of course), we may define Z Z
f = lim f
D n Dn
if the limit exists. That is not totally arbitrary: the way to produce the se-
quence (Dn ) is standard: on R we take intervals and R2 rectangles or circles,
depending on the geometry of the domain. In the problem is just a singular-
ity of the function, the domains consists in removing a neighbourhood of the
singularity, usually an Euclidean ball centred at the singularity.
95
Using integration on squares [−S, S]2 we deduce that J = I 2 . However, the
plane integral can be calculated through circles with the same result. Indeed,
the function is positive and every circle is contained into a square and viceversa.
With the help of polar coordinates we have
2π R R
−1 −r2
Z Z
2
2
I = lim e−r r dr dθ = lim 2π e =π
R→+∞ 0 0 R→+∞ 2 r=0
√
and therefore we get that I = π.
6.8 Exercises
1. Prove that the null content sets in Rd are stable by finte unions and
closures.
96
where ⌊·⌋ is the integer part of a real number. Consider also the set
97
9. A function f : [a, b] → R is said to be step if there exists a partition
of [a, b] such that f is constant on the interior of each of the intervals
defined by the partition. A function is said ruled if it is a uniform limit
of step functions. Prove the following statements:
(a) Ruled functions has countably many discontinuities, at most.
(b) Ruled functions are Riemann integrable on their domains.
(c) A function is ruled if and only if at each c ∈ [a, b) exists limx→c+ f (x)
and at each c ∈ (a, b] exists limx→c− f (x).
10. Let f : [0, a] → [0, b] be a decreasing continuous bijection. Prove with
the help of a plane integral that
Z a Z b
f (x) dx = f −1 (x) dx.
0 0
98
Chapter 7
Change of Variables in
Integration
m∗ (f (A)) ≤ λn m∗ (A)
99
Proof. The result is consequence of three easy observations. Firstly, the
Lebesgue outer measure can be approximated by coverings of balls (actually
cubes) if the norm we are using is ∥·∥∞ (actually, thanks to a result of Vitali we
may use any norm for the same purpose). Indeed, the outer measure is defined
by coverings of generalized rectangles and those rectangles can be arbitrarily
approached by non-overlapping unions of cubes. The second observation is
that for any ball we have
As the notion of null measure does not depend on the norm we have.
Corollary 7.1.2. A locally Lipschitz map f : D ⊂ Rn → Rn carries Lebesgue
null sets to Lebesgue nul sets.
Proof. The domain D can be decomposed in countably many domains where
the restriction of f is Lipschitz, so the previous theorem is applicable.
Now that the measurability of T (A) is not a problem, remember that the
Lebesgue measure is the unique non trivial translation invariant Borel measure
on Rn , but a multiplicative positive constant. Therefore, for every T linear
there is k(T ) ≥ 0 such that
Obviously k(T ) is the volume of the image of the unitary cube through T .
Note also that the constant is multiplicative
100
Proof. Clearly k(I) = k(II) = k(I)2 . Since k is not trivial we have k(I) = 1.
If T is invertible we have k(T )k(T −1 ) = k(I) = 1. Therefore k(T ) ̸= 0 and
k(T −1 ) = k(T )−1 . We deduce that k takes the same value for similar matrices
Consider now the diagonal matrices Dx having 1’s on the diagonal except the
first entry which takes the value x ∈ R. Observe that
Now we can achieve the objetive stated at the beginning of the section.
Theorem 7.1.4. Given a linear map T : Rn → Rn we have
101
inverse is continuous, there is 0 < δ < ε/2 such that ∥T − I∥ < δ implies
∥T −1 − I∥ < ε/2. We have ∥T ∥, ∥T −1 ∥ ≤ 2 and
The proof of the formula for the transformation of volumes through linear
maps can be obtained also by geometrical considerations which are specially
clear for R2 : showing that a parallelogram is equivalent to a rectangle by
decomposing it into 2 pieces.
T (x0 ) + dT (x0 )(B[0, (1 − η)δ]) ⊂ T (B[x0 , δ]) ⊂ T (x0 ) + dT (x0 )(B[0, (1 + η)δ]).
102
Theorem 7.2.2. Let R ⊂ Rn be a compact rectangle, let T : R → Rn be
an one-to-one C 1 map with dT non singular on R. Then T (R) is Jordan
measurable and for every Riemann integrable function f : T (R) → R then
f ◦ T is Riemann integrable on R and
Z Z
f= f ◦ T | det(dT )|
T (R) R
where the determinant is computed for the matrix of dT with respect to the
canonical bases.
Proof. First of all, we may assume that R has nonempty interior. Otherwise
R would be measure 0 and so its image T (R) being the result true trivially. We
may assume that f ≥ 0 as well. By Theorem 5.1.4 we know that the interior
of R is transformed into an open set by T , therefore the boundary of T (R)
is contained in T (∂R) which has null measure. That implies T (R) is Jordan
measurable.
Observe that ∥(dT )−1 ∥ is bounded on T (R) which implies that T −1 is Lipschitz.
If D is the null measure set of discontinuities of f then T −1 (D) is also null.
Since the set of discontinuities of f ◦ T is exactly T −1 (D) we get that f is
Riemann integrable.
We may set the norm of Rn to have the unit ball a translation of R. Take
0 < η < 1 and note that now R can be decomposed into N n non overlapping
balls of radius 1/N . By the continuity of dT on a larger open containing R
we may take N large enough to guarantee that the set containment of the
Lemma can be applied with such η to all the balls of radius 1/N . Let xk with
1 ≤ k ≤ 2N the centres of the balls covering R and Bk = B[xk , N1 ]. We have
now
1 − η 1 + η
T (xk ) + dT (xk ) B 0, ⊂ T (Bk ) ⊂ T (xk ) + dT (xk ) B 0, .
N N
Having in mind that m(L(S)) = | det(L)|m(S) for any linear map L and any
compact rectangle S, we get
(1 − η)n m(Bk )| det(dT (xk ))| ≤ m(T (Bk )) ≤ (1 + η)n m(Bk )| det(dT (xk ))|.
103
2N 2N
X X
n
≤ f (T (xk )) m(Bk ) ≤ (1 + η) f (T (xk )) m(Bk ) | det(dT (xk ))|
k=1 k=1
The sums are of Riemann type, standard ones at the ends and associated to a
Jordan partition of T (R) in the middle. so letting n going to infinity we will
get
Z Z Z
n n
(1 − η) f ◦ T | det(dT )| ≤ f ≤ (1 + η) f ◦ T | det(dT )|
R T (R) R
Proof. The arguments employed above for the Jordan measurability of T (D)
and the Riemann integrability of f ◦ T can be adapted here with some small
changes. As before T (D) is open and the boundary of T (D) is included into
T (∂D). However, T −1 is locally Lipschitz which implies that the set of discon-
tinuities of f ◦ T is null.
To prove the formula, cover ∂D with a finite union of compact rectangles whose
volumes sums less than ε. Then D \ S can be decomposed into a finite union of
non-overlapping rectangles. The previous theorem applied on each rectangle
and having in mind that the images by T of the rectangles are non-overlapping
give us Z Z
f= f ◦ T | det(dT )|.
T (D\S) D\S
If M is an upper bound to f we have
Z Z Z
f− f ≤ |f | ≤ M m(S)
T (D) T (D\S) T (D∩S)
and Z Z Z
T ◦f − T ◦f ≤ |T ◦ f | ≤ M λn m(S)
D D\S D∩S
can be done arbitrarily small which leads to the desired equality.
104
Corollary 7.2.4. If D is an open Jordan domain and T : D → Rn be a C 1
map such that T is one-to-one and dT is non singular on D then T (D) is a
Jordan domain Z
m(T (D)) = | det(dT )|.
D
Proof. First of all note that S is closed. The proof will be by induction on
the dimension n.
Suppose d = 1. It is enough to show that S ∩(a, b) has null measure. Note that
in this case dT (x) is singular if and only if T ′ (x) = 0. Given ε > 0, consider
the set
U = {x ∈ (a, b) : |T ′ (x)| < ε}
Then S ∩ (a, b) ⊂ U and T is ε-Lipschitz on every interval composing U thanks
to the mean value theorem. Now apply Proposition 7.1.1 to obtain that
105
Obviously Z ⊂ S. If B is an arbitrary closed ball and ε > 0, then its is possible
to cover Z ∩ B with finitely many non-overlapping convex sets such that T
is ε-Lipschitz on each of them thanks to the mean value theorem (in several
variables). Reasoning as in the 1-dimensional case that gives m∗ (T (Z ∩ B)) ≤
εm∗ (B) which implies m∗ (T (Z)) = 0 on account of B and ε.
The objetive now is to show that T (S \ Z) has null measure. Note that it
is enough to show that every x0 ∈ S \ Z has a neighbourhood U such that
∂fi
T ((S \ Z) ∩ U ) is null. As x0 ∈ S \ Z there are i, j such that ∂x j
(x) ̸= 0.
Reordering the variables and the coordinate functions we may assume that
∂f1
∂x1
(x) ̸= 0. Consider the map G(x) = (f1 (x), x2 , . . . , xn ) and note that dG(x0 )
is not singular. By the inverse mapping theorem there is a neighbourhood U of
x0 and V of G(x0 ) such that G is a bijection form U onto V . The composition
H = T ◦ G−1 defined on V is of the form
Pt = {y = (y1 , . . . , yn ) ∈ R : y1 = t}.
{t} × Ht (A ∩ Pt ) = Pt ∩ H(A)
is a (n − 1)-dimensional null set, and this is true for every t ∈ R. A well known
consequence of Fubini’s theorem says that H(A) is a n-dimensional null set.
106
7.4 Brouwer fixed point theorem
A spectacular application of the change of variables formula is a simple proof
of the topological theorem about fixed points due to Brouwer.
Theorem 7.4.1. A continuous map from BRn into itself has a fixed point.
Along the section n ∈ N is fixed and we will write B = BRn and S = ∂B.
By standard techniques it is easy to prove the equivalence of the fixed point
property (FPP) for B with the nonexistence of a retraction of B onto S, that is,
a continuous map from B onto S that fixes the points of S. That can be done
not only in the category of continuous maps but also C 1 , which will important
for the proof.
Lemma 7.4.2. The FPP of B for C 1 maps implies the FPP of B for contin-
uous maps.
Proof of the Theorem 7.4.1. Let P be a C 1 retraction of B onto S. We
will arrive to a contradiction after a witty construction. For every t ∈ [0, 1]
take
Pt (x) = (1 − t)x + tP (x)
and note that Pt is a C 1 map from B onto itself that fixes S. We claim that
for t small enough Pt is an homeomorphism onto its image. Indeed, let L be
the Lipschitz constant of P . Then
∥Pt (x) − Pt (y)∥ = ∥(1 − t)(x − y) + t(P (x) − P (y))∥
≥ (1 − t)∥x − y∥ − t∥P (x) − P (y)∥ ≥ (1 − t)∥x − y∥ − Lt∥x − y∥
≥ (1 − (L + 1)t)∥x − y∥.
Therefore, taking t < (L + 1)−1 the inverse of Pt is defined and Lipschitz.
Moreover, for t small enough the map and its inverse are open. Indeed, that is
consequence on the Inverse Map Theorem since det(dPt ) is nearly 1 for t close
to 0. That implies Pt carries one-to-one S onto the ∂Pt (B). As Pt fixes S we
deduce that Pt (B) = B for t small enough.
On the other hand, note that det(dPt ) is a polynomial in t, so it is the function
Z
h(t) = det(dPt (x)) dx
B
defined for t ∈ [0, 1]. As for t small enough Pt is a diffeomorphism, the change
of variables formula Theorem 7.2.3 says that h(t) = m(B) > 0. As h is a
polynomial, being constant in an interval implies to be constant everywhere.
However, h(1) = 0 because P1 = P collapses on S. That is a contradiction.
107
Corollary 7.4.3. Any compact set that is homeomorphic to, or a retract of,
an Euclidean ball has the FPP.
Z Z
∂(x1 , . . . , xn )
= ··· (f ◦ T )(u1 , . . . , un ) du1 . . . dun
D ∂(u1 , . . . , un )
that it is easy to remember.
where the equality can be justified by the monotone convergence theorem (ap-
plication of Riemann theory needs a more detailed analysis).
108
Let T be the triangle with vertices (0, 0), (1.0) and (1, 1). By symmetry we
have Z 1Z 1 ZZ
dxdy dxdy
=2
0 0 1 − xy T 1 − xy
Consider now the change of variables given by x = v + u, y = v − u where
(u, v) runs over the triangle D with vertices (0, 0), (1/2, 1/2) and (0, 1). As
the jacobian is 2 we have
ZZ ZZ
dxdy dudv
= 2 2
T 1 − xy D 1−v +u
1 2 1/2 π2
= arcsin(v) |0 =
2 72
As to the second integral, we have after a first integration that
Z 1
1 1−v
√ arctan √
1/2 1 − v2 1 − v2
109
Finally,
∞ 2
π2 π2
X 1 π
= 4 + =
n=1
n2 72 36 6
as desired.
where Q stands for the first quadrant and the change of variables is
Γ(p)Γ(q)
B(p, q) = .
Γ(p + q)
110
7.5.3 Integrals of Dirichlet
The transformation of the pyramid
{(x1 , . . . , xn ) : x1 , x1 ≥ 0, . . . , xn ≥ 0, x1 + · · · + xn ≤ 1}
into a cube [0, 1]n can be perform with this change of variables
x1 + x2 + · · · + xn = u1
x2 + · · · + xn = u1 u2
..
... . ...
xn = u1 u2 . . . un
D = {(x1 , . . . , xn ) : x1 , x1 ≥ 0, . . . , xn ≥ 0, x1 + · · · + xn ≤ 1}.
Then
Z Z
··· xp11 −1 xp22 −1 . . . xpnn −1 (1 − x1 − · · · − xn )p0 −1 dx1 dx2 . . . dxn
D
111
The theorem of Brouwer is usually proved in Topology books with discreti-
sation and combinatorics. The proof using the change of variables is due to
Milnor and Rogers.
7.7 Exercises
1. Find the volume of the body limited by the sphere x2 + y 2 + z 2 = 1 and
the cylinder x2 + y 2 = 2x.
2. Let D = {(x, y) : x2 + y 2 ≤ 1}, and calculate
ZZ p
1 + x2 + y 2 dxdy.
D
112
Chapter 8
8.1 Motivation
A moment’s reflection on how the notion of area for polygons is treated in
elementary texts shows that the existence of the area and its additive prop-
erty are mostly assumed a priori, so the actual task is to compute the areas
of progressively more complicate polygons. In a second stage the area can be
extended to some nonpolygons, as the circle, assuming monotonicity. Well, it
is possible to provided a sound basis to the elemental method: define the area
for triangles, show that it is independent of the position of the triangle, prove
that it is additive within the triangles, and finally extend it to polygons by
decompositions into triangles. . . However, knowing in advance the scope of this
method, we could opt for an easier approach which will lead to the same results.
113
We will say that a set is elemental if it is union of finitely many rectangles.
The measure can be extended to elemental sets and the measure is extended
additively using non overlapping decompositions into rectangles. The reason
why an elemental set can be reduced to a finite union of non overlapping
rectangles lies in the fact the difference of two rectangles is a finite union of
rectangles. Checking that the definition of m does not depend on how the
decomposition is chosen and the additivity of m with respect to finite disjoint
(or not overlapping) unions offers no challenge.
The method just sketched above, namely Jordan theory of measure (see
Chapter 7), is somehow related to Riemann integral. It serves well at ele-
mentary level but it has many limitations. For instance, sets as simple as the
rational numbers between 0 and 1 are not measurable. Moreover, the approx-
imation of the circle area from within using polygons can be understood as a
limit process which implies a decomposition of the circle into countably many
114
rectangles. The measure m should be countably additive as the geometric in-
terpretation deserves, however it cannot. Otherwise, the set of rationals would
have measure 0, since it is a countable union of points.
8.2 Measures
We need a family of sets where we can perform all the required operations
with a countably additive function. Motived by the previous section we will
introduce algebras and σ-algebras. An algebra of subsets of a set Ω is family
A ⊂ P(Ω) which satisfies
1. ∅, Ω ∈ A;
2. A ∈ A implies Ac ∈ A;
3. nk=1 Ak ∈ A whenever A1 , . . . , An ∈ A.
S
3’. ∞ ∞
S
n=1 Ak ∈ Σ whenever (An )n=1 ⊂ Σ.
As we will see algebras and measures on them appears naturally, however the
theory works nicer with σ-additivity on σ-algebras. On the other hand to
build nontrivial σ-additive measures is a delicate task that we will face in later
sections. Here we will study some properties of systems of sets and measures
provided they are given.
Note that all the σ-algebras that one can define on a nonempty set Ω lie be-
tween the smallest one {∅, Ω} and the biggest one P(Ω). Since the intersection
of σ-algebras is again a σ-algebra, given F ⊂ P(Ω) there is a smaller σ-algebra
containing F called the σ-algebra generated by F and denoted σ(F). This σ-
algebra can be actually built explicitly from F using transfinite induction.
Among the σ-algebras generated by families of sets in a topological space we
115
will consider the Borel σ-algebra which is generated by the open (eq. closed)
sets and the Baire σ-algebra which is the smaller making measurable the con-
tinuous functions. Borel and Baire sets coincide for a metrizable space but
they are different in general.
116
1. If A, B ∈ Σ, A ⊂ B and µ(A) < +∞ then µ(B \ A) = µ(B) − µ(A).
n
!
X X X \
µ(Ai ) − µ(Ai ∩ Aj ) + µ(Ai ∩ Aj ∩ Ak ) − · · · ± µ Ai .
i i̸=j #{i,j,k}=3 i=1
Proof. In the first case define sets Bn = An \ k<n Ak and note An = nk=1 Bk
S S
being the last union disjoint. Therefore
n
X ∞
X ∞
[ ∞
[
lim µ(An ) = lim µ(Bk ) = µ(Bk ) = µ( Bk ) = µ( Ak ).
n n
k=1 k=1 k=1 k=1
The second case is consequence of the first when applied to the sets Cn =
A1 \ An . Indeed, we have
∞
\ ∞
[ ∞
[
µ( An ) = µ(A1 \ Cn ) = µ(A1 ) − µ( Cn )
n=1 n=1 n=1
117
be decomposed into sets of smaller positive measure. Given a measure space
(Ω, Σ, µ) a set A ∈ Σ is called an atom if 0 ∈ {µ(B), µ(A \ B)} whenever
B ∈ Σ with B ⊂ A. We say that two atoms A, B ∈ Σ are equivalent if
µ((A \ B) ∪ (B \ A)) = 0. Sometimes we may requiere to work with finite
measure sets in order to have a result and then, in second step, to extend the
result to countable unions of those sets. We say that a measure space S (Ω, Σ, µ)
is σ-finite if there exists (An ) ⊂ Σ with µ(An ) < +∞ such that Ω = ∞ n=1 An .
Note that our example (Γ, P(Γ), µ) is σ-finite if and only if {γ : aγ = +∞} = ∅
and {γ : aγ ̸= 0} is countable. In particular, the cardinal measure on Γ is σ-
finite if and only if Γ is countable. With the previous definitions we can prove
the following results.
Proposition 8.2.3. A σ-finite measure space (Ω, Σ, µ) has countably many
non equivalent atoms, at most, whose union is called atomic part, and its
complement is called atom-free part is case it has positive measure.
Proof. In this case the atoms must have finite measure. A maximal set of
nonequivalent atoms is necessarily countable at most again by the σ-finiteness.
Indeed, if Ω is decomposed into disjoint parts with finite measure (Ωn ) and A
is an atom then there is only one n such that µ(A \ Ωn ) = 0, that is A ⊂ Ωn
except a measure null set. That enforces that given m ∈ N only finitely many
atoms essentially contained into An have measure greater than 1/m.
Theorem 8.2.4. Let (Ω, Σ, µ) be an atom-free measure space. Then
{µ(A) : A ∈ Σ} = [0, µ(Ω)].
Proof. The application of Zorn’s lemma allows us to find a maximal family
A of subsets from Σ which is totally ordered and µ|A is injective. For every
0 < t < µ(Ω) the sets
[ \
At = {A ∈ A : µ(A) ≤ t} and At = {A ∈ A : µ(A) ≥ t}
belong to Σ because they equal a countable union and a countable intersection
respectively. Evidently µ(At ) ≤ t ≤ µ(At ). We claim that µ(At ) = µ(At ).
Otherwise µ(At \ At ) > 0 and there is E ⊂ At \ At ) such that 0 < µ(E) <
µ(At \ At ). The set At ∩ E can added to A preserving the total ordering and
the injectivity of µ, and therefore violating the maximal property. Now we
have µ(At ) = t (actually At = At ).
118
8.3 Construction of measures
The notion of outer measure plays an essential role here. Let Ω be a nonempty
set. A function µ∗ : P(Ω) → [0, +∞] is called an outer measure if satisfies:
1. µ∗ (∅) = 0;
2. µ∗ (A) ≤ µ∗ (B) if A ⊂ B;
3. µ∗ ( ∞
S P∞ ∗
n=0 An ) ≤ n=1 µ (An ).
Obviously an outer measure is nor a measure in the sense of the previous sec-
tion. The idea is that outer measures are easier to define and we will show
that an outer measure behaves as a measure on a “rich” σ-algebra.
It is not very difficult to check that the following function for sets of Rn is
an outer measure
∞
X ∞
[
m∗ (A) = inf{ m(Rn ) : (Rn )∞
n=1 rectangles, A ⊂ Rn },
n=1 n=1
µ∗ (B) = µ∗ (B ∩ A) + µ∗ (B \ A)
119
Proof. Denote by Σ the family of measurable sets. Clearly ∅, Ω ∈ Σ and
A ∈ Σ if and only of Ac ∈ Σ. As to the union of sets, we will begin by showing
that the union of two sets: assume A1 , A2 ∈ Σ and B ⊂ Ω is arbitrary. The
measurability of A1 witnessed by B ∩ (A1 ∪ A2 ) gives
µ∗ (B ∩ (A1 ∪ A2 )) = µ∗ (B ∩ (A1 ∪ A2 ) ∩ A1 ) + µ∗ (B ∩ (A1 ∪ A2 ) ∩ Ac1 )
= µ∗ (B ∩ A1 ) + µ∗ (B ∩ A2 ∩ Ac1 ).
And the measurability of A2 witnessed by B ∩ Ac1 gives
µ∗ (B ∩ Ac1 ∩ A2 ) + µ∗ (B ∩ Ac1 ∩ Ac2 ) = µ∗ (B ∩ Ac1 ).
Now we have
µ∗ (B ∩ (A1 ∪ A2 )) + µ∗ (B ∩ (A1 ∪ A2 )c ) =
µ∗ (B ∩ A1 ) + µ∗ (B ∩ A2 ∩ Ac1 ) + µ∗ (B ∩ Ac1 ∩ Ac2 )
= µ∗ (B ∩ A1 ) + µ∗ (B ∩ Ac1 ) = µ∗ (B)
which implies the measurability of A1 ∪ A2 .
Clearly, that implies that Σ is closed for finite union of sets, and so it is closed
for finite intersections and differences via complements. Therefore, in order to
show that Σ is closed for countable unions it is enough to consider sequences
of disjoint sets (An ) ⊂ Σ. Firstly we will prove by induction the following
formula n n
X \
µ∗ (B) = µ∗ (B ∩ Ak ) + µ∗ (B ∩ Ack ).
k=1 k=1
Indeed, for n = 1 is just the measurability of A1 . Now, the measurability of
An implies
n−1
\ n−1
\ n
\
µ∗ (B ∩ Ack ) = µ∗ (B ∩ Ack ∩ An ) + µ∗ (B ∩ Ack )
k=1 k=1 k=1
n
\
= µ∗ (B ∩ An ) + µ∗ (B ∩ Ack ).
k=1
If we assume the formula is true for n − 1, the last equality added will imply
the formula is true for n.
The formula easily implies
∞
X ∞
\
∗ ∗ ∗
µ (B) ≥ µ (B ∩ Ak ) + µ (B ∩ Ack ) ≥
k=1 k=1
120
∞
[ ∞
[
∗ ∗
µ (B ∩ Ak ) + µ (B ∩ ( Ak )c ) ≥ µ∗ (B)
k=1 k=1
S∞
that gives both the measurability of k=1 Ak and the σ-additivity of µ∗ |Σ .
for any A ⊂ Ω and let Σ be the σ-algebra of µ∗ -measurable sets. Then we have
A ⊂ Σ and µ∗ |A = µ.
Proof. Firstly we will show the measurability of any A ∈ S A. Let B ⊂ Ω be
arbitrary.SFor any cover (An ) ⊂ A of B we have B ∩ A ⊂ ∞ n=1 (An ∩ A) and
B\A⊂ ∞ n=1 (An \ A) being both covers made of sets from A. We have
∞
X ∞
X ∞
X
∗ ∗
µ (B ∩ A) + µ (B \ A) ≤ µ(An ∩ A) + µ(An \ A) = µ(An )
n=1 n=1 n=1
121
Proposition 8.3.3. Let R ⊂ P(Ω) be a class of sets and let Σ be the σ-algebra
that generates. Suppose that:
1. There is a function µ : R → [0, +∞] which is σ-additive (on R);
2. The formula µ(A) = ni=1 µ(Ri ) using the disjoint decomposition above
P
extends unambiguously the measure µ to all A.
3. µ is σ-additive on A.
Now we can apply Theorem 8.3.2 in order to finish the proof.
The family R plays the role of the rectangles in the construction of measures
on Rn and that is the reason for the choice of the name rectangular, obviously.
At this point we can resume the construction of the Lebesgue measure. Recall
that we have a finitely additive measure m defined on the algebra generated
by the rectangles and exterior measure m∗ built from countable covers with
rectangles. We can use Theorem 8.3.2 to show that we recover m from m∗ ,
and according to the reduction to rectangular families, we only have to show
that m is σ-additive within the rectangles.
Proof. If one of the dimensions of R collapse to 0 then all the measures are 0
and so the equality holds, so we may assume that all the sides of R has length
greater than 0. The case R has an infinite edge so m(R) = +∞ can be reduced
122
to the bounded case by intersecting ∥ · ∥∞ -balls centred at the origin. Assume
then R is bounded and closed (the faces have d-dimensional measure 0). Fix
ε > 0 and for every n ∈ N take Bn an open ∥ · ∥∞ -ball centred at the origin
such that
m(Rn + Bn ) < m(Rn ) + 2−n ε
which is possible by the continuous dependence of the measure on the lengths
of the edges. Since the enlarged rectangles
Sn Rn + Bn are open an cover R there
are finitely many such that R ⊂ k=1 (Rn + Bn ). We deduce that
n
X ∞
X
m(R) ≤ m(Rk ) + ε ≤ m(Rk ) + ε
k=1 k=1
123
measure (even finitely additive) defined on (Ω, Σ) the number
Z n
X
s dµ = an µ(Ak )
k=1
noes not depend on the particular expression of s. This is a tedious but ele-
mentary verification basedR on the algebra structure of Σ and the additivity of
µ. Note that the integral dµ defines a linear operator on the space of simple
functions S that can be naturally
R extended to any closure of S whith respect
to a topology which makes dµ continuous. For instance, the topology of
uniform convergence in case of µ(Ω) < +∞ would do the work. However this
is not the way, and the theory is more powerful if the extension of the integral
is done by monotonicity and the set of integrable functions can be described
in an easier way.
Proposition 8.4.2. Let (Ω, Σ) be a measurable space and let M denote the set
of measurable real functions defined on it and let M∞ denote the set of mea-
surable functions valued into R = [−∞, +∞] (also with the Borel σ-algebra).
Then:
n
1. If f1 , . . . , fn ∈ M (or M∞ ) then (f1 , . . . , fn ) : Ω → Rn (or R ) is
n
measurable for the Borel σ-algebra on Rn (resp. R ).
2. M is an algebra, M∞ is stable by inverses, M and M∞ are lattices.
124
3. M∞ is stable by supremums and infimums of countable sets.
4. M∞ is stable by lim inf and lim sup of sequences, and thus it is also
stable by limits of pointwise convergent sequences.
5. If f ∈ M is bounded then it can be uniformly approximated by simple
functions.
6. If f ∈ M∞ and f ≥ 0 there is an increasing sequence of simple functions
0 ≤ s1 ≤ s2 ≤ · · · ≤ f
which converges pointwise to f .
Proof. (1) In both cases the topologies are generated by rectangles. Note that
n
\
−1
(f1 , . . . , fn ) ([a1 , b1 ] × · · · × [an , bn ]) = fk−1 ([ak , bk ]) ∈ Σ.
k=1
125
8.5 Integration
Now we are ready to define the integral for positive functions. Let (Ω, Σ, µ)
be a measure space. Recall that the integral was already defined for simple
functions in case
Pn that µ(Ω) < +∞. If we limit ourselves to positive simple
functions s = k=1 an χAk with ak ≥ 0 we may remove the finiteness hypothesis
and the formula Z n
X
s dµ = an µ(Ak )
k=1
will make sense in [0, +∞]. It is easy to see that also in this case the value
does not depend on the particular (positive) representation of s.
We define the integral of f : Ω → [0, +∞] with respect to µ as the value in
[0, +∞] given by
Z Z
f dµ = sup{ s dµ : 0 ≤ s ≤ f, s ∈ S}
R R R
and A f dµ = χA f dµ for A ∈ Σ. Note that the computation of s dµ
could need operations involving +∞, however the limitation to positive values
avoids us possible troubles. The very definition implies these almost obvious
properties that we will need later.
Proposition 8.5.1. Under the notation and assumptions above we have:
R R
1. if 0 ≤ f ≤ g are measurable then f dµ ≤ g dµ;
R R
2. if A, B ∈ Σ, A ⊂ B and f is measurable then A f dµ ≤ B f dµ;
R R
3. if f ≥ 0 is measurable and λ ≥ 0 a real number then λf dµ = λ f dµ.
Theorem 8.5.2 (Monotone convergence theorem). Let (Ω, Σ, µ) be a measure
space and let 0 ≤ f1 ≤ f2 ≤ · · · ≤ f a sequence of measurable functions defined
on Ω with values in [0, +∞] that pointwise converges to f . Then
Z Z
lim fn dµ = f dµ.
n
Proof. The limit of the lefthand side exists in [0, +∞] by monotony and it is
obvious the inequality Z Z
lim fn dµ ≤ f dµ.
n
126
For the converse, fix a simple function s ≤ f and a number λ ∈ (0, 1). Note
that the sequence of measurable sets
An = {x ∈ Ω : fn (x) ≥ λs(x)}
S∞
is increasing and n=1An = Ω. We have
Z Z Z
fn dµ ≥ fn dµ ≥ λ s dµ.
An An
R
Note that ν(A) = A s dµ defines a positive measure on Σ, so taking limits we
have Z Z
lim fn dµ ≥ λ lim ν(An ) = λ ν(Ω) = λ s dµ.
n n
Being λ < 1 arbitrary and taking into account the definition of the integral we
get Z Z
lim fn dµ ≥ f dµ
n
as wished.
Corollary 8.5.3. If f, g : Ω → [0, +∞] are measurable then
Z Z Z
(f + g) dµ = f dµ + g dµ.
applying the monotone convergence theorem and having in mind that the
additivity of the integral was established for simple functions.
Corollary 8.5.4. Let (Ω, Σ, µ) be a measure space and let (fn ) be a sequence
of measurable functions valued in [0, +∞]. Then
∞
Z X ! ∞ Z
X
fn dµ = fn dµ.
n=1 n=1
127
Proof. Just apply the monotone convergence theorem to the increasing se-
quence of functions gn = nk=1 fk whose limit is ∞
P P
f
k=1 k .
Proposition 8.5.5 (Fatou’s lemma). Let (fn ) be a sequence of non negative
measurable functions. Then
Z Z
lim inf fn dµ ≤ lim inf fn dµ.
n n
Now we are ready to extend the notion of integral to non positive Rfunctions.
We say that f : Ω → [−∞, +∞] is integrable if it is measurable and |f | dµ <
+∞. In such a case we define the integral of f as the real number
Z Z Z
f dµ = f dµ − f − dµ.
+
R R
We will also consider the integrals over sets A
f dµ := χA f dµ. The following
properties are not a surprise.
Proposition 8.5.6. Let L1 (µ) denote the set of integrable functions defined
on the measure space (Ω, Σ, µ). Then
1. L1 (µ) is vector lattice;
2. the integral is a linear functional on L1 (µ);
R R
3. | f dµ| ≤ |f | dµ
The following result is the key of the versatility of Lebesgue integral.
Theorem 8.5.7 (Dominated convergence theorem). Let (fn ) ⊂ L1 (µ) a se-
quence which converges pointwise to f . Assume that there is g ∈ L1 (µ) such
that |fn | ≤ g for all n ∈ N. Then f ∈ L1 (µ) and
Z Z Z
lim fn dµ = f dµ and lim |fn − f | dµ = 0.
n n
128
Proof. The integrability of f is clear from the inequality |f | ≤ g. We may
apply Fatou’s lemma to the positive sequence (2g − |fn − f |) we get
Z Z Z
2g dµ = lim(2g − |fn − f |) dµ ≤ lim inf (2g − |fn − f |) dµ
n n
Z Z
= 2g dµ − lim sup |fn − f | dµ.
n
R R
We deduce lim supn |fn − f | dµ = 0 and thus limn |fn − f | dµ = 0, which
easily implies the other part of the statement.
That implies f coincides with f and f almost everywhere. That implies the
Labesgue measurability of f and the coincidence of Riemann and Lebesgue
integrals.
Unfortunately there are some important integrals which are not covered by
Lebesgue theory. For instance, the following one exists in improper Riemann
sense but not in Lebesgue Z +∞
sin x
dx.
0 x
The convergence theorems cast some light on the following question: when
can we commute derivation and integration? That is, whether is true the
129
following formula
Z Z
∂ ∂f
f (x, y) dµ(x) = (x, y) dµ(x).
∂y ∂y
If we express the derivation by its very definition at y0
Z Z
∂
f (x, y) dµ(x) = lim h−1 (f (x, y0 + h) − f (x, y0 )) dµ(x)
∂y y=y0
h→0
and this last limit can be written sequentially, taking h = hn with limn hn = 0,
for instance. Thus the question is reduced to know the limit
Z Z
−1 ∂f
lim hn (f (x, y0 + hn ) − f (x, y0 )) dµ(x) = lim (x, y0 + θ(x, n)) dµ(x)
n n ∂y
where |θ(x, n)| < |hn | is given by the finite increments theorem. If the family of
functions { ∂f
∂y
(x, y) : y} were dominated by a positive integrable function for y
in a neighbourhood of y0 we could apply the dominated convergence theorem.
The analysis for interesting integrals is sometimes more tricky. Lets go back
to the improper Riemann non-Lebesgue integral above.
while the last one remains bounded by 2/(πn) for n odd. Taking limits in n,
we get that Z +∞
sin x π
dx = .
0 x 2
Proposition 8.6.1. The set of simple functions on finite measure sets is dense
in (L1 (µ), ∥ · ∥1 ).
131
Proof. Given f ∈ L1 (µ) the sequence fn = min{n, max{f, −n}} is dominated
by |f | and converges to f in ∥ · ∥1 , so we may assume f is bounded. Now
consider the sequence (fn ) where fn (x) = f (x) if |f (x)| ≥ 1/n and fn (x) = 0
otherwise. This sequence is also dominated by |f | and converges to f , and so
for the seminorm ∥f ∥1 . The functions fn have supports of finite measure, so
they can be uniformly approached by simple functions also with supports of
finite measure.
The previous result makes clear that the approximation of integrable func-
tions by others reduces to the approximation of simple functions, and thus
the approximation of characteristic functions. Define a pseudometric on Σ by
dµ (A, B) = µ(A∆B) where A∆B = (A \ B) ∪ (B \ A) is the symmetric dif-
ference. Note that d is actually the restriction of the seminorm ∥ · ∥1 through
characteristic functions d(A, B) = ∥χA − χB ∥1 .
Proposition 8.6.2. Let (Ω, Σ, µ) a finite measure space and assume that Σ is
generated by an algebra A. Then A is dense in (Σ, dµ )
Proof. Consider the set
which implies dµ ( ∞
S Sn
k=1 Ak , k=1 Bk ) < ε. Since M ⊂ Σ is a σ-algebra that
contains A they must be the same.
With similar ideas we can deal with the completion of a measure space.
132
Proposition 8.6.3. Let (Ω, Σ, µ) a measure space. There exists a complete
measure space over the same set (Ω, Σ, µ) which is the smaller possible and has
the following property: for every A ∈ Σ there is B ∈ Σ such that µ(A∆B) = 0,
that is, Σ is dense in Σ with respect to dµ .
Proof. Evidently, a completion of (Ω, Σ, µ) must contain the family of sets
N = {M ⊂ Ω : ∃N ∈ Σ, µ(N ) = 0, M ⊂ N }.
Using the same ideas than in the previous proposition it is possible to prove
that
Σ = {A ⊂ Ω : ∃B ∈ Σ, A∆B ∈ N }
is a σ-algebra and µ(A) = µ(B) if A∆B ∈ N is well defined.
Corollary 8.6.4. Let (Ω, Σ, µ) be the completion of (Ω, Σ, µ). If f is Σ mea-
surable, then there is a Σ-measurable function g such that f = g almost every-
where with respect to µ.
Proof. For every t ∈ Q take a set Nt ∈ Σ with µ(Nt ) = 0 such that S there is
At ∈ Σ such that At ⊂ {f ≤ t} and {f ≤ t} \ At ⊂ Nt . The set N = t∈Q Nt
is null. Define g(x) = f (x) if x ̸∈ N and g(x) = 0 otherwise. By construction
g fulfils the requirements.
133
satisfy µ(Ak ) ≤ 2−k . Take A = ∞
T S
k=1 j≥k Aj . And note that µ(A) = 0. By
c
construction we have for any x ∈ A that |fnk (x) − f (x)| ≤ 1/k from a certain
k on, and so the theorem is proven.
we may proceed by applying the same ideas of the proof of Proposition 8.6.2.
Since in a metric space the open sets are a countable union of closed sets
we have the following.
Corollary 8.6.7. Every finite Borel measure in a metrizable space is regular.
The possibility of changing closed sets by compact sets in the inner approx-
imation.
Theorem 8.6.8. Assume that X is separable and completely metrizable and
µ a finite Borel measure on it then
for every A ∈ B.
134
Proof. After the previous proposition it is enough to show the result is true
for A closed. Fix ε > 0. For every n ∈ N take a countable cover (Bn,m )∞
m=1 of
A by balls of radius less than 1/n. Now fix mn such that
mn
[
µ(A \ Bn,m ) < 2−n ε.
m=1
Now we have ∞ [
mn
\
B= Bn,m
n=1 m=1
The results discussed so far could be adapted for σ-finite measures with
some additional hypotheses. For instance, it is easy that Theorem 8.6.8 is still
true if the space X can be covered by countably many closed sets of finite
measure. Let us mention that the result is still true even in the σ-finite case
since any Borel subset of the completely metrizable space X can be completely
metrized for the relative topology. In any case, for our most important case
we have the following.
Theorem 8.6.9. A Lebesgue measurable set of Rd differs from a Borel set in
a null measure set and it is regular for the Lebesgue measure.
Proof. Let A ⊂ Rd a Lebesgue measurable set. Since the Lebesgue outer
measure can be computed by open covers we may find a Gδ -set A (countable
intersection of covers) E ⊃ A such that m(E) = m(A). That implies E \ A
has null measure. In order to prove the regularity it is enough to work with
Borel sets. If A were bounded the result could be deduced from Proposition
8.6.7. Otherwise, fix ε > 0 and take the sets Cn = B(0, n + 1) \ B(0, n). Find a
−n
compact Kn ⊂ C Sn∞∩ A and an open Un ⊃ CS n ∩ A such that m(Un \ Kn ) < 2 ε.
∞
Obviously U = n=1 Un is open, and F = n=1 Kn is closed. Indeed, converg-
ing sequences stays in only one Kn . Clearly m(U \ F ) < ε.
135
Proof. After Proposition 8.6.1 it is enough to prove the statement for char-
acteristic functions χA with A ∈ B and µ(A) < +∞. Fix ε > 0 and take
closed an open sets F ⊂ A ⊂ U such that µ(U ) < ∞ and µ(U \ F ) < ε. By
Urysohn’s lemma there is f : X → [0, 1] continuous such that f |F = 1 and
f |U c = 0. Note that the support of f has finite measure and ∥χA − f ∥1 < ε.
Σ1 ⊗ Σ2 = σ({A × B : A ∈ Σ1 , B ∈ Σ2 }).
fn = µ2 (Sn )χRn .
For every x ∈ R the sets {Sn : x ∈ Rn } are disjoint and their union is S.
Therefore ∞
X X
µ2 (S) = µ2 (Sn ) = fn (x)
x∈Rn n=1
P∞
and thus n=1 fn = µ2 (S)χR . The monotone convergence for series gives
∞
X ∞ Z
X Z
µ1 (Rn )µ2 (Sn ) = fn dµ1 = µ2 (S)χR dµ1 = µ1 (R)µ2 (S)
n=1 n=1
136
as wished. If (Ω1 , Σ1 , µ1 ) and (Ω2 , Σ2 , µ2 ) are σ-finite then (Ω1 × Ω2 , Σ1 ⊗
Σ2 , µ1 ⊗ µ2 ) also is. Assume first that µ1 ⊗ µ2 is finite. Proposition 8.6.2
implies that the measure is determined by the value on the algebra generated
by the rectangles. This can be extended to the σ-finite case in an obvious way.
Proof. Consider the class M ⊂ P(Ω1 × Ω2 ) for which the statement of the
theorem is true. Clearly, Σ1 × Σ2 ⊂ M and the sets of the algebra generated
by Σ1 × Σ2 because of the reduction to disjoint unions. In order to prove that
M actually contains Σ1 ⊗ Σ2 we will use Theorem 8.2.1. Indeed, if (An ) ⊂ M
is an increasing sequence, then fn (x) = µ2 ((An )x ) and gn (y) = µ1 ((An )y ) are
also increasing, so the monotone convergence applies to get that
∞
[ Z Z
µ( A) = lim fn dµ1 = lim gn dµ2 .
n n
n=1
S∞ S∞ y
Note
S∞ that limn f (x) = µ2 (( n=1 A)x ) and limn g(y) = µ2 (( n=1 A) ) and so
n=1 An ∈ M. The proof for decreasing sequences is similar but using dom-
inated convergence instead if we assume that the measure is finite. Now, the
σ-finite case follows straight: the intersection of M with every finite measure
set lies on Σ1 ⊗ Σ2 .
After the result for sets we will prove the corresponding for functions. In
order the result be more powerful, we will consider measurability with respect
the completion of the product measure. In this way, cross section technique
for integration on Rd will be covered by the result.
137
Theorem 8.7.3 (Fubini, Tonelli). Suppose that (Ω1 , Σ1 , µ1 ) and (Ω2 , Σ2 , µ2 )
are complete and σ-finite. Let f : Ω1 × Ω2 → R measurable with respect of
the completion of Σ1 ⊗ Σ2 and assume either f is positive or integrable
R and
put fRx ( ) = f (x, ) f y ( ) = f ( , y) for x ∈ Ω1 and y ∈ Ω2 . Then fx dµ2
and f y dµ1 exists for almost x and y (with respect to µ1 and µ2 ), they are
measurable on their respective spaces and
Z Z Z Z Z
y
f d(µ1 ⊗ µ2 ) = fx dµ2 dµ1 = f dµ1 dµ2 .
Proof. Firstly note that the result is true for simple functions built on subsets
from Σ1 ⊗Σ2 and the result extends to simple functions because the expression
is linear. If f where positive and measurable with respect Σ1 ⊗ Σ2 the result
would be consequence of the observation and the monotone convergence theo-
rem. Obviously the result extends to f measurable with respect Σ1 ⊗ Σ2 and
integrable. Now, if f is measurable with respect of the completion of Σ1 ⊗ Σ2 ,
then there is g which is Σ1 ⊗ Σ2 measurable and coincides with f almost ev-
erywhere. The support of |f − g| is contained in a set N ∈ Σ1 ⊗ Σ2 of null
measure. Theorem 8.7.2 implies that the set Nx has null measure for almost
all x ∈ Ω1 . Then f (x, ) is measurable for those x and coincides with g(x, )
almost everywhere. A similar reasoning works for f ( , y).
Then, the integral is derivable with respect to λ and the following equality holds
Z Z
∂ ∂f
f (x, λ) dµ = (x, λ) dµ
∂λ ∂λ
at the points λ ∈ (a, b) where the second term is continuous.
138
R
Proof. Put F (λ) = f (x, λ) dµ. For λ1 , λ2 ∈ (a, b) we have
Z λ2 Z Z Z λ2
∂f ∂f
(x, λ) dµ dλ = (x, λ) dλ dµ
λ1 ∂λ λ1 ∂λ
Z
= (f (x, λ2 ) − f (x, λ1 ))dµ = F (λ2 ) − F (λ1 ).
Therefore,
λ2
F (λ2 ) − F (λ1 )
Z Z
1 ∂f
= (x, λ) dµ dλ
λ2 − λ1 λ2 − λ1 λ1 ∂λ
whenever (An )∞n=1 ⊂ Σ are mutually disjoint. Note that any permutation of
the sets in the union on the lefthand-side leaves the value unchanged so the
series on the righthand-side have to be unconditionally convergent, which is
the same that absolutely convergent for real numbers, namely
∞
X
|ν(An )| < +∞
n=1
whenever (An )∞
n=1 ⊂ Σ are mutually disjoint.
The first task to do with a signed measure is finding sets where the measure
behaves monotonically. Let us say that A ∈ Σ is positive if ν(B) ≥ 0 for any
B ∈ Σ with B ⊂ A. Analogously negative sets can be defined.
Lemma 8.8.1. Let ν be a signed measure. Then any set A ∈ σ with ν(A) > 0
contains a positive set P ∈ Σ with ν(P ) ≥ ν(A).
139
Proof. Consider
If d1 ≥ 0 the set A is already positive and there is nothing to do. In other case
d1 < 0. Take a set B1 such that ν(B1 ) < max{d1 /2, −1}. Assume the sets
B1 , . . . , Bn−1 are already built and take
n−1
[
dn = inf{ν(B) : B ∈ Σ, B ⊂ A \ Bk }.
k=1
140
after the lemma. Then P ∪A would be positive and ν(P ∪A) > s which violates
the definition of s. Now, it is clear that if A ⊂ Σ is positive then ν(A \ P ) = 0
and if A is negative then ν(A \ N ) = 0, which implies the uniqueness of the
decomposition up to null measure sets.
Corollary 8.8.3. A signed measure is the difference of two positive finite
measures.
Proof. Let (P, N ) be the Hahn decomposition of ν. Take ν + (A) = ν(P ∩ A)
and ν − (A) = −ν(N ∩ A). Obviously, we have ν = ν + − ν − .
The fact that |ν|(Ω) < +∞ is expressed usually by saying that ν has finite
variation. The formula above is more interesting for vector valued measures
(we will skip the definition, but the reader can easily guess it) because it allows
to define a positive measure |ν| that accurately controls ν. However, in the
infinitely dimensional case |ν| could be not finite.
141
Proof. Without loss of generality we may assume ν positive. One of the
implications is clear. For the converse just assume the ε-δ property is false.
Namely, there is some ε > 0 such that for all n ∈ N there is An ∈ Σ such that
µ(An ) < 2−n and ν(An ) > ε. Note now that the set
∞ [
\ ∞
A= Ak
n=1 k=n
142
R
We claim f is the function we are looking for. Note that ν0 (A) = ν(A)− A f dµ
defines a positive measure which is also absolutely continuous with respect to
µ. Suppose that ν0 (Ω) > 0 in order to get a contradiction. Fix ε > 0 such
that ν0 (Ω) > εµ(Ω). Now applying the Hahn decomposition to ν0 − εµ we get
a positive part P with respect to such a measure. Since (ν0 − εµ)(Ω) > 0 we
get (ν0 − εµ)(P ) > 0 and also
ν0 (A ∩ P ) ≥ εµ(A ∩ P )
8.9 Differentiation
We will develop the Lebesgue theory of differentiation on Rd endowed with
the d-dimensional Lebesgue measure m. The chosen norm on Rd will not play
143
an essential role, however it must fixed from the beginning. Instead of the
difference quotients we will use integral averages
Z
1
Ar (f )(x) = f dm
m(B(x, r)) B(x,r)
for every r > 0 and f ∈ L1 (µ). We have to prepare the tools for the main
result. The first one is expresses a general property of the convolution, actually.
Proposition 8.9.1. The average Ar (f ) is norm 1 operator on f ∈ L1 (µ).
The previous result (and its proof) can be interpreted in terms of convolu-
tion with a family of kernels.
144
Proof. Write R for the radius of a ball or the supremum of the radii of
a family of balls. The choice of balls will be by induction. Take B1 ∈ F
with a R(B1 ) ≥ 2−1 R(F). Suppoose now that Bk are already chosen for
k = 1, . . . , n − 1. Find Bn ∈ F such that
1
R(Bn ) ≥ R({B ∈ F : B ∩ Bk = ∅, k = 1, . . . , n − 1})
2
if that choice is possible, otherwise the construction stops. PIn order to show
that the sequence satisfies the statement, we may assume n m(Bn ) < ∞.
Let Bn′ be a ball with the same center that Bn and radius 5 times bigger.
We claim that (Bn ) meets every set in F. Indeed, take B ∈ F and assume
Bn ∩ B = ∅. ThatP would imply R(Bn+1 ) > R(B)/2 and therefore the sequence
is infinite and n m(Bn ) = ∞ against the previous assumption. Now, since B
meets some Bn , assume n is minimum. Then we have
′
S R(B′
n ) > R(B)/2 and
so B ⊂ Bn (draw a picture). In consequence, A ⊂ n Bn , and thus
X X
m(A) ≤ m(Bn′ ) = 5d m(Bn )
n n
as desired.
Clearly the radii of the balls are uniformly bounded. Using Vitali’s lemma
there is disjoint sequence (maybe finite) in {Bx : x ∈ A} that we denote (Bn ).
We have Z X
S
|f | dm > ε m(Bn ) ≥ ε 5−d m(A)
n Bn n
which implies the statement.
145
Theorem 8.9.4. Let f ∈ L1 (Rd ) then for almost every point x ∈ Rd there
exists the limit Z
1
lim f dm = f (x).
r→0+ m(B(x, r)) B(x,r)
Proof. Firstly we will show that the convergence of the averages happens with
respect to ∥ · ∥1 , namely
lim ∥Ar (f ) − f ∥1 = 0.
r→0+
5d
m({osc(f, x) > ε/2}) ≤ ∥f − g∥1
ε
that implies m({osc(f, x) > ε/2}) = 0. Therefore the integral averages of f
converge almost everywhere and the limit must coincide with f almost every-
where by the first part of the proof and Theorem 8.6.5.
Is clear that Theorem 8.9.4 can be extended to functions that are integrable
on a bounded open subset of Rd (locally integrable). In this way the result
naturally applies to characteristic functions.
146
Corollary 8.9.5. Given a measurable set A ⊂ Rd the limit
m(B(x, r) ∩ A)
lim+
r→0 m(B(x, r))
1 x 1 x+r
Z Z
lim f (t) dt and lim f (t) dt.
r→0+ r x−r r→0+ r x
It is not difficult to check that all the previous theory can can be adapted to
these averages despite x is not the central point. As a consequence we have
the following.
Rx
Corollary 8.9.6. The indefinite integral F (x) = a f (t) dt of a locally in-
tegrable function on R is differentiable almost everywhere and the equality
F ′ (x) = f (x) holds almost everywhere on its domain.
A related important question is to recognise those functions which are in-
definite integrals
R of L1 (R) functions. The key idea is the fact that the measure
µ(A) = A f dm is absolutely continuous with respect the Lebesgue measure
on R. We say that a function F defined on an interval (a, b) of R is absolutely
continuous if for every ε > 0 there is δ > 0 such that for any choice of points
with nk=1 (bk − ak ) < δ then nk=1 |F (bk ) − F (ak )| < ε. Evidently, an abso-
P P
lutely continuous function is continuous and also it is of bounded variation on
bounded intervals.
Theorem 8.9.7. Let F : (a, b) → R be an absolutely continuos function. Then
F is differentiable almost everywhere and its derivative F ′ (x) = f (x) is locally
integrable and satisfies
Z d
f (x) dx = F (d) − F (c)
c
147
Proof. We may assume that [a, b] is finite and the butts belong to the domain.
The members of the algebra A generated by the intervals can be represented
as a disjoint finite union of intervals. We define a function on A by
n
X
ν(A) = (F (bk ) − F (ak ))
k=1
In the classroom, I like to begin the topic with the “proof” of Pythagoras
Theorem based in different decompositions of a square. In this way, I discuss
in what extent we are using an intuitive notion of area and how we could calm
the need for rigor. We can devise a “naı̈ve” theory of area for polygons based in
equidecomposability and then I would mention the theorems of Bolyai-Gerwien
and Hadwiger-Glur. However, the same program cannot be developed in three
dimensions because of the solution of Dehn to Hilbert’s third problem. That
148
shows that integral methods are needed even if we restrict ourselves to volumes
of polyhedral bodies.
As to the methods to carry out the topic, I want to point out the strug-
gle to make clear the need for countable additive measures and the meaning
of Caratheodoy’s measurability definition. Our proof of the Radon-Nikodym
theorem is constructive, instead of von Neumann’s idea using Hilbert proper-
ties of L2 . In our vision, some specific properties of function spaces belong to
Functional Analysis, so they are relegated to the auxiliary chapter.
8.11 Exercises
1. Use the formula for the measure of a union of sets to deduce the area of
a spherical triangle in terms of its angles.
2. Let Sn be the permutation group action on a set of n elements. Let Fn
be the subset of Sn that fixes at least one element. Show the existence
and find the value of the limit
#(Fn )
lim .
n n!
3. Find the values of α for which
Z 1
lim nα (1 − x)xn cos(πx/n) dx = 0.
n 0
149
R1 log(1+xt)
5. Let f (x) = 0 1+t2
dt defined for x > 0. Show that f (1) = π log(2)/8
and
log 4 + πx − 4 log(1 + x)
f ′ (x) = .
4(1 + x2 )
6. Prove that the function
+∞ √
Z
cos(xt) −t
f (x) = x t+ √ e dt
0 t
is defined on R, it continuous and monotone.
7. Let f : R → R be integrable, and for n ∈ N take
n +∞
Z
f (t) dt
fn (x) = .
π −∞ 1 + n2 (t − x)2
10. Let F ⊂ P(Ω) be a family of sets. Prove that for every A ∈ σ(F) there
is FA ⊂ F countable such that A ∈ σ(FA ).
11. Prove that the cardinality of the Borel sets of R is c = 2N and the
cardinality of Lebesgue measurable sets of R es 2c .
12. Find a non measurable Lebesgue set (use an equivalent version of the
Axiom of Choice).
13. Prove that there is no probability µ on P(N) such that µ(nN) = 1/n for
all n ∈ N.
150
14. For any set A ⊂ R, let D(A) = {x−y : x, y ∈ A}. Prove that if m(A) > 0
then D(A) is a neighbourhood of 0. Show that the reciproque does not
hold by computing D(T ) where T is the ternary Cantor set.
15. Let f : R → R a first Baire class function, that is, a pointwise limit of
continuous functions. Show that f −1 (U ) is a countable union of closed
sets for every open U ⊂ R. Deduce that the indicator fuction of the
rationals is not first Baire class, but it is second Baire class (pointwise
limit of first Baire class functions).
16. Let ⟨x⟩ denote the not-integer part of a number x ∈ R. Prove that for
all α ∈ R \ Q and every f ∈ C[0, 1] then
n Z 1
1X
lim f (⟨kα⟩) = f (x) dx.
n n 0
k=1
19. Let f, g : [0, +∞) → R be functions such that f Ris decreasing with
y
limx→+∞ f (x) = 0 and there is M > 0 such that | x g(t)dt ≤ M | for
every x, y ∈ [0, +∞). Show that if x < y ∈ [0, +∞), then
Z y
f (t)g(t) dt ≤ M f (x).
x
20. Show that the product of two absolutely continuous functions (defined
on the same compact interval) is absolutely continuous too.
151
152
Chapter 9
called the variation of f on [a, b]. If Vab (f ) < +∞ we say that f is of bounded
variation. Note that monotone functions are trivially of bounded variation. A
less trivial example: the existence and boundedness of the derivative implies
bounded variation since
n
X n
X
|f (xi ) − f (xi−1 )| = |f ′ (ξi )|(xi − xi−1 ) ≤ M (b − a)
i=1 i=1
for some ξi ∈ (xi−1 , xi ) and M > 0 being a bound for f ′ (x) on (a, b).
153
(a) |f (b) − f (a)| ≤ Vab (f );
(b) If [c, d] ⊂ [a, b] then Vcd (f ) ≤ Vab (f );
(c) if c ∈ [a, b] then Vac (f ) + Vcb (f ) = Vab (f ).
Proof. (a) Note that |f (b)−f (a)| is the sum associated to the trivial partition
of [a, b]. Statements (b) and (c) follow just taking suitable partitions of [a, b]
including points a, b.
and thus
Vax (f ) − f (x) ≤ Vay (f ) − f (y)
which finishes the proof.
154
Corollary 9.1.4. A function of bounded variation has at most countably many
discontinuities of jump type.
Now we are interested in knowing if continuity of the function is inherited
by its variation.
Theorem 9.1.5. If f : [a, b] → R is continuous and of bounded variation then
Vax (f ) is continuous for x ∈ [a, b].
Proof. Suppose that Vax (f ) ̸→ Vac (f ) for some c ∈ [a, b]. We may assume
that c > a and x < c, otherwise we could made a similar argument. Therefore
there is some η > 0 such that Vxc (f ) > η for every x < c. Take a1 = a and find
a partition (xi )ni=1 of [a1 , c] such that
n
X
|f (xi ) − f (xi−1 )| > η.
i=1
By the continuity of f we may assume that |f (xn−1 ) − f (c)| < η/2. Now take
a2 = xn−1 < c and observe that Vaa12 > η/2. Proceed likewise to find a3 and so
an increasing sequence (an ) such that Vaann+1 (f ) > η/2. Thus
It is easy to check that a rectifiable curve is rectifiable for any equivalent norm
on X unless the length is obviously not invariant.
Note the similarities of the length with the variation. Some of the argu-
ments in the preceding section can be adapted to prove the following properties.
155
Proposition 9.2.1. Let γ : [a, b] → X be a rectifiable continuous curve. Then
(a) ∥γ(b) − γ(a)∥ ≤ Lba (f );
(b) if c ∈ [a, b] then Lca (f ) + Lbc (f ) = Lba (f );
(c) Lta (γ) is increasing for t ∈ [a, b];
(d) Lta (γ) is continuous for t ∈ [a, b].
In the following, we may assume that parameterized curves are always con-
tinuous. We will begin with the characterization in finite dimensional spaces.
Theorem 9.2.2. A curve γ : [a, b] → Rd is rectifiable if and only if its coordi-
natewise functions are of bounded variation.
Proof. Being rectifiable is independent of the norm on Rd since all the norms
are equivalent. We will use the ∥ · ∥1 norm to prove the equivalence. Write
γ(t) = (x1 (t), . . . , xd (t)) and observe that
n
X n
X d X
X n
|xk (ti ) − xk (ti−1 )| ≤ ∥γ(ti ) − γ(ti−1 )∥1 = |xj (ti ) − xj (ti−1 )|
i=1 i=1 j=1 i=1
where (ti )ni=0 is a partition of [a, b] and k = 1, . . . , d. The first inequality implies
that xk is of bounded variation when γ is rectifiable. The equality on the right
hand side implies that γ is rectifiable if all the functions xj for j = 1, . . . , d are
of bounded variation.
Using a deep result saying that monotone functions have derivative almost
everywhere we can deduce the following corollary.
Corollary 9.2.3. A rectifiable curve γ : [a, b] → Rd has a tangent line at γ(t)
for almost every t ∈ [a, b].
However, this derivative is of little use as it doesn’t show the global be-
haviour of of the curve unless we assume extra regularity. Indeed, remind that
there exist non trivial monotone functions with null derivative at almost every
point (Cantor’s staircase e.g.).
We will fix the standard of regularity in order to get profit of the derivative
of the curve. We will say that a curve γ : [a, b] → X is C 1 (please, remark: on
[a, b]) if it has derivative at every point of [a, b] including the endpoints with
156
side derivatives and the derivative is continuous on [a, b]. It is no difficult to
prove that this is equivalent to say that there exists a C 1 extension of γ to
an open interval containing [a, b]. We will say that a curve γ : [a, b] → X is
piecewise C 1 if it continuous and there exists a finite partition of [a, b] such
that γ is C 1 on every subinterval of the partition.
Theorem 9.2.4. Let γ : [a, b] → X be a piecewise C 1 curve. Then γ is
rectifiable and Z b
b
La (γ) = ∥γ ′ (t)∥ dt.
a
Proof. Firstly we will assume that γ is C 1 on [a, b], which implies the uniform
continuity of γ ′ (t) on [a, b]. Given ε > 0 find δ > 0 such that |t − ξ| < δ implies
∥γ ′ (t) − γ ′ (ξ)∥ < ε. Take a partition (ti )ni=0 of [a, b] such that |ti − ti−1 | < δ.
Using the mean value theorem on the auxiliary function
f (t) = γ(t) − γ(ti−1 ) − γ ′ (ξi )(t − ti−1 )
with ξi ∈ [ti−1 , ti ] we get that
∥γ(ti ) − γ(ti−1 ) − γ ′ (ξi )(ti − ti−1 )∥ = ∥f (ti ) − f (ti−1 )∥
≤ sup{∥f ′ (t)∥ : t ∈ [ti , ti−1 ]}(ti − ti−1 ) ≤ ε(ti − ti−1 ).
Therefore
|∥γ(ti ) − γ(ti−1 )∥ − ∥γ ′ (ξi )∥(ti − ti−1 )| ≤ ε(ti − ti−1 ).
Using that on any interval of the partition we have
n
X n
X n
X
′
| ∥γ(ti ) − γ(ti−1 )∥ − ∥γ (ξi )∥(ti − ti−1 )| ≤ ε(ti − ti−1 ) = ε(b − a)
i=0 i=0 i=0
Since we could take partitions such that the first term approaches the length
and the second one the Riemann integral, taking limits we get
Z b
b
|La (γ) − ∥γ ′ (t)∥ dt| ≤ ε(b − a)
a
following that Lba (γ) < +∞. Now, as ε > 0 was arbitrary we get the equality
between both numbers. Finally, the general case with γ being piecewise C 1
reduces to the last equality by the additivity of the length and the integral
with respect to intervals.
157
9.3 Some formulas
Despite the generality of the results of the previous section, the Euclidean norm
still plays a fundamental role. If the curve γ is parameterized as (x(t), y(t), z(t))
for t ∈ [a, b], the length is given by
Z bp
Lba (γ) = x′ (t)2 + y ′ (t)2 + z ′ (t)2 dt.
a
The important case of the graph y = f (x) of a function we have for the length
for x ∈ [a, b]
Z bp
b
La = 1 + f ′ (x)2 dx.
a
However, a plane curve could be given in polar form r = ϕ(θ) with θ ∈ [α, β].
We have (locally)
x(θ) = ϕ(θ) cos θ,
y(θ) = ϕ(θ) sin θ.
The derivation gives
158
for a partition (ti )ni=0 of [a, b] and ξi ∈ [ti−1 , ti ]. For instance, the mass of
a curve in terms of its linear density can be obtained this way. The reader
acquainted with the Riemann-Stieltjes integral can see that the integral could
be expressed as Z b
f (γ(t)) dLta (γ)
a
That implies, in particular, that the existence is guarantied for f continu-
ous. A direct proof can be obtained by just mimicking the proof of Riemann
integrability of continuous functions. We will follow the following notation
Z n
X
f dℓ = lim f (γ(ξi ))Lttii−1 (γ)
γ i=0
and dℓ is called “arc element”. It worth noticing that the same limit, when
existing, can be obtained by the Riemann-like sums
n
X
f (Ξi )∥γ(ti ) − γ(ti−1 )∥
i=0
where the point Ξi can be taken from γ([tt−1 , ti ]) or from [γ(ti−1 ), γ(ti )] in case
of f is defined on a neighbourhood of γ([a, b]) where it is uniformly continuous.
Proof. Take a partition (ti )ni=0 of [a, b] and find ξi ∈ [ti−1 , ti ] such that
and so n n
X X
f (γ(ξi ))Lttii−1 (γ) = f (γ(ξi ))∥γ ′ (ξi )∥(ti−1 − ti )
i=0 i=0
Taking limits with respect to the partitions we will get the desired identity.
159
For the following we will restrict ourselves to the Euclidean norm on Rd
since the scalar product is involved. A very alike notion appears when we
wish formalise the path integral used to compute the work done by a force.
Suppose that f : γ([a, b]) → Rd is continuous and consider the convergence of
Riemann-like sums of the form
X n
f (γ(ξi )) · (γ(ti−1 ) − γ(ti ))
i=0
where “·” is the scalar product, (ti )ni=0 a partition of [a, b] and ξi ∈ [ti−1 , ti ].
Again, the existence of this limit called the line integral can be proved by
standard methods and its value is denoted by
→
− → −
Z
f ·d ℓ
γ
(the notation with d→ −s is also popular but we will try to avoid when it could
lead to confusion). Note that we could work in a Hilbert space instead of
Rd because of the properties of the scalar product, or more generally, we
could assume that f takes values in X ∗ , so we may consider sums of terms
f (γ(ξi ))(γ(ti ) − γ(ti−1 )) which is actually what appear in the theory of inte-
gration of differential forms.
Proof. Given ε > 0 we have establish in the proof of Theorem 9.2.4 that
∥γ(ti−1 ) − γ(ti ) − γ ′ (ξi )(ti − ti−1 )∥ < ε(ti − ti−1 )
for a fine enough partition, and so
|f (γ(ξi )) · (γ(ti−1 ) − γ(ti )) − f (γ(ξi )) · γ ′ (ξi )(ti − ti−1 )| < εM (ti − ti−1 )
where M > 0 is an upper bound for f . Summing all the terms we have
X n n
X
f (γ(ξi )) · (γ(ti−1 ) − γ(ti )) − f (γ(ξi )) · γ ′ (ξi )(ti − ti−1 ) < εM.
i=1 i=0
160
9.5 Alternative parameterizations
A change of parameterization of a curve γ : [a, b] → X can be easily done
just taking an increasing onto function h : [c, d] → [a, b] and considering γ ◦ h.
The regularity of the new parameterization depends on que quality of both γ
and h. We will try to solve the inverse problem: assume that we have two
parameterizations giving the same curve (image and orientation), are these
two parameterizations linked by a regular change of variables?
This is the parameterization of γ with respect to the arc length, which is usu-
ally denoted by the choice of the letter s as a variable.
Proof. Under the hypotheses Lta (γ) is strictly increasing and thus τ is an ac-
tual inverse function whose derivative τ ′ (s) = 1/∥γ ′ (τ (s))∥ which is continuous
too. Moreover
γ̃ ′ (s) = γ ′ (τ (s))τ ′ (s)
which has norm one by the previous formula.
161
The answer to the question of the beginning of the section turns out as a
corollary.
Corollary 9.5.2. If γ1 : [a, b] → X and γ2 : [c, d] → X are two C 1 parameter-
izations of the same curve (image and orientation) whose derivatives do not
vanish, then there exists a C 1 increasing bijection h : [a, b] → [c, d] such that
γ1 = γ2 ◦ h.
Proof. We have two re-parameterizations by the arc length γ̃1 = γ1 ◦ τ1 and
γ̃2 = γ2 ◦ τ2 . Note that we must have γ̃1 = γ̃2 and therefore γ1 = γ2 ◦ τ2 ◦ τ1−1 .
162
Now, if ε > 0 is small enough then F is injective when defined on [0, L]×[−ε, ε]
and its image covers γ([0, L]) + B[0, ε]) except the butts which are covered by
two semicircles of radius ε so
The area of F ([0, L] × [−ε, ε]) can be computed by integrating the absolute
value of the Jacobian of F over [0, L] × [−ε, ε]. Firstly, the computation leads
to
∂F x′ (s) + ty ′′ (s) y ′ (s)
= ′
∂(s, t) y (s) − tx′′ (s) −x′ (s)
(actually, the term O(ε2 ) is null). Having in mind the butts of the curve, we
still have Area(γ([0, L]) + B[0, ε]) = 2εL + O(ε2 ). Dividing by 2ε and taking
limits we will recover L, proving so the claim.
163
Any implicitly defined surface can always be parameterized in that way locally.
Note that
∂Γ ∂Γ
(u, v) and (u, v)
∂u ∂v
are tangent vectors at the point Γ(u, v). The condition in terms of the vec-
tor product “×” means that they generate the tangent plane at that point.
Moreover, ∂Γ
∂u
(u, v) × ∂Γ
∂v
(u, v) is a normal vector to that plane.
Definition 9.7.1. The area (2-dimensional measure) of a compact subset S ∈
R3 is defined by the limit, whether it exists, as
The points that are at distance less or equal than ε from Γ(D) with exception
of the points that attains the minimum distance to Γ(∂D) are covered with
the image of the map
164
whose diameter does not exceed δ. Now we may assume that ε > 0 is small
enough in order to satisfy ε < δ/3 and ∥Γ(u1 , v1 ) − Γ(u2 , v2 )∥ < 2ε implies
∥(u1 , v1 ) − (u2 , v2 )∥ < δ/3 which is possible by the uniform continuity of Γ−1
defined on Γ(D). In order to prove global injectivity assume that F (u1 , v1 , t1 ) =
F (u2 , v2 , t2 ). Since |t1 |, |t2 | < ε we have ∥Γ(u1 , v1 ) − Γ(u2 , v2 )∥ < 2ε and so
∥(u1 , v1 , t1 ) − (u2 , v2 , t2 )∥ < δ which contradicts the local injectivity.
The volume of Γ(D) + B[0, ε] differs from the volume of F (Ωt ) in O(t2 ), ob-
tained by estimation of the volume of those points in Γ(D) + B[0, ε] whose
distance, less than ε, is attained at some point from ∂D which is composed of
finitely many C 1 curves.
In order to compute this volume, as F is injective and C 1 , we may use the
change of variable formula
ZZZ
∂F
du dv dt
Ωt ∂(u, v, t)
Note that the partial derivatives that we need for the computation of the
Jacobian can be expressed in vector notation as
∂Γ ∂N ∂Γ ∂N
+t ; +t ;
∂u ∂u ∂v ∂v
and N . Therefore, the Jacobian can be computed as the mixed product of the
three vectors
∂Γ ∂N ∂Γ ∂N
+t × +t ·N
∂u ∂u ∂v ∂v
∂Γ ∂Γ ∂Γ ∂N ∂Γ ∂N 2 ∂N ∂N
= × +t × −t × +t × ·N
∂u ∂v ∂u ∂v ∂v ∂u ∂u ∂v
∂Γ ∂Γ
= × + tf + t2 g
∂u ∂v
where f, g are continuous functions on D. Thus
ZZ
∂Γ ∂Γ
vol(F (Ωt )) = 2ε × du dv + O(ε2 )
D ∂u ∂v
Therefore
ZZ
vol(Γ(D) + B[0, ε]) ∂Γ ∂Γ
lim+ = × du dv
ε→0 2ε D ∂u ∂v
165
which completes the proof.
∂Γ ∂Γ
× = (R + r cos ϕ)r.
∂u ∂v
166
which are the coefficients of the so called first fundamental form in differential
geometry of surfaces. With this notation we have
2
∂Γ ∂Γ
× = EG − F 2
∂u ∂v
and therefore
ZZ ZZ √
∂Γ ∂Γ
Area = × du dv = EG − F 2 du dv
D ∂u ∂v D
Now we will discuss two important particular cases. The first one is the
form adopted by the integral for a surface given as the graph of a function
z = f (x, y) with (x, y) ∈ D. In this case the parameters are the variables x, y
and Γ(x, y) = (x, y, f (x, y)) and so
∂Γ ∂f ∂Γ ∂f
= (1, 0, ); = (0, 1, ).
∂x ∂x ∂x ∂y
We have then
2 2
∂f ∂f ∂f ∂f
E =1+ ; F = ; G=1+ .
∂x ∂x ∂y ∂y
and s
ZZ 2 2
∂f ∂f
Area = 1+ + dx dy.
D ∂x ∂y
The second case of surfaces admitting a particular formula for their area we are
going to discuss is the case of the surfaces of revolution. Assume that we rotate
around the the X-axis the graph of an one variable function y = f (x) ≥ 0 with
167
x ∈ [a, b]. In such a case, in addition to the variable x ∈ [a, b], we have to
consider a rotation parameter θ ∈ [0, 2π], so the surface can be expressed as
which is the classical first Pappus–Guldin theorem. The reader can check that
the result of Example 9.7.3 can be easily obtained in this way.
For the sake of completeness, let us comment also the second Pappus–
Guldin theorem. Consider now the vertical coordinate of the center of mass of
the trapezoid
{(x, y) : x ∈ [a, b], y ∈ [0, f (x)]}
that is Rb
a
f (x)2 dx
CMY = Rb .
2 a
f (x) dx
168
Then the volume generated by the rotation can be calculated as
Volume = 2π CMY Area
being “Area” the area of the trapezoid. The formula is evident by integrating
through cross sections. Note that the center of mass involves quadratic degree.
That was used by Archimedes in his Method to reduce one degree in “integra-
tion”, reducing in that way, for instance, the computation of the area limited
by a parabola to a property of the triangle.
169
Proof. The surface can be represented as finite or countable union of non
overlapping parameterized C 1 surfaces with boundaries. The formula above
defines a measure on each piece and the sum (series) of all those measures will
be the wanted measure. We have to check that the measure does not depends
on the decomposition into parameterized C 1 surfaces neither the parameteri-
zations.
Suppose that Σ has two different decompositions. The intersection of both
decompositions induce a finer decomposition, at most countable. It is not dif-
ficult to see that the problem of uniqueness for S reduces to check if it is the
same on each of such a pieces.
Suppose that Γ1 and Γ2 with domains D1 and D2 . Firstly, note that h =
Γ−1 1
2 ◦ Γ1 is an injective C map form D1 onto D2 . Therefore the theorem of
change of variables is applicable
ZZ ZZ
∂Γ2 ∂Γ2 ∂Γ2 ∂Γ2 ∂(u, v)
× du dv = × ds dt
−1
Γ2 (A) ∂u ∂v −1
(h−1 ◦Γ2 )(A) ∂u ∂v ∂(s, t)
where h(s, t) = (u(s, t), v(s, t)). Now we have to express the tangent vectors
in terms of Γ1 and the variables s, t. We have Γ1 = h ◦ Γ2 , thus
∂Γ1 ∂Γ2 ∂u ∂Γ2 ∂v ∂Γ1 ∂Γ2 ∂u ∂Γ2 ∂v
= + ; = + .
∂s ∂u ∂s ∂v ∂s ∂t ∂u ∂t ∂v ∂t
The vector product gives
∂Γ1 ∂Γ1 ∂u ∂v ∂v ∂u ∂Γ2 ∂Γ2 ∂(u, v) ∂Γ2 ∂Γ2
× = − × = ×
∂s ∂t ∂s ∂t ∂s ∂t ∂u ∂v ∂(s, t) ∂u ∂v
and thus
ZZ ZZ
∂Γ2 ∂Γ2 ∂Γ1 ∂Γ1
× du dv = × ds dt
Γ−1
2 (A)
∂u ∂v Γ−1
1 (A)
∂s ∂t
170
for curves. The construction of the measure S by means of an integral together
standard methods of measure theory (approximation by simple functions) im-
plies for function f defined on a surface Σ which is integrable with respect to
S we have Z X ZZ ∂Γn ∂Γn
f dS = f ◦ Γn × du dv.
Σ n Dn ∂u ∂v
where (Γn , Dn ) is a finite or countable decomposition of Σ.
However y many occasions, the vector field will be normal to the surface, so it
→
− →
− →
−
can be write as F = f N , being N a normal unitary vector field and f a scalar
function. Let us remark that at any point there are two unitary vectors which
→
−
are normal to the surface. To set a continuous normal field N is to choose the
orientation of Γ between the two possible ones for a parameterized surface with
boundary (general surfaces cannot always be oriented as the Moebius strip. In
that case there is a specific notation
→
− →
−
ZZ ZZ
f N dS = f dS
Σ Σ
→
−
The notation using the oriented element of area d S is very suitable because
in the case of a parameterized surface Γ we have
→
−
ZZ ZZ
∂Γ ∂Γ
f dS = f ◦Γ × du dv.
Γ D ∂u ∂v
→
−
in case the unitary normal N points in the same direction that ∂Γ
∂u
× ∂Γ
∂v
. Indeed,
in such a case we have
∂Γ ∂Γ ∂Γ ∂Γ → −
× = × N.
∂u ∂v ∂u ∂v
171
Finally we will consider the so called flux of a vector field throughout a surface.
If the vector field represents the speeds at a given moment of the particles a
moving fluid is composed of, the flux integral computes the volume of fluid
crossing the surface by time unit at that moment. Let Γ be a parameterized
C 1 surface with boundary and assume that there is an orientation given to Γ
→
−
which agrees with ∂Γ∂u
× ∂Γ
∂v
. Then the flux of a field F throughout Γ is defined
as
→
− → − →
− → −
ZZ ZZ
F · dS = F · N dS
Γ Γ
and, obviously, it can be computed by
→
− → − →
− ∂Γ ∂Γ
ZZ ZZ
F · dS = F · × du dv
Γ D ∂u ∂v
and the scalar-vector product inside can be computed straightly by means of
a determinant of the coefficients of the three vectors. The flux integrals can
be interpreted as integrals of differential forms of second degree.
172
fact of being rectifiable implies that the curve necessarily is continuous but
countably many points and continuous rectifiable curves have tangents almost
everywhere.
9.11 Exercises
1. A logarithmic spiral is a curve that can be represented in polar coordi-
nates by r = aebθ . Calculate the length of the arc for θ ∈ [0, 2π].
2. Calculate the length of the cardioid, a classic curve whose polar formula
is
r = 1 + cos θ
where θ ∈ [−π, π].
3. Calculate the length of the closed curve
x = a sin3 t; y = a cos3 t.
x2 + y 2 + z 2 = 1; x2 + y 2 − x = 0.
173
6. Find the explicit formula for the length of a curve contained in a C 1 sur-
face T (u, v) = (x(u, v), y(u, v), z(u, v)) in terms of the first fundamental
form and the this expression for the curve γ = T ◦ η donde η : [a, b] → R2
also C 1 .
7. Calculate the area of the portion of sphere x2 + y 2 + z 2 = 2x inside the
cone x2 + y 2 = z 2 .
8. Calculate the area of the cone x2 + y 2 = z 2 with z ≥ 0 inside the sphere
x2 + y 2 + z 2 = 2ax.
9. Consider the surface z = Axy con x2 +y 2 ≤ R2 where A, R ≥ 0. Estimate
the value of the parameter A for the area of the surface be twice the area
of its orthogonal projection on the plane XY .
p
10. Find the area of the portion of cone z = 2x2 + 2y 2 below the plane
z = x + 1.
√
11. Calculate the area of the surface z = 2xy with (x, y) ∈ [0, a] × [0, b].
12. Calculate the area of the piece of paraboloid y 2 + z 2 = 2px limited by
the plane x = a.
13. Calculate the area of the piece of paraboloid x2 /a + y 2 /b = 2z inside the
cylinder x2 /a2 + y 2 /b2 = 1.
14. Calculate the area of the cone z 2 = 2xy limited by the planes x = 0,
x = a, y = 0, y = b.
15. Find the are of the surface
z = arcsin(sinh x sinh y)
174
20. The catenary is the curve with the shape of a hanging chain and it is
modelled by the hyperbolic cosine. The catenoid is the surface generated
by a catenary that rotates around its symmetry axis.
(a) Find the length of an arc of catenary.
(b) Find the area of a catenoid between its vertex and a circular section.
21. Consider the arc of cycloid x(t) = t − sin t, y(t) = 1 − cos t for t ∈ [0.2π].
(a) Find the length of the arc.
(b) Find the area of the surface generated by the rotation of the arc
around the X axis.
175
176
Chapter 10
177
Now we will address our attention to the finite dimensional case X = Rn .
If we denote a generic point as x = (x1 , x2 , . . . , xn ) we may consider the linear
forms given by the assignation to the k-th coordinate x → xk an to denote by
dxk this linear form. It turns out that {dx1 , dx2 , . . . , dxn } is a basis of (Rn )∗ .
Therefore, any 1-form ω can be expressed in terms of the basis as
ω = f1 dx1 + f2 dx2 + . . . fn dxn
where fk with k = 1 . . . n are scalar functions. After that, we can express the
differential of a scalar function in this way
∂f ∂f ∂f
df = dx1 + dx2 + · · · + dxn .
∂x1 ∂x2 ∂xn
If n ≤ 3 we will use the set of variables x, y, z rather than the numeration.
The first natural question is to know in what extent the definition depends on
γ as a function rather than on the “shape” γ([a, b]). Actually the integral will
depend on the set γ([a, b]) and the order in which the points are placed, the
sense the curve is walked. To decide between the two possible ways to walk
the curve is to set an orientation. To be more precise, the integral of 1-forms
does not change by reparameterizations which preserve the orientation.
Proposition 10.2.1. Let γ : [a, b] → D ⊂ Rn piecewise C 1 , ω a continuous
1-form defined on D and j : [c, d] → [a, b] an increasing piecewise C 1 bijection.
Then γ̃ : [c, d] → D is a piecewise C 1 curve and
Z Z
ω= ω
γ̃ γ
Proof. The following equalities are not bothered by finite set of points where
the derivatives are not defined
Z Z d Z d
′
ω= ω(γ̃(t))(γ̃ (t)) dt = ω(γ(j(t)))(γ ′ (j(t))j ′ (t)) dt
γ̃ c c
178
Z d Z b Z
′ ′ ′
= ω(γ(j(t)))(γ (j(t))) j (t) dt = ω(γ(τ ))(γ (τ )) dτ = ω
c a γ
where we have used the chain rule and the linearity of the form.
The chain rule is also the trick behind the following result.
Proposition 10.2.3. Let f : D ⊂ Rn → R be C 1 and let γ : [a, b] → D
piecewise C 1 . Then Z
df = f (γ(b)) − f (γ(a)).
γ
R
In particular, if γ is closed, that is, γ(a) = γ(b), then γ
df = 0.
Proof. Only one line
Z Z b Z b
′ d
df = df (γ(t))(γ (t)) dt = (f (γ(t))) dt = f (γ(b)) − f (γ(a)).
γ a a dt
Note that to say that the integral depends only on the starting and ending
points (from now on “endpoints” with a distinction between them) means that
it does not matter how the path is joining them, which is actually equivalent
to say that the integral along a closed path is always 0. We have the following
important result.
Theorem 10.2.4. Let ω be an 1-form defined on D ⊂ RRn . There there exists
a C 1 function f : D → R such that ω = df if and only if γ ω depends only on
R
the endpoints of γ (equivalently, if γ ω = 0 for every closed γ ⊂ D).
179
Proof. Clearly one implication is a consequence of the previous proposition.
Assume now that the line integral depends only on the endpoints of the curve.
We may assume that D is connected, so the for a general open set it is enough
to make the following construction on every connected component. Fix a point
x0 ∈ D and for any other point x ∈ D fix a C 1 curve γx ⊂ D starting at xR0 and
ending at x (the parameter interval is not relevant). Define now f (x) := γx ω.
Note that this definition is not ambiguous by the hypothesis of independence
on how the points x0 and x are joined.
Now fix x ∈ D and h ∈ Rn and take δ > 0 small enough to have x + h ∈ D
for ∥h∥ < δ]. Observe that
Z Z Z
f (x + h) − f (x) = ω− ω= ω
γx+h γx γx+h −γx
and the fact that the curve γx+h − γx joins the points x and x + h, so it can
be replaced by the segment x + th with t ∈ [0, 1]. We have then
Z 1
f (x + h) − f (x) = ω(x + th)(h) dt.
0
180
∂f
its coefficients satisfy that fk = ∂xk
and therefore
∂fk ∂ 2f ∂ 2f ∂fj
= = =
∂xj ∂xk ∂xj ∂xj ∂xk ∂xk
181
That shows that df = ω as wanted.
A explicit form for the primitive of ω is given in the proof, however for
small dimension is better to proceed in this way: suppose that ω = p dx + q dy
∂p ∂q
is closed, that is ∂y = ∂x . Then find a partial primitive g of q with respect to
∂g
y, that is, ∂y = q. The function we are looking for can be written as f = g + h
where h does not depend on y as
∂h ∂f ∂g
= − = q − q = 0.
∂y ∂y ∂y
Then h(x) is a function of a single variable and
∂f ∂g ∂g
h′ (x) = − =p− .
∂x ∂x ∂x
The last term should be only function of x in order to find h. Actually, it is
∂ ∂g ∂p ∂ 2g ∂p ∂q
(p − )= − = − =0
∂y ∂x ∂y ∂x∂y ∂y ∂x
by the hypothesis. Now find h and we have f = g + h explicitly.
Poincaré’s results points out that closed forms are exact on special domains.
If the domain is not start-shaped then the result may fail. Indeed, the form
−y dx x dy
2 2
+ 2
x +y x + y2
on R2 \ {(0, 0)} is closed (easily checkable) and not exact (consider the integral
around the unit circle). The formula f (x, y) = arctan(y/x) provides a primitive
valid on a halfplane that can be extended to any domain of the form R2 \ R
where R is a halfline departing from (0, 0) (being f (x, y) is a measure of the
angle between (x, y) and R). The characterization of the domains where all
closed form is exact is actually a topological matter, however the notions and
proofs involved are beyond the scope of these notes.
182
it is piecewise C 1 . We will start with simpler elementary domains. We say
that a domain is elemental with respect to the X axis if it is limited by two
vertical lines x = a and x = b and two graphs of C 1 functions f, g : [a, b] → R
such that f > g. The boundary ∂D of an elementary domain D is supposed
to be oriented anticlockwise, that is the graph of f is walked from right to left
and the segment on x = a is walked down, for instance. A domain elemental
with respect to the Y axis is defined similarly, or we can think of a how looks
like a domain elemental with respect to the X axis after a rotation of π/2.
Lemma 10.3.1. Let D be a elemental domain with respect to the X axis and
p(x, y) a C 1 function defined on a domain containing D. Then
Z ZZ
∂p
p dx = − dxdy.
∂D D ∂y
Proof. Note that the vertical segments do not add to the line integral because
it does not contain dy. The graphs are parameterized by (x, g(x)) and (x, f (x)),
but this last one must be walked backwards. Thus
Z Z b Z b
p dx = p(x, g(x)) dx − p(x, f (x)) dx =
∂D a a
!
Z b Z b Z f (x)
∂p
− (p(x, f (x)) − p(x, g(x))) dx = − (x, y) dy dx =
a a g(x) ∂y
ZZ
∂p
− (x, y) dxdy
D ∂y
as wanted.
Lemma 10.3.2. Let D be a elemental domain with respect to the Y axis and
q(x, y) a C 1 function defined on a domain containing D. Then
Z ZZ
∂q
q yx = dxdy.
∂D D ∂x
183
It is easy to prove that a convex domain with C 1 boundary is elemental with
respect to both the X and Y axis, and many other domains can be expressed
as a non overlapping union of domains which are elemental with respect to the
two axis. Putting together the previous lemmas and the observation we have
the following.
Proposition 10.3.3 (Green-Riemann, elemental domains). Let D be a domain
which is elemental with respect to both the X and Y axis, and let ω = p dx+q dy
an 1-form which is C 1 on a domain that contains D. Then
Z Z ZZ
∂q ∂p
ω= p dx + q dy = − dxdy.
∂D ∂D D ∂x ∂y
Moreover, the same formula hold if D is a domain such that it can be decom-
posed into a finite non overlapping union of domains with C 1 boundaries which
are elemental with respect to both the X and Y axis.
Proof. The first statement is just the sum of the formulas provided by the
lemmas. For the second statement we have only to remark that the double
integras
Sn are additive for non overlapping unions of domains. Namely, if D =
k=1 Dk then
n ZZ ZZ
X ∂q ∂p ∂q ∂p
− dxdy = − dxdy.
k=1 Dk ∂x ∂y Sn
k=1 Dk ∂x ∂y
On the other hand,Sany C 1 non trivial (not reduced to a single point) piece
of curve contained nk=1 ∂Dk \ ∂D is contained into a shared boundaryR ∂Di ∩
∂Dj where i ̸= j are unique. This curve does not RcontributeR to ∂D since
it is walked in opposite directions when computing ∂Di and ∂Dj with the
subsequent
Pn cancelation. That can be expressed with the “arithmetics of paths”
as k=1 ∂Dk = ∂D. Therefore
n Z
X Z Z
p dx + q dy = Pn
p dx + q dy = p dx + q dy
k=1 ∂Dk k=1 ∂Dk ∂D
The previous result cast more light on the relation between closed and exact
1-forms. Indeed, if the 1-form is closed then the function inside the double
integral vanishes, so the line integral along the boundary of any elemental
184
domain is zero. Any Jordan closed curve is the boundary of a region. However,
provided that such a curve is C 1 , it may be impossible to decompose it into
finitely many elemental domains. We will improve the previous proposition for
more general domains.
Theorem 10.3.4 (Green-Riemann, general). Let D be a bounded open domain
with C 1 boundary and let p dx + q dy be an 1-form which is C 1 on a domain
that includes D. Then
Z ZZ
∂q ∂p
p dx + q dy = − dxdy.
∂D D ∂x ∂y
Proof. It is enough to prove the result for 1-forms of type p dx as we have
seen before. Let F1 ⊂ ∂D be the finite set of points where the boundary is not
smooth. Let K ⊂ ∂D be the compact set of the points of the boundary where
the tangent line is vertical. Indeed, the set has closed intersection with each
C 1 piece, explicitly {γ(t) · (1, 0) : γ ′ (t) · (1, 0) = 0}. The orthogonal projection
of K onto the X axis KX is also compact and has measure 0 by Morse-Sard’s
theorem. The set KX can be covered with finitely many open intervals whose
total length can be taken smaller than a number δ we will precise later. Let
F2 denote the finite set of their butts. The vertical lines built on the points
of F = F1 ∪ F2 define closed strips Sk with k = 1, . . . , n of two types: stripes
of type B (bad) if they contain points of ∂D at which the tangent is vertical;
stripes of type A (alright) just the others. Note that at any point of ∂D ∩ S
with S of type A is possible to express locally the boundary as the graph of
a function y = f (x). Indeed, the points where the implicit function theorem
is not applicable are contained in the strips of type B. Adding a finite set of
points F3 to F we may moreover assume that any connected part of ∂D ∩ S
for S of type A is a graph of the sort y = f (x). Under the hypotheses, the
number of connected parts of ∂D ∩ S must be finite and thus D ∩ S can be
decomposed into finitely elemental domains. Therefore
Z ZZ
∂p
p dx = − dxdy
∂(D∩S) D∩S ∂y
for any S of type A. Now we are going to deal with the stripes of type B.
∂p
Take ε > 0. Since the set D is bounded and ∂y continuous we may take δ > 0
small enough to guarantee that
X ZZ ∂p
dxdy < ε.
k∈B D∩Sk
∂y
185
On the other hand, if the covering by open intervals of KX is tight enough
we may assume that first component of the derivative (γ ′ (t) · (1, 0) for γ a C 1
piece of ∂D) is small than ε/length(∂D). That implies
X Z
p dx < ε.
k∈B ∂(D∩Sk )
Now we have
X ZZ ∂p X Z
dxdy + p dx < 2ε.
k∈A∪B D∩Sk ∂y k∈A∪B ∂(D∩S)
Since the cancelation happens for the vertical segments added by the sripe, we
get Z ZZ
∂p
p dx + dxdy < 2ε
∂D D ∂y
and being ε > 0 arbitrary we arrive to the formula in the statement.
Let A2 (X) denote the set of all continuous antisymmetric bilinear forms
on X. This space is naturally endowed with the topology of the supremum
norm (on BX × BX ). In order to keep coherence with the general theory of
alternate forms, which we are no discussing here, A0 (X) will denote the scalars
and A1 (X) = X ∗ . We will introduce the exterior product in a very restricted
version
∧ : A1 (X) × A1 (X) → A2 (X)
defined as (α ∧ β)(x, y) = α(x)β(y) − α(y)β(x) for α, β ∈ A1 (X). There is not
difficulty in checking that ∧ is bilinear and antisymmetric.
187
or alternatively {dy ∧ dz, dz ∧ dx, dx ∧ dy}.
After the algebraic part we are ready for the analytic definition.
Definition 10.4.1. A differential form ω of degree 2 (also called 2-form) is a
function defined on a open subset of X with values on A2 (X).
As in the case of 1-forms we are interested in 2-forms which are at least
continuous, and often differentiable or with further regularity. This regularity
is revealed by the scalar coefficient functions in the case of finite dimension
X
ω= fij dxi ∧ dxj .
1≤i<j≤n
Pn
Given a differentiable 1-form ω = i=1 fi dxi its exterior derivative is the
2-form defined as
X ∂fj ∂fi
dω = − dxi ∧ dxj .
1≤i<j≤n
∂x i ∂x j
d(f ω) = df ∧ ω + f dω
were the properties of the operation ∧ are used when it comes to simplification.
Note that with the help of exterior differentiation we can rewrite the con-
dition of closeness for 1-forms: the 1-form ω is closed if and only if dω = 0.
Now we are going to discuss exactness and closeness for 2-forms. Let ω be
a 2-form. We say that ω is exact if there is a 1-form α such that ω = dα. As
in the case of exact 1-forms, the 2-forms which are exact satisfy a sort identity
with partial derivatives of the coefficients. Firstly, consider an exact 2-form
in R3 , whose primitive f1 dx1 + f2 dx2 + f3 dx3 is C 2 , and so the 2-form can be
written as
∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1
− dx2 ∧ dx3 + − dx3 ∧ dx1 + − dx1 ∧ dx2 .
∂x2 ∂x3 ∂x3 ∂x1 ∂x1 ∂x2
188
Note now that the following identity holds
∂ ∂f3 ∂f2 ∂ ∂f1 ∂f3 ∂ ∂f2 ∂f1
− + − + − = 0.
∂x1 ∂x2 ∂x3 ∂x2 ∂x3 ∂x1 ∂x3 ∂x1 ∂x2
In other words (and standard notation in R3 ), if ω = Ady ∧ dz + Bdz ∧ dx +
Cdx ∧ dy is exact then
∂A ∂B ∂C
+ + = 0.
∂x ∂y ∂z
The previous equality satisfied by exact 2-forms can be replaced by n3
equalities in dimension n which are the key of the next definition. We say that
a 2-form X
ω= fij dxi ∧ dxj
1≤i<j≤n
The Lemma of Poincaré is true also for 2-forms, that is closed 2-forms
defined on star-shaped domains are exact. Instead of proving that we will
provide a method to compute primitives of 2-forms in R3 . Consider the form
ω = Ady ∧ dz + Bdz ∧ dx + Cdx ∧ dy where A, B, C are functions of x, y, z. The
objetive is to eliminate z form both the functions and the basis of 2-forms.
Firstly consider an 1-form α = pdx + qdy (p, q are functions of x, y, z). Its
exterior differential is
∂q ∂p ∂q ∂p
dα = − dx ∧ dy − dy ∧ dz + dz ∧ dx.
∂x ∂y ∂z ∂z
Now we are going to compute p, q so ω − dα only contains the dx ∧ dy term.
For that it is necessary these two equations be fulfilled
∂q ∂p
A=− ; B=
∂z ∂z
189
what turns out possible with partial primitives. From now on the functions
p, q are supposed known. We have
∂q ∂p
ω − dα = C − + dx ∧ dy.
∂x ∂y
We claim that the function between brackets does not depend on z. We will
compute its partial derivative with respect to z
∂ 2q ∂ 2p
∂ ∂q ∂p ∂C ∂C ∂A ∂B
C− + = − + = + + =0
∂z ∂x ∂y ∂z ∂x∂z ∂y∂z ∂z ∂x ∂y
where the hypothesis of being ω closed is used by the first time. Once we know
that the function between brackets does not contain z the problem is reduced
to dimension 2 where to find a primitive is not difficult by partial integration
as we may assume that the primitive is of the form f (x, y)dx (or g(x, y)dy).
Firstly, note that if we interchange the role of the variables (u, v) taking
Γ̃(v, u) = Γ(u, v) defined on D̃ = {(v, u) : (u, v) ∈ D} then the value of
the integral change multiplied by −1. Indeed, this is consequence of the an-
tisymmetry of ω. This phenomenon is the analogous of the change of sign
in the integral of 1-forms when the path is walked backwards. The principle
behind is that parameterized surfaces (the ones we are considering) can given
190
an “orientation” that plays a role similar to the orientation of curves. In the
case of surfaces embedded into R3 having an orientation is simply to distin-
guish between the to “faces” of the surface, as for instance we can distinguish
between up and down when the surface is given as the graph of a function of
two variables.
Another issue we have to deal with is the to prove that the notion integral
for 2-forms does not depend on the particular choice of the parameterization
but on the shape Γ(D) of the surface together with the orientation, that means,
a similar statement to Proposition 10.2.1.
Proposition 10.5.1. Let Γ : D → Rn a C 1 surface with boundary, ω a
continuous 2-form defined on a set containing Γ(D) and h : D̃ → D a C 1
bijection with positive jacobian. Then Γ̃ : D̃ → Rn is a piecewise C 1 surface
and Z Z
ω= ω
Γ̃ Γ
Proof. Writing the change of variables as h(s, t) = (u(s, t), v(s, t)) and sub-
stituting into the expression of the first integral (some variables are omitted
for the sake of readability) we have
ZZ
∂ Γ̃ ∂ Γ̃
ω Γ̃(s, t) (s, t), (s, t) dsdt =
D̃ ∂s ∂t
ZZ
∂Γ ∂u ∂Γ ∂v ∂Γ ∂u ∂Γ ∂v
ω Γ̃(s, t) + , + dsdt =
D̃ ∂u ∂s ∂v ∂s ∂u ∂t ∂v ∂t
ZZ
∂Γ ∂Γ ∂u ∂v ∂v ∂u
ω Γ̃(s, t) , )( − dsdt =
D̃ ∂u ∂u ∂s ∂t ∂s ∂t
ZZ
∂Γ ∂Γ ∂(u, v)
ω Γ(u(s, t), v(s, t)) (u(s, t), v(s, t)), (u(s, t), v(s, t)) dsdt
D̃ ∂u ∂v ∂(s, t)
ZZ
∂Γ ∂Γ
= ω Γ(u, v) (u, v), (u, v) dudv
D ∂u ∂v
where in the first equality we have used the chain rule for derivatives, in the
second equality the bilinearity and antisymmetry of the 2-form, third equality
is just to make explicit the involved variables and finally the last equality is
due to the change of variables formula for the integral.
191
Now we will consider a more general type of surfaces. We say that con-
nected set in Rn is an oriented piecewise C 1 surface if it is the union of the
images of finitely many parameterized C 1 surfaces with border, those surfaces
can only intersect on points of their borders and the intersection when happens
is a non trivial curve, and finally the orientations induced by the parameteri-
zations on each piece are compatible. This is something with a clear meaning
for surfaces in R3 thinking of orientation with the help of the normal vector
field. In this 3-dimensional setting there is an important example. Asume
that a compact set with nonempty interior has a boundary which is made up
of finitely many parameterized surfaces (with boundary). Then there is a nat-
ural standard orientation: the normal field points to the exterior of the set.
Thus, in case of an oriented piecewise C 1 surface Γ = Γ1 + · · · + Γm where Γk
are parameterized C 1 pieces we define
Z m Z
X
ω= ω
Γ k=1 Γk
for any 2 form defined on a domain containing Γ. It is not difficult to check that
the definition does not depends on how Γ is decomposed into C 1 parameterized
pieces. For instance, the sphere needs such a decomposition and it can be done
of infinitely many fashions. Moreover, removing one point of the sphere, the
remainder is a parameterized surface and one point less does not bother when
it comes to integration.
192
Proof. The integral of the 2-form on those parts of ∂E contained in the
“cylinder” is null (from the geometrical point of view the field R dx ∧ dy is ver-
tical meanwhile the normal vectors of the cylinder are horizontal). The upper
and lower parts of the domain are given by the parameterizations Γ1 (x, y) =
(x, y, f (x)) and Γ2 (x, y) = (x, y, g(x, y)) with (x, y) ∈ D, where the second has
to be reversed to be according to the orientation (towards the exterior). As
we have ∂Γ ∂Γ ∂Γ ∂Γ
1 1 2 2
(dx ∧ dy) , = (dx ∧ dy) , =1
∂x ∂y ∂x ∂y
then
Z ZZ ZZ
R dx ∧ dy = R(x, y, f (x, y)) dxdy − R(x, y, g(x, y)) dxdy =
∂E D D
!
ZZ ZZ Z f (x,y
∂R
R(x, y, f (x, y)) − R(x, y, g(x, y)) dxdy = dz dxdy
D D g(x,y) ∂z
ZZZ
∂R
= dxdydz
E ∂z
as wanted.
Since the analogous results are true for elemental domains with respect to
the Y Z and XZ planes we have the following.
Theorem 10.6.2 (Gauss-Ostrogradsky). Let E ⊂ R3 be a domain which is
elemental with respect to the three planes XY , Y Z and XZ and let P dy ∧ dz +
Qdz ∧ dx + Rdx ∧ dy a 2-form which is C 1 on a domain containing E. Then
Z
P dy ∧ dz + Q dz ∧ dx + R dx ∧ dy =
∂E
ZZZ
∂P ∂Q ∂R
+ + dxdydz
E ∂x ∂y ∂z
Moreover, the same formula hold if E is a domain such that it can be decom-
posed into a finite non overlapping union of domains with C 1 boundaries which
are elemental with respect to the three coordinate planes.
Proof. The result is just the sum of the three equalities
Z ZZZ
∂P
P dy ∧ dz = dxdydz
∂E E ∂x
193
Z ZZZ
∂Q
Q dz ∧ dx = dxdydz
∂E E ∂y
Z ZZZ
∂R
R dx ∧ dy = dxdydz
∂E E ∂z
where the last one come from the previous lemma and the two other ones are
the analogous that can be obtained switching the coordinate planes.
Remark 10.6.3. The theorem of Gauss-Ostrogradsky can be proved with a
similar degree of generality that the Green-Riemann theorem, but the extra
work is not worth at all.
Now we will obtain a result which relates the integration of an 1-form along
the relative boundary (the “free points” of the boundaries of the pieces) of an
oriented piecewise C 1 surface and the integral of its exterior differential over
that surface. Firstly, given an oriented piecewise C 1 surface S we have to assign
an orientation to the relative boundary ∂S. That will be the anticlockwise
orientation when we look at the surface from “above”, that is, from the part
the normal vectors points towards. We will say that a piece of the surface is
flat with respect to the plane XY if it can be represented as the graph of a
function z = f (x, y).
Theorem 10.6.4 (Stokes). Let S be an oriented piecewise C 2 oriented surface
and let ω an 1-form which C 1 on a domain containing S. Then
Z Z
ω= dω.
∂S S
Proof. Decomposing the surface into C 2 pieces we just have to prove the
result for each piece which are C 2 surfaces with boundary. Indeed, the surface
integrals are additive and the integral of the 1-form vanishes on the shared
parts of the relative boundary. With the help of the implicit function theorem
we may decompose the surface into smaller flat pieces (remember that the
function representing the surface can be enlarged smoothly beyond its domain).
Therefore we may assume that S is C 2 flat with respect to XY (the other
two orientations are obtained likewise). Assume now that S is represented
as z = f (x, y) with (x, y) ⊂ D a domain with C 1 boundary. In order to
proof the result we are going to develop both members of the equality. Put
ω = P dx + Qdy + Rdz and (X(t), Y (t)) with t ∈ [a, b] a parameterization of
the border. Thus we have Z
ω=
∂S
194
Z b
∂f ∂f
P (·)X ′ (t) + Q(·)Y ′ (t) + R(·)( (··)X ′ (t) + (··)Y ′ (t)) dt =
a ∂x ∂x
Z b
∂f ∂f
(P (·) + R(·) (··))X ′ (t) + (Q(·) + R(·) (··))Y ′ (t) dt =
a ∂x ∂y
Z b Z
p(·)X ′ (t) + q(·)Y ′ (t) dt =
p dx + q dy
a ∂D
where (·) = (X(t), Y (t), f (X(t), Y (t))), (··) = (X(t), Y (t)) and
∂f
p(x, y) = P (x, y, f (x, y)) + R(x, y, f (x, y)) (x, y),
∂x
∂f
q(x, y) = Q(x, y, f (x, y)) + R(x, y, f (x, y)) (x, y).
∂y
Therefore Z Z ZZ
∂q ∂p
ω= p dx + q dy = − dxdy
∂S ∂D D ∂x ∂y
where the last equality is thanks to the Green-Riemann formula. Now we have
to compute
∂ 2f
∂p ∂P ∂P ∂f ∂R ∂R ∂f ∂f
= + + + +R
∂y ∂y ∂z ∂y ∂y ∂z ∂y ∂x ∂x∂y
∂ 2f
∂q ∂Q ∂Q ∂f ∂R ∂R ∂f ∂f
= + + + +R
∂x ∂x ∂z ∂x ∂x ∂z ∂x ∂y ∂y∂x
where the variables have been removed for sake of better readability. Therefore
∂q ∂p ∂Q ∂Q ∂f ∂R ∂f ∂P ∂P ∂f ∂R ∂f
− = + + − − − = (∗)
∂x ∂y ∂x ∂z ∂x ∂x ∂y ∂x ∂z ∂y ∂y ∂x
Now we are going to compute the surface integral of the statement. Firstly
∂R ∂Q ∂P ∂R ∂Q ∂P
dω = − dy ∧ dz + − dz ∧ dx + − dx ∧ dy
∂y ∂z ∂z ∂x ∂x ∂y
that implies
∂R ∂f ∂Q ∂f ∂P ∂f ∂R ∂f ∂Q ∂P
dω(U, V ) = − + − + + − = (∗∗)
∂y ∂x ∂z ∂x ∂z ∂y ∂x ∂y ∂x ∂x
where U = (1, 0, ∂f
∂x
) and V = (0, 1, ∂f
∂y
). The equality (∗) = (∗∗) completes the
proof of the theorem.
195
Remark 10.6.5. The hypothesis C 2 in the last theorem contrasts with the C 1
assumption in previous results. This is a consequence of the chosen method of
proof. And the result can be proved under more relaxed hypotheses.
Many of the previous results can be expressed in terms of the relation
between the integrals of a (k − 1)-form and it exterior differential, which is a
k-form, on the (k−1)-dimensional smooth boundary of a k-dimensional object,
respectively. Now we are ready for this new point of view:
1. Let γ : [a, b] → Rn be a parameterized piecewise C 1 curve (injective). Its
relative boundary is ∂γ = {γ(a), γ(b)}. The orientation of γ induces an
orientation on that two points set, thatR is a distinction. Given a 0-form,
that is a scalar function, f we define ∂γ f = f (γ(b)) − f (γ(a)). With
this notation Proposition 10.2.3 becomes
Z Z
df = f.
γ ∂γ
196
4. And, of course, Stokes Theorem 10.6.4 itself follows the same scheme.
The development of the theory is quite standard. Only two comments: the
topological facts are not much stressed (no mention of simply connected do-
mains, nor homotopy neither homology), and the proof of the Green-Riemann
formula is a real one, that means, it is not based on an a priori existence of a
nice decomposition of the domain. Such a struggle is not repeated for the R3
theorems, though.
10.8 Exercises
1. Calculate the integral Z
y dx − x dy
γ
being γ the triangle with vertices (0, 0), (1, 0) and (0, 1) orientated like-
wise.
197
being γ e the triangle with vertices (a, 0, 0), (0, b, 0) and (0, 0, c) with
a, b, c > 0 orientated likewise.
3. Calculate the integral
Z
z 2 dx + x2 dy + y 2 dz,
γ
being γ the spherical triangle with vertices (a, 0, 0), (0, a, 0) and (0, 0, a),
on the sphere centred at (0, 0, 0) with radius a > 0.
4. Consider the differential form
2x 2y x2 + y 2
ω(x, y, z) = dx + dy + 1 − dz,
z z z2
defined on the set
A = {(x, y, z) ∈ R3 : z ̸= 0}.
Show that it is exact and find all its primitives.
5. Find all the functions φ, ψ ∈ C 1 (R) with ψ(0) = 0 such that the differ-
ential form
ω(x, y, z) = (z + z 2 ) dx + φ(y) ψ(z) dy + x + 2z(x + y 2 ) dz,
7. Calculate ZZ
z dx ∧ dy,
ϕ
where ϕ is the parameterized surface
{ϕ(u, v) := (u + v, u2 + v 2 , u − v) : u, v ∈ [−1, 1]}.
198
8. Prove that on a oriented surface M there is a 2-form ω such that for
every N R⊂ M surface with boundary, then the area of N is the absolute
value of N ω.
9. Given the 1-form on R3
df ∧ dg = λ dx ∧ dy
199
16. Given the 1-form
ω(x, y, z) = y dx − x dy + dz,
ω = f (x, y) dx ∧ dy + x2 y dy ∧ dz − xy 2 dz ∧ dx.
200
Chapter 11
201
There is an obvious analogy with complex numbers. The term on the numer-
ator is called conjugate and the real number on the denominator is the square
of the modulus. As it happens with the modulus of complex numbers, the
modulus is multiplicative.
Now we will consider quaternions with real part zero, which are called
purely imaginary and can be interpreted into R3 . The product of two purely
imaginary quaternions is not purely imaginary in general
(x1 i + y1 j + z1 k)(x2 i + y2 j + z2 k) =
−(x1 x2 + y1 y2 + z1 z2 ) + (y1 z2 − z1 y2 )i + (z1 x2 − x1 z2 )j + (x1 y2 − y1 x2 )k
The scalar part of the result, after changing the sign can be identified with the
Euclidean scalar product of vectors. The vector part of the product is called
the vector product. If u, v are quaternions with null scalar part its product is
can be represented as
uv = −u · v + u × v
being u × v the vector product. Since the modulus is multiplicative we have
|u|2 |v|2 = |u · v|2 + |u × v|2
It is not difficult to check that
u · (u × v) = v · (v × u) = 0
which means that u × v is orthogonal to both u and v, so its direction is well
determined in R3 if u and v are independent. We also have a consequence of
the non commutativity: u × v = −v × u.
Some time time after the discovery of the quaternions it was clear that in
order to deal with the Euclidean geometry of R3 it is not necessary the full
power of its the algebraic structure. We can work more easily in that frame
just keeping the scalar and vector products once we know their properties.
Let us note that the easiest method to compute the vector product without
appealing to quaternions is the following symbolic determinant
i j k
u × v = u1 u2 u3
v1 v2 v3
where u(u1 , u2 , u3 ) and v = (v1 , v2 , v3 ). Since most of the vectors will be
crowned with a little arrow in the following sections, we will denote from now
on the basis of R3 derived from the quaternions by {⃗i, ⃗j, ⃗k}.
202
11.2 Differential forms on R3
Along this chapter we will consider scalar and vector fields in R3 . These are
simply real functions and functions with values in R3 defined on some open
domain of R3 , often the whole space. We will follow this terminology (fields)
in order to stress the fact the different nature of the domain, which is made
of points, and the range which can be made of either numbers or vectors. As
to vector fields comes, it is worth noticing that it can be interpreted both
as differential 1-forms or 2-forms. Firstly we will establish the identification
between the vectors of R3 and the alternate forms of degrees 1 and 2 on R3 ,
whose respective spaces on have dimension 3. Such an identification can be
done with the help of basis, namely
(a dy ∧ dz + b dz ∧ dx + c dx ∧ dy)(u, v) =
a b c
⃗ ⃗ ⃗
(ai + bj + ck) · (⃗u × ⃗v ) = u1 u2 u3 .
v1 v2 v3
The proof of the these equalities can reduced to check them on pairs or triplets
of basic vectors thanks to the linearity. For instance, the second one on the
triplet (⃗i, ⃗j, ⃗k) we have (dy ∧ dz)(⃗j, ⃗k) = 1 and ⃗i · (⃗j, ⃗k) = ⃗i · ⃗i = 1.
203
→
−
1 or 2 defined on a domain of R3 can be transformed into a vector field F . If
γ(t) is a parameterized curve we have
Z b Z b
→
− →
− →
−
Z Z
′ ′
ω= ω(γ(t))(γ (t)) dt = F (γ(t)) · γ (t) dt = F · dℓ.
γ a a γ
Under the hypothesis of some regularity, there are some operations involv-
ing differentiation that can be performed to scalar and vector fields. These
operations can be labelled with the help of a symbolic operator named nabla
∂ ∂ ∂ ∂ ⃗ ∂ ⃗ ∂ ⃗
∇= , , = i+ j+ k.
∂x ∂y ∂z ∂x ∂y ∂z
204
The first operation we will consider is well known: the gradient. Let us
recall that gradient of a scalar field (function) f is the vector field defined by
∂f ∂f ∂f
∇f = , , .
∂x ∂y ∂z
Despite the fact that the gradient is defined in term of the cartesian coordinates
associated, it has an intrinsic meaning. Indeed, its modulus is the maximum
value of the directional derivative over all the norm one vectors, and provided
it is not zero, the gradient points in the maximizing direction.
→
−
Given a vector field F = (f1 , f2 , f3 ), its divergence is the scalar field defined
by
→
− ∂f1 ∂f2 ∂f3
∇· F = + + .
∂x ∂y ∂z
It is not obvious from this definition that the divergence is an intrinsic notion.
That can be deduced by alternative methods, as straight computation. It is
easier to remark that if we identify the vector field with a (differential) 2-form,
its divergence can be identified with its exterior differential. Therefore, if we
know that the exterior differential is an intrinsic notion independent from the
coordinate system then the same is true for the divergence. The third method
we will give also provides an interpretation of the divergence. The Gauss-
Ostrogradsky theorem says wit this notation that if D ⊂ R3 is a bounded
→
−
open domain and F is C 1 on a domain which includes D then
→
− → − →
−
ZZ ZZZ
F · dS = ∇ · F dV.
∂D D
For that reason divergence can be interpreted as the net rate of the flux leav-
ing/entering a small volume around the point
RR →
− → −
→
− ∂B(x,ε)
F · dS
∇ · F (x) = lim+ .
ε→0 Vol(B(x, ε))
Suppose that the vector field represents the speeds of a fluid. If the fluid is
incompressible that implies that the net rate of the flux is 0 as the amount of
fluid entering the ball equals that one getting out, so the divergence is 0. If the
is not incompressible then the divergence represents variations in density at
that point. In other interpretations of vector fields the divergence represents
a magnitude related to the field that is created/destroyed at the point. For
205
instance, the divergence of the static electric force field represents the charge
per unit volume.
→
−
Given a vector field F = (f1 , f2 , f3 ), its rotational or curl is the vector field
defined by
→
− ∂f3 ∂f2 ∂f1 ∂f3 ∂f2 ∂f1
∇× F = − , − , − .
∂y ∂z ∂z ∂x ∂x ∂y
→
− →
−
Note that if we identify the vector field F with a n 1-form ω then ∇× F can be
identify with dω. That shows that the definition of the rotational is intrinsic.
In order to have an interpretation we have to appeal to Stokes’ theorem, which
can be rewritten in those terms. Let Γ be an oriented parameterized C 2 surface
with boundary in R3 then
− →
→ − →
− → −
Z ZZ
F ·d ℓ = ∇ × F · dS .
∂Γ Γ
Now we will consider an operator that involves second order derivatives, the
Laplacian, which is actually a combination of gradient and divergence. Given
a twice differentiable scalar field take
∂ 2f ∂ 2f ∂ 2f
∆f = ∇ · (∇f ) = + + .
∂x2 ∂y 2 ∂z 2
The Laplacian is also represented as ∇2 f . The intrinsicness of the Laplacian is
clear, however the combination of the interpretations of the gradient and the
206
divergence do not cast light on the what the Laplacian means. For that reason
we are going to build a straight one. Assume that f has a Taylor development
around 0 = (0, 0, 0), for simplicity, of second order of the form
∂f ∂f ∂f
f (x, y, z) = f (0) + (0)x + (0)y + (0)z+
∂x ∂y ∂z
∂ 2f ∂ 2f ∂ 2f
1 2 2
(0)x + (0)y + (0)z 2 +
∂x2
2 ∂y 2 ∂z 2
∂ 2f ∂ 2f ∂ 2f
2 (0)xy + 2 (0)xz + 2 (0)yz + o(∥(x, y, z)∥2 ).
∂x∂y ∂x∂z ∂y∂z
The integration over the sphere ∂B(0, ε) leads to the cancellation of the first
order terms and the mixed ones (those containing xy,xz and yz), so it remains
4πε4
∆f (0) + o(ε4 ).
= 4πε2 f (0) +
6
The trick to compute easily the integrals was the following: by symmetry we
obviously have
ZZ ZZ ZZ
2 2
x dS = y dS = z 2 dS
∂B(0,ε) ∂B(0,ε) ∂B(0,ε)
but
ZZ ZZ ZZ ZZ
2 2 2
x dS + y dS + z dS = ε2 dS = 4πε4 .
∂B(0,ε) ∂B(0,ε) ∂B(0,ε) ∂B(0,ε)
Therefore the average of f (x, y, z) − f (0, 0, 0) over the sphere for ε > 0 small
is
ε2
ZZ
1
2
(f − f (0)) dS = ∆f (0) + o(ε2 ).
4πε ∂B(0,ε) 6
Therefore, the Laplacian measures the difference between the value of the
function on a point and its average around the point. The functions whose
Laplacian is null are called harmonic and we will see later that the value at a
given point is actually the average of the values on centred spheres.
207
The last operator we will consider is the Laplacian of a vector field, that
→
−
appears in applications to Electromagnetism. If F = (f1 , f2 , f3 ) then we define
→
−
∆ F = (∆f1 , ∆f2 , ∆f3 ).
The fact that this definition is intrinsic is consequence of the following identity
whose proof is left to the reader
→
− →
− →
−
∆ A = ∇(∇ · A) − ∇ × (∇ × A).
208
Note that this function is harmonic except at the singularities →
−r k ’s. If we
think of the potential produced by many small charges we arrive naturally
to a generalization of the potential with the help of integration. Let µ be
a finite signed σ-additive Borel measure with compact support. Under these
assumptions the potential can be defined for points →
−
p out the support of µ as
dµ(→
−
Z
→
− r)
Φ( p ) = .
∥r −→
→
− −
p∥
This function is harmonic on the complement of the support of the measure
as the interchange of derivation and integration is not a problem in absence
of singularities. However, the potential Φ could be defined at more points if
the integral is convergent, although we cannot say anything of the regularity
of Φ unless we make some assumptions on the measure µ. As we will see later,
for measures compactly supported which are continuous with respect to the
Lebesgue measure (continuous densities from now on appealing to the physical
origin) the potential Φ is defined everywhere. For not compactly supported
measures, if we impose special decay conditions to µ at infinity we may even
have the potential defined everywhere. Nevertheless some special cases are
treated by analogy with physical situations.
∥→
−
r −→
−
p ∥2 = R2 cos2 θ cos2 ϕ + R2 sin2 θ cos2 ϕ + (R sin ϕ − z0 )2
209
The potential at →
−
p is given by
2π π/2
ρR2 cos ϕ dθdϕ
Z Z
Φ(→
−
p)= p
0 −π/2 R2 − 2Rz0 sin ϕ + z02
π/2 π/2
ρR2 cos ϕ dϕ
Z
2πρR
q
= 2π = − R 2 − 2Rz sin ϕ + z 2
p 0 0
−π/2 R2 − 2Rz0 sin ϕ + z02 z0 −π/2
Note that the integral converges at the singular points (the sphere of radius R)
making the potential continuous on all R3 , although not differentiable. More-
over, for exterior points the potential of the charged sphere behaves as if all
the charge (4πR2 ρ) was concentrated at the origin.
δs if 0 ≤ s ≤ min{∥→−
(
4πs2 ρ
p ∥, R};
Φs (→
− ∥−
→
p∥
p)= →
−
4πsρ δs if min{∥ p ∥, R} < s ≤ R.
In case ∥→
−
p ∥ ≥ R only the first formula is necessary and therefore
R
4πs2 ρ 4πR3 ρ
Z
Φ(→
−
p)= ds =
0 ∥→
−p∥ 3∥→
−
p∥
which means that the homogeneously charged sphere behaves like a punctual
charge concentred at the origin. In case that ∥→
−
p ∥ < R we have
R ∥−
→
p∥ R
4πs2 ρ
Z Z Z
Φ(→
−
p)= Φs (→
−
p ) ds = ds + 4πsρ ds =
0 0 ∥→
−p∥ ∥−
→
p∥
210
4π∥→
−
p ∥3 ρ 2π →
→
− + 2πR2 ρ − 2π∥→
−
p ∥2 ρ = 2πR2 ρ − ∥−
p ∥2 ρ.
3∥ p ∥ 3
Putting both expressions together we have
−
→ 2
2πR2 ρ − 2π∥ 3p ∥ ρ if ∥→
−
(
→
− p ∥ < R;
Φ( p ) = 3
4πR ρ →
−
3∥−
→
p∥
if ∥ p ∥ ≥ R.
ρ(→
−
ZZZ
→
− r ) dV
Φ( p ) =
∥r −→
→
− −p∥
|ρ(→
−
ZZZ ZZZ
r )| dV
= |ρ(→
−
r )|∥∇(∥→
−r −→ −p ∥−1 )∥ dV
∥→−r −→ −p ∥2
211
In order to have second order derivatives we need to ask some regularity to ρ. If
ρ were differentiable at some point →−p then it would be possible to compensate
−3
the growing of rate ε near the singularity with the “balanced” difference
ρ(→
−r ) − ρ(→
−p ), implying that we may change locally ρ by the constant value
→
−
ρ( p ). This is a delicate task that we are no going to detail here, however the
interpretation is easy: the Laplacian of Φ at →−
p can be calculated decomposing
the charge into two parts: the part inside a small ball of radius ε where we may
assume that ρ is constant and the charge out the small ball which produces
a potential whose Laplacian is 0 at → −
p since this point is not in the support.
The consequence of that argument provided the extra regularity of ρ is the
possibility of being recovered from the potential it generates
∆Φ(→−
p ) = −4πρ(→ −p ).
This remarkable formula is known as Poisson’s equation. It is natural to think
the feasibility of the inverse problem: given a function f defined on a domain
D with some regularity hypotheses. is possible to express f as a potential? In
general the answer is negative. Indeed, the function to be a candidate for the
charge is evidently
−1
ρ= ∆f.
4π
If f is C 3 and D is bounded then the potential
−1 ∆f (→
−
ZZZ
→
− r ) dV
Φ( p ) =
4π D∥r −→
→
− −p∥
is a function such that ∆Φ = ∆f on D, but Φ ̸= f in general. For instance,
if f (x, y, z) = x2 + y 2 + z 2 and D = B(0, 1) the integral formula will produce
the potential of a homogenous charged ball (as above in this section) which
differs from f in a constant. In general, the difference Φ − f will be a harmonic
function on D. If f and its derivatives satisfy some particular decay conditions
we could enforce the equality as a consequence of the properties of harmonic
functions we are going to study in next section.
The previous arguments have a nice application. Every C 2 vector field can
be decomposed locally as the sum of a gradient of a scalar function and the
→
− →
−
rotational of a vector field. Indeed, given F take ρ = ∇ · F . If D ⊂ R3 is a
ball (or more generally, a bounded star shaped domain) then we may consider
the potential Φ generated by the density ρ on D and take f = −(4π)−1 Φ. Now
we have →
−
∇ · ( F − ∇f ) = ρ − ρ = 0.
212
→
−
Therefore, the field F − ∇f is closed regarded as a 2-form. Then there exists
→
− →
− →
−
a primitive G on D such that ∇ × G = F − ∇f and thus
→
− →
−
F = ∇f + ∇ × G
as we wanted. Is worth noticing that the decomposition above does not make
any sense from the point of view given by the theory of differential forms.
Indeed, we are adding a 1-form and a 2-form.
→
−
and the term ∇f · N can be interpreted as a normal derivative, usually denoted
∂f
by ∂n (in this particular case is a radial derivative ∂f
∂r
). We may parameterize
the sphere ∂B[→ −
p , r] by means of the unit sphere ∂B[0, 1] as →−p + r→
−
x . In such
→
− →
−
a case we have x = N as well. Since the sizes of the spheres differ in a r2
factor we have
→
−
ZZ ZZ
→
− →
− →
− →
− 1
∇f ( p + r x ) · x dS( x ) = 2 ∇f · N dS = 0.
∂B[0,1] r ∂B[−
→
p ,r]
213
ZZ
= (f (→
−
p + R→
−
x ) − f (→
−
p )) dS(→
−
x)
∂B[0,1]
which implies after rescaling (integration over the ball of radius R) that
ZZ ZZ
0= →
−
(f − f ( p )) dS = f dS − 4πR2 f (→
−p)
∂B[−
→
p ,R] ∂B[−
→
p ,R]
and so ZZ
1
f (→
−
p)= f dS.
4πR2 ∂B[−
→
p ,R]
This remarkable identity is the so called mean value property of the harmonic
functions, that is the value at any point can be expressed as an average of the
values over any sphere around that point. The mean value property is true in
any dimension with the corresponding adaptation. Note that in dimension 1
is evident because the harmonic functions are exactly the affine functions, so
the mean value property just says that the value at the middle of a segment is
the arithmetic mean of the values at the butts. In dimension 2 the above proof
can be adapted with the use of the Green-Riemann theorem and the ideas to
be discussed in the next section. Nevertheless, in dimension 2 the theory of
harmonic functions have strong bonds with complex analysis which provide
alternative techniques.
We will go on with the 3-dimensional frame to state and prove the results
although the they are valid in any dimension. The mean value property has a
surprising consequence.
Theorem 11.5.1. Let D ⊂ R3 be a connected domain and f a harmonic
function defined on D. Then
(a) f does not have relative strictly extremum values;
(b) if f attains an absolute extreme value on D then f is constant;
Proof. We will argue with maximums, being the argument with minimums
similar.
Assume that f has a relative strict maximum at →
−
p . Then there is ε > 0
214
such that f |∂B(− →
−
p ,ε) < f ( p ). By the continuity of f (a strict inequality at a
→
particular point integrated remains strict) we get that
ZZ
1
2
f dS < f (→
−
p)
4πε
which is a contradiction.
Now, if the function attains a maximum the previous argument shows that
actually we have f |∂B(− →
− →
−
p ,ε) = f ( p ) for any p where the maximum is attained
→
and any ε > 0 such that B[→ −
p , ε] ⊂ D. That shows that the set
{→
−
p ∈ D : f (→
−
p ) = max(f )}
215
flux integral on the plane and what is its corresponding “divergence” formula?
In order to answer that question, suppose firstly that the boundary of D is
given by some C 1 curve → −γ (s) with s ∈ [0, L] the arc-length. That implies
→
− ′ →
−
∥ γ (s)∥ = 1. We put γ (s) = (x(s), y(s)) the integral above over ∂D can be
expressed as
Z Z L
p dx + q dy = (p(x(s), y(s)) x′ (s) + q(x(s), y(s)) y ′ (s)) ds
∂D 0
→
−
where F = (p, q). Note that the last member can be interpreted as a standard
line integral
Z L Z
′ ′
(−q(x(s), y(s)) x (s) + p(x(s), y(s)) y (s)) ds = −q dx + p dy.
0 ∂D
216
role was played by the function (x2 + y 2 + z 2 )−1/2 which the unique non trivial
harmonic function with spherical symmetry, turning a blind eye on the fact
that it s not defined at 0. The analogous role in R is played by the function
ϕ(x, y) = (−1/2) log(x2 + y 2 ) for which
Z
∇ϕ · d→
−ν = −2π
∂D
The advantage that this method offers is based that it is often easier to pa-
rameterize the boundary rather than to express as a graph. Another choice of
functions p, q which offers some symmetry is the following
Z
1
area(D) = −y dx + x dy.
2 ∂D
In case of not having a natural parameterization of the curve, the part con-
tained in one of the half-planes x > 0 or x < 0 taking as parameter t = y/x on
each piece where it can be done uniquely. The relation among the differentials
dy = t dx + x dt carried to the last area formula gives
Z Z
1 2 1
area(D) = −tx dx + tx dx + x dt = x2 dt
2 ∂D 2 ∂D
217
which is a remarkable formula that keeps some resemblance with the well
known polar formula for the area
1 β 2
Z
area(D) = ρ dθ
2 α
11.7.1 Mechanics
Assume that a particle of mass m follows a path → −
r (t) as a consequence of a
→
−
force field F acting on it. Newton’s second law says that
→
− d2 →
−
r
F =m 2 .
dt
Assume that the force is conservative, that is, it is the gradient of some scalar
→
−
function. Take a function V such that F = −∇V and call it potential energy.
We already know that
Z t2
→
− →
F · d−
r = V (→−r (t1 )) − V (→
−r (t2 ))
t1
d d→
−
r d→−r d2 →
−
r d→−r d→
−r d2 →
−r d→
−r d2 →
−r
· = · + · = 2 · .
dt dt dt dt2 dt dt dt2 dt dt2
We will apply that to the line integral above
d2 →
−
r d→− m d→
−
r d→−
t2 t2 t2
→
− → mv 2 (t2 ) mv 2 (t1 )
Z Z
r r
F · d−
r = m · dt = · = −
t1 t1 dt2 dt 2 dt dt t1 2 2
218
−
→
where v = ∥ ddtr ∥. The magnitude mv 2 /2 is called the kinetic energy. Now
from the equality
mv 2 (t2 ) mv 2 (t1 )
V (→
−
r (t1 )) − V (→
−
r (t2 )) = −
2 2
we get
mv 2 (t1 ) mv 2 (t2 )
V (→
−
r (t1 )) + = V (→
−
r (t2 )) +
2 2
which is called the conservation of energy law: the sum of the kinetic and the
potential energy remains constant along the time.
11.7.2 Hydrostatics
On a still fluid the pressure p is a scalar field that at any point represents
the magnitude of the force per unit area applied on the a face of a tiny plane
surface under the assumption that the fluid is removed from the other side.
The experimental knowledge says that the force is always normal to the surface
and its magnitude does not depend on the orientation of the surface, at the
same point. Assume that a non porous body D with C 1 boundary is subdued
to a pressure field. The total force applied on D is given by the surface integral
→
− →
−
ZZ ZZ
− p N dS = − pdS
∂D ∂D
where the sign “−” is necessary because the pressure by its very definition is
→
−
positive, the normal N points out the outside meanwhile the force is exerted
towards the body. The trick to compute this integral, which is vector-valued,
→
−
is to reduce it to a flux integral. Consider the field F = p⃗i. Then
→
− →
− → − →
−
ZZ ZZ ZZZ ZZZ
⃗i · ∂p
pdS = F · dS = ∇ · F dV = dV.
∂D ∂D D D ∂x
by Gauss-Ostrogradsky at the last step. That can be done likewise also for ⃗j
and ⃗k with obvious consequences that can be written simultaneously for each
coordinate as the integral of a vector function
→
−
ZZ ZZZ ZZZ
∂p ∂p ∂p
− p N dS = − , , dV = − ∇p dV.
∂D D ∂x ∂y ∂z D
219
where c is constant (pressure at z = 0), ρ the density of the fluid (that also
may depend on the point, but we are considering constant at our scale) and g
the standard gravity constant at ground level. Since we have ∇p = −ρg⃗k, the
total force exerted on D is
ZZZ ZZZ
− ∇p dV = ρg⃗k dV = vol(D)ρg⃗k
D D
that is the so called Archimedes’ principle: the total force is exerted vertically
upright and is equivalent to the weight of the mass of fluid that the body
D displaces. We may complete this result calculating the line of action the
resultant force. Indeed, the physical forces are not completely represented
by vectors of R3 . It is necessary to specify the line through this force acts or
equivalently its moment with respect a given point. The moment of a resultant
force is the sum of the individual moments. Let → −
r = (x, y, z) be the position
of a point of ∂D. The total moment of the force exerted by the pressure is
→
−
ZZ
− p→−
r × N dS.
∂D
Since the result is a vector we will repeat the previous trick multiplying scalarly
by ⃗i and so
→
− →
−
ZZ ZZ
−⃗i · →
−
p r × N dS = − p⃗i · (→
−
r × N ) dS =
∂D ∂D
→
− →
−
ZZ ZZ
− p · (⃗i × →
−
r ) · N dS = − p · (⃗i × →
−
r ) · dS
∂D ∂D
that can be calculated by the Gauss-Ostrogradsky theorem. Using that p =
c − ρgz we have
−p · (⃗i × →
−
r ) = (cz − ρgz 2 )⃗j + (ρgyz − cy)⃗k
and so
∇ · (−p · (⃗i × →
−
r )) = ρgy.
Therefore we have
→
−
ZZ ZZZ
−⃗i · p→
−
r × N dS = ρg y dV.
∂D D
220
→
−
ZZ
−⃗k · p→
−
r × N dS = 0.
∂D
Knowing that the center of mass of D (with uniform density) is the point of
coordinates
Z Z Z
−→
ZZZ ZZZ
1
CM = x dV, y dV, z dV
vol(D) D D D
which means that the resultant force vol(D)ρg⃗k is exerted along the line passing
−→
through CM , exactly at the weight of D as if it was filled with fluid. In despite
of the technical difficulty of our calculations, it is possible a much simpler way
to reach the same conclusions: the volume−→ D filled with fluid would be in
equilibrium so its weight applied on CM compensates all the external forces
and moments over ∂D exerted by the rest of the fluid.
11.7.3 Hydrodynamics
Assume we have a moving fluid in such a way that at every point we have
a speed →−
v , a pressure p and a density ρ that also may depend on time. If
we delimit a region D within the fluid, the conservation of the mass implies
that the mass flux through the boundary ∂D per time unit must balance the
variation of mass inside D, that is
→
−
ZZ ZZZ ZZZ
→
− d ∂ρ
ρ v · dS = − ρ dV = − dV
∂D dt D D ∂t
where the sign “−” is due to the fact that fluid going out counts positively
and the last equality is just standard derivation of integrals with respect to
parameters. Applying Gauss-Ostrogradsky we have
ZZZ ZZZ ZZZ
→
− ∂ρ →
− ∂ρ
0= ∇ · (ρ v ) dV + dV = ∇ · (ρ v ) + dV.
D D ∂t D ∂t
As this has to be true for every domain D we deduce
∂ρ
∇ · (ρ →
−
v)+ =0
∂t
221
which is the so called continuity equation of fluids. As we said at the beginning
this equation just expresses the mass conservation. If the density is constant
(e.g. liquids) we obtain ∇ · → −
v = 0: volumen entering equals volume going out.
Assume now that ∂D encompasses a part of the fluid and moves along with it.
Newton’s second and third laws combined say that the acceleration observed
on D (the mass inside) is due to the external forces, notably the effect of
the pressure and the weight if we are studying the problem at ground level.
Assuming that D is small enough to consider → −
v homogeneous on it we have
d→
− →
− →
−
Z Z Z Z Z ZZZ
v
ρ dV = − pdS + ρ F dV
dt D ∂D D
→
− →
−
being F a force by mass unit ( F = g⃗k for the ordinary weight). After Gauss-
Ostrogradsky and replacing terms
ZZZ →
d− →
−
ZZZ ZZZ
v
0= ρ dV + ∇p dV − ρ F dV
D dt D D
where ∇p comes from our previous study of the Archimedes’ principle. Since
the equality has to be true for D arbitrarily small we get
d→−
v →
−
ρ + ∇p − ρ F = 0.
dt
This equation has an important handicap for the applications. In practise it
is easier to determine the speed at− a given point and then its variation along
∂−
→v d→
v
time ∂t which is different from dt that represents the acceleration of a part
of the moving fluid. In order to obtain the relation between both derivatives
assume → −
v = (vx , vy , vz ) being all of them functions of x, y, z, t. Now for vx we
have
dvx ∂vx ∂vx dx ∂vx dy ∂vx dz
= + + + =
dt ∂t ∂x dt ∂y dt ∂z dt
∂vx ∂vx ∂vx ∂vx ∂vx
+ vx + vy + vz = + ∇vx · →
−v.
∂t ∂x ∂y ∂z ∂t
The same can ve done for vy , vz and putting all together we get
d→−v ∂→−
v
= + (→
−v · ∇)→
−v
dt ∂t
where the term → −v · ∇ acts like a differential operator on each coordinate of
→
−v . The previous equations of fluids takes now the form
∂→−v 1 →
−
+ (→−v · ∇)→−v + ∇p − F = 0
∂t ρ
222
which known as Euler’s equation. Note that if the fluid −
is stationary. that is,
→
the speed at any point remains constant in time, then ∂∂tv = 0.
Euler’s equation is still much complicated, however some reasonable assump-
tions can lead to simpler forms. Assuming that the fluid is irrotational which
means free of whirlpools and in terms of equations ∇ × →−v = 0 then
and the same is true for y, z. Going above to the place where (→
−
v · ∇)→
−
v first
appeared we get
1
(→
−
v · ∇)→
−v = ∇v 2
2
(remember, under the hypothesis that ∇ × → −
v = 0). Using this for Euler’s
equation gives us
∂→−
v 1 1 →
−
+ ∇v 2 + ∇p − F = 0
∂t 2 ρ
−
→
still not quite practical. Assume moreover that the fluid is stationary ( ∂∂tv = 0),
→
−
the external force is conservative ( F = −∇V ) and the density ρ constant
(liquid). Then we have
1 1 v2 p
0 = ∇v 2 + ∇p + ∇V = ∇ + +V
2 ρ 2 ρ
which implies the important equation
v2 p
+ + V = constant
2 ρ
known as Bernoulli’s equation which is a form of the law of conservation of
energy for fluids. Note that for V constant or being its variation negligible with
respect to the pressure (e.g. at ground level the fluid moves approximatively
at the same height) then the raising of the speed implies a lowering of the
pressure, which is the so called Venturi effect. Introducing the effect of the
viscosity, which implies that the speed of the fluid near immobile objects is
zero for instance, is possible to obtain a set of formulas which model much
better real fluids: the Navier-Stokes equations.
223
11.7.4 Electromagnetic fields
The well known Coulomb’s law says that to point charges are repelled (or
attracted if they are of different sign) in the empty space with a force pro-
portional to their magnitudes and inversely proportional to the square of the
distance between them. Following standard conventions the intensity of this
force is written q1 q2
F =
4πϵ0 r2
where the constant ϵ0 depends on the unit system. The air, or matter in
general, between the charges has some effect that could be included in the
formula but will not consider. Note that the field produced by a single charge
is of Newtonian type, thus the theory of Newtonian potential can be applied
→
−
to study the field produced by a charge density. Indeed, let E the electric field
field produced by a continuous electric density ρ. Poisson equation, after the
correction of the 4π term is →
− ρ
∇· E = .
ϵ0
Note that there is not “−” because the repulsion is the effect suffered by a
positive test charge placed in a field produced by a positive density ρ > 0.
Static electric fields have a potential that simplifies their description and the
mechanical effects on charges.
However, a main ingredient here is the great mobility of charges, notably
through certain substances called conductors. Therefore, the variation with
→
−
time of ρ, and so that of E , must be taken into account. The law of con-
servation of charge implies that we may apply the fluid model to the electric
→
−
current J (flux of charge per time and surface unit) to obtain
→
− → −
ZZZ ZZ
d
− ρ dV = J · dS
dt D ∂D
whose meaning must be obvious at this stage. Commuting derivation and inte-
gral together the Gauss-Ostrogradsky theorem and the fact that D is arbitrary
leads to
∂ρ →
−
− =∇· J.
∂t
Note that so far we are considering the charge to be either positive or negative
and the current is interpreted in a positive sense, that is, if a region receives
a positive flux of charge then the charge “increases”. Actually, what moves
inside conductors are electrons which have negative charge. This fact is not
224
relevant form the point of view of classic electrodynamics.
When an ordinary conductor is placed into an electric field the charges move
inside and after a while the movement stops because of the electrical resistance.
In the reached equilibrium, the electrostatic potential is constant on the con-
→
−
ductor and so the electric field E is normal to the surface of the conductor.
The potential could be computed solving the Laplace equation ∆V = 0 with
the boundary conditions imposed by the charges distributed on the conductors
(V is constant on the boundaries). The former argument does not apply to
superconductors which are substances that under certain conditions (extremely
low temperatures) possess null electrical resistance.
A system of two point charges of equal magnitude and different signs placed
at “short” distance is called a dipole. Far from a dipole the intensity of the
field decreases faster than for a single point charge because there is an almost
cancelation of effects: the sum of two nearly oposite vectors with nearly the
same modulus. However, near the dipole things are obviously different, and the
effect of the field on another dipole not only will include attraction/repulsion
but also a torque (rotational force).
This brief discussion on dipoles was aimed to present the magnetic force. Mag-
nets behave between them like electric dipoles. The magnetic field is repre-
→
−
sented by a vector field B whose effect is not only felt on magnetic materials
but also on moving electric charges according to the formula
→
− →
−
F = q→ −v ×B
where q and → −v are respectively the charge and the speed. Since the magnetic
force is perpendicular to the trajectory it does not modify the kinetic energy
→
−
(the work done by F is 0), however the magnetic field bends the trajectory
and eventually drives trapped ionized moving particles to the poles following
a helix path (look for the explanation of the auroras). It is well known that a
piece of magnet is again a magnet with two different poles, so it is impossible
to isolate “magnetic monopoles”. That leads to consider a larger magnet as
composed of a density of “magnetic dipoles” instead of charges and so in any
arbitrarily (small) volume there is always a compensation written as
→
−
∇ · B = 0.
→
− →
−
There are equations linking E and B notably when they vary with the time.
Charges in movement (currents) produce a magnetic field, for instance elec-
tromagnets, and variations in the magnetic field induce currents, that is, vari-
ations on the electric field, for instance dynamos. Both phenomenons are
225
modelled by the laws of Faraday-Henry and Ampère-Maxwell
→
−
→
− ∂B
∇× E =− ;
∂t
→
−
→
− →
− ∂E
∇ × B = µ0 J + ϵ0 µ0 .
∂t
Note that the application of the divergence (∇·) to the first equation says
nothing new meanwhile for the second we recover the conservation of charge.
→
− →
−
The set of for equations: these two last ones together ∇· E = ρ/ϵ0 and ∇· B =
0 is called Maxwell equations which totally describes the electromagnetic field.
The constant µ0 plays for the magnetism in vacuum a role analogue to ϵ0 , and
we are deliberately omitting the modifications that happens inside the matter.
→
−
We will do more tricky manipulations on Maxwell equations. Since ∇ · B = 0
→
− →
− →
−
there is A such that ∇ × A = B . Now we have
→
−
→
− ∂ →
− ∂A
∇ × E = − (∇ × A) = −∇ ×
∂t ∂t
and so →
−
→− ∂A
∇× E + = 0.
∂t
Therefore there exists a function ϕ such that
→
−
→
− ∂A
−∇ϕ = E +
∂t
→
−
that would allow us to consider that E has a scalar potential (like in the static
−
→
case) but corrected by a term ∂∂tA coming from the magnetic counterpart of
the field. →
−
Note that the property of A remains by adding the gradient of a scalar func-
→
−
tion. Let us assume for instance that ∇· A = 0 (technically that would require
→
−
a special decay of B at infinity, but we will turn a blind eye on it). Under
that hypothesis we would have
→
− ρ
∆ϕ = −∇ · E = − .
ϵ0
This is a very nice consequence, however we have to forget it because there is
→
−
a choice for A that was better for the development of the theory. The vector
226
→
−
potential A can be chosen to satisfy an apparently more strange condition due
→
− →
− →
−
to Lorenz: take A such that ∇ × A = B and
→
− ∂ϕ
∇ · A + ϵ0 µ0 = 0.
∂t
Working with the Lorenz’s condition we have
→
−
→
− ∂A ρ ∂ →
− ρ ∂ 2ϕ
−∆ϕ = ∇ · E + = + (∇ · A) = − ϵ0 µ0 2
∂t ϵ0 ∂t ϵ0 ∂t
that can be rewritten as
∂ 2ϕ ρ
∆ϕ − ϵ0 µ0 = − .
∂t2 ϵ0
On the other hand, the Ampère-Maxwell equation can be transform as
→
−
→
− →
− ∂ ∂A
∇ × (∇ × A) = µ0 J + ϵ0 µ0 − ∇ϕ − =
∂t ∂t
→
− →
−
→
− ∂ϕ ∂2 A →
− →
− ∂2 A
= µ0 J − ϵ0 µ0 ∇ − ϵ0 µ0 = µ0 J + ∇(∇ · A) − ϵ0 µ0 .
∂t ∂t2 ∂t2
→
− →
− →
−
Having in mind that ∆ A = ∇(∇ · A) − ∇ × (∇ × A) we have
→
−
→
− ∂2 A →
−
∆ A − ϵ0 µ0 2
= −µ0 J .
∂t
The couple of equations we have just obtained
∂ 2ϕ ρ
∆ϕ − ϵ0 µ0 2
=− ,
∂t ϵ0
→
−
→
− ∂2 A →
−
∆ A − ϵ0 µ0 = −µ 0 J
∂t2
provide a simpler and quite symmetric version of Maxwell equations in terms
→
−
of the potentials ϕ and A which are more convenient for theoretical studies.
In absence of charges and currents (that is, far away from them in practise),
the equations became homogeneous
∂ 2ϕ
∆ϕ − ϵ0 µ0 = 0,
∂t2
227
→
−
→
− ∂2 A
∆ A − ϵ0 µ0 = 0.
∂t2
It is a very remarkable fact that still have non trivial solutions which are
of wave type. Note that in the same conditions of absence of charges and
→
− →
−
currents E and B satisfy similar equations that can be obtained more easily
from Maxwell equations. The speed of the waves is the number
1
c= √
ϵ0 µ0
which amazingly coincides with the speed of light. This is even more shocking
if we consider that ϵ0 and µ0 could be determined working with batteries and
wires in a modest laboratory. That leads to Maxwell to think that light is
actually an electromagnetic wave. Moreover, the system of equations can be
→
−
written as an only equation in R4 for the pair ( A, ϕ) and a 4-dimensional
Laplacian-like operator
∂2
∂2 ∂2 1 ∂2
□= , , ,− .
∂x2 ∂y 2 ∂z 2 c2 ∂t2
All these goes to the Theory of Relativity, but our way ends here.
228
prescribed Laplacian. The computations are somehow informal but complete.
The problem of uniqueness leads naturally to the study of harmonic functions.
Among the applications we have choose some basic hydrostatics (the theo-
retical results compares to heuristic ones), hydrodynamics and electro-magnetic
field, where we get the 4-dimensional form of the Maxwell’s equations at the
end.
11.9 Exercises
→
− →
−
1. Let f and g be scalar functions, F and G be vectorial fields, all defined
on R3 . Prove the following formulas:
(a) ∇(f g) = g ∇f + f ∇g.
→
− →
− →
−
(b) ∇ · (f F ) = ∇f · F + f ∇ · F .
→
− →
− →
−
(c) ∇ × (f F ) = ∇f × F + f ∇ × F .
→
− → − →
− → − → − →
−
(d) ∇ · ( F × G ) = (∇ × F ) · G + F · (∇ × G ).
2. Show that →
− →
− →
−
∆ F = ∇(∇ · F ) − ∇ × (∇ × F ).
229
7. Assume that the bounded domain D ⊂ R3 has piecewise C 1 border and
F⃗ is a C 2 field defined on R3 . Find the flux of rot(F⃗ ) through ∂D.
8. Consider on R3 the vector field F⃗ = (x3 /a4 , y 3 /b4 , z 3 /c4 ) where a, b, c > 0.
10. Consider the pyramid on [−1, 1]2 with vertex at (0, 0, 3) and let L e the
surface made up from the lateral faces with exterior orientation. Let
F⃗ = ex+y−2z (1, 1, 1). Compute
ZZ
F⃗ · dS.
⃗
L
11. Show that the integral for the Newtonian potential generated by a con-
stant linear density ρ > 0 on the Z axis diverge. However, the analogous
integral for the Newtonian force converges on R3 except the Z axis. Find
→
−
a function Φ such that F = ∇Φ and explain why that is compatible with
the first statement of this exercise.
2 2 2
12. Consider the ellipsoid S with formula xa2 + yb2 + zc2 = 1 and let ρ(x, y, z)
be the distance to the origin from the tangent plane to the ellipsoid at
(x, y, z). Show that ZZ
ρ dS = 4πabc;
S
ZZ
1 4π 1 1 1
dS = abc 2 + 2 + 2 .
S ρ 3 a b c
q
x2 y2 z2
Hint: 1/ρ(x, y, z) = a4
+ b4
+ c4
.
230
Calcule lthe flux integral
ZZ
I= ⃗
∇f dS
∂D
x3 + y 3 = 3xy.
x2 + y 2 + z 2 = 2z.
231
19. Prove that if f, g are enough regular in D with D ⊂ R3 (or R2 ) open,
then
→
−
ZZZ ZZZ ZZ
g∆f dV + ∇g · ∇f dV = g∇f · d S .
D D ∂D
20. Let f : Rn → R be a C 2 function whose level sets coincide with the level
sets of a harmonic function. Show that
∆f
∥∇f ∥2
for every bounded open set D with C 1 border. Prove that f cannot have
strict relative maximums. What about strict minimums?
22. Find the conditions that should be satisfied by the functions ϕ, ψ in order
to p
f (x, y, z) = ϕ( x2 + y 2 )ψ(z)
p
be harmonic. (Hint: use ϕ(r) with r = x2 + y 2 ).
232
Chapter 12
Appendix A: The
Stone-Weierstrass theorem
1. ∅, X ∈ τ ;
2. if U, V ∈ τ , then U ∩ V ∈ τ ;
S
3. if (Ui )i∈I ⊂ τ , then i∈I Ui ∈ τ .
Note that the family of open sets in a metric space is a topology, so we will
refer to the elements of τ as open sets, and their complements will be called
closed sets. A set X endowed with a topology is called a topological space.
The topological space (X, τ ) is said to be Hausdorff if for every x, y ⊂ X with
x ̸= y there exists A, B ∈ τ with x ∈ U , y ∈ V and U ∩ V = ∅. Clearly,
metric topologies are Hausdorff. Continuity can be defined locally, nut we are
only interested in global continuity: a mapping f : (X1 , τ1 ) → (X2 , τ2 ) between
topological spaces is continuous if f −1 (V ) ∈ τ1 whenever U ∈ τ2 .
233
A topological S family (Ui )i∈I ⊂
S space (X, τ ) is said to be compact if for every
τ with X = i∈I Ui there exists J ⊂ I finite such that X = i∈J Ui . Compact-
ness is a very important property in Analysis since it works quite well together
continuity.
As in the metric case, for a compact space K, we will denote C(K) the
set of real continuous functions defined on K, eventually endowed with the
supremum norm.
Theorem 12.1.2 (Urysohn). Let K be a compact Hausdorff space and let
A, B ⊂ K be disjoint closed subsets. Then there exists f : K → [0, 1] such that
f |A = 0 and f |B = 1.
Proof. Consider a maximal family F of open sets such if U ∈ F then A ⊂
U ⊂ X \ B and U ⊂ V for all V ∈ F with U ⊂ V . The family F either
contains a clopen subset (simultaneously open and closed) or for any U, V ∈ F
with U ⊊ V there is W ∈ F such that U ⊊ W ⊊ V . In the first case the
construction of f is obvious, so we will assume the second case holds. By
induction we may take Ut ∈ F for every dyadic t ∈ [0, 1] in such a way that
t ≤ s implies Ut ⊂ Us . Define now
234
12.2 Approximation by continuous functions
The results in this section exploits two additional structures on C(K), namely
the algebra structure, i.e. C(K) is closed for the standard product of functions,
and the lattice structure, which means that C(K) is closed for the boolean
operations max and min.
Proposition 12.2.1. A linear subspace X ⊂ C(K) is dense if and only there
is ε ∈ (0, 1) such that for every disjoint closed sets A, B ⊂ K there is f ∈ X
with values in [−1, 1] such that f |A ≤ ε − 1 and f |B ≥ 1 − ε.
Proof. Assume that X is dense and take a continuous function g such that
g|A = −1 + ε/2 is its minimum and g|B = 1 − ε/2 is its maximum. Then find
f ∈ X such that ∥f − g∥∞ < ε/2. Clearly f fulfils the required conditions.
The reverse implication is more delicate. Firstly note that by scaling it is
enough to prove the approximation by elements from X for functions with
values in [−1, 1]. Consider g ∈ C(K) with values in [−1, 1] and take A =
g −1 ([−1, −2/3]) and B = g −1 ([2/3, 1]). Apply the hypothesis to find f with its
values in [−1, 1], f (A) ⊂ [−1, ε − 1] and f (B) ⊂ [1 − ε, 1]. Take f1 = 3−1 f and
set λ = (2 + ε)/3 < 1. A elemental computation shows that ∥g − f1 ∥∞ < λ,
that is (g −f )(K) ⊂ [−λ, λ]. We have now that λ−1 (g −f1 ) is a function taking
values in [−1, 1], so we can repeat the previous argument to find f2 such that
−λ ≤ λ−1 (g − f1 ) − f2 ≤ λ
what implies
∥g − f1 − λf2 ∥∞ ≤ λ2 .
Inductively we can find a sequence (fn ) such that
g − f1 − λf2 − · · · − λn−1 fn ∞
≤ λn
Proof. The linearity and the possibility of adding constants allows to find a
function fx,y ∈ X such that fx,y (x) = −2 and fx,y (y) = 2 whenever x, y ∈ K
with x ̸= y. Let A ⊂ K be closed and y ∈ K \ A. The family of sets (Ux )x∈A
235
−1
where Ux = fx,y ((−∞, −1)) is an open cover of A. Let x1 , . . . , xn ∈ A such
that the corresponding sets cover A. Then
236
Proof. Take any f ∈ X and let (Pn (t)) be a sequence of polynomials which
converges uniformly to |t| on the set f (K) ⊂ R. Note that Pn ◦ f ∈ X by the
hypotheses, so |f | ∈ X, which implies the lattice property for X. Theorem
12.2.2 says that X is dense in C(K) and so X = C(K).
Proof. Only the last statement needs to be addressed. The fact that the
trigonometric polynomials are an actual algebra follows from these well known
equalities
cos(α) cos(β) = 2−1 (cos(α + β) + cos(α − β))
sin(α) sin(β) = 2−1 (cos(α − β) − cos(α + β))
sin(α) cos(β) = 2−1 (sin(α + β) + sin(α − β))
and separation of points in [0, π] is done just by sin t and cos t.
237
238
Chapter 13
We call that property being essentially bounded. We may complete the scale
of spaces by taking L0 (µ) the set of measurable real-valued functions, that was
named M in the chapter of Measure Theory. Firstly note the following fact.
Proposition 13.1.1. Lp (µ) is a vector space for 0 ≤ p ≤ +∞.
Proof. The case L0 (µ) was already studied in Measure Theory, and L∞ (µ) is
quite obvious. Note that if a, b ≥ 0 then
p
a+b
≤ max{ap , bp } ≤ ap + bp .
2
239
Therefore Z Z Z
p p p p
|f + g| dµ ≤ 2 |f | dµ + 2 |g|p dµ < ∞
240
implying ∥f + g∥p ≤ ∥f ∥p + ∥g∥p as desired.
Note that the key inequality for convexity goes the oposite way if p < 1,
however we have the following.
Proposition 13.1.3. The formula d(f, g) := ∥f − g∥pp defines an invariant
translation pseudometric on Lp (µ) for 0 < p < 1.
Proof. Note that for a, b ≥ 0 and 0 < p ≤ 1 we have (a + b)p ≤ ap + bp .
Therefore Z Z Z
|f − g| dµ ≤ |f − h| dµ + |h − g|p dµ
p p
241
Pn
gn = k=1 |fk | which is an increasing sequence of positive functions. The
triangle property of the norm implies supn ∥gn ∥p < ∞. In particular we have
Z Z
lim gn = lim gnp dµ < ∞
p
n n
where the last inequality comes from the monotone convergence theorem ap-
plied to the triangle inequality. The case p = ∞ can be handled with the same
standard ideas as the proof of the completeness of ℓ∞ or C(K).
13.2 Convergence
Here we will compare several types of convergence. Firstly we will introduce
the notion of convergence in measure. We say that a sequence (fn ) ⊂ L0 (µ)
converge in measure to f if
lim µ({|fn − f | > ε}) = 0
n
for every ε > 0. Clearly, the limit in measure is determined almost everywhere.
Note that Chebyshev inequality says that if f ∈ Lp (µ) and ε > 0 then
Z
−p
µ({|f | ≥ ε}) ≤ ε |f |p dµ = ε−p ∥f ∥pp
242
We also have.
Proof. If (fn ) converges to f almost everywhere then for every ε > 0 we have
∞ [
\
µ( {|fk − f | > ε}) = 0.
n=1 k≥n
as wished.
and thus the d-convergence implies the convergence in measure. On the other
hand, if (fn ) converges in measure to 0 the right-hand side of the following
formula can be done as smaller as we wish
Z
min{|fn |, 1} dµ ≤ εµ(Ω) + µ({|fn | > ε})
The argument in the proof of the following proposition was already used
to prove Theorem 8.6.5.
243
Proposition 13.2.4. If a sequence is convergent in measure, then it has a
subsequence which converges almos everywhere.
Proof. Let (fn ) converging in measure to f . Then it is possible to find n1
such that
µ({|fn1 − f | > 1}) ≤ 1/2.
Inductive it is possible to build an increasing sequence n1 < n2 < . . . such
that the sets
Ak = {|fnk − f | > 1/k}
satisfy µ(Ak ) ≤ 2−k . Take A = ∞
T S
k=1 j≥k Aj . And note that µ(A) = 0. By
construction we have for any x ∈ Ac that |fnk (x) − f (x)| ≤ 1/k from a certain
k on, and so the theorem is proven.
244
if (Ω, Σ, µ) is a finite measure space the inclusion happens in the reverse way
Lp2 (µ) ⊂ Lp1 (µ) if p1 ≤ p2 . Indeed, assume f ∈ Lp1 (µ), then
Z Z Z
p2 p2
|f | dµ ≤ |f | dµ + |f |p2 dµ
|f |≤1 |f |>1
Z
≤ µ(Ω) + |f |p1 dµ < ∞.
|f |>1
The norm of the inclusion can sharply estimated with the help of Hölder in-
equality, see next section.
The general case happens to be a blend of the two previous ones, although
we will state the general result under the hypothesis of σ-finiteness.
Proposition 13.3.1. Let (Ω, Σ, µ) be a σ-finite measure space and 1 ≤ p ≤ ∞.
Then Lp (µ) is isometric to a direct sum of ℓp (Γ) and Lp (ν) where Γ ⊂ N and
ν is an atom-free finite measure (eventually void).
Proof. By a result of measure theory we know that Ω = Ωa ∪ Ωf with
Ωa ∩ Ωf = ∅ where Ωa is atomic and Ωf is atom-free. Clearly
Z Z Z
p p
|f | dµ = |f | dµ + |f |p dµ
Ωa Ωf
what implies Lp is ℓp -sum of Lp (Ωa ) and Lp (Ωf ). Now, let (Aγ )γ∈Γ an enumer-
ation of the atoms. The map
T : ℓp (Γ) → Lp (Ωa )
defined by T ((xγ )) = γ xγ µ(Aγ )−1 χAγ is an isometry (details are left to the
P
reader). For the atom-free part,Sif it is not of finite measure already, we may
consider a decomposition Ωf = n Pn where µ(Pn ) = 1. Consider the measure
ν(A) = 2−n ν(Pn ∩ A) which is finite on Ωf and the map
S : Lp (Ωf ) → Lp (ν)
2n χPn f is an isometry.
P
defined by T (f ) = n
245
13.4 Duality
We will use the following arithmetical identity: if 1 < p, q < ∞ satisfies
1/p + 1/q = 1 and a, b ≥ 0 then
ap b q
ab ≤ +
p q
and the identity only happens if ap = bq . The proof of this identity can be
obtained geometrically by interpreting the summands on the right-hand side
as areas limited by the curve y = xp−1 (or equivalently x = y q−1 ).
Assume 1 < p, q < ∞. We also may assume ∥f ∥p , ∥g∥q > 0 otherwise both
members of the inequality turn to be 0. Consider the norm-one functions
f /∥f ∥p and g/∥g∥q and apply the arithmetic inequality
|f g| |f |p |g|q
≤ + .
∥f ∥p ∥g∥q p∥f ∥pp q∥f ∥qq
Integration gives
|f g| |f |p |g|q
Z Z Z
1 1
dµ ≤ p dµ + q dµ = + = 1.
∥f ∥p ∥g∥q p∥f ∥p q∥g∥q p q
Therefore Z
|f g| dµ ≤ ∥f ∥p ∥g∥q
as wanted. The statement about when the equality holds follows easily.
246
With the help of Hölder inequality we can now obtain a precise bound for
the inclusion operator between Lp -spaces when µ(Ω) < ∞. Assume p1 ≤ p2
and f ∈ Lp2 dµ. Then
Z Z Z 1/q Z p1 /p2
p1 p1 q p1 p2 /p1
|f | dµ = 1 · |f | ≤ 1 dµ (|f | ) dµ
where q is the conjugate exponent to p2 /p1 , that is, 1/q = 1 − p1 /p2 and thus
we have
∥f ∥p1 ≤ µ(Ω)1/p2 −1/p1 ∥f ∥p2 .
The sharpness of this bound can be tested on the function f = 1.
q p ∗
R 1 ≤ p, q ≤ ∞.
Theorem 13.4.2. Let (Ω, Σ, µ) a measure space and
q
Then
the map J : L (µ) → L (µ) defined by J(g)(f ) = f g dµ for g ∈ L (µ) and
f ∈ Lp (µ) is an (injective) isometry. Moreover
1. Lp (µ)∗ = J(Lq (µ)) if 1 < p < ∞;
2. L1 (µ)∗ = J(L∞ (µ)) if µ is σ-finite or a cardinal measure;
247
as n goes to ∞. That implies
∞
[ n
X ∞
X
ν( Ak ) = F (χ S∞
k=1 Ak ) = lim F (χ Sn
k=1 Ak ) = lim ν(Ak ) = ν(Ak ).
n n
k=1 k=1 k=1
The extreme equality extends to simple functions naturally and then to any
f ∈ L∞ (µ) Z
F (f ) = f g dµ
because of the uniform denseness of simple functions among the bounded mea-
surable functions. Now we are going to check that g lies actually in Lq (µ).
Indeed, if p = 1 we claim that g is essentially bounded. Otherwise, for every
n it would be possible to find A ∈ Σ with µ(A) > 0 and |g(x)| > n for x ∈ A.
Taking f = sign(g)χA we will have
Z
|F (f )| = f g dµ ≥ n µ(A) = n ∥f ∥1
248
Since the bound does not depend on A, taking An = {|g| ≤ n} and applying
the monotone convergence theorem we get g ∈ Lq (µ) and ∥g∥q ≤ ∥F ∥.
The cases p = 1 and p > 1 can be put together in the following way: if
R then for every A ∈ Σ there
(Ω, Σ, µ) is a general measure space, is gA ∈ Lq (µ)
supported by A such that F (f ) = f gA dµ for every f ∈ Lp (µ) supported by
A and ∥gA ∥q ≤ ∥F ∥. In case, (Ω, Σ, µ) was σ-finite it is clear how to extend
the result. Assume (An ) are disjoint, cover Ω and have finite measure. Put
gn = gAn and define g(x) = gn (x) if x ∈ An . The function g is measurable and
for every f ∈ Lp (µ) and n ∈ N we have
n
X n Z
X Z
F (χSn
k=1 Ak f ) = F (χAk f ) = f g dµ = Sn
f g dµ
k=1 k=1 An k=1 Ak
Z
|g|q dµ ≤ ∥F ∥q .
S n
k=1 Ak
249
13.5 Uniform convexity of Lp(µ) for 1 < p < ∞
We say that a Banach space is uniformly convex if
x+y
δX (t) = 1 − sup{ : ∥x∥ = ∥y∥ = 1, ∥x − y∥ ≥ t} > 0
2
for all t ∈ (0, 2]. The function δX (t) is called the modulus of uniform convexity
of X. The main aim now is to prove the uniform convexity of Lp (µ) spaces for
1 < p < ∞. We will distinguish between two cases with different proofs.
Theorem 13.5.1. Lp (µ) is uniformly convex for 1 < p < ∞.
Proof. The norm ∥ · ∥p in R2 is strictly convex for 1 < p < ∞. A compactness
argument shows that for every t > 0 then
p
x+y
δ(t) = 1 − sup{ : |x|p + |y|p = 1, |x − y| ≥ t} > 0.
2
Using homogeneity we get that if |a − b|p ≥ tp (|a|p + |b|p ) then
p p
|a| + |b|p
a+b
≤ (1 − δ(t)) .
2 2
Assume f, g ∈ Lp (µ) with ∥f ∥p = ∥g∥p = 1 and ∥f − ∥p ≥ t. Consider the set
Note that
tp tp
Z Z
p
|f − g| dµ ≤ (|f |p + |g|p ) dµ ≤
Ac 4 Ac 2
and therefore
tp
Z
|f − g|p dµ ≥ .
A 2
We get the following inequality we will use soon
p p
|f |p + |g|p |f | + |g| f −g tp
Z Z Z
dµ ≥ dµ ≥ dµ ≥ .
A 2 A 2 A 2 2p 2
The uniform convexity follows easily now
p Z p p
f +g |f | + |g|p f +g
1− ≥ − dµ ≥
2 p 2 2
250
p
|f |p + |g|p |f |p + |g|p tp δ(t)
Z Z
f +g
− dµ ≥ δ(t) dµ ≥ p+1 .
A 2 2 A 2 2
Indeed,
1/p
tp δ(t)
f +g
sup{ : ∥f ∥p = ∥g∥p , ∥f − g∥p ≥ t} ≤ 1 − p+1 <1
2 p 2
as wished.
and so p/2
p p
a2 + b 2 |ap | + |b|p
a+b a−b
+ ≤ ≤
2 2 2 2
because of the convexity of t → tp/2 . Now, if f, g ∈ Lp (µ) with ∥f ∥p = ∥g∥p = 1
then
p p Z
f −g
Z Z Z
f +g 1 p p
dµ + dµ ≤ |f | dµ + |g| dµ = 1
2 2 2
and thus p p
f +g f −g
≤1− .
2 p 2 p
It follows p 1/p
f +g t
≤ 1−
2 p 2
if ∥f − g∥p ≥ t. Therefore Lp (µ) is uniformly convex with modulus of uniform
convexity
p 1/p
tp
t
δLp (µ) (t) ≥ 1 − 1 − ∼ p
2 2p
when t ∼ 0.
The case 1 < p < 2 is trickier and it is known that δLp (µ) (t) ∼ cp t2 .
251
252
Chapter 14
Appendix C: Introduction to
Lagrangian and Hamiltonian
mechanics
253
cartesian and generalized coordinates. Put q = (q1 , . . . , qn ) and q̇ = (q̇1 , . . . , q̇n )
is derivatives with respect time, that we will call generalized speeds. As we said
above
xi = xi (q, t)
so the derivative with respect time will be of the form
ẋi = ẋi (q, q̇, t).
The explicit expression can be find using the chain rule
X ∂xi ∂xi
ẋi = q̇j + .
j
∂gj ∂t
Note the difference between total derivative with respect to t which concerns
to an actual movement, and the partial derivative with respect to t which
expresses a constraint of the system that depends of the moment. On the
other hand, we may consider theoretical variations of coordinates keeping t
constant, which are called virtual displacements.
Lemma 14.1.1. The following relations hold
∂ ẋi ∂xi
= ;
∂ q̇j ∂qj
∂ ẋi d ∂xi
= .
∂qj dt ∂qj
Proof. We have
X ∂xi ∂xi
ẋi = q̇k +
k
∂qk ∂t
from which the first formula follows trivially. For the second one, just compare
these expressions
∂ ẋi X ∂ 2 xi ∂ 2 xi
= q̇k +
∂qj k
∂qk ∂qj ∂t∂qj
and, by the chain rule,
d
∂xi
X ∂ 2 xi ∂ 2 xi
= q̇k +
dt ∂qj k
∂qj ∂qk ∂qj ∂t
254
14.2 Forces, work and energy
The force acting on a particle is a quantitative description on how the interac-
tion with the rest of the universe. We may distinguish between the interaction
with other particles of the system and interactions with objects from outside
the system, that is internal forces and outer forces. We assume that the prin-
ciple of superposition holds, that is, that the interactions are additive and so
the total force applied on a particle is the sum of all the individual forces (one
for each interaction) applied on it.
The work done by a force, which only depends on the configuration of the
system, along the displacement of a particle is independent of time in the sense
that the speed of the particle does not matter for the computation. Therefore,
we may consider the work done by forces in virtual displacements. As the force
is variable, it is convenient to use a differential expression that physicist like to
interpret in terms of infinitesimals. In order to tell apart of real displacements
done as time goes by, we will use δ instead of d for differentials. The differential
of work expressed in cartesian coordinates appears as
X
δW = fi δxi
i
and then
X X ∂xi X X ∂xi X
δW = fi ( δqj ) = ( fi ) δqj = Qj δqj
i j
∂qj j i
∂qj j
where Qj = i fi ∂x
P i
∂qj
are called the generalized components of the force. One
important task is to determine the forces acting on a given system.
In the very important case that the force is conservative (work done be-
tween two point does not depend on the trajectory) the differential form δW
∂V
is exact and there exist a function V = V (q) such that Qj = − ∂q j
where the
sign minus is taken for the sake of the physical interpretation of V as potential
255
energy. The equation we will prove for the movement admits more general
potentials that eventually could depend on the speed too.
If the masses of the particles are suitably enumerated, the kinetic energy
of the system is defined as
1X
T = mi ẋ2i
2 i
In despite that the energy depends only on the cartesian speeds, in generalized
coordinates depends on {q, q̇, t}. However, if t does not appear explicitly in
the change of variables, then T is a quadratic function with respect to q̇. Let
us finish with the following easy fact.
Lemma 14.2.1.
d ∂T
mi ẍi = .
dt ∂ ẋi
256
for 1 ≤ j ≤ n. Moreover, if the system is holomomic the computation of Qj
does not include the constraint forces.
Proof. Consider the following chain of equalities where the previous lemmata
are applied together a trick based on the derivative of a product
X ∂xi X
∂xi X d ∂T ∂xi
Qj = fi = mi ẍi = =
i
∂q j i
∂q j i
dt ∂ ẋ i ∂q j
X d ∂T ∂xi X ∂T d ∂xi
− =
i
dt ∂ ẋ i ∂q j i
∂ ẋ i dt ∂q j
X d ∂T ∂ ẋi X ∂T ∂ ẋi
d ∂T
∂T
− = −
i
dt ∂ ẋi ∂ q̇j i
∂ ẋi ∂qj dt ∂ q̇j ∂qj
which gives the desired equality. The observation about constraint forces is
consequence that their generalized components with respect to such a set of
coordinates is 0.
257
Theorem 14.4.1. Every mechanical system is characterized by a function
L(q, q̇, t) in such a way that at any initial configuration the trajectories of the
system in time satisfies the equations
d ∂L ∂L
− =0
dt ∂ q̇j ∂qj
for 1 ≤ j ≤ n, being n the degrees of freedom of the system.
From now on we will asume that the Lagrange equations characterize the
evolution of the system once we know its Lagrangian, however the derivation
from Newton’s laws to this form was done under specific hypotheses (holon-
omy, potentials depending only on positions. . . ).
amongst all the other smooth trajectories between the same configurations.
Proof. For simplicity we will assume that the system has only a degree of
freedom. Actually, this computation was done on Chapter 4, nevertheless we
will repeat it with the notation from Mechanics. From now on (q, q̇, t) denotes
the trajectory followed by the system, so (q(ti ), q̇(ti )) = (qi , q̇i ) for i = 1, 2.
In order to show the extremality of the real trajectory we will consider C 2
perturbations of the form h(t) such that it and ḣ(t) vanish at t1 , t2 . Extremality
implies the directional derivative
d t2
Z
L(q + sh, q̇ + sḣ, t) dt
ds t1
must be 0 at s = 0. The parametric derivation can be performed under the
integral sign this way
Z t2
∂L ∂L
(q + sh, q̇ + sḣ, t)ḣ + (q + sh, q̇ + sḣ, t)ḧ dt.
t1 ∂q ∂ q̇
258
Integration by parts give that
Z t2
∂L
(q + sh, q̇ + sḣ, t)ḧ dt =
t1 ∂ q̇
t2 Z t2
∂L d ∂L
(q + sh, q̇ + sḣ, t)ḣ − (q + sh, q̇ + sḣ, t) ḣ dt
∂ q̇ t1 t1 dt ∂ q̇
Z t2
d ∂L
=− (q + sh, q̇ + sḣ, t) ḣ dt.
t1 dt ∂ q̇
Using this information above we get
d t2
Z
L(q + sh, q̇ + sḣ, t) dt =
ds t1
Z t2
∂L d ∂L
(q + sh, q̇ + sḣ, t) − (q + sh, q̇ + sḣ, t) ḣ dt
t1 ∂q dt ∂ q̇
and the annulation at s = 0 of this implies
Z t2
∂L d ∂L
(q, q̇, t) − (q, q̇, t) ḣ dt = 0
t1 ∂q dt ∂ q̇
for all the perturbations h satisfying the required assumptions. As it is possible
to take h with support as small as we wish contained into (t1 , t2 ) we deduce
that
∂L d ∂L
(q, q̇, t) − (q, q̇, t) = 0
∂q dt ∂ q̇
for t ∈ (t1 , t2 ) as we wanted.
Conservation laws are always a first step for the integration, or at least
simplification, of the equations of the movement. The conservation of linear
moment and angular moment in Newtonian mechanics can be generalized in
the following fashion. Associate to a generalized coordinate qj we will consider
the generalized mometum
∂L
pj = .
∂ q˙j
We have the following.
Proposition 14.4.3. If the Lagrangian L does not contain explicitly the co-
ordinate qj then pj remains constant along the trajectory of the system.
259
Proof.
dpj d ∂L ∂L
= = = 0.
dt dt ∂ q̇j ∂qj
We can prove the following principle which is somehow more general that the
assumptions of the preceding computation.
Theorem 14.5.1. If L does not contain explicitly the time, then the following
magnitude remains constant along the trajectories of the system
X
H= pj q̇j − L.
j
Proof.
dH X dpj X dL
= q̇j + pj q̈j − =
dt j
dt j
dt
X d ∂L X ∂L X ∂L X ∂L
q̇j + q̈j − q̇j − q̈j =
j
dt ∂ q̇j j
∂ q̇j j
∂qj j
∂ q̇j
X d ∂L ∂L
− q̇j = 0
j
dt ∂ q̇j ∂qj
as claimed.
260
If the matrix whose coefficients are
∂ 2L
∂ q̇i ∂ q̇j i,j
has non null determinant which is a plausible hypothesis regarded from the
point of view of the kinetic energy, then for the Jacobian we have
∂(p1 , p2 , . . . , pn )
̸= 0
∂(q̇1 , q̇2 , . . . , q̇n )
because the matrix is the same. That implies the possibility of replacing the
set of generalized speeds q̇ by the generalized moments p.
261
so
∂H X ∂ q̇i ∂L X ∂ q̇i X ∂L ∂ q̇i dqj
= pi + q̇j − = pi + q̇j − =
∂pj i
∂pj ∂pj i
∂pj i
∂ q̇j ∂pj dt
as wanted.
That implies the 2n-dimensional volume is preserved by the flow of the system
(as we did in dimension 3, see Section 11.3), that is, interpreting the Hamil-
tonian field ( ∂H
∂p
, − ∂H
∂q
) a the speed field of a fluid. That leads to the following
result of Liouville.
Theorem 14.5.3. Given a mechanical system, its Hamiltonian flow preserves
volumes in the phase space. In particular, given an open set D in (q, p) which
is composed of initial states, let be Dt the evolution of those states after a time
t > 0. Then the 2n-dimensional volume of Dt remains constant and equal to
the volume of D.
A well know application is the so called Poincaré’s recurrence theorem: if
the orbits of the system are confined in a bounded set then any initial state
will be arbitrarily approximated by the evolution of the system after some
time. We will finish with a version of the uncertainty principle for mechanical
systems. For simplicity consider only a degree of freedom. Assume that the
position q and momentum p are known with some errors ∆q(0) and ∆p(0)
at the beginning. Then after some time the combined uncertainty does not
decrease
∆q(t)∆p(t) ≥ ∆q(0)∆p(0).
262
Indeed, the product of the uncertainties ∆q(t)∆p(t) represents the area of a
rectangle that contains the evolution through the flow of the system of the
rectangle of sides ∆q(0) and ∆p(0) that contain all the possible initial states.
It is somehow surprising that Classical Mechanics anticipates Heissemberg’s
uncertainty principle.
263
264
Bibliography
[1] M. Aigner, G. M. Ziegler, Proofs form the Book, (6th ed. ), Springer,
2018.
[2] T. M, Apostol, Análisis Matemático, (1a ed.), Ed. Reverté, Barcelona,
1960.
[3] T. M, Apostol, Análisis Matemático, (2a ed.), Ed. Reverté, Barcelona,
1991.
[4] F. Bombal, L. Rodrı́guez, G. Vera Problemas de Análisis
Matemático, (3 vol.), Editorial AC, 1994.
[5] I. Bronshtein, K. Semendiaev, Manual de Matemáticas para inge-
nieros y estudiantes, Editorial MIR, Moscú, 1973.
[6] G. Bruhat Cours de Physique générale, (4 vol.) Masson, 1963-1968.
[7] H. Cartan, Formas Diferenciales, Omega - Colección Métodos,
Barcelona, 1972.
[8] F. del Castillo, Análisis matemático II, Alhambra, 1980.
[13] E. A. Desloge, Classical Mechanics, John Wiley & Sons Inc, 1982.
265
[14] J. Dieudonné, Fundamentos de Análisis Moderno, Ed. Reverté,
Barcelona, 1979.
[15] B. A. Dubrovin, A. T. Fomenko, S. P. Novikov, Métodos y apli-
caciones de Geometrı́a Moderna, (2 vol.), Editorial URSS, Moscú, 2000.
[16] C. H. Edwards JR., Advanced Calculus of Several Variables, Dover
Publ. Inc., New York, 1994.
[23] G. Joos, I. M. Freeman, Theoretical Physics, (3rd ed.), Blackie & Son
LMT, Glasgow, 1960.
266
[29] A. Mishchenko, A. Fomenko, A course of Differential Geometry and
Topology, MIR Publishers, Moscow, 1988.
[30] N. Piskunov, Cálculo Diferencial e Integral, Montaner y Simon, S.A.,
Barcelona, 1978.
[31] P. Puig Adam, Curso Teórico-Práctico de Cálculo Integral, Biblioteca
Matemática S. L., Madrid, 1975.
[41] G. Valiron, Théorie des Fonctions, (3eme ed.) Masson, Paris, 1990.
[42] G. Vera, Lecciones de Análisis Matemático II, https://webs.um.es/
gvb/OCW/OCW-AM-II_files/PDF/AM-II.pdf
[43] C. E. Weatherburn, Advanced Vector Analysis, G. Bell and Sons,
LMT, London, 1943.
267