Measure Intro
Measure Intro
Such a statement is definitely not true for the Riemann integral. Indeed if one takes
fn (x) to be the function on [0, 1] defined by fn (x) := 1 if x is a rational with denominator
6 n and fn (x) = 0 otherwise then each fn has Riemann integral zero. Furthermore we
have fn → f pointwise, where f is the characteristic function of [0, 1] ∩ Q. This limit
function is not Riemann integrable.
Constructing interesting measures. The hardest results in measure theory, slightly
embarrassingly, are required to construct interesting examples of measures. In this
course the underlying set X will have the structure of a compact metric space with
metric d, and we want to be able to compute the volume of the most obvious sets,
namely the open balls B(x, r) := {y ∈ X : d(x, y) < r}. The smallest σ-algebra
containing all the open balls is called the Borel σ-algebra and is generally denoted by
B. Elements of B are called Borel sets; amongst them are countable unions of closed
sets (called Fδ -sets) and countable intersections of open sets (called Gσ -sets). As we
showed earlier in the course, X is separable. This means that B contains the open sets
of X (in fact this is usually the definition of B).
Theorem 0.2 (Lebesgue measure on R/Z). There is a unique probability measure µ on
R/Z such that µ((a, b)) = b − a for all 0 6 a 6 b 6 1.
The existence of Lebesgue measure is a special case of a much more general theorem,
the Riesz Representation theorem.
Theorem 0.3 (Riesz representation theorem). Let X be a compact metric space Then
the probability measures on the Borel σ-algebra B are in one-to-one correspondence
with positive
R linear functionals Λ : C(X) → R with Λ1 = 1 via the correspondence
Λf ↔ f dµ.
The Riesz representation theorem has two directions. The statement that every
probability measure µ gives rise to a linear functional
R Λ is relatively straightforward, and
is essentially just the construction of the integral f dµ. The proof of the other direction
is rather long; this is where the measure µ actually gets constructed. The key idea is to
use the notion of regularity to define µ. One first defines µ on open sets (and hence, by
complementation, on closed sets) by setting µ(U ) := sup{Λf : f ∈ C(X), 0 6 f 6 1U .
One then defines µ+ of an arbitrary subset A ⊆ X to be inf{µ(U ) : A ⊆ U, U open} and
µ− (A) to be sup{µ(K) : K ⊆ A, K closed}. One then shows that the collection of sets
A ⊆ X for which µ− (A) = µ+ (A) is a σ-algebra RB̃ containing B, and that µ = µ+ = µ−
is a measure on B̃ with the property that Λf = f dµ for f ∈ C(X). There is much to
be checked: it is not an easy theorem.
There is another important respect in which the Riesz representation theorem gives
something stronger than we have stated. The measure µ is actually a measure on a
σ-algebra B̃ which, in general, is strictly larger than the Borel σ-algebra B. It has the
additional property that if A ∈ B̃ and µ(A) = 0 then any subset of A is also in B̃. This
property need not hold for B itself: when X = [−1, 1] for example one may employ
a cardinality argument to establish that there are subsets of the Cantor set on [−1, 1]
which do not lie in B. This is a technical point and we will not need to dwell on it in
the course.
Limits of measures. The Riesz representation theorem tells us that probability mea-
sures on X are in 1-1 correspondence with elements Λ ∈ C(X)∗ which are positive and
normalised so that Λ1 = 1. This is, in particular, a convex subset of C(X)∗ . Write
M(X) for the space of regular Borel measures on X. The identification of M(X) with
a convex subset of C(X)∗ makes it much easier to study the former object, and in
particular to discuss limits of measures.
Proof. By the Riesz representation theorem it suffices to prove that the closed unit ball
of C(X)∗ is compact in the weak topology: that is, if Λn : X → R are functionals with
kΛn k 6 1 then there is some subsequence (Λnk )∞
k=1 which tends weakly to a functional
Λ in the sense that Λn f → Λf for all f ∈ C(X).
The statement that the closed unit ball of C(X)∗ is compact in the weak topology
is known as the Banach-Alaoglu theorem, and it is usually proved via Tychonov’s the-
orem. In our setting, where X is a compact metric space, a more direct and vaguely
constructive proof using a diagonalisation argument is possible. One might compare
this with the rather simpler diagonal argument we used to prove that ΛZ is sequentially
compact.
We begin by recalling that C(X) is separable (has a countable dense subset). This
follows from a version of the Stone-Weierstrass theorem.
Take, then, a countable dense collection of functions f1 , f2 , . . . in C(X). Consider
the sequence Λ1 f1 , Λ2 f1 , . . . . We may find a subsequence (n1,i )∞ i=1 of N such that the
sequence (Λn1,i f1 )i=1 converges. We may then pass to a further subsequence (n2,i )∞
∞
i=1
such that the sequence (Λn2,i f2 )∞ i=1 converges, and so on. Set n i := n i,i . Then the
diagonal sequence (ni )∞ i=1 has the property that (Λ ni
fk )∞
i=1 converges for all k. Define
Λfk := limi→∞ Λni fk for k = 1, 2, . . . . We extend this to a map Λ : C(X) → R by
defining Λf := limj→∞ Λfkj , for any sequence (kj )∞ j=1 such that fkj → f in C(X).
We claim that Λ ∈ B1 (C(X)∗ ) and that Λni → Λ in the weak topology. There is
much to prove here (for example we have not yet shown that Λ is well-defined, less still
that it is a bounded linear functional). This is a somewhat tedious task which we leave
to the reader.
Lp -spaces and Lp -norms. One of the most important things that measure theory
allows us to do is to define these spaces of functions. Let X be a compact metric space
with a regular Borel probability measure µ. If f : X → C is a function we define
Z
1/p
kf kp := |f |p d µ
for 1 6 p < ∞ and kf k∞ to be the essential supremum of f , that is to say the infimum
of all those numbers M such that |f (x)| 6 M outside of a set of measure zero (almost
everywhere).
These objects k · kp satisfy the triangle inequality kf + gkp 6 kp + kgkp and also the
inequality kλf kp 6 |λ|kf kp for complex scalars λ. This qualifies them as seminorms;
they are not fully-fledged norms because it is possible to have kf kp = 0 without f being
zero. This is the case if, and only if, f vanishes almost everywhere. We write Lp (X) for
the space of all measurable functions f : X → C with kf kp < ∞, quotiented out by the
equivalence relation of being “equal almost everywhere”. One does not introduce any
special notation for these equivalence classes, abusing notation by regarding Lp (X) as
a space of functions.
The Lp -norms are nested: kf kp 6 kf kp0 whenever 1 6 p 6 p0 6 ∞. This follows from
Hölder’s
R inequality and it is important here that µ is a probability measure, that is to
say 1 dµ = 1.
A very important fact about the Lp (X) spaces is that they are complete (and hence
each one is a Banach space). The truth of this statement is a very important justification
for introducing measures.
A VERY BRIEF REVIEW OF MEASURE THEORY 5
Let us briefly discuss the three principles in turn. A precise version of the first follows
immediately from the definition of a regular measure, viz that µ(E) = inf E⊆U µ(U ) for
any measurable set E, the infimum being taken over all open sets U containing E. Thus
for any ε > 0, there is an open set U whose symmetric difference with E has measure
less than ε.
The second point refers to Lusin’s theorem. If f is any measurable function and if
ε > 0, this states that there is a continuous function g ∈ C(X) such that f (x) = g(x)
except on a set of measure at most ε. Note that it does not assert that f is continuous
except on a set of measure ε.
The third point refers to Egorov’s theorem: If (fn ) is a sequence of measurable
functions which converge pointwise on X, and if ε > 0, then there is a measurable
set X 0 ⊆ X, µ(X \ X 0 ) 6 ε, such that the fn converge uniformly on X 0 .
Perhaps the most useful application of these principles is the following straightforward
consequence of the second one: the space of continuous functions C(X) is dense in
Lp (X), for all 1 6 p 6 ∞.
When the space X has additional structure, one can often pass to an even nicer set
of functions which is dense in Lp (X). If X is a smooth manifold, for example, the space
C ∞ (X) of smooth functions is dense. If X = Rd /Zd is a torus then the trigonometric
polynomials |r|6R ar e2πir·θ are dense (see the example sheet).
P
Some philosophical remarks by Akshay Venkatesh. I hope that the pleasant properties
of measures presented here, particularly their closure properties under taking limits,
will convince you that they are the “right” objects to consider. Here are some further
remarks, by Akshay Venkatesh, which will make more sense a little later in the course.
What is gained by going through measures? Measures have much better formal prop-
erties than sets. A particularly important difference is that a T -invariant probability
measure can be decomposed into “minimal” invariant measures (the ergodic decomposi-
tion). That property does not seem to have a clean analogy at the level of T -invariant
closed sets. In particular, although a T -invariant closed set always contains a minimal
T -invariant closed set, it cannot be decomposed into such sets in any obvious way.
6 A VERY BRIEF REVIEW OF MEASURE THEORY
Further reading. I rather like the introduction to Lebesgue measure on R in the book
of Stein and Shakarchi, Real analysis: measure theory, integration and Hilbert spaces.
For a comprehensive introduction to the more general setting that we need here one
might consult Rudin’s “red” book, Real and complex analysis. He works with locally
compact Hausdorff spaces X rather than simply compact metric spaces as we discuss
here. The words of Akshay Venkatesh are taken from his article The work of Einsiedler,
Katok and Lindenstrauss on the Littlewood conjecture, Bull. Amer. Math. Soc. 45
(2008), 117-134.