Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
22 views

Measure Intro

1. Measure theory assigns volumes to subsets of a space X by defining a measure as a function on a σ-algebra of X. A key aspect is defining integrals of measurable functions using measures. 2. The Lebesgue integral is defined by approximating functions with simple functions and taking limits, allowing integration and limits to be compatible. This is important for properties like the Monotone Convergence Theorem. 3. The Riesz Representation Theorem establishes a one-to-one correspondence between probability measures on a compact metric space X and positive linear functionals on continuous functions on X, allowing the construction of interesting measures like Lebesgue measure on the circle.

Uploaded by

Ana Azevedo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Measure Intro

1. Measure theory assigns volumes to subsets of a space X by defining a measure as a function on a σ-algebra of X. A key aspect is defining integrals of measurable functions using measures. 2. The Lebesgue integral is defined by approximating functions with simple functions and taking limits, allowing integration and limits to be compatible. This is important for properties like the Monotone Convergence Theorem. 3. The Riesz Representation Theorem establishes a one-to-one correspondence between probability measures on a compact metric space X and positive linear functionals on continuous functions on X, allowing the construction of interesting measures like Lebesgue measure on the circle.

Uploaded by

Ana Azevedo
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

A VERY BRIEF REVIEW OF MEASURE THEORY

A brief philosophical discussion. Measure theory, as much as any branch of mathe-


matics, is an area where it is important to be acquainted with the basic notions and
statements, but not desperately important to be acquainted with the detailed proofs,
which are often rather unilluminating. One should always have in a mind a place where
one could go and look if one ever did need to understand a proof: for me, that place is
Rudin’s Real and Complex Analysis (Rudin’s “red book”).
If one wishes to do ergodic theory it is hopeless to try to pretend that measure and
integration theory do not exist. It is vital to have access to all the limiting processes
that are valid in measure theory but not permissible in weaker setups (such as, for
example, the theory of Riemann integration).
In the course I deal exclusively with compact metric spaces. Measure theory in this
setting most certainly does not convey all of the essence of the theory as a whole; for
example our discussion excludes measure theory on R. However it does allow for the
avoidence of a few technicalities.
What is a measure? A measure is a way of assigning a volume to subsets of X. In most
nontrivial settings one is not allowed to assign a volume to any old subset of X. The
collection of sets that one is allowed to measure must be a σ-algebra: collections which
contain ∅ and X and which are closed under complementation and taking countable
unions. And, of course, if F is a σ-algebra the measure cannot be an arbitrary function
on F. In this course we will be dealing exclusively with
S probability P
measures, which are
functions µ : F → [0, 1] such that µ(X) = 1 and µ( ∞ A
n=1 n ) = ∞
n=1 µ(An ) for any
countable collection of disjoint sets An ∈ F.
What can one do with a measure? By far the most important thing one can do with
a measure is use it to integrate functions f : X → R. One cannot integrate arbitrary
functions – f must be measurable, which means that each level set f −1 ([t, ∞)) lies in
the σ-algebra F. As soon as one has made this definition, there are all sorts of things
one might hope to prove, for example that the sum of two measurable functions or
the composition of a measurable function with a continuous function is again measur-
able. All reasonable such statements are true but the proofs of these and subsequent
straightforward facts occupy a lot of space.
R
The definition of the integral f dµ takes place in several stages. One first defines
it for non-negative simple measurable functions, that is to say measurable functions
s : X → R>0 R with finite range. If s(x) = αi for x ∈ Ai , i = 1, . . . , k, then we
R define (as
Pk
is natural) s dµ := i=1 αi µ(A R i ). If f is nonnegative, one then defines f dµ to be
the supremum of all integrals s dµ, over all simple measurable functions s : X → R>0
such that 0 6 s 6 f pointwise. One can modify this in an obvious way for nonpositive
functions f , and then define the integral for arbitrary measurable functions by splitting
f = f + + f −.
1
2 A VERY BRIEF REVIEW OF MEASURE THEORY

What properties does the integral enjoy? There is a hugeRlist of obvious-seeming


properties that the integral enjoys. For example, the map f 7→ f dµ is linear. Perhaps
the most important property, and really the raison d’être for the Lebesgue integral, is
the following result, which states that integration and taking of limits are compatible.
Theorem 0.1 (Monotone convergence theorem). Suppose that (fn )∞ n=1 are measurable
functions and that 0 6 f1 (x) 6 f2 (x) 6 . . .R for all x.R Suppose that fn (x) → f (x)
pointwise. Then f is measurable and limn→∞ fn dµ = f dµ.

Such a statement is definitely not true for the Riemann integral. Indeed if one takes
fn (x) to be the function on [0, 1] defined by fn (x) := 1 if x is a rational with denominator
6 n and fn (x) = 0 otherwise then each fn has Riemann integral zero. Furthermore we
have fn → f pointwise, where f is the characteristic function of [0, 1] ∩ Q. This limit
function is not Riemann integrable.
Constructing interesting measures. The hardest results in measure theory, slightly
embarrassingly, are required to construct interesting examples of measures. In this
course the underlying set X will have the structure of a compact metric space with
metric d, and we want to be able to compute the volume of the most obvious sets,
namely the open balls B(x, r) := {y ∈ X : d(x, y) < r}. The smallest σ-algebra
containing all the open balls is called the Borel σ-algebra and is generally denoted by
B. Elements of B are called Borel sets; amongst them are countable unions of closed
sets (called Fδ -sets) and countable intersections of open sets (called Gσ -sets). As we
showed earlier in the course, X is separable. This means that B contains the open sets
of X (in fact this is usually the definition of B).
Theorem 0.2 (Lebesgue measure on R/Z). There is a unique probability measure µ on
R/Z such that µ((a, b)) = b − a for all 0 6 a 6 b 6 1.

The existence of Lebesgue measure is a special case of a much more general theorem,
the Riesz Representation theorem.
Theorem 0.3 (Riesz representation theorem). Let X be a compact metric space Then
the probability measures on the Borel σ-algebra B are in one-to-one correspondence
with positive
R linear functionals Λ : C(X) → R with Λ1 = 1 via the correspondence
Λf ↔ f dµ.

Here, C(X) is the space of continuous functions from X to R, and by a positive


linear functional we mean a linear functional with the additional property that Λf > 0
whenever f > 0 pointwise.
It is not actually a trivial matter to deduce the existence of Lebesgue measure from
this theorem. In fact that linear functional Λ that is appropriate here is none other
than the Riemann integral, which is of course well-defined and linear when restricted
to continuous functions.
In fact the proof of the Riesz representation theorem and a little extra work shows
that any probability measure on B is automatically regular, which means that
µ(A) = sup{µ(K) : K ⊆ A, K closed} = inf{µ(U ) : A ⊆ U, U open}.
For the purposes of this course it might be better to regard regularity as patr of the
definition of a measure: we will not encounter measures which are not regular.
A VERY BRIEF REVIEW OF MEASURE THEORY 3

The Riesz representation theorem has two directions. The statement that every
probability measure µ gives rise to a linear functional
R Λ is relatively straightforward, and
is essentially just the construction of the integral f dµ. The proof of the other direction
is rather long; this is where the measure µ actually gets constructed. The key idea is to
use the notion of regularity to define µ. One first defines µ on open sets (and hence, by
complementation, on closed sets) by setting µ(U ) := sup{Λf : f ∈ C(X), 0 6 f 6 1U .
One then defines µ+ of an arbitrary subset A ⊆ X to be inf{µ(U ) : A ⊆ U, U open} and
µ− (A) to be sup{µ(K) : K ⊆ A, K closed}. One then shows that the collection of sets
A ⊆ X for which µ− (A) = µ+ (A) is a σ-algebra RB̃ containing B, and that µ = µ+ = µ−
is a measure on B̃ with the property that Λf = f dµ for f ∈ C(X). There is much to
be checked: it is not an easy theorem.
There is another important respect in which the Riesz representation theorem gives
something stronger than we have stated. The measure µ is actually a measure on a
σ-algebra B̃ which, in general, is strictly larger than the Borel σ-algebra B. It has the
additional property that if A ∈ B̃ and µ(A) = 0 then any subset of A is also in B̃. This
property need not hold for B itself: when X = [−1, 1] for example one may employ
a cardinality argument to establish that there are subsets of the Cantor set on [−1, 1]
which do not lie in B. This is a technical point and we will not need to dwell on it in
the course.
Limits of measures. The Riesz representation theorem tells us that probability mea-
sures on X are in 1-1 correspondence with elements Λ ∈ C(X)∗ which are positive and
normalised so that Λ1 = 1. This is, in particular, a convex subset of C(X)∗ . Write
M(X) for the space of regular Borel measures on X. The identification of M(X) with
a convex subset of C(X)∗ makes it much easier to study the former object, and in
particular to discuss limits of measures.

Definition 0.4 (Weak convergence of measures). Let µ and µn , n = 1, 2, . . . , be mea-


sures in M(X). Then we Rsay that µnR converges weakly to µ, and write µn → µ, if, for
every f ∈ C(X), we have f dµn → f dµ.

An important fact is that M(X) is compact in the topology of weak convergence.

Proposition 0.5 (Space of probability measures is weakly compact). Suppose that


(µn )∞ ∞
n=1 is a sequence of probability measures. Then there is a subsequence (µnk )k=1
which converges weakly to some µ ∈ M(X).

Proof. By the Riesz representation theorem it suffices to prove that the closed unit ball
of C(X)∗ is compact in the weak topology: that is, if Λn : X → R are functionals with
kΛn k 6 1 then there is some subsequence (Λnk )∞
k=1 which tends weakly to a functional
Λ in the sense that Λn f → Λf for all f ∈ C(X).

R that if each Λn corresponds to a probability measure µn∗then we have |Λn f | 6


Note
kf k∞ µn = kf k∞ , and so Λn does lie in the unit ball of C(X) . It is clear that if all
the Λn are positive and normalised so that Λn 1 = 1 then the same will be true for any
weak limit Λ; thus by the Riesz representation theorem such a limit corresponds to a
probability measure µ.
4 A VERY BRIEF REVIEW OF MEASURE THEORY

The statement that the closed unit ball of C(X)∗ is compact in the weak topology
is known as the Banach-Alaoglu theorem, and it is usually proved via Tychonov’s the-
orem. In our setting, where X is a compact metric space, a more direct and vaguely
constructive proof using a diagonalisation argument is possible. One might compare
this with the rather simpler diagonal argument we used to prove that ΛZ is sequentially
compact.
We begin by recalling that C(X) is separable (has a countable dense subset). This
follows from a version of the Stone-Weierstrass theorem.
Take, then, a countable dense collection of functions f1 , f2 , . . . in C(X). Consider
the sequence Λ1 f1 , Λ2 f1 , . . . . We may find a subsequence (n1,i )∞ i=1 of N such that the
sequence (Λn1,i f1 )i=1 converges. We may then pass to a further subsequence (n2,i )∞

i=1
such that the sequence (Λn2,i f2 )∞ i=1 converges, and so on. Set n i := n i,i . Then the
diagonal sequence (ni )∞ i=1 has the property that (Λ ni
fk )∞
i=1 converges for all k. Define
Λfk := limi→∞ Λni fk for k = 1, 2, . . . . We extend this to a map Λ : C(X) → R by
defining Λf := limj→∞ Λfkj , for any sequence (kj )∞ j=1 such that fkj → f in C(X).

We claim that Λ ∈ B1 (C(X)∗ ) and that Λni → Λ in the weak topology. There is
much to prove here (for example we have not yet shown that Λ is well-defined, less still
that it is a bounded linear functional). This is a somewhat tedious task which we leave
to the reader.
Lp -spaces and Lp -norms. One of the most important things that measure theory
allows us to do is to define these spaces of functions. Let X be a compact metric space
with a regular Borel probability measure µ. If f : X → C is a function we define

Z
1/p
kf kp := |f |p d µ

for 1 6 p < ∞ and kf k∞ to be the essential supremum of f , that is to say the infimum
of all those numbers M such that |f (x)| 6 M outside of a set of measure zero (almost
everywhere).
These objects k · kp satisfy the triangle inequality kf + gkp 6 kp + kgkp and also the
inequality kλf kp 6 |λ|kf kp for complex scalars λ. This qualifies them as seminorms;
they are not fully-fledged norms because it is possible to have kf kp = 0 without f being
zero. This is the case if, and only if, f vanishes almost everywhere. We write Lp (X) for
the space of all measurable functions f : X → C with kf kp < ∞, quotiented out by the
equivalence relation of being “equal almost everywhere”. One does not introduce any
special notation for these equivalence classes, abusing notation by regarding Lp (X) as
a space of functions.
The Lp -norms are nested: kf kp 6 kf kp0 whenever 1 6 p 6 p0 6 ∞. This follows from
Hölder’s
R inequality and it is important here that µ is a probability measure, that is to
say 1 dµ = 1.
A very important fact about the Lp (X) spaces is that they are complete (and hence
each one is a Banach space). The truth of this statement is a very important justification
for introducing measures.
A VERY BRIEF REVIEW OF MEASURE THEORY 5

Approximation of measurable functions. In practice one rarely tries to understand


anything about general measurable functions or even functions in Lp (X). Instead one
approaches them by stealth, as limits of functions which are much easier to understand.
As we are in Cambridge, it seems appropriate at this point to mention J. E. Little-
wood’s three basic principles. These can be a useful practical guide to working with
measures; it is notable that they were formulated at a time when the use of rough
heuristics and models to motivate quite technical subjects was not nearly so widespread
as it is now.
Here are the three principles, which apply to any regular Borel measure µ on a
compact metric space X:

(i) A measurable set is nearly an open set;


(ii) A measurable function is nearly a continuous function;
(iii) A convergent sequence of functions is nearly uniformly convergent.

Let us briefly discuss the three principles in turn. A precise version of the first follows
immediately from the definition of a regular measure, viz that µ(E) = inf E⊆U µ(U ) for
any measurable set E, the infimum being taken over all open sets U containing E. Thus
for any ε > 0, there is an open set U whose symmetric difference with E has measure
less than ε.
The second point refers to Lusin’s theorem. If f is any measurable function and if
ε > 0, this states that there is a continuous function g ∈ C(X) such that f (x) = g(x)
except on a set of measure at most ε. Note that it does not assert that f is continuous
except on a set of measure ε.
The third point refers to Egorov’s theorem: If (fn ) is a sequence of measurable
functions which converge pointwise on X, and if ε > 0, then there is a measurable
set X 0 ⊆ X, µ(X \ X 0 ) 6 ε, such that the fn converge uniformly on X 0 .
Perhaps the most useful application of these principles is the following straightforward
consequence of the second one: the space of continuous functions C(X) is dense in
Lp (X), for all 1 6 p 6 ∞.
When the space X has additional structure, one can often pass to an even nicer set
of functions which is dense in Lp (X). If X is a smooth manifold, for example, the space
C ∞ (X) of smooth functions is dense. If X = Rd /Zd is a torus then the trigonometric
polynomials |r|6R ar e2πir·θ are dense (see the example sheet).
P

Some philosophical remarks by Akshay Venkatesh. I hope that the pleasant properties
of measures presented here, particularly their closure properties under taking limits,
will convince you that they are the “right” objects to consider. Here are some further
remarks, by Akshay Venkatesh, which will make more sense a little later in the course.
What is gained by going through measures? Measures have much better formal prop-
erties than sets. A particularly important difference is that a T -invariant probability
measure can be decomposed into “minimal” invariant measures (the ergodic decomposi-
tion). That property does not seem to have a clean analogy at the level of T -invariant
closed sets. In particular, although a T -invariant closed set always contains a minimal
T -invariant closed set, it cannot be decomposed into such sets in any obvious way.
6 A VERY BRIEF REVIEW OF MEASURE THEORY

Further reading. I rather like the introduction to Lebesgue measure on R in the book
of Stein and Shakarchi, Real analysis: measure theory, integration and Hilbert spaces.
For a comprehensive introduction to the more general setting that we need here one
might consult Rudin’s “red” book, Real and complex analysis. He works with locally
compact Hausdorff spaces X rather than simply compact metric spaces as we discuss
here. The words of Akshay Venkatesh are taken from his article The work of Einsiedler,
Katok and Lindenstrauss on the Littlewood conjecture, Bull. Amer. Math. Soc. 45
(2008), 117-134.

You might also like