Probability Theory Sigma Field and Measures
Probability Theory Sigma Field and Measures
Probability Theory Sigma Field and Measures
Associated reading: Sec 1.1-1.4 of Ash and Doleans-Dade; Sec 1.1 and A.1 of Durrett.
1 Introduction
How is this course different from your earlier probability courses? There are some problems
that simply cant be handled with finite-dimensional sample spaces and random variables
that are either discrete or have densities.
Example 1 Try to express the strong law of large numbers without using an infinite-dimensional
space. Oddly enough, the weak law of large numbers requires only a sequence of finite-
dimensional spaces, but the strong law concerns entire infinite sequences.
Example 3 One way to measure the size of a set is to count its elements. All infinite sets
would have the same size (unless you distinguish different infinite cardinals).
Example 4 Special subsets of Euclidean spaces can be measured by length, area, volume,
etc. But what about sets with lots of holes in them? For example, how large is the set of
irrational numbers between 0 and 1?
We will use measures to say how large sets are. First, we have to decide which sets we will
measure.
1
2 -fields
Definition 1 (fields and -fields) Let be a set. A collection F of subsets of is called
a field if it satisfies
F,
for each A F, AC F,
for all A1 , A2 F, A1 A2 F.
Example 6 (Power set) Let be an arbitrary set. The collection of all subsets of is a
-field. It is denoted 2 and is called the power set of .
Exercise 1 Let F1 , F2 , . . . be classes of sets in a common space such that Fn Fn+1 for
each n. Show that if each Fn is a field, then n=1 Fn is also a field.
2
Generated -fields A field is closed under finite set theoretic operations whereas a -field
is closed under countable set theoretic operations. In a problem dealing with probabilities,
one usually deals with a small class of subsets A, for example the class of subintervals of
(0, 1]. It is possible that if we perform countable operations on such a class A of sets, we
might end up operating on sets outside the class A. Hence, we would like to define a class
denoted by (A) in which we can safely perform countable set-theoretic operations. This
class (A) is called the -field generated by A, and it is defined as the intersection of all
the -fields containing A (exercise: show that this is a -field). (A) is the smallest -field
containing A.
Example 8 Let C = {A} for some nonempty A that is not itself . Then (C) = {, A, AC , }.
Example 9 Let = IR and let C be the collection of all intervals of the form (a, b]. Then
the field generated by C is U from Example 5 while (C) is larger.
Example 10 (Borel -field) Let be a topological space and let C be the collection of open
sets. Then (C) is called the Borel -field. If = IR, the Borel -field is the same as (C)
in Example 9. The Borel -field of subsets of IRk is denoted B k .
3 Measures
Notation 11 (Extended Reals) The extended reals is the set of all real numbers together
+
with and . We shall denote this set IR. The positive extended reals, denoted IR is
+0
(0, ], and the nonnegative extended reals, denoted IR is [0, ].
+0
Definition 3 Let (, F) be a measurable space. Let : F IR satisfy
() = 0,
S P
for every sequence {Ak }
k=1 of mutually disjoint elements of F, ( k=1 Ak ) = k=1 (Ak ).
3
Example 12 Let be arbitrary with F the trivial -field. Define () = 0 and () = c
for arbitrary c > 0 (with c = possible).
Example 13 (Counting measure) Let be arbitrary and F = 2 . For each finite subset
A of , define (A) to be the number of elements of A. Let (A) = for all infinite subsets.
This is called counting measure on .
Sometimes, if the name of the probability P is understood or is not even mentioned, we will
denote P (E) by Pr(E) for events E.
Infinite measures pose a few unique problems. Some infinite measures are just like finite
ones.
for arbitrary sequences {An } n=1 . The proof of this uses a Sstandard trick for dealing with
n1
countable sequences of sets. Let B1 = A1 and let Bn = An \ i=1 Bi for n > 1. The Bn s are
disjoint and have the same finite and countable unions as the An s. The proof of Equation
5 relies on the additional fact that (Bn ) (An ) for all n.
Next, if (An ) = 0 for all n, it follows that (
S
n=1 An ) = 0. This
T gets used a lot in proofs.
Similarly, if is a probability and (An ) = 1 for all n, then ( n=1 An ) = 1.
4
Definition 6 (Almost sure/almost everywhere) Suppose that some statement about el-
ements of holds for all AC where (A) = 0. Then we say that the statement holds
almost everywhere, denoted a.e. []. If P is a probability, then almost everywhere is often
replaced by almost surely, denoted a.s. [P ].
Bn = An \ An1 for
Proof: Define AS= limn An . In the first case, write B1 = A1 and P
n > 1. Then An = k=1 Bk for all n (including n = ). Then (An ) = nk=1 (Bk ), and
n
X n
X
lim An = (A ) = (Bk ) = lim (Bk ) = lim (An ).
n n n
k=1 k=1
In the second case, write Bn = An \ An+1 for all n k. Then, for all n > k,
n1
[
Ak \ An = Bi ,
i=k
[
Ak \ A = Bi .
i=k
5
By the first case,
!
[
lim (Ak \ An ) = Bi = (Ak \ A ).
n
i=k
Exercise 8 Construct a simple counterexample to show that the condition (Ak ) < is
required in the second claim of Lemma 16.
C,
for each A C, AC C,
S
for each sequence {An }
n=1 of disjoint elements of C, n=1 An C.
Some simple results about -systems and -systems are the following.
6
Proposition 9 If is a set and C is both a -system and a -system, then C is a -field.
Proof: First, let C be such that 1 (C) = 2 (C) < , and define GC to be the
collection of all B F such that 1 (B C) = 2 (B C). It is easy to see that GC is a
-system that contains , hence it equals F by Lemma 19. (For example, if B GC ,
1 (B C C) = 1 (C) 1 (B C) = 2 (C) 2 (B C) = 2 (B C C),
so B C GC .)
Since 1 and 2 are -finite,
S there exists a sequence {Cn }
n=1 such that 1 (Cn ) =
2 (Cn ) < , and = n=1 Cn . (Since is only a -system, we cannot assume that the Cn
are disjoint.) For each A F,
n
!
[
j (A) = lim j [Ci A] for j = 1, 2.
n
i=1
7
Since j ( ni=1 [Ci A]) can be written as a linear combination of values of j at sets of the
S
form A C, where Sn C is the intersection
Sn of finitely many of C1 , . . . , Cn , it follows from
A GC that 1 ( i=1 [Ci A]) = 2 ( i=1 [Ci A]) for all n, hence 1 (A) = 2 (A).
Exercise 12 Return to Example 17. You should now be able to answer the question posed
there.
Exercise 13 Suppose that = {a, b, c, d, e} and I tell you the value of P ({a, b}) and
P ({b, c}). For which subset of do I need to define P () in order to have a unique ex-
tension of P to a -field of subsets of ?
S is f > a such that F (a) F (f ) . Now, the interval [f, b] is compact and
Also, there
[f, b] k=1S(ck , ek ). So there are finitely many (ck , ek )s (suppose they are the first n) such
that [f, b] nk=1 (ck , ek ). Now,
n
X n
X
F (b) F (a) F (b) F (f ) + + F (ek ) F (ck ) 2 + F (dk ) F (ck ).
k=1 k=1
8
Here we have to work with finitely many (ck , ek )s because we do not yet have countable
P
sub-additivity. It follows that F (b) F (a) 2 + k=1 F (dk ) F (ck ). Since this is true for
all > 0, it is true for = 0.
If = a < b < , let g > be such that F (g) < . The above argument shows that
X
X
F (b) F (g) F (dk g) F (ck g) F (dk ) F (ck ).
k=1 k=1
P
Since limg F (g) = 0, it follows that F (b) k=1 F (dk ) F (ck ). Similar arguments
work when a < b = and = a < b = .
Example 22 (Lebesgue measure) Start with the function F (x) = x, form the measure
on the field U and extend it to the Borel -field. The result is called Lebesgue measure,
and it extends the concept of length from intervals to more general sets.
Example 23 Every distribution function for a random variable has a corresponding proba-
bility measure on the real line.
Exercise 14 In this exercise, we prove Theorem 24. Note that the uniqueness of the exten-
sion is a direct consequence of Theorem 20. We only need to prove the existence.
First, for each B 2 , define
X
(B) = inf (Ai ), (15)
i=1
9
1. Show that extends , i.e. that (A) = (A) for each A C.
3. Show that C A.
10
Supplement: Measures from Increasing Functions
Lemma 21 deals only with functions F that are cdfs. Suppose that F is an unbounded
S If < a < b < , then the proof
nondecreasing function that is continuous from the right.
of Lemma 21 still applies. Suppose that (, b] = k=1 (ck , dk ] withPb < nd all (ck , dk ]
disjoint. Suppose that limx F (x) = . We want to show that k=1 F (dk ) F (ck ) =
. If one ck = , the proof is immediate, so assume that all ck > . Then there must
be a subsequence {kj } 0 0
j=1 such that limj ckj = . For each j, let {(cj,n , dj,n ]}n=1 be the
subsequence of intervals that cover (ckj , b]. For each j, the proof of Lemma 21 applies to
show that
X
F (b) F (ckj ) = F (d0j,n ) F (c0j,n ). (16)
n=1
As j , the left side of Equation 16 goes to while the right side eventually includes
every interval in the original collection.
A similar proof works for an interval of the form (a, ) when limx F (x) = . A combi-
nation of the two works for (, ).
11