Probability Essentials: 19 Weak Convergence and Characteristic Func-Tions

Probability Essentials
Jacod and Protter

April 1, 2005
19
Weak convergence and characteristic functions
We would like to use ch.f. to study the convergence of r.v.s.

Theorem 19.1 (L
evy) Let n be a sequence of prob. meas. on Rd .
a) If n weakly, then
n (u)
(u), u Rd .
b) If
n (u) conv. to a function f (u), u Rd , and if f is continuous at
u = 0, then there exists a prob. meas. on Rd s.t. f =
and n
weakly.
Proof: a) Since eiux is bounded and continuous, we see that
Z
eiux n (dx)
eiux (dx).
b) We assume d = 1. We first prove the tightness of {n }.

Recall that {n } is tight if > 0, b, n
n ([b, b]c ) < .
Note that
Z
n (u)du
Z Z
F ubini
eiux n (dx)du
eiux dun(dx)
2 sin(x)
n (dx).
x
Recall
0
and
sin x
1
x
1
sin x
x
2
if |x| > 2.
Z
2 sin(x)
1Z
(1
n (u))du = 2
n (dx)

x
!
Z
sin(x)
= 2
n (dx)
1
x
Z
1
n (dx)
2
|x|>2 2
= n ([, ]c )
where = 2 . Since f is continuous at 0, > 0, , |u| < , we have

|1 f (u)| < 4 and hence,
As
N, n > N,
Z
1

(1 f (u))du < .

2
1
(1
n (u))du
n ([, ] )
Z
1

(1 f (u))du,

(1
n (u))du
< .
There are only finite many n before N . For each n, n , s.t. n ([n , n ]c ) <
. Let
b = max{1 , , N , }.
Then
n ([b, b]c ) < ,
n.
(u).
This proves the tightness of {n }. Let nk weakly. Then
nk (u)
Thus f (u) =
(u).
It remains to show that n weakly. For each convergent subsequence
0
nk , we have = f =
. Hence = . This proves the convergence of
.
Example 19.2 Suppose Xn P oisson(n). Let

1
Zn = (Xn n).
n
D
Then Zn Z, L(Z) = N (0, 1).

Proof:
Zn (u) = e
iu n
Xn
iu/
1)
= eiu n en(e

= exp n(eiu/ n 1) iu n
u
1
u
= exp n i +
i
n 2
n
1
= exp u2 + o(1)
2

1 2
exp u .
2

Homework: 2, 3.
!2
1
iu n
+
n

20
The laws of large numbers
Recall A sample of n lightbulbs with lifespans X1 , , Xn . The estimation

of average lifespan is
n = 1 (X1 + + Xn ).
X
n
Guess
n .
X
Theorem 20.1 (Strong law of large numbers, SLLN) Let (Xj )j1 be i.i.d.
Let
2
= E(Xj ) and 2 = X
< .
j
Let Sn =
Pn
i=1
Xi . Then
Sn
a.s. and in L2 .
Proof: Consider Zj = Xj . Then (Zj )j1 are i.i.d. with mean 0.

Sn
1
(Z1 + + Zn ) =
.
n
n
We only need to show that
1
(Z1 + + Zn ) 0.
n
In other words, we may assume that = 0.

Sn 2

E

So
1
1
1 2
2
V
ar(S
)
=
n
=
0.
n
n2
n2
n
Sn L 2
0.
n
As
X
n
we have

Sn2 2

E 2
X Sn2 2

2
n
X
n
1 2
< ,
n2
< ,
4
a.s.
and hence,
Sn2
0,
n2
Let p(n) N be s.t.

Then
a.s.
p(n)2 n < (p(n) + 1)2 .

2
(p(n)+1)
1 X
Sn p(n)2 Sp(n)2
=
Xj .
n
n p(n)2
n j=p(n)2 +1
So
Sn p(n)2 Sp(n)2
E
n
n p(n)2
!2
i
1 h
2
2
2
(p(n)
+
1)
p(n)
2
n
1
1
3 2
2
2 =
3n
3
n2
n2
is summable. Thus
Sn p(n)2 Sp(n)2
0,
n
n p(n)2
As
p(n)2
n
1 and
Sp(n)2
p(n)2
0 a.s., we have
Sn
n
a.s.
0 a.s.
Note that the condition depends on 2 , but the conclusion does not. We
hope to relax the condition. The following theorem will be proved in Ch. 27.
Theorem 20.2 Let (Xj )j1 be i.i.d. and R. Then
Sn
a.s. iff E(Xj ) = .
In this case, the convergence also holds in L1 .

HW: 7, 9.
21
The central limit theorem
We know Snn . How fast is the convergence? We shall prove

the order of 1n .
Sn
n
is
Theorem 21.1 (Central limit theorem) Let (Xj )j1 be i.i.d. with E(Xj ) =
and V ar(Xj ) = 2 , 0 < 2 < . Let
Yn =
Sn n
.
n
Then Yn Y where L(Y ) = N (0, 1).

Proof: Let be the ch. f. of Xj . Then
Yn (u) = Sn n
!n
Note that, by Exer. 14.4, we have

(u) = 1
2 u2
+ o(u2 ),
2
as u 0.
Hence
2 u2
1
log Yn (u) = n log 1
+
o
2 2n
n
1
= u2 + o(1).
2
!
Therefore,
Yn (u) e 2 u .
Next, we consider the case that (Xj )j1 not i.i.d.
Theorem 21.2 Let (Xj )j1 be independent and E(Xj ) = 0, E(Xj2 ) = j2 .
Assume
sup E{|Xj |2+ } < for some > 0
j
and
X
j
Then
Sn
qP
n
j=1
j2
j2 = .
Z,
L(Z) = N (0, 1).
Proof: We only give a sketch. Denote

an =
v
uX
u n 2
t
.
j
j=1
Note that
u
u
n
an
an
!!
2
j2
1 2u
n
.
= j=1 1 j 2 + o 2
2 an
an
Sn (u) = 1
an
Hence
j2
1 2 u2
log 1 j 2 + o 2
log Sn (u) =
an
2 an
an
j=1
n
X
!!
1
= u2 + o(1).
2
Finally, we consider the d-dimensional version of the CLT.

Theorem 21.3 Let (Xj )j1 be i.i.d. Rd -valued r.v.s. Let = E(Xj ) Rd ,
and let Q be the covariance matrix of Xj . Then
Sn n D
Z N (0, Q).
n
Proof: Similar to 1-dimensional case.
Example 21.4 Let (Xj )j1 be i.i.d. Bernoulli(p). Then

Sn =
n
X
j=1
= p, 2 = p p2 = p(1 p).
Xj B(n, p).
Sn
p.
n
Sn np D
q
N (0, 1).
np(1 p)
SLLN :
CLT :
Example 21.5 Let (Xj )j1 be i.i.d. r.v.s in L2 with common distribution
F (unknown). We would like to estimate F using X1 , , Xn .
Recall
F (x) = P (Xj x) = E1Xj x .
Define
Yj = 1Xj x .
Then
n
1X
Yj F (x)
n j=1
Let
Fn (x) =
(SLLN ).
n
1X
Yj .
n j=1
Fn is called the empirical distribution function.

Actually, we can prove that the convergence is uniform in x:
sup |Fn (x) F (x)| 0,
x
a.s.
How fast?
1X
n(Fn (x) F (x)) =
n
Yj E(Yj )
n j=1
Sn n D
N (0, 2 (x))
n
where
2 (x) = V ar(Yj ) = F (x)(1 F (x)).
HW: 2, 3, 5, 8, 11
22
L2 and Hilbert space
Recall: L2 consists of r.v.s X such that E(X) < .

X=Y
if
X = Y,
a.s.
Define inner product on L2 : X, Y L2

hX, Y i = E(XY ).
By Thm 9.3, L2 is a linear space: X, Y L2 , , R, we have X + Y
L2 . Further
hX + Y, Zi = hX, Zi + hY, Zi .
Observe,
hX, Xi = E(X 2 ) 0
and
hX, Xi = 0 iff X = 0, a.s.
Define norm
kXk = hX, Xi1/2 = (E(X 2 ))1/2 .
Triangle inequality
kX + Y k2 = E(X 2 ) + E(Y 2 ) + 2E(XY )
2
E(X ) + E(Y ) + 2 E(X 2 )E(Y 2 )

q
= ( E(X 2 ) +
q
E(Y 2 ))2 .
Summarize, we have
Theorem 22.1 L2 is a normed vector space with inner product h, i and
k k = h, i1/2 .
Definition 22.2 1) A normed space is complete if every Cauchy sequence
Xn (i.e. kXn Xm k 0 as n, m 0) has a limit.
2) A Hilbert space H is a complete normed vector space with an inner product
satisfying
kxk = hx, xi1/2 ,
x H.
Theorem 22.3 L2 is a Hilbert space.
9
Proof: We only need to prove L2 is complete. Let Xn be a Cauchy sequence

in L2 . Then
> 0, N, n, m N, kXn Xm k .
Choose =
1
,
2k
nk s.t.
kXnk+1 Xnk k
Define
Ym () =
m
X
k=1
|Xnk+1 () Xnk ()|.
Then
Ym Y =
1
.
2k
k=1
|Xnk+1 Xnk |.
Note that, by triangle inequality,

2
E(Ym )
m
X
k=1
kXnk+1 Xnk k
!2
1.
By MCT,
E(Y 2 )
So Y < a.s., i.e.
k=1
So
Xn 1 +
= lim E(Ym2 ) 1.
m
|Xnk+1 Xnk | < ,
k=1
(Xnk+1 Xnk ) conv a.s.
Denote it by X. Then Xnk X a.s.

Note that
X Xnk = lim
m
Thus
kX Xnk k lim
m
a.s
m
X
j=k
m
X
j=k
(Xnj+1 Xnj ).
kXnj+1 Xnj k
Hence, Xnk X in L2 .
10
1
2k1
0.
Note that
kXn Xk kXn Xnk k + kXnk Xk.
Let n, nk 0, we see that kXn Xk 0.
Next, we introduce some basic properties for Hilbert space.
H Hilbert space, norm k k, inner product h, i, , real numbers.
Definition 22.4 Two vectors x, y H are orthogonal if hx, yi = 0. A vector
x is orthogonal to a set of vectors if
hx, yi = 0,
y .
We denote all such x by .

Theorem 22.5 If xn x, yn y in H, then hxn , yn i hx, yi in R.
Proof:
| hxn , yn i hx, yi | = hxn x, yn i + hx, yn yi |
kxn xkkyn k + kxkkyn yk
0.
Definition 22.6 A subset L of H is a subspace if it is linear (i.e., x, y L,

we have x + y L) and it is closed (L 3 xn x in H, we have x L).
Theorem 22.7 Let be a subset of H. Then is a subspace of H.
Proof:
hx1 + x2 , yi = hx1 , yi + hx2 , yi = 0.
If 3 xn x, then for any y
hx, yi = lim hxn , yi = 0.

n
So x .
11
Definition 22.8 Let L be a subspace of H.

d(x, L) = inf{kx yk : y L}
is the distance between x H and the subspace L.
Theorem 22.9 Let L be a subspace of H. x H, ! y L s.t.
kx yk = d(x, L).
Proof: If x L, then y = x. If x
/ L, take yn L s.t.
kx yn k d(x, L).
Then
kyn ym k2 = k(yn x) (ym x)k2
= kyn xk2 + kym xk2 2 hyn x, ym xi .
Note that
k2x yn ym k2 = kyn xk2 + kym xk2 + 2 hyn x, ym xi .
So
kyn ym k
yn + ym 2
= 2kyn xk + 2kym xk
2
2kyn xk2 + 2kym xk2 4d(x, L)2
0.
2

4 x
Namely, {yn } is a Cauchy sequence, y L s.t. yn y. Hence

d(x, L) = kx yk.
Now we prove the uniqueness. Suppose that
d(x, L) = kx zk,
z L.
Then choose w2n = y and w2n+1 = z. Similar to yn , the sequence wn is a

Cauchy sequence. So y = z.
We call y the projection of x onto L. Denote y = x.
12
Theorem 22.10 Properties of . i) is idempotent: 2 = .

ii) x = x if x L; x = 0 if x L .
iii) x H, x x L .
Proof: i) follows from the first part of ii), which is given in the last proof.
ii) If x L , then for any y L,
kx yk2 = kxk2 + kyk2 kxk2 .
Hence x = 0.
iii) y L and t > 0, we have ty + x L. Then
kx xk2 kx (2ty + x)k2
= kx xk2 4t hx x, yi + 4t2 kyk2 .
Hence
hx x, yi tkyk2 .
Replace y by y, we get
hx x, yi tkyk2 .
Combine both inequalities and take t 0, we get
hx x, yi = 0.
Corollary 22.11
x = x + (x x) L
is the unique decomposition.
Corollary 22.12 i)
hx, yi = hx, yi ,
x L.
ii)
(x + y) = x + y.
13
Proof: i)
hx, yi = hx, y + (y y)i = hx, yi .
Similarly,
hx, yi = hx, yi .
ii) As
x = x1 + x2 ,
x1 L, x2 L
y = y1 + xy2 ,
y1 L, y2 L ,
and
we have
x1 + y1 L, x2 + y2 L .
x + y = (x1 + y1 ) + (x2 + y2 ),
Hence
(x + y) = x1 + y1 = |pix + y.
Finally, we give a characterization of the projection operator.
Theorem 22.13 Suppose T : H L satisfies: x H, x T x L .
Then T = .
Proof: For any x H,
x = T x + (x T x) L
So
T x = x.
14
L .
23
Conditional expectation
X, Y two r.v.s, Y is R-valued, X is S-valued, S = {x1 , x2 , } countable.

If we do not have any information, we expect Y to be E(Y ). If we know
X = xj occurred, then we will modify our expectation about Y .
Conditional probability measure:
A.
Q() = P (|X = xj ),
Definition 23.1 Let X be S-valued. If P (X = xj ) > 0, we define the

conditional expectation of Y given X = xj as
E(Y |X = xj ) = EQ (Y ).
Theorem 23.2 If X is S-valued, Y is {y1 , y2 , }-valued and P (X = xj ) >
0, then
X
yk P (Y = yk |X = xj ).
E(Y |X = xj ) =
k
Proof:
E(Y |X = xj ) = EQ (Y ) =
=
X
k
yk Q(Y = yk )
yk P (Y = yk |X = xj ).
Note that E(Y |X = xj ) is a function of xj . Define

f (x) =
E(Y |X = xj )
any value
if P (X = x) > 0
if P (X = x) = 0.
Definition 23.3 Let X be S-valued and Y be R-valued. Then E(Y |X) is

defined as
E(Y |X) = f (X).
Example 23.4 X Poisson r.v. with para. . If X = n, S is binomial with
para. n, . Find E(S|X) and E(X|S).
15
Solution:
E(S|X = n) = np,
E(S|X) = pX.
Now we calculate E(X|S).

Note that for n k,
P (X = n, S = k)
P (S = k)
P (S = k|X = n)P (X = n)
= P
m=k P (S = k|X = m)P (X = m)
P (X = n|S = k) =
= P

n
k
m=k
=
=
=
pk (1 p)nk n! e

m
k
pk (1 p)mk m! e
1
(1 p)n n
(nk)!
P
1
m m
m=k (mk)! (1 p)
1
(1 p)n n
(nk)!
(1 p)k k e(1p)
((1 p))nk (1p)

e
(n k)!
i.e. k + P oisson((1 p)).

E(X = n|S = k) = k + (1 p).
Namely
E(X|S) = S + (1 p).
Next, we wish to define E(Y |X) when the state space of X is not countable.
Definition 23.5 Let X : (, A) (Rn , B n ). The -algebra generated by X
is (X) = X 1 (B n ), i.e.,
(X) = {A : X 1 (B) = A, B B n }.
Theorem 23.6 Suppose X is Rn -valued r.v. and Y is R-valued r.v. Then Y
is (X)-measurable iff Borel function f on Rn such that Y = f (X).
16
Proof: = B B,
Y 1 (B) = X 1 (f 1 (B)) X 1 (B n ) = (X).
= Suppose Y = 1A is (X)-measurable, then A (X). There is B B n
such that A = X 1 (B). Hence, Y = 1B (X). f = 1B .
If Y is simple, then
Y =
k
X
a i 1Ai .
i=1
In this case, we can prove that

Y = f (X),
f=
k
X
a i 1Bi
i=1
where Bi B n with X 1 (Bi ) = Ai .

If Y 0, then there exists a sequence (X)-measurable simple r.v.s
Yn Y . Note that Yn = fn (X). Set
f (x) = lim sup fn (x).
n
Then
Y = lim
Yn = lim
fn (X) = f (X).
n
n
For general Y , we have
Y = Y + Y = f1 (X) f2 (X).
Next we define E(Y |X) when Y L2 (, A, P ).

Definition 23.7 Let Y L2 (, A, P ). The conditional expectation of Y
given X is the unique element Y in L2 (, (X), P ) such that
E(Y Z) = E(Y Z),
We write
Z L2 (, (X), P ).
Y = E(Y |X).
17
(23.1)
Remark 23.8 1. L2 (, (X), P ) is a subspace of L2 (, A, P ).

2.
Y = P rojL2 (,(X),P ) Y.
3. Since Y is (X)-measurable, f s.t. Y = f (X).
Next, we can replace (X) by any sub--algebra G of A. Then L2 (, G, P )
is a subspace of L2 (, A, P ).
Definition 23.9 Let Y L2 (, A, P ) and let G be a sub--algebra of A.
Then the conditional expectation of Y given G is the unique element E(Y |X)
of L2 (, G, P ) such that
E(Y Z) = E(E(Y |G)Z),
Z L2 (, G, P ).
(23.2)
Theorem 23.10 Let Y L2 (, A, P ) and let G be a sub--algebra of A.

a) If Y 0, then E(Y |G) 0.
b) If G = (X), then Borel function f s.t. E(Y |G) = f (X).
c)
E(E(Y |G)) = E(Y ).
d) Y E(Y |G) is linear.
Proof: (b) is proved in previous theorem. (c) follows from (23.1) by taking
Z = 1. (d) follows from the property of the projection operator. Now we
prove (a).
Note that Z 0 with Z L2 (, G, P ), we have
E(E(Y |G)Z) = E(Y Z) 0.
Take Z = 1E(Y |G)<0 . Then
E(E(Y |G)Z) 0.
Hence E(Y |G) 0 a.s.
Now we define E(Y |G) when Y L1 (, A, P ). First we consider the case
that Y 0. Note that L2 3 Y n Y . Then E(Y n|G) . We define
E(Y |G) = lim E(Y n|G).
n
In general, we define
E(Y |G) = E(Y + |G) E(Y |G).
18
Theorem 23.11 Let Y L1 (, A, P ) and let G be a sub--algebra of A.

Then E(Y |G) is the unique element of L1 (, G, P ) such that
E(Y Z) = E(E(Y |G)Z)
(23.3)
for all bounded G-measurable r.v. Z. Further, E(Y |G) satisfies

a) If Y 0, then E(Y |G) 0.
b) Y E(Y |G) is linear.
Proof: First we assume Z 0 and bounded. Then by MCT
E(E(Y |G)Z) = E(E(Y + |G)Z) E(E(Y |G)Z)

= lim E(E(Y + n|G)Z) E(E(Y n|G)Z)

n
E((Y + n)Z) E((Y n)Z)

= lim
n
= E(Y Z).
For general bounded Z, we get (23.3) by linearity.

Now we prove the uniqueness. If Y and Y are two such elements, then
Z)
E(Y
Let Z = 1Y >Y . Then
Thus
Similarly
Hence Y = Y a.s.
= E(Y Z).
E((Y Y )1Y >Y ) = 0.

(Y Y )1Y >Y = 0,
a.s.
(Y Y )1Y Y = 0,
a.s.
Z in the theorem can be repaced by indicate r.v.

Remark 23.12 Let Y L1 (, A, P ) and let G be a sub--algebra of A.
Then E(Y |G) is the unique element of L1 (, G, P ) such that
E(Y 1B ) = E(E(Y |G)1B )
for all B G.
19
Example 23.13 Let (X, Z) be r.v.s with joint density f (x, z). Let g be a
bounded function and Y = g(Z). Calculate E(Y |X).
Solution: Let
f (x, z)
f (x, z)dz
be the conditional p.d.f. Let E(Y |X) = h(X). Then k(x),
fX=x (z) = R
Z
E(h(X)k(X)) =
h(x)k(x)fX (x)dx
h(x)k(x)
f (x, z)dzdx.
On the other hand,

E(h(X)k(X)) = E(E(g(Z)|X)k(X)) = E(g(Z)k(X))
=
Hence
h(x)
Thus
Z Z
g(z)k(x)f (x, z)dzdx.
f (x, z)dz =
h(x) =
g(z)f (x, z)dz.
g(z)fX=x (z)dz.
Theorem 23.14 Let Y 0 or Y L1 (, A, P ). Let G be a sub--algebra

of A. Then E(Y |G) = Y iff Y is G-measurable.
Proof: Trivial.
Theorem 23.15 Let Y L1 (, A, P ). If X, Y are indep., then
E(Y |X) = E(Y ).
Proof: For g bounded,
E(E(Y )g(X)) = E(Y )E(g(X)) = E(Y g(X)).
Hence
E(Y |X) = E(Y ).
20
Theorem 23.16 Suppose X is G-measurable. If either a) X, Y, XY

L1 (, A, P ) or b) X, Y 0, then
E(XY |G) = XE(Y |G).
Proof: Assume b) holds. Let Z 0 be G-measurable. Then
E(XE(Y |G)Z) = E(E(Y |G)XZ)
= E(Y XZ) = E((XY )Z).
Hence
XE(Y |G) = E(XY |G).
Other case can be treated by linearity.
Theorem 23.17 (Jensens inequality) If : R R is convex and X, (X)
are integrable, then
(E(X|G)) E((X)|G).
(23.4)
Proof: If X is simple, say, X =
Pn
i=1
E(X|G) =
ai 1Ai , then
n
X
i=1
Note that
n
X
i=1
Thus
(E(X|G))
n
X
i=1
ai E(1Ai |G).
E(1Ai |G) = 1.
(ai )E(1Ai |G) = E((X)|G).
We can prove (23.4) for general X by the typical method.

As consequences, we have the following Holders inequality and Minkowski
inequality.
Theorem 23.18 (H
olders inequality) Suppose p, q > 1 s.t.
Then
|E(XY )| E(|XY |) (E|X|p )1/p (E|Y |q )1/q .
21
1
p
1
q
= 1.
Proof: We may assume X, Y 0 and E|X|p , E|Y |q < . Further, we

may assume E|X|p > 0 (otherwise, X = 0 a.s.). Let C = E(X p ). Define
probability measure Q as
Q(A) =
1
E(1A X p ).
C
EQ =
1
E(X p ).
C
Then
Note that
XY = Y X 1p 1X>0 X p .
Let
Z = Y X 1p 1X>0 .
Note that |x|q is convex. Thus
1
1
q
(E(XY
))
=
(E(ZX p ))q = (EQ Z)q
q
q
C
C
1
1
EQ (Z q ) = E(Y q X (1p)q X p ) = E(Y q ).
C
C
Hence
E(XY ) C 1/p (EY q )1/q = (E|X|p)1/p (E|Y |q )1/q .
Theorem 23.19 (Minkowski inequality) For p 1, we have

(E(|X + Y |p ))1/p (E(|X|p ))1/p + (E(|Y |p ))1/p .
Proof: If p = 1, it is trivial. Assume p > 1.
E(|X + Y |p ) E(|X||X + Y |p1 ) + E(|Y ||X + Y |p1 )
(E|X|p )1/p (E(|X + Y |(p1)q ))1/q
+(E|Y |p )1/p (E(|X + Y |(p1)q ))1/q
= ((E(|X|p ))1/p + (E(|Y |p ))1/p )(E(|X + Y |p ))11/p .
HW. 2, 8, 12, 13, 16, 18

22
24
Martingales
Let Xt be a real-valued stochastic process such that E|Xt | < , t T.

We fix a complete probability space (, F , P ) and a family of increasing
sub--fields Ft (t T) satisfying the usual conditions: F0 contains all P -null
sets and Ft is right continuous. We shall take T = R+ = [0, ) (continuous
case) or T = N = {0, 1, 2, } (discrete case). All the stochastic processes Xt
will be adapted to this family of -fields, i.e., t, Xt is Ft -measurable. The
quadruple (, F , P, Ft ) is call a stochastic basis.
Definition 24.1 (Xt )tT is a martingale if s < t,
E(Xt |Fs )
= Xs ,
a.s.
(24.5)
It is a supermartingale (resp. submartingale) if (24.5) is replaced by inequality:

E(Xt |Fs ) Xs ,
(resp. )
a.s.
We consider the discrete case first. Let T = N and let Xn be a discretetime stochastic process.
Let fn be a predictable process (i.e. fn is Fn1 -measurable). We define a
transformation
(f X)n = f0 X0 +
n
X
k=1
fk (Xk Xk1 ).
Note that this transformation is the counterpart in discrete case of the

stochastic integral which will be introduced later.
Proposition 24.2 If Xn is a martingale (resp. supermartingale) and fn
is a bounded (resp. nonnegative and bounded) process, then (f X) n is a
martingale (resp. supermartingale).
Proof: Suppose that Xn is a martingale. As
(f X)n = (f X)n1 + fn (Xn Xn1 ),
we have
E((f
X)n |Fn1 ) = (f X)n1 .
Thus, (f X)n is a martingale. The other case can be verified similarly.

23
It is useful to consider a process at a random time . Such a time should

be adapted to the -fields Ft . Namely, whether t or not should be
decided by using the information available at time t. More precisely, we give
the following
Definition 24.3 : T is a stopping time if t T, { t} Ft .
We define the -field at time as
F = {A F : t T, A { t} Ft }.
Denote the collection of all stopping times bounded by T as ST .
Theorem 24.4 (Optional sampling theorem) Let X = (Xn )nN be a
martingale (resp. supermartingale). Let , SN such that () (),
. Then
E(X |F )
(resp. )
= X
a.s.
(24.6)
Proof: We assume that X is a martingale. Let fn = 1<n . Then fn is

Fn1 -measurable and
(f X)N = X X .
Therefore,
E(X )
= E(X ).
For any B F , it is easy to show that B 1B + N 1B c and B 1B +

N 1B c are two stopping times and B () B () N . Hence, E(XB ) =
E(XB ). Therefore
E(X 1B )
= E(XB ) E(XN 1B c )
= E(XB ) E(XN 1B c )
= E(X 1B ).
This proves (24.6). The case for supermartingale can be proved similarly.
Next, we give some estimates on the probabilities related to submartingales. The corollary of these estimates will be very important.
Theorem 24.5 Let (Xn )nN be a submartingale. Then for every > 0 and
N N,

P max Xn E XN 1maxnN Xn E(|XN |)

nN
24
and
P min Xn E(|X0 | + |XN |)

nN
Proof: Let
= min{n N : Xn }
with the convention that inf = N . Then SN . By (24.6), we have
E(XN )
E(X )

= E X 1maxnN Xn + E XN 1maxnN Xn <

P max Xn + E XN 1maxnN Xn < .

nN
Thus

P max Xn
nN
E(XN ) E XN 1maxnN Xn <

= E XN 1maxnN Xn
E(|XN |).
The other inequality can be proved similarly.

Corollary 24.6 Let (Xn )nN be a martingale such that E(|Xn |p ) < , n
N for some p > 1. Then for every N N,

P max |Xn |
nN
and
E
max |Xn |p
nN
p
p1
E(|XN |p )
(24.7)
!p
E(|XN |
).
(24.8)
Proof: By Jensens inequality, |Xn |p is a submartingale and hence, (24.7)

follows from Theorem 24.5 directly. Let
Y = max |Xn |.
nN
By Theorem 24.5, we have

P (Y ) E(|XN |1Y ).
25
Hence
E(Y
) = E
= p
p
Z 0
0
pp1 1Y d
p1 P (Y )d
p2 E (1Y |XN |) d
p
E(Y p1 |XN |)
p1
p
(E(XN |p ))1/p (E(Y p ))(p1)/p
p1
where the last inequality follows from Holders inequality. (24.8) then follows
easily.
Next, we consider the limit for submartingales. Let (Xn )nN be a submartingale and a < b. Define 0 = 0 = 0 and for n 0,
n = min{m n1 : Xm a}
n = min{m n : Xm b}.
(24.9)
Then n and n are two sequences of increasing stopping times and the number of upcrossing of {Xn : 0 n N } for the interval [a, b] is
UNX (a, b) = max{n : n N }.
Theorem 24.7 Suppose that (Xn )nN is a submartingale. Then, N N
and a < b, we have
X
EUN
(a, b)
1
E{(XN a)+ (X0 a)+ }.
ba
Proof: By Jensens inequality, Yn = (Xn a)+ is a submartingale and

UNX (a, b) = UNY (0, b a). Let n and n be defined as in (24.9) with X, a, b
replaced by Y, 0, b a respectively. If n > N , then
YN Y 0 =
n
X
k=1
(Yk N Yk N ) +
(b a)UNX (a, b) +
26
n
X
k=1
n
X
k=1
(Yk N Yk1 N )
(Yk N Yk1 N ).
Therefore
E(YN
Y0 ) (b a)EUNX (a, b).
As a consequence of the upcrossing estimate above, we have

Theorem 24.8 If (Xn )nN is a submartingale such that
sup E(Xn+ ) < ,
n
then X = limn Xn exists a.s. and X is integrable.

Proof: For any r 0 > r, we have
X
EU
(r, r 0 )
lim EUNX (r, r 0 )
r0
Hence

1
lim E((XN r)+ (X0 r)+ ) < .
r N

X
P lim inf Xn < lim sup Xn = P r<r0 ,r,r0 Q {U
(r, r 0 ) = } = 0,
n
which proves that X exists a.s.

By Fatous lemma,
E|X |
lim inf E|Xn |

n
= lim inf (2E(Xn+ ) E(Xn ))

n
2 sup E(Xn+ ) E(X0 ) < .

n
Hence X is integrable.
Finally, we consider continuous time martingales.
Lemma 24.9 Let (Xt )t0 be a submartingale. Then T > 0,
!
sup |Xt | < = 1
tQ[0,T ]
and
P t 0,
lim
sQ+ , st
Xs and
27
lim
sQ+ , st
Xs exist = 1.
Proof: Let {r1 , r2 , } be an enumeration of Q [0, T ]. For each n, let

s1 < s2 < < sn be a rearrangement of {r1 , , rn }. Define
Y0 = X0 , Yn+1 = XT and Yi = Xsi , i = 1, 2, , n.
Then Y = (Yi )i=0,1,,n+1 is a submartingale, so does |Y |. Therefore, by
Theorems 24.5 and 24.7,
1
1
E(Yn a)+
E(XT a)+ .
ba
ba
and
EUnY (a, b)
Take n , we have
1
E|XT |
max |Yi | >
1in
sup |Xt | >
tQ[0,T ]
1
E|XT |
(24.10)
and
1
E(XT a)+ .
(24.11)
ba
The conclusion of the lemma follows from (24.10) and (24.11) by letting
and a < b run over positive integers and pairs of rationals respectively.
X|Q[0,T ]
EU
(a, b)
Theorem 24.10 Let (Xt )t0 be a submartingale. Then

t = lim Xr
X
rQ, rt
t is a submartingale which is right-continuous with left-limit

exists a.s. and X
t a.s. for every t 0.
(c
adl
ag) a.s. Furthermore, Xt X
t ) = 1,
P (Xt = X
t0
if and only if E(Xt ) is right continuous.

t exists a.s. The cadlag
Proof: It follows from Lemma 24.9 directly that X
t can be verified.
property of X
HW: Verify this.
28
t is Ft+ = Ft measurable. For s > t and B Ft , we have

Note that X
t 1B )
E(X
= lim E(Xr 1B )
rQ, rt
Then
s |Ft )
E(X
lim
r 0 Q, r 0 s
t,
X
E(Xr0 1B )
s 1B ).
= E( X
a.s.
HW: Verify this.

t is a submartingale. Similarly, we have
Hence, X
E(Xt 1B )
t 1B ),
E(X
B Ft
t a.s. If E(Xt ) is right-continuous, then EX

t = EXt and
and hence Xt X
hence, Xt = Xt a.s.
in Theorem 24.10 is called a c
If E(Xt ) is right-continuous, then X
adl
ag
modification of X. From now on, we always take cadlag versions for such
submartingales.
The following theorem is a consequence of Corollary 24.6.
Theorem 24.11 Let (Xt )t0 be a right-continuous martingale such that E(|Xt |p ) <
, t 0 for some p > 1. Then for every t 0,

P max |Xs |
st
and
E
max |Xs |
st
p
p1
E(|Xt |p )
(24.12)
!p
(24.13)
E(|Xt |p ).
HW: Prove this theorem.

Next we consider the continuous-time counterpart of Theorem 24.4. We
need to define the class (DL) first.
Definition 24.12 A submartingale (Xt ) is in the class (DL) if T > 0, the
family of random variables {X : ST } is uniformly integrable, i.e.

lim sup E |X |1|X |>M = 0.
M ST
Lemma 24.13 If Yn is uniformly integrable and Yn Y a.s. then Y L1

and Yn Y in L1 .
29
Proof: As Yn is uniformly integrable,

lim sup E(|Yn |1|Yn |M ) = 0.
There exist M s.t.

sup E(|Yn |1|Yn |M ) < .
n
Hence
sup E|Yn | M + sup E(|Yn |1|Yn |M ) < .
n
By Fatous lemma, we have

E|Y | lim inf E|Yn | <
n
Note that
E|Yn Y | = E(|Yn Y |1|Yn |>M ) + E(|Yn Y |1|Yn |M )
sup E(|Yn |1|Yn |>M ) + E(|Y |1|Yn |>M,|Y |<M/2 )
n
+E(|Y |1|Y |>M/2 ) + E(|Yn Y |1|Yn |M )

M
sup E(|Yn |1|Yn |>M ) + P (|Yn Y | > M/2)
2
n
+E(|Y |1|Y |>M/2 ) + E(|Yn Y |1|Yn |M ).
Hence, by DCT,
lim sup E|Yn Y | sup E(|Yn |1|Yn |>M ) + E(|Y |1|Y |>M/2 ).
n
Take M , we then have

lim sup E|Yn Y | = 0.
n
Theorem 24.14 (Doobs sampling theorem) Let (Xt )t0 be a right-continuous

martingale of class (DL). Let , SN such that () (), .
Then
E(X |F ) = X
(resp. )
a.s.
(24.14)
30
Proof: Let
k1
k
k
if
< n.
n
n
2
2
2
Then n is a sequence of stopping times.
Let n be defined similarly. For any A F , we have A Fn and hence,
by Theorem 24.4,
E(Xn 1A ) = E(Xn 1A ).
n =
Take n , we have
E(X 1A )
= E(X 1A ).
This implies that E(X |F ) = X .

The following theorem follows from Theorem 24.14 immediately.
Theorem 24.15 Let (Xt )t0 be a right-continuous martingale of class (DL)
t = X t
and (t )t0 be a family of increasing bounded stopping times. Let X
and Ft = Ft , t 0. Then (Xt , Ft ) is a martingale.
31
25
More martingale convergence theorems
In this section, we study more martingale convergence theorems. First we

consider backward martingales. Let (Xn )nN be a process and (Fn )nN be
a family of -algebras s.t. Xn is Fn -measurable and Fn1 Fn , n.
(Xn ) is a backward martingale if 0 n < m
E(Xn |Fm ) = Xm
a.s.
Theorem 25.1 (Backward martingale convergence thm) Let (Xn , Fn )nN

be a backward martingale, and let F =
n=1 Fn . Then (Xn ) converges
a.s. and in L1 to a limit X as n .
Proof: Let UNX (a, b) be the number of upcrossing of (Xn )n0 of [a, b] between
time N and 0. Then
X
EU
(a, b) =
Hence
lim EUNX (a, b)
1
E((X0 a)+ ) < .
ba
X
P (U
(a, b) < ) = 1.
The same argument as before implies
X = n
lim Xn
exists a.s. and X L1 . To prove Xn X in L1 , we only need to
show that {Xn }nN is uniformly integrable. Note that
Xn = E(X0 |Fn ).
Hence
E(|Xn |1|Xn |>M )
=
=
E(E(|X0 ||Fn )1|Xn |>M )

E(E(|X0 |1|Xn |>M |Fn ))
E(|X0 |1|Xn |>M )
E(|X0 |1|X0 |>M 0 ) + M 0 P (|Xn | > M )
M0
E(|Xn |)
E(|X0 |1|X0 |>M 0 ) +
M
M0
E(|X0 |1|X0 |>M 0 ) +
E(|X0 |).
M
32
Then
lim sup E(|Xn |1|Xn |>M ) E(|X0 |1|X0 |>M 0 ).
Take M , we have
lim sup E(|Xn |1|Xn |>M ) = 0.
So {Xn }nN is uniformly integrable.

As an application of the backward martingale convergence theorem, we
now prove the SLLN.
Theorem 25.2 Let {Xn }nN be i.i.d. with E|X1 | < . Then
X1 + + X n
E(X1 ),
n
a.s.
Proof: Let Sn = X1 + + Xn and Fn = (Sn , Sn+1 , ). Then Fn1

Fn and
Mn = E(X1 |Fn )
is a backward martingale. By symmetry, for 1 j n, we have
E(Xj |Fn ) = E(X1 |Fn ).
Therefore
Mn = E(X1 |Fn ) = = E(Xn |Fn )
n
1X
Sn
Sn
=
E(Xj |Fn ) = E( |Fn ) =
.
n j=1
n
n
By backward martingale convergence theorem, Snn M a.s. and in L1 .
Note that M is n Fn -measurable. By Kolmogorovs 0-1 law, M is
constant. As
E(M ) = n
lim E(Mn ) = E(X1 ),
we see that M = E(X1 ) a.s. and hence
Sn
n
E(X1 ) a.s.
The following theorem is an application of the forward martingale convergence theorem.

33
Theorem 25.3 Let (Yn )n1 be independent r.v.s, E(Yn ) = 0 and E(Yn2 ) <
. Suppose
n=1
Let
E(Yn2 ) < .
Sn =
Yn .
n=1
Then limn Sn exists a.s. and it is in L1 .

Proof: Let Fn = (Y1 , , Yn ). Then (Sn , Fn )n1 is a martingale. Note
sup E(Sn+ ) sup E(Sn2 ) + 1
n
X
n
E(Yn2 ) + 1 < .
Hence Sn S and S L1 .
Finally, we consider martingale CLT.
Theorem 25.4 Let (Xn )nN be s.t. a)
E(Xn |Fn1 ) = 0
b)
E(Xn2 |Fn1 ) = 1
c)
E(|Xn |3 ) K < .
Let Sn = X1 + + Xn . Then
Proof: Let
Sn
Z in distribution and L(Z) = N (0, 1).

n,j (u) = E e
As
e
iu 1n Xj
iu 1n Xj
Fj1
1
u2 2
iu3 3
= 1 + iu Xj
Xj 3/2 X
j
n
2n
6n
j is between 0 and Xj ,
where X
iu
u2
iu3
3 |Fj1 )
n,j (u) = 1 + E(Xj |Fj1 ) E(Xj2 |Fj1 ) 3/2 E(X
j
n
2n
6n
u2
iu3
3 |Fj1 ).
= 1
3/2 E(X
j
2n 6n
34
Then for j n,
S
E(e
iu jn
) = E(e
= E(e
= E(e
= Ee
iu
Sj1
Sj1
iu
n
Sj1
iu
n
iu
Sj1
j
iu n
)
X
E(e
j
iu n
|F j 1))
n,j (u))
!
u2
iu3 3
1
X .
2n 6n3/2 j
Hence

!

Sj1
S
u2
iu

iu jn
n
1
Ee
Ee

2n

S
iu3
iu j1
n
E e
|Xj |3

6n3/2
K|u|3
.
6n3/2
Then

u2
1
2n
!nj
Ee
S
iu jn
u2
1
2n
!nj+1
Ee
Sj1
iu
n
u
for n large s.t. 1 2n
> 0.
Use telescoping sum, we have
As

Sn
iu
Ee n

u2
1
2n
u2
1
2n
we have
Ee
!n

!n
Sn
iu
n
eu
eu
35
K|u|3
n 0.
6n3/2
2 /2
2 /2
K|u|3
6n3/2
26
Doob-Meyer decomposition
Note that a submartingale increases in expectation. Therefore, it should

consist of two parts: the martingale part plus an increasing process. The
rigorous treatment of this idea is the so-called Doob-Meyer decomposition
which is the subject of this section.
Theorem 26.1 (Doob decomposition) A submartingale (Xn )nN has exactly one decomposition
Xn = M n + A n
(26.15)
where (Mn , Fn ) is a martingale, A0 = 0, An is Fn1 -measurable and An
An+1 a.s.
Proof: Define M0 = X0 and for n 1,
Mn = Mn1 + Xn E(Xn |Fn1 ).
Then (Mn , Fn ) is a martingale. Define
An = Xn M n .
Then A0 = 0 and
An = An1 Xn1 + E(Xn |Fn1 ).
(26.16)
By induction, we see that An is Fn1 -measurable. Since Xn is a submartingale, by (26.16), we have An An1 a.s.
Next we prove the uniqueness. Suppose (Mn , An ) is such a decomposition,
then
E(Xn |Fn1 ) = Mn1 An .
Therefore An is uniquely determined. The uniqueness of Mn then follows
from Mn = Xn + An .
Next we consider the decomposition of continuous time supermartingale.
Definition 26.2 (At )t0 is an integrable increasing process if A0 = 0, t 7
At is right-continuous and increasing a.s. and
E(At )
< ,
36
t 0.
An increasing process At is natural if it does not have much common

jumps with any martingales. More precisely,
Definition 26.3 An integrable increasing process At is natural if for every
bounded martingale mt ,
E
Z
t
0
ms dAs = E
Z
t
0
ms dAs
holds for every t 0.

The following proposition gives a useful equivalent definition of the natural increasing process.
Proposition 26.4 An integrable increasing process At is natural if and only
if for every bounded martingale mt ,
E(mt At )
=E
Z
ms dAs
holds for every t 0.

Proof: Note that
E
Z
t
0
= E lim
=
=
lim
lim
ms dAs
n1
X
k=0
n1
X
k=0
n
X
k=1
= E(mt At )

m (k+1)t A (k+1)t A kt
n

n1
X
m (k+1)t A (k+1)t
n
m kt A kt
n
n1
X
k=0
k=0

!

E E m (k+1)t A kt F kt
n
m kt A kt
n
where the second equality follows from the dominated convergence theorem
and the third the martingale property of mt .
37
Theorem 26.5 (Doob-Meyer decomposition) If (Xt )t0 is a submartingale of class (DL), then it is expressible uniquely as
Xt = M t + A t
where At is an integrable natural increasing process and Mt is a martingale.
Proof: Uniqueness. Suppose that
Xt = Mt At = Mt0 A0t
be two such decompositions. Then
At A0t = Mt Mt0
is a martingale. Therefore, for any bounded martingale mt , we have
=
=
E(mt (At A0t ))

Z t
ms d(As A0s )
E
0
lim E
= 0.
n1
X
k=0
m kt
n

A (k+1)t A (k+1)t A kt A k
n

For any bounded random variable , let mt = E(|Ft ). Then

E(At )
= E(E(|Ft )At ) = E(E(|Ft )A0t ) = E(A0t ).
Hence At = A0t a.s. By the right continuity of A and A0 , we see that A = A0

a.s.
Existence. By the uniqueness, we only need to construct the decomposition on [0, T ]. Set
Yt = Xt E(XT |Ft ).
. As
Then Yt is a non-positive submartingale with YT = 0. Let tnj = jT
2n
(Ytnj , Ftnj ) is a submartingale with YT = 0, it follows from Theorem 26.1 that
Ytnj = E(AnT |Ftnj ) + Antnj
(26.17)
where An0 = 0, Atnj Atnj+1 , Atnj is Ftnj1 -measurable. Assume for the moment
that the family {AnT }n1 is uniformly integrable, which will be shown in
38
Lemma 26.6 below. Then there is a subsequence nk such that AnTk converges
to a random variable AT in the weak topology of L1 (): For any bounded
random variable , E(AnTk ) E(AT ).
Denote by Mt a right-continuous version of the uniformly integrable martingale (E(AT |Ft ))0tT and let
At = Y t M t .
Then (At ) is right-continuous. Let i j. For any n0 > 0,
Ytni 0 + E(AnTk |Ftni 0 ) Ytnj 0 + E(AnTk |Ftnj 0 ),
and hence by taking k , Atni 0 Atnj 0 . Therefore, At is increasing on
{tni 0 : n0 1, i = 0, 1, , 2n0 }, and thus on all [0, T ].
Finally, we prove that (At ) is natural. Let mt be a nonnegative, bounded,
right-continuous martingale. By the dominated convergence theorem,
E
T
0
ms dAs =
=
=
lim
lim
n
lim
n 1
2X
i=0
n 1
2X

i=0
n 1
2X
i=0
= E(mT AT )
mtni (Atni+1 Atni )

mtni Antni+1 E mtni Antni )
mtni+1 Antni+1 mtni Antni

where the next to the last equality follows from that Antni+1 is Ftni -measurable.
Hence At is natural.
Lemma 26.6 {AnT }n1 is uniformly integrable.
Proof: It is easy to show that
Antnk
k1
X
|Ftnj ) Ytnj .
j+1
E(Ytn
j=0
Let c > 0 be fixed and

cn = inf{tnk1 : Antn > c}
k
39
with the convention that the infimum over empty set is T . Then cn ST .
By the optional sampling theorem and (26.17), we have
Ycn = Ancn E(AnT |Fcn ).
Hence
E(An
>c )
T 1An
T
= E(Ycn 1cn <T ) + E(Ancn 1cn <T ).
(26.18)
Note that
E(An
cn 1cn <T )
cP (cn < T )
2E((AnT Ann )1cn <T )
c/2
2E((AnT
n <T )
Ann )1c/2
c/2
n 1 n <T ).
= 2E(Yc/2
c/2
It then follows from (26.18) that

E(An
)
T 1An
T >c
n 1 n <T ).
E(Ycn 1cn <T ) + 2E(Yc/2
c/2
Note that
sup E(Ycn 1cn <T ) sup E(Ycn 1|Ycn |>M ) + sup E(Ycn 1|Ycn |M, cn <T )
n
sup E(Ycn 1|Ycn |>M ) + M sup P (cn < T ).

n
n } and {Y n } are uniformly integrable and

As {Yc/2
c
P (cn < T ) = P (AnT > c)

1
1 n
EAT EY0 0
c
c
as c , uniformly for n, we have
lim sup E(AnT 1AnT >c ) = 0.
This then proves the uniform integrability.

Definition 26.7 A submartingale Xt is regular, if for any T > 0 and for
any n ST increasing to , we have E(Xn ) E(X ).
40
Theorem 26.8 Let Xt be a regular submartingale of class (DL). Then At

in the Doob-Meyer decomposition is continuous a.s.
Proof: Suppose that n ST increasing to , then E(An ) E(A ) and hence,
An A a.s. Denote tnj = jT
. For c > 0, we define
2n
Ant = E(Atnj+1 c|Ft ),
t (tnj , tnj+1 ].
Since Ant is a martingale on the interval (tnj , tnj+1 ] and At is a natural increasing
process, it is easy to show that
E
t
0
Ans dAs
=E
t
0
Ans dAs ,
t [0, T ].
(26.19)
Next we show that there exists a subsequence nk such that Ant k At c

uniformly in t [0, T ] so we can pass to limit in the above equality. For
> 0, we define
n, = inf{t [0, T ] : Ant At c > }
with the convention that inf = T . Let n (t) = tnj+1 for t (tnj , tnj+1 ]. Then
n, , n (n, ) ST . Since Ant is decreasing in n, n, is increasing in n. Let
= limn n, . Then ST and limn n (n, ) = . By the optional
sampling theorem,
E(An
n, ) = E(An (n, ) c)
and hence,
P (n, < T ) 1 E(Ann, An, c)
= 1 E(An (n, ) c An, c)

0,
as n .
Therefore
lim P
sup
t[0,T ]
|Ant
At c| > = 0.
Hence, there exists a subsequence nk such that

lim sup |Ant k At c| = 0,
nk t[0,T ]
41
a.s.
Thus, by (26.19), we have

E
T
0
As cdAs = E
T
0
As cdAs .
Hence
0=E
T
0
(As c As c)dAs E
sT
(As c As c)(As As ).
This implies the continuity of s 7 As c. Since c is arbitrary, we have the

continuity of At .
42
27
Quadratic variation process
In this section, we introduce square integrable martingales whose stochastic

integral will be given in next section.
Definition 27.1 A martingale (Mt )t0 is a square integrable martingale (denoted by M M2 ) if
E(Mt2 )
< ,
t 0.
If M is continuous, then we write M M2,c .

Lemma 27.2 If (Mt )t0 is a right-continuous square integrable martingale,
then Mt2 is a right-continuous submartingale of class (DL).
Proof: By Jensens inequality, Mt2 is a submartingale. By Theorem 24.11,
for any ST ,
E(M2 )
sup
0tT
Mt2
4E(MT2 ) < .
Hence Mt2 is in class (DL).

Apply Doob-Meyers decomposition, there exists a unique natural increasing process At such that Mt2 At is a martingale. We shall denote At by
hM it , which is called the quadratic variation process of Mt . This process will
play a key role in the definition of the stochastic integral.
Finally, we consider the covariation between two martingales.
Definition 27.3 For M, N M2 , the process
1
(hM + N it hM N it )
4
is called the quadratic covariation process of Mt and Nt .
hM, N it =
Sometimes, we need to define quadratic variation for a more general class

of processes.
Definition 27.4 A real-valued process (Mt )tR+ is a local martingale if there
exists a sequence of stopping times n increasing to almost surely such that
n, Mtn Mtn is a martingale. We denote the collection of all continuous local martingales by Mcloc , and all continuous locally square-integrable
martingales by M2,c
loc .
43
Remark 27.5 Let Mt be a continuous local martingale. Define

n () = inf{t : |Mt ()| n}
with the convention that inf = . Then n, Mtn Mtn is a bounded
continuous martingale.
Theorem 27.6 Let Mt be a continuous local martingale. Then there exists
a unique continuous increasing process At with A0 = 0 such that Mt2 At is
a local martingale. We shall denote At by hM it .
Proof: Let M n be given as in Remark 27.5. Let An = hM n it . The continuous
n+1
martingale Mt
has the quadratic variation process An+1
tn . However
n
n+1
Mt
= Mtn n+1 = Mtn = Mtn
n
which has the quadratic variation process Ant . Hence

n
An+1
tn = At ,
t.
Define
At = Ant ,
t n .
Then A0 = 0 and At is a continuous increasing process and

Atn = Ant .
2
= (Mtn )2 , it is clear that Mt2 At is a local martingale with
Since Mt
n
localizing stopping times (n ). The uniqueness of At follows from that of the
process Ant .
As a consequence of Theorem 24.14, we have

Corollary 27.7 Let Xt be a continuous local martingale and (t )t0 be a
t = Xt and Ft = Ft ,
family of increasing bounded stopping times. Let X
t , Ft ) is a local martingale with
t 0. Then (X
D
= hXit .
44
28
Brownian motions
Brownian motion is the simplest and the most useful square integrable martingale. In a sense, stochastic analysis is a branch of mathematics which
studies the functionals of Brownian motions.
Definition 28.1 A d-dimensional continuous process Xt is a Brownian motion if X0 = 0, for any t > s, Xt Xs is independent of Fs and Xt Xs
has a multivariate normal distribution with mean zero and covariance matrix
(t s)Id , where Id is the d d identity matrix.
The next theorem shows that the quadratic variation process for Brownian motion is t. The converse of this theorem is also true and will be proved
in next chapter.
Theorem 28.2 Suppose that Xt = (Xt1 , Xt2 , , Xtd ) is a d-dimensional
Brownian motion. Then Xtj , j = 1, 2, , d are square integrable martingales and
D
E
X j , X k = jk t.
(28.20)
t
Proof: It is easy to show that X i is a square integrable martingale. We only

prove (28.20). For t > s, we have
j
E(Xt Xtk
jk t|Fs )
=
Xsj )(Xtk Xsk )|Fs ) + Xsj Xsk jk t
+E(Xsk (Xtj Xsj ) + Xsj (Xtk Xsk )|Fs )
= jk (t s) + Xsj Xsk jk t
= Xsj Xsk jk s.
j
E((Xt
Therefore, Xtj Xtk jk t is a martingale. This proves (28.20).
45
29
Predictable processes
Let L be the collection of all measurable maps

X : (R+ , B(R+ ) F ) (R, B(R))
such that t 0, Xt : R is Ft -measurable and, for each , t 7 Xt ()
is left-continuous. Let

P = XL X 1 (B(R)) .
Namely, P is the smallest -field on (R+ , B(R+ ) F ) such that X L,
X : (R+ , P) (R, B(R)) is measurable.
Definition 29.1 A stochastic process X = (Xt ()) is predictable if X :
(R+ , P) (R, B(R)) is measurable.
Example 29.2 Let 0 = t0 < t1 < < tn . Define simple process
Xt () = X0 ()1{0} (t) +
n1
X
Xj ()1(tj ,tj+1 ] (t).
j=0
If Xj is Ftj -measurable, j = 0, 1, , n, then X is predictable.

The following lemma gives a useful alternative description of the predictable -field P.
Lemma 29.3 The -field P is generated by all sets of the form = (u, v]
B, where B Fu , or = {0} B, where B F0 .
Proof: Let G be the collection of all sets of the form = (u, v] B, where
B Fu , or = {0} B, where B F0 . G, it is easy to see that
1 L and hence G P. This implies that (G) P, where (G) is the
-field generated by G.
On the other hand, for each X L, we define
2
Xtn ()
= X0 ()1{0} (t) +
n
X
Xj/n ()1(jn1 ,(j+1)n1 ] (t).
j=0
It is clear that X n is (G)-measurable and Xtn () Xt () for each t 0

and . Hence, X is (G)-measurable. This implies that P (G).
Therefore, P = (G).
46
30
Stochastic integral
Denote by L0 the collection of all simple predictable processes ft of the form

ft () =
n1
X
fj ()1(tj ,tj+1 ] (t)
j=1
where 0 t0 < < tn , fj is a bounded Ftj -measurable random variable.

For f L0 , we define the stochastic integral
I(f )
n1
X
fs dMs =
j=1
fj (Mtj+1 Mtj ).
(30.21)
Proposition 30.1 I(f ) satisfies the following identities:

E
and
Z
Z
fs dMs
fs dMs = 0
2 !
=E
Z
fs2 d hM is
Proof: The first equality follows from

E
Z
fs dMs
n1
X
E(fj (Mtj+1
j=1
n1
X
Mtj ))
E(fj E(Mtj+1
j=1
= 0.
Mtj |Ftj ))
(30.22)
To prove the second equality, we note that

Z
fs dMs
2
n1
X
j=1
+2
fj2 (Mtj+1 Mtj )2

X
0j<kn1
fj fk (Mtj+1 Mtj )(Mtk+1 Mtk )
I1 + I2 .
47
Similar to (30.22), we have E(I2 ) = 0. On the other hand,

E(I1 )
n1
X
j=1
n1
X
j=1
= E
Z

fj2 E (Mtj+1 Mtj )2 Ftj

fj2 hM itj+1 hM itj
fs2 d hM is
2

For M M2,c , we define a measure M on (R+ , P) by

M (A) = E
Z
1A (t, )d hM it .
By Lemma 29.3, it is easy to show that L0 is a dense subspace of L2 (M ).

The following theorem follows from Proposition 30.1 directly.
Theorem 30.2 I : L0 L2 (, F , P ) defined by (30.21) is a linear isometry. As a consequence, it can be extended uniquely to a linear isometry from
L2 (M ) into L2 (, F , P ). We still denote the extension by
I(f ) =
fs dMs .
Proof: For any f L2 (M ), let f n L0 be such that

kf n f kL2 (M ) 0.
Then
kI(f n ) I(f m )kL2 (,F ,P ) = kf n f m kL2 (M ) 0
as m, n . Namely, I(f n ) is a Cauchy sequence. Denote its limit by

I(f ).
We then define the stochastic integral as a process.
It (f )
t
0
fs dMs
48
fs 1[0,t] (s)dMs .
Lemma 30.3 If f L0 , then It (f ) is a continuous square integrable martingale with quadratic variation process
hI(f )it =
Proof: As
It (f ) =
n1
X
i=1
t
0
fs2 d hM is .
fi ()(Mti+1 t Mti t ),
it is continuous. For t > s, let tj < s tj+1 . Then

E(It (f )|Fs )
j1
X
E(fi ()(Mti+1 t
n1
X
i=1
Mti t )|Fs )
+E(fj ()(Mtj+1 t Mtj t )|Fs )

i=j+1
j1
X
i=1
E(fi ()(Mti+1 t
Mti t )|Fs )
fi ()(Mti+1 t Mti t )
+fj ()(Ms Mtj ) + 0

= Is (f ).
So It (f ) is a continuous martingale. Similarly,
It (f )2
t
0
fs2 d hM is
is a martingale. Hence the quadratic variation process of I(f ) is

hI(f )it =
t
0
fs2 d hM is .
Theorem 30.4 It (f ) is a continuous square integrable martingale with quadratic

variation process
Z
hI(f )it =
49
fs2 d hM is .
Proof: We only need to prove the theorem for t T with T being fixed. Let
f n be a sequence of simple predictable processes such that
|fsn | |fs |
and
E
(30.23)
(fsn fs )2 d hM is < 2n .
By the definition of the stochastic integral, we see that t [0, T ],

E
|It (f n ) It (f )|2 0.
It is easy to verify that It (f n ) and

It (f n )2
t
0
(30.24)
(fsn )2 d hM is
are martingales. It follows from (30.24) and (30.23) that It (f ) and

2
It (f )
t
0
fs2 d hM is
are martingales. Therefore, It (f ) is a square integrable martingale with

quadratic variation process
hI(f )it =
t
0
fs2 d hM is .
From Theorem 24.11, we get

1
P sup |It (f ) It (f )| >
n
0tT
n
n E
0tT
4n2 E
sup |It (f ) It (f )|
T
(fsn fs )2 d hM is < n2 22n
which is summable. By Borel-Cantellis lemma, we have

!
1
P sup |It (f n ) It (f )| > , infinitely often = 0.
n
0tT
50
Hence
sup |It (f n ) It (f )| 0,
a.s.
0tT
As It (f n ) are continuous, It (f ) is continuous a.s.

Finally, we give the definition of the stochastic integral when M M2,c
loc .
2
Definition 30.5 For M M2,c
loc , let Lloc (M ) be the collection of all realvalued predictable processes f such that there exists a sequence of stopping
times n a.s. and
T n
0
ft2 d hM it
< ,
T > 0, n N.
(30.25)
It is clear that we may choose n in Definition 30.4 such that n
N, Mt n Mtn is a square-integrable martingale and (30.25) is satisfied.

Define
Itn (f ) = It (1(0,n ] f ).
For m < n, it is easy to verify that
n
Itm (f ) = It
(f ).
m
Therefore, there exists a unique stochastic process It (f ) such that

Itn (f ) = Itn (f ).
Definition 30.6 It (f ) is called the stochastic integral of f L2loc (M ) with
respect to M M2,c
loc . We also denote
It (f ) =
t
0
51
fs dMs .
31
It
os formula
In this section, we derive Itos formula for a function of a semimartingale.

This formula is the counterpart in stochastic analysis for the chain rule in
calculus.
Definition 31.1 A d-dimensional process Xt is a continuous semimartingale if
Xt = X 0 + M t + A t
where M 1 , , M d are continuous local martingales and A1 , , Ad are
continuous finite variational processes.
Theorem 31.2 (It
os formula) Let Xt be a d-dimensional continuous semimartingale and let F Cb2 (Rd ). Then
F (Xt ) = F (X0 ) +
+
d
1 X
2 i,j=1
d Z
X
i=1 0
t
d
X
F (Xs )
i
dM
+
s
xi
i=1
t
0
F (Xs ) i
dAs
xi
E
2 F (Xs ) D i
j
d
M
,
M
.
s
xi xj
(31.26)
Proof: For simplicity of notation, we assume d = 1. Let

n =
0
if |X0 | > n
inf {t : |Mt | > n or V ar(A)t > n or hM it > n} if |X0 | n
where V ar(A)t is the total variation of A on [0, t]. It is clear that n a.s.
We only need to prove (31.26) with t replaced by t n . In other words, we
assume that |X0 |, |Mt |, V ar(A)t and hM it are all bounded by a constant C
and F C02 (R).
Let ti = itn , i = 0, 1, , n. Then
F (Xt ) F (X0 ) =
=
n
X
i=1
n
X
i=1
(F (Xti ) F (Xti1 ))
F 0 (Xti1 )(Xti Xti1 )
n
1X
F 00 (i )(Xti Xti1 )2
2 i=1
I1n + I2n
52
where i is between Xti1 and Xti .

Note that
n
X
I1n =
i=1
t
F 0 (Xti1 )(Mti Mti1 ) +

F 0 (Xs )dMs +
t
0
n
X
i=1
F 0 (Xti1 )(Ati Ati1 )
F 0 (Xs )dAs .
On the other hand,

n
X
2I2n =
i=1
+2
F 00 (i )(Mti Mti1 )2
n
X
i=1
n
X
i=1
F 00 (i )(Mti Mti1 )(Ati Ati1 )
F 00 (i )(Ati Ati1 )2
n
n
n
I21
+ I22
+ I23
.
Since At is of finite variation and M is continuous, it is easy to show that

n
n
I22
0 and I23
0.
Let
Vkn =
k
X
i=1
(Mti Mti1 )2 ,
k = 1, 2, , n.
Then
E(Vnn )2
n
X
E(Mti
i=1
+2
4C
Mti1 )4
1i<jn
n
X
2
n
E E (Mtj
E(Mti
i=1
+2
1i<jn
Mti1 )2
n
(4C +
hM itj hM itj1 (Mti Mti1 )2
4C 2 E(Vnn ) + 2C
2
Mtj1 )2 |Ftj1 (Mti Mti1 )2
1i<n
n
2C)E(Vn ).
53
(Mti Mti1 )2
Hence
E(Vnn )2
Let
I3n =
n
X
i=1
and
I4n =
Then
{E(|I3n
n
I21
|)}2
F 00 (Xti1 )(Mti Mti1 )2
n

1X
F 00 (Xti1 ) hM itj hM itj1 .
2 i=1
1in
I4n
Finally, note that
and hence,
= E
00
t
0
(Vnn )2 0
F 00 (Xs )d hM is .

(Mti Mti1 )2 hM iti hM iti1 Fti1 = 0,

|I3n I4n |2
n
X
i=1
00
E max |F (i ) F (Xti1 )|
and
(4C 2 + 2C)2 .
F 00 (Xti1 ) (Mti Mti1 )2 hM iti hM iti1
2kF 00 k2 E
0.
( n
X
i=1
(Mti Mti1 ) + hM iti hM iti1
2
2
)
As an application of the Itos formula, we justify the terminology quadratic variation process for hM it by the following theorem.
n
n
n
Theorem 31.3 Suppose that M M2,c
loc . Let 0 = t0 < t1 < < tn = t be
such that
max (tnj tnj1 ) hM it .
1jn
Then
lim
n
n
X
(Mtnj Mtnj1 )2 = 0.
j=1
54
Proof: Note that

n
X
(Mtnj Mtnj1 )2
j=1
n
X
j=1
= 2
tn
j
tn
j1
2(Ms Mtnj1 )dMs + hM itn hM itn
Ms dMs 2
hM it
n
X
j=1
j1
Mtnj1 (Mtnj Mtnj1 ) + hM it
where the first step follows from the Itos formula.
55
32
Martingale characterization of Brownian

motion
In this section, we make use of Itos formula to show that the Brownian
motion is characterized by its quadratic variational process. Then, as consequences of this result, we present some representation theorems of squareintegrable martingales in terms of Brownian motions.
Theorem 32.1 Suppose that Xt = (Xt1 , , Xtd ) be such that X j M2,c
loc ,
X0 = 0 and
E
D
X j , X k = jk t,
j, k = 1, 2, , d.
t
Then Xt is a d-dimensional Brownian motion.
Proof: Let Rd . Applying Itos formula to eih,xi , we have

e
ih,Xt i
=e
ih,Xs i
ie
ih,Xu i
1
h, dXu i
2
||2 eih,Xu i du
here the notation h, Xs i stands for the inner product in Rd (do not confuse
with the quadratic covariational process). Thus
E

Fs
ih,Xt i
=e
ih,Xs i
t
s
|| E e
Solve this integral equation, we get

E
eih,Xt Xs i Fs = e 2 ||

Fs du.
ih,Xu i
2 (ts)
This implies that Xt Xs is independent of Fs and Xt Xs has a multivariate

normal distribution with mean 0 and covariance matrix (t s)Id . Namely,
Xt is a d-dimensional Brownian motion.
As applications of Theorem 32.1, we can represent any locally squareintegrable martingales as time changes of the Brownian motion.
Theorem 32.2 Suppose that M M2,c
loc satisfying limt hM it = a.s.
Let
t = inf{u : hM iu > t}
and Ft = Ft . Then Bt = Mt is an (Ft )-Brownian motion. As a consequence, Mt has the following representation:
Mt = BhM it .
56
Proof: We first prove that Bt is continuous. Since the only possible case for
Bt not being continuous is that t has a jump and M is not constant over
this jump. In this case, hM it must be flat over an interval, say (r, r 0 ), and
M is not constant over this interval. Therefore, we only need to show that
P ({Mu = Mr , u [r, r 0 ]} \ {hM ir0 = hM ir }) = 0.
(32.27)
Let
= inf {s > r : hM is > hM ir } .
Then is a stopping time and hence, by Doobs optional sampling theorem,

Ns M(r+s) Mr
is a local martingale with respect to Fs F(r+s) . Since
hN is = hM i(r+s) hM ir = 0,
we have N = 0. This implies (32.27).
By Doobs optional sampling theorem again,
hBit = hM it = t.
Therefore, Bt is a Brownian motion.
Next, we would like to remove the condition that limt hM it =
a.s. To this end, we need to define the Brownian motion in an extended
probability space.
F,
P , Ft ) is an extension of a stochastic
Definition 32.3 We say that (,
which is F/F
-measurable
basis (, F , P, Ft ) if there exists a map :
1
1
such that i) Ft (Ft ); ii) P = P and iii) for every bounded random
variable X on ,
X(
)|Ft ) = E(X|Ft )(
E(
) P -a.s.,
) = X(
.
We shall denote X
by X if its meaning is
where X(
), for
clear from the context.
F , P , Ft ) is called a standard extension of a stochastic basis (, F ,
(,
P, Ft ) if we have another stochastic basis (0 , F 0 , P 0 , Ft0 ) such that
F , P , Ft ) = (, F , P, Ft ) (0 , F 0 , P 0 , F 0 )
(,
t
and
= for
= (, 0) .
57
Theorem 32.4 For M M2,c

loc , we define
t = inf{u : hM iu > t}
with the convention that inf = . Let
Ft = (s>0 Ft s ) .
F , P , Ft ) of (, F , Ft , P ) there exists an (Ft )Then on an extension (,

Brownian motion Bt such that
Mt = BhM it .
Proof: By the optional sampling theorem, s s0 and u v,
E(Mu s |Fv s0 )
= Mv s0
and
E((Mu s
Mv s0 )2 |Fv s0 ) = E(hM iu s hM iv s0 |Fv s0 ).
Therefore, Mu s is a square-integrable martingale with quadratic variational

process hM iu s . By martingale convergence theorem,
u = lim Mu s
B
s
exists a.s. Further, u v, we have
u |Fv )
E(B
and
u
E((B
v
=B
v )2 |Fv ) = E(hM i hM i |Fv ).

B
v
u
Let (0 , F 0 , P 0 , Ft0 ) be a stochastic basis and let Bt0 be a Brownian motion on

0 . Define the standard extension by
F,
P , Ft ) = (, F , P, Ft ) (0 , F 0 , P 0 , F 0 ).
(,
t
Let
Bt = Bt0 BthM
i + BthM i .
Then Bt is a continuous Ft -martingale with hBit = t and hence, a Brownian

motion. The rest of the proof is easy.
Next, we represent square-integrable martingale as stochastic integral
with respect to Brownian motion.
58
Theorem 32.5 Let M i M2,c , i = 1, 2, , d. Let ij : R+ R,

i, j = 1, 2, , d be such that
E
ij (s)2 ds < ,
and
D
M ,M
d
X
0 k=1
If
t>0
ik (s)jk (s)ds.
det(ij (s)) 6= 0 a.s.
s,
(32.28)
then there exists a d-dimensional Brownian motion Bt (on the original stochastic basis) such that
Mti
d Z
X
k=1 0
ik (s)dBsk .
(32.29)
Proof: For N > 0, let

IN (s) = 1max1i,jd |(1 )ij (s)|N
where 1 is the inverse matrix of . Define
Bti,N =
d Z
X
k=1 0
(1 )ik (s)IN (s)dMsk ,
Then, B i,N M2,c and

D
B i,N , B j,N =
t
0
i = 1, 2, , d.
IN (s)dsij .
Therefore, B i,N converges in M2,c to some B i and

D
B i , B j = ij t.
By Theorem 32.1, Bt = (Bt1 , , Btd ) is a d-dimensional Brownian motion.

Note that
d Z
X
k=1 0
ik (s)dBsi,N
t
0
IN (s)dMsi .
Take N , we get the representation (32.29).

Next, we remove the condition (32.28). In this case, we need to construct
the Brownian motion on an extension of the original stochastic basis.
59
Theorem 32.6 Let M i M2,c , i = 1, 2, , d. Let ij : R+ R,

i = 1, 2, , d, j = 1, 2, , r, be such that
E
ij (s)2 ds < ,
and
D
M i, M j
r
X
0 k=1
t>0
ik (s)jk (s)ds.
F,
P , Ft ) of (, F , P, Ft ) there exists a r-dimenThen on an extension (,
sional Brownian motion Bt such that
Mti
r Z
X
k=1 0
ik (s)dBsk .
(32.30)
Proof: By assuming Mti 0 or ik 0 if necessary, we may assume that

d = r. Let
r
ij (s) =
ik (s)jk (s).
k=1
Then (s) is a d d non-negative definite matrix. Let

1
(s)
= lim (s) 2 ((s) + Id )1
0
where Id is the d d identity matrix. Let ER (s) be the projection matrix to

the range of (s) and EN (s) = Id ER (s). Then
1
1
2 = (s) 2 (s)
(s)(s)
= ER (s).
1
First, we assume that (s) = (s) 2 . Let Bt0 be a d-dimensional Brownian

motion on a stochastic basis (0 , F 0 , P 0 , Ft0 ) and let
F,
P , Ft ) = (, F , P, Ft ) (0 , F 0 , P 0 , Ft0 ).
(,
Define
Bti =
d Z
X
k=1 0
ik (s)dM k +
60
d Z
X
k=1 0
EN (s)ik dBs0k .
Then hB i , B j it = ij t and hence, Bt is a d-dimensional Brownian motion.

Further,
d Z
X
k=1 0
ik (s)dBsk =
d Z
X
k,j=1 0
=
Note that
d Z
X
j=1 0
kj (s)dM j
ik (s)
s
d Z
X
+
=
k,j=1 0
d Z
X
j=1 0
Mti
ik (s)EN (s)kj dBs0j
ER (s)ij dMsj
d Z
X
j=1 0
EN (s)ij dMsj
EN (s)ij dMsj .
(32.31)
= 0.
Combining with (32.31), we see that (32.30) holds.

For general (s), there exists an orthogonal matrix predictable process
1
P (s) such that (s) 2 = (s)P (s). By the previous step, we have
Mti
Let
k =
B
s
r Z
X
d Z
X
k=1 0
i=1 0
(s) 2 dBsk .
Pkj (s)dBsj .
t is a d-dimensional Brownian motion and (32.30) holds with B reThen B
placed by B.
61
33
Change of measures
In this section, we consider how the martingales change under equivalent

probability measures.
First we consider a sequence of non-negative local martingales. Under
Novikovs condition (33.33), it becomes a martingale and gives the RadonNickodym derivative between two probability measures.
For X M2,c
loc , we define
1
Mt = exp Xt hXit .
2

Theorem 33.1 Mt is a continuous local martingale. Further, Mt is a supermartingale, and it is a martingale if and only if
E(Mt )
t 0.
= 1,
(33.32)
Proof: Apply Itos formula, we have

Mt = 1 +
t
0
Ms dXs .
Hence M is a continuous local martingale. By Fatous lemma, it is easy to

verify that Mt is a supermartingale.
The next theorem gives sufficient condition for (33.32) to hold.
Theorem 33.2 (Novikov) If
1
hXit < ,
E exp
2

t 0,
(33.33)
then (Mt )t0 is a continuous martingale.

F,
P , Ft ) there
Proof: By Theorem 32.4, we see that on an extension (,
exists a Brownian motion Bt such that Xt = BhXit . For a > 0, set
a = inf{t : Bt t a}.
For > 0, define
u(t, x) = et e(
62
1+21)x
Then
u 1 2 u u
+
= 0.
t
2 x2 x
Apply Itos formula, we have
u(t, Bt t) = 1 +
t
0
u
(s, Bs s)dBs .
x
It is easy to see that u(t a , Bta t a ) is a martingale and hence

Eu(t a , Bta
Take t , we get
Eea
t a ) = 1.
= e(
1+21)a
(33.34)
By the uniqueness of analytic extension, we see that (33.34) holds for = 12 ,

i.e.
1
a
Ee 2 a = e < .
Therefore
E exp
B a
1
1
a = ea Ee 2 a = 1.
2
Let
1
Yt = exp Bta t a .
2
By Theorem 33.1, (Yt )t0 is a uniformly integrable Ft -martingale. Hence, for
any (Ft )-stopping time ,

E exp
Ba
1
a = 1.
2

Take = hXit , we get

1
1 = E exp BhXit a hXit a
2

1
= E 1a hXit exp a + a
2
n
o
+E 1a >hXit M1 .

Note that, as a ,
1
E 1a hXit exp a + a
2

63
1
e E exp
hXit 0.
2
a
Hence E(Mt ) = 1.
Suppose that Mt is a martingale. We define a probability measure P on
(, Ft ) by
Pt (A) = E(Mt 1A ),
A Ft .
Then t > s, we have Pt |Fs = Ps . In fact, A Fs ,
Pt (A) = E(E(Mt 1A |Fs )) = E(Ms 1A ) = Ps (A).
We assume that
F = (t0 Ft ) .
Then there exists a unique probability measure P on (, F ) such that P |Ft =

Pt . Denote P by M P .
Theorem 33.3 (Girsanovs transformation) i) If Y M2,c

loc , then Y defined by
Yt = Yt hX, Y it
(33.35)
2,c .
is a P -locally square-integrable martingale. Denote Y M
loc
1 , Y 2 be defined by (33.35). Then
ii) For Y 1 , Y 2 M2,c
,
let
Y
loc
D
Y 1 , Y 2 = Y 1 , Y 2 .
Proof: i) First we assume that Yt is bounded. By Itos formula,

d(Mt Yt ) = Yt dMt + Mt dYt + d hM, Y it
= Yt dMt + Mt dYt .
Hence Nt Yt is a martingale and hence, by Bayes formula,
(Yt |Fs )
E
= E(Mt Yt |Fs )Ms1 = Ys .
In general, we choose a sequence of increasing stopping times n such

2,c , and so does Y .
that n, Ytn is bounded. Therefore Yn M
loc
(ii) can be proved similarly.
64
Corollary 33.4 If
Xt =
hs , dBs i
where Bt is a d-dimensional Brownian motion, then

t = Bt
B
t
0
s ds
is a d-dimensional Brownian motion on (, F , P , Ft ).

Proof: Note that
i M
2,c . As
Hence B
loc
D
i, B
j
B
Bi, X
= it .
= B i , B j = ij t,
Bt is a d-dimensional Brownian motion.
65

Probability Essentials: 19 Weak Convergence and Characteristic Func-Tions

Uploaded by

Copyright:

Available Formats

Probability Essentials: 19 Weak Convergence and Characteristic Func-Tions

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Probability Essentials: 19 Weak Convergence and Characteristic Func-Tions

Uploaded by

Copyright:

Available Formats

Probability Essentials

Jacod and Protter

Weak convergence and characteristic functions

We would like to use ch.f. to study the convergence of r.v.s.

b) We assume d = 1. We first prove the tightness of {n }.

where = 2 . Since f is continuous at 0,  > 0, , |u| < , we have

n ([b, b]c ) < ,

Example 19.2 Suppose Xn P oisson(n). Let

Then Zn Z, L(Z) = N (0, 1).

The laws of large numbers

Recall A sample of n lightbulbs with lifespans X1 , , Xn . The estimation

Proof: Consider Zj = Xj . Then (Zj )j1 are i.i.d. with mean 0.

Let p(n) N be s.t.

p(n)2 n < (p(n) + 1)2 .

a.s. iff E(Xj ) = .

In this case, the convergence also holds in L1 .

The central limit theorem

We know Snn . How fast is the convergence? We shall prove

Then Yn Y where L(Y ) = N (0, 1).

Note that, by Exer. 14.4, we have

L(Z) = N (0, 1).

Proof: We only give a sketch. Denote

Finally, we consider the d-dimensional version of the CLT.

Example 21.4 Let (Xj )j1 be i.i.d. Bernoulli(p). Then

Fn is called the empirical distribution function.

L2 and Hilbert space

Recall: L2 consists of r.v.s X such that E(X) < .

Define inner product on L2 : X, Y L2

kXk = hX, Xi1/2 = (E(X 2 ))1/2 .

E(X ) + E(Y ) + 2 E(X 2 )E(Y 2 )

Proof: We only need to prove L2 is complete. Let Xn be a Cauchy sequence

|Xnk+1 () Xnk ()|.

Note that, by triangle inequality,

So Y < a.s., i.e.

|Xnk+1 Xnk | < ,

(Xnk+1 Xnk ) conv a.s.

Denote it by X. Then Xnk X a.s.

We denote all such x by .

Definition 22.6 A subset L of H is a subspace if it is linear (i.e., x, y L,

If 3 xn x, then for any y

hx, yi = lim hxn , yi = 0.

Definition 22.8 Let L be a subspace of H.

Namely, {yn } is a Cauchy sequence, y L s.t. yn y. Hence

Then choose w2n = y and w2n+1 = z. Similar to yn , the sequence wn is a

Theorem 22.10 Properties of . i) is idempotent: 2 = .

X, Y two r.v.s, Y is R-valued, X is S-valued, S = {x1 , x2 , } countable.

Definition 23.1 Let X be S-valued. If P (X = xj ) > 0, we define the

Note that E(Y |X = xj ) is a function of xj . Define

Definition 23.3 Let X be S-valued and Y be R-valued. Then E(Y |X) is

Now we calculate E(X|S).

((1 p))nk (1p)

i.e. k + P oisson((1 p)).

In this case, we can prove that

where Bi B n with X 1 (Bi ) = Ai .

Next we define E(Y |X) when Y L2 (, A, P ).

Remark 23.8 1. L2 (, (X), P ) is a subspace of L2 (, A, P ).

Theorem 23.10 Let Y L2 (, A, P ) and let G be a sub--algebra of A.

d) Y E(Y |G) is linear.

where = 2 . Since f is continuous at 0, > 0, , |u| < , we have

n ([b, b]c ) < ,