Stochastic Calculus - An Introduction With Applications PDF
Stochastic Calculus - An Introduction With Applications PDF
Applications
Gregory F. Lawler
c 2014, Gregory F. Lawler
All rights reserved
ii
Contents
2 Brownian motion 33
2.1 Limits of sums of independent variables . . . . . . . . . . . . 33
2.2 Multivariate normal distribution . . . . . . . . . . . . . . . . 36
2.3 Limits of random walks . . . . . . . . . . . . . . . . . . . . . 40
2.4 Brownian motion . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5 Construction of Brownian motion . . . . . . . . . . . . . . . . 43
2.6 Understanding Brownian motion . . . . . . . . . . . . . . . . 48
2.6.1 Brownian motion as a continuous martingale . . . . . 51
2.6.2 Brownian motion as a Markov process . . . . . . . . . 53
2.6.3 Brownian motion as a Gaussian process . . . . . . . . 54
2.6.4 Brownian motion as a self-similar process . . . . . . . 54
2.7 Computations for Brownian motion . . . . . . . . . . . . . . . 55
2.8 Quadratic variation . . . . . . . . . . . . . . . . . . . . . . . . 59
2.9 Multidimensional Brownian motion . . . . . . . . . . . . . . . 63
2.10 Heat equation and generator . . . . . . . . . . . . . . . . . . 65
2.10.1 One dimension . . . . . . . . . . . . . . . . . . . . . . 65
2.10.2 Expected value at a future time . . . . . . . . . . . . . 70
2.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
iii
iv CONTENTS
3 Stochastic integration 79
3.1 What is stochastic calculus? . . . . . . . . . . . . . . . . . . . 79
3.2 Stochastic integral . . . . . . . . . . . . . . . . . . . . . . . . 80
3.2.1 Review of Riemann integration . . . . . . . . . . . . . 81
3.2.2 Integration of simple processes . . . . . . . . . . . . . 82
3.2.3 Integration of continuous processes . . . . . . . . . . . 85
3.3 Itos formula . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.4 More versions of Itos formula . . . . . . . . . . . . . . . . . . 100
3.5 Diffusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.6 Covariation and the product rule . . . . . . . . . . . . . . . . 111
3.7 Several Brownian motions . . . . . . . . . . . . . . . . . . . . 112
3.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
We will discuss some of the applications to finance but our main focus
will be on the mathematics. Financial mathematics is a kind of applied
mathematics, and I will start by making some comments about the use of
mathematics in the real world. The general paradigm is as follows.
A mathematical model is made of some real world phenomenon. Usu-
ally this model requires simplification and does not precisely describe
the real situation. One hopes that models are robust in the sense that
if the model is not very far from reality then its predictions will also
be close to accurate.
1
2 CONTENTS
The user of mathematics does not always need to know the details of
the mathematical analysis, but it is critical to understand the assumptions
in the model. No matter how precise or sophisticated the analysis is, if the
assumptions are bad, one cannot expect a good answer.
Chapter 1
3
4 CHAPTER 1. MARTINGALES IN DISCRETE TIME
f (x, y)
f (y|x) = .
f (x)
This is well defined provided that f (x) > 0, and if f (x) = 0, then x is an
impossible value for X to take. We can write
Z
E[Y | X = x] = y f (y | x) dy.
= y f (x, y) dy dx
= E[Y ].
E[Y ] = E [E[Y | Fn ]] .
Let 1A denote the indicator function (or indicator random variable) associ-
ated to the event A,
1 if A occurs
1A = .
0 if A does not occur
E [Y 1A ] = E [E[Y | Fn ] 1A ] .
E[Y | Fn ] is Fn -measurable.
For every Fn -measurable event A,
E [E[Y | Fn ] 1A ] = E [Y 1A ] .
We have used different fonts for the E of conditional expectation and the
E of usual expectation in order to emphasize that the conditional expectation
is a random variable. However, most authors use the same font for both
leaving it up to the reader to determine which is being referred to.
E [Y 1A ] = E [E[Y | G] 1A ] .
E [Z1 1A ] = E [Z2 1A ]
Although the definition does not give an immediate way to calculate the
conditional expectation, in many cases one can compute it. We will give a
number of properties of the conditional expectation most of which follow
quickly from the definition.
Proposition 1.1.1. Suppose X1 , X2 , . . . is a sequence of random variables
and Fn denotes the information at time n. The conditional expectation E[Y |
Fn ] satisfies the following properties.
If Y is Fn -measurable, then E[Y | Fn ] = Y .
If A is an Fn -measurable event, then E [E[Y | Fn ] 1A ] = E [Y 1A ]. In
particular,
E [E[Y | Fn ]] = E[Y ].
1.1. CONDITIONAL EXPECTATION 7
E[Y | Fn ] = E[Y ].
The proof of this proposition is not very difficult given our choice
of definition for the conditional expectation. We will discuss only a
couple of cases here, leaving the rest for the reader. To prove the
linearity property, we know that a E[Y | Fn ] + b E[Z | Fn ] is an
Fn -measurable random variable. Also if A Fn , then linearity of
expectation implies that
2 2
E[Sm | Fm ] = Sm ,
and hence,
E[Sn2 | Fm ] = Sm
2
+ 2 (n m).
1.2. MARTINGALES 9
Example 1.1.3. In the same setup as Example 1.1.1, let us also assume
that X1 , X2 , . . . are identically distributed. We will compute E[X1 | Sn ].
Note that the information contained in the one data point Sn is less than the
information contained in X1 , . . . , Xn . However, since the random variables
are identically distributed, it must be the case that
Therefore,
Sn
E[X1 | Sn ] =.
n
It may be at first surprising that the answer does not depend on E[X1 ].
1.2 Martingales
A martingale is a model of a fair game. Suppose X1 , X2 , . . . is a sequence of
random variables to which we associate the filtration {Fn } where Fn is the
information contained in X1 , . . . , Xn .
Therefore,
2
E[Mn+1 | Fn ] = E[Sn+1 An+1 | Fn ]
= Sn2 + n+1
2 2
(An + n+1 ) = Mn .
Mn = Mn Mn1
as either the change in the asset price or as the amount won in the game
at time n. Negative values indicate drops in price or money lost in the
game. The basic idea of stochastic integration is to allow one to change
ones portfolio (in the asset viewpoint) or change ones bet (in the game
viewpoint). However, we are not allowed to see the outcome before betting.
We make this precise in the next example.
Let us assume that for each n there is a number Kn < such that |Bn |
Kn . We also assume that we cannot see the result of nth game before betting.
This last assumption can be expressed mathematically by saying that Bn
is Fn1 -measurable. In other words, we can adjust our bet based on how
well we have been doing. We claim that under these assumptions, Wn is a
martingale with respect to Fn . It is clear that Wn is measurable with respect
to Fn , and integrability follows from the estimate
n
X
E[|Wn |] E[|Bj ||Mj Mj1 |]
j=1
n
X
Kj (E[|Mj |] + E[|Mj1 |]) < .
j=1
12 CHAPTER 1. MARTINGALES IN DISCRETE TIME
Also,
Therefore,
E[Wn+1 | Fn ] = Wn .
Bj = 2j1 if X1 = X2 = = Xj1 = 1,
E[Wn ] = 1 [1 2n ] [2n 1] 2n = 0.
However, we will eventually win which means that with probability one
W = lim Wn = 1,
n
and
1 = E[W ] > E[W0 ] = 0.
We have beaten the game (but it takes an infinite amount of time to guar-
antee it).
E[Mn | Fm ] Mm ,
E[Mn | Fm ] Mm ,
time and then one bets 0 afterwards. Let T be the stopping time for the
strategy. Then the winnings at time t is
n
X
M0 + Bj [Mj Mj1 ],
j=1
MnT ,
E [MnT ] = E [M0 ] .
If T is bounded, that is, if there exists k < such that P{T k} = 1, then
The final conclusion (1.7) of the theorem holds since E[MnT ] = E[MT ]
for n k. What if the stopping time T is not bounded but P{T < } = 1?
Then, we cannot conclude (1.7) without further assumptions. To see this we
need only consider the martingale betting strategy of the previous section.
If we define
T = min{n : Xn = 1} = min{n : Wn = 1},
then with probability one T < and WT = 1. Hence,
Often one does want to conclude (1.7) for unbounded stopping times, so
it is useful to give conditions under which it holds. Let us try to derive the
equality and see what conditions we need to impose. First, we will assume
that we stop, P{T < } = 1, so that MT makes sense. For every n < ,
we know that
In the martingale betting strategy example, this term did not cause a prob-
lem since WT = 1 and hence E[|WT |] < .
lim E[Xn ] = 0.
n
Finally, in order to conclude (1.7) we will make the hypothesis that the
other term acts nicely.
Then,
E [MT ] = E [M0 ] .
16 CHAPTER 1. MARTINGALES IN DISCRETE TIME
Let us check that the martingale betting strategy does not satisfy the
conditions of the theorem (it better not since it does not satisfy the conclu-
sion!) In fact, it does not satisfy (1.8). For this strategy, if T > n, then we
have lost n times and Wn = 1 2n . Also, P{T > n} = 2n . Therefore,
lim E [|Wn | 1{T > n}] = lim (2n 1) 2n = 1 6= 0.
n n
We need to show that (1.9) implies (1.8). If b > 0, then for every
n,
E[|MnT |2 ] C
E [|Mn | 1{|Mn | b, T > n}] .
b b
Therefore,
E [|Mn | 1{T > n}] = E [|Mn | 1{T > n, |Mn | b}]
+E [|Mn | 1{T > n, |Mn | < b}]
C
+ b P{T > n}.
b
Hence,
C C
lim sup E [|Mn | 1{T > n}] + b lim P{T > n} = .
n b n b
Since this holds for every b > 0 we get (1.8).
1.3. OPTIONAL SAMPLING THEOREM 17
Sn = 1 + X1 + + Xn .
By solving, we get
1
P{MT = K} = .
K
This relation is sometimes called the gamblers ruin estimate for the random
walk. Note that
lim P{MT = K} = 0.
K
T = min{n : Sn = J or Sn = K}.
In Exercise 1.13 it is shown that there exists C < such that for all n
2
E[MnT ] C. Hence we can use Theorem 1.3.3 to conclude that
0 = E[M0 ] = E [MT ] = E ST2 E [T ] .
Moreover,
E ST2 = J 2 P{ST = J} + K 2 P{ST = K}
K J
= J2 + K2 = JK.
J +K J +K
Therefore,
E[T ] = E ST2 = JK.
In particular, the expected amount of time for the random walker starting
at the origin to get distance K from the origin is K 2 .
Example 1.3.3. As in Example 1.3.2, let Sn = X1 + + Xn be simple
random walk starting at 0. Let
T = min{n : Sn = 1}, TJ = min{n : Sn = 1 or Sn = J}.
Note that T = limJ TJ and
1
P{T = } = lim P {STJ = J} = lim = 0.
J J J + 1
Therefore, P{T < } = 1, although Example 1.3.2 shows that for every J,
E[T ] E [TJ ] = J,
and hence E[T ] = . Also, ST = 1, so we do not have E[S0 ] = E[ST ]. From
this we can see that (1.8) and (1.9) are not satisfied by this example.
It does not follow from the theorem that E[M ] = E[M0 ]. For example,
the martingale betting strategy satisfies the conditions of the theorem since
E [|Wn |] = (1 2n ) 1 + (2n 1) 2n 2.
However, W = 1 and W0 = 0.
E [|Mn |] C < ,
for all n. Suppose a < b are real numbers. We will show that it is
impossible for the martingale to fluctuate infinitely often below a
and above b. Define a sequence of stopping times by
Bn = 1 if Sj n 1 < Tj ,
Bn = 0 if Tj n 1 < Sj+1 .
In other words, every time the price drops below a we buy a unit
of the asset and hold onto it until the price goes above b at which
time we sell. Let Un denote the number of times by time n that we
have seen a fluctuation; that is,
Un = j if Tj < n Tj+1 .
20 CHAPTER 1. MARTINGALES IN DISCRETE TIME
Wn Un (b a) + (Mn a).
R0 = G0 = 1, Rn + Gn = n + 2,
and let
Rn Rn
Mn = =
Rn + Gn n+2
1.4. MARTINGALE CONVERGENCE THEOREM AND POLYAS URN21
be the fraction of red balls at this time. Let Fn denote the information in
the data M1 , . . . , Mn , which one can check is the same as the information in
R1 , R2 , . . . , Rn . Note that the probability that a red ball is chosen at time
n depends only on the number (or fraction) of red balls in the urn before
choosing. It does not depend on what order the red and green balls were put
in. This is an example of the Markov property. This concept will appear a
number of times for us, so let us define it.
Yn+1 , Yn+2 , . . .
P{Rn+1 = Rn + 1 | Fn } = 1 P{Rn+1 = Rn | Fn } =
Rn
P{Rn+1 = Rn + 1 | Mn } = = Mn .
n+2
We claim that Mn is a martingale with respect to Fn . To check this,
E [Mn+1 | Fn ] = E [Mn+1 | Mn ]
Rn + 1 Rn
= Mn + [1 Mn ]
n+3 n+3
Rn (Rn + 1) (n + 2 Rn )Rn
= +
(n + 2)(n + 3) (n + 2)(n + 3)
Rn (n + 3)
= = Mn .
(n + 2)(n + 3)
Since E[|Mn |] = E[Mn ] = E[M0 ] = 1/2, this martingale satisfies the condi-
tions of the martingale convergence theorem. (In fact, the same argument
shows that every martingale that stays nonnegative satisfies the conditions.)
Hence, there exists a random variable M such that with probability one,
lim Mn = M .
n
22 CHAPTER 1. MARTINGALES IN DISCRETE TIME
It turns out that the random variable Mn is really random in the sense
that it has a nontrivial distribution. In Exercise 1.11 you will show that for
each n, the distribution of Mn is uniform on
1 2 n+1
, ,..., ,
n+2 n+2 n+2
and from this it is not hard to see that M has a uniform distribution on
[0, 1]. You will also be asked to simulate this process to see what happens.
There is a lot of randomness in the first few draws to see what fraction of
red balls the urn will settle down to. However, for large n this ratio changes
very little; for example, the ratio after 2000 draws is very close to the ratio
after 4000 draws.
While Polyas urn seems like a toy model, it arises in a number of places.
We will give an example from Bayesian statistics. Suppose that we perform
independent trials of an experiment where the probability of success for each
experiment is (such trials are called Bernoulli trials). Suppose that we do
not know the value of , but want to try to deduce it by observing trials.
Let X1 , X2 , . . . be independent random variables with
P{Xj = 1} = 1 P{Xj = 0} = .
The (strong) law of large numbers implies that with probability one,
X1 + + Xn
lim = . (1.10)
n n
Hence, if were able to observe infinitely many trials, we could deduce
exactly.
Clearly, we cannot deduce with 100% assurance if we see only a finite
number of trials. Indeed, if 0 < < 1, there is always a chance that the first
n trials will all be failures and there is a chance they will all be successes.
The Bayesian approach to statistics is to assume that is a random variable
with a certain prior distribution. As we observe the data we update to a
posterior distribution. We will assume we know nothing initially about the
value and choose the prior distribution to the uniform distribution on [0, 1]
with density
f0 () = 1, 0 < < 1.
Suppose that after observing n trials, we have had Sn = X1 + + Xn
successes. If we know , then the distribution of Sn is binomial,
n k
P{Sn = k | } = (1 )nk .
k
1.5. SQUARE INTEGRABLE MARTINGALES 23
Note that this condition is not as strong as (1.9). We do not require that
2
there exists a C < such that E Mn C for each n. Random variables
X, Y are orthogonal if E[XY ] = E[X] E[Y ]. Independent random variables
are orthogonal, but orthogonal random variables need not be independent. If
X1 , . . . , Xn are pairwise orthogonal random variables with mean zero, then
E[Xj Xk ] = 0 for j 6= k and by expanding the square we can see that
n
X
2
E[Xj2 ].
E (X1 + + Xn ) =
j=1
24 CHAPTER 1. MARTINGALES IN DISCRETE TIME
E [(Mn+1 Mn ) (Mm+1 Mm )] = 0.
j=1
E [(Mn+1 Mn ) (Mm+1 Mm ) | Fn ]
= (Mm+1 Mm ) E[Mn+1 Mn | Fn ] = 0.
Hence
E [(Mn+1 Mn ) (Mm+1 Mm )]
= E [E [(Mn+1 Mn ) (Mm+1 Mm ) | Fn ]] = 0.
Also, if we set M1 = 0,
2
Xn
Mn2 = M0 + (Mj Mj1 )
j=1
n
X X
= M02 + (Mj Mj1 )2 + (Mj Mj1 )(Mk Mk1 ).
j=1 j6=k
(X, Y ) = E[XY ].
1.6. INTEGRALS WITH RESPECT TO RANDOM WALK 25
This is immediate.
Variance rule
Xn Xn n
X
Jj Sj )2 = 2 E Jj2 .
Var Jj Sj = E (
j=1 j=1 j=1
Y n = max{Y0 , Y1 , . . . , Yn }.
P{Y n a} a1 E[Yn ].
1.8. EXERCISES 27
n
X
E [Yk 1Ak ] = E YT 1{Y n a} a P{Y n a}.
k=0
P M n a a2 E Mn2 .
1.8 Exercises
Exercise 1.1. Suppose we roll two dice, a red and a green one, and let X
be the value on the red die and Y the value on the green die. Let Z = XY .
1. Let W = E(Z | X). What are the possible values for W ? Give the
distribution of W .
Exercise 1.2. Suppose we roll two dice, a red and a green one, and let X
be the value on the red die and Y the value on the green die. Let Z = X/Y .
1. Find E[(X + 2Y )2 | X].
4. Let W = E[Z | X]. What are the possible values for W ? Give the
distribution of W .
Exercise 1.3. Suppose X1 , X2 , . . . are independent random variables with
1
P{Xj = 2} = 1 P{Xj = 1} = .
3
Let Sn = X1 + + Xn and let Fn denote the information in X1 , . . . , Xn .
1. Find E[Sn ], E[Sn2 ], E[Sn3 ].
2. If m < n, find
E sin Sn | Sn2 .
Mn = S n .
Exercise 1.7. Suppose two people want to play a game in which person A
has probability 2/3 of winning. However, the only thing that they have is a
fair coin which they can flip as many times as they want. They wish to find
a method that requires only a finite number of coin flips.
Exercise 1.8. Repeat the last exercise with 2/3 replaced by 1/.
2. Write a short program that will simulate this urn. Each time you
run the program note the fraction of red balls after 600 draws and
after 1200 draws. Compare the two fractions. Then, repeat this twenty
times.
3. What is E[Wn2 ]?
P{T j + K | T > j} 2K .
Show that there exists c < , > 0 such that for all j,
P{T > j} c ej .
Let Mn = Sn2 n. Show there exists C < such that for all n,
2
E MnT C.
Show that there exists C < such that for all n, E[Sn2 ] C.
S = lim Sn ,
n
exists.
Show that
X
E[S ] = 0, Var[S ] = n2 .
n=1
Exercise 1.15.
Suppose Y is a random variable and is a convex function,
that is, if 0 1,
(x + (1 ) y) (x) + (1 ) (y).
Brownian motion
X1 + X2 + + Xn
(X1 + + Xn ) n
Zn = .
n
We let denote the standard normal distribution function,
Z b
1 2
(b) = ex /2 dx.
2
While this function cannot be written down explicitly, the numerical values
are easily accessible in tables and computer software packages.
33
34 CHAPTER 2. BROWNIAN MOTION
(i)2 2 2
(t) = 1 + i E [Xj ] t + E Xj t + o(t2 )
2
t2
= 1 + o(t2 ),
2
where o(t2 ) denotes a function such that |o(t2 )|/t2 0 as t 0.
Using the independence of the Xj , we see that the characteristic
function of Zn is
2 n
n t2
t 2
Zn (t) = (t/ n) = 1 +o et /2 .
2n n
The right-hand side is the characteristic function of a standard nor-
mal random variable.
and note that for each n, E[Yn ] = . Recall that a random variable Y has a
Poisson distribution with mean if for each nonnegative integer k,
k
P{Y = k} = e .
k!
Theorem 2.1.2 (Convergence to the Poisson distribution). As n ,
the distribution of Yn approaches a Poisson distribution with mean . More
precisely, for every nonnegative integer k,
k
lim P {Yn = k} = e .
n k!
Proof. For each n, Yn has a binomial distribution with parameters n and
/n, and hence
lim P {Yn = k}
n
k
nk
n
= lim 1
n k n n
k n
k
n(n 1) (n k + 1)
= lim 1 1
k! n nk n n
k
n(n 1) (n k + 1)
lim 1 = 1,
n nk n
In the Poisson case, the limit distribution will not have continuous paths
but rather will be a jump process. The prototypical case is the Poisson pro-
cess with intensity . In this case, Nt denotes the number of occurrences of
an event by time t. The function t 7 Nt takes on nonnegative integer values
and the jumps are always of size +1. It satisfies the following conditions.
where
X1 Z1
X2 Z2
X= , Z= ,
.. ..
. .
Xn Zm
and A is the n m matrix with entries ajk . Each Xj is a normal random
variable with mean zero and variance
E[Xj2 ] = a2j1 + + a2jm .
More generally, the covariance of Xj and Xk is given by
m
X
Cov(Xj , Xk ) = E[Xj Xk ] = ajl akl .
l=1
(If the is replaced with > 0 for all b = (b1 , . . . , bn ) 6= (0, . . . , 0),
then the matrix is called positive definite.) The inequality (2.1) can be
derived by noting that the left-hand side is the same as
E (b1 X1 + + bn Xn )2 ,
f (x1 , . . . , xn ) = f (x) =
(x m) 1 (x m)T
1
exp .
(2)n/2 det 2
Sometimes this density is used as a definition of a joint normal. The
formula for the density looks messy, but note that if n = 1, m =
m, = [ 2 ], then the right-hand side is the density of a N (m, 2 )
random variable.
Hence the covariance matrix for (Z, W ) is the identity matrix and this is
the covariance matrix for independent N (0, 1) random variables.
40 CHAPTER 2. BROWNIAN MOTION
Sn = X1 + + Xn
t, 2t, 3t, ,
We also know from the central limit theorem, that if N is large, then the
distribution of
X1 + + XN
,
N
is approximately N (0, 1).
Brownian motion could be defined formally as the limit of random walks,
but there are subtleties in describing the kind of limit. In the next section, we
define it directly using the idea of continuous random motion. However,
the random walk intuition is useful to retain.
random continuous motion. Let Bt = B(t) be the value at time t. For each
t, Bt is a random variable.1 A collection of random variables indexed by time
is called a stochastic process. We can view the process in two different ways:
For each t, there is a random variable Bt , and there are correlations
between the values at different times.
The function t 7 B(t) is a random function. In other words, it is a
random variable whose value is a function.
There are three major assumptions about the random variables Bt .
Stationary increments. If s < t, then the distribution of Bt Bs is
the same as that of Bts B0 .
Independent increments. If s < t, the random variable Bt Bs is
independent of the values Br for r s.
Continuous paths. The function t 7 Bt is a continuous function of
t.
We often assume B0 = 0 for convenience, but we can also take other initial
conditions. All of the assumptions above are very reasonable for a model of
random continuous motion. However, it is not obvious that these are enough
assumptions to characterize our process uniquely. It turns out that they do
up to two parameters. One can prove (see Theorem 6.8.3), that if Bt is a
process satisfying the three conditions above, then the distribution of Bt for
each t must be normal. Suppose Bt is such a process, and let m, 2 be the
mean and variance of B1 . If s < t, then independent, identically distributed
increments imply that
E[Bt ] = E[Bs ] + E[Bt Bs ] = E[Bs ] + E[Bts ],
Var[Bt ] = Var[Bs ] + Var[Bt Bs ] = Var[Bs ] + Var[Bts ].
Using this relation, we can see that E[Bt ] = tm, Var[Bt ] = t 2 . At this point,
we have only shown that if a process exists, then the increments must have
a normal distribution. We will show that such a process exists. It will be
convenient to put the normal distribution in the definition.
1
In this book and usually in the financial world, the terms Brownian motion and
Wiener process are synonymous. However, in the scientific world, the word Brownian
motion is often used for a physical process for which what we will describe is one possible
mathematical model. The term Wiener process always refers to the model we define. The
letter Wt is another standard notation for Brownian motion/Wiener process and is more
commonly used in financial literature. We will use both Bt and Wt later in the book when
we need two notations.
42 CHAPTER 2. BROWNIAN MOTION
B0 = 0.
Yt = Bt + mt,
Indeed, one just checks that it satisfies the conditions above. Hence, in order
to establish the existence of Brownian motion, it suffices to construct a
standard Brownian motion.
There is a mathematical challenge in studying stochastic processes in-
dexed by continuous time. The problem is that the set of positive real num-
bers is uncountable, that is, the elements cannot be enumerated t1 , t2 , . . ..
The major axiom of probability theory is the fact that if A1 , A2 , . . . is a
countable sequence of disjoint events, then
" #
[ X
P An = P[An ].
n=1 n=1
This rule does not hold for uncountable unions. An example that we have
all had to deal with arises with continuous random variables. Suppose, for
instance, that Z has a N (0, 1) distribution. Then for each x R,
P{Z = x} = 0.
2.5. CONSTRUCTION OF BROWNIAN MOTION 43
However, " #
[
1 = P{Z R} = P Ax ,
xR
where Ax denotes the event {Z = x}. The events Ax are disjoint, each with
probability zero, but it is not the case that
" #
[ X
P Ax = P(Ax ) = 0.
xR xR
then we know the value at every t. Indeed, we need only find a sequence of
dyadic rationals tn that converge to t, and let
The next section shows that one can construct a Brownian motion. The
reader can skip this section and just have faith that such a process exists.
{Zq : q D}
is indexed by D.
We will use the independent random variables {Zq } to define the
Brownian motion Bq , q D. We start by defining B0 , B1 , and then
B1/2 , and then B1/4 and B3/4 , and so forth, by always subdividing
our intervals. We start with B0 = 0 and we let B1 = Z1 which is
clearly N (0, 1). We then define
B1 Z1/2
B1/2 = + ,
2 2
and hence
B1 Z1/2
B1 B1/2 = .
2 2
We think of the definition of B1/2 as being E[B1/2 | B1 ] plus
some independent randomness. Using Proposition 2.2.1, we can see
that B1/2 and B1 B1/2 are independent random variables, each
N (0, 1/2). We continue this splitting. If q = (2k + 1)/2n+1
Dn+1 \ Dn , we define
B(k+1)2n Bk2n Zq
Bq = Bk2n + + (n+2)/2 .
2 2
This formula looks a little complicated, but we can again view this
as
Bk2n B(k1)2n : k = 1, . . . , 2n
2.5. CONSTRUCTION OF BROWNIAN MOTION 45
Kn = sup |Bs Bt | : s, t D, |s t| 2n ,
lim 2n Kn = 0. (2.2)
n
In particular, Kn 0.
In order to prove this proposition, it is easier to consider another
sequence of random variables
Jn = max n Y (j, n)
j=1,...,2
which implies (2.2). To get our estimate, we will need the following
lemma which is a form of the reflection principle for Brownian
motion.
Yn = max {|Bq | : q Dn } .
Then
P{Y > a} = lim P{Yn > a},
n
and hence it suffices to prove the inequality for each n. Fix n and
let Ak be the event that
2 n
[
{Yn > a} = Ak .
k=1
B0 , Bt , B2t , B3t , . . .
The increment B(k+1)t Bkt is a normal random variable with mean 0 and
variance t. If N0 , N1 , N2 , . . . denote independent N (0, 1) random variables
(which can be generated on a computer), we set
B(k+1)t = Bkt + t Nk ,
|f (s) f (t)| C |s t| .
Theorem 2.6.2. With probability one, for all < 1/2, Bt is Holder con-
tinuous of order but it is not Holder continuous of order 1/2.
Hence, we could find a positive integer M < such that for all
sufficiently large integers n, there exists k n such that Yk,n
M/n, where Yk,n is
k+1 k
max B
B ,
n n
B k + 2 B k + 1 , B k + 3 B k + 2 .
n n n n
and hence,
n1
X M3
P{Yn M/n} P{Yk,n M/n} 0.
k=0
n1/2
" #
[
P AM = 0.
M =1
But our first remark shows that the event that Bt is differentiable
at some point is contained in M AM .
Theorem 2.6.2 is a restatement of (2.2).
2.6. UNDERSTANDING BROWNIAN MOTION 51
Often we will have more information at time t than the values of the
Brownian motion so it is useful to extend our definition of Brownian motion.
We say that Bt is Brownian motion with respect to the filtration {Ft } if each
Bt is Ft -measurable and Bt satisfies the conditions to be a Brownian motion
with the third condition being replaced by
E(Y | Fs ) = Ms
Ms = E(Y | Fs ).
52 CHAPTER 2. BROWNIAN MOTION
Indeed, if we define Ms as above and r < s, then the tower property for
conditional expectation implies that
Mt = Nt t.
Then using the fact that the increments are independent we see that for
s < t,
Ys = Bt+s , 0 s < ,
Bs = Bt+s Bt ,
54 CHAPTER 2. BROWNIAN MOTION
(Xt1 , . . . , Xtn )
has a joint normal distribution. Recall that to describe a joint normal dis-
tribution one needs only give the means and the covariances. Hence the
finite-dimensional distributions of a Gaussian process are determined by the
numbers
mt = E[Xt ], st = Cov(Xs , Xt ).
If Bt is a standard Brownian motion and t1 < t2 < < tn , then we
can write Bt1 , . . . , Btn as linear combinations of the independent standard
normal random variables
Bt Btj1
Zj = j , j = 1, . . . , n.
tj tj1
Hence Bt is a Gaussian process with mean zero. If s < t,
The events {B1 > 0} and {B2 > 0} are not independent; we would expect
them to be positively correlated. We compute by considering the possibilities
at time 1,
Z
P{B1 > 0, B2 > 0} = P{B2 > 0 | B1 = x} dP{B1 = x}
Z0
1 2
= P{B2 B1 > x} ex /2 dx
2
Z0 Z
1 (x2 +y2 )/2
= e dy dx
0 x 2
Z Z /2 r2 /2
e 3
= d r dr = .
0 /4 2 8
One needs to review polar coordinates to do the fourth equality. Note that
which confirms our intuition that the events are positively correlated.
For more complicated calculations, we need to use the strong Markov
property. We say that a random variable T taking values in [0, ] is a stop-
ping time (with respect to the filtration {Ft }) if for each t, the event P{T t}
is Ft -measurable. In other words, the decision to stop can use the informa-
tion up to time t but cannot use information about the future values of the
Brownian motion.
If x R and
T = min{t : Bt = x},
then T is a stopping time.
S T = min{S, T }
and
S T = max{S, T }
{Bt : 0 t T }.
Let us apply this theorem, to prove a very useful tool for computing
probabilities.
Proposition 2.7.2 (Reflection Principle). If Bt is a standard Brownian
motion with B0 = 0, then for every a > 0,
i
h
P max Bs a = 2P{Bt > a} = 2 1 (a/ t) .
0st
Ta = min {s : Bs a} = min {s : Bs = a} .
The second equality uses the fact that P{Ta = t} P{Bt = a} = 0. Since
BTa = a,
This gives the first equality of the proposition and the second follows from
P{Bt > a} = P{B1 > a/ t} = 1 (a/ t).
Example 2.7.1. Let a > 0 and let Ta = inf{t : Bt = a}. The random
variable Ta is called a passage time. We will find the density of Ta . To do
this, we first find its distribution function
i
h
F (t) = P{Ta t} = P max Bs a = 2 1 (a/ t) .
0st
58 CHAPTER 2. BROWNIAN MOTION
P max Bs r = 2P{Bt r} = 2 [1 (r/ t)].
0st
[ 1
{T < t} = T t Ft .
n
n=1
If T is a stopping time, the -algebra FT is defined to be the
set of events A F such that for each t, A {T t} Ft . (It is
not hard to show that this is a -algebra.) We think of FT as the
information available up to time T . If {T < } let
Yt = BT +t BT ,
where 2
j
B n B j1
n
Yj = Yj,n = .
1/ n
E [Yj ] = E Z 2 = 1, E Yj2 = E Z 4 = 3.
4
(One can use integration by parts to calculate
h i E Z or one could just look
it up somewhere.) Hence Var[Yj ] = E Yj2 E [Yj ]2 = 2, and
n n
1X 1 X 2
E [Qn ] = E [Yj ] = 1, Var [Qn ] = 2
Var [Yj ] = .
n n n
j=1 j=1
X j j1 2
Qn (t) = B B .
n n
jtn
X j1 2
j
hXit = lim X X ,
n n n
jtn
X j j1 2 j1 2
X j
2
W W = B B
n n n n
2.8. QUADRATIC VARIATION 61
X 2
2m X j j1 m
+ B B + ,
n n n n2
where in each case the sum is over j tn. As n ,
X j j1 2
2
B B 2 hBit = 2 t,
n n
2m X j j1 2m
B B Bt 0,
n n n n
X m2 tnm2
0.
n2 n2
We have established the following.
The important facts are that the quadratic variation is not random and
that it depends on the variance but not on the mean. It may seem silly at
this point to give a name and notation to a quantity which is almost trivial
for Brownian motion, but in the next chapter we will deal with processes
for which the quadratic variation is not just a linear function of time.
n
X
Var[Q(t; )] = Var [B(tj ) B(tj1 )]2
j=1
Xn
= 2 (tj tj1 )2
j=1
n
X
2kk (tj tj1 ) = 2kkt.
j=1
and hence by the Borel-Cantelli lemma, with probability one, for all
n sufficiently large,
1
|Q(t; n ) t| .
k
It is important to note the order of the quantifiers in this theo-
rem. Let t = 1. The theorem states that for every sequence {n }
satisfying (2.5), Q(1; n ) 1. The event of measure zero on which
convergence does not hold depends on the sequence of partitions.
2.9. MULTIDIMENSIONAL BROWNIAN MOTION 63
X j + 1 2
j
Qn (t) = B n
B .
n
2 2n
j<t2
Since the dyadic rationals are countable, the theorem implies that
with probability one for every dyadic rational t, Qn (t) t. However,
since t 7 Qn (t) is increasing, we can conclude that with probability
one, for every t0, Qn (t) t.
B0 = 0.
As in the quadratic variation, the drift does not contribute to the covariation.
We state the following result which is proved in the same way as for quadratic
variation.
Theorem 2.9.1. If Bt is a d-dimensional Brownian motion with drift m
and covariance matrix , then
hB i , B k it = ik t.
hB i , B k it = 0, i 6= k.
to first observe the position at time s and then to consider what happens in
the next time interval of length t. This leads to the Chapman-Kolmogorov
equation Z
ps+t (x) = ps (y) pt (x y) dy. (2.7)
1 x pt (x) x pt (x x)
,
2 x
2.10. HEAT EQUATION AND GENERATOR 67
and now we have a difference quotient for the first derivative in x in which
case the limit should be
1
xx pt (x).
2
Another, essentially equivalent, method to evaluate the limit is to write
f (x) = pt (x) and expand in a Taylor series about x,
1 00
f (x + ) = f (x) + f 0 (x) + f (x) 2 + o(2 ),
2
where o(2 ) denotes a term such that
o(2 )
lim = 0.
0 2
and hence the limit of the right-hand side of (2.8) is xx pt (x)/2. We have
derived the heat equation
1
t pt (x) = xx pt (x).
2
While we have been a bit sketchy on details, one could start with this equa-
tion and note that pt as defined in (2.6) satisfies this. This is the solution
given that B0 = 0, that is, when the initial density p0 (x) is the delta
function at 0. (The delta function, written () is the probability density
of the probability distribution that gives measure one to the point 0. This
is not really a density, but informally we write
(0) = , (x) = 0, x 6= 0,
Z
(x) dx = 1.
These last equations do not make mathematical sense, but they give a work-
able heuristic definition.)
If the Brownian motion has variance 2 , one binomial approximation is
1
P {Bt+t = Bt + x} = P {Bt+t = Bt x} = ,
2
where x = t. The factor is put in so that
We can use the same argument to derive that this density should satisfy the
heat equation
2
t pt (x) = xx pt (x).
2
The coefficient 2 (or in some texts ( 2 /2)) is referred to as the diffusion
coefficient. One can check that a solution to this equation is given by
x2
1
pt (x) = exp 2 .
2 2 t 2 t
When the Brownian motion has drift m, the equation gets another term.
To see what the term should look like, let us first consider the case of deter-
ministic linear motion, that is, motion with drift m but no variance. Then
if pt (x) denotes the density at x at time t, we get the relationship
t pt (x) = m x pt (x).
(x mt)2
1
pt (x) = exp ,
2 2 t 2 2 t
and show that it satisfies this equation.
Before summarizing, we will change the notation slightly. Let pt (x, y)
denote the density of Bt given B0 = x. Under this notation pt (x) = pt (0, x).
2.10. HEAT EQUATION AND GENERATOR 69
2
Lx ft (x, y) = m x ft (x, y) + xx ft (x, y).
2
Theorem 2.10.1. Suppose Bt is a standard Brownian motion with drift
m and variance 2 . Then the transition density pt (x, y) satisfies the heat
equation
t pt (x, y) = Ly pt (x, y)
with initial condition p0 (x, ) = x (). Here L is the operator on functions
2 00
L f (y) = m f 0 (y) + f (y).
2
We think of pt (x, y) as the probability of being at y at time t given that
B0 = x. For driftless Brownian motion, this is the same as the probability
of being at x given that one was at y. However, the reversal of a Brownian
motion with drift m should be a Brownian motion with drift m. This gives
the following.
2 00
Lf (x) = m f 0 (x) + f (x).
2
The operator L will be more important to us than the operator L which
is why we give it the simpler notation.
70 CHAPTER 2. BROWNIAN MOTION
Let H denote the real Hilbert space L2 (R) with inner product
Z
(f, g) = f (x) g(x)dx.
(L f, g) = (f, Lg).
One can verify this using the following relations that are obtained
by integration by parts:
Z Z
f (x) g 0 (x) dx = f 0 (x) g(x) dx,
Z Z
f (x) g 00 (x) dx = f 00 (x) g(x) dx.
Suppose B0 has an initial density f . Then the density of Bt is
given by
Z
ft (y) = Pt f (y) := f (x) pt (x, y) dx.
t ft (y) = Ly ft (y)
with initial condition f0 (y) = f (y). We can write the heat equation
as a derivative for operators,
t Pt = L Pt .
Let (t, x) be the expected value of f (Bt ) given that B0 = x. We will write
this as
(t, x) = Ex [f (Bt )] = E [f (Bt ) | B0 = x] .
Then Z
(t, x) = f (y) pt (x, y) dy.
Pt f (x) f (x)
Lf (x) = lim ,
t0 t
where
Pt f (x) = Ex [f (Xt )] .
72 CHAPTER 2. BROWNIAN MOTION
then satisfies
t (t, x) = Lx (t, x), t > 0,
d d d
X 1 XX
f () = f (0) + bj j + ajk j k + o(||2 ),
2
j=1 j=1 k=1
In particular,
d d d
1 XX
bj Btj ajk Btj Btk + o(|Bt |2 ).
X
f (Bt ) f (B0 ) = +
2
j=1 j=1 k=1
We know that h i
E Btj = mj t, E Btj Btk = jk t,
and hence
Z
(t, x) := Pt f (x) = f (y) pt (x, y) dy <
Since
where
d d
1 XX
L f (y) = m f (y) + jk jk f (y).
2
j=1 k=1
2.11. EXERCISES 75
The equations
are sometimes called the Kolmogorov backwards and forwards equations, re-
spectively. The name comes from the fact that they can be derived from the
Chapman-Kolmogorov equations by writing
Z
pt+t (x, y) = pt (x, z) pt (z, y) dz,
Z
pt+t (x, y) = pt (x, z) pt (z, y) dz,
2.11 Exercises
Exercise 2.1. Let Z1 , Z2 , Z3 be independent N (0, 1) random variables. Let
1. P{B3 1/2}
3. P(E) where E is the event that the path stays below the line y = 6 up
to time t = 10.
4. P{B4 0 | B2 0}.
76 CHAPTER 2. BROWNIAN MOTION
Show that
Find E[Y ].
Exercise 2.6. Let Bt be a standard Brownian motion and let {Ft } denote
the usual filtration. Suppose s < t. Compute the following.
1. E[Bt2 | Fs ]
2.11. EXERCISES 77
2. E[Bt3 | Fs ]
3. E[Bt4 | Fs ]
4. E[e4Bt 2 | Fs ]
Exercise 2.7. Let Bt be a standard Brownian motion and let
Y (t) = t B(1/t).
What is Q if
1. f is a nonconstant, continuously differentiable function on R?
2. f is a Brownian motion?
Exercise 2.9. Suppose Bt is a standard Brownian motion. For the functions
(t, x), 0 < t < 1, < x < , defined below, give a PDE satisfied by the
function.
1. (t, x) = P{Bt 0 | B0 = x}.
Mt = max Bs .
0st
1. Explain why Mt has the same distribution as t M1 .
3. Find E[Mt ].
Run the simulation enough times to get a good estimate for the prob-
ability. Use the reflection principle to calculate the actual probability
and compare the result.
Stochastic integration
79
80 CHAPTER 3. STOCHASTIC INTEGRATION
The ds integral is the usual integral from calculus; the integrand m(s, Xs )
is random, but that does not give any problem in defining the integral. The
main task will be to give precise meaning to the second term, and more
generally to Z t
As dBs .
0
There are several approaches to stochastic integration. The approach we
give, which is most commonly used in mathematical finance, is that of the
Ito integral.
by
j1
X
Ztj = Yi [Bti+1 Bti ],
i=0
and, more generally,
Zt = Ztj + Yj [Bt Btj ] if tj t tj+1 ,
Z t
As dBs = Zt Zr .
r
There are four important properties of the stochastic integral of simple
processes which we give in the next proposition. The reader should compare
these with the properties of integration with respect to random walk in
Section 1.6.
Proposition 3.2.1. Suppose Bt is a standard Brownian motion with respect
to a filtration {Ft }, and At , Ct are simple processes.
Linearity. If a, b are constants, then aAt +bCt is also a simple process
and Z t Z t Z t
(aAs + bCs ) dBs = a As dBs + b Cs dBs .
0 0 0
Moreover, if 0 < r < t,
Z t Z r Z t
As ds = As dBs + As dBs .
0 0 r
3.2. STOCHASTIC INTEGRAL 83
We will do this in the case t = tj , s = tk for some j > k and leave the other
cases for the reader. In this case
k1
X
Zs = Yi [Bti+1 Bti ],
i=0
and
j1
X
Zt = Zs + Yi [Bti+1 Bti ].
i=k
If i < k,
E Yi [Bti+1 Bti ]Yk [Btk+1 Btk ]
= E E Yi [Bti+1 Bti ]Yk [Btk+1 Btk ] | Ftk .
The random variables Yi , Yk , Bti+1 Bti are all Ftk -measurable while Btk+1
Btk is independent of Ftk , and hence
E Yi [Bti+1 Bti ]Yk [Btk+1 Btk ] | Ftk
= Yi [Bti+1 Bti ]Yk E Btk+1 Btk | Ftk
= Yi [Bti+1 Bti ]Yk E Btk+1 Btk = 0.
= Yi2 (ti+1 ti ),
and hence,
= E[Yi2 ] (ti+1 ti ).
The function s 7 E[A2s ] is a step function that takes on the value E[Yi2 ] for
ti s < ti+1 . Therefore,
j1
X Z t
E[Zt2 ] = E[Yi2 ] (ti+1 ti ) = E[A2s ] ds.
i=0 0
3.2. STOCHASTIC INTEGRAL 85
(n)
Moreover, for all n, t, |At | C.
It suffices to prove (3.3) for each fixed value of t and for ease we
(n)
will choose t = 1. By construction, the At are simple processes
(n)
satisfying |At | C. Since (with probability one) the function
t 7 At is continuous, we have
(n)
At At ,
and hence by the bounded convergence theorem applied to Lebesgue
measure,
lim Yn = 0,
n
where Z 1
(n)
Yn = [At At ]2 dt. (3.4)
0
Since the random variables {Yn } are uniformly bounded, this implies
Z 1
(n) 2
lim E [At At ] dt = lim E[Yn ] = 0.
n 0 n
86 CHAPTER 3. STOCHASTIC INTEGRATION
We define Z t
As dBs = Zt .
0
The integral satisfies four properties which should start becoming familiar.
Also, if r < t,
Z t Z r Z t
As dBs = As dBs + As dBs .
0 0 r
E[(Zt Zs )4 ] 3 C 4 |t s|2 .
By Fatous lemma, this bound will also hold for the limit process.
This estimate and an argument first due to Kolmogorov suffice to
give continuity. We leave the argument as Exercise 3.11.
(n) (n)
Let At be a sequence of simple processes and let Zt , Zt
denote the corresponding stochastic integrals. Let
h i Z 1 h i
(n) (n)
kA(n) Ak2 = E (Z1 Z1 )2 = E (At At )2 dt,
0
and let
(n)
Qn = max |Zt Zt |.
0t1
Proposition 3.2.4. If
X
kA(n) Ak2 < , (3.5)
n=1
(n)
For fixed m, the process Mt = Zt Zt , t Dm , is a discrete time
martingale and Corollary 1.7.2 implies that
E[M12 ]
P max |Mt | > .
tDm 2
Therefore,
E[M12 ]
P{Qn > } = 2 kA(n) At k2 .
2
Let us emphasize the order of quantifiers in the proposition.
Given a sequence A(n) of approximating simple processes satisfying
(3.5), we get convergence with probability one. It is not true that
with probability one, we get convergence for every sequence. See an
analogous discussion in Section 2.8 about the quadratic variation.
Kt = max |As |.
0st
3.2. STOCHASTIC INTEGRAL 89
(n)
Then for n Kt , As = As for 0 s t, and hence
(n)
Zt = Zt , t Kt .
The value of this integral does not depend on how we define At at the
discontinuity. In this book, the process At will have continuous or piecewise
continuous paths although the integral can be extended to more general
processes. Note that the simple processes have piecewise continuous paths.
90 CHAPTER 3. STOCHASTIC INTEGRATION
One important case that arises comes from a stopping time. Suppose T is
a stopping time with respect to {Ft }. Then if At is a continuous, adapted
process and Z t
Zt = As dBs ,
0
then Z tT Z t
ZtT = As dBs = As,T dBs ,
0 0
where As,T denotes the piecewise continuous process,
As s < T
As,T = .
0 sT
dXt = At dBt ,
3.2. STOCHASTIC INTEGRAL 91
This has been made mathematically precise by the definition of the integral.
Intuitively, we think of Xt as a process that at time t evolves like a Brownian
motion with zero drift and variance A2t . This is well defined for any adapted,
continuous process At , and Xt is a continuous function of t. In particular, if
is a bounded continuous function, then we can hope to solve the equation
Solving such an equation can be difficult (see the end of Section 3.5), but
simulating such a process is straightforward using the stochastic Euler rule:
Xt+t = Xt + (Xt ) t N,
1 2 B2
Zt = Bt B02 = t .
2 2
However, a quick check shows that this cannot be correct. The left-hand side
is a martingale with Z0 = 0 and hence
E[Zt ] = 0.
However,
E Bt2 /2 = t/2 6= 0.
92 CHAPTER 3. STOCHASTIC INTEGRATION
In the next section we will derive the main tool for calculating integrals,
Itos formula or lemma. Using this we will show that, in fact,
1 2
Zt = [B t]. (3.8)
2 t
This is a very special case. In general, one must do more than just sub-
tract the expectation. One thing to note about the solution for t > 0,
the random variable on the right-hand side of (3.8) does not have a normal
distribution. Even though stochastic integrals are defined as limits of nor-
mal increments, the betting factor At can depend on the past and this
allows one to get non-normal random variables. If the integrand At = f (t)
is nonrandom, then the integral
Z t
f (s) dBs ,
0
is really a limit of normal random variables and hence has a normal distri-
bution (see Exercise 3.8).
If Z t
Zt = As dBs ,
0
then the quadratic variation of Z is defined by
X j j1 2
hZit = lim Z Z .
n n n
jnt
lim Vt (f ) = 0.
t0
M j M j 1 ,
n = max
j=1,...,n n n
TK = inf{t : Vt (M ) = K}.
h i
The argument above gives E M1T 2 = 0 for each K, and
K
hence with probability one for each K, M1TK = 0. But M1 =
limK M1TK .
Using the proposition, we see that if Yt is an increasing adapted
process such that Zt2 Yt is a martingale, then
Mt (Zt2 Yt ) = hZit Yt ,
Hence,
n n
X j1 1 X 1
f (1) = f (0) + lim f0 + lim o .
n n n n n
j=1 j=1
The first limit on the right-hand side is the Riemann sum approximation of
the definite integral and the second limit equals zero since n o(1/n) 0.
Therefore, Z 1
f (1) = f (0) + f 0 (t) dt.
0
Itos formula is similar but it requires considering both first and second
derivatives.
1
= f 0 B(j1)/n j,n + f 00 B(j1)/n 2j,n + o(2j,n ),
2
where
j,n = Bj/n B(j1)/n .
Hence f (B1 ) f (B0 ) is equal to the sum of the following three limits:
n
X
f 0 B(j1)/n
lim Bj/n B(j1)/n , (3.9)
n
j=1
n
1 X 00 2
lim f B(j1)/n Bj/n B(j1)/n , (3.10)
n 2
j=1
n
X 2
lim o( Bj/n B(j1)/n ). (3.11)
n
j=1
2
The increment of the Brownian motion satisfies Bj/n B(j1)/n 1/n.
Since the sum in (3.11) looks like n terms of smaller order than 1/n the
limit equals zero. The limit in (3.9) is a simple process approximation to a
stochastic integral and hence we see that the limit is
Z 1
f 0 (Bt ) dBt .
0
3.3. ITOS FORMULA 97
This tells us what the limit should be in general. Let h(t) = f 00 (Bt ) which
is a continuous function. For every > 0, there exists a step function h (t)
such that |h(t) h (t)| < for every t. For fixed , we can consider each
interval on which h is constant to see that
n Z 1
X 2
lim h (t) Bj/n B(j1)/n = h (t) dt.
n 0
j=1
Also,
n n
[h(t) h (t)] Bj/n B(j1)/n 2
X X 2
Bj/n B(j1)/n .
j=1 j=1
m(j, n) 2
Btj Btj1
2
f (Btj ) f (Btj1 ) f 0 (Btj1 ) Btj Btj1
M (j, n) 2
Btj Btj1 ,
2
where m(j, n), M (j, n) denote the minimum and maximum of
f 00 (x) for x on the interval with endpoints Btj1 and Btj . Hence if
we let
Xn
1
f 0 (Btj1 ) Btj Btj1 ,
Q () =
j=1
n
X M (j, n) 2
Q2+ () = Btj Btj1 ,
2
j=1
n
X m(j, n) 2
Q2 () = Btj Btj1 ,
2
j=1
we have
P
with kn k < . Then we have seen that with probability one,
for all 0 < s < t < 1,
X 2
lim Btj,n Btj1,n = t s.
n
stj,n <t
1 1 00
Z
2 2
lim Q (n ) = lim Q+ (n ) = f (Bs ) dBs .
n n 2 0
We now assume for the moment that there exists K < such
that |f 00 (x)| K for all x. This happens, for example, if f has
compact support. Then
(n)
Let At = f 0 (Bt ) and let At be the simple process that equals
f 0 (Btj1 ,n ) for tj1,n t < tj,n . For tj1,n s < tj,n ,
(n)
E([At At ]2 ) K 2 E([Bs Btj1,n ]2 ) = K 2 [stj1,n ] K 2 kn k.
Therefore, Z 1
(n)
E([At At ]2 ) dt K 2 kn k.
0
In particular, we get the following.
00
Proposition 3.3.2. Suppose P that f is bounded and {n } is a
sequence of partitions with kn k < . Let
kn
(n)
X
Yt = B0 + f 0 (Btj1,n ) [Btj,n Btj1,n ]
j=1
kn
X f 00 (Btj1,n )
+ [Btj,n Btj1,n ]2 .
2
j=1
Then, our argument shows that with probability one for each K,
Z tTK
1 tTK 00
Z
0
f (BtTK ) = f (Bs ) dBs + f (Bs ) ds.
0 2 0
However, with probability one TK as K , and hence this
gives us a formula for f (Bt ).
Suppose that f is defined only on an interval I = (a, b) and
B0 I. Let
T = inf{t : Bt a + or Bt b }, T = T0 .
We can apply Itos formula to conclude for all t and all > 0.
Z tT
1 tT 00
Z
f (BtT ) = f (B0 ) + f 0 (Bs ) dBs + f (s) ds.
0 2 0
This is sometimes written shorthand as
1 00
df (Bt ) = f 0 (Bt ) dBt + f (Bt ) dt, 0 t < T.
2
The general strategy for proving the generalizations of Itos for-
mula that we give in the next couple of sections is the same, and
we will not give the details.
This is derived similarly to the first version except when we expand with
a Taylor polynomial around x we get another term:
f (t + t, x + x) f (t, x) =
1
t f (t, x) t + o(t) + x f (t, x) x + xx f (t, x) (x)2 + o((x)2 ).
2
If we set t = 1/n and write a telescoping sum for f (1, B1 ) f (0, B0 ) we
get terms as (3.9) (3.11) as well as two more terms:
n
X
lim t f ((j 1)/n, B(j1)/n ) (1/n), (3.12)
n
j=1
n
X
lim o(1/n). (3.13)
n
j=1
The limit in (3.13) equals zero, and the sum in (3.12) is a Riemann sum
approximation of a integral and hence the limit is
Z 1
t f (t, Bt ) dt.
0
2
Xt = X0 exp m t + Bt , (3.15)
2
then Xt is a geometric Brownian motion with parameters (m, ). Even
though we have an exact expression (3.15) for geometric Brownian motion,
it is generally more useful to think of it in terms of its SDE (3.14).
Geometric Brownian motion is more natural than usual Brownian mo-
tion for modeling prices of assets such as stock. It measures changes in terms
of fractions or percentages of the current price rather than the listed price
per share. In particular, the latter quantity includes a rather arbitrary unit
share which does not appear if one models with geometric Brownian mo-
tion.
The geometric Brownian motion (3.15) is what is sometimes called a
strong solution to the SDE (3.14). (All of the solutions to SDEs that
we discuss in this book are strong solutions.) We will not give the exact
definition, but roughly speaking, if one uses the same Brownian motion Bt
in both places, one gets the same function. Let us explain this in terms
of simulation. Suppose a small t is chosen. Then we can define Bkt by
B0 = 0 and
Bkt = B(k1)t + t Nk , (3.16)
where N1 , N2 , . . . is a sequence of independent N (0, 1) random variables.
Using the same sequence, we could define an approximate solution to (3.14)
by choosing X0 = e0 = 1 and
h i
Xkt = X(k1)t + X(k1)t m t + t Nj . (3.17)
2
Ykt = exp m (kt) + Bkt . (3.18)
2
In Exercise 3.9 you are asked to do a simulation to compare (3.17) and
(3.18).
3.4. MORE VERSIONS OF ITOS FORMULA 103
In some of our derivations below, we will use this kind of argument. For
example, a formal derivation of Itos formula II can be given as
1
df (t, Bt ) = t f (t, Bt ) dt + x f (t, Bt ) dBt + xx f (t, Bt )2 (dBt )2
2
+o(dt) + o((dt)(dBt )) + o((dBt )2 ).
By setting (dBt )2 = dt and setting the last three terms equal to zero we get
the formula.
Suppose that Xt satisfies
As in the case for Brownian motion, the drift term does not contribute to
the quadratic variation,
Z Z t
hXit = h A dBit = A2s ds.
0
dXt = At Xt dBt X0 = x0 .
3.5 Diffusions
Geometric Brownian motion is an example of a time-homogeneous diffusion.
We say that Xt is a diffusion (process) if it is a solution to an SDE of the
form
dXt = m(t, Xt ) dt + (t, Xt ) dBt , (3.21)
where m(t, x), (t, x) are functions. It is called time-homogeneous if the func-
tions do not depend on t,
That is,
Z t
2 (s, Xs ) 00
0
f (Xt ) f (X0 ) = m(s, Xs ) f (Xs ) + f (Xs ) ds
0 2
Z t
+ f 0 (Xs ) (s, Xs ) dBs .
0
The second term on the right-hand side is a martingale (since the integrand
is bounded) and has expectation zero, so the expectation of the right-hand
side is
t E[Yt ],
where Z t
2 (s, Xs ) 00
1 0
Yt = m(s, Xs ) f (Xs ) + f (Xs ) ds.
t 0 2
The fundamental theorem of calculus implies that
2 (0, X0 ) 00
lim Yt = m(0, X0 ) f 0 (X0 ) + f (X0 ).
t0 2
108 CHAPTER 3. STOCHASTIC INTEGRATION
Note that
Z t
|y1 (t) y0 (t)| |F (s, y0 )| ds Ct,
0
3.5. DIFFUSIONS 109
k C tk+1
|yk+1 (t) yk (t)| , 0 t t0 .
(k + 1)!
exists and
X j tj+1
|yk (t) y(t)| C .
(j + 1)!
j=k
If we let Z t
y(t) = y0 + F (s, y(s)) ds,
0
then for each k,
Z t
|y(t) yk+1 (t)| |F (s, y(s)) F (s, yk (s))| ds
0
Z t
|y(s) yk (s)| ds,
0
so that
"Z 2 #
h i t
(k+1) k 2 (k) (k1)
E |Xt Xt | 2E |Xs Xs | ds
0
"Z 2 #
t
+2 E [(s, Xs(k) ) (s, Xs(k1) )] dBs .
0
or in differential form,
dhX, Y it = At Ct dt.
The product rule for the usual derivative can be written in differential
form as
d(f g) = f dg + g df = f g 0 dt + f 0 g dt.
It can be obtained formally by writing
However, the (dXt ) (dYt ) term does not vanish, but rather equals dhX, Y it .
This gives the stochastic product rule.
112 CHAPTER 3. STOCHASTIC INTEGRATION
In other words,
Z t Z t Z t
Xt Yt = X0 Y0 + Xs dYs + Ys dXs + dhXY is
0 0 0
Z t
= X0 Y0 + [Xs Ks + Ys Hs + As Cs ] ds
0
Z t
+ [Xs Cs + Ys As ] dBs .
0
dXt = m Xt dt + Xt dBt .
to mean
Z t d Z
X t
Xt = X0 + Hs ds + Ajs dBsj .
0 j=1 0
hB i , B j i = 0, i 6= j.
3.7. SEVERAL BROWNIAN MOTIONS 113
In particular, if
Z t d Z
X t
Yt = Y0 + Ks ds + Csj dBsj ,
0 j=1 0
then
d
Ajt Ctj dt.
X
dhX, Y it =
j=1
Theorem 3.7.1 (Itos formula, final form). Suppose Bt1 , . . . , Btd are inde-
pendent standard Brownian motions, and Xt1 , . . . , Xtn are processes satisfy-
ing
d
Ai,k
X
k k i
dXt = Ht dt + t dBt .
i=1
In other words,
d Z n
" #
X t X
f (t, Xt ) f (0, X0 ) = k f (s, Xs ) Ai,k
s dBsi
i=1 0 k=1
114 CHAPTER 3. STOCHASTIC INTEGRATION
Z t" n
!
X
+ f(s, Xs ) + k f (s, Xs ) Hsk
0 k=1
d n n
1 XXX
+ jk f (s, Xs ) Ai,j
s As
i,k
dt
2
i=1 j=1 k=1
Other standard notations for 2 are and . In the statement below the
gradient and the Laplacian 2 are taken with respect to the x variable
only.
3.8 Exercises
Exercise 3.1. Suppose At is a simple process with |At | C for all t. Let
Z t
Zt = As dBs .
0
Show that
E Zt4 3 C 4 t2 .
(Hint: Write Zt out in step function form, expand the fourth power, and
use rules for conditional expectation to evaluate the different terms.)
3.8. EXERCISES 115
Exercise 3.2. Use Itos formula to find the stochastic differential f (t, Bt )
where Bt is a standard Brownian motion and
1. f (t, x) = sin x;
2. f (t, x) = et (x/t)2
dXt = Xt [m dt + dBt ].
dXt = 4 Xt dt + Xt dBt .
dXt = Xt [1 dt + 1 dBt ],
dYt = Yt [2 dt + 2 dBt ],
where Bt is a standard Brownian motion. Suppose also that X0 = Y0 = 1.
1. Yt = Bt .
2. Yt = Xt3 .
3. Z t
Yt = exp (Xs2 + 1) ds .
0
dXt = Xt [ 2 dt + 2 dBt ].
with a betting strategy As that changes only at times t0 < t1 < t2 < < 1
where
tn = 1 2n .
119
120 CHAPTER 4. MORE STOCHASTIC CALCULUS
We start by setting At = 1 for 0 t < 1/2. Then Z1/2 = B1/2 . Note that
P{Z1/2 1} = P{B1/2 1} = P{B1 2} = 1 ( 2) =: q > 0.
If Z1/2 1, we stop, that is, we set At = 0 for 1/2 t 1. If Z1/2 < 1, let
x = 1 Z1/2 > 0. Define a by the formula
P{a[B3/4 B1/2 ] x} = q.
We set At = a for 1/2 t < 3/4. Note that we only need to know Z1/2 to
determine a and hence a is F1/2 -measurable. Also, if a[B3/4 B1/2 ] x,
then Z 3/4
Z3/4 = As dBs = Z1/2 + a[B3/4 B1/2 ] 1.
0
Therefore,
P{Z3/4 1 | Z1/2 < 1} = q,
and hence
P{Z3/4 < 1} = (1 q)2 .
If Z3/4 1, we set At = 0 for 3/4 t 1. Otherwise, we proceed as above.
At each time tn we adjust the bet so that
and hence
P{Ztn < 1} (1 q)n .
Using this strategy, with probability one Z1 1, and hence, E[Z1 ] 1.
Therefore, Zt is not a martingale. Our choice of strategy used discontinu-
ous bets, but it is not difficult to adapt this example so that t 7 At is a
continuous function except at the one time at which the bet changes to zero.
lim j = T,
j
(j)
Then for each j, Mt is a square integrable martingale. Therefore, Zt is a
local martingale on [0, T ) where
Z t
2
T = inf t : As ds = .
0
dXt = Rt dt + At dBt .
T = inf{t : Zt = a or Zt = b}.
By solving, we get
a
P{ZT = b} = ,
a+b
which is the gamblers ruin estimate for continuous martingales.
lim Zt = Z .
t
Nt = max |Zt |.
0st
E[Zt2 ]
P{Nt a} .
a2
T = T0 = inf{t : Xt = 0}.
At time T the equation is ill-defined so we will not consider the process for
t > T . If a > 0, then when Xt gets close to 0, there is a strong drift away
from the origin. It is not obvious whether or not this is strong enough to
keep the diffusion from reaching the origin.
Suppose that 0 < r < x < R < and let (x) denote the probability
that the process starting at x reaches level R before reachng level r. In other
words, if
= (r, R) = inf{t : Xt = r or Xt = R},
then
(x) = P{X = R | X0 = x}.
We will use Itos formula to find a differential equation for and use this
to compute . Note that (r) = 0, (R) = 1. Let J denote the indicator
function of the event {X = R} and let
Mt = E[J | Ft ].
E[J | Ft ] = (Xt ).
If this is to be a martingale, the dt term must vanish at all times. The way
to guarantee this is to choose to satisfy the ordinary differential equation
(ODE)
x 00 (x) + 2a 0 (x) = 0.
Solving such equations is standard (and this one is particularly easy for one
can solve the first-order equation for g(x) = 0 (x)), and the solutions are of
the form
1
(x) = c1 + c2 x12a , a 6= ,
2
1
(x) = c1 + c2 log x, a= ,
2
where c1 , c2 are arbitrary constants. The boundary conditions (r) =
0, (R) = 1 determine the constants giving
x12a r12a 1
(x) = , a 6= , (4.1)
R12a r12a 2
log x log r 1
(x) = , a= . (4.2)
log R log r 2
We now answer the question that we posed.
The alert reader will note that we cheated a little bit in our
derivation of because we assumed that was C 2 . After assuming
this, we obtained a differential equation and found what should
be. To finish a proof, we can start with as defined in (4.1) or
4.3. FEYNMAN-KAC FORMULA 127
Suppose that at some future time T we have an option to buy a share of the
stock at price S. Since we will exercise the option only if XT S, the value
of the option at time T is F (XT ) where
We will use Itos formula to derive a PDE for under the assumption that
is C 1 in t and C 2 in x.
Let
Mt = E RT1 F (XT ) | Ft .
and that F is a function that does not grow too quickly. Let
f (t, x) = E [F (XT ) | Xt = x] .
Let r(t) 0 be a discount rate which for ease we will assume is a determin-
istic function of time, and
Z t
Rt = R0 exp r(s) ds .
0
Let
Z T
(t, x) = E [(Rt /RT ) F (XT ) | Xt = x] = exp r(s) ds f (t, x)
t
Recall that
t f (t, x) = Ltx f (t, x),
where Lt is the generator
(t, x) 00
Lt h(x) = m(t, x) h0 (x) + h (x).
2
Therefore,
Z T
t (t, x) = r(t) exp r(s) ds f (t, x)
t
Z T
exp r(s) ds Ltx f (t, x)
t
= r(t) (t, x) Ltx (t, x)
2 (t, x)
= r(t) (t, x) m(t, x) x (t, x) xx (t.x),
2
which is the Feynman-Kac PDE.
4.4. BINOMIAL APPROXIMATIONS 131
dXt = dBt ,
There are two reasonable ways to incorporate drift in this binomial model.
One is to use a version of the Eulers method and set
1
P{X(t + t) X(t) = m(t, Xt ) t t | X(t)} = ,
2
132 CHAPTER 4. MORE STOCHASTIC CALCULUS
E [X(t + t) X(t) | Ft ]
= (a1 , a2 , . . . , aN )
m J m N J
1+ 1
N N
N/2
m2 m r N m r N
= 1 1+ 1 .
N N N
N
Using the relation 1 + Na ea , we see that the limit of the right-hand
side as N is
2 2
em /2 e2rm = emX(1) em /2 .
Given this, we see that in order to sample from a Brownian motion with
drift, we could first sample from a Brownian motion without drift and then
2
weight the samples by the factor emX(1) em /2 . We will show how to do this
directly in Section 5.2.
As one more application of binomial approximations, we will give a
heuristic argument for the following theorem.
2 00
L f (x) = [m(x) f (x)]0 + f (x)
2
2 00
= m0 (x) f (x) m(x) f 0 (x) + f (x).
2
Recall that
2 00
Lf (x) = m(x) f 0 (x) + f (x).
2
If m is constant, then as we saw before, one obtains L from L by just
changing the sign of the drift. For varying m we get another term. We will
derive the expression for L by using the binomial approximation
m(Xt )
1
P{X(t + t) X(t) = t | X(t)} = 1 t .
2
Let = t, 2 = t. To be at position x = k at time t + 2 , one must be
at position x at time t. This gives the relation
2 1 m(x )
p(t + , x) = p(t, x ) 1+
2
1 m(x + )
+p(t, x + ) 1 . (4.10)
2
We know that
2 2
p(t, x + ) + p(t, x ) = p(t, x) + xx p(t, x) + o(2 ).
2
p(t, x ) = p(t, x) x p(t, x) + o(),
m(x ) = m(x) m0 (x) + o(),
Plugging in, we see that the right-hand side of (4.10) equals
2
2 0
p(t, x) + xx p(t, x) m (x) p(t, x) m(x) x p(t, x) + o(2 ).
2
If
2 00
Lg(x) = m(x) g 0 (x) + g (x),
2
then integration by parts gives
2 00
L f (x) = [m(x) f (x)]0 + f (x).
2
Then every R,
2 t/2
E [exp{iMt }] = e .
Sketch of proof. Fix and let f (x) = eix . Note that the derivatives of f are
uniformly bounded in x. Following the proof of Itos formula we can show
that
1 t 00 2 t
Z Z
f (Mt ) f (M0 ) = Nt + f (Ms ) ds = Nt f (Ms ) ds,
2 0 2 0
where Nt is a martingale. In particular, if r < t,
Z t
2 t
Z
1 00
E[f (Mt ) f (Mr )] = E f (Ms ) ds = E[f (Ms )] ds.
2 r 2 r
2
G0 (t) = G(t), G(0) = 1,
2
2 t/2
which has solution G(t) = e .
136 CHAPTER 4. MORE STOCHASTIC CALCULUS
4.6 Exercises
Exercise 4.1. A process Xt satisfies the Ornstein-Uhlenbeck SDE if
1. Find a function F with F (0) = 0 and F (x) > 0 for x > 0 such that
F (XtT ) is a martingale. (You may leave your answer in terms of a
definite integral.)
2. Find the probability that XT = R. You can write the answer in terms
of the function F .
lim P{XT = R} = 0 ?
R
= min{t : Xt = /2}.
1. Find a function F (x) that is positive for 0 < x < /2 with F (/2) = 0
and such that Mt = F (XtT ) is a local martingale for t < T . (It suffices
to write F in the form
Z /2
F (x) = g(y) dy,
x
lim q(x0 , ) = 0 ?
0
Exercise 4.4. Suppose B11 , . . . , Btd are independent standard Brownian mo-
tions and let q
Xt = (Bt1 )2 + + (Btd )2 .
2. Show that hM it = t.
139
140CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
where
dPY g
= .
dPX f
If AX AY = , then PX PY .
PX (R \ A) = 0, PY (A) = 0.
Q(A) = E [1A X] , A G,
Q(A) = E [1A Y ] .
P (E ) = 1, P0 (E0 ) = 1.
Let F denote the corresponding Borel -algebra, that is, the small-
est -algebra under which all the open sets under k k are measur-
able. The measures P are defined on (, F). It is easy to check
that the functions s (f ) = f (s) are measurable functions on this
space, and hence so are the functions
2n
j1 2
X j
n (f ) = f f ,
2n 2n
j=1
Play the game and then add a deterministic amount in one direction.
dMt = m Mt dBt , M0 = 1.
Qt (V ) = E [1V Mt ] .
dQt
= Mt .
dP
If s < t and V is Fs -measurable, then it is also Ft -measurable. In this case,
Qs (V ) = Qt (V ) as can be seen in the calculation
Qt (V ) = E [1V Mt ] = E [E(1V Mt | Fs )]
Therefore, if V is Fs -measurable,
dXt = Xt [m dt + dBt ] ,
dBt = r dt + dWt ,
146CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
and in the new measure, Bt was a Brownian motion with drift. We will
generalize this idea here.
Suppose Mt is a nonnegative martingale satisfying the exponential SDE
dMt = At Mt dBt , M0 = 1, (5.7)
where Bt is a standard Brownian motion. The solution to this equation was
given in (3.20),
Z t
1 t 2
Z
Yt
Mt = e where Yt = As dBs A ds. (5.8)
0 2 0 s
For many applications it suffices to consider the equation (5.7) and not worry
about the form of the solution (5.8). Solutions to (5.7) are local martingales,
but as we have seen, they might not be martingales. For now we will assume
that Mt is a martingale. In that case, we can define a probability measure
P by saying that if V is an Ft -measurable event, then
P (V ) = E [1V Mt ] . (5.9)
In other words, if we consider P, P as being defined on Ft -measurable events,
dP
= Mt .
dP
If s < t and V is Fs -measurable, then V is also Ft -measurable. Hence, in
order for the above definition to be consistent, we need that for such V ,
E [1V Ms ] = E [1V Mt ] .
Indeed, this holds by the computation (5.4) which only uses the fact that M
is a martingale and V is Fs -measurable. We write E for expectations with
respect to P . If X is Ft -measurable, then
E [X] = E [X Mt ] .
Theorem 5.3.1 (Girsanov Theorem). Suppose Mt is a nonnegative martin-
gale satisfying (5.7), and let P be the probability measure defined in (5.9).
If Z t
Wt = Bt As ds,
0
then with respect to the measure P , Wt is a standard Brownian motion. In
other words,
dBt = At dt + dWt ,
where W is a P -Brownian motion.
148CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
that is, in the probability measure P , the process obtains a drift of A(t).
The condition that Mt be a martingale (and not just a local martingale)
is necessary for Girsanovs theorem as we have stated it. Given only (5.7)
or (5.8), it may be hard to determine if Mt is a martingale, so it is useful
to have a version that applies for local martingales. If we do not know Mt
is a martingale, we can still use Theorem 5.3.1 if we are careful to stop the
process before anything bad happens. To be more precise, suppose Mt = eYt
satisfies (5.8), and note that
Z t
hY it = A2s ds.
0
Let
Tn = inf{t : Mt + hY it = n},
and let
(n) At , t < Tn
At = .
0, t Tn
5.3. GIRSANOV THEOREM 149
Then
(n)
dMtTn = At MtTn dBt ,
which is a square integrable martingale since
Z t
E (MtTn 1)2 = E[A(n) 2
s MsTn ] ds
0
Z t
n E [A(n)
2 2 3
s ] ds n .
0
There is the corresponding measure, which we might denote by Pn , which
gives a drift of At up to time Tn and then proceeds with drift 0. If n < m,
then Pn and Pm are the same measure restricted to t Tn . Hence we can
write P for a measure on Bt , 0 t < T , where
T = lim Tn .
n
This shows how to tilt the measure up to time T , There are examples
such that P {T < }0. However, if for some fixed t0 , P {T > t0 } = 1, then
Mt , 0 t t0 is a martingale. In other words, what prevents a solution
to (5.7) from being a martingale is that with respect to the new measure
P , either Mt or |At | goes to infinity in finite time. We summarize with a
restatement of the Girsanov theorem.
Theorem 5.3.2 (Girsanov Theorem, local martingale form). Suppose Mt =
eYt satisfies (5.7)(5.8), and let
Tn = inf{t : Mt + hY it = n}, T = T = lim Tn .
n
It is not always easy to see whether or not the local martingale in (5.7)
will be a martingale. However, if any one of the three conditions at the end
of the theorem hold, then it is a martingale. The first condition uses P , the
new measure, while the expectation E in the other two conditions is with
respect to the original measure. The relation (5.10) is called the Novikov
condition.
Even if Mt is not a martingale, since
Mt = lim MtTn ,
n
Let Z t
r(r 1)
Mt = exp 2
ds Btr .
0 2B s
The product rule shows that Mt satisfies the exponential SDE
r
dMt = Mt dBt , t < .
Bt
Therefore,
r
dBt =dt + dWt , t < ,
Bt
where Wt is a Brownian motion in the new measure. This equation is the
Bessel equation. In particular, if r 1/2, then P { = } = 1, and using
this we see that with P -probability one
Z t
Mt + A2s ds
0
E [1V Ws ] = E [1V Wt ] ,
E [1V Ws Ms ] = E [1V Wt Mt ] .
Zt is a square-integrable martingale.
For more general At , Mt we use localization with the stopping
times Tn as above. Note that
E[Mt ] lim E [Mt 1{Tn > t}] = lim P {Tn > t} = P {T > t},
n n
r = inf{t : hY it = r}.
The process Z r
Xr = As dWs
0
is a standard Brownian motion with respect to the measure P .
Also,
1 r
Z
r
Yr = Xr + As ds = Xr + .
2 0 2
In particular,
t
max Ms = max exp Xt + .
0sr 0tr 2
In other words, T = lim Tn can be defined as
T = sup{t : hY it < }.
Let V denote the event that r 1 for all r < . We need to show
that P (V ) = 0.
Let
n = min {m n : Xm 0} .
Since Xm is a P -local martingale, for each n, P [V {n < }] =
P (V ). Also,
Mn = exp {Yn } en /2 .
Let us fix n, let = n , and note that
X
X
P (V ) E [Mm 1 { = m}] em/2 P{ = m}
m=n m=n
h i
E ehY i1 /2 1{ = m} .
5.4. BLACK-SCHOLES FORMULA 155
We therefore, get
h i
P (V ) E ehY i1 /2 1{r < 1} .
that is, Rt = ert R0 . Let T be a time in the future and suppose we have the
option to buy a share of stock at time T for strike price K. The value of this
option at time T is
(ST K) if ST > K,
F (ST ) = (ST K)+ =
0 if ST K.
The goal is to find the price f (t, x) of the option at time t < T given St = x.
156CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
2 x2 00
t f(t, x) = r f(t, x) m x f0 (t, x) f (t, x). (5.14)
2
Here and throughout this section we use primes for x-derivatives. If one sells
an option at this price and uses the money to buy a bond at the current
interest rate, then there is a positive probability of losing money.
The Black-Scholes approach to pricing is to let f (t, x) be the value of
a portfolio at time t, given that St = x, that can be hedged in order to
guarantee a portfolio of value F (ST ) at time T . By a portfolio, we mean an
ordered pair (at , bt ) where at , bt denote the number of units of stocks and
bonds, respectively. Let Vt be the value of the portfolio at time t,
Vt = at St + bt Rt . (5.15)
dVt = at St [m dt + dBt ] + bt r Rt dt
= at St [m dt + dBt ] + r [Vt at St ] dt
= [m at St + r (Vt at St )] dt + at St dBt . (5.17)
dVt = df (t, St )
1
= t f (t, St ) dt + f 0 (t, St ) dSt + f 00 (t, St ) dhSit
2
2 St2 00
0
= t f (t, St ) + m St f (t, St ) + f (t, St ) dt
2
+ St f 0 (t, St ) dBt . (5.18)
By equating the dBt terms in (5.17) and (5.18), we see that that the portfolio
is given by
Vt at St
at = f 0 (t, St ), bt = , (5.19)
Rt
and then by equating the dt terms we get the Black-Scholes equation
2 x2 00
t f (t, x) = r f (t, x) r x f 0 (t, x) f (t, x).
2
There are two things to note about this equation.
The drift term m does not appear. If we think about this, we realize
why our assumptions should give us an equation independent of m.
Our price was based on being able to hedge our portfolio so that with
probability one the value at time T is (ST K)+ . Geometric Brown-
ian motions with the same but different m are mutually absolutely
continuous and hence have the same events of probability one.
The equation is exactly the same as (5.14) except that m has been
replaced with r. Therefore, we can write
h i
f (t, x) = E er(T t) F (St ) | St = x ,
where S satisfies
dSt = St [r dt + dBt ].
158CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
Using this, one can compute f (t, x) exactly; in the next section we do
this and derive the Black-Scholes formula
2
!
log(x/K) + (r + 2 )t
f (T t, x) = x
t
2
!
rt log(x/K) + (r 2 )t
K e , (5.20)
t
We can easily generalize this to the case where the stock price satisfies
(t, x)2 x2 00
t f (t, x) = r(t, x) f (t, x) r(t, x) x f 0 (t, x) f (t, x). (5.21)
2
As before, the drift term m(t, x) does not appear in the equation. The func-
tion f can be given by
where St , Rt satisfy
Vt = Rt EQ RT1 F (ST ) | Ft .
(5.23)
dSt = St dBt ,
Z = exp {aN + y} ,
where a = T t, N is a standard normal random variable, and
a2
y = log St .
2
Straightforward calculus shows that Z has a density
1 y + log z
g(z) = ,
az a
5.5. MARTINGALE APPROACH TO BLACK-SCHOLES EQUATION161
1 T
Z
V = max St , V = St dt.
0tT T 0
We will start with the following definition.
Definition, first try If V is a claim at time T , then the (arbitrage-free)
price Vt , 0 t T, of a claim VT is the minimum value of a self-financing
portfolio that can be hedged to guarantee that its value at time T is V .
Our goal is to determine the price Vt and the corresponding portfolio
(at , bt ), where at denotes the number of units of S and bt the number of
units of R. This will require some mathematical assumptions that we will
make as we need them. Recall that
Vt = at St + bt Rt ,
We will start by giving two bad, but unrealistic, examples that show that we
need to take some care. The two examples are similar. In the first example,
we allow the stock price to fluctuate too much. In the second, we choose a
very risky portfolio similar to the martingale betting strategy.
5.6. MARTINGALE APPROACH TO PRICING 163
(Note that the discounted bond value is R0 .) Using the product formula,
we see that
dSt = St [(mt rt ) dt + t dBt ] .
Our goal is to find a self-financing portfolio (at , bt ) such that with probability
one
VT = aT ST + bT R0 = V .
Since this must happen with probability one, we may consider a mutually
absolutely continuous measure. We let Q be the probability measure (if it
exists) that is mutually absolutely continuous with respect to P such that
under Q the discounted stock price is a martingale. Recalling (5.11), we can
see that the Girsanov theorem tells us to choose
dQ = Mt dP,
where Mt satisfies
rt mt
dMt = Mt dBt , M0 = 1. (5.27)
t
The solution to this last equation is a local martingale, but it not necessarily
a martingale. If it is not a martingale, then some undesirable conclusions
may result as in our examples above. Our first assumption will be that it is
a martingale.
Assumption 1. The local martingale defined in (5.27) is actually a
martingale.
This assumption implies that Q is mutually absolutely continuous with re-
spect to P. Theorem 5.3.2 gives a number of ways to establish Q P. If
Q P, then we also get P Q if P{Mt > 0} = 1. Let
Z t
rs ms
Wt = Bt ds,
0 s
which is a Brownian motion with respect to Q. Plugging in we see that
This shows that St is a local martingale with respect to Q. We will want this
to be a martingale, and we make this assumption.
Assumption 2. The Q-local martingale St satisfying (5.28) is actually
a Q-martingale.
5.6. MARTINGALE APPROACH TO PRICING 165
Again, Theorem 5.3.2 gives some sufficient conditions for establishing that
the solution to (5.28) is a Q-martingale. We write EQ and EQ for expecta-
tions (regular and conditional) with respect to Q.
Vt = at St + bt Rt ,
that is self-financing,
dVt = at dSt + bt dRt .
then Z t
Zt = As dWs
0
is a square integrable martingale. Let us assume for the moment that there
exists such a process As such that
Z t
Vt = V0 + As dWs ,
0
that is
dVt = At dWt . (5.29)
166CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
We compute,
Along the way we made the assumption that we could write Vt as (5.29).
It turns out, as we discuss in the next section, that this can always be done
although we cannot guarantee that the process At is continuous or piecewise
continuous. Knowing existence of the process is not very useful if one cannot
find At . For now we just write as an assumption that the computations work
out.
Assumption 3. We can write Vt as (5.29), and if we define at , bt as
in (5.30), then the stochastic integral
Z t Z t
Vt = as dSs + bs dRs ,
0 0
is well defined.
Theorem 5.6.1. If V is a contingent claim and assumptions 1-3 hold, then
the arbitrage-free price is
Vt = Rt EQ (VT | Ft ).
We have done most of the work in proving this theorem. What remains
is to show that if (at , bt ) is a self-financing strategy with value
Vt = at St + bt Rt ,
5.6. MARTINGALE APPROACH TO PRICING 167
such that with probability one, Vt 0 for all t and VT V , then for all t,
with probability one Vt Vt . When we say with probability one this can
be with respect to either P or Q since one of our assumptions is that the
two measures are mutually absolutely continuous. Let Vt = Vt /Rt be the
discounted values. The product rule gives
h i
dVt = d(Rt Vt ) = Rt dVt + Vt dRt = Rt dVt + at St + bt dRt ,
If V VT , then h i
EQ VT | Ft EQ [V | Ft ] = Vt .
The product rule implies that the discounted stock price satisfies
Vt = Rt Vt = Rt EQ RT1 V | Ft = (t, St ).
168CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
dSt = St [m dt + dBt ] ,
and the bond rate is constant r. Suppose that the claim is the average stock
price over the interval [0, T ],
1 T
Z
V = St dt.
T 0
dSt = St dWt ,
Rt
Since 0 Ss ds is Ft -measurable, we get
Z t Z T
rT
Te Vt = Ss ds + EQ Ss ds | Ft
0 t
Z t Z T
= Ss ds + EQ [Ss | Ft ] ds.
0 t
The second equality uses a form of Fubinis theorem that follows from the
linearity of conditional expectation. Since Ss is a Q-martingale, if s > t,
h i
EQ [Ss | Ft ] = ers EQ Ss | Ft = ers St = er(st) St .
Therefore,
T T
er(T t) 1
Z Z
EQ [Ss | Ft ] ds = St er(st) ds = St ,
t t r
and
er(T t) t
1 er(T t)
Z
Vt = ert Vt = Ss ds + St .
T 0 rT
Note that VT = V which we needed, and the price at time 0 is
1 erT
V0 = S0 .
rT
The hedging portfolio can be worked out with a little thought. We will
start with all the money in stocks and as time progresses we move money
into bonds. Suppose that during time interval [t, t + t] we convert ut
units of stock into bonds. Then the value of these units of bonds at time
T will be about u er(T t) St t. If we choose u = er(tT ) /T , then the value
will be about St t/T and hence the value of all our bonds will be about
1 T
R
T 0 Ss ds. This gives us
dat er(tT )
=
dt T
170CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
1 er(tT )
at = . (5.32)
rT
This is a special case where the hedging strategy does not depend on the
current stock price St . If we want to use the formula we derived, we use
(5.31) to give
1 er(tT ) 1 er(tT )
dVt = dSt = St dWt .
rT rT
Plugging this into (5.30) gives (5.32).
Vt = E[V | Ft ].
Xt , X2t , . . .
5.7. MARTINGALE REPRESENTATION THEOREM 171
each with
1
P {Xkt = x} = P {Xkt = x} = .
2
Let Fkt denote the information in {Xt , X2t , , . . . , Xkt }, and assume
that Mkt is a martingale with respect to Fkt . The martingale property
implies that
E M(k+1)t | Fkt = Mkt . (5.33)
Xt , X2t , . . . , X(k+1)t .
Given a particular value of Xkt , there are only two possible values for
M(k+1)t corresponding to the values when X(k+1)t = x and X(k+1)t =
x, respectively. Let us denote the two values by
[b + a] x, [b a] x,
where b x is the average of the two values. The two numbers a, b depend
on Xkt and hence are Fkt -measurable. The martingale property (5.33)
tells us that
b x = Mkt ,
and hence
M(k+1)t Mkt = a x = a X(k+1)t .
If we write J(k+1)t for the number a, then J(k+1)t is Fkt -measurable, and
k
X
Mkt = M0 + Jjt Xjt .
j=1
This is the form of the stochastic integral with respect to random walk as
in Section 1.6.
172CHAPTER 5. CHANGE OF MEASURE AND GIRSANOV THEOREM
5.8 Exercises
Exercise 5.1. For each of the following random variables Xj on R, let j
be the distribution considered as a probability measure on R. For each pair
state whether or not j k . Since there are six random variables, there
should be 30 answers.
M n = 1 Wn ,
Qn (V ) = E [Mn 1V ] .
3. Given the last part, we can write Q rather than just Qn . Find the
transition probability
Q Mn+1 = 2n+1 | Mn = 2n .
dXt = 2 dt + dBt ,
dXt = 2 dt + 6 dBt ,
dXt = 2Bt dt + dBt .
is a local martingale.
is a local martingale for t < T . (Do not worry about what happens
after time T .)
Jump processes
Independent increments
Stationary increments
Continuous paths.
177
178 CHAPTER 6. JUMP PROCESSES
that it is increasing, that is, with probability one if r < s, then Tr < Ts . We
calculated the density of Ts in Example 2.7.1,
s s2
fs (t) = e 2t , 0 < t < .
t3/2 2
In particular, the distribution of Ts is not normal. We claim that (with
probability one) the function s 7 Ts is not continuous. To see this, let
M = max0t1 Bt and let s0 (0, 1) be a time t with Bt = M . Then by
definition of M, Ts we see that TM s0 , but Ts > 1 for s > M , showing that
T is not continuous at s0 . The scaling property of Brownian motion implies
that Ts has the same distribution as s2 T1 . A tricky calculation which we
omit computes the characteristic function
Z
2
E[ei(r/s ) Ts ] = E[eirT1 ] = eirt f1 (t) dt = e(r) ,
where
|2r|1/2 (1 i)
if r 0
(r) = .
|2r|1/2 (1 + i) if r 0
Example 6.1.2. Let Bt = (Bt1 , Bt2 ) be a standard two-dimensional Brown-
ian motion. Let
Ts = inf{t : Bt1 = s},
and
Xs = B 2 (Ts ).
Using the strong Markov property, one can show that the increments are
independent, stationary and similarly to Example 6.1.1 that the paths as
discontinuous. The scaling property of Brownian motion implies that Xs has
the same distribution as s X1 . The density of X1 turns out to be Cauchy,
1
f (x) = , < x < ,
(x2 + 1)
with characteristic function
Z
eirx f (x) dx = e|r| .
irXt
E e =
Y1,n + + Yn,n .
The goal of the next few sections is to show that every infinitely divisible
random variable is the sum of a normal random variable and a generalized
Poisson or jump random variable. The category generalized Poisson is
rather large and will include the distributions in Examples 6.1.1 and 6.1.2.
exist where the limits are taken over the dyadic rationals. We then
define Xt to be Xt+ .
180 CHAPTER 6. JUMP PROCESSES
p(t) = t + o(t), t 0.
T = inf{t : Xt = 1}
denote the amount of time until the first jump. Using independent, station-
ary increments, we see that
n
Y (j 1)t jt
P{T > t} = P no jump during ,
n n
j=1
n
t
= 1p .
n
Therefore,
n
t
P{T > t} = lim 1 p
n n
n
t t
= lim 1 +o = et .
n n n
Recall that a random variable T has an exponential distribution with rate
if it has density
f (t) = et , 0 < t < ,
and hence Z
P{T > t} = f (s) ds = et .
t
Our assumptions imply that the waiting times of a Poisson distribution with
parameter must be exponential with rate . Note that
Z
1
E[T ] = t f (t) dt = ,
0
6.2. POISSON PROCESS 181
so that the mean waiting time is (quite reasonably) the reciprocal of the
rate. This observation gives a way to construct a Poisson process. This con-
struction also gives a good way to simulate Poisson processes (see Exercise
6.1).
n = T1 + + Tn .
Set
Xt = n for n t < n+1 .
Note that we have defined the process so that the paths are right-continuous,
Xt = Xt+ := lim Xs .
st
The paths also have limits from the left, that is, for every t the limit
Xt = lim Xs
st
exists. If t is a time that the process jumps, that is, if t = n for some n > 0,
then
Xt = Xt+ = Xt + 1.
At all other times the path is continuous, Xt = Xt+ .
The independent, stationary increments follow from the construction.
We can use the assumptions to show the following.
One way to derive this is to write a system of differential equations for the
functions
qk (t) = P{Xt = k}.
182 CHAPTER 6. JUMP PROCESSES
In the small time interval [t, t + t], the chance that there is more than one
jump is o(t) and the chance there is exactly one jump is t + o(t).
Therefore, up to errors that are o(t),
This gives
or
dqk (t)
= [qk1 (t) qk (t)].
dt
If we assume X0 = 0, we also have the initial conditions q0 (0) = 1 and
qk (0) = 0 for k > 0. We can solve this system of equations recursively, and
this yields the solutions
(t)k
qk (t) = et .
k!
(Although it takes some good guesswork to start with the equations and find
qk (t), it is easy to verify that qk (t) as given above satisfies the equations.)
When studying infinitely divisible distributions, it will be useful to con-
sider characteristic functions, and for notational ease, we will take loga-
rithms. Since the characteristic function is complex-valued, we take a little
care in defining the logarithm.
X+Y = X + Y . (6.1)
If X N (m, 2 ), then
2 2
(s) = ims s .
2
6.2. POISSON PROCESS 183
We now compute the generator for the Poisson process. Although we think
of the Poisson process as taking integer values, there is no problem extending
184 CHAPTER 6. JUMP PROCESSES
the definition so that X0 = x. In this case the values taken by the process are
x, x + 1, x + 2, . . . Up to terms that are o(t), P{Xt = X0 + 1} = 1 P{Xt =
X0 } = t. Therefore as t 0,
and
Lf (x) = [f (x + 1) f (x)].
The same argument shows that if f (t, x) is defined by
f (t, x) = E [F (Xt ) | X0 = x] ,
then f satisfies the heat equation (6.3) with the generator L. We can view
the generator as the operator on functions f that makes (6.3) hold.
The generators satisfy the following linearity property: if Xt1 , Xt2 are
independent Levy processes with generators L1 , L2 , respectively, then Xt =
Xt1 + Xt2 has generator L1 + L2 . For example, if X is the Levy process with
as in (6.2),
2 00
Lf (x) = m f 0 (x) + f (x) + [f (x + 1) f (x)].
2
Nt = n if T1 + + Tn t < T1 + + Tn+1 .
Sn = Y1 + + Yn , S0 = 0.
6.3. COMPOUND POISSON PROCESS 185
Set Xt = SNt .
= #
Moreover, if Z
2
:= x2 d(x) < , (6.5)
and Z
m= x d(x),
Proof. Let Z
isYj
(s) = E[e ]= eisx d# (x),
186 CHAPTER 6. JUMP PROCESSES
Therefore,
E [f (Xt ) | X0 = x]
Z
= [1 t] f (x) + t f (x + y) d(y) + o(t)
Z
= f (x) + t [f (x + y) f (x)] d(y) + o(t),
and hence
E[Xt tm | Fs ] = Xs sm.
This shows that Mt is a martingale and
The notation hXit might seem more appropriate, but it is standard for jump
processes to change to this notation. Note that the terms in the sum are zero
unless there is a jump in the time interval [(j 1)/n, j/n]. Hence we see that
X
[X]t = [Xs Xs ]2 .
st
Unlike the case of Brownian motion, for fixed t the random variable hXit is
not constant. We can similarly find the quadratic variation of the martingale
Mt = Xt mt. By expanding the square, we see that this it is the limit as
n of three sums
X j j1 2
X X
n n
jtn
X 2
2m X j j1 m
+ X X + .
n n n n2
jtn jtn
Since there are only finitely many jumps, we can see that the second and
third limits are zero and hence
X
[M ]t = [X]t = [Xs Xs ]2 .
st
This next proposition generalizes the last assertion of the previous propo-
sition and sheds light on the meaning of the generator L. In some sense, this
is an analogue of Itos formula for CPP. Recall that if Xt is a diffusion
satisfying
dXt = m(Xt ) dt + (Xt ) dBt ,
6.3. COMPOUND POISSON PROCESS 189
2 (x) 00
Lf (x) = m(x) f 0 (x) + f (x),
2
and Itos formula gives
In other words, if
Z t
Mt = f (Xt ) Lf (Xs ) ds,
0
E f (Xt )2 | X0 = x < .
Then Z t
Mt = f (Xt ) Lf (Xs ) ds
0
is a square integrable martingale with
X
[M ]t = [f (X)]t = [f (Xs ) f (Xs )]2 .
st
This argument is the same for all t, so let us assume t = 1. Then, as in the
proof of Itos formula, we write
Z 1
f (X1 ) f (X0 ) Lf (Xs ) ds
0
190 CHAPTER 6. JUMP PROCESSES
n
" #
X Z j/n
= f Xj f X (j1) Lf (Xs ) ds .
n n (j1)/n
j=1
We write the expectation of the right-hand side as the sum of two terms
Xn 1
E f X j f X (j1) Lf X (j1) (6.7)
n n n n
j=1
n
" #
Z j/n
X 1
E Lf X (j1) Lf (Xs ) ds , (6.8)
n n (j1)/n
j=1
and hence
E [f (Xt+t ) f (Xt ) Lf (Xt ) t] = o(t).
This shows that the sum in (6.7) has n terms that are o(1/n) and hence
the limit is zero. The terms inside the expectation in (6.8) equal zero unless
there is a jump between time (j 1)/n and j/n. This occurs with probability
O(1/n) and in this case the value of the random variable is O(1/n). Hence
the expectations are O(1/n2 ) and the sum of n of them has limit zero.
The computation of the quadratic variation is essentially the same as in
Proposition 6.3.1.
Therefore,
E[f (X1 )] = lim E[f (X1 )1Enc ].
n
However,
E [f (Xt+s ) f (Xt )] 1Enc = s E[Lf (Xt )][1 + O(s)].
We therefore get
n1
1X
E [f (X1 ) f (X0 )] = lim E[Lf (Xj/n )].
n n
j=0
6.3. COMPOUND POISSON PROCESS 191
and the bound E[|f (X1 )|] < can be used to justify the inter-
change of limit and expectation.
For a CPP the paths t 7 Xt are piecewise constant and are discontinuous
at the jumps. As for the usual Poisson process, we have defined the path so
that is it right-continuous and has left-limits. We call a function cadlag (also
written cadlag), short for continue a droite, limite a gauche, if the paths are
right-continuous everywhere and have left-limits. That is, for every t, the
limits
Xt+ = lim Xs , Xt = lim Xs ,
st st
exist and Xt = Xt+ . The paths of a CPP are cadlag. We can write
X
Xt = X0 + [Xs Xs ] .
0st
then
2
P{|K| a} .
a2
Proof. Let
Kn = max{|Mj/n | : j = 1, 2, . . . , n}.
E[Zn2 ] E[M12 ] 2
P{Z n a} = = .
a2 a2 a2
As noted before, there are only a finite number of nonzero terms in the sum
so the sum is well defined. This definition requires no assumptions on the
process As . However, if we want the integral to satisfy some of the properties
of the Ito integral, we will need to assume more.
Suppose that E[X1 ] = m, Var[X1 ] = 2 , and let Mt be the square in-
tegrable martingale Mt = Xt mt. Then if the paths of At are Riemann
integrable, and the integral
Z t Z t Z t
Zt = As dMs = As dXs m As ds
0 0 0
Example 6.4.1. Suppose Xt is the CPP that takes jumps of rate 1 and
when it jumps it chooses 1 each with the same probability. In other words,
the Levy measure of X is the probability measure with ({1}) = ({1}) =
1/2. Then m = 0, 2 = 1, and At = Xt Xt is adapted to {Ft }. It is
6.4. INTEGRATION WITH RESPECT TO COMPOUND POISSON PROCESSES193
The problem in our setup is that we allow a betting strategy that sees
a jump at time s and immediately changes the bet to take advantage of
this. In our frameweork, we will not allow these instantaneous changes by
restricting to strategies that are left-continuous.
Then Z t
Zt = As dMs
0
= E A2s E[(Mt Ms )2 ]
= 2 (t s) E[A2s ]. (6.12)
At = lim Ant ,
n
and hence Z t Z t
As dMs = lim Ans dMs .
0 n 0
be the CPP with X0 = 0 that takes a jump of size h(y) whenever Xt takes
a jump of size y. This process has Levy measure where
(h(V )) = (V ), (6.14)
with mean m = r. If Yt = Xt r t is the compensated CPP, we can write
(6.13) as
h i
dSt = St dXt = St dYt + r St dt = St dYt + r dt .
Let Z
m = x d(x) < ,
1
are
Zt = Z0 eXt , Mt = Z0 eXt mt .
Z t Z t
Mt = Zt exp m As ds = Zt exp m As ds .
0 0
6.5. CHANGE OF MEASURE 197
Q[V ] = E [Mt 1V ] .
P{Xt+t Xt V } = (V ) t + o(t),
Q{Xt+t Xt V } = (V ) t + o(t),
Q{Xt+t Xt V } (V )
= + o(t).
P{Xt+t Xt V } (V )
and this implies that there are only a finite number of jumps in a bounded
time interval. In this section we allow an infinite number of jumps ((R) =
), but require the expected sum of the absolute values of the jumps in an
interval to be finite. This translates to the condition
Z
|x| d(x) < . (6.15)
Let us write X
Vt = Xs Xs
,
0st
6.6. GENERALIZED POISSON PROCESSES I 199
In other words Vt has the same jumps as Xt except that they all go in the
positive direction. As 0, Vt increases (since we are adding jumps and
they are all in the same direction) and using (6.15), we see that
Z
lim E [Vt ] = t |x| d(x) < .
0
Xt = lim Xt .
0
One may worry about the existence of the integrals above. Tay-
lors theorem implies that
x2 s2
eixs 1 = ixs + O(|xs|3 ).
2
Using this and (6.15) we can see that
Z
|eixs 1| d(x) < .
|x|1
Also, Z
|eixs 1| d(x) 2 {x : |x| > 1} < .
|x|>1
Example 6.6.1 (Positive Stable Processes). Suppose that 0 < < 1 and
is defined by
d(x) = c x(1+) dx, 0 < x < ,
where c > 0. In other words, the probability of a jump of size between x and
x + x in time t, is approximately
Note that Z
c
c x(1+) dx = ,
Z 1 Z 1
c
x d(x) = c x dx = < ,
0 0 1
Z Z
x d(x) = c x dx = .
0 0
Therefore, satisfies (6.16) and (6.17), but not (6.15). The corresponding
generalized Poisson process has
Z
(s) = c [eisx 1] x(1+) dx.
0
With careful integration, this integral can be computed. We give the answer
in the case = 1/2, c = 2/ , in which case
ex
d(x) = dx, 0 < x < .
x
Note that Z Z
d(x) = , x d(x) < .
0 0
ft (x) = (x)t1 ex , 0 < x < 1,
(t)
where Z
(t) = xt1 ex dx,
0
Z12 + + Zn2 ,
For every > 0, the number of jumps of absolute value greater than
in a finite time interval is finite. More precisely, if denotes
restricted to {|x| > }, then
Note that (6.20) can hold even if (6.15) does not hold.
Let Z Z
m = x d (x) = x d(x),
<|x|1
which is finite and well defined for all > 0 by (6.18)(6.20). For each
> 0, there is a CPP Xt associated to the measure and a correponding
compensated Poisson process Mt = Xt m t.
If Z
|x| d(x) < ,
Xt = lim Xt ,
0
Yt = Xt mt,
where Z
m = lim m = x d(x).
0
If is symmetric about the origin, that is, if for every 0 < a < b the
rate of jumps with increments in [a, b] is the same as that of jumps in
[b, a], then m = 0 for every and
Yt = lim Xt .
0
Since Z
m = x d(x),
|x|>
we can write Z
(s) = [eisx 1 isx] d(x).
|x|>
Using Taylor series, we see that |eisx 1 isx| = O(s2 x2 ), and hence (6.20)
implies that for each s,
Z
|eisx 1 isx| d(x) < .
The same argument shows that the moment generating function also exists
and Z 1
sYt sx
E[e ] = exp t [e 1 sx] d(x) , s R.
1
In particular, E[esYt ] < for all s, t. (This is in contrast to CCPs for which
this expectation could be infinite.)
206 CHAPTER 6. JUMP PROCESSES
Y = Y1 = X1 m .
Y = lim Y .
0
n (n+1)
Let Zn = Y 2 Y2 , so that
X
Y = Zn .
n=0
lim E |Y Y |2 = 0.
0
Here we the shorthand that when q is the index, the limit is taken
over D only. Define the jump time T to be the set of times s such
that X is discontinuous and T = j=1 T1/j . With probability one
each T is finite and hence T is countable.
6.8. THE LEVY-KHINCHIN CHARACTERIZATION 207
Proof. Let
Z
S2 2
x2 d(x).
= E (Y Y ) =
|x|
Let
M = sup |Yq Yq | : q D, q 1 .
Using Lemma 6.3.3 (first for Yt Yt and then letting 0), we
can see that
S2
P{M a} 2 .
a
We can find n 0 such that
X
n2 S2n < .
n=1
Similar estimates hold for the liminf and for limits from the left.
We now define Yt for all t by Yt = Yt+ , and then the paths of
Yt are cadlag.
C {|x| 1} = 0,
The process Ct +Yt is called a jump Levy process and if the compensating
term m in the definition of Yt is zero it is a pure jump process. We can
summarize the theorem by saying that every Levy process is the independent
sum of a Brownian motion (with mean m and variance 2 ) and a jump
process (with Levy measure = C + Y .)
We have not included the generalized Poisson process in our decompo-
sition (6.23). If Xt is such a process we can write it as
Xt = Ct + Xt
where Ct denotes the sum of jumps of absolute value greater than one. Then
we can write Xt = Yt + mt where Yt is a compensated generalized Poisson
process and m = E[X1 ], which by the assumptions is finite.
The conditions on C , Y can be summarized as follows:
Z
({0}) = 0, [x2 1] d(x) < . (6.24)
Xt = Xt,a + Ct,a ,
6.8. THE LEVY-KHINCHIN CHARACTERIZATION 209
where Ct,a denotes the movement by jumps of absolute value greater than
a and Xt,a denotes a Levy process with all jumps bounded by a. For each a
one can show that Ct,a is a CPP independent of Xt,a . We let a go to zero,
and after careful analysis we see that
Xt = Zt + Ct,1 + Yt ,
We will prove the following fact which is a key step in the proof
of Theorem 6.8.1.
Theorem 6.8.3. Suppose Xt is a Levy process with continuous
paths. Then Xt is a Brownian motion.
Proof. All we need to show is that X1 has a normal distribution.
Let
Xj,n = Xj/n X(j1)/n ,
Mn = max {|Xj,n | : j = 1, . . . , n} .
Continuity of the paths implies that Mn 0 and hence for every
a > 0, P{Mn < a} 1. Independence of the increments implies
that
n
P{Mn < a} = 1 P{|X1/n | a}
exp nP{|X1/n | a} .
We claim that all the moments of X1 are finite. To see this let
J = max0t1 |Xt | and find k such that P{J k} 1/2. Then
using continuity of the paths, by stopping at the first time t that
|Xt | = nk, we can see that
and hence
P{J nk} (1/2)n ,
from which finiteness of the moments follows. Let m = E[X1 ], 2 =
Var[X1 ], and note that E[X12 ] = m2 + 2 . Our goal is to show that
X1 N (m, 2 ).
Let
n
X
Xj,n = Xj,n 1{|Xj,n | an }, Zn = Xj,n .
j=1
Also, since|X1,n | an ,
m2 2
3 2
E[|X1,n | ] an E[X1,n ] = an + [1 + o(1)].
n2 n
(This estimate uses (6.26) which in turn uses the fact that the paths
are continuous.) Using these estimates, we see that for fixed s,
ims 2 s
1
n (s) = 1 + +o ,
n 2n n
6.9. INTEGRATION WITH RESPECT TO LEVY PROCESSES 211
Zt = lim Ztn
tn t
X1 + + Xn
Y = .
n1/
Using eisx = cos(sx) + i sin(sx) and the fact that is symmetric about the
origin, we see that
cos(sx) 1
Z
(s) = 2C dx
0 x1+
Z
cos(y) 1
= 2C|s| dy = Cb |s| .
0 y 1+
This comes from the fact that the easiest way for |X1 | to be unusually large
is for there to be a single very large jump, and the probability of a jump of
size at least K by time one is asymptotic to {|x| K}.
If = 1, then b = and the corresponding Levy measure is
1
d(x) = .
x2
1
g1 (x) = .
(x2 + 1)
If > 2, and
X1 + + Xn
Zn = ,
n
then Zn converges in distribution to a centered normal random vari-
able.
If = 2, and
X1 + + Xn
Zn = ,
n log n
then Zn converges in distribution to a centered normal random vari-
able.
If > 2, then Var[Xj ] < , and hence the result is a restatement of the
central limit theorem.
We will prove the proposition for 0 < < 2. Our first obser-
vation is that if X1 , X2 , . . . are independent, identically distributed
random variables whose characteristic exponent satisfies
We claim that
Z
lim [cos y 1] s(1+) f (y/s) dy = cI,
s0 0
where
cos y 1
Z
I= dy,
0 y 1+
from which (6.30) follows with r = 2cI. To see this, let = (2
)/6 > 0 and note that
Z 1 Z s1
s
(1+) (1+)
[cos y 1] s f (y/s) dy C s y 2 dy
0 0
Cs(2)/2 0.
and hence
Z
lim [cos y 1] s(1+) f (y/s) dy
s0 s1
Z
= c lim [cos y 1] y (1+) dy = cI.
s0 s1
Proposition 6.10.3 only uses stable processes for 2. The next propo-
sition that shows that there are no nontrivial stable process for > 2.
218 CHAPTER 6. JUMP PROCESSES
2 2
ims 3
(s) = e 1 s + O(|s| ) , s 0,
2
2 2 2 2
|(s)| 1 s +1=1 s .
4 4
6.11 Exercises
Exercise 6.1. Suppose F (t) = P{T t} is the distribution function of a
continuous random variable T . Define theinverse function G by
Xt = Yt a(t)
E[Xt | Fs ] = E[Xt | Ys ] = Xs .
Find E[Xt ].
Find E[Xt2 ]
If so find it.
Zt = exp{Xt }.
1. Find E[Zt ].
Mt = Zt St
Mt = Z t A t
is a martingale?
Exercise 6.5. Suppose Yt is a Cauchy process, that is, a Levy process such
that Y1 has a Cauchy distribution. Show why the following statement holds:
for every r > 0 and t > 0,
P max Ys > r () 2 P{Yt > r}.
0st
Here () is one of the following: >, =, <. Your task is to figure out which
of >, =, < should go into the statement and explain why the relation holds.
(Hint: go back to the derivation of the reflection principle for Brownian
motion. The only things about the Cauchy process that you should need to
use are that it is symmetric about the origin and has jumps. Indeed, the
same answer should be true for any symmetric Levy process with jumps.)
Exercise 6.6. Suppose Xt is a Poisson process with rate 1 and let r > 0
with filtration {Ft }, and
St = eXt rt .
1. Find a strictly positive martingale Mt with M0 = 1 such that St is a
martingale with respect to the tilted measure Q given by
Q(V ) = E [1V Mt ] , V Ft .
3. What is the probability that there are no jumps of size greater than
1/2 by time t = 2?
4. Let
f (t, x) = E Xt4 | X0 = x .
d
f (t, x) |t=0 = g(x).
dt
7.1 Definition
The assumptions of independent, stationary increments along with continu-
ous paths give Brownian motion. In the last chapter we dropped the conti-
nuity assumption and obtained Levy processes. In this chapter we will retain
the assumptions of stationary increments and continuous paths, but will al-
low the increments to be dependent. The process Xt we construct is called
fractional Brownian motion and depends on a parameter H (0, 1) called
the Hurst index. It measures the correlation of the increments.
When H > 1/2, the increments are positively correlated, that is, if
the process has been increasing, then it is more likely to continue
increasing.
When H < 1/2, the increments are negatively correlated, that is, if
the process has been increasing, then it is more likely to decrease in
the future.
Yt := aH Xat
223
224 CHAPTER 7. FRACTIONAL BROWNIAN MOTION
For every t1 , . . . , tk , the random vector (Xt1 , . . . , Xtk ) has a joint nor-
mal distribution with mean zero.
Definition If H (0, 1), the fractional Brownian motion with Hurst pa-
rameter H (f BmH ) is the centered (mean zero) Gaussian process Xt with
continuous paths such that for all s, t,
E (Xt Xs )2 = |t s|2H .
Since
E (Xt Xs )2 = E Xt2 + E Xs2 2E [Xs Xt ] ,
it follows that
1 2H
s + t2H |s t|2H .
Cov(Xs , Xt ) = E [Xs Xt ] = (7.1)
2
If H = 1/2, then fractional Brownian motion is the same as usual Brownian
motion.
As in the case of Brownian motion, we must show that such a process
exists. We will discuss this in the next section, but for now we assume it
does exist. If s < t, note that
1 2H
t s2H (t s)2H
=
2
> 0, H > 1/2
= 0, H = 1/2 .
< 0, H < 1/2
|Xt+ Xt | H .
In other words, the Holder exponent of f BmH is given by the Hurst index H.
If H > 1/2, the paths are smoother than Brownian paths and if H < 1/2,
the paths are rougher.
7.2. STOCHASTIC INTEGRAL REPRESENTATION 225
The right-hand side is the sum of two independent normal random variables:
the first is Fs -measurable and the second is independent of Fs . Hence Yt
Ys has a normal distribution. More generally, one can check that Yt is a
Gaussian process whose covariance is given for s < t by
Z s
E [Ys Yt ] = f (r, s) f (r, t) dr.
Proposition 7.2.1. If
1
(s) = c sH 2 ,
and Yt is defined as in (7.2), then Xt = Yt Y0 is f BmH . Here
Z 1/2
1 H 21 H 12 2
c = cH = + [(1 + r) r ] dr .
2H 0
t t
t2H
Z Z
2
c Var (t r) dBr = (t r)2H1 dr = .
0 0 2H
If we choose c as in (7.3), we get E[Xt2 ] = t2H .
7.3 Simulation
Because the fractional Brownian motion has long range dependence it is not
obvious how to do simulations. The stochastic integral represetation (7.2)
is difficult to use because it uses the value of the Brownian motion for all
negative times. However, there is a way to do simulations that uses only the
fact that f BmH is a Gaussian process with continuous paths. Let us choose
a step size t = 1/N ; continuity tells us that it should suffice to sample
Y1 , Y2 , . . .
where Yj = Xj/N . For each n, the random vector (Y1 , . . . , Yn ) has a centered
Gaussian distribution with covariance matrix = [jk ]. Given we claim
that we can find numbers ajk with ajk = 0 if k > j, and independent
standard normal random variables Z1 , Z2 , . . . such that
Yn = an1 Z1 + + ann Zn . (7.3)
In matrix notation, A = [ajk ] is a lower triangular matrix such that =
AAT .
This decomposition = AAT is sometimes called the Cholesky decom-
position. We will now show that it exists by giving an algorithm for finding
the matrix. We start by setting
p
a11 = 11 .
Suppose we have found the first k 1 rows of A. This is a lower triangular
(k 1) (k 1) matrix. Suppose j < k. Then,
j
X
jk = E [Yj Yk ] = aji aki .
i=1
Pn X = E[X | Y1 , . . . , Yn ].
Then we define
Yn Pn1 Yn
Zn = .
kYn Pn1 Yn k2
Since Hn is also the subspace spanned by Z1 , . . . , Zn , we can write
n1
X
Pn1 Yn = ajn Zj ,
j=1
Harmonic functions
229
230 CHAPTER 8. HARMONIC FUNCTIONS
d d
X 1X
f (y) = bj yj + ajk yj yk + o(|y|2 ), y 0,
2
j=1 j=1
d d
X 1X
M V () = bj M V (0; yj , ) + ajk M V (0; yj yk , ) + o(2 ).
2
j=1 j=1
d
X d
X
M V (0; yj2 , ) = M V (0; yj2 , ) = M V (0; 2 , ) = 2 .
j=1 j=1
Symmetriy implies that M V (0; yj2 , ) = M V (0; yk2 , ), and hence
M V (yj2 , ) = /d. We therefore have
d
1X 2 2
M V () = ajj (2 /d) + o(2 ) = f (0) + o(2 ).
2 2d
j=1
f (x) = M V (f ; x, ).
Proposition 8.1.2.
8.1. DIRICHLET PROBLEM 231
Proof. The first statement follows from Itos formula. In fact, we already
did this calculation in Theorem 3.7.2. Since bounded local martingales are
martingales, the second statement follows from the optional sampling theo-
rem.
hmD (V ; x) = Px {B V }, V D.
The next proposition shows that we could use the mean value propery
to define harmonic functions. In fact, this is the more natural definition.
Proposition 8.1.3. Suppose f is a continuous function on an open set
D Rd . Then f is harmonic in D if and only f satisfies the mean value
property on D.
Proof. If f is harmonic, then f restricted to a closed ball of radius con-
tained in D is bounded. Therefore, the mean value property is a particular
case of (8.2).
If f is C 2 and satisfies the mean value property, then 2 f (x) = 0 by
Proposition 8.1.1. Hence we need only show that f is C 2 . We will, in fact,
show that f is C .
232 CHAPTER 8. HARMONIC FUNCTIONS
The proof showed that we did not need to assume that f is continuous.
It suffices for f to be measurable and locally bounded so that derivatives
can be taken on the right-hand side of (8.3).
We will solve the classic Dirichlet problem for harmonic functions. Sup-
pose D is a bounded open set and F : D R is a continuous function.
The goal is to find a continuous function f : D R that is harmonic in D
with f (x) = F (x) for x D. Suppose that such a function f existed. Let
= D = inf{t 0 : Bt D}. Since D is compact, f must be bounded,
and hence
Mt = f (Bt )
is a continuous, bounded martingale. Arguing as in (8.2) we see that
f (x) = Ex [f (B )] = Ex [F (B )] . (8.4)
The right-hand side gives the only candidate for the solution. The strong
Markov property can be used to see that this candidate satisfies the mean
value property and the last proposition gives that f is harmonic in D.
It is a little more subtle to check if f is continuous on D. This requires
futher assumptions on D which can be described most easily in terms of
Brownian motion. Suppose z D and x D with x near z. Can we say
that f (x) is near F (z)? Since F is continuous, this will be true if B is near
z. To make this precise, one defines a point z D to be regular if for every
> 0 there exists > 0 such that if x D with |x z| < , then
Px {|B z| } .
8.1. DIRICHLET PROBLEM 233
We claim that
xr
f (x; r, R) = , d = 1,
Rr
log |x| log r
f (x; r, R) = , d = 2,
log R log r
rd2 |x|d2
f (x; r, R) = , d 3.
rd2 Rd2
One can check this by noting that 2 f (x) = 0 and f has the correct bound-
ary condition. The theorem implies that there is only one such function.
Note that
0, d2
lim f (x; r, R) = ,
R 1 (r/|x|)2d , d 3
x/R d = 1
lim f (x; r, R) = . (8.5)
r0 1, d2
We have already seen the d = 1 case as the gamblers ruin estimate for
Brownian motion.
f (x) = Ex [F (B )] = 1, x D.
If D = Ur = {x : |x| < r} is the ball of radius r about the origin, then the
harmonic measure hmUr (; x) is known explicitly. It is absolutely continuous
with respect to (d 1)-dimensional surface measure s on Ur . Its density is
called the Poisson kernel,
r2 |x|2
Hr (x, y) = ,
r d1 |x y|d
where Z
d1 = ds(y)
|y|=1
If x Ur , Z
Hr (x, y) ds(y) = 1;
|y|=r
From these one concludes that f as defined by the right-hand side of (8.6)
is harmonic in Ur and continuous on U r .
The reader may note that we did not need the probabilistic interpreta-
tion of the solution in order to verify that (8.6) solves the Dirichlet problem.
Indeed, the solution using the Poisson kernel was discovered before the re-
lationship with Brownian motion. An important corollary of this explicit
solution is the following theorem; we leave the verification as Exercise 8.1.
The key part of the theorem is the fact that the same constant C works for
all harmonic functions.
Theorem 8.1.5.
8.2. H-PROCESSES 235
1. For every positive integer n, there exists C = C(d, n) < such that if
f is a harmonic function on an open set D Rd , x D, {y : |x y| <
r} D, and j1 , . . . , jn {1, . . . , d} then
xj1 xjn f (x) C rn sup |f (y)|.
|yx|<r
2. (Harnack inequality) For every 0 < u < 1, there exists C = C(d, u) <
such that if f is a positive harmonic function on an open set D
Rd , x D, {y : |x y| < r} D, then if |x z| ur,
8.2 h-processes
Suppose h is a positive harmonic function on an open set D Rd , and let
Bt be a standard Brownian motion starting at x D. Let = D be the
first time t with Bt 6 D. Then Mt = h(Bt ) is a positive local martingale for
t < satisfying
h(Bt )
dMt = Mt dBt , t < .
h(Bt )
Let n be the minimum of and the first time t with h(Bt ) n. Then Mtn
is a continuous bounded martingale.
We can use the Girsanov theorem to consider the measure obtained by
weighting by the local martingale Mt . To be more precise, if V is an event
that depends only on Bt , 0 t n , then
P (V ) = h(x)1 E [Mn 1V ] .
One can use the Girsanov theorem (more precisely, a simple generalization
of the theorem to d-dimensional Brownian motion), to see that
h(Bt )
dBt = dt + dWt ,
h(Bt )
where Wt is a standard Brownian motion with respect to P . The process
Bt in the new measure P is often called the (Doob) h-process associated to
the positive harmonic function h. It is defined for t < .
As an example, suppose that D is the unit ball, y = (1, 0, . . . , 0) D
and
1 |x|2
h(x) = d1 H1 (x, y) = ,
|x y|d
236 CHAPTER 8. HARMONIC FUNCTIONS
is the Poisson kernel. Then the h-process can be viewed as Brownian motion
conditioned so that B = y, where = D . This is not precise because the
conditioning is with respect to a event of probability zero. We claim that
the P -probability that B = y equals one. To see this, assume that B0 = 0
and let
We claim that P {Tn < } = 1. Indeed, if we let r = inf{t : |Bt | = r}, then
we can check directly that
Z
3
lim P {h(Br ) n } = lim h(rx) d(x) = 1,
r1 r1 |x|=1,h(rx)n3
and hence
lim P {Tn < r } = 1.
r1
P {Tn0 < } n2 ,
X
P {Tn0 < } < .
n=1
for the Brownian motion Bt . We will make the further assumption that a is
a C 1 function, that is, we can write
Z t
a(t) = a(s) ds,
0
dXt = Rt dt + At dBt ,
if r = 1/2. One can use this to give another proof of Proposition 4.2.1. Note
that if r < 1/2, then in the new time, it take an infinite amount of time for
L = log X to reach . However, in the original time, this happens at time
T < . In other words,
Z
a() = a(s) ds = T < .
0
Bt = Bt1 + i Bt2
is called a (standard) complex Brownian motion. Note that this is the same
as a two-dimensional Brownian motion except that the point in the plane is
viewed as a complex number.
Suppose f : C C is a nonconstant holomorphic (complex differ-
entiable) function. We will consider Xt = f (Bt ). Near any point z with
f 0 (z) 6= 0, the function looks like f (w) = f (z) + f 0 (z) (w z). Multiplication
by f 0 (z) is the same as a dilation by |f 0 (z)| and a rotation by arg f 0 (z). A
rotation of a standard two-dimensional Brownian motion gives a standard
two-dimensional Brownian motion and a dilation gives a Brownian motion
with a different variance. This leads us to believe that Xt is a time-change
of standard Brownian motion.
Let us make this more precise. Let
Z t
a(t) = |f 0 (Bs )|2 ds.
0
8.5 Exercises
Exercise 8.1. Use (8.6) to prove Theorem 8.1.5.
(w) = (f 1 (w)).