Martingales and Stopping Times: 3.1 Filtrations
Martingales and Stopping Times: 3.1 Filtrations
Martingales and Stopping Times: 3.1 Filtrations
In this chapter,we will meet two of the most important concepts of modern probability theory and its applications to option pricing.
3.1
Filtrations
Notation: Z+ = N {0} = {0, 1, 2, 3, . . .}, is time starting at zero, however we can (and sometimes will) also use N for indexing processes. We need to be able to model the ow of information in time. The standard way of doing this is to use a ltration of sub--algebras. Denition. A ltration is a sequence (Fn , n Z+ ) of sub--algebras of F such that each Fn Fn+1 . The standard example of a ltration is obtained when we observe a random process (Xn , n Z+ ), e.g. Xn might be the price of a stock at time n. We then take Fn = {X0 , X1 , . . . , Xn } to be the smallest sub--algebra of F for which X0 , X1 , . . . , Xn are all measurable. This is called the natural X ltration of the process (Xn , n Z+ ) and denoted by (Fn , n Z+ ). Denition. Let (Fn , n Z+ ) be a ltration. A stochastic process Y = (Yn , n Z+ ) is said to be adapted to this ltration if each Yn is Fn -measurable. If the process Y is adapted, then all the information we need to predict the random variable Yn can be found in Fn , i.e. all the information about the observation at time n is (in principle) already known at this time. 24
Any stochastic process is automatically adapted to its own natural ltration. If X and Y are two stochastic processes and Y is adapted to the natural ltration of X, then each Yn = fn (X0 , X1 , . . . , Xn ), where fn is a measurable function from Rn+1 to R.
3.2
3.2.1
Martingales
Basic Ideas
We x a ltration (Fn , n Z+ ) of F. A stochastic process X = (Xn , n Z+ ) is said to be a (discrete-parameter) martingale if it is adapted, it is integrable (i.e. each E(|Xn |) < ), E(Xn |Fn1 ) = Xn1 for all n N. The third of these is the most important - it is sometimes called the martingale property. A martingale can be thought of as a fair game. Consider a game where rounds take place at times 1, 2, . . .. The sub--algebra Fn contains all the information about the history of the game up to and including time n. You stake 1 in each round. Your fortune (which may be negative) at time n is Xn so Xn Xn1 is your winnings per unit stake at time n. By (CE1), the martingale property is equivalent to E(Xn Xn1 |Fn1 ) = 0, i.e. average winnings in game n are zero given knowledge of the history of the game up to and including time n 1. Another way of looking at a martingale is that it is a process where the best estimate (i.e. the least squares predictor) of Xn at time n 1 is in fact Xn1 . The martingale property is equivalent to E(Xn |Fm ) = Xm 25 (3.2.1)
for all 0 m n. You can establish this in Problem 15. Using (3.2.1) and (CE5) we see that if X = (Xn , n Z+ ) is a martingale then for all m, n with m < n E(Xn ) = E(E(Xn |Fm )) = E(Xm ), i.e. we have the important result that The mean of a martingale is constant in time. Martingales are important because (as we will see) they are ubiquitous in probability, statistics, nance and also analysis. they have a beautiful theoretical development. Before proceeding further with martingales themselves, lets take a look at some close relatives. We have seen that martingales correspond to fair games. If a game is biased in your favour, we obtain a submartingale, i.e. for all n N, E(Xn Xn1 |Fn1 ) 0, and if it is biased against you, we obtain a supermartingale E(Xn Xn1 |Fn1 ) 0. Formally, an adapted integrable process (Xn , n Z+ ) is a submartingale if E(Xn |Fn1 ) Xn1 for all n N and a supermartingale if E(Xn |Fn1 ) Xn1 for all n N. You can check that the mean of a submartingale increases in time, while that of a supermartingale decreases. The following are obvious: (Xn , n Z+ ) is a martingale if and only if it is both a sub- and a supermartingale. (Xn , n Z+ ) is a submartingale if and only if (Xn , n Z+ ) is a supermartingale. 26
3.2.2
Examples of Martingales
(E1) Let X be an integrable random variable and (Fn , n Z+ ) be an arbitrary ltration. Dene a process (Xn , n Z+ ) by Xn = E(X|Fn ). Then it is easy to see that (Xn , n Z+ ) is a martingale by using (CE4). (E2) Sums of Independent Zero Mean Random Variables Let (Yn , n N) be a sequence of independent integrable random variables, each with zero mean. Dene Xn = Y1 + Y2 + + Yn , Fn = {Y1 , Y2 , . . . , Yn }. (Xn , n N) is a martingale since by (CE1), (CE3) and (CE6) E(Xn |Fn1 ) = Y1 + Y2 + + Yn1 + E(Yn |Fn1 ) = Xn1 + E(Yn ) = Xn1 . (E3) Products of Independent Unit Mean Random Variables This is the same set up as in Example 2, the only dierence being that we now take each E(Yn ) = 1 and dene Xn = Y1 Y2 Yn . This yields a martingale since by (CE3) and (CE6) E(Xn |Fn1 ) = Y1 Y2 Yn1 E(Yn |Fn1 ) = Xn1 E(Yn ) = Xn1 . (E4) Likelihood Ratios Let f and g be two pdfs with g(x) = 0 for all x R. (Yn , n N) is a sequence of i.i.d 1 random variables - each with pdf g. Dene a ltration by Fn = {Y1 , Y2 , . . . , Yn }, and let Xn = f (Y1 )f (Y2 ) f (Yn ) . g(Y1 )g(Y2 ) g(Yn )
f (Yj ) g(Yj )
=
R
f (y)dy = 1.
R
27
3.2.3
The martingale transform is the discrete time analogue of the important stochastic integral concept which well meet later on in continuous time. Lets return to the idea of betting in a game. When we discussed martingales as describing a fair game we only bet a unit stake in each round. Now lets make this more interesting by allowing an arbitrary stake or gambling strategy. This leads us to introduce the following concept. Denition. Let (Fn , n Z+ ) be a ltration. A process (Cn , n N) is said to be previsible if Cn is Fn1 -measurable for all n N. Note that C0 does not exist. Any previsible process is adapted. The idea behind the denition of a previsible process is that your stake in game n will depend on the history of the game up to and including game n 1. Recall the interpretation of a martingale (Xn , n Z+ ) wherein Xn Xn1 is winnings per unit stake on game n. Using a previsible strategy, the total winnings on game n are Cn (Xn Xn1 ) and hence the total winnings up to and including game n is
n
Yn =
j=1
Cj (Xj Xj1 ).
(3.2.2)
We call (Yn , n N) the martingale transform of C by X. It is sometimes written succinctly as Y = C X. The next result show that you cant beat the system. Theorem 3.2.1 Suppose that C is bounded, i.e. there exists K > 0 such that |Cn ()| K, for all , n N and non-negative, i.e. Cn () 0 for all , n N. (i) If X = (Xn , n Z+ ) is a supermartingale then Y = C X is a supermartingale. (ii) If X = (Xn , n Z+ ) is a martingale then Y = C X is martingale. Proof 28
(i) Y is adapted since sums and products of Fn -measurable random variables are themselves Fn -measurable. Y is integrable follows from the fact that X is integrable and C is bounded. To get the supermartingale property for Y we argue as follows. Using (CE3) and the supermartingale property of X: E(Yn Yn1 |Fn1 ) = E(Cn (Xn Xn1 )|Fn1 ) = Cn E((Xn Xn1 )|Fn1 ) 0. (ii) is proved similarly and in this case you can even drop the requirement that C be non-negative.
3.3
3.3.1
Stopping Times
Stopping Times and Stopped Processes
Suppose that your strategy is to exit from a game the rst time your fortune reaches a certain high or low point (or you want to sell your shares as soon as their price reaches a certain value). We can model this by using a stopping time T relative to a ltration (Fn , n Z+ ). A stopping time T is a random variable for which (T1) T takes values in Z+ {} = {0, 1, 2, . . . , }. (T2) The event (T = n) Fn for each n Z+ . Recall that (T = n) = { ; T () = n}. The intuition behind (T1) is simply that T is a random time and so should take non-negative values. We include the point at innity so that we can capture the idea of waiting forever for an event to take place. (T2) is the key part of the denition. It encapsulates the idea that the information needed to determine whether or not to stop after the nth game depends on the history of the game up to and including time n. It is sometimes useful to replace (T2) with an equivalent axiom (T2) . (T2) . The event (T n) Fn for each n Z+ . To see that these are equivalent note that if (T2) holds then (T n1) Fn1 Fn and hence (T = n) = (T n) (T n 1) Fn and so (T2) holds. Conversely if (T2) holds then (T n) = n (T = i) Fn and so (T2) i=0 holds. 29
Here is an important example of a stopping time. Let (Xn , n Z+ ) be an adapted process and let B be a Borel set. We dene T = min{n N; Xn B}, so T is the rst time (after time zero) that the process moves into the set B. T is a stopping time as
n
(T n) =
(Xi B) Fn .
i=1
Stopping times dened in this way are called rst hitting times. T will not be a stopping time if the process (Xn , n Z+ ) fails to be adapted. Theorem 3.3.1 If S and T are stopping times (with respect to the same ltration) then so are (i) S + T , (ii) T , (where N) (iii) S T , (iv) S T . 2 Proof (i) is Problem 20 , (ii) is obvious. For (iii) and (iv) use (S T n) = (S n) (T n) Fn , (S T n) = (S n) (T n) Fn . Now let (Xn , n Z+ ) be an adapted process and T be a stopping time. We introduce The stopped random variable XT , so for each , XT () = XT () (). The stopped process (XT n , n Z+ ) so for each , n Z+ , XT n () = XT ()n (). So for example if chance selects an for which T () = 5, then XT () = X5 () and (for this value of only) XT n = (X0 , X1 , X2 , X3 , X4 , X5 , X5 , X5 , . . .). If Xn is the price of a share at time n and T is the selling time, then XT is the price at the selling time and (XT n , n Z+ ) is the history of the share price up to and including the time you sell. From a practical point of view,
2
S T = min{S, T }, S T = max{S, T }.
30
we would like to know as much as possible about XT . It turns out that we can get some information about its mean and well now explore this further. We begin by introducing a gambling strategy based on T . Suppose that you always bet one pound but you stop playing after time T . In this case, T your stake process is C T = (Cn , n N) where
T Cn = 1{T n} =
1 if T n 0 if T < n
T This process is previsible since Cn = 0 i T n 1 and (T n 1) T Fn1 , while Cn = 1 i T n and (T n) = (T n 1)c Fn1 . The winnings process is the martingale transform n
(C X)n =
j=1
T Cj (Xj Xj1 ).
If T > n 1,
n
(C X)n =
j=1
(Xj Xj1 ) = Xn X0 = XT n X0 .
If T = k < n,
k
(C T X)n =
j=1
(Xj Xj1 ) = Xk X0 = XT X0 = XT n X0 ,
i.e. (C T X)n = XT n X0 . Combining this result with that of theorem 3.2.1, we deduce the following Theorem 3.3.2 1. If X is a supermartingale and T is a stopping time, then (XT n , n Z+ ) is a supermartingale and so E(XT n ) E(X0 ). 2. If X is a martingale and T is a stopping time, then (XT n , n Z+ ) is a martingale and so E(XT n ) = E(X0 ). 31
It would be nice if we could take the limit as n in these results and deduce (in the martingale case) that E(XT ) = E(X0 ). Unfortunately this doesnt always work. Indeed there is no reason why XT should even be integrable in general. The great twentieth century probabilist Joseph Doob found conditions which allow you to do this. The following result is called Doobs optional stopping theorem in his honour. Theorem 3.3.3 (Optional stopping) Let X be a supermartingale and T be a stopping time. If either (i) T is bounded, (ii) X is bounded and T is a.s. nite, then XT is integrable and E(XT ) E(X0 ). If X is a martingale and either (i) or (ii) hold then E(XT ) = E(X0 ). Proof. We only consider the supermartingale case here (the extension to martingales is straightforward). First suppose that (i) holds so there exists N N such that T () N for all . From theorem 3.3.2 we have E(XT n ) E(X0 ) for all n N. Now take n = N , then T N = T and so we get E(XT ) E(X0 ) as was required. If (ii) holds we have that there exists K > 0 such that |Xn ()| K for all n Z+ , . Hence we obtain |XT n X0 | |XT n | + |X0 | 2K. By the dominated convergence theorem (theorem 1.3.1) we have E(XT ) E(X0 ) = lim E(XT n X0 ) 0.
n
Note: For the proof of theorem 3.3.3 (ii), it is enough for (XT n , n Z+ ) to be bounded, provided T is nite (a.s.).
3.3.2
1. Hitting Times for a Random Walk Let (Xn , n N) be a sequence of i.i.d. random variables with each P (Xn = 1) = P (Xn = 1) = 1/2. The simple (symmetric) random walk is the process (Sn , n Z+ ) dened as follows:S0 = 0, Sn = X1 + X2 + + Xn . 32
We work with the ltration (Fn , n Z+ ) where F0 = T = {, } and Fn = {X1 , X2 , . . . , Xn } = {S1 , S2 , . . . , Sn }, for n N. Now consider the stopping time T = min{n N, Sn = 1}. We would like to know the distribution of T . Our rst observation is that P (T = 2m) = 0 for all m Z+ , indeed after 2m steps the path must comprise k steps of 1 and 2m k steps of +1(0 k 2m) and hence the walk must be at a point which is an even number of steps away from the origin. To get precise information about the probabilities P (T = 2m 1) for m N we reason as follows. Let R, then for each n N, 1 E(eXn ) = (e + e ) = cosh(), 2 hence E(sech()eXn ) = 1.
By example (E3) we then have that (Mn , n Z+ ) is a martingale where M0 = 1 and for n 1, Mn = (sech())n eSn .
Now for all R we have cosh() 1 sech() 1. Furthermore if > 0, we have eST n eST = e ,
hence (MT n , n Z+ ) is bounded (for > 0).
Although P (T = ) > 0 a technical argument which we will not give here 3 allows us to still use Doobs optional stopping theorem (theorem 3.3.3), to get E(MT ) = 1, i.e. E((sech())T ) = e .
3 Eectively we may dene MT = 0 when T = since limn (sech())n = 0.
(3.3.3)
33
2 . Then e and e are the two solutions + e of the quadratic equation r2 2r + = 0. On solving this in the usual way, we obtain e = 1 (1 1 2 ). Now put = sech() = e Substituting into (3.3.3) yields E(T ) = 1 (1
1 2 ), 1 2 ). (3.3.4)
i.e.
n=0
n P (T = n) = 1 (1
(1 y) 2 =
m=0 1 2
m 1 2
1 2
(y)m , 1 2 3 2 1 m+1 . 2
where for m N,
1 m!
P (T = n) =
n=0
(1)m1
m=1
1 2
2m1 .
(3.3.5)
In particular you can compute P (T = 1) = 1/2, P (T = 3) = 1/8, P (T = 5) = 1/16, P (T = 7) = 5/128, . . . so e.g. P (T 7) = 93/128 0.7266. = 2. Walds Sequential Hypothesis Test We briey examine a nice application of martingales and stopping times to statistics. We make observations of i.i.d random variables (Yn , n N). There are two hypotheses: H0 : the common pdf of the Yj s is f . H1 : the common pdf of the Yj s is g. We control type I and type II errors as follows: P (reject H0 |H0 ) , P (reject H1 |H1 ) . 34
Let (Xn , n Z+ ) be the likelihood ratio martingale under H0 as described in (E4), so for each n N Xn = and X0 = 1. The strategy is as follows: n=0 Repeat n = n + 1; observe Yn ; Calculate Xn ; Until Xn 1 or Xn If Xn 1 accept H1 Otherwise accept H0 . This recipe appears rather strange at rst glance. To explore further, introduce the stopping time T = min{n 0; Xn 1 or Xn }. It can be shown that T is nite. Although X is not bounded, more detailed analysis shows that we still have E0 (XT ) E0 (X0 ) in this case (where the subscript zero on the expectation emphasises that probabilities are calculated under H0 ). Now using Markovs inequality (Problem 8), we obtain 1 = E0 (X0 ) E0 (XT ) 1 P (XT 1 ) = 1 P (reject H0 |H0 ). Hence P (reject H0 |H0 ) as was required. A similar argument (assuming H1 this time) shows that P (reject H1 |H1 ) . g(Y1 )g(Y2 ) g(Yn ) , f (Y1 )f (Y2 ) f (Yn )
35