Monte-Carlo Methods For Single-And Multi-Factor Models: 1 Simulating Stochastic Differential Equations
Monte-Carlo Methods For Single-And Multi-Factor Models: 1 Simulating Stochastic Differential Equations
Monte-Carlo Methods For Single-And Multi-Factor Models: 1 Simulating Stochastic Differential Equations
We recognize that XT depends on the Brownian motion only through the Brownian motion’s terminal value,
WT . This implies that even if we are unable to compute θ analytically, we can estimate it by simulating WT
directly. (In this case it is also true that since the distribution of XT is known, we could also simulate XT
directly. And of course, as an alternative to simulation, we could choose to estimate θ by evaluating the
expectation numerically.)
Note that unlike the previous example, XT now depends on the entire path of the Brownian motion. This
means that we cannot compute an unbiased estimate of θ by first simulating the entire path of the Brownian
motion since it is only possible to simulate the latter at discrete intervals of time. It so happens, however, that
we know the distribution of XT : it is normal. In particular, this places us back in the context of Example 1
where, if θ cannot be computed analytically, we can estimate it by either simulating XT directly or by evaluating
the expectation numerically.
and where µ(t) is a deterministic function of time. While it is clear that Xt follows a CIR process with time
varying parameters, we do not know how to find an explicit solution2 to the SDE in (5). Of course we do not
necessarily need an explicit solution to (5) to determine the distribution of XT (which is what we need to
evaluate θ). For example, in the CIR model with constant parameters, we still do not have an explicit solution
to the SDE yet it is known that XT has a non-central χ2 distribution from which we can easily simulate.
Unfortunately, however, once we move to a CIR model with time-varying parameters as in (5), the distribution
of XT is, in general, no longer available. This then complicates the task of computing θ (either analytically or
by estimating it by simulating XT directly).
One solution to this problem is to simulate XT indirectly by simulating the SDE in (5).
Exercise 1 Suppose we assume that the short-rate, rt , has dynamics given by (5). What do the comments in
the above example then imply about the difficulty of computing the term-structure?
Exercise 2 Suppose you wish to estimate θ := E[f ({Xt }0≤t≤T )] so that f (·) now depends on the entire path
of the process, Xt . Comment on whether or not this can be reduced to the problem of estimating θ = E[f (YT )]
for some process, YT .
Remark 1 The situation of Example 3 where we do not know the distribution of XT is typical. As a result, it
is often necessary to simulate a stochastic differential equation if we wish to estimate some associated quantity,
e.g. θ = E[f (XT )]. We describe how to do this below, beginning with the one-dimensional case.
and that we wish to simulate values of XT but do not know its distribution. (This could be due to the fact that
we cannot solve (6) to obtain an explicit solution for XT , or because we simply cannot determine the
distribution of XT even though we do know how to solve (6)).
When we simulate an SDE, what we mean is that we simulate a discretized version of the SDE. In particular, we
bh , X
simulate a discretized process, {X b2h , . . . , X
bmh }, where m is the number of time steps, h is a constant and
mh = T . The smaller the value of h, the closer our discretized path will be to the continuous-time path we wish
2 As we did in Examples 1 and 2
Monte-Carlo Methods for Single- and Multi-Factor Models 3
to simulate. Of course this will be at the expense of greater computational effort. While there are a number of
discretization schemes available, we will focus on the simplest and perhaps most common scheme, the Euler
scheme.
The Euler scheme is intuitive, easy to implement and satisfies
³ ´ ³ ´ √
bkh = X
X b(k−1)h + µ (k − 1)h, X b(k−1)h h + σ (k − 1)h, X
b(k−1)h h Zk (7)
where the Zk ’s are IID N (0, 1). If we want to estimate θ := E[f (XT )] using the Euler scheme, then for a fixed
number of paths, n, and discretization interval, h, we have the following algorithm.
for j = 1 to n
b = X0
t = 0; X
for k = 1 to T /h = m
generate Z ∼ N (0, 1) √
set X b =X b + µ(t, X)h
b + σ(t, X)
b hZ
set t = t + h
end for
set fj = f (X) b
end for
set θbn = (f1 + . . . + fn ))/n
Pn
b2 =
set σ n j=1(fj − θbn )2 /(n − 1)
b
set approx. 100(1 − α) % CI = θbn ± z1−α/2 √
σn
n
Remark 2 Observe that even though we only care about XT , we still need to generate intermediate values,
Xih , if we are to minimize the discretization error. Because of this discretization error, θbn is no longer an
unbiased estimator of θ.
Remark 3 If we wished to estimate θ = E[f (Xt1 , . . . , Xtp )] then in general we would need to keep track of
(Xt1 , . . . , Xtp ) inside the inner for-loop of the algorithm.
Exercise 3 Can you think of a derivative where the payoff depends on (Xt1 , . . . , Xtp ), but where it would not
be necessary to keep track of (Xt1 , . . . , Xtp ) on each sample path?
(1) Modelling the evolution of multiple stocks. This might be necessary if we are trying to price
derivatives whose values depend on multiple stocks or state variables, or if we are studying the properties
of some portfolio strategy with multiple assets.
(2) Modelling the evolution of a single stock where we assume that the volatility of the stock is itself
stochastic. Such a model is termed a stochastic volatility model.
(3) Modelling the evolution of interest rates. For example, if we assume that the short rate, rt , is
driven by a number of factors which themselves are stochastic and satisfy SDE’s, then simulating rt
amounts to simulating the SDE’s that drive the factors. Such models occur in short-rate models as
well as HJM and LIBOR market models.
In all of these cases, whether or not we will have to simulate the SDE’s will depend on the model in question
and on the particular quantity that we wish to compute. If we do need to discretize the SDE’s and simulate
their discretized versions, then it is very straightforward. If there are n correlated Brownian motions driving the
SDE’s, then at each time step, ti , we must generate n IID N (0, 1) random variables. We would then use the
Cholesky Decomposition to generate Xti+1 . This is exactly analogous to our method of generating correlated
geometric Brownian motions. In the context of simulating multidimensional SDE’s, however, it is more common
to use independent Brownian motions as any correlations between components of the vector, Xt , can be
induced through the matrix, σ(t, Xt ).
m ∝ C 1/3 (10)
n ∝ C 2/3 . (11)
When it comes to estimating θ, (10) and (11) provide guidance as follows. We begin by using n0 paths and m0
discretization points per path to compute an initial estimate, θb0 , of θ. If we then compute a new estimate, θb1 ,
Monte-Carlo Methods for Single- and Multi-Factor Models 5
by setting m1 = 2m0 , then (10) and (11) suggest we should set n1 = 4n0 . We may then continue to compute
new estimates, θbi , in this manner until the estimates and their associated confidence intervals converge. In
general, if we increase m by a factor of 2 then we should increase n by a factor of 4. Although estimating θ in
this way requires additional computational resources, it is not usually necessary to perform more than two or
three iterations, provided we begin with sufficiently large values of m0 and n0 .
Remark 4 There are other important issues that arise when simulating SDE’s. For example, while we have
only described the Euler scheme, there are other more sophisticated discretization schemes that can also be
used. In a sense that we will not define, these schemes have superior convergence properties than the Euler
scheme. However, they are sometimes more difficult to implement, particularly in the multi-dimensional setting.
p (2)
dVt = α (b − Vt ) dt + σ Vt dWt . (13)
If we want to price a European call option on the stock with expiration, T , and strike K, then the price is given
by
C0 = exp(−rT )E[max(ST − K, 0)].
We could estimate C0 by simulating n sample paths of {St , Vt } up to time T , and taking the average of
exp(−rT ) max(ST − K, 0) over the n paths as our estimated call option price, C b0 .
Exercise 4 Write out the details of the algorithm that you would use to estimate C0 in Example 4.
(1) (2)
dXt = −kXt dt + σx,1 dWt + σx,2 dWt
(3)
drt = α(t, rt ) dt + β(t, rt ) dWt
where Xt is a state variable that possibly represents the time t value of some relevant economic variable. Let θt
be an adapted process that denotes the fraction of the investor’s wealth that is invested3 in the fund at time t,
and let Yt denote the investor’s wealth at time t. We then see that Yt satisfies
(1) (2)
dYt = [rt + θt (µ + λXt − r)]Yt dt + θt Yt [σ1 dWt + σ2 dWt ]. (14)
Now it may be the case that the investor wishes to compute E[u(YT )] where u(·) is his utility function, or that
he wishes to compute P(YT ≤ a) for some fixed value, a. In general, however, it is not possible to perform
these computations explicitly. As a result, we could instead use simulation. Noting that we will not in general be
able to solve (14) for YT and its distribution, this means that we would have to simulate the multivariate SDE
satisfied by (Pt , Xt , rt , Yt ) in order to answer these questions.
Exercise 5 Write out the details of the algorithm that you would use to estimate θ = P(YT ≤ a) in Example 5.
where µ(t) is a deterministic function of time. This generalized CIR model is used when we want to fit a
CIR-type model to the initial term-structure. (We will see later that a CIR model with constant parameters
when modelled under Q becomes a CIR model with time-varying parameters under the forward measure, P τ .)
Suppose now that we wish to price a derivative security maturing at time T with payoff CT (rT ). Then its time
0 price, C0 , is given by · RT ¸
− r ds
C0 = E 0 e 0 s CT (rT ) . (16)
The distribution of rt is not available in an easy-to-use closed form so perhaps the easiest way to estimate C0 is
by simulating the dynamics of rt . Towards this end, we could either use (15) and simulate rt directly or
alternatively, we could simulate Xt := f (rt ) where f (·) is an invertible transformation. Note that because of the
discount factor in (16), it is also necessary to simulate the process, Yt , given by
Z t
Yt = exp(− rs ds).
0
Exercise 6 Describe in detail how you would you would estimate C0 in Example 6. Note that there are
alternative ways to do this. What way do you prefer?
Exercise 7 Suppose we wish to simulate the known dynamics of a zero-coupon bond. How would you ensure
that the simulated process satisfies 0 < ZtT < 1 ?
Theorem 1 (Jensen’s Inequality) Suppose f (·) is a concave4 function on R, E[X] < ∞ and
E[f (X)] < ∞. Then E[f (X)] ≤ f (E[X]).
Returning to Example 6, we assumed there that the payoff function, CT (rT ), was easy to evaluate. This is true
for example, if CT (rT ) = (ZTU − K)+ in the Vasicek or CIR model, among others. However, in many
circumstances CT (rT ) will not be easy to evaluate. This is true, for example, when CT (rT ) = f (ZTU ) in models
where zero-coupon bond prices are not available in closed form. In such circumstances it may be necessary to
estimate f (ZTU ) using an additional simulation. Staying with the call option example, we see that its price, C0 ,
may be written as
· RT ¸
− rs ds U +
C0 = E 0 e 0 (ZT − K)
" R µ · R ¸ ¶ #
T U +
− rs ds − rv dv
= E0 e 0 ET e T −K . (17)
To estimate ZTU along a given sample path we see from (17) that it will therefore be necessary to perform an
additional simulation, or a “simulation within a simulation”.
4A function f (·) is concave on R if f (αx + (1 − α)y) ≥ αf (x) + (1 − α)f (y) for all α ∈ [0, 1].
Monte-Carlo Methods for Single- and Multi-Factor Models 7
Exercise 8 Use Jensen’s Inequality and (17) to show that the estimate of C0 will be biased away from the true
value. In what direction will the bias be?
The extent of the bias will depend on how accurately we can estimate ZTU along each simulated sample path.
Accuracy can be improved by conducting a large number of “simulations within the simulation” but this is at a
cost of requiring more computational resources. For reasons that will be clear later, HJM and Market LIBOR
models do not not suffer from this bias problem.
as our control variate. In this situation the expression in (18) may not have mean 0 due to discretization
error. However, this bias can be made arbitrarily small by taking a sufficiently fine partition of [0, T ].
³ R ´
T
2. It is possible to simulate exp − 0 rs ds exactly while at the same time, simulating the SDE for rt .
³ R ´
t
This is possible, for example, in the Vasicek model where the joint distribution of − 0 i rs ds, rti is
known to be bivariate normal.
For an example based on conditional Monte-Carlo, consider again the stochastic volatility model of Example 4.
Monte-Carlo Methods for Single- and Multi-Factor Models 8
(1) (2)
Suppose now that the Brownian motions, Wt and Wt in (12) and (13), are independent. Then
C0 = e−rT E[max(ST − K, 0)] = e−rT E [E[ max(ST − K, 0) | Vt , 0 ≤ t ≤ T ] ] .
(1) (2)
But it can be shown using the independence of Wt and Wt that
−rT
e E[ max(ST − K, 0) | Vt , 0 ≤ t ≤ T ] = c(S0 , T, K, r, V )
qR
T
where V := 0
Vt dt/T . In particular, this means that we can estimate C0 by using conditional Monte-Carlo
method.
Exercise 9 Write out the details of the conditional Monte-Carlo algorithm that you would use to estimate C0 .
Remark 6 The above example may be generalized in certain circumstances to accommodate dependence
(1) (2)
between Wt and Wt .
CT+ + CT−
CT = .
2
An antithetic estimator will often provide a significant variance reduction over the naive estimator. The
magnitude of the variance reduction, however, will depend on Cov(CT+ , CT− ) with a positive covariance resulting
in a variance increase.
Exercise 10 Would the method of antithetic variates be guaranteed (by the monotonicity theorem for
antithetic variates) to provide a variance reduction when estimating the value of a derivative security with date
T payoff given by |rT − r̄|? (To be precise, we should first specify a model but the answer should be the same
for most reasonable models.)
Remark 7 The antithetic method obviously extends to multi-factor models, as do all of the variance reduction
methods.
and we can use this result to generate (Wh |W0 , WT ). More generally, we can use (19) to successively
simulate (Wh |W0 , WT ), (W2h |Wh , WT ), . . . , (WT −h |WT −2h , WT ).
We can in fact simulate the points on the sample path in any order we like. In particular, to simulate Wv we use
(19) and condition on the two closest sample points before and after v, respectively, that have already been
sampled. This method of pinning the beginning and end points of the Brownian motion is known5 as the
Brownian bridge.
Exercise 11 If we are working with a multi-dimensional correlated Brownian motion, Wt , (e.g. in the context
of a multi-factor model of the short rate) is it still easy to use the Brownian bridge construction where we first
generate the random vector, WT ?
Remark 8 It is clear, but perhaps worth mentioning nonetheless, that the Brownian bridge / stratification
technique is not restricted to term structure applications.
We will delay a discussion of importance sampling until the next section when we talk about changing the
numeraire. This makes sense as the two concepts are clearly related.
where P τ is the forward measure6 that corresponds to taking Ztτ as the numeraire security. We then saw in
Example 1 of the Continuous-Time Short-Rate Models lecture notes how this change of numeraire technique
can be useful for obtaining analytic expressions for derivative security prices in the Vasicek and Hull-White
models. More generally, this change of numeraire technique can also be advantageous when using simulation to
estimate security prices.
τ
= Z0τ EP U
0 [max(ZT − k, 0)]. (22)
Note that if we estimate C0 by simulating then there are at least two advantages that arise from using the
expression in (22) rather than the expression in (21):
Rt
1. We do not need to keep track of the process Xt := 0 rs ds
5 See Glasserman for further details.
6 Recall also that dP τ /dQ = 1/(Bτ Z0τ ).
Monte-Carlo Methods for Single- and Multi-Factor Models 10
2. We avoid the discretization error associated with our general inability to simulate the discount factor
exactly. (Note that we might still incur some discretization error in simulating ZTU .)
Remark 9 In the Vasicek and Hull-White models we know that ZTU has a log-normal distribution so there is
no need to use simulation to price the option in Example 11. In general, however, we will not know the
distribution for ZTU and so simulating its SDE will be necessary.
Exercise 12 Can you think of any other advantage that might result from simulating under the forward
measure? (Depending on the context, this might also be a disadvantage.)
Exercise 13 Check that a Vasicek model under Q becomes a Vasicek model with time varying parameters (i.e.
a Hull-White model) under P τ . Does a similar result hold for the CIR model?
Exercise 14 Consider pricing an out-of-the-money European option with payoff max(ZTU − k, 0) where we
need to simulate an SDE to generate samples of ZTU . Do you think working under the forward measure would
tend to increase or decrease the variance of your estimator? What if you wanted to price a put option?
¡ ¢
Exercise 15 Is it possible to work with ZtT , P T as the numeraire-EMM pair and then use importance
sampling without foregoing the advantages of working under the forward measure if we wish to price a security
using simulation?
where Wt is a Q-Brownian motion. Then a derivative security with payoff at time T given by C(rT ) has time 0
price, C0 , given by · RT ¸
− rs ds
C0 = E Q 0 e 0 C(r T ) .
In general, if Rwe are to estimate C0 using simulation then it will be necessary to simulate the SDE’s satisfied by
t
rt and Yt := 0 rs ds. However, it is worth mentioning that in some circumstances it is possible to simulate
without bias from the joint distribution of (YT , rT ). This would then imply that we could estimate C0 without
any statistical bias.
Monte-Carlo Methods for Single- and Multi-Factor Models 11
where τ := T − t.