MPSkript
MPSkript
MPSkript
4 Convergence to equilibrium 87
4.1 Setup and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Stationary distributions and reversibility . . . . . . . . . . . . . . . . . . . . . . 91
4.3 Dirichlet forms and convergence to equilibrium . . . . . . . . . . . . . . . . . . 99
4.4 Hypercontractivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.5 Logarithmic Sobolev inequalities: Examples and techniques . . . . . . . . . . . 116
4.6 Concentration of measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5 129
5.1 Ergodic averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.2 Central Limit theorem for Markov processes . . . . . . . . . . . . . . . . . . . . 132
3
4 CONTENTS
Chapter 1
• Asmussen [4]
• Stroock [20]
• Norris [16]
• Kipnis, Landim [13]
Definition 1.1. A stochastic process (Xn )n=0,1,2,... on (Ω, A, P ) is called (F n )-Markov chain
with transition probabilities pn if and only if
(i) Xn is Fn -measurable ∀ n ≥ 0
(ii) P [Xn+1 ∈ B|Fn ] = pn+1 (Xn , B) P -a.s. ∀n ≥ 0, B ∈ S .
5
6 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
where
Z
(pq)(x, B) := p(x, dy)q(y, B)
S
x b
0 1 2 n n+1 n+2
pn+1 pn+2
{T = n} ∈ Fn ∀n ≥ 0,
FT = {A ∈ A | A ∩ {T = n} ∈ Fn ∀n ≥ 0} events observable up to time T.
1.2. FROM DISCRETE TO CONTINUOUS TIME: 7
E[F (XT , XT +1 , . . .) · I{T <∞} |FT ] = E(T,XT ) [F (X0 , X1 , . . .)] P -a.s. on {T < ∞}
Definition 1.3. A stochastic process (Xt )t≥0 on (Ω, A, P ) is called a (F t )-Markov process with
transition functions ps,t if and only if
(i) Xt is Ft -measurable ∀t ≥ 0
1. ps,s f = f
2. Kolmogorov existence theorem: Let ps,t be transition probabilities on (S, S ) such that
(i) pt,t (x, ·) = δx ∀x ∈ S, t ≥ 0
(ii) ps,t pt,u = ps,u ∀0≤s≤t≤u
Then there is an unique canonical Markov process (Xt , P(s,x) ) on S [0,∞) with transition
functions ps,t .
8 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
Problems:
• regularity of paths t 7→ Xt . One can show: If S is locally compact and ps,t Feller, then Xt
has càdlàg modification (cf. Revuz, Yor [17]).
Definition 1.5. (Xt , P ) has the strong Markov property w.r.t. a stopping time T : Ω → [0, ∞] if
and only if
E f (XT +s )I{T <∞} | FT = (pT,T +s f )(XT )
P -a.s. on {T < ∞} for all measurable f : S → R+ .
In the time-homogeneous case, we have
E f (XT +s )I{T <∞} | FT = (ps f )(XT )
EXT [F (X)]
Proof. Exercise.
Definition 1.7.
Definition 1.8. A Markov process (Xt )t≥0 on (Ω, A, P ) is called a pure jump process or contin-
uous time Markov chain if and only if
(t 7→ Xt ) ∈ PC(R+ , S)
P -a.s.
1.2. FROM DISCRETE TO CONTINUOUS TIME: 9
Aim: Construct a pure jump process with instantaneous jump rates qt (x, dy), i.e.
(Xt )t≥0 ↔ (Yn , Jn )n≥0 ↔ (Yn , Sn ) with Jn holding times, Sn jumping times of Xt .
n
∈ (0, ∞] with jump time {Jn : n ∈ N} point process on R+ , ζ = sup Jn
P
Jn = Si
i=1
explosion time.
a) Time-homogeneous case: qt (x, dy) ≡ q(x, dy) independent of t, λt (x) ≡ λ(x), πt (x, dy) ≡
π(x, dy).
Yn (n = 0, 1, 2, . . .) Markov chain with transition kernel π(x, dy) and initial distribution µ
En
Sn := , En ∼ Exp(1) independent and identically distributed random variables,
λ(Yn−1 )
independent of Yn , i.e.
Sn |(Y0 , . . . Yn−1 , E1 , . . . En−1 ) ∼ Exp(λ(Yn−1 )),
X n
Jn = Si
i=1
(
Yn for t ∈ [Jn , Jn+1 ), n ≥ 0
Xt :=
∆ for t ≥ ζ = sup Jn
Distribution at time t:
n
Jn =
P
Si ∼Γ(λ,n) Zt ∞
i=1 (λs)n−1 −λs differentiate r.h.s. −tλ X (tλ)k
P [Nt ≥ n] = P [Jn ≤ t] = λe ds = e ,
(n − 1)! k=n
k!
0
Nt ∼ Poisson(λt)
| | | | | | |
x−1 x x+1
1.2. FROM DISCRETE TO CONTINUOUS TIME: 11
b) Time-inhomogeneous case:
Remark (Survival times). Suppose an event occurs in time interval [t, t + h] with probability
λt · h + o(h) provided it has not occurred before:
Simulation of T :
Rt
E ∼ Exp(1), T := inf{t ≥ 0 : λs ds ≥ E}
0
t
Z Rt
− λs ds
⇒ P [T > t] = P λs ds < E = e 0
0
J0 := t0 , Y ∼µ
and
t∨t
R0
− λs (Y0 ) ds
P(t0 ,µ) [J1 > t | Y0 ] = e t0
for all t ≥ t0 , and (Yn−1 , Jn )n∈N is a time-homogeneous Markov chain on S × [0, ∞) with
transition law
12 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
t∨J
Rn
− λs (y) ds
P(t0 ,µ) [Yn ∈ dy, Jn+1 > t | Y0 , J1 , . . . , Yn−1 , Jn ] = πJn (Yn−1 , dy) · e Jn
i.e.
t∨J
Rn
−
Z λs (y) ds
P(t0 ,µ) [Yn ∈ A, Jn+1 > t |Y0 , J1 , . . . , Yn−1 , Jn ] = πJn (Yn−1 , dy) · e Jn
A
P -a.s. for all A ∈ S , t ≥ 0
b b b b b b b
Claim:
1.2. FROM DISCRETE TO CONTINUOUS TIME: 13
R n−1 Rt
t
1. fJn (t) = 1
(n−1)! 0
λs ds λt e− 0 λs ds
R
t
2. Nt ∼ Poisson 0
λs ds
Proof. 1. By induction:
Zt
fJn+1 (t) = fJn−1 |Jn (t|r)fJn (r) dr
0
n−1
Zt Zr
Rt 1 Rr
= λt e− r λs ds λs ds λr e− 0 λs ds
dr = . . .
(n − 1)!
0 0
2. P [Nt ≥ n] = P [Jn ≤ t]
Ks := min{n : Jn > s} first jump after time s. Stopping time with respect to
Gn = σ (E1 , . . . , En , Y0 , . . . , Yn−1 ),
i.e.
h Rt i
P(t0 ,µ) [{JKs > t} ∩ {s < ζ} ∩ A] = E(t0 ,µ) e− s λr (Xs ) dr ; A ∩ {s < ζ} ∀A ∈ Fs
Remark . The assertion is a restricted form of the Markov property in continuous time: The
conditional distribution with respect to P(t0 ,µ) of JKs given Fs coincides with the distribution of
J1 with respect to P(s,Xs ) .
Proof.
(Ex.)
A ∈ Fs ⇒ A ∩ {Ks = n} ∈ σ (J0 , Y0 , . . . , Jn−1 , Yn−1 ) = G̃n−1
h i
⇒ P [{JKs > t} ∩ A ∩ {Ks = n}] = E P [Jn > t | G̃n−1 ] ; A ∩ {Ks = n}
where
Zt Zt
P [Jn > t | G̃n−1 ] = exp − λr (Yn−1 ) dr = exp − λr (Yn−1 ) dr · P [Jn > s | G̃n−1 ],
|{z}
Jn−1 s =Xs
hence we get
h Rt i
P [Jn > t | G̃n−1 ] = E e− s λr (Xs ) dr ; A ∩ {Ks = n} ∩ {Jn > s} ∀n ∈ N
˙
x := Φ ((tn , yn )n=0,1,2,... ) ∈ PC ([t0 , ∞), S ∪{∆})
by (
Yn for tn ≤ t < tn+1 , n ≥ 0
xt :=
∆ for t ≥ sup tn
1.2. FROM DISCRETE TO CONTINUOUS TIME: 15
Let
E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | FsX (ω) = E(s,Xs (ω)) [F (Xs:∞ )] P -a.s. {s < ζ}
for all
F : PC ([s, ∞), S ∪ {∆}) → R+
measurable with respect to σ (x 7→ xt | t ≥ s).
Proof. Xs:∞ = Φ(s, YKs −1 , JKs , YKs , JKs +1 , . . .) on {s < ζ} = {Ks < ∞}
t0 s JKs
i.e. the process after time s is constructed in the same way from s, YKs −1 , JKs , . . . as the orig-
inal process is constructed from t0 , Y0 , J1 , . . .. By the Strong Markov property for the chain
(Yn−1 , Jn ),
E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | GKs
=E(t0 ,µ) F ◦ Φ(s, YKs −1 , JKs , . . .) · I{Ks <∞} | GKs
=EMarkov chain
(YKs −1 ,JKs ) [F ◦ Φ(s, (Y0 , J1 ), (Y1 , J2 ), . . .)] a.s. on {Ks < ∞} = {s < ζ}.
taking into account that the conditional distribution given GKs is 0 on {s ≥ ζ} and that YKs −1 =
Xs .
Here the conditional distribution of JKs ist k(Xs , ·), by Lemma 1.10
Rt
k(x, dt) = λt (x) · e− s λr (x) dr
· I(s,∞) (t) dt
hence
Corollary 1.12. A non-homogeneous Poisson process (Nt )t≥0 with intensity λt has independent
increments with distribution
t
Z
Nt − Ns ∼ Poisson λr dr
s
Proof.
MP
P Nt − Ns ≥ k | FsN = P(s,Ns ) [Nt − Ns ≥ k] = P(s,Ns ) [Jk ≤ t]
t
Z
as above
= Poisson λr dr ({k, k + 1, . . .}) .
s
R
t
Hence Nt − Ns independent of FsN and Poisson s
λr dr distributed.
Recall that the total variation norm of a signed measure µ on (S, S ) is given by
Z
+ −
kµkTV = µ (S) + µ (S) = sup f dr
|f |≤1
Theorem 1.13. 1. Under P(t0 ,µ) , (Xt )t≥t0 is a Markov jump process with initial distribution
Xt0 ∼ µ and transition probabilities
Rt
Zt Rr
−
ps,t (x, B) = e s λr (x) dr
δx (B) + e− s λu (x) du
(qr pr,t )(x, B) dr (1.1)
s
holds for all s ≥ 0 , x ∈ S and bounded functions f : → R such that t 7→ (qt f )(x) is
continuous.
Remark . 1. (1.2) shows that (Xt ) is the continuous time Markov chain with intensities λt (x)
and transition rates qt (x, dy).
2. If ζ = sup Jn is finite with strictly positive probability, then there are other possible con-
tinuations of Xt after the explosion time ζ.
non-uniqueness.
The constructed process is called the minimal chain for the given jump rates, since its
transition probabilities pt (x, B) , B ∈ S are minimal for all continuations, cf. below.
Rt
Zt Rr
(ps,t f )(x) = e− s λr (x) dr
f (x) + e− s λu (x) du
(qr pr,t f )(x) dr (1.3)
s
and hence
h i
P(s,x) Xt ∈ B|G̃1 (ω) = δx (B) · I{t<J1 } (ω) + P(J1 (ω),Y1 (ω)) [Xt ∈ B] · I{t≥J1 } (ω)
Rt
Zt Rr
−
= δx (B) · e s λr (x) dr
+ e− s λu (x) du
(qr pr,t )(x, B) dr
s
for all 0 ≤ r ≤ t and x ∈ S. Hence if r 7→ λr (x) is continuous (and locally bounded) for
all x ∈ S, then
(pr,t f )(x) −→ f (x) (1.4)
as r, t ↓ s for all x ∈ S.
Thus by dominated convergence,
as r, t ↓ s provided r 7→ (qr f )(x) is continuous. The assertion now follows from (1.3).
1.3. FORWARD AND BACKWARD EQUATIONS 19
Proof.
Z t ≤λ̄
z }| {
Jn = inf t ≥ 0 : λs (Yn−1 ) ds ≥ En
Jn−1
−1
≥ Jn−1 + λ̄ En
∞
X
=⇒ ζ = sup Jn ≥ λ̄−1 En = ∞ a.s.
n−1
i.e.
then
d
1 X 1
(Lt f )(x) = (f (x + ei ) + f (x − ei ) − 2f (x)) = (∆ d f ) (x)
2d k=1 2d Z
−→ 0 as r → s for all x ∈ S
by dominated convergence, because sup |pr,t f | ≤ sup |f |. Hence the integrand in (1.3) is
continuous in r at s, and so there exists
∂
− (ps,t f ) (x) = −λs (x)(ps,t f )(x) + (qs ps,t f )(x) = (Ls ps,t f ) (x)
∂s
1.3. FORWARD AND BACKWARD EQUATIONS 21
2. Minimality:
Let (x, B) 7→ p̃s,t (x, B) be an arbitrary non-negative solution of (1.5). Then
∂
− p̃r,t (x, B) = (qr p̃r,t )(x, B) − λr (x)p̃r,t (x, B)
∂r
∂ R s λu (x) du Rr
⇒− er p̃r,t (x, B) = e− s λu (x) du (qr p̃r , t)(x, B)
∂r
Rt
Zt R
integrate r
⇒ p̃s,t (x, B) − e− s λu (x) du δx (B) = e− s λu (x) du (qr p̃r,t )(x, B) dr
s
integrated backward-equation.
Claim:
p̃s,t (x, B) ≥ ps,t (x, B) = P(s,x) [Xt ∈ B] ∀ x ∈ S, B ∈ S
This is OK if
P(s,x) [Xt ∈ B, t < Jn ] ≤ p̃s,t (x, B) ∀ n ∈ N
n=0: X
n → n + 1: by First step analysis:
where by induction
Remark . 1. (1.5) describes the backward evolution of the expectation values E(s,x) [f (Xt )]
respectively the probabilities P(s,x) [Xt ∈ B] when varying the starting times s.
d(x) b(x)
| | | | | |
0 1 2 x−1 x x+1
d X
pt (x, z) = q(x, y) (pt (y, z) − pt (x, z))
dt
|x−y|=1
The particles in a population die with rate d > 0 and divide into two particles with rate
b > 0, independently from each other.
Let
η(t) := P1 [Xt = 0] = pt (1, 0)
the extinction probability. Equation (1.5) gives
η 0 (t) = dpt (0, 0) − (b + d)pt (1, 0) + bt (2, 0)
= d − (b + d)η(t) + bη(t)2 ,
η(0) = 0
Hence we get (
1
1+bt
if b=d
P1 [Xt 6= 0] = 1 − η(t) = b−d
b−d·et(d−b)
if b 6= d
i.e.
• exponentially decay if d > b
• polynomial decay if d = b (critical case)
• strictly positive survival probability if d < b
Proof. 1. Strong continuity: Fix t0 > 0. Note that kqr gksup ≤ λ̄r kf ksup for all 0 ≤ r ≤ t0 .
Hence by the assumption and the integrated backward equation (1.3),
kps,t f − ps,r f ksup = kps,r (pr,t f − f )ksup
≤kpr,t f − f ksup ≤ ε(t − r) · kf ksup
for all 0 ≤ s ≤ r ≤ t ≤ t0 and some function ε : R+ → R+ with limh↓0 ε(h) = 0.
2. Differentiability: By 1.) and the assumption,
(r, u, x) 7→ (qr pr,u f )(x)
is uniformly bounded for 0 ≤ r ≤ u ≤ t0 and x ∈ S, and
qr pr,u f = qr (pr,u f − f ) +qr f −→ qt f
| {z }
−→0 uniformly
24 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
pointwise as r, u −→ t. Hence by the integrated backward equation (1.3) and the continuity
of t 7→ λt (x),
pt,t+h f (x) − f (x) h↓0
−→ −λt (x)f (x) + qt f (x) = Lt f (x)
h
for all x ∈ S, and the difference quotients are uniformly bounded.
Dominated convergence now implies
ps,t+h f − ps,t f pt,t+h f − f
= ps,t −→ ps,t Lt f
h h
pointwise as h ↓ 0. A similar argument shows that also
ps,t f − ps,t−h f pt−h,t f − f
= ps,t−h −→ ps,t Lt f
h h
pointwise.
Notation:
Z
< µ, f >:= µ(f ) = f dµ
Proof.
Z Z
< µt , f >=< µps,t , f >= µ(dx) ps,t (x, dy)f (y) =< µ, ps,t f >
hence we get
Remark . (Important!)
Zt
< µt , 1 > < < µ, 1 > + < µs , Ls 1 > ds
| {z } | {z }
<1 =1 0
where Ls 1 = 0.
The question whether one can extend the forward equation to unbounded jump rates leads to the
martingale problem.
26 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
Now we consider again the minimal jump process (Xt , P(t0 ,µ) ) constructed above. A function
f : [0, ∞) × S → R
(t, x) 7→ ft (x)
Theorem 1.20 (Time-dependent martingale problem). Suppose that t 7→ λt (x) is continuous for
all x. Then:
1. The process
Zt
∂
Mtf := ft (Xt ) − + Lr fr (Xr ) dr, t ≥ t0
∂r
t0
is a local (FtX )-martingale up to ζ with respect to P(t0 ,µ) for any locally bounded function
∂
f : R × S → R such that t 7→ ft (x) is C 1 for all x, (t, x) 7→ ∂t
+
ft (x) is locally bounded,
and r 7→ (qr,t ft )(x) is continuous at r = t for all t, x.
∂
2. If λ̄t < ∞ and f and ∂t
f are bounded functions, then M f is a global martingale.
3. More generally, if the process is non-explosive then M f is a global martingale provided
∂
sup |fs (x)| + fs (x) + |(Ls fs )(x)| < ∞ (1.8)
x∈S ∂s
t0 ≤s≤t
for all x ∈ S.
2. In general:
∂
If ht is space-time harmonic, i.e. ∂t ht +Lt ht = 0, then h(Xt ) is a martingale. In particular,
(ps,t f )(Xt ), (t ≥ s) is a martingale for all bounded functions f .
Proof of theorem. 2. Similarly to the derivation of the forward equation, one shows that the
assumption implies
∂ ∂
(ps,t ft )(x) = (ps,t Lt ft ) (x) + ps,t ft (x) ∀ x ∈ S,
∂t ∂t
Zt
∂
ps,t ft = fs + ps,r + Lr fr dr
∂r
s
28 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
1. For k ∈ N let
(k)
qt (x, B) := (λt (x) ∧ k) · πt (x, B)
(k)
denote the jump rates for the process Xt with the same transition probabilities as Xt and
(k)
jump rates cut off at k. By the construction above, the process Xt , k ∈ N, and Xt can be
realized on the same probability space in such a way that
(k)
Xt = Xt a.s. on {t < Tk }
where
Tk := inf {t ≥ 0 : λt (Xt ) ≥ k, Xt ∈
/ Bk }
∂
for an increasing sequence Bk ofS open subsets of S such that f and ∂t f are bounded on
[0, t] × Bk for all t, k and S = Bk . Since t 7→ λt (Xt ) is piecewise continuous and the
jump rates do not accumulate before ζ, the function is locally bounded on [0, ζ). Hence
Tk % ζ a.s. as k → ∞
Zt
(k) ∂
Mtf,k = ft (Xt ) − + Lr(k)
fr (Xr(k) ) dr, t ≥ t0 ,
∂r
t0
is a martingale with respect to P(t0 ,µ) , which coincides a.s. with Mtf for t < Tk . Hence Mtf
is a local martingale up to ζ = sup Tk .
3. If ζ = sup Tk = ∞ a.s. and f satisfies (1.8), then (Mtf )t≥0 is a bounded local martingale,
and hence, by dominated convergence, a martingale.
1.4. THE MARTINGALE PROBLEM 29
∂
(iii)ϕt (x) + Lt ϕt (x) ≤ 0 superharmonic
∂t
Then the minimal Markov jump process constructed above is non-explosive.
for all t ≥ 0.
(iii) Lt ψ ≤ αψ ∀t ≥ 0
then the theorem applies with ϕt (x) = e−αt ψ(x):
∂
ϕt + Lt ϕt ≤ −αϕt + αϕt ≤ 0
∂t
This is a standard criterion in the time-homogeneous case!
30 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
2. If S is a locally compact connected metric space and the intensities λt (x) depend continu-
ously on s and x, then we can choose the sets
Bn = {x ∈ S | d(x0 , x) < n}
as the balls around a fixed point x0 ∈ S, and condition (ii) above then means that
lim ψ(x) = ∞
d(x,x0 )→∞
Suppose a population consists initially (t = 0) of one particle, and particles die with time-
dependent rates dt > 0 and divide into two with rates bt > 0 where d, b : R+ → R+ are continu-
ous functions, and b is bounded. Then the total number Xt of particles at time t is a birth-death
process with rates
n · bt if m = n + 1
qt (n, m) = n · dt if m = n − 1 , λt (n) = n · (bt + dt )
0 else
The generator is
0 0 0 0 0 0 ···
dt −(dt + bt ) bt 0 0 0 ···
Lt =
0 2dt −2(dt + bt ) 2bt 0 0 ···
0
0 3d t −3(dt + b t ) 3b t 0 ···
... ... ... .. .. ..
. . .
Since the rates are unbounded, we have to test for explosion. choose ψ(n) = n as Lyapunov
function. Then
Since the individual birth rates bt , t ≥ 0, are bounded, the process is non-explosive. To study
long-time survival of the population, we consider the generating functions
∞
X
Gt (s) = E sXt = sn P [Xt = n], 0<s≤1
n=0
Since the process is non-explosive and fs and Lt fs are bounded on finite time-intervals, the
forward equation holds. We obtain
∂ ∂
Gt (s) = E [fs (Xt )] = E [(Lt fs )(Xt )]
∂t ∂t
2 ∂ Xt
= (bt s − (bt + dt )s + dt ) · E s
∂s
∂
= (bt s − dt )(s − 1) · Gt (s),
X0 ∂s
G0 (s) = E s =s
The solution of this first order partial differential equation for s < 1 is
−1
%t Zt
e
Gt (s) = 1 − + bn e%u du
1−s
0
where
Zt
%t := (du − bu ) du
0
is the accumulated death rate. In particular, we obtain an explicit formula for the extinction
probability:
−1
Zt
P [Xt = 0] = lim Gt (s) = e%t + bn e%u du
s↓0
0
−1
Zt
= 1 − 1 + du e%u du
0
Theorem 1.23.
Z∞
P [Xt = 0 eventually] = 1 ⇐⇒ du e%u du = ∞
0
Remark . Informally, the mean and the variance of Xt can be computed by differentiating Gt at
s=1:
d Xt
Xt −1
E s = E Xt s = E[Xt ]
ds
s=1 s=1
d2 Xt
Xt −2
E s = E Xt (X t − 1)s = Var(Xt )
ds2
s=1 s=1
32 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
since
Z∞ Z∞
1 1
e−α% = e−αt dt = e−αt · I{t≥%} dt
α α
% 0
Theorem 1.24. (Necessary and sufficient condition for non-explosion, Reuter’s criterion)
1. Fα is the maximal solution of
L g = αg (1.10)
satisfying 0 ≤ y ≤ 1.
2. The minimal Markov jump process is non-explosive if and only if (1.10) has only the trivial
solution satisfying 0 ≤ y ≤ 1.
i.e.
L = λ · (π − I) (1.11)
T := inf {t ≥ 0 | Xt ∈ Dc } , D ⊆ S open
Theorem 1.25 (Dirichlet and Poisson problem). For any measurable functions c : D → R+ and
f : D c → R+ , T
Z
u(x) := Ex c(Xt ) dt + f (XT ) · I{T <ζ}
0
−L u = c on D
(1.12)
u=f on Dc
Proof. 1. For c ≡ 0 this follows from the corresponding result in discrete time. In fact, the
exit points from D of Xt and Yt coincide, and hence
where τ = inf{n ≥ 0 | Yn ∈
/ D}. Therefore u is the minimal non-negative solution of
πu = u on D
u=f on Dc
2. In the general case the assertion can be proven by first step analysis (Exercise).
2. Distribution of XT :
u(x) = Ex [f (XT ) ; T < ζ] solves L u = 0 on D, u = f on Dc .
3. Mean exit time:
u(x) = Ex [T ] solves −L u = 1 on D, u = 0 on Dc .
4. Mean occupation time of A before exit from D:
A
D
b b
A ⊆ D measurable
Z∞
T
Z
u(x) = Ex IA (Xt ) dt = Px [Xt ∈ A, t < T ] dt
0 0
(i) x y
(ii) x y for the jump chain (Yn )
(iii) There is a k ∈ N and x1 , . . . , xk ∈ S such that
for all t > 0. Hence with (iii) and the independence of the states,
pt (x, y) ≥ p k+1
t (x, x1 )p t (x1 , x2 ) · · · p t (xk , y) > 0
k+1 k+1
and hence
∞
X
ζ= (Ji − Ji−1 ) = ∞ Px -a.s.
1
36 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
since (J1 − Ji−1 ) are conditional independent given (Yn ), Exp(λ(x)) distributed infinitely
often.
Since λ > 0, the process (Xt )t≥0 does not get stuck, and hence visits the same states as the
jump chain (Yn )n≥0 . Thus Xt is recurrent. Similarly, the converse implication holds.
If x is transient for (Xt ) then it is transient for (Yn ) since otherwise it would be recurrent
by the dichotomy in discrete time. Finally, if x is transient for (Yn ) then it is transient for
Xn since the process spends only finite time in each state.
Let
Tx := inf{t ≥ J1 : Xt = x}
denote the first passage time of x.
Proof. Under the assumption λ > 0, the first two assertions follow from the corresponding result
in discrete time. If λ(x) = 0 for some x, we can apply the same arguments if we construct the
process (Xt ) form a jump chain which is absorbed at x ∈ S with λ(x) = 0.
3. If λ(x) > 0 then by the discrete time result x is recurrent if and only if
Px [Yn = x for some n ≥ 1] = 1,
i.e., if and only if
Px [Tx < ∞]
Moreover, the Green function of (Xt ) can be computed from the Green function of the jump
chain (Yn ):
Z∞
∞
Z
pt (x, x) dt = Ex I{x} (Xt ) dt
0
" 0∞ #
X
= Ex (Jn+1 − Jn )I{x} (Yn )
n=0
∞
X
= E[Jn+1 − Jn | Yn = x] · Px [Yn = x]
| {z }
n=0
∼Exp(λ(x))
∞
1 X
n
= π (x, x)
λ(x)
|n=0 {z }
discrete-time Green function
1.5. ASYMPTOTICS OF TIME-HOMOGENEOUS JUMP PROCESSES 37
Hence
G(x, x) = ∞ ⇔ λ(x) = 0 or x recurrent for Yn ⇔ x recurrent for Xt
Proof. Either directly from the strong Markov property for the jump chain (Exercise). A more
general proof that applies to other continuous time Markov processes as well will be given in the
next chapter.
Definition 1.29. A positive measure µ on S is called stationary (or invariant) with respect to (pr )t≥0
if and only if
µpt = µ
for all t ≥ 0.
Theorem 1.30 (Existence and uniqueness of stationary measure). Suppose that x ∈ S is re-
current. Then:
1. T
Zx
µ(B) := Ex IB (Xt ) dt , B ⊆ S,
0
µ(B)
µ̄(B) =
µ(S)
τx := inf{n ≥ 1 : Yn = x}
T
Zx
µ(y) = Ex I{y} (Xt ) dt
0
"τ −1 #
Xx
We have shown that µ is a stationary measure. If x is positive recurrent then µ(S) is finite,
and hence µ can be normalized to a stationary probability distribution.
2. If (Xt ) is irreducible then the skeleton chain (Xn )n=0,1,2,... is a discrete-time Markov chain
with transition kernel
p1 (x, y) > 0 ∀ x, y ∈ S
Hence (Xn ) is irreducible. If we can show that (Xn ) is recurrent, than by the discrete-time
theory, (Xn ) has at most one invariant measure (up to a multiplicative factor), and thus the
same holds for (Xt ). Since x is recurrent for (Xt ), the jump chain (Yn ) visits x infinitely
often with probability 1. Let K1 < K2 < · · · denote the successive visit times. Then
In fact, the holding times JKi+1 − JKi , i ∈ N, are conditionally independent given (Yn )
with distribution Exp(λ(x)). Hence
which implies (1.14) by (1.13). The recurrence of (Xn ) follows from (1.14) by irreducibil-
ity.
µpt = µ ∀ t ≥ 0 ⇔ µL = 0
In the general case this infinitesimal characterization of stationary measures does not always hold,
cf. the example below. However, as a consequence of the theorem we obtain:
1. If µ is a stationary distribution of (pt )t≥0 then all states are positive recurrent, and
Proof. A stationary distribution µ of (Xt ) is also stationary for the skeleton chain (Xn )n=0,1,2,... ,
which is irreducible as noted above. Therefore, the skeleton chain and thus (Xt ) are positive
recurrent. Now the theorem and the remark above imply that in the recurrent case, a measure µ
is stationary if and only if λ · µ is stationary for the jump chain (Yn ), i.e.
X
(µq)(y) = λ(x)µ(x)π(x, y) = λ(y)µ(y) (1.16)
x∈S
P to (µL )(y) = 0.
which is equivalent
In particular, if λ(x)µ(x) < ∞ and µL = 0 then λµ is a stationary distribution for (Yn ),
where (Yn ) and thus (Xt ) are positive recurrent.
Example . We consider the minimal Markov jump process with jump chain Yn = Y0 + n and
intensities λ(x) = 1 + x2 . Since ν(y) ≡ 1 is a stationary measure for (Yn ), i.e. νπ = ν, we see
that
ν(y) 1
µ(y) := =
λ(y) 1 + y2
is a finite measure with (µL )(y) = 0 for all y. However, Xt is not recurrent (since Yn is
transient), and hence µ is not stationary for Xt !
Theorem 1.32. If (Xt ) is irreducible and non-explosive, and µ ∈ M1 (S) satisfies (1.16), then µ
is a stationary distribution.
for all y ∈ S.
1.5. ASYMPTOTICS OF TIME-HOMOGENEOUS JUMP PROCESSES 41
For a birth-death process on {0, 1, 2, . . .} with strictly positive birth rates b(x) and death rates
d(x) the detailed balance condition is
Suppose that
∞
X
ν(n) < ∞ (1.19)
n=0
Then
ν(x)
µ(x) := P∞
y=0 ν(y)
is a probability distribution satisfying (1.17), and hence (1.16). By irreducibility, µ is the unique
stationary probability distribution provided the process is non-explosive. The example above
shows that explosion may occur even when (1.19) holds.
Then:
1. The minimal birth-death process in non explosive, and µ is the unique stationary probabil-
ity distribution.
(a)
y−1
X µ({0, 1, . . . , n})
Ex [Ty ] = for all 0 ≤ x ≤ y and
n=x
µ(n) · b(n)
(b)
x
X µ({n, n + 1, . . .})
Ex [Ty ] = for all 0 ≤ y ≤ x respectively.
n=y+1
µ(n) · d(n)
(c)
y−1
X 1
Ex [Ty ] + Ey [Tx ] = for all 0 ≤ x < y.
n=x
µ(n) · b(n)
Proof. 1. Reuter’s criterion implies that the process is non-explosive of and only if
∞
X ν({0, . . . , n})
=∞
n=0
b(n)
b(0)u0 (0) = 1,
b(n)u0 (n) + d(n)u0 (n − 1) = 1
for all 1 ≤ n < y. By the detailed balance condition (1.18) the unique solution of this
equation is given by
n n n
0
X 1 Y d(l) (1.18) X µ(k)
u (n) = − = − ∀0≤n≤y
k=0
b(k) l=k+1 b(l) k=0
µ(n)b(n)
Assertion (a) now follows by summing over n and taking into account the boundary con-
dition u(y) = 0. The proof of (b) is similar and (c) follows from (a) and (b) since
by (1.18).
Remark . Since µ(n) · b(n) is the flow through the edge {n, n + 1}, the right hand side of (c)
can be interpreted as the effective resistance between x and y of the corresponding electrical
network. With this interpretation, the formula carries over to Markov chains on general graphs
and the corresponding electrical networks, cf. Aldous, Fill [1].
1.5. ASYMPTOTICS OF TIME-HOMOGENEOUS JUMP PROCESSES 43
Theorem 1.34 (Ergodic Theorem). Suppose that (Xt , Px ) is irreducible and has stationary prob-
ability distribution µ̄. Then
Zt Z
1
f (Xs ) ds −→ f dµ̄
t
0
Pν -a.s. as t → ∞ for any non-negative function f : S → R and any initial distribution ν ∈
M1 (S).
Remark . 1. In particular,
Zt
1
µ̄(B) = lim IB (Xs ) ds Pν -a.s.
t→∞ t
0
is a stationary measure.
Proof. Similar to the discrete time case. Fix x ∈ S and define recursively the successive leaving
and visit times of the state x:
T̃ 0 = inf {t ≥ 0 : Xt 6= x}
n o
T n = inf t ≥ T̃ n−1 : Xt = x visit times of x
T̃ n = inf {t ≥ T n : Xt 6= x} leaving times of x
We have
ZT n n−1
X
f (Xs ) ds = Yk
k=1
T1
where
TZk+1 T k+1
Z −T k
Yk := f (Xs ) ds = f (Xs+Tk ) ds
Tk 0
44 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
Note that T k+1 (X) = T k (X) + T 1 (XT k +• ). Hence by the strong Markov property the random
variables are independent and identically distributed with expectation
T
Zx Z
Eν [Yk ] = Eν [Eν [Yk | FT k ]] = Ex f (Xs ) ds = f dµ
0
ZT n Z
1
f (Xs ) ds −→ f dµx Pν -a.s. as n → ∞ (1.21)
n
0
In particular,
Tn
−→ µx (S) Pν -a.s.
n
By irreducibility, the stationary measure is unique up to a multiplicative factor. Hence µx (S) <
∞ and µ̄ = µxµ(S)
x
. Thus we obtain
Z R ZTn Zt
f dµ n 1 1
f dµ̄ = = lim · f (Xs ) ds ≤ lim inf f (Xs ) ds
µ(S) n→∞ Tn+1 n t→∞ t
0 0
Zt ZTn Z
1 n 1
≤ lim sup f (Xs ) ds ≤ lim sup · f (Xs ) ds = f dµ̄s
t→∞ t n→∞ Tn n
0 0
i.e.
Zt Z
1
f (Xs ) ds −→ f dµ̄ Pν -a.s.
t
0
Chapter 2
no particle
particle
45
46 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
i.e.
(
ci (x, η) if ξ = η x,i
q(η, ξ) =
0 otherwise
where
(
6 x
η(y) for y =
η x,i (y) =
i for y = x
Example . 1. Contact process: (Spread of plant species, infection,...) T = {0, 1}. Each
particle dies with rate d > 0, produces descendent at any neighbor site with rate b > 0 (if
not occupied)
c0 (x, η) = d
c1 (x, η) = b · N1 (x, η); N1 (x, η) := |{y ∼ x : η(y) = 1}|
Spatial branching process with exclusion rule (only one particle per site).
2. Voter model: η(x) opinion of voter at x,
ci (x, y) = Ni (x, y) := |{y ∼ x : η(y) = i}|
changes opinion to i with rate equal to number of neighbors with opinion i.
3. Ising model with Glauber (spin flip) dynamics: T = {−1, 1}, β > 0 inverse tempera-
ture.
(a) Metropolis dynamics:
X
∆(x, η) := η(y) = N1 (x, η) − N−1 (x, η) total magnetization
y∼x
In the rest of this section we will assume that the vertex set V is finite. In this case, the config-
uration space S = T V is finite-dimensional. If, moreover, the type space T is also finite then S
itself is a finite graph with respect to the Hamming distance
Hence a continuous-time Markov chain (ηt , Px ) can be constructed as above from the jump rates
qt (η, ξ). The process is non-explosive, and the asymptotic results from the last section apply. In
particular, if irreducibility holds the there exists a unique stationary probability distribution, and
the ergodic theorem applies.
with Hamiltonian
1 X X
H(η) = (η(x) − η(y))2 = η(x)η(y) + |E|
2
{x,y}∈E {x,y}∈E
µβ (η)q(η, ξ) = µβ (ξ)q(ξ, η) ∀ ξ, η ∈ S.
Moreover, irreducibility holds - so the stationary distribution is unique, and the ergodic
theorem applies (Exercise).
2. Voter model: The constant configurations i(x) ≡ i, i ∈ T , are absorbing states, i.e.
cj (x, i) = 0 for all j 6= i, x. Any other state is transient, so
" #
[
P {ηt = i eventually} = 1.
i∈T
Moreover,
Ni (ηt ) := |{x ∈ V : ηt (x) = i}|
is a martingale (Exercise), so
t→∞
Ni (η) = Eη [Ni (ηt )] −→ Eη [Ni (η∞ )] = N · P [ηt = i eventually]
i.e.
Ni (η)
P [ηt = i eventually] =
N
The stationary distributions are the Dirac measures δi , i ∈ T , and their convex combina-
tions.
48 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
3. Contact process: The configuration 0 is absorbing, all other states are transient. Hence
δ0 is the unique invariant measure and ergodicity holds.
We see that on finite graphs the situation is rather simple as long as we are only interested in
existence and uniqueness of invariant measures, and ergodicity. Below, we will show that on
infinite graphs the situation is completely different, and phase transitions occur. On finite sub-
graphs on an infinite graph these phase transitions effect the rate of convergence to the stationary
distribution and the variances of ergodic averages but not the ergodicity properties themselves.
Example . Multinomial resampling (e.g. population genetics), mean field voter model.
With rate 1 replace each type η(x), x ∈ V , by a type that is randomly selected from Ln (η):
1
ci (x, η) = Ln (η)(i) = |{x ∈ η : η(x) = i}|
n
As a special case we now consider mean-field models with type space T = {0, 1} or T =
{−1, 1}. In this case the empirical distribution is completely determined by the frequence of type
1 in a configuration:
Ln (η) ←→ N1 (η) = |{x : η(x) = 1}|
ci (x, y) = f˜(N1 (η))
If (ηt , Px ) is the corresponding mean field particle system, then (Exercise) Xt = N1 (η) is a
birth-death process on {0, 1, . . . , n} with birth/death rates
b(k) = (n − k) · f˜1 (k), d(k) = k · f˜0 (k)
where (n − k) is the number of particles with state 0 and f˜1 (k) is the birth rate per particle.
Explicit computation of hitting times, stationary distributions etc.!
2.1. INTERACTING PARTICLE SYSTEMS - A FIRST LOOK 49
where
n
X
m(η) = η(x) = N1 (η) − N−1 (η) = 2N1 (η) − n
x=1
is the total magnetization. Note that each η(x) is interacting with the mean field n1
P
η(y),
which explains the choice of interacting strength of order n1 . The birth-death chain N1 (ηt )
corresponding to the heat bath dynamics has birth and death rates
k n−k
eβ n eβ n
b(k) = (n − k) · k n−k , d(k) = k · k n−k
eβ n + eβ n eβ n + eβ n
The binomial
√
distribution Bin(n, 21 ) has a maximum at its mean value n2 , and standard
n
deviation
√ 2
. Hence for large n, the measure µ̄β has one sharp mode of standard deviation
O( n) if β is small, and two modes if β is large:
| | | | |
0 n n 0 n
2
β1 β1
50 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
lim βn = 1 (Exercise)
n→∞
Now consider the heat bath dynamics with an initial configuration η0 with N1 (η0 ) ≤ n2 , n even,
and let n no
T := inf t ≥ 0 : N1 (ηt ) > .
2
By the formula for mean hitting times for a birth-and-death process,
n
µ̄β 0, 1, . . . , n2 1
2 eβ 2
E[T ] ≥ ≥ ≥ n
µ̄β n2 · b n2 n n
µ̄β 2
· 2
n2
since
n
n βn βn
µ̄β = n · e− 2 µ̄β (0) ≤ 2n e− 2 .
2 2
Hence the average time needed to go from configurations with negative magnetization to states
with positive magnetization is increasing exponentially in n for β > 2 log 2. Thus although
ergodicity holds, for large n the process gets stuck for a very large time in configurations with
negative resp. positive magnetization.
Metastable behaviour.
More precisely, one can show using large deviation techniques that metastability occurs for any
inverse temperature β > 1, cf. below.
• Durett [9]
• Liggett [15]
V = Zd , T finite
d
E = {(x, y) : |x − y|l1 = 1} S = T Z with product topology, compact
µn → µ ⇔ µn (x) → µ(x) ∀ x ∈ Zd
2.2. PARTICLE SYSTEMS ON ZD 51
Assumptions:
Ntx,i independent Poisson process with rate λ̄ (alarm clock for transition at x to i)
Tnx,i n-th. arrival time of Ntx,i
Unx,i independent random variables uniformly distributed on [0, 1]
Sξ,A := {η ∈ S | η = ξ on Ac }
(s,ξ,A)
is finite. Hence for all s ≥ 0 there exists a unique Markov jump process ηt on Sξ,A
t≥s
(s,ξ,A)
with initial condition ηs = ξ and transitions t ≥ s, η → η x,i at times Tnx,i whenever Unx,i ≤
ci (x,y) (s,ξ)
λ̄
, x ∈ A. The idea is now to define a Markov process ηt on S for t − s small by
(s,ξ) (s,ξ,A)
ηt := ηt
52 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
If x effects y in the time interval (s, t] or vice versa then {x, y} ∈ Es,t .
Lemma 2.1. If
1
t−s≤ =: δ
8 · d2 · |T | · λ̄
then
P all connected components of (Zd , Es,t ) are finite = 1.
Consequence: For small time intervals [s, t] we can construct the configuration at time t form
the configuration at time s independently for each component by the standard construction for
jump processes with finite state space.
where (2d)2n−1 is a bound for the number of self-avoiding paths starting at 0 and independent
events {(z2i , z2i+1 ) ∈ Es,t }.
Hence
n
P [∃ x ∈ C0 : dl1 (x, 0) ≥ 2n − 1] ≤ 4d2 · 1 − e−2|T |λ̄(t−s)
≤ (8d2 · |T |λ̄ · (t − s))n −→ 0
as n → ∞, where e−2|T |λ̄(t−s) is the probability for no arrival in [s, t] in a 2|T | Poisson(λ̄) process
and 1 − e−2|T |λ̄(t−s) ≤ 2|T |λ̄ · (t − s).
d
By the lemma, P -almost sure for all s > 0 and ξ ∈ T Z , there is an unique function t 7→
(s,ξ)
ηt , t ≥ s, such that
2.2. PARTICLE SYSTEMS ON ZD 53
(s,ξ)
(i) ηs =ξ
(s,ξ)
(ii) For s ≤ t, h ≤ δ, and each connected component C of (Zd , Et,t+h ), ηt+h is obtained
C
(s,ξ)
from ηt by subsequently taking into account the finite number of transitions in C
C
during [t, t + h].
We set
ηtξ := ηt0,ξ .
By construction,
(s,ηsξ )
ηtξ = ηt ∀0 ≤ s ≤ t (2.1)
(s,ξ)
taking into account the Fs -measurability of ηsξ and ηt being independent of Fs for fixed
ξ, we conclude with (i)
(s,ηsξ (ω))
h i
ξ
E f ηt | Fs (ω) = E f ηt
h ξ i
ηs (ω)
= E f ηt−s
= (pt−s f ) ηsξ (ω)
54 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
(iii)
ξn → ξ ⇒ ξn (x) → ξ(x) ∀ x ∈ Zd
Hence ξn = ξ eventually on each finite set C ⊂ Zd , and hence on each component of
Zd , E0,δ . By the componentwise construction,
ηtξn = ηtξ ∀t ≤ δ
Remark . Since f is a cylinder function, the sum in the formula for the generator has only finitely
many non-zero summands.
Proof. " #
X
P Ntxk ,i > 1 ≤ const. · t2
k=1,...,ni∈T
where {Ntxk ,i > 1} means that there is more than one transition in the time interval [0, t] among
{x1 , . . . , xn } and const. is a global constant.
P Ntxk ,i = 1 = λ̄ · t + O(t2 )
and hence
Definition 2.5. The Markov process ηtξ , P is called attractive if and only if for all x ∈ Zd ,
(
c1 (x, η) ≤ c1 (x, η̃) and
η ≤ η̃, η(x) = η̃(x) ⇒
c0 (x, η) ≥ c0 (x, η̃)
Example . Contact process, voter model, as well as the Metropolis and heat-bath dynamics for
the (ferromagnetic) Ising model are attractive
and by induction
(s,ξ) (s,ξ̃)
ηt ≤ ηt ∀t≥s≥0
2.3. STATIONARY DISTRIBUTIONS AND PHASE TRANSITIONS 57
(s,ξ)
(s,ξ) s+δ,ηs+δ
since ηt = ηt .
(If, for example, before a possible transition at time Tnx,1 , η ≤ η̃ and η(x) = η̃(x) = 0,
then after the transition, η(x) = 1 if Unx,1 ≤ c1 (x,η)
λ̄
, but in this case also η̃(x) = 1 since
c1 (x, η) ≤ c1 (x, η̃) by attractiveness. The other cases are checked similarly.)
˜
2. Since f is increasing and ξ ≤ ξ,
h i h i
(pt f )(ξ) = E f (ηtξ ) ≤E f (ηtξ̃ ) ˜
= (pt f )(ξ)
Let 0, 1 ∈ S denote the constant configurations and δ0 , δ1 the minimal respectively maximal
element in M1 (S).
d
Theorem 2.7. For an attractive particle system on {0, 1}Z we have
1. The functions t 7→ δ0 pt and t 7→ δ1 pt are decreasing respectively increasing with respect to 4.
2. The limits µ := limt→∞ δ0 pt and µ̄ := limt→∞ δ1 pt exist with respect to weak convergence
in M1 (S)
3. µ and µ̄ are stationary distributions for pt
4. Any stationary distribution π satisfies
µ 4 π 4 µ̄.
Proof. 1.
0≤s≤t ⇒ δ0 4 δ0 pt−s
δ0 ps 4 δ0 pt−s ps = δ0 pt
d
2. By monotonicity and compactness, since S = {0, 1}Z is compact with respect to the prod-
uct topology, M1 (S) is compact with respect to weak convergence. Thus it suffices to show
that any two subsequential limits µ1 and µ2 of δ0 pt coincide. Now by 1),
Z
f d(δ0 pt )
58 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
Corollary 2.8. For an attractive particle system, the following statements are equivalent:
1. µ = µ̄.
2. There is an unique stationary distribution.
3. Ergodicity holds:
∃ µ ∈ M1 (S) : νpt −→ µ ∀ ν ∈ M1 (S).
νpt → µ = µ̄
3. ⇒ 1.: obvious.
2.3. STATIONARY DISTRIBUTIONS AND PHASE TRANSITIONS 59
For the contact process, c0 (x, η) = δ and c1 (x, η) = b · N1 (x, η) where the birth rate b and the
death rate δ are positive constants. Since the 0 configuration is an absorbing state, µ = δ0 is the
minimal stationary distribution. The question now is if there is another (non-trivial) stationary
distribution, i.e. if µ̄ 6= µ.
Theorem 2.9. If 2db < δ then δ0 is the only stationary distribution, and ergodicity holds.
d 1 X
P ηt (x) = 1 = −δP ηt1 (x) = 1 + b · P ηt1 (x) = 0, ηt1 (y) = 1
dt
y : |x−y|=1
1
≤ (−δ + 2db) · P ηt (x) = 1
Conversely, one can show that for b sufficiently small (or δ sufficiently large), there is nontrivial
stationary distribution. The proof is more involved, cf. Liggett [15]. Thus a phase transition from
ergodicity to non-ergodicity occurs as b increases.
We consider the heat bath or Metropolis dynamics with inverse temperature β > 0 on S =
d
{−1, +1}Z .
S+,A := {η ∈ S | η = +1 on Ac } (finite!)
S−,A := {η ∈ S | η = −1 on Ac } .
(0,ξ,A)
For ξ ∈ S+,A resp. ξ ∈ S−,A , ηtξ,A = ηt , the dynamics taking into account only
transitions in A.
60 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
ηtξ,A , P is a Markov chain on S+,A resp. S−,A with generator
X
ci (x, η) · f (η x,i ) − f (η)
(L f )(η) =
x∈A
i∈{−1,+1}
Let
1 X
H(η) = (η(x) − η(y))2
4 d
x,y∈Z
|x−y|=1
denote the Ising Hamiltonian. Note that for η ∈ S+,A or η ∈ S−,A only finitely many
summands do not vanish, so H(η) is finite. The probability measure
1
µ+,A
β (η) = e−βH(η) , η ∈ S+,A
Zβ+,A
where X
Zβ+,A = e−βH(η)
η∈S+,A
µ+,A +,A
β (ξ)L (ξ, η) = µβ (η)L (η, ξ) ∀ ξ, η ∈ S+,A
respectively
µ−,A −,A
β (ξ)L (ξ, η) = µβ (η)L (η, ξ) ∀ ξ, η ∈ S−,A .
Since S+,A and S−,A are finite and irreducible this implies that µ+,A
β respectively µ−,A
β is the
unique stationary distribution of µξ,At ,P for ξ ∈ S+,A , S−,A respectively. Thus in finite
volume there are several processes corresponding to different boundary conditions (which
effect the Hamiltonian) but each of them has a unique stationary distribution. Conversely,
in infinite volume there is only one process, but it may have several stationary distributions:
b) Infinite volume: To identify the stationary distributions for the process on Zd , we use an
approximation by the dynamics in finite volume. For n ∈ N let
An := [−n, n]d ∩ Zd ,
(
ξ(x) for x ∈ An
ξn (x) :=
+1 for x ∈ Zd \ An
Remark (Gibbs measures). A probability measure µ on S is called Gibbs measure for the Ising
Hamiltonian on Zd and inverse temperature β > 0 if and only if for all finite A ⊆ Zd and ξ ∈ S,
1
µξ,A
β (η) := e−βH(η) , η ∈ Sξ,A := {η ∈ S | η = ξ on Ac } ,
Zβξ,A
is a version of the conditional distribution of µβ given η(x) = ξ(x) for all x ∈ Ac . One can show
−
that µ+β and µβ are the extremal Gibbs measures for the Ising model with respect to stochastic
dominance, cf. e.g. [Milos] ???.
−
Definition 2.10. We say that a phase transition occurs for β > 0 if and only if µ+
β 6= µβ
Proof. Let Cx denote the component containing x in the random graph Zd , E0,δ . If Cx ⊆ An
then the modifications in the initial condition and the transition mechanism outside An do not
effect the value at x before time δ. Hence the probability in (2.2) can be estimated by
P [Cx ∩ Acn 6= ∅]
which goes to 0 as n → ∞ by Lemma (2.1) above.
d
Let pt denote the transition semigroup on {−1, 1}Z . Since the dynamics is attractive,
µ̄β = lim δ+1 pt and µβ = lim δ−1 pt
t→∞ t→∞
are extremal stationary distributions with respect to stochastic dominance. The following theo-
rem identifies µ̄ and µ as the extremal Gibbs measures for the Ising Hamiltonian on Zd :
−
In particular, ergodicity holds if and only if there is no phase transition (i.e. iff µ+
β = µβ ).
62 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
Proof. We show:
1. µ̄β 4 µ+
β
2. µ+
β is a stationary distribution with respect to pt .
1. It can be shown similarly as above that, the attractiveness of the dynamics implies
µ1t ≤ µ1,A
t
n
hence
µ̄β 4 µ+,A
β
n
2. It is enough to show
µ+ +
β p t = µβ for t ≤ δ, (2.3)
then the assertion follows by the semigroup property of (pt )t≥0 . Let
h i
n ξn ,An
(pt f ) (ξ) := E f (ηt )
µ+,n n +,n
β p t = µβ (2.4)
uniformly in ξ.
2.3. STATIONARY DISTRIBUTIONS AND PHASE TRANSITIONS 63
w
Since µ+,n
β to µ+
β , and f and pt f are continuous by the Feller property, taking the limit in
(2.5) as n → ∞ yields
Z Z Z
µ+ dµ+ f dµ+
fd β pt = pt f β = β
−
O 1
µ+
β = µβ = ν, where ν (+
−1) =
2
z∈Zd
On the other hand, phase transition occur for d ≥ 2 and large values of β:
Theorem 2.13 (P EIERL). For d = 2 there exists βc ∈ (0, ∞) such that for β > βc ,
1
µ+
β ({η : η(0) = −1}) < < µ−
β ({η : η(0) = −1}) ,
2
−
and thus µ+
β 6= µβ .
Proof. Let C0 (η) denote the connected component of 0 in {x ∈ Zd | η(x) = −1}, and set
C0 (η) = ∅ if η(0) = +1. Let A ⊆ Zd be finite and non-empty. For η ∈ S with C0 = A let η̃
denote the configuration obtained by reversing all spins in A. Then
and hence
X
µ+,n
β (C0 = A) = µ+,n
β (η)
η : C0 (η)=A
X
≤ e−2β|∂A| µ+,n
β (η̃) ≤ e
−2β|∂A|
η : C0 (η)=A
| {z }
≤1
64 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
Thus
X
µ+,n
β ({η : η(0) = −1}) = µ+,n
β (C0 = A)
A⊂Zd
A6=∅
X∞
e−2βL {A ⊂ Zd : |∂A| = L}
≤
L=1
X∞
≤ e−2βL · 4 · 3L−1 · L2
L=4
1
≤ for β > βc
2
2
where ∂A is a self-avoiding path in Z2 by length L, starting in − L2 , L2 . Hence for n → ∞,
1
µ+
β ({η : η(0) = −1}) <
2
and by symmetry
1
µ− +
β ({η : η(0) = −1}) = µβ ({η : η(0) = 1}) >
2
for β > βc .
b) ν σ-finite: Let S = ˙ i∈N Si with ν(Si ) < ∞. Let Ni be independent Poisson random
S
measures with intensity ISi · ν. Then
∞
X
N := Ni
i=1
P∞
is a Poisson random measure with intensity ν = i=1 ISi · ν.
(ii) If B1 , . . . , Bn ∈ S are disjoint, then (Nt (B1 ))t≥0 , . . . , (Nt (Bn ))t≥0 are independent.
(iii) (Nt (B))t≥0 is a Poisson process of intensity ν(B) for all B ∈ S with ν(B) < ∞.
Remark . A Poisson random measure (respectively a Poisson point process) is a random variable
(respectively a stochastic process) with values in the space
( )
X
Mc+ (S) = δx | A ⊆ S countable subset ⊆ M + (S)
x∈A
of all counting measures on S. The distribution of a Poisson random measure and a Poisson
point process of given intensity is determined uniquely by the definition.
is a Poisson point process of intensity ν provided the random variables Zi are independent
with distribution λ−1 ν, and (Kt )t≥0 is an independent Poisson process of intensity λ.
Proof. Exercise.
b
low intensity
b
b b b
b
b
β b
N (β) b
b
b
high intensity
b b
b b b
b
Corollary 2.17. If ν(S) < ∞ then a Poisson point process of intensity ν is a Markov jump
process on Mc+ (S) with finite jump measure
Z
q(π, •) = (π + δy ) ν(dy), π ∈ Mc+ (S)
and generator Z
(L F )(π) = (F (π + δy ) − F (π)) ν(dy), (2.6)
F : Mc+ (S) → R bounded. If ν(S) = ∞, (2.6) is not defined for all bounded functions F .
Chapter 3
Properties:
1. Semigroup:
ps pt = ps+t ∀ s, t ≥ 0
2. (sub-)Markov:
(i) f ≥ 0 ⇒ pt f ≥ 0 positivity preserving
(ii) pt 1 = 1 (respectively pt 1 ≤ 1 if ζ 6= ∞)
67
68 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
Consequence:
(pt )t≥0 induces a semigroup of linear contractions (Pt )t≥0 on the following Banach spaces:
1. Fb (S) which is the space of all bounded measurable functions f : S → R endowed with
the sup-norm.
Now let (Pt )t≥0 be a general semigroup of linear contractions on a Banach space B.
lim Pt f = f
t↓0
for all f ∈ B.
2. Conversely, if Pt is strongly continuous on B then Dom(L) is dense in B (for the proof see
script Stochastic analysis II).
3. The transition semigroup (pt )t≥0 of a right-continuous Markov process induces a C0 semi-
group (Pt )t≥0 on
Theorem 3.2 (Maximum principle). The generator L of a Markov semigroup (pt )t≥0 on Fb (S)
satisfies
pt f (x0 ) − f (x0 )
(Lf )(x0 ) = lim ≤0
t↓0 t
2. 1 ∈ Dom(L) and L1 = 0.
pt f ≤ f (x0 ) · pt 1 ≤ f (x0 )
pt f (x0 ) − f (x0 )
(Lf )(x0 ) = lim ≤0
t↓0 t
2. Pt 1 = 1 for all t ≥ 0.
Theorem 3.3 (Kolmogorov equations). If (Pt )t≥0 is a C0 semigroup with generator L then t 7→
Pt f is continuous for all f ∈ B. Moreover, if f ∈ Dom(L) then Pt f ∈ Dom(L) for all t ≥ 0,
and
d
Pt f = Pt Lf = LPt f
dt
and
kPt−h f − Pt f k = kPt−h (f − Ph f )k ≤ kf − Ph f k → 0
as h ↓ 0.
70 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
1 Ph f − f
(Pt+h − Pt f ) = Pt → Pt Lf
h h
as h ↓ 0, because the operators Pt are contractions. On the other hand,
1 Ph f − f
(Pt−h f − Pt f ) = Pt−h → Pt Lf
−h h
as h ↓ 0 by 1.) and the contractivity.
1 1
(Ph Pt f − Pt f ) = (Pt+h f − Pt f ) → Pt Lf
h h
as h ↓ 0. Hence by 1.), Pt f ∈ Dom(L) and LPt f = Pt Lf .
Corollary 3.4. Suppose (Xt , P ) is a right-continuous (Ft )-Markov process with transition semi-
group (pt )t≥0 .
2. Suppose (Xt , P ) is stationary with initial distribution µ, and L(p) is the generator of the
corresponding C0 semigroup on Lp (S, µ) for some p ∈ [1, ∞). Then for f ∈ Dom(L(p) ),
Zt
Mtf L(p) f (Xs ) ds,
= f (Xt ) − t ≥ 0,
0
is P -almost sure independent of the chosen version L(p) f and (Xt , P ) solves the martingale
problem for (L(p) , Dom(L(p) )).
Zt
Mtf = f (Xt ) − (Lf )(Xs ) ds ∈ L 1 (P ),
0
3.2. LÉVY PROCESSES 71
and
Zt
E[Mtf − Msf | Fs ] = E f (Xt ) − f (Xs ) − Lf (Xu ) du | Fs
s
Zt
= E[f (Xt ) | Fs ] − f (Xs ) − E[Lf (Xu ) | ] du
s
Zt
= pt−s f (Xs ) − f (Xs ) − pu−s Lf (Xs ) du = 0
s
2. Exercise.
Definition 3.5. An Rd -valued stochastic process ((Xt )t≥0 , P ) with càdlàg paths is called a
Lévy process if and only if it has stationary independent increments, i.e.,
`
(i) Xs+t − Xs Fs = σ(Xr | r ≤ s) for all s, t ≥ 0.
Xt = σBt + bt
The Lévy-Khinchin formula gives a classification of the distributions of all infinitely divisible
random variables on Rd in terms of their characteristic functions.
Theorem 3.6. A Lévy process is a time-homogeneous Markov process with translation invariant
transition functions
pt (x, B) = µt (B − x) = pt (a + x, a + B) ∀ a ∈ Rd (3.1)
Proof.
µt ∗ µs = µt+s ∀ t, s ≥ 0
E.g.
Z Z
µt ∗ µs (B) = µt (dy)µs (B − y) = pt (0, dy)ps (y, B) = pt+s (0, B) = µt+s (B)
µ := µ1 = P ◦ X1−1 .
One easily verifies that for any Lévy process there exists a unique characteristic exponent.
Since (Xi − Xi−1 ) are independent identically distributed random variables with the
same distribution as X1 ,
m
b) Let t = n
∈ Q and
n
X
Xm = X im − X (i−1)m
n n
i=1
Hence
ϕXm = ϕnXt
and since ϕXm = e−mψ ,
m
ϕXt = e− n ψ = e−tψ
2. Exercise.
74 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
Since Xs+t −Xs ∼ Xt , independent of Fs , the marginal distributions of a Lévy process ((Xt )t≥0 , P )
are completely determined by the distributions of Xt , and hence by ψ! In particular:
Corollary 3.9 (Semigroup and generator of a Lévy process). 1. For all f ∈ S (Rd ) and t ≥
0,
pt f = e−tψ fˆ ˇ
where
Z
d
fˆ(p) = (2π)− 2 eip·x f (x) dx, and
Z
d
ǧ(x) = (2π)− 2 eip·x g(p) dp
denote the Fourier transform and the inverse Fourier transform of functions f, g ∈ L 1 (Rd ).
2. S (Rd ) is contained in the domain of the generator L of the semigroup induced by (pt )t≥0
on C∞ (Rd ), and
Here S (Rd ) denotes the Schwartz space of rapidly decreasing smooth functions on Rd . Recall
that the Fourier transform maps S (Rd ) one-to-one onto S (Rd ).
Proof. 1. Since (pt f )(x) = E[f (Xt + x)], we conclude with Fubini
Z
ˆ − d2
(pt f )(p) = (2π) e−ip·x (pt f )(x) dx
Z
− d2 −ip·x
= (2π) · E e f (Xt + x) dx
= E eip·Xt · fˆ(p)
= e−tψ(p) fˆ(p)
for all p ∈ Rd . The claim follows by the Fourier inversion theorem, noting that e−tψ ≤ 1.
2. For f ∈ S (Rd ), fˆ is in S (Rd ) as well. The Lévy-Khinchin formula that we will state
below gives an explicit representation of all possible Lévy exponents which shows in par-
ticular that ψ(p) is growing at most polynomial as |p| → ∞. Hence
e−tψ fˆ − fˆ e−tψ − 1
+ ψ fˆ = + ψ · |fˆ|
t t
3.2. LÉVY PROCESSES 75
and
Zt Z t Zs
e−tψ − 1 1 −sψ 1
ψ 2 e−rψ dr ds
+ψ =− ψ e − 1 ds =
t t t
0 0 0
hence
e−tψ fˆ − fˆ
+ ψ f ≤ t · |ψ 2 | · |fˆ| ∈ L 1 (Rd ),
ˆ
t
and therefore:
pt f − f
Z
d
− (−ψ fˆ)ˇ= (2π)− 2 eip·x · · · · dp → 0
t
as t ↓ 0 uniformly in x. This shows f ∈ Dom(L) and Lf = (−ψ fˆ)ˇ. In particular, pt is
strongly continuous on S (Rd ). Since S (Rd ) is dense in C∞ (Rd ) and pt is contractive this
implies strong continuity on C∞ (Rd ).
Remark . pt is not necessarily strongly continuous on Cb (Rd ). Consider e.g. the deterministic
process
Xt = X0 + t
on R1 . Then
(pt f )(x) = f (x + t),
and one easily verifies that there exists f ∈ Cb (R) such that pt f 9 f uniformly.
Corollary 3.10. (Xt , P ) solves the martingale problem for the operator (L, S (Rd )) defined by
(3.2).
= e−ψ(p)·t
where
1 1
ψ(p) = (σ T p)2 − ib · p = p · ap − ib · p, a := σσ T
2 2
and
1
Lf = −(ψ fˆ)ˇ= div(a∇f ) − b · ∇f, f ∈ S (Rn )
2
76 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
and hence
Z
1 − eip·y λπ(dy)
ψ(p) = λ · (1 − ϕπ (p)) =
and Z
(Lf )(x) = (−ψ fˆ)ˇ(x) = (f (x + y) − f (x))λπ(dy), f ∈ S (Rn ).
The jump intensity measure ν := λπ is called the Lévy measure of the compound Poisson
process.
Proof. e.g.
2
E[Ms+t − Ms2 | Fs ] = E[(Ms+t − Ms )2 | Fs ] = E[(Ms+t − Ms )2 ]
Nt
!
X
= E[|Mt |2 ] = Var(Xt ) = Var Zi
i=1
" Nt
!# " Nt
#!
X X
= E Var Z i Nt + Var E Z i Nt
i=1 i=1
P hP i
Nt Nt
and since Var i=1 Zi | = Nt · Var(Z1 ) and E i=1 Zi | = Nt · E[Z1 ],
2
E[Ms+t − Ms2 | Fs ] = E[Nt ] · Var(Z1 ) + Var(Nt ) · |E[Z1 ]|2 = λtE[|Z1 |2 ].
This motivates looking for Lévy processes that satisfy the scaling relation (3.3). Clearly,
(3.3) is equivalent to
α
e−tψ(p) = E eip·cXt = E eip·Xcα t = e−c tψ(p) ∀ c > 0
i.e.
ψ(cp) = cα ψ(p) ∀c > 0
78 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
α
ψ(p) = · |p|α
2
for some σ > 0. In this case, the generator of a corresponding Lévy process would be the
fractional power
σ α
Lf = −(ψ fˆ)ˇ= ∆ 2 f
2
of the Laplacien.
For α = 2 and σ = 1 the corresponding Lévy process is a Brownian motion, the scaling
limit in the classical central limit theorem. For α > 2, L does not satisfy the maximum
principle, hence the corresponding semigroup is not a transition semigroup of a Markov
process.
where Z
1 − eip·y |y|−α−1 dy
ψε (p) = lim
ε↓0 Rd \B(0,ε)
p
Proof. By substitution x = |p|y and ν := |p|
,
Z Z
−α−1
ip·y
1 − eiνx |x|−α−1 dx · |p|α → const.|p|α
1−e · |y| dy =
Rd \B(0,ε) Rd \B(0,ε·|p|)
Note that ψε is the symbol of a compound Poisson process with Lévy measure proportional to
|y|−α−1 · I{|y|>ε} dy. Hence we could expect that ψ is a symbol of a similar process with Lévy
measure proportional to |y|−α−1 dy. Since this measure is infinite, a corresponding process should
have infinitely many jumps in any non-empty time interval. To make this heuristics rigorous we
now give a construction of Lévy processes from Poisson point process:
3.3. CONSTRUCTION OF LÉVY PROCESSES FROM POISSON POINT PROCESSES: 79
b
b b b
b
b
β b
N (β) b
b
b
b b
b b b
b
a) Finite intensity
Theorem 3.12. Suppose ν is a finite measure on Rd . If (Nt )t≥0 is a Poisson point process of
intensity ν then Z
Xt := y Nt (dy)
is a compound Poisson process with Lévy measure ν (i.e. total intensity λ = ν(Rd ) and jump
distribution π = ν(Rν d ) ).
Proof. By the theorem in Section 1.9 above and the uniqueness of a Poisson point process of
intensity ν we may assume
XKt
Nt = δZi
i=1
where Zi are independent random variables of distribution λ−1 ν, and (Kt )t≥0 is an independent
Poisson process of intensity λ.
80 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
Hence
Z Kt
X
Xt = y Nt (dy) = Zi
k=1
and
R
(A2) (1 ∧ |y|2 ) ν(dy) < ∞.
R
(i.e. ν (|y| ≥ ε) < ∞ and |y|<ε
|y|2 ν(dy) < ∞ ∀ε > 0 )
For example, we could choose ν(dy) = |y|−α−1 , α ∈ (0, 2) which is our candidate for the Lévy
measure on an α-stable process. Let (Nt )t≥0 be a Poisson point process with intensity ν. Our aim
is to prove the existence of a corresponding Lévy process by an approximation argument. For
ε > 0,
Ntε (dy) := I{|y|>ε} · Nt (dy)
is a Poisson point process with finite intensity ν ε (dy) = I{|y|>ε} · ν(dy), and hence
Z Z
ε
Xt := y Nt (dy) = y Ntε (dy)
|y|>ε
Proof. Z Z
Xtδ − Xtε = y Nt (dy) = y Ntδ,ε (dy)
δ<|y|≤ε
where
Ntδ,ε (dy) := I{δ<|y|≤ε} · Nt (dy)
3.3. CONSTRUCTION OF LÉVY PROCESSES FROM POISSON POINT PROCESSES: 81
is a Poisson point process of intensity ν δ,ε (dy) = I{δ<|y|≤ε} ·ν(dy). Hence Xtδ −Xtε is a compound
Poisson process with finite Lévy measure ν δ,ε . In particular,
Z Z
δ ε δ,ε δ ε
Mt := Xt − Xt − t · yν (dy) = Xt − Xt − t · y ν(dy)
δ<|y|≤ε
and Z Z
2 2 δ,ε 2
|Mt | − t · y ν (dy) = |Mt | − t · |y|2 ν(dy)
δ<|y|≤ε
Theorem 3.14. Let t ≥ 0. If (A1) and (A2) hold then the process X ε , ε > 0, form a Cauchy
sequence with respect to the norm
ε 2
δ ε
δ
kX − X | := E sup Xs − Xs
.
s≤t
Remark . 1. Representation of Lévy process with symbol ψ as jump process with infinite jump
intensity.
2. For ν(dy) = |y|−α−1 , α ∈ (0, 2), we obtain an α-stable process.
Proof. Lemma and (A2) yields that (X ε )ε>0 is a Cauchy sequence with respect to k · k. Since the
processes Xsε are right-continuous and the convergence is uniform, the limit process Xs is right-
continuous as well. Similarly, it has independent increments, since the approximating processes
have independent increments, and by dominated convergence
ε ε
E eip·(Xs+t −Xs ) = lim E eip·(Xs+t −Xs ) = lim e−tψε (p)
ε↓0 ε↓0
82 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
where
Z Z
ip·y
1 − eip·y ν(dy)
ψε (p) = 1−e νε (dy) =
|y|>ε
Theorem 3.15 (L ÉVY-K HINCHIN). For ψ : Rd → C the following statements are equivalent:
(iii) Z
1
1 − eip·y + ip · yI{|y|≤1} ν(dy)
ψ(p) = p · ap − ib +
2 Rd
2
where a ∈ Rd is a non-negative definite matrix, b ∈ Rd , and ν is a positive measure on
Rd \ {0} satisfying (A2).
(ii)⇒(iii): This is the classical Lévy-Khinchin theorem which is proven in several textbooks on
probability theory, cf. e.g. Feller [10] and Varadhan [22].
(iii)⇒(i): The idea for the construction of a Lévy process with symbol ψ is to define
(1) (2) (3)
Xt = Xt + Xt + Xt
(3,ε)
Since Xt is a martingale for all ε > 0, the existence of the limit as ε ↓ 0 can be
established as above via the maximal inequality. One then verifies as above that X (1) , X (2)
3.3. CONSTRUCTION OF LÉVY PROCESSES FROM POISSON POINT PROCESSES: 83
|y|>1
Z
(3)
1 − eip·y + ip · y ν(dy).
ψ (p) =
|y|≤1
3. In the construction of the α-stable process above, a compensation was not required because
for a symmetric Lévy measure the approximating processes are already martingales.
for all f ∈ C0∞ (Rd ), where aij , b, c ∈ C(Rd ), a(x) non-negative definit and c(x) ≤ 0 for
all x, and ν(x, ·) is a kernel of positive (Radon) measures.
3. If Pt is the transition semigroup of a diffusion (i.e. a Markov process with continuous paths)
then L is a local operator, and a representation of type (3.4) holds with ν ≡ 0.
We will not prove assertion 1. The proof of 2. is left as an exercise. We now sketch an in-
dependent proof of 3., for a detailed proof we refer to volume one of Rogers / Williams [18]:
c) Taylor expansion: Fix x ∈ Rd and f ∈ C0∞ (Rd ). Let ϕ, ϕi ∈ C0∞ (Rd ) such that
ϕ(y) = 1 and ϕi (y) = yi − xi for all y in a neighborhood of x. Then in a neighborhood U
of x,
d
X ∂f 1X ∂2f
f (y) = f (x) · ϕ(y) + (x)ϕi (y) + d (x)ϕi (y)ϕj (y) + R(y)
i=1
∂xi 2 i,j=1 ∂xi ∂xj
1X ∂2f
(Lf )(x) = c · f (x) + b∇f (x) + aij (x) + (LR)(x)
2 ∂xi ∂xj
where c := Lϕ(x), bi := Lϕi (x) and aij := L(ϕi ϕj )(x). In order to show (LR)(x) = 0
we apply the local maximum principle. For ε ∈ R choose Rε ∈ C0∞ (Rd ) such that
on U . Then for ε > 0, Rε has a local maximum at x, and hence LRε ≤ 0. For ε ↓ 0
we obtain LR(x) ≤ 0. Similarly, for ε < 0, −Rε has a local maximum at x and hence
LRε ≥ 0. For ε ↑ 0 we obtain LR(x) ≥ 0, and thus LR(x) = 0.
86 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
Chapter 4
Convergence to equilibrium
Our goal in the following sections is to relate the long time asymptotics (t ↑ ∞) of a time-
homogeneous Markov process (respectively its transition semigroup) to its infinitesimal charac-
teristics which describe the short-time behavior (t ↓ 0):
Although this is usually limited to the time-homogeneous case, some of the results can be applied
to time-inhomogeneous Markov processes by considering the space-time process (t, Xt ), which
is always time-homogeneous. On the other hand, we would like to take into account processes
that jump instantaneously (as e.g. interacting particle systems on Zd ) or have continuous trajecto-
ries (diffusion-processes). In this case it is not straightforward to describe the process completely
in terms of infinitesimal characteristics, as we did for jump processes. A convenient general setup
that can be applied to all these types of Markov processes is the martingale problem of Stroock
and Varadhan.
then µ = 0
Let
87
88 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
L : A ⊆ Fb (S) → Fb (S)
be a linear operator.
Definition 4.1. An adapted right-continuous stochastic process ((Xt )t≥0 , (Ft )t≥0 , P ) is called a
solution for the (local) martingale problem for the operator (L , A ) if and only if
Zt
Mtf := f (Xt ) − (L f )(Xs ) ds
0
Example . 1. Jump processes: A minimal Markov jump process solves the martingale prob-
lem for its generator Z
(L f )(x) = q(x, dy) (f (y) − f (x))
with domain
A = {f ∈ Fb (S) : L f ∈ Fb (S)} ,
cf. above.
d
2. Interacting particle systems: An interacting particle system with configuration space T Z
as constructed in the last section solves the martingale problem for the operator
XX
ci (x, µ) · f µx,i − f (µ)
(L f )(µ) = (4.1)
x∈Zd i∈T
Note that for a cylinder function only finitely many summands in (4.1) do not vanish. Hence
L f is well-defined.
3. Diffusions: Suppose S = Rn . By Itô’s formula, any (weak) solution ((Xt )t≥0 , P ) of the
stochastic differential equation
dXt = σ(Xt ) dBt + b(Xt ) dt
with an Rd -valued Brownian motion Bt and locally bounded measurable functions σ : Rn →
Rd×n , b : Rn → Rn , solves the martingale problem for the differential operator
n
1X ∂2f
(L f )(x) = aij (x) (x) + b(x) · ∆f (x),
2 i,j=1 ∂xi ∂xj
a(x) = σ(x)σ(x)T ,
with domain C 2 (Rn ), and the martingale problem for the same operator with domain A =
C02 (Rn ), provided there is no explosion in finite time. The case of explosion can be included
by extending the state space to Rn ∪{∆}
˙ and setting f (∆) = 0 for f ∈ C02 (Rn ).
4.1. SETUP AND EXAMPLES 89
4. Lévy processes A Lévy process solves the martingale problem for its generator
L f = −(ψ fˆ)ˇ
From now on we assume that we are given a right continuous time-homogeneous Markov process
((Xt )t≥0 , (Ft )t≥0 , (Px )x∈S ) with transition semigroup (pt )t≥0 such that for any x ∈ S, (Xt )t≥0 is
under Px a solution of the martingale problem for (L , A ) with Px [X0 = x] = 1.
such that with respect to Px , the canonical process Xt (ω) = ω(t) is a solution of the martingale
problem for (L , A ) satisfying Px [X0 = x] = 1.
If
(iii) For any x ∈ S, Px is the unique probability measure on D(R+ , S) solving the martingale
problem,
then (Xt , Px ) is a strong Markov process, cf. e.g. Rogers, Williams [18] Volume 1.
Let A¯ denote the closure of A with respect to the supremum norm. For most results derived
below, we will impose two additional assumptions:
Assumptions:
(A2) There exists a linear subspace A0 ⊆ A such that if f ∈ A0 , then pt f ∈ A for all t ≥ 0,
and A0 is dense in A with respect to the supremum norm.
Example . 1. For Lévy processes (A1) and (A2) hold with A0 = A = S (Rd ), and B =
A¯ = C∞ (Rd ).
3. In general, it can be difficult to determine explicitly a space A0 such that (A2) holds. In
this case, a common procedure is to approximate the Markov process and its transition
semigroup by more regular processes (e.g. non-degenerate diffusions in Rd ), and to derive
asymptotic properties from corresponding properties of the approximands.
d
4. For an interacting particle system on T Z with bounded transition rates ci (x, η), the con-
ditions (A1) and (A2) hold with
n d
o
A0 = A = f : T Z → R : |||f ||| < ∞
where
X
|||f ||| = ∆f (x), ∆f (x) = sup f (η x,i ) − f (η) ,
i∈T
x∈Zd
Theorem 4.2 (From the martingale problem to the Kolmogorov equations). Suppose (A1) and
(A2) hold. Then (pt )t≥0 induces a C0 contraction semigroup (Pt )t≥0 on the Banach space B =
A¯ = A¯0 , and the generator is an extension of (L , A ). In particular, the forward and backward
equations
d
pt f = pt L f ∀ f ∈ A
dt
and
d
pt f = L pt f ∀ f ∈ A0
dt
hold.
Proof. Since Mtf is a bounded martingale with respect to Px , we obtain the integrated backward
forward equation by Fubini:
t
Z
(pt f )(x) − f (x) = Ex [f (Xt ) − f (X0 )] = Ex (L f )(Xs ) ds
0
(4.2)
Zt
= (ps L f )(x) ds
0
Zt
kpt f − f ksup ≤ kps L f ksup ds ≤ t · kL f ksup → 0
0
4.2. STATIONARY DISTRIBUTIONS AND REVERSIBILITY 91
Zt
pt f − f 1
−Lf = (ps L f − L f ) ds → 0
t t
0
uniformly for all f ∈ A , i.e. A is contained in the domain of the generator L of the semigroup
(Pt )t≥0 induced on B, and Lf = L f for all f ∈ A . Now the forward and the backward
equations follow from the corresponding equations for (Pt )t≥0 and Assumption (A2).
Theorem 4.3 (Infinitesimal characterization of stationary distributions). Suppose (A1) and (A2)
hold. Then for µ ∈ M1 (S) the following assertions are equivalent:
(iii) Z
L f dµ = 0 ∀ f ∈ A
(ii)⇒(i) By the Markov property, for any measurable subset B ⊆ D(R+ , S),
for all f ∈ A0 and t ≥ 0. Since A0 is dense in A with respect to the supremum norm,
(4.3) extends to all f ∈ A . Hence µpt = µ for all t ≥ 0 by (A0).
where (Bt )t≥0 is a Brownian motion in Rd , and the functions σ : Rn → Rn×d and b : Rn → Rn
are locally Lipschitz continuous. Then by Itô’s formula (Xt , Px ) solves the martingale problem
for the operator
n
1X ∂2
L = aij (x) + b(x) · ∇, a = σσ T ,
2 i,j=1 ∂xi ∂xj
with domain A = C0∞ (Rn ). Moreover, the local Lipschitz condition implies uniqueness of
strong solutions, and hence, by the Theorem of Yamade-Watanabe, uniqueness in distribution of
weak solutions and uniqueness of the martingale problem for (L , A ), cf. e.g. Rogers/Williams
[18]. Therefore by the remark above, (Xt , Px ) is a Markov process.
4.2. STATIONARY DISTRIBUTIONS AND REVERSIBILITY 93
Theorem 4.4. Suppose µ is a stationary distribution of (Xt , Px ) that has a smooth density % with
respect to the Lebesgue measure. Then
n
1 X ∂2
L % :=
∗
(aij %) − div(b%) = 0
2 i,j=1 ∂xi ∂xj
Here the last equation follows by integration by parts, because f has compact support.
Z•
a 2b
L f = f 00 + bf 0 = 0 ⇔ f 0 = C1 exp − dx, C1 ∈ R
2 a
0
⇔ f = C2 + C1 · s, C1 , C2 ∈ R
where
Z• Ry 2b(x)
s := e− 0 a(x)
dx
dy
0
is a strictly increasing harmonic function that is called the scale function or natural scale of the diffusion.
In particular, s(Xt ) is a martingale with respect to Px . The stopping theorem implies
s(b) − s(x)
Px [Ta < Tb ] = ∀a < x < b
s(b) − s(a)
As a consequence,
94 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
(i) If s(∞) < ∞ or s(−∞) > −∞ then Px [|Xt | → ∞] = 1 for all x ∈ R, i.e., (Xt , Px ) is
transient.
(ii) If s(R) = R then Px [Ta < ∞] = 1 for all x, a ∈ R, i.e., (Xt , Px ) is irreducible and
recurrent.
b) Stationary distributions:
(i) s(R) 6= R: In this case, by the transience of (Xt , Px ), a stationary distribution does not
exist. In fact, if µ is a finite stationary measure, then for all t, r > 0,
Since Xt is transient, the right hand side converges to 0 as t ↑ ∞. Hence µ({x : |x| ≤
r}) = 0 for all r > 0, i.e., µ ≡ 0.
Here the last equivalence holds since s0 a% ≥ 0 and s(R) = R imply C2 = 0. Hence a
stationary distribution µ can only exist if the measure
1 R y 2b dx
m(dy) := e 0 a dy
a(y)
m
is finite, and in this case µ = m(R)
. The measure m is called the speed measure of the
diffusion.
Concrete examples:
2. Ornstein-Uhlenbeck process:
3.
1
dXt = dBt + b(Xt ) dt, b ∈ C 2, for |x| ≥ 1
b(x) =
x
transient, two independent non-negative solutions of L ∗ % = 0 with % dx = ∞.
R
γ
(Exercise: stationary distributions for dXt = dBt − 1+|Xt |
dt)
Proposition 4.5.
L ∗% = 0 ⇔ div(%b) = 0
⇔ (L , C0∞ (Rn )) anti-symmetric on L2 (µ)
Second equivalence:
Z Z Z
f L g dµ = f b · ∇g% dx = − div(f b%)g dx
Z Z
= − L f g dµ − div(%b)f g dx ∀ f, g ∈ C0∞
Theorem 4.6. Suppose (A1) and (A2) hold. Then for µ ∈ M1 (S) the following assertions are
equivalent:
(i) The process (Xt , Pµ ) is invariant with respect to time reversal, i.e.,
(Xs )0≤s≤t ∼ (Xt−s )0≤s≤t with respect to Pµ ∀ t ≥ 0
(ii)
µ(dx)pt (x, dy) = µ(dy)pt (y, dx) ∀ t ≥ 0
(iii) pt is µ-symmetric, i.e.,
Z Z
f pt g dµ = pt f g dµ ∀ f, g ∈ Fb (S)
(ii) La∗ µ = 0
(iii) div(%β) = 0
(iv) (La , C0∞ ) is anti-symmetric with respect to µ
Proof. Let Z
E(f, g) := − f L g dµ (f, g ∈ C0∞ )
denote the bilinear form of the operator (L , C0∞ (Rn )) on the Hilbert space L2 (Rn , µ). We de-
compose E into a symmetric part and a remainder. An explicit computation based on the integra-
tion by parts formula in Rn shows that for g ∈ C0∞ (Rn ) and f ∈ C ∞ (Rn ):
∂2g
Z X
1
E(f, g) = − f aij + b · ∇g % dt
2 ∂xi ∂xj
Z X Z
1 ∂ ∂g
= (%aij f ) dx − f b · ∇g% dx
2 i,j ∂xi ∂xj
Z X Z
1 ∂f ∂g
= ai,j % dx − f β · ∇g% dx ∀ f, g ∈ C0∞
2 i,j ∂xi ∂xj
98 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
and set
Z Z
1X ∂f ∂g
Es (f, g) := ai,j % dx = − f Ls g dµ
2 i,j ∂xi ∂xj
Z Z
Ea (f, g) := f β · ∇g% dx = − f La g dµ
This proves 1) and, since Es is a symmetric bilinear form, also 2). Moreover, the assertions (i)
and (ii) of 3) are equivalent, since
Z Z
− L g dµ = E(1, g) = Es (1, g) + Ea (1, g) = − La g dµ
for all g ∈ C0∞ (Rn ) since Es (1, g) = 0. Finally, the equivalence of (ii),(iii) and (iv) has been
shown in the example above.
Example . L = 12 ∆ + b · ∇, b ∈ C(Rn , Rn ),
1
(L , C0∞ ) µ-symmetric ⇔ β =b− ∇% = 0
2%
∇% 1
⇔ b= = ∇ log %
2% 2
L symmetrizable ⇔ b is a gradient
1
L ∗µ = 0 ⇔ b = ∇ log % + β
2
when div(%β) = 0.
Zt
1
Xt = x + Bt + b(Xs ) ds, non-explosive, b = − ∇h
2
0
−1
Hence Pµ ◦ X0:T Wiener measure with density
ZT
1 1 1 1
exp − H(B0 ) − H(BT ) − |∇H| − ∆H (Bs ) ds
2 2 8 2
0
by Jensen’s inequality and the stationarity of µ. As before, we assume that we are given a
Markov process with transition semigroup (pt )t≥0 solving the martingale problem for the op-
erator (L , A ). The assumptions on A0 and A can be relaxed in the following way:
(A0) as above
(A1’) f, Lf ∈ L p (S, µ) for all 1 ≤ p < ∞
(A2’) A0 is dense in A with respect to the Lp (S, µ) norms, 1 ≤ p < ∞, and pt f ∈ A for all
f ∈ A0
(A3) 1 ∈ A
Remark . Condition (A0) implies that A R, and hence A0 , is dense in Lp (S, µ) for all p ∈ [1, ∞).
In fact, if g ∈ L q (S, µ), 1q + 1q = 1, with f g dµ = 0 for all f ∈ A , then g dµ = 0 by (A0) and
hence g = 0 µ-a.e. Similarly as above, the conditions (A0), (A1’) and (A2’) imply that (pt )t≥0
induces a C0 semigroup on Lp (S, µ) for all p ∈ [1, ∞), and the generator (L(p) , Dom(L(p) ))
extends (L , A ), i.e.,
Remark . More generally, E (f, g) is defined for all f ∈ L2 (S, µ) and g ∈ Dom(L(2) ) by
d
E (f, g) = −(f, L g)µ = − (f, pt g)µ
(2)
dt t=0
Remark . 1. In particular,
Z
1 1d
E (f, f ) = − 2
(pt f ) dµ = − Varµ (pt f ) ,
2 2 dt t=0
is constant,
Z
d d
Varµ (pt ) = (pt f )2 dµ
dt dt
4.3. DIRICHLET FORMS AND CONVERGENCE TO EQUILIBRIUM 101
Remark . 1. In particular,
Z
1d 1d
E (f, f ) = − (pt f )2 dµ = − Varµ (pt f )
2 dt t=0 2 dt
1 1d
Es (f, g) = (Es (f + g, f + g) + Es (f − g, f − g)) = − Covµ (pt f, pt g)
4 2 dt
Dirichlet form = infinitesimal change of (co)variance.
2. Since pt is a contraction on L 2 (µ), the operator (L , A ) is negative-definite, and the
bilinear form (E , A ) is positive definite:
Z Z
1
(−f, L f )µ = E (f, f ) = − lim 2 2
(pt f ) dµ − f dµ ≥ 0
2 t↓0
Corollary 4.10 (Decay of variance). For λ > 0 the following assertions are equivalent:
(i) Poincaré inequality:
1
Varµ (f ) ≤ E(s) (f, f ) ∀ f ∈ A
λ
(ii) Exponential decay of variance:
Varµ (pt f ) ≤ e−2λt Varµ (f ) ∀ f ∈ L2 (S, µ) (4.5)
Remark . Optimizing over λ, the corollary says that (4.5) holds with
E (f, f ) (f, −L f )µ
λ := inf = inf
f ∈A Varµ (f ) f ∈A (f, f )µ
f ⊥1 in L2 (µ)
Adjoint:
µ(x)
L ∗µ (y, x) = L (x, y)
µ(y)
Proof.
X
(L f, g)µ = µ(x)L (x, y)f (y)g(x)
x,y
X µ(x)
= µ(y)f (y) L (x, y)g(x)
µ(y)
= (f, L ∗µ g)µ
4.3. DIRICHLET FORMS AND CONVERGENCE TO EQUILIBRIUM 103
Symmetric part:
1 1 µ(y)
Ls (x, y) = (L (x, y) + L (x, y)) =
∗µ
L (x, y) + L (y, x)
2 2 µ(x)
1
µ(x)Ls (x, y) = (µ(x)L (x, y) + µ(y)L (y, x))
2
Dirichlet form:
X
Es (f, g) = −(Ls f, g) = − µ(x)Ls (x, y) (f (y) − f (x)) g(x)
x,y
X
=− µ(y)Ls (y, x) (f (x) − f (y)) g(y)
x,y
1X
=− µ(x)Ls (x, y) (f (y) − f (x)) (g(y) − g(x))
2
Hence
1X
E (f, f ) = Es (f, f ) = Q(x, y) (f (y) − f (x))2
2 x,y
where
1
Q(x, y) = µ(x)Ls (x, y) = (µ(x)L (x, y) + µ(y)L (y, x))
2
2. Diffusions in Rn : Let
1X ∂2
L = aij + b · ∇,
2 i,j ∂xi ∂xj
(ii) χ2 -contrast:
R dµ
2 R dν 2
− 1 dµ = dµ − 1 if ν µ
χ2 (µ|ν) = dν dµ
+∞ else
(i) Z Z
1
kν − µk = sup f dν − f dµ
2 f ∈Fb (S)
|f |≤1
(ii)
Z Z 2
2
λ (ν|µ) = sup f dν − f dµ
Rf ∈F b (S)
f 2 dµ≤1
R
and by replacing f by f − f dµ,
Z 2
2
λ (ν|µ) = sup f dν
Rf ∈F b (S)
2 dµ≤1
R f
f dµ=0
(iii)
Z Z Z
H(ν|µ) = sup f dν = R
sup f dν − log ef dµ
Rf ∈F b (S) f Fb (S)
ef dµ≤1
4.3. DIRICHLET FORMS AND CONVERGENCE TO EQUILIBRIUM 105
R R
Remark . ef dµ ≤ 1, hence f dµ ≤ 0 by Jensen and we also have
Z Z
R
sup f dν − f dµ ≤ H(ν|µ)
ef dµ≤1
Proof. (i) ” ≤ ”
Z Z
1 1
ν(A) − µ(A) = (ν(A) − µ(A) + µ(Ac ) − ν(Ac )) = f dν − f dµ
2 2
” ≥ ” If |f | ≤ 1 then
Z Z Z
f d(ν − µ) = f d(ν − µ) + f d(ν − µ)
S+ S−
≤ (ν − µ)(S+ ) − (ν − µ)(S− )
= 2(ν − µ)(S+ ) (since (ν − µ)(S+ ) + (ν − µ)(S− ) = (ν − µ)(S) = 0)
≤ 2kν − µkTV
R
This proves the first equation. The second equation follows by replacing f by f − f dµ.
” ≥ ” By Young’s inequality,
uv ≤ u log u − u + ev
for all u ≥ 0 and v ∈ R, and hence for ν µ with density %,
Z Z
f dν = f % dµ
Z Z Z
≤ % log % dµ − % dµ + ef dµ
Z
= H(ν|µ) − 1 + ef dµ ∀ f ∈ Fb (S)
Z
≤ H(ν|µ) if ef dµ ≤ 1
” ≤ ” ν µ with density %:
1
a) ε ≤ % ≤ ε
for some ε > 0: Choosing f = log % we have
Z Z
H(ν|µ) = log % dν = f dν
and
Z Z
f
e dµ = % dµ = 1
Corollary 4.13. The assertions (i) − (iii) in the corollary above are also equivalent to
Example: d = 1!
2
e.g. b(x) = −αx, %(x) = const. e−αx , µ = Gauss measure.
108 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
In this case:
Zx Zx
1
dz = 2 g 0 dz = 2g(x),
g%
0 0
so
Z∞ Z∞
2
(f − f (0)) % dx ≤ |f 0 |2 % dy · sup
y>0
0 0
and hence
√ 1 1
3|x − 1| ≤ (4 + 2x) 2 (x log x − x + 1) 2
where kν − µkTV ≤ 1. This leads to a bound for the Dobrushin coefficient (contraction
coefficient with respect to k · kTV ).
110 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
Proof.
1 1 1 1 2 1
kνpt − µkTV ≤ χ2 (νpt |µ) 2 ≤ e−λt χ2 (ν|µ) 2 ≤ e−λt kν − µkTV
2 2 2 min µ 12
if S is finite.
where the first summand is the L2 relaxation time and the second is called burn-in period, i.e.
the time needed to make up for a bad initial distribution.
Remark . On high or infinite-dimensional state spaces the bound (4.6) is often problematic since
χ2 (ν|µ) can be very large (whereas kν − µkTV ≤ 1). For example for product measures,
Z n 2 Z 2 !n
dν dν
χ2 (ν n |µn ) = n
dµn − 1 = dµ − 1
dµ dµ
R dν 2
where dµ
dµ > 1 grows exponentially in n.
However, Sobolev constants are dimension dependent! This leads to a replacement by the log
Sobolev inequality.
4.4 Hypercontractivity
Additional reference for this chapter:
• Gross [11]
• Ané [2]
• Royer [19]
We consider the setup from section 4.3. In addition, we now assume that (L , A ) is symmetric
on L2 (S, µ).
Theorem 4.17. With assumptions (A0)-(A3) and α > 0, the following statements are equivalent:
f2
Z
f 2 log dµ ≤ αE (f, f ) ∀ f ∈ A
S kf k2L2 (µ)
kpt f kLq (µ) = kpt0 pt−t0 f kLq (µ) ≤ kpt−t0 f kL2 (µ) ≤ e−λ(t−t0 ) kf kL2 (µ)
α
for all t ≥ t0 (q) := 4
log(q − 1).
2. Stroock estimate:
4(q − 1) q q
E f q−1 , f ≥ E f 2,f 2
q2
Proof.
1
E (f q−1 , f ) = − f q−1 , L f µ = lim f q−1 , f − pt f µ
t↓0 t
ZZ
1
f q−1 (y) − f q−1 (x) (f (y) − f (x)) pt (x, dy) µ(dx)
= lim
t↓0 2t
4(q − 1)
ZZ
1 q q 2
≥ lim f 2 (y) − f (x) pt (x, dy) µ(dx)
q2 t↓0 2t 2
4(q − 1) q q
= E f 2,f 2
q2
where we have used that
q q 2 q2
aq−1 − bq−1 (a − b) ∀ a, b > 0, q ≥ 1
a −b
2 ≤
2 4(q − 1)
Remark . – The estimate justifies the use of functional inequalities with respect to
E to bound Lp norms.
– For generators of diffusions, equality holds, e.g.:
4(q − 1)
Z Z
q 2
q−1
∇f ∇f dµ = ∇f dµ
2
q2
by the chain rule.
3. Combining the estimates:
Z Z
q(t)−1 d d q(t) 0
q(t) · kpt f kq(t) kpt f kq(t) = (pt f ) dµ − q (t) (pt f )q(t) log kpt f kq(t) dµ
dt dt
where Z
q(t)
(pt f )q(t) dµ = kpt f kq(t)
4. Applying the logarithmic Sobolev inequality: Fix p ∈ (1, ∞). Choose q(t) such that
i.e.
4t
q(t) = 1 + (p − 1)e α
Then by the logarithmic Sobolev inequality, the right hand side in the estimate above
is negative, and hence kpt f kq(t) is decreasing. Thus
Theorem 4.18 (ROTHAUS). A logarithmic Sobolev inequality with constant α implies a Poincaré
inequality with constant α = α2 .
R
Proof. f ∈ L2 (µ), g dµ = 0, f := 1 + εg, f 2 = 1 + 2εg + ε2 g 2 ,
Z Z
2
f dµ = 1 + ε 2
g 2 dµ, E (f, f ) = E (1, 1) + 2E (1, g) + ε2 E (g, g)
Z Z Z
f log f dµ ≤ αE (f, f ) + f dµ log f 2 dµ ∀ ε > 0
2 2 2
Theorem 4.19 (Exponential decay of relative entropy). 1. H(νpt |µ) ≤ H(ν|µ) for all t ≥ 0
and ν ∈ M1 (S).
2. If a logarithmic Sobolev inequality with constant α > 0 holds then
2
H(νpt |µ) ≤ e− α t H(ν|µ)
114 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
f2
Z Z
2 α
f log dµ ≤ |∇f |2 dµ = αE (f, f )
kf k2L2 (µ) 2
(i) Suppose ν = g · µ, 0 <R ε ≤ g ≤ 1εR for some εR > 0. Hence R νpt µ with density
1
pt g, ε ≤ pt g ≤ ε (since f d(νpt ) = pt f dν = pt f gdµ = f pt g dµ by symmetry).
This implies that
Z Z
d d
H(νpt |µ) = pt g log pt g dµ = L pt g(1 + log pt g) dµ
dt dt
by Kolmogorov. Using the fact that (x log x)0 = 1 + log x we get
Z
d 1
H(νpt |µ) = −E (pt g, log pt g) = − ∇pt g · ∇ log pt g dµ
dt 2
∇pt g
where ∇ log pt g = pt g
. Hence
√
Z
d
H(νpt |µ) = −2 |∇ pt g|2 dµ (4.7)
dt
R √ 2
1. −2 ∇ pt g dµ ≤ 0
2. The Logarithmic Sobolev Inequality yields that
√
Z Z
2 4 pt g
−2 |∇ pt g| dµ ≤ − pt g log R dµ
α pt g dµ
R R
where pt g dµ = g dµ = 1 and hence
√
Z
4
−2 |∇ pt g|2 dµ ≤ − H(νpt |µ)
α
(ii) Now for a general ν. If ν 6 µ, H(ν|µ) = ∞ and we have the assertion. Let ν = g · µ, g ∈
L1 (µ) and
Then by (i),
2t
H(νa,b pt |µ) ≤ e− α H(νa,b |µ)
The claim now follows for a ↓ 0 and b ↑ ∞ by dominated and monotone convergence.
4.4. HYPERCONTRACTIVITY 115
Remark . 1. The proof in the general case is analogous, just replace (4.7) by inequality
4E ( f , f ) ≤ E (f, log f )
p p
2. An advantage of the entropy over the χ2 distance is the good behavior in high dimensions.
E.g. for product measures,
H(ν d |µd ) = d · H(ν|µ)
1 t 1
kνpt − µkTV ≤ √ e− α H(ν|µ) 2
2
1 1 t
( ≤ √ log e− α if S is finite)
2 min µ(x)
Proof.
1 1 1 t 1
kνpt − µkTV ≤ √ H(νpt |µ) 2 ≤ √ e− α H(ν|µ) 2
2 2
where we use Pinsker’s Theorem for the first inequality and Theorem ??? for the second inequal-
ity. Since S is finite,
1 1
H(δx |µ) = log ≤ log ∀x ∈ S
µ(x) min µ
which leads to
X 1
H(ν|µ) ≤ ν(x)H(δx |µ) ≤ log ∀ν
min µ
P
since ν = ν(x)δx is a convex combination.
q =1−p
0 1
Dirichlet form:
1X
E (f, f ) = (f (y) − f (x))2 µ(x)L (x, y)
2 x,y
= pq · |f (1) − f (0)|2 = Varµ (f )
Spectral gap:
E (f, f )
λ(p) = inf = 1 independent of p !
f not const. Varµ (f )
goes to infinity as p ↓ 0 or p ↑ ∞ !
2
| |
0 p 1
4.5. LOGARITHMIC SOBOLEV INEQUALITIES: EXAMPLES AND TECHNIQUES 117
Z
Entµ (f ) := f log f dµ, f > 0
where on the right hand side the variance is taken with respect to the i-th variable.
2. n
X h i
Entµ (f ) ≤ Eµ Ent(i)
µi (f )
i=1
Proof. 1. Exercise.
2.
and hence
Eiµi [egi ] = 1 ∀, 1 ≤ i ≤ n
Xn n
X
Eµ E(i) ≤ Ent(i)
⇒ Eµ [f g] = Eµ [f gi ] = µi [f g i ] µi (f )
i=1 i=1
n
X h i
⇒ Entµ [f ] = sup Eµ [f g] ≤ Eµ Ent(i)
µi (f )
Eµ [eg ]=1 i=1
118 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
where n
X h i
(i)
E (f, f ) = Eµ Ei (f, f )
i=1
and
λ = min λi
1≤i≤n
2. The corresponding assertion holds for Logarithmic Sobolev Inequalities with α = max αi
Proof.
n
X h i 1
Varµ (f ) ≤ Eµ Var(i)
µi (f ) ≤ E (f, f )
i=1
min λi
since
1
Varµ(i)i (f ) ≤ Ei (f, f )
λi
Entµn (f )
n Z
X
≤ α(p)·p·q· |f (x1 , . . . , xi−1 , 1, xi+1 , . . . , xn ) − f (x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )|2 µn (dx1 , . . . , dxn )
i=1
independent of n.
Generator:
1
L = ∆ − ∇H · ∇
2
µ(dx) = e−H(x) dx satisfies L ∗ µ = 0
∂ 2 H(x) ≥ κ · I ∀ x ∈ Rn
2
i.e. ∂ξξ H ≥ κ · |ξ|2 ∀ ξ ∈ Rn
By (4.9), the measure µ is finite, hence by our results above, the normalized measure is a station-
ary distribution for pt .
2. If we replace Rn by an arbitrary Riemannian manifold the same assertion holds under the
assumption
Ric + Hess H ≥ κ · I
(Bochner-Lichnerowicz-Weitzenböck).
∂ ∂ *
∇pt f = ∇ pt f = ∇L pt f =L ∇pt f
∂t ∂t
and hence
∂
∂ ∂ 1
∂t
∇pt f · ∇pt f
|∇pt f | = (∇pt f · ∇pt f ) 2 =
∂t ∂t |∇pt f |
*
L ∇pt f · ∇pt f L ∇pt f · ∇pt f |∇pt f |2
= ≤ −κ·
|∇pt f | |∇pt f | |∇pt f |
≤ · · · ≤ L |∇pt f | − κ |∇pt f |
and hence
eκs |∇ps f | = v(s) ≤ v(0) = ps |∇f |
• The proof can be made rigorous by approximating | · | by a smooth function, and using
regularity results for pt , cf. e.g. Deuschel, Stroock[8].
Probabilistic proof: pt f (x) = E[f (Xtx )] where Xtx is the solution flow of the stochastic differ-
ential equation
By the assumption on H one can show that x → Xtx is smooth and the derivative flow Ytx =
∇x Xt satisfies the differentiated stochastic differential equation
i.e.
|∇pt f (x)| ≤ e−κt pt |∇f |(x)
Then
f2
Z Z
1
2
f log 2
dµ ≤ |∇f |2 dµ ∀ f ∈ C0∞ (Rn )
kf kL2 (µ) κ
Remark . The inequality extends to f ∈ H 1,2 (µ) where H 1,2 (µ) is the closure of C0∞ with respect
to the norm Z 21
2 2
kf k1,2 := |f | + |∇f | dµ
122 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
(iii) Key Step! By the computation above (decay of entropy) and the lemma,
|∇pt g|2
Z Z
0 1 1
−u (t) = + ∇pt g · ∇ log pt g dµ = dµ
2 2 pt g
1 −κt |pt ∇g|2 |∇g|2
Z Z
1 −2κt
≤ e dµ ≤ e pt dµ
2 pt g 2 g
1 −2κt |∇g|2 √
Z Z
= e dµ = 2e −2κt
|∇ g|2 dµ
2 g
Corollary 4.25. If X
inf V 00 (x) > |ϑ(i)|
x∈R
i∈Z
then E satisfies a log Sobolev inequality with constant independent of Λ.
Proof.
∂2H
(x) = V 00 (xi ) · δij − ϑ(i − j)
∂xi ∂xj
!
X
⇒ ∂ 2 H ≥ inf V 00 − |ϑ(i)| · I
i
dν 1
(x) = e−U (x) .
dµ Z
If
f2
Z Z
2
f log dµ ≤ α · |∇f |2 dµ ∀ f ∈ C∞
0
kf k2L2 (µ)
then
f2
Z Z
2
f log 2
dν ≤ α · eosc(U ) · |∇f |2 dν ∀ f ∈ C0∞
kf kL2 (ν)
where
osc(U ) := sup U − inf U
Proof.
|f |2
Z Z
2
f log dν ≤ f 2 log f 2 − f 2 log kf k2L2 (µ) − f 2 + kf k2L2 (µ) dν (4.11)
kf k2L2 (ν)
since
|f |2
Z Z
2
f log dν ≤ f 2 log f 2 − f 2 log t2 − f 2 + t2 dν ∀t > 0
kf k2L2 (ν)
Note that in (4.11) the integrand on the right hand side is non-negative. Hence
|f |2
Z Z
2 1 − inf U 2 2 2 2 2 2
f log dν ≤ · e f log f − f log kf kL2 (µ) − f + kf k L2 (µ) dµ
kf k2L2 (ν) Z
f2
Z
1 − inf U
= e · f 2 log dµ
Z kf k2L2 (µ)
Z
1 − inf U
≤ ·e α |∇f |2 dµ
Z
Z
sup U −inf U
≤e α |∇f |2 dν
1. No interactions:
X x2
i
H(x) = + V (xi ) , V : R → R bounded
i∈Λ
2
Hence O
µ= µV
i∈Λ
where
µV (dx) ∝ e−V (x) γ(dx)
and γ(dx) is the standard normal distribution. Hence µ satisfies the logarithmic Sobolev
inequality with constant
α(µ) = α(µV ) ≤ e− osc(V ) α(γ) = 2 · e− osc(V )
by the factorization property. Hence we have independence of dimension!
2. Weak interactions:
X x2 X X
i
H(x) = + V (xi ) − ϑ xi xj − ϑ xi zj ,
i∈Λ
2 i,j∈Λ i∈Λ
|i−j|=1 j ∈Λ
/
|i−j|=1
Theorem 4.27. If V is bounded then there exists β > 0 such that for ϑ ∈ [−beta, β] a
logarithmic Sobolev inequality with constant independent of λ holds.
The proof is based on the exponential decay of correlations Covµ (xi , xj ) for Gibbs mea-
sure, cf. ???, Course ???.
3. Discrete Ising model: One can show that for β < βc (???) a logarithmic Sobolev in-
equality holds on{−N, . . . , N }d with constant of Order O(N 2 ) independent of the bound-
ary conditions, whereas for β > βc and periodic boundary conditions the spectral gap,
and hence the log Sobolev constant, grows exponentially in N , cf. [???].
N Z
1 X
U (Xi ) → U dµ U ∈ L 1 (µ)
N i=1
126 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
Cramér:
" #
1 X N Z
P U (Xi ) − U dµ ≥ r ≤ 2 · e−N I(r) ,
N
i=1
Z
tU
I(r) = sup tr − log e dµ LD rate function.
t∈R
Hence we have
•
" #
N
r2
1 X Z
N r2
P U (Xi ) − U dµ ≥ r ≤ e− c provided I(r) ≥
N c
i=1
Gaussian concentration.
When does thisRhold? Extension to non independent identically distributed case? This leads to:
Bounds for log etU dµ !
Theorem 4.28 (H ERBST). If µ satisfies a logarithmic Sobolev inequality with constant α then
for any Lipschitz function U ∈ Cb1 (Rd ):
(i)
Z Z
1 tU α
log e dµ ≤ t + U dµ ∀t > 0 (4.12)
t 4
where 1t log etUR dµ can be seen as the free energy at inverse temperature t, α
R
4
as a bound
for entropy and U dµ as the average energy.
(ii) Z
r2
µ U≥ U dµ + r ≤ e− α
(iii) Z
2 1
eγ|x| dµ < ∞ ∀γ <
α
tU
Proof. WLOG, 0 ≤ ε ≤ U ≤ 1ε . Logarithmic Sobolev inequality applied to f = e 2 :
Z Z 2 Z Z
tU t 2 tU
tU e dµ ≤ α |∇U | e dµ + e dµ log etU dµ
tU
2
R tU
For Λ(t) := log e dµ this implies
R tU
R
tU e dµ αt 2
|∇U |2 etU dµ αt2
tΛ0 (t) = R tU ≤ R + Λ(t) ≤ + Λ(t)
e dµ 4 etU dµ 4
since |∇U | ≤ 1. Hence
d Λ(t) tΛ0 (t) − Λ(t) α
= ≤ ∀t > 0
dt t t2 4
Since
Z
0 2
Λ(t) = Λ(0) + t · Λ (0) + O(t ) = t U dµ + O(t2 ),
we obtain
Z
Λ(t) α
≤ U dµ + t,
t 4
i.e. (i).
(ii) follows from (i) by the Markov inequality, and (iii) follows from (ii) with U (x) = |x|.
Proof. By the factorization property, µN satisfies a logarithmic Sobolev inequality with constant
α as well. Now apply the theorem to
N
1 X
Ũ (x) := √ U (xi )
N i=1
noting that
∇U (x1 )
1 ..
∇Ũ (x1 , . . . , xn ) = √
.
N
∇U (xN )
128 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
N
! 12
1 X
∇Ũ (x) = √ |∇U (xi )|2 ≤1
N i=1
Chapter 5
Xt (ω) = ω(t)
pt transition semigroup,
Pµ ◦ Θ−1
t = Pµ ∀ t ≥ 0,
(Ω, F, Pµ , (Θt )t≥0 ) is a dynamical system where Θt are measure preserving maps, Θt+s = Θt ◦
Θs .
Definition 5.1.
ϑ := A ∈ F : Θ−1
t (A) = A ∀ t ≥ 0
σ-algebra of shift-invariant events. The dynamical system (Ω, F, Pµ , (Θt )t≥0 ) is called ergodic
if and only if
Pµ [A] ∈ {0, 1} ∀ A ∈ ϑ
or, equivalently,
Zt
1
F (Θs (ω)) ds → Eµ [F | ϑ](ω)
t
0
129
130 CHAPTER 5.
In particular
Zt
1
F ◦ Θs ds → Eµ [F ] Pµ -a.s. if ergodic.
t
0
Remark . R 1. The Ergodic theorem implies Px -a.s. convergence for µ-almost every x (since
Pµ = Px µ(dx)).
Example . Ising model with Glauber dynamics on Z2 , β > βcrit (low temperature regime). It
−
follows that there exist two extremal stationary distributions µ+
β and µβ . Pµ+ and Pµ− are both
β β
ergodic. Hence
Zt (
1 Eµ+ [F ] Pµ+ -a.s.
F ◦ Θs ds → β β
−
No assertion for the initial distribution ν ⊥ µ+
β , µβ .
(i) Pµ is ergodic
Proof. (i)⇒(ii) If h is harmonic then h(Xt ) is a martingale. Hence when we apply the L2 mar-
tingale convergence theorem,
h(Xt ) → M∞ in L2 (Pµ ), M∞ ◦ Θt = M∞
(ii)⇒(iii) h = IB
h = IB = const. µ-a.s.
(iii)⇔(iv) If reversibility holds, the assertion follows from the spectral theorem:
pt symmetric C0 semigroup on L2 (µ), generator L self-adjoint and negative definite. Hence
Z0
pt f = etL f = etλ dP((−∞,λ]) (f ) → P{0} f = Projection of f onto ker L
−∞
132 CHAPTER 5.
Example . 1. Rotation on S 1 :
d
L= dϕ
Note: For discontinuous martingales, hM it is not the quadratic variation of the paths!
and M f is a martingale.
Zt
f
hM it = Γ(f, f )(Xs ) ds Pµ -a.s.
0
where
where
Zt
2
f (Xt ) ∼ Lf 2 (Xs ) ds
0
Zt Zt Z t Zr
2f (Xt ) Lf (Xs ) ds = 2 f (Xs )Lf (Xr ) dr + 2 Lf (Xs ) ds df (Xr )
0 0 0 0
134 CHAPTER 5.
where
Zr
f (Xr ) ∼ Lf (Xs ) ds
0
Hence 2
2 Zt Zt
Mtf ∼2 f (Xr )Lf (Xr ) dr + Lf (Xs ) ds
0 0
Example . Diffusion in Rn ,
1X ∂2
L= aij (x) + b(x) · ∇
2 i,j ∂xi ∂xj
Hence
X ∂f ∂g 2
Γ(f, g)(x) = aij (x) (x) (x) = σ T (x)∇f (x)Rn
i,j
∂xi ∂xj
C0∞ (Rn ).
for all f, g ∈ Results for gradient diffusions on Rn (e.g. criteria for log Sobolev) extend
to general state spaces if |∇f |2 is replaced by Γ(f, g)!
Theorem 5.5 (Central limit theorem for martingales). (Mt ) square-integrable martingale on
(Ω, F, P ) with stationary increments (i.e. Mt+s − Ms ∼ Mt − M0 ), σ > 0. If
1
hM it → σ 2 in L1 (P )
t
then
Mt D
√ → N (0, σ 2 )
t
5.2. CENTRAL LIMIT THEOREM FOR MARKOV PROCESSES 135
Corollary 5.6 (Central limit theorem for Markov processes (elementary version)). Let (Xt , Pµ )
be a stationary ergodic Markov process. Then for f ∈ Range(L), f = Lg:
Zt
1 D
√ f (Xs ) ds → N (0, σf2 )
t
0
where Z
σf2 =2 g(−L)g dµ = 2E (g, g)
If L : L20 (µ) → L2 (µ) is bijective with G = (−L)−1 then the Central limit theorem holds
for all f ∈ L2 (µ) with
(H −1 norm if symmetric).
2
σf2 ≤ kf k2L2 (µ)
λ
is a bound for asymptotic variance.
136 CHAPTER 5.
Proof of corollary.
Zt
1 g(Xt ) − g(X0 ) Mtg
√ f (Xs ) ds = √ + √
t t t
0
Zt
hM g it = Γ(g, g)(Xs ) ds Pµ -a.s.
0
Moreover
1
√ (g(Xt ) − g(X0 )) → 0
t
in L2 (Pµ ), hence in distribution. This gives the claim since
D D D
Xt → µ, Yt → 0 ⇒ Xt + Yt → µ
[1] A LDOUS , D. & F ILL , J. Reversible Markov Chains and Random Walks on Graph. Avail-
able online: http://www.stat.berkeley.edu/˜aldous/RWG/book.html
[2] A N É , C. & B LACH ÈRE , S. & C HAFA Ï , D. & F OUG ÈRES , P. & G ENTIL , I. & M ALRIEU ,
F. & ROBERTO , C. & S CHEFFER , G. (2000) Sur les inégalités de Sobolev logarithmiques
Panoramas et Synthèses 10, Société Mathématique de France.
[3] A PPLEBAUM , D. (2004) Lévy processes and stochastic calculus Cambridge University
Press.
[6] B OULEAU , N. & H IRSCH , F. (1991) Dirichlet Forms and Analysis on Wiener Space.
Gruyter.
[7] B R ÉMAUD , P. (2001) Markov chains : Gibbs fields, Monte Carlo simulation, and queues.
Springer.
[8] D EUSCHEL , J.-D. & S TROOCK , D.W. (1989) Large deviations. Series in Pure and Ap-
plied Mathematics Vol. 137, Academic Press Inc.
[9] D URETT, R. (1993) Ten Lectures on Particle Systems. St. Flour Lecture notes. Springer.
[10] F ELLER , W. (1991) An Introduction to Probability Theory and Its Applications, Volume 1
and 2. John Wiley & Sons.
[11] G ROSS , L. & FABES , E. & F UKUSHIMA , M. & K ENIG , C. & R ÖCKNER , M. &
S TROOCK , D.W. & G IANFAUSTO D ELL’A NTONIO & U MBERTO M OSCO (1992) Dirich-
let Forms: Lectures given at the 1st Session of the Centro Internazionale Matematico Es-
tivo (C.I.M.E.) held in Varenna, Italy, June 8-19, 1992 (Lecture Notes in Mathematics).
Springer.
[12] K ARATZAS , I. & S HREVE , S. E. (2005) Brownian Motion and Stochastic Calculus.
Springer.
[13] K IPNIS , C. & L ANDIM , C. (1998) Scaling limits of interacting particle systems. Springer.
137
138 BIBLIOGRAPHY
[14] L ANDIM , C. (2003) Central Limit Theorem for Markov processes, from Classical to Mod-
ern Probability CIMPA Summer School 2001, Picco, Pierre; San Martin, Jaime (Eds.),
Progress in Probability 54, 147207, Birkhäuser.
[17] R EVUZ , D. & YOR , M. (2004) Continuous martingales and Brownian motion. Springer.
[18] ROGERS , L.C.G. & W ILLIAMS , D. (2000) Diffusions, Markov Processes and Martin-
gales. Cambridge University Press.
[21] S TROOCK , D.W. (2000) Probability Theory, an Analytic View. Cambridge University
Press.
139
140 INDEX
mean-field model, 48
minimal chain, 17
Q-matrix, 20
recurrent chain, 35
recurrent state, 34
relative entropy, 104
Reuter’s criterion, 32
scale function, 93
scale-invariant Lévy process, 77
space-time harmonic, 27
spatial Poisson process, 64
speed measure, 94
stationary measure, 37
stochastic dominance, 56
Strong Markov property of a Markov process,
8
Strong Markov property of a Markov chain, 7
strongly continuous semigroup of linear con-
tractions, 68
superharmonic, 27
symmetric measure, 80
Voter model, 46