MPSkript

Markov processes
University of Bonn, summer term 2008

Author: Prof. Dr. Andreas Eberle
Edited by: Sebastian Riedel
Latest Version: January 22, 2009
2
Contents
1 Continuous-time Markov chains 5

1.1 Markov properties in discrete and continuous time . . . . . . . . . . . . . . . . . 5
1.2 From discrete to continuous time: . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Forward and Backward Equations . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.4 The martingale problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.5 Asymptotics of Time-homogeneous jump processes . . . . . . . . . . . . . . . . 33
2 Interacting particle systems 45

2.1 Interacting particle systems - a first look . . . . . . . . . . . . . . . . . . . . . . 45
2.2 Particle systems on Zd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.3 Stationary distributions and phase transitions . . . . . . . . . . . . . . . . . . . 55
2.4 Poisson point process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3 Markov semigroups and Lévy processes 67

3.1 Semigroups and generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.2 Lévy processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.3 Construction of Lévy processes from Poisson point processes: . . . . . . . . . . 79
4 Convergence to equilibrium 87
4.1 Setup and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.2 Stationary distributions and reversibility . . . . . . . . . . . . . . . . . . . . . . 91
4.3 Dirichlet forms and convergence to equilibrium . . . . . . . . . . . . . . . . . . 99
4.4 Hypercontractivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.5 Logarithmic Sobolev inequalities: Examples and techniques . . . . . . . . . . . 116
4.6 Concentration of measure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
5 129
5.1 Ergodic averages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.2 Central Limit theorem for Markov processes . . . . . . . . . . . . . . . . . . . . 132
3
4 CONTENTS
Chapter 1
Continuous-time Markov chains
Additional reference for this chapter:
• Asmussen [4]
• Stroock [20]
• Norris [16]
• Kipnis, Landim [13]
1.1 Markov properties in discrete and continuous time

Let (S, S ) be a measurable space, pn (x, dy) transition probabilities (Markov kernels) on (S, S ),
(Ω, A, P ) a probability space and Fn (n = 0, 1, 2, ...) a filtration on (Ω, A).
Definition 1.1. A stochastic process (Xn )n=0,1,2,... on (Ω, A, P ) is called (F n )-Markov chain
with transition probabilities pn if and only if
(i) Xn is Fn -measurable ∀ n ≥ 0
(ii) P [Xn+1 ∈ B|Fn ] = pn+1 (Xn , B) P -a.s. ∀n ≥ 0, B ∈ S .
Example (Non-linear state space model). Consider

Xn+1 = Fn+1 (Xn , Un+1 ),
Un : Ω → S̃n independent random variable, noise
Fn : S × S̃n → S measurable, rule for dynamics
(Xn )n≥0 is a Markov chain with respect to Fn = σ(X0 , U1 , U2 , . . . , Un ) with transition probabil-
ities pn (x, ·) = distribution of Fn (x, Un ).
5
6 CHAPTER 1. CONTINUOUS-TIME MARKOV CHAINS
Remark . 1. Longer transitions:
P [Xn ∈ B|Fm ] = (pn · · · pm+2 pm+1 )(Xn , B)

| {z }
=:pm,n
where
Z
(pq)(x, B) := p(x, dy)q(y, B)
S
Time-homogeneous case: pn = p ∀n, pm,n = pn−m .
2. Reduction to time-homogeneous case:

Xn time-inhomogeneous Markov chain, X̂n := (n, Xn ) space-time process is time-homogeneous
Markov chain on Ŝ = {0, 1, 2, . . . } × S with transition probabilities
p̂((n, x), {m} × B) = δm,n+1 · pn+1 (x, B).
3. Canonical model for space-time chain with start in (n, x):

There is an unique probability measure P(n,x) on Ω := S {0,1,2,...} so that under P(n,x) ,
Xk (ω) = ω(k) is a Markov chain with respect to Fn = σ(X0 , . . . Xn ) and with transition
kernels pn+1 , pn+2 , . . . and initial condition X0 = x P(n,x) -a.s.
x b
0 1 2 n n+1 n+2
pn+1 pn+2
Let T : Ω → {0, 1, . . .} ∪ {∞} be a (Fn )-stopping time, i.e.
{T = n} ∈ Fn ∀n ≥ 0,
FT = {A ∈ A | A ∩ {T = n} ∈ Fn ∀n ≥ 0} events observable up to time T.
1.2. FROM DISCRETE TO CONTINUOUS TIME: 7
Theorem 1.2 (Strong Markov property).
E[F (XT , XT +1 , . . .) · I{T <∞} |FT ] = E(T,XT ) [F (X0 , X1 , . . .)] P -a.s. on {T < ∞}
Proof. Exercise (Consider first T = n).
1.2 From discrete to continuous time:

Let t ∈ R+ , S Polish space (complete, separable metric space), S = B(S) Borel σ-algebra,
ps,t (x, dy) transition probabilities on (S, S ), 0 ≤ s ≤ t < ∞, (Ft )t≥0 filtration on (Ω, A, P ).
Definition 1.3. A stochastic process (Xt )t≥0 on (Ω, A, P ) is called a (F t )-Markov process with
transition functions ps,t if and only if
(i) Xt is Ft -measurable ∀t ≥ 0
(ii) P [Xt ∈ B|Fs ] = ps,t (Xs , B) P -a.s. ∀ 0 ≤ s ≤ t, B ∈ S
Lemma 1.4. The transition functions of a Markov process satisfy
1. ps,s f = f
2. ps,t pt,u f = ps,u f Chapman-Kolmogorov equation
P ◦ X −1 -a.s. for every f : S → R and 0 ≤ s ≤ t ≤ u.
Proof. 1. (ps,s f )(Xs ) = E[f (Xs )|Fs ] = f (Xs ) P -a.s.

(pt,u f )(Xt )
z }| {
2. (ps,u f )(Xs ) = E[f (Xu )|Fs ] = E[E[f (Xn )|Ft ] |Fs ] = (ps,t pt,u f )(Xs ) P -a.s.
Remark . 1. Time-homogeneous case: ps,t = pt−s

Chapman-Kolmogorov: ps pt f = ps+t f (semigroup property).
2. Kolmogorov existence theorem: Let ps,t be transition probabilities on (S, S ) such that
(i) pt,t (x, ·) = δx ∀x ∈ S, t ≥ 0
(ii) ps,t pt,u = ps,u ∀0≤s≤t≤u
Then there is an unique canonical Markov process (Xt , P(s,x) ) on S [0,∞) with transition
functions ps,t .
Problems:
• regularity of paths t 7→ Xt . One can show: If S is locally compact and ps,t Feller, then Xt
has càdlàg modification (cf. Revuz, Yor [17]).
• in applications, ps,t is usually not known explicitly.
We take a more constructive approach instead.
Let (Xt , P ) be an (Ft )-Markov process with transition functions ps,t .
Definition 1.5. (Xt , P ) has the strong Markov property w.r.t. a stopping time T : Ω → [0, ∞] if
and only if
E f (XT +s )I{T <∞} | FT = (pT,T +s f )(XT )
P -a.s. on {T < ∞} for all measurable f : S → R+ .
In the time-homogeneous case, we have

E f (XT +s )I{T <∞} | FT = (ps f )(XT )
Lemma 1.6. If the strong Markov property holds then

E F (XT +· )I{T <∞} | FT = E(T,XT ) [F (XT +· )]
P -a.s. on {T < ∞} for all measurable functions F : S [0,∞) → R+ .

In the time-homogeneous case, the right term is equal to
EXT [F (X)]
Proof. Exercise.
Definition 1.7.
PC(R+ , S) := {x : [0, ∞) → S | ∀t ≥ 0 ∃ε > 0 : x constant on [t, t + ε)}
Definition 1.8. A Markov process (Xt )t≥0 on (Ω, A, P ) is called a pure jump process or contin-
uous time Markov chain if and only if
(t 7→ Xt ) ∈ PC(R+ , S)
P -a.s.
Let qt : S × S → [0, ∞] be a kernel of positive measure, i.e. x 7→ qt (x, A) is measurable and

A 7→ qt (x, A) is a positive measure.
Aim: Construct a pure jump process with instantaneous jump rates qt (x, dy), i.e.
Pt,t+h (x, B) = qt (x, B) · h + o(h) ∀ t ≥ 0, x ∈ S, B ⊆ S \ {x} measurable
(Xt )t≥0 ↔ (Yn , Jn )n≥0 ↔ (Yn , Sn ) with Jn holding times, Sn jumping times of Xt .
n
∈ (0, ∞] with jump time {Jn : n ∈ N} point process on R+ , ζ = sup Jn
P
Jn = Si
i=1
explosion time.
Construction of a process with initial distribution µ ∈ M1 (S):

λt (x) := qt (x, S \ {x}) intensity, total rate of jumping away from x.
Assumption: λt (x) < ∞ ∀x ∈ S , no instantaneous jumps.

qt (x,A)
πt (x, A) := λt (x)
transition probability , where jumps from x at time t go to.
a) Time-homogeneous case: qt (x, dy) ≡ q(x, dy) independent of t, λt (x) ≡ λ(x), πt (x, dy) ≡
π(x, dy).
Yn (n = 0, 1, 2, . . .) Markov chain with transition kernel π(x, dy) and initial distribution µ
En
Sn := , En ∼ Exp(1) independent and identically distributed random variables,
λ(Yn−1 )
independent of Yn , i.e.
Sn |(Y0 , . . . Yn−1 , E1 , . . . En−1 ) ∼ Exp(λ(Yn−1 )),
X n
Jn = Si
i=1
(
Yn for t ∈ [Jn , Jn+1 ), n ≥ 0
Xt :=
∆ for t ≥ ζ = sup Jn
where ∆ is an extra point, called the cemetery.

Example . 1) Poisson process with intensity λ > 0
S = {0, 1, 2, . . .}, q(x, y) = λ · δx+1,y , λ(x) = λ ∀x, π(x, x + 1) = 1

Si ∼ Exp(λ) independent and identically distributed random variables, Yn = n
Nt = n :⇔ Jn ≤ t ≤ Jn+1 ,
Nt = #{i ≥ 1 : Ji ≤ t} counting process of point process {Jn | n ∈ N}.
Distribution at time t:
n
Jn =
P
Si ∼Γ(λ,n) Zt ∞
i=1 (λs)n−1 −λs differentiate r.h.s. −tλ X (tλ)k
P [Nt ≥ n] = P [Jn ≤ t] = λe ds = e ,
(n − 1)! k=n
k!
0
Nt ∼ Poisson(λt)
2) Continuization of discrete time chain

Let (Yn )n≥0 be a time-homogeneous Markov chain on S with transition functions p(x, dy),
Xt = YNt , Nt Poisson(1)-process independent of (Yn ),

q(x, dy) = π(x, dy), λ(x) = 1
e.g. compound Poisson process (continuous time random walk):

Nt
X
Xt = Zi ,
i=1
Zi : Ω → Rd independent and identically distributed random variables, independent of

(Nt ).
3) Birth and death chains
S = {0, 1, 2, . . .}.

b(x) if y = x + 1 ”birth rate”

q(x, y) = d(x) if y = x − 1 ”death rate”

0 if |y − x| ≥ 2

rate d(x) rate b(x)
| | | | | | |
x−1 x x+1
b) Time-inhomogeneous case:
Remark (Survival times). Suppose an event occurs in time interval [t, t + h] with probability
λt · h + o(h) provided it has not occurred before:
P [T ≤ t + h|T > t] = λt · h + o(h)

P [T > t + h]
⇔ = P [T > t + h|T > t] = 1 − λt h + o(h)
P [T > t]
| {z }
survival rate
log P [T > t + h] − log P [T > t]
⇔ = −λt + o(h)
h
d
⇔ log P [T > t] = −λt
dt  
Zt
⇔ P [T > t] = exp − λs ds t
0
where the integral is the accumulated hazard rate up to time t,
 
Zt
fT (t) = λt exp − λs ds · I(0,∞) (t) the survival distribution with hazard rate λs
0
Simulation of T :
Rt
E ∼ Exp(1), T := inf{t ≥ 0 : λs ds ≥ E}
0
 t 
Z Rt
− λs ds
⇒ P [T > t] = P  λs ds < E  = e 0
0
Construction of time-inhomogeneous jump process:
Fix t0 ≥ 0 and µ ∈ M1 (S) (the initial distribution at time t0 ).

Suppose that with respect to P(t0 ,µ) ,
J0 := t0 , Y ∼µ
and
t∨t
R0
− λs (Y0 ) ds
P(t0 ,µ) [J1 > t | Y0 ] = e t0
for all t ≥ t0 , and (Yn−1 , Jn )n∈N is a time-homogeneous Markov chain on S × [0, ∞) with
transition law
t∨J
Rn
− λs (y) ds
P(t0 ,µ) [Yn ∈ dy, Jn+1 > t | Y0 , J1 , . . . , Yn−1 , Jn ] = πJn (Yn−1 , dy) · e Jn
i.e.
t∨J
Rn
−
Z λs (y) ds
P(t0 ,µ) [Yn ∈ A, Jn+1 > t |Y0 , J1 , . . . , Yn−1 , Jn ] = πJn (Yn−1 , dy) · e Jn
A
P -a.s. for all A ∈ S , t ≥ 0
Explicit (algorithmic) construction:

• J0 := t0 , Y0 ∼ µ
For n = 1, 2, . . . do
• En ∼ Exp(1) independent of Y0 , . . . , Yn−1 , E1 , . . . , En−1
n Rt o
• Jn := inf t ≥ 0 : Jn−1 λs (Yn−1 ) ds ≥ En
• Yn |(Y0 , . . . , Yn−1 , E0 , . . . , En ) ∼ πJn (Yn−1 , ·) where π∞ (x, ·) = δx (or other arbitrary
definition)
Example (Non-homogeneous Poisson process on R+ ).
S = {0, 1, 2, . . .}, qt (x, y) = λt · δx+1,y ,

Rt
Yn = n, Jn+1 |Jn ∼ λt · e− Jn λs ds
dt,
Nt := #{n ≥ 1 : Jn ≤ t} the associated counting process
b b b b b b b
high intensity low intensity
Claim:
R n−1 Rt
t
1. fJn (t) = 1
(n−1)! 0
λs ds λt e− 0 λs ds
R
t
2. Nt ∼ Poisson 0
λs ds
Proof. 1. By induction:
Zt
fJn+1 (t) = fJn−1 |Jn (t|r)fJn (r) dr
0
 n−1
Zt Zr
Rt 1 Rr
= λt e− r λs ds  λs ds λr e− 0 λs ds
dr = . . .
(n − 1)!
0 0
2. P [Nt ≥ n] = P [Jn ≤ t]
Lemma 1.9. (Exercise)

1. (Yn−1 , Jn ) is a Markov chain with respect to Gn = σ(Y0 , . . . , Yn−1 , E1 , . . . , En ) with tran-
sition functions
Rt
p ((x, s), dydt) = πs (x, dy)λt (y) · e− s λr (y) dr
I(s,∞) (t) dt
2. (Jn , Yn ) is a Markov chain with respect to G̃ = σ(Y0 , . . . , Yn , E1 , . . . , En ) with transition

functions Rt
p̃ ((x, s), dtdy) = λt (x) · e− s λr (x) dr I(s,∞) (t)πt (x, dy)
Remark . 1. Jn strictly increasing.
2. Jn = ∞ ∀ n, m is possible Xt absorbed in state Yn−1 .
3. sup Jn < ∞ explosion in finite time
4. {s < ζ} = {Xs 6= ∆} ∈ Fs no explosion before time s.
Ks := min{n : Jn > s} first jump after time s. Stopping time with respect to
Gn = σ (E1 , . . . , En , Y0 , . . . , Yn−1 ),
{Ks < ∞} = {s < ζ}

Lemma 1.10 (Memoryless property). Let s ≥ t0 . Then for all t ≥ s,

Rt
P(t0 ,µ) [{JKs > t} ∩ {s < ζ} | Fs ] = e− s λr (Xs ) dr
P -a.s. on {s < ζ}
i.e.
h Rt i
P(t0 ,µ) [{JKs > t} ∩ {s < ζ} ∩ A] = E(t0 ,µ) e− s λr (Xs ) dr ; A ∩ {s < ζ} ∀A ∈ Fs
Remark . The assertion is a restricted form of the Markov property in continuous time: The
conditional distribution with respect to P(t0 ,µ) of JKs given Fs coincides with the distribution of
J1 with respect to P(s,Xs ) .
Proof.
(Ex.)
A ∈ Fs ⇒ A ∩ {Ks = n} ∈ σ (J0 , Y0 , . . . , Jn−1 , Yn−1 ) = G̃n−1
h i
⇒ P [{JKs > t} ∩ A ∩ {Ks = n}] = E P [Jn > t | G̃n−1 ] ; A ∩ {Ks = n}
where
   
Zt Zt
P [Jn > t | G̃n−1 ] = exp − λr (Yn−1 ) dr = exp − λr (Yn−1 ) dr · P [Jn > s | G̃n−1 ],
|{z}
Jn−1 s =Xs
hence we get
h Rt i
P [Jn > t | G̃n−1 ] = E e− s λr (Xs ) dr ; A ∩ {Ks = n} ∩ {Jn > s} ∀n ∈ N
where A ∩ {Ks = n} ∩ {Jn > s} = A ∩ {Ks = n}.

Summing over n gives the assertion since
˙
[
{s < ζ} = {Ks = n}.
n∈N
For yn ∈ S, tn ∈ [0, ∞] strictly increasing define
˙
x := Φ ((tn , yn )n=0,1,2,... ) ∈ PC ([t0 , ∞), S ∪{∆})
by (
Yn for tn ≤ t < tn+1 , n ≥ 0
xt :=
∆ for t ≥ sup tn
Let
(Xt )t≥t0 := Φ ((Jn , Yn )n≥0 )

FtX := σ (Xs | s ∈ [t0 , t]) , t ≥ t0
Theorem 1.11 (Markov property). Let s ≥ t0 , Xs:∞ := (Xt )t≥s . Then
E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | FsX (ω) = E(s,Xs (ω)) [F (Xs:∞ )] P -a.s. {s < ζ}

for all
F : PC ([s, ∞), S ∪ {∆}) → R+
measurable with respect to σ (x 7→ xt | t ≥ s).
Proof. Xs:∞ = Φ(s, YKs −1 , JKs , YKs , JKs +1 , . . .) on {s < ζ} = {Ks < ∞}
t0 s JKs
i.e. the process after time s is constructed in the same way from s, YKs −1 , JKs , . . . as the orig-
inal process is constructed from t0 , Y0 , J1 , . . .. By the Strong Markov property for the chain
(Yn−1 , Jn ),

E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | GKs

=E(t0 ,µ) F ◦ Φ(s, YKs −1 , JKs , . . .) · I{Ks <∞} | GKs
=EMarkov chain
(YKs −1 ,JKs ) [F ◦ Φ(s, (Y0 , J1 ), (Y1 , J2 ), . . .)] a.s. on {Ks < ∞} = {s < ζ}.
Since Fs ⊆ GKs , we obtain by the projectivity of the conditional expectation,

E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | Fs
h i
=E(t0 ,µ) EMarkov chain

(Xs ,JKs ) F ◦ Φ(s, (Y 0 , J1 ), . . .) · I{s<ζ} | Fs
taking into account that the conditional distribution given GKs is 0 on {s ≥ ζ} and that YKs −1 =
Xs .
Here the conditional distribution of JKs ist k(Xs , ·), by Lemma 1.10
Rt
k(x, dt) = λt (x) · e− s λr (x) dr
· I(s,∞) (t) dt
hence
E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | Fs = EMarkov chain

(Xs ,k(Xs ,·)) [F ◦ Φ(. . .)] a.s. on {s < ζ}
Here k(Xs , ·) is the distribution of J1 with respect to Ps,Xs , hence we obtain

E(t0 ,µ) F (Xs:∞ ) · I{s<ζ} | Fs = E(s,Xs ) [F (Φ(s, Y0 , J1 , . . .))]
= E(s,Xs ) [F (Xs:∞ )]
Corollary 1.12. A non-homogeneous Poisson process (Nt )t≥0 with intensity λt has independent
increments with distribution
 t 
Z
Nt − Ns ∼ Poisson  λr dr
s
Proof.
MP
P Nt − Ns ≥ k | FsN = P(s,Ns ) [Nt − Ns ≥ k] = P(s,Ns ) [Jk ≤ t]

 t 
Z
as above
= Poisson  λr dr ({k, k + 1, . . .}) .
s
R
t
Hence Nt − Ns independent of FsN and Poisson s
λr dr distributed.
Recall that the total variation norm of a signed measure µ on (S, S ) is given by
Z
+ −
kµkTV = µ (S) + µ (S) = sup f dr
|f |≤1
Theorem 1.13. 1. Under P(t0 ,µ) , (Xt )t≥t0 is a Markov jump process with initial distribution
Xt0 ∼ µ and transition probabilities
ps,t (x, B) = P(s,x) [Xt ∈ B] (0 ≤ s ≤ t, x ∈ S, B ∈ S )
satisfying the Chapman-Kolmogorov equations ps,t pt,u = ps,u ∀ 0 ≤ s ≤ t ≤ u.

2. The integrated backward equation
Rt
Zt Rr
−
ps,t (x, B) = e s λr (x) dr
δx (B) + e− s λu (x) du
(qr pr,t )(x, B) dr (1.1)
s
holds for all 0 ≤ s ≤ t , x ∈ S and B ∈ S .
3. If t 7→ λt (x) is continuous for all x ∈ S, then
(ps,s+h f )(x) = (1 − λs (x) · h)f (x) + h · (qs f )(x) + o(h) (1.2)
holds for all s ≥ 0 , x ∈ S and bounded functions f : → R such that t 7→ (qt f )(x) is
continuous.
Remark . 1. (1.2) shows that (Xt ) is the continuous time Markov chain with intensities λt (x)
and transition rates qt (x, dy).
2. If ζ = sup Jn is finite with strictly positive probability, then there are other possible con-
tinuations of Xt after the explosion time ζ.
non-uniqueness.
The constructed process is called the minimal chain for the given jump rates, since its
transition probabilities pt (x, B) , B ∈ S are minimal for all continuations, cf. below.
3. The integrated backward equation extends to bounded functions f : S → R
Rt
Zt Rr
(ps,t f )(x) = e− s λr (x) dr
f (x) + e− s λu (x) du
(qr pr,t f )(x) dr (1.3)
s
Proof. 1. By the Markov property,
P(t0 ,µ) Xt ∈ B|FsX = P(s,Xs ) [Xt ∈ B] = ps,t (Xs , B) a.s.

since {Xt ∈ B} ⊆ {t < ζ} ⊆ {s < ζ} for all B ∈ S and 0 ≤ s ≤ t.

Thus (Xt )t≥t0 , P(t0 ,µ) is a Markov jump process with transition kernels ps,t . Since this
holds for any initial condition, the Chapman-Kolmogorov equations
(ps,t pt,u f )(x) = (ps,u f )(x)
are satisfied for all x ∈ S , 0 ≤ s ≤ t ≤ u and f : S → R.

2. First step analysis: Condition on G̃1 = σ(J0 , Y0 , J1 , Y1 ):

Since Xt = Φt (J0 , Y0 , J1 , Y1 , J2 , Y2 , . . .), the Markov property of (Jn , Yn ) implies
h i
P(s,x) Xt ∈ B|G̃1 (ω) = P(J1 (ω),Y1 (ω)) [Φt (s, x, J0 , Y0 , J1 , Y1 , . . .) ∈ B]
On the right side, we see that

(
x if t < J1 (ω)
Φt (s, x, J0 , Y0 , J1 , Y1 , . . .) =
Φt (J0 , Y0 , J1 , Y1 , . . .) if t ≥ J1 (ω)
and hence
h i
P(s,x) Xt ∈ B|G̃1 (ω) = δx (B) · I{t<J1 } (ω) + P(J1 (ω),Y1 (ω)) [Xt ∈ B] · I{t≥J1 } (ω)
P(s,x) -a.s. We conclude
ps,t (x, B) = P(s,x) [Xt ∈ B]

= δx (B)P(s,x) [J1 > t] + E(s,x) [pJ1 ,t (Y1 , B); t ≥ J1 ]
Zt Z
− st λr (x) dr − sr λu (x) du
R R
= δx (B) · e + λr (x)e πr (x, dy)pr,t (y, B) dr
| {z }
s =(πr pr,t )(x,B)
Rt
Zt Rr
−
= δx (B) · e s λr (x) dr
+ e− s λu (x) du
(qr pr,t )(x, B) dr
s
3. This is a direct consequence of (1.1).

Fix a bounded function f : S → R. Note that
0 ≤ (qr pr,t f )(x) = λr (x)(πr pr,t f )(x) ≤ λr (x) sup |f |
for all 0 ≤ r ≤ t and x ∈ S. Hence if r 7→ λr (x) is continuous (and locally bounded) for
all x ∈ S, then
(pr,t f )(x) −→ f (x) (1.4)
as r, t ↓ s for all x ∈ S.
Thus by dominated convergence,
(qr pr,t f )(x) − (qs f )(x)

Z
= qr (x, dy)(pr,t f (y) − f (y)) + (qr f )(x) − (qs f )(x) −→ 0
as r, t ↓ s provided r 7→ (qr f )(x) is continuous. The assertion now follows from (1.3).
1.3. FORWARD AND BACKWARD EQUATIONS 19
Theorem 1.14 (A first non-explosion criterion). If λ̄ := sup λt (x) < ∞, then

t≥0
x∈S
ζ=∞ P(t0 ,µ) -a.s. ∀ t0 , µ
Proof.
 
 Z t ≤λ̄ 
 z }| { 
Jn = inf t ≥ 0 : λs (Yn−1 ) ds ≥ En

 

Jn−1
−1
≥ Jn−1 + λ̄ En
∞
X
=⇒ ζ = sup Jn ≥ λ̄−1 En = ∞ a.s.
n−1
Remark . In the time-homogeneous case,

n
X Ek
Jn =
k=1
λ(Yn−1 )
is a sum of conditionally independent exponentially distributed random variables given {Yk | k ≥

0}.
From this one can conclude that the events
(∞ ) (∞ )
X Ek X 1
{ζ < ∞} = < ∞ and <∞
k=1
λ(Y k−1 )
k=0
λ(Y k)
coincide almost sure (apply Kolmogorov’s 3-series Theorem).
1.3 Forward and Backward Equations

Definition 1.15. The infinitesimal generator (or intensity matrix, kernel) of a Markov jump pro-
cess at time t is defined by
Lt (x, dy) = qt (x, dy) − λt (x)δx (dy)
i.e.
(Lt f )(x) = (qt f )(x) − λt (x)f (x)

Z
= qt (x, dy) · (f (y) − f (x))
for all bounded and measurable f : S → R.

Remark . 1. Lt is a linear operator on functions f : S → R.
2. If S is discrete, Lt is a matrix, Lt (x, y) = qt (x, y) − λt (x)δ(x, y). This matrix is called

Q-Matrix.
Example . Random walk on Zd :

(
1
2d
if |x − y| = 1
qt (x, y) =
0 else
then
d
1 X 1
(Lt f )(x) = (f (x + ei ) + f (x − ei ) − 2f (x)) = (∆ d f ) (x)
2d k=1 2d Z
Theorem 1.16 (Kolmogorov’s backward equation). If t 7→ qt (x, ·) is continuous in total variation

norm for all x ∈ S, then the transition kernels ps,t of the Markov jump process constructed above
are the minimal solutions of the backward equation (BWE)
∂
− (ps,t f ) (x) = − (Ls ps,t f ) (x) for all bounded f : S → R, 0 ≤ s ≤ t (1.5)
∂s
with terminal condition (pt,t f )(x) = f (x).
Proof. 1. ps,t f solves (1.5):

Informally, the assertion follows by differentiating the integrated backward equation (1.3)
with respect to s. To make this rigorous, we proceed similarly as in the proof of (1.2). First,
one shows similarly to the derivation of (1.4) that
(pr,t f )(x) −→ (ps,t f )(x) as r → s
for all 0 ≤ s ≤ t, x ∈ S, and bounded functions f : S → R. This combined with the

assumption implies that also
|(qr pr,t f )(x) − (qs ps,t f )(x)|

Z
≤ kqr (x, ·) − qs (x, ·)kTV · sup |ps,t f | + qs (x, dy) (pr,t f (y) − ps,t f (y))
−→ 0 as r → s for all x ∈ S
by dominated convergence, because sup |pr,t f | ≤ sup |f |. Hence the integrand in (1.3) is
continuous in r at s, and so there exists
∂
− (ps,t f ) (x) = −λs (x)(ps,t f )(x) + (qs ps,t f )(x) = (Ls ps,t f ) (x)
∂s
2. Minimality:
Let (x, B) 7→ p̃s,t (x, B) be an arbitrary non-negative solution of (1.5). Then
∂
− p̃r,t (x, B) = (qr p̃r,t )(x, B) − λr (x)p̃r,t (x, B)
∂r
∂ R s λu (x) du Rr
⇒− er p̃r,t (x, B) = e− s λu (x) du (qr p̃r , t)(x, B)
∂r
Rt
Zt R
integrate r
⇒ p̃s,t (x, B) − e− s λu (x) du δx (B) = e− s λu (x) du (qr p̃r,t )(x, B) dr
s
integrated backward-equation.
Claim:
p̃s,t (x, B) ≥ ps,t (x, B) = P(s,x) [Xt ∈ B] ∀ x ∈ S, B ∈ S
This is OK if
P(s,x) [Xt ∈ B, t < Jn ] ≤ p̃s,t (x, B) ∀ n ∈ N
n=0: X
n → n + 1: by First step analysis:
P(s,x) [Xt ∈ B, t < Jn+1 | J1 , Y1 ]

MP
= δx (B) · I{t<J1 } + P(J1 ,Y1 ) [Xt ∈ B, t < Jn ] · I{t≥J1 }
where by induction
P(J1 ,Y1 ) [Xt ∈ B, t < Jn ] ≤ p̃Jn ,t (Y1 , B)
Hence we conclude with the integrated backward equation
P(s,x) [Xt ∈ B, t < Jn+1 ] ≤ E(s,x) [. . .]

Rt
Zt R
s
= δx (B)e− s λr (x) dr + e− r λu (x) du (qr p̃r,t )(x, B) dr
s
≤ p̃s,t (B)
Remark . 1. (1.5) describes the backward evolution of the expectation values E(s,x) [f (Xt )]
respectively the probabilities P(s,x) [Xt ∈ B] when varying the starting times s.
2. In a discrete state space, (1.5) reduces to

∂ X
− ps,t (x, z) = Ls (x, y)ps,t (y, z), pt,t (x, z) = δxz
∂s y∈S
a system of ordinary differential equations.

For S being finite,
   t n
Zt ∞ Z
X 1 
ps,t = exp  Lr dr = Lr dr
n=0
n!
s s
is the unique solution.

If S is infinite, the solution is not necessarily unique (hence the process is not unique).
3. Time-homogeneous case: ps,t = pt−s , Ls = L . The backward-equation (1.5) then be-
comes
d
(pt f )(x) = (Lpt f )(x), p0 f = f
dt
If S is infite, pt = etL . In particular, if L is diagonalizable with left/right eigenvectors
ui , vj s.t. uTi vj = δij and eigenvalues λi , then
n
X n
X
L = λi v i ⊗ uTi , pt = etλi vi ⊗ uTi
i=1 i=1
Example . 1. General birth-and-death process
d(x) b(x)
| | | | | |
0 1 2 x−1 x x+1
d X
pt (x, z) = q(x, y) (pt (y, z) − pt (x, z))
dt
|x−y|=1
= b(x) (pt (x + 1, z) − pt (x, z)) + d(x) (pt (x − 1, z) − pt (x, z))

p0 (x, z) = δxz
2. Continuous-time branching process
The particles in a population die with rate d > 0 and divide into two particles with rate
b > 0, independently from each other.
Xt = total number of particles at time t
is a birth-death process on S = {0, 1, 2, . . .} with total birth/death rates
b(n) = q(n, n + 1) = n · b, d(n) = q(n, n − 1) = n · d,λ(n) = n · (b + d)

Let
η(t) := P1 [Xt = 0] = pt (1, 0)
the extinction probability. Equation (1.5) gives
η 0 (t) = dpt (0, 0) − (b + d)pt (1, 0) + bt (2, 0)
= d − (b + d)η(t) + bη(t)2 ,
η(0) = 0
Hence we get (
1
1+bt
if b=d
P1 [Xt 6= 0] = 1 − η(t) = b−d
b−d·et(d−b)
if b 6= d
i.e.
• exponentially decay if d > b
• polynomial decay if d = b (critical case)
• strictly positive survival probability if d < b
Theorem 1.17 (Kolmogorov’s forward equation). Suppose that

λ̄t = sup sup λs (x) < ∞
0≤s≤t x∈S
for all t > 0. Then the forward equation

d
(ps,t f )(x) = (ps,t Lt f )(x), (ps,s f )(x) = f (x) (1.6)
dt
holds for all 0 ≤ s ≤ t, x ∈ S and all bounded functions f : S → R such that t 7→ (qt f )(x) and
t 7→ λt (x) are continuous for all x.
Proof. 1. Strong continuity: Fix t0 > 0. Note that kqr gksup ≤ λ̄r kf ksup for all 0 ≤ r ≤ t0 .
Hence by the assumption and the integrated backward equation (1.3),
kps,t f − ps,r f ksup = kps,r (pr,t f − f )ksup
≤kpr,t f − f ksup ≤ ε(t − r) · kf ksup
for all 0 ≤ s ≤ r ≤ t ≤ t0 and some function ε : R+ → R+ with limh↓0 ε(h) = 0.
2. Differentiability: By 1.) and the assumption,
(r, u, x) 7→ (qr pr,u f )(x)
is uniformly bounded for 0 ≤ r ≤ u ≤ t0 and x ∈ S, and
qr pr,u f = qr (pr,u f − f ) +qr f −→ qt f
| {z }
−→0 uniformly
pointwise as r, u −→ t. Hence by the integrated backward equation (1.3) and the continuity
of t 7→ λt (x),
pt,t+h f (x) − f (x) h↓0
−→ −λt (x)f (x) + qt f (x) = Lt f (x)
h
for all x ∈ S, and the difference quotients are uniformly bounded.
Dominated convergence now implies
ps,t+h f − ps,t f pt,t+h f − f
= ps,t −→ ps,t Lt f
h h
pointwise as h ↓ 0. A similar argument shows that also
ps,t f − ps,t−h f pt−h,t f − f
= ps,t−h −→ ps,t Lt f
h h
pointwise.
Remark . 1. The assumption implies that the operators Ls , 0 ≤ s ≤ t0 , are uniformly

bounded with respect to the supremum norm:
kLs f ksup ≤ λt · kf ksup ∀ 0 ≤ s ≤ t.
2. Integrating (1.5) yields

Zt
ps,t f = f + ps,r Lr f dr (1.7)
s
ps,t+h f −ps,t f
In particular, the difference quotients h
converge uniformly for f as in the asser-
tion.
Notation:
Z
< µ, f >:= µ(f ) = f dµ
µ ∈ M1 (S), s ≥ 0, µt := µps,t = P(s,µ) ◦ Xt−1 mass distribution at time t
Corollary 1.18 (Fokker-Planck equation). Under the assumptions in the theorem,

d
< µt , f >=< µt , Lt f >
dt
for all t ≥ s and bounded functions f : S → R such that t 7→ qt f and t 7→ λt are pointwise
continuous. Abusing notation, one sometimes writes
d
µt = Lt∗ µt
dt
Proof.
Z Z
< µt , f >=< µps,t , f >= µ(dx) ps,t (x, dy)f (y) =< µ, ps,t f >
hence we get
< µt+h , f > − < µt , f > pt,t+h f − f

=< µps,t , >−→< µt , Lt f >
h h
as h ↓ 0 by dominated convergence.
Remark . (Important!)
P(s,µ) [ζ < ∞] > 0

⇒ < µt , 1 >= µt (S) < 1 for large t
hence the Fokker-Planck equation does not hold for f ≡ 1:
Zt
< µt , 1 > < < µ, 1 > + < µs , Ls 1 > ds
| {z } | {z }
<1 =1 0
where Ls 1 = 0.
Example . Birth process on S = {0, 1, 2, . . .}

(
b(i) if j = i + 1
q(i, j) =
0 else
π(i, j) = δi+1,j ,
Yn = n,
Sn = Jn − Jn−1 ∼ Exp(b(n − 1)) independent,
X ∞ ∞
X
ζ = sup Jn = Sn < ∞ ⇐⇒ b(n)n−1 < ∞
n=1 n=1
In this case, Fokker-Planck does not hold.
The question whether one can extend the forward equation to unbounded jump rates leads to the
martingale problem.
1.4 The martingale problem

Definition 1.19. A Markov process (Xt , P(s,x) | 0 ≤ s ≤ t, x ∈ S) is called non-explosive (or
conservative) if and only if ζ = ∞ P(s,x) -a.s. for all s, x.
Now we consider again the minimal jump process (Xt , P(t0 ,µ) ) constructed above. A function
f : [0, ∞) × S → R
(t, x) 7→ ft (x)
is called locally bounded

S if and only if there exists an increasing sequence of open subsets Bn ⊆
S such that S = Bn , and
sup |fs (x)| < ∞

x∈Bn
0≤s≤t
for all t > 0, n ∈ N.

The following theorem gives a probabilistic form of Kolmogorov’s forward equation:
Theorem 1.20 (Time-dependent martingale problem). Suppose that t 7→ λt (x) is continuous for
all x. Then:
1. The process
Zt
∂
Mtf := ft (Xt ) − + Lr fr (Xr ) dr, t ≥ t0
∂r
t0
is a local (FtX )-martingale up to ζ with respect to P(t0 ,µ) for any locally bounded function
∂
f : R × S → R such that t 7→ ft (x) is C 1 for all x, (t, x) 7→ ∂t
+
ft (x) is locally bounded,
and r 7→ (qr,t ft )(x) is continuous at r = t for all t, x.
∂
2. If λ̄t < ∞ and f and ∂t
f are bounded functions, then M f is a global martingale.
3. More generally, if the process is non-explosive then M f is a global martingale provided

∂
sup |fs (x)| + fs (x) + |(Ls fs )(x)| < ∞ (1.8)
x∈S ∂s
t0 ≤s≤t
for all t > t0 .
Corollary 1.21. If the process is conservative then the forward equation

Zt
∂
ps,t ft = fs + pr,t + Lr fr dr, t0 ≤ s ≤ t (1.9)
∂r
s
holds for functions f satisfying (1.8).

1.4. THE MARTINGALE PROBLEM 27
Proof of corollary. M f being a martingale, we have

 
Zt
∂
(ps,t fr )(x) = E(s,x) [ft (Xt )] = E(s,x) fs (Xs ) + + Lr fr (Xr ) dr
∂r
s
Zt
∂
= fs (x) + ps,r + Lr fr (x) dr
∂r
s
for all x ∈ S.
Remark . The theorem yields the Doob-Meyer decomposition
ft (Xt ) = local martingale + bounded variation process
Remark . 1. Time-homogeneous case:

If h is an harmonic function, i.e. L h = 0, then h(Xt ) is a martingale
2. In general:
∂
If ht is space-time harmonic, i.e. ∂t ht +Lt ht = 0, then h(Xt ) is a martingale. In particular,
(ps,t f )(Xt ), (t ≥ s) is a martingale for all bounded functions f .
3. If ht is superharmonic (or excessive), i.e. ∂t∂

ht + Lt ht ≤ 0, then ht (Xt ) is a supermartin-
gale. In particular, E[ht (Xt )] is decreasing
stochastic Lyapunov function, stability criteria
e.g.
ht (x) = e−tc h(tc), Lt h ≤ ch
Proof of theorem. 2. Similarly to the derivation of the forward equation, one shows that the
assumption implies

∂ ∂
(ps,t ft )(x) = (ps,t Lt ft ) (x) + ps,t ft (x) ∀ x ∈ S,
∂t ∂t
or, in a integrated form,
Zt
∂
ps,t ft = fs + ps,r + Lr fr dr
∂r
s
for all 0 ≤ s ≤ t. Hence by the Markov property, for t0 ≤ s ≤ t,
E(t0 ,µ) [ft (Xt ) − fs (Xs ) | FsX ]

=E(s,Xs ) [ft (Xt ) − fs (Xs )] = (ps,t ft )(Xs ) − fs (Xs )
Zt
∂
= ps,r + Lr fr (Xs ) dr
∂r
s
 t 
Z
∂
=E(t0 ,µ)  + Lr fr (Xr ) dr | FrX  ,
∂r
s
because all the integrands are uniformly bounded.
1. For k ∈ N let
(k)
qt (x, B) := (λt (x) ∧ k) · πt (x, B)
(k)
denote the jump rates for the process Xt with the same transition probabilities as Xt and
(k)
jump rates cut off at k. By the construction above, the process Xt , k ∈ N, and Xt can be
realized on the same probability space in such a way that
(k)
Xt = Xt a.s. on {t < Tk }
where
Tk := inf {t ≥ 0 : λt (Xt ) ≥ k, Xt ∈
/ Bk }
∂
for an increasing sequence Bk ofS open subsets of S such that f and ∂t f are bounded on
[0, t] × Bk for all t, k and S = Bk . Since t 7→ λt (Xt ) is piecewise continuous and the
jump rates do not accumulate before ζ, the function is locally bounded on [0, ζ). Hence
Tk % ζ a.s. as k → ∞
By the theorem above,
Zt
(k) ∂
Mtf,k = ft (Xt ) − + Lr(k)
fr (Xr(k) ) dr, t ≥ t0 ,
∂r
t0
is a martingale with respect to P(t0 ,µ) , which coincides a.s. with Mtf for t < Tk . Hence Mtf
is a local martingale up to ζ = sup Tk .
3. If ζ = sup Tk = ∞ a.s. and f satisfies (1.8), then (Mtf )t≥0 is a bounded local martingale,
and hence, by dominated convergence, a martingale.
Theorem 1.22 (Lyapunov condition for non-explosion).

S Let Bn , n ∈ N be an increasing se-
quence of open subsets of S such that S = Bn and the intensities (s, x) 7→ λs (x) are bounded
on [0, t] × Bn for all t ≥ 0 and n ∈ N. Suppose that there exists a function ϕ : R+ × S → R
satisfying the assumption in Part 1) of the theorem above such that for all t ≥ 0 and x ∈ S,
(i) ϕt (x) ≥ 0 non-negative
(ii) inf ϕs (x) −→ 0 as n → ∞ tends to infinity
0≤s≤t
x∈Bnc
∂
(iii)ϕt (x) + Lt ϕt (x) ≤ 0 superharmonic
∂t
Then the minimal Markov jump process constructed above is non-explosive.
Proof. Claim: P [ζ > t] = 1 for all t ≥ 0 and any initial condition.

First note that since the sets Bnc are closed, the hitting times
Tn = inf {t ≥ 0 | Xt ∈ Bnc }
are (FtX )-stopping times and XTn ∈ Bnc whenever Tn < ∞. As the intensities are uniformly
bounded on [0, t] × Bn for all t, n, the process can not explode before time t without hitting
Bnc (Exercise),i.e. ζ ≥ Tn almost sure for all n ∈ N. By (iii), the process ϕt (Xt ) is a local
supermartingale up to ζ. Since ϕt ≥ 0, the stopped process
ϕt∧Tn (Xt∧Tn )
is a supermartingale by Fatou. Therefore, for 0 ≤ s ≤ t and x ∈ S,
ϕs (x) ≥ E(s,x) [ϕt∧Tn (Xt∧Tn )]
≥ E(s,x) [ϕTn (XTn ) ; Tn ≤ t]
≥ P(s,x) [Tn ≤ t] · inf ϕr (x)
s≤r≤t
x∈Bnc
Hence by (ii), we obtain

P(s,x) [ζ ≤ t] ≤ lim inf P(s,x) [Tn ≤ t] = 0
n→∞
for all t ≥ 0.
Remark . 1. If there exists a function ψ : S → R and α > 0 such that

(i) ψ ≥ 0
(ii) infc ψ(x) −→ ∞ as n → ∞
x∈Bn
(iii) Lt ψ ≤ αψ ∀t ≥ 0
then the theorem applies with ϕt (x) = e−αt ψ(x):
∂
ϕt + Lt ϕt ≤ −αϕt + αϕt ≤ 0
∂t
This is a standard criterion in the time-homogeneous case!
2. If S is a locally compact connected metric space and the intensities λt (x) depend continu-
ously on s and x, then we can choose the sets
Bn = {x ∈ S | d(x0 , x) < n}
as the balls around a fixed point x0 ∈ S, and condition (ii) above then means that
lim ψ(x) = ∞
d(x,x0 )→∞
Example . Time-dependent branching
Suppose a population consists initially (t = 0) of one particle, and particles die with time-
dependent rates dt > 0 and divide into two with rates bt > 0 where d, b : R+ → R+ are continu-
ous functions, and b is bounded. Then the total number Xt of particles at time t is a birth-death
process with rates

n · bt if m = n + 1

qt (n, m) = n · dt if m = n − 1 , λt (n) = n · (bt + dt )

0 else

The generator is
 
0 0 0 0 0 0 ···
dt −(dt + bt ) bt 0 0 0 ··· 
 
Lt = 
0 2dt −2(dt + bt ) 2bt 0 0 ··· 

0
 0 3d t −3(dt + b t ) 3b t 0 ··· 

... ... ... .. .. ..
. . .
Since the rates are unbounded, we have to test for explosion. choose ψ(n) = n as Lyapunov
function. Then
(Lt ψ) (n) = n · bt · (n + 1 − n) + n · dt · (n − 1 − n) = n · (bt − dt ) ≤ n sup bt

t≥0
Since the individual birth rates bt , t ≥ 0, are bounded, the process is non-explosive. To study
long-time survival of the population, we consider the generating functions
∞
X
Gt (s) = E sXt = sn P [Xt = n], 0<s≤1
n=0
of the population size. For fs (n) = sn we have
(Lt fs ) (n) = nbt sn+1 − n(bt + dt )sn + ndt sn−1

∂
= bt s2 − (bt + dt )s + dt · fs (n)
∂s
Since the process is non-explosive and fs and Lt fs are bounded on finite time-intervals, the
forward equation holds. We obtain
∂ ∂
Gt (s) = E [fs (Xt )] = E [(Lt fs )(Xt )]
∂t ∂t
2 ∂ Xt
= (bt s − (bt + dt )s + dt ) · E s
∂s
∂
= (bt s − dt )(s − 1) · Gt (s),
X0 ∂s
G0 (s) = E s =s
The solution of this first order partial differential equation for s < 1 is
 −1
%t Zt
e
Gt (s) = 1 −  + bn e%u du
1−s
0
where
Zt
%t := (du − bu ) du
0
is the accumulated death rate. In particular, we obtain an explicit formula for the extinction
probability:
 −1
Zt
P [Xt = 0] = lim Gt (s) = e%t + bn e%u du
s↓0
0
 −1
Zt
= 1 − 1 + du e%u du
0
since b = d − %0 . Thus we have shown:
Theorem 1.23.
Z∞
P [Xt = 0 eventually] = 1 ⇐⇒ du e%u du = ∞
0
Remark . Informally, the mean and the variance of Xt can be computed by differentiating Gt at
s=1:
d Xt
Xt −1
E s = E Xt s = E[Xt ]
ds

s=1 s=1
d2 Xt
Xt −2
E s = E Xt (X t − 1)s = Var(Xt )
ds2

s=1 s=1
Explosion in time-homogeneous case

Distribution of explosion time: By Kolmogorov’s backward equation,
F (t, x) := Px [ζ ≤ t] = Px [Xt = ∆] = 1 − Ex [Is (Xt )]
where Ex [Is (Xt )] is the minimal non-negative solution of

∂u
= L u, u(0, x) = 1,
∂t
hence F (t, x) is the maximal solution ≤ 1 of
∂u
= L u, u(0, x) = 0
∂t
Laplace transformation: α > 0,
Z∞
1
Fα (x) := Ex e−α% = e−αt F (t, x) dt

α
0
since
Z∞ Z∞
1 1
e−α% = e−αt dt = e−αt · I{t≥%} dt
α α
% 0
Informally by integrating by parts,

Z∞ Z∞
1
(L Fα )(x) = −αt
e L F (t, x) dt = e−αt F (t, x) dt = (αFα )(x)
α | {z }
0 = ∂F 0
∂t
Theorem 1.24. (Necessary and sufficient condition for non-explosion, Reuter’s criterion)
1. Fα is the maximal solution of
L g = αg (1.10)
satisfying 0 ≤ y ≤ 1.
2. The minimal Markov jump process is non-explosive if and only if (1.10) has only the trivial
solution satisfying 0 ≤ y ≤ 1.
Proof. 1. by first step analysis (Exercise)
2. ζ = ∞ Px -a.s. if and only if Fα (x) = 0.

1.5. ASYMPTOTICS OF TIME-HOMOGENEOUS JUMP PROCESSES 33
1.5 Asymptotics of Time-homogeneous jump processes

Let (Xt , Px ) be a minimal Markov jump process with jump rates q(x, dy). The generator is given
by
(L f )(x) = (qf )(x) − λ(x)f (x) = λ(x) · ((πf )(x) − f (x))
i.e.
L = λ · (π − I) (1.11)
where π − I is the generator of the jump chain (Yn ). Let
T := inf {t ≥ 0 | Xt ∈ Dc } , D ⊆ S open
Theorem 1.25 (Dirichlet and Poisson problem). For any measurable functions c : D → R+ and
f : D c → R+ ,  T 
Z
u(x) := Ex  c(Xt ) dt + f (XT ) · I{T <ζ} 
0
is the minimal non-negative solution of the Poisson equation
−L u = c on D
(1.12)
u=f on Dc
Proof. 1. For c ≡ 0 this follows from the corresponding result in discrete time. In fact, the
exit points from D of Xt and Yt coincide, and hence
f (XT ) · I{T <ζ} = f (Yτ ) · I{τ <∞}
where τ = inf{n ≥ 0 | Yn ∈
/ D}. Therefore u is the minimal non-negative solution of
πu = u on D
u=f on Dc
which is equivalent to (1.12) by (1.11).
2. In the general case the assertion can be proven by first step analysis (Exercise).
Example . 1. Hitting probability of D c :

u(x) = Px [T < ζ] solves L u = 0 on D, u = 1 on Dc .
2. Distribution of XT :
u(x) = Ex [f (XT ) ; T < ζ] solves L u = 0 on D, u = f on Dc .
3. Mean exit time:
u(x) = Ex [T ] solves −L u = 1 on D, u = 0 on Dc .
4. Mean occupation time of A before exit from D:
A
D
b b
A ⊆ D measurable
Z∞
 T 
Z
u(x) = Ex  IA (Xt ) dt = Px [Xt ∈ A, t < T ] dt
0 0
solves −L u = IA on D, u = 0 on Dc . u(x) = GD (x, A) is called Green function of a

Markov process in a domain D.
Assumption (from now on): S is countable.
Definition 1.26. 1. A state x ∈ S is called recurrent if and only if

Px [{t ≥ 0 : Xt = x} is unbounded] = 1,
and transient if and only if
Px [{t ≥ 0 : Xt = x} is unbounded] = 0
2. A state y ∈ S is called accessible from x (x y) if and only if
Px [Xt = y for some t ≥ 0] > 0.
x and y communicate (x ! y) if and only if x y and y x. The Markov chain

(Xt , Px ) is called irreducible if and only if all states communicate. It is called recurrent
respectively transient if and only if all states are recurrent respectively transient.
Lemma 1.27. Suppose that λ(x) > 0 for all x ∈ S. Then:
1. For x, y ∈ S the following assertions are equivalent:
(i) x y
(ii) x y for the jump chain (Yn )
(iii) There is a k ∈ N and x1 , . . . , xk ∈ S such that
q(x, x1 )q(x1 , x2 ) · · · q(xk−1 , xk )q(xk , y) > 0
(iv) pt (x, y) > 0 for all t > 0

(v) pt (x, y) > 0 for some t > 0.
2. A state x ∈ S is recurrent (respectively transient) if and only if it is recurrent (respectively

transient) for the jump chain.
Proof. 1. (i) ⇔ (ii) since Yn visits the same states as Xt

(ii) ⇒ ∃x1 , . . . xk such that π(x, x1 )π(x1 , x2 ) · · · π(xk , y) > 0 ⇒ (iii) since λ > 0.
(iii) ⇒ (iv) : q(a, b) > 0 and hence
pt (a, b) ≥ Pa [J1 ≤ t < J2 , Y1 = b] > 0
for all t > 0. Hence with (iii) and the independence of the states,
pt (x, y) ≥ p k+1
t (x, x1 )p t (x1 , x2 ) · · · p t (xk , y) > 0
k+1 k+1
(iv) ⇒ (v) ⇒ (i) is obvious.
2. If x is recurrent for Yn then
Px [λ(Yn ) = λ(x) infinitely often] = 1,
and hence
∞
X
ζ= (Ji − Ji−1 ) = ∞ Px -a.s.
1
since (J1 − Ji−1 ) are conditional independent given (Yn ), Exp(λ(x)) distributed infinitely
often.
Since λ > 0, the process (Xt )t≥0 does not get stuck, and hence visits the same states as the
jump chain (Yn )n≥0 . Thus Xt is recurrent. Similarly, the converse implication holds.
If x is transient for (Xt ) then it is transient for (Yn ) since otherwise it would be recurrent
by the dichotomy in discrete time. Finally, if x is transient for (Yn ) then it is transient for
Xn since the process spends only finite time in each state.
Let
Tx := inf{t ≥ J1 : Xt = x}
denote the first passage time of x.
Theorem 1.28 (Recurrence and transience). 1. Every x ∈ S is either recurrent or transient.

2. If x is recurrent and x y, then y is recurrent.
R∞
3. x recurrent ⇔ λ(x) = 0 or Px [Tx < ∞] = 1 ⇔ G(x, x) = 0
pt (x, x) dt = ∞
Proof. Under the assumption λ > 0, the first two assertions follow from the corresponding result
in discrete time. If λ(x) = 0 for some x, we can apply the same arguments if we construct the
process (Xt ) form a jump chain which is absorbed at x ∈ S with λ(x) = 0.
3. If λ(x) > 0 then by the discrete time result x is recurrent if and only if
Px [Yn = x for some n ≥ 1] = 1,
i.e., if and only if
Px [Tx < ∞]
Moreover, the Green function of (Xt ) can be computed from the Green function of the jump
chain (Yn ):
Z∞
∞ 
Z
pt (x, x) dt = Ex  I{x} (Xt ) dt
0
" 0∞ #
X
= Ex (Jn+1 − Jn )I{x} (Yn )
n=0
∞
X
= E[Jn+1 − Jn | Yn = x] · Px [Yn = x]
| {z }
n=0
∼Exp(λ(x))
∞
1 X
n
= π (x, x)
λ(x)
|n=0 {z }
discrete-time Green function
Hence
G(x, x) = ∞ ⇔ λ(x) = 0 or x recurrent for Yn ⇔ x recurrent for Xt
Remark (Strong Markov property).

Eν F (XT +• ) · I{T <ζ} | FT = EXT [F (X)]
+
Pν -a.s. on {T < ζ} for any FtX stopping time T, and any measurable function F : S R → R+ ,
and any initial distribution ν ∈ M1 (S).
Proof. Either directly from the strong Markov property for the jump chain (Exercise). A more
general proof that applies to other continuous time Markov processes as well will be given in the
next chapter.
Definition 1.29. A positive measure µ on S is called stationary (or invariant) with respect to (pr )t≥0
if and only if
µpt = µ
for all t ≥ 0.
Theorem 1.30 (Existence and uniqueness of stationary measure). Suppose that x ∈ S is re-
current. Then:
1. T 
Zx
µ(B) := Ex  IB (Xt ) dt , B ⊆ S,
0
is a stationary measure. If x is positive recurrent (i.e. Ex [Tx ] < ∞) then
µ(B)
µ̄(B) =
µ(S)
is a stationary probability distribution.
2. If (Xt , Px ) is irreducible then any stationary measure is a multiple of µ.
Remark . 1. µ(B) = expected time spent in B before returning to x.

2. Relation to stationary measure of the jump chain:
τx := inf{n ≥ 1 : Yn = x}
T 
Zx
µ(y) = Ex  I{y} (Xt ) dt
0
"τ −1 #
Xx
= Ex (Jn+1 − Jn )I{y} (Yn )

n=0
 
∞
X
= Ex Ex [Jn+1 − Jn | (Yk )] · I{n<τx } · I{y} (Yn )
 
| {z }
n=0
∼Exp(λ(y))
"τ −1 #
x
1 X
= Ex I{y} (Yn )
λ(y) n=0
| {z }
=:ν(y) stationary measure for jump chain
Proof of theorem. 1. Fix B ⊆ S and s ≥ 0.

 s  T 
Z Zx
µ(B) = Ex  IB (Xt ) dt + Ex  IB (Xt ) dt
0 s
 T +s  T 
Zx Zx
= Ex  IB (Xt ) dt + Ex  IB (Xt ) dt
Tx s
 T +s  T 
Zx Zx
= Ex  IB (Xt ) dt = Ex  IB (Xs+t ) dt
s 0
Z∞
= Ex [Px [Xs+t ∈ B | Ft ]; Tx > t] dt
| {z }
0 =ps (Xt ,B)
T 
X Zx
= Ex  I{y} (Xt ) dt ·ps (y, B) = (µps )(B)
y∈S 0
| {z }
=µ(y)
Here we have used in the second step that

 T +s   s 
Zx Z
Ex  IB (Xt ) dt | FTx  = Ex  IB (Xt ) dt
Tx 0
by the strong Markov property for (Xt ).

We have shown that µ is a stationary measure. If x is positive recurrent then µ(S) is finite,
and hence µ can be normalized to a stationary probability distribution.
2. If (Xt ) is irreducible then the skeleton chain (Xn )n=0,1,2,... is a discrete-time Markov chain
with transition kernel
p1 (x, y) > 0 ∀ x, y ∈ S
Hence (Xn ) is irreducible. If we can show that (Xn ) is recurrent, than by the discrete-time
theory, (Xn ) has at most one invariant measure (up to a multiplicative factor), and thus the
same holds for (Xt ). Since x is recurrent for (Xt ), the jump chain (Yn ) visits x infinitely
often with probability 1. Let K1 < K2 < · · · denote the successive visit times. Then
XJKi = YKi = x (1.13)
for all i. We claim that also
XdJKi e = x infinitely often (1.14)
In fact, the holding times JKi+1 − JKi , i ∈ N, are conditionally independent given (Yn )
with distribution Exp(λ(x)). Hence
Px [JKi+1 − JKi > 1 infinitely often] = 1,
which implies (1.14) by (1.13). The recurrence of (Xn ) follows from (1.14) by irreducibil-
ity.
If S is finite, by the Kolmogorov backward equation
µpt = µ ∀ t ≥ 0 ⇔ µL = 0
In the general case this infinitesimal characterization of stationary measures does not always hold,
cf. the example below. However, as a consequence of the theorem we obtain:
Corollary 1.31 (Infinitesimal characterization of stationary distribution). Suppose that (Xt , Px )

is irreducible, and µ ∈ M1 (S). Then:
1. If µ is a stationary distribution of (pt )t≥0 then all states are positive recurrent, and
(µL )(y) = 0 ∀ y ∈ S (1.15)
2. Conversely, if (1.15)Pholds then µ is stationary provided (Xt , Px ) is recurrent. This is for

example the case if x∈S λ(x)µ(x) < ∞.
Remark . Condition on (1.15) means that

< µL , f >=< µ, L f >= 0
P
for all finitely supported functions f : S → R. Note that if λ(x)µ(x) = ∞ then µL = µq−λµ
is not a signed measure with finite variation. In particular, < µL , 1 > is not defined!
Proof. A stationary distribution µ of (Xt ) is also stationary for the skeleton chain (Xn )n=0,1,2,... ,
which is irreducible as noted above. Therefore, the skeleton chain and thus (Xt ) are positive
recurrent. Now the theorem and the remark above imply that in the recurrent case, a measure µ
is stationary if and only if λ · µ is stationary for the jump chain (Yn ), i.e.
X
(µq)(y) = λ(x)µ(x)π(x, y) = λ(y)µ(y) (1.16)
x∈S
P to (µL )(y) = 0.
which is equivalent
In particular, if λ(x)µ(x) < ∞ and µL = 0 then λµ is a stationary distribution for (Yn ),
where (Yn ) and thus (Xt ) are positive recurrent.
Example . We consider the minimal Markov jump process with jump chain Yn = Y0 + n and
intensities λ(x) = 1 + x2 . Since ν(y) ≡ 1 is a stationary measure for (Yn ), i.e. νπ = ν, we see
that
ν(y) 1
µ(y) := =
λ(y) 1 + y2
is a finite measure with (µL )(y) = 0 for all y. However, Xt is not recurrent (since Yn is
transient), and hence µ is not stationary for Xt !
Actually, in the example above, Xt is explosive. In fact, one can show:
Theorem 1.32. If (Xt ) is irreducible and non-explosive, and µ ∈ M1 (S) satisfies (1.16), then µ
is a stationary distribution.
Proof. Omitted, cf. Asmussen [4], Theorem 4.3 in chapter II.
Remark . Detailed balance:
Condition (1.16) is satisfied provided the detailed balance condition

µ(x)q(x, y) = µ(y)q(y, x) (1.17)
holds for all x, y ∈ S. In fact, (1.17) implies
X X
(µq)(y) = µ(x)q(x, y) = µ(y) q(y, x) = λ(y)µ(y)
x∈S x∈S
for all y ∈ S.
Example . Stationary distributions and mean hitting times of birth-death process:
For a birth-death process on {0, 1, 2, . . .} with strictly positive birth rates b(x) and death rates
d(x) the detailed balance condition is
µ(x)b(x) = µ(x + 1)d(x + 1) (1.18)
for all x ≥ 0. Hence detailed balance holds if and only id µ is a multiple of

x
Y d(i)
ν(x) :=
i=1
b(i − 1)
Suppose that
∞
X
ν(n) < ∞ (1.19)
n=0
Then
ν(x)
µ(x) := P∞
y=0 ν(y)
is a probability distribution satisfying (1.17), and hence (1.16). By irreducibility, µ is the unique
stationary probability distribution provided the process is non-explosive. The example above
shows that explosion may occur even when (1.19) holds.
Theorem 1.33. Suppose (1.19) holds and

∞
X 1
= ∞. (1.20)
n=0
b(n)
Then:
1. The minimal birth-death process in non explosive, and µ is the unique stationary probabil-
ity distribution.
2. The mean hitting times are given by
(a)
y−1
X µ({0, 1, . . . , n})
Ex [Ty ] = for all 0 ≤ x ≤ y and
n=x
µ(n) · b(n)
(b)
x
X µ({n, n + 1, . . .})
Ex [Ty ] = for all 0 ≤ y ≤ x respectively.
n=y+1
µ(n) · d(n)
In particular, the mean commute time between x and y is given by

(c)
y−1
X 1
Ex [Ty ] + Ey [Tx ] = for all 0 ≤ x < y.
n=x
µ(n) · b(n)
Proof. 1. Reuter’s criterion implies that the process is non-explosive of and only if
∞
X ν({0, . . . , n})
=∞
n=0
b(n)
(Exercise, cf. Brémaud [7], Chapter 8, Theorem 4.5).

If (1.19) holds then this condition is equivalent to (1.20).
2. Fix y ∈ N. The function

 Ty 
Z
u(x) = Ex [Ty ] = Ex  1 dt , 0≤x≤y
0
is the minimal non-negative solution of the Poisson equation
−L u = 1 on {0, 1, . . . , y − 1}, u(y) = 0
Hence u0 (n) := u(n + 1) − u(n) solves the ordinary difference equation
b(0)u0 (0) = 1,
b(n)u0 (n) + d(n)u0 (n − 1) = 1
for all 1 ≤ n < y. By the detailed balance condition (1.18) the unique solution of this
equation is given by
n n n
0
X 1 Y d(l) (1.18) X µ(k)
u (n) = − = − ∀0≤n≤y
k=0
b(k) l=k+1 b(l) k=0
µ(n)b(n)
Assertion (a) now follows by summing over n and taking into account the boundary con-
dition u(y) = 0. The proof of (b) is similar and (c) follows from (a) and (b) since
µ(n)d(n) = µ(n − 1)b(n − 1)
by (1.18).
Remark . Since µ(n) · b(n) is the flow through the edge {n, n + 1}, the right hand side of (c)
can be interpreted as the effective resistance between x and y of the corresponding electrical
network. With this interpretation, the formula carries over to Markov chains on general graphs
and the corresponding electrical networks, cf. Aldous, Fill [1].
Theorem 1.34 (Ergodic Theorem). Suppose that (Xt , Px ) is irreducible and has stationary prob-
ability distribution µ̄. Then
Zt Z
1
f (Xs ) ds −→ f dµ̄
t
0
Pν -a.s. as t → ∞ for any non-negative function f : S → R and any initial distribution ν ∈
M1 (S).
Remark . 1. In particular,
Zt
1
µ̄(B) = lim IB (Xs ) ds Pν -a.s.
t→∞ t
0
stationary probability = mean occupation time.

2. More generally: If (Xt , Px ) is irreducible and recurrent then
Zt Z
1
f (Xs ) ds −→ f dµx Pν -a.s. for all x ∈ S
Lxt
0
Rt
where Lxt = 0
I{x} (Xs ) ds is the occupation time of x, and
T 
Zx
µx (B) = Ex  IB (Xs ) ds
0
is a stationary measure.
Proof. Similar to the discrete time case. Fix x ∈ S and define recursively the successive leaving
and visit times of the state x:
T̃ 0 = inf {t ≥ 0 : Xt 6= x}
n o
T n = inf t ≥ T̃ n−1 : Xt = x visit times of x
T̃ n = inf {t ≥ T n : Xt 6= x} leaving times of x
We have
ZT n n−1
X
f (Xs ) ds = Yk
k=1
T1
where
TZk+1 T k+1
Z −T k
Yk := f (Xs ) ds = f (Xs+Tk ) ds
Tk 0
Note that T k+1 (X) = T k (X) + T 1 (XT k +• ). Hence by the strong Markov property the random
variables are independent and identically distributed with expectation
T 
Zx Z
Eν [Yk ] = Eν [Eν [Yk | FT k ]] = Ex  f (Xs ) ds = f dµ

0
The law of large numbers now implies
ZT n Z
1
f (Xs ) ds −→ f dµx Pν -a.s. as n → ∞ (1.21)
n
0
In particular,
Tn
−→ µx (S) Pν -a.s.
n
By irreducibility, the stationary measure is unique up to a multiplicative factor. Hence µx (S) <
∞ and µ̄ = µxµ(S)
x
. Thus we obtain
Z R ZTn Zt
f dµ n 1 1
f dµ̄ = = lim · f (Xs ) ds ≤ lim inf f (Xs ) ds
µ(S) n→∞ Tn+1 n t→∞ t
0 0
Zt ZTn Z
1 n 1
≤ lim sup f (Xs ) ds ≤ lim sup · f (Xs ) ds = f dµ̄s
t→∞ t n→∞ Tn n
0 0
i.e.
Zt Z
1
f (Xs ) ds −→ f dµ̄ Pν -a.s.
t
0
Chapter 2
Interacting particle systems
2.1 Interacting particle systems - a first look

Let G = (V, E) be an (undirected) graph with V the set of vertices and E the set of edges. We
write x ∼ y if and only if {x, y} ∈ E. We call
S = T V = {η : V → T }
the configuration space. T can be the space of types, states, spins etc.
E.g.
(
1 particle at x
T = {0, 1}, η(x) =
0 no particle at x
no particle
particle
Markovian dynamics: η(x) changes to state i with rate

ci (x, η) = gi ((η(x), (η(y))y∼x )
45
46 CHAPTER 2. INTERACTING PARTICLE SYSTEMS
i.e.
(
ci (x, η) if ξ = η x,i
q(η, ξ) =
0 otherwise
where
(
6 x
η(y) for y =
η x,i (y) =
i for y = x
Example . 1. Contact process: (Spread of plant species, infection,...) T = {0, 1}. Each
particle dies with rate d > 0, produces descendent at any neighbor site with rate b > 0 (if
not occupied)
c0 (x, η) = d
c1 (x, η) = b · N1 (x, η); N1 (x, η) := |{y ∼ x : η(y) = 1}|
Spatial branching process with exclusion rule (only one particle per site).
2. Voter model: η(x) opinion of voter at x,
ci (x, y) = Ni (x, y) := |{y ∼ x : η(y) = i}|
changes opinion to i with rate equal to number of neighbors with opinion i.
3. Ising model with Glauber (spin flip) dynamics: T = {−1, 1}, β > 0 inverse tempera-
ture.
(a) Metropolis dynamics:
X
∆(x, η) := η(y) = N1 (x, η) − N−1 (x, η) total magnetization
y∼x
c1 (x, η) := min e2β·∆(x,η) , 1

c0 (x, η) := min e−2β·∆(x,η) , 1

(b) Heath bath dynamics / Gibbs sampler:

eβ∆(x,η)
c1 (x, η) =
eβ∆(x,η) + e−β∆(x,η)
e−β∆(x,η)
c0 (x, η) = β∆(x,η)
e + e−β∆(x,η)
β = 0: (infinite temperature) c1 ≡ c0 ≡ 21 , random walk on {0, 1}V (hypercube)
β → ∞: (zero temperature)
 

 1 if ∆(x, y) > 0 1 if ∆(x, y) < 0

1
c1 (x, η) = 2 if ∆(x, y) = 0 , c0 (x, η) = 21 if ∆(x, y) = 0
 
0 if ∆(x, y) < 0 0 if ∆(x, y) > 0
 
Voter model with majority vote.

2.1. INTERACTING PARTICLE SYSTEMS - A FIRST LOOK 47
In the rest of this section we will assume that the vertex set V is finite. In this case, the config-
uration space S = T V is finite-dimensional. If, moreover, the type space T is also finite then S
itself is a finite graph with respect to the Hamming distance
d(η, ξ) = |{x ∈ V ; η(x) 6= ξ(x)}|
Hence a continuous-time Markov chain (ηt , Px ) can be constructed as above from the jump rates
qt (η, ξ). The process is non-explosive, and the asymptotic results from the last section apply. In
particular, if irreducibility holds the there exists a unique stationary probability distribution, and
the ergodic theorem applies.
Example . 1. Ising Model: The Boltzman distribution

1 −βH(η) X
µβ (η) = e , Zβ = e−βH(η) ,
Zβ η
with Hamiltonian
1 X X
H(η) = (η(x) − η(y))2 = η(x)η(y) + |E|
2
{x,y}∈E {x,y}∈E
is stationary, since it satisfies the detailed balance condition
µβ (η)q(η, ξ) = µβ (ξ)q(ξ, η) ∀ ξ, η ∈ S.
Moreover, irreducibility holds - so the stationary distribution is unique, and the ergodic
theorem applies (Exercise).
2. Voter model: The constant configurations i(x) ≡ i, i ∈ T , are absorbing states, i.e.
cj (x, i) = 0 for all j 6= i, x. Any other state is transient, so
" #
[
P {ηt = i eventually} = 1.
i∈T
Moreover,
Ni (ηt ) := |{x ∈ V : ηt (x) = i}|
is a martingale (Exercise), so
t→∞
Ni (η) = Eη [Ni (ηt )] −→ Eη [Ni (η∞ )] = N · P [ηt = i eventually]
i.e.
Ni (η)
P [ηt = i eventually] =
N
The stationary distributions are the Dirac measures δi , i ∈ T , and their convex combina-
tions.
3. Contact process: The configuration 0 is absorbing, all other states are transient. Hence
δ0 is the unique invariant measure and ergodicity holds.
We see that on finite graphs the situation is rather simple as long as we are only interested in
existence and uniqueness of invariant measures, and ergodicity. Below, we will show that on
infinite graphs the situation is completely different, and phase transitions occur. On finite sub-
graphs on an infinite graph these phase transitions effect the rate of convergence to the stationary
distribution and the variances of ergodic averages but not the ergodicity properties themselves.
Mean field models:

Suppose that G is the complete graph with n vertices, i.e.
V = {1, . . . , n} and E = {{x, y} : x, y ∈ V }
Let
n
1X
Ln (η) = δη(x)
n x=1
denote the empirical distribution of a configuration η : {1, . . . , n} → T , the mean field. In a

mean-field model the rates
ci (x, η) = fi (Ln (η))
are independent of x, and depend on η only through the mean field Ln (η).
Example . Multinomial resampling (e.g. population genetics), mean field voter model.
With rate 1 replace each type η(x), x ∈ V , by a type that is randomly selected from Ln (η):
1
ci (x, η) = Ln (η)(i) = |{x ∈ η : η(x) = i}|
n
As a special case we now consider mean-field models with type space T = {0, 1} or T =
{−1, 1}. In this case the empirical distribution is completely determined by the frequence of type
1 in a configuration:
Ln (η) ←→ N1 (η) = |{x : η(x) = 1}|
ci (x, y) = f˜(N1 (η))
If (ηt , Px ) is the corresponding mean field particle system, then (Exercise) Xt = N1 (η) is a
birth-death process on {0, 1, . . . , n} with birth/death rates
b(k) = (n − k) · f˜1 (k), d(k) = k · f˜0 (k)
where (n − k) is the number of particles with state 0 and f˜1 (k) is the birth rate per particle.
Explicit computation of hitting times, stationary distributions etc.!
2.1. INTERACTING PARTICLE SYSTEMS - A FIRST LOOK 49
Example . 1. Binomial resampling: For multinomial resampling with T = {0, 1} we ob-

tain
k · (n − k)
b(k) = d(k) =
n
2. Mean-field Ising model: For the Ising model on the complete graph with inverse temper-
ature β and interaction strength n1 the stationary distribution is
β P 2 β P P β 2
µβ (η) ∝ e− 4n x,y (η(x)−η(y)) ∝ e 2n x η(x)· y η(y)
= e 2n m(η)
where
n
X
m(η) = η(x) = N1 (η) − N−1 (η) = 2N1 (η) − n
x=1
is the total magnetization. Note that each η(x) is interacting with the mean field n1
P
η(y),
which explains the choice of interacting strength of order n1 . The birth-death chain N1 (ηt )
corresponding to the heat bath dynamics has birth and death rates
k n−k
eβ n eβ n
b(k) = (n − k) · k n−k , d(k) = k · k n−k
eβ n + eβ n eβ n + eβ n
and stationary distribution

n −n 2β n 2
2 e n (k− 2 ) ,
X
µ̄β (k) = µβ (η) ∝ 0≤k≤n
k
η : N1 (η)=k
The binomial
√
distribution Bin(n, 21 ) has a maximum at its mean value n2 , and standard
n
deviation
√ 2
. Hence for large n, the measure µ̄β has one sharp mode of standard deviation
O( n) if β is small, and two modes if β is large:
| | | | |
0 n n 0 n
2
β1 β1
The transition from uni- to multimodality occurs at an inverse temperature βn with
lim βn = 1 (Exercise)
n→∞
The asymptotics of the stationary distribution as n → ∞ can be described more accurately

using large deviation results, cf. below.
Now consider the heat bath dynamics with an initial configuration η0 with N1 (η0 ) ≤ n2 , n even,
and let n no
T := inf t ≥ 0 : N1 (ηt ) > .
2
By the formula for mean hitting times for a birth-and-death process,
n
µ̄β 0, 1, . . . , n2 1

2 eβ 2
E[T ] ≥ ≥ ≥ n
µ̄β n2 · b n2 n n

µ̄β 2
· 2
n2
since
n
n βn βn
µ̄β = n · e− 2 µ̄β (0) ≤ 2n e− 2 .
2 2
Hence the average time needed to go from configurations with negative magnetization to states
with positive magnetization is increasing exponentially in n for β > 2 log 2. Thus although
ergodicity holds, for large n the process gets stuck for a very large time in configurations with
negative resp. positive magnetization.
Metastable behaviour.
More precisely, one can show using large deviation techniques that metastability occurs for any
inverse temperature β > 1, cf. below.
2.2 Particle systems on Zd

Reference:
• Durett [9]
• Liggett [15]
V = Zd , T finite
d
E = {(x, y) : |x − y|l1 = 1} S = T Z with product topology, compact
µn → µ ⇔ µn (x) → µ(x) ∀ x ∈ Zd
2.2. PARTICLE SYSTEMS ON ZD 51
Assumptions:
(i) µ̄ := sup ci (x, y) < ∞

i∈T
x∈Zd
(ii) ci (x, y) = gi (η(x), (η(y))y∼x ) translation invariant and nearest neighbor
Graphical construction of associated particle systems:

Hier fehlt ein Bild!
Ntx,i independent Poisson process with rate λ̄ (alarm clock for transition at x to i)
Tnx,i n-th. arrival time of Ntx,i
Unx,i independent random variables uniformly distributed on [0, 1]
Recipe: At time Tnx,i , change η(x) to i provided

x,i ci (x, y) ci (x, y)
Un ≤ i.e. with probability
λ̄ λ̄
Problem: Infinitely many Poisson processes, hence transitions in arbitrary small time, no first
transition.
How can we consistently define a process from the jump times? For a finite subset A ⊂ Zd and
ξ ∈ S, the restricted configuration space
Sξ,A := {η ∈ S | η = ξ on Ac }

(s,ξ,A)
is finite. Hence for all s ≥ 0 there exists a unique Markov jump process ηt on Sξ,A
t≥s
(s,ξ,A)
with initial condition ηs = ξ and transitions t ≥ s, η → η x,i at times Tnx,i whenever Unx,i ≤
ci (x,y) (s,ξ)
λ̄
, x ∈ A. The idea is now to define a Markov process ηt on S for t − s small by
(s,ξ) (s,ξ,A)
ηt := ηt
where A is an appropriately chosen finite neighborhood of x. The neighborhood should be cho-

(s,ξ)
sen in such a way that during the considered time interval, ηt (x) has only been effected by
previous values on A of the configuration restricted to A. That this is possible is guaranteed by
the following observation:
For 0 ≤ s ≤ t we define a random subgraph (Zd , Es,t (ω)) of (V, E) by:
Es,t (ω) = {x, y} : Tnx,i ∈ (s, t] or Tny,i ∈ (s, t] for some n ∈ N and i ∈ T

If x effects y in the time interval (s, t] or vice versa then {x, y} ∈ Es,t .
Lemma 2.1. If
1
t−s≤ =: δ
8 · d2 · |T | · λ̄
then
P all connected components of (Zd , Es,t ) are finite = 1.

Consequence: For small time intervals [s, t] we can construct the configuration at time t form
the configuration at time s independently for each component by the standard construction for
jump processes with finite state space.
Proof. By translation invariance it suffices to show

P [|C0 | < ∞] = 1
where C0 is the component of (Zd , Es,t ) containing 0. If x is in C0 then there exists a self-avoiding
path in (Zd , Es,t ) starting at 0 with length dl1 (x, 0). Hence
P [∃ x ∈ C0 : dl1 (x, 0) ≥ 2n − 1]
≤ P [∃ self-avoiding path z1 = 0, z2 , . . . , z2n−1 s.t. (zi , zi+1 ) ∈ Es,t ∀ i]
n−1
Y
≤ (2d)2n−1 · P [(z2i , z2i+1 ) ∈ Es,t ]
i=0
where (2d)2n−1 is a bound for the number of self-avoiding paths starting at 0 and independent
events {(z2i , z2i+1 ) ∈ Es,t }.
Hence
n
P [∃ x ∈ C0 : dl1 (x, 0) ≥ 2n − 1] ≤ 4d2 · 1 − e−2|T |λ̄(t−s)
≤ (8d2 · |T |λ̄ · (t − s))n −→ 0
as n → ∞, where e−2|T |λ̄(t−s) is the probability for no arrival in [s, t] in a 2|T | Poisson(λ̄) process
and 1 − e−2|T |λ̄(t−s) ≤ 2|T |λ̄ · (t − s).
d
By the lemma, P -almost sure for all s > 0 and ξ ∈ T Z , there is an unique function t 7→
(s,ξ)
ηt , t ≥ s, such that
2.2. PARTICLE SYSTEMS ON ZD 53
(s,ξ)
(i) ηs =ξ

(s,ξ)
(ii) For s ≤ t, h ≤ δ, and each connected component C of (Zd , Et,t+h ), ηt+h is obtained
C
(s,ξ)
from ηt by subsequently taking into account the finite number of transitions in C
C
during [t, t + h].
We set
ηtξ := ηt0,ξ .
By construction,
(s,ηsξ )
ηtξ = ηt ∀0 ≤ s ≤ t (2.1)
Corollary 2.2. (i) Time-homogeneity:

(s,ξ)
ηs+t ∼ ηtξ
t≥0 t≥0
(ii) (ηtξ , P ) is a Markov process with transition semigroup

(pt f )(ξ) = E[f (ηtξ )]
(iii) Feller property:

f ∈ Cb (S) =⇒ pt f ∈ Cb (S) ∀ t ≥ 0
Or, equivalently, pt f is continuous whenever f is continuous with respect to the product
topology. Since S is compact, any continuous function is automatically bounded.
(iv) Translation invariance: Let ξ : Ω → S be a random variable, independent of all Ntx,i
and translation invariant, i.e. ξ(x + •) ∼ ξ for all x ∈ Zd . Then ηtξ is translation invariant
for all t ≥ 0 P -a.s.
Sketch of proof: (i) by the time homogeneity of the Poisson arrivals.

(ii)

(s,ηsξ )
h i
ξ 2.1
E f ηt | Fs (ω) = E f ηt | Fs (ω)
(s,ξ)
taking into account the Fs -measurability of ηsξ and ηt being independent of Fs for fixed
ξ, we conclude with (i)

(s,ηsξ (ω))
h i
ξ
E f ηt | Fs (ω) = E f ηt
h ξ i
ηs (ω)
= E f ηt−s
= (pt−s f ) ηsξ (ω)

(iii)
ξn → ξ ⇒ ξn (x) → ξ(x) ∀ x ∈ Zd
Hence ξn = ξ eventually on each finite set C ⊂ Zd , and hence on each component of
Zd , E0,δ . By the componentwise construction,
ηtξn = ηtξ ∀t ≤ δ
eventually on each component. Hence
ηtξn → ηtξ (pointwise) ∀ t ≤ δ
and for f ∈ Cb (S),

f ηtξn → f ηtξ
for all t ≤ δ. With Lebesgue we conclude

h i
pt f (ξn ) = E f ηtξn −→ pt f (ξ) ∀t ≤ δ
Hence OK for t ≤ δ. General case by semigroup property:

btc
pt = pt−b δt c·δ pδ δ : Cb (S) → Cb (S)
(iv) The ci (x, y) are translation invariant by assumption,

Nnx,i t,i , Unx,i n,i

are identically distributed. This gives the claim.
Theorem 2.3 (Forward equation). For any cylinder function
f (η) = ϕ (η(x1 ), . . . , η(xn )) , n ∈ N ϕ: Tn → R
the forward equation

d
(pt f ) (ξ) = (pt L f ) (ξ)
dt
holds for all ξ ∈ S where
X
ci (x, ξ) · f (ξ ξ,i ) − f (ξ)

(L f )(ξ) =
x∈Zd
i∈T
2.3. STATIONARY DISTRIBUTIONS AND PHASE TRANSITIONS 55
Remark . Since f is a cylinder function, the sum in the formula for the generator has only finitely
many non-zero summands.
Proof. " #
X
P Ntxk ,i > 1 ≤ const. · t2
k=1,...,ni∈T
where {Ntxk ,i > 1} means that there is more than one transition in the time interval [0, t] among
{x1 , . . . , xn } and const. is a global constant.
P Ntxk ,i = 1 = λ̄ · t + O(t2 )

and hence
(pt f )(ξ) = E[f (ηtξ )]

= f (ξ) · P Ntxk ,i = 0 ∀ 1 ≤ k ≤ n, i ∈ T

X
xk ,i xk ,i xk ,i ci (x, ξ)
+ f (ξ ) · P Nt = 1, U1 ≤ + O(t2 )
i,k
λ̄
X λ̄ci (xk , ξ)
· f (ξ xk ,i ) − f (ξ) + O(t2 )

= f (ξ) + t·
i,k
λ̄
= f (ξ) + t · (L f )(ξ) + O(t2 )
where the constants O(t2 ) do not depend on ξ. Hence
pt+h f = pt ph f = pt f + hpt L f + O(h2 )
2.3 Stationary distributions and phase transitions
The reference for this chapter is Liggett [15].
From now on we assume T = {0, 1}. We define a partial order on configurations η, η̃ ∈ S =

d
{0, 1}Z by
η ≤ η̃ :⇔ η(x) ≤ η̃(x) ∀ x ∈ Zd
A function f : S → R is called increasing if and only if
f (η) ≤ f (η̃) whenever η ≤ η̃.

Definition 2.4 (Stochastic dominance). For probability measures µ, ν ∈ M1 (S) we set

Z Z
µ 4 ν :⇔ f dµ ≤ f dν for any increasing bounded function f : S → R
Example . For µ, ν ∈ M1 (R),
µ 4 ν ⇔ Fµ (c) = µ ((−∞, c]) ≥ Fν (c) ∀ c ∈ R
Now consider again the stochastic dynamics constructed above.
c1 (x, η) birth rates

c0 (x, η) death rates

Definition 2.5. The Markov process ηtξ , P is called attractive if and only if for all x ∈ Zd ,
(
c1 (x, η) ≤ c1 (x, η̃) and
η ≤ η̃, η(x) = η̃(x) ⇒
c0 (x, η) ≥ c0 (x, η̃)
Example . Contact process, voter model, as well as the Metropolis and heat-bath dynamics for
the (ferromagnetic) Ising model are attractive
Theorem 2.6. If the dynamics is attractive then:
1. If ξ ≤ ξ˜ then ηtξ ≤ ηtξ̃ for all t ≥ 0 P -a.s.

2. If f : S → R is increasing then pt f is increasing for all t ≥ 0.
3. If µ 4 ν then µpt 4 νpt for all t ≥ 0 (Monotonicity).
Proof. ˜ hence every single transition preserves order.

1. The dynamics is attractive and ξ ≤ ξ,
Hence
(s,ξ,A) (s,ξ̃,A)
ηt ≤ ηt ∀ 0 ≤ s ≤ t, A ⊂ Zd finite
(s,ξ) (s,ξ̃)
⇒ ηt ≤ ηt ∀ s ≥ 0, t ∈ [s, s + δ]
and by induction
(s,ξ) (s,ξ̃)
ηt ≤ ηt ∀t≥s≥0

(s,ξ)
(s,ξ) s+δ,ηs+δ
since ηt = ηt .
(If, for example, before a possible transition at time Tnx,1 , η ≤ η̃ and η(x) = η̃(x) = 0,
then after the transition, η(x) = 1 if Unx,1 ≤ c1 (x,η)
λ̄
, but in this case also η̃(x) = 1 since
c1 (x, η) ≤ c1 (x, η̃) by attractiveness. The other cases are checked similarly.)
˜
2. Since f is increasing and ξ ≤ ξ,
h i h i
(pt f )(ξ) = E f (ηtξ ) ≤E f (ηtξ̃ ) ˜
= (pt f )(ξ)
3. If f is increasing, pt f is increasing as well and hence by Fubini

Z Z µ4ν
Z Z
f d(µpt ) = pt f dµ ≤ pt f dν = f d(νpt )
Let 0, 1 ∈ S denote the constant configurations and δ0 , δ1 the minimal respectively maximal
element in M1 (S).
d
Theorem 2.7. For an attractive particle system on {0, 1}Z we have
1. The functions t 7→ δ0 pt and t 7→ δ1 pt are decreasing respectively increasing with respect to 4.
2. The limits µ := limt→∞ δ0 pt and µ̄ := limt→∞ δ1 pt exist with respect to weak convergence
in M1 (S)
3. µ and µ̄ are stationary distributions for pt
4. Any stationary distribution π satisfies
µ 4 π 4 µ̄.
Proof. 1.
0≤s≤t ⇒ δ0 4 δ0 pt−s
and hence by monotonicity
δ0 ps 4 δ0 pt−s ps = δ0 pt
d
2. By monotonicity and compactness, since S = {0, 1}Z is compact with respect to the prod-
uct topology, M1 (S) is compact with respect to weak convergence. Thus it suffices to show
that any two subsequential limits µ1 and µ2 of δ0 pt coincide. Now by 1),
Z
f d(δ0 pt )
is increasing in t, and hence

Z Z Z
f dµ1 = lim f d(δ0 pt ) = f dµ2
t↑∞
for any continuous increasing function f : S → R, which implies µ1 = µ2 .

3. Since pt is Feller,
Z Z Z Z
f d(µpt ) = pt f dµ = lim pt f d(δ0 ps ) = lim f d(δ0 ps pt )
s→∞ s→∞
Z Z
= lim f d(δ0 ps ) = f dµ
s→∞
for all f ∈ Cb (S).

4. Since π is stationary,
δ0 pt 4 πpt = π 4 δ1 pt
for all t ≥ 0 and hence for t → ∞,
µ 4 π 4 µ̄.
Corollary 2.8. For an attractive particle system, the following statements are equivalent:
1. µ = µ̄.
2. There is an unique stationary distribution.
3. Ergodicity holds:
∃ µ ∈ M1 (S) : νpt −→ µ ∀ ν ∈ M1 (S).
Proof. 1. ⇔ 2. : by the theorem.

1. ⇒ 3. : Since δ0 4 ν 4 δ1 ,
δ0 pt 4 νpt 4 δ1 pt
and since δ0 pt → µ and δ1 pt → µ̄ for t → ∞,
νpt → µ = µ̄
3. ⇒ 1.: obvious.
Example 1: Contact process on Zd
For the contact process, c0 (x, η) = δ and c1 (x, η) = b · N1 (x, η) where the birth rate b and the
death rate δ are positive constants. Since the 0 configuration is an absorbing state, µ = δ0 is the
minimal stationary distribution. The question now is if there is another (non-trivial) stationary
distribution, i.e. if µ̄ 6= µ.
Theorem 2.9. If 2db < δ then δ0 is the only stationary distribution, and ergodicity holds.
Proof. By the forward equation and translation invariance,
d 1 X
P ηt (x) = 1 = −δP ηt1 (x) = 1 + b · P ηt1 (x) = 0, ηt1 (y) = 1

dt
y : |x−y|=1
1
≤ (−δ + 2db) · P ηt (x) = 1
for all x ∈ Zd . Hence if 2db < δ then
µ̄ ({η : η(x) = 1}) = lim (δ1 pt )({η : η(x) = 1})

t→∞
= lim P ηt1 (x) = 1

t→∞
=0
for all x ∈ Zd and thus µ̄ = δ0 .
Conversely, one can show that for b sufficiently small (or δ sufficiently large), there is nontrivial
stationary distribution. The proof is more involved, cf. Liggett [15]. Thus a phase transition from
ergodicity to non-ergodicity occurs as b increases.
Example 2: Ising model on Zd
We consider the heat bath or Metropolis dynamics with inverse temperature β > 0 on S =
d
{−1, +1}Z .
a) Finite volume: Let A ⊆ Zd be finite,
S+,A := {η ∈ S | η = +1 on Ac } (finite!)
S−,A := {η ∈ S | η = −1 on Ac } .
(0,ξ,A)
For ξ ∈ S+,A resp. ξ ∈ S−,A , ηtξ,A = ηt , the dynamics taking into account only
transitions in A.

ηtξ,A , P is a Markov chain on S+,A resp. S−,A with generator
X
ci (x, η) · f (η x,i ) − f (η)

(L f )(η) =
x∈A
i∈{−1,+1}
Let
1 X
H(η) = (η(x) − η(y))2
4 d
x,y∈Z
|x−y|=1
denote the Ising Hamiltonian. Note that for η ∈ S+,A or η ∈ S−,A only finitely many
summands do not vanish, so H(η) is finite. The probability measure
1
µ+,A
β (η) = e−βH(η) , η ∈ S+,A
Zβ+,A
where X
Zβ+,A = e−βH(η)
η∈S+,A
on S+,A and µ−,A

β on S−,A defined correspondingly satisfy the detailed balance conditions
µ+,A +,A
β (ξ)L (ξ, η) = µβ (η)L (η, ξ) ∀ ξ, η ∈ S+,A
respectively
µ−,A −,A
β (ξ)L (ξ, η) = µβ (η)L (η, ξ) ∀ ξ, η ∈ S−,A .
Since S+,A and S−,A are finite and irreducible this implies that µ+,A
β respectively µ−,A
β is the

unique stationary distribution of µξ,At ,P for ξ ∈ S+,A , S−,A respectively. Thus in finite
volume there are several processes corresponding to different boundary conditions (which
effect the Hamiltonian) but each of them has a unique stationary distribution. Conversely,
in infinite volume there is only one process, but it may have several stationary distributions:
b) Infinite volume: To identify the stationary distributions for the process on Zd , we use an
approximation by the dynamics in finite volume. For n ∈ N let
An := [−n, n]d ∩ Zd ,
(
ξ(x) for x ∈ An
ξn (x) :=
+1 for x ∈ Zd \ An
The sequences µβ+,An and µβ−,An , n ∈ N, are decreasing respectively increasing

d
with respect to stochastic dominance. Hence my compactness of {−1, +1}Z there exist
+,An −,An
µ+
β := lim µβ and µ−
β := lim µβ
n↑∞ n↑∞
Remark (Gibbs measures). A probability measure µ on S is called Gibbs measure for the Ising
Hamiltonian on Zd and inverse temperature β > 0 if and only if for all finite A ⊆ Zd and ξ ∈ S,
1
µξ,A
β (η) := e−βH(η) , η ∈ Sξ,A := {η ∈ S | η = ξ on Ac } ,
Zβξ,A
is a version of the conditional distribution of µβ given η(x) = ξ(x) for all x ∈ Ac . One can show
−
that µ+β and µβ are the extremal Gibbs measures for the Ising model with respect to stochastic
dominance, cf. e.g. [Milos] ???.
−
Definition 2.10. We say that a phase transition occurs for β > 0 if and only if µ+
β 6= µβ
For ξ ∈ S define ξn ∈ S+,An by

(
ξ(x) for x ∈ An
ξn (x) :=
+1 for x ∈ Zd \ An
Lemma 2.11. For all x ∈ Zd and f ∈ [0, δ],

h i
P ηtξ (x) 6= ηtξn ,An (x) for some ξ ∈ S −→ 0 (2.2)
as n → ∞.

Proof. Let Cx denote the component containing x in the random graph Zd , E0,δ . If Cx ⊆ An
then the modifications in the initial condition and the transition mechanism outside An do not
effect the value at x before time δ. Hence the probability in (2.2) can be estimated by
P [Cx ∩ Acn 6= ∅]
which goes to 0 as n → ∞ by Lemma (2.1) above.
d
Let pt denote the transition semigroup on {−1, 1}Z . Since the dynamics is attractive,
µ̄β = lim δ+1 pt and µβ = lim δ−1 pt
t→∞ t→∞
are extremal stationary distributions with respect to stochastic dominance. The following theo-
rem identifies µ̄ and µ as the extremal Gibbs measures for the Ising Hamiltonian on Zd :
Theorem 2.12. The upper and lower invariant measures are

µ̄β = µ+
β and µβ = µ−
β.
−
In particular, ergodicity holds if and only if there is no phase transition (i.e. iff µ+
β = µβ ).
Proof. We show:
1. µ̄β 4 µ+
β
2. µ+
β is a stationary distribution with respect to pt .
This implies µ̄β = µ+ + +

β , since by 2. and the corollary above, µβ 4 µ̄β , and thus µβ = µ̄β by 1.
µ−
β = µβ follows similarly.
1. It can be shown similarly as above that, the attractiveness of the dynamics implies
µ1t ≤ µ1,A
t
n
P -a.s. for all n ∈ N and t ≥ 0. As t → ∞,

D D
µ1t → µ̄β and ηt1,An → µ+,A
β
n
,
hence
µ̄β 4 µ+,A
β
n
for all n ∈ N. The assertion follows as n → ∞.
2. It is enough to show
µ+ +
β p t = µβ for t ≤ δ, (2.3)
then the assertion follows by the semigroup property of (pt )t≥0 . Let
h i
n ξn ,An
(pt f ) (ξ) := E f (ηt )
denote the transition semigroup on Sξn ,An . We know:
µ+,n n +,n
β p t = µβ (2.4)
To pass to the limit n → ∞ let f (η) = ϕ (η(x1 ), . . . , η(xk )) be a cylinder function on S.

Then Z Z Z
pt f dµβ = pt f dµβ + (pnt f − pt f ) dµ+,n
+,n n +,n
β (2.5)
and by (2.4) this is equal to

Z Z
f dµ+,n
β + (pnt f − pt f ) dµ+,n
β
But by the lemma above, for t ≤ δ,

h i
ξn ,An
n
|(pt f )(ξ) − (pt f )(ξ)| ≤ E f ηt − f ηtξ

h i
≤ 2 · sup |f | · P ηtξn ,An (xi ) 6= ηtξ (xi ) for some i −→ 0
uniformly in ξ.
w
Since µ+,n
β to µ+
β , and f and pt f are continuous by the Feller property, taking the limit in
(2.5) as n → ∞ yields
Z Z Z
µ+ dµ+ f dµ+

fd β pt = pt f β = β
for all cylinder functions f , which implies (2.3).
The question now is: when does a phase transition occur?

For β = 0, there is no interaction between η(x) and η(y) for x 6= y. Hence ηβ+,n and ηβ−,n are the
uniform distributions on S+,An and S−,An , and
−
O 1
µ+
β = µβ = ν, where ν (+
−1) =
2
z∈Zd
On the other hand, phase transition occur for d ≥ 2 and large values of β:
Theorem 2.13 (P EIERL). For d = 2 there exists βc ∈ (0, ∞) such that for β > βc ,
1
µ+
β ({η : η(0) = −1}) < < µ−
β ({η : η(0) = −1}) ,
2
−
and thus µ+
β 6= µβ .
Proof. Let C0 (η) denote the connected component of 0 in {x ∈ Zd | η(x) = −1}, and set
C0 (η) = ∅ if η(0) = +1. Let A ⊆ Zd be finite and non-empty. For η ∈ S with C0 = A let η̃
denote the configuration obtained by reversing all spins in A. Then
H(η̃) = H(η) − 2|∂A|,
and hence
X
µ+,n
β (C0 = A) = µ+,n
β (η)
η : C0 (η)=A
X
≤ e−2β|∂A| µ+,n
β (η̃) ≤ e
−2β|∂A|
η : C0 (η)=A
| {z }
≤1
Thus
X
µ+,n
β ({η : η(0) = −1}) = µ+,n
β (C0 = A)
A⊂Zd
A6=∅
X∞
e−2βL {A ⊂ Zd : |∂A| = L}

≤
L=1
X∞
≤ e−2βL · 4 · 3L−1 · L2
L=4
1
≤ for β > βc
2
2
where ∂A is a self-avoiding path in Z2 by length L, starting in − L2 , L2 . Hence for n → ∞,
1
µ+
β ({η : η(0) = −1}) <
2
and by symmetry
1
µ− +
β ({η : η(0) = −1}) = µβ ({η : η(0) = 1}) >
2
for β > βc .
2.4 Poisson point process

Let S be a polish space (e.g. Rd ) and ν a σ-finite measure on the Borel σ-algebra S.
Definition 2.14. A collection of random variables N (B), B ∈ S, on a probability space (Ω, A, P )

is called a Poisson random measure (Poisson random field, spatial Poisson process) of intensity ν,
if and only if
(i) B 7→ N (B)(ω) is a positive measure for all ω ∈ Ω.
(ii) If B1 , . . . , Bn ∈ S are disjoint, then the random variables N (B1 ), . . . , N (Bn ) are inde-
pendent.
(iii) N (B) is Poisson(ν(B))-distributed for all B ∈ S with ν(B) < ∞.
Example . If Nt is a standard Poisson process with intensity λ > 0 the number

N (B) := |{t ∈ B | Nt− 6= Nt }| , B ∈ B(R+ )
of arrivals in a time set B is a Poisson random measure on R+ of intensity ν = λ dx, and
Nt − Ns = N ([s, t]), ∀0≤s≤t
2.4. POISSON POINT PROCESS 65
Construction of Poisson random measures:

a) ν(S) < ∞ : Define λ := ν(S). Let X1 , X2 , . . . be independent and identically dis-
tributed random variables, λ−1 ν-distributed. Let K be a Poisson(λ) distributed random
variable, independent of Xi . Then
K
X
N := δXi
k=1
is a Poisson random measure of intensity ν.
b) ν σ-finite: Let S = ˙ i∈N Si with ν(Si ) < ∞. Let Ni be independent Poisson random
S
measures with intensity ISi · ν. Then
∞
X
N := Ni
i=1
P∞
is a Poisson random measure with intensity ν = i=1 ISi · ν.
Definition 2.15. A collection Nt (B), t ≥ 0, B ∈ S, of random variables on a probability space

(Ω, A, P ) is called a Poisson point process of intensity ν if and only if
(i) B 7→ Nt (B)(ω) is a positive measure for all t ≥ 0, ω ∈ Ω.
(ii) If B1 , . . . , Bn ∈ S are disjoint, then (Nt (B1 ))t≥0 , . . . , (Nt (Bn ))t≥0 are independent.
(iii) (Nt (B))t≥0 is a Poisson process of intensity ν(B) for all B ∈ S with ν(B) < ∞.
Remark . A Poisson random measure (respectively a Poisson point process) is a random variable
(respectively a stochastic process) with values in the space
( )
X
Mc+ (S) = δx | A ⊆ S countable subset ⊆ M + (S)
x∈A
of all counting measures on S. The distribution of a Poisson random measure and a Poisson
point process of given intensity is determined uniquely by the definition.
Theorem 2.16 (Construction of Poisson point processes). 1. If N is a Poisson random mea-

sure on R+ × S of intensity dt ⊗ ν then
Nt (B) := N ((0, t] × B), t ≥ 0, B ∈ S,
is a Poisson point process on intensity ν.

2. Suppose λ := ν(S) < ∞. Then

Kt
X
Nt = δZi
i=1
is a Poisson point process of intensity ν provided the random variables Zi are independent
with distribution λ−1 ν, and (Kt )t≥0 is an independent Poisson process of intensity λ.
Proof. Exercise.
b
low intensity
b
b b b
b
b
β b
N (β) b
b
b
high intensity
b b
b b b
b
Corollary 2.17. If ν(S) < ∞ then a Poisson point process of intensity ν is a Markov jump
process on Mc+ (S) with finite jump measure
Z
q(π, •) = (π + δy ) ν(dy), π ∈ Mc+ (S)
and generator Z
(L F )(π) = (F (π + δy ) − F (π)) ν(dy), (2.6)
F : Mc+ (S) → R bounded. If ν(S) = ∞, (2.6) is not defined for all bounded functions F .
Chapter 3
Markov semigroups and Lévy processes
3.1 Semigroups and generators

Suppose that pt , t ≥ 0, are the transition kernels of a time-homogeneous Markov process on a
Polish space S with Borel σ-algebra S.
Properties:
1. Semigroup:
ps pt = ps+t ∀ s, t ≥ 0
2. (sub-)Markov:
(i) f ≥ 0 ⇒ pt f ≥ 0 positivity preserving
(ii) pt 1 = 1 (respectively pt 1 ≤ 1 if ζ 6= ∞)
3. Contraction with respect to sup-norm:

kpt f k ≤ kf ksup
4. Lp -Contraction: If µ is a stationary distribution, then

kpt f kL p (S,µ) ≤ kf kL p (S,µ)
for all 1 ≤ p ≤ ∞.
Proof of 4. If p = ∞, the claim follows by 3. If 1 ≤ p < ∞, we conclude with the Jensen

inequality
|pt f |p ≤ (pt |f |)p ≤ pt |f |p
and since µ is a stationary distribution,
Z Z Z Z
|pt f | dµ ≤ pt |f | dµ = |f | d(µpt ) = |f |p dµ
p p p
67
68 CHAPTER 3. MARKOV SEMIGROUPS AND LÉVY PROCESSES
Consequence:
(pt )t≥0 induces a semigroup of linear contractions (Pt )t≥0 on the following Banach spaces:
1. Fb (S) which is the space of all bounded measurable functions f : S → R endowed with
the sup-norm.
2. Cb (S) provided pt is Feller.
3. Lp (S, µ) provided µ is a stationary distribution.
Now let (Pt )t≥0 be a general semigroup of linear contractions on a Banach space B.
Definition 3.1. 1. The linear operator

Pt f − f
Lf := lim
t↓0 t
with limit in B an domain

Pt f − f
Dom(L) = f ∈ B | the limit lim exists
t↓0 t
is called the generator of (Pt )t≥0 .
2. (Pt )t≥0 is called strongly continuous (C0 semigroup) if and only if
lim Pt f = f
t↓0
for all f ∈ B.
Remark . 1. Pt f → f as t ↓ 0 for all f ∈ Dom(L), and hence, by contractivity, for all

f ∈ Dom(L):
kpt f − f k ≤ kpt f − pt gk + kpt g − gk + kg − f k, g ∈ Dom(L)

ε ε ε
< + +
3 3 3
i.e. Pt is a C0 semigroup on B if Dom(L) is dense in B. In general, Pt is a C0 semigroup
on Dom(L).
2. Conversely, if Pt is strongly continuous on B then Dom(L) is dense in B (for the proof see
script Stochastic analysis II).
3. The transition semigroup (pt )t≥0 of a right-continuous Markov process induces a C0 semi-
group (Pt )t≥0 on
(i) B = Lp (S, µ) provided µ is stationary.

3.1. SEMIGROUPS AND GENERATORS 69
(ii) B = C∞ (S) := all continuous functions vanishing at infinity provided S is locally

compact and pt (C∞ (S)) ⊆ C∞ (S) (proof omitted, see script Stochastic analysis II).
Theorem 3.2 (Maximum principle). The generator L of a Markov semigroup (pt )t≥0 on Fb (S)
satisfies
1. Maximum principle: If f ∈ Dom(L), x0 ∈ S with f (x0 ) = sups∈S f (x), then
pt f (x0 ) − f (x0 )
(Lf )(x0 ) = lim ≤0
t↓0 t
2. 1 ∈ Dom(L) and L1 = 0.
Proof. 1. f ≤ f (x0 ), hence
pt f ≤ f (x0 ) · pt 1 ≤ f (x0 )
for all t ≥ 0 and hence
pt f (x0 ) − f (x0 )
(Lf )(x0 ) = lim ≤0
t↓0 t
2. Pt 1 = 1 for all t ≥ 0.
Theorem 3.3 (Kolmogorov equations). If (Pt )t≥0 is a C0 semigroup with generator L then t 7→
Pt f is continuous for all f ∈ B. Moreover, if f ∈ Dom(L) then Pt f ∈ Dom(L) for all t ≥ 0,
and
d
Pt f = Pt Lf = LPt f
dt
Proof. 1. For h > 0,
kPt+h f − Pt f k = kPt (Ph f − f )k ≤ kPh f − f k → 0
and
kPt−h f − Pt f k = kPt−h (f − Ph f )k ≤ kf − Ph f k → 0
as h ↓ 0.
2. For f ∈ Dom(L) and h > 0 we have
1 Ph f − f
(Pt+h − Pt f ) = Pt → Pt Lf
h h
as h ↓ 0, because the operators Pt are contractions. On the other hand,
1 Ph f − f
(Pt−h f − Pt f ) = Pt−h → Pt Lf
−h h
as h ↓ 0 by 1.) and the contractivity.
3. We use 2.) so conclude that
1 1
(Ph Pt f − Pt f ) = (Pt+h f − Pt f ) → Pt Lf
h h
as h ↓ 0. Hence by 1.), Pt f ∈ Dom(L) and LPt f = Pt Lf .
Application to the martingale problem:
Corollary 3.4. Suppose (Xt , P ) is a right-continuous (Ft )-Markov process with transition semi-
group (pt )t≥0 .
1. Suppose (pt ) induces a C0 semigroup with generator L on a closed subspace of Fb (S)

(e.g. on Cb (S) or C∞ (S)). Then (Xt , P ) is a solution of the martingale problem for
(L, Dom(L)) (independently of the initial distribution).
2. Suppose (Xt , P ) is stationary with initial distribution µ, and L(p) is the generator of the
corresponding C0 semigroup on Lp (S, µ) for some p ∈ [1, ∞). Then for f ∈ Dom(L(p) ),
Zt
Mtf L(p) f (Xs ) ds,

= f (Xt ) − t ≥ 0,
0
is P -almost sure independent of the chosen version L(p) f and (Xt , P ) solves the martingale
problem for (L(p) , Dom(L(p) )).
Proof. 1. Since f ∈ Dom(L), f and Lf are bounded and
Zt
Mtf = f (Xt ) − (Lf )(Xs ) ds ∈ L 1 (P ),
0
3.2. LÉVY PROCESSES 71
and
 
Zt
E[Mtf − Msf | Fs ] = E f (Xt ) − f (Xs ) − Lf (Xu ) du | Fs 
s
Zt
= E[f (Xt ) | Fs ] − f (Xs ) − E[Lf (Xu ) | ] du
s
Zt
= pt−s f (Xs ) − f (Xs ) − pu−s Lf (Xs ) du = 0
s
P -almost sure by Kolomogorov’s forward equation.
2. Exercise.
3.2 Lévy processes

Additional reference for this chapter: Applebaum [3] and Bertoin [5].
Definition 3.5. An Rd -valued stochastic process ((Xt )t≥0 , P ) with càdlàg paths is called a
Lévy process if and only if it has stationary independent increments, i.e.,
`
(i) Xs+t − Xs Fs = σ(Xr | r ≤ s) for all s, t ≥ 0.
(ii) Xs+t − Xs ∼ Xt − X0 for all s, t ≥ p.
Example . 1. Diffusion processes with constant coefficients:
Xt = σBt + bt
with Bt a Brownian motion on Rn , σ ∈ Rd×n , b ∈ Rd .
2. Compound Poisson process:

Nt
X
Xt = Zi
i=1
with Zi independent identically distributed random variables, Nt Poisson process, inde-

pendent of Zi .
More interesting examples will be considered below.

Remark . If (Xt , P ) is a Lévy process then the increments Xs+t − Xs are

infinitely divisible random variables, i.e. for any n ∈ N, there exist independent identically
(n) (n)
distributed random variables Y1 , . . . , Yn such that
n
X
(n) (n)
Xs+t − Xs ∼ Yi e.g. Yi := X it − X (i−1)t
n n
i=1
The Lévy-Khinchin formula gives a classification of the distributions of all infinitely divisible
random variables on Rd in terms of their characteristic functions.
Theorem 3.6. A Lévy process is a time-homogeneous Markov process with translation invariant
transition functions
pt (x, B) = µt (B − x) = pt (a + x, a + B) ∀ a ∈ Rd (3.1)
where µt = P ◦ (Xt − X0 )−1 .
Proof.
P [Xs+t ∈ B | Fs ](ω) = P [Xs + (Xs+t − Xs ) ∈ B | Fs ](ω)

= P [Xs+t − Xs ∈ B − Xs (ω)]
= P [Xt − X0 ∈ B − Xs (ω)]
= µ(B − Xs (ω)),
so (Xt , P ) is Markov with transition function pt (x, B) = µt (B − x) which is clearly translation

invariant.
Remark . 1. In particular, the transition semigroup of a Lévy process is Feller: If f ∈

d
Cb (R ), then Z
(pt f )(x) = f (x + y) µt (dy)
is continuous by dominated convergence. If f ∈ C∞ (Rd ), then pt f ∈ C∞ (Rd ).
2. pt defined by (3.1) is a semigroup if and only if µt is a convolution semigroup, i.e.,
µt ∗ µs = µt+s ∀ t, s ≥ 0
E.g.
Z Z
µt ∗ µs (B) = µt (dy)µs (B − y) = pt (0, dy)ps (y, B) = pt+s (0, B) = µt+s (B)
if pt is a semigroup. The inverse implication follows similarly.

From now on we assume w.l.o.g. X0 = 0, and set
µ := µ1 = P ◦ X1−1 .
Definition 3.7. A continuous function ψ : Rd → C is called characteristic exponent of the mea-

sure µ or the Lévy process (Xt , P ) if and only if ψ(0) = 0 and
E eip·X1 = ϕµ (p) = e−ψ(p)

One easily verifies that for any Lévy process there exists a unique characteristic exponent.
1. E eip·Xt = e−tψ(p) for all t ≥ 0, p ∈ Rn .

Theorem 3.8.
2. Mtp := eip·Xt +tψ(p) is a martingale for any p ∈ Rn .
Proof. 1. a) For t ∈ N, define

t
X
Xt = (Xi − Xi−1 )
i=1
Since (Xi − Xi−1 ) are independent identically distributed random variables with the
same distribution as X1 ,
ϕXt (p) = ϕX1 (p)t = e−tψ(p)
m
b) Let t = n
∈ Q and
n
X
Xm = X im − X (i−1)m
n n
i=1
Hence
ϕXm = ϕnXt
and since ϕXm = e−mψ ,
m
ϕXt = e− n ψ = e−tψ
c) Let tn ∈ Q, tn ↓ t. Since Xt is right-continuous, we conclude with Lebesque
E eip·Xt = lim E eip·Xtn = lim e−tn ψ(p) = e−tψ(p)

tn →t tn →t
2. Exercise.
Since Xs+t −Xs ∼ Xt , independent of Fs , the marginal distributions of a Lévy process ((Xt )t≥0 , P )
are completely determined by the distributions of Xt , and hence by ψ! In particular:
Corollary 3.9 (Semigroup and generator of a Lévy process). 1. For all f ∈ S (Rd ) and t ≥
0,
pt f = e−tψ fˆ ˇ
where
Z
d
fˆ(p) = (2π)− 2 eip·x f (x) dx, and
Z
d
ǧ(x) = (2π)− 2 eip·x g(p) dp
denote the Fourier transform and the inverse Fourier transform of functions f, g ∈ L 1 (Rd ).
2. S (Rd ) is contained in the domain of the generator L of the semigroup induced by (pt )t≥0
on C∞ (Rd ), and
Lf = (−ψ fˆ)ˇ (Pseudo-differential operator). (3.2)
In particular, pt is strongly continuous on C∞ (Rd ).
Here S (Rd ) denotes the Schwartz space of rapidly decreasing smooth functions on Rd . Recall
that the Fourier transform maps S (Rd ) one-to-one onto S (Rd ).
Proof. 1. Since (pt f )(x) = E[f (Xt + x)], we conclude with Fubini
Z
ˆ − d2
(pt f )(p) = (2π) e−ip·x (pt f )(x) dx
Z
− d2 −ip·x
= (2π) · E e f (Xt + x) dx
= E eip·Xt · fˆ(p)

= e−tψ(p) fˆ(p)

for all p ∈ Rd . The claim follows by the Fourier inversion theorem, noting that e−tψ ≤ 1.
2. For f ∈ S (Rd ), fˆ is in S (Rd ) as well. The Lévy-Khinchin formula that we will state
below gives an explicit representation of all possible Lévy exponents which shows in par-
ticular that ψ(p) is growing at most polynomial as |p| → ∞. Hence

e−tψ fˆ − fˆ e−tψ − 1
+ ψ fˆ = + ψ · |fˆ|

t t

and
Zt Z t Zs
e−tψ − 1 1 −sψ 1
ψ 2 e−rψ dr ds

+ψ =− ψ e − 1 ds =
t t t
0 0 0
hence

e−tψ fˆ − fˆ
+ ψ f ≤ t · |ψ 2 | · |fˆ| ∈ L 1 (Rd ),
ˆ

t

and therefore:
pt f − f
Z
d
− (−ψ fˆ)ˇ= (2π)− 2 eip·x · · · · dp → 0
t
as t ↓ 0 uniformly in x. This shows f ∈ Dom(L) and Lf = (−ψ fˆ)ˇ. In particular, pt is
strongly continuous on S (Rd ). Since S (Rd ) is dense in C∞ (Rd ) and pt is contractive this
implies strong continuity on C∞ (Rd ).
Remark . pt is not necessarily strongly continuous on Cb (Rd ). Consider e.g. the deterministic
process
Xt = X0 + t
on R1 . Then
(pt f )(x) = f (x + t),
and one easily verifies that there exists f ∈ Cb (R) such that pt f 9 f uniformly.
Corollary 3.10. (Xt , P ) solves the martingale problem for the operator (L, S (Rd )) defined by
(3.2).
Example . 1. Translation invariant diffusions:

Xt = σBt + bt, σ ∈ Rd×n , b ∈ Rd , Bt Brownian motion on Rn
We have
h T i
E eip·Xt = E ei(σ p)·Bt eip·bt

= e−ψ(p)·t
where
1 1
ψ(p) = (σ T p)2 − ib · p = p · ap − ib · p, a := σσ T
2 2
and
1
Lf = −(ψ fˆ)ˇ= div(a∇f ) − b · ∇f, f ∈ S (Rn )
2
2. Compound Poisson process:

Nt
X
Xt = Zi ,
i=1
Zi independent identically distributed random variables on Rd with distribution π, Nt Pois-

son process of intensity λ, independent of Zi .
∞
ip·Xt X
E eip·Xt | Nt = n · P [Nt = n]

E e =
n=0
∞
X (λt)n −λt
= ϕπ (p)n · e
n=0
n!
−λ·(1−ϕπ (p))·t
=e
and hence
Z
1 − eip·y λπ(dy)

ψ(p) = λ · (1 − ϕπ (p)) =
and Z
(Lf )(x) = (−ψ fˆ)ˇ(x) = (f (x + y) − f (x))λπ(dy), f ∈ S (Rn ).
The jump intensity measure ν := λπ is called the Lévy measure of the compound Poisson
process.
3. Compensated Poisson process: Xt as above, assume Zi ∈ L 1 .

Nt
X
Mt := Xt − E[Xt ] = Zi − λE[Z1 ] · t
i=1
is a Lévy process with generator

Z
comp
(L f )(x) = Lf (x) − λ · yπ(dy) · ∇f (x)
Z
= (f (x + y) − f (x) − y · ∇f (x)) λπ(dy)
Remark/Exercise (Martingales of compound Poisson process). The following processes

are martingales:
(a) Mt = xt − b · t, where b := λ · E[Z1 ] = y ν(dy) provided Z1 ∈ L 1 .

R
(b) |Mt |2 − a · t, where a := λ · E[|Z1 |2 ] = |y|2 ν(dy) provided Z1 ∈ L 2 .

R
(c) exp(ip · Xt + ψ(p) · t), p ∈ Rn .

Proof. e.g.
2
E[Ms+t − Ms2 | Fs ] = E[(Ms+t − Ms )2 | Fs ] = E[(Ms+t − Ms )2 ]
Nt
!
X
= E[|Mt |2 ] = Var(Xt ) = Var Zi
i=1
" Nt
!# " Nt
#!
X X
= E Var Z i Nt + Var E Z i Nt

i=1 i=1
P hP i
Nt Nt
and since Var i=1 Zi | = Nt · Var(Z1 ) and E i=1 Zi | = Nt · E[Z1 ],
2
E[Ms+t − Ms2 | Fs ] = E[Nt ] · Var(Z1 ) + Var(Nt ) · |E[Z1 ]|2 = λtE[|Z1 |2 ].
4. Symmetric stable processes: Stable processes appear as continuous-time scaling limits

of random walks. By Donsker’s invariance principle, if
n
X
Sn = Zi
i=1
is a random walk with independent identically distributed increments Zi ∈ L 2 then the

rescaled processes
(k) 1
Xt := k − 2 Sbktc
converge in distribution to a Brownian motion. This functional central limit theorem fails
(as does the classical central limit theorem) if the increments are not square integrable,
i.e., if their distribution has heavy tails. In this case, one looks more generally for scaling
limits of rescaled processes of type
(k) 1
Xt := k − α Sbktc
(k)
for some α > 0. If (Xt )t≥0 converges in distribution then the limit should be a
scale-invariant Lévy process, i.e.
1
k − α Xkt ∼ Xt for all k > 0 (3.3)
This motivates looking for Lévy processes that satisfy the scaling relation (3.3). Clearly,
(3.3) is equivalent to
α
e−tψ(p) = E eip·cXt = E eip·Xcα t = e−c tψ(p) ∀ c > 0

i.e.
ψ(cp) = cα ψ(p) ∀c > 0
The simplest choice of a Lévy exponent satisfying (3.3) is
α
ψ(p) = · |p|α
2
for some σ > 0. In this case, the generator of a corresponding Lévy process would be the
fractional power
σ α
Lf = −(ψ fˆ)ˇ= ∆ 2 f
2
of the Laplacien.
For α = 2 and σ = 1 the corresponding Lévy process is a Brownian motion, the scaling
limit in the classical central limit theorem. For α > 2, L does not satisfy the maximum
principle, hence the corresponding semigroup is not a transition semigroup of a Markov
process.
Now suppose α ∈ (0, 2).
Lemma 3.11. For α ∈ (0, 2),

ψ(p) = const. · lim ψε (p)
ε↓0
where Z
1 − eip·y |y|−α−1 dy

ψε (p) = lim
ε↓0 Rd \B(0,ε)
p
Proof. By substitution x = |p|y and ν := |p|
,
Z Z
−α−1
ip·y
1 − eiνx |x|−α−1 dx · |p|α → const.|p|α

1−e · |y| dy =
Rd \B(0,ε) Rd \B(0,ε·|p|)
as ε ↓ 0 since (1 − eiνx ) = iνx + O(|x|2 ) by Taylor expansion.
Note that ψε is the symbol of a compound Poisson process with Lévy measure proportional to
|y|−α−1 · I{|y|>ε} dy. Hence we could expect that ψ is a symbol of a similar process with Lévy
measure proportional to |y|−α−1 dy. Since this measure is infinite, a corresponding process should
have infinitely many jumps in any non-empty time interval. To make this heuristics rigorous we
now give a construction of Lévy processes from Poisson point process:
3.3. CONSTRUCTION OF LÉVY PROCESSES FROM POISSON POINT PROCESSES: 79
3.3 Construction of Lévy processes from Poisson point pro-

cesses:
Idea: Jumps of given size for Lévy processes ↔ Points of a Poisson point process on Rd .
b
b b b
b
b
β b
N (β) b
b
b
b b
b b b
b
Position of a Lévy process after time t:

X
Xt = yNt ({y}) if supp(Nt ) is countable
y
Z
Xt = y Nt (dy) in general
a) Finite intensity
Theorem 3.12. Suppose ν is a finite measure on Rd . If (Nt )t≥0 is a Poisson point process of
intensity ν then Z
Xt := y Nt (dy)
is a compound Poisson process with Lévy measure ν (i.e. total intensity λ = ν(Rd ) and jump
distribution π = ν(Rν d ) ).
Proof. By the theorem in Section 1.9 above and the uniqueness of a Poisson point process of
intensity ν we may assume
XKt
Nt = δZi
i=1
where Zi are independent random variables of distribution λ−1 ν, and (Kt )t≥0 is an independent
Poisson process of intensity λ.
Hence
Z Kt
X
Xt = y Nt (dy) = Zi
k=1
is a compound Poisson process.
b) Infinite symmetric intensity

Now assume that
(A1) ν is a symmetric measure on Rd \ {0}, i.e.
ν(B) = ν(−B) ∀ B ∈ B(Rd \ {0})
and
R
(A2) (1 ∧ |y|2 ) ν(dy) < ∞.
R
(i.e. ν (|y| ≥ ε) < ∞ and |y|<ε
|y|2 ν(dy) < ∞ ∀ε > 0 )
For example, we could choose ν(dy) = |y|−α−1 , α ∈ (0, 2) which is our candidate for the Lévy
measure on an α-stable process. Let (Nt )t≥0 be a Poisson point process with intensity ν. Our aim
is to prove the existence of a corresponding Lévy process by an approximation argument. For
ε > 0,
Ntε (dy) := I{|y|>ε} · Nt (dy)
is a Poisson point process with finite intensity ν ε (dy) = I{|y|>ε} · ν(dy), and hence
Z Z
ε
Xt := y Nt (dy) = y Ntε (dy)
|y|>ε
is a compound Poisson process with Lévy measure ν ε .
Lemma 3.13. If (A1) holds, then for all 0 < δ ≤ ε and t ≥ 0,

Z
ε 2
δ
E sup Xs − Xs
≤ 4t · |y|2 ν(dy)
s≤t
δ<|y|≤ε
Proof. Z Z
Xtδ − Xtε = y Nt (dy) = y Ntδ,ε (dy)
δ<|y|≤ε
where
Ntδ,ε (dy) := I{δ<|y|≤ε} · Nt (dy)
is a Poisson point process of intensity ν δ,ε (dy) = I{δ<|y|≤ε} ·ν(dy). Hence Xtδ −Xtε is a compound
Poisson process with finite Lévy measure ν δ,ε . In particular,
Z Z
δ ε δ,ε δ ε
Mt := Xt − Xt − t · yν (dy) = Xt − Xt − t · y ν(dy)
δ<|y|≤ε
and Z Z
2 2 δ,ε 2
|Mt | − t · y ν (dy) = |Mt | − t · |y|2 ν(dy)
δ<|y|≤ε
are right-continuous martingales. Since ν is symmetric, Mt = Xtδ − Xtε . Hence by Doob’s

maximal inequality,
2 Z
δ ε
2 2 2 2
E sup Xs − Xs = E sup |Ms | ≤ · E[|Mt | ] = 4t · |y|2 ν(dy).
s≤t s≤t 2−1
δ<|y|≤ε
Theorem 3.14. Let t ≥ 0. If (A1) and (A2) hold then the process X ε , ε > 0, form a Cauchy
sequence with respect to the norm

ε 2
δ ε
δ
kX − X | := E sup Xs − Xs
.
s≤t
The limit process Z

Xs = lim Xsε = lim y Ns (dy), 0 ≤ s ≤ t,
ε↓0 ε↓0
|y|>ε
is a Lévy process with symbol

Z
1 − eip·y ν(dy).

ψ(p) = lim
ε↓0
|y|>ε
Remark . 1. Representation of Lévy process with symbol ψ as jump process with infinite jump
intensity.
2. For ν(dy) = |y|−α−1 , α ∈ (0, 2), we obtain an α-stable process.
Proof. Lemma and (A2) yields that (X ε )ε>0 is a Cauchy sequence with respect to k · k. Since the
processes Xsε are right-continuous and the convergence is uniform, the limit process Xs is right-
continuous as well. Similarly, it has independent increments, since the approximating processes
have independent increments, and by dominated convergence
ε ε
E eip·(Xs+t −Xs ) = lim E eip·(Xs+t −Xs ) = lim e−tψε (p)

ε↓0 ε↓0
where
Z Z
ip·y
1 − eip·y ν(dy)

ψε (p) = 1−e νε (dy) =
|y|>ε
is the symbol of the approximating compound Poisson process.
General Lévy processes
Theorem 3.15 (L ÉVY-K HINCHIN). For ψ : Rd → C the following statements are equivalent:
(i) ψ is the characteristic exponent of a Lévy process.
(ii) e−ψ is the characteristic function of an infinitely divisible random variable.
(iii) Z
1
1 − eip·y + ip · yI{|y|≤1} ν(dy)

ψ(p) = p · ap − ib +
2 Rd
2
where a ∈ Rd is a non-negative definite matrix, b ∈ Rd , and ν is a positive measure on
Rd \ {0} satisfying (A2).
Sketch of proof: (i)⇒(ii): If Xt is a Lévy process with characteristic exponent ψ then X1 is an

infinitely divisible random variable with the same characteristic exponent.
(ii)⇒(iii): This is the classical Lévy-Khinchin theorem which is proven in several textbooks on
probability theory, cf. e.g. Feller [10] and Varadhan [22].
(iii)⇒(i): The idea for the construction of a Lévy process with symbol ψ is to define
(1) (2) (3)
Xt = Xt + Xt + Xt
where X (1) , X (2) and X (3) are independent Lévy processes,

(1) √
Xt = aBt + b (diffusion part),
(2)
Xt compound Poisson process with Lévy measure I{|y|≥1} ·ν(dy) (big jumps)
(3) (3,ε)
Xt = lim Xt (small jumps),
ε↓0
(3,ε)
Xt compensated Poisson process with Lévy measure I{ε<|y|≤1} ·ν(dy)
(3,ε)
Since Xt is a martingale for all ε > 0, the existence of the limit as ε ↓ 0 can be
established as above via the maximal inequality. One then verifies as above that X (1) , X (2)
and X (3) are Lévy processes with symbols

1
ψ (1) (p) = p · ap − ib,
2Z
ψ (2) (p) = 1 − eip·y ν(dy), and

|y|>1
Z
(3)
1 − eip·y + ip · y ν(dy).

ψ (p) =
|y|≤1
Thus by independence, X = X (1) + X (2) + X (3) is a Lévy process with symbol ψ =

ψ (1) + ψ (2) + ψ (3) , cf. Bertoin [5] or Applebaum [3] for details.
Remark . 1. Lévy-Itô representation:

√
Z Z
Xt = X0 + aBt + bt + y Nt (dy) + y (Nt (dy) − tν(dy))
| {z }
diffusion part |y|≥1 0<|y|<1
| {z } | {z }
big jumps small jumps compensated by drift
2. The compensation for

R small jumps ensures that the infinite intensity limit exists by martin-
gale arguments. If |y| ν(dy) = ∞ then the uncompensated compound Poisson process
do not converge!
3. In the construction of the α-stable process above, a compensation was not required because
for a symmetric Lévy measure the approximating processes are already martingales.
Extension to non-translation invariant case:
Theorem 3.16 (Classification of Feller semigroups in Rd ). [DYNKIN , C OURR ÈGE , K UNITA ,

ROTH ] Suppose (Pt )t≥0 is a C0 contraction semigroup on C∞ (Rd ), such that C0∞ (Rd ) is con-
tained in the domain of the generator L. Then:
1.
d d
X ∂2f X
Lf (x) = aij (x) (x) + b(x)∇f (x) + c(x) · f (x)
i,j=1
∂xi ∂xj i=1
Z
(3.4)
+ f (y) − f (x) − I{|y−x|<1} ·(y − x)·∇f (x) ν(x, dy)
Rd \{x}
for all f ∈ C0∞ (Rd ), where aij , b, c ∈ C(Rd ), a(x) non-negative definit and c(x) ≤ 0 for
all x, and ν(x, ·) is a kernel of positive (Radon) measures.
2. If Pt is the transition semigroup of a non-explosive Markov process then c ≡ 0.

3. If Pt is the transition semigroup of a diffusion (i.e. a Markov process with continuous paths)
then L is a local operator, and a representation of type (3.4) holds with ν ≡ 0.
Remark . Corresponding Markov processes can be constructed as solutions of stochastic differ-

ential equations with combined Gaussian and Poisson noise.
We will not prove assertion 1. The proof of 2. is left as an exercise. We now sketch an in-
dependent proof of 3., for a detailed proof we refer to volume one of Rogers / Williams [18]:
If f ≡ 1 on a neighborhood of x and 0 ≤ f ≤ 1, then by the maximum principle,

Z
0 = (Lf )(x) = o(x) + (f (y) − f (x)) ν(dy)
Rd \{x}
this is only possible for all f as above if ν(x, ·) = 0.
Proof of 3.: a) Locality: If x ∈ Rd and f, g ∈ Dom(L) with f = g in a neighborhood of x,

then Lf (x) = Lg(x). Since f ∈ Dom(L) and x ∈ Rd ,
Zt
f (Xt ) − Lf (Xs ) ds
0
is a martingale and hence

 T 
Z
Ex [f (XT )] = f (x) + Ex  (Lf )(Xs ) ds
0
for all bounded stopping times T (Dynkin’s formula). Hence

hR i
T
Ex 0 ε Lf (Xs ) ds
(Lf )(x) = lim
ε↓0 Ex [Tε ]
Ex [f (XTε )] − f (x)
= lim (Dynkin’s characteristic operator)
ε↓0 Ex [Tε ]
where
Tε := inf {t ≥ 0 : Xt ∈
/ B(x, ε)} ∧ 1
since in the equation above, Lf (Xs ) = Lf (x) + O(1) by right continuity.
If the paths are continuous then XTε ∈ B(x, ε). Hence for f, g ∈ Dom(L) with f = g in a
neighborhood of x,
f (XTε ) = g(XTε )
for small ε > 0, and thus
Lf (x) = Lg(x).
b) Local maximum principle: Locality and the maximum principle imply:
f ∈ Dom(L) with local maximum at x =⇒ Lf (x) ≤ 0
c) Taylor expansion: Fix x ∈ Rd and f ∈ C0∞ (Rd ). Let ϕ, ϕi ∈ C0∞ (Rd ) such that
ϕ(y) = 1 and ϕi (y) = yi − xi for all y in a neighborhood of x. Then in a neighborhood U
of x,
d
X ∂f 1X ∂2f
f (y) = f (x) · ϕ(y) + (x)ϕi (y) + d (x)ϕi (y)ϕj (y) + R(y)
i=1
∂xi 2 i,j=1 ∂xi ∂xj
where R is a function in C0∞ (Rd ) with R(y) = o(|y − x|2 ). Hence
1X ∂2f
(Lf )(x) = c · f (x) + b∇f (x) + aij (x) + (LR)(x)
2 ∂xi ∂xj
where c := Lϕ(x), bi := Lϕi (x) and aij := L(ϕi ϕj )(x). In order to show (LR)(x) = 0
we apply the local maximum principle. For ε ∈ R choose Rε ∈ C0∞ (Rd ) such that
Rε (y) = R(y) − ε|y − x|2
on U . Then for ε > 0, Rε has a local maximum at x, and hence LRε ≤ 0. For ε ↓ 0
we obtain LR(x) ≤ 0. Similarly, for ε < 0, −Rε has a local maximum at x and hence
LRε ≥ 0. For ε ↑ 0 we obtain LR(x) ≥ 0, and thus LR(x) = 0.
Chapter 4
Convergence to equilibrium
Our goal in the following sections is to relate the long time asymptotics (t ↑ ∞) of a time-
homogeneous Markov process (respectively its transition semigroup) to its infinitesimal charac-
teristics which describe the short-time behavior (t ↓ 0):
Asymptotic properties ↔ Infinitesimal behavior, generator

t↑∞ t↓0
Although this is usually limited to the time-homogeneous case, some of the results can be applied
to time-inhomogeneous Markov processes by considering the space-time process (t, Xt ), which
is always time-homogeneous. On the other hand, we would like to take into account processes
that jump instantaneously (as e.g. interacting particle systems on Zd ) or have continuous trajecto-
ries (diffusion-processes). In this case it is not straightforward to describe the process completely
in terms of infinitesimal characteristics, as we did for jump processes. A convenient general setup
that can be applied to all these types of Markov processes is the martingale problem of Stroock
and Varadhan.
4.1 Setup and examples

In this section, we introduce the setup for the rest of the chapter IV. Let S be a Polish space
endowed with its Borel σ-algebra S. By Fb (S) we denote the linear space of all bounded mea-
surable functions f : S → R. Suppose that A is a linear subspace of Fb (S) such that
(A0) If µ is a signed measure on S with finite variation and

Z
f dµ = 0 ∀ f ∈ A ,
then µ = 0
Let
87
88 CHAPTER 4. CONVERGENCE TO EQUILIBRIUM
L : A ⊆ Fb (S) → Fb (S)
be a linear operator.
Definition 4.1. An adapted right-continuous stochastic process ((Xt )t≥0 , (Ft )t≥0 , P ) is called a
solution for the (local) martingale problem for the operator (L , A ) if and only if
Zt
Mtf := f (Xt ) − (L f )(Xs ) ds
0
is an (Ft )-martingale for all f ∈ A .
Example . 1. Jump processes: A minimal Markov jump process solves the martingale prob-
lem for its generator Z
(L f )(x) = q(x, dy) (f (y) − f (x))
with domain
A = {f ∈ Fb (S) : L f ∈ Fb (S)} ,
cf. above.
d
2. Interacting particle systems: An interacting particle system with configuration space T Z
as constructed in the last section solves the martingale problem for the operator
XX
ci (x, µ) · f µx,i − f (µ)

(L f )(µ) = (4.1)
x∈Zd i∈T
with domain given by the bounded cylinder functions

A = f : S → R : f (µ) = ϕ (µ(x1 ), . . . , µ(xk )) , k ∈ N, x1 , . . . , xk ∈ Zd , ϕ ∈ Fb (T k )

Note that for a cylinder function only finitely many summands in (4.1) do not vanish. Hence
L f is well-defined.
3. Diffusions: Suppose S = Rn . By Itô’s formula, any (weak) solution ((Xt )t≥0 , P ) of the
stochastic differential equation
dXt = σ(Xt ) dBt + b(Xt ) dt
with an Rd -valued Brownian motion Bt and locally bounded measurable functions σ : Rn →
Rd×n , b : Rn → Rn , solves the martingale problem for the differential operator
n
1X ∂2f
(L f )(x) = aij (x) (x) + b(x) · ∆f (x),
2 i,j=1 ∂xi ∂xj
a(x) = σ(x)σ(x)T ,
with domain C 2 (Rn ), and the martingale problem for the same operator with domain A =
C02 (Rn ), provided there is no explosion in finite time. The case of explosion can be included
by extending the state space to Rn ∪{∆}
˙ and setting f (∆) = 0 for f ∈ C02 (Rn ).
4.1. SETUP AND EXAMPLES 89
4. Lévy processes A Lévy process solves the martingale problem for its generator
L f = −(ψ fˆ)ˇ
with domain A = S (Rn ).
From now on we assume that we are given a right continuous time-homogeneous Markov process
((Xt )t≥0 , (Ft )t≥0 , (Px )x∈S ) with transition semigroup (pt )t≥0 such that for any x ∈ S, (Xt )t≥0 is
under Px a solution of the martingale problem for (L , A ) with Px [X0 = x] = 1.
Remark (Markov property of solutions of martingale problems). Suppose Px , x ∈ S, are prob-

ability measures on
D(R+ , S) := all càdlàg functions ω : R+ → S
such that with respect to Px , the canonical process Xt (ω) = ω(t) is a solution of the martingale
problem for (L , A ) satisfying Px [X0 = x] = 1.
If
(i) A is separable with respect to kf kL := kf ksup + kL f ksup ,
(ii) x 7→ Px (B) is measurable for all B ∈ S,
(iii) For any x ∈ S, Px is the unique probability measure on D(R+ , S) solving the martingale
problem,
then (Xt , Px ) is a strong Markov process, cf. e.g. Rogers, Williams [18] Volume 1.
Let A¯ denote the closure of A with respect to the supremum norm. For most results derived
below, we will impose two additional assumptions:
Assumptions:
(A1) If f ∈ A , then L f ∈ A¯.
(A2) There exists a linear subspace A0 ⊆ A such that if f ∈ A0 , then pt f ∈ A for all t ≥ 0,
and A0 is dense in A with respect to the supremum norm.
Example . 1. For Lévy processes (A1) and (A2) hold with A0 = A = S (Rd ), and B =
A¯ = C∞ (Rd ).
2. For a diffusion process in Rd with continuous non-degenerated coefficients satisfying an

appropriate growth constraint at infinity, (A1) and (A2) hold with A0 = C0∞ (Rd ), A =
S (Rd ) ∩ C 2 (Rd ) and B = A¯ = C∞ (Rd ).
3. In general, it can be difficult to determine explicitly a space A0 such that (A2) holds. In
this case, a common procedure is to approximate the Markov process and its transition
semigroup by more regular processes (e.g. non-degenerate diffusions in Rd ), and to derive
asymptotic properties from corresponding properties of the approximands.
d
4. For an interacting particle system on T Z with bounded transition rates ci (x, η), the con-
ditions (A1) and (A2) hold with
n d
o
A0 = A = f : T Z → R : |||f ||| < ∞
where
X
|||f ||| = ∆f (x), ∆f (x) = sup f (η x,i ) − f (η) ,
i∈T
x∈Zd
cf. Liggett [15].
Theorem 4.2 (From the martingale problem to the Kolmogorov equations). Suppose (A1) and
(A2) hold. Then (pt )t≥0 induces a C0 contraction semigroup (Pt )t≥0 on the Banach space B =
A¯ = A¯0 , and the generator is an extension of (L , A ). In particular, the forward and backward
equations
d
pt f = pt L f ∀ f ∈ A
dt
and
d
pt f = L pt f ∀ f ∈ A0
dt
hold.
Proof. Since Mtf is a bounded martingale with respect to Px , we obtain the integrated backward
forward equation by Fubini:
 t 
Z
(pt f )(x) − f (x) = Ex [f (Xt ) − f (X0 )] = Ex  (L f )(Xs ) ds
0
(4.2)
Zt
= (ps L f )(x) ds
0
for all f ∈ A and x ∈ S. In particular,
Zt
kpt f − f ksup ≤ kps L f ksup ds ≤ t · kL f ksup → 0
0
4.2. STATIONARY DISTRIBUTIONS AND REVERSIBILITY 91
as t ↓ 0 for any f ∈ A . This implies strong continuity on B = A¯ since each pt is a contraction

with respect to the sup-norm. Hence by (A1) and (4.2),
Zt
pt f − f 1
−Lf = (ps L f − L f ) ds → 0
t t
0
uniformly for all f ∈ A , i.e. A is contained in the domain of the generator L of the semigroup
(Pt )t≥0 induced on B, and Lf = L f for all f ∈ A . Now the forward and the backward
equations follow from the corresponding equations for (Pt )t≥0 and Assumption (A2).
4.2 Stationary distributions and reversibility
Theorem 4.3 (Infinitesimal characterization of stationary distributions). Suppose (A1) and (A2)
hold. Then for µ ∈ M1 (S) the following assertions are equivalent:
(i) The process (Xt , Pµ ) is stationary, i.e.
(Xs+t )t≥0 ∼ (Xt )t≥0
with respect to Pµ for all s ≥ 0.
(ii) µ is a stationary distribution for (pt )t≥0
(iii) Z
L f dµ = 0 ∀ f ∈ A
(i.e. µ is infinitesimally invariant, L ∗ µ = 0).
Proof. (i)⇒(ii) If (i) holds then in particular
µps = Pµ ◦ Xs−1 = Pµ ◦ X0−1 = µ
for all s ≥ 0, i.e. µ is a stationary initial distribution.
(ii)⇒(i) By the Markov property, for any measurable subset B ⊆ D(R+ , S),
Pµ [(Xs+t )t≥0 ∈ B | Fs ] = PXs [(Xt )t≥0 ∈ B]
Pµ -a.s., and thus
Pµ [(Xs+t )t≥0 ∈ B] = Eµ [PXs ((Xt )t≥0 ∈ B)] = Pµps [(Xt )t≥0 ∈ B] = Pµ [X ∈ B]

(ii)⇒(iii) By the theorem above, for f ∈ A ,

pt f − f
→ L f uniformly as t ↓ 0,
t
so
R R R
(pt f − f ) dµ f d(µpt ) −
Z
f dµ
L f dµ = lim = lim =0
t↓0 t t↓0 t
provided µ is stationary with respect to (pt )t≥0 .
(iii)⇒(ii) By the backward equation and (iii),

Z Z
d
pt f dµ = L pt f dµ = 0
dt
since pt f ∈ A for f ∈ A0 and hence
Z Z Z
f d(µpt ) = pt f dµ = f dµ (4.3)
for all f ∈ A0 and t ≥ 0. Since A0 is dense in A with respect to the supremum norm,
(4.3) extends to all f ∈ A . Hence µpt = µ for all t ≥ 0 by (A0).
Remark . Assumption (A2) is required only for the implication (iii)⇒(ii).
Applicaton to Itô diffusions:

Suppose that we are given non-explosive weak solutions (Xt , Px ), x ∈ Rd , of the stochastic
differential equation
dXt = σ(Xt ) dBt + b(Xt ) dt, X0 = x Px -a.s.,
where (Bt )t≥0 is a Brownian motion in Rd , and the functions σ : Rn → Rn×d and b : Rn → Rn
are locally Lipschitz continuous. Then by Itô’s formula (Xt , Px ) solves the martingale problem
for the operator
n
1X ∂2
L = aij (x) + b(x) · ∇, a = σσ T ,
2 i,j=1 ∂xi ∂xj
with domain A = C0∞ (Rn ). Moreover, the local Lipschitz condition implies uniqueness of
strong solutions, and hence, by the Theorem of Yamade-Watanabe, uniqueness in distribution of
weak solutions and uniqueness of the martingale problem for (L , A ), cf. e.g. Rogers/Williams
[18]. Therefore by the remark above, (Xt , Px ) is a Markov process.
Theorem 4.4. Suppose µ is a stationary distribution of (Xt , Px ) that has a smooth density % with
respect to the Lebesgue measure. Then
n
1 X ∂2
L % :=
∗
(aij %) − div(b%) = 0
2 i,j=1 ∂xi ∂xj
Proof. Since µ is a stationary distribution,

Z Z Z
0 = L f dµ = L f % dx = f L ∗ % dx ∀ f ∈ C0∞ (Rn ) (4.4)
Rn Rn
Here the last equation follows by integration by parts, because f has compact support.
Remark . In general, µ is a distributional solution of L ∗ µ = 0.
Example (One-dimensional diffusions). In the one-dimensional case,

a
L f = f 00 + bf 0 ,
2
and
1
L ∗ % = (a%)00 − (b%)0
2
2
where a(x) = σ(x) . Assume a(x) > 0 for all x ∈ R.
a) Harmonic functions and recurrence:
Z•
a 2b
L f = f 00 + bf 0 = 0 ⇔ f 0 = C1 exp − dx, C1 ∈ R
2 a
0
⇔ f = C2 + C1 · s, C1 , C2 ∈ R
where
Z• Ry 2b(x)
s := e− 0 a(x)
dx
dy
0
is a strictly increasing harmonic function that is called the scale function or natural scale of the diffusion.
In particular, s(Xt ) is a martingale with respect to Px . The stopping theorem implies
s(b) − s(x)
Px [Ta < Tb ] = ∀a < x < b
s(b) − s(a)
As a consequence,
(i) If s(∞) < ∞ or s(−∞) > −∞ then Px [|Xt | → ∞] = 1 for all x ∈ R, i.e., (Xt , Px ) is
transient.
(ii) If s(R) = R then Px [Ta < ∞] = 1 for all x, a ∈ R, i.e., (Xt , Px ) is irreducible and
recurrent.
b) Stationary distributions:
(i) s(R) 6= R: In this case, by the transience of (Xt , Px ), a stationary distribution does not
exist. In fact, if µ is a finite stationary measure, then for all t, r > 0,
µ({x : |x| ≤ r}) = (µpt )({x : |x| ≤ r}) = Pµ [|Xt | ≤ r].
Since Xt is transient, the right hand side converges to 0 as t ↑ ∞. Hence µ({x : |x| ≤
r}) = 0 for all r > 0, i.e., µ ≡ 0.
(ii) s(R) = R: We can solve the ordinary differential equation L ∗ % = 0 explicitly:

0
1
L %=
∗
(a%)0 − b% =0
2
1 b
⇔ (a%)0 − a% = C1 with C1 ∈ R
2 a
1 − R • 2b dx 0 R • 2b
⇔ e 0 a a% = C1 · e− 0 a dx
2
⇔ s0 a% = C2 + 2C1 · s with C1 , C2 ∈ R
C2 C2 R y 2b dx
⇔ %(y) = = e0 a with C2 ≥ 0
a(y)s0 (y) a(y)
Here the last equivalence holds since s0 a% ≥ 0 and s(R) = R imply C2 = 0. Hence a
stationary distribution µ can only exist if the measure
1 R y 2b dx
m(dy) := e 0 a dy
a(y)
m
is finite, and in this case µ = m(R)
. The measure m is called the speed measure of the
diffusion.
Concrete examples:
1. Brownian motion: a ≡ 1, b ≡ 0, s(y) = y. Since s(R) = R, Brownian motion is

transient and there is no stationary distribution. Lebesgue measure is an infinite stationary
measure.
2. Ornstein-Uhlenbeck process:
dXt = dBt − γXt dt, γ > 0,

1 d2 d
L = 2
− γx , a ≡ 1,
2 dx dx
Zy Ry
Zy
2γx dx 2
b(x) = −γx, s(y) = e 0 dy = eγy dy recurrent,
0 0

−γy 2 m 2
m(dy) = e dy, µ= =N 0, is the unique stationary distribution
m(R) γ
3.
1
dXt = dBt + b(Xt ) dt, b ∈ C 2, for |x| ≥ 1
b(x) =
x
transient, two independent non-negative solutions of L ∗ % = 0 with % dx = ∞.
R
γ
(Exercise: stationary distributions for dXt = dBt − 1+|Xt |
dt)
Example (Deterministic diffusions).
dXt = b(Xt ) dt, b ∈ C 2 (Rn )

L f = b · ∇f
L ∗ % = − div(%b) = −% div b − b · ∇%, % ∈ C1
Proposition 4.5.
L ∗% = 0 ⇔ div(%b) = 0
⇔ (L , C0∞ (Rn )) anti-symmetric on L2 (µ)
Proof. First equivalence: cf. above
Second equivalence:
Z Z Z
f L g dµ = f b · ∇g% dx = − div(f b%)g dx
Z Z
= − L f g dµ − div(%b)f g dx ∀ f, g ∈ C0∞
Hence L is anti-symmetric if and only if div(%b) = 0

Theorem 4.6. Suppose (A1) and (A2) hold. Then for µ ∈ M1 (S) the following assertions are
equivalent:
(i) The process (Xt , Pµ ) is invariant with respect to time reversal, i.e.,
(Xs )0≤s≤t ∼ (Xt−s )0≤s≤t with respect to Pµ ∀ t ≥ 0
(ii)
µ(dx)pt (x, dy) = µ(dy)pt (y, dx) ∀ t ≥ 0
(iii) pt is µ-symmetric, i.e.,
Z Z
f pt g dµ = pt f g dµ ∀ f, g ∈ Fb (S)
(iv) (L , A ) is µ-symmetric, i.e.,

Z Z
f L g dµ = L f g dµ ∀ f, g ∈ A
Remark . 1. A reversible process (Xt , Pµ ) is stationary, since for all s, u ≥ 0,

(Xs+t )0≤t≤u ∼ (Xu−t )0≤t≤u ∼ (Xt )0≤t≤u with respect to Pµ
2. Similarly (ii) implies that µ is a stationary distribution:

Z Z
µ(dx)pt (x, dy) = pt (y, dx)µ(dy) = µ(dy)
Proof of the Theorem. (i)⇒(ii):

µ(dx)pt (x, dy) = Pµ ◦ (X0 , Xt )−1 = Pµ ◦ (Xt , X0 )−1 = µ(dy)pt (y, dx)
(ii)⇒(i): By induction, (ii) implies

µ(dx0 )pt1 −t0 (x0 , dx1 )pt2 −t1 (x1 , dx2 ) · · · ptn −tn−1 (xn−1 , dxn )
=µ(dxn )pt1 −t0 (xn , dxn−1 ) · · · ptn −tn−1 (x1 , dx0 )
for n ∈ N and 0 = t0 ≤ t1 ≤ · · · ≤ tn = t, and thus
Eµ [f (X0 , Xt1 , Xt2 , . . . , Xtn−1 , Xt )] = Eµ [f (Xt , . . . , Xt1 , X0 )]
for all measurable functions f ≥ 0. Hence the time-reversed distribution coincides with
the original one on cylinder sets, and thus everywhere.
(ii)⇔(iii): By Fubini, Z ZZ
f pt g dµ = f (x)g(y)µ(dx)pt (x, dy)
is symmetric for all f, g ∈ Fb (S) if and only if µ ⊗ pt is a symmetric measure on S × S.

(iii)⇔(iv): Exercise.
Application to Itô diffusions in Rn :

n
1X ∂2
L = aij (x) + b · ∇, A = C0∞ (Rn )
2 i,j=1 ∂xi ∂xj
µ probability measure on Rn (more generally locally finite positive measure)
Question: For which process is µ stationary?
Theorem 4.7. Suppose µ = % dx with %i aij ∈ C 1 , b ∈ C, % > 0. Then

1. We have
L g = Ls g + La g
for all g ∈ C0∞ (Rn ) where

1X 1 ∂ ∂g
Ls g = n %aij
2 i,j=1 % ∂xi ∂xi
X 1 ∂
La g = β · ∇g, βj = bj − (%aij )
i
2% ∂xi
2. The operator (Ls , C0∞ is symmetric with respect to µ.

3. The following assertions are equivalent:
(i) L ∗ µ = 0 (i.e. L f dµ = 0 for all f ∈ C0∞ ).
R
(ii) La∗ µ = 0
(iii) div(%β) = 0
(iv) (La , C0∞ ) is anti-symmetric with respect to µ
Proof. Let Z
E(f, g) := − f L g dµ (f, g ∈ C0∞ )
denote the bilinear form of the operator (L , C0∞ (Rn )) on the Hilbert space L2 (Rn , µ). We de-
compose E into a symmetric part and a remainder. An explicit computation based on the integra-
tion by parts formula in Rn shows that for g ∈ C0∞ (Rn ) and f ∈ C ∞ (Rn ):
∂2g
Z X
1
E(f, g) = − f aij + b · ∇g % dt
2 ∂xi ∂xj
Z X Z
1 ∂ ∂g
= (%aij f ) dx − f b · ∇g% dx
2 i,j ∂xi ∂xj
Z X Z
1 ∂f ∂g
= ai,j % dx − f β · ∇g% dx ∀ f, g ∈ C0∞
2 i,j ∂xi ∂xj
and set
Z Z
1X ∂f ∂g
Es (f, g) := ai,j % dx = − f Ls g dµ
2 i,j ∂xi ∂xj
Z Z
Ea (f, g) := f β · ∇g% dx = − f La g dµ
This proves 1) and, since Es is a symmetric bilinear form, also 2). Moreover, the assertions (i)
and (ii) of 3) are equivalent, since
Z Z
− L g dµ = E(1, g) = Es (1, g) + Ea (1, g) = − La g dµ
for all g ∈ C0∞ (Rn ) since Es (1, g) = 0. Finally, the equivalence of (ii),(iii) and (iv) has been
shown in the example above.
Example . L = 12 ∆ + b · ∇, b ∈ C(Rn , Rn ),
1
(L , C0∞ ) µ-symmetric ⇔ β =b− ∇% = 0
2%
∇% 1
⇔ b= = ∇ log %
2% 2
where log % = −H if µ = e−H dx.
L symmetrizable ⇔ b is a gradient
1
L ∗µ = 0 ⇔ b = ∇ log % + β
2
when div(%β) = 0.
Remark . Probabilistic proof of reversibility for b := − 12 ∇H, H ∈ C 1 :
Zt
1
Xt = x + Bt + b(Xs ) ds, non-explosive, b = − ∇h
2
0
−1
Hence Pµ ◦ X0:T Wiener measure with density
 
ZT
1 1 1 1
exp − H(B0 ) − H(BT ) − |∇H| − ∆H (Bs ) ds
2 2 8 2
0
which shows that (Xt , Pµ ) is reversible.

4.3. DIRICHLET FORMS AND CONVERGENCE TO EQUILIBRIUM 99
4.3 Dirichlet forms and convergence to equilibrium

Suppose now that µ is a stationary distribution for (pt )t≥0 . Then pt is a contraction on Lp (S, µ)
for all p ∈ [1, ∞] since
Z Z Z
|pt f | dµ ≤ pt |f | dµ = |f |p dµ ∀ f ∈ Fb (S)
p p
by Jensen’s inequality and the stationarity of µ. As before, we assume that we are given a
Markov process with transition semigroup (pt )t≥0 solving the martingale problem for the op-
erator (L , A ). The assumptions on A0 and A can be relaxed in the following way:
(A0) as above
(A1’) f, Lf ∈ L p (S, µ) for all 1 ≤ p < ∞
(A2’) A0 is dense in A with respect to the Lp (S, µ) norms, 1 ≤ p < ∞, and pt f ∈ A for all
f ∈ A0
In addition, we assume for simplicity
(A3) 1 ∈ A
Remark . Condition (A0) implies that A R, and hence A0 , is dense in Lp (S, µ) for all p ∈ [1, ∞).
In fact, if g ∈ L q (S, µ), 1q + 1q = 1, with f g dµ = 0 for all f ∈ A , then g dµ = 0 by (A0) and
hence g = 0 µ-a.e. Similarly as above, the conditions (A0), (A1’) and (A2’) imply that (pt )t≥0
induces a C0 semigroup on Lp (S, µ) for all p ∈ [1, ∞), and the generator (L(p) , Dom(L(p) ))
extends (L , A ), i.e.,
A ⊆ Dom(L(p) ) and L(p) f = Lf µ-a.e. for all f ∈ A
In particular, the Kolmogorov forward equation

d
pt f = pt Lf ∀f ∈ A
dt
and the backward equation
d
pt f = Lpt f ∀ f ∈ A0
dt
hold with the derivative taken in the Banach space Lp (S, µ).
We first restrict ourselves to the case p = 2. For f, g ∈ L 2 (S, µ) let

Z
(f, g)µ = f g dµ
denote the L2 inner product.

Definition 4.8. The bilinear form

d
E (f, g) := −(f, Lg)µ = − (f, pt g)µ ,

dt t=0
f, g ∈ A , is called the Dirichlet form associated to (L , A ) on L2 (µ).

1
Es (f, g) := (E (f, g) + E (g, f ))
2
is the symmetrized Dirichlet form.
Remark . More generally, E (f, g) is defined for all f ∈ L2 (S, µ) and g ∈ Dom(L(2) ) by
d
E (f, g) = −(f, L g)µ = − (f, pt g)µ
(2)
dt t=0
Theorem 4.9. For all f ∈ A0 and t ≥ 0

Z
d d
Varµ (pt f ) = (pt f )2 dµ = −2E (pt f, pt f ) = −2Es (pt f, pt f )
dt dt
Z
1 1d
E (f, f ) = − 2
(pt f ) dµ = − Varµ (pt f ) ,

2 2 dt t=0
infinitesimal change of variance

2. The assertion extends to all f ∈ Dom(L(2) ) if the Dirichlet form is defined with respect to
the L2 generator. In the symmetric case the assertion even holds for all f ∈ L2 (S, µ).
Proof. By the backward equation,

Z Z
d
(pt f ) dµ = 2 pt Lpt f dµ = −2E (pt f, pt f ) = −2Es (pt f, pt f )
2
dt
Moreover, since
Z Z Z
pt f dµ = f d(µpt ) = f dµ
is constant,
Z
d d
Varµ (pt ) = (pt f )2 dµ
dt dt
Z
1d 1d
E (f, f ) = − (pt f )2 dµ = − Varµ (pt f )

2 dt t=0 2 dt
1 1d
Es (f, g) = (Es (f + g, f + g) + Es (f − g, f − g)) = − Covµ (pt f, pt g)
4 2 dt
Dirichlet form = infinitesimal change of (co)variance.
2. Since pt is a contraction on L 2 (µ), the operator (L , A ) is negative-definite, and the
bilinear form (E , A ) is positive definite:
Z Z
1
(−f, L f )µ = E (f, f ) = − lim 2 2
(pt f ) dµ − f dµ ≥ 0
2 t↓0
Corollary 4.10 (Decay of variance). For λ > 0 the following assertions are equivalent:
(i) Poincaré inequality:
1
Varµ (f ) ≤ E(s) (f, f ) ∀ f ∈ A
λ
(ii) Exponential decay of variance:
Varµ (pt f ) ≤ e−2λt Varµ (f ) ∀ f ∈ L2 (S, µ) (4.5)
(iii) Spectral gap:

(2)
Re α ≥ λ ∀ α ∈ spec −L
span{1}⊥
Remark . Optimizing over λ, the corollary says that (4.5) holds with
E (f, f ) (f, −L f )µ
λ := inf = inf
f ∈A Varµ (f ) f ∈A (f, f )µ
f ⊥1 in L2 (µ)
Proof. (i) ⇒ (ii)

E (f, f ) ≥ λ · Varµ (f ) ∀ f ∈ A
By the theorem above,
d
Varµ (pt f ) = −2E (pt f, pt f ) ≤ −2λ Varµ (pt f )
dt
for all t ≥ 0, f ∈ A0 . Hence
Varµ (pt f ) ≤ e−2λt Varµ (p0 f ) = e−2λt Varµ (f )
for all f ∈ A0 . Since the right hand side is continuous with respect to the L2 (µ) norm, and
A0 is dense in L2 (µ) by (A0) and (A2), the inequality extends to all f ∈ L2 (µ).
(ii) ⇒ (iii) For f ∈ Dom(L(2) ,

d
Varµ (pt f ) = −2E (f, f ).

dt t=0
Hence if (4.5) holds then

Varµ (pt f ) ≤ e−2λt Varµ (f ) ∀ t ≥ 0
which is equivalent to
Varµ (f ) − 2tE (f, f ) + o(t) ≤ Varµ (f ) − 2λt Varµ (f ) + o(t) ∀ t ≥ 0
Hence
E (f, f ) ≥ λ Varµ (f )
and thus
Z
(2)
−(L f, f )µ ≥ λ f 2 dµ for f ⊥1
which is equivalent to (iii).

(iii) ⇒ (i) Follows by the equivalence above.
Remark . Since (L , A ) is negative definite, λ ≥ 0. In order to obtain exponentially decay,

however, we need λ > 0, which is not always the case.
Example . 1. Finite state space: Suppose µ(x) > 0 for all x ∈ S.

Generator:
X X
(L f )(x) = L (x, y)f (y) = L (x, y)(f (y) − f (x))
y y
Adjoint:
µ(x)
L ∗µ (y, x) = L (x, y)
µ(y)
Proof.
X
(L f, g)µ = µ(x)L (x, y)f (y)g(x)
x,y
X µ(x)
= µ(y)f (y) L (x, y)g(x)
µ(y)
= (f, L ∗µ g)µ
Symmetric part:

1 1 µ(y)
Ls (x, y) = (L (x, y) + L (x, y)) =
∗µ
L (x, y) + L (y, x)
2 2 µ(x)
1
µ(x)Ls (x, y) = (µ(x)L (x, y) + µ(y)L (y, x))
2
Dirichlet form:
X
Es (f, g) = −(Ls f, g) = − µ(x)Ls (x, y) (f (y) − f (x)) g(x)
x,y
X
=− µ(y)Ls (y, x) (f (x) − f (y)) g(y)
x,y
1X
=− µ(x)Ls (x, y) (f (y) − f (x)) (g(y) − g(x))
2
Hence
1X
E (f, f ) = Es (f, f ) = Q(x, y) (f (y) − f (x))2
2 x,y
where
1
Q(x, y) = µ(x)Ls (x, y) = (µ(x)L (x, y) + µ(y)L (y, x))
2
2. Diffusions in Rn : Let
1X ∂2
L = aij + b · ∇,
2 i,j ∂xi ∂xj
and A = C0∞ , µ = % dx, %, aij ∈ C 1 , b ∈ C % ≥ 0,

n
Z X
1 ∂f ∂g
Es (f, g) = aij dµ
2 i,j=1
∂xi ∂xj
1
E (f, g) = Es (f, g) − (f, β · ∇g), β =b− div (%aij )
2%
Definition 4.11 (”Distances” of probability measures). µ, ν probability measures on S, µ − ν

signed measure.
(i) Total variation distance:
kν − µkTV = sup |ν(A) − µ(A)|

A∈S
(ii) χ2 -contrast:

R dµ
2 R dν 2
− 1 dµ = dµ − 1 if ν µ
χ2 (µ|ν) = dν dµ
+∞ else
(iii) Relative entropy:

(R
dν dν dν
R
dµ
log dµ dµ = log dµ dν if ν µ
H(ν|µ) =
+∞ else
(where 0 log 0 := 0).
Remark . By Jensen’s inequality,

Z Z
dν dν
H(ν|µ) ≥ dµ log dµ = 0
dµ dµ
Lemma 4.12 (Variational characterizations).
(i) Z Z
1
kν − µk = sup f dν − f dµ
2 f ∈Fb (S)
|f |≤1
(ii)
Z Z 2
2
λ (ν|µ) = sup f dν − f dµ
Rf ∈F b (S)
f 2 dµ≤1
R
and by replacing f by f − f dµ,
Z 2
2
λ (ν|µ) = sup f dν
Rf ∈F b (S)
2 dµ≤1
R f
f dµ=0
(iii)
Z Z Z
H(ν|µ) = sup f dν = R
sup f dν − log ef dµ
Rf ∈F b (S) f Fb (S)
ef dµ≤1
R R
Remark . ef dµ ≤ 1, hence f dµ ≤ 0 by Jensen and we also have
Z Z
R
sup f dν − f dµ ≤ H(ν|µ)
ef dµ≤1
Proof. (i) ” ≤ ”
Z Z
1 1
ν(A) − µ(A) = (ν(A) − µ(A) + µ(Ac ) − ν(Ac )) = f dν − f dµ
2 2
and setting f := IA − IAc leads to

Z Z
1
kν − µkTV = sup (ν(A) − µ(A)) ≤ sup f dν − f dµ
A 2 |f |≤1
” ≥ ” If |f | ≤ 1 then
Z Z Z
f d(ν − µ) = f d(ν − µ) + f d(ν − µ)
S+ S−
≤ (ν − µ)(S+ ) − (ν − µ)(S− )
= 2(ν − µ)(S+ ) (since (ν − µ)(S+ ) + (ν − µ)(S− ) = (ν − µ)(S) = 0)
≤ 2kν − µkTV
where S = S+ ˙ S− , ν − µ ≥ 0 on S+ , ν − µ ≤ 0 on S− is the Hahn-Jordan

S
decomposition of the measure ν − µ.
(ii) If ν µ with density % then

Z Z Z
1 1
2
χ (ν|µ) = k% − 1kL2 (µ) =
2 2
sup f (% − 1) dµ = sup f dν − f dµ
f ∈L 2 (µ) f ∈Fb (S)
kf kL2 (µ) ≤1 kf kL2 (µ) ≤1
by the Cauchy-Schwarz inequality and a density argument.

If ν 6 µ then there exists A ∈ S with µ(A) = 0 and ν(A) 6= 0. Choosing f = λ · IA with
λ ↑ ∞ we see that
Z Z 2
sup f dν − f dµ = ∞ = χ2 (ν|µ).
f ∈Fb (S)
kf kL2 (µ) ≤1
R
This proves the first equation. The second equation follows by replacing f by f − f dµ.
(iii) First equation:

” ≥ ” By Young’s inequality,
uv ≤ u log u − u + ev
for all u ≥ 0 and v ∈ R, and hence for ν µ with density %,
Z Z
f dν = f % dµ
Z Z Z
≤ % log % dµ − % dµ + ef dµ
Z
= H(ν|µ) − 1 + ef dµ ∀ f ∈ Fb (S)
Z
≤ H(ν|µ) if ef dµ ≤ 1
” ≤ ” ν µ with density %:
1
a) ε ≤ % ≤ ε
for some ε > 0: Choosing f = log % we have
Z Z
H(ν|µ) = log % dν = f dν
and
Z Z
f
e dµ = % dµ = 1
b) General case by an approximation argument.
Second equation: cf. Deuschel, Stroock [8].
Remark . If ν µ with density % then

Z
1 1
kν − µkTV = sup f (% − 1) dµ = k% − 1kL1 (µ)
2 |f |≤1 2
However, kν − µkTV is finite even when ν 6 µ.
Corollary 4.13. The assertions (i) − (iii) in the corollary above are also equivalent to
(iv) Exponential decay of χ2 distance to equilibrium:
χ2 (νpt |µ) ≤ e−2λt χ2 (ν|µ) ∀ ν ∈ M1 (S)
Proof. We show (ii) ⇔ (iv).

” ⇒ ” Let f ∈ L 2 (µ) with

R
f dµ = 0. Then
Z Z Z Z
f d(νpt ) − f dµ = f d(νpt ) = pt f dν
1
≤ kpt f kL2 (µ) · χ2 (ν|µ) 2
1
≤ e−λt kf kL2 (µ) · χ2 (ν|µ) 2
R R
where
R 2 we have used that pt f dµ = f dµ = 0. By taking the supremum over all f with
f dµ ≤ 1 we obtain
1 1
χ2 (νpt |µ) 2 ≤ e−λt χ2 (ν|µ) 2
” ⇐ ” For f ∈ L 2 (µ) with

R
f dµ = 0, (iv) implies
Z Z
ν:=gµ 1
pt f g dµ = f d(νpt ) ≤ kf kL2 (µ) χ2 (νpt |µ) 2
1
≤ e−λt kf kL2 (µ) χ2 (ν|µ) 2
= e−λt kf kL2 (µ) kgkL2 (µ)
for all g ∈ L2 (µ), g ≥ 0. Hence
kpt f kL2 (µ) ≤ e−λt kf kL2 (µ)
Example: d = 1!
Example (Gradient type of diffusions in Rn ).

dXt = dBt + b(Xt ) dt, b ∈ C(Rn , Rn )
Generator:
1
L f = ∆f + b∇f, f ∈ C0∞ (Rn )
2
symmetric with respect to µ = % dx, % ∈ C 1 ⇔ b = 12 ∇ log %.
Corresponding Dirichlet form on L2 (% dx):
Z Z
1
E (f, g) = − Lf g% dx = ∇f ∇g% dx
2
Poincaré inequality: Z
1
Var% dx (f ) ≤ · |∇f |2 % dx
2λ
The one-dimensional case: n = 1, b = 12 (log %)0 and hence
Rx
2b(y) dy
%(x) = const. e 0
2
e.g. b(x) = −αx, %(x) = const. e−αx , µ = Gauss measure.
Theorem 4.14. ???
Proof. By the Cauchy-Schwarz inequality, for x > 0,

 x 2
Z Zx Zx
2 1
(f (x) − f (0)) =  f 0 dy  ≤ |f 0 |2 g% dy dy
g%
0 0 0
where g : R+ → R+ is an arbitrary continuous strict positive function. Hence by Fubini

Z∞
(f (x) − f (0))2 %(x) dx
0
Z∞ Z∞ Zx
1
≤ |f 0 (y)|2 g(y)%(y) dz%(x) dx
g(x)%(z)
0 y 0
Z∞ Z∞ Zx
0 2 1
≤ |f (y)| %(y) dy · sup g(y) · dz%(x) dx
y>0 g(x)%(z)
0 y 0
Optimal choice for g:

 21
Zy

1
g(y) =  dx
%(x)
0
In this case:
Zx Zx
1
dz = 2 g 0 dz = 2g(x),
g%
0 0
so
Z∞ Z∞
2
(f − f (0)) % dx ≤ |f 0 |2 % dy · sup
y>0
0 0
Bounds on the variation norm:
Lemma 4.15. (i)

1
kν − µk2TV ≤ χ2 (ν|µ)
4
(ii) Pinsker’s inequality:
1
kν − µk2TV ≤ H(ν|µ) ∀ µ, ν ∈ M1 (S)
2
Proof. If ν 6 µ, then H(ν|µ) = χ2 (ν|µ) = ∞.

Now let ν µ:
(i)
1 1 1 1
kν − µkTV = k% − 1kL1 (µ) ≤ k% − 1kL2 (µ) = χ2 (ν|µ) 2
2 2 2
(ii) We have the inequality
3(x − 1)2 ≤ (4 + 2x)(x log x − x + 1) ∀x ≥ 0
and hence
√ 1 1
3|x − 1| ≤ (4 + 2x) 2 (x log x − x + 1) 2
and with the Cauchy Schwarz inequality

12 Z 12
√ Z
Z
3 |% − 1| dµ ≤ (4 + 2%) dµ (% log % − % + 1) dµ
√ 1
= 6 · H(ν|µ) 2
Remark . If S is finite and µ(x) > 0 for all x ∈ S then conversely

P 2
ν(x)
2 − 1 µ(x)

X ν(x) x∈S µ(x)

χ2 (ν|µ) = − 1 µ(x) ≤
x∈S
µ(x) minx∈S µ(x)
4kν − µk2TV
=
min µ
Corollary 4.16. (i) If the Poincaré inequality

1
Varµ (f ) ≤ E (f, f ) ∀ f ∈ A
λ
holds then
1 1
kνpt − µkTV ≤ e−λt χ2 (ν|µ) 2 (4.6)
2
(ii) In particular, if S is finite then
1
kνpt − µkTV ≤ 1 e−λt kν − µkTV
minx∈S µ(x) 2
where kν − µkTV ≤ 1. This leads to a bound for the Dobrushin coefficient (contraction
coefficient with respect to k · kTV ).
Proof.
1 1 1 1 2 1
kνpt − µkTV ≤ χ2 (νpt |µ) 2 ≤ e−λt χ2 (ν|µ) 2 ≤ e−λt kν − µkTV
2 2 2 min µ 12
if S is finite.
Consequence: Total variation mixing time: ε ∈ (0, 1),
Tmix (ε) = inf {t ≥ 0 : kνpt − µkTV ≤ ε for all ν ∈ M1 (S)}

1 1 1 1
≤ log + log
λ ε 2λ min µ(x)
where the first summand is the L2 relaxation time and the second is called burn-in period, i.e.
the time needed to make up for a bad initial distribution.
Remark . On high or infinite-dimensional state spaces the bound (4.6) is often problematic since
χ2 (ν|µ) can be very large (whereas kν − µkTV ≤ 1). For example for product measures,
Z n 2 Z 2 !n
dν dν
χ2 (ν n |µn ) = n
dµn − 1 = dµ − 1
dµ dµ
R dν 2
where dµ
dµ > 1 grows exponentially in n.
Are there improved estimates?

Z Z Z
pt f dν − f dµ = pt f d(ν − µ) ≤ kpt f ksup · kν − µkTV
Analysis: From the Sobolev inequality follows
kpt f ksup ≤ c · kf kLp
However, Sobolev constants are dimension dependent! This leads to a replacement by the log
Sobolev inequality.
4.4 Hypercontractivity
Additional reference for this chapter:
• Gross [11]
• Deuschel, Stroock [8]

4.4. HYPERCONTRACTIVITY 111
• Ané [2]
• Royer [19]
We consider the setup from section 4.3. In addition, we now assume that (L , A ) is symmetric
on L2 (S, µ).
Theorem 4.17. With assumptions (A0)-(A3) and α > 0, the following statements are equivalent:
(i) Logarithmic Sobolev inequality (LSI)
f2
Z
f 2 log dµ ≤ αE (f, f ) ∀ f ∈ A
S kf k2L2 (µ)
(ii) Hypercontractivity For 1 ≤ p < q < ∞,

α q−1
kpt f kLq (µ) ≤ kf kLp (µ) ∀ f ∈ Lp (µ), t ≥ log
4 p−1
(iii) Assertion (ii) holds for p = 2.
Remark . Hypercontractivity and Spectral gap implies
kpt f kLq (µ) = kpt0 pt−t0 f kLq (µ) ≤ kpt−t0 f kL2 (µ) ≤ e−λ(t−t0 ) kf kL2 (µ)
α
for all t ≥ t0 (q) := 4
log(q − 1).
Proof. (i)⇒(ii) Idea: WLOG f ∈ A0 , f ≥ δ > 0 (which implies that pt f ≥ δ ∀ t ≥ 0).

Compute
d
kpt f kLq(t) (µ) , q : R+ → (1, ∞) smooth:
dt
1. Kolmogorov:
d
pt f = L pt f derivation with respect to sup-norm
dt
implies that
Z Z Z
d q(t)
(pt f ) dµ = q(t) (pt f )q(t)−1
L pt f dµ + q (t) (pt f )q(t) log pt f dµ
0
dt
where Z
(pt f )q(t)−1 L pt f dµ = −E (pt f )q(t)−1 , pt f

2. Stroock estimate:
4(q − 1) q q
E f q−1 , f ≥ E f 2,f 2
q2
Proof.
1
E (f q−1 , f ) = − f q−1 , L f µ = lim f q−1 , f − pt f µ

t↓0 t
ZZ
1
f q−1 (y) − f q−1 (x) (f (y) − f (x)) pt (x, dy) µ(dx)

= lim
t↓0 2t
4(q − 1)
ZZ
1 q q 2
≥ lim f 2 (y) − f (x) pt (x, dy) µ(dx)
q2 t↓0 2t 2
4(q − 1) q q
= E f 2,f 2
q2
where we have used that
q q 2 q2
aq−1 − bq−1 (a − b) ∀ a, b > 0, q ≥ 1

a −b
2 ≤
2 4(q − 1)
Remark . – The estimate justifies the use of functional inequalities with respect to
E to bound Lp norms.
– For generators of diffusions, equality holds, e.g.:
4(q − 1)
Z Z
q 2

q−1
∇f ∇f dµ = ∇f dµ
2
q2
by the chain rule.
3. Combining the estimates:
Z Z
q(t)−1 d d q(t) 0
q(t) · kpt f kq(t) kpt f kq(t) = (pt f ) dµ − q (t) (pt f )q(t) log kpt f kq(t) dµ
dt dt
where Z
q(t)
(pt f )q(t) dµ = kpt f kq(t)
This leads to the estimate

d q(t)−1
q(t) · kpt f kq(t)
kpt f kq(t)
dt
4(q(t) − 1) q(t) q(t)
q 0 (t) Z (pt f )q(t)
≤− E (pt f ) 2 , (pt f ) 2 + · (pt f )q(t) log R dµ
q(t) q(t) (pt f )q(t) dµ
4. Applying the logarithmic Sobolev inequality: Fix p ∈ (1, ∞). Choose q(t) such that
αq 0 (t) = 4(q(t) − 1), q(0) = p

i.e.
4t
q(t) = 1 + (p − 1)e α
Then by the logarithmic Sobolev inequality, the right hand side in the estimate above
is negative, and hence kpt f kq(t) is decreasing. Thus
kpt f kq(t) ≤ kf kq(0) = kf kp ∀ t ≥ 0.

d
Other implication: Exercise. (Hint: consider dt
kpt f kLq(t) (µ) ).
Theorem 4.18 (ROTHAUS). A logarithmic Sobolev inequality with constant α implies a Poincaré
inequality with constant α = α2 .
R
Proof. f ∈ L2 (µ), g dµ = 0, f := 1 + εg, f 2 = 1 + 2εg + ε2 g 2 ,
Z Z
2
f dµ = 1 + ε 2
g 2 dµ, E (f, f ) = E (1, 1) + 2E (1, g) + ε2 E (g, g)
and the Logarithmic Sobolev Inequality implies

Z Z Z
(1 + ε) log(1 + εg) dµ ≤ αE (f, f ) + f dµ log f 2 dµ
2 2 2
Z Z Z
f log f dµ ≤ αE (f, f ) + f dµ log f 2 dµ ∀ ε > 0
2 2 2
where f 2 log f 2 = 2εg + ε2 g 2 + 21 (2εg)2 + O(ε3 ) and

R R R
f 2 dµ log f 2 dµ = ε2 g 2 dµ + O(ε3 ).
1
x log x = x − 1 + (x − 1)2 + O |x − 1|3

2
which implies that
Z
2ε 2
g 2 dµ + O(ε3 ) ≤ αε2 E (g, g) ∀ ε > 0
Z
⇒ 2 g 2 dµ ≤ αE (g, g)
Application to convergence to equilibrium:
Theorem 4.19 (Exponential decay of relative entropy). 1. H(νpt |µ) ≤ H(ν|µ) for all t ≥ 0
and ν ∈ M1 (S).
2. If a logarithmic Sobolev inequality with constant α > 0 holds then
2
H(νpt |µ) ≤ e− α t H(ν|µ)
Proof for gradient diffusions. L = 12 ∆ + b∇, b = 21 ∇ log % ∈ C(Rn ), µ = % dx probability

measure, A0 = span{C0∞ (Rn ), 1}
. The Logarithmic Sobolev Inequality implies that
f2
Z Z
2 α
f log dµ ≤ |∇f |2 dµ = αE (f, f )
kf k2L2 (µ) 2
(i) Suppose ν = g · µ, 0 <R ε ≤ g ≤ 1εR for some εR > 0. Hence R νpt µ with density
1
pt g, ε ≤ pt g ≤ ε (since f d(νpt ) = pt f dν = pt f gdµ = f pt g dµ by symmetry).
This implies that
Z Z
d d
H(νpt |µ) = pt g log pt g dµ = L pt g(1 + log pt g) dµ
dt dt
by Kolmogorov. Using the fact that (x log x)0 = 1 + log x we get
Z
d 1
H(νpt |µ) = −E (pt g, log pt g) = − ∇pt g · ∇ log pt g dµ
dt 2
∇pt g
where ∇ log pt g = pt g
. Hence
√
Z
d
H(νpt |µ) = −2 |∇ pt g|2 dµ (4.7)
dt
R √ 2
1. −2 ∇ pt g dµ ≤ 0
2. The Logarithmic Sobolev Inequality yields that
√
Z Z
2 4 pt g
−2 |∇ pt g| dµ ≤ − pt g log R dµ
α pt g dµ
R R
where pt g dµ = g dµ = 1 and hence
√
Z
4
−2 |∇ pt g|2 dµ ≤ − H(νpt |µ)
α
(ii) Now for a general ν. If ν 6 µ, H(ν|µ) = ∞ and we have the assertion. Let ν = g · µ, g ∈
L1 (µ) and
ga,b := (g ∨ a) ∧ b, 0 < a < b,

νa,b := ga,b · µ.
Then by (i),
2t
H(νa,b pt |µ) ≤ e− α H(νa,b |µ)
The claim now follows for a ↓ 0 and b ↑ ∞ by dominated and monotone convergence.
Remark . 1. The proof in the general case is analogous, just replace (4.7) by inequality
4E ( f , f ) ≤ E (f, log f )
p p
2. An advantage of the entropy over the χ2 distance is the good behavior in high dimensions.
E.g. for product measures,
H(ν d |µd ) = d · H(ν|µ)
grows only linearly in dimension.
Corollary 4.20 (Total variation bound). For all t ≥ 0 and ν ∈ M1 (S),
1 t 1
kνpt − µkTV ≤ √ e− α H(ν|µ) 2
2
1 1 t
( ≤ √ log e− α if S is finite)
2 min µ(x)
Proof.
1 1 1 t 1
kνpt − µkTV ≤ √ H(νpt |µ) 2 ≤ √ e− α H(ν|µ) 2
2 2
where we use Pinsker’s Theorem for the first inequality and Theorem ??? for the second inequal-
ity. Since S is finite,
1 1
H(δx |µ) = log ≤ log ∀x ∈ S
µ(x) min µ
which leads to
X 1
H(ν|µ) ≤ ν(x)H(δx |µ) ≤ log ∀ν
min µ
P
since ν = ν(x)δx is a convex combination.
Consequence for mixing time: (S finite)
Tmix (ε) = inf {t ≥ 0 : kνpt − µkTV ≤ ε for all ν ∈ M1 (S)}

1 1
≤ α · log √ + log log
2ε minx∈S µ(x)
Hence we have log log instead of log !

4.5 Logarithmic Sobolev inequalities: Examples and techniques

Example . Two-point space. S = {0, 1}. Consider a Markov chain with generator

−q q
L = , p, q ∈ (0, 1), p + q = 1
p −p
which is symmetric with respect to the Bernoulli measure,
µ(0) = p, µ(1) = q
q =1−p
0 1
Dirichlet form:
1X
E (f, f ) = (f (y) − f (x))2 µ(x)L (x, y)
2 x,y
= pq · |f (1) − f (0)|2 = Varµ (f )
Spectral gap:
E (f, f )
λ(p) = inf = 1 independent of p !
f not const. Varµ (f )
Optimal Log Sobolev constant:

(
1
R
f 2 log f 2 dµ 2 if p = 2
α(p) = sup =
R f2 ⊥1
E (f, f ) log q−log p
q−p
else
f dµ=1
goes to infinity as p ↓ 0 or p ↑ ∞ !
2
| |
0 p 1
4.5. LOGARITHMIC SOBOLEV INEQUALITIES: EXAMPLES AND TECHNIQUES 117
Spectral gap and Logarithmic Sobolev Inequality for product measures:
Z
Entµ (f ) := f log f dµ, f > 0
Theorem 4.21 (Factorization property). (Si , Si , µi ) probability spaces, µ = ⊗ni=1 µi . Then

1. n
X h i
Varµ (f ) ≤ Eµ Var(i)
µi (f )
i=1
where on the right hand side the variance is taken with respect to the i-th variable.
2. n
X h i
Entµ (f ) ≤ Eµ Ent(i)
µi (f )
i=1
Proof. 1. Exercise.
2.
Entµ (f ) = sup Eµ [f g], cf. above

g : Eµ [eg ]=1
Fix g : S n → R such that Eµ [eg ] = 1. Decompose:
g(x1 , . . . , xn ) = log eg(x1 ,...,xn )

R g(y ,x ,...,x )
eg(x1 ,...,xn ) e 1 2 n µ1 (dy1 )
= log R g(y1 ,x2 ,...,xn ) + log RR g(y1 ,y2 ,x3 ,...,xn ) + ···
e µ1 (dy1 ) e µ1 (dy1 )µ2 (dy2 )
Xn
=: gi (x1 , . . . , xn )
i=1
and hence
Eiµi [egi ] = 1 ∀, 1 ≤ i ≤ n
Xn n
X
Eµ E(i) ≤ Ent(i)

⇒ Eµ [f g] = Eµ [f gi ] = µi [f g i ] µi (f )
i=1 i=1
n
X h i
⇒ Entµ [f ] = sup Eµ [f g] ≤ Eµ Ent(i)
µi (f )
Eµ [eg ]=1 i=1
Corollary 4.22. 1. If the Poincaré inequalities

1
Varµi (f ) ≤ Ei (f, f ) ∀ f ∈ Ai
λi
hold for each µi then
n
1 O
Varµ (f ) ≤ E (f, f ) ∀ f ∈ Ai
λ i=1
where n
X h i
(i)
E (f, f ) = Eµ Ei (f, f )
i=1
and
λ = min λi
1≤i≤n
2. The corresponding assertion holds for Logarithmic Sobolev Inequalities with α = max αi
Proof.
n
X h i 1
Varµ (f ) ≤ Eµ Var(i)
µi (f ) ≤ E (f, f )
i=1
min λi
since
1
Varµ(i)i (f ) ≤ Ei (f, f )
λi
Example . S = {0, 1}n , µn product of Bernoulli(p),
Entµn (f )
n Z
X
≤ α(p)·p·q· |f (x1 , . . . , xi−1 , 1, xi+1 , . . . , xn ) − f (x1 , . . . , xi−1 , 0, xi+1 , . . . , xn )|2 µn (dx1 , . . . , dxn )
i=1
independent of n.
Example . Standard normal distribution γ = N (0, 1),

Pn 1

i=1 xi −
ϕn : {0, 1}n → R, ϕn (x) = pn 2
The Central Limit Theorem yields that µ = Bernoulli( 21 ) and hence

w
µn ◦ ϕ−1
n → γ
Hence for all f ∈ C0∞ (R),
Varγ (f ) = lim Varµn (f ◦ ϕn )

n→∞
n Z
1X
≤ lim inf |∆i f ◦ ϕn |2 dµn
2 i=1
Z
≤ · · · ≤ 2 · |f 0 |2 dγ
Central Limit Theorem with constant α = 2.

Similarly: Poincaré inequality with λ = 1 (Exercise).
Central Limit Theorem with respect to log concave probability measures:

Stochastic gradient flow in Rn :
dXt = dBt − (∇H)(Xt ) dt, H ∈ C 2 (Rn )
Generator:
1
L = ∆ − ∇H · ∇
2
µ(dx) = e−H(x) dx satisfies L ∗ µ = 0
Assumption: There exists a κ > 0 such that
∂ 2 H(x) ≥ κ · I ∀ x ∈ Rn
2
i.e. ∂ξξ H ≥ κ · |ξ|2 ∀ ξ ∈ Rn
Remark . The assumption implies the inequalities
x · ∇H(x) ≥ κ · |x|2 − c, (4.8)

κ
H(x) ≥ |x|2 − c̃ (4.9)
2
with constants c, c̃ ∈ R. By (4.8) and a Lyapunov argument it can be shown that Xt does not ex-
plode in finite time and that pt (A0 ) ⊆ A where A0 = span (C∞ 0 (R ), 1), A = span (S (R ), 1).
n n
By (4.9), the measure µ is finite, hence by our results above, the normalized measure is a station-
ary distribution for pt .
Lemma 4.23. If Hess H ≥ κI then
|∇pt f | ≤ e−κt pt |∇f | f ∈ Cb1 (Rn )

Remark . 1. Actually, both statements are equivalent.
2. If we replace Rn by an arbitrary Riemannian manifold the same assertion holds under the
assumption
Ric + Hess H ≥ κ · I
(Bochner-Lichnerowicz-Weitzenböck).
Informal analytic proof:

1
∇L f = ∇ ∆ − ∇H · ∇ f
2

1 2
= ∆ − ∇H · ∇ − ∂ H ∇f
2
*
=:L operator on one-forms (vector fields)
This yields to the Evolution equation for ∇pt f :
∂ ∂ *
∇pt f = ∇ pt f = ∇L pt f =L ∇pt f
∂t ∂t
and hence
∂

∂ ∂ 1
∂t
∇pt f · ∇pt f
|∇pt f | = (∇pt f · ∇pt f ) 2 =
∂t ∂t |∇pt f |
*
L ∇pt f · ∇pt f L ∇pt f · ∇pt f |∇pt f |2
= ≤ −κ·
|∇pt f | |∇pt f | |∇pt f |
≤ · · · ≤ L |∇pt f | − κ |∇pt f |
We get that v(t) := eκt ps−t |∇pt f | with 0 ≤ t ≤ s satisfies
v 0 (t) ≤ κv(t) − ps−t L |∇pt f | + ps−t L |∇pt f | − κps−t |∇pt f | = 0
and hence
eκs |∇ps f | = v(s) ≤ v(0) = ps |∇f |
• The proof can be made rigorous by approximating | · | by a smooth function, and using
regularity results for pt , cf. e.g. Deuschel, Stroock[8].
• The assertion extends to general diffusion operators.

Probabilistic proof: pt f (x) = E[f (Xtx )] where Xtx is the solution flow of the stochastic differ-
ential equation
dXt = dBt − (∇H)(Xt ) dt, i.e.,

Zt
Xtx = x + Bt − (∇H)(Xsx ) ds
0
By the assumption on H one can show that x → Xtx is smooth and the derivative flow Ytx =
∇x Xt satisfies the differentiated stochastic differential equation
dYtx = −(∂ 2 H)(Xtx )Ytx dt,

Y0x = I
which is an ordinary differential equation. Hence if ∂ 2 H ≥ κI then for v ∈ Rn ,

d
|Yt · v|2 = −2 Yt · v, (∂ 2 H)(Xt )Yt · v Rn ≤ κ · |Yt · v|2

dt
where Yt · v is the derivative of the flow in direction v. Hence
|Yt · v|2 ≤ e−2κt |v|

⇒ |Yt · v| ≤ e−κt |v|
This implies that for f ∈ Cb1 (Rn ), pt f is differentiable and
v · ∇pt f (x) = E [(∇f (Xtx ); Ytx · v)]

≤ E [|∇f (Xtx )|] · e−κt · |v| ∀ v ∈ Rn
i.e.
|∇pt f (x)| ≤ e−κt pt |∇f |(x)
Theorem 4.24 (BAHRY-E MERY). Suppose that
∂2H ≥ κ · I with κ > 0
Then
f2
Z Z
1
2
f log 2
dµ ≤ |∇f |2 dµ ∀ f ∈ C0∞ (Rn )
kf kL2 (µ) κ
Remark . The inequality extends to f ∈ H 1,2 (µ) where H 1,2 (µ) is the closure of C0∞ with respect
to the norm Z 21
2 2
kf k1,2 := |f | + |∇f | dµ
Proof. g ∈ span(C0∞ , 1), g ≥ δ ≥ 0.

Aim:
√ 2
Z Z Z Z
1
g log g dµ ≤ |∇ g| dµ + g dµ log g dµ
κ
Then g = f 2 and we get the assertion.
Idea: Consider Z
u(t) = pt g log pt g dµ
Claim:
R
(i) u(0) = g log g dµ
R R
(ii) limt↑∞ u(t) = g dµ log g dµ
R √ 2
(iii) −u0 (t) ≤ 2e−2κt ∇ g dµ
By (i), (ii) and (iii) we then obtain:
Z Z Z
g log g dµ − g dµ log g dµ = lim (u(0) − u(t))
t→∞
Zt
= lim −u0 (t) ds
t→∞
0
√
Z
≤ |∇ g|2 dµ
R∞
where 2 0
e−2κs ds = κ1 .
Proof of claim: (i) Obvious.
(ii) Ergodicity yields to Z
pt g(x) → g dµ ∀x
for t ↑ ∞.
In fact:
|∇pt g| ≤ e−κt pt |∇g| ≤ e−κt |∇g|
and hence
|pt g(x) − pt g(y)| ≤ e−κt sup |∇g| · |x − y|
which leads to
Z Z

pt g(x) − g dµ = (pt g(x) − pt g(y)) µ(dy)

Z
≤ e−κt sup |∇g| · |x − y| µ(dy) → 0
Since pt g ≥ δ ≥ 0, dominated convergence implies that

Z Z Z
pt g log pt δ dµ → g dµ log g dµ
(iii) Key Step! By the computation above (decay of entropy) and the lemma,
|∇pt g|2
Z Z
0 1 1
−u (t) = + ∇pt g · ∇ log pt g dµ = dµ
2 2 pt g
1 −κt |pt ∇g|2 |∇g|2
Z Z
1 −2κt
≤ e dµ ≤ e pt dµ
2 pt g 2 g
1 −2κt |∇g|2 √
Z Z
= e dµ = 2e −2κt
|∇ g|2 dµ
2 g
Example . An Ising model with real spin: (Reference: Royer [19])

S = RΛ = {(xi )i∈Λ | xi ∈ R}, Λ ⊂ Zd finite.
1
µ(dx) = exp(−H(x)) dx
Z
X 1 X X
H(x) = V (xi ) − ϑ(i − j) xi xj − ϑ(i − j)xi zj ,
| {z } 2 | {z }
i∈Λ potential i,j∈Λ interactions i∈Λ,j∈Zd \Λ
where V : R → R is a non-constant polynomial, bounded from below, and ϑ : Z → R is a

function such that ϑ(0) = 0, ϑ(i) = ϑ(−i) ∀ i, (symmetric interactions), ϑ(i) = 0 ∀ |i| ≥ R
d
(finite range), z ∈ RZ \Λ fixed boundary condition.
Glauber-Langevin dynamics:
∂H
dXti = − (Xt ) dt + dBti , i∈Λ (4.10)
∂xi
Dirichletform: Z
1X ∂f ∂g
E (f, g) = dµ
2 i∈Λ ∂xi ∂xi
Corollary 4.25. If X
inf V 00 (x) > |ϑ(i)|
x∈R
i∈Z
then E satisfies a log Sobolev inequality with constant independent of Λ.
Proof.
∂2H
(x) = V 00 (xi ) · δij − ϑ(i − j)
∂xi ∂xj
!
X
⇒ ∂ 2 H ≥ inf V 00 − |ϑ(i)| · I
i
in the sense of ???.

Consequence: There is a unique Gibbs measure on Zd corresponding to H, cf. Royer [19].

What can be said if V is not convex?
Theorem 4.26 (Bounded perturbations). µ, ν ∈ M1 (Rn ) ??? absolut continuous,
dν 1
(x) = e−U (x) .
dµ Z
If
f2
Z Z
2
f log dµ ≤ α · |∇f |2 dµ ∀ f ∈ C∞
0
kf k2L2 (µ)
then
f2
Z Z
2
f log 2
dν ≤ α · eosc(U ) · |∇f |2 dν ∀ f ∈ C0∞
kf kL2 (ν)
where
osc(U ) := sup U − inf U
Proof.
|f |2
Z Z
2
f log dν ≤ f 2 log f 2 − f 2 log kf k2L2 (µ) − f 2 + kf k2L2 (µ) dν (4.11)
kf k2L2 (ν)
since
|f |2
Z Z
2
f log dν ≤ f 2 log f 2 − f 2 log t2 − f 2 + t2 dν ∀t > 0
kf k2L2 (ν)
Note that in (4.11) the integrand on the right hand side is non-negative. Hence
|f |2
Z Z
2 1 − inf U 2 2 2 2 2 2

f log dν ≤ · e f log f − f log kf kL2 (µ) − f + kf k L2 (µ) dµ
kf k2L2 (ν) Z
f2
Z
1 − inf U
= e · f 2 log dµ
Z kf k2L2 (µ)
Z
1 − inf U
≤ ·e α |∇f |2 dµ
Z
Z
sup U −inf U
≤e α |∇f |2 dν
Example . We consider the Gibbs measures µ from the example above

4.6. CONCENTRATION OF MEASURE 125
1. No interactions:
X x2
i
H(x) = + V (xi ) , V : R → R bounded
i∈Λ
2
Hence O
µ= µV
i∈Λ
where
µV (dx) ∝ e−V (x) γ(dx)
and γ(dx) is the standard normal distribution. Hence µ satisfies the logarithmic Sobolev
inequality with constant
α(µ) = α(µV ) ≤ e− osc(V ) α(γ) = 2 · e− osc(V )
by the factorization property. Hence we have independence of dimension!
2. Weak interactions:
X x2 X X
i
H(x) = + V (xi ) − ϑ xi xj − ϑ xi zj ,
i∈Λ
2 i,j∈Λ i∈Λ
|i−j|=1 j ∈Λ
/
|i−j|=1
ϑ ∈ R. One can show:
Theorem 4.27. If V is bounded then there exists β > 0 such that for ϑ ∈ [−beta, β] a
logarithmic Sobolev inequality with constant independent of λ holds.
The proof is based on the exponential decay of correlations Covµ (xi , xj ) for Gibbs mea-
sure, cf. ???, Course ???.
3. Discrete Ising model: One can show that for β < βc (???) a logarithmic Sobolev in-
equality holds on{−N, . . . , N }d with constant of Order O(N 2 ) independent of the bound-
ary conditions, whereas for β > βc and periodic boundary conditions the spectral gap,
and hence the log Sobolev constant, grows exponentially in N , cf. [???].
4.6 Concentration of measure

(Ω, A , P ) probability space, Xi : Ω → Rd independent identically distributed, ∼ µ.
Law of large numbers:
N Z
1 X
U (Xi ) → U dµ U ∈ L 1 (µ)
N i=1
Cramér:
" #
1 X N Z
P U (Xi ) − U dµ ≥ r ≤ 2 · e−N I(r) ,

N
i=1
Z
tU
I(r) = sup tr − log e dµ LD rate function.
t∈R
Hence we have
• Exponential concentration around mean value provided I(r) > 0 ∀ r 6= 0
•
" #
N
r2
1 X Z
N r2

P U (Xi ) − U dµ ≥ r ≤ e− c provided I(r) ≥

N c
i=1
Gaussian concentration.
When does thisRhold? Extension to non independent identically distributed case? This leads to:
Bounds for log etU dµ !
Theorem 4.28 (H ERBST). If µ satisfies a logarithmic Sobolev inequality with constant α then
for any Lipschitz function U ∈ Cb1 (Rd ):
(i)
Z Z
1 tU α
log e dµ ≤ t + U dµ ∀t > 0 (4.12)
t 4
where 1t log etUR dµ can be seen as the free energy at inverse temperature t, α
R
4
as a bound
for entropy and U dµ as the average energy.
(ii) Z
r2
µ U≥ U dµ + r ≤ e− α
Gaussian concentration inequality

In particular,
(iii) Z
2 1
eγ|x| dµ < ∞ ∀γ <
α
Remark . Statistical mechanics:

Ft = t · S − hU i
where Ft is the free energy, t the inverse temperature, S the entropy and hU i the potential.
4.6. CONCENTRATION OF MEASURE 127
tU
Proof. WLOG, 0 ≤ ε ≤ U ≤ 1ε . Logarithmic Sobolev inequality applied to f = e 2 :
Z Z 2 Z Z
tU t 2 tU
tU e dµ ≤ α |∇U | e dµ + e dµ log etU dµ
tU
2
R tU
For Λ(t) := log e dµ this implies
R tU
R
tU e dµ αt 2
|∇U |2 etU dµ αt2
tΛ0 (t) = R tU ≤ R + Λ(t) ≤ + Λ(t)
e dµ 4 etU dµ 4
since |∇U | ≤ 1. Hence
d Λ(t) tΛ0 (t) − Λ(t) α
= ≤ ∀t > 0
dt t t2 4
Since
Z
0 2
Λ(t) = Λ(0) + t · Λ (0) + O(t ) = t U dµ + O(t2 ),
we obtain
Z
Λ(t) α
≤ U dµ + t,
t 4
i.e. (i).
(ii) follows from (i) by the Markov inequality, and (iii) follows from (ii) with U (x) = |x|.
Corollary 4.29 (Concentration of empirical measures). Xi independent identically distributed,

∼ µ. If µ satisfies a logarithmic Sobolev inequality with constant α then
" #
1 X N
N r2

P U (Xi ) − Eµ [U ] ≥ r ≤ 2 · e− 4α

N
i=1
for any Lipschitz function U ∈ Cb1 (Rd ), N ∈ N and r > 0.
Proof. By the factorization property, µN satisfies a logarithmic Sobolev inequality with constant
α as well. Now apply the theorem to
N
1 X
Ũ (x) := √ U (xi )
N i=1
noting that  
∇U (x1 )
1  ..
∇Ũ (x1 , . . . , xn ) = √ 

. 
N
∇U (xN )
hence since U is Lipschitz,
N
! 12
1 X
∇Ũ (x) = √ |∇U (xi )|2 ≤1

N i=1
Chapter 5
5.1 Ergodic averages

(Xt , Px ) canonical Markov process on (Ω, F), i.e.
Xt (ω) = ω(t)
pt transition semigroup,
Θt (ω) = ω(· + t) shift operator

Xt (Θs (ω)) = Xt+s (ω)
µ stationary distribution for pt . Hence (Xt , Pµ ) is a stationary process, i.e.
Pµ ◦ Θ−1
t = Pµ ∀ t ≥ 0,
(Ω, F, Pµ , (Θt )t≥0 ) is a dynamical system where Θt are measure preserving maps, Θt+s = Θt ◦
Θs .
Definition 5.1.
ϑ := A ∈ F : Θ−1

t (A) = A ∀ t ≥ 0
σ-algebra of shift-invariant events. The dynamical system (Ω, F, Pµ , (Θt )t≥0 ) is called ergodic
if and only if
Pµ [A] ∈ {0, 1} ∀ A ∈ ϑ
or, equivalently,
F ∈ L 2 (Pµ ), F ◦ Θt = F Pµ -a.s. ⇒ F = const. Pµ -a.s.
Theorem 5.2 (Ergodic theorem).
Zt
1
F (Θs (ω)) ds → Eµ [F | ϑ](ω)
t
0
129
130 CHAPTER 5.
a) Pµ -a.s. and L1 (Pµ ) for all F ∈ L1 (Pµ )
b) in L2 (Pµ ) for F ∈ L2 (Pµ )
In particular
Zt
1
F ◦ Θs ds → Eµ [F ] Pµ -a.s. if ergodic.
t
0
Proof. cf. e.g. Stroock[21].
Remark . R 1. The Ergodic theorem implies Px -a.s. convergence for µ-almost every x (since
Pµ = Px µ(dx)).
2. In general Px -a.s. convergence for fixed x does not hold!
Example . Ising model with Glauber dynamics on Z2 , β > βcrit (low temperature regime). It
−
follows that there exist two extremal stationary distributions µ+
β and µβ . Pµ+ and Pµ− are both
β β
ergodic. Hence
Zt (
1 Eµ+ [F ] Pµ+ -a.s.
F ◦ Θs ds → β β
t Eµ− [F ] Pµ− -a.s.

0 β β
−
No assertion for the initial distribution ν ⊥ µ+
β , µβ .
When are stationary Markov processes ergodic?

Let (L, Dom(L)) denote the generator of (pt )t≥0 on L2 (µ).
Theorem 5.3. The following assertions are equivalent:
(i) Pµ is ergodic
(ii) ker L = span{1}, i.e.
h ∈ L 2 (µ)harmonic ⇒ h = const. µ-a.s.
(iii) pt is µ-irreducible, i.e.
B ∈ S such that pt IB = IB µ-a.s. ∀ t ≥ 0 ⇒ µ(B) ∈ {0, 1}
If reversibility holds then (i)-(iii) are also equivalent to:

5.1. ERGODIC AVERAGES 131
(iv) pt is L2 (µ)-ergodic, i.e.

Z

pt f − f dµ
→ 0 ∀ f ∈ L2 (µ)
L2 (µ)
Proof. (i)⇒(ii) If h is harmonic then h(Xt ) is a martingale. Hence when we apply the L2 mar-
tingale convergence theorem,
h(Xt ) → M∞ in L2 (Pµ ), M∞ ◦ Θt = M∞
and since ergodicity holds

M∞ = const. Pµ -a.s.
hence
h(X0 ) = Eµ [M∞ | F0 ] = const. Pµ -a.s.
and we get that h =const. µ-a.s.
(ii)⇒(iii) h = IB
(iii)⇒(i) If A ∈ ϑ then IA is shift-invariant. Hence h(x) = Ex [IA ] is harmonic since, applying

the Markov property,
pt h(x) = Ex [EXt [IA ]] = Ex [IA ◦ Θt ] = Ex [IA ] = h(x).
Applying the Markov property once again gives
h(Xt ) = EXt [IA ] = Eµ [IA ◦ Θt | Ft ] → IA Pµ -a.s.
if t ↑ ∞. Hence, applying the stationarity,
µ ◦ h−1 = Pµ ◦ (h(Xt ))−1 → Pµ ◦ IA−1

⇒ h ∈ {0, 1} µ-a.s.
⇒ ∃ B ∈ S : h = IB µ-a.s., pt IB = IB µ-a.s.
and irreducibility gives
h = IB = const. µ-a.s.
(iii)⇔(iv) If reversibility holds, the assertion follows from the spectral theorem:
pt symmetric C0 semigroup on L2 (µ), generator L self-adjoint and negative definite. Hence
Z0
pt f = etL f = etλ dP((−∞,λ]) (f ) → P{0} f = Projection of f onto ker L
−∞
132 CHAPTER 5.
Example . 1. Rotation on S 1 :
d
L= dϕ
Uniform distribution is stationary and ergodic but

pt f (x) = f eit x

does not converge.

2. S = Rn , pt irreducible (i.e. there is a t ≥ 0 such that for all x ∈ Rn , U ⊂ Rn open:
pt (x, U ) > 0) and strong Feller (i.e. if f is bounded and measurable then pt f is continuous)
then pt is µ-irreducible and Pµ is ergodic.
E.g. for Itô diffusions: aij (x), b(x) locally Hölder continuous and (aij ) non-degenerate,
then ergodicity holds.
5.2 Central Limit theorem for Markov processes

Let (Mt )t≥0 be a continuous square-integrable (Ft ) martingale and Ft a filtration satisfying the
usual conditions. Then Mt2 is a submartingale and there exists a unique natural (e.g. continuous)
increasing process hM it such that
Mt2 = martingale + hM it
(Doob-Meyer decomposition, cf. e.g. Karatzas, Shreve [12]).
Example . If Nt is a Poisson process then

Mt = Nt − λt
is a martingale and
hM it = λt
almost sure.
5.2. CENTRAL LIMIT THEOREM FOR MARKOV PROCESSES 133
Note: For discontinuous martingales, hM it is not the quadratic variation of the paths!
Identification of bracket process for martingales corresponding to Markov

processes:
(2)
(Xt , Pµ ) stationary Markov process, LL , L(1) generator on L2 (µ), L1 (µ), f ∈ Dom(L(1) ) ⊇
Dom(L(2) ). Hence
Zt
f
f (Xt ) = Mt + (L(1) f )(Xs ) ds Pµ -a.s.
0
and M f is a martingale.
Theorem 5.4. Suppose f ∈ Dom(L(2) ) with f 2 ∈ Dom(L(1) ). Then
Zt
f
hM it = Γ(f, f )(Xs ) ds Pµ -a.s.
0
where
Γ(f, g) = L(1) (f · g) − f L2 g − gL(2) f ∈ L1 (µ)
is called Carré du champ (square field) operator.
Proof. We write A ∼ B if and only if A − B is a martingale. Hence

 2
2 Zt
Mtf = f (Xt ) − Lf (Xs ) ds
0
 2
Zt Zt
= f (Xt )2 − 2f (Xt ) Lf (Xs ) ds +  Lf (Xs ) ds
0 0
where
Zt
2
f (Xt ) ∼ Lf 2 (Xs ) ds
0
and, applying Itô,
Zt Zt Z t Zr
2f (Xt ) Lf (Xs ) ds = 2 f (Xs )Lf (Xr ) dr + 2 Lf (Xs ) ds df (Xr )
0 0 0 0
134 CHAPTER 5.
where
Zr
f (Xr ) ∼ Lf (Xs ) ds
0
Hence  2
2 Zt Zt
Mtf ∼2 f (Xr )Lf (Xr ) dr +  Lf (Xs ) ds
0 0
Example . Diffusion in Rn ,
1X ∂2
L= aij (x) + b(x) · ∇
2 i,j ∂xi ∂xj
Hence
X ∂f ∂g 2
Γ(f, g)(x) = aij (x) (x) (x) = σ T (x)∇f (x)Rn
i,j
∂xi ∂xj
C0∞ (Rn ).
for all f, g ∈ Results for gradient diffusions on Rn (e.g. criteria for log Sobolev) extend
to general state spaces if |∇f |2 is replaced by Γ(f, g)!
Connection to Dirichlet form:

Z Z Z
1 1
E (f, f ) = − f L f dµ +
(2) (1) 2
L f dµ = Γ(f, f ) dµ
2 2
| {z }
=0
Reference: Bouleau, Hirsch [6].
Application 1: Maximal inequalities.

h pi
Z
p p
f p f 2 −1
E sup |Ms | ≤ Cp · E hM it ≤ Cp · t 2 Γ(f, f ) 2 dµ
s≤t
This is an important estimate for studying convergence of Markov processes!
Application 2: Central limit theorem for ergodic averages.
Theorem 5.5 (Central limit theorem for martingales). (Mt ) square-integrable martingale on
(Ω, F, P ) with stationary increments (i.e. Mt+s − Ms ∼ Mt − M0 ), σ > 0. If
1
hM it → σ 2 in L1 (P )
t
then
Mt D
√ → N (0, σ 2 )
t
5.2. CENTRAL LIMIT THEOREM FOR MARKOV PROCESSES 135
For the proof cf. e.g. Landim [14], Varadhan [22].
Corollary 5.6 (Central limit theorem for Markov processes (elementary version)). Let (Xt , Pµ )
be a stationary ergodic Markov process. Then for f ∈ Range(L), f = Lg:
Zt
1 D
√ f (Xs ) ds → N (0, σf2 )
t
0
where Z
σf2 =2 g(−L)g dµ = 2E (g, g)
Remark . 1. If µ is stationary then

Z Z
f dµ = Lg dµ = 0
i.e. the random variables f (Xs ) are centered.
2. ker(L) = span{1} by ergodicity

Z
⊥ 2
(ker L) = f ∈ L (µ) : f dµ = 0 =: L20 (µ)
If L : L20 (µ) → L2 (µ) is bijective with G = (−L)−1 then the Central limit theorem holds
for all f ∈ L2 (µ) with
σf2 = 2(Gf, (−L)Gf )L2 (µ) = 2(f, Gf )L2 (µ)
(H −1 norm if symmetric).
Example . (Xt , Pµ ) reversible, spectral gap λ, i.e.,
spec(−L) ⊂ {0} ∪ [λ, ∞)

hence there is a G = (−L )−1 , spec(G) ⊆ [0, λ1 ] and hence

L20 (µ)
2
σf2 ≤ kf k2L2 (µ)
λ
is a bound for asymptotic variance.
136 CHAPTER 5.
Proof of corollary.
Zt
1 g(Xt ) − g(X0 ) Mtg
√ f (Xs ) ds = √ + √
t t t
0
Zt
hM g it = Γ(g, g)(Xs ) ds Pµ -a.s.
0
and hence by the ergodic theorem

Z
1 t↑∞
hM g it → Γ(g, g) dµ = σf2
t
The central limit theorem for martingales gives
D
Mtg → N (0, σf2 )
Moreover
1
√ (g(Xt ) − g(X0 )) → 0
t
in L2 (Pµ ), hence in distribution. This gives the claim since
D D D
Xt → µ, Yt → 0 ⇒ Xt + Yt → µ
Extension: Range(L) 6= L2 , replace −L by α − L (bijective), then α ↓ 0. Cf. Landim [14].

Bibliography
[1] A LDOUS , D. & F ILL , J. Reversible Markov Chains and Random Walks on Graph. Avail-
able online: http://www.stat.berkeley.edu/˜aldous/RWG/book.html
[2] A N É , C. & B LACH ÈRE , S. & C HAFA Ï , D. & F OUG ÈRES , P. & G ENTIL , I. & M ALRIEU ,
F. & ROBERTO , C. & S CHEFFER , G. (2000) Sur les inégalités de Sobolev logarithmiques
Panoramas et Synthèses 10, Société Mathématique de France.
[3] A PPLEBAUM , D. (2004) Lévy processes and stochastic calculus Cambridge University
Press.
[4] A SMUSSEN , S. (2003) Applied probability and queues. Springer.
[5] B ERTOIN , J. (1998) Lévy processes. Cambridge University Press.
[6] B OULEAU , N. & H IRSCH , F. (1991) Dirichlet Forms and Analysis on Wiener Space.
Gruyter.
[7] B R ÉMAUD , P. (2001) Markov chains : Gibbs fields, Monte Carlo simulation, and queues.
Springer.
[8] D EUSCHEL , J.-D. & S TROOCK , D.W. (1989) Large deviations. Series in Pure and Ap-
plied Mathematics Vol. 137, Academic Press Inc.
[9] D URETT, R. (1993) Ten Lectures on Particle Systems. St. Flour Lecture notes. Springer.
[10] F ELLER , W. (1991) An Introduction to Probability Theory and Its Applications, Volume 1
and 2. John Wiley & Sons.
[11] G ROSS , L. & FABES , E. & F UKUSHIMA , M. & K ENIG , C. & R ÖCKNER , M. &
S TROOCK , D.W. & G IANFAUSTO D ELL’A NTONIO & U MBERTO M OSCO (1992) Dirich-
let Forms: Lectures given at the 1st Session of the Centro Internazionale Matematico Es-
tivo (C.I.M.E.) held in Varenna, Italy, June 8-19, 1992 (Lecture Notes in Mathematics).
Springer.
[12] K ARATZAS , I. & S HREVE , S. E. (2005) Brownian Motion and Stochastic Calculus.
Springer.
[13] K IPNIS , C. & L ANDIM , C. (1998) Scaling limits of interacting particle systems. Springer.
137
138 BIBLIOGRAPHY
[14] L ANDIM , C. (2003) Central Limit Theorem for Markov processes, from Classical to Mod-
ern Probability CIMPA Summer School 2001, Picco, Pierre; San Martin, Jaime (Eds.),
Progress in Probability 54, 147207, Birkhäuser.
[15] L IGGETT, T.M. (2004) Interacting particle systems. Springer.
[16] N ORRIS , J.R. (1998) Markov chains. Cambridge University Press.
[17] R EVUZ , D. & YOR , M. (2004) Continuous martingales and Brownian motion. Springer.
[18] ROGERS , L.C.G. & W ILLIAMS , D. (2000) Diffusions, Markov Processes and Martin-
gales. Cambridge University Press.
[19] ROYER , G. (2007) An initiation to logarithmic Sobolev inequalities. American Mathemat-

ical Society.
[20] S TROOCK , D.W. (2005) An introduction to Markov processes. Springer.
[21] S TROOCK , D.W. (2000) Probability Theory, an Analytic View. Cambridge University
Press.
[22] VARADHAN , S.R.S. (2001) Probability Theory. Oxford University Press.

Index
χ2 -contrast, 104 Heath bath dynamics, 46

heavy tails, 77
accessible state, 35 Herbst theorem, 126
attractive process, 56 holding times, 9
Birth-and-death chain, 10 infinitely divisible random variables, 72
infinitisemal generator, 19
Carré du champ operator, 133
integrated backward equation, 17
Chapman-Kolmogorov equation, 7
intensity matrix, 19
characteristic exponent, 73
invariant measure, 37
communicating states, 35
invariant process with respect to time reversal,
Contact process, 46
96
continuous-time branching process, 22
irreducible chain, 35
convolution semigroup, 72
Ising model, 46
detailed balance condition, 40 Ising Hamiltonian, 60
Dirichlet form, 100
Dobrushin coefficient, 109 jumping times, 9
dynamical system, 129 kernel, 19
ergodic, 129 Kolmogorov’s backward equation, 20
effective resistance, 42 Kolmogorov’s forward equation, 23
empirical distribution, 48 Lévy process, 71
Ergodic theorem, 43, 129 Lévy measure, 76
excessive, 27
Lévy-Itô representation, 83
Feller property, 53 locally bounded function, 26
Fokker-Planck equation, 24 Logarithmic Sobolev inequality, 111
Forward equation, 54 Lyapunov condition, 29
Gaussian concentration inequality, 126 Markov chain, 5

general birth-and-death process, 22 continuous time, 8
generator of a semigroup of linear contractions, Markov process, 7
68 conservative, 26
Gibbs measures, 61 non-explosive, 26
Gibbs sampler, 46 time-homogeneous, 7
Glauber-Langevin dynamics, 123 Maximum principle, 69
Green function of a Markov process, 34 mean commute time, 41
mean hitting times, 41
Hamming distance, 47 Mean-field Ising model, 49
139
140 INDEX
mean-field model, 48
minimal chain, 17
partial order on configurations, 55

Peierl’s theorem, 63
phase transition, 61
Pinsker’s inequality, 108
Poincaré inequality, 101
Poisson equation, 33
Poisson point process, 65
Poisson process, 10
compound, 10
Poisson random field, 64
Poisson random measure, 64
positive recurrent, 37
pure jump process, 8
Q-matrix, 20
recurrent chain, 35
recurrent state, 34
relative entropy, 104
Reuter’s criterion, 32
scale function, 93
scale-invariant Lévy process, 77
space-time harmonic, 27
spatial Poisson process, 64
speed measure, 94
stationary measure, 37
stochastic dominance, 56
Strong Markov property of a Markov process,
8
Strong Markov property of a Markov chain, 7
strongly continuous semigroup of linear con-
tractions, 68
superharmonic, 27
symmetric measure, 80
total variation distance, 103

total variation norm, 16
transient chain, 35
transient state, 34
transition probabilities, 5
Voter model, 46

MPSkript

Uploaded by

Copyright:

Available Formats

MPSkript

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

MPSkript

Uploaded by

Copyright:

Available Formats

Markov processes

University of Bonn, summer term 2008

1 Continuous-time Markov chains 5

2 Interacting particle systems 45

3 Markov semigroups and Lévy processes 67

Continuous-time Markov chains

Additional reference for this chapter:

1.1 Markov properties in discrete and continuous time

Example (Non-linear state space model). Consider

Remark . 1. Longer transitions:

P [Xn ∈ B|Fm ] = (pn · · · pm+2 pm+1 )(Xn , B)

Time-homogeneous case: pn = p ∀n, pm,n = pn−m .

2. Reduction to time-homogeneous case:

p̂((n, x), {m} × B) = δm,n+1 · pn+1 (x, B).

3. Canonical model for space-time chain with start in (n, x):

Let T : Ω → {0, 1, . . .} ∪ {∞} be a (Fn )-stopping time, i.e.

Theorem 1.2 (Strong Markov property).

Proof. Exercise (Consider first T = n).

1.2 From discrete to continuous time:

(ii) P [Xt ∈ B|Fs ] = ps,t (Xs , B) P -a.s. ∀ 0 ≤ s ≤ t, B ∈ S

Lemma 1.4. The transition functions of a Markov process satisfy

2. ps,t pt,u f = ps,u f Chapman-Kolmogorov equation

P ◦ X −1 -a.s. for every f : S → R and 0 ≤ s ≤ t ≤ u.

Proof. 1. (ps,s f )(Xs ) = E[f (Xs )|Fs ] = f (Xs ) P -a.s.

Remark . 1. Time-homogeneous case: ps,t = pt−s

• in applications, ps,t is usually not known explicitly.

We take a more constructive approach instead.

Let (Xt , P ) be an (Ft )-Markov process with transition functions ps,t .

Lemma 1.6. If the strong Markov property holds then

P -a.s. on {T < ∞} for all measurable functions F : S [0,∞) → R+ .

PC(R+ , S) := {x : [0, ∞) → S | ∀t ≥ 0 ∃ε > 0 : x constant on [t, t + ε)}

Let qt : S × S → [0, ∞] be a kernel of positive measure, i.e. x 7→ qt (x, A) is measurable and

Pt,t+h (x, B) = qt (x, B) · h + o(h) ∀ t ≥ 0, x ∈ S, B ⊆ S \ {x} measurable

Construction of a process with initial distribution µ ∈ M1 (S):

Assumption: λt (x) < ∞ ∀x ∈ S , no instantaneous jumps.

where ∆ is an extra point, called the cemetery.

Example . 1) Poisson process with intensity λ > 0

S = {0, 1, 2, . . .}, q(x, y) = λ · δx+1,y , λ(x) = λ ∀x, π(x, x + 1) = 1

2) Continuization of discrete time chain

Xt = YNt , Nt Poisson(1)-process independent of (Yn ),

e.g. compound Poisson process (continuous time random walk):

Zi : Ω → Rd independent and identically distributed random variables, independent of

rate d(x) rate b(x)

P [T ≤ t + h|T > t] = λt · h + o(h)

Construction of time-inhomogeneous jump process:

Fix t0 ≥ 0 and µ ∈ M1 (S) (the initial distribution at time t0 ).

Explicit (algorithmic) construction:

Example (Non-homogeneous Poisson process on R+ ).

S = {0, 1, 2, . . .}, qt (x, y) = λt · δx+1,y ,

high intensity low intensity

Lemma 1.9. (Exercise)

2. (Jn , Yn ) is a Markov chain with respect to G̃ = σ(Y0 , . . . , Yn , E1 , . . . , En ) with transition

Remark . 1. Jn strictly increasing.

2. Jn = ∞ ∀ n, m is possible Xt absorbed in state Yn−1 .

3. sup Jn < ∞ explosion in finite time

4. {s < ζ} = {Xs 6= ∆} ∈ Fs no explosion before time s.

{Ks < ∞} = {s < ζ}

Lemma 1.10 (Memoryless property). Let s ≥ t0 . Then for all t ≥ s,

where A ∩ {Ks = n} ∩ {Jn > s} = A ∩ {Ks = n}.

For yn ∈ S, tn ∈ [0, ∞] strictly increasing define