Abstract
Through this paper we analyze the ergodic properties of continuous time Markov chains with values on the one-dimensional spin lattice \(\{1,\dots,d\}^{{\mathbb{N}}}\) (also known as the Bernoulli space). Initially, we consider as the infinitesimal generator the operator , where is a discrete time Ruelle operator (transfer operator), and \(A:\{1,\dots,d\}^{{\mathbb{N}}}\to\mathbb{R}\) is a given fixed Lipschitz function. The associated continuous time stationary Markov chain will define the a priori probability.
Given a Lipschitz interaction \(V:\{1,\dots,d\}^{{\mathbb{N}}}\to\mathbb{R}\), we are interested in Gibbs (equilibrium) state for such V. This will be another continuous time stationary Markov chain. In order to analyze this problem we will use a continuous time Ruelle operator (transfer operator) naturally associated to V. Among other things we will show that a continuous time Perron-Frobenius Theorem is true in the case V is a Lipschitz function.
We also introduce an entropy, which is negative (see also Lopes et al. in Entropy and Variational Principle for one-dimensional Lattice Systems with a general a-priori probability: positive and zero temperature. Arxiv, 2012), and we consider a variational principle of pressure. Finally, we analyze large deviations properties for the empirical measure in the continuous time setting using results by Y. Kifer (Tamsui Oxf. J. Manag. Sci. 321(2):505–524, 1990). In the last appendix of the paper we explain why the techniques we develop here have the capability to be applied to the analysis of convergence of a certain version of the Metropolis algorithm.
Similar content being viewed by others
References
Baraviera, A., Leplaideur, R., Lopes, A.O.: Selection of ground states in the zero temperature limit for a one-parameter family of potentials. SIAM J. Appl. Dyn. Syst. 11(1), 243–260 (2012)
Baraviera, A., Exel, R., Lopes, A.: A Ruelle Operator for continuous time Markov chains. São Paulo J. Math. Sci. 4(1), 1–16 (2010)
Baraviera, A.T., Cioletti, L.M., Lopes, A., Mohr, J., Souza, R.R.: On the general one dimensional XY model: positive and zero temperature, selection and non-selection. Rev. Math. Phys. 23(10), 1063–1113 (2011)
Baladi, V.: Positive Transfer Operators and Decay of Correlations. World Scientific, Singapore (2000)
Baladi, V., Smania, D.: Linear response formula for piecewise expanding unimodal maps. Nonlinearity 21(4), 677–711 (2008)
Berger, N., Kenyon, C., Mossel, E., Peres, Y.: Glauber dynamics on trees and hyperbolic graphs. Probab. Theory Relat. Fields 131(3), 311–340 (2005)
Contreras, G., Lopes e, A.O., Thieullen, Ph.: Lyapunov minimizing measures for expanding maps of the circle. Ergod. Theory Dyn. Syst. 21, 1379–1409 (2001)
Craizer, M.: Teoria Ergódica das Transformações expansoras. Master dissertation, IMPA, Rio de Janeiro (1985)
Dembo, A., Zeitouni, O.: Large Deviations Techniques and Applications. Springer, Berlin (1998)
Donsker, M., Varadhan, S.: Asymptotic evaluation of certain Markov process expectations for large time I. Commun. Pure Appl. Math. 28, 1–47 (1975)
Deutschel, J.-D., Stroock, D.: Large Deviations. AMS, Berlin (1989)
den Hollander, F.: Large Deviations. AMS, Providence (2000)
Diaconis I, P., Saloff-Coste, L.: Nash inequalities for finite Markov chains. J. Theor. Probab. 9(2) (1996)
Diaconis, P., Saloff-Coste, L.: What do we know about the Metropolis algorithm? J. Comput. Syst. Sci. 57(1), 2–36 (1998). 27th Annual ACM Symposium on the Theory of Computing (STOC 95) (Las Vegas, NV)
Dupuis, P., Liu, Y.: On the large deviation rate function for the empirical measures of reversible jump Markov processes. Arxiv (2013)
Ellis, R.: Entropy, Large Deviations, and Statistical Mechanics Springer, Berlin
Ethier, S.N., Kurtz, T.G.: Markov Processes—Characterization and Convergence. Wiley, New York (2005)
Feng, J., Kurtz, T.: Large Deviations for Stochastic Processes. AMS, Providence (2006)
Kifer, Y.: Large deviations in dynamical systems and stochastic processes. Tamsui Oxf. J. Manag. Sci. 321(2), 505–524 (1990)
Kifer, Y.: Principal eigenvalues, topological pressure, and stochastic stability of equilibrium states. Isr. J. Math. 70(I), 1–47 (1990)
Landim, C., Kipnis, C.: Scaling Limits of Interacting Particle Systems. Grundlehren der Mathematischen Wissenschaften, vol. 320. Springer, Berlin (1999)
Lebeau, G.: Introduction a l’analyse de l’algorithme de Metropolis. Preprint. http://math.unice.fr/~sdescomb/MOAD/CoursLebeau.pdf
Leonard, C.: Large deviations for Poisson random measures and processes with independent increments. Stoch. Process. Appl. 85, 93–121 (2000)
Liggett, T.: Continuous Time Markov Processes. AMS, Providence (2010)
Lopes, A., Mengue, J., Mohr, J., Souza, R.R.: Entropy and Variational Principle for one-dimensional Lattice Systems with a general a-priori probability: positive and zero temperature. Arxiv (2012)
Lopes, A.O.: An analogy of the charge on distribution on Julia sets with the Brownian motion. J. Math. Phys. 30(9), 2120–2124 (1989)
Lopes, A.O.: Entropy and large deviations. Nonlinearity 3(2), 527–546 (1990)
Lopes, A.O., Mohr, J., Souza, R., Thieullen, Ph.: Negative entropy, zero temperature and stationary Markov chains on the interval. Bull. Braz. Math. Soc. 40, 1–52 (2009)
Lopes, A.O.: Thermodynamic formalism, maximizing probabilities and large deviations. Lecture Notes—Dynamique en Cornouaille (2012, in press)
Parry, W., Pollicott, M.: Zeta functions and the periodic orbit structure of hyperbolic dynamics. Astérisque 187–188 (1990)
Protter, P.: Stochastic Integration and Differential Equations. Springer, Berlin (1990)
Randall, D., Tetali, P.: Analyzing Glauber dynamics by comparison of Markov chains. J. Math. Phys. 41(3), 1598–1615 (2000)
Revuz, D., Yor, M.: Continuous Martingales and Brownian Motion, 3rd edn. Springer, Berlin (1999)
Ruelle, D.: Thermodynamic Formalism. Addison-Wesley, Reading (1978)
Stroock, D.: An Introduction to Markov Processes. Springer, Berlin (2000)
Stroock, D.: An Introduction to the Theory of Large Deviations Springer, Berlin
Stroock, D., Zegarlinski, B.: On the ergodic properties of Glauber dynamics. J. Stat. Phys. 81(5–6), 1007–1019 (1995)
Yoshida, K.: Functional Analysis. Springer, Berlin (1978)
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix A: The Spectrum of on \({{\mathbb{L}}}^{2} (\mu_{A})\) and Dirichlet Form
For any \(f\in {\mathbb{L}}^{2} (\mu_{A})\) the Dirichlet form of f is
Notice that
Indeed,
By the other hand,
These two equalities imply that
From expression (26) we have that implies f=0.
We point out that we will consider bellow eigenvalues in \({{\mathbb{L}}}^{2} (\mu_{A})\) which are not necessarily Lipschitz.
Dirichlet forms are quite important (see [21]), among other reasons, because they are particularly useful when there is an spectral gap. However, this will not be the case here.
Proposition 26
Let a Lipschitz function \(V: \{1,\dots,d\}\to {\mathbb{R}}\) such that supV−infV<2. There are eigenvalues c for in \({{\mathbb{L}}}^{2} (\mu_{A})\) such that [(supV−2)∨0]<c<infV. Each eigenvalue has infinite multiplicity. Therefore, in this case, there is no spectral gap.
Proof
The existence of positive eigenvalues c for the operator satisfying [(supV−2)∨0]<c<infV will obtained from solving the twisted cohomological equation. In order to simplify the reasoning we will present the proof for the case \(E=\{0,1\}^{\mathbb{N}}\). From Sect. 2.2 in [5], we know that given functions \(z:E\to\mathbb {R}\) and \(C:E\to \mathbb{R}\) one can solve in α the twisted cohomological equation
in the case that |C|<1. Indeed, just take
Note that this function α is measurable and bounded but not Lipschitz.
Take \(z(y)=(-1)^{y_{0}} e^{-A(y)}\), when y=(y 0,y 1,y 2,…). Now, for c∈([(supV−2)∨0],infV) fixed, consider C(y)=1−V(σ(y))+c. Notice that |C|<1. Then, (27) becomes
Let \(x\in\{1,\dots,d\}^{{\mathbb{N}}}\). Adding the equations above when y=0x and when y=1x, we get
because σ(0x)=x=σ(1x), and the potential A is normalized.
It is also easy to show that changing a little bit the argument one can get an infinite dimensional set of possible α associated to the same eigenvalue. □
Appendix B: Basic Tools for Continuous Time Markov Chains
In this section we present the proofs of Lemma 3 and Lemma 5. In order to do that, we will present another way to analyze the properties of a continuous time Markov chain.
Suppose the process {X t ,t≥0} is a continuous time Markov chain. In an alternative way we can described it by considering its skeleton chain (see [24, 31]). Let \(\{\xi_{n}\}_{n\in {\mathbb{N}}}\) be a discrete time Markov chain with transition probability given by p(x,y)=1 [σ(y)=x] e A(y). Consider a sequence of random variables \(\{\tau_{n}\}_{n\in {\mathbb{N}}}\), which are independent and identically distributed according to an exponential law of parameter 1. For n≥0, define
Thus, X t can be rewritten as \(\sum_{n=0}^{+\infty} \xi_{n} \mathbf{1}_{[T_{n}\leq t<T_{n+1}]}\), for all t≥0.
Proof of Lemma 3
Using the above, we are able to describe expression (1) in a different way:
where σ n(a n …a 1 x)=x. The first term above is equal to e TV(x) f(x)e −T. The summand in the second one is equal to
Using the transition probability of the Markov chain {ξ n } n , we get
Recalling that the random variables {τ i } are independent and identically distributed according to an exponential law of parameter 1, we have
Therefore,
□
Proof of Lemma 5
We begin analyzing
and \(e^{T V(x)} e^{-T}\leq e^{TC_{V}d(x,y)} e^{T V(y)} e^{-T}\). Since the potential A is also Lipschitz, we get
By the hypothesis we assume for f, we get
Thus,
□
Appendix C: Radon-Nikodym Derivative
Let be the natural filtration.
Proposition 27
The Radon-Nikodym derivative of the measure \({\mathbb{P}}_{\mu}\) (associated to the a priori process) concerning the admissible measure \(\tilde{{\mathbb{P}}}_{\mu}\) (see Definition 10) restricted to is
Proof
The probabilities \(\tilde{{\mathbb{P}}}_{\mu}\) and \({\mathbb{P}}_{\mu}\) on are equivalent, because the initial measure and the allowed jumps are the same. Thus, the expectation under \({\mathbb{E}}_{\mu}\) of all bounded function , -measurable, is
The goal here is to obtain a formula for the Radon-Nikodym derivative \(\frac{\, \mathrm{d}{\mathbb{P}}_{\mu}}{\, \mathrm{d}\tilde{{\mathbb{P}}}_{\mu}}\). Since every bounded -measurable function can be approximated by functions depending only on a finite number of coordinates, then, it is enough to work with these functions. For k≥1, consider a sequence of times 0≤t 1<⋯<t k ≤T and a bounded function \(F: (\{1,\dots,d\}^{{\mathbb{N}}} )^{k}\to {\mathbb{R}}\). Using the skeleton chain, presented in the proof of Lemma 3, we get
Since \(F(X_{t_{1}},\dots,X_{t_{k}})\) restricted to the set [T n ≤T<T n+1] depends only on ξ 1,T 1,…,ξ n ,T n , there exist functions \(\bar{F}_{n}\) such that
Through some calculations that are similar to the one used on the Corollary 2.2 in Appendix 1 of the [21], the last probability is equal to
Then, we need to estimate for each \(n\in {\mathbb{N}}\) and, moreover, for all bounded measurable function \(G: (\{1,\dots,d\}^{{\mathbb{N}}}\times(0,\infty) )^{n}\to {\mathbb{R}}\) the expectation
Notice that, for all \(x\in\{1,\dots,d\}^{{\mathbb{N}}}\),
We can write \(\sum_{i=0}^{n-1}(\tilde{\gamma}(\xi_{i})-1)\tau_{i}\) as
and, we can write \(e^{A(\xi_{i+1})-\tilde{A}(\xi_{i+1})} \frac {1}{\tilde{\gamma}(\xi_{i})}\) as
The expectation under \({\mathbb{P}}_{x}\) of G(ξ 1,T 1,…,ξ n ,T n ) becomes
Using the formula above in (30), the expectation under \({\mathbb{E}}_{\mu}\) of \(F(X_{t_{1}},\dots,X_{t_{k}})\) is equal to
Once again, we use some calculations similarly to the Corollary 2.2 in Appendix 1 of the [21] and we rewrite the expression above as
and, this sum is equal to
This finish the proof. □
Appendix D: Proof of Lemma 11
Proof of Lemma 11
We claim that
is a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale. Then, this lemma will follow from \(\tilde{{\mathbb{E}}}_{\mu}[M^{G}_{T} ]=\tilde{{\mathbb{E}}}_{\mu}[M^{G}_{0} ]=0\). In order to prove this claim it is enough to prove that
is a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale, because \(M^{G}_{T}=\int G \, \mathrm{d}M_{T}\) will be a \(\tilde{{\mathbb{P}}}_{\mu}\)—martingale (see [33]).
Now, we prove (31). Let be the natural filtration. For all S<T, we prove that . By Markov property, we only need to show that \(\tilde{{\mathbb{E}}}_{x} [M_{t} ]=0 \).
Denote by the space of all trajectories ω in such that ω 0=x. Observe that, for all ω in ,
For all s≥0 and \(y \in\{1,\dots,d\}^{{\mathbb{N}}}\), N s (y) denotes the number of times that the exponential clock rang at site y. Thus, the first term on the right side of (31) can be rewritten as
for all ω in .
Since (32) and (33) are true, in order to conclude this prove, it is sufficient to show that
for all \(y \in\{1,\dots,d\}^{{\mathbb{N}}}\).
Let 0=t 0<t 1<⋯<t n =t be a partition of the interval [0,t]. The expression (34) can be rewritten as
Observe that
where the function \(O_{\tilde{\gamma}}\) satisfies \(O_{\tilde{\gamma }}(h)\leq C_{\tilde{\gamma}} h\). Then, we only need to prove that
By the Markov Property, it is enough to see that \(\tilde{{\mathbb{E}}}_{x}[ N_{h}(y)]=\tilde{\gamma}(y)h\). This is a consequence of the \(\tilde{\gamma}(y)\) being the parameter of the exponential clock at the site y. □
Appendix E: Basic Properties of Q(V)
Lemma 28
|Q(V)−Q(U)|≤∥V−U∥∞.
Proof
Since
then,
□
Lemma 29
The functional V→Q(V) is convex, i.e., for all α∈(0,1), we have
Proof
Using the Holder’s inequality, we have
Thus,
□
Appendix F: The Associated Symmetric Process and the Metropolis Algorithm
We can consider in our setting an extra parameter \(\beta\in\mathbb {R}\) which plays the role of the inverse of temperature. For a given fixed potential V we can consider the new potential βV, \(\beta\in\mathbb{R}\), and applying what we did before, we get continuous time equilibrium states described by γ β :=γ βV and B β :=B βV , in the previous notation. In other words, we consider the infinitesimal generator , β>0, and the associated main eigenvalue λ β :=λ βV . We denote by L V,β the infinitesimal generator of the process that is the continuous time Gibbs state for the potential βV, then L V,β acts on functions f as \(L^{V,\beta}(f)(x)= \gamma_{\beta}(x) \sum_{\sigma (y)=x}e^{B_{\beta}(y)} [f(y)-f(x) ]\). We are interested in the stationary probability \(\mu_{\beta}:=\mu_{B_{\beta V}, \gamma_{\beta V}}\) for the semigroup \(\{e^{ t L^{V,\beta} }, t\geq0\}\), and its weak limit as β→∞. This limit would correspond to the continuous time Gibbs state for temperature zero (see [7, 25, 28] for related results).
The dual of L V,β on the Hilbert space \({\mathbb{L}}^{2}( \mu_{\beta })\) is , where is the Koopman operator. Notice that the probability μ β is also stationary for the continuous time process with symmetric infinitesimal generator \(L_{sym}^{V,\beta}:=\frac{1}{2} (L^{V,\beta} + {L^{V,\beta}}^{*})\). In this new process the particle at x can jump to a σ−preimage y with probability \(\frac{1}{2}e^{B_{\beta}(y)}\), or with probability \(\frac{1}{2}\), to the forward image σ(x), but, in both ways, according to a exponential time of parameter γ β (x).
The eigenfunction of the continuous time Markov chain with infinitesimal generator \(L_{sym}^{V,\beta}\) can be different from the one with generator L V,β. Given V and β, we denote λ(β) sym the main eigenvalue that we obtained from βV and the generator \(L_{sym}^{V,\beta}\). The eigenvalues of L V,β and L V,β ∗ are the same as before. Now, we will look briefly at how to obtain λ(β) sym . From the symmetric assumption, [11], we get, for a fixed β,
The second equality is due to the Definition (12), and the last one is by the dual, , on \({\mathbb{L}}^{2}(\mu _{\beta})\) is .
Suppose one changes β in such way that β increases converging to ∞, then one can ask about the asymptotic behavior of the stationary Gibbs probability μ β . One should analyze first what that happens with the optimal ϕ (or almost optimal) in the maximization problem above. In order to answer this last question, we use, in \({\mathbb{L}}^{2}(\mu_{\beta})\), the Schwartz inequality, and we obtain
Note that, for a fixed large β, the positive value γ β (x)=1−βV(x)+λ βV became smaller close by the supremum of V. Which means that \(\frac{1}{\gamma_{\beta}(x)}\) became large close by the supremum of V. Moreover, for fixed β, the part \(\int \beta V |\phi| \frac{1}{ \gamma_{\beta}} \frac{\, \mathrm{d}\mu_{B_{\beta}}}{\int\frac{1}{\gamma_{\beta}} \, \mathrm{d} \mu_{B_{\beta}}} \) of the above expression increase if we consider |ϕ| such that the big part of its mass is more and more close by to the supremum of βV. Note that, for fixed β, the part of the above expression is bounded and just depends on ϕ. The supremum of \(\int \beta V |\phi| \frac{1}{ \gamma_{\beta}} \frac{\, \mathrm{d}\mu_{B_{\beta}}}{\int\frac{1}{\gamma_{\beta}} \, \mathrm{d} \mu_{B_{\beta}}} \) grows with β at least of order β.
Therefore, for large β, the maximization above should be obtained by taking ϕ=ϕ β in \({\mathbb{L}}^{2}(\mu_{\beta})\) such that is more and more concentrated close by the supremum of βV. In this way, when β→∞ the “almost” optimal ϕ has a tendency to localize the points where the supremum of V is attained. If there is a unique point z 0 where V is optimal, then λ β ∼βV(z 0). The probability μ β will converge to the delta Dirac on the point z 0. This procedure is quite similar with the process of determining ground states for a given potential via an approximation by Gibbs states which have a very small value of temperature (see for instance [1]).
The Metropolis algorithm has several distinct applications. In one of them, it can be used to maximize a function on a quite large space (see [14, 22]). Suppose V has a unique point of maximal value. The basic idea is to produce a random algorithm that can explore the state space and localize the point of maximum, this problem may happen with a deterministic algorithm. The use of continuous time paths resulted in some advantages in the method. The randomness assures that the algorithm does note stuck on a point of local maximum of some function V. The setting we consider here has several similarities with the usual procedure. When we take β large, then the probability μ β will be very close to the delta Dirac on the point of maximum for V as we just saw. This is so because the parameter \(\frac{1}{\gamma_{\beta}(x)}\) of the exponential distribution became large close by the supremum of V. In the classical Metropolis algorithm there is link on β and t which is necessary for the convergence (cooling schedule in [35]). In a forthcoming paper, using our large deviation results, we will investigate the question: given small ϵ and δ, with probability bigger than 1−δ, the empirical path on the one-dimensional spin lattice will stay, up to a distance smaller the ϵ of the maximal value, a proportion 1−δ of the time t, if t and β are chosen in a certain way (to be understood). In order to do that we have to use the large deviation results we get before.
Rights and permissions
About this article
Cite this article
Lopes, A., Neumann, A. & Thieullen, P. A Thermodynamic Formalism for Continuous Time Markov Chains with Values on the Bernoulli Space: Entropy, Pressure and Large Deviations. J Stat Phys 152, 894–933 (2013). https://doi.org/10.1007/s10955-013-0796-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10955-013-0796-7