0510424v1

An error bound in the Sudakov-Fernique inequality
arXiv:math/0510424v1 [math.PR] 20 Oct 2005
Sourav Chatterjee
November 26, 2024
Abstract
We obtain an asymptotically sharp error bound in the classical Sudakov-Fernique compar-
ison inequality for finite collections of gaussian random variables. Our proof is short and self-
contained, and gives an easy alternative argument for the classical inequality, extended to the
case of non-centered processes.
1 Statement of the result

Gaussian comparison inequalities are among the most important tools in the theory of gaussian
processes, and the Sudakov-Fernique inequality (named after Sudakov [11, 12] and Fernique [3]) is
perhaps the most widely used member of that class.
We will concentrate on the Sudakov-Fernique inequality in this article; general discussions
about comparison inequalities can be found in Adler [1], Fernique [4], Ledoux & Talagrand [9], and
Lifshits [10].
The classical Sudakov-Fernique inequality goes as follows:
Theorem 1.1. [Sudakov-Fernique inequality] Let {Xi , i ∈ I} and {Yi , i ∈ I} be two centered
gaussian processes indexed by the same indexing set I. Suppose that both the processes are almost
surely bounded. For each i, j ∈ I, let γijX = E(X − X )2 and γ Y = E(Y − Y )2 . If γ X ≤ γ Y for all
i j ij i j ij ij
i, j, then E(supi∈I Xi ) ≤ E(supi∈I Yi ).
As mentioned before, this inequality is attributed to Sudakov [11, 12] and Fernique [3]. Later
proofs were given in Alexander [2] and an unpublished work of S. Chevet. Important variants
were proved by Gordon [5, 6, 7] and Kahane [8]. More recently, Vitale [14] has shown, through a
clever argument, that we only need E(Xi ) = E(Yi ) instead of E(Xi ) = E(Yi ) = 0 in the hypothesis
of Theorem 1.1. We will prove the following result, which gives an sharp error bound when the
indexing set is finite, and also contains Vitale’s extension of the Sudakov-Fernique inequality.
Theorem 1.2. Let (X1 , . . . , Xn ) and (Y1 , . . . , Yn ) be gaussian random vectors with E(Xi ) = E(Yi )
for each i. For 1 ≤ i, j ≤ n, let γij X = E(X − X )2 and γ Y = E(Y − Y )2 , and let γ =
i j ij i j
X − γ Y |. Then
max1≤i,j≤n |γij ij
p
|E( max Xi ) − E( max Yi )| ≤ γ log n.
1≤i≤n 1≤i≤n
X ≤ γ Y for all i, j, then E(max X ) ≤ E(max Y ).

Moreover, if γij ij i i i i
1
The asymptotic sharpness of the error bound is easy to see from the case where all the Xi ’s are
independent standard normals and all the Yi ’s are zero.
2 Proof
We first need to state the following well-known “integration by parts” lemma:
Lemma 2.1. If F : Rn → R is a C 1 function of moderate growth at infinity, and X = (X1 , . . . , Xn )

is a centered Gaussian random vector, then for any 1 ≤ i ≤ n,
n
X ∂F
E(Xi F (X)) = E(Xi Xj )E (X) .
∂xi
j=1
A proof of this lemma can be found in the appendix of [13], for example.
Proof of Theorem 1.2. Let X = (X1 , . . . , Xn ) and Y = (Y1 , . . . , Yn ). Without loss of

generality, we may assume that X and Y are defined on the same probability space and are inde-
pendent. Fix β > 0, and define Fβ : Rn → R as:
n
X
−1 βxi
Fβ (x) := β log e .
i=1
(Note that x denotes the vector (x1 , . . . , xn ), a convention that we shall follow throughout.) Now,
for each i, let µi = E(Xi ) = E(Yi ), X̃i = Xi − µi , and Ỹi = Yi − µi . For 1 ≤ i, j ≤ n, let
X = E(X̃ X̃ ) and σ Y = E(Ỹ Ỹ ). For 0 ≤ t ≤ 1 define the random vector Z = (Z , . . . , Z ) as
σij i j ij i j t t,1 t,n
√ √
Zt,i = 1 − tX̃i + tỸi + µi .
For all t ∈ [0, 1], let ϕ(t) = E(Fβ (Zt )). Then ϕ is differentiable, and
n
X
′ ∂Fβ Ỹi X̃i
ϕ (t) = E (Zt ) √ − √ .
∂xi 2 t 2 1−t
i=1
Again, for any i, Lemma 2.1 gives us

n
√ ∂ 2 Fβ

∂Fβ X
X
E (Zt )X̃i = 1−t σij E (Zt )
∂xi ∂xj ∂xi
j=1
and
n
√ X
2
∂Fβ Y ∂ Fβ
E (Zt )Ỹi = t σij E (Zt ) .
∂xi ∂xj ∂xi
j=1
2
Combining, we have
∂ 2 Fβ

′ 1 X
Y X
ϕ (t) = E (Zt ) (σij − σij ).
2 ∂xj ∂xi
1≤i,j≤n
Now
∂Fβ eβxi
(x) = pi (x) := Pn βxj
.
∂xi j=1 e
Note that for each x ∈ Rn , the numbers p1 (x), . . . pn (x) as defined above are nonnegative and sum
to 1. In other words, they induce a probability measure on {1, 2, . . . , n}. It is straightforward to
verify that
(
∂ 2 Fβ β(pi (x) − pi (x)2 ) if i = j,
(x) =
∂xj ∂xi −βpi (x)pj (x) if i 6= j.
Thus,
X ∂ 2 Fβ Y X
(x)(σij − σij )
∂xj ∂xi
1≤i,j≤n
Xn X
=β pi (x)(σiiY − σiiX ) − β Y
pi (x)pj (x)(σij X
− σij ).
i=1 1≤i,j≤n
Pn
Now observe that since i=1 pi (x) = 1, therefore
n
X 1 X
pi (x)(σiiY − σiiX ) = pi (x)pj (x)(σiiY − σiiX + σjj
Y X
− σjj ).
2
i=1 1≤i,j≤n
Combining, we have
X ∂ 2 Fβ Y X
(x)(σij − σij )
∂xj ∂xi
1≤i,j≤n
β X
pi (x)pj (x) (σiiY + σjj
Y Y
) − (σiiX + σjj
X X

= − 2σij − 2σij ) .
2
1≤i,j≤n
Now note that
σiiX + σjj
X X
− 2σij = E(X̃i − X̃j )2 = E(Xi − Xj )2 − (µi − µj )2
and similarly
σiiY + σjj
Y Y
− 2σij = E(Ỹi − Ỹj )2 = E(Yi − Yj )2 − (µi − µj )2 .
3
Therefore,
X ∂ 2 Fβ Y X β X
Y X
(x)(σij − σij )= pi (x)pj (x)(γij − γij ).
∂xj ∂xi 2
1≤i,j≤n 1≤i,j≤n
X ≤ γ Y for all i, j, then ϕ′ (t) ≥ 0 for each t, which implies

Thus, if γij ij
E(Fβ (Y)) = ϕ(1) ≥ ϕ(0) = E(Fβ (X)). (1)
Now observe that
max xi = β −1 log eβ maxi xi

i
X
−1 βxi
≤ β log e
i
−1 β maxi xi

≤β log ne
−1
=β log n + max xi . (2)
i
In other words, max xi ≤ Fβ (x) ≤ β −1 log n + max xi .

Thus, taking β → ∞ in (1), we get the second assertion of the theorem. For the first, note that
Y − γ X |, we have
with γ = max1≤i,j≤n |γij ij
X ∂ 2 Fβ Y X βγ X βγ
(x)(σij − σij ) ≤ pi (x)pj (x) = .
∂xj ∂xi 2 2
1≤i,j≤n 1≤i,j≤n
This shows that

βγ
|E(Fβ (Y)) − E(Fβ (X))| ≤ .
4
Combined with (2), this gives
βγ log n
|E(max Yi ) − E(max Xi )| ≤ + .
i i 4 β
q
Choosing β = 2 logγ n gives the desired result.
References
[1] Adler, R.J. (1990). An Introduction to Continuity, Extrema, and Related Topics for General Gaussian
Processes. Institute of Mathematical Statistics.
[2] Alexander, R. (1985). Lipschitzian mappings and total mean curvature of polyhedral surfaces I.
Trans. Amer. Math. Soc. 288, 661–678.
[3] Fernique, X. (1975). Regularité des trajectoires des fonctions aléatoires Gaussiens. Lecture Notes in
Mathematics 480, 1–96, Springer.
4
[4] Fernique, X. (1997). Fonctions aléatoires gaussiennes vecteurs aléatoires gaussiens. CRM, Montreal.
[5] Gordon, Y. (1985). Some inequalities for Gaussian processes and applications. Israel J. Math. 50,
265–289.
[6] Gordon, Y. (1987). Elliptically contoured distributions. Prob. Th. Rel. Fields 76, 429–438.
[7] Gordon, Y. (1992). Majorization of Gaussian processes and geometric applications. Prob. Th. Rel.
Fields 91, 251–267.
[8] Kahane, J.-P. (1986). Une inegalité du type de Slepian et Gordon sur les processus gaussiens. Israel
J. Math. 55, 109–110.
[9] Ledoux, M. Talagrand, M. (1985). Probability in Banach Spaces. Springer, New York.
[10] Lifshits, M.A. (1995). Gaussian Random Functions. Kluwer, Boston.
[11] Sudakov, V.N. (1971). Gaussian random processes and measures of solid angles in Hilbert space.
Dokl. Akad. Nauk. SSR 197, 4345.; English translation in Soviet Math. Dokl. (1971) 12, 412–415.
[12] Sudakov, V.N. (1976). Geometric Problems in the Theory of Infinite-Dimensional Probability Distri-
butions. Trud. Mat. Inst. Steklov 141. English translation in Proc. Steklov Inst. Math 2, Amer. Math.
Soc.
[13] Talagrand, Michel (2003). Spin glasses: a challenge for mathematicians. Cavity and mean field
models. Springer-Verlag, Berlin.
[14] Vitale, R.A. (2000). Some comparisons for gaussian processes. Proc. Amer. Math. Soc. 128, 3043–
3046.

0510424v1

Uploaded by

Copyright:

Available Formats

0510424v1

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

0510424v1

Uploaded by

Copyright:

Available Formats

An error bound in the Sudakov-Fernique inequality

arXiv:math/0510424v1 [math.PR] 20 Oct 2005

November 26, 2024

1 Statement of the result

X ≤ γ Y for all i, j, then E(max X ) ≤ E(max Y ).

Lemma 2.1. If F : Rn → R is a C 1 function of moderate growth at infinity, and X = (X1 , . . . , Xn )

Proof of Theorem 1.2. Let X = (X1 , . . . , Xn ) and Y = (Y1 , . . . , Yn ). Without loss of

Again, for any i, Lemma 2.1 gives us

Now note that

X ≤ γ Y for all i, j, then ϕ′ (t) ≥ 0 for each t, which implies

E(Fβ (Y)) = ϕ(1) ≥ ϕ(0) = E(Fβ (X)). (1)

Now observe that

max xi = β −1 log eβ maxi xi

In other words, max xi ≤ Fβ (x) ≤ β −1 log n + max xi .

This shows that

You might also like