Optimal constants in concentration inequalities on the sphere and in the Gauss space

Guillaume Aubrun Institut Camille Jordan, Université Claude Bernard Lyon 1, 43 boulevard du 11 novembre 1918, 69622 Villeurbanne cedex, France Univ. Lyon, ENS Lyon, UCBL, CNRS, Inria, LIP, F-69342, Lyon Cedex 07, France. aubrun@math.univ-lyon1.fr , Justin Jenkinson Case Western Reserve University, Cleveland, Ohio 44106-7058, U.S.A. jdj13@case.edu and Stanislaw J. Szarek Case Western Reserve University, Cleveland, Ohio 44106-7058, U.S.A. Institut de Mathématiques de Jussieu, Sorbonne Université, 4 place Jussieu, 75252 Paris cedex 05, France szarek@cwru.edu

Abstract.

We show several variants of concentration inequalities on the sphere stated as subgaussian estimates with optimal constants. For a Lipschitz function, we give one-sided and two-sided bounds for deviation from the median as well as from the mean. For example, we show that if $\mu$ is the normalized surface measure on $S^{n-1}$ with $n\geqslant 3$ , $f:S^{n-1}\to\mathbb{R}$ is $1$ -Lipschitz, $M$ is the median of $f$ , and $t>0$ , then $\mu\big{(}f\geqslant M+t\big{)}\leqslant\frac{1}{2}e^{-nt^{2}/2}$ . If $M$ is the mean of $f$ , we have a two-sided bound $\mu\big{(}|f-M|\geqslant t\big{)}\leqslant e^{-nt^{2}/2}$ . Consequently, if $\gamma$ is the standard Gaussian measure on $\mathbb{R}^{n}$ and $f:\mathbb{R}^{n}\to\mathbb{R}$ (again, $1$ -Lipschitz, with the mean equal to $M$ ), then $\gamma\big{(}|f-M|\geqslant t\big{)}\leqslant e^{-t^{2}/2}$ . These bounds are slightly better and arguably more elegant than those available elsewhere in the literature.

1. Introduction and the main results

Lévy’s isoperimetric inequality on the sphere in $\mathbb{R}^{n}$ [9, 15] is one of the most useful tools in the study of high-dimensional phenomena. The isoperimetric inequality itself is a very precise result: Among the subsets of the sphere of given measure, the caps have the smallest boundary. On the other hand, typical applications appeal to its corollaries, which exhibit varying degrees of tightness. Such corollaries are most often expressed as subgaussian concentration inequalities of the form

(1)

\mu(\left\{f\geqslant M+t\right\})\leqslant C\ e^{-cnt^{2}},

valid for any $1$ -Lipschitz real valued function on the sphere and any $t>0$ , where $\mu=\mu_{n}$ is the normalized surface measure on the sphere, $M$ is either the median or the mean of $f$ , and $C,c>0$ are (explicit or effectively computable) constants, independent of $n$ , $f$ , and $t$ . Perhaps the most frequently cited variant of a spherical concentration inequality comes from the influential 1986 book of Milman and Schechtman [11].

Fact 1.

If $f:S^{n+1}\rightarrow\mathbb{R}$ is a $1$ -Lipschitz function (with respect to the geodesic distance) and $M$ is its median, then for every $t>0$

(2)

\mu(\left\{f\geqslant M+t\right\})\leqslant\sqrt{{\pi}/8}\ e^{-nt^{2}/2}.

Equivalently, if $A\subset S^{n+1}$ is such that $\mu(A)\geqslant\frac{1}{2}$ , then – for every $t>0$ – the $t$ -enlargement of $A$ defined by $A_{t}:=\{x:{\rm dist}(x,A)<t\}$ verifies $\mu(A_{t})\geqslant 1-\sqrt{{\pi}/8}\ e^{-nt^{2}/2}$ .

The spherical concentration inequality (2) played a huge role in the development of the theory. However, its statement is not completely satisfactory for two reasons. First, while it is well known that the constant $c=\frac{1}{2}$ in the exponent is optimal, stating Fact 1 for $S^{n+1}$ hides the inconvenient truth that the dimension of the ambient space in this formulation is $n+2$ , while the factor that appears in the exponent is just $n$ . It would be more elegant and convenient for that factor to coincide with the dimension of the ambient space, leading directly to a concentration result of the form (1). Next, the constant $C=\sqrt{\pi/8}\approx 0.626657$ in (2) is not optimal (ideally, both bounds in Fact 1 should tend to $\mu(A)=\frac{1}{2}$ as $t\to 0^{+}$ ). Even though it is possible to replace $m+1$ with $m-1$ while adjusting the constants, doing that would only exacerbate the second drawback. Here we will prove the following version of the inequality that addresses all these concerns. We emphasize that this bound is not meant to be optimal; in fact, various bounds that are better and near optimal in various asymptotics are known and not that difficult (for example, see Proposition 6, its proof, and the comments following it). However, we believe that our results offer a reasonable compromise between sharpness, simplicity, and the ease of application.

Theorem 2.

([5]) Let $n>2$ and $t>0$ . If $A\subset S^{n-1}$ satisfies $\mu(A)\geqslant 1/2$ then

(3)

\mu(A_{t})\geqslant 1-\frac{1}{2}\,e^{-t^{2}n/2}.

Consequently, if $f:S^{n-1}\rightarrow\mathbb{R}$ is a function which is $L$ -Lipschitz with respect to the geodesic distance and if $M$ is its median, then for every $t\geqslant 0$ ,

(4)

\mu(\left\{f>M+t\right\})\leqslant\frac{1}{2}\,e^{-nt^{2}/2L^{2}}\ \hbox{ and % }\ \mu(\left\{|f-M|>t\right\})\leqslant e^{-nt^{2}/2L^{2}}.

Remark.

Before continuing, let us comment on the case $n=2$ , i.e., that of the $1$ -dimensional sphere $S^{1}$ . As easily follows from the general argument sketched in Section 2, the optimal lower bound in (3) is then very simple and reads (in the nontrivial range $(0,\pi/2]$ )

\mu(A_{t})\geqslant\frac{1}{2}+\frac{t}{\pi}.

A direct check shows then that the estimate (3) from Theorem 2 fails if $t\in(\alpha,\beta)$ , where $\alpha\approx 1.05858,\beta\approx 1.18588$ . However, it remains true outside of this interval. (This is visualized in Figure 1 further below.) Also, if we use the extrinsic chordal distance in $\mathbb{R}^{2}$ (or in $\mathbb{R}^{n}$ , for any $n\geqslant 2$ ) instead of the geodesic distance, the estimate holds in the entire nontrivial range $[0,\sqrt{2}]$ . Note that the extrinsic distance, i.e., the usual Euclidean distance in the ambient space $\mathbb{R}^{n}$ , is in many applications more relevant than the geodesic distance. This happens for example when the function $f$ is defined – and Lipschitz – on the entire space $\mathbb{R}^{n}$ , or at least on the unit ball. ∎

Let us now pass to the discussion of bounds of the form (1) (or (4)) with $M=\mathbb{E}f$ , the expected value of $f$ . The first observation is that, in this context, the value of the constant $C$ can not be smaller than $1$ , which is shown by the following simple example. (The example is cooked up for $n=2$ , but clearly a simple modification with similar features can be produced for any $n\geqslant 2$ , see also Section 3.) Identify $S^{1}$ with $(-\pi,\pi]$ endowed with the normalized Lebesgue measure, which we will also denote by $\mu$ . Next, let $\delta\in(0,\pi)$ and let $f:[-\pi,\pi]\to\mathbb{R}$ be defined by $f(x)=\min\{|x|-\delta,0\}$ . Then $\mathbb{E}f=-\frac{\delta^{2}}{2\pi}$ and so, for $t\in(0,\frac{\delta^{2}}{2\pi})$ ,

\mu(\left\{f>\mathbb{E}f+t\right\})\geqslant\mu(\left\{f\geqslant 0\right\})=1% -\frac{\delta}{\pi}.

Accordingly, if – for such $t$ – we have $\mu(\left\{f>\mathbb{E}f+t\right\})\leqslant C\,e^{-cnt^{2}}<C$ , then letting $\delta\to 0^{+}$ yields $C\geqslant 1$ .

Thus, when $M=\mathbb{E}f$ , the best bound we may hope for in the estimate (1) is $e^{-nt^{2}/2}$ . Somewhat surprisingly, a stronger fact is true: we also have (for $n>2$ ) a two-sided bound of the same form.

Theorem 3.

If $f:S^{n-1}\rightarrow\mathbb{R}$ is a function which is $L$ -Lipschitz with respect to the geodesic distance, then, for every $t\geqslant 0$ ,

(5)

\mu(\left\{f\geqslant\mathbb{E}f+t\right\})\leqslant e^{-nt^{2}/2L^{2}}\ \hbox% { for all }\ n\geqslant 2,

(6)

\mu(\left\{|f-\mathbb{E}f|\geqslant t\right\})\leqslant e^{-nt^{2}/2L^{2}}\ % \hbox{ for all }\ n>2.

Remark.

(i) All the comments presented in the remark following Theorem 2 apply mutatis mutandis to the case $n=2$ of the inequality from (6). (ii) Obviously, the bound (6) is stronger than (5). However, we state the latter separately since it is also valid for $n=2$ . Moreover, its proof provides some additional information, is a good warmup for the harder proof of (6), and in fact some cases considered in the proof of (6) reduce to instances of (5). (iii) An alert reader will recall that a powerful standard tool for obtaining subgaussian estimates for deviations from the expected value of a random variable is the log-Sobolev inequality, and will wonder whether at least the one-sided part of the assertion of Theorem 3 (i.e., the inequality from (5)) can be derived that way. This is almost true, but not quite. Indeed, a log-Sobolev inequality for a Riemannian manifold $M$ is usually deduced from from a bound on $c(M)$ , the Ricci curvature of $M$ . Now, $c(S^{n-1})=n-2$ , which by general arguments alluded to above leads to an estimate of the form (1) with $M=\mathbb{E}f$ , $C=1$ , and the coefficient of $t^{2}$ in the exponent on the right hand side equal to $\frac{1}{2}\frac{(n-1)\,c(S^{n-1})}{n-2}=\frac{n-1}{2}$ . The stronger two-sided estimate (6) presents further problems. ∎

A standard consequence of Theorem 3 is the following deviation result for the Gaussian space.

Corollary 4.

Let $\gamma=\gamma_{n}$ be the standard Gaussian measure on $\mathbb{R}^{n}$ and let $f:\mathbb{R}^{n}\rightarrow\mathbb{R}$ be an $L$ -Lipschitz function (with respect to the Euclidean distance). Next, let $M=\mathbb{E}f$ and $t>0$ . Then

(7)

\gamma(\left\{|f-M|\geqslant t\right\})\leqslant e^{-t^{2}/2L^{2}}.

Remark.

(i) When $M$ is the median of $f$ , the bound (7) is an easy consequence of the Gaussian isoperimetric inequality [4, 16] and the inequality $\gamma_{1}\big{(}[t,\infty)\big{)}\leqslant\frac{1}{2}e^{-t^{2}/2}$ . (ii) The Gaussian analogue of (5) follows from the log-Sobolev inequality via the so-called “Herbst argument” (see [8] or [2]), but we couldn’t easily find a reference giving the estimate (7) for the two-sided deviation. Indeed, the most frequently cited concentration result of that kind is $\gamma(\left\{|f-\mathbb{E}f|\geqslant t\right\})\leqslant 2e^{-t^{2}/2L^{2}}$ . While in our presentation Corollary 4 appears as an afterthought, the ubiquity of the normal distribution makes this result potentially at least as useful as Theorems 2 and 3. ∎

Our methods are based on refinements of known ideas, a lot of elementary calculus, and some numerics. It would be nice to find better reasons for the results. Hopefully, now that we know that the inequalities hold, someone will come up with a more conceptual proof. Concerning a potential self-contained argument, see the Remark at the end of subsection 3.1. Otherwise, there are many relevant techniques that we did not explore seriously, or did not know how to use (for starters, measure transportation and semigroup methods, see [8], or the Curvature-Dimension-Diameter condition [10] in the context of Section 4), and we use only very weakly log-concavity of the marginals of $\mu$ (the measures $\nu=\nu_{n}$ defined by (25)), so there is hope.

2. Deviation from the median: proof of Theorem 2

In this section we will sketch the proof of Theorem 2, which, except for the calculus part, follows [11]. The derivation of the second statement from the first one is standard and well-known (see, e.g., [8] or [11]). In fact the two statements are formally equivalent; here is a sketch of the argument.¹¹1Modulo minor modifications, this argument works in any metric probability space.

First, if $f:S^{n-1}\rightarrow\mathbb{R}$ and $M$ is the median of $f$ , then the set $A:=\{f\leqslant M\}$ verifies $\mu(A)\geqslant\frac{1}{2}$ , so (3) applies. Next, if $f$ is $1$ -Lipschitz, then $\{f>M+t\}\subset S^{n-1}\backslash A_{t}$ , and so any lower bound on $\mu(A_{t})$ implies an upper bound on $\mu(\left\{f>M+t\right\})$ , which is exactly what we need. The second inequality in (4) and the case of general Lipschitz constant $L$ follow easily.

In the opposite direction, if $\mu(A)\geqslant\frac{1}{2}$ , define $\phi(x)={\rm dist}(x,A)$ . Then $\phi$ is $1$ -Lipschitz, the median of $\phi$ is $0$ , and we have $\{\phi>t\}=S^{n-1}\backslash A_{t}$ (for any $t>0$ ). Again, this means that any upper bound on $\mu(\left\{\phi>t\right\})$ translates to a lower bound on $\mu(A_{t})$ , as needed.

From this point on we will concentrate on the estimate (3). The spherical isoperimetric inequality guarantees that, given $\mu(A)$ and $t>0$ , the value of $\mu(A_{t})$ is minimized when $A$ is a spherical cap. Accordingly, we need only consider the case of a spherical cap $K\subset S^{n-1}$ with $\mu(K)=1/2$ (i.e., a hemisphere). In other words, we need to show

Proposition 5.

If $n>2$ and $x\in[0,\pi/2]$ , then

(8)

\mu(S^{n-1}\backslash K_{x})\leqslant\frac{1}{2}e^{-nx^{2}/2}\ .

Proof.

We first note that $S^{n-1}\backslash K_{x}$ is again a spherical cap (whose radius in the geodesic distance is $r=\pi/2-x$ ), and so – following [11] – the left hand side of (8) can be rewritten as

(9)

\mu(S^{n-1}\backslash K_{x})=(2I_{n-2})^{-1}\int_{x}^{\pi/2}\cos^{n-2}\theta\,% d\theta=(2I_{n-2})^{-1}\int_{0}^{r}\sin^{n-2}\theta\,d\theta\ =:v(r),

where $I_{m}$ is the well-known Wallis integral

(10)

I_{m}:=\int_{0}^{\pi/2}\cos^{m}\theta\,d\theta.

[The precise value of $I_{m}$ is not important for the present argument, but for future reference we will cite some easy and well-known facts in Proposition 8 at the end of this section.] This means that the first assertion of Theorem 2 is equivalent to the inequality

(11)

q_{n}(x):=\frac{\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta}{I_{n-2}}e^{nx^{2}/2% }\leqslant 1,

to be valid for $n>2$ and $x\in[0,\frac{\pi}{2}]$ . While numerical considerations suggest that, for each $x\in[0,\pi/2]$ , the sequence $(q_{n}(x))$ is nonincreasing, in view of the recurrence formula

(12)

\int\cos^{n}\theta\,d\theta=\frac{1}{n}\cos^{n-1}\theta\sin\theta+\frac{n-1}{n% }\int\cos^{n-2}\theta\,d\theta,

it will be easier to compare $q_{n+2}(x)$ and $q_{n}(x)$ . Specifically, we will aim at proving that

(13)

q_{n+2}(x)\leqslant q_{n}(x),

which simplifies to

(14)

\int_{x}^{\pi/2}\cos^{n}\theta\,d\theta\stackrel{{\scriptstyle?}}{{\leqslant}}% e^{-x^{2}}\frac{I_{n}}{I_{n-2}}\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta.

Passing to the definite integrals $\int_{0}^{\pi/2}$ in the formula (12) yields $\frac{I_{n}}{I_{n-2}}=\frac{n-1}{n}$ . Substituting this value in (14), and further applying the recurrence formula (12) to the left hand side of (14), allows us to rewrite that inequality as an upper bound on $\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta$ , namely as

(15)

(n-1)(1-e^{-x^{2}})\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta\stackrel{{% \scriptstyle?}}{{\leqslant}}\cos^{n-1}x\sin x\ .

The the cosine integral appearing above can be upper-bounded as follows:

\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta\leqslant\frac{1}{\sin x}\int_{x}^{% \pi/2}\cos^{n-2}\theta\sin\theta\,d\theta=\frac{1}{n-1}\frac{\cos^{n-1}x}{\sin x% }\ .

Applying this bound to the left hand side of (15), we see that it is now sufficient to show that

(16)

(1-e^{-x^{2}})\stackrel{{\scriptstyle?}}{{\leqslant}}\sin^{2}x\ \ \ \hbox{or }% \ \ \ \cos x\stackrel{{\scriptstyle?}}{{\leqslant}}e^{-x^{2}/2}

for $x\in[0,\pi/2]$ , which is a well known inequality used often in analysis. This inequality can be validated in many ways; for example, one may consider the power series of both sides, or take logarithm of both sides and repeatedly differentiate.

To recapitulate, we have shown up to now that, for each $x\in[0,\pi/2]$ , the sequences $(q_{2n}(x))$ and $(q_{2n+1}(x))$ are nonincreasing. Thus, to deduce (11), we only need to establish that, for $x\in[0,\pi/2]$ , $q_{3}(x)\leqslant 1$ and $q_{4}(x)\leqslant 1$ . (The failure of $q_{2}(x)\leqslant 1$ on some subinterval $(\alpha,\beta)\subset[0,\pi/2]$ was the reason why the case $n=2$ had to be excluded from Theorem 2 and analyzed separately.) From (11), $q_{3}(x)$ and $q_{4}(x)$ are given by the following

(17)		$\displaystyle q_{3}(x)$	$\displaystyle=(1-\sin x)e^{3x^{2}/2},$
(18)		$\displaystyle q_{4}(x)$	$\displaystyle=\frac{1}{\pi}\left(\pi-2x-2\cos x\sin x\right)e^{2x^{2}}.$

The inequalities $q_{3}(x)\leqslant 1$ and $q_{4}(x)\leqslant 1$ can now be verified numerically or graphically, see Figure 1. Note that the only points where $q_{3}(x)$ or $q_{4}(x)$ is even close to $1$ are near $x=0$ (for which we have equality), but since $q_{3}^{\prime}(0)=-1$ and $q_{4}^{\prime}(0)=-\frac{4}{\pi}$ , we can be sure that the inequalities hold when $x$ is close to $0$ . (Note that, additionally, we know that $q_{4}(x)\leqslant q_{2}(x)$ , so that $q_{4}(x)\leqslant 1$ for sure holds outside the short interval $(\alpha,\beta)$ identified earlier.)

Refer to caption — Figure 1. The plots of $q_{2}$ , $q_{3}$ , and $q_{4}$ . The bound $q_{2}\leqslant 1$ is not valid on some interval $(\alpha,\beta)\approx(1.05858,1.18588)$ , but $q_{4},q_{3}\leqslant 1$ (analytically) and $q_{4}\leqslant q_{3}$ , $q_{3}\leqslant q_{2}$ (numerically).

Alternatively, we can analyze the inequalities by analytic means using the same techniques as those employed earlier in the proof of (16). However, the argument is more involved and at some stage numerics seem necessary. For example, taking the logarithm of both sides of $q_{3}(x)\leqslant 1$ leads to

\ln(1-\sin x)\leqslant\frac{-3x^{2}}{2}.

We again have equality when $x=0$ so we look at the derivatives and are led to

g(x):=3x-\frac{\cos x}{1-\sin x}\leqslant 0

(that is, if the above inequality holds in $(0,\pi/2)$ , then so does the previous one and we are done). By direct calculation, $g^{\prime}(x)=\frac{2-3\sin x}{1-\sin x}$ , and so it is apparent that $g$ has a unique maximum in $[0,\pi/2)$ at $x_{0}=\arcsin\frac{2}{3}$ . It remains to verify that $g(x_{0})\approx-0.046885<0$ , as needed. ∎

Remark.

It is possible to generalize the above argument to arrive at estimates involving $\frac{1}{2}e^{-t^{2}(n+\xi)/2}$ , where $\xi>0$ . We cannot expect such estimates to hold for all $n$ : we have already seen above that, even in the case $\xi=0$ , $n$ must be greater than $2$ . However, they may be true for $n\geqslant n(\xi)$ . Bounds of this type are relevant to improving constants in isoperimetric/concentration inequalities on $\big{(}S^{n-1}\big{)}^{k}$ for $k>1$ , the topic that is explored in Section 4. ∎

For future reference, we will present here other closely related bounds for the quantities appearing in Theorem 2. For simplicity, we will state them only as an upper bound for the volume of spherical caps; estimates of the form (3) and (4) follow then in the usual way.

Proposition 6.

For $0\leqslant r\leqslant\pi/2$ , the normalized volume of a spherical cap of geodesic radius $r$ in $S^{n-1}$ satisfies

(19)

v(r)\leqslant\frac{1}{2}\sin^{n-1}r

and, for $0\leqslant r<\pi/2$ ,

(20)

v(r)\leqslant(\sqrt{2\pi}\,\kappa_{n}\cos r)^{-1}\sin^{n-1}r,

where $\kappa_{n}=\frac{\sqrt{2}\;\Gamma\left(\frac{n+1}{2}\right)}{\Gamma\left(\frac% {n}{2}\right)}$ . Moreover, the ratio $v(r)/\sin^{n-1}r$ is increasing on $(0,\pi/2]$ .

Remark.

The inequality (19) is equivalent to $\mu(S^{n-1}\backslash K_{x})\leqslant\frac{1}{2}\cos^{n-1}x$ , where $x=\pi/2-r$ , which is superior to (8) except for $x$ close to $0$ . Next, $n-1=\dim S^{n-1}$ is the best exponent that we may possibly expect since, for small $r$ , the volume of an $r$ -cap scales as $r^{n-1}$ . Finally, since $\kappa_{n}>\sqrt{n-\frac{1}{2}}$ , the bound (20) is superior to (19) except for $r$ very close to $\pi/2$ ; this will be exploited in the proof of inequality (6) from Theorem 3. ∎

Proof.

First, it is apparent that we have equality in (19) when $r=0$ and $r=\pi/2$ . Accordingly, (19) will follow once we prove the last statement, i.e., $r\to v(r)/\sin^{n-1}r$ being increasing on $[0,\pi/2]$ . To that end, recall that (cf. (9))

v(r)=(2I_{n-2})^{-1}\int_{0}^{r}\sin^{n-2}\theta\,d\theta,

while $\sin^{n-1}r=(n-1)\int_{0}^{r}\sin^{n-2}\theta\cos\theta\,d\theta$ . The conclusion follows now immediately from the following elementary fact.

Lemma 7.

Let $f:(a,b)\to(0,\infty)$ be integrable and let $h:[a,b)\to\mathbb{R}$ be nonincreasing. Then the function $x\to\frac{\int_{a}^{x}fh}{\int_{a}^{x}f}$ is nonincreasing.

Let us note that the Lemma was stated here with the hypotheses fitting the current setting, but it remains true under any reasonable assumptions that assure the quantities in question are well defined.

We skip the proof of (20); it can be found, together with a geometric proof of (19) and a few similar or related estimates, in Section 5.1.2 and Appendix A of [2]. Similar bounds were also established, e.g., in Lemma 2.1 of [3]. ∎

Finally, let us state for future reference some easy and well-known facts concerning the Wallis integral defined in (10).

Proposition 8.

If $I_{m}=\int_{0}^{\pi/2}\cos^{m}\theta\,d\theta$ then
(i) $I_{m}=\frac{\sqrt{\pi}\,\Gamma\left(\frac{m+1}{2}\right)}{2\Gamma\left(\frac{m% }{2}+1\right)}$ for $m\geqslant 0$ ,
(ii) $\sqrt{\frac{\pi}{2m+2}}\leqslant I_{m}\leqslant\sqrt{\frac{\pi}{2m+1}}$ for $m\geqslant 1$ .
(iii) We have $I_{m}=\sqrt{\frac{\pi}{2}}\,\kappa_{m+1}^{-1}$ , where $\kappa_{s}$ is as in Proposition 6. The sequence $\left(\frac{\kappa_{s}}{\sqrt{s}}\right)$ is increasing to $1$ , and so $\big{(}I_{m-1}\sqrt{m}\big{)}$ and $\big{(}I_{m-2}\sqrt{m}\big{)}$ both decrease to $\sqrt{\frac{\pi}{2}}$ .

For these (and other, tighter and two-sided) estimates we refer the reader to, for example, Section 5.1.2 and Appendix A in [2].

3. Deviation from the mean: proofs of Theorem 3 and Corollary 4

In this section we will address the one-sided and the two-sided problem for the deviation from the mean: If $f$ is $1$ -Lipschitz and $t\geqslant 0$ , what are the bounds for $\mu(\{f\geqslant\mathbb{E}f+t\})$ and for $\mu(\{|f-\mathbb{E}f|\geqslant t\})$ ? The bulk of the argument will be devoted to the spherical case (Theorem 3); at the very end we will make a few comments about deducing the Gaussian result (Corollary 4).

As in the previous section, the proof of Theorem 3 splits into two parts: (i) identifying extremal instances for the problem at hand and (ii) obtaining tight estimates satisfied by those extremal instances. In the case of Theorem 2, the extremal objects were the spherical cap $K$ of measure $\frac{1}{2}$ , i.e., a hemisphere (in the context of (3)) and the function $\phi(x)={\rm dist}(x,K)$ (in the context of (4)). This will not suffice in the present setting since $\mathbb{E}f$ depends on the entire distribution of $f$ , i.e., on the values of $\mu(\{f\geqslant t\})$ for all $t\in\mathbb{R}$ , and not only on the values of $t$ for which that measure is $\frac{1}{2}$ . However, allowing caps of arbitrary measure leads to a sufficiently rich family of functions, which we will describe in a moment.

3.1. The one-sided problem : Proof of (5)

We will start with the following simple lemma.

Lemma 9.

Let $\nu$ be a probability measure on $\mathbb{R}$ such that $\int|x|d\nu(x)<\infty$ . Next, let $\mathcal{L}$ be the set of functions from $\mathbb{R}$ to $\mathbb{R}$ that are $1$ -Lipschitz and nondecreasing. For $t>0$ , consider the optimization problem

(21)

\sup_{\psi\in\mathcal{L}}\ \nu\big{(}\{\psi\geqslant\mathbb{E}\psi+t\}\big{)},

where $\mathbb{E}$ stands for the expected value in the probability space $(\mathbb{R},\nu)$ . Then, for each $t$ , it is enough to restrict the supremum to the subfamily of $\mathcal{L}$ consisting of functions of the form

(22)

\phi_{a}(\theta)=\left\{\begin{array}[]{cl}\theta&{\rm if}\quad\theta\leqslant a% \\ a&{\rm if}\quad\theta\geqslant a\end{array}\right.,

where $a\in\mathbb{R}$ is a parameter. Moreover, for each $a$ it is sufficient to consider $t=t(a):=a-\mathbb{E}\phi_{a}$ . That is, if $\lambda:(0,\infty)\to[0,1]$ is a nonincreasing function satisfying $\nu(\{\phi_{a}\geqslant\mathbb{E}\phi_{a}+t(a)\})\leqslant\lambda(t(a))$ for all $a\in\mathbb{R}$ , then it is also true that $\nu(\{\psi\geqslant\mathbb{E}\psi+t\})\leqslant\lambda(t)$ for all $t>0$ and all $\psi\in\mathcal{L}$ .

Proof.

Denote $U(\psi,t):=\{\psi\geqslant\mathbb{E}\psi+t\}$ . For the first assertion of the Lemma, we need to show that, for $t>0$ ,

(23)

\sup_{\psi\in\mathcal{L}}\ \nu\left(U(\psi,t)\right)\leqslant\sup_{a\in\mathbb% {R}}\ \nu\left(U(\phi_{a},t)\right),

the converse inequality being trivial. To that end, fix $\psi\in\mathcal{L}$ and $t>0$ . Since $\psi$ is nondecreasing an continuous, the set $U(\psi,t)$ is either empty, or of the form $[a_{0},\infty)$ for some $a_{0}\in\mathbb{R}$ , and necessarily $f(a_{0})=\mathbb{E}\psi+t$ . Next, since adding a constant to any function $\zeta$ doesn’t change the set $U(\zeta,t)$ , we may just as well assume that $\psi(a_{0})=a_{0}$ .

Consider now the function $\phi_{a_{0}}$ as defined by (22). Then

$\bullet$ $\psi(x)\geqslant\phi_{a_{0}}(x)=a_{0}$ for $x\geqslant a_{0}$ with equality for $x=a_{0}$ (by construction)
$\bullet$ $\psi(x)\geqslant\phi_{a_{0}}(x)=x$ for $x\leqslant a_{0}$ (because $\psi$ is $1$ -Lipschitz).
(The reader is advised to draw a picture.) Thus $\psi\geqslant\phi_{a_{0}}$ everywhere on $\mathbb{R}$ and it follows in particular that $\mathbb{E}\phi_{a_{0}}\leqslant\mathbb{E}\psi$ . Consequently, if we choose $t_{0}$ so that $\mathbb{E}\phi_{a_{0}}+t_{0}=\mathbb{E}\psi+t$ , then $t_{0}\geqslant t$ . On the other hand, $U(\phi_{a_{0}},t_{0})=\{\phi_{a_{0}}\geqslant a_{0}\}=[a_{0},\infty)=U(\psi,t)$ , and since for any $g$ and any $t\leqslant t_{0}$ we obviously have $U(g,t_{0})\subset U(g,t)$ , it follows that

\nu(U(\phi_{a_{0}},t))\geqslant\nu(U(\phi_{a_{0}},t_{0}))=\nu(U(\psi,t)).

Since $\psi\in\mathcal{L}$ was arbitrary, (23) follows. The second assertion of the Lemma is a consequence of the fact that to upper-bound $\nu(U(\psi,t))$ we only need information about $U(\phi_{a_{0}},t_{0})$ for some $t_{0}\geqslant t$ , wiith the equality $t_{0}=t(a_{0})$ being implicit in the definition of the latter set. ∎

Remark.

The second assertion of Lemma 9 can be strengthened and simplified as follows: For any $t>0$ , the supremum from (21) is attained for some $\psi=\phi_{a}$ with $a$ verifying $t=a-\mathbb{E}\phi_{a}$ . We stated the weaker version since it is sufficient for our purposes and easier to prove, but since the Lemma may be of independent interest, we include a sketch of the proof of the stronger fact in the Appendix. ∎

We will now sketch a very well known reduction argument that allows to derive from Lemma 9 the form of the extremal functions for the one-sided problem (5) from Theorem 3. (The same argument will work for the two-sided problem (6) once we establish a two-sided analogue of Lemma 9.)

Let $f$ be any (say, Borel) function on $S^{n-1}$ and let $f^{*}$ be its rearrangement (i.e., verifying $\mu(\{f^{*}\geqslant t\})=\mu(\{f\geqslant t\})$ for any $t\in\mathbb{R}$ ) that is of the form

(24)

f^{*}(u)=g(u_{1}),

where $u_{1}=\left\langle u,e_{1}\right\rangle$ is the first coordinate of $u\in S^{n-1}$ and $g:[-1,1]\to\mathbb{R}$ is nondecreasing. This is a standard procedure that works for any random variable and any non-atomic probability measure on $\mathbb{R}$ , but in our setting it has an additional feature: if $f$ is $1$ -Lipschitz, so is $f^{*}$ . Indeed, suppose $f$ is $1$ -Lipschitz and let $v,w\in S^{n-1}$ . We need to show that if $t=f^{*}(w)=g(w_{1})$ , $s=f^{*}(v)=g(v_{1})$ , then $\varepsilon:=|t-s|\leqslant{\rm dist}(v,w)$ .

By symmetry, we may assume that $s<t$ (hence $v_{1}<w_{1}$ ). By construction, the sets $K=\{u\in S^{n-1}:f^{*}(u)\leqslant s\}$ and $L=\{u\in S^{n-1}:f^{*}(u)\geqslant t\}$ are “opposite” spherical caps with “parallel” boundaries with $v\in K,w\in L$ . Likewise, by construction, if $A=\{f\leqslant s\}$ and $B=\{f\geqslant t\}$ , then $\mu(K)=\mu(A)$ and $\mu(L)=\mu(B)$ . Next, since $f$ is $1$ -Lipschitz, it follows that ${\rm dist}(A,B)\geqslant\varepsilon$ ; in other words, $B\subset(A_{\varepsilon})^{c}$ , where $A_{\varepsilon}$ is the $\varepsilon$ -enlargement of $A$ defined in Fact 1. We now appeal to the isoperimetric inequality on $S^{n-1}$ to conclude that $\mu\big{(}(K_{\varepsilon})^{c}\big{)}\geqslant\mu\big{(}(A_{\varepsilon})^{c}% \big{)}\geqslant\mu(B)=\mu(L)$ . Since both $L$ and $(K_{\varepsilon})^{c}$ are (closed) “left” spherical caps, we deduce that $L\subset(K_{\varepsilon})^{c}$ or, equivalently, ${\rm dist}(K,L)\geqslant\varepsilon$ , and – in particular – ${\rm dist}(v,w)\geqslant\varepsilon$ , as needed.

Since all quantities depending on the distribution are identical for $f$ and $f^{*}$ , it follows that for estimates such as (5) it is enough to consider functions of the form (24). Further, since the geodesic distance between the two “parallels” $\{u\in S^{n-1}:u_{1}=\alpha\}$ and $\{u\in S^{n-1}:u_{1}=\beta\}$ equals $|\arcsin\alpha-\arcsin\beta|$ , the function $f^{*}$ defined by (24) is $1$ -Lipschitz on $S^{n-1}$ if an only if $\psi(\theta):=g(\sin\theta)$ is $1$ -Lipschitz on $[-\pi/2,\pi/2]$ . Putting all these observations together, we conclude that the one-sided bound (5) for given $n\geqslant 2$ is an instance of (21) with $\nu$ being the push-forward of $\mu=\mu_{n}$ under the map $S^{n-1}\ni u\to\arcsin u_{1}\in[-\pi/2,\pi/2]$ , with the extremal functions given by (22).²²2Note that the functions in (22) are a priori defined on $\mathbb{R}$ , but only their restrictions to $[-\pi/2,\pi/2]$ and the values $a\in[-\pi/2,\pi/2]$ are relevant to the problem at hand. However, for other reference measures (for example, the Gaussian measure), test functions defined on $\mathbb{R}$ and all values of $a\in\mathbb{R}$ may be needed.

As was (implicitly) determined in Section 2, $\nu=\nu_{n}$ is then of the form

(25)

d\nu(\theta)=(2I_{n-2})^{-1}\cos^{n-2}\theta\,d\theta,\quad\theta\in[-\pi/2,% \pi/2],

where $I_{m}$ is defined by (10).³³3For definiteness, we will assume the that the density of $\nu$ is $0$ outside of the interval $[-\pi/2,\pi/2]$ . Accordingly, verifying the one-sided estimates from Theorem 3 numerically for “not-too-large” $n$ , and analytically for “small” $n$ , is completely straightforward. First, given $a\in[-\pi/2,\pi/2]$ and $t\geqslant 0$ such that $\mathbb{E}\phi_{a}+t=a$ , the set $\{\phi_{a}\geqslant\mathbb{E}\phi_{a}+t\}$ is exactly $[a,\infty)$ , and its measure is

\nu\big{(}[a,\pi/2]\big{)}=(2I_{n-2})^{-1}\int_{a}^{\pi/2}\cos^{n-2}\theta\,d\theta,

which is the Haar measure of the corresponding cap

(26)

K^{a}=\{u\in S^{n-1}:u_{1}\geqslant\sin a\},

already analyzed in Section 2 (at least for $a\geqslant 0$ , see Proposition 5; note that, in the notation from that section, $K^{a}=S^{n-1}\setminus K_{a}$ ). This quantity (as a function of $a\in[-\pi/2,\pi/2]$ ) needs to be compared with $e^{-nt^{2}/2}$ , where $t=t_{n}(a)=a-\mathbb{E}\phi_{a}$ , which can be rewritten as

(27)

t=a+\mathbb{E}(\theta-a)^{+}=a+(2I_{n-2})^{-1}\int_{\theta=a}^{\pi/2}(\theta-a% )\cos^{n-2}\theta\,d\theta,

where $\theta$ , $(\theta-a)^{+}$ are understood as random variables in the probability space $(\mathbb{R},\nu)$ , and we use the fact that $\mathbb{E}\theta=0$ . For future reference, let us note that (27) can be restated as follows

(28)

t=a+\int_{a}^{\pi/2}\nu(\{\theta>x\})\,dx=a+\int_{a}^{\pi/2}\mu(K^{x})\,dx\,;

this is because for any nonnegative random variable $X$ one has $\mathbb{E}X=\int_{0}^{\infty}\mathbb{P}(X>x)\,dx$ .

Note that if $a$ is close to (but strictly greater than) $-\pi/2$ , then $\phi_{a}\equiv a$ on a set of nearly full measure, while $\mathbb{E}\phi_{a}<a$ . Consequently, for $t=a-\mathbb{E}\phi_{a}>0$ we have $\mu(\{\phi_{a}\geqslant\mathbb{E}\phi_{a}+t\})=\mu(\{\phi_{a}\geqslant a\})\approx 1$ , while $e^{-nt^{2}/2}<1$ (in fact also necessarily $\approx 1$ ). This is another argument showing that, in the present context, one can not hope for the multiplicative constant $C$ in the bound of type (1) to be strictly smaller than $1$ .

To summarize, we have shown that the validity of the one-sided bound from (5) for given $n$ will follow from (and in fact is equivalent to)

(29)

\mu(K^{a})\stackrel{{\scriptstyle?}}{{\leqslant}}e^{-nt^{2}/2},

where $t=t_{n}(a)$ is defined via (27) or (28). Here is a sketch of the calculation showing that (29) holds for $a\geqslant 0$ . (In fact, we will see that in that range a tighter bound, with a better constant $C<1$ , can be found. The argument from the proof of Lemma 9 implies then that a version of (5) with that improved constant $C$ holds for all $1$ -Lipschitz functions and all $t$ above the threshold given by (27) or (28) with $a=0$ . The so calculated threshold depends on $n$ and is asymptotically equivalent to $(2\pi n)^{-1/2}$ .)

Since we know from Proposition 5 (at least for $n>2$ , which we assume) that the measure of the cap in question does not exceed $\frac{1}{2}e^{-na^{2}/2}$ , it is enough to show that

(30)

\frac{1}{2}e^{-na^{2}/2}\stackrel{{\scriptstyle?}}{{\leqslant}}e^{-nt^{2}/2}=e% ^{-n(a+\eta)^{2}/2},

where

(31)

\eta:=t-a=-\mathbb{E}\phi_{a}=\int_{a}^{\pi/2}\mu(K^{x})\,dx

(cf. (28)). Taking logarithms of both sides of the inequality (30) shows that it is equivalent to

(a+\eta)^{2}\stackrel{{\scriptstyle?}}{{\leqslant}}a^{2}+\frac{2\ln 2}{n}

and finally to

\eta\stackrel{{\scriptstyle?}}{{\leqslant}}\sqrt{a^{2}+\frac{2\ln 2}{n}}-a.

On the other hand, appealing again to the estimate $\mu(K^{x})\leqslant\frac{1}{2}e^{-nx^{2}/2}$ , we can upper-bound $\eta$ by $\frac{1}{2}\int_{a}^{\pi/2}e^{-nx^{2}/2}\,dx<\frac{1}{2}\int_{a}^{\infty}e^{-% nx^{2}/2}\,dx$ , so it is enough to show that, for any $a\geqslant 0$ and $n>2$ ,

\frac{1}{2}\int_{a}^{\infty}e^{-nx^{2}/2}\,dx\stackrel{{\scriptstyle?}}{{% \leqslant}}\sqrt{a^{2}+\frac{2\ln 2}{n}}-a.

Let us now change variables via $a=\frac{u}{\sqrt{n}}$ and, inside the integral, $x=\frac{s}{\sqrt{n}}$ to get an equivalent dimension-free form

\frac{1}{2}\int_{u}^{\infty}e^{-s^{2}/2}\,ds\stackrel{{\scriptstyle?}}{{% \leqslant}}\sqrt{u^{2}+{2\ln 2}}-u.

This inequality is easy to confirm, in fact a much sharper bound $\frac{1}{2}\int_{u}^{\infty}e^{-s^{2}/2}<(\sqrt{u^{2}+1}-u)e^{-u^{2}/2}$ follows from the well-known Komatu inequality ([6], or see Remark 4 in [17])

(32)

\int_{u}^{\infty}e^{-s^{2}/2}ds\leqslant\frac{2e^{-u^{2}/2}}{u+\sqrt{u^{2}+2}}.

As is easy to check, the above argument yields (for $a\geqslant 0$ and $n>2$ ) the bound in (29) that is of the form $Ce^{-nt^{2}/2}$ with $C=\frac{e^{1/2}}{2}\approx 0.8244<1$ . Further improvement is possible if one replaces the use of the Komatu inequality (32) by the more precise bound $\int_{u}^{\infty}e^{-s^{2}/2}\leqslant\frac{4e^{-u^{2}/2}}{3u+\sqrt{u^{2}+8}}$ ([13], or see Proposition 3 in [17] ; numerical check suggests that the constant $C=0.53$ works. This shows that – except for small values of $t>0$ – the bound (5) is not very sharp. However, this is a feature, not a bug: it provides the wiggle room needed to deliver the two-sided bound (5).

Concerning the case $n=2$ of (29), the verification is – as pointed out earlier – completely straightforward. Indeed, a direct computation leads to $\mu(K^{a})=\frac{1}{2}-\frac{a}{\pi}$ and $t=t(a)=\frac{(a+\pi/2)^{2}}{2\pi}$ and there is no doubt that (29) holds in the entire non-trivial range $-\pi/2\leqslant a\leqslant\pi/2$ , with equality iff $a=-\pi/2$ .

For $a<0$ , we (trivially) have equality in (29) when $a=-\pi/2$ (and $t=0$ , for all $n$ ), but otherwise the bound $e^{-nt^{2}/2}$ does not seem very tight. (This can be seen heuristically by expanding the quantities in question in powers of $z=a+\pi/2$ if $z$ is small, and approximating the random variable $\sqrt{n}\,\theta$ by a standard normal random variable if $\theta$ is not “too large.”) For a rigorous argument, observe first that, for $a<0$ , it is more transparent to rewrite the formula for $t$ as

(33)

t=a+\mathbb{E}(\theta-a)^{+}=\mathbb{E}(\theta-a)^{-}=\int_{-a}^{\pi/2}\nu(\{% \theta>x\})\,dx=\int_{-a}^{\pi/2}\mu(K^{x})\,dx.

In other words, we need to show that if $b:=-a\in[0,\pi/2]$ , then

(34)

t=\int_{b}^{\pi/2}\mu(K^{x})\,dx\quad\stackrel{{\scriptstyle?}}{{\Rightarrow}}% \quad 1-\mu(K^{b})\leqslant e^{-nt^{2}/2},

where we used $\nu\big{(}[a,\infty)\big{)}=1-\nu\big{(}[b,\infty)\big{)}=1-\mu(K^{b})$ . Note that, in the present context, the relationship between $t$ and $b$ is exactly the same as the relationship between $\eta$ and $a$ was in the case $a\geqslant 0$ . Next observe that in (35) we are in a different regime than for $a\geqslant 0$ : unless $b>0$ is very small, both sides in the last inequality are close to $1$ . Accordingly, it is more appropriate to restate the conclusion of (34) as a lower bound on the probability of the complementary event

(35)

t=\int_{b}^{\pi/2}\mu(K^{x})\,dx\quad\stackrel{{\scriptstyle?}}{{\Rightarrow}}% \quad\mu(K^{b})\geqslant 1-e^{-nt^{2}/2},

To further facilitate concentrating on the values of $b$ that are close to $\pi/2$ , we change variables to $y:=\pi/2-x$ and $\alpha=\pi/2-b$ . The statement (35) becomes then

(36)

t=\int_{0}^{\alpha}v(y)\,dy\quad\stackrel{{\scriptstyle?}}{{\Rightarrow}}\quad v% (\alpha)\geqslant 1-e^{-nt^{2}/2},

where

(37)

v(r)=v_{n}(r):=\mu(K^{\pi/2-r})=(2I_{n-2})^{-1}\int_{0}^{r}\sin^{n-2}\theta\,d\theta

is the normalized volume of a spherical cap of geodesic radius $r$ in $S^{n-1}$ , the function that was already defined in (9). We now appeal to Proposition 6 to obtain

(38)

t=t_{n}(\alpha)\leqslant\frac{1}{2}\int_{0}^{\alpha}\sin^{n-1}y\,dy=I_{n-1}v_{% n+1}(\alpha).

The next step is the following simple observation.

Lemma 10.

If $\alpha\in[0,\pi/2]$ and $k\geqslant 2$ , then $v_{k+1}(\alpha)\leqslant v_{k}(\alpha)$ .

Proof.

There is equality for $\alpha=0$ and $\alpha=\pi/2$ , and the function $x\to v_{k}(x)-v_{k+1}(x)$ has the derivative $\frac{\sin^{k-2}x}{2I_{k-2}}-\frac{\sin^{k-1}x}{2I_{k-1}}$ , which is positive on $[0,\alpha_{0}]$ and negative on $[\alpha_{0},\pi/2]$ , where $\alpha_{0}$ is such that $\sin\alpha_{0}=I_{k-1}/I_{k-2}$ . ∎

Combining Lemma 10 and (38) we see that (36) will follow if

(39)

v_{n}(\alpha)\stackrel{{\scriptstyle?}}{{\geqslant}}1-e^{-n(I_{n-1}v_{n}(% \alpha)^{2}/2}.

Since $e^{-u}\geqslant 1-u$ , the above can be further strengthened to

(40)

v_{n}(\alpha)\stackrel{{\scriptstyle?}}{{\geqslant}}n\big{(}I_{n-1}v_{n}(% \alpha)\big{)}^{2}/2,

which in turn is equivalent to

(41)

nI_{n-1}^{2}v_{n}(\alpha)\stackrel{{\scriptstyle?}}{{\leqslant}}2.

This is evidently true for $n\geqslant 2$ since $v_{n}(\alpha)\leqslant v_{n}(\pi/2)=\frac{1}{2}$ and $I_{n-1}^{2}\leqslant\frac{\pi}{2(n-1)}$ by Proposition 8(ii), which concludes the proof of (5).

Remark.

We point out that the two crucial inequalities appearing in the proof, namely (29) and the conclusion of (35), are de facto functional inequalities relating the function $\eta(\cdot)$ and its derivative or, in the formulation in the spirit of (36), the function $v_{n}(\cdot)$ and its primitive. Accordingly, it is conceivable that once one comes up with a manageable related differential inequality, these functional inequalities would follow. (This could be parallel to the proofs of Komatu-like inequalities, see, e.g., Proposition 3 in [17], or Exercise A.2 in [2].) Similar comments apply to the proofs of the two sided-bound (6) and Corollary 4 in the next two subsections. ∎

3.2. The two-sided problem : Proof of (6)

We now pass to the analysis of the estimate (6) from Theorem 3, i.e., the bound for $\mu(\{|f-\mathbb{E}f|\geqslant t\})$ . The initial step, a reduction to the case of $\psi\in\mathcal{L}=\{\psi:\mathbb{R}\to\mathbb{R},\psi\hbox{ is }\ 1\hbox{-% Lipschitz and nondecreasing}\}$ and the reference measure $\nu=\nu_{n}$ defined by (25) is the same as for the one-sided problem. The second step, the analogue of Lemma 9, i.e., a reduction to functions $\phi_{a}$ from (22), is slightly more involved, and we impose some mild restrictions on the reference measure $\nu$ , which needs to be symmetric and unimodal (by the latter we mean that $d\nu(x)=\rho(|x|)\,dx$ , where $\rho:\mathbb{R}^{+}\to\mathbb{R}^{+}$ is nonincreasing).

Lemma 11.

Let $\nu$ be a symmetric unimodal probability measure on $\mathbb{R}$ such that $\int|x|d\nu(x)<\infty$ and let $t>0$ . Then

(42)

\sup_{\psi\in\mathcal{L}}\ \nu\left(\{|\psi-\mathbb{E}\psi|\geqslant t\}\right% )=\sup_{a\in\mathbb{R}}\ \nu\left(\{|\phi_{a}-\mathbb{E}\phi_{a}|\geqslant t\}% \right).

Moreover, for each $a$ it is sufficient to consider $t=t(a)=a-\mathbb{E}\phi_{a}$ .

The proof of the Lemma is elementary, but on the complicated side. We relegate it to the Appendix.

Similarly as was the case in the one-sided setting, Lemma 11 reduces – for a specific density $\nu$ – the inequality of type (6) to a comparison of two concrete functions of the parameter $a$ , which can be verified numerically. For added rigor, this should be accompanied by an asymptotic analysis of the quantities in question when $t\to 0$ (note that as $t\to 0$ , both sides of (6) typically converge to $1$ ) and – if $\nu$ is not compactly supported – as $a\to\infty$ . In particular, it is routine to check whether (6) holds for any particular value of $n$ . For $n=2$ there is a failure on the same interval for which Theorem 2 failed for $n=2$ , and the failure happens for the same reason: in the reformulation in terms of $\nu=\nu_{2}$ given by (25), consider the function $\psi(\theta)=\theta$ and note that since the density of $\nu_{n}$ is constant for $n=2$ , replacing the median by the mean does not make any difference. For $n=3$ , the integrals involved in the definitions of $\mathbb{E}\phi_{a}$ , $t=t(a)$ , and the relevant probabilities can be explicitly evaluated and the resulting graphs look as in Figure 2.

Clearly, the only values of $a$ that are questionable are those close to $-\pi/2$ , but it is readily verified that $F(a)=\nu_{3}\left(\{|\phi_{a}-\mathbb{E}\phi_{a}|\geqslant t\}\right)=1-\frac{% 1}{12}(a+\pi/2)^{4}+O((a+\pi/2)^{6})$ , while $G(a)=e^{-nt(a)^{2}/2}=1-\frac{1}{96}(a+\pi/2)^{6}+O((a+\pi/2)^{8})$ . Consequently, comparing $\big{(}1-F(a)\big{)}(a+\pi/2)^{-4}$ vs. $\big{(}1-G(a)\big{)}(a+\pi/2)^{-4}$ will show a clear separation.

We now focus on the values $n>3$ , which we will assume when needed (though most steps will work for $n\geqslant 3$ or even $n\geqslant 2$ ).

Consider first the case $a\geqslant 0$ . In the notation from the one-sided setting, we have

	$\displaystyle\nu\left(\{\|\phi_{a}-\mathbb{E}\phi_{a}\|\geqslant t\}\right)$	$\displaystyle=$	$\displaystyle\nu\left(\{\phi_{a}\geqslant a\}\right)+\nu\left(\{\phi_{a}% \leqslant-a-2\eta\}\right)$
		$\displaystyle=$	$\displaystyle\nu([a,\pi/2])+\nu([a+2\eta,\pi/2]).$

Accordingly, our problem reduces to determining whether

(43)

\nu([a,\pi/2])+\nu([a+2\eta,\pi/2])\stackrel{{\scriptstyle?}}{{\leqslant}}e^{-% n(a+\eta)^{2}/2}.

If we use (for $n>2$ ) the bound $\nu([t,\pi/2])\leqslant\frac{1}{2}e^{-nt^{2}/2}$ (a special case of (3)), the conclusion will follow if

(44)

\frac{1}{2}\left(e^{-na^{2}/2}+e^{-n(a+2\eta)^{2}/2}\right)\stackrel{{% \scriptstyle?}}{{\leqslant}}e^{-n(a+\eta)^{2}/2}.

Since the function $s\to e^{-ns^{2}/2}$ is concave on the interval $[-\frac{1}{\sqrt{n}},\frac{1}{\sqrt{n}}]$ , this clearly holds if $a+2\eta\leqslant\frac{1}{\sqrt{n}}$ . A direct calculation shows that the constraint $a+\eta\leqslant\frac{1}{\sqrt{n}}$ is also sufficient. However, this argument can not work in full generality since the function $s\to e^{-ns^{2}/2}$ is convex on the interval $[\frac{1}{\sqrt{n}},\infty)$ , and so the inequality converse to (44) holds if $a\geqslant\frac{1}{\sqrt{n}}$ . To handle such larger values of $a$ , we need a strengthening of the inequality (8) from Proposition 5.

Heuristically, it is clear that the bound (8) can be improved if $s=x{\sqrt{n}}$ is large enough. Indeed, $\sqrt{n}\,\theta$ behaves roughly as a standard normal random variable $Z$ and so – within the range of this approximation – $\nu_{n}(\theta\geqslant x)\approx\mathbb{P}(Z\geqslant s)\sim\frac{1}{\sqrt{2% \pi}\,s}e^{-s^{2}/2}=\frac{1}{\sqrt{2\pi n}\,x}e^{-nx^{2}/2}$ by Komatu’s inequality (32), or the more precise inequality from [17] mentioned in the proof of (5). Consequently, the coefficient of $e^{-nx^{2}/2}$ becomes small when $x{\sqrt{n}}$ is large. The same phenomenon is exemplified in the spherical case by the bound (20). For our purposes, the following variant will suffice.

Lemma 12.

If $n>3$ , then $\mu(K^{x})=\nu_{n}([x,\pi/2])\leqslant\frac{2}{5}e^{-nx^{2}/2}$ for $x\in[\frac{1}{2\sqrt{n}},\frac{\pi}{2}]$ . For $n=3$ , the inequality holds for $x\in[\frac{5}{9\sqrt{n}},\frac{\pi}{2}]$ .

Lemma 12 is based on a subtle comparison between the cosine and the exponential function. Since such ideas will also be used later in the argument, we state them separately.

Lemma 13.

We have

(45)

\cos^{n-2}\theta\geqslant e^{-n\theta^{2}/2}\quad\hbox{ for }\ \theta\in[0,3/% \sqrt{n}]\ \hbox{ and }\ n\geqslant 5.

For $n=3,4$ , the inequality holds for $\theta\in[0,2.67/\sqrt{n}]$ and $\theta\in[0,2.89/\sqrt{n}]$ respectively. On the other hand,

(46)

\cos^{n-1}\theta\leqslant e^{-n\theta^{2}/2}\quad\hbox{ for }\ \theta\in[\sqrt% {6/n},\pi/2]\ \hbox{ and }\ n\geqslant 3.

The proofs of both Lemmas involve mostly calculus, some numerics, and careful book-keeping. We relegate them to the Appendix.

Returning to the proof of (43), we consider two cases.

Case $1^{\circ}$ $a\geqslant\frac{1}{2\sqrt{n}}$ Assuming $n>3$ , both terms on the left-hand side of (43) can be upper-bounded using (the first statement of) Lemma 12 and so it is enough to verify

(47)

\frac{2}{5}\left(e^{-na^{2}/2}+e^{-n(a+2\eta)^{2}/2}\right)\stackrel{{% \scriptstyle?}}{{\leqslant}}e^{-n(a+\eta)^{2}/2}.

Due to the improvement in the bound for $\mu(K^{x})$ (compared to (44)), an argument along the lines of the proof of (5) will work. First, since clearly $e^{-n(a+2\eta)^{2}/2}\ \leqslant e^{-n(a+\eta)^{2}/2}$ , the inequality (47) can be further reduced to

e^{-na^{2}/2}\stackrel{{\scriptstyle?}}{{\leqslant}}1.5\ e^{-n(a+\eta)^{2}/2}.

As in the proof of (5), this is equivalent to

(48)

\eta\stackrel{{\scriptstyle?}}{{\leqslant}}\sqrt{a^{2}+\frac{2\ln 1.5}{n}}-a=% \frac{1}{\sqrt{n}}\left(\sqrt{u^{2}+2\ln 1.5}-u\right),

where $a=u/\sqrt{n}$ . On the other hand, from the definition (31) of $\eta$ and appealing to Lemma 12, we deduce that

(49)

\eta=\int_{a}^{\pi/2}\mu(K^{x})\leqslant\frac{2}{5}\int_{a}^{\pi/2}e^{-nx^{2}/% 2}\,dx<\frac{2}{5\sqrt{n}}\int_{u}^{\infty}e^{-s^{2}/2}\,ds.

The last integral in (49) can be expressed in terms of the Gaussian error function and investigated numerically. Alternatively, as in the proof of (5), we may use the Komatu bound (32), which reduces the problem to showing that, for $u\geqslant 0.5$ ,

(50)

\frac{2}{5}\times\frac{2e^{-u^{2}/2}}{u+\sqrt{u^{2}+2}}\ \stackrel{{% \scriptstyle?}}{{\leqslant}}\ \sqrt{u^{2}+2\ln 1.5}-u=\frac{2\ln 1.5}{u+\sqrt{% u^{2}+2\ln 1.5}}.

Siince $u+\sqrt{u^{2}+2}\geqslant u+\sqrt{u^{2}+2\ln 1.5}$ , these two denominators can be discarded. To complete the argument, it remains to verify the resulting inequality at $u=0.5$ . (The inequality (50) actually holds for all $u\geqslant 0$ , but showing that is not needed.)

Case $2^{\circ}$ $a\leqslant\frac{1}{2\sqrt{n}}$ In this case we can not use Lemma 12 to estimate $\eta$ , but – as in the proof of (5) – the weaker bound (8) combined with Komatu’s inequality (32) will be – for different reasons – sufficient. In the notation from Case $1^{\circ}$ , we have

\eta\leqslant\frac{1}{2}\int_{a}^{\pi/2}e^{-nx^{2}/2}\,dx<\frac{1}{2\sqrt{n}}% \int_{u}^{\infty}e^{-s^{2}/2}\,ds<\frac{1}{\sqrt{n}}\times\frac{e^{-u^{2}/2}}{% u+\sqrt{u^{2}+2}}.

Consequently,

a+\eta<\frac{1}{\sqrt{n}}\times\left(u+\frac{e^{-u^{2}/2}}{u+\sqrt{u^{2}+2}}% \right).

It is readily verified that the expression in the parentheses is less than $1$ for $u\leqslant\frac{1}{2}$ . In particular, for $0\leqslant u=a\sqrt{n}\leqslant\frac{1}{2}$ , we get $a+\eta<\frac{1}{\sqrt{n}}$ , and so we are in the range of applicability of (44). (The argument is almost as clean if we use the more precise expression involving the Gaussian error function; it yields the bound $a+\eta<\frac{1}{\sqrt{n}}$ for $u\leqslant 0.69$ .)

Finally, let us recall that when $n=3$ , the inequality (43) was verified numerically, and – in the range $a\geqslant 0$ – it was never close. Alternatively, the general argument presented above can be easily patched up when specified to the instance $n=3$ (and only Case $1^{\circ}$ requires patching).

It remains to handle the case $a<0$ . As in the context of the one-sided bound, it is then more transparent to rewrite the formula for $t$ as

(51)

t=\int_{b}^{\pi/2}\mu(K^{x})\,dx.

where $b=-a\in[0,\pi/2]$ (see (33)), while the inequality to be verified (cf. (35)) becomes

(52)

\mu(K^{b})-\mu(K^{b+2t})\stackrel{{\scriptstyle?}}{{\geqslant}}1-e^{-nt^{2}/2}.

As for $a\geqslant 0$ , we will consider separately the cases when $b$ is “small” and “not-so-small.”

Case $1^{\circ}$ , small $b$ : If $b$ is sufficiently small (to be made precise later), $\theta=b+2t$ will be within the range of applicability of inequality (45), and the approach from (the proof of) Lemma 12 will work. Specifically, we can deduce then that

(53)

\mu(K^{b})-\mu(K^{b+2t})=(2I_{n-2})^{-1}\int_{b}^{b+2t}\cos^{n-2}\theta\,d% \theta\geqslant(2I_{n-2})^{-1}\int_{b}^{b+2t}e^{-n\theta^{2}/2}\,d\theta.

Substituting $s=\sqrt{n}\,\theta$ , the inequality (52) reduces to

(54)

(2I_{n-2})^{-1}\times\frac{1}{\sqrt{n}}\int_{u}^{u+2v}e^{-s^{2}/2}\,ds% \stackrel{{\scriptstyle?}}{{\geqslant}}1-e^{-v^{2}/2},

where $u=b\sqrt{n}$ and $v=t\sqrt{n}$ . At the same time, as in Eqs. (33)-(38), and in view of Lemma 10 and Proposition 5,

(55)

v=v(b,n)\leqslant\sqrt{n}\,I_{n-1}\mu_{n+1}(K^{b})\leqslant\frac{\sqrt{n}\,I_{% n-1}}{2}\,e^{-u^{2}/2}.

For future reference, let us note that (55) and Proposition 8 imply immediately that $v(b,n)\leqslant\frac{\sqrt{2}\,I_{1}}{2}=1/\sqrt{2}<1$ .

To summarize, we need to show that if $v$ satisfies the constraint (55), then (54) holds for $u$ in the appropriate range. Furthermore, since (for fixed $u$ ), $v\to\frac{1}{v}\int_{u}^{u+2v}e^{-s^{2}/2}\,ds$ is decreasing, while $v\to\frac{1}{v}\big{(}1-e^{-v^{2}/2}\big{)}$ is increasing (for $v\leqslant 1$ ), it is enough to consider the largest possible value of $v$ , e.g., $v=\frac{\sqrt{n}\,I_{n-1}}{2}\,e^{-u^{2}/2}$ . Thus the problem is reduced to comparing two functions of $u$ , which depend rather weakly on $n$ since, by Proposition 8, the coefficients appearing in them satisfy $(2I_{n-2})^{-1}\times\frac{1}{\sqrt{n}}\to\frac{1}{\sqrt{2\pi}}$ and $\frac{\sqrt{n}\,I_{n-1}}{2}\to\sqrt{\frac{\pi}{8}}$ . This suggests verifying first the asymptotic version of the statement, namely :

(56)

v=\sqrt{\frac{\pi}{8}}\,e^{-u^{2}/2}\stackrel{{\scriptstyle?}}{{\implies}}% \frac{1}{\sqrt{2\pi}}\int_{u}^{u+2v}e^{-s^{2}/2}\,ds\geqslant 1-e^{-v^{2}/2}.

A numerical check shows that this statement “comfortably” holds in the relevant $u$ -range (say, $0\leqslant u\leqslant 3$ ), see Figure 3.

This implies that the inequality (54) holds under the constraint (55) if $n$ is large enough. Moreover, since the sequences of coefficients are monotone (by Proposition 8), once we determine that (54) holds for $n=n_{0}$ , it will follow that it is also valid for $n>n_{0}$ . A direct check shows that this works for $n_{0}=3$ , which is enough for our purposes.

To complete this part of the argument, we need to determine the range of the parameter $u$ that assures that (45) can be applied with $\theta=b+2t$ (that is, $b+2t\leqslant 3/\sqrt{n}$ or, equivalently, $u+2v\leqslant 3$ if $n\geqslant 5$ , and similarly for $n=3,4$ ). Using the bound (55) with $n=3$ , we see that $u+2v\leqslant 2.67$ if $u\leqslant 2.52$ . Since the sequence $(\sqrt{n}\,I_{n-1})$ decreases by Proposition 8, it follows that the same is true for $n>3$ . In other words, the method from Case $1^{\circ}$ works for $u\leqslant 2.52$ . This is amply sufficient as the approach from Case $2^{\circ}$ will cover the range $u\geqslant\sqrt{6}$ , and $\sqrt{6}\approx 2.45<2.52$ .

Case $2^{\circ}$ , not-so-small $b$ : To handle the values of $u>\sqrt{6}$ , which correspond to $b>\sqrt{6/n}$ , we need a more specialized upper bound for $t$ . To that end, we appeal to (20), which restated in the current setup becomes

(57)

\mu_{n}(K^{x})\leqslant(\sqrt{2\pi}\,\kappa_{n}\sin x)^{-1}\cos^{n-1}x.

Accordingly (cf. (51))

(58)	$\displaystyle t$	$\displaystyle\leqslant$	$\displaystyle\int_{b}^{\pi/2}(\sqrt{2\pi}\,\kappa_{n}\sin x)^{-1}\cos^{n-1}x\,dx$
		$\displaystyle\leqslant$	$\displaystyle(\sqrt{2\pi}\,\kappa_{n}\sin b)^{-1}\int_{b}^{\pi/2}\cos^{n-1}x\,dx$
		$\displaystyle=$	$\displaystyle(\sqrt{2\pi}\,\kappa_{n}\sin b)^{-1}\times 2I_{n-1}\,\mu_{n+1}(K^% {b})$
		$\displaystyle\leqslant$	$\displaystyle\frac{I_{n-1}}{\pi n\sin^{2}b}\times\cos^{n}b,$

where in the last inequality we used (again) (57) and the identity $\kappa_{n}\kappa_{n+1}=n$ .

We now argue similarly as in the one-sided context. The left-hand side of (52) will be lower-bounded by $2t\times(2I_{n-2})^{-1}\cos^{n-2}(b+2t)$ (cf. (53)) and the right-hand side upper-bounded by $nt^{2}/2$ , which reduces the problem to

(59)

\cos^{n-2}(b+2t)\stackrel{{\scriptstyle?}}{{\geqslant}}I_{n-2}\times\frac{nt}{% 2}.

Appealing to (58) allows further reduction to

(60)

t\leqslant\frac{I_{n-1}}{\pi n\sin^{2}b}\times\cos^{n}b\quad\stackrel{{% \scriptstyle?}}{{\implies}}\quad\cos^{n-2}(b+2t)\geqslant\frac{\cos^{n}b}{4(n-% 1)\sin^{2}b},

where we used $I_{k}I_{k+1}=\frac{\pi}{2(k-1)}$ .

As in the argument that led to (56), let us consider first an asymptotic version of (60). That is, substitute $u=b\sqrt{n}$ and $v=t\sqrt{n}$ and let $n\to\infty$ , which leads to

(61)

u\geqslant\sqrt{6}\ \hbox{ and }\ v\leqslant\frac{e^{-u^{2}/2}}{\sqrt{2\pi}u^{% 2}}\quad\stackrel{{\scriptstyle?}}{{\implies}}\quad e^{-(u+2v)^{2}/2}\geqslant% \frac{e^{-u^{2}/2}}{4u^{2}}.

To establish (61), it is clearly enough to assume equality in the constraint on $v$ , and it is then apparent (numerically) that the inequality on the right comfortably holds. In fact, it does hold for $u\geqslant 0.84$ and, for $u\geqslant\sqrt{6}$ , the ratio of the two sides is greater than $23$ . We can not, however, deduce immediately that (60) holds for sufficiently large $n$ since there is no obvious monotonicity with respect to $n$ and we do not know if the convergence involved in obtaining (61) is appropriately uniform. Still, patching the calculation is rather routine; we sketch the main points in the Appendix. ∎

3.3. The Gaussian case : Proof of Corollary 4

The Gaussian isoperimetric inequality [4, 16] reduces the problem to $n=1$ . The obvious line of argument is now to invoke some version of the Poincaré Lemma (appropriately normalized marginals of $\mu=\mu_{n}$ converge, as $n\to\infty$ , to the normal distribution) and then appeal to (6), but there are some minor technical issues that need to be addressed. First, $\Theta_{n}$ , the random variable distributed according to $\nu_{n}$ , is not exactly a marginal of $\mu_{n}$ (the marginals are parametrized by $\sin\theta$ rather than by $\theta$ ). Next, we have to make sure that the convergence of $\sqrt{n}\Theta_{n}$ to the standard normal $Z$ preserves probabilities and moments. An elementary way to resolve these issues is to consider the density of $\sqrt{n}\Theta_{n}$ , which is (on its support) $g_{n}(\theta):=(2I_{n-2}\sqrt{n})^{-1}\cos^{n-2}(\theta/\sqrt{n})$ . Once we take into account the properties of $I_{m}$ stated in Proposition 8, it is an elementary exercise to show that $g_{n}(\theta)\to(2\pi)^{-1/2}e^{-\theta^{2}/2}$ (the density of $Z$ ), and that this convergence is dominated in a rather strong sense: we have $0\leqslant g_{n}(\theta)\leqslant(2\pi)^{-1/2}e^{-\frac{n-2}{2n}\theta^{2}}% \leqslant(2\pi)^{-1/2}e^{-\theta^{2}/6}$ for all $\theta$ and $n\geqslant 3$ . The dominated convergence theorem implies then the convergence of all probabilities and all moments. ∎

An alternative line of argument is to appeal to Lemma 11 and then compare $\mathbb{P}(|Z-\mathbb{E}Z|>t)$ to $e^{-t^{2}/2}$ , where $t=t(a)=a+\int_{a}^{\infty}\gamma_{1}\big{(}[x,\infty)\big{)}\,dx$ . Since all these quantities can be expressed in terms of the Gaussian error function, there is no problem with a numerical verification, see Figure 4. For complete rigor, this should be accompanied by an asymptotic analysis as $a\to\pm\infty$ .

Finally, one could redo in the Gaussian setting the rather rigorous calculations that were performed in the spherical case. These would be substantially easier since we do not need to worry about the dependence on the dimension. As a matter of fact, we did some such calculations to provide heuristics for the spherical case. However, an argument of that nature wouldn’t be pretty. It would be good to have a neat proof based on standard properties of the Gaussian error function, perhaps along the lines of [17] or the follow-up papers [7, 12].

4. Products of spheres

In this section we will discuss perspectives for improving isoperimetric/concentration inequalities on $\big{(}S^{n-1}\big{)}^{k}$ for $k>1$ . The discussion is somewhat exploratory in nature, with some results based on numerics and many estimates presumably not optimal, and is intended to encourage further research.

As is well known, Fact 1 generalizes to products $S^{n+1}\times S^{n+1}\times\ldots S^{n+1}$ with arbitrary number of factors (see, e.g., [11], section 6.5.2; note, however, that he family discussed there should involve $S^{n+1}$ and not $S^{n}$ ). This is because Ricci curvature $R(S^{n+1})$ is $n$ , and consequently the same is true for the product and one may apply the following comparison result due to Gromov (see section 6.4 and Appendix I in [11]).

Fact 14.

Let $X$ be a an $m$ -dimensional Riemannian manifold, whose Ricci curvature $R(X)$ is bounded from below by $\kappa>0$ . Choose $r>0$ so that $R(rS^{m})=(m-1)/r^{2}=\kappa$ . Denote by $\mu_{X}$ the normalized Riemannian measure on $X$ and by $\mu$ the normalized Haar (surface) measure on the sphere $rS^{m}$ . Next, let $A\subset X$ , $t>0$ and $B\subset rS^{m}$ be a cap such that $\mu_{X}(A)=\mu(B)$ . Then $\mu_{X}(A_{t})\geqslant\mu(B_{t})$ .

As earlier, $B_{t}$ is again a cap and so its volume can be – after rescaling by $r$ – expressed as an integral of the type (9). In particular, if $\mu_{X}(A)=\frac{1}{2}$ , then

\mu_{X}(A_{t})\geqslant 1-\frac{\int_{t/r}^{\pi/2}\cos^{m-1}\theta\,d\theta}{2% I_{m-1}}=1-q_{m+1}(t/r),

where $q_{m+1}(\cdot)$ is defined by (11). Specifying further to $X=\big{(}S^{n-1}\big{)}^{k}$ (with $n>2$ ), whose Ricci curvature is $\kappa=n-2$ , we are led to

(62)

m=(n-1)k\quad\hbox{ and }\quad r=\sqrt{\frac{m-1}{\kappa}}=\sqrt{\frac{(n-1)k-% 1}{n-2}},

and – after appealing to the bound (3) from Theorem 2 – to

Proposition 15.

Let $n>2$ , $k\geqslant 2$ , and let $\sigma$ be the normalized product measure on $\big{(}S^{n-1}\big{)}^{k}$ . Next, let $A\subset\big{(}S^{n-1}\big{)}^{k}$ be such that $\sigma(A)\geqslant\frac{1}{2}$ . If $t>0$ , then

(63)

\sigma(A_{t})\geqslant 1-\frac{1}{2}\;q_{m+1}(t/r)\geqslant 1-\frac{1}{2}\exp% \Big{(}-\frac{\big{(}(n-1)k+1\big{)}(n-2)}{(n-1)k-1}\,\frac{t^{2}}{2}\,\Big{)}.

Since the fraction inside $\exp$ is clearly greater than $n-2$ , it follows that $\sigma(A_{t})\geqslant 1-\frac{1}{2}\,e^{-(n-2)t^{2}/2}$ , which is slightly better than the estimate from [11] mentioned at the beginning of this section: the multiplicative constant $\frac{1}{2}$ instead of $\sqrt{\pi/8}$ (this is because we are using Theorem 2 and not Fact 1). There is also a slight improvement in the coefficient of $t^{2}$ in the exponent, but it becomes less and less significant as $k$ increases. In order to be able to deduce the same bound as in Theorem 2 for $k\geqslant 2$ , we need a stronger version of (11) (or, equivalently, of (3) or (8)) with an appropriate “excess” in the exponent to compensate for the coefficient of $\frac{t^{2}}{2}$ in (63) being strictly smaller than $n$ . Specifically, define

(64)

q_{n,\xi}(x):=\frac{\int_{x}^{\pi/2}\cos^{n-2}\theta\,d\theta}{I_{n-2}}e^{(n+% \xi)x^{2}/2}=q_{n}(x)e^{\xi x^{2}/2}.

and suppose that the inequality

(65)

q_{m+1,\xi}(x)\leqslant 1

is valid with $m=(n-1)k$ and with $\xi$ such that $\frac{\big{(}(n-1)k+1+\xi\big{)}(n-2)}{(n-1)k-1}=n$ . Then, repeating mutatis mutandis the argument that led to Proposition 15, we obtain – for this particular choice of $n,k$ , and for the respective values of $t$ – an improvement to (63) with $1-\frac{1}{2}\exp\big{(}-nt^{2}/2\big{)}$ on the right hand side, which is a much neater expression.

As it turns out, as speculative as the bound (65) appears, it is not unreasonable. Of course, as already pointed out in the Remark following the proof of Theorem 2, we cannot expect it to hold for all $m$ and all $x$ , but it may conceivably be true for $m\geqslant m(\xi)$ . Below is an analysis showing that validating (65) is actually quite feasible.

We note first that the equation $\frac{\big{(}(n-1)k+1+\xi\big{)}(n-2)}{(n-1)k-1}=n$ resolves to $\xi=2(k-1)\frac{n-1}{n-2}$ or $\xi=2(k-1)\frac{m}{m-k}$ . The last two expressions are decreasing functions of respectively $n$ or $m$ , which means that the threshold value for the excess $\xi$ that is sufficient for our purposes can be chosen as a function of $k$ only. Next, it follows immediately from the definition (64) of $q_{n,\xi}$ and from (13) that $q_{n+2,\xi}(x)\leqslant q_{n,\xi}(x)$ for all $n,\xi$ and $x$ . Thus, for each $x\in[0,\pi/2]$ and for all $\xi\geqslant 0$ , the sequences $(q_{2n,\xi}(x))$ and $(q_{2n+1,\xi}(x))$ are nonincreasing. This means that, given $\xi\geqslant 0$ , once we establish (65) for certain $m_{0}$ and $m_{0}+1$ , it will be valid for all $m\geqslant m_{0}$ , and consequently for all sufficiently large $n$ (with the last qualification depending on $k$ ).

A numerical check indicates that $q_{4,1}(x)\leqslant 1$ and $q_{6,2}(x)\leqslant 1$ for $x\in[0,\pi/2]$ , and that such inequalities do not generally hold for $q_{3,1}$ and $q_{5,2}$ . This suggests that, in the setting of Theorem 2, the bound $\mu(A_{t})\geqslant 1-\frac{1}{2}e^{-t^{2}(n+1)/2}$ is valid for $n\geqslant 4$ and the bound $\mu(A_{t})\geqslant 1-\frac{1}{2}e^{-t^{2}(n+2)/2}$ is valid for $n\geqslant 6$ . It would be interesting to rigorously determine the threshold values $n=n(\xi)$ , in addition to the numerical results indicated above and further explored below.

As a demonstration, let us focus on the instance $k=2$ . A numerical check using Mathematica shows that in that case:

$\bullet$ if $n=3$ , hence $m=4$ and $\xi=4$ , then (65) holds for $x\not\in(0.47595,1.45105)$ ; taking into account the rescaling $x=t/r$ we deduce (cf. (62)) that, in the setting of Proposition 15 the bound $\sigma(A_{t})\geqslant 1-\frac{1}{2}e^{-nt^{2}/2}$ holds for $t\leqslant 0.82437$

$\bullet$ if $n=4$ , hence $m=6$ and $\xi=3$ , then (65) and all the subsequent bounds hold for $x\not\in(0.71556,1.19952)$ , so (after rescaling) we deduce that $\sigma(A_{t})\geqslant 1-\frac{1}{2}e^{-nt^{2}/2}$ for $t\leqslant 1.1314$ ; however, if we use the chordal distance instead of the geodesic distance, then the bounds hold in the entire range

$\bullet$ if $n=5$ , hence $m=8$ and $\xi=8/3$ , then (65) and all the subsequent bounds hold in the entire respective range

$\bullet$ if $n>5$ , the same holds by monotonicity; note that for $k=2$ all $m$ ’s are even, and so our inductive “initialization” requires verifying only one value $m_{0}$ .

The above considerations can be summarized in the following statement.

Theorem 16.

Let $\sigma$ be the normalized product measure on $\big{(}S^{n-1}\big{)}^{2}$ and let $A\subset\big{(}S^{n-1}\big{)}^{2}$ be such that $\sigma(A)\geqslant\frac{1}{2}$ . If $t\geqslant 0$ and $n>4$ , then

\sigma(A_{t})\geqslant 1-\frac{1}{2}e^{-nt^{2}/2}.

If $n=3$ and $n=4$ , the above bound holds for, respectively, $t\leqslant 0.82437$ and $t\leqslant 1.1314$ . Additionally, the bound holds in the entire range if $n=4$ and if the enlargements $A_{t}$ are defined via the chordal distance rather than the geodesic distance.

For $k=3$ , the bound (65) holds for $n>6$ and for $n=6$ with the chordal distance; for $k=4$ for $n>7$ and for $n=6,7$ with the chordal distance. So it appears that the threshold for allowable dimensions $n$ increases with $k$ , possibly unboundedly. Again, it would be interesting to rigorously determine the dependence of the allowable range of $n$ as a function of $k$ (assuming it indeed does not stabilize, which would be a desirable property, but not very likely in view of the above numerical results). Another useful – and perhaps not that hard – result would be a good universal lower lower bound on $b$ such that the bounds hold for $x\in[0,b]$ , or for $t\in[0,b]$ . Numerics suggest that those intervals are never really small.

The final remark is that – unlike in the case $k=1$ – we do not have an exact calculation, based on the knowledge of extremal subsets, but one that relies on the Gromov’s comparison theorem (Fact 14). While there are many very sophisticated approaches to isoperimetric problems on product spaces (e.g. [14, 18]), we are not aware of the precise solution to the problem even in the case of the torus $(S^{1})^{k}$ . So it is possible, and quite likely, that in reality the bounds hold for a larger set of parameters than what follows from the argument above. It may be feasible to test, e.g., $k=2$ and $n=3$ or $n=4$ by looking at some specific sets $A$ and the (relatively large) values of $x$ or $t$ suggested by the numerics leading to the results described above.

Acknowledgements. GA was supported in part by ANR (France) under the grant ESQuisses (ANR-20-CE47-0014-01). The research of JJ and SJS has been supported in part by grants from the National Science Foundation (U.S.A.).

References

[1]
[2] G. Aubrun and S. J. Szarek, Alice and Bob Meet Banach. The Interface of Asymptotic Geometric Analysis and Quantum Information Theory. Mathematical Surveys and Monographs, 223, Amer. Math. Soc., 2017.
[3] A. Brieden, P. Gritzmann, R. Kannan, V. Klee, L. Lovász, and M. Simonovits, Deterministic and randomized polynomial-time approximation of radii. Mathematika 48, Issues 1-2, pp.63-105 (2003).
[4] C. Borell, The Brunn-Minkowski inequality in Gauss space. Invent. Math. 30 (1975), no. 2, 207-216
[5] J. Jenkinson, Convex Geometric Connections to Information Theory. Ph.D. thesis, Case Western Reserve University, 2013, http://rave.ohiolink.edu/etdc/view?acc_num=case1365179413.
[6] Y. Komatu, Elementary inequalities for Mills’ ratio. Rep. Statist. Appl. Res. Un. Jap. Sci. Engrs. 4 (1955), 69-70.
[7] O. Kouba, Inequalities related to the error function. arXiv:math/0607694, 2006.
[8] M. Ledoux, The Concentration of Measure Phenomenon. Mathematical Surveys and Monographs, 223, Amer. Math. Soc., Providence, 2001.
[9] P. Lévy, Problèmes concrets d’analyse fonctionnelle. 2nd ed. Gauthier-Villars, Paris, 1951.
[10] E. Milman, Sharp isoperimetric inequalities and model spaces for the Curvature-Dimension-Diameter condition. J. Eur. Math. Soc. 17 (2015), 1041-1078.
[11] V. D. Milman and G. Schechtman, Asymptotic theory of finite dimensional normed spaces. With an appendix by M. Gromov, Lecture Notes Math. 1200, Springer Verlag, Berlin-New York, 1986.
[12] M. B. Ruskai and E. Werner, A pair of optimal inequalities related to the error function. arXiv:math/9711207, 1997.
[13] M. R. Sampford, Some inequalities on Mill’s ratio and related functions. Ann. Math. Statistics 24 (1953), 130-132.
[14] G. Schechtman, Concentration results and applications. In Handbook of the geometry of Banach spaces. Edited by W. B. Johnson and J. Lindenstrauss (North-Holland, Amsterdam, 2003), Vol. 2, pp. 1603-1634.
[15] E. Schmidt, Die Brunn-Minkowskische Ungleichung und ihr Spiegelbild sowie die isoperimetrische Eigenschaft der Kugel in der euklidischen und nichteuklidischen Geometrie. I. Math. Nachr. 1 (1948) 81-157.
[16] V. N. Sudakov and B. S. Tsirelson, Extremal properties of half-spaces for spherically invariant measures. J. Soviet Math. (1978), 9-18.
[17] S. J. Szarek and E. Werner, A Nonsymmetric Correlation Inequality for Gaussian Measure. J. Multivariate Analysis 68 (1999), 193-211
[18] M. Talagrand, Concentration of measure and isoperimetric inequalities in product spaces. Inst. Hautes Études Sci. Publ. Math. 81 (1995), 73-205.

5. Appendix

5.1. Achievability of tail estimates in Lemma 9

Here we sketch a proof of the assertion from the Remark following the proof of Lemma 9, which said that the supremum in (21) is attained for some $\psi=\phi_{a}$ and, moreover, for $a$ verifying $t=a-\mathbb{E}\phi_{a}$ . In view of (23), this reduces to the analysis of the family $\{\phi_{a}:a\in\mathbb{R}\}$ without having to consider a general $\psi\in\mathcal{L}$ . We will start with a series of elementary observations.

$1^{\circ}$ Since $\phi_{a}\leqslant a$ , it follows that $\mathbb{E}\phi_{a}\leqslant a$ , with equality iff $\nu$ is supported on $[a,\infty)$ .

$2^{\circ}$ Since $\|\phi_{a}-\phi_{b}\|_{\infty}=|a-b|$ and $\phi_{a}\leqslant\phi_{b}$ if $a\leqslant b$ , the function $a\to\mathbb{E}\phi_{a}$ is $1$ -Lipschitz and non-decreasing. Consequently, the same is true for $a\to\mathbb{E}\phi_{a}+t$ .

$3^{\circ}$ Similarly, $a\to a-\mathbb{E}\phi_{a}$ is nondecreasing; this follows from $a-\mathbb{E}\phi_{a}=\int(a-x)^{+}d\nu(x)$ and $a\to(a-x)^{+}$ being a nondecreasing function of $a$ (for each fixed $x$ ).

$4^{\circ}$ If $t>0$ , then the supremum in (21) is strictly smaller than $1$ . The hypothesis $M:=\int|x|\,d\nu(x)<\infty$ implies that $\forall\varepsilon>0\ \exists\delta>0$ such that if $\nu(A)<\delta$ , then $\int_{A}|x|d\nu(x)<\varepsilon$ . Let $\psi\in\mathcal{L}$ and suppose (as we can) that $f(0)=0$ , which implies $|\psi(x)|\leqslant|x|$ . Denote $A=U(\psi,t)^{c}$ ; our objective is to show that $\nu(A)$ can not be too small. We have

\mathbb{E}\psi=\int_{U(\psi,t)}\psi\,d\nu+\int_{A}\psi\,d\nu\geqslant\nu\big{(% }U(\psi,t)\big{)}(\mathbb{E}\psi+t)-\int_{A}|\psi|\,d\nu,

which can be rewritten as

\nu(A)\,\mathbb{E}\psi\geqslant\big{(}1-\nu(A)\big{)}t-\int_{A}|\psi|\,d\nu.

Now, set $\varepsilon=\frac{t}{4}$ and choose the corresponding $\delta\in(0,\frac{1}{2})$ . If $\nu(A)<\delta$ , then $\big{(}1-\nu(A)\big{)}t-\int_{A}|\psi|\,d\nu>\frac{t}{2}-\varepsilon=\frac{t}{4}$ , while $\nu(A)\mathbb{E}\psi\leqslant\nu(A)\mathbb{E}|\psi|\leqslant\nu(A)M$ and so

\frac{t}{4}<\nu(A)M.

In other words, either $\nu(A)\geqslant\delta$ , or $\nu(A)\geqslant\frac{t}{4M}$ , so $\nu(A)\geqslant\min\{\delta,\frac{t}{4M}\}>0$ , as asserted.
Note : The assertion $4^{\circ}$ will follow independently from the other observations, but we include it here since the argument works for any $1$ -Lipschitz function $\psi$ and not just for $\psi\in\mathcal{L}$ .

$5^{\circ}$ $\lim_{a\to+\infty}\mathbb{E}\phi_{a}=\int x\,d\nu(x)$ and $\lim_{a\to-\infty}a-\mathbb{E}\phi_{a}=0$ . Both of these follow from $\int|x|\,d\nu(x)<\infty$ via the dominated convergence theorem.

With this preparation, the conclusion is very easy. Denote $\xi(a):=\mathbb{E}\phi_{a}+t$ , $a\in\mathbb{R}$ . By $2^{\circ}$ , the function $\xi$ is $1$ -Lipschitz and non-decreasing. By $5^{\circ}$ , it has an oblique asymptote $\ell_{-}(a)=a+t$ as $a\to-\infty$ and a horizontal asymptote $\ell_{+}(a)=\int x\,d\nu(x)+t$ as $a\to+\infty$ . Since, by $3^{\circ}$ , $\xi(a)-a$ is nonincreasing, it follows that there is a unique value $a_{0}$ such that
$\bullet$ $\xi(a)>a$ if $a<a_{0}$
$\bullet$ $\xi(a)\leqslant a$ if $a\geqslant a_{0}$ , with equality if $a=a_{0}$ .
Let us decode what these inequalities mean. First, $\xi(a)>a$ means $\mathbb{E}\phi_{a}+t>a\geqslant\phi_{a}$ ; in that case $U(\phi_{a},t)=\{\phi_{a}\geqslant\mathbb{E}\phi_{a}+t\}=\emptyset$ . On the other hand, if $\xi(a)=\mathbb{E}\phi_{a}+t\leqslant a$ , then $U(\phi_{a},t)=[\xi(a),\infty)$ . Since, again, $\xi$ is non-decreasing by $2^{\circ}$ , the the largest value of $\nu([\xi(a),\infty))$ will be attained for the smallest value of $a$ for which $\xi(a)=a$ , i.e., for $a=a_{0}$ , and the condition $\xi(a)=a$ means precisely $\mathbb{E}\phi_{a}+t=a$ or $t=a-\mathbb{E}\phi_{a}$ . ∎

5.2. Proof of Lemma 11

We fix $t>0$ and proceed in several steps.

Step $1^{\circ}$ First, there is the essentially trivial observation that – from the point of view of estimating $\lambda(\psi,t):=\nu(\{|\psi-\mathbb{E}\psi|\geqslant t\})$ – the function $x\to f(x)$ is equivalent to $x\to f(x)+c$ (for any $c\in\mathbb{R}$ ) and, due to the symmetry of $\nu$ , to $x\to-f(-x)$ .

Step $2^{\circ}$ Second, extremal functions (i.e., such that $\lambda(\cdot,t)$ is maximal) do exist. Indeed, suppose that $(\psi_{k})$ is a sequence of functions for which $\lambda(\psi_{k},t)\to\max_{\psi\in\mathcal{L}}\lambda(\psi,t)=:\Lambda$ . By the previous remark, we may assume that $\psi_{k}(0)=0$ for all $k$ , and it then follows the there is a subsequence $(\psi_{k_{i}})$ of $(\psi_{k})$ converging uniformly on bounded intervals⁴⁴4In our setting, the measures $\nu=\nu_{n}$ all all supported on $[-\pi/2,\pi/2]$ , which makes the argument even more straightforward. to some function $\psi$ , which necessarily belongs to $\mathcal{L}$ . Since $\int|x|d\nu(x)<\infty$ , it follows from the dominated convergence theorem that $\mathbb{E}\psi_{k_{i}}\to\mathbb{E}\psi$ . This implies that $\{|\psi-\mathbb{E}\psi|\geqslant t\}\supset\limsup_{i}\{|\psi_{k_{i}}-\mathbb{% E}\psi_{k_{i}}|\geqslant t\}$ and, consequently, that $\lambda(\psi,t)\geqslant\limsup_{i}\lambda(\psi_{k_{i}},t)=\Lambda$ . So the limit function $\psi$ is extremal.

Step $3^{\circ}$ The next observation is slightly less trivial; it gives the first hint why functions of the form (22) may be extremal. Let $\psi\in\mathcal{L}$ and let $\alpha$ be defined by $\psi(\alpha)=\mathbb{E}\psi$ (it exists by the intermediate value theorem, see Figure 5 for this and the subsequent steps). We claim that if $\psi$ is extremal, then (at least in all cases of interest, to be clarified later)

(66)

\psi(\theta)-\psi(\alpha)=\theta-\alpha\ \hbox{ for }\ \theta\in[\alpha-t,% \alpha+t].

If this is not the case, then, denoting $\theta_{0}:=\max\{\theta:\psi(\theta)=\mathbb{E}\psi-t\}$ and $\theta_{1}:=\min\{\theta:\psi(\theta)=\mathbb{E}\psi+t\}$ , we will have $\theta_{1}-\theta_{0}>2t$ . We now set $\alpha_{0}:=\theta_{0}+t,\alpha_{1}:=\theta_{1}-t$ (so $\alpha_{0}<\alpha_{1}$ ) and define $\psi_{0}$ and $\psi_{1}$ as follows

\psi_{0}(\theta)=\left\{\begin{array}[]{ll}\mathbb{E}\psi+\theta-\alpha_{0}&{% \rm if}\ \theta\in[\alpha_{0}-t,\alpha_{0}+t]\\ \mathbb{E}\psi+t&{\rm if}\ \theta\in[\alpha_{0}+t,\theta_{1}]\\ \psi(\theta)&{\rm otherwise}\end{array}\right.\hskip-8.53581pt,\ \ \psi_{1}(% \theta)=\left\{\begin{array}[]{ll}\mathbb{E}\psi-t&{\rm if}\ \theta\in[\theta_% {0},\alpha_{1}-t]\\ \mathbb{E}\psi+\theta-\alpha_{1}&{\rm if}\ \theta\in[\alpha_{1}-t,\alpha_{1}+t% ]\\ \psi(\theta)&{\rm otherwise}\end{array}\right.

Figure 5. The functions

\psi

\psi_{0}

\psi_{1}

, and

\psi_{s}

. Note that

\alpha_{0}=\theta_{0}+t

and

\theta_{1}=\alpha_{1}+t

Then $\psi_{0},\psi_{1}$ are $1$ -Lipschitz, $\psi_{0}\geqslant\psi\geqslant\psi_{1}$ , and so $\mathbb{E}\psi_{0}\geqslant\mathbb{E}\psi\geqslant\mathbb{E}\psi_{1}$ . Consequently (again, see Figure 5), there is an intermediate function $\psi_{s}$ for some $s\in[0,1]$ such that

$\bullet$ $\psi_{s}(\theta)=\mathbb{E}\psi+\theta-\alpha_{s}$ for $\theta\in[\alpha_{s}-t,\alpha_{s}+t]$
$\bullet$ $\psi_{s}(\theta)=\mathbb{E}\psi-t$ for $\theta\in[\theta_{0},\alpha_{s}-t]$ and $\psi_{s}(\theta)=\mathbb{E}\psi+t$ for $\theta\in[\alpha_{s}+t,\theta_{1}]$
$\bullet$ $\mathbb{E}\psi_{s}=\mathbb{E}\psi$

As a consequence of these properties, the set $\{|\psi_{s}-\mathbb{E}\psi_{s}|<t\}=(\alpha_{s}-t,\alpha_{s}+t)$ is strictly contained in the set $\{|\psi-\mathbb{E}\psi|<t\}=(\theta_{0},\theta_{1})$ and so (again, in all cases of interest) $\nu\big{(}\{|\psi_{s}-\mathbb{E}\psi_{s}|\geqslant t\}\big{)}>\nu\big{(}|\psi-% \mathbb{E}\psi|\geqslant t\big{)}$ . This means that such $\psi$ can not be extremal for the two-sided problem for this particular value of $t$ and shows that an extremal function must satisfy (66). An alternative take on this argument is that we just produced an extremal function, namely $\psi_{s}$ , for which (66) holds (this doesn’t even require that the inequality stated earlier in this paragraph is strict).

In the above argument we tacitly assumed that $\theta_{0}$ and $\theta_{1}$ existed (i.e., the sets appearing in their definitions were nonempty) and that they belonged to the support of $\nu$ ; this is what we meant by “cases of interest.” However, if – for example – the set $\{\theta:\psi(\theta)=\mathbb{E}\psi-t\}$ was empty, then it would follow that in fact $\psi(\theta)>\mathbb{E}\psi-t$ for all $\theta$ and, consequently, $\{|\psi-\mathbb{E}\psi|\geqslant t\}=\{\psi-\mathbb{E}\psi\geqslant t\}$ . This means that we would be back to the one-sided problem, for which we know that the he functions $\phi_{a}$ are extremal. Another caveat is that if the interval $(\theta_{0},\theta_{1})$ was not included in $[-\pi/2,\pi/2]$ (the support of the measure $\nu$ from (25)), it might happen that $\mu(\{|\psi_{s}-\mathbb{E}\psi_{s}|<t\})=\nu\big{(}(\alpha_{s}-t,\alpha_{s}+t)% \big{)}$ is the same as $\mu(\{|\psi-\mathbb{E}\psi|<t\})=\nu\big{(}(\theta_{0},\theta_{1})\big{)}$ . Again, this is not a problem. First, we can replace one putatively extremal function $\psi$ by another one by modifying it outside of the support of $\nu$ , which has no effect on the quantities under consideration. Next, $(\theta_{0},\theta_{1})\not\subset[-\pi/2,\pi/2]$ means that we are again de facto in the setting of the one-sided problem.

Step $4^{\circ}$ To summarize the analysis up to this point, the extremal functions $\psi$ satisfy the property (66), which can be subsumed as follows: for some $\alpha\in\mathbb{R}$ ,

(67)

\mathbb{E}\psi=\psi(\alpha)\quad\hbox{and}\quad\{|\psi-\mathbb{E}\psi|<t\}=(% \alpha-t,\alpha+t).

Since the density of $\nu$ decreases away from $0$ , it is apparent that $\nu\big{(}(\alpha-t,\alpha+t)\big{)}$ will be minimized when $|\alpha|$ is as large as possible and, by Step $1^{\circ}$ , it is enough to consider $\alpha\leqslant 0$ . Given such putatively extremal function $\psi$ (with associated $\alpha$ ), define $\widetilde{\psi}$ by

(68)

\widetilde{\psi}(\theta)=\phi_{\alpha+t}(\theta)+\psi(\alpha)-\alpha.

Then $\widetilde{\psi}(\theta)=\psi(\theta)$ for $\theta\in[a-t,a+t]$ and $\widetilde{\psi}\leqslant\psi$ everywhere else. Accordingly, $\mathbb{E}\widetilde{\psi}\leqslant\mathbb{E}\psi$ , and so if $\widetilde{\alpha}$ is defined by $\widetilde{\psi}(\widetilde{\alpha})=\mathbb{E}\widetilde{\psi}$ , then

\widetilde{\psi}(\widetilde{\alpha})=\mathbb{E}\widetilde{\psi}\leqslant% \mathbb{E}\psi=\psi(\alpha)=\widetilde{\psi}(\alpha).

It follows that $\widetilde{\alpha}\leqslant\alpha$ and the inequality is strict unless $\widetilde{\psi}=\psi$ on the support of $\nu$ (in other words, $\nu$ -a.e.). Consequently,

(69)

\nu\big{(}\{|\psi-\mathbb{E}\psi|<t\}\big{)}=\nu\big{(}(\alpha-t,\alpha+t)\big% {)}\geqslant\nu\big{(}(\widetilde{\alpha}-t,\widetilde{\alpha}+t)\big{)}=\nu% \big{(}\{|\widetilde{\psi}-\mathbb{E}\widetilde{\psi}<t\}\big{)}

and so if $\psi$ was extremal, so is $\widetilde{\psi}$ . Since the function $\widetilde{\psi}$ is – up to an additive constant – of the form $\phi_{a}$ , (42) follows.

It remains to justify the last assertion of Lemma 11, namely that “it is sufficient to consider $t=a-\mathbb{E}\phi_{a}$ .” In the argument above, the extremal function $\widetilde{\psi}$ being of the form $\phi_{\alpha+t}+c$ , this condition translates to $\widetilde{\alpha}=\alpha$ , an equality which can be immediately deduced in many cases of interest. This happens, for example, if the function $\rho$ defining the density of $\nu$ is strictly decreasing on its support (which holds in our setting, cf. (25)). In that case, the function $\alpha\to\nu\big{(}(\alpha-t,\alpha+t)\big{)}$ is strictly increasing for $\alpha<0$ (excluding, if applicable, the “trivial” range, i.e., the values of $\alpha$ for which the interval $(\alpha-t,\alpha+t)$ does not intersect the support of $\nu$ ). Now, if we had $\widetilde{\alpha}<\alpha$ , a strict inequality in (69) would follow, contradicting the extremality of $\psi$ .

This special case is sufficient for our intended applications of Lemma 11. And here is a sketch of the argument addressing the case of general $\rho$ . Let $\psi_{1}:=\widetilde{\psi}$ and repeat the construction above with $\psi$ replaced by $\psi_{1}$ to obtain $\psi_{2}$ etc. Each $\psi_{k}$ is extremal and, up to an additive constant, is of the form $\phi_{a_{k}}$ with $a_{1}\geqslant a_{2}\geqslant\ldots$ . Next, since $\lim_{x\to\infty}\rho(x)=0$ , it follows that $\lim_{\alpha\to-\infty}\nu\big{(}(\alpha-t,\alpha+t)\big{)}=0$ , and so the sequence $(a_{k})$ must converge to some finite limit $a$ . The limit function $\phi_{a}$ will be also extremal and will satisfy $t=a-\mathbb{E}\phi_{a}$ . ∎

5.3. Proof of Lemma 13.

Since the proof of Lemma 12 uses the inequalities from Lemma 13, we will start with the latter. Let us recall the main instances of the inequalities in question :

(45)

\cos^{n-2}\theta\geqslant e^{-n\theta^{2}/2}\quad\hbox{ for }\ \theta\in[0,3/% \sqrt{n}]\ \hbox{ and }\ n\geqslant 5.

(46)

\cos^{n-1}\theta\leqslant e^{-n\theta^{2}/2}\quad\hbox{ for }\ \theta\in[\sqrt% {6/n},\pi/2]\ \hbox{ and }\ n\geqslant 3.

Here is an outline of the proof. We start with (45), which is effectively a lower bound for the density of $\nu=\nu_{n}$ . Taking logarithms of both sides and substituting $s=\theta\sqrt{n}$ we see that, for a given $n$ , (45) is equivalent to.

(70)

h(s)=h_{n}(s):=({n-2})\log\cos\frac{s}{\sqrt{n}}+\frac{s^{2}}{2}\geqslant 0% \quad\hbox{ for }\ s\in[0,3].

It is readily verified that $h(0)=h^{\prime}(0)=0$ , while $h^{\prime\prime}(s)>0$ on some interval $[0,\alpha)$ (where $\alpha>0$ depends on $n$ ) and $h^{\prime\prime}(s)<0$ for $s>\alpha$ . Now, the domain of $h$ is $[0,\pi\sqrt{n}/2)$ and $\lim_{s\to\pi\sqrt{n}\pi/2}h(s)=-\infty$ , so it follows that $h(s)\geqslant 0$ on some interval $[0,\beta)$ (again, depending on $n$ ) and $h(s)<0$ for $s>\beta$ . Accordingly, (70) holds, for a given $n$ , iff $h_{n}(3)\geqslant 0$ . Since (by elementary calculus) we have equality in the limit, this would follow once if the sequence $\big{(}h_{n}(3)\big{)}$ was nonincreasing. This is not exactly true, but almost: in fact it is nonincreasing starting with $n=7$ . For $n\geqslant 8$ , this can be established numerically by substituting, say, $s=\frac{3}{\sqrt{n}}$ (to have a compact interval to deal with) and by differentiating with respect to $s$ . The remaining instances are handled by directly evaluating $h_{3}(2.67),h_{4}(2.89)$ , and $h_{k}(3)$ for $k=5,6,7$ .

We now pass to the analysis of (46). As in the argument that led to (45), this reduces to verifying that the inequality holds for $x=\sqrt{6/n}$ , and one way to show that is by establishing that the sequence $\big{(}\cos^{n-1}\sqrt{6/n}\big{)}$ , which converges to $e^{-3}$ , is increasing. Again, this can be verified by the method suggested above. This fact is fairly delicate and is (roughly) equivalent to the inequality $\cos s\leqslant e^{-3s^{2}/(6-s^{2})}$ , which – while not standard – may be known, and is also fairly easy to show directly. ∎

5.4. Proof of Lemma 12.

We first use the bound (45) to prove the estimate from Lemma 12 for $x\in\big{[}\frac{1}{2\sqrt{n}},\sqrt{\frac{6}{n}}\big{]}$ . (Again, the choice of the $\sqrt{\frac{6}{n}}$ cutoff is predicated on the other approach taking care of $x\geqslant\sqrt{\frac{6}{n}}$ .) The heuristics are as follows. First, having an estimate of the form $e^{-n\theta^{2}/2}$ allows to express both sides of the inequality in terms of the variable $u=x\sqrt{n}$ . Next, it turns out that after this substitution, the density of $\nu$ is bounded from below on the interval $0\leqslant u\leqslant 3$ (and, a fortiori, for $u\in[0,\sqrt{6}]$ ) by a strictly positive constant, independent of $n$ . This means that $u\to\nu_{n}([u/\sqrt{n},\pi/2])$ can be upper-bounded on that interval by a strictly decreasing linear function, with equality at $u=0$ , and so – for moderate values of $u$ – we will have a strict separation between that function and $\frac{1}{2}e^{-nx^{2}/2}=\frac{1}{2}e^{-u^{2}/2}$ , leading to the asserted improved upper bound.

Here are some of the details. First, the change of variables $u=x\sqrt{n},s=\theta\sqrt{n}$ leads to

$\displaystyle\nu_{n}([0,x])$	$\displaystyle=$	$\displaystyle\big{(}2I_{n-2}\big{)}^{-1}\int_{0}^{x}\cos^{n-2}\theta\,d\theta$
	$\displaystyle\geqslant$	$\displaystyle\big{(}2I_{n-2}\big{)}^{-1}\int_{0}^{x}e^{-n\theta^{2}/2}\,d\theta$
	$\displaystyle=$	$\displaystyle\big{(}2I_{n-2}\big{)}^{-1}\frac{1}{\sqrt{n}}\int_{0}^{u}e^{-s^{2% }/2}\,ds.$

By Proposition 8(ii), the factor in front of the integral converges to $\frac{1}{\sqrt{2\pi}}$ as $n\to\infty$ (basically, this reflects the fact that, as we mentioned earlier, the random variable $\sqrt{n}\,\theta$ approximates the standard normal), but – unfortunately – is strictly smaller than the limit. If we ignore that discrepancy and compare the “ideal upper bound” $\frac{1}{2}-\frac{1}{\sqrt{2\pi}}\int_{0}^{u}e^{-s^{2}/2}\,ds$ for $\nu([x,\pi/2])$ to $\frac{2}{5}e^{-u^{2}/2}$ , we get the picture as shown in Figure 6.

Clearly and unsurprisingly, there is some room to spare between the ideal upper bound and the bound asserted in Lemma 12, which establishes that bound (in the range $\frac{1}{2\sqrt{n}}\leqslant x\leqslant\sqrt{\frac{6}{n}}$ ) for sufficiently large $n$ . Since the constants are explicit, we can verify that “sufficiently large” means here “ $n\geqslant 30$ .” Smaller values of $n$ can be checked directly.

Finally, we can check directly (numerically) that, for $n=3$ , the bound in question is valid when $u=x\sqrt{n}\geqslant 0.551$ .

It remains to show the estimate from Lemma 12 for $x>\sqrt{\frac{6}{n}}$ . In that range, we will use the bound (20) from Proposition 6, which restated in the current context asserts that

\nu_{n}([x,\pi/2])\leqslant(\sqrt{2\pi}\,\kappa_{n}\sin x)^{-1}\cos^{n-1}x.

We next appeal to (46) to upper-bound $\cos^{n-1}x$ by $e^{-nx^{2}/2}$ . Once this is done, it remains to check that the coefficient $(\sqrt{2\pi}\,\kappa_{n}\sin x)^{-1}$ doesn’t exceed $0.4$ as long as $x\in\big{[}\sqrt{\frac{6}{n}},\frac{\pi}{2}\big{]}$ , which is straightforward and completes the proof. (The sequence $\big{(}\frac{\kappa_{n}}{\sqrt{n}}\big{)}$ being increasing, see Proposition 8, comes in handy here.) ∎

5.5. Proof of inequality (60)

We first rewrite the inequality from the conclusion of (60) as follows

(71)

\left(\frac{\cos(b+2t)}{\cos b}\right)^{n}\geqslant\frac{\cos^{2}(b+2t)}{4(n-1% )\sin^{2}b}\,.

By elementary trigonometry,

(72)

\frac{\cos(b+2t)}{\cos b}=1-2\sin^{2}t-\frac{\sin b}{\cos b}\sin 2t.

We next use the constraint $t\leqslant\frac{I_{n-1}}{\pi n\sin^{2}b}\times\cos^{n}b$ from (60) to upper-bound the last two terms on the right-hand side of (72). We have

$\displaystyle 2\sin^{2}t$	$\displaystyle\leqslant$	$\displaystyle 2\left(\frac{I_{n-1}}{\pi n\sin^{2}b}\times\cos^{n}b\right)^{2}$
	$\displaystyle\leqslant$	$\displaystyle 2\left(\frac{I_{n-1}}{\pi n\,b^{2}}\right)^{2}\cos^{2(n-1)}b$
	$\displaystyle\leqslant$	$\displaystyle\frac{2I_{n-1}^{2}}{\pi^{2}u^{4}}e^{-u^{2}}=\big{(}I_{n-1}^{2}n% \big{)}\times\frac{2e^{-u^{2}}}{\pi^{2}u^{4}}\times\frac{1}{n}$

where we used consecutively the inequality $\frac{\cos y}{\sin^{2}y}\leqslant\frac{1}{y^{2}}$ , the substitution $u=b\sqrt{n}$ , and the bound (46), which applies since $u\geqslant\sqrt{6}$ . Similarly

$\displaystyle\frac{\sin b}{\cos b}\sin 2t$	$\displaystyle\leqslant$	$\displaystyle\frac{2I_{n-1}}{\pi n\sin b}\times\cos^{n-1}b$
	$\displaystyle\leqslant$	$\displaystyle\frac{I_{n-1}}{nb}\times\cos^{n-1}b$
	$\displaystyle\leqslant$	$\displaystyle\big{(}I_{n-1}\sqrt{n}\big{)}\times\frac{e^{-u^{2}/2}}{u}\times% \frac{1}{n}.$

We now note that the sequence $n\to I_{n-1}\sqrt{n}$ decreases to $\sqrt{\frac{\pi}{2}}$ (by Proposition 8) and so, for $n\geqslant 3$ , can be upper-bounded by $I_{2}\sqrt{3}=\frac{\sqrt{3}\pi}{4}$ . Moreover, the coefficients $\frac{2e^{-u^{2}}}{\pi^{2}u^{4}}$ and $\frac{e^{-u^{2}/2}}{u}$ are decreasing functions of $u$ , so they can be upper-bounded by their vales at $u=\sqrt{6}$ . Putting these estimates together, we conclude that

(73)

\frac{\cos(b+2t)}{\cos b}\geqslant 1-\frac{\alpha}{n},\quad\hbox{where}\quad% \alpha=\frac{1}{96e^{6}}+\frac{\pi}{4\sqrt{2}e^{3}}<0.028.

At the same time, again for $u\geqslant\sqrt{6}$ and $n\geqslant 3$ ,

\frac{\cos^{2}(b+2t)}{4(n-1)\sin^{2}b}\leqslant\frac{\cos^{2}b}{4(n-1)\sin^{2}% b}\leqslant\frac{n}{4(n-1)}\times\frac{1}{nb^{2}}\leqslant\frac{3}{8u^{2}}% \leqslant\frac{1}{16}.

Now, for $\alpha\in(0,1)$ , the sequence $n\to\left(1-\frac{\alpha}{n}\right)^{n}$ increases to $e^{-\alpha}$ , so it can be lower-bounded by its initial term. In our setting, the initial term corresponds to $n=3$ , and so $\left(1-\frac{\alpha}{n}\right)^{n}\geqslant\left(1-\frac{\alpha}{3}\right)^{3% }>0.97$ . This means that the inequality (71) indeed holds and, again, it is not close. ∎