1 Overview

1.1 Two kam Theorems for Properly Degenerate Hamiltonian Systems We deal with Hamiltonians which meet the demand of being close-to-be-integrable [see, e.g., Gallavotti (1986)], but, in addition, with the number of degrees of freedom of perturbing term being possibly larger than the one of the unperturbed part. Such kind of Hamiltonians often arise in problems of celestial mechanics and are referred to as “properly degenerate”, after (Arnold 1963). We denote them as

$$\begin{aligned} H(I, \varphi , p, q; {\mu })=H_0(I)+{\mu }\,P(I, {\varphi }, p, q; {\mu }) , \end{aligned}$$

where the coordinates \((I, \varphi )=(I_1, \ldots , I_n, \varphi _1, \ldots , \varphi _n)\) are of “action-angle” kind (after a possible application of the Liouville–Arnold theorem to the unperturbed term), while (for our needs) the \((p, q)=(p_1, \ldots , p_m, q_1, \ldots , q_m)\) are “rectangular”, namely, take value in a small ball (say, of radius \((\varepsilon _0)\)) about some point (say, the origin). The symplectic form is standard:

$$\begin{aligned} \Omega =dI\wedge d\varphi +dp\wedge dq=\sum _{i=1}^n dI_i\wedge d\varphi _i+\sum _{i=1}^m dp_i\wedge dq_i . \end{aligned}$$

We work in the real-analytic framework, which means that we assume that H admits a holomorphic extension on a complex neighborhood of the real “phase space” (namely, the domain)

$$\begin{aligned} \mathcal {P}_{\varepsilon _{0}}:=V\times {{\mathbb {T}}}^n\times B^{2m}_{\varepsilon _0}, \end{aligned}$$

where \(V\subset {{\mathbb {R}}}^n\) is bounded, open and connected, \(({{\mathbb {T}}}={{\mathbb {R}}}/(2\pi {\mathbb Z}))\) is the “flat torus”, \(B^{2m}_\varepsilon \) is the 2m-dimensional ball around 0 of radius \(\varepsilon \), relatively to some norm in \({\mathbb {R}}^{2m}\).

In this framework, we presentFootnote 1 two “kam theorems” which deal with different situations. A basic assumption, common to both statements, and often referred to as “Kolmogorov condition”, is:

(A\({}_1\)):

the map \(I\rightarrow \partial _I H_0(I)\) is a diffeomorphism of V.

However, due to the proper degeneracy mentioned above, such assumption is to be reinforced with some statement concerning the perturbing term, or, more precisely, its Lagrange average

$$\begin{aligned} P_\textrm{av}(I, p, q; \mu ):=\frac{1}{(2\pi )^n}\int _{[0, 2\pi ]^n}P(I, \varphi , p, q; \mu )d^n\varphi \end{aligned}$$

with respect to the \(\varphi \)-coordinates. Such extra-assumption will be different in the two statements; therefore, we quote them below.

The first result is a revisitation of the so-called Fundamental Theorem by V.I. Arnold, Arnold (1963). Such theorem has been already studied, generalized and extended in previous works (Chierchia and Pinzari 2010; Pinzari 2018). Here, we deal with the situation (not considered in the aforementioned papers) where \(P_\textrm{av}\) admits a “Birkhoff Normal Form” (bnf hereafter) about \((p, q)=(0, 0)\) of highFootnote 2 order; say s. As expected, a higher order of bnf allows to improve the measure of the “Kolmogorov set”, namely the set given by the union of all kam tori. We shall proveFootnote 3 the following

Theorem 1.1

Assume \((A_1)\) above and the following conditions:

(A\({}_2\)):

\( P_\textrm{av} (I, p, q)=\sum _{j=1}^{s}\mathcal{P} _j(r;I) +\textrm{O}_{2 s+1}(p,q;I)\), with \(r_i:=\frac{p_i^2+q_i^2}{2}\) and \(\mathcal{P} _j(r;I)\) being a polynomial of degree j in \(r=(r_1, \cdots , r_{m})\), for some \(2\le s\in {{\mathbb {N}}}\).

(A\({}_3\)):

the \(m\times m\) matrix \(\beta (I)\) of the coefficients of the second-order term \(\mathcal{P} _2(r;I)=\frac{1}{2}\sum _{i,j=1}^{m} {\beta }_{ij}(I)r_i r_j\) is non-degenerate: \(|\det {\beta }(I)|\ge {\, \mathrm const\,}>0\) for all \(I\in V\).

Then, there exist positive numbers \(\varepsilon _*<\varepsilon _0\), \(C_*\) and \(c_*\) such that, for

$$\begin{aligned} 0<\varepsilon<\varepsilon _*\ , \qquad 0<{\mu }<\frac{\varepsilon ^{2s+2}}{C_* (\log \varepsilon ^{-1})^{c_*}}\ . \end{aligned}$$
(1)

one can find a set \(\mathcal{K} \subset \mathcal{P} _\varepsilon \) formed by the union of H-invariant n-dimensional Lagrangian tori, on which the H-motion is analytically conjugated to linear Diophantine quasi-periodic motions with frequencies \(({\omega }_1,{\omega }_2)\in {\mathbb R}^{n_1}\times {{\mathbb {R}}}^{n_2}\) with \({\omega }_1=O(1)\) and \({\omega }_2=O({{\mu }})\). The set \(\mathcal{K} \) has positive Liouville–Lebesgue measure and satisfies

$$\begin{aligned} {\, \mathrm meas}\mathcal{P} _\varepsilon>{\, \mathrm meas}\mathcal{K} > \Big (1- C_* {\varepsilon }^{s-\frac{3}{2}}\Big ){\, \mathrm meas}\mathcal{P} _\varepsilon \ . \end{aligned}$$
(2)

The second result deals with lower-dimensional quasi-periodic motions, the so-called whiskered tori. These are n-dimensional quasi-periodic motions (in a phase space of dimension \(2n+2m\)), approached or reached at an exponential rate. For simplicity, in view of our application, we focus on the case \(m=1\). In addition, we allow a further degeneracy in the Hamiltonian: the unperturbed term \(H_0\) may possibly depend not on all the I’s, but only on a part of them.

Theorem 1.2

Let \(m=1\), and let \(H_0\) depend on the components \(I_1=(I_{11}, \ldots , I_{1n_1})\) of the \(I=(I_1, I_2)\)’s, with \(1\le n_1\le n:=n_1+n_2\). Assume \((A_1)\) with \(I_1\) replacing I and, in addition, that

(A\({}_2'\)):

\( P_\textrm{av} (I, p, q; {\mu })=P_0 (I, pq; {\mu })+ P_1 (I,\varphi , p, q; {\mu })\) with \(\Vert P_1\Vert \le a \Vert P_0\Vert \);

(A\({}_3'\)):

\(| \partial _{pq} P_0 |\ge {\, \mathrm const\,}>0\) and \(|\det \partial ^2_{I_2} P_0 |\ge {\, \mathrm const\,}>0\) if \(n_2\ne 0\).

Fix \(\eta >0\). Then, there exist positive numbers \(a_*\), \(\varepsilon _*<\varepsilon _0\), \(C_*\) and \(c_*\) such that, if

$$\begin{aligned} 0<\varepsilon<\varepsilon _* ,\quad 0< a< a _*\varepsilon ^4\ ,\quad 0<{\mu }<\frac{C_* (a \Vert P_0\Vert )^{1+\eta }}{ (\log a ^{-1})^{c_*}} \end{aligned}$$
(3)

one can find a set \(\mathcal{K} \) formed by the union of H-invariant n-dimensional Lagrangian tori, on which the H-motion is analytically conjugated to linear Diophantine quasi-periodic motions with frequencies \(({\omega }_1,{\omega }_2)\in {{\mathbb {R}}}^{n_1}\times {{\mathbb {R}}}^{n_2}\) with \({\omega }_1=O(1)\) and \({\omega }_2=O({{\mu }})\). The projection \(\mathcal{K} _0\) of set \(\mathcal{K} \) on \(\mathcal{P} _0:=V\times {{\mathbb {T}}}^{n}\) has positive Liouville–Lebesgue measure and satisfies

$$\begin{aligned} {\, \mathrm meas}\mathcal{P}_0>{\, \mathrm meas}\mathcal{K} _0> \Big (1- C_* \sqrt{ a }\Big ) {\, \mathrm meas}\mathcal{P}_0\ . \end{aligned}$$
(4)

Furthermore, for any \(\mathcal{T} \in \mathcal{K} \) there exist two \((n+1)\)-dimensional invariant manifolds \(\mathcal{W}_\textrm{u}\), \(\mathcal{W}_\textrm{s}\subset \mathcal{P} _{\varepsilon _*}\) such that \(\mathcal{T} =\mathcal{W}_\textrm{u}\cap \mathcal{W}_\textrm{s}\) and the motions on \(\mathcal{W}_\textrm{u}\), \(\mathcal{W}_\textrm{s}\) leave, approach \(\mathcal{T} \) at an exponential rate.

Before we go on with describing how we aim to use the theorems above, we premise some comment.

  1. (i)

    The conditions involving \(\mu \) in (1) and (3) are not optimal. With a procedure similar to the one shown in Chierchia and Pinzari (2010, proof of Theorem 1.2, steps 1–4), one can show that they can be relaxed to, respectively

    $$\begin{aligned} {\mu }<\frac{1}{{C_* (\log \varepsilon ^{-1})^{2b}}} ,\qquad {\mu }<\frac{1}{{C_* (\log (a \Vert P_0\Vert )^{-1})^{2b}}} \end{aligned}$$

    with some \(C_*\), \(b>0\).

  2. (ii)

    The careful bounds on the measure of the invariant sets provided in (2) and (4) are needed in view of our application. Indeed, we shall apply both the theorems above in order to prove that, in the three-body problem, closely to the co-planar, co-circular, outer retrograde configuration (see below for the exact definition), full-dimensional and “whiskered” quasi-periodic tori co-exist [the result was conjectured in Pinzari (2018)]. In the application, \(\varepsilon \) will correspond to the maximum eccentricity or inclination; a the semi-major axes ratio, and the use of a high-order bnf in Theorem 1.1 will be necessary because the size of the set goes to 0 with some power of \(\varepsilon \) (\(s=4\) will be enough for our application).

  3. (iii)

    Following Chierchia and Gallavotti (1994), Theorem 1.2 might be extended to prove the existence of “diffusion paths” and “whisker ladders”. We shall not do, as proving Arnold instability [in the sense of Arnold (1964)] for the system (5) below is not the purpose of this paper. We, however, remark that such kind of instability has been found for the four-body problem in a very similar framework (Clarke et al. 2022). We remark that proofs of chaos or Arnold instability in celestial mechanics are quite recent (Féjoz et al. 2014; Delshams et al. 2019), by the difficulty of overcoming the so-called problem of large gaps. See Guzzo et al. (2020) and references therein.

  4. (iv)

    Another important aspect in view of the application described above is a rather standard consequence of the proof of Theorem 1.2: If P (namely, \(P_1\)) has an equilibrium at \((p, q)=0\), then, along the motions of \(\mathcal K\), the coordinates (pq) remain fixed at (0, 0) (rather than varying closely to it), namely

    $$\begin{aligned} \mathcal{K}\subset V\times {{\mathbb {T}}}^{n}\times \{(0, 0)\} . \end{aligned}$$

    More generally, the stable and unstable invariant manifolds do not shift from the unperturbed ones:

    $$\begin{aligned} \mathcal{W}_\textrm{s}\subset \mathcal{P} _{\varepsilon }\cap \big \{q=0\big \} ,\qquad \mathcal{W}_\textrm{u}\subset \mathcal{P} _{\varepsilon }\cap \big \{p=0\big \} . \end{aligned}$$

1.2 Application to the three-body problem We apply the results above to prove that, in a region of the phase space of the three-body problem, and under conditions that will be specified later, full dimensional and whiskered tori co-exist. We underline that the co-existence of such different kind of motions is not a mere consequence of the non-integrability of the system (as in such case the result would be somewhat expected) as it persists in two suitable integrable approximations of the system, close one to the other. Indeed, such motions will be found in a very small zone in the phase space of the three-body problem which simultaneously is in the neighborhood of an elliptic equilibrium of one of such approximations and in a hyperbolic one of the other. Such an occurrence is intimately related to the use of two different systems of coordinates, which are singular one with respect to the other, in the region of interest. The authors are not aware of the appearance of such phenomenon, previously.

After the “heliocentric reduction” of translational invariance, the three-body problem Hamiltonian with gravitational masses equal to \(m_0\), \({\mu }m_1\) and \({\mu }m_2\) and Newton constant \(\mathcal{G} \equiv 1\), takes the form of the two-particle system [see, e.g., Féjoz (2004), Laskar and Robutel (1995) for a derivation]:

$$\begin{aligned} \textrm{H}_\textrm{3b}=\sum _{i=1}^2\left( \frac{|y^{\mathrm{(i)}}|^2}{2\textrm{m}_i}-\frac{\textrm{m}_i\textrm{M}_i}{|x^{\mathrm{(i)}}|}\right) +\mu \left( -\frac{m_1m_2}{|x^{(1)}-x^{(2)}|}+\frac{y^{(1)}\cdot y^{(2)}}{m_0}\right) \end{aligned}$$
(5)

with suitable values of \(\textrm{m}_i=m_i+\textrm{O}(\mu )\), \(\textrm{M}_i=m_0+\textrm{O}(\mu )\). We consider the system in the Euclidean space, namely we take, in (5), \(y^{\mathrm{(i)}}\), \(x^{\mathrm{(i)}}\in {\mathbb {R}}^3\), with \(x^{(1)}\ne x^{(2)}\).

We call Kepler maps the class of symplecticFootnote 4 coordinate systems \(\mathcal{C} =({\Lambda }_1, {\Lambda }_2, \ell _1, \ell _2, \mathrm y, \mathrm x)\) for the Hamiltonian (5), where \( \mathrm y=(y_1, \ldots , y_4), \mathrm x=(x_1, \ldots , x_{4})\), such that:

  • \(\Lambda _i=\textrm{m}_i \sqrt{\textrm{M}_i a_i}\), where \(a_i\) denotes the semi-major axis of the \(i^\textrm{th}\) instantaneousFootnote 5 ellipse;

  • \(\ell _1, \ell _2\in {\mathbb {T}}\) are conjugated to \({\Lambda }_1\), \({\Lambda }_2\). Such angles are defined in a different way according to the choice of \(\mathcal{C} \). In all known examples, they are related to the area spanned by the planet along the instantaneous ellipse.

Using a Kepler map, the Hamiltonian (5) takes the form

$$\begin{aligned} \textrm{H}_\mathcal{C} =-\frac{\textrm{m}^3_1\textrm{M}^2_1}{2{\Lambda }_1^2}-\frac{\textrm{m}^3_2\textrm{M}^2_2}{2{\Lambda }_2^2}+{\mu }f_\mathcal{C} ({\Lambda }_1, {\Lambda }_2, \ell _1, \ell _2, {\hat{\text {y}}}, \hat{\text {x}}) \end{aligned}$$
(6)

where \( \hat{\text {y}}, \hat{\text {x}}\) include the couples \( (\textrm{y}_i, \textrm{x}_i)\) suchFootnote 6 that nor \(\textrm{y}_i\) nor \(\textrm{x}_i\) is negligible. \( {{\hat{\text {y}}}}, \hat{\text {x}}\) are often called degenerate coordinates, because they do not appear in (6) when \(\mu \) is set to zero. In other words, \(\textrm{H}_\mathcal{C} \) is a properly degenerate close-to-be-integrable system, in the sense of the previous paragraph.

We call co-planar, co-circular, outer retrograde configuration the configuration of two planets in circular and co-planar motions, with the angular momentum of the outer planet having opposite verse to the resulting one. In Pinzari (2018) it has been pointed out that, under a careful choice of \(\mathcal{C} \) such configuration plays the rôle of an equilibrium for the \((\ell _1, \ell _2)\)-averaged perturbing function

$$\begin{aligned} \overline{f}_\mathcal{C} ({\Lambda }_1, {\Lambda }_2, \hat{\text {y}}, \hat{\text {x}})=\frac{1}{(2\pi )^2}\int _{[0, 2\pi ]^2}f_\mathcal{C} ({\Lambda }_1, {\Lambda }_2, \ell _1, \ell _2, \hat{\text {y}}, \hat{\text {x}}) d\ell _1d\ell _2 . \end{aligned}$$

But what matters more is that, closely to such equilibrium, there exist two such \(\mathcal{C} _i\)’s such that the Hamiltonian \(\textrm{H}_{\mathcal{C} _1}\) is suited to Theorem 1.1, while \(\textrm{H}_{\mathcal{C} _2}\) is suited to Theorem 1.2. This leads to the following result, which states co-existence of stable and whiskered quasi-periodic motions in the three-body problem. It will be made more precise (see Theorem 2.1) and proved along the paper.

Theorem A In the vicinity of the co-planar, co-circular, outer retrograde configuration, and provided that the masses of the planets and the semi-axes ratio are small, there exists a positive measure set \(\mathcal{K} _1\) made of 5-dimensional quasi-periodic motions \(\mathcal{T} _1\)’s “surrounding” (in a sense which will be specified) 3-dimensional quasi-periodic motions \(\mathcal{T} _2\)’s, each equipped with two invariant manifolds, called, respectively, unstable, stable manifold, where the motions are respectively asymptotic to the \(\mathcal{T} _2\)s in the past, in the future.

We conclude with saying how this paper is organized.

  • In Sects. 2.1 and 2.2 we recall the main arguments of the discussion in Pinzari (2018), which lead to put the system (5) to a form suited to apply Theorems  1.1 and 1.2.

  • In Sects. 2.3 and 2.4 we check that the two domains where Theorems  1.1 and 1.2 apply have a non-empty intersection, and such intersection includes both families of tori. This check is subtle, because of the difference of the frameworks used.

  • In Sect. 3, we prove Theorems  1.1 and 1.2 via a carefully quantified kam theory.

2 Ellipticity and Hyperbolicity Closely to Co-planar, Co-circular, Outer Retrograde Configuration

Putting the system in a form suited to Theorem  1.1 requires identifying an elliptic equilibrium, while Theorem 1.2 calls for a hyperbolic one.

Denoting as \((\text {C}^{{(j)}}\,{:}{=}\,x^{{(j)}}\times y^{{(j)}}\) the angular momenta of the planets, we proceed to study motions evolving from initial data close to the manifold

$$\begin{aligned} \mathcal{M}_{\pi }:=\Big \{(y, x):\ \textrm{C}^{(1)}\parallel (-\textrm{C}^{(2)})\parallel \textrm{C},\ \textrm{and}\ x^{(1)},\ x^{(2)}\ \mathrm{describe\ circular\ motions .} \Big \} .\nonumber \\ \end{aligned}$$
(7)

The sub-fix “\({\pi }\)” recalls that \(\mathrm C^{(1)}\) and \(\mathrm C^{(2)}\) are opposite. In the two next sections, we recall material from Pinzari (2018), which highlights a sort of “double (elliptic, hyperbolic) nature” of \(\mathcal{M}_{\pi }\).

2.1 Ellipticity (with bnf)

BasicallyFootnote 7, the construction of the elliptic equilibrium—and of its associated bnf—proceeds as in Chierchia and Pinzari (2011). We briefly resume the procedure here.

We fix a domain \(\mathcal{D} _{{{c}}}\subset {{\mathbb {R}}}^{12}\) for impulse-position “Cartesian” coordinates

$$\begin{aligned} {c}=(y, x):=(y^{(1)}, y^{(2)}, x^{(1)}, x^{(2)}) \end{aligned}$$

of two point masses relatively to a prefixed orthonormal frame \((k^{(1)}, k^{(2)}, k^{(3)})\) in \({{\mathbb {R}}}^3\). As a first step, we switch to a set of coordinates, well known in the literature, which we name jrd, after C. G. J. Jacobi, R. Radau and A. Deprit (Jacobi 1842; Radau 1868; Deprit 1983), who, at different stages, contributed to their construction.

We fix a region of phase space where the orbits \(t\rightarrow (x^{(j)}(t), y^{(j)}(t))\) generated by the unperturbed “Kepler” Hamiltonians

$$\begin{aligned} \textrm{h}^{(j)}_{\textrm{k}}:=\frac{|y^{(j)}|^2}{2\textrm{m}_j}-\frac{\textrm{m}_j\textrm{M}_j}{|x^{(j)}|} \end{aligned}$$

in (5) are ellipses with non-vanishing eccentricity. Then, we denote as \(\textrm{P}^{(j)}\) the unit vectors pointing in the directions of the perihelia; as \(a_j\) the semi-major axes; as \(\ell _j\) the “mean anomaly” of \(x^{(j)}\)(which, we recall, is defined as area of the elliptic sector from \(\textrm{P}^{(j)}\) to \(x^{(j)}\) “normalized at \(2{\pi }\)”); as \(\textrm{C}^{(j)}=x^{(j)}\times y^{(j)}\), \(j=1\), 2, the angular momenta of the two planets and \(\textrm{C}:=\textrm{C}^{(1)}+\textrm{C}^{(2)}\) the total angular momentum integral. We assume that the “nodes”

$$\begin{aligned} {\nu }_1:=k^{(3)}\times \textrm{C}\ ,\quad {\nu }:=\textrm{C}\times \textrm{C}^{(1)}={\textrm{C}}^{(2)}\times {\textrm{C}}^{(1)}\end{aligned}$$
(8)

do not vanish, anytime. Such condition is equivalent to ask that the planes determined by the instantaneous ellipses and the \((k^{(1)}, k^{(2)})\) plane never pairwise coincide. As in previous works, we use the following notations. For three vectors u, v, w with u, \(v\perp \) w, we denote as \({\alpha }_{w}(u,v)\) the angle formed by u to v relatively to the positive (counterclockwise) orientation established by w. Then, the jrd coordinates are here denoted with the symbols

$$\begin{aligned} {{jrd}}:=\Big (\widehat{{jrd}}\,{:}{=}\,({\Lambda }_1,{\Lambda }_2,\textrm{G}_1,\textrm{G}_2, \ell _1,\ell _2,{\gamma }_1,{\gamma }_2), (\textrm{G},\textrm{Z},{\gamma }, \zeta )\Big )\in {{\mathbb {R}}}^4\times {{\mathbb {T}}}^4\times {{\mathbb {R}}}^2\times {{\mathbb {T}}}^2\nonumber \\ \end{aligned}$$
(9)

and defined via the formulae

$$\begin{aligned} \begin{array}{llllrrr} \left\{ \begin{array}{lrrr} \textrm{Z}:=\textrm{C}\cdot k^{(3)}\\ \textrm{G}:=\Vert \textrm{C}\Vert \\ \textrm{G}_1:=\Vert \textrm{C}^{(1)}\Vert \\ \textrm{G}_2:=\Vert \textrm{C}^{(2)}\Vert \\ {\Lambda }_j:=\textrm{M}_j\sqrt{\textrm{m}_j a_j} \end{array} \right. \qquad \left\{ \begin{array}{lrrr} \zeta :={\alpha }_{k^{(3)}}(k^{(1)}, {\nu }_1)\qquad &{} \\ {\gamma }:={\alpha }_{\textrm{C}}({\nu }_1, {\nu })\qquad &{}\\ {{\gamma }}_1:={\alpha }_{{\textrm{C}^{(1)}}}({\nu }, \textrm{P}^{(1)})&{}\\ {{\gamma }}_2:={\alpha }_{\textrm{C}^{(2)}}({\nu },\textrm{P}^{(2)})&{}\\ \ell _j := \mathrm{mean\ anomaly\ of}\ x^{(j)}&{} \end{array} \right. \end{array} \end{aligned}$$
(10)

The main point of jrd is that \(\textrm{Z}\), \(\zeta \) and \(\gamma \) are ignorable coordinates and \(\textrm{G}\) is constant along the motions of SO(3)-invariant systems. Therefore, most of motions of SO(3)-invariant systems are effectively described by the “reduced” coordinates \(\widehat{{jrd}}\). This strong property cannot be exploited in the case study of the paper, as the manifold \(\mathcal{M}_{\pi }\) in (7) is a singularity of the change (10). More generally, any co-planar or circularFootnote 8 configuration is so. Pretty similarly as in Chierchia and Pinzari (2011), we bypass such difficulty switching to new coordinates denoted as

$$\begin{aligned} {{rps}}_{\pi }\,{:}{=}\,\Big (\widehat{{rps}}_{\pi }\,{:}{=}\,({\Lambda }_1,{\Lambda }_2,{\lambda }_1,{\lambda }_2, \eta _1,\eta _2, \xi _1,\xi _2, p, q), (Z, \zeta ) \Big ) \end{aligned}$$

where the \(\Lambda _j\)’s, \(\textrm{Z}\) and \(\zeta \) are the sameFootnote 9 as in (9), while

$$\begin{aligned} \left\{ \begin{array}{l} \displaystyle \lambda _1=\ell _1+\gamma _1+\gamma \\ \displaystyle \lambda _2=\ell _2+\gamma _2-\gamma \\ \displaystyle \eta _1+\textrm{i}\xi _1=\sqrt{2({\Lambda }_1-\textrm{G}_1)}e^{-\textrm{i}({\gamma }_1+{\gamma })}\\ \displaystyle \eta _2+\textrm{i}\xi _2=-\sqrt{2({\Lambda }_2-\textrm{G}_2)}e^{\textrm{i}(-{\gamma }_2+{\gamma })}\\ \displaystyle p+\textrm{i}q=-\sqrt{2(\textrm{G}+\textrm{G}_2-\textrm{G}_1)} e^{\textrm{i}{\gamma }} \end{array}\right. \end{aligned}$$
(11)

As in jrd, \((Z, \zeta )\) is a cyclic couple in SO(3)-invariant Hamiltonians but now no more cyclic coordinates but it appears. This leaves the system with 5 degrees of freedom and an extra-integral: the action \(\textrm{G}\) written using \({rps}_\pi \):

$$\begin{aligned} \textrm{G}_{{rps}_\pi }\,{:}{=}\,\Lambda _1-\Lambda _2-\frac{\eta _1^2+\xi _1^2}{2}+\frac{\eta _2^2+\xi _2^2}{2}+\frac{p^2+q^2}{2} .\end{aligned}$$
(12)

We denote as

$$\begin{aligned} \textrm{H}_{{{rps}}_{\pi }}{} & {} :=-\frac{\textrm{m}_1^3\textrm{M}_1^2}{2{\Lambda }_1^2}-\frac{\textrm{m}_2^3\textrm{M}_2^2}{2{\Lambda }_2^2}+{\mu }\, \Big ( -\frac{m_1 m_2}{|x_{{{rps}}_{\pi }}^{(1)}-x_{{{rps}}_{\pi }}^{(2)}|}+ \frac{y_{{{rps}}_{\pi }}^{(1)}\cdot y_{{{rps}}_{\pi }}^{(2)}}{m_0} \Big )\nonumber \\{} & {} =:\textrm{h}_\textrm{k}({\Lambda })+{\mu }f_{{rps}_{\pi }}({\Lambda },{\lambda }, z)\qquad z\,{:}{=}\,(\eta ,\xi ,p,q)\end{aligned}$$
(13)

the Hamiltonian (5) written in rps\({}_{\pi }\) coordinates, and worry about it.

We note that the manifold \(\mathcal{M}_\pi \) in (7) is now given by

$$\begin{aligned} \mathcal{M}_\pi =\big \{{{{rps}}_{\pi }}:\ z=0\big \} . \end{aligned}$$

Then we consider a neighborhood of \(\mathcal{M}_\pi \) of the form

$$\begin{aligned}\mathcal{M}_{{rps}_\pi , \varepsilon _0}\,{:}{=}\,\mathcal{L}\times {{\mathbb {T}}}^2\times B^{6}_{\varepsilon _0}(0)\ ,\end{aligned}$$

where \(B^6_{\varepsilon _0}\) is the 6-ball centered at \(0\in {{\mathbb {R}}}^6\) with radius \(\varepsilon _0\); \({\mathbb T}\,{:}{=}\,{{\mathbb {R}}}/(2{\pi }{{\mathbb {Z}}})\) and \(\mathcal{L}\) is defined as

$$\begin{aligned} \mathcal{L}\,{:}{=}\,\Big \{{\Lambda }=({\Lambda }_1,{\Lambda }_2): \ {\Lambda }_-< {\Lambda }_2< {\Lambda }_+\ ,\quad k_-{\Lambda }_2< {\Lambda }_1< k_+{\Lambda }_2\Big \} .\end{aligned}$$
(14)

Here, \(0<{\Lambda }_-<{\Lambda }_+\) are arbitrarily taken (more conditions on such numbers will be specified in the course of the paper) and, for fixed positiveFootnote 10 numbers \(0<{\alpha }_-<{\alpha }_+<1\), \(k_\pm \) are constants depending on \(\alpha _\pm \) and the masses via

$$\begin{aligned} k_\pm \,{:}{=}\,\frac{m_1}{m_2}\sqrt{\frac{m_0+{\mu }m_2}{m_0+{\mu }m_1}{\alpha }_\pm }\ .\nonumber \\ \end{aligned}$$
(15)

We now take \(0<\delta <1\) andFootnote 11 and assume

$$\begin{aligned} 0<\frac{m_2}{m_1}<\min \left\{ \sqrt{(1-{\delta }){\alpha }_-}\ ,\ 1-{\delta }\right\} ,\quad 0<{\mu }<{\mu }_0({\delta })\,{:}{=}\,\frac{{\delta }m_0}{m_1(1-{\delta })-m_2}\nonumber \\ \end{aligned}$$
(16)

Then weFootnote 12 have

Proposition 2.1

[Pinzari (2018, Section III and Appendix A)] One can find \(\varepsilon _0>0\), depending only on \({\Lambda }_-\), \(\delta \), \({\alpha }_-\), \(m_1\), \(m_2\) such that the function \(\textrm{H}_{{rps}_{\pi }}\) in (13) is real-analyticFootnote 13 for \(({\Lambda },{\lambda },\eta ,\xi ,p,q)\in \mathcal{M}_{{rps}_\pi , \varepsilon _0}\). In addition, for any \(s\in {{\mathbb {N}}}\), there exists a positive number \({\alpha }^{\#}\) such that, if \({\alpha }_+<{\alpha }^{\#}\), there exists a positive number \(\varepsilon _1<\varepsilon _0\) and a real-analytic canonical transformation

$$\begin{aligned} \phi _{{bnf}}:\quad ({\Lambda },\overline{{\lambda }}, \overline{\eta }, \overline{\xi }, \overline{p}, \overline{q})\in \mathcal{M}_{{rps}_\pi , \varepsilon _1}\rightarrow ({\Lambda }, {\lambda }, \eta , \xi , p, q) \in \mathcal{M}_{{rps}_\pi , \varepsilon _0} \end{aligned}$$

which carries \((\overline{\eta }, \overline{\xi }, \overline{p}, \overline{q})=0\) to \((\eta , \xi , p, q)=0\) for all \((\overline{{\Lambda }}, \overline{{\lambda }})\in \mathcal{L}\times {\mathbb T}^2\), such that, if

$$\begin{aligned} \textrm{H}_{{bnf}}\,{:}{=}\,\textrm{H}_{{rps}_{\pi }}\circ \phi _{{bnf}}=\textrm{h}_\textrm{k}({\Lambda })+{\mu }f_{{bnf}}({\Lambda },\overline{{\lambda }},\overline{\eta }, \overline{\xi }, \overline{p}, \overline{q}) \end{aligned}$$
(17)

then the averaged perturbing function

$$\begin{aligned} f_{{bnf}}^\textrm{av}({\Lambda },\overline{\eta }, \overline{\xi }, \overline{p}, \overline{q})\,{:}{=}\,\frac{1}{(2{\pi })^2}\int _{{{\mathbb {T}}}^2} f({\Lambda },\overline{{\lambda }},\overline{\eta }, \overline{\xi }, \overline{p}, \overline{q})d\overline{{\lambda }}_1d\overline{{\lambda }}_2 \end{aligned}$$

“is in Birkhoff Normal Form of order s”, namely:

$$\begin{aligned} f_{{bnf}}^{\textrm{av}} =C_0({\Lambda })+{\Omega }\cdot \overline{{\tau }}+\frac{1}{2}\overline{{\tau }}\cdot {\textrm{T}}({\Lambda })\overline{{\tau }}+{\mathbb 1}_{s\ge 3}\sum _{j=3}^{s}\mathcal{P} _j(\overline{{\tau }};{\Lambda })+{\textrm{O}}_{2s+1}(\overline{\eta },\overline{\xi }, \overline{p}, \overline{q};{\Lambda }) \end{aligned}$$

where \({\Omega }({\Lambda })=({\Omega }_1({\Lambda }), {\Omega }_2({\Lambda }), {\Omega }_3({\Lambda }))\); \(\mathcal{P} _j(\overline{{\tau }};{\Lambda })\) are homogeneous polynomials of degree j in \(\overline{{\tau }}\,{:}{=}\,\left( \frac{\overline{\eta }_1^2+\overline{\xi }_1^2}{2},\ \frac{\overline{\eta }_2^2+\overline{\xi }_2^2}{2}\ ,\ \frac{\overline{p}^2+\overline{q}^2}{2}\right) \) and the determinant of the \(3\times 3\) matrix \(\textrm{T}({\Lambda })\) does not identically vanish. Moreover, \(\phi _{{bnf}}\) leaves \(\textrm{G}_{{rps}_{\pi }}\) unvaried, meaning that the function

$$\begin{aligned}\overline{\textrm{G}}\,{:}{=}\,{\Lambda }_1-{\Lambda }_2-\frac{\overline{\eta }_1^2+\overline{\xi }_1^2}{2}+\frac{\overline{\eta }_2^2+\overline{\xi }_2^2}{2}+\frac{\overline{p}^2+\overline{q}^2}{2}\end{aligned}$$

is still a first integral to \(\overline{\textrm{H}}\).

2.2 Hyperbolicity

The hyperbolic character appears using a set of canonical coordinates, named perihelia reduction (p -coordinates). This is a further set of canonical coordinates

$$\begin{aligned} \text {{p}}\,{:}{=}\,\Big (\widehat{\text {{p}}}, (\textrm{Z}, \textrm{G}, \zeta , \textrm{g})\Big )\in {{\mathbb {R}}}^{3n-2}\times {{\mathbb {T}}}^{3n-2}\times {{\mathbb {R}}}^{2}\times {{\mathbb {T}}}^{2}\end{aligned}$$
(18)

performing full reduction of SO(3) invariance for a n-particle system, which, in addition keeps regular for co-planar motions. The \(\text {{p}}\)-coordinates have been firstly introduced in Pinzari (2018), to which we refer for the proof of their canonical character. We remark that in (18), \(\textrm{G}\), \(\textrm{Z}\) and \(\zeta \) are the same as in jrd in (10). The coordinate \(\textrm{g}\), conjugated to \(\textrm{G}\), is not the same as in (10), but of course \((\textrm{Z}, \zeta , \textrm{g})\) are again ignorable and \(\textrm{G}\) is constant in SO(3) invariant systems. For the 3-body problem, namely, \(n=2\), the 8-plet \({\widehat{\text {{p}}}}\) is given by

$$\begin{aligned} {\widehat{\text {{p}}}}\,{:}{=}\,({\Lambda }_1,{\Lambda }_2,\textrm{G}_2,\Theta , \ell _1,\ell _2, \textrm{g}_2, \vartheta ) \end{aligned}$$

with \(\Lambda _j\), \(\ell _j\), \(\textrm{G}_2\) as in (10). To define \(\Theta \), \(\textrm{g}\), \(\vartheta \) and \(\textrm{g}_2\), we assume that

$$\begin{aligned} {\nu }_1\,{:}{=}\,k^{(3)}\times \textrm{C},\qquad \textrm{n}_1\,{:}{=}\,\textrm{C}\times \textrm{P}^{(1)},\qquad {\nu }_2\,{:}{=}\,\textrm{P}^{(1)}\times \textrm{C}^{(2)}, \quad \textrm{n}_2=\textrm{C}^{(2)}\times \textrm{P}^{(2)}\end{aligned}$$
(19)

do not vanish. Note that \({\nu }_1\) in (19) is the same as in (8). WeFootnote 14 let (under the same notations as in the previous section)

$$\begin{aligned} \begin{array}{llllrrr} \begin{array}{lrrr} \Theta \,{:}{=}\,\textrm{C}\cdot \textrm{P}^{(1)}=\textrm{C}^{(2)}\cdot \textrm{P}^{(1)} \\ \end{array} \qquad \left\{ \begin{array}{lrrr} \vartheta \,{:}{=}\,{\alpha }_{\textrm{P}^{(1)}}(\textrm{n}_1, {\nu }_2)&{}\\ \textrm{g}\,{:}{=}\,{\alpha }_{\textrm{C}}({\nu }_1, \textrm{n}_1)\qquad &{}\\ \textrm{g}_2\,{:}{=}\,{\alpha }_{\textrm{C}^{(2)}}({\nu }_2, \textrm{n}_2)&{}\\ \end{array} \right. \end{array} \end{aligned}$$
(20)

We now describe the rôle of the p-coordinates in the Hamiltonian (5). We denote as

$$\begin{aligned}\textrm{H}_{{\text {{p}}}}=\textrm{h}_{\textrm{k}}({\Lambda }_1,{\Lambda }_2)+{\mu }f_{{p}}({\Lambda }_1,{\Lambda }_2,\textrm{G}_2,\Theta ;\ell _1,\ell _2,\textrm{g}_2,\vartheta ; \textrm{G})\end{aligned}$$

where

$$\begin{aligned} \textrm{h}_{\textrm{k}}({\Lambda }_1,{\Lambda }_2)=-\frac{\textrm{m}_1^3\textrm{M}_1^2}{2{\Lambda }_1^2}-\frac{\textrm{m}_2^3\textrm{M}_2^2}{2{\Lambda }_2^2},\qquad f_{{p}}=-\frac{m_1m_2}{|x_{{p}}^{(1)}-x_{{p}}^{(2)}|}+\frac{y_{{p}}^{(1)}\cdot y_{{p}}^{(2)}}{m_0}. \end{aligned}$$

the Hamiltonian (5) expressed in terms of \({{p}}\), and

$$\begin{aligned} f^\textrm{av}_{{p}}\,{:}{=}\,\frac{1}{(2{\pi })^2}\int _{[0,2{\pi }]^2}f_{{p}}d\ell _1d\ell _2\end{aligned}$$

the doubly averaged perturbing function. We look at the expansion

$$\begin{aligned} {f^\textrm{av}_{{p}}}=-\frac{m_1 m_2}{a_2}\Big (1+\alpha ^2\textrm{P}+\textrm{O}(\alpha ^3)\Big )\end{aligned}$$

where \(\alpha \,{:}{=}\,\frac{a_1}{a_2}\) is the semi-major axes ratio. We focus on the function \(\textrm{P}\). Let \(\mathcal{L}\) as in (14); \(c\in (0,1)\), and put

$$\begin{aligned} \mathcal{L} _{{p}}(\textrm{G}){} & {} :=\Big \{{\Lambda }=({\Lambda }_1,{\Lambda }_2)\in \mathcal{L}:\quad {\Lambda }_1>\textrm{G}+\frac{2}{ c}\sqrt{{\alpha }_+}{\Lambda }_2\nonumber \\{} & {} \quad 5{\Lambda }_1^2\textrm{G}-(\textrm{G}+\frac{2}{ c}\sqrt{{\alpha }_+}{\Lambda }_1)^2 (4 \textrm{G}+\frac{2}{ c}\sqrt{{\alpha }_+}{\Lambda }_1)>0,\nonumber \\{} & {} \quad 5\Lambda _1^2\textrm{G}-(\textrm{G}+\Lambda _2)(4\textrm{G}+\Lambda _2)>0\nonumber \\{} & {} \quad {\Lambda }_2>\textrm{G}\ ,\quad {\Lambda }_1> 2\textrm{G}\Big \} \end{aligned}$$
(21)
$$\begin{aligned} \mathcal{G} _{{p}}({\Lambda }_1,{\Lambda }_2,\textrm{G}){} & {} :=\Big \{\textrm{G}_2:\ \max \{\frac{2}{ c}\sqrt{{\alpha }_+}{\Lambda }_2, \textrm{G}\}<\textrm{G}_2<\Lambda _2 \Big \} \nonumber \\ \mathcal{B} _{{p}}(\textrm{G}){} & {} :=\Big \{(\Theta ,\vartheta ): \ |\Theta |< \frac{\textrm{G}}{2} ,\ |\vartheta |< \frac{{\pi }}{2}\Big \} \end{aligned}$$
(22)

and finally

$$\begin{aligned} \mathcal{A} _{{p}}(\textrm{G})\,:=\,\Big \{({\Lambda }_1, {\Lambda }_2, \textrm{G}_2):\ ({\Lambda }_1, {\Lambda }_2)\in \mathcal{L} _{{p}}(\textrm{G}) ,\ \textrm{G}_2\in \mathcal{G}_{{p}}({\Lambda }_1, {\Lambda }_2)\Big \} \end{aligned}$$

Moreover, we let

$$\begin{aligned} \begin{array}{llll} \mathcal{N}(\textrm{G})\,{:}{=}\,\mathcal{A} _{{p}}(\textrm{G})\times {{\mathbb {T}}}^3\times \mathcal{B} _{{p}}(\textrm{G}) ,\quad \mathcal{N}_{0}(\textrm{G})\,{:}{=}\, \mathcal{A} _{{p}}(\textrm{G})\times {{\mathbb {T}}}^3\times \big \{0 ,0\big \}. \end{array} \end{aligned}$$
(23)

Note that phase points in \(\mathcal{N}_{0}\) has the geometrical meaning of co-planar motions with the outer planet in retrograde motion.

Proposition 2.2

(Pinzari 2018, Section IV) The 4 degrees of freedom Hamiltonian \(\textrm{H}_{{p}}\) is real-analytic in \( \mathcal{N} \). It has an equilibrium on \(\mathcal{N}_{0}\). Such equilibrium turns to be hyperbolicFootnote 15 for \(\textrm{P}\).

2.3 Existence and Co-Existence of two Families of Tori

Theorems 1.1 and 1.2 can now be used to prove the existence of both full-dimensional and whiskered, co-dimension 2 tori in the three-body problem. Indeed,

  • Under conditions (1), by Theorem 1.1, an invariantFootnote 16 set \(\mathcal{F}\subset \mathcal{M}_{\varepsilon } \) for the Hamiltonian \(\textrm{H}_{{rps}_{\pi }}\) with 5-dimensional frequencies is found, whose measure satisfies

    $$\begin{aligned} {\, \mathrm meas}\mathcal{M}_{\varepsilon }>{\, \mathrm meas}\mathcal{F}> \Big (1- C_* {\varepsilon }^{\frac{1}{2}+\overline{s}}\Big ) {\, \mathrm meas}\mathcal{M}_{\varepsilon } \end{aligned}$$
    (25)

    where \(\overline{s}=s-2\).

  • Under conditions (3) with \(a=\alpha _+\), by Theorem 1.2, for any \(\textrm{G}\in {{\mathbb {R}}}_+\), one finds an invariant set \(\mathcal{H}(\textrm{G})\subset \mathcal{N}_{0}(\textrm{G})\) with 3-dimensional frequencies for \(\textrm{H}_{{p}}\) and equipped with 4-dimensional stable and unstable manifoldsFootnote 17, whose measure satisfies

    $$\begin{aligned} {\, \mathrm meas}\mathcal{N}_{0}(\textrm{G})>{\, \mathrm meas}\mathcal{H}(\textrm{G})> \Big (1- C_* \sqrt{ \alpha _+ }\Big ) {\, \mathrm meas}\mathcal{N}_{0}(\textrm{G})\ . \end{aligned}$$
    (26)

In the next, we show that the invariant sets \(\mathcal{F}\) and \(\mathcal{H}(\textrm{G})\) constructed above “have a common domain of existence”. We have to make this assertion more precise, mainly because \(\mathcal{F}\) and \(\mathcal{H}(\textrm{G})\) have been constructed with different formalisms.

Let

$$\begin{aligned} \phi _{{rps}_{\pi }}^{{p}}:\ {{rps}_{\pi }}\rightarrow {p}\end{aligned}$$
(27)

the canonical change of coordinates between \({rps}_{\pi }\) and \({p}\), well defined in a full measure set.

Let \({{\mathbb {G}}}_*\), \({{\mathbb {G}}}_0\) the respective images under the function (12):

$$\begin{aligned} {{\mathbb {G}}}_0\,{:}{=}\,\textrm{G}_{{rps}_{\pi }}\left( \mathcal{M}_{\varepsilon }\right) ,\qquad {{\mathbb {G}}}_*\,{:}{=}\,\textrm{G}_{{rps}_{\pi }}\left( \mathcal{F}\right) \end{aligned}$$

of the sets \(\mathcal{M}_{\varepsilon }\), \(\mathcal{F}\). As \(\mathcal{F}\subset \mathcal{M}_{\varepsilon }\), then \({\mathbb G}_*\subset {{\mathbb {G}}}_0\). For any \(\textrm{G}_0\in {{\mathbb {G}}}_0\), \(\textrm{G}_*\in {{\mathbb {G}}}_*\), let

$$\begin{aligned} \mathcal{M}_{\varepsilon }(\textrm{G}_0)\,{:}{=}\,\mathcal{M}_{\varepsilon }\cap \{\textrm{G}_{{rps}_{\pi }}=\textrm{G}_0\} ,\qquad \mathcal{F}(\textrm{G}_*)\,{:}{=}\,\mathcal{F}\cap \{\textrm{G}_{{rps}_{\pi }}=\textrm{G}_*\} \end{aligned}$$

\(\mathcal{M}_{\varepsilon }(\textrm{G}_0)\) and \(\mathcal{F}(\textrm{G}_*)\) are invariant sets because \(\textrm{G}_{{rps}_{\pi }}\) is conserved along the motions of \(\textrm{H}_{{rps}}\).

Define:

$$\begin{aligned} \mathcal{M}'_{\varepsilon }(\textrm{G}_0)\,{:}{=}\,\phi _{{rps}_{\pi }}^{{p}}\left( \mathcal{M}_{\varepsilon }(\textrm{G}_0)\right) ,\qquad \mathcal{F}'(\textrm{G}_*)\,{:}{=}\,\phi _{{rps}_{\pi }}^{{p}}\left( \mathcal{F}(\textrm{G}_*)\right) . \end{aligned}$$

At the cost of eliminating zero-measure sets from \({{\mathbb {G}}}_0\), \({{\mathbb {G}}}_*\), the sets \(\mathcal{F}'(\textrm{G}_*)\), \(\mathcal{M}'_{\varepsilon }(\textrm{G}_0) \) are well-defined, for all \(\textrm{G}_0\in {{\mathbb {G}}}_0\), \(\textrm{G}_*\in {{\mathbb {G}}}_*\). Then split

$$\begin{aligned} \mathcal{M}'_{\varepsilon }(\textrm{G}_0)= \widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_0)\times \{\textrm{G}=\textrm{G}_0 ,\ \textrm{g}\in {{\mathbb {T}}}\}\qquad \mathcal{F}'(\textrm{G}_*)=\widehat{\mathcal{F}}'(\textrm{G}_*)\times \{\textrm{G}=\textrm{G}_* ,\ \textrm{g}\in {{\mathbb {T}}}\} \end{aligned}$$

The volume-preserving property of \(\phi _{{rps}_{\pi }}^{{p}}\) in (27), the monotonicity of the Lebesgue integral and the bounds in (25) guarantee that

$$\begin{aligned} {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)>{\, \mathrm meas}\widehat{\mathcal{F}}'(\textrm{G}_*)> \Big (1- C_1 {\varepsilon }^{\frac{1}{2}+\overline{s}}\Big ) {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\qquad \forall \ \textrm{G}_*\in {{\mathbb {G}}}_* . \end{aligned}$$
(28)

with some \(C_1>0\).

Recall now the definition of \(\mathcal{N}(\textrm{G})\), \(\mathcal{N}_{0}(\textrm{G})\) in (23) and \(\mathcal{H}(\textrm{G})\) in (26). The main result of the paper is the following

Theorem 2.1

Let \({\sigma }>0\) half-integer. There exist \(\varepsilon _*\), \(c_0\in (0, 1)\) such that, if \(\varepsilon <\varepsilon _*\), \(\textrm{G}_*\in {\mathbb G}_*\), \(\textrm{G}_*>c_0^{-1}\varepsilon ^2\), \(\alpha _+\le c_0 \varepsilon ^{12}\) and \(\mu \) verifies (1), (3) with \(a=\alpha _+\) and \(s=\sigma +\frac{7}{2}\), then there exists a non-empty set \(\mathcal{A}_{\star }(\textrm{G}_*) \) such that, letting

$$\begin{aligned} \mathcal{Q}(\textrm{G}_*)\,{:}{=}\,\mathcal{A}_{\star }(\textrm{G}_*)\times {{\mathbb {T}}}^3\times \mathcal{B} _1(\varepsilon , \textrm{G}_\star ) ,\qquad \mathcal{Q}_0(\textrm{G}_*)\,{:}{=}\,\mathcal{A}_{\star }(\textrm{G}_*)\times {{\mathbb {T}}}^3\times \{(0, 0)\} \end{aligned}$$

and denoting \(\widehat{\mathcal{F}}_*'(\textrm{G}_*)\), \(\widehat{\mathcal{H}}_*(\textrm{G}_*)\) the respective intersections of \(\widehat{\mathcal{F}}'(\textrm{G}_*)\), \(\widehat{\mathcal{H}}(\textrm{G}_*)\) with \(\mathcal{Q}(\textrm{G}_*) \), \(\mathcal{Q}_0(\textrm{G}_*) \) then \(\widehat{\mathcal{F}}_*'(\textrm{G}_*)\), \(\widehat{\mathcal{H}}_*(\textrm{G}_*)\) are non-empty and in fact verify

$$\begin{aligned}{} & {} {\, \mathrm meas}\mathcal{Q}(\textrm{G}_*)\ge {\, \mathrm meas}\widehat{\mathcal{F}}_*'(\textrm{G}_*)\ge \left( 1-\frac{\varepsilon ^{\sigma }}{\varepsilon _*^{{\sigma }}}\right) {\, \mathrm meas}\mathcal{Q}(\textrm{G}_*)\end{aligned}$$
(29)
$$\begin{aligned}{} & {} {\, \mathrm meas}\mathcal{Q}_{0}(\textrm{G}_*)\ge {\, \mathrm meas}\widehat{\mathcal{H}}_*(\textrm{G}_*)\ge \left( 1-\frac{\alpha _+}{c_0\varepsilon ^{12}}\right) {\, \mathrm meas}\mathcal{Q}_{0}(\textrm{G}_*) .\end{aligned}$$
(30)

The proof of Theorem 2.1 relies on some technical result (Propositions 2.32.4 and 2.5) which we now state and prove later.

Proposition 2.3

Let, for a suitable pure number \({\underline{k}}\in (1, 2)\), \(\Lambda _-<\textrm{G}\), \(k_-\le {\underline{k}}\) \(k_+\ge 2 \), \(\alpha _+\le \frac{c^2}{16}\). Choose \({\Lambda }_+\) as the unique value of \({\Lambda }_2> \textrm{G}\) such that \(\mathcal{C} \) and the straight line \({\Lambda }_1=2 {\Lambda }_2\) meet at \(({\Lambda }_1, {\Lambda }_2)=(2{\Lambda }_+, {\Lambda }_+)\). Let

$$\begin{aligned}{} & {} \mathcal{L}_0(\textrm{G})\,{:}{=}\,\Bigg \{({\Lambda }_1, {\Lambda }_2):\quad \textrm{G}\le {\Lambda }_2\le {\Lambda }_+\ ,\quad (\textrm{G}+{\Lambda }_2)\sqrt{\frac{4\textrm{G}+{\Lambda }_2}{5\textrm{G}} }<{\Lambda }_1< \min \{k_+\,{\Lambda }_2 ,\ 2{\Lambda }_+\}\Bigg \} \\{} & {} \mathcal{A} _{0}(\textrm{G})\,{:}{=}\,\Big \{({\Lambda }_1, {\Lambda }_2, \textrm{G}_2):\ ({\Lambda }_1, {\Lambda }_2)\in \mathcal{L} _{0}(\textrm{G}) ,\ \textrm{G}_2\in \mathcal{G}_{{p}}({\Lambda }_1, {\Lambda }_2)\Big \} \end{aligned}$$

Then, the set

$$\begin{aligned} \begin{array}{llll} \mathcal{N}_{0}(\textrm{G})\,{:}{=}\,\mathcal{A} _{0}(\textrm{G})\times {{\mathbb {T}}}^3\times \mathcal{B} _{{p}}(\textrm{G}) \end{array} \end{aligned}$$

is a subset of \(\mathcal{N}(\textrm{G})\).

Proposition 2.4

There exists \(c_1\in (0, 1)\) depending only on \({\Lambda }_+/\textrm{G}\), \({\Lambda }_-/\textrm{G}\) such that, letting, for any \({\gamma }< c_1^2\varepsilon ^2\),

$$\begin{aligned}{} & {} \mathcal{L} _{1}(\textrm{G})\,{:}{=}\,\Big \{({\Lambda }_1, {\Lambda }_2)\in \mathcal{L}\ ,\ |{\Lambda }_1-{\Lambda }_2-\textrm{G}|<c^2_1\varepsilon ^2\Big \} \\{} & {} \mathcal{G} _{1}({\Lambda }_2)\,{:}{=}\,\Big \{\textrm{G}_2:\ {\Lambda }_2-c^2_1\varepsilon ^2<\textrm{G}_2<{\Lambda }_2-{\gamma }\Big \} \\{} & {} \mathcal{A} _{1}(\textrm{G})\,{:}{=}\,\Big \{({\Lambda }_1, {\Lambda }_2, \textrm{G}_2):\quad ({\Lambda }_1, {\Lambda }_2)\in \mathcal{L} _{1}\ ,\quad \textrm{G}_2\in \mathcal{G} _{1}({\Lambda }_2)\Big \} \\{} & {} \mathcal{B} _{1}(\textrm{G}, \varepsilon )\,{:}{=}\,\Big \{(\Theta , \vartheta ):\quad \Theta ^2<c^2_1\textrm{G}\varepsilon ^2 ,\ \vartheta ^2<c^2_1\frac{\varepsilon ^2}{\textrm{G}}\Big \} \end{aligned}$$

then the set

$$\begin{aligned} \begin{array}{llll} \mathcal{N}_{1}(\textrm{G}, \varepsilon )\,{:}{=}\,\mathcal{A} _{1}(\textrm{G})\times {\mathbb T}^3\times \mathcal{B} _{1}(\textrm{G}, \varepsilon ) \end{array} \end{aligned}$$

is a subset of \( \widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G})\).

Proposition 2.5

Assume \(\textrm{G}\ge 10c_1^2\varepsilon ^2\) and \({{\alpha }_+}<\frac{c^2}{16}\). Then, \(\mathcal{A}_0(\textrm{G})\) and \(\mathcal{A}_1(\textrm{G})\) have a non-empty intersection \(\mathcal{A}_\star (\textrm{G})\), verifying

$$\begin{aligned} {\, \mathrm meas}(\mathcal{A} _\star (\textrm{G}))\ge \frac{9}{10}(c_1^2\varepsilon ^2-{\gamma })c_1^4\varepsilon ^4 \end{aligned}$$

We prove how Theorem 2.1 follows from the above propositions. \(\mathcal{Q}(\textrm{G}_*)\) is a subset of \(\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\) and \(\mathcal{N}_{\varepsilon }(\textrm{G}_*)\), and

$$\begin{aligned} {\, \mathrm meas}\mathcal{Q}(\textrm{G}_*)= C_1 \varepsilon ^8 =C_2\varepsilon ^2 {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon } . \end{aligned}$$

The bound in (28) guarantees that

$$\begin{aligned} {\, \mathrm meas}\left( \widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\setminus \widehat{\mathcal{F}}'(\textrm{G}_*)\right) <C_3 {\varepsilon }^{\frac{1}{2}+\overline{s}} {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\qquad \forall \ \textrm{G}_*\in {{\mathbb {G}}}_* . \end{aligned}$$

On the other hand, if \(\widehat{\mathcal{F}}'_{\varepsilon }(\textrm{G}_*)\cap \mathcal{Q}(\textrm{G}_*) \) was empty, we would have

$$\begin{aligned} {\, \mathrm meas}\left( \widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\setminus \widehat{\mathcal{F}}'(\textrm{G}_*)\right) \ge {\, \mathrm meas}\mathcal{Q}(\textrm{G}_*)= C_2\varepsilon ^2 {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*) \end{aligned}$$

which contradicts the previous inequality if \(\overline{s}> \frac{3}{2}\) and \(\varepsilon \) is small. Finally, if \(\textrm{G}_*\in {{\mathbb {G}}}_*\),

$$\begin{aligned}{} & {} {\, \mathrm meas}\left( \mathcal{Q}_{\varepsilon }(\textrm{G}_*)\setminus \widehat{\mathcal{F}}'(\textrm{G}_*)\right) \le {\, \mathrm meas}\left( \widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\setminus \widehat{\mathcal{F}}'(\textrm{G}_*)\right) <C_3 {\varepsilon }^{\frac{1}{2}+\overline{s}} {\, \mathrm meas}\widehat{\mathcal{M}}'_{\varepsilon }(\textrm{G}_*)\\{} & {} \quad =C_4 {\varepsilon }^{\overline{s}-\frac{3}{2}} {\, \mathrm meas}\mathcal{Q}_{\varepsilon }(\textrm{G}_*) \end{aligned}$$

and we have (29) with \({\sigma }=\overline{s}-\frac{3}{2}=s-\frac{7}{2}\), with \(s\ge 4\). The proof of (30) is similar.

2.4 Proof of Propositions 2.32.4 and 2.5

Proof of Proposition 2.3

We only need to prove that \(\mathcal{L}_0(\textrm{G})\subset \mathcal{L}_{{p}}(\textrm{G})\). We switch to the coordinates

$$\begin{aligned} y\,{:}{=}\,\frac{{\Lambda }_1}{\textrm{G}} ,\qquad x\,{:}{=}\,\frac{{\Lambda }_2}{\textrm{G}} . \end{aligned}$$

We denote as \(\mathcal{X}_{{p}}\,{:}{=}\,\textrm{G}^{-1}\mathcal{L} _{{p}}\) the domain of (yx), and as

$$\begin{aligned} x_-\,{:}{=}\,\frac{\Lambda _-}{\textrm{G}} ,\ \quad x_+\,{:}{=}\,\frac{\Lambda _+}{\textrm{G}} \end{aligned}$$

\(\mathcal{X}_{{p}}\) can be written as the intersection of the three sets:

$$\begin{aligned} \mathcal{X}_1{} & {} :=\Bigg \{(y, x):\quad 1\le x\le x_+\ ,\quad y>2 ,\ \max \{k_-\,x, (1+x)\sqrt{\frac{4 +x}{5} }\}<y< k_+\,x\Bigg \}\\ \mathcal{X}_2{} & {} :=\Bigg \{(y, x):\quad 1\le x\le x_+ \ ,\quad y>1+\frac{2}{ c}\sqrt{{\alpha }_+}x \Bigg \}\\ \mathcal{X}_3{} & {} :=\Bigg \{(y, x):\quad 1\le x\le x_+ ,\quad y>2\ ,\quad 5y^2 -(1+\frac{2}{ c}\sqrt{{\alpha }_+}y)^2 (4 +\frac{2}{ c}\sqrt{{\alpha }_+}y)>0\Bigg \} \end{aligned}$$

We prove \(\mathcal{X}_0\,{:}{=}\,\textrm{G}^{-1}\mathcal{L} _0\) is a subset of all of them. The curve

$$\begin{aligned} \mathcal{C}:\qquad y=(1+x)\sqrt{\frac{4 +x}{5}} \qquad x\ge 1 \end{aligned}$$

passes through \(P_0=(1, 2)\). We denote as \({\underline{k}}\) the slope of the straight line \(y=kx\) which is tangent at \(\mathcal{C}\) at \(P_0\). The slope of the straight line \(y=kx\) through \(P_0\) is obviously \({\overline{k}}=2\). We assume that

$$\begin{aligned} k_-\le {\underline{k}} ,\qquad k_+\ge {\overline{k}} \end{aligned}$$

and choose \((x_+, y_+)\) as the only (xy) with \(x>1\) such that \(\mathcal{C} \) meets \(y=2 x\) at (xy). Under such assumptions, we have:

$$\begin{aligned} \mathcal{X}_1=\Bigg \{(y, x):\quad 1\le x\le x_+\ ,\quad (1+x)\sqrt{\frac{4 +x}{5} }<y< k_+\,x\Bigg \}\supset \mathcal{X}_0 \end{aligned}$$

The straight line which is tangent at \(\mathcal{C} \) at \(P_0=(1, 2)\) has equation

$$\begin{aligned} y=\frac{6}{5}x+\frac{4}{5} \end{aligned}$$

Since we \({\alpha }_+<\frac{{c^2}}{4}\), \(x>1\) and \(\mathcal{C} \) is convex, we have

$$\begin{aligned} 1+\frac{2}{ c}\sqrt{{\alpha }_+}x\le 1+x \le \frac{6}{5}x+\frac{4}{5}\le (1+x)\sqrt{\frac{4 +x}{5} } \end{aligned}$$

This shows that \(\mathcal{X}_2\supset \mathcal{X}_0\). As for \(\mathcal{X}_3\), we note that for

$$\begin{aligned} \alpha _+\le \frac{c^2}{16} \end{aligned}$$

it is

$$\begin{aligned}{} & {} 5y^2 -(1+\frac{2}{ c}\sqrt{{\alpha }_+}y)^2 (4 +\frac{2}{ c}\sqrt{{\alpha }_+}y)\ge 5y^2 -\left( 1+\frac{y}{ 2}\right) ^2 \left( 4 +\frac{y}{ 2}\right) \\{} & {} \quad =\frac{1}{4}(y-2)(y-y_-)(y_+-y) . \end{aligned}$$

with

$$\begin{aligned} y_\pm =13\pm \sqrt{185} . \end{aligned}$$

As \(y_-<0\) and \((y-2)(y_+-y)\ge 0\) on \(\mathcal{X}_0\), we have that \(\mathcal{X}_3\supset \mathcal{X}_0\) (Figs. 1, 2). \(\square \)

Fig. 1
figure 1

The blue curve is \(\mathcal{C}\); the orange line has slope \(k_-\), the green one has slope \(k_+\) (Mathematica)

Fig. 2
figure 2

\(\mathcal{L} _0(\textrm{G})\) (blue) and \(\mathcal{L} _1(\textrm{G})\) (green)

Remark 2.1

The numbers \({\underline{k}}\), \({\Lambda }_+\) of Proposition 2.3 can be chosen as

$$\begin{aligned} {\Lambda }_+=\frac{\textrm{G}}{2}\left( 13+ \sqrt{185}\right) ,\quad {\underline{k}}=\frac{1}{4}\sqrt{\frac{3}{10}(69+11\sqrt{33})}\sim 1.57 \end{aligned}$$

\({\Lambda }_+\) is related to the number \(x_+\) computed along the proof via \({\Lambda }_+=x_+\textrm{G}\). \({\underline{k}}\) is defined as the slope of the straight line \(y= kx \) which is tangent at \(\mathcal{C} \). We can compute it eliminating y between the two equations; we obtain the cubic equation

$$\begin{aligned} x^3+(6-5 k^2)x^2+9x+4=0.\end{aligned}$$
(31)

The tangency condition is imposed identifying this equation with

$$\begin{aligned} (x-a)^2(x-b)=0 \end{aligned}$$
(32)

where a is the abscissa of the tangency point. Equating the respective coefficients of (31) and (32), we obtain

$$\begin{aligned} \left\{ \begin{array}{l} -(b+2a)=6-5 k^2\\ 2ab+ a^2=9\\ -a^2b=4 \end{array}\right. \end{aligned}$$
(33)

Eliminating b through the second and the third equations, we obtain

$$\begin{aligned} a^3-9 a-8=0 \end{aligned}$$

which has the following three roots:

$$\begin{aligned} a_0=-1,\qquad a_{\pm }=\frac{1\pm \sqrt{33}}{2}. \end{aligned}$$

The only admissible value is then

$$\begin{aligned} a=a_+=\frac{1+\sqrt{33}}{2}\ . \end{aligned}$$

In correspondence of this value for a, solving the system in (33), we find

$$\begin{aligned} b=\frac{-17+\sqrt{33}}{32},\qquad k=\frac{1}{4}\sqrt{\frac{3}{10}(69+11\sqrt{33})}={\underline{k}} . \end{aligned}$$

\(\square \)

Proof of Proposition 2.4

From (11), we get

$$\begin{aligned} |z|^2=\eta _1^2+\xi _1^2+\eta _2^2+\xi _2^2+p^2+q^2=2(\textrm{G}+\textrm{G}_2-\textrm{G}_1)+2({\Lambda }_1-\textrm{G}_1)+2({\Lambda }_2-\textrm{G}_2) . \end{aligned}$$

From the equality

$$\begin{aligned} \textrm{G}_1= & {} \sqrt{\textrm{G}^2+\textrm{G}_2^2-2\Theta ^2+2\sqrt{\textrm{G}^2-\Theta ^2}\sqrt{\textrm{G}_2^2-\Theta ^2}\cos \vartheta }\\= & {} \textrm{G}+\textrm{G}_2+\textrm{O}\left( \frac{\Theta ^2}{\textrm{G}+\textrm{G}_2}\right) +\textrm{O}\left( \frac{\Theta ^2\textrm{G}_2}{\textrm{G}(\textrm{G}+\textrm{G}_2)}\right) +\textrm{O}\left( \frac{\Theta ^2\textrm{G}}{\textrm{G}_2(\textrm{G}+\textrm{G}_2)}\right) +\textrm{O}\left( \frac{\vartheta ^2\textrm{G}\textrm{G}_2}{\textrm{G}+\textrm{G}_2} \right) \end{aligned}$$

and the definition of \(\mathcal{N}_1(\textrm{G})\), the assertion trivially follows. \(\square \)

Proof of Proposition 2.5

Let \({\Lambda }_2^\star \) be the abscissa, in the plane \((\Lambda _2, \Lambda _1)\), of the intersection point between the curves

$$\begin{aligned} {\Lambda }_1=(\textrm{G}+{\Lambda }_2)\sqrt{\frac{4 \textrm{G}+{\Lambda }_2}{5\textrm{G}}}\ ,\qquad \textrm{and}\qquad {\Lambda }_1={\Lambda }_2+\textrm{G}+c_1^2\varepsilon ^2\ . \end{aligned}$$

Using the coordinate \(x\,{:}{=}\,\frac{{\Lambda }_2}{\textrm{G}}\). With \(x^\star \,{:}{=}\,\frac{{\Lambda }_2^\star }{\textrm{G}}\), \(\theta \,{:}{=}\,\frac{c_1^2\varepsilon ^2}{\textrm{G}}\), \(\zeta \,{:}{=}\,\frac{{\gamma }}{\textrm{G}}\), where \(\zeta <\theta \), the set \(\mathcal{A}_\star (\textrm{G})\,{:}{=}\,\mathcal{A}_0(\textrm{G})\cap \mathcal{A}_1(\textrm{G})\) has measure

$$\begin{aligned} {\, \mathrm meas}(\mathcal{A}_\star (\textrm{G}))=\textrm{G}^3\int _{1+\zeta }^{x^\star }F_1(x) F'_2(x)dx \end{aligned}$$

where

$$\begin{aligned} F_1(x)= & {} \min \Big \{ 2x,\ x+1+\theta \Big \}-(1+x)\sqrt{\frac{4+x}{5}}\\ F_2(x)= & {} \min \Big \{ \theta -\zeta ,\ x-1-\zeta , \ m x-\zeta \Big \} \end{aligned}$$

and where, for short, we have let \(m\,{:}{=}\,1-\frac{2}{ c}\sqrt{{\alpha }_+}\). Then,

$$\begin{aligned} {\, \mathrm meas}(\mathcal{A}_\star (\textrm{G}))\ge \textrm{G}^3\int _{1+\zeta }^{x^\star }F_1(x) F_2(x)dx\ . \end{aligned}$$
(34)

To go further, we need a quantitative bound on \(x^\star \). Indeed, we have

Claim 2.1

If \(0<\theta <\frac{1}{10}\), then \(1+4\theta<x^\star <1+6\theta \).

The proof of the claim is postponed below, in order not to interrupt the main proof.

Since we have assumed \(\textrm{G}\ge 10c_1^2\varepsilon ^2\) and \(\alpha _+\le \frac{c^2}{16}\), then \(\textrm{G}\ge \frac{\frac{12}{ c}\sqrt{{\alpha }_+}}{1-\frac{2}{ c}\sqrt{{\alpha }_+}}c_1^2\varepsilon ^2\). In the new variables, this is \(\theta \le \frac{m}{6(1-m)}\). But then

$$\begin{aligned} x^*<1+6\theta \le \frac{1}{1-m}\qquad \Longrightarrow \quad x-1-\zeta \le mx-\zeta \quad \forall \ x<x^* \end{aligned}$$

whence

$$\begin{aligned} F_2(x)=\left\{ \begin{array}{lll}x-1-\zeta \quad &{}\textrm{if}\quad &{}1+\zeta \le x\le 1+\theta \\ \\ \theta -\zeta &{}\textrm{if} &{}1+\theta <x\le x^\star \end{array} \right. \end{aligned}$$

Observe that the second inequality is well put, because \(x^\star >1+4\theta \), as said. The function \(F_1(x)\) splits in the same intervals:

$$\begin{aligned} F_1(x)=\left\{ \begin{array}{lll}2x- (1+x)\sqrt{\frac{4+x}{5}} \quad &{}\textrm{if}\quad &{}1+\zeta \le x\le 1+\theta \\ \\ x+1+\theta -(1+x)\sqrt{\frac{4+x}{5}} &{}\textrm{if} &{}1+\theta <x\le x^\star \end{array} \right. \end{aligned}$$
(35)

Since \(\zeta <\theta \), a lower bound to the integral in (34) is given by

$$\begin{aligned}\int _{1+\zeta }^{x^\star }F_1(x) F_2(x)dx\ge \int _{1+\theta }^{x^\star }F_1(x) F_2(x)dx=(\theta -\zeta ) \int _{1+\theta }^{x^\star } F(x)dx\end{aligned}$$

with

$$\begin{aligned} F(x)\,{:}{=}\,x+1+\theta -(1+x)\sqrt{\frac{x+4}{5}} \end{aligned}$$
(36)

the function in the second line in (35). Since F is the difference of a linear function and a convex one, it is concave. Then, we have

$$\begin{aligned} F(x)\ge F(1)+\frac{F(x^\star )-F(1)}{x^\star -1}(x-1)\qquad \forall \ 1\le x\le x^\star \end{aligned}$$

since \(F(x^\star )=0\) and \(F(1)=\theta \), this inequality becomes

$$\begin{aligned} F(x)\ge \frac{x^\star -x}{x^\star -1} \theta \qquad \forall \ 1\le x\le x^\star \end{aligned}$$

hence

$$\begin{aligned} \int _{1+\theta }^{x^\star }F(x)dx\ge \frac{\theta }{x^\star -1}\int _{1+\theta }^{x^\star }(x^\star -x)dx=\frac{\theta }{2} \frac{(x^\star -1-\theta )^2}{x^\star -1}\ge \frac{9}{10}\theta ^2 \end{aligned}$$

having used \(1+4\theta<x^\star <1+6\theta \).

It remains to prove Claim 2.1. \(x^\star \) is defined as the zero of the function F in (36) in the range \((1, +\infty )\). Multiplying the left hand side of Equation

$$\begin{aligned} x+1+\theta -(1+x)\sqrt{\frac{x+4}{5}}=0 \end{aligned}$$

by \(x+1+\theta +(1+x)\sqrt{\frac{x+4}{5}}\), we obtain the algebraic equation of degree three

$$\begin{aligned} x^3+x^2+(1+10 \theta ) x-1-10\theta -5\theta ^2=0 \end{aligned}$$

which, for \(x\ge -1\) is completely equivalent to the initial equation. We aim to apply a bisection argument to the function at left hand side, which we denote as G(x). We have

$$\begin{aligned} G(1+4\theta )=\theta ( 64\theta ^2+19\theta -4 )\ ,\qquad G(1+6\theta )=\theta ( 216\theta ^2+79\theta +4 ) \end{aligned}$$

and it is immediate to check that

$$\begin{aligned} G(1+4\theta )<0\qquad G(1+6\theta )>0\qquad \forall \ 0<\theta < \frac{-19+\sqrt{1385}}{ 128 }= 0.142\ldots \end{aligned}$$

To prove uniqueness, just observe that the function \(x\in (0, +\infty )\rightarrow G(x)\) is increasing for all \(\theta >0\). This completes the proof. \(\square \)

3 Quantitative kam Theory

3.1 Proof of Theorem 1.1

The proof of Theorem 1.1 is based on an application of Chierchia and Pinzari (2010, Proposition 3). The method is completely analogous to the one used in the proof of Chierchia and Pinzari (2010, Theorem 1.3), so we shall only say what to change in the proof of Chierchia and Pinzari (2010, Theorem 1.3) in order to obtain the proof of Theorem 1.1. The polynomial N(Ir) in the first non-numbered formula in Chierchia and Pinzari (2010, Section 4) is to be changed as

$$\begin{aligned} N(I,r)=P_0(I) +\sum _{i=1}^{m}{\Omega }_i(I)r_i+\frac{1}{2}\sum _{i,j=1}^{m} {\beta }_{ij}(I)r_i r_j+{\mathbb 1}_{s\ge 3}\sum _{j=3}^{s}\mathcal{P} _j(r;I) . \end{aligned}$$
(37)

Equations (60) and (61) in Chierchia and Pinzari (2010) can be modified, respectively, as

$$\begin{aligned}{} & {} \sup _{B_\varepsilon ^{2m}\times V_{\rho _0}}|{\tilde{P}}_\textrm{av}|\le C\varepsilon ^{2s+1}\qquad \forall \ 0<\varepsilon<\varepsilon _0\nonumber \\{} & {} {\mu }<\frac{\varepsilon ^{2s+2}}{(\log \varepsilon ^{-1})^{2{\tau }+1}}\qquad \overline{{\gamma }}>\Big (\frac{6(2s+1)}{s_0}\Big )^{{\tau }+\frac{1}{2}}\frac{\sqrt{\mu }(\log \varepsilon ^{-1})^{{\tau }+\frac{1}{2}}}{\varepsilon ^{s+\frac{1}{2}}}\ . \end{aligned}$$
(38)

Analogously to Chierchia and Pinzari (2010), one next applies Lemma A.1 in Chierchia and Pinzari (2010), but modifying the choice of K as

$$\begin{aligned} K=\frac{6(2s+1)}{s_0}\log \varepsilon ^{-1}\end{aligned}$$
(39)

and leaving the other quantities unvaried. A bound as in Equation (62) in Chierchia and Pinzari (2010) is so obtained, with \(H_0\) as in Chierchia and Pinzari (2010), \(N(\overline{I},\overline{r})\) as in (37), \({\mu }{\widetilde{P}}_\textrm{av}(\overline{p}, \overline{q}, \overline{I})=f_{{bnf}}(I,\overline{p},\overline{q})-N(\overline{I},\overline{r})\) uniformly bounded by \(C{\mu }\varepsilon ^{2s+1}\), by (A\({}_2\)). Due to the choice of K in (39) and the one for \(\overline{{\gamma }}\) in (38), a bound similar to the one in Equation (63) in Chierchia and Pinzari (2010) holds, with the right hand side replaced by \(\overline{C}{\mu }\varepsilon ^{2s+1}\). At this point, one follows the indications in Step 2 of the proof of Theorem 1.3 in Chierchia and Pinzari (2010). Namely, one has to repeat the procedure in Steps 5 and 6 of the proof Theorem 1.4 [previously proved in altchierchiaPi10], with the following modification. The annulus \(\mathcal{A} (\varepsilon )\) in Equation (47) in Chierchia and Pinzari (2010) is to be taken as

$$\begin{aligned} \mathcal{A} (\varepsilon )=\Big \{J\in {{\mathbb {R}}}^{m}:\ \check{c}_1\varepsilon ^{s+\frac{1}{2}}<J_i<{\check{c}}_2\varepsilon ^2\ ,\quad 1<i<m\Big \} \end{aligned}$$

and the number \(\breve{\rho }\) in

Equation (48) in Chierchia and Pinzari (2010) is to be replaced with \(\breve{\rho }\,{:}{=}\,\min \{{\check{c}}_1\varepsilon ^{s+\frac{1}{2}}/2,\ \overline{{\rho }}/ {48}\}\). The other quantities remain unvaried. In the remaining Steps 5 and 6 of the proof of Theorem 1.4 in Chierchia and Pinzari (2010) replace the number “5” appearing in all the formulae with \((2s+1)\) and \(\varepsilon ^{n_2/2}\) in Equation (56) (and the formulae below) in Chierchia and Pinzari (2010) with \(\varepsilon ^{m(s-\frac{3}{2})}\). \(\square \)

3.2 Proof of Theorem 1.2

The proof of Theorem 1.2 proceeds along the same lines as the proof of Theorem 1.1, apart for being based on a generalization (Theorem 3.1 below) of Chierchia and Pinzari (2010, Proposition 3) which now we state.

As in Chierchia and Pinzari (2010) \(\mathcal{D}_{\gamma _1, \gamma _2,{\tau }}\subset {{\mathbb {R}}}^{n}\) denotes the set of vectors \({\omega }=({\omega }_1, {\omega }_2)\in {{\mathbb {R}}}^{n_1}\times {{\mathbb {R}}}^{n_2}\) satisfying for any \(k=(k_1,k_2)\in {{\mathbb {Z}}}^{n_1}\times {{\mathbb {Z}}}^{n_2}\setminus \{0\}\), inequality

$$\begin{aligned} |{\omega }_1\cdot k_1+{{\omega }_2}\cdot k_2|\ge \left\{ \begin{array}{l} \displaystyle \frac{{\gamma }_1}{|{k}|^{{\tau }}}\quad \textrm{if}\quad k_1\ne 0\ ;\\ \ \\ \displaystyle \frac{{\gamma }_2}{| k_2|^{{\tau }}}\quad \textrm{if}\quad k_1= 0\ ,\quad k_2\ne 0\ . \end{array}\right. \end{aligned}$$
(40)

Theorem 3.1

Let \({n_1}\), \({n_2}\in {{\mathbb {N}}}\), \({n}\,{:}{=}\,{n_1}+{n_2}\), \(\tau >n\), \({{\gamma }_1}\ge {{\gamma }_2}>0\), \(0<s\le \frac{\varepsilon }{\overline{\varepsilon }+\varepsilon }\), \({\rho }>0\), \(A\,{:}{=}\,{D}_{\rho }\times B^2_{\overline{\varepsilon }+\varepsilon }\), and let

$$\begin{aligned} \textrm{H} (I,\psi , p, q)=\textrm{h}(I, pq)+\textrm{f}(I,\psi , p, q) \end{aligned}$$

be real-analytic on \(A\times {{\mathbb {T}}}_{\overline{s}+s}^{n}\). Let

$$\begin{aligned} I=(I_1, I_1) ,\quad \varpi (I,pq)\,{:}{=}\,\partial _{ {(I,pq)}} \textrm{h}(I,pq)=(\omega _1(I_1, I_2, pq), \omega _2(I_1, I_2, pq), {\nu }(I_1, I_2, pq)) \end{aligned}$$

with \(\omega _k(I_1, I_2, pq)\,{:}{=}\,\partial _{I_k} \textrm{h}(I_1, I_2, pq)\), and assume that the map \(I\in D_{\rho }\rightarrow {\omega }(I,J)\) is a diffeomorphism of \(D_{\rho }\) for all \(J=pq\), with \((p,q)\in B^2_{\varepsilon }\), with non-singular Hessian matrix \(U(I,J)\,{:}{=}\,\partial _{I}^2\textrm{h}(I,J)\). LetFootnote 18

$$\begin{aligned} M\ge \Vert \partial \omega \Vert _A\ ,\ {\widehat{M}}\ge \Vert \partial \omega _1\Vert _A ,\ \overline{M}\ge \Vert U^{-1}\Vert _A\ ,\ E\ge \Vert \textrm{f}\Vert _{{\rho },\overline{s}+s} ,\quad \lambda \le \inf |{\, \mathrm Re\,}\nu |_A\ . \end{aligned}$$

Assume, forFootnote 19 simplicity,

$$\begin{aligned} 2\frac{s^{\tau }\gamma _2}{6^{\tau }\lambda }\le 1 .\end{aligned}$$
(41)

Define

$$\begin{aligned}{} & {} \displaystyle {\widehat{c}}\,{:}{=}\,2^7(n+1)(24)^\tau \ ,\quad {{\widetilde{c}}\,{:}{=}\,2^{6}}\\{} & {} \displaystyle K\,{:}{=}\,\frac{32}{s}\ \log _+{\left( \frac{E M^2\,L}{\gamma _1^2}\right) ^{-1}}\quad \textrm{where}\quad \log _+ a \,{:}{=}\,\max \{1,\log {a}\}\\{} & {} \displaystyle {{\widehat{{\rho }}}\,{:}{=}\,\min \left\{ \frac{\gamma _1}{2MK^{\tau +1}}\ ,\ \frac{\gamma _2}{2{\widehat{M}} K^{\tau +1}} ,\ {\rho }\right\} } ,\quad {\widetilde{\rho }}\,{:}{=}\,\min \left\{ {\widehat{\rho }} ,\ \frac{\varepsilon ^2}{s}\right\} \\ \\{} & {} \displaystyle L\,{:}{=}\,\max \ \Big \{\overline{M} ,\ M^{-1} ,\ \widehat{M}^{-1}\Big \} \\{} & {} {\widehat{E}}\,{:}{=}\,\frac{E L}{{\widehat{{\rho }}}{\widetilde{\rho }}}\ ,\qquad {{\widetilde{E}}\,{:}{=}\,\frac{E}{\lambda \varepsilon ^2}} . \end{aligned}$$

Finally, let \(\overline{M}_1\), \(\overline{M}_2\) upper bounds on the norms of the sub-matrices \(n_1\times n\), \(n_2\times n\) of \(U^{-1}\) of the first \(n_1\), last \(n_2\) rowsFootnote 20. Assume the perturbation \(\textrm{f}\) so small that the following “KAM conditions” hold

$$\begin{aligned} {\widehat{c}}{\widehat{E}}<1\ ,\quad {\widetilde{c}}{\widetilde{E}}<1 \end{aligned}$$
(42)

Then, for any \(({\pi }, {\kappa })\in B^2_{\overline{\varepsilon }}\) and any \({\omega }_*\in {\Omega }_*({\pi }{\kappa })\,{:}{=}\,{\omega }({D}, {\pi }{\kappa })\cap \mathcal{D} _{\gamma _1, \gamma _2,\tau }\), one can find a unique real-analytic embedding

$$\begin{aligned} \phi _{{\omega }_*}:\ {{\mathbb {T}}}^{n}\times \{({\pi }, {\kappa })\}\rightarrow & {} {\, \mathrm Re\,}({D}_r)\times {{\mathbb {T}}}^{{n}}\times B^2_{\overline{\varepsilon }+r'}\nonumber \\ (\vartheta , {\pi }, {\kappa })\rightarrow & {} \Big (v(\vartheta , {\pi }, {\kappa }; \omega _*), \vartheta +u(\vartheta , {\pi }, {\kappa }; \omega _*), {\pi }+w(\vartheta , {\pi }, {\kappa }; \omega _*), {\kappa }+y(\vartheta , {\pi }, {\kappa }; \omega _*) \Big ) \nonumber \\ \end{aligned}$$
(43)

such that \(\mathcal{M} _{{\omega }_*}\,{:}{=}\,\phi _{{{\omega }_*}}({{\mathbb {T}}}^n\times B^2_{\overline{\varepsilon }})\) is a real-analytic \((n+2)\)-dimensional manifold, on which the \(\textrm{H}\)-flow is analytically conjugated to

$$\begin{aligned} (\vartheta , {\pi }, {\kappa })\in {{\mathbb {T}}}^{n}\times B^2_{\overline{\varepsilon }} \rightarrow (\vartheta +{\omega }_* t,\ {\pi }\rightarrow {\pi }e^{-{\nu }_*({\omega }_*, {\pi }{\kappa })t}, \ {\kappa }\rightarrow {\kappa }e^{{\nu }_*({\omega }_*, {\pi }{\kappa })t}) . \end{aligned}$$
(44)

In particular, the manifolds

$$\begin{aligned} \textrm{T}_{{{\omega }_*}}\,{:}{=}\,\phi _{{\omega }_*}\left( {{\mathbb {T}}}^n\times \{(0,0)\}\right) \end{aligned}$$

are real-analytic n-dimensional \(\textrm{H}\)-invariant tori embedded in \({\, \mathrm Re\,}({D}_r)\times {{\mathbb {T}}}^{n}\times B^2_{\overline{\varepsilon }}\), equipped with \((n+1)\)-dimensional manifolds

$$\begin{aligned} \mathcal{M} _\textrm{u}\,{:}{=}\,\phi _{{\omega }_*}\left( {{\mathbb {T}}}^n\times \{0\}\times B^1_{\overline{\varepsilon }}\right) \ ,\qquad \mathcal{M} _\textrm{s}\,{:}{=}\,\phi _{{\omega }_*}\left( {{\mathbb {T}}}^n\times B^1_{\overline{\varepsilon }}\times \{0\}\right) \end{aligned}$$

on which the motions leave, approach \(\textrm{T}_{{{\omega }_*}}\) at an exponential rate. Let \( \textrm{T}_{\omega _*, 0}\) denote the projection of \(\textrm{T}_{\omega _*}\) on the \((I, \varphi )\)-variables, and \(\displaystyle \textrm{K}_0\,{:}{=}\,\bigcup _{{\omega }_*\in {\Omega }_*}\textrm{T}_{{\omega }_*, 0}\). Then \( \textrm{K}_0\) satisfies the following measureFootnote 21 estimate:

$$\begin{aligned} {\, \mathrm meas}_{2n}({\, \mathrm Re\,}({D}_r)\times {{\mathbb {T}}}^{n}\setminus \textrm{K}_0)\le c_n\Big ({\, \mathrm meas}({D}\setminus {D}_{\gamma _1, \gamma _2,\tau }\times {\mathbb T}^{n})+{\, \mathrm meas}({\, \mathrm Re\,}({D}_r)\setminus {D})\times {{\mathbb {T}}}^{n}\Big ),\nonumber \\ \end{aligned}$$
(45)

where \({D}_{\gamma _1, \gamma _2,\tau }\) denotes the \({\omega }_0(\cdot , 0)\)-preimage of \(\mathcal{D} _{\gamma _1, \gamma _2,\tau }\) and \(c_n\) can be taken to be \(\displaystyle c_n=(1+(1+2^8nE)^{2n})^2\).

Finally, the following uniform estimates hold for the embedding \(\phi _{\omega _*}\):

$$\begin{aligned}{} & {} | v_1(\vartheta , {\pi }, {\kappa };{\omega }_*)-I_1^0({\pi }{\kappa }; {\omega }_*)|\le {6} {n}\left( \frac{\overline{M}_1}{\overline{M}}+\frac{{\widehat{M}}}{M}\right) {\widehat{E}}\,{\widetilde{{\rho }}}\nonumber \\{} & {} | v_2(\vartheta , {\pi }, {\kappa };{\omega }_*)-I_2^0({\pi }{\kappa };{\omega }_*)|\le {6}n\left( \frac{\overline{M}_2}{\overline{M}}+\frac{{\widehat{M}}}{M}\right) {\widehat{E}}\,{\widetilde{{\rho }}}\ ,\nonumber \\{} & {} |u(\vartheta , {\pi }, {\kappa };{\omega }_*)|\le 2\,{\widehat{E}}\,s ,\quad |w(\vartheta , {\pi }, {\kappa };{\omega }_*)|\le 2\,{\widehat{E}}\,\varepsilon \nonumber \\{} & {} |y(\vartheta , {\pi }, {\kappa };{\omega }_*)|\le 2\,{\widehat{E}}\,\varepsilon \end{aligned}$$
(46)

where \(v(\vartheta , {\pi }, {\kappa }; {\omega }_*)=(v_1(\vartheta , {\pi }, {\kappa }; {\omega }_*), v_2(\vartheta ,{\pi }, {\kappa }; {\omega }_*))\) and \(I^0({\pi }{\kappa }; {\omega }_*)=(I^0_1({\pi }{\kappa }; {\omega }_*),I^0_2({\pi }{\kappa }; {\omega }_*))\in D\) is the \({\omega }(\cdot , {\pi }{\kappa })\)—pre-image of \({\omega }_*\in {\Omega }_*({\pi }{\kappa })\). where \(r\,{:}{=}\,8 {n} {\widehat{E}} {\widetilde{{\rho }}}\), \( r'=2{\widehat{E}}\varepsilon \)

The proof of Theorem 3.1 is deferred to the next Sect. 3.3. Here, we prove how Theorem 1.2 follows from it.

As said, we follow the same ideas of the proof of Theorem 3.1, which in turn follows (Chierchia and Pinzari 2010, Theorem 1.3). By \((A'_{2})\),

$$\begin{aligned} P_\textrm{av}(I, p, q)=P_0(I, pq)+P_1(I, p, q)\quad \textrm{where}\quad |P_1|\le a \Vert P_0\Vert {=}{:}\epsilon .\end{aligned}$$
(47)

At this point, proceeding as in Chierchia and Pinzari (2010, Proof of Theorem 1.3, Step 1) but with \(\epsilon ^5\) replaced by \(\epsilon \), under condition

$$\begin{aligned}{\mu }<\frac{{\epsilon }^{1+\eta }}{(\log ({\epsilon }^{-1}))^{2{\tau }+1}}\ ,\qquad {\overline{{\gamma }}}\ge C\Big (\frac{6}{s_0}\Big )^{{\tau }+\frac{1}{2}}\frac{\sqrt{{\mu }}(\log {{\epsilon }^{-1}})^{{\tau }+\frac{1}{2}}}{\sqrt{{\epsilon }}}\ ,\end{aligned}$$

by an application of Chierchia and Pinzari (2010, Lemma A.1), with \({\overline{K}}=\frac{6}{s_0}\log {\epsilon ^{-1}}\), \(r_p=r_q={\epsilon }_0\), \(r=4\rho ={\overline{{\rho }}}\,{:}{=}\,\min \left\{ \frac{{\overline{{\gamma }}}}{2\overline{M}{\overline{K}}^{{\tau }+1}},\ {\rho }_0\right\} \) (with \(\overline{M}\,{:}{=}\,\sup |\partial ^2_{I_1}H_0|\)), \({\rho }_p={\rho }_q={\epsilon }_0/4\), \({\sigma }=s_0/4\), \(\ell _1=n_1\), \(\ell _2=0\), \(m=n_2\) \(h=H_0\), \(g\equiv 0\), \(f={\mu }P\), \(A={\overline{D}}\,{:}{=}\,\omega _0^{-1}\mathcal{D} _{\gamma , \tau }\) (where \(\omega _0\) is as in \(A_1\) and \(\mathcal{D} _{\gamma , \tau }\) is the usual Diophantine set in \({\mathbb {R}}^n\), namely the set (40) with \(\gamma _1=\gamma _2\)), \({B}={B'}=\{0\}\), \(s=s_0\), \({\alpha }_1={\alpha }_2={\overline{{\alpha }}}=\frac{{\overline{{\gamma }}}}{2{\overline{K}}^{\tau }}\), and \({\Lambda }=\{0\}\), on the domain \(W_{\overline{v},\overline{s}}\) where \(\overline{v}=({\overline{{\rho }}}/2,{\epsilon }_0/2)\) and \(\overline{s}=s_0/2\), one finds a real-analytic and symplectic transformation \(\overline{\phi }\) which carries \(\textrm{H}\) to

$$\begin{aligned} {\overline{H}}({\overline{I}},{\overline{\varphi }},{\overline{p}},{\overline{q}}){} & {} :=H\circ \overline{\phi }({\overline{I}},{\overline{\varphi }},{\overline{p}},{\overline{q}})\\{} & {} =H_0({\overline{I}})+{\mu }P_{0}({\overline{I}}, {\overline{p}}{\overline{q}})+\mu P_1({\overline{I}},{\overline{\varphi }},{\overline{p}},{\overline{q}})+{\widetilde{P}}({\overline{I}},{\overline{\varphi }},{\overline{p}},{\overline{q}})\\{} & {} =H_0({\overline{I}})+{\mu }P_{0}({\overline{I}}, {\overline{p}}{\overline{q}})+\mu {\overline{P}}(\overline{I},{\overline{\varphi }},{\overline{p}},{\overline{q}}) \end{aligned}$$

where

$$\begin{aligned}\Vert {\widetilde{P}}\Vert _{{\overline{v}},{\overline{s}}}\le {{\overline{C}}{\mu }}\max \{\frac{{\mu }{\overline{K}}^{2{\tau }+1}}{{\overline{{\gamma }}}^2},\frac{{\mu }{\overline{K}}^{\tau }}{{\overline{{\gamma }}}}\ e^{-{\overline{K}} s_0/2}\}\le {{\overline{C}}}{\mu }\epsilon ={{\overline{C}}}{\mu }a \Vert P_0\Vert \ ,\end{aligned}$$

whence (by (47)) also \({\overline{P}}={\mu }{\widetilde{P}}_\textrm{av}+{\widetilde{P}}\) is bounded by \({C}{\mu }a \Vert P_0\Vert \) on \(W_{{\overline{v}},{\overline{s}} }\).

The next step is to apply Theorem  3.1 to the Hamiltonian \(\overline{H}\). Since we can take

$$\begin{aligned}{} & {} M=C\ ,\quad {\widehat{M}}=C{\mu }\Vert P_0\Vert \ ,\quad {\overline{M}}=C({\mu }\Vert P_0\Vert )^{-1}\ ,\quad E ={C}{\mu }a \Vert P_0\Vert \\{} & {} {\bar{M}}_1=C\ , \quad {\bar{M}}_2=C({\mu }\Vert P_0\Vert )^{-1} ,\quad \lambda =C^{-1} {\mu }\Vert P_0\Vert \end{aligned}$$

the numbers L, K, \({\widehat{\rho }}\) and \({\widetilde{\rho }}\) can be bounded, respectively, as

$$\begin{aligned} L\le C({\mu }\Vert P_0\Vert )^{-1}\ ,\qquad K\le C\log {(a /{{\gamma }_1}^2)^{-1}} \end{aligned}$$

and

$$\begin{aligned}{} & {} {\widehat{{\rho }}}\ge c\,\min \Big \{\frac{{{\gamma }_1}}{(\log {(a/{{\gamma }_1}^2)^{-1}})^{{\tau }+1}}\ ,\ \frac{{\overline{{\gamma }}_2}}{(\log {(a/{{\gamma }_1}^2)^{-1}})^{{\tau }+1}} \ , \frac{{\bar{{\gamma }}}}{(\log {{\epsilon }^{-1}})^{{\bar{{\tau }}}+1}}\ , \ {\rho }_0\Big \}\\{} & {} {\widetilde{{\rho }}}\ge c\,\min \Big \{\frac{{{\gamma }_1}}{(\log {(a/{{\gamma }_1}^2)^{-1}})^{{\tau }+1}}\ ,\ \frac{{\overline{{\gamma }}_2}}{(\log {(a/{{\gamma }_1}^2)^{-1}})^{{\tau }+1}} \ , \frac{{\bar{{\gamma }}}}{(\log {{\epsilon }^{-1}})^{{\bar{{\tau }}}+1}}\ , \ {\rho }_0 ,\ \varepsilon ^2\Big \} \end{aligned}$$

having let \(\gamma _2\,{:}{=}\,\mu \Vert P_0\Vert \overline{\gamma }_2\). Condition (41) is trivially satisfied for any \(\overline{\gamma }<1\), \(s\le 6\), while, from the bounds

$$\begin{aligned} {\widehat{c}}{\widehat{E}}\le & {} C a \max \left\{ \frac{(\log {(a/{{\gamma }_1}^2)^{-1}})^{2({\tau }+1)}}{\gamma _1^2}v\ \frac{(\log {(a/{{\gamma }_1}^2)^{-1}})^{2({\tau }+1)}}{\overline{\gamma }_2^2} ,\ \frac{(\log {\epsilon )^{-1}})^{2({\tau }+1)}}{\overline{\gamma }^2} ,\ \frac{1}{\rho _0^2} ,\ \frac{1}{\varepsilon ^4} \right\} \ ,\ \\ {\widetilde{c}}{\widetilde{E}}\le & {} C\frac{a}{\varepsilon ^2} \end{aligned}$$

one sees that conditions (42) hold taking

$$\begin{aligned} \overline{\gamma }=\gamma _1=\overline{\gamma }_2={\widehat{C}}{\sqrt{a}} ,\quad a<{\widehat{C}}^{-1}\varepsilon ^4 \end{aligned}$$

with a suitable \({\widehat{C}}>1\). By the thesis of Theorem  3.1, we can find a set of n-dimensional invariant tori \(\mathcal{K}\subset \mathcal{P} \) whose projection \(\mathcal{K} _0\) on \(\mathcal{P} _0\) satisfies the measure estimate

$$\begin{aligned} {\, \mathrm meas}\mathcal{P} _0\ge {\, \mathrm meas}\mathcal{K} _0\ge (1- C'({\overline{{\gamma }}}+{\gamma }_1+{{\gamma }_2})){\, \mathrm meas}\mathcal{P} _0\ge (1- C\sqrt{a}){\, \mathrm meas}\mathcal{P} _0\ .\quad \end{aligned}$$

\(\square \)

3.3 Proof of Theorem 3.1

We fix the following notations.

  • in \({{\mathbb {R}}}^{n}\) we fix the 1-norm: \( |I|\,{:}{=}\,|I|_1\,{:}{=}\,\sum _{1\le i\le n_1}|I_i|\);

  • in \({{\mathbb {T}}}^{n}\) we fix the “sup-metric”: \( |\varphi |\,{:}{=}\,|\varphi |_{\infty }\,{:}{=}\,\max _{1\le i\le n}|\varphi _i|\) (mod \(2{\pi }\));

  • in \({{\mathbb {R}}}\) we fix the sup norm: \( |(p, q)|\,{:}{=}\,|(p, q)|_{\infty }\,{:}{=}\,\max \{|p|, |q|\}\);

  • for matrices we use the “sup-norm”: \( |{\beta }|\,{:}{=}\,|{\beta }|_{\infty }\,{:}{=}\,\max _{i,j}|{\beta }_{ij}|\);

  • we denote as \(B^n_{\varepsilon }(z_0)\) the complex ball having radius \(\varepsilon \) centered at \(z_0\in {{\mathbb {C}}}^n\). If \(z_0=0\), we simply write \(B^n_{\varepsilon }\).

  • if \(A\subset {{\mathbb {R}}}^{n}\), and \(r>0\), we denote by \(A_r\,{:}{=}\,\bigcup _{x_0\in A} B^n_r(x_0)\) the complex r-neighborhood of A (according to the prefixed norms/metrics above);

  • given \(A\subset {{\mathbb {R}}}^n\) and positive numbers r, \(\varepsilon \), s, we let

    $$\begin{aligned} v\,{:}{=}\,(r, \varepsilon )\ ,\quad U_{v}\,{:}{=}\,{A}_r \times B^2_{ \varepsilon }\ ,\quad W_{v, s}\,{:}{=}\,{U}_v \times {\mathbb T}^{n}_{ s}\end{aligned}$$
  • if f is real-analytic on a complex domain of the form \(W_{v_0, s_0}\), with \(v_0=(r_0, \varepsilon _0)\), \(r_0>r\), \(\varepsilon _0>\varepsilon \), \(s_0>s\), we denote by \(\Vert f\Vert _{v,s}\) its “sup-Taylor–Fourier norm”:

    $$\begin{aligned} \Vert f\Vert _{v, s}\,{:}{=}\,\sum _{k,{\alpha },{\beta }}\sup _{U_v}|f_{{\alpha },{\beta }, k}|e^{|k|s}\varepsilon ^{|({\alpha },{\beta })|}\end{aligned}$$
    (48)

    with \(|k|\,{:}{=}\,|k|_1\), \(|({\alpha },{\beta })|\,{:}{=}\,|{\alpha }|_1+|{\beta }|_1\), where \(f_{k, {\alpha }, {\beta }}(I)\) denotes the coefficients in the expansion

    $$\begin{aligned} f=\sum _{\begin{array}{c} (k,{\alpha },{\beta })\in \textbf{Z}^{n}\times {{\mathbb {N}}}^\ell \times {{\mathbb {N}}}^\ell \\ {{\alpha }_i\ne {\beta }_i\forall i} \end{array}}f_{k, {\alpha },{\beta }}(I)e^{ik\cdot \varphi } p^{\alpha }q^{\beta }; \end{aligned}$$
  • if f is as in the previous item, \(K>0\) and \({{\mathbb {L}}}\) is a sub-lattice of \(\textbf{Z}^n\), \(T_Kf\) and \(\P _{{\mathbb {L}}} f\) denote, respectively, the K-truncation and the \({{\mathbb {L}}}\)-projection of f:

    $$\begin{aligned} T_Kf{} & {} :=\sum _{\begin{array}{c} (k,{\alpha },{\beta })\in \textbf{Z}^{n}\times {{\mathbb {N}}}^\ell \times {{\mathbb {N}}}^\ell \\ {{\alpha }_i\ne {\beta }_i\forall i} |k|_1\le K \end{array}}f_{k, {\alpha },{\beta }}(I)e^{ik\cdot \varphi }p^{\alpha }q^{\beta }\ ,\quad \P _{{\mathbb {L}}} f\\{} & {} :=\sum _{\begin{array}{c} (k,{\alpha },{\beta })\in \textbf{Z}^{n}\times {{\mathbb {N}}}^\ell \times {\mathbb N}^\ell \\ {{\alpha }_i\ne {\beta }_i\forall i}, k\in {{\mathbb {L}}} \end{array}}f_{k, {\alpha },{\beta }}(I)e^{ik\cdot \varphi }p^{\alpha }q^{\beta }\end{aligned}$$

    with \(f_{k, {\alpha },{\beta }}(I)\,{:}{=}\,f_{k, {\alpha },{\beta }}(I, 0, 0)\). We say that f is \((K, {{\mathbb {L}}})\) in normal form if \(f=\P _{{\mathbb {L}}} T_{K}f\). If \({{\mathbb {L}}}\) is strictly larger than \(\{0\}\), we say that f is resonant normal form.

Proposition 3.1

(Partially hyperbolic averaging theory) Let \(H=h(I_1, I_2, pq)+ f(I,{\varphi },p,q) \) be a real-analytic function on \(W_{v_0, s_0}\), with \(v_0=(r_0, \varepsilon _0)\). Let K, r, s, \(\varepsilon \), \({\hat{r}}\), \({\hat{s}}\), positive numbers, with \({\hat{r}}<r/4\), \({\hat{s}}<s/4\) and \(\hat{\varepsilon }<\varepsilon /4\). Put \({\hat{\sigma }}\,{:}{=}\,\min \left\{ {\hat{s}}, \frac{{\hat{\varepsilon }}}{\varepsilon }\right\} \). Assume there exist positive numbers \({\alpha }_1\), \({\alpha }_2>0\), with \(\alpha _1\ge \alpha _2\), such that, for all \(k=(k_1, k_2, k_3)\in {{\mathbb {Z}}}^{n+1}\), \(0<|k|\le K\) and for all \((I, p, q)\in U_{r, \varepsilon }\),

$$\begin{aligned} |{\omega }_1\cdot k_1+{\omega }_2\cdot k_2-\textrm{i}k_3{\nu }|\ge \left\{ \begin{array}{lll} {\alpha }_1\quad &{}\textrm{if}\quad &{}k_1\ne 0\\ {\alpha }_2 &{}\textrm{if}&{} k_1=0,\ (k_2, k_3)\ne (0, 0)\\ \end{array} \right. \end{aligned}$$
(49)

and

$$\begin{aligned} K{\widehat{{\sigma }}}\ge 8\log 2 ,\qquad \frac{2^{ 3}c_1 K \widehat{\sigma }}{{\alpha }_2 {\delta }} \Vert f\Vert _{r, s, \varepsilon }<1,\qquad {\delta }\,{:}{=}\, \min \{ {\hat{r}} {\hat{s}},\ {\hat{\varepsilon }}^2 \} \end{aligned}$$
(50)

with a suitable number \(c_1\). Then, one can find a real-analytic and symplectic transformation

$$\begin{aligned} \Phi _*:\quad W_{r_*, s_*, \varepsilon _* }\rightarrow W_{r, s, \varepsilon } \end{aligned}$$

with \(r_*=r-4{\hat{r}}\), \(s_*=s-4{\hat{s}}\), \(\varepsilon _*=\varepsilon -4{\hat{\varepsilon }}\), which conjugates H to

$$\begin{aligned} H_{*}(I, {\varphi },p,q)\,{:}{=}\,H\circ \Phi _*=\textrm{h}(I,pq)+ g(I,{\varphi },p,q)+ f_{*}(I,{\varphi },p,q), \end{aligned}$$

where g is \((K, \{0\})\) in normal form, and g, f verify

$$\begin{aligned}{} & {} \Vert g-\P _{0}T_Kf\Vert _{r_*, s_*, \varepsilon _*}\le \frac{8c_1\,\Vert f\Vert ^2_{r, s, \varepsilon }}{{\alpha }_2 {\delta }}\nonumber \\{} & {} \Vert f_*\Vert _{r_*, s_*, \varepsilon _*}\le e^{-K{\hat{{\sigma }}}/4}\Vert f\Vert _{r_*, s_*, \varepsilon _*} \end{aligned}$$
(51)

Finally, \(\Phi _*\) verifies

$$\begin{aligned} \max \big \{ {\alpha }_1{\hat{s}}|I_1-I_1'| ,\, {\alpha }_2{\hat{s}}|{I_2}-{I_2}'| , \, {\alpha }_2 {\hat{r}}\,|\varphi -\varphi '| , \, {\alpha }_2{\hat{\varepsilon }}\,|p-p'| , \, {\alpha }_2 {\hat{\varepsilon }},|q-q'| \big \}\,\le 2 { c_1E} .\nonumber \\ \end{aligned}$$
(52)

Proposition 3.1 is an extension of the Normal Form Lemma by Pöschel (1993). The extension pertains at introducing the (pq) coordinates in the integrable part and leaving the amounts of analyticity \({\hat{r}}\), \({\hat{s}}\) and \({\hat{\varepsilon }}\) as independent. This is needed in order to construct the motions (44), where the coordinates \(({\pi }, {\kappa })\) are not set to (0, 0), but take value in a small neighborhood of it. A more complete statement implying Proposition 3.1 is quoted and proved in Sect. 3.4.

Below, we let \(B\,{:}{=}\,B^2_{\overline{\varepsilon }}(0)\); therefore, \(B_\varepsilon \) will stand for \(B^2_{\overline{\varepsilon }+\varepsilon }(0)\).

Lemma 3.1

(kam Step Lemma) Under the same assumptions and notations as in Theorem 3.1, there exists a sequence of numbers \({\rho }_j\), \(\varepsilon _j\), \(s_j\); of domains

$$\begin{aligned} (W_j)_{\rho _j, \varepsilon _j, s_j}=(A_j)_{\rho _j, \varepsilon _j}\times {{\mathbb {T}}}^n_{{\overline{s}}+s_j} ,\qquad \textrm{with}\quad (A_j)_{\rho _j, \varepsilon _j}\,{:}{=}\,\bigcup _{(p_j, q_j)\in B_{\varepsilon _j}}\left( D_j(p_jq_j)\right) _{\rho _j}\times \{(p_j, q_j)\} \end{aligned}$$

and a real-analytic and symplectic transformations

$$\begin{aligned} \Psi _{j+1}:\ (I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})\in (W_{j+1})_{\rho _{j+1}, \varepsilon _{j+1}, s_{j+1}} \ \rightarrow \ (I_{j},\varphi _{j}, p_{j}, q_{j})\in (W_j)_{\rho _j, \varepsilon _j, s_j}\nonumber \\ \end{aligned}$$
(53)

such that

$$\begin{aligned} \textrm{H}_{j+1}(I_{j+1}, \varphi _{j+1}, p_{j+1}, q_{j+1})= & {} \textrm{H}_j\circ \Psi _{j+1}(I_{j+1}, \varphi _{j+1}, p_{j+1}, q_{j+1})\\= & {} \textrm{h}_{j+1}(I_{j+1}, p_{j+1}q_{j+1})\\{} & {} +\textrm{f}_{j+1}(I_{j+1}, \varphi _{j+1}, p_{j+1}, q_{j+1}) \end{aligned}$$

and such that the following holds. Letting \(E_0\,{:}{=}\,E\), \((M_{0}, \overline{M}_{0}, \widehat{M}_{0}, L_{0})=(M, \overline{M}, {\widehat{M}}, L)\), \(s_0\,{:}{=}\,s\), \(\rho _0\,{:}{=}\,\rho \), \(\varepsilon _0\,{:}{=}\,\varepsilon _0\), \(\lambda _0\,{:}{=}\,\lambda \) and, given, for \(0\le j\in {{\mathbb {Z}}}\), \(E_j\), \((M_{j}, \overline{M}_{j}, {\widehat{M}}_{j}, L_{j})\), \(s_j\), \(\rho _j\), \(\varepsilon _j\), \(\lambda _j\), define

$$\begin{aligned} K_{j}{} & {} :=\frac{32}{s_{j}}\log _+\Big (\frac{E _{j}L_{j}M_{j}^2}{{{\gamma }_1}^2}\Big )^{-1} \end{aligned}$$
(54)
$$\begin{aligned} {\widehat{{\rho }}}_{j}{} & {} :=\min \left\{ \frac{{\gamma }_1}{2M_{j}K_{j}^{{\tau }+1}} ,\ \frac{{\gamma }_2}{2\widehat{M}_{j}K_{j}^{{\tau }+1}} ,\ \frac{\lambda _j}{2 M_{j}K_{j}} ,\ \frac{\lambda _j}{2 {\widehat{M}}_{j}K_{j}} ,\ {\rho }_{j}\right\} \ , \end{aligned}$$
(55)
$$\begin{aligned} {\widetilde{\rho }}_j{} & {} :=\min \left\{ {\widehat{\rho }}_j, \frac{\varepsilon ^2_j}{s_j}\right\} ,\quad \widehat{E}_{j}\,{:}{=}\,\frac{E_{j}L_{j}}{{\widehat{{\rho }}}_{j}{\widetilde{{\rho }}}_{j}} \nonumber \\ E_{j+1}{} & {} :=\frac{E _{j}L_{j}M_{j}^2}{{{\gamma }_1}^2}\ ,\quad (M_{j+1}, \overline{M}_{j+1}, {\widehat{M}}_{j+1}, L_{j+1})=2(M_{j}, \overline{M}_{j}, {\widehat{M}}_{j}, L_{j})\nonumber \\ \rho _{j+1}{} & {} :=\frac{{\widehat{\rho }}_j}{4} ,\ \varepsilon _{j+1}\,{:}{=}\,\frac{\varepsilon _j}{4} ,\ ,\lambda _{j+1}\,{:}{=}\, \lambda _j- 2^{8}\frac{E_j}{\varepsilon ^2_j} ,\quad s_{j+1}\,{:}{=}\,\frac{s_j}{4} . \end{aligned}$$
(56)

Then, for all \((p_{j+1}, q_{j+1})\in B_{\varepsilon _{j+1}}\),

  1. (i)

    \(D_{j+1}(p_{j+1}q_{j+1})\subseteq {(D_{j}}(p_{j+1}q_{j+1}))_{{\widehat{{\rho }}}_{j}/ {4}}\). Letting

    $$\begin{aligned}{} & {} \varpi _{j+1}\,{:}{=}\,\partial _{(I_{j+1}, p_{j+1}q_{j+1})}\textrm{h}_{j+1}(I_{j+1}, p_{j+1}q_{j+1}))\\{} & {} \quad =(\omega _{j+1}(I_{j+1}, p_{j+1}q_{j+1}), \nu _{j+1}(I_{j+1}, p_{j+1}q_{j+1})) \end{aligned}$$

    the map \(I_{j+1}\rightarrow \omega _{j+1}(I_{j+1}, p_{j+1}q_{j+1})\) is a diffeomorphism of \(\big (D_{j+1}(p_{j+1}q_{j+1})\big )_{{\rho }_j}\) verifying

    $$\begin{aligned} \omega _{j+1}(D_{j+1}(p_{j+1}q_{j+1})), p_{j+1}q_{j+1})=\omega _{j}(D_{j}(p_{j+1}q_{j+1})), p_{j+1}q_{j+1}) . \end{aligned}$$

    The map

    $$\begin{aligned} {\widehat{\iota }}_{j{+1}}(p_{j+1}q_{j+1})= & {} ({\widehat{\iota }}_{j{+1,} 1}(p_{j+1}q_{j+1}),{\widehat{\iota }}_{j{+1,}2}(p_{j+1}q_{j+1})):\\ D_{j}(p_{j+1}q_{j+1})\rightarrow & {} D_{j+1}(p_{j+1}q_{j+1})\\ I_j(p_{j+1}q_{j+1})\rightarrow & {} I_{j+1}(p_{j+1}q_{j+1})\,{:}{=}\,\omega _{j+1}^{-1}\big (\omega _{j}(I_j, p_{j+1}q_{j+1}), p_{j+1}q_{j+1}\big ) \end{aligned}$$

    verifies

    $$\begin{aligned}{} & {} \sup _{{ D}_{j}}|{\widehat{\iota }}_{j+1, 1}(p_{j+1}q_{j+1})-{\, \mathrm id \,}|\le 3{n} \frac{{\overline{M}}_1}{{\overline{M}}}{\widehat{E}} _{j}{\widetilde{{\rho }}}_{j}\le 3{n} {\widehat{E}} _{j}{\widetilde{{\rho }}}_{j}\ ,\nonumber \\{} & {} \sup _{{ D}_{j}}|{\widehat{\iota }}_{j+1, 2}(p_{j+1}q_{j+1})-{\, \mathrm id \,}|\le 3{n} \frac{{\overline{M}}_2}{{\overline{M}}}{\widehat{E}} _{j}{\widetilde{{\rho }}}_{j}\le 3{n}{\widehat{E}} _{j}{\widetilde{{\rho }}}_{j} \end{aligned}$$
    (57)
    $$\begin{aligned}{} & {} \mathcal{L}(\widehat{\iota }_{j+1}(p_{j+1}q_{j+1})-{\, \mathrm id \,})\le 2^9{n} {\widehat{E}} _{j} \end{aligned}$$
    (58)
  2. (ii)

    the perturbation \({f}_j\) has sup-Fourier norm

    $$\begin{aligned}\Vert {f}_j\Vert _{(W_j)_{\rho _j, \varepsilon _j, s_j}}\le E_j \end{aligned}$$
  3. (iii)

    the real-analytic symplectomorphisms \(\Psi _{j+1}\) in (53) verify

    $$\begin{aligned}{} & {} \sup _{(W_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}|{I_{j, 1}}(I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})-{I_{j+1,1}}|\le \frac{3}{ {4 }} \frac{{\widehat{M}}_j }{ {M_j}} {\widehat{E}}_j{{\widetilde{{\rho }}}_j}\nonumber \\{} & {} \sup _{(W_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}|{I_{j, 2}}(I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})-{I_{j+1,2}}|\le \frac{3}{ {4 }}{\widehat{E}}_j {{\widetilde{{\rho }}}_j}\nonumber \\{} & {} \sup _{(W_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}|\varphi _{j}(I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})-\varphi _{j+1}|\le \frac{3}{ {4 }}{\widehat{E}}_j s_j\nonumber \\{} & {} {\sup _{(W_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}|p_{j}(I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})-p_{j+1}|\le \frac{3}{ {4 }}{\widehat{E}}_j \varepsilon _j}\nonumber \\{} & {} {\sup _{(W_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}|q_{j}(I_{j+1},\varphi _{j+1}, p_{j+1}, q_{j+1})-q_{j+1}|\le \frac{3}{ {4 }}{\widehat{E}}_j \varepsilon _j} . \end{aligned}$$
    (59)

    The rescaled dimensionless map \({\check{\Phi }}_{j+1}\,{:}{=}\,{\, \mathrm id \,}+{1}_{{\widehat{{\rho }}}_0^{-1},s_0^{-1}, \varepsilon _0^{-1}}\left( \Phi _{j+1}-{\, \mathrm id \,}\right) \circ {1}_{{\widehat{{\rho }}}_{0},s_{0}, \varepsilon _{0}}\) has Lipschitz constant on \((W_{j+1})_{{\rho }_{j+1}/{\widehat{\rho }}_0, \varepsilon _{j+1}/\varepsilon _0,s_{j+1}/s_0}\)

    $$\begin{aligned} \mathcal{L}({\check{\Phi }}_{j+1}-{\, \mathrm id \,})\le 6(n+1)\big (12\cdot (24)^\tau \big )^{j}{\widehat{E}} _{j}\,;\end{aligned}$$
    (60)
  4. (iv)

    for any \(j\ge 0\), \({\widehat{E}} _{j+1}<{\widehat{E}} _{j}^2\), \(\lambda _j\ge \frac{\lambda _0}{2}\).

Proof

The proof of this proposition is obtained generalizing (Chierchia and Pinzari 2010, Lemma B.1). We shall limit ourselves to describe only the different points, leaving to the interested reader the easy work of completing details.

We construct the transformations (53) by recursion, based on Proposition 3.1. For simplicity of notations, we shall systematically eliminate the sub-fix “j” and replace “\(j+1\)” with a “+”. As an example, instead of (53), we shall write

$$\begin{aligned} \Psi _{+}:\ W_{+}\rightarrow W . \end{aligned}$$

When needed, the base step will be labeled as “0” (e.g., (76) below). Let us assume (inductively) that

$$\begin{aligned}{} & {} \omega (D, pq)\subset \mathcal{D}_{\gamma _1, \gamma _2, \tau }\quad \forall \ (p, q)\in B_\varepsilon \end{aligned}$$
(61)
$$\begin{aligned}{} & {} {\widehat{c}}{\widehat{E}}<1\end{aligned}$$
(62)
$$\begin{aligned}{} & {} \lambda \ge \max \left\{ \frac{\gamma _2}{K^{\tau }} ,\ \frac{\lambda _0}{2}\right\} . \end{aligned}$$
(63)

Condition (61) is verified at the base step provided one takes \(D_0=\omega _0^{-1}(\mathcal{D}_{\gamma _1, \gamma _2, \tau }, p_0q_0)\); (62) is so by assumption, while (63) follows from (41):

$$\begin{aligned} \lambda _0\ge \frac{\lambda _0}{2}\ge \frac{s_0^\tau \gamma _2}{6^{\tau }} \ge \frac{\gamma _2}{K_0^{\tau }} .\end{aligned}$$
(64)

We aim to apply Proposition 3.1 with \(\varepsilon \), s of Proposition 3.1 corresponding now to \(\overline{\varepsilon }+\varepsilon \), \(\overline{s}+s\), and//

$$\begin{aligned} r={\widehat{\rho }} ,\quad {\hat{r}}=\frac{{\hat{\rho }}}{8} ,\quad {\hat{s}}\,{:}{=}\,\frac{s}{8} ,\quad \hat{\varepsilon }\,{:}{=}\,\frac{\varepsilon }{8} ,\quad {{\mathbb {L}}}=\{0\} . \end{aligned}$$

We check that that (61) and (62) imply conditions (49) and (50). We start with (49). If \((I, p, q)=(I_{1}, I_{2}, p_, q)\in A_{{\widehat{\rho }}, \varepsilon }\) and \(k\in \textbf{Z}^3\setminus \{0\}\), with \(|k|_1\le K\), then there exists some \(I_0(pq)=(I_{01}(pq), I_{02}(pq))\) such that \(|I-I_0(pq)|<{\widehat{\rho }}\) and \(\omega (I_0(pq), pq)=(\omega _{01}, \omega _{02})\in \mathcal{D}_{\gamma _1, \gamma _2, \tau }\). We have

$$\begin{aligned} |\varpi (I, pq)\cdot k|= & {} \Big |{\omega }_{01}\cdot k_1+{\omega }_{02}\cdot k_2+({\omega }_1(I, pq)-{\omega }_1(I(pq), pq))\cdot k_1\\{} & {} +({\omega }_2(I, pq)-{\omega }_2(I(pq), pq)))\cdot k_2-\textrm{i}{\nu }(I, pq) k_3 \Big |\\\ge & {} \left\{ \begin{array}{lll} \displaystyle \min \left\{ \frac{\gamma _1}{2K^\tau } ,\ \frac{\lambda }{2}\right\} \quad &{}\textrm{if}\quad &{}k_1\ne 0\\ \displaystyle \min \left\{ \frac{\gamma _2}{2K^\tau } ,\ \frac{\lambda }{2}\right\} &{}\textrm{if}&{} k_1=0,\ k_2\ne 0\\ \displaystyle \lambda &{}\textrm{if}&{} k_1=k_2=0,\ k_3\ne 0 \end{array} \right. \\\ge & {} \left\{ \begin{array}{lll} \displaystyle \alpha _1\,{:}{=}\,\frac{\gamma _1}{2K^\tau }\quad &{}\textrm{if}\quad &{}k_1\ne 0\\ \displaystyle \alpha _2\,{:}{=}\,\frac{\gamma _2}{2K^\tau } &{}\textrm{if}&{} k_1=0,\ (k_2, k_3)\ne (0, 0) \end{array} \right. \end{aligned}$$

having used (63). The bounds above have been obtained considering separately the cases \(k_3\ne 0\) and \(k_3= 0\), and:

–if \(k_3\ne 0\), taking the infimum of the modulus of the imaginary part of the expression between the |’s; observing that \(\overline{\omega }_0=(\overline{\omega }_{01}, \overline{\omega }_{02})\) are real and bounding the differences \(|{\, \mathrm Im\,}\big ({\omega }_i(I, pq)-{\omega }_i(I(pq), pq)\big )|\) with \(MK{\widehat{\rho }}\) (when \(i=1\)), \(\widehat{M}K{\widehat{\rho }}\) (when \(i=2\)) and using the definition of \({\widehat{\rho }}\) in (55).

–if \(k_3=0\), using the Diophantine inequality and again bounding the differences \(|{\, \mathrm Im\,}\big ({\omega }_i(I, pq)-{\omega }_i(I(pq), pq)\big )|\) as in the previous case and using the definition \({\widehat{\rho }}\).

We now check condition (50). The inequality \(Ks> 8\log 2\) is trivial by definition of K (see (54)), and also, the smallness condition  (50) is easily met, since \({\hat{\sigma }}=\min \{\frac{1}{8}\frac{\varepsilon }{\overline{\varepsilon }+\varepsilon } ,\ \frac{s}{8} \}=\frac{s}{8}\), \(\delta =2^{-6}\min \{{\hat{\rho }} s ,\ \varepsilon ^2\}=2^{-6}{\widetilde{{\rho }}} s\) (by the definition of \({\widetilde{\rho }}\) in (56)), whence

$$\begin{aligned} 2^{3} c_1\frac{K\frac{s}{8}}{{\alpha }_2 {\delta }}\Vert f\Vert _{ {W_{{\widehat{{\rho }}}, \varepsilon , s}}}\le 2^{6}c_1\frac{E L}{{\widehat{{\rho }}}\widetilde{\rho }}\le {\widehat{c}} {\widehat{E}}<1 \end{aligned}$$

having used \(L\ge {\widehat{M}}^{-1}\), \(M^{-1}\), so \({\alpha }_2\ge K L^{-1}{\widehat{{\rho }}}\), \(2^{6}c_1<{\widehat{c}}\), and (62). Thus, by Proposition 3.1, \(\textrm{H}\) may be conjugated to

$$\begin{aligned} \textrm{H}_+\,{:}{=}\,\textrm{H}\circ \Psi _+=\textrm{h}_+(I_+, p_+q_+)+f_+(I_+, {\varphi }_+, p_+, q_+) \end{aligned}$$

where

$$\begin{aligned} \textrm{h}_+(I_+, p_+q_+)=\textrm{h}(I_+, p_+q_+)+\textrm{g}(I_+, p_+q_+) \end{aligned}$$

while, by  (51) and the choice of K,

$$\begin{aligned} \Vert f_+\Vert _{{{\widehat{{\rho }}}/2, \varepsilon /2, s/2}}\le & {} e^{-Ks/32}E\le \frac{E LM ^2}{{{\gamma }_1}^2}E=E_+\ .\end{aligned}$$
(65)

The conjugation is realized by an analytic transformation

$$\begin{aligned} \Psi _+:\quad (I_+,\varphi _+, p_+, q_+,)\in W_{{\widehat{{\rho }}}/2, \varepsilon /2, s/2}\rightarrow (I,\varphi , p, q)\in W_{{\widehat{{\rho }}}, \varepsilon , s}\ . \end{aligned}$$

Using (52), \({\widetilde{\rho }}\le \varepsilon ^2/s\), \({\alpha }_1\ge {MK{\widehat{{\rho }}}}\), \({\alpha }_2=\frac{\gamma _2}{2K^\tau }\ge {L^{-1} K{\widehat{{\rho }}}}\), \(Ks\ge {6}\) and the definition of \({\widehat{E}}\), we obtain the bound (59) with, at the left hand side, the set \(W_{{\widehat{{\rho }}}/2, \varepsilon /2, s/2}\). Below we shall prove that \({W_{+}}_{{\rho }_{+}, \varepsilon _{+},s_{+}}\subset W_{{\widehat{{\rho }}}/2, \varepsilon /2, s/2}\), so we shall have (59).

We now evaluate the generalized frequency

$$\begin{aligned} \varpi _+(I_+, p_+q_+)\,{:}{=}\,\partial _{I_+, p_+q_+}\textrm{h}_+(I_+, p_+q_+)=\big (\omega _+(I_+, p_+q_+) ,\ \nu _+(I_+, p_+q_+)\big ) . \end{aligned}$$

with

$$\begin{aligned} \omega _+(I_+, p_+q_+)\,{:}{=}\,\partial _{I_+}\textrm{h}_+(I_+, p_+q_+)=\partial _{I_+}\textrm{h}(I_+, p_+q_+)+\partial _{I_+}\textrm{g}(I_+, p_+q_+)\end{aligned}$$
(66)

(the “new frequency map”) and

$$\begin{aligned} \nu _+(I_+, p_+q_+):=\partial _{p_+q_+}\textrm{h}_+(I_+, p_+q_+)=\nu (I_+, p_+q_+)+\partial _{p_+q_+}\textrm{g}(I_+, p_+q_+)\nonumber \\ \end{aligned}$$
(67)

(the “new Lyapunov exponent”).

Lemma 3.2

Let \((p_+, q_+)\in B_{\varepsilon /2}\). The new frequency map \( {{\omega }}_+\) is injective on \( D(p_+q_+)_{{\widehat{{\rho }}}/2}\) and maps \( D(p_+q_+)_{{\widehat{{\rho }}}/4}\) over \( {{\omega }}(D, p-+q_+)\). The map \({\widehat{\iota }}_+(p_{+}q_{+})=(\widehat{\iota }_{+1}(p_{+}q_{+}),{\widehat{\iota }}_{+2}(p_{+}q_{+}))\,{:}{=}\, {{\omega }}_+^{-1}\circ {{\omega }}|_{D(p_+q_+)}\) which assigns to a point \(I_0\in D(p_+q_+)\) the \( {{\omega }}_+(\cdot , p_+q_+)\)-preimage of \( {{\omega }}(I_0, p_+q_+)\) in \({ D}(p_+q_+)_{{\widehat{{\rho }}}/4}\) satisfies

$$\begin{aligned}{} & {} \sup _{(A_+)_{\rho _+, \varepsilon _+}}|{\widehat{\iota }}_{+1}(p_{+}q_{+})-{\, \mathrm id \,}|\le {3}{n}\frac{{\overline{M}}_1 E}{{\widehat{{\rho }}}}\le {3}{n}\frac{{\overline{M}}E}{{\widehat{{\rho }}}}\ ,\nonumber \\{} & {} \sup _{(A_+)_{\rho _+, \varepsilon _+}}|{\widehat{\iota }}_{+2}(p_{+}q_{+})-{\, \mathrm id \,}|\le 3{n}\frac{{\overline{M}}_2 E}{{\widehat{{\rho }}}}\le 3{n}\frac{{\overline{M}}E}{{\widehat{{\rho }}}}\ ,\nonumber \\{} & {} \mathcal{L}({\widehat{\iota }}_+(p_{+}q_{+})-{\, \mathrm id \,})\le 2^{ {9}}{n}\frac{{\overline{M}}E}{{\widehat{{\rho }}}^2}\ . \end{aligned}$$
(68)

The Jacobian matrix \(U_+\,{:}{=}\,\partial ^2_{ {I_+}}\textrm{h}_+ {(I_+, p_+q_+)}\) is non-singular on \( D_{{\widehat{{\rho }}}/ {4}} {\times B^2_{\varepsilon /2}}\) and the following bounds hold

$$\begin{aligned}{} & {} M_+ \,{:}{=}\,2M\ge \sup _{(A_+)_{\rho _+, \varepsilon _+}}\Vert U_+\Vert \ ,\quad {\widehat{M}}_+\,{:}{=}\,2{\widehat{M}}\ge \sup _{(A_+)_{\rho _+, \varepsilon _+}}\Vert {\widehat{U}}_+\Vert \ ,\nonumber \\{} & {} \overline{M}_+\,{:}{=}\,2\overline{M}\ge \sup _{(A_+)_{\rho _+, \varepsilon _+}}\Vert U_+^{-1}\Vert \ ,\quad \overline{M}_{i+}\,{:}{=}\,2\overline{M}_i\ge \sup _{(A_+)_{\rho _+, \varepsilon _+}}\Vert T_{i+}\Vert \ ,\quad i=1,\ 2.\nonumber \\ \end{aligned}$$
(69)

where \( U_+^{-1}{=}{:}\left( \begin{array}{lrr} T_{+1}\\ T_{+2} \end{array} \right) \). Finally, the new Lyapunov exponent \(\nu _+(I_+, p_+q_+)\) satisfies

$$\begin{aligned} {\lambda }_+\,{:}{=}\,{\lambda }- 2^4\frac{E}{\varepsilon ^2}\le \inf _{(A_+)_{\rho _+, \varepsilon _+}}|{\, \mathrm Re\,}{\nu }_+| .\end{aligned}$$
(70)

Postponing for the moment the proof of this lemma, we let \({\rho }_+\,{:}{=}\,{\widehat{{\rho }}}/ 2\), \(s_+\,{:}{=}\,s/2\), \(\varepsilon _+=\varepsilon /2\) and \( D_+(p_{+}q_{+})\,{:}{=}\,\widehat{\iota }_+(p_{+}q_{+})(D(p_{+}q_{+}))\). By Lemma 3.2, \(D_+\) is a subset of \(D_{{\widehat{{\rho }}}/4}\) and hence

$$\begin{aligned} (D_+)_{{\rho }_+}\subset D_{{\widehat{{\rho }}}/2}\ .\end{aligned}$$
(71)

We prove that \( {\widehat{E}} _+=\frac{E_+L_+}{\widehat{\rho }_+^2}\le {\widehat{E}} ^2 \). Since

$$\begin{aligned} s_+=\frac{s}{4}\quad \text {and}\quad x_+\,{:}{=}\,\Big (\frac{E_+ L_+M _+^2}{{{\gamma }_1}^2}\Big )^{-1}=\frac{x^2}{8}\quad \text {where}\quad x\,{:}{=}\,\Big (\frac{E LM ^2}{{{\gamma }_1}^2}\Big )^{-1} \end{aligned}$$
(72)

we have

$$\begin{aligned} K_+=\frac{2^5}{s_+}\log x_+= \frac{2^7}{s}\log \frac{x^2}{8} =\frac{2^8}{s}\log _+ x-\frac{3\cdot 2^7}{s}\log _+ 2<8 K .\end{aligned}$$
(73)

Finally, (42), (70) and the definition of \({\tilde{E}}\) imply \(\lambda _+\ge \frac{\lambda }{2}\). Collecting all bounds, we get

$$\begin{aligned} {\widehat{{\rho }}}_+= & {} \min \left\{ \frac{{\gamma }_1}{ 2M_{+}K_{+}^{{\tau }+1}} ,,\ \frac{{\gamma }_2}{ 2{\widehat{M}}_{+}K_{+}^{{\tau }+1}} ,\ \frac{\lambda _+}{ 2 M_{+}K_{+}} ,\ \frac{\lambda _+}{ 2 {\widehat{M}}_{+}K_{+}} ,\ {\rho }_{+}=\frac{{\widehat{\rho }}}{2}\right\} \ge \frac{{\widehat{{\rho }}}}{2\cdot 8^{{\tau }+1}}\nonumber \\ {\widetilde{\rho }}_+= & {} \min \left\{ {\widehat{\rho }}_+, \frac{\varepsilon ^2_+}{s_+}\right\} \ge \frac{{\widetilde{{\rho }}}}{2\cdot 8^{{\tau }+1}} \end{aligned}$$
(74)

and

$$\begin{aligned} {\widehat{E}} _+=\frac{E_+L_+}{\widehat{\rho }_+{\widetilde{\rho }}_+}\le \frac{E^2LM ^2}{{{\gamma }_1}^2}\frac{2L}{{\widehat{{\rho }}}{\widetilde{\rho }}}4\cdot 8^{2({\tau }+1)}=8\cdot 8^{2({\tau }+1)}\frac{E\,LM ^2}{{{\gamma }_1}^2}{\widehat{E}} \end{aligned}$$

Now, using, in the last inequality, the bound

$$\begin{aligned} \frac{E\,LM ^2}{{{\gamma }_1}^2} \le \frac{1}{4}\left( \frac{s}{6}\right) ^{2({\tau }+1)} \frac{E L}{\widehat{\rho }^2} \le \frac{1}{4}\left( \frac{s}{6}\right) ^{2({\tau }+1)}{\widehat{E}} \end{aligned}$$

(since \({\hat{\rho \le }} \frac{{\gamma }_1}{ 2M K^{{\tau }+1}}\) and \(K\ge 6/s\)) we find

$$\begin{aligned} {\widehat{E}} _+\le 2(\frac{4}{3}s)^{{\tau }+1}{\widehat{E}} ^2<{\widehat{E}} ^2\end{aligned}$$
(75)

(having used \(s\le 1/2\)). We now prove that \(\lambda _{+}\ge \frac{\lambda _0}{2}\). Iterating (70) and using \({\widehat{\rho }}_k\le {\widehat{\rho }}_{k-1}/4\), \({\widetilde{\rho }}_k\le {\widetilde{\rho }}_{k-1}/4\), \(\varepsilon _k= \varepsilon _{k-1}/4\), \(L_k=2L_{k-1}\), (75) and the second condition in (42) with \({\tilde{c}}=2^6\), we get

$$\begin{aligned} \lambda _{+}{} & {} =\lambda _{j+1}=\lambda _0-2^4\sum _{k=1}^{j}\frac{E_k}{\varepsilon _k^2}\ge \lambda _0-2^4\sum _{k=1}^{j}\widehat{E}_k\frac{{\widehat{\rho }}_k{\widetilde{\rho }}_k}{\varepsilon ^2_k L_k} \ge \lambda _0-2^4\frac{{\widehat{\rho }}_0{\widetilde{\rho }}_0}{\varepsilon ^2_0 L_0}\sum _{k=1}^{j}{\widehat{E}}_k\nonumber \\{} & {} \ge \lambda _0-2^5\frac{{\widehat{\rho }}_0{\widetilde{\rho }}_0}{\varepsilon ^2_0 L_0}{\widehat{E}}_0\nonumber \\{} & {} =\lambda _0-2^5\frac{ E_0}{\varepsilon ^2_0} \ge \frac{\lambda _0}{2} . \end{aligned}$$
(76)

This allows to check (63) at the next step: using (64) and (73), we have

$$\begin{aligned} \lambda _+\ge \frac{\lambda _0}{2}\ge \frac{\gamma _2}{K_0^{\tau }} \ge \frac{\gamma _2}{K_+^{\tau }} . \end{aligned}$$

Finally, (57) and (58) follow from (68), while the estimate in (60) is a consequence of  (59), (71), (72), (74), inequality \(LM\ge 1\) and Cauchy estimates:

$$\begin{aligned} \mathcal{L}(\check{{\Phi }}_{j+1}-{\, \mathrm id \,})\le & {} 2(n+1)\sup _{{({\check{W}}_{j+1})_{{\rho }_{j+1}, \varepsilon _{j+1},s_{j+1}}}}\Vert D(\check{{\Phi }}_{j+1}-{\, \mathrm id \,})\Vert _{\infty }\\\le & {} 2(n+1) \frac{\frac{3}{4}{\widehat{E}}_j\max \{{\widehat{{\rho }}}_{j}/{\rho }_0,s_j/s_0, {{\varepsilon _j}/{\varepsilon _0}}\}}{\min \{{\widehat{{\rho }}}_{j}/(4{\widehat{{\rho }}}_0),s_{j}/(4s_0), {{\varepsilon _{j}}/{(4\varepsilon _0)}}\}}\\\le & {} 2(n+1)\frac{3/4(1/4)^{j}}{1/4\left( \frac{1}{2({24})^{{\tau }+1}}\right) ^{j}}\widehat{E} _{j}=6(n+1)\big (12\cdot (24)^\tau \big )^{j}{\widehat{E}} _{j}\ . \end{aligned}$$

Proof of Lemma 3.2

The proof of this proposition is obtained generalizing (Chierchia and Pinzari 2010, Lemma B.2). As above, we limit to discuss only the different parts.

By (51),

$$\begin{aligned} \sup _{ {{ D}_{{\widehat{{\rho }}}/2}\times B^2_{\varepsilon /2}}}|\textrm{g}|\le \sup _{ {{ D}_{{\widehat{{\rho }}}/2}\times B^2_{\varepsilon /2}}}|\textrm{g}-{\overline{{f}}}|+\sup _{ {{ D}_{\widehat{\rho }/2}\times B^2_{\varepsilon /2}}}|{\overline{{f}}}|\le {\frac{3}{2}}E\ , \end{aligned}$$

(where \({\overline{f}}\) denotes the average of f). Therefore we may bound

$$\begin{aligned}{} & {} \sup _{ {{ D}_{{\widehat{{\rho }}}/4}\times B^2_{\varepsilon /2}}}\Vert (\partial _{I_+}^2\,\textrm{h})^{-1}\partial _{I_+}^2\,\textrm{g}\Vert \le 2 \overline{M} \frac{\frac{3}{2}E}{({\widehat{{\rho }}}/4)^2}\le 2^{ {6}}\frac{\overline{ME}}{{\widehat{{\rho }}}^2}\le 2^{ {6}}\frac{\overline{ME}}{{\widehat{{\rho }}}^2}<\frac{1}{2} \end{aligned}$$

This shows that the function (66) has a Jacobian matrix

$$\begin{aligned} \partial _{I_+}\omega _+(I_+, p_+q_+)=\partial _{I_+}^2\,\textrm{h}_+(I_+, p_+q_+)=\partial ^2_{I_+}\textrm{h}(I_+, p_+q_+)+\partial ^2_{I_+}\textrm{g}(I_+, p_+q_+) \end{aligned}$$

which is invertible for all \((p_+, q_+)\in B^2_{\varepsilon /2}\) and satisfies

$$\begin{aligned} \overline{M}_+\,{:}{=}\,\sup _{ {{ D}_{{\widehat{{\rho }}}/4}\times B^2_{\varepsilon /2}}}\Big \Vert \Big (\partial _{I_+}\omega _+(I_+, p_+q_+)\Big )^{-1}\Big \Vert \le 2\overline{M} \end{aligned}$$

In a similar way one proves (69). Next, for any fixed \((p_+, q_+)\in B^2_{\varepsilon /2}\) and \(\overline{\omega }=\omega (I(p_+q_+), p_+q_+)\in \omega (D, p_+q_+) \) with \(I(p_+q_+)\in D\), we want to find \(I_+=I_+(p_+q_+)\in D_+\) such that

$$\begin{aligned} \omega _+(I_+(p_+q_+), p_+q_+)=\overline{\omega }=\omega (I(p_+q_+), p_+q_+)\end{aligned}$$
(77)

To this end, we consider the function

$$\begin{aligned} I_+\in { D}_{{\widehat{{\rho }}}/2}\rightarrow F(I_+, p_+q_+)\,{:}{=}\,\omega _+(I_+, p_+q_+)-\overline{\omega } \quad (p_+, q_+)\in B^2_{\varepsilon /2}\end{aligned}$$

As F differs from \(\omega _+\) by a constant, we have

$$\begin{aligned} m\,{:}{=}\,\sup _{ {{ D}_{{\widehat{{\rho }}}/4}\times B^2_{\varepsilon /2}}}\Big \Vert \Big (\partial _{I_+}F(I_+, p_+q_+)\Big )^{-1}\Big \Vert =\sup _{ {{ D}_{{\widehat{{\rho }}}/4}\times B^2_{\varepsilon /2}}}\Big \Vert \Big (\partial _{I_+}\omega _+(I_+, p_+q_+)\Big )^{-1}\Big \Vert \le 2M . \end{aligned}$$

Similarly, we bound the quantities

$$\begin{aligned} Q\,{:}{=}\,|\partial ^2_{I_+}F(I)|=|\partial ^3_{I_+}\textrm{g}(I_+, p_+q_+)|\le 6\frac{\frac{3}{2}E}{({\widehat{\rho }}/4)^3}<2^{10}\frac{E}{{\widehat{\rho }}^3}. \end{aligned}$$

and

$$\begin{aligned} P\,{:}{=}\,|F(I(p_+q_+))|=|\partial _{I_+}\textrm{g}(I(p_+q_+), p_+q_+)|\le \frac{\frac{3}{2}E}{({\widehat{\rho }}/4)}\le 2^{3}\frac{E}{{\widehat{\rho }}} . \end{aligned}$$

Putting everything together, we get

$$\begin{aligned} 4m^2 PQ\le 2^{16}\frac{M^2 E^2}{{\widehat{\rho }}^4}\le {\widehat{c}}^2 {\widehat{E}}^2<1 \end{aligned}$$

By the implicit function theorem (e.g., (Celletti and Chierchia 1998, Theorem 1 and Remark 1)), Equation (77) has a unique solution

$$\begin{aligned} (p_+, q_+)\in B_{\varepsilon /2}\rightarrow I_+(p_+q_+)\in B_r(I(p_+q_+)) , \end{aligned}$$

with

$$\begin{aligned} r=2mP\le 2^5\frac{ME}{{\widehat{\rho }}}\le \frac{{\widehat{\rho }}}{4} \end{aligned}$$

so we can take

$$\begin{aligned} D_+(p_+q_+)=\bigcup _{\overline{\omega }\in \omega (D, p_+q_+)}\{I_+(p_+q_+)\} \end{aligned}$$

This ensures that (61) holds also for \(D_+\).

Finally, the real part of the function (67) satisfies the lower bound

$$\begin{aligned} \inf _{ {{ D}_{{\widehat{{\rho }}}/2}\times B^2_{\varepsilon /4}}}\left| {\, \mathrm Re\,}{\nu }_+\right| \ge {\lambda }- \frac{E}{(\varepsilon /4)^2}={\lambda }_+ . \end{aligned}$$

The proof of (68) proceeds as in Chierchia and Pinzari (2010, proof of Lemma B.2). \(\square \)

Proof of Theorem 3.1.

Step 1 Construction of the “generalized limit actions”

Let \(({\pi }, {\kappa })\in B_0=B^2_{\overline{\varepsilon }}=\bigcap _{j\ge 0}B_{\varepsilon _j}\). Define, on \(D_0({\pi }{\kappa })=\omega _0^{-1}(\mathcal{D} _{\gamma _1, \gamma _2, \tau }, {\pi }{\kappa })\cap D\),

$$\begin{aligned} {\check{\iota }}_j({\pi }{\kappa })\,{:}{=}\,{\widehat{\iota }}_j({\pi }{\kappa })\circ \widehat{\iota }_{j-1}({\pi }{\kappa })\circ \cdots \circ {\widehat{\iota }}_1({\pi }{\kappa })\quad j\ge 1 . \end{aligned}$$

Then \({\check{\iota }}_j({\pi }{\kappa })\) converge uniformly to a \({\check{\iota }} ({\pi }, {\kappa })=({\check{\iota }}_1({\pi }, {\kappa }), {\check{\iota }}_2({\pi }, {\kappa }))\) verifying

$$\begin{aligned} \sup _{D_0({\pi }{\kappa })}|{\check{\iota }}_1({\pi }{\kappa })-{\, \mathrm id \,}|\le 6n \frac{\overline{M}_1}{\overline{M}}{\widetilde{\rho }}_0{\widehat{E}}_0 ,\quad \sup _{D_0({\pi }{\kappa })}|{\check{\iota }}_2({\pi }{\kappa })-{\, \mathrm id \,}|\le 6n \frac{\overline{M}_i}{\overline{M}}{\widetilde{\rho }}_0{\widehat{E}}_0 .\end{aligned}$$
(78)

Moreover, as

$$\begin{aligned} \sup |{\widehat{\iota }}_j({\pi }{\kappa })-{\widehat{\iota }}({\pi }{\kappa })|\le 6n {\widehat{E}}_j{\widetilde{\rho }}_j<\frac{6n}{{\widehat{c}}}{\widehat{\rho }}_j<\rho _j \end{aligned}$$

we have

$$\begin{aligned} D_*(pq)\,{:}{=}\,{\check{\iota }}({\pi }{\kappa }) (D_0({\pi }{\kappa }))\subset \bigcap _{j} {D_j({\pi }{\kappa })}_{\rho _j} .\end{aligned}$$
(79)

In particular, taking \(j=0\),

$$\begin{aligned} D_*({\pi }{\kappa })\subset (D_0({\pi }{\kappa }))_{6n {\widehat{E}}_0{\widetilde{\rho }}_0} .\end{aligned}$$
(80)

Moreover,

$$\begin{aligned} \mathcal{L}({\check{\iota }}({\pi }{\kappa })-{\, \mathrm id \,})\le 2^8 n {\widehat{E}} . \end{aligned}$$

So \({\check{\iota }}({\pi }{\kappa })\) is bi-Lipschitz, with

$$\begin{aligned} \mathcal{L}_-({\check{\iota }}({\pi }{\kappa }))\ge 1-2^8 n {\widehat{E}} ,\qquad \mathcal{L}_+({\check{\iota }}({\pi }{\kappa }))\le 1+2^8 n {\widehat{E}} . \end{aligned}$$

Step 2 Construction of \(\phi _{\omega _*}\). For each \(j\ge 1\), the transformation

$$\begin{aligned} \Phi _j\,{:}{=}\,\Psi _1\circ \cdots \circ \Psi _j \end{aligned}$$

is defined on \((W_j)_{\rho _j, s_j, \varepsilon _j}\). If

$$\begin{aligned} A_*\,{:}{=}\,\bigcup _{|({\pi }, {\kappa })|<\overline{\varepsilon }}D_*({\pi }{\kappa })\times \{({\pi }, {\kappa })\} ,\qquad W_*\,{:}{=}\,A_*\times {{\mathbb {T}}}^n . \end{aligned}$$

then, by (79), \(W_*\subset \bigcap _j (W_j)_{\rho _j, s_j, \varepsilon _j}\). The sequence \(\Phi _j\) converges uniformly on \(W_*\) to a map \(\Phi \). We then let

$$\begin{aligned} \phi _{\omega _*}(\vartheta , {\pi }, {\kappa }){} & {} =\Big (v(\vartheta , {\pi }, {\kappa }; \omega _*), \vartheta +u(\vartheta , {\pi }, {\kappa }; \omega _*), {\pi }+w(\vartheta , {\pi }, {\kappa }; \omega _*), {\kappa }+y(\vartheta , {\pi }, {\kappa }; \omega _*) \Big )\\{} & {} :=\Phi \left( {\check{\iota }}(\omega _0^{-1}(\omega _*, {\pi }{\kappa })), \vartheta , {\pi }, {\kappa }\right) \end{aligned}$$

with \(v(\vartheta , {\pi }, {\kappa }; \omega _*)\,{:}{=}\,\big (v_1(\vartheta , {\pi }, {\kappa }; \omega _*), v_2(\vartheta , {\pi }, {\kappa }; \omega _*)\big )\). Since (59) imply, on \(W_*\),Footnote 22

$$\begin{aligned} \sup _{W_*}|\P _{I_1}\Phi -{\, \mathrm id \,}|_1\le 2n\frac{{\widehat{M}}_0}{{M}_0}{\widehat{E}}_0{\widetilde{{\rho }}}_0\end{aligned}$$
(81)

and similarly,

$$\begin{aligned}{} & {} \sup _{W_*}|\P _{I_2}\Phi -{\, \mathrm id \,}|_1\le 2n {\widehat{E}}_0{\widetilde{{\rho }}}_0\ ,\quad \sup _{W_*}|\P _{\varphi }\Phi -{\, \mathrm id \,}|_{\infty }\le 2{\widehat{E}}_0 s_0\ ,\nonumber \\{} & {} \sup _{W_*}|\P _{p}\Phi -{\, \mathrm id \,}|_{\infty }\le 2{\widehat{E}}_0 \varepsilon _0 ,\quad \sup _{W_*}|\P _{q}\Phi -{\, \mathrm id \,}|_{\infty }\le 2{\widehat{E}}_0 \varepsilon _0 \end{aligned}$$
(82)

then, in view of (78), (81), (82), the definition of \(W_*\) and the triangular inequality, we have (46). Equations (80), (81), (82) also imply

$$\begin{aligned} \textrm{T}_{{\omega }_*}\,{:}{=}\,\phi _{{\omega }_*}({\mathbb {T}}^n, 0, 0)\subset (D_*(0))_{2 {\widehat{E}}_0{\widetilde{{\rho }}}_0}\times {\mathbb {T}}^n\times B^2_{r'}\subset (D_0(0))_r\times {\mathbb {T}}^n\times B^2_{r'}\end{aligned}$$
(83)

where

$$\begin{aligned} r=8 n{\widehat{E}}_0{\widetilde{{\rho }}}_0 ,\qquad r'=2{\widehat{E}}_0\varepsilon _0 \end{aligned}$$

Finally, with similar arguments as in Step 1, by (84), the rescaled map

$$\begin{aligned} {\check{\Phi }}\,{:}{=}\,{\, \mathrm id \,}+{1}_{{\widehat{{\rho }}}_0^{-1},s_0^{-1}, \varepsilon _0^{-1}}\left( \Phi -{\, \mathrm id \,}\right) \circ {1}_{\widehat{\rho }_{0},s_{0}, \varepsilon _{0}} \end{aligned}$$

has Lipschitz constant

$$\begin{aligned} \mathcal{L}({\check{\Phi }}-{\, \mathrm id \,})\le 2^6(n+1) {\widehat{E}}_0\ .\end{aligned}$$
(84)

In particular, \({\check{\Phi }}\), hence, \(\Phi \), and, finally, the map \((\vartheta ,{\pi }, {\kappa }; {\omega })\rightarrow \phi _{\omega }(\vartheta , {\pi }, {\kappa })\) are bi-Lipschitz, hence, injective.

Step 3 For any \({\omega }_*\in \mathcal{D} _{\gamma _1, \gamma _2,{\tau }}\cap {\omega }_0(D, 0)\), \(\textrm{T}_{{\omega }_*}\) in (83) is a n-dimensional \(\mathrm H\)-invariant torus with frequency \({\omega }_*\). This assertion is a trivial generalization of its analogue one in Chierchia and Pinzari (2010, Proof of Proposition 3, Step 3); therefore, its proof is omitted.

Step 4 Measure Estimates (proof of (45)) The proof of (45) proceeds as in Chierchia and Pinzari (2010, Proof of Proposition 3, Step 4), just replacing the quantities that in Chierchia and Pinzari (2010, Proof of Proposition 3, Step 4) are called

$$\begin{aligned} D_0 ,\quad D_* ,\quad {\check{\iota }} ,\quad {\check{\Phi }} ,\quad \textrm{K} \end{aligned}$$

with the quantities here denoted as

$$\begin{aligned} D_0(0) ,\quad D_*(0) ,\quad {\check{\iota }}(0) ,\quad {\check{\Phi }}\big |_{({\pi }, {\kappa })=(0, 0)} ,\quad \textrm{K}_0 .\qquad \end{aligned}$$

\(\square \)

3.4 Normal Form Theory

Proposition 3.1 can be obtained from the more general Proposition 3.2, taking \(m=1\), \({\mathbb L}=\{0\}\) and changing coordinates as follows:

$$\begin{aligned} p=\frac{{p_1}-\textrm{i}{q_1}}{\sqrt{2}}\ ,\qquad q=\frac{{p_1}+\textrm{i}{q_1}}{\sqrt{2} \textrm{i}}\ .\end{aligned}$$

We define \(c_m\) to be the smallest number such that, for any two functions, real-analytic in \(W_{r, s, \varepsilon }\) and any choice of \({\hat{r}}<r\), \({\hat{s}}<s\), \({\hat{\varepsilon }}<\varepsilon \),

$$\begin{aligned} \Vert \{f, g\}\Vert _{r-{\hat{r}}, s-{\hat{s}}, \varepsilon -{\hat{\varepsilon }}}\le \frac{c_m}{\delta }\Vert f\Vert _{r, s, \varepsilon }\Vert g\Vert _{r, s, \varepsilon }\quad \textrm{with}\ \delta \,{:}{=}\,\min \{{\hat{r}}{\hat{s}}, {\hat{\varepsilon }}^2\} . \end{aligned}$$

Proposition 3.2

Let \(\{0\}\subset {{\mathbb {L}}}\subset {{\mathbb {Z}}}\). Proposition 3.1 holds true taking

$$\begin{aligned} H(I,{\varphi },p,q)=h\left( I_1, I_2, J(p, q)\right) + f(I,{\varphi },p,q) ,\quad J(p, q)\,{:}{=}\,\left( \frac{p_1^2+q_1^2}{2} ,\ldots , \frac{p_m^2+q_m^2}{2}\right) \end{aligned}$$

replacing \(c_1\) with \(c_m\), \(\P _0\) with \(\P _{{\mathbb {L}}}\) and condition (49) with

$$\begin{aligned}{} & {} |{\omega }_1\cdot k_1+{\omega }_2\cdot k_2 |\ge \left\{ \begin{array}{lll} {\alpha }_1\quad &{}\textrm{if}\quad &{}k_1\ne 0\\ {\alpha }_2 &{}\textrm{if}&{} k_1=0,\ k_2\ne 0\\ \end{array} \right. \nonumber \\{} & {} \forall \ k=(k_1, k_2)\in {{\mathbb {Z}}}^{n_1}\times {\mathbb Z}^{n_2+m}\setminus {{\mathbb {L}}}\ne (0, 0) ,\ |k|_1\le K ,\quad \forall (I_1, I_2, p, q)\in V_r\times B^{2m}_{\varepsilon }\nonumber \\ \end{aligned}$$
(85)

where

$$\begin{aligned} \omega{} & {} =(\omega _1, \omega _2)\\{} & {} :=\left( \partial _{I_1} h\left( I_1, I_2, J(p, q)\right) ,\ \partial _{\left( I_2, J(p, q)\right) } h\left( I_1, I_2, J(p, q)\right) \right) . \end{aligned}$$

Lemma 3.3

Let \({\hat{r}}<r/2\) \({\hat{s}}<s/2\), \({\hat{\varepsilon }}<\varepsilon /2\) and \({\delta }\,{:}{=}\,\min \{{\hat{r}}{\hat{s}},\ {\hat{\varepsilon }}^2\}\). Let

$$\begin{aligned} H(u,\varphi ,p,q)= & {} \textrm{h}(I, p,q)+g(u, {\varphi }, p,q) +f(u, {\varphi },p,q)\qquad g(u, {\varphi }, p,q)\\= & {} \sum _{i=1}^mg_i(u, {\varphi }, p,q)\end{aligned}$$

be real-analytic on \(W_{v,s,\varepsilon }\). Assume that inequality (85) and

$$\begin{aligned} \Vert f\Vert _{v,s, \varepsilon }<\frac{\alpha _2 {\delta }}{c_m} \end{aligned}$$

are satisfied. Then, one can find a real-analytic and symplectic transformation

$$\begin{aligned} \Phi :\ W_{v-2{\hat{v}}, s-2{\hat{s}},\varepsilon -2{\hat{\varepsilon }}}\rightarrow W_{v, s,\varepsilon } \end{aligned}$$

defined by the time-one flowFootnote 23\(X_\phi ^1f\,{:}{=}\,f\circ \Phi \) of a suitable \(\phi \) verifying

$$\begin{aligned} \Vert \phi \Vert _{v,s,\varepsilon }\le \frac{\Vert f\Vert _{v,s,\varepsilon }}{\alpha _2} \end{aligned}$$

such that

$$\begin{aligned} H_+\,{:}{=}\,H\circ \Phi =h+g+\P _{{{\mathbb {L}}}} T_Kf+f_+ \end{aligned}$$

and, moreover, the following bounds hold

$$\begin{aligned} \Vert f_+\Vert _{v-2{\hat{v}}, s-2{\hat{s}}, \varepsilon -2{\hat{\varepsilon }}}\le & {} \big (1-\frac{c_m}{\alpha _2 {\delta }}\Vert f\Vert _{v, s,\varepsilon }\big )^{-1}\Big [\frac{c_m}{\alpha _2 {\delta }}\Vert f\Vert _{v, s,\varepsilon }^2 \\{} & {} +\max \left\{ e^{-K{\hat{s}}/2} ,\ \big (\frac{\varepsilon -{\hat{\varepsilon }}}{\varepsilon }\big )^{K/2}\right\} \Vert f\Vert _{v, s,\varepsilon }+ \Vert \big \{\phi , g \big \}\Vert _{v-{\hat{v}}, s-{\hat{s}}, \varepsilon -{\hat{\varepsilon }}}\Big ] \end{aligned}$$

Finally, for any real-analytic function F on \(W _{v, s, \varepsilon }\),

$$\begin{aligned} \Vert F\circ \Phi -F\Vert _{v-2{\hat{v}}, s-2{\hat{s}}, \varepsilon -2{\hat{\varepsilon }}}\le & {} \frac{\Vert \{\phi , F\}\Vert _{v-{\hat{v}}, s-{\hat{s}}, \varepsilon -{\hat{\varepsilon }}}}{\displaystyle 1-\frac{c_m\Vert f\Vert _{v, s,\varepsilon }}{\alpha _2 {\delta }}}. \end{aligned}$$
(86)

Sketch of proof Lemma 3.3 is a straightforward generalization of Pöschel (1993, Iterative Lemma). To obtain such generalization, just replace the norm defined in Pöschel (1993, Section 1) with the norm (48), where

$$\begin{aligned} f=\sum _{\begin{array}{c} (k,{\alpha },{\beta })\in \textbf{Z}^{n}\times {{\mathbb {N}}}^\ell \times {{\mathbb {N}}}^\ell \\ {{\alpha }_i\ne {\beta }_i\forall i} \end{array}}f_{k, {\alpha },{\beta }}(I)e^{ik\cdot \varphi } \left( \frac{p-\textrm{i}q}{\sqrt{2}}\right) ^{\alpha }\left( \frac{p+\textrm{i}q}{\textrm{i}\sqrt{2}}\right) ^{\beta }, \end{aligned}$$
(87)

and bound the “ultraviolet remainders”, namely the norm of the functions whose expansion (87) includes only terms with \(|(k, \alpha -\beta )|_1>K\), as follows. Observe that, if \(|(k, \alpha -\beta )|_1>K\), then either \(|k|_1>K/2\) or \(|\alpha -\beta |_1>K/2\). In the latter case, a fortiori, \(|\alpha |_1+|\beta |_1\ge |\alpha -\beta |_1>K/2\). Then we have, for such functions, \(\Vert f\Vert _{r, s-{\hat{s}}, \varepsilon -{\hat{\varepsilon }}}\le \max \left\{ e^{-K{\hat{s}}/2} ,\ \left( \frac{\varepsilon -{\hat{\varepsilon }}}{\varepsilon }\right) ^{K/2}\right\} \Vert f\Vert _{r, s, \varepsilon }\). Other details are omitted.

Proof of Proposition 3.2

Let

$$\begin{aligned} r_1\,{:}{=}\,r_0-2 {\hat{r}}_0,\qquad s_1\,{:}{=}\,s_0-2{\hat{s}}_0,\quad \varepsilon _1\,{:}{=}\,\varepsilon _0-2{\hat{\varepsilon }}_0 . \end{aligned}$$

By Lemma 3.3, we find a canonical transformation \(\Phi _1=X_{\phi _1}\) which is real-analytic on \(W_{ r_1, s_1, \varepsilon _1}\) and conjugates \(H=H_0\) to \(H_1=H_0\circ \Phi _1=h+g_1+f_1\), where \(g_1=\P _{{{\mathbb {L}}}}T_K f_0\) and

$$\begin{aligned} \Vert f_1\Vert _{v_1,s_1,\varepsilon _1}\le & {} (1-\frac{c_mE_0}{\alpha _2 {\delta }_0})^{-1}\Big [\frac{c_mE_0}{\alpha _2 {\delta }_0}+\max \left\{ e^{-K{\hat{s}}_0/2} ,\ \big (\frac{\varepsilon _0-{\hat{\varepsilon }}_0}{\varepsilon _0}\big )^{K/2}\right\} \Big ]E_0\\\le & {} 2\Big [\frac{c_mE_0}{\alpha _2 {\delta }_0}+e^{-K{\hat{{\sigma }}}_0/2}\Big ]E_0 \end{aligned}$$

having used

$$\begin{aligned} \big (\frac{\varepsilon _0-{\hat{\varepsilon }}_0}{\varepsilon _0}\big )^{K/2}=e^{\frac{K}{2}\log \big (1-\frac{{\hat{\varepsilon }}_0}{\varepsilon _0}\big )}\le e^{-\frac{K}{2}\frac{{\hat{\varepsilon }}_0}{\varepsilon _0}} . \end{aligned}$$

We now focus on the case

$$\begin{aligned} \frac{c_mE_0}{\alpha _2 {\delta }_0}<e^{-K{\hat{{\sigma }}}_0/2} \end{aligned}$$

otherwise the lemma isFootnote 24 proved. Then, we have

$$\begin{aligned} \Vert f_1\Vert _{v_1,s_1,\varepsilon _1}\le & {} 4\frac{c_mE^2_0}{\alpha _2 {\delta }_0}{=}{:}E_1 . \end{aligned}$$

Note that

$$\begin{aligned} E_1<\frac{E_0}{4} . \end{aligned}$$

Assume now that, for some \(j\ge 1\), it is \(H_j=H_{j-1}\circ \Phi _j=h+g_j+f_j\), where

$$\begin{aligned} g_j=\sum _{h=0}^{j-1}\P _{{\mathbb {L}}}T_K f_{h} ,\qquad \Vert f_j\Vert _{v_j,s_j,\varepsilon _j}\le & {} E_j\le \min \left\{ \frac{E_0}{4^j} ,\ 4\frac{c_mE^2_0}{\alpha _2 {\delta }_0}\right\} . \end{aligned}$$
(88)

We have just proved this is true when \(j=1\). Let \(L\,{:}{=}\,\left[ \frac{K{\hat{\sigma }}_0}{8\log 2}\right] \). We prove that (88) is true for \(j+1\), for all \(1\le j\le L\). Let

$$\begin{aligned}{\hat{r}}_j\,{:}{=}\,\frac{{\hat{r}}_0}{L} ,\quad {\hat{s}}_j\,{:}{=}\,\frac{{\hat{s}}_0}{L} ,\quad {\hat{\varepsilon }}_j\,{:}{=}\,\frac{{\hat{\varepsilon }}_0}{L}\quad \textrm{hence}\quad \delta _j=\frac{\delta _0}{L^2}\quad \forall \ 1\le j\le L .\end{aligned}$$

Note that, for all \(1\le j\le L\), it is \({\hat{r}}_j<\frac{r_j}{2}\):

$$\begin{aligned} r_j=r_1-2(j-1)\frac{{\hat{r}}_0}{L}\ge r_1-2(1-1/L){\hat{r}}_0=r_0-4{\hat{r}}_0+2{\hat{r}}_j>2{\hat{r}}_j . \end{aligned}$$

Similarly, \({\hat{s}}_j<\frac{s_j}{2}\), \({\hat{\varepsilon }}_j<\frac{\varepsilon _j}{2}\). Let then

$$\begin{aligned} r_{j+1}=r_j-2\frac{{\hat{r}}_0}{L} ,\quad s_{j+1}=s_j-2\frac{{\hat{s}}_0}{L} ,\quad \varepsilon _{j+1}=\varepsilon _j-2\frac{{\hat{\varepsilon }}_0}{L} \end{aligned}$$

so that \(r_j=r_1-2(j-1)\frac{{\hat{r}}_0}{L}\), etc., for all \(1\le j\le L\). Then

$$\begin{aligned} c_m\frac{E_j}{\alpha _2 \delta _j}\le 4\frac{c^2_0E^2_0}{\alpha ^2_2 {\delta }^2_0}L^2<\frac{1}{16}\end{aligned}$$
(89)

and Lemma 3.3 applies again, and \(H_j\) can be conjugated to \(H_{j+1}=H_j\circ \Phi _{j+1}=h+g_{j+1}+f_{j+1}\), with

$$\begin{aligned} g_{j+1}= & {} g_j+\P _{{{\mathbb {L}}}} T_Kf_j=\sum _{h=0}^{j}\P _{{\mathbb {L}}}T_K f_{h}\\ \Vert f_{j+1}\Vert _{r_{j+1}, s_{j+1}, \varepsilon _{j+1}}\le & {} \big (1-\frac{c_m}{\alpha _2 {\delta }_j}E_j\big )^{-1}\Big [\frac{c_m}{\alpha _2 {\delta }_j}E_j^2 \\{} & {} +\max \left\{ e^{-K{\hat{s}}_j/2} ,\ \big (\frac{\varepsilon _j-{\hat{\varepsilon _j}}}{\varepsilon _j}\big )^{K/2}\right\} E_j\\{} & {} + \Vert \big \{\phi _j, g_j \big \}\Vert _{r_{j}-{\hat{r}}_j, s_j-{\hat{s}}_j, \varepsilon _j-{\hat{\varepsilon _j}}}\Big ] \end{aligned}$$

To bound the right hand side of the latter expression, we use (89) and observe that

$$\begin{aligned}{} & {} e^{-K{\widehat{s}}_j/2}=e^{-\frac{K}{2L}{\widehat{s}}_0}\le \frac{1}{16}\\{} & {} \left( \frac{\varepsilon _j-{\hat{\varepsilon _j}}}{\varepsilon _j}\right) ^{K/2}=\left( 1-\frac{\frac{{\widehat{\varepsilon }}_0}{L}}{\varepsilon _1-2(j-1)\frac{{\widehat{\varepsilon }}_0}{L}}\right) ^{K/2}\le \left( 1-\frac{{{\widehat{\varepsilon }}_0}}{\varepsilon _1L}\right) ^{K/2}\le e^{-\frac{K{{\widehat{\varepsilon }}_0}}{2\varepsilon _1L}}\le \frac{1}{16}\end{aligned}$$

having used \(e^{-\frac{K\widehat{s}_0}{2L}}\le e^{-\frac{K{\hat{\sigma }}_0}{2L}}\), \(e^{-\frac{K{{\widehat{\varepsilon }}_0}}{2\varepsilon _1L}}\le e^{-\frac{K{{\widehat{\varepsilon }}_0}}{2\varepsilon _0L}}\le e^{-\frac{K{\hat{\sigma }}_0}{2L}}\) and \(L\le \frac{K{\hat{\sigma }}_0}{8\log 2}\). Moreover, writing

$$\begin{aligned} g_j=\P _{{\mathbb {L}}}T_K f_{0}+{\mathbb 1}_{j\ge 2}\sum _{h=1}^{j-1}\P _{{\mathbb {L}}}T_K f_{h}{=}{:}f_{0}^{{{\mathbb {L}}}, K}+f_{j-1}^{{{\mathbb {L}}}, K} \end{aligned}$$

with \(f_{0}^{{{\mathbb {L}}}, K}\) real-analytic on \(W_{r_0, s_0, \varepsilon _0}\), while \(f_{j-1}^{{{\mathbb {L}}}, K}\) real-analytic on \(W_{r_{j-1}, s_{j-1}, \varepsilon _{j-1}}\) and verifying

$$\begin{aligned} \Vert f_{0}^{{{\mathbb {L}}}, K}\Vert _{r_0, s_0, \varepsilon _0}\le E_0 ,\quad \Vert f_{j-1}^{{{\mathbb {L}}}, K}\Vert _{r_{j-1}, s_{j-1}, \varepsilon _{j-1}}\le \sum _{h=1}^{j-1}\frac{E_1}{4^{j-1}}\le \frac{4}{3}E_1 \end{aligned}$$

we get

$$\begin{aligned} \Vert \big \{\phi _j, g_j \big \}\Vert _{r_{j}-{\hat{r}}_j, s_j-{\hat{s}}_j, \varepsilon _j-{\hat{\varepsilon _j}}}\le & {} \frac{c_mL}{\alpha _2\delta _0}E_0{E_j}+\frac{4}{3}\frac{c_mL^2}{\alpha _2\delta _0}E_1{E_j}\\\le & {} \left( \frac{c_mL}{\alpha _2\delta _0}E_0+\frac{16}{3}\frac{c^2_mL^2}{\alpha ^2_2\delta ^2_0}E_0^2\right) {E_j}\\\le & {} \left( \frac{1}{32}+\frac{1}{32}\right) {E_j}=\frac{E_j}{16} \end{aligned}$$

Collecting all such bounds we get

$$\begin{aligned} E_{j+1}\le \frac{16}{15}\frac{3}{16}E_j<\frac{E_j}{4} . \end{aligned}$$

The inductive claim \(j\rightarrow j+1\) is thus proved, for all \(1\le j\le L\). Letting now \(\Phi _*\,{:}{=}\,\Phi _1\circ \cdots \circ \Phi _{L+1}\) and

$$\begin{aligned}{} & {} H_*\,{:}{=}\,H_{L+1}=h+g_{L+1}+f_{L+1}{=}{:}h+g_{*}+f_{*} \\{} & {} r_*\,{:}{=}\,r_{L+1}=r-4{\hat{r}} ,\ s_*\,{:}{=}\,s_{L+1}=s-4{\hat{s}} ,\ \varepsilon _*\,{:}{=}\,\varepsilon _{L+1}=\varepsilon -4{\hat{\varepsilon }} \end{aligned}$$

and using \(L+1> \frac{K{\widehat{{\sigma }}}_0}{8\log 2}\), we get

$$\begin{aligned}{} & {} \Vert f_*\Vert _{r_*, s_*, \varepsilon _*}\le \frac{E_0}{4^{L+1}}=e^{-2(L+1)\log 2}E_0< e^{-\frac{K{\widehat{{\sigma }}}_0}{4}}E_0\\{} & {} \Vert g_*-\P _{{\mathbb {L}}}T_K f_{0}\Vert _{r_*, s_*, \varepsilon _*}\le \frac{4}{3}E_1<8\frac{c_mE^2_0}{\alpha _2 {\delta }_0} \end{aligned}$$

as claimed. The bounds (52) are obtained from (86), by usual telescopic arguments. \(\square \)