Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

HTML conversions sometimes display errors due to content that did not convert correctly from the source. This paper uses the following packages that are not yet supported by the HTML conversion tool. Feedback on these issues are not necessary; they are known and are being worked on.

  • failed: simpler-wick

Authors: achieve the best HTML results from your LaTeX submissions by following these best practices.

License: CC BY 4.0
arXiv:2401.07383v1 [physics.chem-ph] 14 Jan 2024

Recent mathematical advances in coupled cluster theory

Fabian M. Faulstich111Department of Mathematical Sciences, Rensselaer Polytechnic Institute, NY, 12180, USA
Abstract

This article presents an in-depth educational overview of the latest mathematical developments in coupled cluster (CC) theory, beginning with Schneider’s seminal work from 2009 that introduced the first local analysis of CC theory. We offer a tutorial review of second quantization and the CC ansatz, laying the groundwork for understanding the mathematical basis of the theory. This is followed by a detailed exploration of the most recent mathematical advancements in CC theory. Our review starts with an in-depth look at the local analysis pioneered by Schneider which has since been applied to analyze various CC methods. We then move on to discuss the graph-based framework for CC methods developed by Csirik and Laestadius. This framework provides a comprehensive platform for comparing different CC methods, including multireference approaches. Next, we delve into the latest numerical analysis results analyzing the single reference CC method developed by Hassan, Maday, and Wang. This very general approach is based on the invertibility of the CC function’s Fréchet derivative. We conclude the article with a discussion on the recent incorporation of algebraic geometry into CC theory, highlighting how this novel and fundamentally different mathematical perspective has furthered our understanding and provides exciting pathways to new computational approaches.

1 Introduction

Coupled-cluster (CC) theory is a widely acclaimed, high-precision wave function approach that is used in computational quantum many-body theory and is of great interest to both practitioners as well as theoreticians [5]. The origin of CC theory dates back to 1958 when Coester proposed to use an exponential parametrization of the wave function [15]. This parametrization was independently derived by Hubbard [36] and Hugenholtz [38] in 1957 as an alternative to summing many-body perturbation theory contributions order by order. A milestone of CC theory is the work by Čížek from 1966 [14]. In this work, Čížek discussed the foundational concepts of second quantization (as applied to many-fermion systems), normal ordering, contractions, Wick’s theorem, normal-ordered Hamiltonians (which was a novelty at that time), Goldstone-style diagrammatic techniques, and the origin of the exponential wave function ansatz. He moreover derived the connected cluster form of the Schrödinger equation and proposed a general recipe for how to obtain the energy and amplitude equations through projections of the connected cluster form of the Schrödinger equation on the reference and excited determinants, which was illustrated using the CC doubles (CCD) approximation. This work also reported the very first CC computations, using full and linearized forms of CCD, for nitrogen (treated fully at the ab initio level) and benzene (treated with a PPP model Hamiltonian).

In this article, we review the most recent mathematical advances from a computational chemistry perspective. Our objective is to elucidate various mathematical frameworks, their objectives, and outcomes in a manner that is accessible to a wide computational chemistry audience. In doing so, we aim to make the complex mathematical concepts accessible to a broader audience, providing a clear and comprehensible pathway for readers who may not have an extensive background in the advanced mathematics typically necessary to fully engage with the original research articles. With this effort, our goal is to render these mathematical results not only understandable but also directly applicable and relevant for practitioners and researchers in the field of computational chemistry.

While this article centers on the mathematical developments in CC theory post-2009, following the landmark work by Schneider, we recognize that there were significant contributions and advancements in the field before this date. However, our focus remains on the period after 2009, showcasing the progress made in recent years. Providing a full account of the rich history of CC theory and the mathematical advances therein is beyond the scope of this article, the interested reader is referred to articles and the references therein that provide insight into the history and development of CC theory including those by Bartlett [4], Paldus [52], Arponen [1], and Bishop [9].

The following article is outlined as follows: In Sec. 2 we provide a brief review of the mathematical matrix structures that arise in the second quantized framework. In Sec. 3 we then introduce the CC ansatz using an algebraic formulation. In Sec. 4 we then review the mathematical results established by employing a local strong monotonicity approach (Sec. 4.1), the excitation graph approach (Sec. 4.2) and the inf-sup condition approach (Sec. 4.3). In Sec. 5 we then elaborate on the root structure of the CC equations and review the advances made along this line by employing an algebraic geometry perspective.

2 Brief review of second quantization

In this section, we review the second quantization framework with a slight mathematical twist. Our aim is to resolve any ambiguities surrounding concepts that have been a potential subject of debate within either the mathematical or chemical community. Considering an N𝑁Nitalic_N electron system, we denote the set \mathcal{B}caligraphic_B with ||=NBNsubscript𝑁𝐵much-greater-than𝑁|\mathcal{B}|=N_{B}\gg N| caligraphic_B | = italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ≫ italic_N the set of molecular orbitals, comprising of L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-orthonormal functions, i.e.,

ξi,ξjL2(X)=Xξi*(x)ξj(x)𝑑λ(x)1i,jNB,formulae-sequencesubscriptsubscript𝜉𝑖subscript𝜉𝑗superscript𝐿2𝑋subscript𝑋superscriptsubscript𝜉𝑖𝑥subscript𝜉𝑗𝑥differential-d𝜆𝑥formulae-sequencefor-all1𝑖𝑗subscript𝑁𝐵\langle\xi_{i},\xi_{j}\rangle_{L^{2}(X)}=\int_{X}\xi_{i}^{*}(x)\xi_{j}(x)d% \lambda(x)\qquad\forall~{}1\leq i,j\leq N_{B},⟨ italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_X ) end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT italic_ξ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( italic_x ) italic_ξ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_x ) italic_d italic_λ ( italic_x ) ∀ 1 ≤ italic_i , italic_j ≤ italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT , (1)

where X=3×{±1/2}𝑋superscript3plus-or-minus12X=\mathbb{R}^{3}\times\{\pm 1/2\}italic_X = blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT × { ± 1 / 2 } and dλ(x)𝑑𝜆𝑥d\lambda(x)italic_d italic_λ ( italic_x ) denotes the corresponding integration measure [28]. Mathematically, the integral measure dλ(x)𝑑𝜆𝑥d\lambda(x)italic_d italic_λ ( italic_x ) is a product measure introduced to combine spatial and spin integration, it can also be written as

Xf(𝐱)𝑑λ(𝐱)=s{±1/2}3f(𝐫,s)𝑑𝐫,subscript𝑋𝑓𝐱differential-d𝜆𝐱subscript𝑠plus-or-minus12subscriptsuperscript3𝑓𝐫𝑠differential-d𝐫\int_{X}f(\mathbf{x})d\lambda(\mathbf{x})=\sum_{s\in\{\pm 1/2\}}\int_{\mathbb{% R}^{3}}f(\mathbf{r},s)d{\bf r},∫ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT italic_f ( bold_x ) italic_d italic_λ ( bold_x ) = ∑ start_POSTSUBSCRIPT italic_s ∈ { ± 1 / 2 } end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT italic_f ( bold_r , italic_s ) italic_d bold_r , (2)

where 𝐫3𝐫superscript3\mathbf{r}\in\mathbb{R}^{3}bold_r ∈ blackboard_R start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT denotes the spatial degree of freedom and s{±1/2}𝑠plus-or-minus12s\in\{\pm 1/2\}italic_s ∈ { ± 1 / 2 } is the spin degree of freedom. We moreover assume that the functions in \mathcal{B}caligraphic_B are sufficiently smooth allowing us to take all required derivatives. Note that in computations that use Gaussian-type orbitals, this is always the case. Mathematically, the largest space (i.e., the most general space) from which we can choose \mathcal{B}caligraphic_B is the Sobolev space H1(X)superscript𝐻1𝑋H^{1}(X)italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_X ) [47, 2], however, for sake of simplicity, one can assume twice continuously differentiable and L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-integrable functions. In any case, we conclude that the molecular orbitals span a finite-dimensional Hilbert space hH1(X)superscript𝐻1𝑋h\subset H^{1}(X)italic_h ⊂ italic_H start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( italic_X ) which we shall denote the single-particle space.

Next, we define multi-particle functions that are used to span the fermionic Fock space. Due to the anti-symmetry constraints of the wave function, we need to take the anti-symmetrized product also called the exterior product: Let ξ1,,ξMsubscript𝜉1subscript𝜉𝑀\xi_{1},...,\xi_{M}\in\mathcal{B}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ∈ caligraphic_B, we define the M𝑀Mitalic_M-folded exterior product of ξ1,,ξMsubscript𝜉1subscript𝜉𝑀\xi_{1},...,\xi_{M}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_ξ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT (pointwise) by

ξ1ξM(𝐱1,,𝐱M)=πSMsign(π)i=1Mξπ(i)(𝐱i),subscript𝜉1subscript𝜉𝑀subscript𝐱1subscript𝐱𝑀subscript𝜋subscript𝑆𝑀sign𝜋superscriptsubscriptproduct𝑖1𝑀subscript𝜉𝜋𝑖subscript𝐱𝑖\xi_{1}\wedge...\wedge\xi_{M}(\mathbf{x}_{1},...,\mathbf{x}_{M})=\sum_{\pi\in S% _{M}}{\rm sign}(\pi)\prod_{i=1}^{M}\xi_{\pi(i)}(\mathbf{x}_{i}),italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ … ∧ italic_ξ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_π ∈ italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_sign ( italic_π ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) , (3)

where SMsubscript𝑆𝑀S_{M}italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT is the symmetric group describing all possible permutations of the set {1,,M}1𝑀\{1,...,M\}{ 1 , … , italic_M } and sign(π)sign𝜋{\rm sign}(\pi)roman_sign ( italic_π ) is the parity of the permutation π𝜋\piitalic_π.

Example 1.

Let ξ1,ξ2subscript𝜉1subscript𝜉2\xi_{1},\xi_{2}\in\mathcal{B}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ caligraphic_B be two molecular orbitals. The exterior product of ξ1subscript𝜉1\xi_{1}italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ξ2subscript𝜉2\xi_{2}italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT is pointwise given by

ξ1ξ2(𝐱1,𝐱2)=ξ1(𝐱1)ξ2(𝐱2)ξ1(𝐱2)ξ2(𝐱1).subscript𝜉1subscript𝜉2subscript𝐱1subscript𝐱2subscript𝜉1subscript𝐱1subscript𝜉2subscript𝐱2subscript𝜉1subscript𝐱2subscript𝜉2subscript𝐱1\xi_{1}\wedge\xi_{2}(\mathbf{x}_{1},\mathbf{x}_{2})=\xi_{1}(\mathbf{x}_{1})\xi% _{2}(\mathbf{x}_{2})-\xi_{1}(\mathbf{x}_{2})\xi_{2}(\mathbf{x}_{1}).italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∧ italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) . (4)

Given the set \mathcal{B}caligraphic_B, one can form (NBM)binomialsubscript𝑁𝐵𝑀{N_{B}\choose M}( binomial start_ARG italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_ARG start_ARG italic_M end_ARG ) linearly independent exterior products, which define the set 𝔅(M)superscript𝔅𝑀\mathfrak{B}^{(M)}fraktur_B start_POSTSUPERSCRIPT ( italic_M ) end_POSTSUPERSCRIPT. The space (M)superscript𝑀\mathcal{H}^{(M)}caligraphic_H start_POSTSUPERSCRIPT ( italic_M ) end_POSTSUPERSCRIPT, spanned by these functions is the M𝑀Mitalic_M-folded exterior power of hhitalic_h, and it inherits an inner product from the single particle space hhitalic_h: Let ΨI=ξi1ξiMsubscriptΨ𝐼subscript𝜉subscript𝑖1subscript𝜉subscript𝑖𝑀\Psi_{I}=\xi_{i_{1}}\wedge...\wedge\xi_{i_{M}}roman_Ψ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT = italic_ξ start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∧ … ∧ italic_ξ start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT and ΨJ=ξj1ξjMsubscriptΨ𝐽subscript𝜉subscript𝑗1subscript𝜉subscript𝑗𝑀\Psi_{J}=\xi_{j_{1}}\wedge...\wedge\xi_{j_{M}}roman_Ψ start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT = italic_ξ start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∧ … ∧ italic_ξ start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT then

ΨI,ΨJ=πSIσSJp=1Mξπ(ip),ξσ(jp)L2(X),subscriptΨ𝐼subscriptΨ𝐽subscript𝜋subscript𝑆𝐼𝜎subscript𝑆𝐽superscriptsubscriptproduct𝑝1𝑀subscriptsubscript𝜉𝜋subscript𝑖𝑝subscript𝜉𝜎subscript𝑗𝑝superscript𝐿2𝑋\langle\Psi_{I},\Psi_{J}\rangle=\sum_{\begin{subarray}{c}\pi\in S_{I}\\ \sigma\in S_{J}\end{subarray}}\prod_{p=1}^{M}\langle\xi_{\pi(i_{p})},\xi_{% \sigma(j_{p})}\rangle_{L^{2}(X)},⟨ roman_Ψ start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT , roman_Ψ start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT ⟩ = ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_π ∈ italic_S start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL italic_σ ∈ italic_S start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ∏ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT ⟨ italic_ξ start_POSTSUBSCRIPT italic_π ( italic_i start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT , italic_ξ start_POSTSUBSCRIPT italic_σ ( italic_j start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ( italic_X ) end_POSTSUBSCRIPT , (5)

where SIsubscript𝑆𝐼S_{I}italic_S start_POSTSUBSCRIPT italic_I end_POSTSUBSCRIPT and SJsubscript𝑆𝐽S_{J}italic_S start_POSTSUBSCRIPT italic_J end_POSTSUBSCRIPT are the permutations of {i1,,iM}subscript𝑖1subscript𝑖𝑀\{i_{1},...,i_{M}\}{ italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT } and {j1,,jM}subscript𝑗1subscript𝑗𝑀\{j_{1},...,j_{M}\}{ italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT }, respectively. Normalizing the (NBM)binomialsubscript𝑁𝐵𝑀{N_{B}\choose M}( binomial start_ARG italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_ARG start_ARG italic_M end_ARG ) exterior products obtained from \mathcal{B}caligraphic_B by 1/M!1𝑀1/\sqrt{M!}1 / square-root start_ARG italic_M ! end_ARG yields the well-known definition of M𝑀Mitalic_M-particle Slater determinants, i.e.,

Ψ[i1,,iM](𝐱1,,𝐱M)Ψsubscript𝑖1subscript𝑖𝑀subscript𝐱1subscript𝐱𝑀\displaystyle\Psi[i_{1},...,i_{M}](\mathbf{x}_{1},...,\mathbf{x}_{M})roman_Ψ [ italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ] ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) =1M!σSMsign(π)i=1Mξπ(i)(𝐱i)absent1𝑀subscript𝜎subscript𝑆𝑀sign𝜋superscriptsubscriptproduct𝑖1𝑀subscript𝜉𝜋𝑖subscript𝐱𝑖\displaystyle=\frac{1}{\sqrt{M!}}\sum_{\sigma\in S_{M}}{\rm sign}(\pi)\prod_{i% =1}^{M}\xi_{\pi(i)}(\mathbf{x}_{i})= divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_M ! end_ARG end_ARG ∑ start_POSTSUBSCRIPT italic_σ ∈ italic_S start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_sign ( italic_π ) ∏ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_M end_POSTSUPERSCRIPT italic_ξ start_POSTSUBSCRIPT italic_π ( italic_i ) end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) (6)
=1M!det([ξ1(𝐱1)ξ1(𝐱M)ξM(𝐱1)ξM(𝐱M)]).absent1𝑀detmatrixsubscript𝜉1subscript𝐱1subscript𝜉1subscript𝐱𝑀subscript𝜉𝑀subscript𝐱1subscript𝜉𝑀subscript𝐱𝑀\displaystyle=\frac{1}{\sqrt{M!}}{\rm det}\left(\begin{bmatrix}\xi_{1}(\mathbf% {x}_{1})&\cdots&\xi_{1}(\mathbf{x}_{M})\\ \vdots&\ddots&\vdots\\ \xi_{M}(\mathbf{x}_{1})&\cdots&\xi_{M}(\mathbf{x}_{M})\end{bmatrix}\right).= divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_M ! end_ARG end_ARG roman_det ( [ start_ARG start_ROW start_CELL italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_CELL start_CELL ⋯ end_CELL start_CELL italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL start_CELL ⋱ end_CELL start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_ξ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) end_CELL start_CELL ⋯ end_CELL start_CELL italic_ξ start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ( bold_x start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARG ] ) .

To avoid linear dependence in the set of M𝑀Mitalic_M-particle Slater determinants, we assume i1<,<iMi_{1}<...,<i_{M}italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT < … , < italic_i start_POSTSUBSCRIPT italic_M end_POSTSUBSCRIPT which yields (NBM)binomialsubscript𝑁𝐵𝑀{N_{B}\choose M}( binomial start_ARG italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_ARG start_ARG italic_M end_ARG ) possible exterior products formed from \mathcal{B}caligraphic_B. The direct sum of the M𝑀Mitalic_M-particle spaces for M=0,,NB𝑀0subscript𝑁𝐵M=0,...,N_{B}italic_M = 0 , … , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT yields the fermionic Fock space \mathcal{F}caligraphic_F:

=M=0NB(M),superscriptsubscriptdirect-sum𝑀0subscript𝑁𝐵superscript𝑀\mathcal{F}=\bigoplus_{M=0}^{N_{B}}\mathcal{H}^{(M)},caligraphic_F = ⨁ start_POSTSUBSCRIPT italic_M = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT caligraphic_H start_POSTSUPERSCRIPT ( italic_M ) end_POSTSUPERSCRIPT , (7)

which is known as the Grassmann algebra on hhitalic_h in the mathematics community. For brevity, we will employ the Dirac notation writing the basis elements in \mathcal{F}caligraphic_F as

|s1,,sNB=1M!ξ1s1ξ2s2ξNBsNBketsubscript𝑠1subscript𝑠subscript𝑁𝐵1𝑀superscriptsubscript𝜉1subscript𝑠1superscriptsubscript𝜉2subscript𝑠2superscriptsubscript𝜉subscript𝑁𝐵subscript𝑠subscript𝑁𝐵|s_{1},...,s_{N_{B}}\rangle=\frac{1}{\sqrt{M!}}\xi_{1}^{s_{1}}\wedge\xi_{2}^{s% _{2}}\wedge...\wedge\xi_{N_{B}}^{s_{N_{B}}}| italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ = divide start_ARG 1 end_ARG start_ARG square-root start_ARG italic_M ! end_ARG end_ARG italic_ξ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∧ italic_ξ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∧ … ∧ italic_ξ start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_POSTSUPERSCRIPT (8)

where M=isi𝑀subscript𝑖subscript𝑠𝑖M=\sum_{i}s_{i}italic_M = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and si{0,1}subscript𝑠𝑖01s_{i}\in\{0,1\}italic_s start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ { 0 , 1 } for all i=1,,NB𝑖1subscript𝑁𝐵i=1,...,N_{B}italic_i = 1 , … , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. A general element in \mathcal{F}caligraphic_F, is then given as

|Ψ=s1,,sNB{0,1}Ψ(s1,,sNB)|s1,,sNBketΨsubscriptsubscript𝑠1subscript𝑠subscript𝑁𝐵01Ψsubscript𝑠1subscript𝑠subscript𝑁𝐵ketsubscript𝑠1subscript𝑠subscript𝑁𝐵|\Psi\rangle=\sum_{s_{1},...,s_{N_{B}}\in\{0,1\}}\Psi(s_{1},...,s_{N_{B}})|s_{% 1},...,s_{N_{B}}\rangle| roman_Ψ ⟩ = ∑ start_POSTSUBSCRIPT italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ { 0 , 1 } end_POSTSUBSCRIPT roman_Ψ ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ (9)

where Ψ(s1,,sNB)Ψsubscript𝑠1subscript𝑠subscript𝑁𝐵\Psi(s_{1},...,s_{N_{B}})\in\mathbb{C}roman_Ψ ( italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) ∈ blackboard_C. We now define the fermionic creation and annihilation operators, i.e.,

ap:;|s1,,sNB:superscriptsubscript𝑎𝑝ketsubscript𝑠1subscript𝑠subscript𝑁𝐵\displaystyle a_{p}^{\dagger}:\mathcal{F}\to\mathcal{F}~{};~{}|s_{1},...,s_{N_% {B}}\rangleitalic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT : caligraphic_F → caligraphic_F ; | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ (1)σ(p)(1sp)|s1,sp1,1sp,sp+1,,sNBmaps-toabsentsuperscript1𝜎𝑝1subscript𝑠𝑝ketsubscript𝑠1subscript𝑠𝑝11subscript𝑠𝑝subscript𝑠𝑝1subscript𝑠subscript𝑁𝐵\displaystyle\mapsto(-1)^{\sigma(p)}(1-s_{p})|s_{1},...s_{p-1},1-s_{p},s_{p+1}% ,...,s_{N_{B}}\rangle↦ ( - 1 ) start_POSTSUPERSCRIPT italic_σ ( italic_p ) end_POSTSUPERSCRIPT ( 1 - italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … italic_s start_POSTSUBSCRIPT italic_p - 1 end_POSTSUBSCRIPT , 1 - italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_p + 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ (10)
ap:;|s1,,sNB:subscript𝑎𝑝ketsubscript𝑠1subscript𝑠subscript𝑁𝐵\displaystyle a_{p}:\mathcal{F}\to\mathcal{F}~{};~{}|s_{1},...,s_{N_{B}}\rangleitalic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT : caligraphic_F → caligraphic_F ; | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ (1)σ(p)sp|s1,sp1,1sp,sp+1,,sNBmaps-toabsentsuperscript1𝜎𝑝subscript𝑠𝑝ketsubscript𝑠1subscript𝑠𝑝11subscript𝑠𝑝subscript𝑠𝑝1subscript𝑠subscript𝑁𝐵\displaystyle\mapsto(-1)^{\sigma(p)}s_{p}|s_{1},...s_{p-1},1-s_{p},s_{p+1},...% ,s_{N_{B}}\rangle↦ ( - 1 ) start_POSTSUPERSCRIPT italic_σ ( italic_p ) end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … italic_s start_POSTSUBSCRIPT italic_p - 1 end_POSTSUBSCRIPT , 1 - italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_s start_POSTSUBSCRIPT italic_p + 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩

where σ(p)=q=1p1sq𝜎𝑝superscriptsubscript𝑞1𝑝1subscript𝑠𝑞\sigma(p)=\sum_{q=1}^{p-1}s_{q}italic_σ ( italic_p ) = ∑ start_POSTSUBSCRIPT italic_q = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_p - 1 end_POSTSUPERSCRIPT italic_s start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT. We note that dim()=2NBdimsuperscript2subscript𝑁𝐵{\rm dim}(\mathcal{F})=2^{N_{B}}roman_dim ( caligraphic_F ) = 2 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, we therefore identify elements of the fermionic Fock space \mathcal{F}caligraphic_F uniquely with elements in 2NBsuperscriptsuperscript2subscript𝑁𝐵\mathbb{C}^{2^{N_{B}}}blackboard_C start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT. Mathematically, we write 2NBsimilar-to-or-equalssuperscriptsuperscript2subscript𝑁𝐵\mathcal{F}\simeq\mathbb{C}^{2^{N_{B}}}caligraphic_F ≃ blackboard_C start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT which means that the spaces \mathcal{F}caligraphic_F and 2NBsuperscriptsuperscript2subscript𝑁𝐵\mathbb{C}^{2^{N_{B}}}blackboard_C start_POSTSUPERSCRIPT 2 start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT are essentially the same in their structure. We moreover introduce the convention

(10)unoccupiedand(01)occupied.formulae-sequencebinomial10unoccupiedandbinomial01occupied{1\choose 0}\equiv{\rm unoccupied}\quad{\rm and}\quad{0\choose 1}\equiv{\rm occupied.}( binomial start_ARG 1 end_ARG start_ARG 0 end_ARG ) ≡ roman_unoccupied roman_and ( binomial start_ARG 0 end_ARG start_ARG 1 end_ARG ) ≡ roman_occupied .

Note that this is an arbitrary choice, but it is the commonly employed convention. This allows us to express the basis elements as

|s1,,sNB=(1s1s1)(1sNBsNB).ketsubscript𝑠1subscript𝑠subscript𝑁𝐵tensor-productbinomial1subscript𝑠1subscript𝑠1binomial1subscript𝑠subscript𝑁𝐵subscript𝑠subscript𝑁𝐵|s_{1},...,s_{N_{B}}\rangle={1-s_{1}\choose s_{1}}\otimes...\otimes{1-s_{N_{B}% }\choose s_{N_{B}}}.| italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ = ( binomial start_ARG 1 - italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ) ⊗ … ⊗ ( binomial start_ARG 1 - italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG start_ARG italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT end_ARG ) . (11)
Example 2.

Let NB=2subscript𝑁𝐵2N_{B}=2italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT = 2, then

|01=(10)(01)=(0001)ket01tensor-productbinomial10binomial01matrix0001|01\rangle={1\choose 0}\otimes{0\choose 1}=\begin{pmatrix}0\\ 0\\ 0\\ 1\end{pmatrix}| 01 ⟩ = ( binomial start_ARG 1 end_ARG start_ARG 0 end_ARG ) ⊗ ( binomial start_ARG 0 end_ARG start_ARG 1 end_ARG ) = ( start_ARG start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL end_ROW start_ROW start_CELL 1 end_CELL end_ROW end_ARG ) (12)

In this formulation, the fermionic creation and annihilation operators in Eq. (10) are matrices of the form

ap=σzσzp1timesaIINBp1timesandap=σzσzp1timesaIINBp1timesformulae-sequencesuperscriptsubscript𝑎𝑝tensor-productsubscripttensor-productsubscript𝜎𝑧subscript𝜎𝑧𝑝1timessuperscript𝑎subscripttensor-product𝐼𝐼subscript𝑁𝐵𝑝1timesandsubscript𝑎𝑝tensor-productsubscripttensor-productsubscript𝜎𝑧subscript𝜎𝑧𝑝1times𝑎subscripttensor-product𝐼𝐼subscript𝑁𝐵𝑝1times\displaystyle a_{p}^{\dagger}=\underbrace{\sigma_{z}\otimes...\otimes\sigma_{z% }}_{p-1~{}{\rm times}}\otimes\;a^{\dagger}\otimes\underbrace{I\otimes...% \otimes I}_{N_{B}-p-1~{}{\rm times}}\quad{\rm and}\quad a_{p}=\underbrace{% \sigma_{z}\otimes...\otimes\sigma_{z}}_{p-1~{}{\rm times}}\otimes\;a\otimes% \underbrace{I\otimes...\otimes I}_{N_{B}-p-1~{}{\rm times}}italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT = under⏟ start_ARG italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ⊗ … ⊗ italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_p - 1 roman_times end_POSTSUBSCRIPT ⊗ italic_a start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ⊗ under⏟ start_ARG italic_I ⊗ … ⊗ italic_I end_ARG start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT - italic_p - 1 roman_times end_POSTSUBSCRIPT roman_and italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = under⏟ start_ARG italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ⊗ … ⊗ italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT end_ARG start_POSTSUBSCRIPT italic_p - 1 roman_times end_POSTSUBSCRIPT ⊗ italic_a ⊗ under⏟ start_ARG italic_I ⊗ … ⊗ italic_I end_ARG start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT - italic_p - 1 roman_times end_POSTSUBSCRIPT (13)

where

I=(1001),σz=(1001),a=(0100).formulae-sequence𝐼matrix1001formulae-sequencesubscript𝜎𝑧matrix1001𝑎matrix0100I=\begin{pmatrix}1&0\\ 0&1\end{pmatrix},\quad\sigma_{z}=\begin{pmatrix}1&0\\ 0&-1\end{pmatrix},\quad a=\begin{pmatrix}0&1\\ 0&0\end{pmatrix}.italic_I = ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW end_ARG ) , italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT = ( start_ARG start_ROW start_CELL 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL - 1 end_CELL end_ROW end_ARG ) , italic_a = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) . (14)
Example 3.

Let NB=3subscript𝑁𝐵3N_{B}=3italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT = 3, then

a2=σzaI=(0010000000010000000000000000000000000010000000010000000000000000)subscript𝑎2tensor-productsubscript𝜎𝑧𝑎𝐼matrix0010000000010000000000000000000000000010000000010000000000000000a_{2}=\sigma_{z}\otimes a\otimes I=\footnotesize\begin{pmatrix}0&0&1&0&0&0&0&0% \\ 0&0&0&1&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&-1&0\\ 0&0&0&0&0&0&0&-1\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ \end{pmatrix}italic_a start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = italic_σ start_POSTSUBSCRIPT italic_z end_POSTSUBSCRIPT ⊗ italic_a ⊗ italic_I = ( start_ARG start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 1 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL - 1 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL - 1 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW start_ROW start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL start_CELL 0 end_CELL end_ROW end_ARG ) (15)

These matrices are sparse and have several properties [35]. We first note that the definition given in Eq. (13) implies directly that the creation and annihilation operators are nilpotent, i.e., (ap)2=(ap)2=0superscriptsuperscriptsubscript𝑎𝑝2superscriptsubscript𝑎𝑝20(a_{p}^{\dagger})^{2}=(a_{p})^{2}=0( italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0. Moreover, the fermionic creation and annihilation operators obey the canonical anti-communication relation (CAR):

[ap,aq]+=[ap,aq]+=0and[ap,aq]+=δp,q.formulae-sequencesubscriptsubscript𝑎𝑝subscript𝑎𝑞subscriptsuperscriptsubscript𝑎𝑝superscriptsubscript𝑎𝑞0andsubscriptsubscript𝑎𝑝superscriptsubscript𝑎𝑞subscript𝛿𝑝𝑞[a_{p},a_{q}]_{+}=[a_{p}^{\dagger},a_{q}^{\dagger}]_{+}=0\quad\text{and}\quad[% a_{p},a_{q}^{\dagger}]_{+}=\delta_{p,q}.[ italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT + end_POSTSUBSCRIPT = [ italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT , italic_a start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT + end_POSTSUBSCRIPT = 0 and [ italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT , italic_a start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ] start_POSTSUBSCRIPT + end_POSTSUBSCRIPT = italic_δ start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT . (16)

Lastly, we define the number operator np=apapsubscript𝑛𝑝superscriptsubscript𝑎𝑝subscript𝑎𝑝n_{p}=a_{p}^{\dagger}a_{p}italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT satisfying

np|s1,,sNB=sp|s1,,sNBsubscript𝑛𝑝ketsubscript𝑠1subscript𝑠subscript𝑁𝐵subscript𝑠𝑝ketsubscript𝑠1subscript𝑠subscript𝑁𝐵n_{p}|s_{1},...,s_{N_{B}}\rangle=s_{p}|s_{1},...,s_{N_{B}}\rangleitalic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ = italic_s start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT | italic_s start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_s start_POSTSUBSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ⟩ (17)

and the total number operator N=p=1NBnp𝑁superscriptsubscript𝑝1subscript𝑁𝐵subscript𝑛𝑝N=\sum_{p=1}^{N_{B}}n_{p}italic_N = ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. In this formulation, the matrix describing an interacting electronic system in a potential generated by clamped nuclei, i.e., the Hamiltonian, takes the form

H=p,qhp,qapaq+14p,q,r,svp,q,r,sapaqasar,𝐻subscript𝑝𝑞subscript𝑝𝑞superscriptsubscript𝑎𝑝subscript𝑎𝑞14subscript𝑝𝑞𝑟𝑠subscript𝑣𝑝𝑞𝑟𝑠superscriptsubscript𝑎𝑝superscriptsubscript𝑎𝑞subscript𝑎𝑠subscript𝑎𝑟H=\sum_{p,q}h_{p,q}a_{p}^{\dagger}a_{q}+\frac{1}{4}\sum_{p,q,r,s}v_{p,q,r,s}a_% {p}^{\dagger}a_{q}^{\dagger}a_{s}a_{r},italic_H = ∑ start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT italic_h start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT + divide start_ARG 1 end_ARG start_ARG 4 end_ARG ∑ start_POSTSUBSCRIPT italic_p , italic_q , italic_r , italic_s end_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT italic_p , italic_q , italic_r , italic_s end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT , (18)

where hNB×NBsuperscriptsubscript𝑁𝐵subscript𝑁𝐵h\in\mathbb{C}^{N_{B}\times N_{B}}italic_h ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT and vNB×NB×NB×NB𝑣superscriptsubscript𝑁𝐵subscript𝑁𝐵subscript𝑁𝐵subscript𝑁𝐵v\in\mathbb{C}^{N_{B}\times N_{B}\times N_{B}\times N_{B}}italic_v ∈ blackboard_C start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT × italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUPERSCRIPT are system dependent integral tensors. They are defined via

hp,q=Xξp*(x)(Δ2jZj|r1Rj|)ξq(x)𝑑λ(x)subscript𝑝𝑞subscript𝑋superscriptsubscript𝜉𝑝𝑥Δ2subscript𝑗subscript𝑍𝑗subscript𝑟1subscript𝑅𝑗subscript𝜉𝑞𝑥differential-d𝜆𝑥h_{p,q}=\int_{X}\xi_{p}^{*}(x)\left(-\frac{\Delta}{2}-\sum_{j}\frac{Z_{j}}{|r_% {1}-R_{j}|}\right)\xi_{q}(x)d\lambda(x)italic_h start_POSTSUBSCRIPT italic_p , italic_q end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_X end_POSTSUBSCRIPT italic_ξ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( italic_x ) ( - divide start_ARG roman_Δ end_ARG start_ARG 2 end_ARG - ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT divide start_ARG italic_Z start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG | italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_R start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | end_ARG ) italic_ξ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( italic_x ) italic_d italic_λ ( italic_x ) (19)

and

vp,q,r,s=X×Xξp*(x1)ξq(x1)ξr*(x2)ξs(x2)|r1r2|𝑑λ(x)𝑑λ(x).subscript𝑣𝑝𝑞𝑟𝑠subscript𝑋𝑋superscriptsubscript𝜉𝑝subscript𝑥1subscript𝜉𝑞subscript𝑥1superscriptsubscript𝜉𝑟subscript𝑥2subscript𝜉𝑠subscript𝑥2subscript𝑟1subscript𝑟2differential-d𝜆𝑥differential-d𝜆𝑥v_{p,q,r,s}=\int_{X\times X}\frac{\xi_{p}^{*}(x_{1})\xi_{q}(x_{1})\xi_{r}^{*}(% x_{2})\xi_{s}(x_{2})}{|r_{1}-r_{2}|}d\lambda(x)d\lambda(x).italic_v start_POSTSUBSCRIPT italic_p , italic_q , italic_r , italic_s end_POSTSUBSCRIPT = ∫ start_POSTSUBSCRIPT italic_X × italic_X end_POSTSUBSCRIPT divide start_ARG italic_ξ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_ξ start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) italic_ξ start_POSTSUBSCRIPT italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) italic_ξ start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT ( italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG | italic_r start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - italic_r start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT | end_ARG italic_d italic_λ ( italic_x ) italic_d italic_λ ( italic_x ) . (20)

Note that hhitalic_h is hermitian and v𝑣vitalic_v fulfills the symmetry relations

vp,q,r,s=vr,s,p,q=vq,p,s,r*=vs,r,q,p*subscript𝑣𝑝𝑞𝑟𝑠subscript𝑣𝑟𝑠𝑝𝑞superscriptsubscript𝑣𝑞𝑝𝑠𝑟superscriptsubscript𝑣𝑠𝑟𝑞𝑝v_{p,q,r,s}=v_{r,s,p,q}=v_{q,p,s,r}^{*}=v_{s,r,q,p}^{*}italic_v start_POSTSUBSCRIPT italic_p , italic_q , italic_r , italic_s end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_r , italic_s , italic_p , italic_q end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_q , italic_p , italic_s , italic_r end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_s , italic_r , italic_q , italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT * end_POSTSUPERSCRIPT (21)

or in the case of real-valued atomic spin orbitals

vp,q,r,s=vr,s,p,q=vq,p,s,r=vs,r,q,p=vq,p,r,s=vs,r,p,q=vp,q,s,r=vr,s,q,p.subscript𝑣𝑝𝑞𝑟𝑠subscript𝑣𝑟𝑠𝑝𝑞subscript𝑣𝑞𝑝𝑠𝑟subscript𝑣𝑠𝑟𝑞𝑝subscript𝑣𝑞𝑝𝑟𝑠subscript𝑣𝑠𝑟𝑝𝑞subscript𝑣𝑝𝑞𝑠𝑟subscript𝑣𝑟𝑠𝑞𝑝v_{p,q,r,s}=v_{r,s,p,q}=v_{q,p,s,r}=v_{s,r,q,p}=v_{q,p,r,s}=v_{s,r,p,q}=v_{p,q% ,s,r}=v_{r,s,q,p}.italic_v start_POSTSUBSCRIPT italic_p , italic_q , italic_r , italic_s end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_r , italic_s , italic_p , italic_q end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_q , italic_p , italic_s , italic_r end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_s , italic_r , italic_q , italic_p end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_q , italic_p , italic_r , italic_s end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_s , italic_r , italic_p , italic_q end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_p , italic_q , italic_s , italic_r end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_r , italic_s , italic_q , italic_p end_POSTSUBSCRIPT . (22)

The goal is now to compute the lowest lying eigenstate of the matrix H𝐻Hitalic_H in the N𝑁Nitalic_N-particle subspace (N)superscript𝑁\mathcal{H}^{(N)}caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT which is the N𝑁Nitalic_N-particle ground state energy of the electronic Schrödinger equation, i.e.,

E0=min|ΨΨ|Ψ=1Ψ|HμNΨ,subscript𝐸0subscriptketΨinner-productΨΨ1conditionalΨ𝐻𝜇𝑁ΨE_{0}=\min_{\begin{subarray}{c}|\Psi\rangle\in\mathcal{F}\\ \langle\Psi|\Psi\rangle=1\end{subarray}}\langle\Psi|H-\mu N|\Psi\rangle,italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT start_ARG start_ROW start_CELL | roman_Ψ ⟩ ∈ caligraphic_F end_CELL end_ROW start_ROW start_CELL ⟨ roman_Ψ | roman_Ψ ⟩ = 1 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ⟨ roman_Ψ | italic_H - italic_μ italic_N | roman_Ψ ⟩ , (23)

where μ𝜇\muitalic_μ is a Lagrange multiplier ensuring that the solution |ΨketΨ|\Psi\rangle| roman_Ψ ⟩ lies in the N𝑁Nitalic_N-particle Hilbert space.

3 The CC ansatz

Coupled cluster theory is built upon an exponential ansatz of the wave function, as opposed to the linear ansatz of Eq. (9). We emphasize that the simple approach of projecting the Hamitlonian onto much smaller, manageable linear subspaces of \mathcal{F}caligraphic_F proves inadequate for electronic structure problems. This is exemplified in the case study of lithium hydride, where we analyze the lowest eigenstate for different relative positions of 𝐑1subscript𝐑1\mathbf{R}_{1}bold_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and 𝐑2subscript𝐑2\mathbf{R}_{2}bold_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, denoted R=𝐑1𝐑2𝑅normsubscript𝐑1subscript𝐑2R=\|\mathbf{R}_{1}-\mathbf{R}_{2}\|italic_R = ∥ bold_R start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - bold_R start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∥ (see Fig. 1). Through this examination, it becomes clear that essential energies, such as the chemical bonding energy, are not accurately captured when using a limited linear subspace, see EbondCISDsuperscriptsubscript𝐸bondCISDE_{\rm bond}^{\rm CISD}italic_E start_POSTSUBSCRIPT roman_bond end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_CISD end_POSTSUPERSCRIPT in Fig. 1. Conversely, CC theory provides a far more accurate representation of this critical energy, see EbondCCSDsuperscriptsubscript𝐸bondCCSDE_{\rm bond}^{\rm CCSD}italic_E start_POSTSUBSCRIPT roman_bond end_POSTSUBSCRIPT start_POSTSUPERSCRIPT roman_CCSD end_POSTSUPERSCRIPT in Fig. 1.

Refer to caption
Figure 1: Case study of lithium hydride comparing the linear parametrization (blue) and the exponential parametrization (red) for different values of R𝑅Ritalic_R in the AUG-cc-pVTZ basis set [35].

In order to derive the exponential ansatz in a mathematically sound way, we need to introduce a few concepts first, starting with excitation matrices.

3.1 Excitation and cluster matrices

Since we started the characterization of the fermionic Fock space with the molecular orbitals, the Hartree-Fock state is given by

|Ψ0=|1,,1,0,0=(01)(01)(10)(10)N,ketsubscriptΨ0ket1100tensor-productbinomial01binomial01binomial10binomial10superscript𝑁|\Psi_{0}\rangle=|1,...,1,0,...0\rangle={0\choose 1}\otimes\cdots\otimes{0% \choose 1}\otimes{1\choose 0}\otimes\cdots\otimes{1\choose 0}\in\mathcal{H}^{N},| roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = | 1 , … , 1 , 0 , … 0 ⟩ = ( binomial start_ARG 0 end_ARG start_ARG 1 end_ARG ) ⊗ ⋯ ⊗ ( binomial start_ARG 0 end_ARG start_ARG 1 end_ARG ) ⊗ ( binomial start_ARG 1 end_ARG start_ARG 0 end_ARG ) ⊗ ⋯ ⊗ ( binomial start_ARG 1 end_ARG start_ARG 0 end_ARG ) ∈ caligraphic_H start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ,

where the first N𝑁Nitalic_N entries are set to one, and the remaining entries are zero. We refer to this vector as the reference determinant. We moreover define vocc=[[N]]={1,,N}subscript𝑣occdelimited-[]delimited-[]𝑁1𝑁v_{\rm occ}=[\![N]\!]=\{1,...,N\}italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT = [ [ italic_N ] ] = { 1 , … , italic_N } and vvirt=[[NB]][[N]]={N+1,,NB}subscript𝑣virtdelimited-[]delimited-[]subscript𝑁𝐵delimited-[]delimited-[]𝑁𝑁1subscript𝑁𝐵v_{\rm virt}=[\![N_{B}]\!]\setminus[\![N]\!]=\{N+1,...,N_{B}\}italic_v start_POSTSUBSCRIPT roman_virt end_POSTSUBSCRIPT = [ [ italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ] ] ∖ [ [ italic_N ] ] = { italic_N + 1 , … , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT }. Assume a1,,akvvirtsubscript𝑎1subscript𝑎𝑘subscript𝑣virta_{1},...,a_{k}\in v_{\rm virt}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_virt end_POSTSUBSCRIPT, and i1,,ikvoccsubscript𝑖1subscript𝑖𝑘subscript𝑣occi_{1},...,i_{k}\in v_{\rm occ}italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT. Then,

X(a1,,aki1,,ik)=aakaa1ai1aiksubscript𝑋binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT

defines an excitation matrix, and the set of all excitation matrices is given by

𝔈((N))={Xμ|μ=(a1,,aki1,,ik),ajvvirt,ijvocc,kN}.𝔈superscript𝑁conditional-setsubscript𝑋𝜇formulae-sequence𝜇binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘formulae-sequencesubscript𝑎𝑗subscript𝑣virtformulae-sequencesubscript𝑖𝑗subscript𝑣occ𝑘𝑁\mathfrak{E}(\mathcal{H}^{(N)})=\left\{X_{\mu}~{}\Big{|}~{}\mu={a_{1},...,a_{k% }\choose i_{1},...,i_{k}},\,a_{j}\in v_{\rm virt},\,i_{j}\in v_{\rm occ},\,k% \leq N\right\}.fraktur_E ( caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT ) = { italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_μ = ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) , italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_virt end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT , italic_k ≤ italic_N } .

Note that the above construction of the excitation matrices yields that excitation matrices are particle number preserving. The excitation indices μ𝜇\muitalic_μ that excite from the occupied into the virtual orbitals define the multi-index set

={μ|μ=(a1,,aki1,,ik),ajvvirt,ijvocc, 1kN}.conditional-set𝜇formulae-sequence𝜇binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘formulae-sequencesubscript𝑎𝑗subscript𝑣virtformulae-sequencesubscript𝑖𝑗subscript𝑣occ1𝑘𝑁\mathcal{I}=\left\{\mu~{}\Bigg{|}~{}\mu={a_{1},...,a_{k}\choose i_{1},...,i_{k% }},\,a_{j}\in v_{\rm virt},\,i_{j}\in v_{\rm occ},\,1\leq k\leq N\right\}.caligraphic_I = { italic_μ | italic_μ = ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) , italic_a start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_virt end_POSTSUBSCRIPT , italic_i start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∈ italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT , 1 ≤ italic_k ≤ italic_N } . (24)

Since this set of excitations corresponds to simply replacing indices in the string [1,,N]1𝑁[1,...,N][ 1 , … , italic_N ] with indices in the string [N+1,,NB]𝑁1subscript𝑁𝐵[N+1,...,N_{B}][ italic_N + 1 , … , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT ] (plus some additional permutation), we deduce that there is a one-to-one relation between excitation operators and Slater determinants except for the reference Slater determinant |Ψ0ketsubscriptΨ0|\Psi_{0}\rangle| roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩. In other words, the excitation operators map the reference Slater determinant |Ψ0ketsubscriptΨ0|\Psi_{0}\rangle| roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ to all other Slater determinants.

Theorem 4.

There exists a one-to-one relation between the N𝑁Nitalic_N-particle basis functions 𝔅(N)superscript𝔅𝑁\mathfrak{B}^{(N)}fraktur_B start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT and 𝔈((N)){I}𝔈superscript𝑁𝐼\mathfrak{E}(\mathcal{H}^{(N)})\cup\{I\}fraktur_E ( caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT ) ∪ { italic_I }.

Proof.

Since excitation matrices are defined w.r.t. the reference determinant |Ψ0ketsubscriptΨ0|\Psi_{0}\rangle| roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ it follows immediately that |Ψ0=I|Ψ0ketsubscriptΨ0𝐼ketsubscriptΨ0|\Psi_{0}\rangle=I|\Psi_{0}\rangle| roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = italic_I | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩. Consider |ΨP=ξp1ξpN(N)ketsubscriptΨ𝑃subscript𝜉subscript𝑝1subscript𝜉subscript𝑝𝑁superscript𝑁|\Psi_{P}\rangle=\xi_{p_{1}}\wedge...\wedge\xi_{p_{N}}\in\mathcal{H}^{(N)}| roman_Ψ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ⟩ = italic_ξ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∧ … ∧ italic_ξ start_POSTSUBSCRIPT italic_p start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT end_POSTSUBSCRIPT ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT. Comparing {1,,N}1𝑁\{1,...,N\}{ 1 , … , italic_N } to {p1,,pN}subscript𝑝1subscript𝑝𝑁\{p_{1},...,p_{N}\}{ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } we can identify a multi-index μ𝜇\muitalic_μ describing the indices that have to be changed in {1,,N}1𝑁\{1,...,N\}{ 1 , … , italic_N } to obtain {p1,,pN}subscript𝑝1subscript𝑝𝑁\{p_{1},...,p_{N}\}{ italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_p start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT }. More precisely, μ𝜇\muitalic_μ describes an excitation from voccPsubscript𝑣occ𝑃v_{\rm occ}\setminus Pitalic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT ∖ italic_P to Pvvirt𝑃subscript𝑣virtP\cap v_{\rm virt}italic_P ∩ italic_v start_POSTSUBSCRIPT roman_virt end_POSTSUBSCRIPT. Due to the canonical ordering, this multi-index μ𝜇\muitalic_μ is unique. Then, by definition we obtain |ΨP=sign(μ)Xμ|Ψ0ketsubscriptΨ𝑃sign𝜇subscript𝑋𝜇ketsubscriptΨ0|\Psi_{P}\rangle={\rm sign}(\mu)X_{\mu}|\Psi_{0}\rangle| roman_Ψ start_POSTSUBSCRIPT italic_P end_POSTSUBSCRIPT ⟩ = roman_sign ( italic_μ ) italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩, which shows the claim. ∎

The above result is the fundamental result that allows us to express any target wave function |Ψ(N)ketΨsuperscript𝑁|\Psi\rangle\in\mathcal{H}^{(N)}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT through a wave operator applied to the reference determinant instead of an expansion through basis vectors, i.e.,

|Ψ=(c0I+μcμXμ)|Ψ0.ketΨsubscript𝑐0𝐼subscript𝜇subscript𝑐𝜇subscript𝑋𝜇ketsubscriptΨ0|\Psi\rangle=\left(c_{0}I+\sum_{\mu}c_{\mu}X_{\mu}\right)|\Psi_{0}\rangle.| roman_Ψ ⟩ = ( italic_c start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT italic_I + ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_c start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (25)

We will now focus on certain properties that ensure the later discussed exponential formulation is properly defined. The first property we consider is the commutativity of the excitation matrices.

Proposition 5.

Let Xμ,Xν𝔈((N))subscript𝑋𝜇subscript𝑋𝜈𝔈superscript𝑁X_{\mu},X_{\nu}\in\mathfrak{E}(\mathcal{H}^{(N)})italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ∈ fraktur_E ( caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT ). Then [Xμ,Xν]=0subscript𝑋𝜇subscript𝑋𝜈0[X_{\mu},X_{\nu}]=0[ italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ] = 0.

Proof.

Let

Xμ=X(a1,,aki1,,ik)=aakaa1ai1aikandXν=X(b1,,bj1,,j)=abab1aj1aj.formulae-sequencesubscript𝑋𝜇subscript𝑋binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘andsubscript𝑋𝜈subscript𝑋binomialsubscript𝑏1subscript𝑏subscript𝑗1subscript𝑗superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗X_{\mu}=X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_% {1}}^{\dagger}a_{i_{1}}...a_{i_{k}}\quad\text{and}\quad X_{\nu}=X_{b_{1},...,b% _{\ell}\choose j_{1},...,j_{\ell}}=a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger% }a_{j_{1}}...a_{j_{\ell}}.italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT and italic_X start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG start_ARG italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

The proof is conducted in two steps:
First, we seek to permute all creation operators in the commutator to the left using the CAR. We begin with the following product and note that when permuting absuperscriptsubscript𝑎subscript𝑏a_{b_{\ell}}^{\dagger}italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT to the right of aa1superscriptsubscript𝑎subscript𝑎1a_{a_{1}}^{\dagger}italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT we merely pick up a sign, since bvoccsubscript𝑏subscript𝑣occb_{\ell}\notin v_{\rm occ}italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT ∉ italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT, i.e.,

\wickaakaa1ai1aikabab1aj1aj=(1)kaakaa1abai1aikab1ab1aj1aj.\wicksuperscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscript1𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗\wick{a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}\c{1}a_{i_{1}}...a_{i_{k}}\c{1}% a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}}=(-1)^{k}% a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}a_{i_{1}}...a_{% i_{k}}a_{b_{\ell-1}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}.italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT 1̧ italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1̧ italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( - 1 ) start_POSTSUPERSCRIPT italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ - 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

This furthermore yields

aakaa1ai1aikabab1aj1aj=(1)kaakaa1abab1ai1aikaj1ajsuperscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscript1𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{b_{\ell}}^{% \dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}=(-1)^{\ell\cdot k}a_{a_% {k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{\ell}}italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT

and similar

abab1aj1ajaakaa1ai1aik=(1)kabab1aa1aakaj1ajai1aik.superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscript1𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑎𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{a_{k}}^% {\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}=(-1)^{\ell\cdot k}a_{b_{% \ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{1}}^{\dagger}...a_{a_{k}}^{\dagger% }a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}.italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

Second, we wish to unify the index sequence of the creation and annihilation operators in the two summands of the commutator. Applying the CAR again, we find

\wickabab1aakaa1aj1ajai1aik=(1)aakabab1aak1aa1aj1ajai1aik,\wicksuperscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscript1superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑎𝑘1superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘\wick{\c{1}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}\c{1}a_{a_{k}}^{\dagger% }...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}}=(-1)^{% \ell}a_{a_{k}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k-1}% }^{\dagger}...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}},1̧ italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT 1̧ italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k - 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT ,

which yields

abab1aakaa1aj1ajai1aik=(1)2*kaakaa1abab1ai1aikaj1aj.superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscript12𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k}}^{\dagger}...a_{a_{1}}^{% \dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}=(-1)^{2*\ell\cdot k}a_{a% _{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{\ell}}.italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT = ( - 1 ) start_POSTSUPERSCRIPT 2 * roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

Note that we have here assumed that μν=𝜇𝜈\mu\cap\nu=\emptysetitalic_μ ∩ italic_ν = ∅, otherwise the expression is trivially zero due to the nilpotency of the creation and annihilation operators. Overall this yields

=[X(a1,,aki1,,ik),X(b1,,bj1,,j)]absentsubscript𝑋binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘subscript𝑋binomialsubscript𝑏1subscript𝑏subscript𝑗1subscript𝑗\displaystyle=[X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}},X_{b_{1},...,b_{\ell% }\choose j_{1},...,j_{\ell}}]= [ italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG start_ARG italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT ] (26)
=aakaa1ai1aikabab1aj1ajabab1aj1ajaakaa1ai1aiabsentsuperscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖\displaystyle=a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_% {b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}-a_{b_{\ell}% }^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{a_{k}}^{\dagger}..% .a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{\ell}}= italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT - italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=(1)kaakaa1abab1ai1aikaj1aj(1)kabab1aakaa1aj1ajai1aikabsentsuperscript1𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscript1𝑘superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑗1subscript𝑎subscript𝑗subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘\displaystyle=(-1)^{\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_% {\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}-(-1)^{\ell\cdot k}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k}}% ^{\dagger}...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}= ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT - ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=(1)kaakaa1abab1ai1aikaj1aj(1)3kaakaa1abab1ai1aikaj1ajabsentsuperscript1𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗superscript13𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1superscriptsubscript𝑎subscript𝑏superscriptsubscript𝑎subscript𝑏1subscript𝑎subscript𝑖1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑗1subscript𝑎subscript𝑗\displaystyle=(-1)^{\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_% {\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}-(-1)^{3\cdot\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{% \ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}= ( - 1 ) start_POSTSUPERSCRIPT roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT - ( - 1 ) start_POSTSUPERSCRIPT 3 ⋅ roman_ℓ ⋅ italic_k end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_j start_POSTSUBSCRIPT roman_ℓ end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=0.absent0\displaystyle=0.= 0 .

Another important property is that the excitation matrices inherited the nilpotency from the fermionic creation and annihilation matrices.

Proposition 6.

Let Xμ𝔈((N))subscript𝑋𝜇𝔈superscript𝑁X_{\mu}\in\mathfrak{E}(\mathcal{H}^{(N)})italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ∈ fraktur_E ( caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT ). Then Xμ2=0superscriptsubscript𝑋𝜇20X_{\mu}^{2}=0italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0.

Proof.

Recall that (ap)2=(ap)2=0superscriptsuperscriptsubscript𝑎𝑝2superscriptsubscript𝑎𝑝20(a_{p}^{\dagger})^{2}=(a_{p})^{2}=0( italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = ( italic_a start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0 by construction (see Eq. (13)). Let

Xμ=X(a1,,aki1,,ik)=aakaa1aikai1.subscript𝑋𝜇subscript𝑋binomialsubscript𝑎1subscript𝑎𝑘subscript𝑖1subscript𝑖𝑘superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑖1X_{\mu}=X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_% {1}}^{\dagger}a_{i_{k}}...a_{i_{1}}.italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT = italic_X start_POSTSUBSCRIPT ( binomial start_ARG italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG start_ARG italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ) end_POSTSUBSCRIPT = italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT .

Then

Xμ2superscriptsubscript𝑋𝜇2\displaystyle X_{\mu}^{2}italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT =\wickaakaa1aikai1aakaa1aikai1absent\wicksuperscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑖1superscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑖1\displaystyle=\wick{\c{1}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...% a_{i_{1}}\c{1}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}}= 1̧ italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT 1̧ italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT (27)
=aakaak=0aak1aa1aikai1aak1aa1aikai1absentsubscriptsuperscriptsubscript𝑎subscript𝑎𝑘superscriptsubscript𝑎subscript𝑎𝑘absent0superscriptsubscript𝑎subscript𝑎𝑘1superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑖1superscriptsubscript𝑎subscript𝑎𝑘1superscriptsubscript𝑎subscript𝑎1subscript𝑎subscript𝑖𝑘subscript𝑎subscript𝑖1\displaystyle=-\underbrace{a_{a_{k}}^{\dagger}a_{a_{k}}^{\dagger}}_{=0}a_{a_{k% }-1}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}a_{a_{k}-1}^{\dagger}% ...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}= - under⏟ start_ARG italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT end_ARG start_POSTSUBSCRIPT = 0 end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT … italic_a start_POSTSUBSCRIPT italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_POSTSUBSCRIPT … italic_a start_POSTSUBSCRIPT italic_i start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT
=0.absent0\displaystyle=0.= 0 .

We are now set to define the vector space of cluster matrices, a fundamental concept in coupled cluster theory, i.e., the \mathbb{C}blackboard_C-vector space

𝔟={T=μtμXμ|μ},𝔟conditional-set𝑇subscript𝜇subscript𝑡𝜇subscript𝑋𝜇𝜇\mathfrak{b}=\left\{T=\sum_{\mu}t_{\mu}X_{\mu}~{}\Big{|}~{}\mu\in\mathcal{I}% \right\},fraktur_b = { italic_T = ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | italic_μ ∈ caligraphic_I } , (28)

where \mathcal{I}caligraphic_I is as denied in Eq. (24). Note that 𝔟𝔟\mathfrak{b}fraktur_b is a linear space, and the excitation matrices are linearly independent by Theorem 4, hence, each element in 𝔟𝔟\mathfrak{b}fraktur_b is uniquely defined through its linear coefficients 𝐭=(tμ)𝐭subscript𝑡𝜇\mathbf{t}=(t_{\mu})bold_t = ( italic_t start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ). We therefore use the convention that 𝐭𝐭\mathbf{t}bold_t describes an amplitude vector whereas T𝑇Titalic_T describes the corresponding cluster matrix.
Utilizing the propositions discussed earlier, we will demonstrate that this vector space possesses a highly structured nature. Our next step is to introduce the concept of the exponential of cluster matrices, which forms a key mathematical bridge between cluster matrices and wave operators. This involves drawing a connection between the Lie algebra, as embodied by the cluster matrices, and the Lie group comprising wave operators, thereby establishing an essential theoretical link in our analysis. To begin this exploration, we first assert that 𝔟𝔟\mathfrak{b}fraktur_b constitutes some form of Lie algebra. As it turns out, this assertion holds true.

Theorem 7.

The space of cluster matrices 𝔟𝔟\mathfrak{b}fraktur_b equipped with the standard matrix commutator [,]normal-⋅normal-⋅[\cdot,\cdot][ ⋅ , ⋅ ] forms a nilpotent Abelian Lie algebra.

Proof.

To show that 𝔟𝔟\mathfrak{b}fraktur_b is a Lie algebra, we need to prove that it (i) is a linear space (which is true by construction), and (ii) that there exists an alternating bilinear map (in this case the standard matrix commutator [,][\cdot,\cdot][ ⋅ , ⋅ ]) that satisfies the Jacobi identity, i.e.,

[X,[Y,Z]]+[Y,[Z,X]]+[Z,[X,Y]]=0X,Y,Z𝔟.formulae-sequence𝑋𝑌𝑍𝑌𝑍𝑋𝑍𝑋𝑌0for-all𝑋𝑌𝑍𝔟[X,[Y,Z]]+[Y,[Z,X]]+[Z,[X,Y]]=0\qquad\forall~{}X,Y,Z\in\mathfrak{b}.[ italic_X , [ italic_Y , italic_Z ] ] + [ italic_Y , [ italic_Z , italic_X ] ] + [ italic_Z , [ italic_X , italic_Y ] ] = 0 ∀ italic_X , italic_Y , italic_Z ∈ fraktur_b .

In order to show (ii), we combine Proposition 5 and the bi-linearity of the matrix commutator. This yields that for two cluster matrices T1,T2𝔟subscript𝑇1subscript𝑇2𝔟T_{1},T_{2}\in\mathfrak{b}italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ fraktur_b

[T1,T2]=muνtμtν[Xμ,Xν]=muνtμtν[Xν,Xμ]=[T2,T1],subscript𝑇1subscript𝑇2subscript𝑚𝑢subscript𝜈subscript𝑡𝜇subscript𝑡𝜈subscript𝑋𝜇subscript𝑋𝜈subscript𝑚𝑢subscript𝜈subscript𝑡𝜇subscript𝑡𝜈subscript𝑋𝜈subscript𝑋𝜇subscript𝑇2subscript𝑇1[T_{1},T_{2}]=\sum_{m}u\sum_{\nu}t_{\mu}t_{\nu}[X_{\mu},X_{\nu}]=\sum_{m}u\sum% _{\nu}t_{\mu}t_{\nu}[X_{\nu},X_{\mu}]=[T_{2},T_{1}],[ italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_u ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT [ italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT italic_u ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT [ italic_X start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT , italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ] = [ italic_T start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_T start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ,

hence, cluster matrices commute. Therefore, the Jacobi identity is trivially fulfilled. This shows that 𝔟𝔟\mathfrak{b}fraktur_b equipped with the regular matrix commutator is an abelian Lie algebra, where the term “abelian” simply means that the elements in 𝔟𝔟\mathfrak{b}fraktur_b commute with each other.
Next, we shall show the nilpotency. To that end, we expand TN+1superscript𝑇𝑁1T^{N+1}italic_T start_POSTSUPERSCRIPT italic_N + 1 end_POSTSUPERSCRIPT which yields

TN+1=k1+k2++km=N+1k1,k2,,km0(N+1k1,k2,,km)j=1m(tμjXμj)kj,superscript𝑇𝑁1subscriptsubscript𝑘1subscript𝑘2subscript𝑘𝑚𝑁1subscript𝑘1subscript𝑘2subscript𝑘𝑚0binomial𝑁1subscript𝑘1subscript𝑘2subscript𝑘𝑚superscriptsubscriptproduct𝑗1𝑚superscriptsubscript𝑡subscript𝜇𝑗subscript𝑋subscript𝜇𝑗subscript𝑘𝑗T^{N+1}=\sum_{\begin{subarray}{c}k_{1}+k_{2}+\cdots+k_{m}=N+1\\ k_{1},k_{2},\cdots,k_{m}\geq 0\end{subarray}}{N+1\choose k_{1},k_{2},\ldots,k_% {m}}\prod_{j=1}^{m}(t_{\mu_{j}}X_{\mu_{j}})^{k_{j}},italic_T start_POSTSUPERSCRIPT italic_N + 1 end_POSTSUPERSCRIPT = ∑ start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + ⋯ + italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_N + 1 end_CELL end_ROW start_ROW start_CELL italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , ⋯ , italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ≥ 0 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT ( binomial start_ARG italic_N + 1 end_ARG start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG ) ∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT , (29)

where

(N+1k1,k2,,km)=N+1!k1!k2!km!binomial𝑁1subscript𝑘1subscript𝑘2subscript𝑘𝑚𝑁1subscript𝑘1subscript𝑘2subscript𝑘𝑚{N+1\choose k_{1},k_{2},\ldots,k_{m}}={\frac{N+1!}{k_{1}!\,k_{2}!\cdots k_{m}!}}( binomial start_ARG italic_N + 1 end_ARG start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_ARG ) = divide start_ARG italic_N + 1 ! end_ARG start_ARG italic_k start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ! italic_k start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ! ⋯ italic_k start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ! end_ARG

is a multinomial coefficient. Since |vocc|=Nsubscript𝑣occ𝑁|v_{\rm occ}|=N| italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT | = italic_N, there exists one ivocc𝑖subscript𝑣occi\in v_{\rm occ}italic_i ∈ italic_v start_POSTSUBSCRIPT roman_occ end_POSTSUBSCRIPT that appears at least twice in each matrix j=1m(tμjXμj)kjsuperscriptsubscriptproduct𝑗1𝑚superscriptsubscript𝑡subscript𝜇𝑗subscript𝑋subscript𝜇𝑗subscript𝑘𝑗\prod_{j=1}^{m}(t_{\mu_{j}}X_{\mu_{j}})^{k_{j}}∏ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT italic_k start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT. However, since ai2=0superscriptsubscript𝑎𝑖20a_{i}^{2}=0italic_a start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT = 0 this yields that TN+1=0superscript𝑇𝑁10T^{N+1}=0italic_T start_POSTSUPERSCRIPT italic_N + 1 end_POSTSUPERSCRIPT = 0, which shows the claim. ∎

The above Theorem ensures that the (Lie) exponential of 𝔟𝔟\mathfrak{b}fraktur_b is a Lie Group, i.e., a differentiable manifold. However, we seek that this is the Lie Group of wave operators that we used to define any intermediately normalized wave function in (N)superscript𝑁\mathcal{H}^{(N)}caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT, see Eq. (25). To that end, we begin by showing that every intermediately normalized wave function can be expressed through a linear wave operator. As mentioned earlier, the construction of excitation operators allows us to transfer the degrees of freedom from the basis functions in Eq. (9) to wave operators. Formally, this yields the definition of the (linear) wave operator map ΩΩ\Omegaroman_Ω as

Ω:𝔟𝒢;CI+C,:Ωformulae-sequence𝔟𝒢maps-to𝐶𝐼𝐶\Omega:\mathfrak{b}\to\mathcal{G}~{};~{}C\mapsto I+C,roman_Ω : fraktur_b → caligraphic_G ; italic_C ↦ italic_I + italic_C , (30)

where

𝒢={I+C|C𝔟}.𝒢conditional-set𝐼𝐶𝐶𝔟\mathcal{G}=\{I+C~{}|~{}C\in\mathfrak{b}\}.caligraphic_G = { italic_I + italic_C | italic_C ∈ fraktur_b } . (31)

Note that in this formulation, the wave operator map ΩΩ\Omegaroman_Ω takes a cluster matrix as input and yields a wave operator, i.e., Ω(C)Ω𝐶\Omega(C)roman_Ω ( italic_C ) maps the reference determinant to some wave function in (N)superscript𝑁\mathcal{H}^{(N)}caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT. By construction, 𝒢𝒢\mathcal{G}caligraphic_G is an affine linear space of matrices. We will now show the one-to-one correspondence between intermediately normalized functions

|Ψint={|Ψ(N)|Ψ|Ψ0=1}(N),ketΨsubscriptintketΨconditionalsuperscript𝑁inner-productΨsubscriptΨ01superscript𝑁|\Psi\rangle\in\mathcal{H}_{\rm int}=\left\{|\Psi\rangle\in\mathcal{H}^{(N)}~{% }|~{}\langle\Psi|\Psi_{0}\rangle=1\right\}\subset\mathcal{H}^{(N)},| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT = { | roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT | ⟨ roman_Ψ | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = 1 } ⊂ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT , (32)

and cluster matrices C𝔟𝐶𝔟C\in\mathfrak{b}italic_C ∈ fraktur_b. We begin with the linear parametrization of elements in |ΨintketΨsubscriptint|\Psi\rangle\in\mathcal{H}_{\rm int}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT.

Lemma 8.

Let |Ψintketnormal-Ψsubscriptnormal-int|\Psi\rangle\in\mathcal{H}_{\rm int}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT. There exists a unique element (I+C)𝒢𝐼𝐶𝒢(I+C)\in\mathcal{G}( italic_I + italic_C ) ∈ caligraphic_G, s.t.,

|Ψ=(I+C)|Ψ0.ketΨ𝐼𝐶ketsubscriptΨ0|\Psi\rangle=(I+C)|\Psi_{0}\rangle.| roman_Ψ ⟩ = ( italic_I + italic_C ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (33)
Proof.

We first observe that int(N)subscriptintsuperscript𝑁\mathcal{H}_{\rm int}\subset\mathcal{H}^{(N)}caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT ⊂ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT can be characterized by

int=|Ψ0+span({|Ψμ}μ)=(*)|Ψ0+span({Xμ}μ)|Ψ0=(I+𝔟)|Ψ0,subscriptintketsubscriptΨ0spansubscriptketsubscriptΨ𝜇𝜇ketsubscriptΨ0spansubscriptsubscript𝑋𝜇𝜇ketsubscriptΨ0𝐼𝔟ketsubscriptΨ0\mathcal{H}_{\rm int}=|\Psi_{0}\rangle+{\rm span}(\{|\Psi_{\mu}\rangle\}_{\mu}% )\underset{(*)}{=}|\Psi_{0}\rangle+{\rm span}(\{X_{\mu}\}_{\mu})|\Psi_{0}% \rangle=(I+\mathfrak{b})|\Psi_{0}\rangle,caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT = | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ + roman_span ( { | roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩ } start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) start_UNDERACCENT ( * ) end_UNDERACCENT start_ARG = end_ARG | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ + roman_span ( { italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = ( italic_I + fraktur_b ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ , (34)

where the equality (*)(*)( * ) is a consequence of Theorem 4. This shows that every element in |ΨintketΨsubscriptint|\Psi\rangle\in\mathcal{H}_{\rm int}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT can be expressed as

|Ψ=(I+C)|Ψ0.ketΨ𝐼𝐶ketsubscriptΨ0|\Psi\rangle=(I+C)|\Psi_{0}\rangle.| roman_Ψ ⟩ = ( italic_I + italic_C ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (35)

Next, assume there exist two cluster matrices C1,C2𝔟subscript𝐶1subscript𝐶2𝔟C_{1},C_{2}\in\mathfrak{b}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ fraktur_b s.t.

(I+C1)|Ψ0=|Ψ=(I+C2)|Ψ0.𝐼subscript𝐶1ketsubscriptΨ0ketΨ𝐼subscript𝐶2ketsubscriptΨ0(I+C_{1})|\Psi_{0}\rangle=|\Psi\rangle=(I+C_{2})|\Psi_{0}\rangle.( italic_I + italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = | roman_Ψ ⟩ = ( italic_I + italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (36)

However, this yields

[c1]μ=Ψμ|(I+C1)|Ψ0=Ψμ|Ψ=Ψμ|(I+C2)|Ψ0=[c2]μμformulae-sequencesubscriptdelimited-[]subscript𝑐1𝜇quantum-operator-productsubscriptΨ𝜇𝐼subscript𝐶1subscriptΨ0inner-productsubscriptΨ𝜇Ψquantum-operator-productsubscriptΨ𝜇𝐼subscript𝐶2subscriptΨ0subscriptdelimited-[]subscript𝑐2𝜇for-all𝜇[c_{1}]_{\mu}=\langle\Psi_{\mu}|(I+C_{1})|\Psi_{0}\rangle=\langle\Psi_{\mu}|% \Psi\rangle=\langle\Psi_{\mu}|(I+C_{2})|\Psi_{0}\rangle=[c_{2}]_{\mu}\qquad% \forall\mu\in\mathcal{I}[ italic_c start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT = ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | ( italic_I + italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_Ψ ⟩ = ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | ( italic_I + italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = [ italic_c start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ∀ italic_μ ∈ caligraphic_I (37)

implying that C1=C2subscript𝐶1subscript𝐶2C_{1}=C_{2}italic_C start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_C start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which shows the claim. ∎

Lemma 9.

The wave operator map Ωnormal-Ω\Omegaroman_Ω is bijective.

Proof.

First note that 𝒢𝒢\mathcal{G}caligraphic_G was defined by the range of ΩΩ\Omegaroman_Ω. Hence, the wave operator map is trivially subjective. Second, note that dim(𝔟)=dim(𝒢)dim𝔟dim𝒢{\rm dim}(\mathfrak{b})={\rm dim}(\mathcal{G})roman_dim ( fraktur_b ) = roman_dim ( caligraphic_G ), which yields that ΩΩ\Omegaroman_Ω is a bijection. ∎

Combining these two lemmata, yields the desired one-to-one correspondence between 𝔟𝔟\mathfrak{b}fraktur_b and elements in intsubscriptint\mathcal{H}_{\rm int}caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT.

Theorem 10.

Let |Ψintketnormal-Ψsubscriptnormal-int|\Psi\rangle\in\mathcal{H}_{\rm int}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT. There exists a unique element C𝔟𝐶𝔟C\in\mathfrak{b}italic_C ∈ fraktur_b, s.t.,

|Ψ=Ω(C)|Φ0.ketΨΩ𝐶ketsubscriptΦ0|\Psi\rangle=\Omega(C)|\Phi_{0}\rangle.| roman_Ψ ⟩ = roman_Ω ( italic_C ) | roman_Φ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (38)

Although we have restricted the above theorem to intermediately normalized wave functions (the reason will become apparent shortly), Theorem 10 is in fact the core of the (full) configuration interaction expansion [35].

We now proceed to the exponential parametrization. Note, since 𝔟𝔟\mathfrak{b}fraktur_b is nilpotent the exponential series exp(T)𝑇\exp(T)roman_exp ( start_ARG italic_T end_ARG ) for any element T𝔟𝑇𝔟T\in\mathfrak{b}italic_T ∈ fraktur_b is not a true exponential as it terminates after at most N𝑁Nitalic_N terms. Hence, it is a polynomial at most of the degree N𝑁Nitalic_N. We therefore do not need to investigate the convergence of the exponential series and can define the set

𝒢~={exp(T)=I+n=1N1n!Tn|T𝔟}.~𝒢conditional-set𝑇𝐼superscriptsubscript𝑛1𝑁1𝑛superscript𝑇𝑛𝑇𝔟\tilde{\mathcal{G}}=\left\{\exp(T)=I+\sum_{n=1}^{N}\frac{1}{n!}T^{n}~{}|~{}T% \in\mathfrak{b}\right\}.over~ start_ARG caligraphic_G end_ARG = { roman_exp ( start_ARG italic_T end_ARG ) = italic_I + ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n ! end_ARG italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_T ∈ fraktur_b } . (39)
Lemma 11.

The set 𝒢~normal-~𝒢\tilde{\mathcal{G}}over~ start_ARG caligraphic_G end_ARG is equal to 𝒢𝒢\mathcal{G}caligraphic_G.

Proof.

Let exp(T)𝒢~𝑇~𝒢\exp(T)\in\tilde{\mathcal{G}}roman_exp ( start_ARG italic_T end_ARG ) ∈ over~ start_ARG caligraphic_G end_ARG with T𝔟𝑇𝔟T\in\mathfrak{b}italic_T ∈ fraktur_b. By definition exp(T)=I+P(T)𝑇𝐼𝑃𝑇\exp(T)=I+P(T)roman_exp ( start_ARG italic_T end_ARG ) = italic_I + italic_P ( italic_T ) where P𝑃Pitalic_P is a polynomial at most of the degree N𝑁Nitalic_N and since 𝔟𝔟\mathfrak{b}fraktur_b is a vector space, we have P(T)𝔟𝑃𝑇𝔟P(T)\in\mathfrak{b}italic_P ( italic_T ) ∈ fraktur_b. However, this defines an element in 𝒢𝒢\mathcal{G}caligraphic_G which yields 𝒢~𝒢~𝒢𝒢\tilde{\mathcal{G}}\subseteq\mathcal{G}over~ start_ARG caligraphic_G end_ARG ⊆ caligraphic_G. Conversely, let I+C𝒢𝐼𝐶𝒢I+C\in\mathcal{G}italic_I + italic_C ∈ caligraphic_G. Then I+CI=C𝔟𝐼𝐶𝐼𝐶𝔟I+C-I=C\in\mathfrak{b}italic_I + italic_C - italic_I = italic_C ∈ fraktur_b, which implies that

log(I+C)=n=0(1)nn+1Cn+1𝐼𝐶superscriptsubscript𝑛0superscript1𝑛𝑛1superscript𝐶𝑛1\log(I+C)=\sum_{n=0}^{\infty}\frac{(-1)^{n}}{n+1}C^{n+1}roman_log ( start_ARG italic_I + italic_C end_ARG ) = ∑ start_POSTSUBSCRIPT italic_n = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT divide start_ARG ( - 1 ) start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_ARG start_ARG italic_n + 1 end_ARG italic_C start_POSTSUPERSCRIPT italic_n + 1 end_POSTSUPERSCRIPT (40)

terminates after N+1𝑁1N+1italic_N + 1 terms. Hence log(I+C)log𝐼𝐶{\rm log}(I+C)roman_log ( italic_I + italic_C ) is an element in 𝔟𝔟\mathfrak{b}fraktur_b and therewith

I+C=exp(log(I+C))𝒢~𝐼𝐶log𝐼𝐶~𝒢I+C=\exp({\rm log}(I+C))\in\tilde{\mathcal{G}}italic_I + italic_C = roman_exp ( start_ARG roman_log ( italic_I + italic_C ) end_ARG ) ∈ over~ start_ARG caligraphic_G end_ARG

which shows that 𝒢𝒢~𝒢~𝒢\mathcal{G}\subseteq\tilde{\mathcal{G}}caligraphic_G ⊆ over~ start_ARG caligraphic_G end_ARG.

The common algebraic definition of the Lie exponential map is by means of a map exp:𝔟𝒢:𝔟𝒢{\exp:\mathfrak{b}\to\mathcal{G}}roman_exp : fraktur_b → caligraphic_G, where 𝒢𝒢\mathcal{G}caligraphic_G is a Lie group and 𝔟𝔟\mathfrak{b}fraktur_b the corresponding Lie algebra. In particular, the exponential map is a map from the tangent space to the Lie group [32, 39]. We wish to equip 𝒢𝒢\mathcal{G}caligraphic_G with a particular group multiplication direct-product\odot, such that (𝒢,)𝒢direct-product(\mathcal{G},\odot)( caligraphic_G , ⊙ ) is a Lie group and 𝔟𝔟\mathfrak{b}fraktur_b its Lie algebra. This group multiplication direct-product\odot is defined by means of the Backer–Campbell–Hausdorff formula

:𝒢×𝒢𝒢;exp(T)exp(U)=exp(T*U)\odot:\mathcal{G}\times\mathcal{G}\rightarrow\mathcal{G};\ \exp(T)\odot\exp(U)% =\exp(T*U)⊙ : caligraphic_G × caligraphic_G → caligraphic_G ; roman_exp ( start_ARG italic_T end_ARG ) ⊙ roman_exp ( start_ARG italic_U end_ARG ) = roman_exp ( start_ARG italic_T * italic_U end_ARG )

for an operation *** on 𝔟𝔟\mathfrak{b}fraktur_b which takes the following simple form on Abelian algebras

*:𝔟×𝔟𝔟;(T,S)T+S.*:\mathfrak{b}\times\mathfrak{b}\to\mathfrak{b}~{};~{}(T,S)\mapsto T+S.* : fraktur_b × fraktur_b → fraktur_b ; ( italic_T , italic_S ) ↦ italic_T + italic_S . (41)

In other words, we can almost trivially derive the coupled cluster ansatz using concepts from non-linear algebra [48].

Theorem 12.

Given the Lie group 𝒢𝒢\mathcal{G}caligraphic_G with Lie algebra 𝔟𝔟\mathfrak{b}fraktur_b. The exponential map exp:𝔟𝒢normal-:normal-→𝔟𝒢\exp:\mathfrak{b}\to\mathcal{G}roman_exp : fraktur_b → caligraphic_G is surjective.

Note that this theorem can be generalized to any nilpotent Lie algebra. However, the proof shows that the inverse of the exponential is in this particular case well-defined, which proves the following theorem.

Theorem 13.

The exponential map from 𝔟𝔟\mathfrak{b}fraktur_b to 𝒢𝒢\mathcal{G}caligraphic_G is bijective.

This shows that any wave function that is intermediately normalized can be uniquely expressed through an element in 𝒢𝒢\mathcal{G}caligraphic_G, i.e., through the exponential of a cluster matrix T𝔟𝑇𝔟T\in\mathfrak{b}italic_T ∈ fraktur_b. This aligns with the known functional analytic results [60, 56, 45, 25], and is known in the quantum-chemistry community as the equivalence of FCI and FCC.

Some of the above results naturally extend to the truncated case, i.e., using a subspace 𝔟¯𝔟¯𝔟𝔟\bar{\mathfrak{b}}\subset\mathfrak{b}over¯ start_ARG fraktur_b end_ARG ⊂ fraktur_b in the above construction. We refer the interested reader to [26].

3.2 The single reference CC theory

After the mathematical introduction to the CC ansatz, we now turn to the equations that yield the desired cluster matrix. These equations, central to coupled cluster theory, are the coupled cluster equations. This set of equations can be motivated as follows: Let |Ψ~(N)ket~Ψsuperscript𝑁|\tilde{\Psi}\rangle\in\mathcal{H}^{(N)}| over~ start_ARG roman_Ψ end_ARG ⟩ ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT be the ground state solution to the electronic Schrödinger equation, i.e.,

H|Ψ~=E0|Ψ~.𝐻ket~Ψsubscript𝐸0ket~ΨH|\tilde{\Psi}\rangle=E_{0}|\tilde{\Psi}\rangle.italic_H | over~ start_ARG roman_Ψ end_ARG ⟩ = italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | over~ start_ARG roman_Ψ end_ARG ⟩ . (42)

We can renormalize |Ψ~ket~Ψ|\tilde{\Psi}\rangle| over~ start_ARG roman_Ψ end_ARG ⟩ to be intermediately normalized, i.e.,

|Ψ=1Ψ0|Ψ~|Ψ~.ketΨ1inner-productsubscriptΨ0~Ψket~Ψ|\Psi\rangle=\frac{1}{\langle\Psi_{0}|\tilde{\Psi}\rangle}|\tilde{\Psi}\rangle.| roman_Ψ ⟩ = divide start_ARG 1 end_ARG start_ARG ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | over~ start_ARG roman_Ψ end_ARG ⟩ end_ARG | over~ start_ARG roman_Ψ end_ARG ⟩ . (43)

By Theorem 13, we then know that there exists a unique element T𝔟𝑇𝔟T\in\mathfrak{b}italic_T ∈ fraktur_b such that

|Ψ=exp(T)|Ψ0.ketΨexp𝑇ketsubscriptΨ0|\Psi\rangle={\rm exp}(T)|\Psi_{0}\rangle.| roman_Ψ ⟩ = roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (44)

Substituting Eq. (44) in the electronic Schrödinger equation (42) yields

Hexp(T)|Ψ0=E0exp(T)|Ψ0{Ψ0|exp(T)Hexp(T)|Ψ0=E0Ψ|exp(T)Hexp(T)|Ψ0=0,Ψ|Ψ0|.H{\rm exp}(T)|\Psi_{0}\rangle=E_{0}{\rm exp}(T)|\Psi_{0}\rangle\Leftrightarrow% \left\{\begin{aligned} \langle\Psi_{0}|{\rm exp}(-T)H{\rm exp}(T)|\Psi_{0}% \rangle&=E_{0}\\ \langle\Psi|{\rm exp}(-T)H{\rm exp}(T)|\Psi_{0}\rangle&=0,\qquad\forall\langle% \Psi|\perp\langle\Psi_{0}|.\end{aligned}\right.italic_H roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ ⇔ { start_ROW start_CELL ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | roman_exp ( - italic_T ) italic_H roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_CELL start_CELL = italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_CELL end_ROW start_ROW start_CELL ⟨ roman_Ψ | roman_exp ( - italic_T ) italic_H roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_CELL start_CELL = 0 , ∀ ⟨ roman_Ψ | ⟂ ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | . end_CELL end_ROW (45)

Noting that the cluster matrix is a function of the cluster amplitudes 𝐭𝐭{\bf t}bold_t, i.e.,

T(𝐭)=μtμXμ,𝑇𝐭subscript𝜇subscript𝑡𝜇subscript𝑋𝜇T({\bf t})=\sum_{\mu}t_{\mu}X_{\mu},italic_T ( bold_t ) = ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_X start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT , (46)

and that by construction Ψμ|Ψ0=0inner-productsubscriptΨ𝜇subscriptΨ00\langle\Psi_{\mu}|\Psi_{0}\rangle=0⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = 0 for all μ𝜇\muitalic_μ, we see that the coupled cluster amplitudes fulfill the square system of equations

0=Ψμ|exp(T(𝐭))Hexp(T(𝐭))|Ψ0=:fμ(𝐭)μ.0=\langle\Psi_{\mu}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}% \rangle=:f_{\mu}({\bf t})\qquad\forall\mu.0 = ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_exp ( - italic_T ( bold_t ) ) italic_H roman_exp ( italic_T ( bold_t ) ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = : italic_f start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( bold_t ) ∀ italic_μ . (47)

Mathematically, CC methods use roots of a high-dimensional non-linear function

fCC:𝐭[fμ(𝐭)]μ:subscript𝑓CCmaps-to𝐭subscriptdelimited-[]subscript𝑓𝜇𝐭𝜇f_{\rm CC}:{\bf t}\mapsto[f_{\mu}({\bf t})]_{\mu}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT : bold_t ↦ [ italic_f start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ( bold_t ) ] start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT (48)

to characterize physical states. The above derivation proves that

H|Ψ~=E0|Ψ~{Ψ0|exp(T(𝐭))Hexp(T(𝐭))|Ψ0=EΨμ|exp(T(𝐭))Hexp(T(𝐭))|Ψ0=0μ,H|\tilde{\Psi}\rangle=E_{0}|\tilde{\Psi}\rangle\Rightarrow\left\{\begin{% aligned} \langle\Psi_{0}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}% \rangle&=E\\ \langle\Psi_{\mu}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}\rangle% &=0\qquad\forall\mu,\end{aligned}\right.italic_H | over~ start_ARG roman_Ψ end_ARG ⟩ = italic_E start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | over~ start_ARG roman_Ψ end_ARG ⟩ ⇒ { start_ROW start_CELL ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | roman_exp ( - italic_T ( bold_t ) ) italic_H roman_exp ( italic_T ( bold_t ) ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_CELL start_CELL = italic_E end_CELL end_ROW start_ROW start_CELL ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_exp ( - italic_T ( bold_t ) ) italic_H roman_exp ( italic_T ( bold_t ) ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ end_CELL start_CELL = 0 ∀ italic_μ , end_CELL end_ROW (49)

for T(𝐭)𝑇𝐭T({\bf t})italic_T ( bold_t ) fulfilling

exp(T(𝐭))|Ψ0=1Ψ0|Ψ~|Ψ~.exp𝑇𝐭ketsubscriptΨ01inner-productsubscriptΨ0~Ψket~Ψ{\rm exp}(T({\bf t}))|\Psi_{0}\rangle=\frac{1}{\langle\Psi_{0}|\tilde{\Psi}% \rangle}|\tilde{\Psi}\rangle.roman_exp ( italic_T ( bold_t ) ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = divide start_ARG 1 end_ARG start_ARG ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | over~ start_ARG roman_Ψ end_ARG ⟩ end_ARG | over~ start_ARG roman_Ψ end_ARG ⟩ .

Note that the converse direction also holds. Let |Ψ(N)ketΨsuperscript𝑁|\Psi\rangle\in\mathcal{H}^{(N)}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT be arbitrary and 𝐭𝐭{\bf t}bold_t fulfilling Eq. (47). We define

ECC(𝐭)=Ψ0|exp(T(𝐭))Hexp(T(𝐭))|Ψ0.subscript𝐸CC𝐭quantum-operator-productsubscriptΨ0exp𝑇𝐭𝐻exp𝑇𝐭subscriptΨ0E_{\rm CC}({\bf t})=\langle\Psi_{0}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}% ))|\Psi_{0}\rangle.italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) = ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | roman_exp ( - italic_T ( bold_t ) ) italic_H roman_exp ( italic_T ( bold_t ) ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ .

Then

Ψ|(HECC)exp(T)|Ψ0quantum-operator-productΨ𝐻subscript𝐸CCexp𝑇subscriptΨ0\displaystyle\langle\Psi|(H-E_{\rm CC}){\rm exp}(T)|\Psi_{0}\rangle⟨ roman_Ψ | ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ) roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ =Ψ|exp(T)exp(T)(HECC)exp(T)|Ψ0absentquantum-operator-productΨexp𝑇exp𝑇𝐻subscript𝐸CCexp𝑇subscriptΨ0\displaystyle=\langle\Psi|{\rm exp}(T){\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)|% \Psi_{0}\rangle= ⟨ roman_Ψ | roman_exp ( italic_T ) roman_exp ( - italic_T ) ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ) roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ (50)
=Ψ|exp(T)|Ψ0Ψ0|exp(T)(HECC)exp(T)|Ψ0absentquantum-operator-productΨexp𝑇subscriptΨ0quantum-operator-productsubscriptΨ0exp𝑇𝐻subscript𝐸CCexp𝑇subscriptΨ0\displaystyle=\langle\Psi|{\rm exp}(T)|\Psi_{0}\rangle\langle\Psi_{0}|{\rm exp% }(-T)(H-E_{\rm CC}){\rm exp}(T)|\Psi_{0}\rangle= ⟨ roman_Ψ | roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ ⟨ roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT | roman_exp ( - italic_T ) ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ) roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩
+μΨ|exp(T)|ΨμΨμ|exp(T)(HECC)exp(T)|Ψ0subscript𝜇quantum-operator-productΨexp𝑇subscriptΨ𝜇quantum-operator-productsubscriptΨ𝜇exp𝑇𝐻subscript𝐸CCexp𝑇subscriptΨ0\displaystyle\quad+\sum_{\mu}\langle\Psi|{\rm exp}(T)|\Psi_{\mu}\rangle\langle% \Psi_{\mu}|{\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)|\Psi_{0}\rangle+ ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟨ roman_Ψ | roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ⟩ ⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT | roman_exp ( - italic_T ) ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ) roman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩
=0.absent0\displaystyle=0.= 0 .

Since |Ψ(N)ketΨsuperscript𝑁|\Psi\rangle\in\mathcal{H}^{(N)}| roman_Ψ ⟩ ∈ caligraphic_H start_POSTSUPERSCRIPT ( italic_N ) end_POSTSUPERSCRIPT was chosen arbitrarily, this shows that exp(T)|Ψ0exp𝑇ketsubscriptΨ0{\rm exp}(T)|\Psi_{0}\rangleroman_exp ( italic_T ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ is an eigenvector of H𝐻Hitalic_H corresponding to the eigenvalue ECCsubscript𝐸CCE_{\rm CC}italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT.

In practical applications, |ΨketΨ|\Psi\rangle| roman_Ψ ⟩ is of course not known, instead, we seek to find an amplitude vector 𝐭𝐭{\bf t}bold_t that fulfills the non-linear equations (47). Moreover, we are considering the subspace 𝔟¯𝔟¯𝔟𝔟\bar{\mathfrak{b}}\subset\mathfrak{b}over¯ start_ARG fraktur_b end_ARG ⊂ fraktur_b instead of the full space 𝔟𝔟\mathfrak{b}fraktur_b. In order to still obtain a square system of equations, i.e., as many variables as equations, we merely consider the equations that arise from projections that correspond to the excitation matrices used to expand the sought cluster matrix.

It is worth noticing that in this case, the coupled cluster solution is no longer equivalent to the quantum mechanical energy expression. In fact, this does – in general – not even yield an eigenpair. This becomes apparent by inspecting Eq. (50) and noting that in order to be exactly zero, the CC equations have to contain projections onto all basis functions Ψμ|brasubscriptΨ𝜇\langle\Psi_{\mu}|⟨ roman_Ψ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT |.

Restrictions to different 𝔟¯¯𝔟\bar{\mathfrak{b}}over¯ start_ARG fraktur_b end_ARG can be motivated from many physical and chemical perspectives, however, mathematically, we consider these restrictions to be sparsity patterns enforced onto the CC amplitude vector 𝐭𝐭{\bf t}bold_t. In this context, it is worth noticing that there exists no mathematical result showing the general existence of a sparsity pattern, a sought sparsity pattern is rather the result of computational limitations as well as many computational results indicating that even for complicated systems a certain sparsity in t is apparent. As such we think of this as a conjecture rather than a fact.

As a system of nonlinear equations, the equations (47) have a number of solutions. Speaking of the coupled cluster solution bears therefore a certain level of ambiguity. Most coupled cluster implementations seek a solution that is close to zero employing a quasi-Newton approach and an initial guess for 𝐭𝐭{\bf t}bold_t that stems from MP2. Given the convergence behavior of quasi-Newton methods, together with the interesting structures that arise when considering the basins of convergence, this approach seems appropriate for a set of “well-behaved” problems but is not a generally applicable procedure. This has resulted in a number of numerical advances together with chemically or physically motivated adjustments of the considered system of equations.

4 Analysis

The numerical analysis of coupled cluster methods witnessed a significant surge since 2009 when Schneider published the pioneering work that introduced the first local analysis based on Zarantonello’s lemma [66] to coupled cluster theory. This work set the stage for several follow-up works and motivated the exploration of alternative mathematical frameworks well-suited for describing coupled cluster methods.

In Section 4.1, we outline Schneider’s approach and elaborate on the central ideas. We then proceed in Section 4.2 by introducing the graph-based framework for CC methods developed by Csirik and Laestadius. This perspective introduced novel ideas offering a unified platform to compare various CC methods, including multireference approaches. In Section 4.3 we then elaborate on the most recent numerical analysis results characterizing the single reference CC method. The authors Hassan, Maday, and Wang presented yet another and – compared to the local analysis – a more general approach based on the invertibility of the CC Fréchet derivative.

Before delving into these analytical characterizations of CC theory, we have to elaborate on three subtle mathematical details that are important to keep in mind when reading this section:

The first is related to the wave function. As outlined earlier, the most general space in which we seek to find a solution to the electronic Schrödinger equation is an anti-symmetrized Sobolev space [60]. Although we will avoid this detail explicitly in the subsequent elaborations, it is an important detail and a central concept that appears in all analysis works related to CC theory. From a quantum chemistry perspective, seeking a solution within this space ensures finite kinetic energy, in other words:

XN|ψ(𝐱1,,𝐱N)|2dλ(𝐱1)dλ(𝐱N)<+.subscriptsuperscript𝑋𝑁superscript𝜓subscript𝐱1subscript𝐱𝑁2differential-d𝜆subscript𝐱1differential-d𝜆subscript𝐱𝑁\int_{X^{N}}|\nabla\psi(\mathbf{x}_{1},\dots,\mathbf{x}_{N})|^{2}\mathrm{d}% \lambda(\mathbf{x}_{1})\dots\mathrm{d}\lambda(\mathbf{x}_{N})<+\infty.∫ start_POSTSUBSCRIPT italic_X start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | ∇ italic_ψ ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , bold_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT roman_d italic_λ ( bold_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) … roman_d italic_λ ( bold_x start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ) < + ∞ . (51)

For more details on Sobolev spaces, we refer the interested reader to mathematical textbooks [47, 2, 31, 65] or relevant articles that offer insights into their application in quantum chemistry [44, 27, 55, 64]. This extra constraint of finite kinetic energy is particularly important for the continuous (i.e., infinite dimensional) formulation of coupled-cluster theory [56]. In this context, we remind the reader of the notation for the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-inner product ψ|ψinner-productsuperscript𝜓𝜓\langle\psi^{\prime}|\psi\rangle⟨ italic_ψ start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | italic_ψ ⟩, and its induced norm ψL22=ψ|ψsubscriptsuperscriptnorm𝜓2superscript𝐿2inner-product𝜓𝜓\|\psi\|^{2}_{L^{2}}=\langle\psi|\psi\rangle∥ italic_ψ ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ⟨ italic_ψ | italic_ψ ⟩.

The second important detail is a measure of distance on matrix and operator spaces. We here consider operators that act on the wave functions, e.g., the Hamiltonian H𝐻Hitalic_H, cluster operators T𝑇Titalic_T, ΛΛ\Lambdaroman_Λ, etc. We can then introduce a norm expression for the operator inherited from the function space it is defined on. For example, let O𝑂Oitalic_O be an operator defined on L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT then we define the L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT operator norm

OL2=sup{OψL2:ψL2=1}.\|O\|_{L^{2}}=\sup\{\|O\psi\|_{L^{2}}~{}:~{}\|\psi\|_{L^{2}}=1\}.∥ italic_O ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = roman_sup { ∥ italic_O italic_ψ ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT : ∥ italic_ψ ∥ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = 1 } . (52)

Note that this concept reduces to induced matrix norms in the finite-dimensional case.

The third detail is that the CC amplitudes 𝐭𝐭{\bf t}bold_t live in the Hilbert space of finite square summable sequences denoted the 2superscript2\ell^{2}roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-space. This space is equipped with the 2superscript2\ell^{2}roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-inner product [2], i.e., let x=(xμ)𝑥subscript𝑥𝜇x=(x_{\mu})italic_x = ( italic_x start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) and y=(yμ)𝑦subscript𝑦𝜇y=(y_{\mu})italic_y = ( italic_y start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT ) be two finite sequences, the 2superscript2\ell^{2}roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT-inner product is defined as

x,y2=μxμyμ,subscript𝑥𝑦superscript2subscript𝜇subscript𝑥𝜇subscript𝑦𝜇\langle x,y\rangle_{\ell^{2}}=\sum_{\mu}x_{\mu}y_{\mu},⟨ italic_x , italic_y ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT italic_y start_POSTSUBSCRIPT italic_μ end_POSTSUBSCRIPT , (53)

which induces the norm x22=x,x2subscriptsuperscriptnorm𝑥2superscript2subscript𝑥𝑥superscript2\|x\|^{2}_{\ell^{2}}=\langle x,x\rangle_{\ell^{2}}∥ italic_x ∥ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ⟨ italic_x , italic_x ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT.

4.1 Local strong monotonicity

The local analysis introduced by Schneider [60] has spawned various works following a similar methodology analyzing different CC methods: Rohwedder generalized it to infinite dimensions [56, 57], Laestadius and Kvaal adapted it for the extended CC framework [45], and Faulstich et al. adapted it for tailored CC methods [25]. Central to all these local analyses is the local version of Zarantonello’s lemma [66].

The local version of Zarantonello’s lemma states that – under certain conditions – a function is (locally) invertible. In the context of coupled cluster theory, the function under investigation is the CC function fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT defined in Eq. (48). The local invertibility of this function yields the local existence and uniqueness of a CC solution. To ensure the applicability of Zarantonello’s lemma, the function in question must exhibit specific characteristics of mathematical “well-behavedness”. Specifically, in this context, it means that the function must satisfy two essential properties:

Local strong monotonicity. The function fCCsubscript𝑓CCf_{\mathrm{CC}}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT is called locally strongly monotone at 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT if for some r>0𝑟0r>0italic_r > 0, γ>0𝛾0\gamma>0italic_γ > 0 and all 𝐭,𝐭𝐭superscript𝐭\mathbf{t},\mathbf{t}^{\prime}bold_t , bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT within the distance r𝑟ritalic_r of 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT

fCC(𝐭)fCC(𝐭),𝐭𝐭2γ𝐭𝐭22.subscriptsubscript𝑓CC𝐭subscript𝑓CCsuperscript𝐭𝐭superscript𝐭superscript2𝛾superscriptsubscriptnorm𝐭superscript𝐭superscript22\langle f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}\geq\gamma\|\mathbf{t}-\mathbf% {t}^{\prime}\|_{\ell^{2}}^{2}.⟨ italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) - italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ≥ italic_γ ∥ bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT . (54)

Local Lipschitz continuity. The function fCCsubscript𝑓CCf_{\mathrm{CC}}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT is said to be locally Lipschitz continuous at 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT with Lipschitz constant L>0𝐿0L>0italic_L > 0 if for some r>0𝑟0r>0italic_r > 0 and all 𝐭,𝐭𝐭superscript𝐭\mathbf{t},\mathbf{t}^{\prime}bold_t , bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT within the distance r𝑟ritalic_r of 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT

fCC(𝐭)fCC(𝐭)2L𝐭𝐭2,subscriptnormsubscript𝑓CC𝐭subscript𝑓CCsuperscript𝐭superscript2𝐿subscriptnorm𝐭superscript𝐭superscript2\|f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime})\|_{\ell^{2}% }\leq L\|\mathbf{t}-\mathbf{t}^{\prime}\|_{\ell^{2}},∥ italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) - italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ≤ italic_L ∥ bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , (55)

Note that in the finite-dimensional case, fCCsubscript𝑓CCf_{\mathrm{CC}}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT is indeed locally Lipschitz since it is continuously differentiable.

In the context of CC theory, the difficult property to prove is that fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT – or the respective function describing the CC method under consideration – is locally strongly monotone. Here, all analyses generally follow a similar pattern: Inspecting the left-hand side in Eq. (54) we begin by a tailor expansion of fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT around 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT, i.e.,

fCC(𝐭)fCC(𝐭)=DfCC(𝐭*)(𝐭𝐭)+𝒪((𝐭𝐭)2),subscript𝑓CC𝐭subscript𝑓CCsuperscript𝐭𝐷subscript𝑓CCsubscript𝐭𝐭superscript𝐭𝒪superscript𝐭superscript𝐭2f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime})=Df_{\mathrm{% CC}}(\mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime})+\mathcal{O}\left((\mathbf% {t}-\mathbf{t}^{\prime})^{2}\right),italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) - italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) = italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ( bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) + caligraphic_O ( ( bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , (56)

where DfCC𝐷subscript𝑓CCDf_{\mathrm{CC}}italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT is the Jacobian of fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT. This yields

fCC(𝐭)fCC(𝐭),𝐭𝐭2=DfCC(𝐭*)(𝐭𝐭),𝐭𝐭2+𝒪(𝐭𝐭3).subscriptsubscript𝑓CC𝐭subscript𝑓CCsuperscript𝐭𝐭superscript𝐭superscript2subscript𝐷subscript𝑓CCsubscript𝐭𝐭superscript𝐭𝐭superscript𝐭superscript2𝒪superscriptnorm𝐭superscript𝐭3\langle f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}=\langle Df_{\mathrm{CC}}(% \mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime}),\mathbf{t}-\mathbf{t}^{\prime}% \rangle_{\ell^{2}}+\mathcal{O}\left(\|\mathbf{t}-\mathbf{t}^{\prime}\|^{3}% \right).⟨ italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) - italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ⟨ italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ( bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT + caligraphic_O ( ∥ bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) . (57)

At this point, it is common to impose certain locality assumptions, i.e., assuming that 𝐭𝐭\mathbf{t}bold_t and 𝐭superscript𝐭\mathbf{t}^{\prime}bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are close enough to 𝐭*subscript𝐭\mathbf{t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT. This ensures that the term 𝒪(𝐭𝐭3)𝒪superscriptnorm𝐭superscript𝐭3\mathcal{O}\left(\|\mathbf{t}-\mathbf{t}^{\prime}\|^{3}\right)caligraphic_O ( ∥ bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∥ start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT ) is sufficiently small. In order to control the remaining term, the derivative of fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT can explicitly be computed, which yields

DfCC(𝐭*)(𝐭𝐭),𝐭𝐭2=(TT)Ψ0,eT*(HECC(𝐭*))eT*(TT)Ψ0L2.subscript𝐷subscript𝑓CCsubscript𝐭𝐭superscript𝐭𝐭superscript𝐭superscript2subscript𝑇superscript𝑇subscriptΨ0superscript𝑒subscript𝑇𝐻subscript𝐸CCsubscript𝐭superscript𝑒subscript𝑇𝑇superscript𝑇subscriptΨ0superscript𝐿2\langle Df_{\mathrm{CC}}(\mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}=\langle(T-T^{\prime})\Psi_{0}% ,e^{-T_{*}}\left(H-E_{\rm CC}({\bf\mathbf{t}_{*}})\right)e^{T_{*}}(T-T^{\prime% })\Psi_{0}\rangle_{L^{2}}.⟨ italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ( bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , bold_t - bold_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ start_POSTSUBSCRIPT roman_ℓ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT = ⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_e start_POSTSUPERSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) italic_e start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT . (58)

The next step involves expanding the similarity-transformed Hamiltonian, i.e., eT*HeT*superscript𝑒subscript𝑇𝐻superscript𝑒subscript𝑇e^{-T_{*}}He^{T_{*}}italic_e start_POSTSUPERSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_H italic_e start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT, using the Hausdorff Lemma [32], which is an important lemma derived from the Baker–Campbell–Hausdorff formula. This yields

eT*(HECC(𝐭*))eT*=(HECC(𝐭*))T*(HECC(𝐭*))+(HECC(𝐭*))T*+superscript𝑒subscript𝑇𝐻subscript𝐸CCsubscript𝐭superscript𝑒subscript𝑇𝐻subscript𝐸CCsubscript𝐭subscript𝑇𝐻subscript𝐸CCsubscript𝐭𝐻subscript𝐸CCsubscript𝐭subscript𝑇e^{-T_{*}}\left(H-E_{\rm CC}({\bf t_{*}})\right)e^{T_{*}}=\left(H-E_{\rm CC}({% \bf t_{*}})\right)-T_{*}\left(H-E_{\rm CC}({\bf t_{*}})\right)+\left(H-E_{\rm CC% }({\bf t_{*}})\right)T_{*}+...italic_e start_POSTSUPERSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) italic_e start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) + ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT + … (59)

Again, imposing locality of t𝑡titalic_t and tsuperscript𝑡t^{\prime}italic_t start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT around t*subscript𝑡t_{*}italic_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT will ensure that higher-order terms become negligible. This yields that

(TT)Ψ0,\displaystyle\langle(T-T^{\prime})\Psi_{0},⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , eT*(HECC(𝐭*))eT*(TT)Ψ0L2\displaystyle e^{-T_{*}}\left(H-E_{\rm CC}({\bf t_{*}})\right)e^{T_{*}}(T-T^{% \prime})\Psi_{0}\rangle_{L^{2}}italic_e start_POSTSUPERSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) italic_e start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT (60)
=(TT)Ψ0,(HECC(𝐭*))(TT)Ψ0L2absentsubscript𝑇superscript𝑇subscriptΨ0𝐻subscript𝐸CCsubscript𝐭𝑇superscript𝑇subscriptΨ0superscript𝐿2\displaystyle=\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})% \right)(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}= ⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT
+(TT)Ψ0,(HECC(𝐭*))(T*T*)(TT)Ψ0L2subscript𝑇superscript𝑇subscriptΨ0𝐻subscript𝐸CCsubscript𝐭subscript𝑇superscriptsubscript𝑇𝑇superscript𝑇subscriptΨ0superscript𝐿2\displaystyle\quad+\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}% })\right)(T_{*}-T_{*}^{\dagger})(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}+ ⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) ( italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT
+\displaystyle\quad+...+ …

The first term in this expansion, i.e.,

(TT)Ψ0,(HECC(𝐭*))(TT)Ψ0L2,subscript𝑇superscript𝑇subscriptΨ0𝐻subscript𝐸CCsubscript𝐭𝑇superscript𝑇subscriptΨ0superscript𝐿2\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})\right)(T-T^{% \prime})\Psi_{0}\rangle_{L^{2}},⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT , (61)

can then be bounded by imposing different spectral gap assumptions depending on the CC method under consideration. The applicability of different spectral gap assumptions is reasonable in the context of CC methods. For the remainder term,

(TT)Ψ0,(HECC(𝐭*))(T*T*)(TT)Ψ0L2subscript𝑇superscript𝑇subscriptΨ0𝐻subscript𝐸CCsubscript𝐭subscript𝑇superscriptsubscript𝑇𝑇superscript𝑇subscriptΨ0superscript𝐿2\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})\right)(T_{*}-T_{% *}^{\dagger})(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}⟨ ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ( italic_H - italic_E start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ) ( italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT - italic_T start_POSTSUBSCRIPT * end_POSTSUBSCRIPT start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ) ( italic_T - italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ start_POSTSUBSCRIPT italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_POSTSUBSCRIPT (62)

further “well-behavedness” assumptions have to be made which commonly involve the fluctuation potential W=HF𝑊𝐻𝐹W=H-Fitalic_W = italic_H - italic_F, where F𝐹Fitalic_F is the Fock matrix. Opposed to the spectral gap assumption there is much less known about the feasibility of such assumptions. Combining these estimates yields an approximate strong monotonicity constant denoted by ΓΓ\Gammaroman_Γ. The positivity of this constant varies depending on the system being analyzed, indicating that such an analysis is not universally applicable. For specific values of ΓΓ\Gammaroman_Γ across different systems, we refer to Table 1 where these variations are detailed. Moreover, a prior error estimates can be established using the general framework introduced by Bangerth and Rannacher [3].

4.2 Excitation graphs and topological degree

In a series of two articles [18, 19] Csirik and Laestadius propose a novel and comprehensive mathematical framework for Coupled-Cluster-type methods.

In the first article of this series [18], the authors develop a graph theoretical approach offering a new interpretation of the excitation structures in various CC methods through a graph-based framework. This method is particularly potent as it enables a cohesive analysis of both single and multi-reference CC methods within a unified structure. To illustrate this concept, we consider a simplified scenario with five spin orbitals, labeled {1,,5}15\{1,...,5\}{ 1 , … , 5 }, and two reference states, {1,2,3}123\{1,2,3\}{ 1 , 2 , 3 } and {1,2,4}124\{1,2,4\}{ 1 , 2 , 4 }. The array of possible excitations in this setup can be effectively represented using a graph, where each edge symbolizes a potential excitation. This graphical representation is detailed in Figure 2, providing a clear and structured visualization of the excitation dynamics.

Refer to caption
Figure 2: Full multi-reference excitation multigraph for five spin orbitals, labeled {1,,5}15\{1,...,5\}{ 1 , … , 5 } and two reference states, {1,2,3}123\{1,2,3\}{ 1 , 2 , 3 } and {1,2,4}124\{1,2,4\}{ 1 , 2 , 4 }. The excitations w.r.t the references {1,2,3}123\{1,2,3\}{ 1 , 2 , 3 } and {1,2,4}124\{1,2,4\}{ 1 , 2 , 4 } are represented as edges. To distinguish the excitations from {1,2,3}123\{1,2,3\}{ 1 , 2 , 3 } and {1,2,4}124\{1,2,4\}{ 1 , 2 , 4 }, the edges are colored in red and blue, respectively. See [18] for more details.

Analyzing the excitation graph itself can lead to insights about the considered CC method. As an example, the transitivity of the graph implies the algebraic closedness of the set of excitation operators.

In the second article of this series [19], the authors analyze the nonlinear equations arising in the single reference CC method using topological degree theory. This mathematical tool is instrumental in decoding and resolving specific equation types that entail mappings between topological spaces. When applied to the CC map, topological degree theory allows for the deduction of local existence and uniqueness of the CC solutions. Additionally, it facilitates the extraction of the topological index for solutions within the single reference CC framework. In general, the topological index of a root in a nonlinear map is particularly enlightening, shedding light on the root’s inherent nature, especially regarding its stability and the map’s behavior in its vicinity. In this context, the authors successfully demonstrate the application of topological index results to both non-degenerate and degenerate solutions in the single reference CC method, providing deeper insights into the underlying mathematical structure of these solutions.

In addition to their exploration of nonlinear equations in the single reference CC method, the authors also investigate the complex issue of discerning the “physicality” of solutions to truncated CC equations. This area of research has been pivotal in distinguishing between “physical” solutions, which accurately mirror real-world phenomena, and “unphysical” ones, which are considered irrelevant or misleading. A landmark study by Kowalski and Piecuch [53] played a crucial role in this context, employing a specific homotopy method to categorize these solutions. Despite some debate over the universality of this method, as noted by Csirik and Laestadius in Remark 4.30 in [19], the contributions of Kowalski and Piecuch were significant – to the extent that the authors christened this particular homotopy the “Kowalski–Piecuch homotopy”, or “KP homotopy” for short. Unveiled the intricate nature of solutions to truncated CC equations, results in [53] highlighted the need for deeper analytical scrutiny. This revelation has spurred further examination of the CC equations and the KP homotopy approach, with a renewed focus on employing topological degree theory. By doing so, Csirik and Laestadius have markedly enhanced our comprehension of the complex nature inherent in truncated CC equations, offering new perspectives and deeper insights into their behavior and implications.

4.3 Inf-Sup condition

In their two-part series of articles [34, 33], Hassan, Maday, and Wang have made substantial advancements in our analytical grasp of the CC function fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT defined in Eq. (48). To avoid an ad hoc bound onto the fluctuation potential as imposed in Sec. 4.1, the authors instead prove the local invertibility of the CC function through a classical inf-sup type argument that marks a significant shift in the analytical methodology employed in CC theory. Such an inf-sup condition, also called the Babuška–Brezzi condition which is a technique commonly used when analyzing indefinite elliptic partial differential equations, can be summarized as follows:

Consider the bounded linear mapping A𝐴Aitalic_A between two normed spaces (V,V)(V,\|\cdot\|_{V})( italic_V , ∥ ⋅ ∥ start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ) and (W,W)(W,\|\cdot\|_{W})( italic_W , ∥ ⋅ ∥ start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) – note that the Jacobian naturally fulfills these assumptions. The Babuška–Brezzi condition states that there exists a constant α>0𝛼0\alpha>0italic_α > 0 such that

infvVv0supwWw0|A(v,w)|vVwW0andinfwWw0supvVv0|A(v,w)|vVwWα,formulae-sequencesubscriptinfimum𝑣𝑉𝑣0subscriptsupremum𝑤𝑊𝑤0𝐴𝑣𝑤subscriptnorm𝑣𝑉subscriptnorm𝑤𝑊0andsubscriptinfimum𝑤𝑊𝑤0subscriptsupremum𝑣𝑉𝑣0𝐴𝑣𝑤subscriptnorm𝑣𝑉subscriptnorm𝑤𝑊𝛼\inf_{\begin{subarray}{c}v\in V\\ v\neq 0\end{subarray}}\sup_{\begin{subarray}{c}w\in W\\ w\neq 0\end{subarray}}\frac{|A(v,w)|}{\|v\|_{V}\|w\|_{W}}\geq 0\quad{\rm and}% \quad\inf_{\begin{subarray}{c}w\in W\\ w\neq 0\end{subarray}}\sup_{\begin{subarray}{c}v\in V\\ v\neq 0\end{subarray}}\frac{|A(v,w)|}{\|v\|_{V}\|w\|_{W}}\geq\alpha,roman_inf start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_v ∈ italic_V end_CELL end_ROW start_ROW start_CELL italic_v ≠ 0 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_w ∈ italic_W end_CELL end_ROW start_ROW start_CELL italic_w ≠ 0 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT divide start_ARG | italic_A ( italic_v , italic_w ) | end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ∥ italic_w ∥ start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_ARG ≥ 0 roman_and roman_inf start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_w ∈ italic_W end_CELL end_ROW start_ROW start_CELL italic_w ≠ 0 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT roman_sup start_POSTSUBSCRIPT start_ARG start_ROW start_CELL italic_v ∈ italic_V end_CELL end_ROW start_ROW start_CELL italic_v ≠ 0 end_CELL end_ROW end_ARG end_POSTSUBSCRIPT divide start_ARG | italic_A ( italic_v , italic_w ) | end_ARG start_ARG ∥ italic_v ∥ start_POSTSUBSCRIPT italic_V end_POSTSUBSCRIPT ∥ italic_w ∥ start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_ARG ≥ italic_α , (63)

see [58] for more details. This condition ensures that the operator A𝐴Aitalic_A is neither “too weak” nor “too strong”, in the sense that it maps elements of V𝑉Vitalic_V and W𝑊Witalic_W in a balanced way. Gaining a clearer understanding becomes easier in the context of finite dimensions: In this scenario, striving for local strong monotonicity, as detailed in Sec. 4.1, is analogous to verifying that a matrix is positive definite. Similarly, the inf-sup condition, as described in this section, can be likened to establishing a matrix’s invertibility. Note that in the realm of finite dimensions, a square matrix’s invertibility can be deduced solely from its injectivity. However, the infinite-dimensional scenario demands a bit more nuance. This is why we see two distinct conditions in Eq. (63), reflecting the additional complexity inherent in infinite dimensions.

In connection with single reference coupled cluster theory, the authors establish this condition for the similarity transformed shifted Hamiltonian, which arises from the coupled cluster Jacobian, see [34]. As shown subsequently, proving local invertibility based on this inf-sup condition yields more generally applicable well-posedness results compared to the local analysis techniques described in Sec. 4.1.

4.3.1 Overview of the inf-sup type argument

The essence of the analysis presented in [34] is that the CC function can be locally inverted if and only if its Jacobian, referred to as the CC Jacobian, can be locally inverted. Recall that the CC Jacobian is given by

w,DfCC(t)v=WΨ0,eT(t)[H,S]eT(t)Ψ0𝑤𝐷subscript𝑓CC𝑡𝑣𝑊subscriptΨ0superscript𝑒𝑇𝑡𝐻𝑆superscript𝑒𝑇𝑡subscriptΨ0\langle w,Df_{\rm CC}(t)v\rangle=\langle W\Psi_{0},e^{T(t)}[H,S]e^{T(t)}\Psi_{% 0}\rangle⟨ italic_w , italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( italic_t ) italic_v ⟩ = ⟨ italic_W roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_e start_POSTSUPERSCRIPT italic_T ( italic_t ) end_POSTSUPERSCRIPT [ italic_H , italic_S ] italic_e start_POSTSUPERSCRIPT italic_T ( italic_t ) end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ (64)

and we introduce the map A𝐴Aitalic_A via the description

WΨ0,A(t)SΨ0=WΨ0,eT(t)[H,S]eT(t)Ψ0.𝑊subscriptΨ0𝐴𝑡𝑆subscriptΨ0𝑊subscriptΨ0superscript𝑒𝑇𝑡𝐻𝑆superscript𝑒𝑇𝑡subscriptΨ0\langle W\Psi_{0},A(t)S\Psi_{0}\rangle=\langle W\Psi_{0},e^{T(t)}[H,S]e^{T(t)}% \Psi_{0}\rangle.⟨ italic_W roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_A ( italic_t ) italic_S roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = ⟨ italic_W roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_e start_POSTSUPERSCRIPT italic_T ( italic_t ) end_POSTSUPERSCRIPT [ italic_H , italic_S ] italic_e start_POSTSUPERSCRIPT italic_T ( italic_t ) end_POSTSUPERSCRIPT roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ . (65)

Note that the CC Jacobian Df𝐷𝑓Dfitalic_D italic_f at 𝐭𝐭{\bf t}bold_t is then invertible if and only if A𝐴Aitalic_A is invertible at 𝐭𝐭{\bf t}bold_t. The authors leverage this observation and work with A𝐴Aitalic_A instead of the Jacobian Df𝐷𝑓Dfitalic_D italic_f. Moreover, A(𝐭*)𝐴subscript𝐭A({\bf t}_{*})italic_A ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) is a similarity transformed of the shifted Hamiltonian, in particular, it is non-symmetric! Therefore, one can either study A(𝐭)𝐴𝐭A({\bf t})italic_A ( bold_t ) or A(𝐭)superscript𝐴𝐭A^{\dagger}({\bf t})italic_A start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( bold_t ), both approaches are equivalent, yet one approach might be simpler than the other. Indeed, the authors establish the following two key results which yield the invertibility of A𝐴Aitalic_A at 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT which then yields the invertibility of the CC Jacobian DfCC𝐷subscript𝑓CCDf_{\rm CC}italic_D italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT at 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT, see Theorem 31 in [34]. First, the authors prove that at the true, untruncated CC solution 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT, the function A(𝐭*)𝐴subscript𝐭A({\bf t}_{*})italic_A ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) is injective, see step one in the proof of Theorem 31 in [34]. This is equivalent to the first inequality in Eq. (63). Second, the authors establish that A(𝐭*)superscript𝐴subscript𝐭A^{\dagger}({\bf t}_{*})italic_A start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) is bounded below, see step two in the proof of Theorem 31 in [34]. This is equivalent to the second inequality in Eq. (63). Combining these results yields that A𝐴Aitalic_A at 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT is invertible. A direct consequence of this is that the CC Jacobian Df𝐷𝑓Dfitalic_D italic_f evaluated at 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT is invertible, and its inverse is bounded, i.e.,

Df1(𝐭*)ΘΥ,withΘ=eT(𝐭*)0eT(𝐭*),formulae-sequencenorm𝐷superscript𝑓1subscript𝐭ΘΥwithΘnormsuperscript𝑒superscript𝑇subscript𝐭normsuperscriptsubscript0perpendicular-tosuperscript𝑒𝑇subscript𝐭\|Df^{-1}({\bf t}_{*})\|\leq\frac{\Theta}{\Upsilon},\qquad{\rm with}\qquad% \Theta=\|e^{T^{\dagger}({\bf t}_{*})}\|\|\mathbb{P}_{0}^{\perp}e^{-T({\bf t}_{% *})}\|,∥ italic_D italic_f start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ∥ ≤ divide start_ARG roman_Θ end_ARG start_ARG roman_Υ end_ARG , roman_with roman_Θ = ∥ italic_e start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT † end_POSTSUPERSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ∥ ∥ blackboard_P start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⟂ end_POSTSUPERSCRIPT italic_e start_POSTSUPERSCRIPT - italic_T ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT ∥ , (66)

where ΥΥ\Upsilonroman_Υ is the inf-sup constant from Eq. (63), and 0superscriptsubscript0perpendicular-to\mathbb{P}_{0}^{\perp}blackboard_P start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ⟂ end_POSTSUPERSCRIPT is the projection onto the space orthogonal to span(Ψ0)spansubscriptΨ0{\rm span}(\Psi_{0})roman_span ( roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ).

These results, in turn, can then be leveraged to establish that the CC function fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT, under some assumptions (see Theorem 33 in [34]), is locally invertible around 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT and fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT as well as its local inverse are differentiable – in mathematical parlance, fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT is a local diffeomorphism. Moreover, the authors establish a local error bound of the form

𝐭*𝐭2ΘΥfCC(𝐭*).normsubscript𝐭𝐭2ΘΥnormsubscript𝑓CCsubscript𝐭\|{\bf t}_{*}-{\bf t}\|\leq 2\frac{\Theta}{\Upsilon}\|f_{\rm CC}({\bf t}_{*})\|.∥ bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT - bold_t ∥ ≤ 2 divide start_ARG roman_Θ end_ARG start_ARG roman_Υ end_ARG ∥ italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ∥ . (67)

4.3.2 Interpretation and results of the inf-sup argument

Similarly to the local analysis results elaborated on in Sec. 4.1, the inf-sup argument relies on the positivity of the constants involved. The advantage of the analysis presented in [34], is that the constants are provably positive and therefore universally applicable. In particular, they do not rely on assumptions on the fluctuation potential. See Table 1 for some molecular test systems in equilibrium geometry, and Fig. 3 for bond-dissociation of hydrogen fluoride.

Table 1: Comparison of the approximate strong monotonicity constant ΓΓ\Gammaroman_Γ (see Sec. 4.1) and Υ/ΘΥΘ\Upsilon/\Thetaroman_Υ / roman_Θ (see Eq. (66)) – both seeking a positive lower bound to 1/Df1(t*)1norm𝐷superscript𝑓1subscript𝑡1/\|Df^{-1}(t_{*})\|1 / ∥ italic_D italic_f start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ∥. The calculations were performed in STO-6G basis sets except for the HF and LiH molecules for which the 6-31G basis sets was used. For more details see [34]
Molecule 1/Df1(t*)1norm𝐷superscript𝑓1subscript𝑡1/\|Df^{-1}(t_{*})\|1 / ∥ italic_D italic_f start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ∥ Υ/ΘΥΘ\Upsilon/\Thetaroman_Υ / roman_Θ ΓΓ\Gammaroman_Γ
BeH22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT 0.3379 0.2568 0.0363
BH33{}_{3}start_FLOATSUBSCRIPT 3 end_FLOATSUBSCRIPT 0.3060 0.2081 -0.0950
HF 0.2995 0.2529 -0.0083
H22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPTO 0.3576 0.2789 0.0249
LiH 0.2628 0.2164 -0.0065
NH33{}_{3}start_FLOATSUBSCRIPT 3 end_FLOATSUBSCRIPT 0.4113 0.2784 -0.0325

.

Table 1: Comparison of the approximate strong monotonicity constant ΓΓ\Gammaroman_Γ (see Sec. 4.1) and Υ/ΘΥΘ\Upsilon/\Thetaroman_Υ / roman_Θ (see Eq. (66)) – both seeking a positive lower bound to 1/Df1(t*)1norm𝐷superscript𝑓1subscript𝑡1/\|Df^{-1}(t_{*})\|1 / ∥ italic_D italic_f start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT ) ∥. The calculations were performed in STO-6G basis sets except for the HF and LiH molecules for which the 6-31G basis sets was used. For more details see [34]
Figure 3: Numerically computed constants for the HF molecule at different bond lengths. The equilibrium bond length is 0.9168 Å. The figure on the right uses a log scale on the y-axis. For more details see [34]
Refer to caption

.

Figure 3: Numerically computed constants for the HF molecule at different bond lengths. The equilibrium bond length is 0.9168 Å. The figure on the right uses a log scale on the y-axis. For more details see [34]

While the analytical approach outlined in [34] encounters certain challenges, its contributions to the mathematical understanding of CC theory are pivotal. Initially, this analysis seemed limited to the untruncated CC framework, relating approximate untruncated CC solutions with the infinite-dimensional untruncated CC solutions. However, the authors adeptly addressed this in a subsequent publication [33], successfully extending their findings to truncated CC methods. Another complexity lies in the computation of the involved constants in a numerically tractable manner. The constants involve operator norms which are in general not easily accessible, to say the least. Moreover, these constants are further linked to either the specific value of the untruncated CC solution 𝐭*subscript𝐭{\bf t}_{*}bold_t start_POSTSUBSCRIPT * end_POSTSUBSCRIPT or the spectral properties of related operators. Despite this, the potential for practical application remains promising. Future work could focus on developing manageable approximations of these constants, thereby making the insights from [34, 33] more accessible for practical simulations.

In summary, this novel analytical approach has significantly advanced our understanding of the local behavior of the CC function. It introduces a sound mathematical framework for understanding its local behavior, thereby greatly enriching our knowledge in this area.

5 The root structure of CC theory

As outlined in [26, 24], the root structure of a polynomial system is (in general) of fundamental importance. It unveils key aspects [37], such as the multiplicity of roots and the nature of these roots, e.g., whether they are real or complex. Such insights are especially vital when employing (approximate) root-finding methods in practical applications. The pursuit of roots to the CC equations (47) is a direct application of these principles.

In the context of CC methods, most commonly, (quasi) Newton-type methods are employed to approximate one root of the CC equations. From a computational perspective, (quasi) Newton-type methods have better numerical scaling than more general root-finding procedures. Additionally, in a perturbative regime, one could argue that the CC amplitudes can be viewed as minor corrections to the HF solution. Consequently, it may be sufficient to approximate a single root near zero, which represents a small change to the HF solution. This perturbation theoretical reasoning is of paramount importance in understanding the current computational and theoretical practices in CC theory. In particular, it justifies the use (quasi) Newton-type methods and also explains the quantum chemical rule of thumb, namely, “Do not trust simulations with large CC amplitudes”. However, it is very important to note that:

This reasoning does not cover all cases where CC theory can be successfully applied! [12, 30, 23]

The rule of thumb may be fine in the regime of weakly correlated systems, but it certainly breaks down for strongly correlated systems [26, 23]. For strongly correlated systems, it is common practice to make a case-by-case assessment of the computed results, currently limiting the reliable out-of-the-box application of CC methods. To illustrate the limitations of the perturbation theoretical perspective in fully comprehending CC theory, consider the single polynomial p(z)=z31𝑝𝑧superscript𝑧31p(z)=z^{3}-1italic_p ( italic_z ) = italic_z start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT - 1, which has three distinct roots: z1=1subscript𝑧11z_{1}=1italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = 1, and z2,3=1/2±i3/2subscript𝑧23plus-or-minus12𝑖32z_{2,3}=1/2\pm i\sqrt{3}/2italic_z start_POSTSUBSCRIPT 2 , 3 end_POSTSUBSCRIPT = 1 / 2 ± italic_i square-root start_ARG 3 end_ARG / 2. Applying Newton’s method to approximate one root to this system, we notice that, depending on the initialization, a different solution is found. This can be visualized by sampling a feasible region in \mathbb{C}blackboard_C and using these points as initialization for Newton’s method. Depending on which root was approximated, we then color each point accordingly. This yields the known Newton fractal corresponding to p(z)𝑝𝑧p(z)italic_p ( italic_z ), see Fig. 4 (left panel).

Refer to caption
Refer to caption
Figure 4: Left: Newton fractal of p(z)=z31𝑝𝑧superscript𝑧31p(z)=z^{3}-1italic_p ( italic_z ) = italic_z start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT - 1. The white dots correspond to the roots z1,z2,z3subscript𝑧1subscript𝑧2subscript𝑧3z_{1},z_{2},z_{3}italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT. The colored regions, red, blue, and green, correspond to the basins of attraction of the roots z1subscript𝑧1z_{1}italic_z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT, z2,z3subscript𝑧2subscript𝑧3z_{2},z_{3}italic_z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_z start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, respectively [24]. Right: Energy trajectory of CCS solutions for a two-electron system. The overlap of the eigenstates with the reference state is steered by the parameter ε𝜀\varepsilonitalic_ε. The plot shows the ε𝜀\varepsilonitalic_ε-energy trajectory of all CCS solutions, where ε𝜀\varepsilonitalic_ε was varied between zero and eight. For more details see [26].

This shows that around the individual roots Newton’s method convergence towards the closest root. However, it also shows that the global convergence behavior of (quasi) Newton-type methods is highly complicated [37, 59]. One can only imagine how intricate the Newton fractal of the high-dimensional CC equations is. These considerations raise the pressing question:

Which CC root has been approximated, and is this the “best” solution attainable with the considered CC method?

To definitively answer this question one must leave the perturbative framework, theoretically as well as practically! Mathematically, the most promising framework for studying systems of polynomial equations is algebraic geometry. This field not only provides a set of advanced theoretical tools but also has seen a tremendous surge in computational advances. Exploiting parallel implementations, computational procedures (mostly) based on the homotopy continuation method, e.g., PHCpack [63], Bertini [7], HOM4PS [13, 46], NAG4M2 [6], and HomotopyContinuation.jl [11], provide a reasonable starting point to numercally investigate the intricate root structures of the high-dimensional and non-linear CC equations.

Within the chemistry community, the root structure of the CC equations has been studied at a fundamental level with the goal of including homotopy continuation methods in the CC methodology. The first study on this topic dates back to 1978 when Živkovič and Monkhorst investigated the singularities and multiple solutions of the equations [67]. This was followed by mathematical and numerical studies of multiple solutions of the single-reference and state-universal multi-reference CC equations and their singularities and analytic properties in the early 1990s by Paldus and coworkers [54, 51]. In 1998, Kowalski and Jankowski revived the homotopy methods in connection with the CC theory and used them to solve the CC equations with doubles for a minimum-basis-set four-electron problem [42]. This was followed by a fruitful collaboration of Kowalski and Piecuch, who extended the application of the homotopy methods to the equations defining the CC approaches with singles and doubles (CCSD), singles, doubles, and triples (CCSDT), and singles, doubles, triples, and quadruples (CCSDTQ) [53], again using a four-electron system described by a minimum basis set as a target. They also introduced the formalism of β𝛽\betaitalic_β-nested equations and proved the Fundamental Theorem of the β𝛽\betaitalic_β-NE Formalism, which enabled them to explain the behavior of the curves connecting multiple solutions of the various CC polynomial systems, i.e., from CCSD to CCSDT, CCSDT to CCSDTQ, etc. In [43], Piecuch and Kowalski used homotopy methods to determine all solutions of nonlinear state-universal multireference CCSD equations based on the Jeziorski-Monkhorst ansatz, proving two theorems that provided an explanation for the observed intruder solution problem. In a sequel work [41], they used homotopy methods to obtain all solutions of the generalized Bloch equation, which is nonlinear even in a CI parametrization.
Despite these intensive investigations, the practical computational use of this approach has been restricted to only very small model systems, primarily because of two key reasons. Firstly, to effectively integrate computational algebraic methods with cutting-edge computational quantum chemistry, a substantial scientific divide must be bridged, one that involves advanced and abstract mathematical principles. Secondly, in the late 1980s and 1990s, the field of computational nonlinear algebra was in its infancy, presenting a pioneering yet challenging academic environment for advancements.

Recently, a novel computational shift adopting a fully algebraic geometry perspective of CC theory was established [26, 22]. This approach has demonstrated significant potential in reshaping our understanding of the CC theory [26, 22, 24, 10]. In preliminary works, the authors Faulstich, Oster, Strumfels, and Sverrisdóttir have demonstrated that the CC equations possess rich mathematical structures. By integrating these structures into the computational model, the authors were able to significantly reduce the computational scaling of algebro computational methods applied to the CC equations allowing the computation of all CC roots for small molecular systems [22].

The following chapter is outlined as follows. We begin with a brief review of the fundamental concepts underlying the homotopy continuation method in Sec. 5.1. We then discuss different bounds to the number of roots to the CC equations and introduce the crucial concept of truncation varieties in Sec. 5.2. In Section 5.3, we review the essential numerical discoveries yielded by this approach, providing a detailed analysis of its implications.

5.1 Homotopy continuation

Most algebro computational methods are built on the idea of homotopy continuation – the numerical approach established in [22] is no exception. The idea of homotopy continuation is simple: continuously transform a simple system of polynomials with known solutions into a more complex one and track the paths of these solutions. More formally, we consider the CC equations, written in the following form

fCC(𝐭)=[f1(𝐭)fm(𝐭)]=[f1(t1,,tm)fm(t1,,tm)]=0.subscript𝑓CC𝐭delimited-[]subscript𝑓1𝐭subscript𝑓𝑚𝐭delimited-[]subscript𝑓1subscript𝑡1subscript𝑡𝑚subscript𝑓𝑚subscript𝑡1subscript𝑡𝑚0f_{\rm CC}({\bf t})=\left[\begin{array}[]{c}f_{1}({\bf t})\\ \vdots\\ f_{m}({\bf t})\end{array}\right]=\left[\begin{array}[]{c}f_{1}(t_{1},...,t_{m}% )\\ \vdots\\ f_{m}(t_{1},...,t_{m})\end{array}\right]=0.italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) = [ start_ARRAY start_ROW start_CELL italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( bold_t ) end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( bold_t ) end_CELL end_ROW end_ARRAY ] = [ start_ARRAY start_ROW start_CELL italic_f start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_CELL end_ROW start_ROW start_CELL ⋮ end_CELL end_ROW start_ROW start_CELL italic_f start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT ) end_CELL end_ROW end_ARRAY ] = 0 . (68)

This is our target system, i.e., the system we wish to solve. In a general case, we require the number of equations to be larger than the number of variables, however, the CC equations are a square system, i.e., we have as many equations as variables. In order to find all roots to the system in Eq. (68), we construct an auxiliary system of polynomial equations denoted g(𝐭)=0𝑔𝐭0g(\mathbf{t})=0italic_g ( bold_t ) = 0. For the construction of this system, two fundamental criteria must be met: firstly, its roots of g𝑔gitalic_g should be known, and secondly, the sytem g𝑔gitalic_g must have at least as many roots as the target system fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT. While meeting the first condition is relatively simple, the second condition poses a greater challenge, as accurately determining the number of roots in the CC equations is a hard problem, see Sec. 5.2. Having fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT and g𝑔gitalic_g, we define a family of systems H(𝐭,λ)𝐻𝐭𝜆H(\mathbf{t},\lambda)italic_H ( bold_t , italic_λ ) for λ𝜆\lambda\in\mathbb{R}italic_λ ∈ blackboard_R interpolating between fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT and g𝑔gitalic_g, i.e., H(𝐭,0)=fCC(𝐭)𝐻𝐭0subscript𝑓CC𝐭H(\mathbf{t},0)=f_{\rm CC}(\mathbf{t})italic_H ( bold_t , 0 ) = italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT ( bold_t ) and H(𝐭,1)=g(𝐭)𝐻𝐭1𝑔𝐭H(\mathbf{t},1)=g(\mathbf{t})italic_H ( bold_t , 1 ) = italic_g ( bold_t ). For the sake of illustration, we now consider one root 𝐬0subscript𝐬0\mathbf{s}_{0}bold_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT of g𝑔gitalic_g and restrict λ[0,1]𝜆01\lambda\in[0,1]italic_λ ∈ [ 0 , 1 ]. The condition H(𝐭,λ)=0𝐻𝐭𝜆0H(\mathbf{t},\lambda)=0italic_H ( bold_t , italic_λ ) = 0 then defines a solution path 𝐭(λ)m𝐭𝜆superscript𝑚\mathbf{t}(\lambda)\subset\mathbb{C}^{m}bold_t ( italic_λ ) ⊂ blackboard_C start_POSTSUPERSCRIPT italic_m end_POSTSUPERSCRIPT such that H(𝐭(λ),λ)=0𝐻𝐭𝜆𝜆0H(\mathbf{t}(\lambda),\lambda)=0italic_H ( bold_t ( italic_λ ) , italic_λ ) = 0 for λ[0,1]𝜆01\lambda\in[0,1]italic_λ ∈ [ 0 , 1 ] and 𝐭(1)=𝐬0𝐭1subscript𝐬0\mathbf{t}(1)=\mathbf{s}_{0}bold_t ( 1 ) = bold_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Numerically, this path is followed from λ=1𝜆1\lambda=1italic_λ = 1 to λ=0𝜆0\lambda=0italic_λ = 0 in order to compute one solution 𝐭0=𝐭(0)subscript𝐭0𝐭0\mathbf{t}_{0}=\mathbf{t}(0)bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = bold_t ( 0 ) to the target system fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT. This procedure is equivalent to solving the initial value problem

𝐭H(𝐭,λ)(ddλ𝐭(λ))+λH(𝐭,λ)=0,𝐭(1)=𝐬0,formulae-sequence𝐭𝐻𝐭𝜆dd𝜆𝐭𝜆𝜆𝐻𝐭𝜆0𝐭1subscript𝐬0\frac{\partial}{\partial\mathbf{t}}H(\mathbf{t},\lambda)\left(\frac{\mathrm{d}% }{\mathrm{d}\lambda}\mathbf{t}(\lambda)\right)+\frac{\partial}{\partial\lambda% }H(\mathbf{t},\lambda)=0,\quad\mathbf{t}(1)=\mathbf{s}_{0},divide start_ARG ∂ end_ARG start_ARG ∂ bold_t end_ARG italic_H ( bold_t , italic_λ ) ( divide start_ARG roman_d end_ARG start_ARG roman_d italic_λ end_ARG bold_t ( italic_λ ) ) + divide start_ARG ∂ end_ARG start_ARG ∂ italic_λ end_ARG italic_H ( bold_t , italic_λ ) = 0 , bold_t ( 1 ) = bold_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ,

which is known as the Davidenko differential equation [20, 21]. We say that 𝐭(1)=𝐬0𝐭1subscript𝐬0\mathbf{t}(1)=\mathbf{s}_{0}bold_t ( 1 ) = bold_s start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT gets tracked towards 𝐭(0)𝐭0\mathbf{t}(0)bold_t ( 0 ). For this to work, 𝐭(λ)𝐭𝜆\mathbf{t}(\lambda)bold_t ( italic_λ ) must be a regular zero of H(𝐭,λ)=0𝐻𝐭𝜆0H(\mathbf{t},\lambda)=0italic_H ( bold_t , italic_λ ) = 0 for every λ(0,1]𝜆01\lambda\in(0,1]italic_λ ∈ ( 0 , 1 ]. In the case of nonregular solutions at λ=0𝜆0\lambda=0italic_λ = 0 endgames are employed which are special numerical methods [50].

Refer to caption
Figure 5: A sketch of possible homotopy paths. The solid line shows a path with no finite limit as λ0𝜆0\lambda\to 0italic_λ → 0, the dashed lines have the same limit, and the dotted-dashed line has a unique limit.

In analyzing the solution paths traced by the homotopy, as illustrated in Fig. 5, various scenarios may arise [6]. One path, represented by the solid line, diverges to infinity as λ0𝜆0\lambda\to 0italic_λ → 0. In contrast, the other three paths converge to finite limits. The path indicated by a dotted-dashed line uniquely converges to a regular zero of the target system fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT at λ=0𝜆0\lambda=0italic_λ = 0. Meanwhile, the two paths denoted by dashed lines converge to a common limit, corresponding to an isolated zero of fCCsubscript𝑓CCf_{\rm CC}italic_f start_POSTSUBSCRIPT roman_CC end_POSTSUBSCRIPT with a multiplicity of two. Mathematically, homotopy continuation methods are well studied, we refer the interested reader to [6, 29, 49, 61], and for a quantum chemistry perspective see [24].

5.2 Bounding the number of roots

As becomes apparent from Sec. 5.1, knowing the precise count of roots, or at least a close upper bound, is crucial for the effective application of homotopy methods. This number dictates the number of roots in the auxiliary system g𝑔gitalic_g and therewith determines the number of paths to be numerically tracked. Due to the high dimensionality, this turns out to be particularly challenging in the case of CC theory. Subsequently, we denote

CCdegN,NB(σ)subscriptCCdeg𝑁subscript𝑁𝐵𝜎{\rm CCdeg}_{N,N_{B}}(\sigma)roman_CCdeg start_POSTSUBSCRIPT italic_N , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_σ ) (69)

the true number of roots to the CC equations for a system of N𝑁Nitalic_N electrons discretized in NBsubscript𝑁𝐵N_{B}italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT spin orbitals imposing the CC truncation level σ𝜎\sigmaitalic_σ, where e.g. σ={1,2}𝜎12\sigma=\{1,2\}italic_σ = { 1 , 2 } stands for CCSD, σ={2}𝜎2\sigma=\{2\}italic_σ = { 2 } stands for CCD, σ={1,2,3}𝜎123\sigma=\{1,2,3\}italic_σ = { 1 , 2 , 3 } stands for CCSDT, etc.

In order to establish a bound to the number of roots to the CC equations (47), one can start with the simplest estimate for the number of roots in a polynomial system, namely, the Bézout number. The Bézout number is simply the product of the degrees of the individual polynomial equations. In the case of CCSD, this yields

CCdegN,NB({1,2})3ns4ndsubscriptCCdeg𝑁subscript𝑁𝐵12superscript3subscript𝑛𝑠superscript4subscript𝑛𝑑{\rm CCdeg}_{N,N_{B}}(\{1,2\})\leq 3^{n_{s}}4^{n_{d}}roman_CCdeg start_POSTSUBSCRIPT italic_N , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( { 1 , 2 } ) ≤ 3 start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT end_POSTSUPERSCRIPT 4 start_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT end_POSTSUPERSCRIPT (70)

where ns=N(NBN)subscript𝑛𝑠𝑁subscript𝑁𝐵𝑁n_{s}=N(N_{B}-N)italic_n start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = italic_N ( italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT - italic_N ) is the number of singles equations and nd=(N1)N(NBN1)(NBN)subscript𝑛𝑑𝑁1𝑁subscript𝑁𝐵𝑁1subscript𝑁𝐵𝑁n_{d}=(N-1)N(N_{B}-N-1)(N_{B}-N)italic_n start_POSTSUBSCRIPT italic_d end_POSTSUBSCRIPT = ( italic_N - 1 ) italic_N ( italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT - italic_N - 1 ) ( italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT - italic_N ) is the number of doubles equations, see e.g. [53]. The Bézout number often greatly overestimates the actual number of roots, as seen in the CC equations [26]. For the effective use of homotopy methods, however, it is essential to have precise and accurate estimates of the number of roots.

One potential way to improve this bound is by means of the Bernstein-Khovanskii-Kushnirenko (BKK) theorem [8, 40, 16]. The BKK theorem provides a way to estimate the maximum number of solutions that a system of polynomial equations can have, based on the geometric properties of the equations’ coefficients. More precisely, it states that for a system of polynomial equations, the number of isolated solutions in the complex domain is bounded by the mixed volume of the Newton polytopes corresponding to the polynomials. In order to apply this theorem to CC theory, one must investigate the CC Newton polytopes and establish a way to compute or at least bound their mixed volume. This direction was explored in [26].

Another auspicious direction is the use of truncation varieties. This provides significantly improved bounds to the number of CC roots, see [22] and Fig. 8. The truncation varieties are algebraic varieties specific to CC theory. In general, an algebraic variety is a set of solutions to one or more algebraic equations, typically defined in a higher-dimensional space, where these solutions form a geometric shape or structure. In the context of CC theory, there are several varieties, that appear. Consider the exponential parametrization

exp:𝒱int;𝐭|Ψ=exp(𝐭)|Ψ0=|Ψ0+n=1N1n!Tn|Ψ0,:formulae-sequence𝒱subscriptintmaps-to𝐭ketΨ𝐭ketsubscriptΨ0ketsubscriptΨ0superscriptsubscript𝑛1𝑁1𝑛superscript𝑇𝑛ketsubscriptΨ0\exp~{}:~{}\mathcal{V}\to\mathcal{H}_{\rm int}~{};~{}{\bf t}\mapsto|\Psi% \rangle=\exp({\bf t})|\Psi_{0}\rangle=|\Psi_{0}\rangle+\sum_{n=1}^{N}\frac{1}{% n!}T^{n}|\Psi_{0}\rangle,roman_exp : caligraphic_V → caligraphic_H start_POSTSUBSCRIPT roman_int end_POSTSUBSCRIPT ; bold_t ↦ | roman_Ψ ⟩ = roman_exp ( start_ARG bold_t end_ARG ) | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ = | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ + ∑ start_POSTSUBSCRIPT italic_n = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT divide start_ARG 1 end_ARG start_ARG italic_n ! end_ARG italic_T start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | roman_Ψ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⟩ , (71)

where 𝒱𝒱\mathcal{V}caligraphic_V denotes the vector space of CC amplitudes. Note that Eq. (71) defines a set of algebraic equations. Imposing a certain level of truncation corresponds to restricting this map to a subspace of amplitudes 𝒱σ𝒱subscript𝒱𝜎𝒱\mathcal{V}_{\sigma}\subseteq\mathcal{V}caligraphic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ⊆ caligraphic_V, where σ𝜎\sigmaitalic_σ denotes the respective level of truncation as defined above. We define the truncation variety Vσsubscript𝑉𝜎V_{\sigma}italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT as the closure of the image of the exponential map of 𝒱σsubscript𝒱𝜎\mathcal{V}_{\sigma}caligraphic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT. Since the exponential parametrization is invertible, the dimension of the variety Vσsubscript𝑉𝜎V_{\sigma}italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT is the dimension of 𝒱σsubscript𝒱𝜎\mathcal{V}_{\sigma}caligraphic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT. The truncation varieties exhibit numerous mathematical properties, as elucidated in [22], which collectively lead to the bound

CCdegN,NB(σ)(dim(Vσ)+1)deg(Vσ),subscriptCCdeg𝑁subscript𝑁𝐵𝜎dimsubscript𝑉𝜎1degsubscript𝑉𝜎{\rm CCdeg}_{N,N_{B}}(\sigma)\leq\bigl{(}{\rm dim}(V_{\sigma})+1\bigr{)}\,{\rm deg% }(V_{\sigma}),roman_CCdeg start_POSTSUBSCRIPT italic_N , italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_σ ) ≤ ( roman_dim ( italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ) + 1 ) roman_deg ( italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ) , (72)

where deg(Vσ)degsubscript𝑉𝜎{\rm deg}(V_{\sigma})roman_deg ( italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT ) is the degree of the truncation variety Vσsubscript𝑉𝜎V_{\sigma}italic_V start_POSTSUBSCRIPT italic_σ end_POSTSUBSCRIPT, which is an intrinsic quantity providing information about the variety’s geometric and algebraic properties. In general, the degree of a variety in algebraic geometry refers to a measure of its complexity. It is typically defined as the number of intersections that the variety has with a general linear space of complementary dimension. In simpler terms, it is the number of points at which a linear space will intersect the variety, assuming it intersects it in the maximum possible number of points [17]. Computing the exact degree for a given truncation variety – or at least a sufficiently good bound to it – is the subject of current investigations.

Undertaking a formal comparison between the bounds presented in Eq. 70 and Eq. (72) is challenging given the fundamentally distinct nature of the underlying concepts involved. Despite this, a numerical comparison reveals that the bound in Eq. (72) provides a significant improvement over the previously established bounds, see Sec. 5.3.

5.3 Numerical results

We begin this section by taking a closer look at the convergence behavior of (quasi) Newton-type methods applied to the CC equations (47). Since the dimensionality of the amplitude space grows rapidly, it is not possible to visualize the corresponding Newton fractal. However, we can obtain an idea of the size of the basin of attraction around one root, i.e., a ball around one solution within (quasi) Newton-type methods commonly converge to the solution at its center. To that end, we consider a variant of the H44{}_{4}start_FLOATSUBSCRIPT 4 end_FLOATSUBSCRIPT model consisting of four hydrogen atoms symmetrically distributed on a circle of radius R=1.738𝑅1.738R=1.738italic_R = 1.738 Å [62, 12] discretized in the STO-3G basis set, see Fig. 6.

Refer to caption
Figure 6: Depiction of the H44{}_{4}start_FLOATSUBSCRIPT 4 end_FLOATSUBSCRIPT model undergoing a symmetric disturbance on a circle modeled by the angle ΘΘ\Thetaroman_Θ.

For Θ=90Θsuperscript90\Theta=90^{\circ}roman_Θ = 90 start_POSTSUPERSCRIPT ∘ end_POSTSUPERSCRIPT we obtain a CC solution 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT by initializing the (quasi) Newton-type method with zero. Adding a random perturbation 𝐭psubscript𝐭𝑝{\bf t}_{p}bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT to this solution provides a different initialization 𝐭init=𝐭0+𝐭psubscript𝐭initsubscript𝐭0subscript𝐭𝑝{\bf t}_{\rm init}={\bf t}_{0}+{\bf t}_{p}bold_t start_POSTSUBSCRIPT roman_init end_POSTSUBSCRIPT = bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT for the CC computations. Scaling the size of 𝐭psubscript𝐭𝑝{\bf t}_{p}bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT (i.e., 𝐭pnormsubscript𝐭𝑝\|{\bf t}_{p}\|∥ bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥,) allows us to (approximately) investigate the basin of attraction. Clearly, comparing with Fig. 4, we expect that the region for which Newton’s method converges to 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT will not be circular. However, this investigation yields the ballpark for the local basin of convergence, since we can extract the radius of the largest ball rmaxsubscript𝑟maxr_{\rm max}italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT in which (quasi) Newton-type methods converge to the solution 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT in 99.9%percent99.999.9\%99.9 % of the cases, see Fig. 7. We moreover plot the success rate of Newton’s method, i.e., how many of the randomly perturbed initializations converged toward 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, as a function of the size of 𝐭psubscript𝐭𝑝{\bf t}_{p}bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT. Note that we measure 𝐭pnormsubscript𝐭𝑝\|{\bf t}_{p}\|∥ bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ relative to the size of 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, in particular, the initialization zero lies on the boundary of 𝐭p=1normsubscript𝐭𝑝1\|{\bf t}_{p}\|=1∥ bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ = 1, see Fig. 7.

Refer to caption
Refer to caption
Figure 7: Left: The convergence success of 1000 simulations initializing CCSD with the optimal solution 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT plus a random perturbation 𝐭psubscript𝐭𝑝{\bf t}_{p}bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT of size α=𝐭p/𝐭0𝛼normsubscript𝐭𝑝normsubscript𝐭0\alpha=\|{\bf t}_{p}\|/\|{\bf t}_{0}\|italic_α = ∥ bold_t start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ∥ / ∥ bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥. The red dotted line indicates rmaxsubscript𝑟maxr_{\rm max}italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT. The blue dotted line indicates the success rate within a ball around 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT of radius 𝐭0normsubscript𝐭0\|{\bf t}_{0}\|∥ bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥.
Right: Schematic representation of the different regions of convergences around 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT.

This shows that rmaxsubscript𝑟maxr_{\rm max}italic_r start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT is approximately 0.2𝐭00.2normsubscript𝐭00.2\;\|{\bf t}_{0}\|0.2 ∥ bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥. Moreover, this shows that beyond this point convergence towards 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is by no means guaranteed. In fact, for an arbitrary initialization that is 𝐭0normsubscript𝐭0\|{\bf t}_{0}\|∥ bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∥ away from 𝐭0subscript𝐭0{\bf t}_{0}bold_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, the success rate is only 27%percent2727\%27 %. Being oblivious about the physical motivation of this initial guess, one could argue that it is quite surprising that Newton’s method converges for the initial guess zero.

We now compare the new bound to the CC roots derived in [22] with the existing bounds reported in e.g. [53]. To that end, we compute the roots corresponding to CCS and CCD for two-electron systems, i.e., N=2𝑁2N=2italic_N = 2, for different numbers of spin orbitals NBsubscript𝑁𝐵N_{B}italic_N start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT. This shows that using the truncation varieties and their profound mathematical structures dramatically improved the bounds to the CC roots, see Fig. 8.

Refer to caption
Figure 8: Bounds to the number of roots for 2 electron systems for a varying number of spin orbitals using the CCS method (left) and CCD method (right). For more details see [22].

This reduction in the bounds together with the incorporation of truncation varieties in the computational procedure allowed for severe numerical advancements enabling the computation of the full root structure for true molecular systems like lithium hydride (see Fig. 9[22] using CCD.

Refer to caption
Figure 9: Comparison of energy spectra obtained via FCI and CCD for lithium hydride [22].

We emphasize that these advances are far from a straightforward application of off-the-shelf computational algebra tools. Instead, they result from a sophisticated combination of multiple techniques, underscoring the complexity and innovation of the approach. The general computational procedure comprises two major steps:

1. The set-up of an initial system from which the homotopy continuation starts. This initial system is specific for the number of electrons, the number of spin orbitals employed for the discretion of the Hamiltonian, and the used CC truncation level σ𝜎\sigmaitalic_σ as defined in Sec. 5.2. We emphasize that in this implementation, the initial system can be reused when computing CC solutions at the truncation level σ𝜎\sigmaitalic_σ for systems with the same number of electrons and basis functions.

2. Once the initial system for a target system configuration is set up, we employ a parametric homotopy approach as described in Sec. 5.1 that connects the initial system with the targeted system.

6 Conclusion

This article provides a self-contained educational review of the latest mathematical developments in coupled cluster (CC) theory from a computational chemistry perspective. To that end, we started this review article with a foundational introduction to CC theory, employing an algebraic approach. This particular formulation offers a rigorous and mathematically elegant framework, thereby facilitating a deeper understanding of the underlying principles. Additionally, in an effort to ensure comprehensive coverage and to augment the article’s self-contained nature, we have incorporated a detailed analysis of the matrix structures that emerge within the realm of second quantization. This includes an exploration of their theoretical underpinnings and practical implications in computational chemistry, providing valuable context and enhancing the overall utility of this review for researchers in the field.

We then explore a variety of analytical frameworks and methods used in CC theory, with a focus on their contributions to establishing local existence and uniqueness of the CC solutions. We delve into the local analysis based on Zarantonello’s Lemma, a technique pioneered by Schneider [60], which has significantly influenced the field by its application in various CC methods, including the continuous single-reference CC method [56, 57], the extended CC method [45], and the tailored CC ansatz [25]. Further, we explore the graph-based framework for CC methods developed by Csirik and Laestadius [18, 19]. This section highlights the versatility of the framework and its utility in comparing various CC methods, encompassing even multireference approaches. We then delved into the latest numerical analysis results analyzing the single reference CC method developed by Hassan, Maday, and Wang. This segment decodes the complex ansatz from a computational chemistry viewpoint and encapsulates key findings from their research presented in [34, 33], offering readers a comprehensive understanding of this cutting-edge area in CC theory.

Furthermore, our review extends to the algebraic geometry approach within CC theory. This unique perspective not only illuminates the intricate root structure inherent in CC equations but also paves the way for novel computational paradigms. These emerging methodologies have the potential to form the cornerstone of future CC computational strategies. In our discussion, we delve into the overarching principles of the algebraic approach and incorporate an overview of the most recent numerical advancements that have been made in this area [25, 22].

Acknowledgments

The author is thankful for useful discussions with Andre Leastadius, Mihály Csirik, Muhammad Hassan, and Svala Sverrisdóttir.

References

  • [1] J. S. Arponen, “Independent-cluster methods as mappings of quantum theory into classical mechanics,” Theoretica chimica acta, vol. 80, no. 2-3, pp. 149–179, 1991.
  • [2] J.-P. Aubin, Applied functional analysis.   John Wiley & Sons, 2011.
  • [3] W. Bangerth and R. Rannacher, Adaptive finite element methods for differential equations.   Springer Science & Business Media, 2003.
  • [4] R. Bartlett, “Theory and applications of computational chemistry: the first forty years,” Dykstra, CE, Frenking, G., Kim, KS, Scuseria, GE, Eds, pp. 1191–1221, 2005.
  • [5] R. J. Bartlett and M. Musiał, “Coupled-cluster theory in quantum chemistry,” Reviews of Modern Physics, vol. 79, no. 1, p. 291, 2007.
  • [6] D. J. Bates, P. Breiding, T. Chen, J. D. Hauenstein, A. Leykin, and F. Sottile, “Numerical nonlinear algebra,” ArXiv preprint arXiv:2302.08585, 2023.
  • [7] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler, “Bertini: Software for numerical algebraic geometry,” 2006.
  • [8] D. N. Bernshtein, “The number of roots of a system of equations,” Funct. Anal. Appl., vol. 9, no. 3, pp. 183–185, Jul 1975.
  • [9] R. Bishop, “An overview of coupled cluster theory and its applications in physics,” Theoretica chimica acta, vol. 80, no. 2-3, pp. 95–148, 1991.
  • [10] V. Borovik, B. Sturmfels, and S. Sverrisdóttir, “Coupled cluster degree of the grassmannian,” arXiv:2310.15474, 2023.
  • [11] P. Breiding and S. Timme, “Homotopycontinuation. jl: A package for homotopy continuation in julia,” in Mathematical Software–ICMS 2018: 6th International Conference, South Bend, IN, USA, July 24-27, 2018, Proceedings 6.   Springer, 2018, pp. 458–465.
  • [12] I. W. Bulik, T. M. Henderson, and G. E. Scuseria, “Can single-reference coupled cluster theory describe static correlation?” Journal of chemical theory and computation, vol. 11, no. 7, pp. 3171–3179, 2015.
  • [13] T. Chen, T.-L. Lee, and T.-Y. Li, “Hom4ps-3: a parallel numerical solver for systems of polynomial equations based on polyhedral homotopy continuation methods,” in Mathematical Software–ICMS 2014: 4th International Congress, Seoul, South Korea, August 5-9, 2014. Proceedings 4.   Springer, 2014, pp. 183–190.
  • [14] J. Čížek, “On the correlation problem in atomic and molecular systems. calculation of wavefunction components in ursell-type expansion using quantum-field theoretical methods,” The Journal of Chemical Physics, vol. 45, no. 11, pp. 4256–4266, 1966.
  • [15] F. Coester, “Bound states of a many-particle system,” Nuclear Physics, vol. 7, pp. 421–424, 1958.
  • [16] D. Cox, J. Little, and D. O’Shea, Using Algebraic Geometry, ser. Graduate Texts in Mathematics.   Springer New York, 1991.
  • [17] D. Cox, J. Little, and D. OShea, Ideals, varieties, and algorithms: an introduction to computational algebraic geometry and commutative algebra.   Springer Science & Business Media, 2013.
  • [18] M. A. Csirik and A. Laestadius, “Coupled-cluster theory revisited-part i: Discretization,” ESAIM: Math. Model. Numer. Anal., vol. 57, no. 2, pp. 645–670, 2023.
  • [19] ——, “Coupled-cluster theory revisited-part ii: Analysis of the single-reference coupled-cluster equations,” ESAIM: Mathematical Modelling and Numerical Analysis, vol. 57, no. 2, pp. 545–583, 2023.
  • [20] D. Davidenko, “On a new method of numerical solution of systems of nonlinear equations,” in Proceedings of the USSR Academy of Sciences, vol. 88, no. 4, 1953, pp. 601–602.
  • [21] ——, “On the approximate solution of systems of nonlinear equations,” Ukrainian Mathematical Journal, vol. 5, no. 2, pp. 196–206, 1953.
  • [22] F. Faulstich, B. Sturmfels, and S. Sverrisdóttir, “Algebraic varieties in quantum chemistry,” arXiv:2308.05258, 2023.
  • [23] F. M. Faulstich, H. E. Kristiansen, M. A. Csirik, S. Kvaal, T. B. Pedersen, and A. Laestadius, “S-diagnostic—an a posteriori error assessment for single-reference coupled-cluster methods,” The Journal of Physical Chemistry A, vol. 127, no. 43, pp. 9106–9120, 2023.
  • [24] F. M. Faulstich and A. Laestadius, “Homotopy continuation methods for coupled-cluster theory in quantum chemistry,” Molecular Physics, vol. 0, p. e2258599, 2023.
  • [25] F. M. Faulstich, A. Laestadius, Ö. Legeza, R. Schneider, and S. Kvaal, “Analysis of the tailored coupled-cluster method in quantum chemistry,” SIAM J. Numer. Anal., vol. 57, no. 6, pp. 2579–2607, 2019.
  • [26] F. M. Faulstich and M. Oster, “Coupled cluster theory: Towards an algebraic geometry formulation,” arXiv:2211.10389 [In press: SIAM Journal on Applied Algebra and Geometry], 2022.
  • [27] F. M. Faulstich, “Mathematical aspects of coupled-cluster theory in chemistry,” 2020.
  • [28] D. H. Fremlin, Measure theory.   Torres Fremlin, 2000, vol. 4.
  • [29] C. B. Garcia and W. I. Zangwill, “Finding all solutions to polynomial systems and other systems of equations,” Mathematical Programming, vol. 16, no. 1, pp. 159–176, 1979.
  • [30] E. Giner, D. P. Tew, Y. Garniron, and A. Alavi, “Interplay between electronic correlation and metal–ligand delocalization in the spectroscopy of transition metal compounds: Case study on a series of planar cu2+ complexes,” Journal of Chemical Theory and Computation, vol. 14, no. 12, pp. 6240–6252, 2018.
  • [31] W. Hackbusch, Elliptic differential equations: theory and numerical treatment.   Springer, 2017, vol. 18.
  • [32] B. C. Hall, Lie groups, Lie algebras, and representations.   Springer, 2013.
  • [33] M. Hassan, Y. Maday, and Y. Wang, “Analysis of the single reference coupled cluster method for electronic structure calculations: The discrete coupled cluster equations,” arXiv preprint arXiv:2311.00637, 2023.
  • [34] ——, “Analysis of the single reference coupled cluster method for electronic structure calculations: the full-coupled cluster equations,” Numerische Mathematik, pp. 1–53, 2023.
  • [35] T. Helgaker, P. Jorgensen, and J. Olsen, Molecular electronic-structure theory.   John Wiley & Sons, 2014.
  • [36] J. Hubbard, “The description of collective motions in terms of many-body perturbation theory,” Proceedings of the Royal Society A, vol. 240, no. 1223, pp. 539–560, 1957.
  • [37] J. Hubbard, D. Schleicher, and S. Sutherland, “How to find all roots of complex polynomials by newton’s method,” Inventiones mathematicae, vol. 146, no. 1, p. 1, 2001.
  • [38] N. Hugenholtz, “Perturbation approach to the fermi gas model of heavy nuclei,” Physica, vol. 23, no. 1-5, pp. 533–545, 1957.
  • [39] A. A. Kirillov, An introduction to Lie groups and Lie algebras.   Cambridge University Press, 2008, vol. 113.
  • [40] A. G. Kouchnirenko, “Polyèdres de newton et nombres de milnor,” Invent. Math., vol. 32, no. 1, pp. 1–31, Feb 1976.
  • [41] K. Kowalski and P. Piecuch, “Complete set of solutions of the generalized bloch equation,” Int. J. Quantum Chem., vol. 80, no. 4-5, pp. 757–781, 2000.
  • [42] K. Kowalski and K. Jankowski, “Towards complete solutions to systems of nonlinear equations of many-electron theories,” Phys. Rev. Lett., vol. 81, no. 6, p. 1195, 1998.
  • [43] K. Kowalski and P. Piecuch, “Complete set of solutions of multireference coupled-cluster equations: The state-universal formalism,” Phys. Rev. A, vol. 61, no. 5, p. 052506, 2000.
  • [44] A. Laestadius and F. M. Faulstich, “The coupled-cluster formalism–a mathematical perspective,” Molecular Physics, vol. 117, no. 17, pp. 2362–2373, 2019.
  • [45] A. Laestadius and S. Kvaal, “Analysis of the extended coupled-cluster method in quantum chemistry,” SIAM J. Numer. Anal., vol. 56, no. 2, pp. 660–683, 2018.
  • [46] T.-L. Lee, T.-Y. Li, and C.-H. Tsai, “Hom4ps-2.0: a software package for solving polynomial systems by the polyhedral homotopy continuation method,” Computing, vol. 83, pp. 109–133, 2008.
  • [47] G. Leoni, A first course in Sobolev spaces.   American Mathematical Soc., 2017.
  • [48] M. Michałek and B. Sturmfels, Invitation to nonlinear algebra.   American Mathematical Soc., 2021, vol. 211.
  • [49] A. Morgan, Solving polynomial systems using continuation for engineering and scientific problems.   SIAM, 2009.
  • [50] A. P. Morgan, A. J. Sommese, and C. W. Wampler, “Computing singular solutions to polynomial systems,” Advances in Applied Mathematics, vol. 13, no. 3, pp. 305–327, 1992.
  • [51] J. Paldus, P. Piecuch, L. Pylypow, and B. Jeziorski, “Application of hilbert-space coupled-cluster theory to simple (H22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT)22{}_{2}start_FLOATSUBSCRIPT 2 end_FLOATSUBSCRIPT model systems: Planar models,” Phys. Rev. A, vol. 47, no. 4, p. 2738, 1993.
  • [52] J. Paldus, “The beginnings of coupled-cluster theory: an eyewitness account,” in Theory and Applications of Computational Chemistry.   Elsevier, 2005, pp. 115–147.
  • [53] P. Piecuch and K. Kowalski, “In search of the relationship between multiple solutions characterizing coupled-cluster theories,” in Computational chemistry: reviews of current trends, J. Leszczynski, Ed.   World Scientific, 2000, vol. 5, pp. 1–104.
  • [54] P. Piecuch, S. Zarrabian, J. Paldus, and J. Čížek, “Coupled-cluster approaches with an approximate account of triexcitations and the optimized-inner-projection technique. ii. coupled-cluster results for cyclic-polyene model systems,” Phys. Rev. B, vol. 42, no. 6, p. 3351, 1990.
  • [55] T. Rohwedder, “An analysis for some methods and algorithms of quantum chemistry,” 2010.
  • [56] ——, “The continuous coupled cluster formulation for the electronic schrödinger equation,” ESAIM: Math. Model. Numer. Anal., vol. 47, no. 2, pp. 421–447, 2013.
  • [57] T. Rohwedder and R. Schneider, “Error estimates for the coupled cluster method,” ESAIM: Math. Model. Numer. Anal., vol. 47, no. 6, pp. 1553–1582, 2013.
  • [58] S. A. Sauter and C. Schwab, Boundary element methods.   Springer, 2011.
  • [59] D. Schleicher, “On the number of iterations of newton’s method for complex polynomials,” Ergodic Theory and Dynamical Systems, vol. 22, no. 3, pp. 935–945, 2002.
  • [60] R. Schneider, “Analysis of the projected coupled cluster method in electronic structure calculation,” Numer. Math., vol. 113, no. 3, pp. 433–471, 2009.
  • [61] A. J. Sommese, C. W. Wampler et al., The Numerical solution of systems of polynomials arising in engineering and science.   World Scientific, 2005.
  • [62] T. Van Voorhis and M. Head-Gordon, “Benchmark variational coupled cluster doubles results,” J. Chem. Phys., vol. 113, no. 20, pp. 8873–8879, 2000.
  • [63] J. Verschelde, “Algorithm 795: Phcpack: A general-purpose solver for polynomial systems by homotopy continuation,” ACM Transactions on Mathematical Software (TOMS), vol. 25, no. 2, pp. 251–276, 1999.
  • [64] H. Yserentant, “On the regularity of the electronic schrödinger equation in hilbert spaces of mixed derivatives,” Numerische Mathematik, vol. 98, pp. 731–759, 2004.
  • [65] ——, Regularity and approximability of electronic wave functions.   Springer, 2010.
  • [66] E. Zeidler, Nonlinear functional analysis and its applications: II/B: nonlinear monotone operators.   Springer Science & Business Media, 2013.
  • [67] T. P. Živković and H. J. Monkhorst, “Analytic connection between configuration–interaction and coupled-cluster solutions,” J. Math. Phys., vol. 19, no. 5, pp. 1007–1022, 1978.