Recent mathematical advances in coupled cluster theory

Fabian M. Faulstich¹¹1Department of Mathematical Sciences, Rensselaer Polytechnic Institute, NY, 12180, USA

Abstract

This article presents an in-depth educational overview of the latest mathematical developments in coupled cluster (CC) theory, beginning with Schneider’s seminal work from 2009 that introduced the first local analysis of CC theory. We offer a tutorial review of second quantization and the CC ansatz, laying the groundwork for understanding the mathematical basis of the theory. This is followed by a detailed exploration of the most recent mathematical advancements in CC theory. Our review starts with an in-depth look at the local analysis pioneered by Schneider which has since been applied to analyze various CC methods. We then move on to discuss the graph-based framework for CC methods developed by Csirik and Laestadius. This framework provides a comprehensive platform for comparing different CC methods, including multireference approaches. Next, we delve into the latest numerical analysis results analyzing the single reference CC method developed by Hassan, Maday, and Wang. This very general approach is based on the invertibility of the CC function’s Fréchet derivative. We conclude the article with a discussion on the recent incorporation of algebraic geometry into CC theory, highlighting how this novel and fundamentally different mathematical perspective has furthered our understanding and provides exciting pathways to new computational approaches.

1 Introduction

Coupled-cluster (CC) theory is a widely acclaimed, high-precision wave function approach that is used in computational quantum many-body theory and is of great interest to both practitioners as well as theoreticians [5]. The origin of CC theory dates back to 1958 when Coester proposed to use an exponential parametrization of the wave function [15]. This parametrization was independently derived by Hubbard [36] and Hugenholtz [38] in 1957 as an alternative to summing many-body perturbation theory contributions order by order. A milestone of CC theory is the work by Čížek from 1966 [14]. In this work, Čížek discussed the foundational concepts of second quantization (as applied to many-fermion systems), normal ordering, contractions, Wick’s theorem, normal-ordered Hamiltonians (which was a novelty at that time), Goldstone-style diagrammatic techniques, and the origin of the exponential wave function ansatz. He moreover derived the connected cluster form of the Schrödinger equation and proposed a general recipe for how to obtain the energy and amplitude equations through projections of the connected cluster form of the Schrödinger equation on the reference and excited determinants, which was illustrated using the CC doubles (CCD) approximation. This work also reported the very first CC computations, using full and linearized forms of CCD, for nitrogen (treated fully at the ab initio level) and benzene (treated with a PPP model Hamiltonian).

In this article, we review the most recent mathematical advances from a computational chemistry perspective. Our objective is to elucidate various mathematical frameworks, their objectives, and outcomes in a manner that is accessible to a wide computational chemistry audience. In doing so, we aim to make the complex mathematical concepts accessible to a broader audience, providing a clear and comprehensible pathway for readers who may not have an extensive background in the advanced mathematics typically necessary to fully engage with the original research articles. With this effort, our goal is to render these mathematical results not only understandable but also directly applicable and relevant for practitioners and researchers in the field of computational chemistry.

While this article centers on the mathematical developments in CC theory post-2009, following the landmark work by Schneider, we recognize that there were significant contributions and advancements in the field before this date. However, our focus remains on the period after 2009, showcasing the progress made in recent years. Providing a full account of the rich history of CC theory and the mathematical advances therein is beyond the scope of this article, the interested reader is referred to articles and the references therein that provide insight into the history and development of CC theory including those by Bartlett [4], Paldus [52], Arponen [1], and Bishop [9].

The following article is outlined as follows: In Sec. 2 we provide a brief review of the mathematical matrix structures that arise in the second quantized framework. In Sec. 3 we then introduce the CC ansatz using an algebraic formulation. In Sec. 4 we then review the mathematical results established by employing a local strong monotonicity approach (Sec. 4.1), the excitation graph approach (Sec. 4.2) and the inf-sup condition approach (Sec. 4.3). In Sec. 5 we then elaborate on the root structure of the CC equations and review the advances made along this line by employing an algebraic geometry perspective.

2 Brief review of second quantization

In this section, we review the second quantization framework with a slight mathematical twist. Our aim is to resolve any ambiguities surrounding concepts that have been a potential subject of debate within either the mathematical or chemical community. Considering an $N$ electron system, we denote the set $\mathcal{B}$ with $|\mathcal{B}|=N_{B}\gg N$ the set of molecular orbitals, comprising of $L^{2}$ -orthonormal functions, i.e.,

\langle\xi_{i},\xi_{j}\rangle_{L^{2}(X)}=\int_{X}\xi_{i}^{*}(x)\xi_{j}(x)d% \lambda(x)\qquad\forall~{}1\leq i,j\leq N_{B},

(1)

where $X=\mathbb{R}^{3}\times\{\pm 1/2\}$ and $d\lambda(x)$ denotes the corresponding integration measure [28]. Mathematically, the integral measure $d\lambda(x)$ is a product measure introduced to combine spatial and spin integration, it can also be written as

\int_{X}f(\mathbf{x})d\lambda(\mathbf{x})=\sum_{s\in\{\pm 1/2\}}\int_{\mathbb{% R}^{3}}f(\mathbf{r},s)d{\bf r},

(2)

where $\mathbf{r}\in\mathbb{R}^{3}$ denotes the spatial degree of freedom and $s\in\{\pm 1/2\}$ is the spin degree of freedom. We moreover assume that the functions in $\mathcal{B}$ are sufficiently smooth allowing us to take all required derivatives. Note that in computations that use Gaussian-type orbitals, this is always the case. Mathematically, the largest space (i.e., the most general space) from which we can choose $\mathcal{B}$ is the Sobolev space $H^{1}(X)$ [47, 2], however, for sake of simplicity, one can assume twice continuously differentiable and $L^{2}$ -integrable functions. In any case, we conclude that the molecular orbitals span a finite-dimensional Hilbert space $h\subset H^{1}(X)$ which we shall denote the single-particle space.

Next, we define multi-particle functions that are used to span the fermionic Fock space. Due to the anti-symmetry constraints of the wave function, we need to take the anti-symmetrized product also called the exterior product: Let $\xi_{1},...,\xi_{M}\in\mathcal{B}$ , we define the $M$ -folded exterior product of $\xi_{1},...,\xi_{M}$ (pointwise) by

\xi_{1}\wedge...\wedge\xi_{M}(\mathbf{x}_{1},...,\mathbf{x}_{M})=\sum_{\pi\in S% _{M}}{\rm sign}(\pi)\prod_{i=1}^{M}\xi_{\pi(i)}(\mathbf{x}_{i}),

(3)

where $S_{M}$ is the symmetric group describing all possible permutations of the set $\{1,...,M\}$ and ${\rm sign}(\pi)$ is the parity of the permutation $\pi$ .

Example 1.

Let $\xi_{1},\xi_{2}\in\mathcal{B}$ be two molecular orbitals. The exterior product of $\xi_{1}$ and $\xi_{2}$ is pointwise given by

\xi_{1}\wedge\xi_{2}(\mathbf{x}_{1},\mathbf{x}_{2})=\xi_{1}(\mathbf{x}_{1})\xi% _{2}(\mathbf{x}_{2})-\xi_{1}(\mathbf{x}_{2})\xi_{2}(\mathbf{x}_{1}).

(4)

Given the set $\mathcal{B}$ , one can form ${N_{B}\choose M}$ linearly independent exterior products, which define the set $\mathfrak{B}^{(M)}$ . The space $\mathcal{H}^{(M)}$ , spanned by these functions is the $M$ -folded exterior power of $h$ , and it inherits an inner product from the single particle space $h$ : Let $\Psi_{I}=\xi_{i_{1}}\wedge...\wedge\xi_{i_{M}}$ and $\Psi_{J}=\xi_{j_{1}}\wedge...\wedge\xi_{j_{M}}$ then

\langle\Psi_{I},\Psi_{J}\rangle=\sum_{\begin{subarray}{c}\pi\in S_{I}\\ \sigma\in S_{J}\end{subarray}}\prod_{p=1}^{M}\langle\xi_{\pi(i_{p})},\xi_{% \sigma(j_{p})}\rangle_{L^{2}(X)},

(5)

where $S_{I}$ and $S_{J}$ are the permutations of $\{i_{1},...,i_{M}\}$ and $\{j_{1},...,j_{M}\}$ , respectively. Normalizing the ${N_{B}\choose M}$ exterior products obtained from $\mathcal{B}$ by $1/\sqrt{M!}$ yields the well-known definition of $M$ -particle Slater determinants, i.e.,

	$\displaystyle\Psi[i_{1},...,i_{M}](\mathbf{x}_{1},...,\mathbf{x}_{M})$	$\displaystyle=\frac{1}{\sqrt{M!}}\sum_{\sigma\in S_{M}}{\rm sign}(\pi)\prod_{i% =1}^{M}\xi_{\pi(i)}(\mathbf{x}_{i})$		(6)
		$\displaystyle=\frac{1}{\sqrt{M!}}{\rm det}\left(\begin{bmatrix}\xi_{1}(\mathbf% {x}_{1})&\cdots&\xi_{1}(\mathbf{x}_{M})\\ \vdots&\ddots&\vdots\\ \xi_{M}(\mathbf{x}_{1})&\cdots&\xi_{M}(\mathbf{x}_{M})\end{bmatrix}\right).$		(6)

To avoid linear dependence in the set of $M$ -particle Slater determinants, we assume $i_{1}<...,<i_{M}$ which yields ${N_{B}\choose M}$ possible exterior products formed from $\mathcal{B}$ . The direct sum of the $M$ -particle spaces for $M=0,...,N_{B}$ yields the fermionic Fock space $\mathcal{F}$ :

\mathcal{F}=\bigoplus_{M=0}^{N_{B}}\mathcal{H}^{(M)},

(7)

which is known as the Grassmann algebra on $h$ in the mathematics community. For brevity, we will employ the Dirac notation writing the basis elements in $\mathcal{F}$ as

|s_{1},...,s_{N_{B}}\rangle=\frac{1}{\sqrt{M!}}\xi_{1}^{s_{1}}\wedge\xi_{2}^{s% _{2}}\wedge...\wedge\xi_{N_{B}}^{s_{N_{B}}}

(8)

where $M=\sum_{i}s_{i}$ and $s_{i}\in\{0,1\}$ for all $i=1,...,N_{B}$ . A general element in $\mathcal{F}$ , is then given as

|\Psi\rangle=\sum_{s_{1},...,s_{N_{B}}\in\{0,1\}}\Psi(s_{1},...,s_{N_{B}})|s_{% 1},...,s_{N_{B}}\rangle

(9)

where $\Psi(s_{1},...,s_{N_{B}})\in\mathbb{C}$ . We now define the fermionic creation and annihilation operators, i.e.,

	$\displaystyle a_{p}^{\dagger}:\mathcal{F}\to\mathcal{F}~{};~{}\|s_{1},...,s_{N_% {B}}\rangle$	$\displaystyle\mapsto(-1)^{\sigma(p)}(1-s_{p})\|s_{1},...s_{p-1},1-s_{p},s_{p+1}% ,...,s_{N_{B}}\rangle$		(10)
	$\displaystyle a_{p}:\mathcal{F}\to\mathcal{F}~{};~{}\|s_{1},...,s_{N_{B}}\rangle$	$\displaystyle\mapsto(-1)^{\sigma(p)}s_{p}\|s_{1},...s_{p-1},1-s_{p},s_{p+1},...% ,s_{N_{B}}\rangle$		(10)

where $\sigma(p)=\sum_{q=1}^{p-1}s_{q}$ . We note that ${\rm dim}(\mathcal{F})=2^{N_{B}}$ , we therefore identify elements of the fermionic Fock space $\mathcal{F}$ uniquely with elements in $\mathbb{C}^{2^{N_{B}}}$ . Mathematically, we write $\mathcal{F}\simeq\mathbb{C}^{2^{N_{B}}}$ which means that the spaces $\mathcal{F}$ and $\mathbb{C}^{2^{N_{B}}}$ are essentially the same in their structure. We moreover introduce the convention

{1\choose 0}\equiv{\rm unoccupied}\quad{\rm and}\quad{0\choose 1}\equiv{\rm occupied.}

Note that this is an arbitrary choice, but it is the commonly employed convention. This allows us to express the basis elements as

|s_{1},...,s_{N_{B}}\rangle={1-s_{1}\choose s_{1}}\otimes...\otimes{1-s_{N_{B}% }\choose s_{N_{B}}}.

(11)

Example 2.

Let $N_{B}=2$ , then

|01\rangle={1\choose 0}\otimes{0\choose 1}=\begin{pmatrix}0\\ 0\\ 0\\ 1\end{pmatrix}

(12)

In this formulation, the fermionic creation and annihilation operators in Eq. (10) are matrices of the form

\displaystyle a_{p}^{\dagger}=\underbrace{\sigma_{z}\otimes...\otimes\sigma_{z% }}_{p-1~{}{\rm times}}\otimes\;a^{\dagger}\otimes\underbrace{I\otimes...% \otimes I}_{N_{B}-p-1~{}{\rm times}}\quad{\rm and}\quad a_{p}=\underbrace{% \sigma_{z}\otimes...\otimes\sigma_{z}}_{p-1~{}{\rm times}}\otimes\;a\otimes% \underbrace{I\otimes...\otimes I}_{N_{B}-p-1~{}{\rm times}}

(13)

where

I=\begin{pmatrix}1&0\\ 0&1\end{pmatrix},\quad\sigma_{z}=\begin{pmatrix}1&0\\ 0&-1\end{pmatrix},\quad a=\begin{pmatrix}0&1\\ 0&0\end{pmatrix}.

(14)

Example 3.

Let $N_{B}=3$ , then

a_{2}=\sigma_{z}\otimes a\otimes I=\footnotesize\begin{pmatrix}0&0&1&0&0&0&0&0% \\ 0&0&0&1&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&-1&0\\ 0&0&0&0&0&0&0&-1\\ 0&0&0&0&0&0&0&0\\ 0&0&0&0&0&0&0&0\\ \end{pmatrix}

(15)

These matrices are sparse and have several properties [35]. We first note that the definition given in Eq. (13) implies directly that the creation and annihilation operators are nilpotent, i.e., $(a_{p}^{\dagger})^{2}=(a_{p})^{2}=0$ . Moreover, the fermionic creation and annihilation operators obey the canonical anti-communication relation (CAR):

[a_{p},a_{q}]_{+}=[a_{p}^{\dagger},a_{q}^{\dagger}]_{+}=0\quad\text{and}\quad[% a_{p},a_{q}^{\dagger}]_{+}=\delta_{p,q}.

(16)

Lastly, we define the number operator $n_{p}=a_{p}^{\dagger}a_{p}$ satisfying

n_{p}|s_{1},...,s_{N_{B}}\rangle=s_{p}|s_{1},...,s_{N_{B}}\rangle

(17)

and the total number operator $N=\sum_{p=1}^{N_{B}}n_{p}$ . In this formulation, the matrix describing an interacting electronic system in a potential generated by clamped nuclei, i.e., the Hamiltonian, takes the form

H=\sum_{p,q}h_{p,q}a_{p}^{\dagger}a_{q}+\frac{1}{4}\sum_{p,q,r,s}v_{p,q,r,s}a_% {p}^{\dagger}a_{q}^{\dagger}a_{s}a_{r},

(18)

where $h\in\mathbb{C}^{N_{B}\times N_{B}}$ and $v\in\mathbb{C}^{N_{B}\times N_{B}\times N_{B}\times N_{B}}$ are system dependent integral tensors. They are defined via

h_{p,q}=\int_{X}\xi_{p}^{*}(x)\left(-\frac{\Delta}{2}-\sum_{j}\frac{Z_{j}}{|r_% {1}-R_{j}|}\right)\xi_{q}(x)d\lambda(x)

(19)

and

v_{p,q,r,s}=\int_{X\times X}\frac{\xi_{p}^{*}(x_{1})\xi_{q}(x_{1})\xi_{r}^{*}(% x_{2})\xi_{s}(x_{2})}{|r_{1}-r_{2}|}d\lambda(x)d\lambda(x).

(20)

Note that $h$ is hermitian and $v$ fulfills the symmetry relations

v_{p,q,r,s}=v_{r,s,p,q}=v_{q,p,s,r}^{*}=v_{s,r,q,p}^{*}

(21)

or in the case of real-valued atomic spin orbitals

v_{p,q,r,s}=v_{r,s,p,q}=v_{q,p,s,r}=v_{s,r,q,p}=v_{q,p,r,s}=v_{s,r,p,q}=v_{p,q% ,s,r}=v_{r,s,q,p}.

(22)

The goal is now to compute the lowest lying eigenstate of the matrix $H$ in the $N$ -particle subspace $\mathcal{H}^{(N)}$ which is the $N$ -particle ground state energy of the electronic Schrödinger equation, i.e.,

E_{0}=\min_{\begin{subarray}{c}|\Psi\rangle\in\mathcal{F}\\ \langle\Psi|\Psi\rangle=1\end{subarray}}\langle\Psi|H-\mu N|\Psi\rangle,

(23)

where $\mu$ is a Lagrange multiplier ensuring that the solution $|\Psi\rangle$ lies in the $N$ -particle Hilbert space.

3 The CC ansatz

Coupled cluster theory is built upon an exponential ansatz of the wave function, as opposed to the linear ansatz of Eq. (9). We emphasize that the simple approach of projecting the Hamitlonian onto much smaller, manageable linear subspaces of $\mathcal{F}$ proves inadequate for electronic structure problems. This is exemplified in the case study of lithium hydride, where we analyze the lowest eigenstate for different relative positions of $\mathbf{R}_{1}$ and $\mathbf{R}_{2}$ , denoted $R=\|\mathbf{R}_{1}-\mathbf{R}_{2}\|$ (see Fig. 1). Through this examination, it becomes clear that essential energies, such as the chemical bonding energy, are not accurately captured when using a limited linear subspace, see $E_{\rm bond}^{\rm CISD}$ in Fig. 1. Conversely, CC theory provides a far more accurate representation of this critical energy, see $E_{\rm bond}^{\rm CCSD}$ in Fig. 1.

Refer to caption — Figure 1: Case study of lithium hydride comparing the linear parametrization (blue) and the exponential parametrization (red) for different values of $R$ in the AUG-cc-pVTZ basis set [35].

In order to derive the exponential ansatz in a mathematically sound way, we need to introduce a few concepts first, starting with excitation matrices.

3.1 Excitation and cluster matrices

Since we started the characterization of the fermionic Fock space with the molecular orbitals, the Hartree-Fock state is given by

|\Psi_{0}\rangle=|1,...,1,0,...0\rangle={0\choose 1}\otimes\cdots\otimes{0% \choose 1}\otimes{1\choose 0}\otimes\cdots\otimes{1\choose 0}\in\mathcal{H}^{N},

where the first $N$ entries are set to one, and the remaining entries are zero. We refer to this vector as the reference determinant. We moreover define $v_{\rm occ}=[\![N]\!]=\{1,...,N\}$ and $v_{\rm virt}=[\![N_{B}]\!]\setminus[\![N]\!]=\{N+1,...,N_{B}\}$ . Assume $a_{1},...,a_{k}\in v_{\rm virt}$ , and $i_{1},...,i_{k}\in v_{\rm occ}$ . Then,

X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}

defines an excitation matrix, and the set of all excitation matrices is given by

\mathfrak{E}(\mathcal{H}^{(N)})=\left\{X_{\mu}~{}\Big{|}~{}\mu={a_{1},...,a_{k% }\choose i_{1},...,i_{k}},\,a_{j}\in v_{\rm virt},\,i_{j}\in v_{\rm occ},\,k% \leq N\right\}.

Note that the above construction of the excitation matrices yields that excitation matrices are particle number preserving. The excitation indices $\mu$ that excite from the occupied into the virtual orbitals define the multi-index set

\mathcal{I}=\left\{\mu~{}\Bigg{|}~{}\mu={a_{1},...,a_{k}\choose i_{1},...,i_{k% }},\,a_{j}\in v_{\rm virt},\,i_{j}\in v_{\rm occ},\,1\leq k\leq N\right\}.

(24)

Since this set of excitations corresponds to simply replacing indices in the string $[1,...,N]$ with indices in the string $[N+1,...,N_{B}]$ (plus some additional permutation), we deduce that there is a one-to-one relation between excitation operators and Slater determinants except for the reference Slater determinant $|\Psi_{0}\rangle$ . In other words, the excitation operators map the reference Slater determinant $|\Psi_{0}\rangle$ to all other Slater determinants.

Theorem 4.

There exists a one-to-one relation between the $N$ -particle basis functions $\mathfrak{B}^{(N)}$ and $\mathfrak{E}(\mathcal{H}^{(N)})\cup\{I\}$ .

Proof.

Since excitation matrices are defined w.r.t. the reference determinant $|\Psi_{0}\rangle$ it follows immediately that $|\Psi_{0}\rangle=I|\Psi_{0}\rangle$ . Consider $|\Psi_{P}\rangle=\xi_{p_{1}}\wedge...\wedge\xi_{p_{N}}\in\mathcal{H}^{(N)}$ . Comparing $\{1,...,N\}$ to $\{p_{1},...,p_{N}\}$ we can identify a multi-index $\mu$ describing the indices that have to be changed in $\{1,...,N\}$ to obtain $\{p_{1},...,p_{N}\}$ . More precisely, $\mu$ describes an excitation from $v_{\rm occ}\setminus P$ to $P\cap v_{\rm virt}$ . Due to the canonical ordering, this multi-index $\mu$ is unique. Then, by definition we obtain $|\Psi_{P}\rangle={\rm sign}(\mu)X_{\mu}|\Psi_{0}\rangle$ , which shows the claim. ∎

The above result is the fundamental result that allows us to express any target wave function $|\Psi\rangle\in\mathcal{H}^{(N)}$ through a wave operator applied to the reference determinant instead of an expansion through basis vectors, i.e.,

|\Psi\rangle=\left(c_{0}I+\sum_{\mu}c_{\mu}X_{\mu}\right)|\Psi_{0}\rangle.

(25)

We will now focus on certain properties that ensure the later discussed exponential formulation is properly defined. The first property we consider is the commutativity of the excitation matrices.

Proposition 5.

Let $X_{\mu},X_{\nu}\in\mathfrak{E}(\mathcal{H}^{(N)})$ . Then $[X_{\mu},X_{\nu}]=0$ .

Proof.

Let

X_{\mu}=X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_% {1}}^{\dagger}a_{i_{1}}...a_{i_{k}}\quad\text{and}\quad X_{\nu}=X_{b_{1},...,b% _{\ell}\choose j_{1},...,j_{\ell}}=a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger% }a_{j_{1}}...a_{j_{\ell}}.

The proof is conducted in two steps:
First, we seek to permute all creation operators in the commutator to the left using the CAR. We begin with the following product and note that when permuting $a_{b_{\ell}}^{\dagger}$ to the right of $a_{a_{1}}^{\dagger}$ we merely pick up a sign, since $b_{\ell}\notin v_{\rm occ}$ , i.e.,

\wick{a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}\c{1}a_{i_{1}}...a_{i_{k}}\c{1}% a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}}=(-1)^{k}% a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}a_{i_{1}}...a_{% i_{k}}a_{b_{\ell-1}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}.

This furthermore yields

a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{b_{\ell}}^{% \dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}=(-1)^{\ell\cdot k}a_{a_% {k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{\ell}}

and similar

a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{a_{k}}^% {\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}=(-1)^{\ell\cdot k}a_{b_{% \ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{1}}^{\dagger}...a_{a_{k}}^{\dagger% }a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}.

Second, we wish to unify the index sequence of the creation and annihilation operators in the two summands of the commutator. Applying the CAR again, we find

\wick{\c{1}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}\c{1}a_{a_{k}}^{\dagger% }...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}}=(-1)^{% \ell}a_{a_{k}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k-1}% }^{\dagger}...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}},

which yields

a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k}}^{\dagger}...a_{a_{1}}^{% \dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}=(-1)^{2*\ell\cdot k}a_{a% _{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{% \dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{\ell}}.

Note that we have here assumed that $\mu\cap\nu=\emptyset$ , otherwise the expression is trivially zero due to the nilpotency of the creation and annihilation operators. Overall this yields

$\displaystyle=[X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}},X_{b_{1},...,b_{\ell% }\choose j_{1},...,j_{\ell}}]$		(26)
	$\displaystyle=a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_% {b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}-a_{b_{\ell}% }^{\dagger}...a_{b_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{a_{k}}^{\dagger}..% .a_{a_{1}}^{\dagger}a_{i_{1}}...a_{i_{\ell}}$
	$\displaystyle=(-1)^{\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_% {\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}-(-1)^{\ell\cdot k}a_{b_{\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{a_{k}}% ^{\dagger}...a_{a_{1}}^{\dagger}a_{j_{1}}...a_{j_{\ell}}a_{i_{1}}...a_{i_{k}}$
	$\displaystyle=(-1)^{\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_% {\ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}-(-1)^{3\cdot\ell\cdot k}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{b_{% \ell}}^{\dagger}...a_{b_{1}}^{\dagger}a_{i_{1}}...a_{i_{k}}a_{j_{1}}...a_{j_{% \ell}}$
	$\displaystyle=0.$

∎

Another important property is that the excitation matrices inherited the nilpotency from the fermionic creation and annihilation matrices.

Proposition 6.

Let $X_{\mu}\in\mathfrak{E}(\mathcal{H}^{(N)})$ . Then $X_{\mu}^{2}=0$ .

Proof.

Recall that $(a_{p}^{\dagger})^{2}=(a_{p})^{2}=0$ by construction (see Eq. (13)). Let

X_{\mu}=X_{a_{1},...,a_{k}\choose i_{1},...,i_{k}}=a_{a_{k}}^{\dagger}...a_{a_% {1}}^{\dagger}a_{i_{k}}...a_{i_{1}}.

Then

$\displaystyle X_{\mu}^{2}$	$\displaystyle=\wick{\c{1}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...% a_{i_{1}}\c{1}a_{a_{k}}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}}$	(27)
	$\displaystyle=-\underbrace{a_{a_{k}}^{\dagger}a_{a_{k}}^{\dagger}}_{=0}a_{a_{k% }-1}^{\dagger}...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}a_{a_{k}-1}^{\dagger}% ...a_{a_{1}}^{\dagger}a_{i_{k}}...a_{i_{1}}$
	$\displaystyle=0.$

∎

We are now set to define the vector space of cluster matrices, a fundamental concept in coupled cluster theory, i.e., the $\mathbb{C}$ -vector space

\mathfrak{b}=\left\{T=\sum_{\mu}t_{\mu}X_{\mu}~{}\Big{|}~{}\mu\in\mathcal{I}% \right\},

(28)

where $\mathcal{I}$ is as denied in Eq. (24). Note that $\mathfrak{b}$ is a linear space, and the excitation matrices are linearly independent by Theorem 4, hence, each element in $\mathfrak{b}$ is uniquely defined through its linear coefficients $\mathbf{t}=(t_{\mu})$ . We therefore use the convention that $\mathbf{t}$ describes an amplitude vector whereas $T$ describes the corresponding cluster matrix.
Utilizing the propositions discussed earlier, we will demonstrate that this vector space possesses a highly structured nature. Our next step is to introduce the concept of the exponential of cluster matrices, which forms a key mathematical bridge between cluster matrices and wave operators. This involves drawing a connection between the Lie algebra, as embodied by the cluster matrices, and the Lie group comprising wave operators, thereby establishing an essential theoretical link in our analysis. To begin this exploration, we first assert that $\mathfrak{b}$ constitutes some form of Lie algebra. As it turns out, this assertion holds true.

Theorem 7.

The space of cluster matrices $\mathfrak{b}$ equipped with the standard matrix commutator $[\cdot,\cdot]$ forms a nilpotent Abelian Lie algebra.

Proof.

To show that $\mathfrak{b}$ is a Lie algebra, we need to prove that it (i) is a linear space (which is true by construction), and (ii) that there exists an alternating bilinear map (in this case the standard matrix commutator $[\cdot,\cdot]$ ) that satisfies the Jacobi identity, i.e.,

[X,[Y,Z]]+[Y,[Z,X]]+[Z,[X,Y]]=0\qquad\forall~{}X,Y,Z\in\mathfrak{b}.

In order to show (ii), we combine Proposition 5 and the bi-linearity of the matrix commutator. This yields that for two cluster matrices $T_{1},T_{2}\in\mathfrak{b}$

[T_{1},T_{2}]=\sum_{m}u\sum_{\nu}t_{\mu}t_{\nu}[X_{\mu},X_{\nu}]=\sum_{m}u\sum% _{\nu}t_{\mu}t_{\nu}[X_{\nu},X_{\mu}]=[T_{2},T_{1}],

hence, cluster matrices commute. Therefore, the Jacobi identity is trivially fulfilled. This shows that $\mathfrak{b}$ equipped with the regular matrix commutator is an abelian Lie algebra, where the term “abelian” simply means that the elements in $\mathfrak{b}$ commute with each other.
Next, we shall show the nilpotency. To that end, we expand $T^{N+1}$ which yields

T^{N+1}=\sum_{\begin{subarray}{c}k_{1}+k_{2}+\cdots+k_{m}=N+1\\ k_{1},k_{2},\cdots,k_{m}\geq 0\end{subarray}}{N+1\choose k_{1},k_{2},\ldots,k_% {m}}\prod_{j=1}^{m}(t_{\mu_{j}}X_{\mu_{j}})^{k_{j}},

(29)

where

{N+1\choose k_{1},k_{2},\ldots,k_{m}}={\frac{N+1!}{k_{1}!\,k_{2}!\cdots k_{m}!}}

is a multinomial coefficient. Since $|v_{\rm occ}|=N$ , there exists one $i\in v_{\rm occ}$ that appears at least twice in each matrix $\prod_{j=1}^{m}(t_{\mu_{j}}X_{\mu_{j}})^{k_{j}}$ . However, since $a_{i}^{2}=0$ this yields that $T^{N+1}=0$ , which shows the claim. ∎

The above Theorem ensures that the (Lie) exponential of $\mathfrak{b}$ is a Lie Group, i.e., a differentiable manifold. However, we seek that this is the Lie Group of wave operators that we used to define any intermediately normalized wave function in $\mathcal{H}^{(N)}$ , see Eq. (25). To that end, we begin by showing that every intermediately normalized wave function can be expressed through a linear wave operator. As mentioned earlier, the construction of excitation operators allows us to transfer the degrees of freedom from the basis functions in Eq. (9) to wave operators. Formally, this yields the definition of the (linear) wave operator map $\Omega$ as

\Omega:\mathfrak{b}\to\mathcal{G}~{};~{}C\mapsto I+C,

(30)

where

\mathcal{G}=\{I+C~{}|~{}C\in\mathfrak{b}\}.

(31)

Note that in this formulation, the wave operator map $\Omega$ takes a cluster matrix as input and yields a wave operator, i.e., $\Omega(C)$ maps the reference determinant to some wave function in $\mathcal{H}^{(N)}$ . By construction, $\mathcal{G}$ is an affine linear space of matrices. We will now show the one-to-one correspondence between intermediately normalized functions

|\Psi\rangle\in\mathcal{H}_{\rm int}=\left\{|\Psi\rangle\in\mathcal{H}^{(N)}~{% }|~{}\langle\Psi|\Psi_{0}\rangle=1\right\}\subset\mathcal{H}^{(N)},

(32)

and cluster matrices $C\in\mathfrak{b}$ . We begin with the linear parametrization of elements in $|\Psi\rangle\in\mathcal{H}_{\rm int}$ .

Lemma 8.

Let $|\Psi\rangle\in\mathcal{H}_{\rm int}$ . There exists a unique element $(I+C)\in\mathcal{G}$ , s.t.,

|\Psi\rangle=(I+C)|\Psi_{0}\rangle.

(33)

Proof.

We first observe that $\mathcal{H}_{\rm int}\subset\mathcal{H}^{(N)}$ can be characterized by

\mathcal{H}_{\rm int}=|\Psi_{0}\rangle+{\rm span}(\{|\Psi_{\mu}\rangle\}_{\mu}% )\underset{(*)}{=}|\Psi_{0}\rangle+{\rm span}(\{X_{\mu}\}_{\mu})|\Psi_{0}% \rangle=(I+\mathfrak{b})|\Psi_{0}\rangle,

(34)

where the equality $(*)$ is a consequence of Theorem 4. This shows that every element in $|\Psi\rangle\in\mathcal{H}_{\rm int}$ can be expressed as

|\Psi\rangle=(I+C)|\Psi_{0}\rangle.

(35)

Next, assume there exist two cluster matrices $C_{1},C_{2}\in\mathfrak{b}$ s.t.

(I+C_{1})|\Psi_{0}\rangle=|\Psi\rangle=(I+C_{2})|\Psi_{0}\rangle.

(36)

However, this yields

[c_{1}]_{\mu}=\langle\Psi_{\mu}|(I+C_{1})|\Psi_{0}\rangle=\langle\Psi_{\mu}|% \Psi\rangle=\langle\Psi_{\mu}|(I+C_{2})|\Psi_{0}\rangle=[c_{2}]_{\mu}\qquad% \forall\mu\in\mathcal{I}

(37)

implying that $C_{1}=C_{2}$ , which shows the claim. ∎

Lemma 9.

The wave operator map $\Omega$ is bijective.

Proof.

First note that $\mathcal{G}$ was defined by the range of $\Omega$ . Hence, the wave operator map is trivially subjective. Second, note that ${\rm dim}(\mathfrak{b})={\rm dim}(\mathcal{G})$ , which yields that $\Omega$ is a bijection. ∎

Combining these two lemmata, yields the desired one-to-one correspondence between $\mathfrak{b}$ and elements in $\mathcal{H}_{\rm int}$ .

Theorem 10.

Let $|\Psi\rangle\in\mathcal{H}_{\rm int}$ . There exists a unique element $C\in\mathfrak{b}$ , s.t.,

|\Psi\rangle=\Omega(C)|\Phi_{0}\rangle.

(38)

Although we have restricted the above theorem to intermediately normalized wave functions (the reason will become apparent shortly), Theorem 10 is in fact the core of the (full) configuration interaction expansion [35].

We now proceed to the exponential parametrization. Note, since $\mathfrak{b}$ is nilpotent the exponential series $\exp(T)$ for any element $T\in\mathfrak{b}$ is not a true exponential as it terminates after at most $N$ terms. Hence, it is a polynomial at most of the degree $N$ . We therefore do not need to investigate the convergence of the exponential series and can define the set

\tilde{\mathcal{G}}=\left\{\exp(T)=I+\sum_{n=1}^{N}\frac{1}{n!}T^{n}~{}|~{}T% \in\mathfrak{b}\right\}.

(39)

Lemma 11.

The set $\tilde{\mathcal{G}}$ is equal to $\mathcal{G}$ .

Proof.

Let $\exp(T)\in\tilde{\mathcal{G}}$ with $T\in\mathfrak{b}$ . By definition $\exp(T)=I+P(T)$ where $P$ is a polynomial at most of the degree $N$ and since $\mathfrak{b}$ is a vector space, we have $P(T)\in\mathfrak{b}$ . However, this defines an element in $\mathcal{G}$ which yields $\tilde{\mathcal{G}}\subseteq\mathcal{G}$ . Conversely, let $I+C\in\mathcal{G}$ . Then $I+C-I=C\in\mathfrak{b}$ , which implies that

\log(I+C)=\sum_{n=0}^{\infty}\frac{(-1)^{n}}{n+1}C^{n+1}

(40)

terminates after $N+1$ terms. Hence ${\rm log}(I+C)$ is an element in $\mathfrak{b}$ and therewith

I+C=\exp({\rm log}(I+C))\in\tilde{\mathcal{G}}

which shows that $\mathcal{G}\subseteq\tilde{\mathcal{G}}$ .
∎

The common algebraic definition of the Lie exponential map is by means of a map ${\exp:\mathfrak{b}\to\mathcal{G}}$ , where $\mathcal{G}$ is a Lie group and $\mathfrak{b}$ the corresponding Lie algebra. In particular, the exponential map is a map from the tangent space to the Lie group [32, 39]. We wish to equip $\mathcal{G}$ with a particular group multiplication $\odot$ , such that $(\mathcal{G},\odot)$ is a Lie group and $\mathfrak{b}$ its Lie algebra. This group multiplication $\odot$ is defined by means of the Backer–Campbell–Hausdorff formula

\odot:\mathcal{G}\times\mathcal{G}\rightarrow\mathcal{G};\ \exp(T)\odot\exp(U)% =\exp(T*U)

for an operation $*$ on $\mathfrak{b}$ which takes the following simple form on Abelian algebras

*:\mathfrak{b}\times\mathfrak{b}\to\mathfrak{b}~{};~{}(T,S)\mapsto T+S.

(41)

In other words, we can almost trivially derive the coupled cluster ansatz using concepts from non-linear algebra [48].

Theorem 12.

Given the Lie group $\mathcal{G}$ with Lie algebra $\mathfrak{b}$ . The exponential map $\exp:\mathfrak{b}\to\mathcal{G}$ is surjective.

Note that this theorem can be generalized to any nilpotent Lie algebra. However, the proof shows that the inverse of the exponential is in this particular case well-defined, which proves the following theorem.

Theorem 13.

The exponential map from $\mathfrak{b}$ to $\mathcal{G}$ is bijective.

This shows that any wave function that is intermediately normalized can be uniquely expressed through an element in $\mathcal{G}$ , i.e., through the exponential of a cluster matrix $T\in\mathfrak{b}$ . This aligns with the known functional analytic results [60, 56, 45, 25], and is known in the quantum-chemistry community as the equivalence of FCI and FCC.

Some of the above results naturally extend to the truncated case, i.e., using a subspace $\bar{\mathfrak{b}}\subset\mathfrak{b}$ in the above construction. We refer the interested reader to [26].

3.2 The single reference CC theory

After the mathematical introduction to the CC ansatz, we now turn to the equations that yield the desired cluster matrix. These equations, central to coupled cluster theory, are the coupled cluster equations. This set of equations can be motivated as follows: Let $|\tilde{\Psi}\rangle\in\mathcal{H}^{(N)}$ be the ground state solution to the electronic Schrödinger equation, i.e.,

H|\tilde{\Psi}\rangle=E_{0}|\tilde{\Psi}\rangle.

(42)

We can renormalize $|\tilde{\Psi}\rangle$ to be intermediately normalized, i.e.,

|\Psi\rangle=\frac{1}{\langle\Psi_{0}|\tilde{\Psi}\rangle}|\tilde{\Psi}\rangle.

(43)

By Theorem 13, we then know that there exists a unique element $T\in\mathfrak{b}$ such that

|\Psi\rangle={\rm exp}(T)|\Psi_{0}\rangle.

(44)

Substituting Eq. (44) in the electronic Schrödinger equation (42) yields

H{\rm exp}(T)|\Psi_{0}\rangle=E_{0}{\rm exp}(T)|\Psi_{0}\rangle\Leftrightarrow% \left\{\begin{aligned} \langle\Psi_{0}|{\rm exp}(-T)H{\rm exp}(T)|\Psi_{0}% \rangle&=E_{0}\\ \langle\Psi|{\rm exp}(-T)H{\rm exp}(T)|\Psi_{0}\rangle&=0,\qquad\forall\langle% \Psi|\perp\langle\Psi_{0}|.\end{aligned}\right.

(45)

Noting that the cluster matrix is a function of the cluster amplitudes ${\bf t}$ , i.e.,

T({\bf t})=\sum_{\mu}t_{\mu}X_{\mu},

(46)

and that by construction $\langle\Psi_{\mu}|\Psi_{0}\rangle=0$ for all $\mu$ , we see that the coupled cluster amplitudes fulfill the square system of equations

0=\langle\Psi_{\mu}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}% \rangle=:f_{\mu}({\bf t})\qquad\forall\mu.

(47)

Mathematically, CC methods use roots of a high-dimensional non-linear function

f_{\rm CC}:{\bf t}\mapsto[f_{\mu}({\bf t})]_{\mu}

(48)

to characterize physical states. The above derivation proves that

H|\tilde{\Psi}\rangle=E_{0}|\tilde{\Psi}\rangle\Rightarrow\left\{\begin{% aligned} \langle\Psi_{0}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}% \rangle&=E\\ \langle\Psi_{\mu}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}))|\Psi_{0}\rangle% &=0\qquad\forall\mu,\end{aligned}\right.

(49)

for $T({\bf t})$ fulfilling

{\rm exp}(T({\bf t}))|\Psi_{0}\rangle=\frac{1}{\langle\Psi_{0}|\tilde{\Psi}% \rangle}|\tilde{\Psi}\rangle.

Note that the converse direction also holds. Let $|\Psi\rangle\in\mathcal{H}^{(N)}$ be arbitrary and ${\bf t}$ fulfilling Eq. (47). We define

E_{\rm CC}({\bf t})=\langle\Psi_{0}|{\rm exp}(-T({\bf t}))H{\rm exp}(T({\bf t}% ))|\Psi_{0}\rangle.

Then

$\displaystyle\langle\Psi\|(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$	$\displaystyle=\langle\Psi\|{\rm exp}(T){\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)\|% \Psi_{0}\rangle$	(50)
	$\displaystyle=\langle\Psi\|{\rm exp}(T)\|\Psi_{0}\rangle\langle\Psi_{0}\|{\rm exp% }(-T)(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$
	$\displaystyle\quad+\sum_{\mu}\langle\Psi\|{\rm exp}(T)\|\Psi_{\mu}\rangle\langle% \Psi_{\mu}\|{\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$
	$\displaystyle=0.$

Since $|\Psi\rangle\in\mathcal{H}^{(N)}$ was chosen arbitrarily, this shows that ${\rm exp}(T)|\Psi_{0}\rangle$ is an eigenvector of $H$ corresponding to the eigenvalue $E_{\rm CC}$ .

In practical applications, $|\Psi\rangle$ is of course not known, instead, we seek to find an amplitude vector ${\bf t}$ that fulfills the non-linear equations (47). Moreover, we are considering the subspace $\bar{\mathfrak{b}}\subset\mathfrak{b}$ instead of the full space $\mathfrak{b}$ . In order to still obtain a square system of equations, i.e., as many variables as equations, we merely consider the equations that arise from projections that correspond to the excitation matrices used to expand the sought cluster matrix.

It is worth noticing that in this case, the coupled cluster solution is no longer equivalent to the quantum mechanical energy expression. In fact, this does – in general – not even yield an eigenpair. This becomes apparent by inspecting Eq. (50) and noting that in order to be exactly zero, the CC equations have to contain projections onto all basis functions $\langle\Psi_{\mu}|$ .

Restrictions to different $\bar{\mathfrak{b}}$ can be motivated from many physical and chemical perspectives, however, mathematically, we consider these restrictions to be sparsity patterns enforced onto the CC amplitude vector ${\bf t}$ . In this context, it is worth noticing that there exists no mathematical result showing the general existence of a sparsity pattern, a sought sparsity pattern is rather the result of computational limitations as well as many computational results indicating that even for complicated systems a certain sparsity in t is apparent. As such we think of this as a conjecture rather than a fact.

As a system of nonlinear equations, the equations (47) have a number of solutions. Speaking of the coupled cluster solution bears therefore a certain level of ambiguity. Most coupled cluster implementations seek a solution that is close to zero employing a quasi-Newton approach and an initial guess for ${\bf t}$ that stems from MP2. Given the convergence behavior of quasi-Newton methods, together with the interesting structures that arise when considering the basins of convergence, this approach seems appropriate for a set of “well-behaved” problems but is not a generally applicable procedure. This has resulted in a number of numerical advances together with chemically or physically motivated adjustments of the considered system of equations.

4 Analysis

The numerical analysis of coupled cluster methods witnessed a significant surge since 2009 when Schneider published the pioneering work that introduced the first local analysis based on Zarantonello’s lemma [66] to coupled cluster theory. This work set the stage for several follow-up works and motivated the exploration of alternative mathematical frameworks well-suited for describing coupled cluster methods.

In Section 4.1, we outline Schneider’s approach and elaborate on the central ideas. We then proceed in Section 4.2 by introducing the graph-based framework for CC methods developed by Csirik and Laestadius. This perspective introduced novel ideas offering a unified platform to compare various CC methods, including multireference approaches. In Section 4.3 we then elaborate on the most recent numerical analysis results characterizing the single reference CC method. The authors Hassan, Maday, and Wang presented yet another and – compared to the local analysis – a more general approach based on the invertibility of the CC Fréchet derivative.

Before delving into these analytical characterizations of CC theory, we have to elaborate on three subtle mathematical details that are important to keep in mind when reading this section:

The first is related to the wave function. As outlined earlier, the most general space in which we seek to find a solution to the electronic Schrödinger equation is an anti-symmetrized Sobolev space [60]. Although we will avoid this detail explicitly in the subsequent elaborations, it is an important detail and a central concept that appears in all analysis works related to CC theory. From a quantum chemistry perspective, seeking a solution within this space ensures finite kinetic energy, in other words:

\int_{X^{N}}|\nabla\psi(\mathbf{x}_{1},\dots,\mathbf{x}_{N})|^{2}\mathrm{d}% \lambda(\mathbf{x}_{1})\dots\mathrm{d}\lambda(\mathbf{x}_{N})<+\infty.

(51)

For more details on Sobolev spaces, we refer the interested reader to mathematical textbooks [47, 2, 31, 65] or relevant articles that offer insights into their application in quantum chemistry [44, 27, 55, 64]. This extra constraint of finite kinetic energy is particularly important for the continuous (i.e., infinite dimensional) formulation of coupled-cluster theory [56]. In this context, we remind the reader of the notation for the $L^{2}$ -inner product $\langle\psi^{\prime}|\psi\rangle$ , and its induced norm $\|\psi\|^{2}_{L^{2}}=\langle\psi|\psi\rangle$ .

The second important detail is a measure of distance on matrix and operator spaces. We here consider operators that act on the wave functions, e.g., the Hamiltonian $H$ , cluster operators $T$ , $\Lambda$ , etc. We can then introduce a norm expression for the operator inherited from the function space it is defined on. For example, let $O$ be an operator defined on $L^{2}$ then we define the $L^{2}$ operator norm

\|O\|_{L^{2}}=\sup\{\|O\psi\|_{L^{2}}~{}:~{}\|\psi\|_{L^{2}}=1\}.

(52)

Note that this concept reduces to induced matrix norms in the finite-dimensional case.

The third detail is that the CC amplitudes ${\bf t}$ live in the Hilbert space of finite square summable sequences denoted the $\ell^{2}$ -space. This space is equipped with the $\ell^{2}$ -inner product [2], i.e., let $x=(x_{\mu})$ and $y=(y_{\mu})$ be two finite sequences, the $\ell^{2}$ -inner product is defined as

\langle x,y\rangle_{\ell^{2}}=\sum_{\mu}x_{\mu}y_{\mu},

(53)

which induces the norm $\|x\|^{2}_{\ell^{2}}=\langle x,x\rangle_{\ell^{2}}$ .

4.1 Local strong monotonicity

The local analysis introduced by Schneider [60] has spawned various works following a similar methodology analyzing different CC methods: Rohwedder generalized it to infinite dimensions [56, 57], Laestadius and Kvaal adapted it for the extended CC framework [45], and Faulstich et al. adapted it for tailored CC methods [25]. Central to all these local analyses is the local version of Zarantonello’s lemma [66].

The local version of Zarantonello’s lemma states that – under certain conditions – a function is (locally) invertible. In the context of coupled cluster theory, the function under investigation is the CC function $f_{\rm CC}$ defined in Eq. (48). The local invertibility of this function yields the local existence and uniqueness of a CC solution. To ensure the applicability of Zarantonello’s lemma, the function in question must exhibit specific characteristics of mathematical “well-behavedness”. Specifically, in this context, it means that the function must satisfy two essential properties:

Local strong monotonicity. The function $f_{\mathrm{CC}}$ is called locally strongly monotone at $\mathbf{t}_{*}$ if for some $r>0$ , $\gamma>0$ and all $\mathbf{t},\mathbf{t}^{\prime}$ within the distance $r$ of $\mathbf{t}_{*}$

\langle f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}\geq\gamma\|\mathbf{t}-\mathbf% {t}^{\prime}\|_{\ell^{2}}^{2}.

(54)

Local Lipschitz continuity. The function $f_{\mathrm{CC}}$ is said to be locally Lipschitz continuous at $\mathbf{t}_{*}$ with Lipschitz constant $L>0$ if for some $r>0$ and all $\mathbf{t},\mathbf{t}^{\prime}$ within the distance $r$ of $\mathbf{t}_{*}$

\|f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime})\|_{\ell^{2}% }\leq L\|\mathbf{t}-\mathbf{t}^{\prime}\|_{\ell^{2}},

(55)

Note that in the finite-dimensional case, $f_{\mathrm{CC}}$ is indeed locally Lipschitz since it is continuously differentiable.

In the context of CC theory, the difficult property to prove is that $f_{\rm CC}$ – or the respective function describing the CC method under consideration – is locally strongly monotone. Here, all analyses generally follow a similar pattern: Inspecting the left-hand side in Eq. (54) we begin by a tailor expansion of $f_{\rm CC}$ around $\mathbf{t}_{*}$ , i.e.,

f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime})=Df_{\mathrm{% CC}}(\mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime})+\mathcal{O}\left((\mathbf% {t}-\mathbf{t}^{\prime})^{2}\right),

(56)

where $Df_{\mathrm{CC}}$ is the Jacobian of $f_{\rm CC}$ . This yields

\langle f_{\mathrm{CC}}(\mathbf{t})-f_{\mathrm{CC}}(\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}=\langle Df_{\mathrm{CC}}(% \mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime}),\mathbf{t}-\mathbf{t}^{\prime}% \rangle_{\ell^{2}}+\mathcal{O}\left(\|\mathbf{t}-\mathbf{t}^{\prime}\|^{3}% \right).

(57)

At this point, it is common to impose certain locality assumptions, i.e., assuming that $\mathbf{t}$ and $\mathbf{t}^{\prime}$ are close enough to $\mathbf{t}_{*}$ . This ensures that the term $\mathcal{O}\left(\|\mathbf{t}-\mathbf{t}^{\prime}\|^{3}\right)$ is sufficiently small. In order to control the remaining term, the derivative of $f_{\rm CC}$ can explicitly be computed, which yields

\langle Df_{\mathrm{CC}}(\mathbf{t}_{*})(\mathbf{t}-\mathbf{t}^{\prime}),% \mathbf{t}-\mathbf{t}^{\prime}\rangle_{\ell^{2}}=\langle(T-T^{\prime})\Psi_{0}% ,e^{-T_{*}}\left(H-E_{\rm CC}({\bf\mathbf{t}_{*}})\right)e^{T_{*}}(T-T^{\prime% })\Psi_{0}\rangle_{L^{2}}.

(58)

The next step involves expanding the similarity-transformed Hamiltonian, i.e., $e^{-T_{*}}He^{T_{*}}$ , using the Hausdorff Lemma [32], which is an important lemma derived from the Baker–Campbell–Hausdorff formula. This yields

e^{-T_{*}}\left(H-E_{\rm CC}({\bf t_{*}})\right)e^{T_{*}}=\left(H-E_{\rm CC}({% \bf t_{*}})\right)-T_{*}\left(H-E_{\rm CC}({\bf t_{*}})\right)+\left(H-E_{\rm CC% }({\bf t_{*}})\right)T_{*}+...

(59)

Again, imposing locality of $t$ and $t^{\prime}$ around $t_{*}$ will ensure that higher-order terms become negligible. This yields that

$\displaystyle\langle(T-T^{\prime})\Psi_{0},$	$\displaystyle e^{-T_{}}\left(H-E_{\rm CC}({\bf t_{}})\right)e^{T_{*}}(T-T^{% \prime})\Psi_{0}\rangle_{L^{2}}$	(60)
	$\displaystyle=\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})% \right)(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}$
	$\displaystyle\quad+\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{}% })\right)(T_{}-T_{*}^{\dagger})(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}$
	$\displaystyle\quad+...$

The first term in this expansion, i.e.,

\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})\right)(T-T^{% \prime})\Psi_{0}\rangle_{L^{2}},

(61)

can then be bounded by imposing different spectral gap assumptions depending on the CC method under consideration. The applicability of different spectral gap assumptions is reasonable in the context of CC methods. For the remainder term,

\langle(T-T^{\prime})\Psi_{0},\left(H-E_{\rm CC}({\bf t_{*}})\right)(T_{*}-T_{% *}^{\dagger})(T-T^{\prime})\Psi_{0}\rangle_{L^{2}}

(62)

further “well-behavedness” assumptions have to be made which commonly involve the fluctuation potential $W=H-F$ , where $F$ is the Fock matrix. Opposed to the spectral gap assumption there is much less known about the feasibility of such assumptions. Combining these estimates yields an approximate strong monotonicity constant denoted by $\Gamma$ . The positivity of this constant varies depending on the system being analyzed, indicating that such an analysis is not universally applicable. For specific values of $\Gamma$ across different systems, we refer to Table 1 where these variations are detailed. Moreover, a prior error estimates can be established using the general framework introduced by Bangerth and Rannacher [3].

4.2 Excitation graphs and topological degree

In a series of two articles [18, 19] Csirik and Laestadius propose a novel and comprehensive mathematical framework for Coupled-Cluster-type methods.

In the first article of this series [18], the authors develop a graph theoretical approach offering a new interpretation of the excitation structures in various CC methods through a graph-based framework. This method is particularly potent as it enables a cohesive analysis of both single and multi-reference CC methods within a unified structure. To illustrate this concept, we consider a simplified scenario with five spin orbitals, labeled $\{1,...,5\}$ , and two reference states, $\{1,2,3\}$ and $\{1,2,4\}$ . The array of possible excitations in this setup can be effectively represented using a graph, where each edge symbolizes a potential excitation. This graphical representation is detailed in Figure 2, providing a clear and structured visualization of the excitation dynamics.

Analyzing the excitation graph itself can lead to insights about the considered CC method. As an example, the transitivity of the graph implies the algebraic closedness of the set of excitation operators.

In the second article of this series [19], the authors analyze the nonlinear equations arising in the single reference CC method using topological degree theory. This mathematical tool is instrumental in decoding and resolving specific equation types that entail mappings between topological spaces. When applied to the CC map, topological degree theory allows for the deduction of local existence and uniqueness of the CC solutions. Additionally, it facilitates the extraction of the topological index for solutions within the single reference CC framework. In general, the topological index of a root in a nonlinear map is particularly enlightening, shedding light on the root’s inherent nature, especially regarding its stability and the map’s behavior in its vicinity. In this context, the authors successfully demonstrate the application of topological index results to both non-degenerate and degenerate solutions in the single reference CC method, providing deeper insights into the underlying mathematical structure of these solutions.

In addition to their exploration of nonlinear equations in the single reference CC method, the authors also investigate the complex issue of discerning the “physicality” of solutions to truncated CC equations. This area of research has been pivotal in distinguishing between “physical” solutions, which accurately mirror real-world phenomena, and “unphysical” ones, which are considered irrelevant or misleading. A landmark study by Kowalski and Piecuch [53] played a crucial role in this context, employing a specific homotopy method to categorize these solutions. Despite some debate over the universality of this method, as noted by Csirik and Laestadius in Remark 4.30 in [19], the contributions of Kowalski and Piecuch were significant – to the extent that the authors christened this particular homotopy the “Kowalski–Piecuch homotopy”, or “KP homotopy” for short. Unveiled the intricate nature of solutions to truncated CC equations, results in [53] highlighted the need for deeper analytical scrutiny. This revelation has spurred further examination of the CC equations and the KP homotopy approach, with a renewed focus on employing topological degree theory. By doing so, Csirik and Laestadius have markedly enhanced our comprehension of the complex nature inherent in truncated CC equations, offering new perspectives and deeper insights into their behavior and implications.

4.3 Inf-Sup condition

In their two-part series of articles [34, 33], Hassan, Maday, and Wang have made substantial advancements in our analytical grasp of the CC function $f_{\rm CC}$ defined in Eq. (48). To avoid an ad hoc bound onto the fluctuation potential as imposed in Sec. 4.1, the authors instead prove the local invertibility of the CC function through a classical inf-sup type argument that marks a significant shift in the analytical methodology employed in CC theory. Such an inf-sup condition, also called the Babuška–Brezzi condition which is a technique commonly used when analyzing indefinite elliptic partial differential equations, can be summarized as follows:

Consider the bounded linear mapping $A$ between two normed spaces $(V,\|\cdot\|_{V})$ and $(W,\|\cdot\|_{W})$ – note that the Jacobian naturally fulfills these assumptions. The Babuška–Brezzi condition states that there exists a constant $\alpha>0$ such that

\inf_{\begin{subarray}{c}v\in V\\ v\neq 0\end{subarray}}\sup_{\begin{subarray}{c}w\in W\\ w\neq 0\end{subarray}}\frac{|A(v,w)|}{\|v\|_{V}\|w\|_{W}}\geq 0\quad{\rm and}% \quad\inf_{\begin{subarray}{c}w\in W\\ w\neq 0\end{subarray}}\sup_{\begin{subarray}{c}v\in V\\ v\neq 0\end{subarray}}\frac{|A(v,w)|}{\|v\|_{V}\|w\|_{W}}\geq\alpha,

(63)

see [58] for more details. This condition ensures that the operator $A$ is neither “too weak” nor “too strong”, in the sense that it maps elements of $V$ and $W$ in a balanced way. Gaining a clearer understanding becomes easier in the context of finite dimensions: In this scenario, striving for local strong monotonicity, as detailed in Sec. 4.1, is analogous to verifying that a matrix is positive definite. Similarly, the inf-sup condition, as described in this section, can be likened to establishing a matrix’s invertibility. Note that in the realm of finite dimensions, a square matrix’s invertibility can be deduced solely from its injectivity. However, the infinite-dimensional scenario demands a bit more nuance. This is why we see two distinct conditions in Eq. (63), reflecting the additional complexity inherent in infinite dimensions.

In connection with single reference coupled cluster theory, the authors establish this condition for the similarity transformed shifted Hamiltonian, which arises from the coupled cluster Jacobian, see [34]. As shown subsequently, proving local invertibility based on this inf-sup condition yields more generally applicable well-posedness results compared to the local analysis techniques described in Sec. 4.1.

4.3.1 Overview of the inf-sup type argument

The essence of the analysis presented in [34] is that the CC function can be locally inverted if and only if its Jacobian, referred to as the CC Jacobian, can be locally inverted. Recall that the CC Jacobian is given by

\langle w,Df_{\rm CC}(t)v\rangle=\langle W\Psi_{0},e^{T(t)}[H,S]e^{T(t)}\Psi_{% 0}\rangle

(64)

and we introduce the map $A$ via the description

\langle W\Psi_{0},A(t)S\Psi_{0}\rangle=\langle W\Psi_{0},e^{T(t)}[H,S]e^{T(t)}% \Psi_{0}\rangle.

(65)

Note that the CC Jacobian $Df$ at ${\bf t}$ is then invertible if and only if $A$ is invertible at ${\bf t}$ . The authors leverage this observation and work with $A$ instead of the Jacobian $Df$ . Moreover, $A({\bf t}_{*})$ is a similarity transformed of the shifted Hamiltonian, in particular, it is non-symmetric! Therefore, one can either study $A({\bf t})$ or $A^{\dagger}({\bf t})$ , both approaches are equivalent, yet one approach might be simpler than the other. Indeed, the authors establish the following two key results which yield the invertibility of $A$ at ${\bf t}_{*}$ which then yields the invertibility of the CC Jacobian $Df_{\rm CC}$ at ${\bf t}_{*}$ , see Theorem 31 in [34]. First, the authors prove that at the true, untruncated CC solution ${\bf t}_{*}$ , the function $A({\bf t}_{*})$ is injective, see step one in the proof of Theorem 31 in [34]. This is equivalent to the first inequality in Eq. (63). Second, the authors establish that $A^{\dagger}({\bf t}_{*})$ is bounded below, see step two in the proof of Theorem 31 in [34]. This is equivalent to the second inequality in Eq. (63). Combining these results yields that $A$ at ${\bf t}_{*}$ is invertible. A direct consequence of this is that the CC Jacobian $Df$ evaluated at ${\bf t}_{*}$ is invertible, and its inverse is bounded, i.e.,

\|Df^{-1}({\bf t}_{*})\|\leq\frac{\Theta}{\Upsilon},\qquad{\rm with}\qquad% \Theta=\|e^{T^{\dagger}({\bf t}_{*})}\|\|\mathbb{P}_{0}^{\perp}e^{-T({\bf t}_{% *})}\|,

(66)

where $\Upsilon$ is the inf-sup constant from Eq. (63), and $\mathbb{P}_{0}^{\perp}$ is the projection onto the space orthogonal to ${\rm span}(\Psi_{0})$ .

These results, in turn, can then be leveraged to establish that the CC function $f_{\rm CC}$ , under some assumptions (see Theorem 33 in [34]), is locally invertible around ${\bf t}_{*}$ and $f_{\rm CC}$ as well as its local inverse are differentiable – in mathematical parlance, $f_{\rm CC}$ is a local diffeomorphism. Moreover, the authors establish a local error bound of the form

\|{\bf t}_{*}-{\bf t}\|\leq 2\frac{\Theta}{\Upsilon}\|f_{\rm CC}({\bf t}_{*})\|.

(67)

4.3.2 Interpretation and results of the inf-sup argument

Similarly to the local analysis results elaborated on in Sec. 4.1, the inf-sup argument relies on the positivity of the constants involved. The advantage of the analysis presented in [34], is that the constants are provably positive and therefore universally applicable. In particular, they do not rely on assumptions on the fluctuation potential. See Table 1 for some molecular test systems in equilibrium geometry, and Fig. 3 for bond-dissociation of hydrogen fluoride.

Table 1: Comparison of the approximate strong monotonicity constant

\Gamma

(see Sec. 4.1) and

\Upsilon/\Theta

(see Eq. (66)) – both seeking a positive lower bound to

1/\|Df^{-1}(t_{*})\|

. The calculations were performed in STO-6G basis sets except for the HF and LiH molecules for which the 6-31G basis sets was used. For more details see [34]

Molecule	$1/\\|Df^{-1}(t_{*})\\|$	$\Upsilon/\Theta$	$\Gamma$
BeH ${}_{2}$	0.3379	0.2568	0.0363
BH ${}_{3}$	0.3060	0.2081	-0.0950
HF	0.2995	0.2529	-0.0083
H ${}_{2}$ O	0.3576	0.2789	0.0249
LiH	0.2628	0.2164	-0.0065
NH ${}_{3}$	0.4113	0.2784	-0.0325

Table 1: Comparison of the approximate strong monotonicity constant

\Gamma

(see Sec. 4.1) and

\Upsilon/\Theta

(see Eq. (66)) – both seeking a positive lower bound to

1/\|Df^{-1}(t_{*})\|

. The calculations were performed in STO-6G basis sets except for the HF and LiH molecules for which the 6-31G basis sets was used. For more details see [34]

While the analytical approach outlined in [34] encounters certain challenges, its contributions to the mathematical understanding of CC theory are pivotal. Initially, this analysis seemed limited to the untruncated CC framework, relating approximate untruncated CC solutions with the infinite-dimensional untruncated CC solutions. However, the authors adeptly addressed this in a subsequent publication [33], successfully extending their findings to truncated CC methods. Another complexity lies in the computation of the involved constants in a numerically tractable manner. The constants involve operator norms which are in general not easily accessible, to say the least. Moreover, these constants are further linked to either the specific value of the untruncated CC solution ${\bf t}_{*}$ or the spectral properties of related operators. Despite this, the potential for practical application remains promising. Future work could focus on developing manageable approximations of these constants, thereby making the insights from [34, 33] more accessible for practical simulations.

In summary, this novel analytical approach has significantly advanced our understanding of the local behavior of the CC function. It introduces a sound mathematical framework for understanding its local behavior, thereby greatly enriching our knowledge in this area.

5 The root structure of CC theory

As outlined in [26, 24], the root structure of a polynomial system is (in general) of fundamental importance. It unveils key aspects [37], such as the multiplicity of roots and the nature of these roots, e.g., whether they are real or complex. Such insights are especially vital when employing (approximate) root-finding methods in practical applications. The pursuit of roots to the CC equations (47) is a direct application of these principles.

In the context of CC methods, most commonly, (quasi) Newton-type methods are employed to approximate one root of the CC equations. From a computational perspective, (quasi) Newton-type methods have better numerical scaling than more general root-finding procedures. Additionally, in a perturbative regime, one could argue that the CC amplitudes can be viewed as minor corrections to the HF solution. Consequently, it may be sufficient to approximate a single root near zero, which represents a small change to the HF solution. This perturbation theoretical reasoning is of paramount importance in understanding the current computational and theoretical practices in CC theory. In particular, it justifies the use (quasi) Newton-type methods and also explains the quantum chemical rule of thumb, namely, “Do not trust simulations with large CC amplitudes”. However, it is very important to note that:

This reasoning does not cover all cases where CC theory can be successfully applied! [12, 30, 23]

The rule of thumb may be fine in the regime of weakly correlated systems, but it certainly breaks down for strongly correlated systems [26, 23]. For strongly correlated systems, it is common practice to make a case-by-case assessment of the computed results, currently limiting the reliable out-of-the-box application of CC methods. To illustrate the limitations of the perturbation theoretical perspective in fully comprehending CC theory, consider the single polynomial $p(z)=z^{3}-1$ , which has three distinct roots: $z_{1}=1$ , and $z_{2,3}=1/2\pm i\sqrt{3}/2$ . Applying Newton’s method to approximate one root to this system, we notice that, depending on the initialization, a different solution is found. This can be visualized by sampling a feasible region in $\mathbb{C}$ and using these points as initialization for Newton’s method. Depending on which root was approximated, we then color each point accordingly. This yields the known Newton fractal corresponding to $p(z)$ , see Fig. 4 (left panel).

This shows that around the individual roots Newton’s method convergence towards the closest root. However, it also shows that the global convergence behavior of (quasi) Newton-type methods is highly complicated [37, 59]. One can only imagine how intricate the Newton fractal of the high-dimensional CC equations is. These considerations raise the pressing question:

Which CC root has been approximated, and is this the “best” solution attainable with the considered CC method?

To definitively answer this question one must leave the perturbative framework, theoretically as well as practically! Mathematically, the most promising framework for studying systems of polynomial equations is algebraic geometry. This field not only provides a set of advanced theoretical tools but also has seen a tremendous surge in computational advances. Exploiting parallel implementations, computational procedures (mostly) based on the homotopy continuation method, e.g., PHCpack [63], Bertini [7], HOM4PS [13, 46], NAG4M2 [6], and HomotopyContinuation.jl [11], provide a reasonable starting point to numercally investigate the intricate root structures of the high-dimensional and non-linear CC equations.

Within the chemistry community, the root structure of the CC equations has been studied at a fundamental level with the goal of including homotopy continuation methods in the CC methodology. The first study on this topic dates back to 1978 when Živkovič and Monkhorst investigated the singularities and multiple solutions of the equations [67]. This was followed by mathematical and numerical studies of multiple solutions of the single-reference and state-universal multi-reference CC equations and their singularities and analytic properties in the early 1990s by Paldus and coworkers [54, 51]. In 1998, Kowalski and Jankowski revived the homotopy methods in connection with the CC theory and used them to solve the CC equations with doubles for a minimum-basis-set four-electron problem [42]. This was followed by a fruitful collaboration of Kowalski and Piecuch, who extended the application of the homotopy methods to the equations defining the CC approaches with singles and doubles (CCSD), singles, doubles, and triples (CCSDT), and singles, doubles, triples, and quadruples (CCSDTQ) [53], again using a four-electron system described by a minimum basis set as a target. They also introduced the formalism of $\beta$ -nested equations and proved the Fundamental Theorem of the $\beta$ -NE Formalism, which enabled them to explain the behavior of the curves connecting multiple solutions of the various CC polynomial systems, i.e., from CCSD to CCSDT, CCSDT to CCSDTQ, etc. In [43], Piecuch and Kowalski used homotopy methods to determine all solutions of nonlinear state-universal multireference CCSD equations based on the Jeziorski-Monkhorst ansatz, proving two theorems that provided an explanation for the observed intruder solution problem. In a sequel work [41], they used homotopy methods to obtain all solutions of the generalized Bloch equation, which is nonlinear even in a CI parametrization.
Despite these intensive investigations, the practical computational use of this approach has been restricted to only very small model systems, primarily because of two key reasons. Firstly, to effectively integrate computational algebraic methods with cutting-edge computational quantum chemistry, a substantial scientific divide must be bridged, one that involves advanced and abstract mathematical principles. Secondly, in the late 1980s and 1990s, the field of computational nonlinear algebra was in its infancy, presenting a pioneering yet challenging academic environment for advancements.

Recently, a novel computational shift adopting a fully algebraic geometry perspective of CC theory was established [26, 22]. This approach has demonstrated significant potential in reshaping our understanding of the CC theory [26, 22, 24, 10]. In preliminary works, the authors Faulstich, Oster, Strumfels, and Sverrisdóttir have demonstrated that the CC equations possess rich mathematical structures. By integrating these structures into the computational model, the authors were able to significantly reduce the computational scaling of algebro computational methods applied to the CC equations allowing the computation of all CC roots for small molecular systems [22].

The following chapter is outlined as follows. We begin with a brief review of the fundamental concepts underlying the homotopy continuation method in Sec. 5.1. We then discuss different bounds to the number of roots to the CC equations and introduce the crucial concept of truncation varieties in Sec. 5.2. In Section 5.3, we review the essential numerical discoveries yielded by this approach, providing a detailed analysis of its implications.

5.1 Homotopy continuation

Most algebro computational methods are built on the idea of homotopy continuation – the numerical approach established in [22] is no exception. The idea of homotopy continuation is simple: continuously transform a simple system of polynomials with known solutions into a more complex one and track the paths of these solutions. More formally, we consider the CC equations, written in the following form

f_{\rm CC}({\bf t})=\left[\begin{array}[]{c}f_{1}({\bf t})\\ \vdots\\ f_{m}({\bf t})\end{array}\right]=\left[\begin{array}[]{c}f_{1}(t_{1},...,t_{m}% )\\ \vdots\\ f_{m}(t_{1},...,t_{m})\end{array}\right]=0.

(68)

This is our target system, i.e., the system we wish to solve. In a general case, we require the number of equations to be larger than the number of variables, however, the CC equations are a square system, i.e., we have as many equations as variables. In order to find all roots to the system in Eq. (68), we construct an auxiliary system of polynomial equations denoted $g(\mathbf{t})=0$ . For the construction of this system, two fundamental criteria must be met: firstly, its roots of $g$ should be known, and secondly, the sytem $g$ must have at least as many roots as the target system $f_{\rm CC}$ . While meeting the first condition is relatively simple, the second condition poses a greater challenge, as accurately determining the number of roots in the CC equations is a hard problem, see Sec. 5.2. Having $f_{\rm CC}$ and $g$ , we define a family of systems $H(\mathbf{t},\lambda)$ for $\lambda\in\mathbb{R}$ interpolating between $f_{\rm CC}$ and $g$ , i.e., $H(\mathbf{t},0)=f_{\rm CC}(\mathbf{t})$ and $H(\mathbf{t},1)=g(\mathbf{t})$ . For the sake of illustration, we now consider one root $\mathbf{s}_{0}$ of $g$ and restrict $\lambda\in[0,1]$ . The condition $H(\mathbf{t},\lambda)=0$ then defines a solution path $\mathbf{t}(\lambda)\subset\mathbb{C}^{m}$ such that $H(\mathbf{t}(\lambda),\lambda)=0$ for $\lambda\in[0,1]$ and $\mathbf{t}(1)=\mathbf{s}_{0}$ . Numerically, this path is followed from $\lambda=1$ to $\lambda=0$ in order to compute one solution $\mathbf{t}_{0}=\mathbf{t}(0)$ to the target system $f_{\rm CC}$ . This procedure is equivalent to solving the initial value problem

\frac{\partial}{\partial\mathbf{t}}H(\mathbf{t},\lambda)\left(\frac{\mathrm{d}% }{\mathrm{d}\lambda}\mathbf{t}(\lambda)\right)+\frac{\partial}{\partial\lambda% }H(\mathbf{t},\lambda)=0,\quad\mathbf{t}(1)=\mathbf{s}_{0},

which is known as the Davidenko differential equation [20, 21]. We say that $\mathbf{t}(1)=\mathbf{s}_{0}$ gets tracked towards $\mathbf{t}(0)$ . For this to work, $\mathbf{t}(\lambda)$ must be a regular zero of $H(\mathbf{t},\lambda)=0$ for every $\lambda\in(0,1]$ . In the case of nonregular solutions at $\lambda=0$ endgames are employed which are special numerical methods [50].

In analyzing the solution paths traced by the homotopy, as illustrated in Fig. 5, various scenarios may arise [6]. One path, represented by the solid line, diverges to infinity as $\lambda\to 0$ . In contrast, the other three paths converge to finite limits. The path indicated by a dotted-dashed line uniquely converges to a regular zero of the target system $f_{\rm CC}$ at $\lambda=0$ . Meanwhile, the two paths denoted by dashed lines converge to a common limit, corresponding to an isolated zero of $f_{\rm CC}$ with a multiplicity of two. Mathematically, homotopy continuation methods are well studied, we refer the interested reader to [6, 29, 49, 61], and for a quantum chemistry perspective see [24].

5.2 Bounding the number of roots

As becomes apparent from Sec. 5.1, knowing the precise count of roots, or at least a close upper bound, is crucial for the effective application of homotopy methods. This number dictates the number of roots in the auxiliary system $g$ and therewith determines the number of paths to be numerically tracked. Due to the high dimensionality, this turns out to be particularly challenging in the case of CC theory. Subsequently, we denote

{\rm CCdeg}_{N,N_{B}}(\sigma)

(69)

the true number of roots to the CC equations for a system of $N$ electrons discretized in $N_{B}$ spin orbitals imposing the CC truncation level $\sigma$ , where e.g. $\sigma=\{1,2\}$ stands for CCSD, $\sigma=\{2\}$ stands for CCD, $\sigma=\{1,2,3\}$ stands for CCSDT, etc.

In order to establish a bound to the number of roots to the CC equations (47), one can start with the simplest estimate for the number of roots in a polynomial system, namely, the Bézout number. The Bézout number is simply the product of the degrees of the individual polynomial equations. In the case of CCSD, this yields

{\rm CCdeg}_{N,N_{B}}(\{1,2\})\leq 3^{n_{s}}4^{n_{d}}

(70)

where $n_{s}=N(N_{B}-N)$ is the number of singles equations and $n_{d}=(N-1)N(N_{B}-N-1)(N_{B}-N)$ is the number of doubles equations, see e.g. [53]. The Bézout number often greatly overestimates the actual number of roots, as seen in the CC equations [26]. For the effective use of homotopy methods, however, it is essential to have precise and accurate estimates of the number of roots.

One potential way to improve this bound is by means of the Bernstein-Khovanskii-Kushnirenko (BKK) theorem [8, 40, 16]. The BKK theorem provides a way to estimate the maximum number of solutions that a system of polynomial equations can have, based on the geometric properties of the equations’ coefficients. More precisely, it states that for a system of polynomial equations, the number of isolated solutions in the complex domain is bounded by the mixed volume of the Newton polytopes corresponding to the polynomials. In order to apply this theorem to CC theory, one must investigate the CC Newton polytopes and establish a way to compute or at least bound their mixed volume. This direction was explored in [26].

Another auspicious direction is the use of truncation varieties. This provides significantly improved bounds to the number of CC roots, see [22] and Fig. 8. The truncation varieties are algebraic varieties specific to CC theory. In general, an algebraic variety is a set of solutions to one or more algebraic equations, typically defined in a higher-dimensional space, where these solutions form a geometric shape or structure. In the context of CC theory, there are several varieties, that appear. Consider the exponential parametrization

\exp~{}:~{}\mathcal{V}\to\mathcal{H}_{\rm int}~{};~{}{\bf t}\mapsto|\Psi% \rangle=\exp({\bf t})|\Psi_{0}\rangle=|\Psi_{0}\rangle+\sum_{n=1}^{N}\frac{1}{% n!}T^{n}|\Psi_{0}\rangle,

(71)

where $\mathcal{V}$ denotes the vector space of CC amplitudes. Note that Eq. (71) defines a set of algebraic equations. Imposing a certain level of truncation corresponds to restricting this map to a subspace of amplitudes $\mathcal{V}_{\sigma}\subseteq\mathcal{V}$ , where $\sigma$ denotes the respective level of truncation as defined above. We define the truncation variety $V_{\sigma}$ as the closure of the image of the exponential map of $\mathcal{V}_{\sigma}$ . Since the exponential parametrization is invertible, the dimension of the variety $V_{\sigma}$ is the dimension of $\mathcal{V}_{\sigma}$ . The truncation varieties exhibit numerous mathematical properties, as elucidated in [22], which collectively lead to the bound

{\rm CCdeg}_{N,N_{B}}(\sigma)\leq\bigl{(}{\rm dim}(V_{\sigma})+1\bigr{)}\,{\rm deg% }(V_{\sigma}),

(72)

where ${\rm deg}(V_{\sigma})$ is the degree of the truncation variety $V_{\sigma}$ , which is an intrinsic quantity providing information about the variety’s geometric and algebraic properties. In general, the degree of a variety in algebraic geometry refers to a measure of its complexity. It is typically defined as the number of intersections that the variety has with a general linear space of complementary dimension. In simpler terms, it is the number of points at which a linear space will intersect the variety, assuming it intersects it in the maximum possible number of points [17]. Computing the exact degree for a given truncation variety – or at least a sufficiently good bound to it – is the subject of current investigations.

Undertaking a formal comparison between the bounds presented in Eq. 70 and Eq. (72) is challenging given the fundamentally distinct nature of the underlying concepts involved. Despite this, a numerical comparison reveals that the bound in Eq. (72) provides a significant improvement over the previously established bounds, see Sec. 5.3.

5.3 Numerical results

We begin this section by taking a closer look at the convergence behavior of (quasi) Newton-type methods applied to the CC equations (47). Since the dimensionality of the amplitude space grows rapidly, it is not possible to visualize the corresponding Newton fractal. However, we can obtain an idea of the size of the basin of attraction around one root, i.e., a ball around one solution within (quasi) Newton-type methods commonly converge to the solution at its center. To that end, we consider a variant of the H ${}_{4}$ model consisting of four hydrogen atoms symmetrically distributed on a circle of radius $R=1.738$ Å [62, 12] discretized in the STO-3G basis set, see Fig. 6.

For $\Theta=90^{\circ}$ we obtain a CC solution ${\bf t}_{0}$ by initializing the (quasi) Newton-type method with zero. Adding a random perturbation ${\bf t}_{p}$ to this solution provides a different initialization ${\bf t}_{\rm init}={\bf t}_{0}+{\bf t}_{p}$ for the CC computations. Scaling the size of ${\bf t}_{p}$ (i.e., $\|{\bf t}_{p}\|$ ,) allows us to (approximately) investigate the basin of attraction. Clearly, comparing with Fig. 4, we expect that the region for which Newton’s method converges to ${\bf t}_{0}$ will not be circular. However, this investigation yields the ballpark for the local basin of convergence, since we can extract the radius of the largest ball $r_{\rm max}$ in which (quasi) Newton-type methods converge to the solution ${\bf t}_{0}$ in $99.9\%$ of the cases, see Fig. 7. We moreover plot the success rate of Newton’s method, i.e., how many of the randomly perturbed initializations converged toward ${\bf t}_{0}$ , as a function of the size of ${\bf t}_{p}$ . Note that we measure $\|{\bf t}_{p}\|$ relative to the size of ${\bf t}_{0}$ , in particular, the initialization zero lies on the boundary of $\|{\bf t}_{p}\|=1$ , see Fig. 7.

This shows that $r_{\rm max}$ is approximately $0.2\;\|{\bf t}_{0}\|$ . Moreover, this shows that beyond this point convergence towards ${\bf t}_{0}$ is by no means guaranteed. In fact, for an arbitrary initialization that is $\|{\bf t}_{0}\|$ away from ${\bf t}_{0}$ , the success rate is only $27\%$ . Being oblivious about the physical motivation of this initial guess, one could argue that it is quite surprising that Newton’s method converges for the initial guess zero.

We now compare the new bound to the CC roots derived in [22] with the existing bounds reported in e.g. [53]. To that end, we compute the roots corresponding to CCS and CCD for two-electron systems, i.e., $N=2$ , for different numbers of spin orbitals $N_{B}$ . This shows that using the truncation varieties and their profound mathematical structures dramatically improved the bounds to the CC roots, see Fig. 8.

This reduction in the bounds together with the incorporation of truncation varieties in the computational procedure allowed for severe numerical advancements enabling the computation of the full root structure for true molecular systems like lithium hydride (see Fig. 9) [22] using CCD.

We emphasize that these advances are far from a straightforward application of off-the-shelf computational algebra tools. Instead, they result from a sophisticated combination of multiple techniques, underscoring the complexity and innovation of the approach. The general computational procedure comprises two major steps:

1. The set-up of an initial system from which the homotopy continuation starts. This initial system is specific for the number of electrons, the number of spin orbitals employed for the discretion of the Hamiltonian, and the used CC truncation level $\sigma$ as defined in Sec. 5.2. We emphasize that in this implementation, the initial system can be reused when computing CC solutions at the truncation level $\sigma$ for systems with the same number of electrons and basis functions.

2. Once the initial system for a target system configuration is set up, we employ a parametric homotopy approach as described in Sec. 5.1 that connects the initial system with the targeted system.

6 Conclusion

This article provides a self-contained educational review of the latest mathematical developments in coupled cluster (CC) theory from a computational chemistry perspective. To that end, we started this review article with a foundational introduction to CC theory, employing an algebraic approach. This particular formulation offers a rigorous and mathematically elegant framework, thereby facilitating a deeper understanding of the underlying principles. Additionally, in an effort to ensure comprehensive coverage and to augment the article’s self-contained nature, we have incorporated a detailed analysis of the matrix structures that emerge within the realm of second quantization. This includes an exploration of their theoretical underpinnings and practical implications in computational chemistry, providing valuable context and enhancing the overall utility of this review for researchers in the field.

We then explore a variety of analytical frameworks and methods used in CC theory, with a focus on their contributions to establishing local existence and uniqueness of the CC solutions. We delve into the local analysis based on Zarantonello’s Lemma, a technique pioneered by Schneider [60], which has significantly influenced the field by its application in various CC methods, including the continuous single-reference CC method [56, 57], the extended CC method [45], and the tailored CC ansatz [25]. Further, we explore the graph-based framework for CC methods developed by Csirik and Laestadius [18, 19]. This section highlights the versatility of the framework and its utility in comparing various CC methods, encompassing even multireference approaches. We then delved into the latest numerical analysis results analyzing the single reference CC method developed by Hassan, Maday, and Wang. This segment decodes the complex ansatz from a computational chemistry viewpoint and encapsulates key findings from their research presented in [34, 33], offering readers a comprehensive understanding of this cutting-edge area in CC theory.

Furthermore, our review extends to the algebraic geometry approach within CC theory. This unique perspective not only illuminates the intricate root structure inherent in CC equations but also paves the way for novel computational paradigms. These emerging methodologies have the potential to form the cornerstone of future CC computational strategies. In our discussion, we delve into the overarching principles of the algebraic approach and incorporate an overview of the most recent numerical advancements that have been made in this area [25, 22].

Acknowledgments

The author is thankful for useful discussions with Andre Leastadius, Mihály Csirik, Muhammad Hassan, and Svala Sverrisdóttir.

References

[1] J. S. Arponen, “Independent-cluster methods as mappings of quantum theory into classical mechanics,” Theoretica chimica acta, vol. 80, no. 2-3, pp. 149–179, 1991.
[2] J.-P. Aubin, Applied functional analysis. John Wiley & Sons, 2011.
[3] W. Bangerth and R. Rannacher, Adaptive finite element methods for differential equations. Springer Science & Business Media, 2003.
[4] R. Bartlett, “Theory and applications of computational chemistry: the first forty years,” Dykstra, CE, Frenking, G., Kim, KS, Scuseria, GE, Eds, pp. 1191–1221, 2005.
[5] R. J. Bartlett and M. Musiał, “Coupled-cluster theory in quantum chemistry,” Reviews of Modern Physics, vol. 79, no. 1, p. 291, 2007.
[6] D. J. Bates, P. Breiding, T. Chen, J. D. Hauenstein, A. Leykin, and F. Sottile, “Numerical nonlinear algebra,” ArXiv preprint arXiv:2302.08585, 2023.
[7] D. J. Bates, J. D. Hauenstein, A. J. Sommese, and C. W. Wampler, “Bertini: Software for numerical algebraic geometry,” 2006.
[8] D. N. Bernshtein, “The number of roots of a system of equations,” Funct. Anal. Appl., vol. 9, no. 3, pp. 183–185, Jul 1975.
[9] R. Bishop, “An overview of coupled cluster theory and its applications in physics,” Theoretica chimica acta, vol. 80, no. 2-3, pp. 95–148, 1991.
[10] V. Borovik, B. Sturmfels, and S. Sverrisdóttir, “Coupled cluster degree of the grassmannian,” arXiv:2310.15474, 2023.
[11] P. Breiding and S. Timme, “Homotopycontinuation. jl: A package for homotopy continuation in julia,” in Mathematical Software–ICMS 2018: 6th International Conference, South Bend, IN, USA, July 24-27, 2018, Proceedings 6. Springer, 2018, pp. 458–465.
[12] I. W. Bulik, T. M. Henderson, and G. E. Scuseria, “Can single-reference coupled cluster theory describe static correlation?” Journal of chemical theory and computation, vol. 11, no. 7, pp. 3171–3179, 2015.
[13] T. Chen, T.-L. Lee, and T.-Y. Li, “Hom4ps-3: a parallel numerical solver for systems of polynomial equations based on polyhedral homotopy continuation methods,” in Mathematical Software–ICMS 2014: 4th International Congress, Seoul, South Korea, August 5-9, 2014. Proceedings 4. Springer, 2014, pp. 183–190.
[14] J. Čížek, “On the correlation problem in atomic and molecular systems. calculation of wavefunction components in ursell-type expansion using quantum-field theoretical methods,” The Journal of Chemical Physics, vol. 45, no. 11, pp. 4256–4266, 1966.
[15] F. Coester, “Bound states of a many-particle system,” Nuclear Physics, vol. 7, pp. 421–424, 1958.
[16] D. Cox, J. Little, and D. O’Shea, Using Algebraic Geometry, ser. Graduate Texts in Mathematics. Springer New York, 1991.
[17] D. Cox, J. Little, and D. OShea, Ideals, varieties, and algorithms: an introduction to computational algebraic geometry and commutative algebra. Springer Science & Business Media, 2013.
[18] M. A. Csirik and A. Laestadius, “Coupled-cluster theory revisited-part i: Discretization,” ESAIM: Math. Model. Numer. Anal., vol. 57, no. 2, pp. 645–670, 2023.
[19] ——, “Coupled-cluster theory revisited-part ii: Analysis of the single-reference coupled-cluster equations,” ESAIM: Mathematical Modelling and Numerical Analysis, vol. 57, no. 2, pp. 545–583, 2023.
[20] D. Davidenko, “On a new method of numerical solution of systems of nonlinear equations,” in Proceedings of the USSR Academy of Sciences, vol. 88, no. 4, 1953, pp. 601–602.
[21] ——, “On the approximate solution of systems of nonlinear equations,” Ukrainian Mathematical Journal, vol. 5, no. 2, pp. 196–206, 1953.
[22] F. Faulstich, B. Sturmfels, and S. Sverrisdóttir, “Algebraic varieties in quantum chemistry,” arXiv:2308.05258, 2023.
[23] F. M. Faulstich, H. E. Kristiansen, M. A. Csirik, S. Kvaal, T. B. Pedersen, and A. Laestadius, “S-diagnostic—an a posteriori error assessment for single-reference coupled-cluster methods,” The Journal of Physical Chemistry A, vol. 127, no. 43, pp. 9106–9120, 2023.
[24] F. M. Faulstich and A. Laestadius, “Homotopy continuation methods for coupled-cluster theory in quantum chemistry,” Molecular Physics, vol. 0, p. e2258599, 2023.
[25] F. M. Faulstich, A. Laestadius, Ö. Legeza, R. Schneider, and S. Kvaal, “Analysis of the tailored coupled-cluster method in quantum chemistry,” SIAM J. Numer. Anal., vol. 57, no. 6, pp. 2579–2607, 2019.
[26] F. M. Faulstich and M. Oster, “Coupled cluster theory: Towards an algebraic geometry formulation,” arXiv:2211.10389 [In press: SIAM Journal on Applied Algebra and Geometry], 2022.
[27] F. M. Faulstich, “Mathematical aspects of coupled-cluster theory in chemistry,” 2020.
[28] D. H. Fremlin, Measure theory. Torres Fremlin, 2000, vol. 4.
[29] C. B. Garcia and W. I. Zangwill, “Finding all solutions to polynomial systems and other systems of equations,” Mathematical Programming, vol. 16, no. 1, pp. 159–176, 1979.
[30] E. Giner, D. P. Tew, Y. Garniron, and A. Alavi, “Interplay between electronic correlation and metal–ligand delocalization in the spectroscopy of transition metal compounds: Case study on a series of planar cu2+ complexes,” Journal of Chemical Theory and Computation, vol. 14, no. 12, pp. 6240–6252, 2018.
[31] W. Hackbusch, Elliptic differential equations: theory and numerical treatment. Springer, 2017, vol. 18.
[32] B. C. Hall, Lie groups, Lie algebras, and representations. Springer, 2013.
[33] M. Hassan, Y. Maday, and Y. Wang, “Analysis of the single reference coupled cluster method for electronic structure calculations: The discrete coupled cluster equations,” arXiv preprint arXiv:2311.00637, 2023.
[34] ——, “Analysis of the single reference coupled cluster method for electronic structure calculations: the full-coupled cluster equations,” Numerische Mathematik, pp. 1–53, 2023.
[35] T. Helgaker, P. Jorgensen, and J. Olsen, Molecular electronic-structure theory. John Wiley & Sons, 2014.
[36] J. Hubbard, “The description of collective motions in terms of many-body perturbation theory,” Proceedings of the Royal Society A, vol. 240, no. 1223, pp. 539–560, 1957.
[37] J. Hubbard, D. Schleicher, and S. Sutherland, “How to find all roots of complex polynomials by newton’s method,” Inventiones mathematicae, vol. 146, no. 1, p. 1, 2001.
[38] N. Hugenholtz, “Perturbation approach to the fermi gas model of heavy nuclei,” Physica, vol. 23, no. 1-5, pp. 533–545, 1957.
[39] A. A. Kirillov, An introduction to Lie groups and Lie algebras. Cambridge University Press, 2008, vol. 113.
[40] A. G. Kouchnirenko, “Polyèdres de newton et nombres de milnor,” Invent. Math., vol. 32, no. 1, pp. 1–31, Feb 1976.
[41] K. Kowalski and P. Piecuch, “Complete set of solutions of the generalized bloch equation,” Int. J. Quantum Chem., vol. 80, no. 4-5, pp. 757–781, 2000.
[42] K. Kowalski and K. Jankowski, “Towards complete solutions to systems of nonlinear equations of many-electron theories,” Phys. Rev. Lett., vol. 81, no. 6, p. 1195, 1998.
[43] K. Kowalski and P. Piecuch, “Complete set of solutions of multireference coupled-cluster equations: The state-universal formalism,” Phys. Rev. A, vol. 61, no. 5, p. 052506, 2000.
[44] A. Laestadius and F. M. Faulstich, “The coupled-cluster formalism–a mathematical perspective,” Molecular Physics, vol. 117, no. 17, pp. 2362–2373, 2019.
[45] A. Laestadius and S. Kvaal, “Analysis of the extended coupled-cluster method in quantum chemistry,” SIAM J. Numer. Anal., vol. 56, no. 2, pp. 660–683, 2018.
[46] T.-L. Lee, T.-Y. Li, and C.-H. Tsai, “Hom4ps-2.0: a software package for solving polynomial systems by the polyhedral homotopy continuation method,” Computing, vol. 83, pp. 109–133, 2008.
[47] G. Leoni, A first course in Sobolev spaces. American Mathematical Soc., 2017.
[48] M. Michałek and B. Sturmfels, Invitation to nonlinear algebra. American Mathematical Soc., 2021, vol. 211.
[49] A. Morgan, Solving polynomial systems using continuation for engineering and scientific problems. SIAM, 2009.
[50] A. P. Morgan, A. J. Sommese, and C. W. Wampler, “Computing singular solutions to polynomial systems,” Advances in Applied Mathematics, vol. 13, no. 3, pp. 305–327, 1992.
[51] J. Paldus, P. Piecuch, L. Pylypow, and B. Jeziorski, “Application of hilbert-space coupled-cluster theory to simple (H ${}_{2}$ ) ${}_{2}$ model systems: Planar models,” Phys. Rev. A, vol. 47, no. 4, p. 2738, 1993.
[52] J. Paldus, “The beginnings of coupled-cluster theory: an eyewitness account,” in Theory and Applications of Computational Chemistry. Elsevier, 2005, pp. 115–147.
[53] P. Piecuch and K. Kowalski, “In search of the relationship between multiple solutions characterizing coupled-cluster theories,” in Computational chemistry: reviews of current trends, J. Leszczynski, Ed. World Scientific, 2000, vol. 5, pp. 1–104.
[54] P. Piecuch, S. Zarrabian, J. Paldus, and J. Čížek, “Coupled-cluster approaches with an approximate account of triexcitations and the optimized-inner-projection technique. ii. coupled-cluster results for cyclic-polyene model systems,” Phys. Rev. B, vol. 42, no. 6, p. 3351, 1990.
[55] T. Rohwedder, “An analysis for some methods and algorithms of quantum chemistry,” 2010.
[56] ——, “The continuous coupled cluster formulation for the electronic schrödinger equation,” ESAIM: Math. Model. Numer. Anal., vol. 47, no. 2, pp. 421–447, 2013.
[57] T. Rohwedder and R. Schneider, “Error estimates for the coupled cluster method,” ESAIM: Math. Model. Numer. Anal., vol. 47, no. 6, pp. 1553–1582, 2013.
[58] S. A. Sauter and C. Schwab, Boundary element methods. Springer, 2011.
[59] D. Schleicher, “On the number of iterations of newton’s method for complex polynomials,” Ergodic Theory and Dynamical Systems, vol. 22, no. 3, pp. 935–945, 2002.
[60] R. Schneider, “Analysis of the projected coupled cluster method in electronic structure calculation,” Numer. Math., vol. 113, no. 3, pp. 433–471, 2009.
[61] A. J. Sommese, C. W. Wampler et al., The Numerical solution of systems of polynomials arising in engineering and science. World Scientific, 2005.
[62] T. Van Voorhis and M. Head-Gordon, “Benchmark variational coupled cluster doubles results,” J. Chem. Phys., vol. 113, no. 20, pp. 8873–8879, 2000.
[63] J. Verschelde, “Algorithm 795: Phcpack: A general-purpose solver for polynomial systems by homotopy continuation,” ACM Transactions on Mathematical Software (TOMS), vol. 25, no. 2, pp. 251–276, 1999.
[64] H. Yserentant, “On the regularity of the electronic schrödinger equation in hilbert spaces of mixed derivatives,” Numerische Mathematik, vol. 98, pp. 731–759, 2004.
[65] ——, Regularity and approximability of electronic wave functions. Springer, 2010.
[66] E. Zeidler, Nonlinear functional analysis and its applications: II/B: nonlinear monotone operators. Springer Science & Business Media, 2013.
[67] T. P. Živković and H. J. Monkhorst, “Analytic connection between configuration–interaction and coupled-cluster solutions,” J. Math. Phys., vol. 19, no. 5, pp. 1007–1022, 1978.

	$\displaystyle a_{p}^{\dagger}:\mathcal{F}\to\mathcal{F}~{};~{}\|s_{1},...,s_{N_% {B}}\rangle$	$\displaystyle\mapsto(-1)^{\sigma(p)}(1-s_{p})\|s_{1},...s_{p-1},1-s_{p},s_{p+1}% ,...,s_{N_{B}}\rangle$		(10)
	$\displaystyle a_{p}:\mathcal{F}\to\mathcal{F}~{};~{}\|s_{1},...,s_{N_{B}}\rangle$	$\displaystyle\mapsto(-1)^{\sigma(p)}s_{p}\|s_{1},...s_{p-1},1-s_{p},s_{p+1},...% ,s_{N_{B}}\rangle$		(10)

$\displaystyle\langle\Psi\|(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$	$\displaystyle=\langle\Psi\|{\rm exp}(T){\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)\|% \Psi_{0}\rangle$	(50)
	$\displaystyle=\langle\Psi\|{\rm exp}(T)\|\Psi_{0}\rangle\langle\Psi_{0}\|{\rm exp% }(-T)(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$
	$\displaystyle\quad+\sum_{\mu}\langle\Psi\|{\rm exp}(T)\|\Psi_{\mu}\rangle\langle% \Psi_{\mu}\|{\rm exp}(-T)(H-E_{\rm CC}){\rm exp}(T)\|\Psi_{0}\rangle$
	$\displaystyle=0.$