Initial tensor construction and dependence of the tensor renormalization group on initial tensors

Katsumasa Nakayama katsumasa.nakayama@riken.jp RIKEN Center for Computational Science, Kobe, 650-0047, Japan Manuel Schneider National Yang Ming Chiao Tung University (NYCU), Hsinchu, 30010, Taiwan

(19 July 2024)

Abstract

We propose a method to construct a tensor network representation of partition functions without singular value decompositions nor series expansions. The approach is demonstrated for one- and two-dimensional Ising models and we study the dependence of the tensor renormalization group (TRG) on the form of the initial tensors and their symmetries. We further introduce variants of several tensor renormalization algorithms. Our benchmarks reveal a significant dependence of various TRG algorithms on the choice of initial tensors and their symmetries. However, we show that the boundary TRG technique can eliminate the initial tensor dependence for all TRG methods. The numerical results of TRG calculations can thus be made significantly more robust with only a few changes in the code. Furthermore, we study a three-dimensional $\mathbb{Z}_{2}$ gauge theory without gauge-fixing and confirm the applicability of the initial tensor construction. Our method can straightforwardly be applied to systems with longer range and multi-site interactions, such as the next-nearest neighbor Ising model.

I Introduction

Since its introduction about two decades ago Levin and Nave (2007), the tensor renormalization group (TRG) method was widely applied to statistical physics problems, including quantum field theories such as the CP(1) model Nakayama et al. (2022), the $\mathbb{Z}_{2}$ gauge theory Liu et al. (2013); Kuramashi and Yoshimura (2019), the Schwinger model Shimizu and Kuramashi (2014a, b, 2018), and many more Yu et al. (2014); Zou et al. (2014); Yang et al. (2016); Takeda and Yoshimura (2015); Yoshimura et al. (2018); Bazavov et al. (2019); Kuramashi and Yoshimura (2020); Hirasawa et al. (2021); Akiyama and Kadoh (2021); Yosprakob et al. (2023); Akiyama et al. (2024); Yosprakob and Okunishi (2024). The partition function is written in the form of a tensor network. Then, the partition function itself and other physical quantities can be calculated by contracting the tensor network, which means summing over all its indices. This is only possible using approximate methods which truncate the exponential growths of information with increasing the system size. The tensor network is contracted in subsequent coarse-graining steps. In each step a truncation is applied, typically making use of a singular value decomposition (SVD). Since common algorithms coarse-grain the lattice in one direction only, the directions are exchanged after each step. The way this change is done affects the accuracy of the method and should therefore be carefully chosen. We discuss this effect in more details in App. G. Finally, physical quantities are extracted from the trace of the coarse-grained tensors. Since the TRG is free of sampling problems Nakayama et al. (2022); Shimizu and Kuramashi (2014b, 2018); Yang et al. (2016); Takeda and Yoshimura (2015); Kuramashi and Yoshimura (2020); Hirasawa et al. (2021); Yosprakob et al. (2023); Shimizu and Kuramashi (2014b), we can study systems for which Monte Carlo methods suffer from the sign problem Nagata (2022).

Overview of TRG algorithms.

The TRG was originally introduced by Levin and Nave Levin and Nave (2007), and it was since improved by truncation methods that reduce the numerical costs Halko et al. (2011); Nakamura et al. (2019); Morita et al. (2018); Okanohara (2014). The tensor network renormalization (TNR) additionally introduces disentanglers to improve the accuracy of the TRG Evenbly and Vidal (2015), an idea that originates in the multi-scale entanglement renormalization ansatz (MERA) Jiang et al. (2008). For systems with relatively small volumes, the core TRG (CTRG) can also reduce the computational requirements Lan and Evenbly (2019).

For higher dimensional systems, the TRG was extended to higher-order TRG (HOTRG) Xie et al. (2012). Recently, various alternatives were also studied, such as the anisotropic TRG (ATRG) Adachi et al. (2020), the triad TRG (TTRG) Kadoh and Nakayama (2019), and the minimally decomposed TRG (MDTRG) Nakayama (2023). These methods can reduce the numerical costs, and allow for contractions of three- and higher-dimensional tensor network systems in feasible computational time. We explain several TRG algorithms in appendices D and E

Initial tensor construction.

In order to apply the TRG methods efficiently, we have to represent the physical quantities by a locally connected tensor network. This means that each index only appears on two neighboring tensors. Different geometries can arise for this network, depending on the connectivity of the interactions. We focus on square and cubic lattices. The TRG coarse-grains these lattices to a network with the same geometry and can thus be used iteratively.

Common approaches to construct a locally connected tensor network make use of SVDs or series expansions such as the Taylor expansion Liu et al. (2013); Baumgartner and Wenger (2015); Marchis and Gattringer (2018). The expansion creates new variables, the power indices of each term. These can be used as indices of the initial tensor of the tensor network, by integrating out the original degrees of freedom. We give two examples of this construction in sections III and IV.

However, the choice of the initial tensors describing a given system is not unique. We propose another approach to construct the tensor network, based on a trivial decomposition with an identity matrix. The procedure does not require problem-specific and more involved decompositions, expansions, and variable transformations from spin indices to new tensor indices. We consider the spin indices as the indices of the initial tensor and localize the network by a matrix decomposition without approximations, inserting an identity matrix. This method generally generates a local tensor network representation for any theory which can be described by a translationally invariant Lagrangian or Hamiltonian. However, the index dimension can be large, depending on the dimension of the local degrees of freedom and the range of the interaction. The method is very efficient for local interactions. The general resource scaling is discussed in Sec. V.

Initial tensor dependence of TRG methods.

Since the coarse graining steps include local truncations, the accuracy of TRG algorithms possibly depends on the form of the initial tensors. Although the tensor construction based on the expansion is widely and successfully used for different models, it might not be the optimal choice for a given system and contraction method. We benchmark the accuracy of different TRG algorithms for the two-dimensional Ising model. Our results show that HOTRG-like methods, which use isometries for the coarse-graining step, are highly dependant on the symmetry of the initial tensors. We find that this problem does not apply when isometries in the algorithms are replaced by so-called squeezers Adachi et al. (2020), an idea originating from the boundary TRG Iino et al. (2019). This is possible for any isometry based TRG algorithm. Thus, we suggest to make use of this method in order to remove the dependence on the form of the initial tensors. In this case, our simple construction of the initial tensors leads to the same accuracy as other, more involved or problem-specific techniques. Appendix D discusses the technical details of how to implement squeezers in coarse-graining algorithms.

$\mathbb{Z}_{2}$ gauge theory.

As an example for higher dimensional systems, the $\mathbb{Z}_{2}$ gauge theory in three spatial dimensions was studied with HOTRG and TRG Kuramashi and Yoshimura (2019). There, the tensor network representation based on a Taylor expansion was used with gauge-fixing. The critical temperature was calculated with high accuracy. However, the representation in Kuramashi and Yoshimura (2019) has two-different tensors. Because the SVD is an optimization of local tensors, a smaller unit cell could generally be preferable. We show the applicability of our tensor network construction to the $\mathbb{Z}_{2}$ gauge theory, where only one initial tensor appears in the network. We calculate the free energy, specific heat and critical temperature without gauge-fixing, and find good agreement with previous calculations.

Structure of this paper.

This paper is organized as follows. We introduce our initial tensor construction in Sec. II for the one-dimensional Ising model with next-nearest neighbor interaction (NNNI) as a simple example. We can reproduce the exact solution with our method. After this, we apply the method to the two-dimensional Ising model and study the initial tensor dependence of the TRG and HOTRG in Sec. III. The accuracy of the HOTRG depends on the symmetricity of the initial tensor. In Sec. IV we apply the initial tensor construction method without gauge-fixing to the $\mathbb{Z}_{2}$ gauge theory and calculate the free energy and specific heat. Section V explains how the method can be applied to general models and how the index sizes of the initial tensors scale. We conclude our study in Sec. VI.

II One-dimensional Ising model with next-nearest neighbor interactions

We first introduce our method for the one-dimensional Ising model with next-nearest neighbor interactions and periodic boundary conditions as a simple example. The idea has similarities to the initial tensor construction for a particular Ising model on a triangular lattice in Zhao et al. (2010). Our method allows for other interaction terms than local interactions, such as next nearest interactions in the case considered here. The Ising model with NNNI in one spatial dimension with $N$ sites can be described by the partition function

Z=\sum_{\sigma=\pm 1}\prod_{x=1}^{N}T_{\sigma_{x},\sigma_{x+1},\sigma_{x+2}}^{% \mathrm{(1d)}}.

(1)

The sum $\sum_{\sigma=\pm 1}$ indicates a summation over all combinations of the spins at all sites. The tensor $T^{\mathrm{(1d)}}$ can be constructed with the spin indices $\sigma_{x}$ at sites $x$ and depends on the inverse temperature $\beta$ and the coupling constants $g_{1}$ and $g_{2}$ :

T_{\sigma_{x},\sigma_{x+1},\sigma_{x+2}}^{\mathrm{(1d)}}\equiv e^{-\beta(g_{1}% \sigma_{x}\sigma_{x+1}+g_{2}\sigma_{x}\sigma_{x+2})}.

(2)

This formulation does not form a locally connected network: a spin index $\sigma_{x}$ at a given site $x$ occurs on three different tensors ( $T_{\sigma_{x-2},\sigma_{x-1},\sigma_{x}}^{\mathrm{(1d)}}T_{\sigma_{x-1},\sigma% _{x},\sigma_{x+1}}^{\mathrm{(1d)}}T_{\sigma_{x},\sigma_{x+1},\sigma_{x+2}}^{% \mathrm{(1d)}}$ ) instead of only two neighboring tensors. Therefore, the partition function in Eq. 1 cannot be used in typical coarse-graining algorithms. We have to find an alternative initial tensor formulation with only locally connected tensors. For example, we want to find initial tensors $T^{\prime\mathrm{(1d)}}$ which only depend on two neighboring indices for a one-dimensional system. For this, we first decompose the tensor $T^{\mathrm{(1d)}}$ into $A$ and $B$ without approximation by introducing a new index $a$ :

T_{\sigma_{x},\sigma_{x+1},\sigma_{x+2}}^{\mathrm{(1d)}}=\sum_{a_{x+1}=\pm 1}A% _{\sigma_{x},\sigma_{x+1}}^{a_{x+1}}B_{\sigma_{x+2}}^{a_{x+1}}.

(3)

We can apply a SVD or other methods for this decomposition, as it was previously done for an Ising model with a magnetic field in Zhao et al. (2010). However, we can also easily construct a tensor of the above form by choosing $B$ to be the identity matrix:

T_{\sigma_{x},\sigma_{x+1},\sigma_{x+2}}^{\mathrm{(1d)}}=\sum_{a_{x+1}=\pm 1}T% _{\sigma_{x},\sigma_{x+1},a_{x+1}}^{\mathrm{(1d)}}\delta_{\sigma_{x+2}}^{a_{x+% 1}}.

(4)

We then define the localized tensor in terms of the tensors $A$ and $B$ , where the indices of tensor $B$ are shifted by one lattice site compared to Eq. 3:

T_{\sigma_{x},\sigma_{x+1},a_{x},a_{x+1}}^{\prime\mathrm{(1d)}}\equiv A_{% \sigma_{x},\sigma_{x+1}}^{a_{x+1}}B_{\sigma_{x+1}}^{a_{x}}.

(5)

Exploiting the translational invariance of the system, the partition function can be rewritten as a locally connected tensor network consisting of these new tensors:

Z=\sum_{\sigma=\pm 1}\sum_{a=\pm 1}\prod_{x=1}^{N}T^{\prime\mathrm{(1d)}}_{% \sigma_{x},\sigma_{x+1},a_{x},a_{x+1}}.

(6)

By defining the combined indices $[a\sigma]\equiv a\otimes\sigma$ , we obtain a one-dimensional system with size-four indices:

Z=\sum_{[a\sigma]=1}^{4}\prod_{x=1}^{N}T^{\prime\mathrm{(1d)}}_{[a\sigma]_{x},% [a\sigma]_{x+1}}.

(7)

Since $T^{\prime\mathrm{(1d)}}$ is a $4\times 4$ matrix, we can easily find its eigenvalues by exact diagonalization. This gives the exact solution of the one-dimensional Ising model with NNNI, which is known from previous studies Pini and Rettori (1993); Taherkhani et al. (2011).

Different ways of constructing the locally connected tensor network lead to different tensors $A$ and $B$ and thus different $T^{\prime}$ . In general, we can relate different tensor representations of the same system using a unitary matrix:

T^{\mathrm{(new)}}_{xx^{\prime}}\equiv\sum_{k,k^{\prime}}U_{xk}T^{\mathrm{(1d)% }}_{kk^{\prime}}U^{\dagger}_{k^{\prime}x^{\prime}}.

(8)

Although the partition function is analytically not changed by this transformation, the numerical accuracy of the coarse-graining steps can depend on the form of $T^{\prime\mathrm{(1d)}}$ . This will be confirmed and studied in more detail in Sec. III.

The presented approach can be straightforwardly extended to an interaction with ${n_{\mathrm{int}}}{}$ distinct hopping interactions. Since each hopping term introduces a new index that gets combined with the spin index, the matrix size of $T^{\prime\mathrm{(1d)}}$ grows as ${d}{}^{{n_{\mathrm{int}}}{}}$ , where ${d}{}$ is the dimension of the spin index. See Sec. V for more details.

One-dimensional models with several interactions are studied in the context of frustrated systems and antiferromagnetism Guimaraes and Plascak (2002); Taherkhani et al. (2011); Jurcisinoca and Jurcisin (2014); Karlova et al. (2018); Kassan-Ogly et al. (2012); Kwek et al. (2009); Niemeijer (1971); Ozerov et al. (2010); Raymond and Wong (2012); Sandvik (2010); Capriotti et al. (2003), and the approach presented here could be useful as a simple candidate to construct a locally connected network. Also, two-dimensional systems are widely studied to understand the phenomena of spin statistical systems Wang et al. (2016); Wang and Sandvik (2018); Richter et al. (2015); Sirker et al. (2006); Yoshiyama and Hukushima (2023); Li and Yang (2021). Our initial tensor construction can be readily applied to these systems. We show the explicit form of the initial tensors for the two-dimensional $J_{1}-J_{2}$ and $J_{1}-J_{3}$ Ising models in appendices A and B respectively. In addition, we discuss more general systems including higher dimensions and long-range interaction in Sec. V. In principle, our construction can be extended to any dimension, and to various kinds of interaction terms.

III Two-dimensional Ising model and initial tensor dependence of the TRG methods

We use the two-dimensional Ising model with periodic boundary conditions in a volume of $N\times N$ as a testing ground for the initial tensor dependence of different TRG methods. The partition function is

	$\displaystyle Z$	$\displaystyle=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}e^{\beta h\sigma_{x,y}}e^{% \frac{\beta g}{2}\sigma_{x,y}(\sigma_{x+1,y}+\sigma_{x,y+1})}$		(9)
		$\displaystyle=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}K_{\sigma_{x,y},\sigma_{x+1,y% },\sigma_{x,y+1}},$		(10)

with the spin indices $\sigma_{x,y}$ at sites $\{x,y\}$ , the coupling constant $g$ and the external field $h$ . In our numerical studies, we set $g=1$ , $h=0$ , and $\beta=\beta_{c}=\frac{1}{2}\mathrm{ln}(1+\sqrt{2})$ , which is the critical value Onsager (1944); Duminil-Copin (2022).

Initial tensor construction with shifted delta-functions.

The representation by the tensor $K$ is not a two-dimensional locally connected tensor network where the same index would only occur on two neighboring tensors. Thus, this formulation can not directly be used for the numerical evaluation of the partition function through coarse-graining algorithms. We can construct a suitable network by inserting a delta function,

Z=\sum_{a=\pm 1}\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}K^{(\mathrm{delta})}_{% \sigma_{x,y},\sigma_{x+1,y},a_{x,y},a_{x,y+1}},

(11)

where

K^{(\mathrm{delta})}_{\sigma_{x,y},\sigma_{x+1,y},a_{x,y},a_{x,y+1}}\equiv K_{% \sigma_{x,y},\sigma_{x+1,y},a_{x,y+1}}\delta_{\sigma_{x,y}}^{a_{x,y}}.

(12)

Similarly, other matrix decompositions like SVD or QR could be used instead of inserting a delta function. Again, we made use of the translational invariance of the system and obtained a locally connected tensor network.

Initial tensor construction based on Taylor expansion.

Previously, a different form of the initial tensor with $h=0$ was derived as in Liu et al. (2013); Zhao et al. (2010); Xie et al. (2012). We explain it in the following as a reference to compare our method to. We consider the Taylor expansion of a two site interaction in the Boltzmann weight. Because the square of a spin variable is the identity, only two factors in the expansion arise, which can be rewritten as a matrix multiplication:

	$\displaystyle e^{(\beta g/2)\sigma_{n}\sigma_{n+1}}$
$\displaystyle=$	$\displaystyle\mathrm{cosh}(\beta g/2)+\sigma_{n}\sigma_{n+1}\mathrm{sinh}(% \beta g/2)$
$\displaystyle=$	$\displaystyle\sum_{l=0}^{1}\Bigg{(}\sigma_{n}^{l}\sqrt{\mathrm{cosh}(\beta g/2% )}^{1-l}\sqrt{\mathrm{sinh}(\beta g/2)}^{l}$
	$\displaystyle\times\sigma_{n+1}^{l}\sqrt{\mathrm{cosh}(\beta g/2)}^{1-l}\sqrt{% \mathrm{sinh}(\beta g/2)}^{l}\Bigg{)}$
$\displaystyle=$	$\displaystyle\sum_{l=0}^{1}W_{\sigma_{n},l}W_{\sigma_{n+1},l}.$	(13)

The matrix $W$ is defined as

W=\begin{pmatrix}\sqrt{\mathrm{cosh}(\beta g/2)}&{\sqrt{\mathrm{sinh}(\beta g/% 2)}}\\ {\sqrt{\mathrm{cosh}(\beta g/2)}}&{-\sqrt{\mathrm{sinh}(\beta g/2)}}\end{% pmatrix},

(14)

where the first row corresponds to $\sigma=-1$ , and the second to $\sigma=+1$ . We see that the exponential of the two-site interaction can be decomposed into two $W$ matrices, introducing a new index $l$ . Including the interaction terms in the orthogonal spatial direction, we get the initial tensor

	$\displaystyle K^{(\mathrm{exp})}_{l_{x,y},l_{x+1,y},m_{x,y},m_{x,y+1}}$
	$\displaystyle=\sum_{\alpha}W_{\alpha,l_{x,y}}W_{\alpha,l_{x+1,y}}W_{\alpha,m_{% x,y}}W_{\alpha,m_{x,y+1}}.$		(15)

This tensor is symmetric under permutation of any indices.

Initial tensor dependence of TRG algorithms.

We test the dependence of the coarse-graining methods on the initial tensors by using $K^{(\mathrm{delta})}$ , $K^{(\mathrm{exp})}$ , and the symmetrized tensor $K^{(\mathrm{sym})}$ . The latter is obtained from $K^{(\mathrm{delta})}$ by a gauge transformation on the tensor indices, in order to make the tensor nearly symmetric under permutation of its indices. This symmetrization is explained in App. C. Each SVD in the coarse-graining step is truncated to a maximum bond size $D$ to prevent the exponential growth of the index sizes. We apply ${\mathcal{O}}(D^{6})$ TRG Levin and Nave (2007), ${\mathcal{O}}(D^{7})$ HOTRG Xie et al. (2012), ${\mathcal{O}}(D^{5})$ ATRG Adachi et al. (2020), ${\mathcal{O}}(D^{5})$ MDTRG without internal line oversampling Nakayama (2023), and ${\mathcal{O}}(D^{7})$ boundary TRG for HOTRG (b-HOTRG) Iino et al. (2019) for a system size of $V=2^{20}$ .

In this section we discuss the initial tensor dependence of the TRG, HOTRG, and b-HOTRG. The details of the algorithms and further benchmarks for the other TRG methods can be found in App. E.

Refer to caption — Figure 1: Dependence of the TRG and HOTRG methods on the form of the initial tensors for different cutoff bond dimensions $D$ in the two-dimensional Ising model. Shown are the relative errors of the free energy for the asymmetric initial tensor $K^{(\mathrm{delta})}$ , the symmetric tensor $K^{(\mathrm{exp})}$ , and the symmetrized tensor $K^{\mathrm{(sym)}}$ . See main text for details.

We calculate the free energy $F\equiv-(\mathrm{ln}Z)/(\beta V)$ and compare it to the exact value Kaufman (1949). Figure 1 shows the error of the free energy for the TRG and HOTRG methods. We find that the accuracy of the original TRG method does not depend on the choice of the initial tensor. As in previous studies Xie et al. (2012), the HOTRG has a better accuracy than TRG if the symmetric tensor $K^{(\mathrm{exp})}$ is used. The same holds for the symmetrized tensor $K^{(\mathrm{sym})}$ . However, this is not true anymore for the asymmetric initial tensor $K^{(\mathrm{delta})}$ , where the accuracy is lowered significantly. We study the symmetry dependence in more detail in App. C and find that the original TRG does generally not depend on the symmetry of the initial tensors, while HOTRG becomes more and more unreliable the less symmetric the initial tensors are.

Removing the initial tensor dependence by boundary TRG techniques.

The HOTRG results for the asymmetric initial tensor can be improved by applying the boundary TRG method Iino et al. (2019). As shown in Fig. 2, this boundary HOTRG method produces results with the same accuracy as HOTRG for a symmetric initial tensor, but does so even if an asymmetric initial tensor $K^{(\mathrm{delta})}$ is used. The boundary HOTRG differs from the simple HOTRG by the details of the coarse graining steps: simple HOTRG uses an isometry $U^{(\mathrm{HOTRG})}$ , while the boundary HOTRG introduces squeezers $P_{1}^{(\mathrm{bHOTRG})}$ and $P_{2}^{(\mathrm{bHOTRG})}$ for the coarse-graining. See App. D for details.

Similar results are found for ATRG and MDTRG as shown in App. E. We observe that ATRG and MDTRG with squeezers similar to the boundary TRG have no dependence on the form of the initial tensors, while coarse graining methods using isometries similar to the simple HOTRG strongly depend on it.

Overview of TRG methods and their initial tensor dependencies.

We give a summary of the different coarse graining methods, their costs and their dependence on the initial tensors in table 1. The different methods can be categorized into three classes. The first category uses no isometries but replaces tensors directly by their SVD representations, or by the projectors introduced in the boundary TRG method Iino et al. (2019) (TRG, b-HOTRG). We call these projectors squeezers as in Adachi et al. (2020) because they are not always projectors in the mathematical sense. We indicate this class of algorithms as sqz in table 1. The second category consist of methods which use the index of an isometry as a new index in the next coarse graining step (HOTRG-like). This is denoted as iso in table 1. Finally, the third class consists of methods which use isometries for intermediate approximate contractions, but the indices of the isometries are not used as new indices of the coarse-grained tensors. We denote these methods as iso*.

	Costs	Trun.	Dep.	$\|1-F(D=30)/F_{\mathrm{ex}}\|$
TRG Levin and Nave (2007)	${\mathcal{O}}(D^{6})$	sqz	$--$	$\sim{\mathcal{O}}(10^{-6})$
HOTRG Xie et al. (2012)	${\mathcal{O}}(D^{4{dim}-1})$	iso	$++$	${\mathcal{O}}(10^{-5}\sim 10^{-8})$
b-HOTRG Iino et al. (2019)		sqz	$--$	$\sim{\mathcal{O}}(10^{-8})$
ATRG Adachi et al. (2020)	${\mathcal{O}}(D^{2{dim}+1})$	sqz	$--$	$\sim{\mathcal{O}}(10^{-7})$
Iso-ATRG Adachi et al. (2020)		iso	$++$	${\mathcal{O}}(10^{-5}\sim 10^{-6})$
sh-ATRG		sqz	$-$	$\sim{\mathcal{O}}(10^{-7})$
sh-Iso-ATRG		iso*	$-$	$\sim{\mathcal{O}}(10^{-6})$
MDTRG Nakayama (2023)	${\mathcal{O}}(D^{{dim}+3})$	iso	$++$	${\mathcal{O}}(10^{-5}\sim 10^{-7})$
sh-MDTRG		iso*	$-$	$\sim{\mathcal{O}}(10^{-6})$
b-MDTRG		sqz	$--$	$\sim{\mathcal{O}}(10^{-7})$

Table 1: Properties of different TRG coarse graining methods. 2nd column: numerical costs;

D

is the bond dimension and

{dim}

the spacetime-dimension. 3rd column: truncation method; iso stands for isometries which are used to create the coarse-grained indices; iso* means that isometries are used for intermediate approximate contractions, but they do not create the new indices of the coarse-grained tensors directly; sqz denotes all other methods, so either the squeezers from boundary TRG Iino et al. (2019) (see main text and App. D), or a simple contraction and singular value decomposition. 4th column: dependence on the initial tensors;

--

stands for no dependence,

-

for a slight but not significant dependence,

++

for strong dependence; 5th column: relative error for a bond dimension of

D=30

for the two-dimensional critical Ising model compared to the exact energy; this gives an estimate of the accuracy, but note that different methods scale differently in the bond dimension. See App. E for more details on algorithms and benchmarks.

From our calculations in Figs. 1, 2 and E we conclude that coarse graining methods making use of isometries to create the new indices (iso), such as the simple HOTRG, depend strongly on their symmetry properties. This was also found for the massless Schwinger model with a different approach Butt et al. (2020). The isometries can always be replaced by squeezers as introduced for the boundary TRG. We suggest using these boundary TRG techniques, which can remove the dependence on the initial tensor symmetries and make the algorithm more robust (sqz). In that case, our tensor construction provides a simple and generic way to represent the partition function as a locally connected tensor network, without loss of accuracy in numerical calculations compared to other construction techniques.

Dependence on the index exchange type.

We also found a dependence of TRG algorithms on the way the index-directions are exchanged after each coarse graining step. From App. G we conclude that the exchange of directions should ideally allow the initial SVDs in a coarse graining step to split tensors along the contraction direction in the previous step. For the algorithms in this paper, this means that a rotation in clockwise or counterclockwise direction is better suited for shifted TRG methods. For non-shifted methods, a flip $x\leftrightarrow y$ ( $x^{\prime}\leftrightarrow y^{\prime}$ ) leads to similar or better results. It replaces the $x$ -index in negative (positive) $x$ -direction with a corresponding $y$ -index. We found that these flips lead to inaccurate results for the shifted methods and an accumulation of systematic errors. Therefore, the type of index exchange should be carefully checked for the TRG method used. In our benchmarks and numerical results we always apply the optimal exchange between directions, which is a rotation for shifted and a flip for non-shifted methods.

IV $\mathbb{Z}_{2}$ gauge theory

The three-dimensional $\mathbb{Z}_{2}$ gauge theory was studied in Liu et al. (2013); Kuramashi and Yoshimura (2019) using HOTRG and TRG. The partition function can be written as

Z=2^{-3V}\sum_{\sigma=\pm 1}\prod_{n,\mu>\nu}e^{-\beta\sigma_{n,\mu}\sigma_{n+% \hat{\mu},\nu}\sigma_{n+\hat{\nu},\mu}\sigma_{n,\nu}},

(16)

where we introduce the link variables $\sigma_{n,\mu}$ at site $n$ with direction $\mu$ . The unit vector in $\mu$ direction is represented by $\hat{\mu}$ . The interaction corresponds to a spin system where each spin interacts with its nearest and next-nearest neighbors in a four-site interaction, known as plaquette-term. A schematic picture of the three plaquette terms in the three directions can be seen as black lines in Fig. 3.

Initial tensor construction based on Taylor expansion.

In Kuramashi and Yoshimura (2019), the authors used a representation based on the Taylor expansion similar to Eq. 13:

	$\displaystyle e^{\beta\sigma_{n,\mu}\sigma_{n+\hat{\mu},\nu}\sigma_{n+\hat{\nu% },\mu}\sigma_{n,\nu}}$
	$\displaystyle=\mathrm{cosh}\beta\sum_{p=0}^{1}\left(\mathrm{tanh}\beta\right)^% {p}\left(\sigma_{n,\mu}\sigma_{n+\hat{\mu},\nu}\sigma_{n+\hat{\nu},\mu}\sigma_% {n,\nu}\right)^{p}.$		(17)

In this previous work, a gauge-fixing was applied to simplify the tensor network representation. However, for gauge theories on the lattice in general, numerical calculations with gauge-fixing can suffer from ambiguity of Gribov copies Gribov (1978). Therefore, we do not fix the gauge in our initial tensor constructions and in our numerical calculations.

Following the derivation in Kuramashi and Yoshimura (2019); Liu et al. (2013) but without gauge-fixing, we define the tensors $A$ and $B$ as

A_{pqrs}=\mathrm{mod}(1+p+q+r+s,2)

(18)

B_{pqrs}=(\mathrm{tanh}\beta)^{(p+q+r+s)/4}\delta_{pq}\delta_{qr}\delta_{rs}.

(19)

A combination of six tensors leads to a unit cell tensor $T^{(\mathrm{exp})}$ which defines a locally connected tensor network that reproduces the partition function:

	$\displaystyle T^{(\mathrm{exp})}_{[xX][x^{\prime}X^{\prime}][yY][y^{\prime}Y^{% \prime}][zZ][z^{\prime}Z^{\prime}]}/(\mathrm{cosh}\beta)^{3}\equiv$
	$\displaystyle\sum_{a,b,c,d,e,f}A_{cyZe}A_{fzxb}A_{dYXa}B_{bx^{\prime}y^{\prime% }c}B_{aX^{\prime}Z^{\prime}e}B_{fz^{\prime}Y^{\prime}d}.$		(20)

The combination of two indices like $[xX]\equiv x\otimes X$ introduces new spin-3/2 indices for the unit cell tensor.

Note that $T^{\mathrm{(exp)}}$ is not symmetric, even if $A$ and $B$ are completely symmetric in all indices. This differs from the Ising model, where the initial tensors obtained using a Taylor expansion were symmetric. Therefore, the expansion method does not produce better symmetry properties than our method for the $\mathbb{Z}_{2}$ model. For the Ising model, we found in sections III and E that HOTRG is not well suited for non-symmetric initial tensors, while ATRG does not depend on the symmetry properties. This suggests that ATRG is a better choice for the initial tensors $T^{\mathrm{(delta)}}$ and $T^{\mathrm{(exp)}}$ of the $\mathbb{Z}_{2}$ model. However, the initial tensor dependence was not explicitly checked for the $\mathbb{Z}_{2}$ model and we use ATRG in all our simulations.

The previous study Kuramashi and Yoshimura (2019) applied further constraints on the tensors A and B to implement a gauge-fixing condition. With this, they could precisely reproduce Monte-Carlo results.

Initial tensor construction with shifted delta-functions.

In the following, we construct another tensor network for the same model using the method introduced in Sec. II. We do not need a Taylor expansion, do not make use of the spin property $\sigma^{2}=1$ , and we keep the gauge unfixed. For a simpler notation, we define the indices

$\displaystyle x_{\hat{k}}\equiv$	$\displaystyle\sigma_{n+\hat{k},\mu=0}$
$\displaystyle y_{\hat{k}}\equiv$	$\displaystyle\sigma_{n+\hat{k},\mu=1}$	(21)
$\displaystyle z_{\hat{k}}\equiv$	$\displaystyle\sigma_{n+\hat{k},\mu=2}.$

Moreover, we define $x\equiv x_{\hat{0}}$ , $y\equiv y_{\hat{0}}$ , $z\equiv z_{\hat{0}}$ . The index $n$ is not written explicitly here for brevity. Figure 3 shows a graphical representation of this index convention, where we locate the degrees of freedom $\sigma_{n,\mu}$ on the links between sites $n$ and $n+\hat{\mu}$ .

The Boltzmann weight at site $n$ is

		$\displaystyle T_{x,x_{\hat{y}},x_{\hat{z}},y,y_{\hat{x}},y_{\hat{z}},z,z_{\hat% {x}},z_{\hat{y}}}$
	$\displaystyle\equiv$	$\displaystyle e^{-\beta\left(xx_{\hat{y}}yy_{\hat{x}}+xx_{\hat{z}}zz_{\hat{x}}% +yy_{\hat{z}}zz_{\hat{y}}\right)}/8.$		(22)

We can translate this weight to a tensor network. For example, we can split the index $x_{\hat{y}}$ from the tensor:

		$\displaystyle T_{x,x_{\hat{y}},x_{\hat{z}},y,y_{\hat{x}},y_{\hat{z}},z,z_{\hat% {x}},z_{\hat{y}}}$
	$\displaystyle=$	$\displaystyle\sum_{{a_{\hat{y}}}=\pm 1}A_{x,x_{\hat{z}},y,y_{\hat{x}},y_{\hat{% z}},z,z_{\hat{x}},z_{\hat{y}}}^{a_{\hat{y}}}B_{x_{\hat{y}}}^{a_{\hat{y}}}$		(23)

One of the simplest choices for this decomposition is $B_{x_{\hat{y}}}^{a_{\hat{y}}}=\delta_{a_{\hat{y}},x_{\hat{y}}}$ . We define a new tensor without summation of the index $x$ ,

		$\displaystyle C_{x,x_{\hat{z}},a,a_{\hat{y}},y,y_{\hat{x}},y_{\hat{z}},z,z_{% \hat{x}},z_{\hat{y}}}$
	$\displaystyle\equiv$	$\displaystyle T_{x,a_{\hat{y}},x_{\hat{z}},y,y_{\hat{x}},y_{\hat{z}},z,z_{\hat% {x}},z_{\hat{y}}}\delta_{x}^{a},$		(24)

where $a\equiv a_{\hat{0}}$ . Similarly, we can split the indices $y_{\hat{z}}$ and $z_{\hat{x}}$ from the tensor and shift the indices. This way, we obtain the initial tensor

		$\displaystyle T^{(\mathrm{delta})}_{x,x_{\hat{z}},a,a_{\hat{y}},y,y_{\hat{x}},% b,b_{\hat{z}},z,z_{\hat{y}},c,c_{\hat{x}}.}$
	$\displaystyle\equiv$	$\displaystyle T_{x,x_{\hat{z}},a_{\hat{y}},y,y_{\hat{x}},b_{\hat{z}},z,z_{\hat% {y}},c_{\hat{x}}}\delta_{x}^{a}\delta_{y}^{b}\delta_{z}^{c}.$		(25)

We define new spin-3/2 indices, $[az]_{n+\hat{y}}\equiv a_{\hat{y}}\otimes z_{\hat{y}}$ and finally obtain the partition function

Z=\sum_{[az],[bx],[cy]=1}^{4}\prod_{n}T^{(\mathrm{delta})}_{[az]_{n}[az]_{n+% \hat{y}}[bx]_{n}[bx]_{n+\hat{z}}[cy]_{n}[cy]_{n+\hat{x}}}.

(26)

This is a locally connected tensor network representation.

Numerical results for the free energy.

We test this representation with the initial tensor $T^{(\mathrm{delta})}$ without gauge-fixing by evaluating the partition function numerically. We set the system sizes in $x$ , $y$ , $z$ direction to $N_{x}=2$ , $N_{y}=N_{z}=2^{15}$ . The first dimension is chosen small, similarly to Kuramashi and Yoshimura (2019). First, the three-dimensional system is reduced to a two-dimensional one by an HOTRG step without truncation. Then, we apply ATRG to perform the coarse-graining contractions with a truncation at a given bond dimension $D$ .

The free energy

F\equiv-\frac{1}{\beta V}\mathrm{ln}Z,

(27)

is calculated from the partition function. The relative error in dependence on the cutoff parameter $D$ is estimated by $|1-F(D)/F(D=128)|$ , where $D=128$ is the largest bond dimension in our simulations.

Figure 4 shows the error for the ATRG coarse graining method at $\beta=0.6561$ with oversampling parameter $r=2$ . Additionally, we show the results for the shifted ATRG algorithm, which is explained in App. E. We observe no significant dependence on the initial tensor for both methods. The initial tensor $T^{\mathrm{(delta)}}$ , which is constructed without a Taylor expansion, leads to accurate results and the accuracy is comparable to calculations with the initial tensor $T^{\mathrm{(exp)}}$ . The relative error between the ATRG and shifted ATRG methods is $\left|1-\frac{F_{\mathrm{sh,ATRG}}(D=128)}{F_{\mathrm{ATRG}}(D=128)}\right|={% \mathcal{O}}(10^{-7})$ , indicating that both methods converge to the same value. The error from randomized SVDs is sufficiently reduced by an $r=2D$ oversampling. Since the shifted ATRG is better suited for the impurity tensor method, as discussed in App. F, we use the shifted ATRG in calculations of the specific heat of the system.

The calculation of the free energy for the three-dimensional $\mathbb{Z}_{2}$ model demonstrates that our initial tensor construction $T^{\mathrm{(delta)}}$ without expansion and gauge-fixing leads to results as accurate as those with the initial tensor constructions $T^{\mathrm{(exp)}}$ .

Numerical results for the specific heat.

We further calculate the specific heat

C\equiv\beta^{2}\frac{1}{V}\frac{\partial^{2}\mathrm{ln}Z}{\partial\beta^{2}}.

(28)

First, we obtain the first order derivative $\partial_{\beta}\mathrm{ln}Z$ by the impurity tensor method as explained in App. F. Then, the second order derivative and therefore $C$ is derived from this with a numerical forth-order approximation of the differentials. For calculations not too close to the critical temperature, we choose a step size of $\delta\beta=0.002$ and a bond dimension $D=64$ . Closer to the critical value of $\beta$ we set $\delta\beta=0.00025$ and the bond dimension to $D=112$ . The error of the approximation for the second order derivative is ${\mathcal{O}}(\delta\beta^{4})$ , becoming small for smaller $\delta\beta$ . On the other hand, any kind of error of the first order derivative $\delta(\partial_{\beta}\mathrm{ln}Z)$ propagates as ${\mathcal{O}}(\delta(\partial_{\beta}\mathrm{ln}Z)/(\delta\beta))$ , growing for small $\delta\beta$ . If one aims for high precision, the step size $\delta\beta$ should therefore be carefully chosen and optimized.

Figure 5 shows the specific heat of the $\mathbb{Z}_{2}$ gauge theory with the initial tensor $T^{\mathrm{(delta)}}$ . The critical temperature is found to be $\beta_{c}=0.6560(3)$ . The uncertainty is estimated by the spread of results due to the randomized SVD. We choose the uncertainty of $\beta_{c}$ such that the largest ten data points lie in the error band, see Fig. 4(b). A more careful study of error sources would be needed if one aims for higher precision. Further methods to improve the accuracy can be found in Kuramashi and Yoshimura (2019). Our result $\beta_{c}=0.6560(3)$ is consistent to the TRG result $\beta_{c}=0.656097(1)$ in Kuramashi and Yoshimura (2019) and the Monte-Carlo result $\beta_{c}=0.65608(5)$ in Svetitsky and Yaffe (1982).

The calculations show that our approach can successfully be applied to a wide range of systems including gauge theories, and can become a first candidate to investigate a system by means of TRG methods. The method can be applied to any translationally invariant spin-statistical system which has a finite number of spin degrees of freedom. We demonstrated this in this section in the case of the $\mathbb{Z}_{2}$ gauge theory and discuss the generalization and scaling in Sec. V. Since we do not need a model-specific expansion of the original partition function or integrate out the original variables in our construction, this method can straightforwardly be used for a large class of systems, including gauge theories, to find the tensor network representation of physical quantities.

V General form of initial tensors

In this section, we consider the initial tensor construction method with delta functions for general models, including long-range and non-neighboring interactions. We derive the scaling ${d}^{2[{n_{\mathrm{int}}}+{n_{\mathrm{s}}}-1]}$ for the number of elements of the initial tensors. Here, ${d}{}$ is the dimension of the local Hilbert space, ${n_{\mathrm{int}}}$ is the number of lattice points of the original, not locally connected tensors representing the partition function. The number of Steiner points ${n_{\mathrm{s}}}$ corresponds to the number of lattice points needed to connect isolated regions, as explained later in this section.

Connected long range chain in 1d

As an example for longer range interactions, we consider a system where each lattice site is coupled to all sites up to a distance of $k$ sites. The partition function can be written as

Z=\sum_{\sigma=1}^{d}{}\prod_{i=1}^{N}K_{\sigma_{i},\sigma_{i+1},...,\sigma_{i% +k}}.

(29)

The local physical dimension is ${d}$ , and ${n_{\mathrm{int}}}=k+1$ is the number of indices of these initial tensors. See App. A for an example of this type.

We apply the decomposition with a delta matrix,

K_{\sigma_{i},\sigma_{i+1},...,\sigma_{i+k}}=\sum_{a^{(1)}_{i+1}=1}^{d}{}K_{% \sigma_{i},\sigma_{i+1},...,a^{(1)}_{i+1}}\delta_{\sigma_{i+k},a^{(1)}_{i+1}}.

(30)

Using translational invariance, we define the new tensor

	$\displaystyle K^{(1)}_{\sigma_{i},\sigma_{i+1},...,\sigma_{i+k-1},a^{(1)}_{i},% a^{(1)}_{i+1}}$
	$\displaystyle\equiv K_{\sigma_{i},\sigma_{i+1},...,\sigma_{i+k-1},a^{(1)}_{i+1% }}\delta_{\sigma_{i+k-1},a^{(1)}_{i}},$		(31)

which leads to the same partition function as the original one if one takes the product of all tensors at different lattice sites and sums over all indices, similar to Eq. 29 but including the new indices $a^{(1)}$ .

We can repeat this procedure $k-1$ times to get the local representation

	$\displaystyle K^{(k-1)}_{\sigma_{i},\sigma_{i+1},a^{(1)}_{i},a^{(1)}_{i+1},...% ,a^{(k-1)}_{i},a^{(k-1)}_{i+1}}$
	$\displaystyle\equiv K^{(k-1)}_{[\sigma_{i}a^{(1)}_{i}\dots a^{(k-1)}_{i}],[% \sigma_{i+1},a^{(1)}_{i+1},\dots,a^{(k-1)}_{i+1}]}.$		(32)

The index dimension of the combined indices $(\sigma\otimes a^{(1)}\otimes...\otimes a^{(k-1)})$ between neighboring points is then ${d}{}^{k}$ , and the initial tensor $K^{(k-1)}$ has ${d}{}^{2k}={d}{}^{2({n_{\mathrm{int}}}-1)}$ elements.

Figure 6 shows a schematic picture of our method for $k=3$ . The original tensor $K_{\sigma_{x}\sigma_{x+1}\sigma_{x+2}\sigma_{x+3}}$ has four spin variables as indices. These are represented by green dots, and their number is ${n_{\mathrm{int}}}=4$ . Each decomposition by a delta function creates two new indices and removes the dependence on one spin variable. We represent each such step by a colored arrow. Explicitly, the red arrow removes the dependence on $\sigma_{x+3}$ and creates new indices $a_{x}$ and $a_{x+1}$ . The blue arrow similarly removes the dependence on $\sigma_{x+2}$ and creates new indices $b_{x}$ and $b_{x+1}$ . The black arrow connects nearest neighbors in the original spin indices, and does not correspond to a decomposition.

Disconnected long range interaction in 1d

The bond size of the tensor network representation in 1d depends only on the maximum interaction distance. For example, we consider a system where the interactions only connect sites at a distance $k$ from each other. The partition function is

Z=\sum_{\sigma=1}^{d}{}\prod_{x=1}^{N}K_{\sigma_{i},\sigma_{i+k}}.

(33)

Our procedure to construct the initial tensors is similar to the previous example, and leads to the same form of the initial tensors as in Eq. 32. The index size is thus the same, the elements of the tensors differ though. Appendix B discusses an example of this type of long range interaction.

Figure 7 shows a schematic picture of the procedure. The red arrow removes the dependence on $\sigma_{x+k}$ but introduces a dependence on the site $x+k-1$ , which is denoted as a red dot in our graphical notation. This new dependence is removed by the blue arrow. The green outlines indicate that the original tensor did not depend on these sites. The number of arrows is the same as in the previous example, and thus the resulting tensor has the same dimensions.

We define the number of arrows, which corresponds to the number of decompositions in our method, as ${n_{\mathrm{dec}}}$ . The initial tensor can then be represented as a $({d}{}^{{n_{\mathrm{dec}}}}\times{d}{}^{{n_{\mathrm{dec}}}})$ matrix $K^{(k-1)}$ . This can also be expressed in terms of the number of original spin values (green dots) ${n_{\mathrm{int}}}$ and the number of generated Steiner points (red dots) ${n_{\mathrm{s}}}$ . The latter are needed to connect disconnected regions of the lattice, and are the points with green outlines in Fig. 7. In the case discussed here, the new tensor has size $({d}{}^{{n_{\mathrm{int}}}-1+{n_{\mathrm{s}}}}\times{d}{}^{{n_{\mathrm{int}}}-% 1+{n_{\mathrm{s}}}})$ , and thus has ${d}{}^{2({n_{\mathrm{int}}}-1+{n_{\mathrm{s}}})}$ elements.

Higher dimensions

The same scaling ${d}{}^{2({n_{\mathrm{int}}}-1+{n_{\mathrm{s}}})}$ holds in higher dimensions as well. However, ${n_{\mathrm{int}}}$ typically grows in higher dimensions because interactions happen in more directions. We can use the graphical notation again, as shown for example in Fig. 8. Arrows are introduced such that a path arises from all sites that take part in the interaction to the origin. Each arrow in a given spatial direction in the lattice contributes a factor $d$ in the bond size of the index for this direction in the constructed tensor. Note, however, that the choice of arrows is not unique anymore in more than one dimension. For example, Fig. 9 shows an alternative way to connect the tensors compared to Fig. 8. The constructed tensor has the same number of elements in this case, but the dimensions of the individual indices differ.

Finally, we discuss the example in Fig. 10 where isolated regions arise. The nearest neighbors of the lower left site do not take part in the interaction, which is symbolized by dashed red outlines of these sites. To form a connected graph, at least one isolated point has to be included. Finding the minimum number of arrows in our graphical representation is a well known problem in graph theory, known as the rectilinear Steiner tree problem Hanan (1966). The graph in Fig. 10 has $n_{x}=2$ arrows in x-direction, $n_{y}=2$ arrows in y-direction and one isolated point. Thus, the constructed tensor has the dimensions $({d}{}^{n_{x}}\times{d}{}^{n_{x}}\times{d}{}^{n_{y}}\times{d}{}^{n_{y}})$ and ${d}{}^{2[{n_{\mathrm{int}}}+{n_{\mathrm{s}}}-1]}={d}{}^{2[4+1-1]}={d}{}^{8}$ elements.

The connectivity of Fig. 10 allows for various types of interactions. It can express nearest neighbor interactions in positive and negative x- and y-directions, next-to-nearest neighbor interactions (diagonal), and next-to-nearest neighbor interactions (one site up, two sites in y-direction, or two sites up and one site in y-direction). Moreover, three- and four-site interactions are possible. The most generic form of a spin model of this type has 12 parameters. Even such an involved model can be expressed with an initial tensor of moderate dimensions $(4\times 4\times 4\times 4)$ for ${d}=2$ . The explicit form of possible interactions for the graph in Fig. 10 is:

	$\displaystyle K_{\sigma_{x,y}\sigma_{x+1,y+1}\sigma_{x+1,y+2}\sigma_{x+2,y+1}}$	(34)
$\displaystyle=$	$\displaystyle e^{h(\sigma_{x,y}+\sigma_{x+1,y+1}+\sigma_{x+1,y+2}+\sigma_{x+2,% y+1})}$
$\displaystyle\times$	$\displaystyle e^{J_{1}^{(x)}\sigma_{x+1,y+1}\sigma_{x+2,y+1}+J_{1}^{(y)}\sigma% _{x+1,y+1}\sigma_{x+1,y+2}}$
$\displaystyle\times$	$\displaystyle e^{J_{2}^{(1)}\sigma_{x,y}\sigma_{x+1,y+1}+J_{2}^{(2)}\sigma_{x+% 1,y+2}\sigma_{x+2,y+1}}$
$\displaystyle\times$	$\displaystyle e^{g_{3}^{(1)}\sigma_{x,y}\sigma_{x+2,y+1}+g_{3}^{(2)}\sigma_{x,% y}\sigma_{x+1,y+2}}$
$\displaystyle\times$	$\displaystyle e^{t_{8}\sigma_{x,y}\sigma_{x+2,y+1}\sigma_{x+1,y+2}}$
$\displaystyle\times$	$\displaystyle e^{t_{6}^{(1)}\sigma_{x,y}\sigma_{x+1,y+1}\sigma_{x+2,y+1}+t_{6}% ^{(2)}\sigma_{x,y}\sigma_{x+1,y+1}\sigma_{x+1,y+2}}$
$\displaystyle\times$	$\displaystyle e^{t_{4}\sigma_{x+1,y+1}\sigma_{x+2,y+1}\sigma_{x+1,y+2}}$
$\displaystyle\times$	$\displaystyle e^{f\sigma_{x,y}\sigma_{x+1,y+1}\sigma_{x+2,y+1}\sigma_{x+1,y+2}}.$

Multi-flavour systems

So far we only considered one-flavour systems, but the ideas can be generalized easily to multi-flavour systems. For example, the degrees of freedom of the $\mathbb{Z}_{2}$ model can be located on the links pointing from $\hat{r}$ to $\hat{r}+\hat{\mu}$ , where $\hat{r}$ is a coordinate and $\hat{\mu}\in\{\hat{x},\hat{y},\hat{z}\}$ is a unit vector in one of the three directions. This is indicated in Fig. 3. We can also localize each such gauge degrees of freedom at the node with position $\hat{r}$ . Then, at each node, an additional degree of freedom arises for the three cases $\hat{\mu}=\hat{x}$ , $\hat{\mu}=\hat{y}$ , $\hat{\mu}=\hat{z}$ . The connectivity is then the same as for the $J_{1}-J_{2}$ model, but with three distinct flavours. There is no Steiner point, ${n_{\mathrm{s}}}=0$ , and the degrees of freedom in each direction is two, such that the number of elements of the initial tensor is $\left(2^{3}\right)^{2(3-1)}=2^{12}=(4^{3})^{2}$ . The initial tensors can be formed by a $(4\times 4\times 4\times 4\times 4\times 4)$ tensor as shown in main text in Sec. IV.

VI Conclusion

In this paper we introduced a simple construction of a tensor network representing a partition function. By inserting a delta function and redefining the tensors, we can construct a locally connected tensor network for any translational invariant theory in any dimension. This network can then be coarse-grained with TRG methods to calculate the partition function and observables.

In a general case, a partition function can be represented by an initial tensor with ${d}^{2({n_{\mathrm{int}}}-1+{n_{\mathrm{s}}})}$ elements (see Sec. V). Here, ${d}{}$ is the dimension of the local degrees of freedom, and ${n_{\mathrm{int}}}$ is the number of indices of the original tensor, which did not form a locally connected tensor network. If disconnected regions exist in the interactions, ${n_{\mathrm{s}}}$ corresponds to the Steiner points Hanan (1966) that are needed to connect these regions.

We demonstrated the applicability of our method in a one-dimensional spin system with multiple interaction terms as a simple example. We extended the method to two-dimensions and investigated the initial tensor dependence of the TRG method. The accuracy of these methods highly depends on the initial tensors and on the details of the TRG method. A high sensitivity was found in the original HOTRG. Our results suggest that one should use symmetric initial tensors for this method. We conclude that the initial tensor influences the numerical accuracy significantly depending on the TRG method, and should be chosen carefully for reliable calculations. We found that symmetric initial tensors lead to better results for many coarse-graining methods, and we calculated a symmetric representation for the two-dimensional Ising model based on our initial tensor construction.

Moreover, we showed that the initial tensor dependence can be eliminated by applying the ideas of the boundary TRG method to HOTRG. In general, any TRG method, such as ATRG and MDTRG, that makes use of isometries to form the new indices of the coarse grained tensors, has a strong initial tensor dependence. We showed, however, how these methods can be modified slightly to use squeezers instead of isometries, as introduced in the boundary TRG Iino et al. (2019). This way, the dependence on initial tensors and their symmetries can be removed, which makes the algorithms more resilient against systematic errors coming from an interplay between the choice of initial tensors and the coarse graining method.

The precision of TRG algorithms also depends on the type of index-exchange between coarse-graining steps. There are several possibilities to alternate between coarse-graining in $x-$ and $y-$ direction. We showed that systematic errors can accumulate with the wrong type of index exchange and discussed the optimal choice for different coarse-graining methods.

We further applied our tensor construction to the $\mathbb{Z}_{2}$ gauge theory in three-dimensions without gauge-fixing. We neither need to consider any expansion nor do we have to integrate out original variables. The results for the free energy and the specific heat with our simple tensor construction were consistent with TRG calculations using expansions and gauge-fixing, and with Monte-Carlo simulations. For the $\mathbb{Z}_{2}$ gauge theory, our construction resulted in an accuracy of the free energy comparable to that of the usual construction by an expansion.

Summarizing, the initial tensor construction presented in this work is a way to translate the partition function to a locally connected tensor network. The approach is simple and can be applied to various systems, without relying on model-specific expansions. Moreover, we found worrisome dependence of HOTRG-like methods (isometric ATRG, MDTRG, HOTRG) on the choice of initial tensors. Even if different choices are mathematically equivalent, the truncation procedures of the coarse graining steps introduces systematic errors. The previously mentioned methods should therefore only be used in their original form for symmetric initial tensors. However, we found that the methods can be made resilient against errors from the choice of initial tensors by using the ideas of the boundary TRG. With this, or by choosing alternative coarse graining algorithms like ATRG, our initial tensor construction leads to a similar accuracy as other construction methods, making it a simple and powerful tool for TRG calculations.

Acknowledgments

We would like to thank Shinji Takeda and Daisuke Kadoh for discussions. This work was supported by JSPS KAKENHI Grant Number 24K17059.

Appendix A J1-J2 Ising model

In this appendix, we discuss how our method can be used to construct the tensor network representation of the $J_{1}-J_{2}$ Ising model. The system with $N$ sites can be described by the partition function

$\displaystyle Z=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}e^{\frac{J_{1}}{2}(\sigma_{x,% y}\sigma_{x+1,y}+\sigma_{x,y}\sigma_{x,y+1})}$
	$\displaystyle\times e^{\frac{J_{1}}{2}(\sigma_{x,y+1}\sigma_{x+1,y+1}+\sigma_{% x+1,y}\sigma_{x+1,y+1})}$
	$\displaystyle\times e^{J_{2}(\sigma_{x,y}\sigma_{x+1,y+1}+\sigma_{x,y+1}\sigma% _{x+1,y})}$	(35)
$\displaystyle=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}K^{(J_{1}J_{2})}_{\sigma_{x,y% },\sigma_{x+1,y},\sigma_{x,y+1},\sigma_{x+1,y+1}},$	(36)

with the spin indices $\sigma_{x,y}$ at sites $\{x,y\}$ and coupling constants $J_{1}$ and $J_{2}$ . By setting the coupling $J_{1}<0$ and $J_{2}>0$ , frustrated systems can be studied in this model.

The representation through the tensor $K^{(J_{1}J_{2})}$ is not a two-dimensional locally connected tensor network. We can construct such a network by inserting delta functions. First, we split the next-nearest neighbor spin $\sigma_{x+1,y+1}$ from the tensor:

	$\displaystyle K^{(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_{x,y+1},% \sigma_{x+1,y+1}}$
	$\displaystyle=\sum_{a=\pm 1}K^{(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},% \sigma_{x,y+1},a_{x+1,y}}\delta^{a_{x+1,y}}_{\sigma_{x+1,y+1}}.$		(37)

With this, we define a new tensor $K^{{}^{\prime}(J_{1}J_{2})}$ ,

	$\displaystyle K^{{}^{\prime}(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_% {x,y+1},a_{x,y},a_{x+1,y}}$
	$\displaystyle\equiv K^{(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_{x,y+% 1},a_{x+1,y}}\delta^{a_{x,y}}_{\sigma_{x,y+1}}.$		(38)

As a next step, we split the index $\sigma_{x,y+1}$ from the tensor:

	$\displaystyle K^{{}^{\prime}(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_% {x,y+1},a_{x,y},a_{x+1,y}}$
	$\displaystyle=\sum_{b=\pm 1}K^{{}^{\prime}(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{% x+1,y},b_{x,y+1},a_{x,y},a_{x+1,y}}\delta^{b_{x,y+1}}_{\sigma_{x,y+1}},$		(39)

and define the new tensor $K^{{}^{\prime\prime}(J_{1}J_{2})}$ :

	$\displaystyle K^{{}^{\prime\prime}(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},a% _{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle\equiv K^{{}^{\prime}(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},b% _{x,y+1},a_{x,y},a_{x+1,y}}\delta^{b_{x,y}}_{\sigma_{x,y}}$		(40)
	$\displaystyle=K^{(J_{1}J_{2})}_{\sigma_{x,y},\sigma_{x+1,y},b_{x,y+1},a_{x+1,y% }}\delta^{a_{x,y}}_{b_{x,y+1}}\delta^{b_{x,y}}_{\sigma_{x,y}}$		(41)
	$\displaystyle=e^{\frac{J_{1}}{2}(\sigma_{x,y}\sigma_{x+1,y}+\sigma_{x,y}b_{x,y% +1})}$
	$\displaystyle\times e^{\frac{J_{1}}{2}(b_{x,y+1}a_{x+1,y}+\sigma_{x+1,y}a_{x+1% ,y})}$
	$\displaystyle\times e^{J_{2}(\sigma_{x,y}a_{x+1,y}+b_{x,y+1}\sigma_{x+1,y})}$
	$\displaystyle\times\delta^{a_{x,y}}_{b_{x,y+1}}\delta^{b_{x,y}}_{\sigma_{x,y}}.$		(42)

We combine the $\sigma$ and $a$ indices to form new bonds in $x$ -direction: at position $x,y$ , the new index is $[\sigma a]_{x,y}\equiv\sigma_{x,y}\otimes a_{x,y}=(\sigma_{x,y},a_{x,y})$ . Finally, the local tensor representation in terms of $K^{\prime\prime}$ is

Z=\sum_{[\sigma a]}\sum_{b=\pm 1}\prod_{x,y=1}^{N}K^{{}^{\prime\prime}(J_{1}J_% {2})}_{[\sigma a]_{x,y},[\sigma a]_{x+1,y},b_{x,y},b_{x,y+1}}.

(43)

The indices of this representation are independent of each other, and this initial tensor can be used for TRG coarse-graining.

We note that other representations can also be starting points for our procedure, as long as the contraction of the initial tensor network corresponds to the same partition function. For example, we can substitute $K$ in Eq. 42 by $K^{(0)}$ ,

	$\displaystyle K^{(0)}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_{x,y+1},\sigma_{x+1,% y+1}}$
	$\displaystyle=e^{{J_{1}}(\sigma_{x,y}\sigma_{x+1,y}+\sigma_{x,y}\sigma_{x,y+1})}$
	$\displaystyle\times e^{J_{2}(\sigma_{x,y}\sigma_{x+1,y+1}+\sigma_{x,y+1}\sigma% _{x+1,y})}.$		(44)

In any case, the tensor construction reproduces the original partition function if all indices of the network are contracted.

Our procedure results in an alternative representation of the partition function to those studied in Li and Yang (2021); Yoshiyama and Hukushima (2023). The authors of Yoshiyama and Hukushima (2023) state that physical quantities depend strongly on the choice of initial tensors and, for finite lattices, on the boundary conditions implemented by the tensor network representation. Additional candidates for initial tensors can therefore be helpful to find the most accurate representation for a given algorithm and system size.

Appendix B J1-J3 Ising model

As a kind of third-nearest neighbor Ising model, we discuss the $J_{1}-J_{3}$ Ising model, which is also called biaxial next-nearest neighbor Ising model. The partition function is

$\displaystyle Z=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}e^{J_{1}(\sigma_{x,y}\sigma_{% x+1,y}+\sigma_{x,y}\sigma_{x,y+1})}$
	$\displaystyle\times e^{J_{3}(\sigma_{x,y}\sigma_{x+2,y}+\sigma_{x,y}\sigma_{x,% y+2})}$	(45)
$\displaystyle=$	$\displaystyle\sum_{\sigma=\pm 1}\prod_{x,y=1}^{N}K^{(J_{1}J_{3})}_{\sigma_{x,y% },\sigma_{x+1,y},\sigma_{x+2,y},\sigma_{x,y+1},\sigma_{x,y+2}}$	(46)

We split the next-next-nearest spins $\sigma_{x+2,y}$ and $\sigma_{x,y+2}$ from the tensor,

	$\displaystyle K^{(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_{x+2,y},% \sigma_{x,y+1},\sigma_{x,y+2}}$
	$\displaystyle=\sum_{a,b=\pm 1}K^{(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},a_% {x+1,y},\sigma_{x,y+1},b_{x,y+1}}$
	$\displaystyle\times\delta_{a_{x+1,y},\sigma_{x+2,y}}\delta_{b_{x,y+1},\sigma_{% x,y+2}},$		(47)

and define the tensor $K^{{}^{\prime}(J_{1}J_{3})}$ with shifted delta functions:

	$\displaystyle K^{{}^{\prime}(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_% {x,y+1},a_{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle=K^{(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},a_{x+1,y},\sigma_{% x,y+1},b_{x,y+1}}$
	$\displaystyle\times\delta_{a_{x,y},\sigma_{x+1,y}}\delta_{b_{x,y},\sigma_{x,y+% 1}}.$		(48)

Similarly, we split $\sigma_{x,y+1}$ ,

	$\displaystyle K^{{}^{\prime}(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},\sigma_% {x,y+1},a_{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle=\sum_{c=\pm 1}K^{{}^{\prime}(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{% x+1,y},c_{x,y+1},a_{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle\times\delta_{c_{x,y+1},\sigma_{x,y+1}},$		(49)

and define the tensor $K^{{}^{\prime\prime}(J_{1}J_{3})}$ :

	$\displaystyle K^{{}^{\prime\prime}(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},c% _{x,y},c_{x,y+1},a_{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle=K^{{}^{\prime}(J_{1}J_{3})}_{\sigma_{x,y},\sigma_{x+1,y},c_{x,y+% 1},a_{x,y},a_{x+1,y},b_{x,y},b_{x,y+1}}$
	$\displaystyle\times\delta_{c_{x,y},\sigma_{x,y}}.$		(50)

We define the new indices in $x$ -direction at position $x,y$ as $[\sigma a]_{x,y}=(\sigma_{x,y},a_{x,y})$ , and $[cb]_{x,y}=(c_{x,y},b_{x,y})$ for the $y$ -direction. We finally obtain the locally connected tensor network representaion

Z=\sum_{[\sigma a],[c,b]}\prod_{x,y=1}^{N}K^{{}^{\prime\prime}(J_{1}J_{3})}_{[% \sigma a]_{x,y}[\sigma a]_{x+1,y}[cb]_{x,y}[cb]_{x,y+1}}.

(51)

Compared to the nearest-neighbor Ising model (see Eq. 12) and the $J_{1}-J_{2}$ Ising model (see Eq. 43), the $J_{1}-J_{3}$ Ising model is represented by an initial tensor with a larger bond dimension of the combined indices. This is a typical property for models with longer range interactions: they require a larger number of the decompositions, and thus create additional new indices in the locally connected tensor network representation. When these indices are combined, the new bonds have a larger bond dimension. See Sec. V for a general discussion of the scaling behavior.

Appendix C Symmetry of the initial tensor

In order to investigate the initial tensor dependence of various TRG algorithms for the two-dimensional Ising model in Sec. II, we consider the symmetrized tensor $K^{\mathrm{(sym)}}$ as a variant of $K^{(\mathrm{delta})}$ . The partition function in Eq. 11 does not change if we redefine the initial tensor as

K^{\mathrm{(sym)}}_{XX^{\prime}YY^{\prime}}\equiv\sum_{kk^{\prime}ll^{\prime}}% A_{Xk}A_{X^{\prime}k^{\prime}}K^{(\mathrm{delta})}_{kk^{\prime}ll^{\prime}}A_{% lY}^{-1}A_{l^{\prime}Y^{\prime}}^{-1}.

(52)

This reconstructed tensor can be made symmetric under swapping of the two indices $K_{abcd}=K_{\{abcd\}}$ if we choose $A$ in the right way.

Several methods are possible to find a suitable $A$ to make $K^{\mathrm{(sym)}}_{XX^{\prime}YY^{\prime}}$ a symmetric tensor. We apply a numerical optimization starting from a random matrix. This matrix is optimized element-wise to minimize the cost function

c^{\mathrm{(sym)}}\equiv\sum_{xx^{\prime}yy^{\prime}}|K_{XX^{\prime}YY^{\prime% }}-K_{\{xx^{\prime}yy^{\prime}\}}|.

(53)

In each optimization step we change a matrix element by the step size $\Delta\sim 10^{0\sim-9}$ and consider $A_{kl}^{\prime}=A_{kl}\pm\Delta$ . We choose either $A^{\prime}$ or $A$ for the next step and accept or reject the change, depending on which of the two has the lower cost function. We sweep several times through all matrix elements and decrease $\Delta$ if all $A^{\prime}$ get rejected. Because the optimization can be stuck in local minima, we repeat the optimization with different randomly initialized matrices $A$ , until we find 1000 matrices with $c^{\mathrm{(sym)}}\leq 2$ . The partition function is calculated with HOTRG and the results are shown in Fig. 11 for all outcomes of this optimization. We found a tensor $K^{\mathrm{(sym)}}_{XX^{\prime}YY^{\prime}}$ with a cost function $c^{\mathrm{(sym)}}$ smaller than $10^{-1}$ . The explicit form of $K^{\mathrm{(sym)}}$ for $h=0$ , $g=1$ is given in Eq. 54. Note that $K^{\mathrm{(sym)}}\neq K^{\mathrm{(\mathrm{exp})}}$ , although $K^{\mathrm{(\mathrm{exp})}}$ is also a symmetric tensor.

The free energy calculated with the symmetrized tensor $K^{\mathrm{(sym)}}$ is shown in Fig. 1 for HOTRG, and the accuracy is similar to a calculation with $K^{\mathrm{(\mathrm{exp})}}$ . This shows that a symmetrization of the initial tensor $K^{(\mathrm{delta})}$ can improve the results for symmetry-dependent TRG methods like HOTRG.

Furthermore, we study the $c^{\mathrm{(sym)}}$ dependence of the TRG and HOTRG methods in Fig. 11. The results clearly show that the HOTRG method becomes less precise and accurate when the initial tensors are less symmetric and $c^{\mathrm{(sym)}}$ is large. In contrast to this, TRG shows almost no dependence on the symmetry behavior of the initial tensors.

We list the explicit representation of the symmetrized initial tensor for the two-dimensional Ising model in the following. For each $\{\sigma,a\}$ we define the combined index $[\sigma a]$ . The indices are ordered as $[--]=0,[+-]=1,[-+]=2,[++]=3$ . Then, the symmetrized tensor $K^{\mathrm{(sym)}}_{[\sigma_{x,y}a_{x,y}][\sigma_{x+1,y}a_{x,y+1}]}$ is

$\displaystyle K^{\mathrm{(sym)}}_{00}=$	$\displaystyle 2.48037458878,$
$\displaystyle K^{\mathrm{(sym)}}_{01}=K^{\mathrm{(sym)}}_{02}=$	$\displaystyle 0.167834510235,$
$\displaystyle K^{\mathrm{(sym)}}_{10}=K^{\mathrm{(sym)}}_{20}=$	$\displaystyle 0.166746023749,$
$\displaystyle K^{\mathrm{(sym)}}_{11}=K^{\mathrm{(sym)}}_{12}=K^{\mathrm{(sym)% }}_{21}=K^{\mathrm{(sym)}}_{22}=$	$\displaystyle 0.334196191574,$
$\displaystyle K^{\mathrm{(sym)}}_{13}=K^{\mathrm{(sym)}}_{23}=$	$\displaystyle 0.749091024240,$
$\displaystyle K^{\mathrm{(sym)}}_{31}=K^{\mathrm{(sym)}}_{32}=$	$\displaystyle 0.749047098416,$
$\displaystyle K^{\mathrm{(sym)}}_{03}=$	$\displaystyle 0.334224186621,$
$\displaystyle K^{\mathrm{(sym)}}_{30}=$	$\displaystyle 0.334168680654,$
$\displaystyle K^{\mathrm{(sym)}}_{33}=$	$\displaystyle 1.67966015282.$	(54)

Note that this initial tensor is not exactly symmetric: to achieve $c^{\mathrm{(sym)}}=0$ , the relation $K^{\mathrm{(sym)}}_{ab}=K^{\mathrm{(sym)}}_{ba}$ must hold for any index. However, the symmetry is sufficient for a reliable coarse graining with sufficient accuracy as can be seen in Fig. 1.

Appendix D Boundary TRG method

The boundary TRG method was originally introduced for open boundary systems to take into account the boundary effect in the coarse graining step. In this appendix, we present a generalization of the original HOTRG method Xie et al. (2012) using the boundary TRG technique Iino et al. (2019), which removes the dependence on the symmetry properties of the initial tensors. The idea can be generalized to other tensor renormalization methods.

The difference between common TRG methods like HOTRG and boundary TRG is the truncation method in the coarse-graining step. In the original HOTRG, the isometries $U^{\mathrm{(HOTRG)}}$ and $V^{\mathrm{(HOTRG)}}$ , which minimize the cost function in Fig. 12, are both calculated. The isometries are found by truncated SVDs with singular values $\lambda^{(U)}$ and $\lambda^{(V)}$ . For example, for $\lambda^{(U)}$ :

	$\displaystyle\sum_{x_{1},x_{2},y,y^{t},y_{1}^{\prime},y_{2}}$	$\displaystyle K_{x_{1}y^{t}{x_{1}^{\prime}}^{t}y_{1}^{\prime}}^{}K_{x_{2}y_{2% }{x_{2}^{\prime}}^{t}y^{t}}^{}K_{x_{1}yx_{1}^{\prime}y_{1}^{\prime}}K_{x_{2}y% _{2}x_{2}^{\prime}y}$
	$\displaystyle\simeq\sum_{a,b}^{D}$	$\displaystyle U^{*(\mathrm{HOTRG})}_{a{x_{1}^{\prime}}^{t}{x_{2}^{\prime}}^{t}% }\left(\lambda^{(U)}\right)^{2}_{ab}U^{(\mathrm{HOTRG})}_{bx_{1}^{\prime}x_{2}% ^{\prime}}.$		(55)

Here, $x_{i}$ ( $x^{\prime}_{i}$ ) are the indices that connect the tensor $K$ to its nearest neighbor to the left (right). Accordingly, $y_{i}$ ( $y^{\prime}_{i}$ ) connects to the next tensor below (above). Upper labels $t$ as in $x^{t}$ indicate that these bonds connect conjugate tensors. For brevity, we drop the indices in the following and use a shorthand notation like $K^{\dagger}K^{\dagger}KK\simeq U^{\mathrm{\dagger(HOTRG)}}\left(\lambda^{(U)}% \right)^{2}U^{\mathrm{(HOTRG)}}$ .¹¹1On the notation for the SVD used here: a Hermitian matrix $M$ can be written as $M=A^{\dagger}A$ . With the SVD $A=U\lambda V$ , we can decompose $M$ as $M=V^{\dagger}\lambda U^{\dagger}U\lambda V=V^{\dagger}\lambda^{2}V$ . In actual calculations, we decompose $M$ in an SVD as $M=U_{M}\lambda_{M}V_{M}$ and identify $V=V_{M}=U_{M}^{\dagger}$ , $\lambda^{2}=\lambda_{M}$ . We use the names $U$ and $V$ interchangeably for isometries. Typically, we label isometries as $U$ , and call them $V$ whenever they have to be distinguished from a given $U$ because they act on different indices of a tensor. Furthermore, we do not put any daggers $\dagger$ on tensors in SVDs. With this convention, isometries are always applied in the form $U^{\dagger}$ or $V^{\dagger}$ to the tensors when indices shall be combined and truncated. The indices can be reconstructed from the corresponding diagrams.

In the cost function $C_{U}$ in Fig. 12, $U^{\dagger(\mathrm{HOTRG})}$ is applied to the right indices of the $K$ tensors. Instead, one can also apply an isometry to the left indices. The corresponding cost function $C_{V}$ is minimized by $V^{\dagger(\mathrm{HOTRG})}$ as an isometry. In general, the isometries $U^{(\mathrm{HOTRG})}$ and $V^{(\mathrm{HOTRG})}$ are different. In the usual HOTRG algorithm, the cost functions $C_{U}$ and $C_{V}$ are computed by summing the squared truncated singular values $\left(\lambda_{>D}^{(U)}\right)^{2}$ and $\left(\lambda_{>D}^{(V)}\right)^{2}$ in both cases. Then, the isometry which corresponds to the smaller cost function is chosen for the truncation step. This introduces a systematic error, which favors one direction (left or right in Fig. 12) in the truncation. In the case of symmetric initial tensors, $C_{U}$ and $C_{V}$ are the same in each step and thus no choice is needed. Since no direction is favored in this case, the algorithm is more suited for symmetric initial tensors than for non-symmetric, in agreement with our numerical observations.

In the boundary TRG method this decision is not applied. Instead, squeezers are created from a combination of $U^{\mathrm{(HOTRG)}}$ and $V^{\mathrm{(HOTRG)}}$ . These squeezers are used for the truncation in the coarse graining step. The procedure minimizes the cost function in Fig. 13. First, the isometries are calculated without truncation as in Eq. 55 and similarly for $V^{\mathrm{(HOTRG)}}$ . Then, a truncated SVD is performed:

{\lambda^{(U)}}U^{\mathrm{(HOTRG)}}V^{\mathrm{(HOTRG)}}{\lambda^{(V)}}\simeq U% \Lambda V.

(56)

The squeezers can be constructed from these tensors and the previous isometries:

	$\displaystyle P_{1}^{\mathrm{(bHOTRG)}}$	$\displaystyle\equiv$	$\displaystyle V^{\mathrm{(HOTRG)}}{\lambda^{(V)}}V^{\dagger}/\sqrt{\Lambda}$		(57)
	$\displaystyle P_{2}^{\mathrm{(bHOTRG)}}$	$\displaystyle\equiv$	$\displaystyle(1/\sqrt{\Lambda})U^{\dagger}{\lambda^{(U)}}U^{\mathrm{(HOTRG)}}.$		(58)

The total computational cost is of the same order as the original HOTRG, and the calculation of $P_{1}$ and $P_{2}$ is not the dominant cost in the renormalization step. The results of this boundary HOTRG method are much less dependent on the symmetry properties of the initial tensors as discussed in Sec. III. Therefore, the method creates more reliable results. In addition, the cost function of the boundary HOTRG in Fig. 13 approximates four tensors instead of two for the usual HOTRG as in Fig. 12. The approximation takes into account a larger region and can thus improve the accuracy of the approximation. We note that the bond-weighted TRG method for HOTRG is also based on the boundary TRG truncation Adachi et al. (2022).

The ideas presented here can generally be used in any TRG method with isometries. Replacing $U^{\mathrm{(HOTRG)}}\rightarrow P_{1}^{\mathrm{(bHOTRG)}},P_{2}^{\mathrm{(% bHOTRG)}}$ does not require significant additional computational costs but can strongly reduce the initial tensor dependence.

Appendix E ATRG, MDTRG and variants

We explain the coarse graining steps with ATRG and MDTRG in this appendix. We also introduce variants of the established algorithms and benchmark the different methods for the two-dimensional Ising model.

The accuracy of the free energy depends on the method used in the coarse-graining step. Particularly, we observe that algorithms which use isometries to create the indices of the next coarse-grained tensors $K^{\mathrm{(next)}}$ are highly dependent on the initial tensor properties.

We start from the partition function $Z=\mathrm{tr}\left(\prod_{i}K_{x_{i}y_{i}x^{\prime}_{i}y^{\prime}_{i}}\right)$ , where $x_{i}$ ( $x^{\prime}_{i}$ ) are the indices that connect a lattice point at site $i$ to its nearest neighbor in negative (positive) $x$ -direction. Accordingly, $y_{i}$ ( $y^{\prime}_{i}$ ) connects to the next tensor in negative (positive) $y$ -direction. Note that $x_{i+1}=x^{\prime}_{i}$ and $y_{i+1}=y^{\prime}_{i}$ . The trace $\mathrm{tr}$ implies a summation over all indices. $K$ can, for example, be $K^{\mathrm{(delta)}}$ or $K^{\mathrm{(exp)}}$ as defined in Sec. III.

Tensor renormalization group algorithms provide a way to coarse-grain a given tensor network to a new network with fewer tensors. This step is approximate to avoid an exponential growth of the numerical costs, and the algorithms differ in the way they truncate the tensors. Typically, two tensors of an initial lattice are replaced by one tensor on a coarse-grained lattice. We restrict ourselves to square lattices in two dimensions but note that most algorithms discussed here can be generalized to higher dimensions. In short, the goal of a tensor renormalization group algorithm is to find the coarse-grained tensor $K^{\mathrm{(next)}}$ from the initial tensor $K$ ,

K_{xyx^{\prime}y^{\prime}}\rightarrow K^{\mathrm{(next)}}_{XYX^{\prime}Y^{% \prime}}.

(59)

For ATRG and MDTRG, we consider two nearest neighbor tensors $\sum_{y}K_{x_{1}yx^{\prime}_{1}y^{\prime}_{1}}K_{x_{2}y_{2}x^{\prime}_{2}y}$ in the coarse-graining step. The tensors are first decomposed into triads, as shown in Fig. 14(a) to (b). For this, the initial tensors of the translational invariant network are split using an SVD:

K_{xyx^{\prime}y^{\prime}}\simeq\sum_{b,c}^{D}H_{xyb}\lambda_{bc}E_{x^{\prime}% y^{\prime}c}.

(60)

Here, $H$ and $E$ are truncated unitary matrices or isometries, and $\lambda$ is a diagonal matrix with non-negative entries. The smallest singular values are dropped in order not to exceed a maximum bond dimension $D$ in the algorithm. Note that we do not use internal line oversampling in this paper, so we truncate the singular values in intermediate steps to the bond dimension $D$ everywhere. We define the triad tensors

	$\displaystyle F_{xye}\equiv$	$\displaystyle\sum_{b}H_{xyb}\lambda_{be}$		(61)
	$\displaystyle G_{x^{\prime}y^{\prime}g}\equiv$	$\displaystyle\sum_{c}E_{x^{\prime}y^{\prime}c}\lambda_{cg}.$		(62)

The contraction of two neighboring tensors in the initial network can then be written as

\sum_{y}K_{x_{1}yx^{\prime}_{1}y^{\prime}_{1}}K_{x_{2}y_{2}x^{\prime}_{2}y}% \simeq\sum_{y,e,g}E_{x^{\prime}_{1}y^{\prime}_{1}e}F_{x_{1}ye}G_{x^{\prime}_{2% }yg}H_{x_{2}y_{2}g},

(63)

corresponding to Fig. 14(b).

E.1 ATRG and variants

In the ATRG method, an additional SVD is applied to swap the indices in $x$ -direction as shown in Fig. 14(c):

	$\displaystyle\sum_{y}F_{x_{1}ye}G_{x^{\prime}_{2}yg}\simeq$	$\displaystyle\sum_{f,h}^{D}\tilde{F^{\prime}}_{x^{\prime}_{2}fe}\lambda^{% \prime}_{fh}\tilde{G^{\prime}}_{x_{1}hg}$		(64)
	$\displaystyle=$	$\displaystyle\sum_{f}F^{\prime}_{x^{\prime}_{2}fe}G^{\prime}_{x_{1}fg}.$		(65)

The singular values $\lambda^{\prime}$ are included in $F^{\prime}$ and $G^{\prime}$ with square root $\sqrt{\lambda^{\prime}}$ factors.

Isometric ATRG.

In the isometric ATRG, two indices $x_{1}$ and $x_{2}$ are combined by applying an isometry $U^{\mathrm{(ATRG)}}$ . This tensor is obtained by an SVD of a combination of triads: $EF^{\prime}G^{\prime}HH^{\dagger}{G^{\prime}}^{\dagger}{F^{\prime}}^{\dagger}E% ^{\dagger}=U^{\mathrm{(ATRG)}}\left(\lambda^{\mathrm{(ATRG)}}\right)^{2}U^{% \mathrm{\dagger(ATRG)}}$ .1 This minimizes the cost function in Fig. 15.

We finally calculate the coarse-grained tensor $K^{\mathrm{(Iso,ATRG)}}$ , as shown in Fig. 16, $K^{\mathrm{(Iso,ATRG)}}=U^{\dagger}EF^{\prime}G^{\prime}HU$ .

ATRG without isometries, and shifted ATRG.

We discuss variants of the ATRG algorithm which do not rely on the applications of isometries as before. Instead, we use further contractions and SVDs. See Fig. 17 for a graphical representation of the individual steps.

First, we take the SVD of the tensor composition $EF^{\prime}G^{\prime}H$ from Fig. 14(b) as

	$\displaystyle\sum_{x^{\prime}_{1},x^{\prime}_{2},e,g}E_{x^{\prime}_{1}y^{% \prime}e}F^{\prime}_{x^{\prime}_{2}fe}G^{\prime}_{x^{\prime}_{1}f^{\prime}g}H_% {x^{\prime}_{2}yg}$
	$\displaystyle\simeq\sum_{X,X^{\prime}}\tilde{M}_{fy^{\prime}X^{\prime}}\lambda% ^{(LM)}_{X^{\prime}X}\tilde{L}_{f^{\prime}yX}=\sum_{X}{M}_{fy^{\prime}X}{L}_{f% ^{\prime}yX}$
	$\displaystyle=J_{fyf^{\prime}y^{\prime}}.$		(66)

We define the shifted ATRG, which takes these tensors $J$ as the coarse grained tensors:

K^{\mathrm{(sh,ATRG)}}_{XyX^{\prime}y^{\prime}}=J_{XyX^{\prime}y^{\prime}}.

(67)

Alternatively, another contraction defines the coarse grained tensor of ATRG without isometry,

K^{\mathrm{(ATRG)}}_{XyX^{\prime}y^{\prime}}=\sum_{f}M_{fy^{\prime}X^{\prime}}% L_{fyX}.

(68)

The SVD which leads to $M$ and $L$ requires ${\mathcal{O}}(D^{6})$ operations if we do not apply a truncated SVD method. If we apply the ideas of the randomized SVD instead, the costs can be reduced to ${\mathcal{O}}(D^{5})$ . See Morita et al. (2018); Kadoh and Nakayama (2019); Nakayama (2023) for more details.

The method to create the coarse-grained tensors $K^{\mathrm{(ATRG)}}$ is equivalent to the original introduction of ATRG in Adachi et al. (2020). The original ATRG method can be understood as a replacement of the isometries $U^{(\mathrm{ATRG})}$ in the isometric ATRG as in Fig. 16 by squeezers. These originate from the truncated SVD in Eq. 66. Explicitly, the squeezers are:

	$\displaystyle P_{1}^{(\mathrm{ATRG})}=$	$\displaystyle[G^{\prime}H]\tilde{L}^{\dagger}/\sqrt{\lambda^{(LM)}}$		(69)
	$\displaystyle P_{2}^{(\mathrm{ATRG})}=$	$\displaystyle(1/\sqrt{\lambda^{(LM)}})\tilde{M}^{\dagger}[EF^{\prime}].$		(70)

The algorithms which create $K^{\mathrm{(ATRG)}}$ and $K^{\mathrm{(Iso,ATRG)}}$ differ in the regions that are approximated in the truncation step, and in the way how the coarse-grained tensors are constructed. The shifted ATRG also creates a different approximation compared to $K^{\mathrm{(ATRG)}}$ . This method can, however, only be used to coarse-grain the indices in one direction. For example, the tensor $J$ would have additional indices for the $z$ -direction in three dimensions. $J_{Xyz_{1}z_{2}X^{\prime}y^{\prime}z_{1}^{\prime}z_{2}^{\prime}}$ has $D^{8}$ elements, and creating it directly is not possible within the leading costs of ${\mathcal{O}}(D^{7})$ for the ATRG methods in three dimensions. Shifted ATRG is thus only applicable for two-dimensional systems or in combination with other methods which coarse-grain the additional directions beforehand. The other two ATRG methods (isometric ATRG and ATRG) can be directly generalized to higher dimensions Adachi et al. (2020).

Shifted isometric ATRG.

Instead of using randomized techniques for the contraction or SVD of $J$ , we can approximate the contraction using the isometry $U^{\mathrm{(ATRG)}}$ that was introduced for the isometric ATRG: $J=EF^{\prime}G^{\prime}H\simeq EF^{\prime}U^{\mathrm{(ATRG)}}U^{\mathrm{% \dagger(ATRG)}}G^{\prime}H$ . We call this method shifted isometric ATRG. It is shown in Fig. 18. Note that the isometry does not create the indices of the coarse-grained tensors directly, since all indices of $U^{\mathrm{(ATRG)}}$ are contracted. This approximation of $J$ may not be optimal, because the isometry is not calculated from the same subregion of the tensor network as $J$ itself: $U$ optimizes $(HG^{\prime})(F^{\prime}E)UU^{\dagger}$ , not $(F^{\prime}E)UU^{\dagger}(HG^{\prime})$ . We include this method in our benchmark, however, to test the accuracy of a method that uses isometries for the contractions.

E.2 MDTRG and variants

In the following we explain the MDTRG method and also introduce a variation of it. The method is similar to the TTRG Kadoh and Nakayama (2019) but with a different approximation in the contraction step. Compared to ATRG, the index swapping (from (b) to (c) in Fig. 14) is omitted and the tensors $EFGH$ are directly used instead of $EF^{\prime}G^{\prime}H$ .

For MDTRG, we calculate the isometry $U^{\mathrm{(MDTRG)}}$ from the tensors $EFGH$ . The cost function is shown in Fig. 19. Namely, we use the decomposition $EFGHH^{\dagger}G^{\dagger}F^{\dagger}E^{\dagger}=U^{\mathrm{(MDTRG)}}\left(% \lambda^{\mathrm{(MDTRG)}}\right)^{2}U^{\mathrm{\dagger(MDTRG)}}$ . Note that the lefthand-side of this equation is Hermitian, and thus the left- and right-singular vectors are equal on the righthand-side.1 The coarse grained tensor is then obtained, as shown in Fig. 20, by a contraction with the isometries:

	$\displaystyle K^{\mathrm{(MDTRG)}}_{XyX^{\prime}y^{\prime}}=$	$\displaystyle\sum_{x_{1},x_{2},x^{\prime}_{1},x^{\prime}_{2},y,e,g}U_{x^{% \prime}_{1}x^{\prime}_{2}X^{\prime}}^{\mathrm{*MDTRG)}}U_{x_{1}x_{2}X}^{% \mathrm{(MDTRG)}}$
		$\displaystyle\times E_{x^{\prime}_{1}y^{\prime}e}F_{x_{1}ye}G_{x^{\prime}_{2}% yg}H_{x_{2}yg}.$		(71)

This contraction requires a truncated SVD method to reduce the costs, in two dimensions to ${\mathcal{O}}(D^{5})$ . We use the randomized SVD, as in Nakayama (2023).

Using the approximation, we get the triad representation of the $K^{\mathrm{(MDTRG)}}$ as the SVD of $K^{\mathrm{(MDTRG)}}$ with square root weight,

K^{\mathrm{(MDTRG)}}_{AyA^{\prime}y^{\prime}}\simeq\sum_{n}N_{Ayn}O_{A^{\prime% }y^{\prime}n}.

(72)

Replacing the isometries by squeezers and applying the ideas of the boundary TRG (see App. D) to MDTRG is straightforward. For this boundary MDTRG method, we calculate $U^{\mathrm{(MDTRG)}}$ and $V^{\mathrm{(MDTRG)}}$ by a randomized SVD with oversampling size $rD$ , and compute the squeezers $P_{1}^{\mathrm{(MDTRG)}}$ and $P_{2}^{\mathrm{(MDTRG)}}$ from this.

Furthermore, we define the shifted MDTRG as depicted in Fig. 21. In the previous MDTRG algorithm, the tensors $EFGH$ and the isometries were contracted to form the new coarse-grained tensors. Instead, the shifted MDTRG replaces this contraction by an approximate SVD, which can be applied efficiently to the tensor network. From this tensor decomposition we obtain truncated unitaries, which are combined with the square roots of the singular values to form new tensors $O$ and $N$ . Their contraction leads to the coarse-grained tensors:

K^{\mathrm{(sh,MDTRG)}}_{XyX^{\prime}y^{\prime}}\simeq\sum_{A}N_{AyX^{\prime}}% O_{Ay^{\prime}X}.

(73)

Note that the index that was created in the SVD forms one of the indices of the coarse grained tensor. Figure 21 shows the contraction for shifted MDTRG as (e) to (f). New indices of the shifted MDTRG is dotted purple line which comes from the truncated SVD of Eq. 71.

E.3 Comparison of coarse-graining methods

We benchmark the different TRG algorithms for the two-dimensional critical Ising model. The results are summarized in table 1 and discussed in the main text. As mentioned there, we divide the algorithm into three classes. Algorithms denoted as iso in table 1 apply isometries to the tensors to create the coarse grained indices. The methods marked as iso $*$ use isometries as well, but only for intermediate contraction steps, and the final coarse-grained indices are not directly the truncated indices of the isometries. Finally, all other algorithms are marked as sqz.

When isometries are introduced in a tensor network to combine bonds and to compress the bond dimension, there is an ambiguity in choosing these tensors. They can be optimized for either direction of the bond that shall be compressed. An example can be seen in Fig. 12, where the isometries $U$ and $V$ minimize the error with respect to different contraction directions. Only one of the two is chosen in isometric algorithms for the coarse-graining, and this can lead to a significant decrease of the accuracy if the tensor is not symmetric. Otherwise, $U$ and $V$ are identical and the problem does not arise. The squeezers introduced in the boundary TRG algorithm and discussed in App. D take into account both isometries. Thus, these methods do not suffer from the errors introduced by omitting the other isometry.

Even though the original TRG algorithm uses an SVD as well for the coarse-graining, both isometries are used in this case. This makes it equivalent to the squeezer algorithms and we group it as sqz.

Our benchmark results for the ATRG and MDTRG methods are shown in Figs. 22, 23 and 24. For the truncated SVD, we use the randomized SVD with an oversampling parameter $r=4$ , such that the SVD is performed in an $rD$ dimensional subspace. We test all methods with two initial tensors, a symmetric tensor $K^{\mathrm{(exp)}}$ (see Eq. 15) and a non-symmetric one $K^{\mathrm{(delta)}}$ from our initial tensor construction (see Eq. 12).

Figure 22 shows that the ATRG does not only produce more accurate results for large bond dimensions compared to the isometric ATRG. Also, ATRG (type sqz) shows no dependence on the initial tensors, while isometric ATRG (type iso) has a strong dependence and is much less accurate for the non-symmetric initial tensor.

Both shifted ATRG methods (shifted ATRG, type sqz and isometric shifted ATRG, type iso $*$ ) show only a very mild dependence on the initial tensors as can be seen in Fig. 23. The shifted ATRG has a similar accuracy compared to the common ATRG. Combined with the technical advantages discussed in App. F, this method makes a good candidate for the impurity tensor method to calculate observables.

For the MDTRG methods shown in Fig. 24, we find that the MDTRG (type iso) produces much less accurate results if a non-symmetric initial tensor is chosen. If the boundary TRG method is applied (type sqz), the results coincide with those of the usual MDTRG method and symmetric initial tensors. The boundary MDTRG obtains similar results, however, for non-symmetric tensors as well. This shows again how the squeezers can make observables more resilient against the choice of initial tensors. The shifted MDTRG (type iso $*$ ) shows only a mild dependence on the initial tensors but has slightly larger errors than boundary MDTRG for large bond dimensions.

From our numerical calculations with the variants of the ATRG and MDTRG, TRG, and HOTRG, we find that the TRG methods with coarse-grained tensors $K^{\mathrm{(next)}}$ , whose indices are directly created from isometries, have large initial tensor dependencies. This dependence is eliminated if we apply the boundary TRG technique as discussed in App. D. We therefore recommend the truncation method with squeezers based on the boundary TRG method, which does not increase the numerical costs significantly but leads to more reliable results.

Appendix F Impurity tensor method for ATRG

Impurity tensors can be used to calculate physical observables with TRG methods. We give a brief introduction and overview and discuss the differences that arise for the ATRG and the shifted ATRG method. The latter was introduced in App. E. The impurity tensor method was first suggested in Gu et al. (2008). It is elsewhere discussed in much detail for TRG Nakamoto and Takeda (2016) and also for HOTRG (with isometries) Morita and Kawashima (2019).

In tensor renormalization group methods, the partition function $Z$ is represented by a translational invariant repetition of a tensor $T_{abcd}(\beta)$ in a volume $V$ as

Z=\mathrm{tr}\prod_{i=1}^{V}T_{a_{i}b_{i}c_{i}d_{i}}(\beta).

(74)

We assume the tensor $T(\beta)$ is a function of a parameter $\beta$ , which could, for example, be the inverse temperature. Using the product rule and exploiting the translational invariance of the network, the derivative of Z with respect to $\beta$ is

\frac{1}{V}\frac{\partial Z}{\partial\beta}=\mathrm{tr}\left(\frac{\partial T_% {a_{1}b_{1}c_{1}d_{1}}}{\partial\beta}\right)\prod_{i=2}^{V}T_{a_{i}b_{i}c_{i}% d_{i}}(\beta).

(75)

We call $\left(\frac{\partial T}{\partial\beta}\right)$ the impurity tensor.

In the impurity tensor method, we need to consider the propagation of the impurity tensor information in each coarse graining step. In order to keep the information of the impurity tensor at each step, we have to store sub tensor networks Nakamoto and Takeda (2016). For the simple TRG, we need to store four different tensors. We show how ATRG (Fig. 25) and its variation shifted ATRG (Fig. 26) can be used for the impurity tensor method.

With the original ATRG, the information of the initial impurity tensor propagates to eight different tensors in later coarse-graining steps, as is shown in Fig. 25. In contrast to this we only need to calculate and store two coarse-grained impurity tensors with the shifted ATRG, as is shown in Fig. 26. The difference arises from the contraction step in Fig. 17. There, the tensor network $EFGH$ contributes to three coarse-grained tensors $K^{\mathrm{(ATRG)}}$ in original ATRG (from Fig. 17(b) to (c)). In contrast to this, the tensors $EFGH$ only affect two coarse-grained tensors $K^{\mathrm{(sh,ATRG)}}$ for the shifted ATRG, see Fig. 17(b) to (d). We use the shifted ATRG method to calculate the free energy in the $\mathbb{Z}_{2}$ gauge theory (see Sec. IV) because of the lower memory footprint and computational costs.

Appendix G Index direction swapping

In a TRG coarse-graining step, two initial tensors $K$ are combined into a single new tensor $K^{\mathrm{(next)}}$ . This was explained in App. E for two tensors connected by a link in $y$ -direction. For a two-dimensional lattice, this step is followed by a similar coarse-graining in $x$ -direction and these directions are alternated. The same algorithm can be used if the indices of the initial tensors are permuted accordingly after each coarse-graining step. There are four different choices to exchange the $x$ and $y$ directions, which are also shown in Fig. 27:

$\displaystyle K_{xyx^{\prime}y^{\prime}}$	$\displaystyle\leftrightarrow K_{yxy^{\prime}x^{\prime}}$	$\displaystyle\mathrm{\ \ \ (x\leftrightarrow y),}$	(76)
$\displaystyle K_{xyx^{\prime}y^{\prime}}$	$\displaystyle\leftrightarrow K_{y^{\prime}x^{\prime}yx}$	$\displaystyle\mathrm{\ \ \ (x\leftrightarrow y^{\prime}),}$	(77)
$\displaystyle K_{xyx^{\prime}y^{\prime}}$	$\displaystyle\rightarrow K_{yx^{\prime}y^{\prime}x}$	$\displaystyle\mathrm{\ \ \ (\circlearrowleft),}$	(78)
$\displaystyle K_{xyx^{\prime}y^{\prime}}$	$\displaystyle\rightarrow K_{y^{\prime}xyx^{\prime}}$	$\displaystyle\mathrm{\ \ \ (\circlearrowright)}.$	(79)

	$xy-$ swap dep.	Trunc.	$K$ dep.
ATRG Adachi et al. (2020)	$-$	sqz	$--$
Iso-ATRG Adachi et al. (2020)	$+$	iso	$++$
sh-ATRG	$++$	sqz	$-$
sh-Iso-ATRG	$++$	iso*	$-$
MDTRG Kadoh and Nakayama (2019)	$-$	iso	$++$
sh-MDTRG	$++$	iso*	$-$
b-MDTRG	$--$	sqz	$--$

Table 2: Properties of different ATRG and MDTRG methods. 2nd column: dependence on the type of exchange between

x-

and

y-

direction between coarse-graining steps;

++

+

-

--

stands for very strong/noticeable/slight but not significant/nearly no dependence; 3rd column: truncation method; iso stands for isometries which are used to create the coarse-grained indices; iso* means that isometries are used for intermediate approximate contractions, but they do not create the new indices of the coarse-grained tensors directly; sqz denotes all other methods, so either the squeezers from boundary TRG Iino et al. (2019) (see main text and App. D), or a simple contraction and singular value decomposition. 4th column: dependence on the initial tensors;

--

stands for no dependence,

-

for a slight but not significant dependence,

++

for strong dependence;

We test the dependence of the TRG variants on the type of $xy$ -exchange, see Fig. 28. We only show data for $x\leftrightarrow y$ and $\circlearrowleft$ because the results for $x\leftrightarrow y$ and $\circlearrowleft$ coincide with $x\leftrightarrow y^{\prime}$ and $\circlearrowright$ , respectively. We summarize our findings in table 2. The main observations from the numerical benchmarks are:

1.

The shifted methods with a flip $x\leftrightarrow y$ do not converge to the correct results when the bond dimension is increased, and the errors remain large or even increase with the bond dimension (red and purple triangles in Fig. 28).
2.

Non-shifted methods have a similar or better accuracy when a flip $x\leftrightarrow y$ is applied. The results are in particular better for the isometric ATRG (black and gray dots in Fig. 28(c)).
3.

The boundary TRG methods do not significantly depend on the type of exchange, $x\leftrightarrow y$ or rotation $\circlearrowleft$ .
4.

Overall, the different types of exchange (rotation $\circlearrowleft$ or flip $x\leftrightarrow y$ ) lead to different accuracies, depending on the details of the coarse-graining algorithm. Therefore, the exchange type should be chosen accordingly.

When implementing a TRG algorithm, one has to carefully keep track of the index order and conventions. For example, we identified $f$ in Eq. 66 as $X$ in Eq. 67, and $f^{\prime}$ as $X^{\prime}$ . If one would instead set $X^{\prime}=f$ and $X=f^{\prime}$ , it would correspond to an exchange between the $xy$ flip and a rotation. These conventions should be explicitly checked when comparing flips and rotations between algorithms and implementations.

The observations can be understood from the interplay of the last step in obtaining a coarse grained tensor, the exchange of indices, and the initial tensor decomposition in a TRG algorithm. For example, the coarse grained tensor $K^{\mathrm{(ATRG)}}$ in the ATRG algorithm is obtained by a contraction of two tensors $M$ and $L$ in Eq. 68 and Fig. 17 from (b) to (c). A flip $x\leftrightarrow y$ ( $x\leftrightarrow y^{\prime}$ ) exchanges two indices of $M$ and two indices of $L$ , but does not move only one index to the other tensor. In the next coarse-graining iteration, the tensor $K^{\mathrm{(ATRG)}}$ is initially split into $E$ and $F$ , which are exactly $M$ and $L$ ( $L$ and $M$ ) respectively. Therefore, this SVD does not introduce a further truncation. This is not be the case if a rotation of the indices is used. Similarly, the initial splitting of $K^{\mathrm{(sh,ATRG)}}$ into $E$ and $F$ for the shifted ATRG reconstructs the tensors $M$ and $L$ ( $L$ and $M$ ) respectively if the exchange type $\circlearrowleft$ ( $\circlearrowright$ ) is used. This can be seen from Eqs. 66 and 67 or Fig. 17(b) to (c). The same arguments hold for the MDTRG algorithms.

The optimal choice for the index exchange can also be understood if the triad representation is used everywhere instead of coarse-graining to a square lattice Kadoh and Nakayama (2019); Nakayama (2023); Morita et al. (2018). For example, the tensor $K^{\mathrm{(ATRG)}}$ does not need to be constructed explicitly as a contraction between $M$ and $L$ . Instead, these two tensors can be used in the next coarse graining step. In this formulation, the natural index exchange order is more apparent.

We benchmarked the two-dimensional Ising model at the critical temperature here. Since we discover a significant dependence on the type of $xy$ -swapping for some of the methods, we suggest to check this behavior for other models as well to find the optimal choice. This is particularly true since we found specific cases for the flip-index exchange $x\leftrightarrow y$ with a systematic accumulation of errors, which led to a decreased accuracy when the bond dimension is increased. Similarly, the type of index permutation after each coarse-graining step can be important for other TRG methods and in higher dimensions, where the number of variants becomes even larger.

References

Levin and Nave (2007) M. Levin and C. P. Nave, Phys. Rev. Lett. 99, 120601 (2007), arXiv:cond-mat/0611687 [cond-mat.stat-mech] .
Nakayama et al. (2022) K. Nakayama, L. Funcke, K. Jansen, Y.-J. Kao, and S. Kühn, Phys. Rev. D 105, 054507 (2022), arXiv:2107.14220 [hep-lat] .
Liu et al. (2013) Y. Liu, Y. Meurice, M. Qin, J. Unmuth-Yockey, T. Xiang, Z. Xie, J. Yu, and H. Zou, Phys. Rev. D 88, 056005 (2013), arXiv:1307.6543 .
Kuramashi and Yoshimura (2019) Y. Kuramashi and Y. Yoshimura, JHEP 08, 023 (2019), arXiv:1808.08025 [hep-lat] .
Shimizu and Kuramashi (2014a) Y. Shimizu and Y. Kuramashi, Phys. Rev. D 90, 014508 (2014a), arXiv:1403.0642 .
Shimizu and Kuramashi (2014b) Y. Shimizu and Y. Kuramashi, Phys. Rev. D 90, 074503 (2014b), arXiv:1408.0897 .
Shimizu and Kuramashi (2018) Y. Shimizu and Y. Kuramashi, Phys. Rev. D 97, 034502 (2018), arXiv:1712.07808 .
Yu et al. (2014) J. F. Yu, Z. Y. Xie, Y. Meurice, Y. Liu, A. Denbleyker, H. Zou, M. P. Qin, and J. Chen, Phys. Rev. E89, 013308 (2014), arXiv:1309.4963 [cond-mat.stat-mech] .
Zou et al. (2014) H. Zou, Y. Liu, C.-Y. Lai, J. Unmuth-Yockey, A. Bazavov, Z. Y. Xie, T. Xiang, S. Chandrasekharan, S. W. Tsai, and Y. Meurice, Phys. Rev. A90, 063603 (2014), arXiv:1403.5238 [hep-lat] .
Yang et al. (2016) L.-P. Yang, Y. Liu, H. Zou, Z. Y. Xie, and Y. Meurice, Phys. Rev. E93, 012138 (2016), arXiv:1507.01471 .
Takeda and Yoshimura (2015) S. Takeda and Y. Yoshimura, Progress of Theoretical and Experimental Physics 2015, 043B01 (2015), arXiv:1412.7855 .
Yoshimura et al. (2018) Y. Yoshimura, Y. Kuramashi, Y. Nakamura, S. Takeda, and R. Sakai, Phys. Rev. D 97, 054511 (2018), arXiv:1711.08121 .
Bazavov et al. (2019) A. Bazavov, S. Catterall, R. G. Jha, and J. Unmuth-Yockey, Phys. Rev. D99, 114507 (2019), arXiv:1901.11443 [hep-lat] .
Kuramashi and Yoshimura (2020) Y. Kuramashi and Y. Yoshimura, JHEP 04, 089 (2020), arXiv:1911.06480 [hep-lat] .
Hirasawa et al. (2021) M. Hirasawa, A. Matsumoto, J. Nishimura, and A. Yosprakob, Journal of High Energy Physics 2021, 11 (2021), arXiv:2110.05800 .
Akiyama and Kadoh (2021) S. Akiyama and D. Kadoh, Journal of High Energy Physics 2021, 188 (2021), arXiv:2005.07570 .
Yosprakob et al. (2023) A. Yosprakob, J. Nishimura, and K. Okunishi, Journal of High Energy Physics 2023, 187 (2023), arXiv:2309.01422 .
Akiyama et al. (2024) S. Akiyama, R. G. Jha, and J. Unmuth-Yockey, (2024), arXiv:2406.10081 [hep-lat] .
Yosprakob and Okunishi (2024) A. Yosprakob and K. Okunishi, (2024), arXiv:2406.16763 .
Nagata (2022) K. Nagata, Progress in Particle and Nuclear Physics 127, 103991 (2022), arXiv:2108.12423 .
Halko et al. (2011) N. Halko, P. G. Martinsson, and J. A. Tropp, SIAM Review 53, 217 (2011), arXiv:0909.4061 .
Nakamura et al. (2019) Y. Nakamura, H. Oba, and S. Takeda, Phys. Rev. B99, 155101 (2019), arXiv:1809.08030 [cond-mat.stat-mech] .
Morita et al. (2018) S. Morita, R. Igarashi, H.-H. Zhao, and N. Kawashima, Phys. Rev. E97, 033310 (2018), arXiv:1712.01458 .
Okanohara (2014) D. Okanohara, “redsvd: RandomizED Singular Value Decomposition,” (2014).
Evenbly and Vidal (2015) G. Evenbly and G. Vidal, Phys. Rev. Lett. 115, 180405 (2015), arXiv:1412.0732 .
Jiang et al. (2008) H. C. Jiang, Z. Y. Weng, and T. Xiang, Phys. Rev. Lett. 101, 090603 (2008), arXiv:0806.3719 .
Lan and Evenbly (2019) W. Lan and G. Evenbly, Phys. Rev. B 100, 235118 (2019), arXiv:1906.09283 .
Xie et al. (2012) Z. Y. Xie, J. Chen, M. P. Qin, J. W. Zhu, L. P. Yang, and T. Xiang, Physical Review B86, 045139 (2012), arXiv:1201.1144 .
Adachi et al. (2020) D. Adachi, T. Okubo, and S. Todo, Phys. Rev. B 102, 054432 (2020), arXiv:1906.02007 .
Kadoh and Nakayama (2019) D. Kadoh and K. Nakayama, (2019), arXiv:1912.02414 .
Nakayama (2023) K. Nakayama, (2023), arXiv:2307.14191 .
Baumgartner and Wenger (2015) D. Baumgartner and U. Wenger, Nucl. Phys. B894, 223 (2015), arXiv:1412.5393 [hep-lat] .
Marchis and Gattringer (2018) C. Marchis and C. Gattringer, Phys. Rev. D97, 034508 (2018), arXiv:1712.07546 [hep-lat] .
Iino et al. (2019) S. Iino, S. Morita, and N. Kawashima, Phys. Rev. B 100, 035449 (2019), arXiv:1905.02351 [cond-mat.stat-mech] .
Zhao et al. (2010) H. H. Zhao, Z. Y. Xie, Q. N. Chen, Z. C. Wei, J. W. Cai, and T. Xiang, Phys. Rev. B 81, 174411 (2010), arXiv:1002.1405 .
Pini and Rettori (1993) M. G. Pini and A. Rettori, Phys. Rev. B 48, 3240 (1993).
Taherkhani et al. (2011) F. Taherkhani, E. Daryaei, H. Abroshan, H. Akbarzadeh, G. Parsafar, and A. Fortunelli, Phase transitions 84, 77 (2011).
Guimaraes and Plascak (2002) P. R. C. Guimaraes and J. A. Plascak, Phys. Rev. B 66, 064413 (2002).
Jurcisinoca and Jurcisin (2014) E. Jurcisinoca and M. Jurcisin, Phys. Rev. E 90, 032108 (2014).
Karlova et al. (2018) K. Karlova, J. Strecka, and M. L. Lyra, Phys. Rev. B 97, 104407 (2018).
Kassan-Ogly et al. (2012) F. Kassan-Ogly, B. Filippov, A. Murtazaev, M. Ramazanov, and M. Badiev, Journal of Magnetism and Magnetix Materials 324, 3418 (2012).
Kwek et al. (2009) L. Kwek, Y. Takahashi, and K. Choo, Journal of Physics: Conference Series 143, 012014 (2009).
Niemeijer (1971) T. Niemeijer, Journal of Mathematical Physics 12, 1487 (1971).
Ozerov et al. (2010) M. Ozerov, A. A. Zvyagin, E. Čižmár, J. Wosnitza, R. Feyerherm, F. Xiao, C. P. Landee, and S. A. Zvyagin, Phys. Rev. B 82, 014416 (2010), arXiv:1007.2143 .
Raymond and Wong (2012) J. Raymond and K. M. Wong, Journal of Statistical Mechanics: Theory and Experiment 2012, P09007 (2012), arXiv:1206.4270 .
Sandvik (2010) A. W. Sandvik, Phys. Rev. Lett. 104, 137204 (2010), arXiv:1001.4428 .
Capriotti et al. (2003) L. Capriotti, F. Becca, S. Sorella, and A. Parola, Phys. Rev. B 67, 172404 (2003), arXiv:cond-mat/0304302 .
Wang et al. (2016) L. Wang, Z.-C. Gu, F. Verstraete, and X.-G. Wen, Phys. Rev. B 94, 075143 (2016), arXiv:1112.3331 .
Wang and Sandvik (2018) L. Wang and A. W. Sandvik, Phys. Rev. Lett. 121, 107202 (2018), arXiv:1702.08197 .
Richter et al. (2015) J. Richter, P. MÃ¼ller, A. Lohmann, and H.-J. Schmidt, Physics Procedia 75, 813 (2015), arXiv:1609.06837 .
Sirker et al. (2006) J. Sirker, Z. Weihong, O. P. Sushkov, and J. Oitmaa, Phys. Rev. B 73, 184420 (2006), arXiv:cond-mat/0601183 [cond-mat.str-el] .
Yoshiyama and Hukushima (2023) K. Yoshiyama and K. Hukushima, Phys. Rev. E 108, 054124 (2023), arXiv:2303.07733 .
Li and Yang (2021) H. Li and L.-P. Yang, Phys. Rev. E 104, 024118 (2021), arXiv:2103.09464 .
Onsager (1944) L. Onsager, Phys. Rev. 65, 117 (1944).
Duminil-Copin (2022) H. Duminil-Copin, (2022), arXiv:2208.00864 [math.PR] .
Kaufman (1949) B. Kaufman, Phys. Rev. 76, 1232 (1949).
Butt et al. (2020) N. Butt, S. Catterall, Y. Meurice, R. Sakai, and J. Unmuth-Yockey, Phys. Rev. D 101, 094509 (2020), arXiv:1911.01285 .
Gribov (1978) V. Gribov, Nuclear Physics B 139, 1 (1978).
Svetitsky and Yaffe (1982) B. Svetitsky and L. G. Yaffe, Nuclear Physics B 210, 423 (1982).
Hanan (1966) M. Hanan, SIAM Journal on Applied Mathematics 14, 255 (1966).
Adachi et al. (2022) D. Adachi, T. Okubo, and S. Todo, Phys. Rev. B 105, L060402 (2022), arXiv:2011.01679 .
Gu et al. (2008) Z.-C. Gu, M. Levin, and X.-G. Wen, Phys. Rev. B 78, 205116 (2008), arXiv:0806.3509 .
Nakamoto and Takeda (2016) N. Nakamoto and S. Takeda, Sci. Rep. Kanazawa Univ. 60, 11 (2016).
Morita and Kawashima (2019) S. Morita and N. Kawashima, Comput. Phys. Commun. 236, 65 (2019), arXiv:1806.10275 [cond-mat.stat-mech] .

Initial tensor construction and dependence of the tensor renormalization group on initial tensors

Abstract

I Introduction

Overview of TRG algorithms.

Initial tensor construction.

Initial tensor dependence of TRG methods.

ℤ2subscriptℤ2\mathbb{Z}_{2}blackboard_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT gauge theory.

Structure of this paper.

II One-dimensional Ising model with next-nearest neighbor interactions

III Two-dimensional Ising model and initial tensor dependence of the TRG methods

Initial tensor construction with shifted delta-functions.

Initial tensor construction based on Taylor expansion.

Initial tensor dependence of TRG algorithms.

Removing the initial tensor dependence by boundary TRG techniques.

Overview of TRG methods and their initial tensor dependencies.

Dependence on the index exchange type.

IV ℤ2subscriptℤ2\mathbb{Z}_{2}blackboard_Z start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT gauge theory

Initial tensor construction based on Taylor expansion.

Initial tensor construction with shifted delta-functions.

Numerical results for the free energy.

Numerical results for the specific heat.

V General form of initial tensors

Connected long range chain in 1d

Disconnected long range interaction in 1d

Higher dimensions

Multi-flavour systems

VI Conclusion

Acknowledgments

Appendix A J1-J2 Ising model

Appendix B J1-J3 Ising model

Appendix C Symmetry of the initial tensor

Appendix D Boundary TRG method

Appendix E ATRG, MDTRG and variants

E.1 ATRG and variants

Isometric ATRG.

ATRG without isometries, and shifted ATRG.

Shifted isometric ATRG.

E.2 MDTRG and variants

E.3 Comparison of coarse-graining methods

Appendix F Impurity tensor method for ATRG

Appendix G Index direction swapping

References

$\mathbb{Z}_{2}$ gauge theory.

IV $\mathbb{Z}_{2}$ gauge theory