Abstract
Obtaining accurate properties of many-body interacting quantum matter is a long-standing challenge in theoretical physics and chemistry, rooting into the complexity of the many-body wave-function. Classical representations of many-body states constitute a key tool for both analytical and numerical approaches to interacting quantum problems. Here, we introduce a technique to construct classical representations of many-body quantum systems based on artificial neural networks. Our constructions are based on the deep Boltzmann machine architecture, in which two layers of hidden neurons mediate quantum correlations. The approach reproduces the exact imaginary-time evolution for many-body lattice Hamiltonians, is completely deterministic, and yields networks with a polynomially-scaling number of neurons. We provide examples where physical properties of spin Hamiltonians can be efficiently obtained. Also, we show how systematic improvements upon existing restricted Boltzmann machines ansatze can be obtained. Our method is an alternative to the standard path integral and opens new routes in representing quantum many-body states.
Similar content being viewed by others
Introduction
A tremendous amount of successful developments in quantum physics builds upon the mapping between many-body quantum systems and effective classical theories. The probably most well-known mapping is due to Feynman, who introduced an exact representation of many-body quantum systems in terms of statistical summations over classical particles trajectories1. Effective classical representations of quantum many-body systems are however not unique, and other approaches rely on different inspiring principles, such as perturbative expansions2, or decomposition of interactions with auxiliary degrees of freedom3,4. The classical representations of quantum states allow both for novel conceptual developments and efficient numerical simulations. On one hand, perturbative approaches based on the graphical resummation of classes of diagrams are at the heart of many-body analytical approaches in various fields of research, ranging from particle to condensed-matter physics5. On the other hand, several non-perturbative numerical methods for many-body quantum systems are also based on these mappings. Quantum Monte Carlo (QMC) methods are among the most successful numerical techniques, relying on continuous-space polymer representations6,7,8,9, world-line lattice path integrals10,11, continuous time algorithm12, and summation of perturbative diagrams13,14. Effective classical representations are also the building block of variational methods based on correlated many-body wave-functions15. Several successful variational techniques make extensive use of parametric representations of quantum states, where the effective parameters are determined by means of the variational principle16,17,18,19. In matrix-product and tensor-network-states the ground-state is expressed as a classical network20,21. In general, finding alternative, efficient classical representations of quantum states can help establishing novel numerical and analytical techniques to study challenging open issues.
Recently, an efficient variational representation of many-body systems in terms of artificial neural networks, which consists of classical degrees of freedom, has been introduced22. Numerical results have shown that artificial neural networks can represent many-body states with high accuracy22,23,24,25,26,27,28,29,30,31. The majority of the variational approaches adopted so-far are based on shallow neural networks, called restricted Boltzmann machines (RBM), in which the physical degrees of freedom interact with an ensemble of hidden degrees of freedom (neurons). While shallow RBM states have promising features in terms of entanglement capacity25,32,33,34, only deep networks are guaranteed to provide a complete and efficient description of the most general quantum states35,36.
In this work, we introduce a constructive approach to explicitly generate deep network structures corresponding to exact quantum many-body ground states. We demonstrate this construction for interacting lattice spin models, including the transverse-field Ising and Heisenberg models. Our constructions are fully deterministic, in stark contrast to the shallow RBM case, in which the numerical optimization of the network parameters is inevitable. The number of neurons required in the construction scales only polynomially with the system size, thus the present approach constitutes a new family of efficient quantum-to-classical mappings exhibiting a prominent representational flexibility. Given as a simple set of iterative rules, these constructions can be used both as a self-standing tool, or to systematically improve results obtained with variational shallow networks. The latter improves the efficiency of the method because the numerically optimized shallow RBM states are already good approximations for ground states. Finally, we discuss sampling strategies from the generated deep networks and show numerical results for one-dimensional spin models.
Results
General scheme of constructing deep neural states
The ground state of a generic Hamiltonian, \({\cal H}\), can be found through imaginary-time evolution, \(\left| {{\mathrm{\Psi }}(\tau )} \right\rangle\)â=â\({\mathrm {e}}^{ - \tau {\cal H}}\left| {{\mathrm{\Psi }}_0} \right\rangle\), for a sufficiently large \(\tau \gg {\mathrm{\Delta }}E^{ - 1}\). Here ÎE is the energy gap between the ground and the first excited state, \(\left| {{\mathrm{\Psi }}_0} \right\rangle\) is an arbitrary initial state non-orthogonal to the exact ground state, and we work in units where ħâ=â1. For a finite system, the energy gap is typically finite, and the total propagation time needed to reach the ground state within an arbitrary given accuracy is expected to grow at most polynomially with the system size (for systems becoming gapless in the thermodynamic limit).
Here, we introduce a representation of the wave-function coefficients in terms of a deep Boltzmann machine (DBM)37. For the sake of concreteness, let us consider the case of N spins, described by the quantum numbers \(\left| {\sigma ^z} \right\rangle\)â=â\(\left| {\sigma _1^z \ldots \sigma _N^z} \right\rangle\). Then, we represent generic many-body amplitudes \(\left\langle {\sigma _1^z \ldots \sigma _N^z{\mathrm{|\Psi }}} \right\rangle \equiv {\mathrm{\Psi }}\left( {\sigma ^z} \right)\) in the two-layer DBM form:
where we have introduced M hidden units h, Mâ² deep units d, and a set of couplings and bias terms \({\cal W}\)ââ¡â(a, b, bâ², W, Wâ²). A sketch of the DBM architecture is shown in Fig. 1.
In the following, we specialize to the case of spin 1/2, thus all the units are taken to be Ïz, h, dâ=â±1. This representation is the natural deep-network generalization of the shallow RBM, introduced as variational ansatz in ref. 22. As for the RBM form, also in this case direct connections between variables in the same layer are not allowed. A crucial difference is however that the layer of deep variables makes, in general, the evaluation of the wave-function amplitudes not possible analytically. At variance with RBM, the DBM form is known to be universal, as proven by Gao and Duan recently35.
Our key finding is that, thanks to the much more flexible representability, the DBM wave function can reproduce the Hamiltonian imaginary-time evolution exactly by changing its form dynamically, and that the parameters for ground state DBM network can be derived analytically. In order to find explicit expressions for the parameters \({\cal W}\) that represent \(\left| {{\mathrm{\Psi }}(\tau )} \right\rangle\) for arbitrary imaginary time, we start considering a second-order TrotterâSuzuki decomposition10,38:
where we have decomposed the Hamiltonian into two non-commuting parts, \({\cal H} = {\cal H}_1 + {\cal H}_2\), and introduced the short-time propagators \({\cal G}_\nu \left( {\delta _\tau } \right) = {\mathrm {e}}^{ - {\cal H}_\nu \delta _\tau }\). The problem of finding an exact representation for \(\left| {{\mathrm{\Psi }}(\tau )} \right\rangle\) then reduces to finding a rule to construct the building blocks of the time-evolution, namely representing the state after two-types of propagators by DBM with new parameters \(\bar {\cal W}\):
In practice, this is achieved either by changing parameters \({\cal W}\), or by introducing additional parameters in \({\cal W}\), adding new neurons and creating new connections in the network.
In the following, we show concrete examples for paradigmatic spin hamiltonians, namely the transverse-field Ising and Heisenberg models. The rest of this section provides a general overview of how the DBM constructions are derived (how Eq. (3) is satisfied) for these models. The next section (Sampling strategies) discusses how they can be used in numerical schemes. A complete, in-depth derivation of the representations and algorithms can be found both in Methods and in the Supplementary Notes, as referred to at each step in this section. Furthermore, we provide computer codes to create the DBM network for each model as Supplementary Software 1â4.
Transverse-field Ising model (TFIM)
We start considering the TFIM on an arbitrary interaction graph. In this case, we decompose the Hamiltonian into two parts: \({\cal H}_1 = - \mathop {\sum}\nolimits_l {\kern 1pt} {\mathrm{\Gamma }}_l\sigma _l^x\), and \({\cal H}_2 = \mathop {\sum}\nolimits_{l < m} {\kern 1pt} V_{lm}\sigma _l^z\sigma _m^z\), where Ï denote Pauli matrices, Îl (>0) are site-dependent transverse fields, and Vlm are arbitrary coupling constants.
In order to implement the mapping to a DBM, we first consider the action of the diagonal propagator \({\mathrm {e}}^{ - \delta _\tau V_{lm}\sigma _l^z\sigma _m^z}\), acting on a bond Vlm. In this case, the goal of finding an exact DBM representation can be rephrased as finding solutions to
i.e. finding a set of new parameters \(\bar {\cal W}\) that exactly reproduces the imaginary time evolution on the left- hand side. Here C is an arbitrary finite normalization constant. The diagonal propagator introduces an interaction between two visible, physical spins, which is not directly available in the DBM architecture. This interaction can be mediated by a new hidden unit in the first layer, h[lm] which is only connected to the visible spins on that bond, i.e. \(\bar W_{l[lm]}\) and \(\bar W_{m[lm]}\) are finite, but \(\bar W_{i[lm]}\)â=â0, âiââ âl, m and \(\bar W_{j[lm]}^\prime\)â=â0, âj (see Fig. 2a).
Construction of exact DBM representations of the transverse-field Ising model. In this example, a step of imaginary-time evolution is shown, for the case of the one-dimensional transverse-field Ising model. Dots represent physical degrees of freedom \(\left( {\sigma _i^z} \right)\), squares represent hidden units (hj), triangles represent deep units (dk). In each panel, upper networks are the initial state with arbitrary network form, and the bottom networks are the final states, after application of the propagator. Intermediate steps illustrate how the network is modified, where the relevant modified couplings at each step are highlighted in black. The highlighted solid and dashed curves indicate new and vanishing couplings, respectively. a Shows the diagonal (interaction) propagator being applied to the highlighted blue spins. This introduces a hidden unit (green) connected only to the two physical spins. In (b) the off-diagonal (transverse-field) propagator is applied, acting on the blue physical spin. Here, we then add one deep unit (red triangle), and a hidden unit (green) mediating visibleâdeep interactions
More concretely, the new wave function has then the form:
Equation (4) is then satisfied if
for all the possible values of \(\sigma _l^z\) and \(\sigma _m^z\). By means of a useful identity [Eq. (21) in Methods], the new parameters Wl[lm] and Wm[lm] are given by
In this way the classical two-body interaction can, in general, be represented exactly by the shallow RBM.
Next, to exactly represent the off-diagonal propagator \({\mathrm {e}}^{\delta _\tau {\mathrm{\Gamma }}_l\sigma _l^x}\left| {{\mathrm{\Psi }}_{\cal W}} \right\rangle\), we must solve:
for the new weights \(\bar {\cal W}\), and for an appropriate finite normalization constant C. In this case, one possible solution is obtained by adding one deep d[l] and one hidden h[l] neurons. For d[l], we create new couplings \(W_{j[l]}^\prime\) to the existing hidden neurons hj which are connected to \(\sigma _l^z\). We simultaneously allow for changes in the existing parameters. By the procedure given in Methods, after applying the off-diagonal propagator for the site l, a solution of Eq. (9) is found by the matching condition of the hidden unit interactions on the left and the right hand sides of Eq. (9). Overall, the solution results in a three-step process (Fig. 2b): First, the hidden units attached to \(\sigma _l^z\) are connected to the newly introduced deep unit d[l] as
(see Eq. (35)). Second, all the hidden units previously connected to the spin \(\sigma _l^z\) lose their connection, i.e., \(\bar W_{lj} = 0,\forall j\). Third, the spin \(\sigma _l^z\) and the deep unit d[l] are connected to the new hidden unit,h[l], through the interactionWl[l] and \(W_{[l][l]}^\prime\), respectively, as
Using the given expressions for the parameters \(\bar {\cal W}\) we can then exactly implement a single step of imaginary-time evolution. The full imaginary-time evolution is achieved by applying the above procedure for \({\cal H}_1\) and \({\cal H}_2\) alternately and repeatedly. Example applications of these rules, for both the diagonal and the off-diagonal propagators are shown in Fig. 2.
Approximate RBM from DBM for transverse ising model
From the previous discussion, we have seen that the action of the off-diagonal propagator is responsible for the introduction of deep units in the network, thus breaking the shallow RBM structure. An interesting question is whether, in some limit, it is possible to stay within the RBM structure even for the off-diagonal propagator. The action of the off-diagonal propagator onto an RBM state can be then systematically expanded in powers of the weights:
In the case of small weights, we can then exactly reproduce the off-diagonal propagator upon imposing a small change in the parameters WljâââWljâ+âÎWlj and keeping an RBM structure. If we expand the new RBM with modified weights, we get
Comparing Eqs. (13) and (14), it follows that (apart from an irrelevant global normalization) the state after the off-diagonal propagator is still an RBM, with weights equal to:
and an error proportional to the square of the weights at that time step. In general, we expect that this kind of approximate updates is accurate in perturbative regimes (for example in the limit of small Îl) or in the limit of small imaginary time evolution. A similar approximation scheme has been derived in ref. 39. Numerical results for this approximation are discussed in a dedicated section before the Discussion.
Heisenberg model
We now consider the anti-ferromagnetic Heisenberg model (AFHM), on bipartite lattices. In one dimension, we decompose the Hamiltonian into odd and even bonds: \({\cal H}_1 = \mathop {\sum}\nolimits_{\langle l,m\rangle }^{{\mathrm{odd}}} {\kern 1pt} {\cal H}_{lm}^{{\mathrm{bond}}}\) and \({\cal H}_2 = \mathop {\sum}\nolimits_{\langle l,m\rangle }^{{\mathrm{even}}} {\cal H}_{lm}^{{\mathrm{bond}}}\), with \({\cal H}_{lm}^{{\mathrm{bond}}}\)â=â\(J\left( {\sigma _l^x\sigma _m^x + \sigma _l^y\sigma _m^y + \sigma _l^z\sigma _m^z} \right)\), where Ï denote Pauli matrices. Because the bond Hamiltonian \({\cal H}_{lm}^{{\mathrm{bond}}}\) is a building block also in higher dimensional models, construction of an exact DBM representation of the ground states can be achieved by finding solutions for the bond-propagator \(\left\langle {\sigma ^z} \right|{\mathrm {e}}^{ - \delta _\tau {\cal H}_{lm}^{{\mathrm{bond}}}}\left| {{\mathrm{\Psi }}_{\cal W}} \right\rangle\)â=â\(C\left\langle {\sigma ^z{\mathrm{|}}\Psi _{\bar {\cal W}}} \right\rangle\), where the parameters \(\bar {\cal W}\) are such that the previous equation is satisfied for all the possible \(\left\langle {\sigma ^z} \right|\), and for an arbitrary finite normalization constant C. More explicitly, we need to satisfy
The basic strategy of finding a solution for Eq. (16) is similar to that for Eq. (9) in the transverse Ising model. Several possibilities arise when looking for solutions of the bond-propagator equation, Eq. (16). The existence of non-equivalent solutions prominently shows the non-uniqueness of DBM structure to represent the very same state and, at the same time, provides us flexibility in designing DBM architectures. Here, we show three concrete constructions. See Methods and Supplementary Note 2 for a detailed derivation of the DBM construction for the Heisenberg model, including anisotropic and bond-disordered coupling cases.
1 deepâ+â3 hidden variables construction for Heisenberg model
The first construction is dubbed â1 deep, 3 hiddenâ (1dâ3h). It amounts to adding an extra deep neuron, d[lm], and three more hidden neurons to satisfy Eq. (16). A crucial difference with respect to the TFIM is that the introduced deep spin d[lm] has a constraint depending on the state of the spins on the bond: \(\sigma _l^z\) and \(\sigma _m^z\). Specifically, when \(\sigma _l^z = \sigma _m^z\) the deep spin is constrained to be \(d_{[lm]} = \sigma _l^z = \sigma _m^z\), whereas when \(\sigma _l^z \ne \sigma _m^z\), its value is unconstrained. From a pictorial point of view, the action of the bond propagator is a four-step process (see Fig. 3a). Starting from a given initial network (uppermost structures in Fig. 3), d[lm] is added and connected, through \(W_{j[lm]}^\prime\) given in Eq. (43), to the existing hidden units hj connected to \(\sigma _l^z\) and \(\sigma _m^z\). Second, spin \(\sigma _l^z\) is disconnected to all hidden units and reconnected to those hidden units the spin \(\sigma _m^z\) is attached to [see Eq. (42)]. Third, two new hidden units are introduced. One of the hidden units, h[lm1], mediates the interaction between \(\sigma _l^z\) and d[lm] [Eq. (47)], and the other hidden unit h[lm2] mediates a direct spinâspin interaction between \(\sigma _l^z\) and \(\sigma _m^z\) [Eq. (49)]. Fourth, a further hidden unit connected to \(\sigma _l^z\), \(\sigma _m^z\), and d[lm] is inserted, in such a way that the constraint previously described is satisfied. For all but the last step, the DBM weights are real-valued. In the last step instead the constraint is enforced by introducing imaginary-valued interactions (dotted lines in Fig. 3), referred to the âiÏ/6â trick, resulting in a sign-problem-free global term \({\mathrm{cos}}({\pi {\mathrm{/}}6({\sigma _l^z + \sigma _m^z - d_{[lm]}})})\) after the summation over ±1 for the lastly added hidden unit h[lm3]: \(\mathop {\sum}\nolimits_{h_{[lm3]} = \pm 1} {\kern 1pt} {\mathrm{exp}}[{i\pi {\mathrm{/}}6({\sigma _l^z + \sigma _m^z - d_{[lm]}})h_{[lm3]}}]\). The constraint mentioned above is assured by this cosine term.
Construction of exact DBM representations of Heisenberg models. In this example, a time step of imaginary-time evolution is shown, for the case of the one-dimensional antiferromagnetic Heisenberg model. Dots represent physical degrees of freedom \(\left( {\sigma _i^z} \right)\), squares represent hidden units (hj), triangles represent deep units (dk). The three panels (aâc) represent different possible explicit constructions. In each panel, upper networks are the initial state with arbitrary network form, and the bottom networks are the final states, after application of the propagator. Intermediate steps illustrate how the network is modified, where the relevant modified weights at each step are highlighted in black. In those diagrams, dashed lines indicate that the corresponding weights are set to zero, and dotted lines indicate complex-valued weights. The three panels correspond to the (a) â1 deep, 3 hiddenâ (1dâ3h), (b) â2 deep, 6 hiddenâ (2dâ6h), and (c) â2 deep, 4 hiddenâ (2dâ4h) constructions (see text for a more detailed explanation of the individual steps characteristic of each construction)
2 deepâ+â6 hidden variables construction for Heisenberg model
The second construction is dubbed â2 deep, 6 hiddenâ (2dâ6h), and is more similar to the lattice path-integral formulation. In this representation, we introduce two auxiliary deep spins per bond, d[l] and d[m] with constraint \(d_{[l]} + d_{[m]} = \sigma _l^z + \sigma _m^z\), and six hidden neurons. The action of the bond propagator is schematically illustrated in Fig. 3b: first, two deep units d[l] and d[m] are introduced, connecting, respectively, to the hidden units spins \(\sigma _l^z\) and \(\sigma _m^z\) are attached to [see Eqs. (51) and (52)]. Second, all the connections between spins \(\sigma _l^z\), \(\sigma _m^z\), and hidden units hj are cut off [Eqs. (53) and (54)]. Third, four hidden units h[lm1], â¦, h[lm4] are introduced, to mediate interactions between the two deep units and the physical spins l, m [Eqs. (61) and (62)]. Finally, two hidden units h[lm5] and h[lm6] are introduced, connecting both to d[l], d[m] and \(\sigma _l^z,\sigma _m^z\) with imaginary-valued weights. The last step realizes the constraint \(d_{[l]} + d_{[m]} = \sigma _l^z + \sigma _m^z\), through the âiÏ/4, iÏ/8â trick discussed in Methods and the discussion of the 2dâ6h representation in Supplementary Note 2.
In this representation, if the hidden neurons are traced out, the imaginary-time evolution becomes equivalent to that of the path-integral Monte Carlo method. More specifically, the number of deep neurons introduced at each time slice is exactly the same as the number of visible spins, and the deep neurons at each time slice can be regarded as additional classical spin degrees of freedom in the path-integral. Moreover, the constraint \(d_{[l]} + d_{[m]} = \sigma _l^z + \sigma _m^z\) ensures that the total magnetization is conserved at each time slice. Finally, the W and Wâ² interactions reproduce the matrix element of \({\mathrm{exp}}\left( { - \delta _\tau {\cal H}_{lm}^{{\mathrm{bond}}}} \right)\) between neighboring time slices. See Supplementary Note 2 for more detail on this point.
2 deepâ+â4 hidden variables construction for Heisenberg model
A further possible solution to Eq. (16) is dubbed â2 deep, 4 hiddenâ (2dâ4h) construction. In this case, we introduce two auxiliary deep variables d[l] and d[lm]. We also introduce four hidden units h[l], h[m], h[lm1], and h[lm2]. Before the imaginary time evolution, \(e^{ - \delta _\tau {\cal H}_{lm}^{{\mathrm{bond}}}}\), the physical variables \(\sigma _n^z\) (nâ=âl or m) are already coupled to each hidden variable hj with a coupling Wnj. After the time evolution \({\mathrm {e}}^{ - \delta _\tau {\cal H}_{lm}^{{\mathrm{bond}}}}\), as shown schematically in Fig. 3c, the coupling parameters are updated in the following way based on the old Wnj: First, the first deep unit d[l] becomes coupled to the already existing hidden variables hj through the coupling \(W_{j[l]}^\prime\) given in Eq. (67). The second deep unit d[lm] becomes similarly coupled to hj through a term Zlmj given in Eq. (67). Second, Wnj is updated to \(\bar W_{nj} = W_{nj} + {\mathrm{\Delta }}W_{nj}\) [see Eq. (66)]. Third, newly introduced h[n] (nâ=âl or m) gets coupled to d[l] through \(W_{[n][l]}^\prime\), and also to \(\sigma _n^z\) through Wn[n] [Eqs. (71) and (73)].
Finally, as clarified in Methods, we also need to satisfy the constraint \(d_{[l]}d_{[lm]} = \sigma _l^z\sigma _m^z\). Such a constraint is represented in DBM form as
which ensures \(d_{[l]}d_{[lm]} = \sigma _l^z\sigma _m^z\) after explicit summation of h[lm1] and h[lm2]. Finally, we remark that the three constructions presented here have different intrinsic network topologies. In particular, 2dâ6h gives rise to a local topology (because of the equivalence with the path-integral contruction), 1dâ3h has a local structure in the first layer and non-local in the second one, and 2dâ4h is purely non-local in both layers.
Sampling strategies
With network structures explicitly determined, we now focus on the problem of extracting meaningful physical quantities from them. To this end, it is convenient to decompose the DBM weight into two parts, such that
where \(P_1\left( {\sigma ^z,h} \right)\)â=â\({\mathrm {e}}^{\sigma ^z \cdot a + \sigma ^z \cdot W \cdot h + h \cdot b}\), and \(P_2(h,d)\)â=â\({\mathrm {e}}^{h \cdot W\prime \cdot d + d \cdot b\prime }\). The expectation value of an arbitrary (few-body) operator \({\cal O}\) can then be computed through the expression
where we have introduced the pseudo-probability density Î (Ïz, h, hâ², d, dâ²)ââ¡â\(P_1\left( {\sigma ^z,h} \right)P_2\left( {h,d} \right)P_1^ \ast \left( {\sigma ^z,h\prime } \right)P_2^ \ast \left( {h\prime ,d\prime } \right)\), and the âlocalâ estimator
For the sampling over the Î distribution, a block Gibbs sampling analogous to what performed in standard DBM architectures can be performed37,40. Alternatively, it is possible to devise a set of Metropolis local updates sampling the exactly known marginals \({\tilde{\mathrm {\Pi}}}\left( {\sigma ^z,h,h\prime } \right)\)â=â\(\mathop {\sum}\nolimits_{\{ d,d\prime \} } {\kern 1pt} {\mathrm{{\Pi}}}\left( {\sigma ^z,h,h\prime ,d,d\prime } \right)\) or \({\tilde{\mathrm {\Pi}}}\prime \left( {\sigma ^z,d,d\prime } \right)\)â=â\(\mathop {\sum}\nolimits_{\{ h,h\prime \} } {\mathrm{{\Pi}}}\left( {\sigma ^z,h,h\prime ,d,d\prime } \right)\).
In general, we have found that efficiently sampling the DBMs arising from the Heisenberg model constructions is typically more challenging than for the TFIM. This circumstance is a consequence of the imaginary couplings which set constraints on the value of hidden/deep units. These constraints typically make local Metropolis updates inefficient. With the notable exception of the 2dâ6h representation, for which loop updates can be readily implemented, we leave the problem of designing efficient Monte Carlo sampling for the other Heisenberg constructions open. The sampling strategies adopted in our numerica are discussed more in detail in Supplementary Note 3.
Numerical results
We have implemented numerical algorithms to sample and obtain physical properties from the DBM previously derived. In Fig. 4a we show results for the one-dimensional TFIM. Specifically, we show the expectation value of the energy following the imaginary-time evolution starting from a fully polarized (in the x direction) initial state. The initial state corresponds to an empty network, where all the DBM parameters are set to zero. The DBM results closely match the exact imaginary-time evolution, thus verifying the correctness of our construction.
Imaginary-time evolution with a DBM for 1D spin models. a Expectation value of energy of the transverse-field Ising Hamiltonian in the exact imaginary-time evolution (continuous line) is compared to the stochastic result obtained with a DBM (filled circles) (δÏâ=â0.01). Empty circles correspond to the approximate RBM evolution scheme, Eq. (15). We consider the critical point (Îlâ=âVlm), periodic boundary conditions, and Nâ=â20 sites. b Expectation value of the isotropic antiferromagnetic Heisenberg Hamiltonian (AFHM) in the exact imaginary-time evolution (continuous line) is compared to the stochastic result obtained with a DBM (δÏâ=â0.01) following the 2dâ6h construction. We consider periodic boundary conditions, Nâ=â16 sites. The subscript α in DBMα in panels (a, b) specifies a different initial state \(\left| {{\mathrm{\Psi }}_0} \right\rangle\): αâ=â1 means that the initial state is an RBM state with hidden-unit density M/Nâ=â1, whereas when αâ=â0 the initial state is the empty-network state (Mâ=â0). All energies are in units of the transverse field (Îlâ=â1) for the TFIM, and of the exchange (Jâ=â1) for the AFHM
In Fig. 4a we also show the corresponding imaginary-time evolution as obtained from the approximate RBM construction, Eq. (15). As expected, this approximation is very accurate for short times, and breaks at later times.
Numerical results for the one-dimensional Heisenberg model are shown in Figs. 4b and 5a. Specifically, 4b shows the numerical check for the DBM (construction 2dâ6h) time evolution for one-dimensional Heisenberg model for Nâ=â16. As expected, the DBM results also in this case follow the exact time evolution. Figure 5a shows the dependence of the energy from the initial state, for Nâ=â80 case. Specifically, by taking a pre-optimized variational RBM as an initial state, we can significantly decrease the time Ï needed to reach the ground state.
Approaching the exact ground-state energy. a Relative error on the ground-state energy for the 1D AFHM as a function of the imaginary time. Here we consider periodic boundary conditions, Nâ=â80 sites, and δÏâ=â0.01, in units of the exchange Jâ=â1. The subscript α in DBMα specifies a different initial state \(\left| {{\mathrm{\Psi }}_0} \right\rangle\): αâ=â1 means that the initial state is an RBM state with hidden-unit density M/Nâ=â1, whereas when αâ=â0 the initial state is the empty-network state (Mâ=â0). b Relative error on the ground-state energy for the two-dimensional J1âââJ2 AFHM as a function of the imaginary time. As an energy unit, we consider J1â=â1, and take J2â=â0.0 and 0.4, periodic boundary conditions, Nâ=â4Ã4â=â16 sites, and δÏâ=â0.001. Initial states are pre-optimized pair-product (geminal) state \(\left| {\psi _{{\mathrm{PP}}}} \right\rangle\) supplemented by Gutzwiller factor \(P_G^\infty\)â=â\(\mathop {\prod}\nolimits_l \left( {1 - n_{l \uparrow }n_{l \downarrow }} \right)\) prohibiting double occupancy and quantum number projection onto the singlet state \({\cal L}^{S = 0}\), i.e., \(\left| {{\mathrm{\Psi }}_0} \right\rangle\)â=â\({\cal L}^{S = 0}P_G^\infty \left| {\psi _{{\mathrm{PP}}}} \right\rangle\). The PP states are given by \(\left| {\psi _{{\mathrm{PP}}}} \right\rangle\)â=â\(\left( {\mathop {\sum}\nolimits_{l,m = 1}^N {\kern 1pt} f_{lm}^{ \uparrow \downarrow }c_{l \uparrow }^\dagger c_{m \downarrow }^\dagger } \right)^{N/2}\left| 0 \right\rangle\), where \(f_{lm}^{ \uparrow \downarrow }\) are variational parameters and \(c_{l\sigma }^\dagger\) are the operators creating the electron with spin Ï at lth site
Results for two-dimensional models are shown in Fig. 5b, both for the two-dimensional Heisenberg model, and for the frustrated J1âââJ2 model, on 4âÃâ4 lattice with periodic boundary conditions.
In the case of the TFIM, sampling from the DBM is realized through the Gibbs scheme previously sketched, in conjunction with a parallel tempering scheme, to improve ergodicity in the sampling.
For the AFHM and for the J1âââJ2 model with 2dâ6h representation, we adopt loop updates41 used in the path-integral QMC method, because the imaginary-time evolution in the 2dâ6h representation has a direct correspondence to the path-integral formulation, allowing for an efficient handling of the constraint \(d_{[l]} + d_{[m]}\)â=â\(\sigma _l^z + \sigma _m^z\).
All the simulations carried here are sign-problem free, with the notable exception of the simulations carried on the two-dimensional J1âââJ2 model. In this case, we start the imaginary-time evolution from a pre-optimized variational wave function, thus setting the fully evolved state as product of a DBM and the initial state. Because of the quality of the initial guess, a moderate sign problem can be numerically afforded for short time evolutions, and in this case it is enough to converge to the exact ground state (see Fig. 5b).
Discussion
We have shown how exact ground states of interacting spin Hamiltonians can be explicitly constructed using artificial neural networks comprising only two layers of hidden variables. In contrast to approaches based on one-layer RBMs, the constructions we have derived here do not require further variational optimization of the network parameters, and the exact representation of many-body ground states can be achieved with only polynomially many neurons. In the case of the Heisenberg model, all of the explicit algorithms presented here give rise to sign-problem-free representations, if the lattice is bipartite.
The DBM representation has an intrinsic conceptual value, as an alternative quantum-to-classical mapping to the path-integral representation. In the path-integral formalism, the addition of an extra dimension (the imaginary time direction) is needed to exactly represent the quantum many-body state. In our case, the DBM deep hidden layer plays a similar role as the additional dimension in the path integral. As argued in Methods [see Eq. (28)], a single-layer RBM is indeed sufficient to exactly, and efficiently describe the state of arbitrary classical spin systems. On the other hand, a second, deep layer is necessary for the efficient, and exact construction of compact networks describing quantum mechanical states.
DBM-based schemes can be further used to systematically improve upon existing RBM variational results. More generally, the initial state for the present DBM scheme can be generic variational states or even combinations of RBMs and more conventional wave functions24,33. We have shown that, by starting the DBM construction from a pre-optimized variational state, a fast convergence to the exact ground state is observed. As shown in Fig. 5b, this kind of scheme opens the possibility of characterizing the ground state even in the case of non-bipartite lattices with frustration effects, exploiting the transient regime in which the sign problem can be still efficiently handled numerically, as for example discussed in ref. 42.
Methods
Useful identities
It is useful to introduce several identities, which can be used when more complicated interactions between the visible spins Ïz, hidden variables h and deep variables d beyond the standard form Eq. (1) are needed. The first identity reads
with
for Ising variables s1 and s2, and a real interaction V. This is a gadget for decomposing two-body interactions, and can be proven by examining all the cases of s1 and s2.
By taking s1 and s2 as visible (physical) variables Ïz and s3 as a hidden variable h, the direct classical two-body interaction between physical variables [the leftmost part in Eq. (21)] is cut and instead mediated by the hidden neuron h. Furthermore, a direct interaction between Ïz and d can also be decomposed: In the following derivations for the DBM wave constructions, for convenience, we sometimes introduce the direct interaction between Ïz and d, which is not allowed in the DBM structure. However, by taking s1 as a visible spin Ïz, s2 as a deep variable d, and s3 as a hidden variable h in Eq. (21), one can eliminate the direct interaction between Ïz and d and decompose it into the interaction mediated only by h with trade-off of the summation over the hidden variable h. With this trick, one can recover the standard DBM form in Eq. (1).
Another identity (decomposition of four-body interaction) is
for Ising variables si with iâ=â1,â¦, 4. Although we have introduced complex couplings in the first line, each term in the summation in the second line of Eq. (25) is positive definite if V is real. The second line remains nonzero only if s1s2â=âs3s7, which proves the identity. This identity with s1 and s2 as physical variables, s4, s5, and s6 as hidden variables, and s3 and s7 as deep variables, reads
Note that the right-hand side fits the DBM structure.
General three-body and two-body interactions can also be represented by the two-body form just by putting some of s1,â¦,s4 as constants in Eq. (25). These could be used instead of Eq. (21), although we employ Eq. (21) in the formalism below for the decoupling of the two-body interaction.
Finally, we discuss the gadgets for decomposing general N-body classical interactions using complex bias term bj in addition to the couplings W and Wâ², whereas the gadgets Eqs. (21) and (26) are represented only by W and Wâ² interactions. The gadget reads
with
This fact suggests that any classical partition function defined for Ising spins can be written exactly in terms of an RBM. Although the RBM is shown to be powerful in representing also the quantum states, there is no analytical way to map quantum states to the RBM and one must rely on numerical optimizations to get the RBM parameters. In the present study, we show analytical mappings from quantum states to the DBM, which has additional hidden layer. In the statistical mechanics, it is known that quantum systems with D dimension can be mapped on (Dâ+â1)-dimensional classical systems. Therefore, having additional hidden layer in neural network language is equivalent to acquiring additional dimension in statistical mechanics.
Transverse-field ising model
The solution of Eq. (9) is found in the following way. The left-hand side of Eq. (9) can be rewritten by using the notation Eq. (18) as
We look for a solution by adding one deep neuron d[l] and creating new couplings \(W_{j[l]}^\prime\) to the existing hidden neurons hj which are connected to \(\sigma _l^z\). We also allow for changes in the existing interaction parameters. In particular we set the new couplings to be \(\bar W_{lj} = W_{lj} + {\mathrm{\Delta }}W_{lj}\), (with ÎWlj to be determined). Moreover, we introduce one hidden neuron h[l] coupled to \(\sigma _l^z\) and d[l] through the interactions Wl[l] and \(W_{[l][l]}^\prime\), respectively. If we trace out h[l], the hidden neuron h[l] mediates the interaction between \(\sigma _l^z\) and d[l] (denoted as \(W_{l[l]}^{\prime\prime}\)).
With this choice, we have (in the representation where h[l] is traced out):
The equations to be verified are obtained considering the two possible values of \(\sigma _l^z = \pm 1\):
This equation has a solution from the requirement that the hidden unit interactions on the left and right hand sides match, thus we require
and
Notice that when Îlâ>â0, \(W_{l[l]}^{\prime\prime}\) is also real. By using Eq. (21) with the following replacement \(s_1 \to \sigma _l^z\), s2âââd[l], s3âââh[l], \(V \to W_{l[l]}^{\prime\prime}\), \(\tilde V_1 \to W_{l[l]}\), and \(\tilde V_2 \to W_{[l][l]}^\prime\), the last condition determines the real couplings Wl[l] and \(W_{[l][l]}^\prime\) as Eqs. (11) and (12).
Heisenberg model
Here, we show the derivation for the general form of bond Hamiltonian allowing anisotropy and bond-disorder: \({\cal H}_{lm}^{{\mathrm{bond}}}\)â=â\(J_{lm}^{xy}\left( {\sigma _l^x\sigma _m^x + \sigma _l^y\sigma _m^y} \right)\)â+â\(J_{lm}^z\sigma _l^z\sigma _m^z\). In the case of the bipartite lattice and the antiferromagnetic exchange \(J_{lm}^z,J_{lm}^{xy} > 0\), we further apply a local gauge transformation by a Ï rotation around the z-axis in the spin space as ÏxââââÏx and ÏyââââÏy on one of the sublattices, which gives a â sign for \(\sigma _l^x\sigma _m^x\) and \(\sigma _l^y\sigma _m^y\) interactions. This transformation is equivalent to taking
The gauge transformation enables to design a DBM neural network with real couplings {W, Wâ²} except for those to put âconstraintâ on the values of deep neuron spins (see more detail about the constraint in the following sections). It ensures that the DBM algorithm has no negative sign problems.
In the case of the antiferromagnetic Heisenberg model after the gauge transformation on the bipartite lattice, we must solve, for each bond,
It is also useful to explicitly write the expression for the exchange term in the second line above:
In the following derivations, for the antiferromagnetic Hamiltonian \(\left( {J_{lm}^z,J_{lm}^{xy} \,> \,0} \right)\) after the gauge transformation, we look for a solution with zero bias terms (\(a_i,b_j,b_k^\prime = 0\), âi, j, k). We can also derive a sign-problem-free solution for the imaginary time evolution in the absence of the explicit gauge transformation by introducing a complex bias term ai. Indeed, in the â2 deep, 4 hiddenâ representation, we will explicitly show that taking a specific set of complex bias term ai on physical spins is equivalent to the gauge transformation, making a solution free from the sign problem.
In a way similar to the TFIM, solutions of Eq. (39) can be found by specifying the structure of the DBM and the three examples are the following.
1dâ3h construction for Heisenberg model
We assume the structure of the updated wave function (corresponding to Eq. (32) for the TFIM) to be
Similarly to the case of the TFIM, a solution of Eq. (39) is given by
and
Notice that the first condition is equivalent to cutting all connections from spin l to the hidden units and attaching the spin l to all the hidden units connected to spin m, with an interaction Wmj.
Although the terms proportional to \(W_{l[lm]}^{\prime\prime}\) and Vlm do not satisfy the standard DBM form, they can be transformed to the DBM form by introducing new hidden neurons h[lm1] and h[lm2] [see the gadget Eq. (21)]:
with
Similarly, the coupling V[lm] is decomposed as
with
Finally, as discussed in the main text, the constraint \(d_{[lm]} = \sigma _l^z\) when \(\sigma _l^z = \sigma _m^z\) can be satisfied by adding the third neuron h[lm3], introducing pure complex iÏ/6 couplings.
2dâ6h construction for Heisenberg model
In this case, the form of the new wave function reads
A solution of Eq. (39) is given by
and
The direct interactions between \(\left( {\sigma _l^z,d_{[l]}} \right)\), \(\left( {\sigma _m^z,d_{[m]}} \right)\), \(\left( {\sigma _l^z,d_{[m]}} \right)\), and \(\left( {\sigma _m^z,d_{[l]}} \right)\), are mediated by h[lm1], h[lm2], h[lm3], and h[lm4], respectively, as follows:
By applying the gadget Eq. (21), the new W and Wâ² interactions are given by, for small Î´Ï (such that \({\textstyle{{{\mathrm {e}}^{ - J_{lm}^z\delta _\tau }} \over {\sqrt {{\mathrm{sinh}}\left( {2J_{lm}^{xy}\delta _\tau } \right)} }}} > 1\)),
and
Finally, the constraint \(d_{[l]} + d_{[m]} = \sigma _l^z + \sigma _m^z\) can be put by introducing additionally two hidden neurons h[lm5] and h[lm6], and by introducing complex couplings
This term gives interactions among d[l], d[m], \(\sigma _l^z\) and \(\sigma _m^z\): \(4\,{\mathrm{cos}}\left( {{\textstyle{\pi \over 4}}\left( {\sigma _l^z + \sigma _m^z - d_{[l]} - d_{[m]}} \right)} \right)\) \({\mathrm{cos}}\left( {{\textstyle{\pi \over 8}}\left( {\sigma _l^z + \sigma _m^z - d_{[l]} - d_{[m]}} \right)} \right)\), which realize the constraint.
2dâ4h construction for Heisenberg model
For this construction, we assume the following structure for the wave-function after the propagator:
In this case, we also look for a solution for the bond operator without the gauge transformation. This shows that the introduction of a complex bias term ai can play the same role as the gauge transformation. Then, we need to solve
Note that the sign for \({\mathrm{\Psi }}_{\cal W}\left( {\sigma _l^z \leftrightarrow \sigma _m^z} \right){\mathrm{sinh}}\left( {2J_{lm}^{xy}\delta _\tau } \right)\) term is different from that in Eq. (39).
A solution of Eq. (65) is obtained as
where Wnj (nâ=âl, m) is updated to \(\bar W_{nj}\) with the increment ÎWnj as \(\bar W_{nj}\)â=â\(W_{nj} + {\mathrm{\Delta }}W_{nj}\). The new couplings \(W_{j[l]}^\prime\), Zlmj and \(W_{n[l]}^{\prime\prime}\) are also given by
and
with alâmâ=âalâââam. On a bipartite lattice, to avoid the negative sign (or complex phase) problem we need to keep \(W_{l[l]}^{\prime\prime}\) and \(W_{m[l]}^{\prime\prime}\) real. This can be achieved by choosing alâ=â0 for any l if Jlmâ<â0 (ferromagnetic case). For Jlmâ>â0 (antiferromagnetic case), alâ=ânÏi with an arbitrary integer n if the site l belongs to the sub-lattice A and alâ=â(nâ+â1/2)Ïi if l belongs to the sub-lattice B. This local gauge for Jlmâ>â0 is equivalent to the transformation \(J_{lm}^{xy} \to - J_{lm}^{xy}\) and alâ=â0 for any site l. We further notice that \(W_{m[l]}^{\prime\prime}\) can be taken positive if we take a sufficiently small Î´Ï in Eq (69), with the leading order term \(- {\mathrm{log}}\left( {2J_{lm}^{xy}\delta _\tau } \right){\mathrm{/}}2\). On the other hand, in Eq. (68), the leading order term is negative (=âJlmδÏ).
To recover the original form of the DBM, we first use Eq. (21) with the replacement \(s_1 \to \sigma _n^z\), s2âââd[l], s3âââh[n], CâââDn, \(V \to W_{n[l]}^{\prime\prime}\) \(\tilde V_1 \to W_{n[n]}\), and \(\tilde V_2 \to W_{[n][l]}^\prime\) for nâ=âl, m. Then a solution for Dn, Wn[n], and \(W_{[n][l]}^\prime\) are represented by using \(W_{n[l]}^{\prime\prime}\) as
for positive \(W_{n[l]}^{\prime\prime}\) and
for negative \(W_{n[l]}^{\prime\prime}\) to give real Wn[n] and \(W_{[n][l]}^\prime\).
To completely recover the original DBM form, we next use Eq. (26) by replacing Ï1 with \(\sigma _l^z\), Ï2 with \(\sigma _m^z\), d1 with d[l], d2 with d[lm], h1 with hj, h2 with h[lm1], h3 with h[lm2], and V with Zlmj.
With these solutions, by ignoring the trivial constant factors including Dl and Dm, the evolution is described by introducing two deep and four hidden additional variables d[l], d[lm], h[l], h[m], h[lm1], and h[lm2] as
where \(\left\{ {\bar h,\bar d} \right\}\) is a set consisting of the existing and new neurons.
Code availability
Computer codes to create the deep Boltzmann machine networks for each model are provided as Supplementary Software 1â4. Other code written for and used in this study is available from the corresponding author upon reasonable request.
Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Feynman, R. P. Space-time approach to non-relativistic quantum mechanics. Rev. Mod. Phys. 20, 367â387 (1948).
Dyson, F. J. The S matrix in quantum electrodynamics. Phys. Rev. 75, 1736â1755 (1949).
Hubbard, J. Calculation of partition functions. Phys. Rev. Lett. 3, 77â78 (1959).
Stratonovich, R. L. On a method of calculating quantum distribution functions. Sov. Phys. Dokl. 2, 416â419 (1957).
Abrikosov, A. A. Methods of Quantum Field Theory in Statistical Physics. (Dover Publications, New York, 1975). revised edition.
Binder, K. Applications of the Monte Carlo Method in Statistical Physics. (Springer Verlag, Berlin, 1984).
Takahashi, M. & Imada, M. Monte carlo calculation of quantum systems. J. Phys. Soc. Jpn. 53, 963 (1984).
Takahashi, M. & Imada, M. Monte carlo calculation of quantum systems. II. Higher order correction. J. Phys. Soc. Jpn. 53, 3765 (1984).
Ceperley, D. Path-integrals in the theory of condensed helium. Rev. Mod. Phys. 67, 279â355 (1995).
Suzuki, M. Relationship between d-dimensional quantal spin systems and (d + 1)-dimensional ising systems: equivalence, critical exponents and systematic approximants of the partition function and spin correlations. Prog. Theor. Phys. 56, 1454â1469 (1976).
Hirsch, J. E., Sugar, R., Scalapino, D. & Blankenbecler, R. Monte carlo simulations of one-dimensional fermion systems. Phys. Rev. B 26, 5033â5055 (1982).
Beard, B. & Wiese, U.-J. Simulations of discrete quantum systems in continuous euclidean time. J. Phys. Rev. Lett. 77, 5130â5133 (1996).
Sandvik, A. W. Stochastic series expansion method with operator-loop update. Phys. Rev. B 59, R14157âR14160 (1999).
Prokofâev, N. & Svistunov, B. Bold diagrammatic Monte Carlo technique: when the sign problem is welcome. Phys. Rev. Lett. 99, 250201 (2007).
Feynman, R. P. Atomic theory of the two-fluid model of liquid helium. Phys. Rev. 94, 262â277 (1954).
Gros, C. Physics of projected wavefunctions. Ann. Phys. 189, 53â88 (1989).
Kashima, T. & Imada, M. Path-integral renormalization group method for numerical study on ground states of strongly correlated electronic systems. J. Phys. Soc. Jpn. 70, 2287â2299 (2001).
Tahara, D. & Imada, M. Variational monte carlo method combined with quantum-number projection and multi-variable optimization. J. Phys. Soc. Jpn. 77, 114701 (2008).
Becca, F. & Sorella, S. Quantum Monte Carlo Approaches for Correlated Systems. (Cambridge University Press, Cambridge, UK; New York, NY, 2017).
White, S. R. Density-matrix algorithms for quantum renormalization groups. Phys. Rev. B 48, 10345â10356 (1993).
Orús, R. A practical introduction to tensor networks: Matrix product states and projected entangled pair states. Ann. Phys. 349, 117â158 (2014).
Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. Science 355, 602â606 (2017).
Torlai, G. et al. Many-body quantum state tomography with neural networks. Nat. Phys. 14, 447â450 (2018).
Nomura, Y., Darmawan, A. S., Yamaji, Y. & Imada, M. Restricted boltzmann machine learning for solving strongly correlated quantum systems. Phys. Rev. B 96, 205152 (2017).
Deng, D.-L., Li, X. & Das Sarma, S. Quantum entanglement in neural network states. Phys. Rev. X 7, 021021 (2017).
Rocchetto, A., Grant, E., Strelchuk, S., Carleo, G. & Severini, S. Learning hard quantum distributions with variational autoencoders. npj Quantum Inf. 4, 28 (2018).
Glasser, I., Pancotti, N., August, M., Rodriguez, I. D. & Cirac, J. I. Neural networks quantum states, string-bond states and chiral topological states. Phys. Rev. X 8, 011006 (2018).
Kaubruegger, R., Pastori, L. & Budich, J. C. Chiral topological phases from artificial neural networks. Phys. Rev. B 97, 195136 (2018).
Cai, Z. Approximating quantum many-body wave-functions using artificial neural networks. Phys. Rev. B 97, 035116 (2018).
Saito, H. & Kato, M. Machine learning technique to find quantum many-body ground states of Bosons on a lattice. J. Phys. Soc. Jpn. 87, 014001 (2017).
Saito, H. Solving the BoseâHubbard model with machine learning. J. Phys. Soc. Jpn. 86, 093001 (2017).
Chen, J., Cheng, S., Xie, H., Wang, L. & Xiang, T. Equivalence of restricted Boltzmann machines and tensor network states. Phys. Rev. B 97, 085104 (2018).
Clark, S. R. Unifying neural-network quantum states and correlator product states via tensor networks. J. Phys. A 51, 135301 (2018).
Deng, D.-L., Li, X. & Das Sarma, S. Machine learning topological states. Phys. Rev. B 96, 195145 (2017).
Gao, X. & Duan, L.-M. Efficient representation of quantum many-body states with deep neural networks. Nat. Commun. 8, 662 (2017).
Huang, Y. & Moore, J. E. Neural network representation of tensor network and chiral states. Preprint at http://arxiv.org/abs/1701.06246 (2017).
Salakhutdinov, R. & Hinton, G. Deep Boltzmann machines. Proc. Mach. Learn. Res. 5, 448â455 (2009).
Trotter, H. F. On the product of semi-groups of operators. Proc. Am. Math. Soc. 10, 545â551 (1959).
Freitas, N., Morigi, G. & Dunjko, V. Neural network operations and SusukiâTrotter evolution of neural network states. Preprint at http://arxiv.org/abs/1803.02118 (2018).
Salakhutdinov, R. & Hinton, G. An efficient learning procedure for deep Boltzmann machines. Neural Comput. 24, 1967â2006 (2012).
Evertz, H. G., Lana, G. & Marcu, M. Cluster algorithm for vertex models. Phys. Rev. Lett. 70, 875â879 (1993).
Ceperley, D. M. & Alder, J. Ground state of the electron gas by a stochastic method. Phys. Rev. Lett. 45, 566â569 (1980).
Acknowledgements
G.C. acknowledges useful discussions with Xun Gao, and Markus Heyl. Y.N. and M.I. are grateful for the useful discussions with Youhei Yamaji and Andrew S. Darmawan. Y.N. was financially supported by Grant-in-Aids for Scientific Research (JSPS KAKENHI) (No. 17K14336). M.I. and Y.N. were financially supported by a Grant-in-Aid for Scientific Research (No. 16H06345) from Ministry of Education, Culture, Sports, Science and Technology, Japan. Part of the calculations were done at Supercomputer Center, Institute for Solid State Physics, University of Tokyo. This work was also supported in part by MEXT as a social and scientific priority issue (Creation of new functional devices and high-performance materials to support next-generation industries CDMSI) to be tackled by using post-K computer. We also thank the support provided by the RIKEN Advanced Institute for Computational Science through the HPCI System Research project (hp170263) supported by Ministry of Education, Culture, Sports, Science, and Technology, Japan.
Author information
Authors and Affiliations
Contributions
G.C. conceived the general idea and contributed the Ising model, and the approximate RBM construction. G.C., Y.N. and M.I. each contributed one of the three Heisenberg model representations. Numerical simulations were performed by Y.N. and G.C. All authors contributed equally to the manuscript preparation and presentation of the results.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Publisherâs note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the articleâs Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the articleâs Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Carleo, G., Nomura, Y. & Imada, M. Constructing exact representations of quantum many-body systems with deep neural networks. Nat Commun 9, 5322 (2018). https://doi.org/10.1038/s41467-018-07520-3
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41467-018-07520-3
This article is cited by
-
Deep NURBSâadmissible physics-informed neural networks
Engineering with Computers (2024)
-
Accurate nano-photonic device spectra calculation using data-driven methods
Applied Physics A (2024)
-
Quantum process tomography with unsupervised learning and tensor networks
Nature Communications (2023)
-
Continuous-variable neural network quantum states and the quantum rotor model
Quantum Machine Intelligence (2023)
-
Compression and reduction of \(N*1\) states by unitary matrices
Quantum Information Processing (2022)