Hammerstein equations for sparse random matrices
Pawat Akara-pipattanaa and Oleg Evninb,c
a Université Paris-Saclay, CNRS, LPTMS, 91405 Orsay, France
b High Energy Physics Research Unit, Department of Physics, Faculty of Science, Chulalongkorn University,
10330 Bangkok, Thailand
b Theoretische Natuurkunde, Vrije Universiteit Brussel and
International Solvay Institutes, 1050 Brussels, Belgium
pawat.akarapipattana@universite-paris-saclay.fr, oleg.evnin@gmail.com
ABSTRACT
Finding eigenvalue distributions for a number of sparse random matrix ensembles can be reduced to solving nonlinear integral equations of the Hammerstein type. While a systematic mathematical theory of such equations exists, it has not been previously applied to sparse matrix problems. We close this gap in the literature by showing how one can employ numerical solutions of Hammerstein equations to accurately recover the spectra of adjacency matrices and Laplacians of random graphs. While our treatment focuses on random graphs for concreteness, the methodology has broad applications to more general sparse random matrices.
1 Introduction
The spectra of random matrices have found far-reaching applications in situations where large complex matrices are involved, including fields as diverse as linear [1, 2] and nonlinear [3] dynamics in the presence of random couplings, quantum chaos [4], localization phenomena in disordered media [5], vibrational properties of amorphous solids [6], the physics of heavy nuclei [7], explorations of discrete geometry [8] and mathematical neurobiology [9]. One particular topic is spectral problems related to graphs, and hence to network theory [10], since graphs are connected to matrices via the adjacency matrix representation; see [11] for a textbook exposition.
A number of spectral problems for sparse random matrices, that is, matrices with a large and a fixed average number of nonzero entries per row, can be reduced [13, 12, 14, 15, 16, 17, 18, 19, 20, 21] to solving nonlinear integral equations of the form
(1) |
where the ‘source’ , the kernel and the nonlinearity are prescribed, the latter typically being an exponential nonlinearity with respect to the unknown in the examples we will consider. Such equations are known as Hammerstein equations, after [22]. A typical reaction of a theoretical physicist to an equation like (1) is that of bewilderment. We thus read in the influential paper [23] on sparse matrix spectra: “This integral equation… has, however, so far resisted exhaustive analysis or full numerical solution.” An early numerical treatment can be seen in [16], and it approximates the desired solution via sophisticated sums of Gaussians adapted to the concrete equation at hand, without evoking a systematic theory of equations of the form (1).
Despite having remained unincorporated in the context of sparse random matrix considerations, a systematic theory of Hammerstein equations has been developed in the mathematical literature, starting with [22], see [24] for a comprehensive review. (Curiously, the physical application of such equations mentioned in [24] is in Chandrasekhar’s theory of radiative transfer [25], very far from our present perspective.) In fact, a splash of research activity on numerical convergence properties in concrete cases is seen in the recent years. Our purpose in this article is to demonstrate that this systematic approach to solving Hammerstein equations numerically provides an efficient procedure for recovering sparse matrix spectra in a way that has been previously overlooked in the physics literature. The concrete examples we shall consider are the spectra of adjacency matrices, ordinary graph Laplacians and normalized graph Laplacians of sparse Erdős-Rényi random graphs. These examples are chosen as concrete random matrix ensembles that have received considerable attention in the literature. The technology we discuss should be applicable with minimal modifications to many other sparse random matrices.
The paper is organized as follows: In section 2, we review the emergence of integral equations from the problem of computing the spectrum of adjacency matrices of sparse Erdős-Rényi graphs. In section 3, we describe the process for solving Hammerstein equations numerically. In section 4, we present various tweaks necessary for handling the concrete Hammerstein equations arising from random matrices, as well as the practical results of our numerical work for the spectra of adjacency matrices, ordinary graph Laplacians and normalized graph Laplacians of sparse Erdős-Rényi random graphs. We conclude with a discussion and a wishlist for possible improvements in the numerical algorithms.
2 An integral equation for sparse random graph spectra
We start with a review of the emergence of integral equations of Hammerstein type from the application of statistical field theory methods [26, 27] to spectral problems for sparse random graphs. These methods have been used for a wide range of problems involving random graphs and random matrices [12, 14, 15, 23, 13, 28, 29, 30, 31, 32, 33, 35, 34, 36]. Our purpose here is not to provide a systematic technical exposition of these methods, but rather to give the readers an impression of the sort of structures and steps involved. The readers have the option of fast-forwarding through this section, as the topic of deriving the Hammerstein equations for sparse random matrix spectra is logically independent from the topic of solving them.
The specific variation of the statistical field theory methods we use relies on supersymmetry and auxiliary integrals over anticommuting variables, see [37, 38] for textbook expositions. In application to random matrices, some essential ingredients we use go back to Fyodorov and Mirlin [14, 15]. We have given a pedagogical account of this technology in [19], and refer the readers to that paper for further details.
The example we choose for our demonstration is the eigenvalue density of the adjacency matrix of an Erdős-Rényi random graph. This graph is defined as a set of vertices with , and each pair of vertices is connected randomly and independently with probability , so that at , each vertex has on average connections. These data are conveniently coded into the adjacency matrix , which is a symmetric zero-diagonal matrix whose entry equals 1 if vertices and are connected, and 0 otherwise. The above specification of the Erdős-Rényi ensemble is translated into the following probability distribution for the entries of :
(2) |
where stands for the standard -function. Finding the eigenvalue density of the adjacency matrices described by this ensemble is a standard problem in random matrix theory. Our treatment follows the original works [15, 14]. We choose this example for its relative simplicity, and also for the fact that it is not directly covered by the pedagogical exposition of [19]. At large , the eigenvalue distribution tends to the semicircle familiar from many incarnations of random matrix theory [39]. At smaller , the shape gets deformed in a complicated way, and it is this deformed shape that will be captured by integral equations we consider in this paper.
We start with the so-called resolvent representation of the eigenvalue density . Let the eigenvalues of be . Then, by definition:
(3) |
where we have used the Sokhotski–Plemelj formula and the spectral decomposition of the matrix inverse, while implies evaluating the relevant expression just below the real axis. The resolvent is much simpler to handle within statistical averages than the eigenvalues themselves, since it can be effectively represented using Gaussian integrals, making the dependence on the components of simplify further. Note first that, due to the symmetry of the Erdős-Rényi ensemble under vertex permutations, after the averaging, one has
(4) |
The next step is to represent the resolvent (4) through Gaussian integrals, which results in factorization over the entries of and hence in a straightforward -averaging. To do so, it is customary [15, 14, 37, 38] to introduce a 4-component ‘supervector’ at every vertex , given by , where and are ordinary numbers, while and are anticommuting numbers satisfying
(5) |
the latter relations known as Berezin integration rules. (All components of anticommute with all components of as well. Note that Berezin integrals should not be visualized as any generalization of Riemann sums. They are linear maps from functions of anticommuting numbers to real numbers that have been developed historically for the purpose of convenient path-integral representation of fermionic systems.) We furthermore introduce a ‘scalar product’ of supervectors defined as
(6) |
With these preliminaries, we can write
(7) |
Note that, if substituting (6) into (7), one obtains independent Gaussian integrals involving , and . Absorbing the powers of arising from the standard Gaussian integrals over and into the definition of the measure, the first integral evaluates to due to the insertion of , the second integral evaluates to , while the last integral evaluates to by the standard formula for Gaussian integrals in Berezin integration theory. As a result, the powers of determinants exactly cancel, leaving behind (7).
The purpose of inserting integrals over anticommuting variables in (7) was precisely to make the powers of determinant arising from Gaussian integration exactly cancel out, yielding an expression factorized over the components of . An alternative strategy, often seen in statistical physics literature, is the replica method [12, 13, 23, 35, 40], where one introduces an arbitrary controllable number of extra Gaussian integrals over ordinary commuting variables, formally extrapolating the total number of Gaussian integrals to 0 in the final formulas at the end of the computation. While, in practical terms, this strategy is largely interchangeable with the supersymmetry-based approach we are employing here, the latter dispenses of the need for the ambiguous operation of extrapolating a positive integer to 0.
Averaging (7) over the random matrix ensemble (2) is straightforward and yields
(8) |
where we have used and then introduced
(9) |
To solve the supervector model (8) at large , we employ, after Fyodorov and Mirlin [14, 15, 19], the following Gaussian functional integral over an auxiliary function :
(10) |
where is the inverse of in the sense of integral convolution:
(11) |
The explicit form of will not be necessary for our derivation. Once this transformation has been implemented, the integrals over with different completely factorize and are performed independently, yielding an explicit saddle point problem:
(12) |
At large , the integral is dominated by the stationary points of the functional defined by , which is
(13) |
This equation is closely reminiscent of the analogous Bray-Rodgers equation in the replica approach [13, 12, 23].
The final step is to assume that the dominant saddle point described by (13) is invariant under ‘superrotations’ that preserve the scalar product (an alternative would have been continuous families of saddle points related by superrotations, since the equation is symmetric under applying a superrotation to the argument of ). This implies that , and with this simplification, the integral equation (13) can be reduced to a single one-dimensional integral equation. To do so, we express the commuting components of in polar coordinates as
(14) |
With this,
(15) |
where the Taylor expansion of truncates at the -term, since all higher powers of anticommuting variables vanish by (5). Inserting this representation in the -integral in (13) and performing the integration over all variables except for , we obtain, after some integration by parts in :
(16) |
where is the usual Bessel function. This equation, which dates back to [12] where it was derived by replica methods, and its analogs for graph Laplacians, will be the main subjects of our numerical work in what follows. Similar integral equations with exponential nonlinearities and Bessel kernels arise also in other sparse random matrix problems.
Once a solution of (16) has been found, one has to substitute it in (12) and perform a saddle point evaluation of the integral. This, in general, would involve complicated determinants from Gaussian integrations, but in our case, these determinants must cancel out. An elementary way to see it is as follows: consider, tautologically, (8) with replaced by 1 (so that the average is guaranteed to equal 1). Then, our entire derivation will go through, arriving at (12), but without the insertion . Neither the saddle point equation (16) nor the determinants arising from Gaussian integrations depend on the presence of this insertion. On the other hand, by our tautological starting point, the result without the insertion must equal 1. The result with the insertion, as in (12), must then be equal to the insertion evaluated at the saddle point, that is,
(17) |
where is now understood as the solution of (16). The integrals can again be simplified using (14-15) and assuming that decays at infinity, and then plugged back into (3-4) to yield the following result for the eigenvalue density:
(18) |
By construction, is normalized as a probability density . It is rather striking, at the face value, that the solution of the formidable equation (16) processed with (18) is automatically normalized in this way. It is nonetheless correct, as follows from our arguments, and as will be verified by numerical simulations. (In [19], we ignored all normalization at the intermediate stages of computation, since it is always easy to normalize the end result. Here, however, we restore all normalizations for comparisons with numerics, which can be done, as it turns out, with minimal effort.)
3 Handling Hammerstein equations
Equation (16) evidently matches the general structure of Hammerstein equations given by (1). Before proceeding with its solution, we will review the general underlying principles, following [24].
In the random matrix literature, the systematic theory of Hammerstein equations appears to have never been applied. Indeed, in [23], the apparent intractability of integral equations like (16) is used to motivate an alternative approach, where one goes back to an analog of (13) and then develops a sampling scheme in the spirit of ‘population dynamics’ to approximate its solutions. In [16], equations analogous to (16) are approached directly, and their solutions are approximated by elaborate sums of Gaussians. This approach is not apparently connected, however, to the strategies for solving Hammerstein equations seen in the mathematical literature.
What the mathematical literature instructs us to do is to take (1) and approximate its solutions by functions within some finite-dimensional linear subspace spanned by with , that is . One can then replace the original equation (1) by the following modification
(19) |
where is a projector on the subspace spanned by . There are different possible choices for this projector, leading to different numerical solution methods, as we shall discuss shortly. One can view (19) as a system of nonlinear algebraic equations for unknowns that can be solved by any standard numerical methods. One expects that, as the dimension of the subspace spanned by increases, the resulting will approximate the true solution of (1) better and better. The details of this convergence have been studied by mathematicians, but we shall be pragmatic about it since rather small values of around 10 will be sufficient to reproduce sparse matrix spectra with a good precision.
There is a technical issue with equation (19) as it stands. Because appear inside the integral over nonlinearly, the integral will have to be recomputed for each assignment of , for example, when solving the algebraic system (19) by means of iterative methods. These repeated integrations are numerically costly. Kumar and Sloan proposed in [41] a simple trick to bypass this issue. Instead of , one introduces
(20) |
so that (1) is equivalently rewritten as
(21) |
We then expand , rather than , in the truncated basis with :
(22) |
and project (21), rather than (1), on this basis using a suitable projector :
(23) |
This is, again, solved for as a system of nonlinear algebraic equations. An advantage over (19) is that the integrals are computed once and for all, and are not affected by the values of . After has been found in this way, providing an approximation to , is best reconstructed by applying one more integration:
(24) |
Regarding the choice of the projector , a natural first thought is to use an orthogonal projector with a suitably defined inner product on the space of functions. This choice leads to Galerkin (or spectral) methods for solving Hammerstein equations. These methods are extensively covered in the literature, but a more economical setup in the context of (23) is given by the collocation method that uses the so-called interpolating projector [42]. For a given input , the interpolating projector provides as a function of the form that equals at prescribed collocation points with . This provides exactly linear conditions for unknowns . With such a prescription for the projector , (21) becomes
(25) |
Note that the matrix only has to be evaluated once. With this matrix computed, the equations for , indexed by , are solved using the standard numerical routines for multidimensional root search.
One still has to decide on how to choose the basis and the collocation points . Mathematical literature consistently advises us to work with orthogonal polynomials. In [42], integral equations on the interval are considered, and then it is natural to choose as the Legendre polynomials. Some considerations of half-infinite integration ranges, as in (16), can be seen in [43]. Since the integrals are from to , it is natural to use Laguerre polynomials but we will make some different choices in the concrete examples below, depending on the concrete form of the integrand appearing in the Hammerstein equation. Note that since no orthogonal projection is involved in the collocation equation (25), the orthogonality property of the polynomial family is never used explicitly, and any other family of linearly independent functions could be used, at least in principle. Using orthogonal polynomials is, however, a way to ascertain a ‘good degree’ of linear independence. For example we have checked that constructing from plain polynomials , , , etc without orthogonalization results in a poorly conditioned system, and the numerical search for the solutions of (25) does not reliably converge. Orthogonal polynomials work well, by contrast. In [43], we are instructed that a good choice of collocation points is the roots of the lowest-degree orthogonal polynomial not included in the chosen set . We will generally follow this prescription, though according to our simulations reported below, reasonable variations in the positions of the collocation points do not immediately compromise the operation of the algorithm.
4 Numerical results for sparse Erdős-Rényi graphs
We proceed to report how numerical solutions of Hammerstein equations via the collocation method (25) work for determining eigenvalue spectra of adjacency matrices and Laplacians of sparse random graphs. For each concrete equation at hand, we will have to indicate how to effectively partition the functional dependences in the integral convolutions on the right-hand side to match the notation in (1), and how to choose the sets of trial functions and collocation points.
4.1 Adjacency matrices
Before applying the technology of section 3 to equation (16), it is wise to implement the following rescalings:
(26) |
Dropping the tildes for convenience, we obtain the rescaled equation
(27) |
An advantage of this representation is that the relevant range of becomes approximately independent of . Indeed, at large , we can expand and the equation turns into , solved by a fixed-size Wigner semicircle [19]. At smaller , the functional dependences evidently get deformed, but they occupy roughly the same region with respect to the new variable, rather than getting scaled with respect to the old variable. This makes the numerical implementation more neat.
There is some ambiguity in how to split the various -dependences in (27) to match the structure of (1). Evidently, . We find it convenient to identify
(28) |
An advantage of including the -dependence into the nonlinearity and not into the kernel is that, as we vary to derive the eigenvalue density at different points, the integrals in the Kumar-Sloan collocation problem (25) will not have to be recomputed. For the basis we choose Laguerre polynomials multiplied by exponentials. Namely, we write
(29) |
where are Laguerre polynomials orthogonal with respect to . We will treat as a tunable real parameters and adjust it to optimize the numerical performance. We remark that we could have used different scalings for the arguments of the Laguerre polynomials and the exponential, which gives more control over the numerical performance, but the practical gain is not significant and the formulas become more bulky. We give a summary of that scheme in Appendix A, and proceed here with (29) as given.
We can evaluate the integral in (27) analytically using the identity [44]
(30) |
where are the associated Laguerre polynomials, and choosing , and . From (27), (29) and (30), together with , we obtain
(31) |
Note that the -integral is completely gone due to the orthogonality of Laguerre polynomials. Recombined with (29), this generally results in the following problem:
(32) |
where we defined , and is the projector of choice on the space of degree polynomials in terms of . This should be treated as a system of equations, from identifying the polynomial coefficients on the two sides, for unknowns . By orthogonality of Laguerre polynomials, (18) then yields for the following eigenvalue density, keeping in mind that we must undo the rescalings (26):
(33) |
For the simplest choice of we take the interpolating projector, equating the polynomial and the exponential in (32) at the collocation points with , chosen as the roots of :
(34) |
To check the performance, we program this in Python (a sample script is provided in Appendix B) using the root function of the SciPy library [45] to solve (34) numerically with and the initial seed for the root search taken as , at the first -point, with the previous solutions reused as the seeds for each subsequent -point.
The results of our simulations are presented in Fig. 1. It can be seen that the performance is perfect at , which is shared by any higher values of . At tiny undulations are seen near the center. These undulations become much more dramatic at lower , disrupting the precision of the algorithm performance, though the overall shape of the distribution remains qualitatively correct. We shall comment in the conclusions on possible strategies for stabilizing the numerical performance at lower values of .
4.2 Ordinary graph Laplacians
Graph Laplacians are of central importance for transport phenomena on networks. We start with the ordinary graph Laplacian that controls, in particular, random walks on graphs where there is a fixed transition rate per edge per unit time [19]. In terms of the adjacency matrix , drawn for Erdős-Rényi graphs from the distribution (2), this Laplacian is defined as follows: we first construct the diagonal degree matrix whose diagonal entries are , and then the ordinary Laplacian is defined by
(35) |
Unlike the adjacency matrix, this matrix is positive semi-definite.
An analytic theory of the ordinary Laplacian along the lines of section 2 was developed in [19]. One has to solve the Hammerstein equation
(36) |
that replaces (16) derived for adjacency matrices. The eigenvalue density of (35) is recovered from the solution as
(37) |
where we have restored, based on the principles outlined in section 2, the exact probability density normalization, ignored in the analytic formulas of [19]. To get to the last representation, one uses the -derivative of (36) at . Equations of the form (36) date back to [13] where the derivations were based on replica methods. They have also been derived, together with equations like (16) for adjacency matrices, in [17] where the analysis based on the methods of moments. The latter strategy has the advantage of being mathematically rigorous, but requires a genuine tour de force of combinatorial accounting and resummations.
As in the case of adjacency matrices, it is convenient to implement a -dependent rescaling to make the relevant ranges of variables depend only weakly on . We choose this rescaling as
(38) |
The resulting equation, with tildes dropped, is
(39) |
At large , this equation reduces to [19, 46, 47], which describes a ‘free convolution of the Wigner semicircle and a Gaussian’ in the language of free probability theory [48]. To represent (39) in the form (1), we choose
We then follow the Kumar-Sloan strategy for implementing collocation described in section 3, and specifically apply the expansion (29).
Since the integral over in (39) is exactly the same as in (27), we can rely on the same strategy as under (30) to explicitly evaluate all the integrals under the ansatz (29). This yields, instead of (31),
(40) |
and leads to the following collocation problem:
(41) | ||||
where are the roots of . From the solution of this algebraic system, one recovers via (37), (38) and (40)
(42) |
We have performed numerical simulations based on (41) and (42) and observed that they reproduce very well the empirical distribution at , though increasing the expansion parameter becomes necessary towards the lower end of this range to ensure numerical convergence. At smaller values of , the performance is less stable, especially in regions away from the main peak, where spurious oscillations develop. The performance at is seen on the left panel of Fig. 2. It captures the smooth descending part of the distribution perfectly, but is less accurate to the left of the main peak, where rough textures drvelop as tends to the ‘percolation threshold’ at , even though the overall shape is reproduced correctly.
One can understand intuitively the deteriorating performance of the expansion scheme (29) away from the main peak. Indeed, there, the first derivative of at becomes small and is dominated by the Gaussian envelope , clearly visible [19] in the large limit of (. If we expand as in (29), this Gaussian envelope is approximated by a polynomial, which is clearly prone to instabilities due to the large growth of polynomials and large decay of . To remedy for this problem, we can replace (29) by an alternative expansion
(43) |
where are the half-range Hermite polynomials [49] orthogonal with respect to the inner product . The price to pay is that there are no longer simple and nice analytic formulas for , but we can still process all the integrals numerically and run the generic formulation of Kumar-Sloan collocation given by (25). We have observed that this approach performs better at smaller . (A sample script is given in Appendix C.) In the right panel of Fig. 2, the performance of this scheme at the low value is shown. The smooth heavy tail of the distribution is captured very accurately, while the rough part around the ascent and the peak is reproduces less well, though the overall shape is correct.
At larger values of , both approaches quickly become very accurate, and the empirical distribution shape is effectively indistinguishable from our numerical solutions of the Hammerstein equation (36) already at c=15.
4.3 Normalized graph Laplacians
The normalized Laplacian is defined, in the notation of (35), by
(44) |
This Laplacian controls, in particular, random walks where there is a fixed rate per unit time to leave the present vertex and jump with equal probability to one of its neighbors [19]. (In the above defintion, is understood as the square root of the pseudoinverse of .)
The spectral theory of normalized Laplacians of Erdős-Rényi graphs was developed in [19]. It results in the following Hammerstein equation:
(45) |
(Curiously, the same equation was recently found to control the probabilities of first return times of random walks on Erdős-Rényi graphs [21].) From the solution of this equation, the eigenvalue density can be recovered in the following form, with the normalization given explicitly:
(46) |
As in our previous treatments, it is convenient to introduce rescalings that minimize the dependence of the contributing ranges of variables on . This is accomplished with
(47) |
Dropping all tildes, we then get
(48) |
The limit is the same as for adjacency matrices, defining a Wigner semicircle [19]. For the Kumar-Sloan collocation, we identify and
(49) |
The integral over in (48) is once again completely identical to (27), so we adopt the expansion (29) and apply the same processing as in section 4.1 to obtain
(50) |
The collocation problem, analogous to (34) for adjacency matrices, then becomes
(51) | ||||
where the collocation points are the roots of . After these equations have been solved numerically, one must reconstruct from (29), from (50), undo the rescaling (47) and evaluate (46). In this way, we obtain:
(52) | ||||
We display the output of solving (51), followed by an application of (52) in Fig. 3. The performance of this algorithm remains stable and successful down to rather low values and 4. Convergence is more problematic near the origin at low , where a large abrupt spike develops, hence we excise these regions. In the bulk of the distribution, the numerics based on (51) and (52) gives very convincing results.
5 Outlook
We have applied a systematic treatment of nonlinear integral equations of the Hammerstein type to recover the spectra of sparse random matrices associated to Erdős-Rényi graphs with an asymptotically large number of vertices of average degree . Our treatment is practically exact for larger than 15 or so, and remains rather accurate down to around 8. As is lowered further, approaching the percolation threshold at , challenges in numerical performance are met, though for graph Laplacians, we still obtain very reasonable output for as low as 4. We hope that further improvements in numerical methods for solving Hammerstein equations will lead to better performance in the low- region.
The approach we have taken here is very complementary to the widely explored ‘cavity’ and ‘population dynamics’ methods for reconstructing sparse graph spectra [23, 50, 40]. In those approaches, stochastic simulations are designed whose stationary distributions contain information on eigenvalue densities. By contrast, our numerics is fully deterministic and consists in solving the integral equations (16), (36) and (45) using collocation methods. The implementation is very economical in its computational costs: numerical solutions of the Hammerstein equations in all of our examples (the solid black lines in the plots) take seconds to obtain on an ordinary personal computer. We feel that it would be very interesting to apply this technology to more sophisticated sparse matrix problems, for example, those arising in the theory of amorphous solids [6] or for non-Hermitian matrices [36, 51, 52].
The main challenges met by our numerical algorithms are either in the regions where the eigenvalue density becomes very small, or at low values of , where it is known from empirical diagonalization of large matrices that the regularity of the eigenvalue density deteriorates dramatically. We provide below preliminary comments on the origin of these issues and potential pathways to their solution.
The poor performance outside the main support of the eigenvalue density can be easily understood qualitatively. The eigenvalue density is related to the imaginary part of the -derivative of at the origin, as in (37). When the eigenvalue density is small, the imaginary part of grows slowly with , and hence decreases slowly with . As a result, a large range of becomes relevant in the integrals on the right-hand side of the Hammerstein equations, with complicated oscillatory behaviors, and one should not expect that such integrals will be easily approximated using interpolating polynomials. The situation is significantly better for ordinary Laplacians, where includes a Gaussian envelope , independent of the eigenvalue density, and this tames the integrals over when we use the half-range Hermite polynomial expansion adapted to this Gaussian envelope. (We note that the other cases also have similar Gaussian envelopes, but the width is of order rather than of order 1. This may provide a viable alternative strategy at small , again, moving away from simple Laguerre expansions.) We feel that a general scheme to handle the equations outside the main support of the eigenvalue density should exist, possibly by rearranging the integrals in the complex -plane, but we have not been able to formulate a working recipe. This problem is of somewhat marginal importance for the eigenvalue distribution topic, since one is mostly interested in the shape of the eigenvalue distributions where they are nonvanishing, and not in the regions where they (almost) vanish. By contrast, for applications of Hammerstein equations to random walks on random graphs, as in [21], the behavior of near , very far away from the main support of the eigenvalue density, is of primary importance.
Using Laguerre polynomial expansions, we have been able to evaluate all the integrals in the Hammerstein equations with Bessel kernels exactly, obtaining explicit systems of algebraic equations (34), (41) and (51). This both provides for very fast numerical performance and creates a comfortable environment for more detailed mathematical analysis of convergence, approximation accuracy, etc. Where it is advantageous to use a different kind of expansion, as in our half-range Hermite treatment of the ordinary graph Laplacians, the integrals can still be effectively handled numerically following the general Kumar-Sloan scheme (25).
As to improving the performance of our numerics at lower , the issue seems to be that the shape of various functional dependences becomes more complicated at lower , and our simple polynomial expansions fail with capturing it. One could hope that increasing the polynomial order in (29) would help, but it is not straightforward for two reasons. First, as we go to higher , we would need to solve nonlinear systems of equations similar to (34) in higher and higher numbers of dimension, where naive root searches become less and less reliable. Perhaps this could be tackled by first running the algorithm at a lower value of and then reusing the output as seeds for the root search at higher values of . More dramatically, however, the interpolating projector we have used with great efficiency at low is likely to be flawed at higher . It is well-known that naively increasing the order of interpolation does not necessarily result in better approximation [53]. There could certainly be more refined choices for the projector in (32) and other similar equations. A naive option is to rely on the infinite radius of convergence of the Taylor series for exponentials and employ a Taylor expansion on the right-hand side of (32). This, however, would require working at a very high arithmetic precision, since the convergence of Taylor expansions for exponentials requires massive numerical cancellations. There likely exist better choices for that we leave for future investigations, also hoping for practical input from mathematicians specializing in numerical methods. Uniform approximations of functions by polynomials on infinite intervals in the presence of exponential suppression have been discussed systematically in the literature [54].
In our numerical work, we have treated the expansion parameters such as in (29) and the initial seed for multidimensional root searches as free parameters chosen empirically to ensure satisfactory performance. In the cases we considered, simple straightforward choices sufficed to obtain the plots given in the paper. It could be good to develop a better approach where these parameters are bootstrapped toward good values, either by analyzing the integral equations at a given value of , or by reusing the previously found solutions at other values of (the latter is actually the approach we took for the initial seed).
Nonlinear algebraic equations like (34), (41) and (51) are generally expected to have multiple solutions. Only one of these solutions corresponds to the actual eigenvalue density curve, while the others are spurious. We could see that deforming the parameters of our expansion schemes (for example, making small) or the initial seed far from the optimal values we found leads to such spurious convergence, predicting a completely wrong eigenvalue density curve. For the concrete cases we considered, we indicated simple natural choices of the numerical implementation parameters that ensure convergence to the correct solution. It would be nice to have more of a theory of the solutions of (34), (41) and (51) and their basins of attraction relative to root search algorithm, especially given that the equations are provided explicitly, with all the integrals evaluated in a closed form. Uniqueness of solutions of Hammerstein equations (before converting them to approximate collocation schemes) has been discussed extensively in the mathematical literature.
One striking feature of the eigenvalue distributions of sparse matrices at lower values of is the emergence of sharp peaks [23, 55]. These peaks are understood qualitatively as coming from small disconnected components. For example, isolated vertices of degree 0 contribute zero eigenvalues to the adjacency matrix. Since there are on average vertices of degree 0, one expects a peak in the eigenvalue distribution. Similarly, graph Laplacians develop zero eigenvalues for each disconnected component of the graph. There are further peaks at nonzero values of . We suspect that they come from graph components that are only connected to the rest of the graph by one edge, and other peculiar arrangements (similar structures play a significant role in the distribution of resistance distances [34]). These sharp peaks are, incidentally, reflected differently in the empirical histograms and in the numerical solutions of the integral equations appearing in our plots. Numerical histograms, in the limit of large samples, always show the integrals of the eigenvalue density over single-bin intervals. The numerical solutions are evaluated at a single -point, except that the truncated approximations of the original integral equation, as in (29), are likely to resolve the infinitely thin -function peaks somewhat.
The sharp peaks could be of significant interest physically, depending on the setting. For example, where random matrix spectra represent vibrational modes, the peaks in the eigenvalue density will signify strongly resonant points in the spectrum. The -function singularities in the eigenvalue distribution should translate to singular behaviors of at some special values of . It would be very nice to develop a systematic theory of these sharp peaks, and use this analytic understanding to subtract that part within the equations of , so that the target functions for numerical work become more smooth and the numerical methods become more stable.
Acknowledgments
PA is supported by the Impulscience program of Fondation Bettencourt Schueller. OE is supported by Thailand NSRF via PMU-B, grant number B13F670063. We acknowledge the use of computation resources of LPTMS at Université Paris-Saclay for generating and diagonalizing large samples of random matrices.
Appendix A A more general expansion in terms of Laguerre polynomials
Instead of (29), we could have employed a more general expansion
(53) |
where the scalings of the argument of the Laguerre polynomial and the exponential are different. This does not immediately yield significant improvements over the case treated in the main text, and the formulas become more bulky. These formulas, however, may become useful for other future applications, so we give a summary here on how to treat the adjacency matrix equation (27) with the expansion (53).
The integrals can in general be evaluated using the relation of the Bessel convolutions – that is, Hankel transforms – to 2d Fourier transforms. Specifically, we write . Then, for any that decays at infinity,
(54) | |||
where we have introduced the Cartesian coordinates , .
If we start with the definition of Laguerre polynomials
(55) |
substitute and expand everything in terms of monomials of the form , the integrals are evaluated term-by-term using the two elementary Gaussian quadratures:
(56) | |||
and
(57) |
where one must define to make the formulas work correctly. Then,
(58) |
where we have introduced the following auxiliary polynomials
(59) |
With these ingredients, we can write for the integrals of the individual terms in (55):
(60) |
To make this formula work correctly at , we must add due to the last term in (54). If we now impose (27) at collocation points with under the ansatz (53) and use the above formulas to evaluate the Bessel integral, keeping in mind that , the output is the following collocation problem
(61) |
with and
(62) |
with given by (59). While the multiple sums may seem awkward, they merely encode the coefficients of explicit polynomials in and can be treated very effectively in numerical implementations. They also need to be evaluated only once. Note that no approximations have been made in (61-62) beyond truncating the expansion (53) at .
For the collocation points we choose . where are the roots , so that . We then use the above collocation problem to find and hence , and then the eigenvalue density can be recovered from (18), which gives
(63) |
Appendix B Python code for the adjacency spectrum
We provide a basic Python script that recovers the solution of the collocation problem (34) for adjacency matrices. Related collocation problems (41) and (51) for graph Laplacians can be treated by minimal modifications of this script.
import numpy as np import scipy.optimize as opt import matplotlib.pyplot as plt def eqs(beta,z): betac=beta[:J]+1j*beta[J:] eqs=[sum([L[k,j]*betac[j] for j in range(J)])-np.exp(x[k]*(1-1j*z/gamma) -sum([C[k,j]*betac[j] for j in range(J)])) for k in range(J)] return np.append(np.real(eqs),np.imag(eqs)) c=15 sqrtc=np.sqrt(c) J=11 #corresponds to J+1 in the notation of the paper gamma=1 zs=np.linspace(0.01,2.1,100) x=np.polynomial.laguerre.lagroots([0]*J+[1]) fac=np.ones((J)) for i in range(1,J-1): fac[i+1]=(i+1)*fac[i] C=np.zeros((J,J)) L=np.zeros((J,J)) for k in range(J): xgc=x[k]/(gamma**2*c) expxgc=np.exp(-xgc) for j in range(J): L[k,j]=np.polynomial.laguerre.Laguerre([0]*j+[1])(x[k]) C[k,j]=c*(1-expxgc*sum([xgc**n/fac[n] for n in range(j+1)])) p=[] beta=np.array([1]+[0]*(2*J-1)) for z in zs: sol=opt.root(eqs,beta,args=(z),tol=1e-12) beta=sol.x p.append(beta[0]/(gamma*np.pi*sqrtc)) plt.plot(zs*sqrtc,p) plt.ylim(0) plt.show()
Appendix C Python code for the ordinary Laplacian via half-range Hermite polynomials
We provide a simple Python script that implements the half-range Hermite collocation for ordinary graph Laplacians. While we could evaluate some of the integrals here analytically in principle, the numerical evaluation works well for practical purposes.
import numpy as np from scipy.optimize import root from scipy.integrate import quad from scipy.special import jv import matplotlib.pyplot as plt def hrHermites(n): polys=[np.poly1d([(np.sqrt(np.pi)/2)**(-0.5)])] for i in range(1,n+1): this_poly=np.poly1d([1]+[0]*i) for last_poly in polys: this_poly-=last_poly*quad(lambda y: this_poly(y)*last_poly(y)*np.exp(-y**2), 0,np.inf)[0] this_poly=this_poly/np.sqrt(quad(lambda y: this_poly(y)**2*np.exp(-y**2), 0,np.inf)[0]) polys.append(this_poly) return polys def eqs(beta,z): betac=beta[:J]+1j*beta[J:] eqs=[] for k in range(J): xg=1j*x[k]/sqrtc expxg=np.exp(xg) eqs.append(sum([H[k,j]*betac[j] for j in range(J)])-np.exp(x[k]**2/2-1j*x[k]*z +c*(expxg-1-xg)-expxg*sum([C[k,j]*betac[j] for j in range(J)]))) return np.append(np.real(eqs),np.imag(eqs)) c=5 sqrtc=np.sqrt(c) J=12 zs=np.linspace(-3,4,100) hrH=hrHermites(J) x=hrH[-1].r C=np.zeros((J,J)) H=np.zeros((J,J)) for k in range(J): for j in range(J): H[k,j]=hrH[j](x[k]) C[k,j]=sqrtc*quad(lambda y:jv(1,2*np.sqrt(x[k]*y/c))*np.sqrt(x[k]/y) *hrH[j](y)*np.exp(-y**2/2),0,np.inf)[0] p=[] beta=np.zeros(2*J) for z in zs: sol=root(eqs,beta,args=(z),tol=1e-12) beta=sol.x betac=beta[:J]+1j*beta[J:] f_real=lambda y: np.real(np.exp(-1j*y/np.sqrt(c)-y**2/2)*np.sum([betac[j] *hrH[j](y) for j in range(J)])) f_imag=lambda y: np.imag(np.exp(-1j*y/np.sqrt(c)-y**2/2)*np.sum([betac[j] *hrH[j](y) for j in range(J)])) p.append(np.real(quad(f_real,0,np.inf)[0] +1j*quad(f_imag,0,np.inf)[0])/(np.pi*sqrtc)) plt.plot(c+1+zs*sqrtc,p) plt.xlim(0) plt.ylim(0) plt.show()
References
- [1] R. M. May, Will a large complex system be stable?, Nature 238 (1972) 413.
- [2] J. Grilli, T. Rogers and S. Allesina, Modularity and stability in ecological communities, Nature Comm. 7 (2016) 12031.
- [3] K. M. Frahm and D. L. Shepelyansky, Nonlinear perturbation of random matrix theory, Phys. Rev. Lett. 131 (2023) 077201, arXiv:2212.11955 [cond-mat.stat-mech].
- [4] T. Guhr, A. Muller-Groeling and H. A. Weidenmuller, Random matrix theories in quantum physics: common concepts, Phys. Rept. 299 (1998) 189, arXiv:cond-mat/9707301.
- [5] F. Evers and A. D. Mirlin, Anderson transitions, Rev. Mod. Phys. 80 (2008) 1355, arXiv:0707.4378 [cond-mat.mes-hall].
- [6] G. M. Cicuta, J. Krausser, R. Milkus and A. Zaccone, A unifying model for random matrix theory in arbitrary space dimensions, Phys. Rev. E 97 (2018) 032113, arXiv:1710.02850 [cond-mat.dis-nn].
- [7] T. A. Brody, J. Flores, J. B. French, P. A. Mello, A. Pandey and S. S. M. Wong, Random-matrix physics: spectrum and strength fluctuations, Rev. Mod. Phys. 53 (1981) 385.
- [8] G. Bianconi and S. N. Dorogovtsev, The spectral dimension of simplicial complexes: a renormalization group theory, J. Stat. Mech. 2020 (2020) 014005, arXiv:1910.12566 [cond-mat.dis-nn]; M. Reitz and G. Bianconi, The higher-order spectrum of simplicial complexes: a renormalization group approach, J. Phys. A 53 (2020) 295001, arXiv:2003.09143 [cond-mat.dis-nn].
- [9] S. C. de Lange, M. A. de Reus and M. P. van den Heuvel, The Laplacian spectrum of neural networks, Front. Comp. Neurosci. 7 (2014) 189.
- [10] M. Newman, Networks: an introduction (Oxford, 2018).
- [11] P. Van Mieghem, Graph spectra for complex networks (Cambridge, 2011).
- [12] G. J. Rodgers and A. J. Bray, Density of states of a sparse random matrix, Phys. Rev. B 37 (1988) 3557.
- [13] A. J. Bray and G. J. Rodgers, Diffusion in a sparsely connected space: a model for glassy relaxation, Phys. Rev. B 38 (1988) 11461.
- [14] Y. V. Fyodorov and A. D. Mirlin, On the density of states of sparse random matrices, J. Phys. A 24 (1991) 2219.
- [15] A. D. Mirlin and Y. V. Fyodorov, Universality of level correlation function of sparse random matrices, J. Phys. A 24 (1991) 2273.
- [16] K. Broderix, T. Aspelmeier, A. K. Hartmann and A. Zippelius, Stress relaxation of near-critical gels, Phys. Rev. E 64 (2001) 021404, arXiv:cond-mat/0011414.
- [17] O. Khorunzhy, M. Shcherbina and V. Vengerovsky. Eigenvalue distribution of large weighted random graphs, J. Math. Phys. 45 (2004) 1648.
- [18] T. Aspelmeier and A. Zippelius, The integrated density of states of the random graph Laplacian, J. Stat. Phys. 144 (2011) 759, arXiv:1008.1087 [cond-mat.dis-nn].
- [19] P. Akara-pipattana and O. Evnin, Random matrices with row constraints and eigenvalue distributions of graph Laplacians, J. Phys. A 56 (2023) 295001, arXiv:2212.06499 [cond-mat.dis-nn].
- [20] L. Avena, R. S. Hazra and N. Malhotra, Limiting spectra of inhomogeneous random graphs, arXiv:2312.02805 [math.PR].
-
[21]
O. Evnin and W. Horinouchi,
First return times on sparse random graphs,
arXiv:2408.10530 [cond-mat.dis-nn]. -
[22]
A. Hammerstein, Nichtlineare Integralgleichungen nebst Anwendungen,
Acta Math. 54 (1930) 117. -
[23]
R. Kühn, Spectra of sparse random matrices, J. Phys. A 41 (2008) 295002,
arXiv:0803.2886 [cond-mat.dis-nn]. - [24] K. E. Atkinson, A survey of numerical methods for solving nonlinear integral equations, J. Int. Eq. App. 4 (1992) 15.
- [25] S. Chandrasekhar, Radiative transfer (Oxford, 1950).
- [26] E. Brézin, Introduction to statistical field theory (Cambridge, 2010).
- [27] M. Helias and D. Dahmen, Statistical field theory for neural networks (Springer, 2020).
- [28] M. Mézard, G. Parisi and A. Zee, Spectra of Euclidean random matrices, Nucl. Phys. B 559 (1999) 689, arXiv:cond-mat/9906135.
- [29] G. Semerjian and L. F. Cugliandolo, Sparse random matrices: the eigenvalue spectrum revisited, J. Phys. A 35 (2002) 4837, arXiv:cond-mat/0202406.
- [30] J. Park and M. E. J. Newman, Solution of the two-star model of a network, Phys. Rev. E 70 (2004) 066146, arXiv:cond-mat/0405457.
- [31] F. L. Metz, G. Parisi and L. Leuzzi, Finite size correction to the spectrum of regular random graphs: an analytical solution, Phys. Rev. E 90 (2014) 052109, arXiv:1403.2582 [cond-mat.dis-nn].
- [32] A. Annibale and O. T. Courtney, The two-star model: exact solution in the sparse regime and condensation transition, J. Phys. A 48 (2015) 365001, arXiv:1504.06458 [cond-mat.dis-nn].
- [33] K. Truong and A. Ossipov, Statistical properties of eigenvectors and eigenvalues of structured random matrices, J. Phys. A 51 (2018) 065001, arXiv:1708.05345 [math-ph].
- [34] P. Akara-pipattana, T. Chotibut and O. Evnin, Resistance distance distribution in large sparse random graphs, J. Stat. Mech. 2022 (2022) 033404, arXiv:2107.12561 [cond-mat.dis-nn].
- [35] D. Venturelli, L. F. Cugliandolo, G. Schehr and M. Tarzia, Replica approach to the generalized Rosenzweig-Porter model, SciPost Phys. 14 (2023) 110, arXiv:2209.11732 [cond-mat.dis-nn].
- [36] J. W. Baron, A path integral approach to sparse non-Hermitian random matrices, arXiv:2308.13605 [cond-mat.dis-nn].
- [37] K. Efetov, Supersymmetry in disorder and chaos (Cambridge, 1996).
- [38] F. Wegner, Supermathematics and its applications in statistical physics (Springer, 2016).
- [39] G. Livan, M. Novaes and P. Vivo, Introduction to random matrices: theory and practice (Springer, 2018) arXiv:1712.07903 [math-ph].
- [40] V. A. R. Susca, P. Vivo and R. Kühn, Cavity and replica methods for the spectral density of sparse symmetric random matrices, SciPost Phys. Lect. Notes 33 (2021) 1, arXiv:2101.08029 [cond-mat.stat-mech].
- [41] S. Kumar, Superconvergence of a collocation-type method for Hammerstein equations, IMA J. Num. Anal. 7 (1987) 313; S. Kumar and I. H. Sloan, A new collocation-type method for Hammerstein integral equations, Math. Comp. 48 (1987) 585.
- [42] P. Das, M. M. Sahani, G. Nelakanti and G. Long, Legendre spectral projection methods for Fredholm–Hammerstein integral equations, J. Sci. Comp. 68 (2016) 213.
- [43] N. Nahid and G. Nelakanti, Convergence analysis of Galerkin and multi-Galerkin methods for nonlinear-Hammerstein integral equations on the half-line using Laguerre polynomials, Int. J. Comp. Math. 99 (2022) 808; R. Nigam, N. Nahid and G. Nelakanti, Non-linear integral equations on unbounded domain with global polynomials, App. Math. Comp. 471 (2024) 128588.
- [44] G. Szegő, Orthogonal polynomials (AMS, 1939), section 5.1.
- [45] SciPy 1.0 contributors, SciPy 1.0: Fundamental algorithms for scientific computing in Python, Nature Methods 17 (2020) 261, arXiv:1907.10121 [cs.MS].
- [46] Y. V. Fyodorov, Spectral properties of random reactance networks and random matrix pencils, J. Phys. A 32 (1999) 7429, arXiv:cond-mat/9906085.
- [47] J. Stäring, B. Mehlig, Y. V. Fyodorov and J. M. Luck, Random symmetric matrices with a constraint: the spectral density of random impedance networks, Phys. Rev. E 67 (2003) 047101, arXiv:cond-mat/0301127.
- [48] X. Ding and T. Jiang, Spectral distributions of adjacency and Laplacian matrices of random graphs, Ann. App. Prob. 20 (2010) 2086, arXiv:1011.2608 [math.PR].
- [49] J. S. Ball, Half-range generalized Hermite polynomials and the related Gaussian quadratures, SIAM J. Num. Anal. 40 (2002) 2311.
- [50] T. Rogers, K. Takeda, I. Pérez Castillo and R. Kühn, Cavity approach to the spectral density of sparse symmetric random matrices, Phys. Rev. E. 78 (2008) 031116, arXiv:0803.1553 [cond-mat.dis-nn].
- [51] F. L. Metz, I. Neri and T. Rogers, Spectral theory of sparse non-Hermitian random matrices, J. Phys. A 52 (2019) 434003, arXiv:1811.10416 [cond-mat.stat-mech].
- [52] G. Nakerst, S. Denisov and M. Haque, Random sparse generators of Markovian evolution and their spectral properties, Phys. Rev. E 108 (2023) 014102, arXiv:2302.12762 [cond-mat.stat-mech].
- [53] J. F. Epperson, On the Runge example, Am. Math. Month. 94 (1987) 329.
- [54] D. S. Lubinsky, A survey of weighted polynomial approximation with exponential weights, Surv. Appr. Th. 3 (2007) 1, arXiv:math/0701099.
- [55] O. Golinelli, Statistics of delta peaks in the spectral density of large random trees, arXiv:cond-mat/0301437.