Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Discovering reduced order model equations of many-body quantum systems using genetic programming: a technical report

Illya Bakurov Department of Computer Science and Engineering, Michigan State University, East Lansing, MI    Pablo Giuliani Facility for Rare Isotope Beams, Michigan State University, East Lansing, Michigan 48824, USA    Kyle Godbey Facility for Rare Isotope Beams, Michigan State University, East Lansing, Michigan 48824, USA    Nathan Haut Department of Computational Mathematics, Science, and Engineering, Michigan State University, East Lansing, MI    Wolfgang Banzhaf Department of Computer Science and Engineering, Michigan State University, East Lansing, MI    Witold Nazarewicz Facility for Rare Isotope Beams, Michigan State University, East Lansing, Michigan 48824, USA Department of Physics and Astronomy, Michigan State University, East Lansing, Michigan 48824, USA
(June 6, 2024)
Abstract

In this technical report we present first results of applying Genetic Programming (GP) to obtain equations for constructing reduced order models of quantum systems, with a particular interest in nuclear Density Functional Theory. We employ the reduced basis method to obtain reduced coordinates as the amplitudes of the reduced basis, and use GP to avoid the need of constructing the reduced equations through, for example, a Galerkin projection. The reduced order models constructed through GP show excellent accuracy and speed performance, including extrapolations in the controlling parameters, and show promise as an effective method for emulating computationally demanding calculations in nuclear physics.

I Introduction

Computational models for many-body quantum systems have quickly grown in complexity as theories get refined, more experimental data becomes available, and computers become more powerful. In fields with practical applications, such as nuclear physics, the demand for more accurate descriptions of nature, together with a push for predictions with well-quantified uncertainties [1, 2], has driven various efforts in the field to create surrogate models that can learn low-dimensional representations of the underlying system equations [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29]. Several of the developed surrogate models are based on model reduction approaches [30, 31, 32] that approximates the solution manifold of equations parametrized by control variables α𝛼\alphaitalic_α (for example, model parameters) by identifying reduced coordinates for the system, and then constructing equations for these coordinates.

In particular, the reduced basis method (RBM) [33, 34] traditionally works by identifying the reduced coordinates as the amplitudes of a reduced basis informed by previous high fidelity (HF) evaluations of the full system, and creates the reduced equations by projecting the underlying operators using the Petrov-Galerkin scheme [23]. This approach has shown great performance with speed ups of the original calculations usually between 2 and 7 orders of magnitude, yet it has been challenging to successfully apply it when the underlying operators are non-affine [28] or non-linear [25], usually requiring the mitigation of such features by hyperreduction schemes [35] such as the Empirical Interpolation Method [36, 37]. Alternative avenues to construct the reduced equations without explicitly projecting the original full system’ model, but rather by relying on data directly obtained from HF calculations have recently started to be explored [38, 39, 40, 41] (see for example [42] for a related application in other fields). Some of these developments follow a supervised machine learning (SML) approach in where various algorithms are trained to reproduce the dynamics of the reduced coordinates as the control parameters α𝛼\alphaitalic_α change. In this work, we explore the application of Genetic Programming (GP) to obtain the reduced equations from data. This is achieved by creating an ensemble of initial possible equations that can reproduce the response of the reduced coordinates to parametric changes, and then performing an evolution algorithm in which the current population of models goes through changes to improve their fitness, measured as their accuracy in capturing the underlying reduced dynamics. We test this framework in two example problems [39] that exhibit non-affine and non-linear characteristics that make them problematic to emulate using the traditional Galerkin projection approach.

II Formalism

II.1 Quantum systems

The first application problem we consider is a modified version of the Gross-Pitaevskii equation [43, 44] (a nonlinear Schrodinger equation used to describe dilute Bose-Einstein condensates):

(d2dx2+κx2+qρ(x)σ)ϕ(i)(x)=Eiϕ(i)(x),superscript𝑑2𝑑superscript𝑥2𝜅superscript𝑥2𝑞𝜌superscript𝑥𝜎superscriptitalic-ϕ𝑖𝑥subscript𝐸𝑖superscriptitalic-ϕ𝑖𝑥\Big{(}-\frac{d^{2}}{dx^{2}}+\kappa x^{2}+q\rho(x)^{\sigma}\Big{)}\phi^{(i)}(x% )=E_{i}\phi^{(i)}(x),( - divide start_ARG italic_d start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG italic_d italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG + italic_κ italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_q italic_ρ ( italic_x ) start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT ) italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ) = italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ) , (1)
Refer to caption
Figure 1: Five wave functions ϕi(x)subscriptitalic-ϕ𝑖𝑥\phi_{i}(x)italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_x ) (1i51𝑖51\leq i\leq 51 ≤ italic_i ≤ 5) and their associated density self-interaction term ρ(x)σ=(i5|ϕ(i)(x)|2)σ𝜌superscript𝑥𝜎superscriptsuperscriptsubscript𝑖5superscriptsuperscriptitalic-ϕ𝑖𝑥2𝜎\rho(x)^{\sigma}=\big{(}\sum_{i}^{5}|\phi^{(i)}(x)|^{2}\big{)}^{\sigma}italic_ρ ( italic_x ) start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT = ( ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT | italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) start_POSTSUPERSCRIPT italic_σ end_POSTSUPERSCRIPT for the modified Gross-Pitaevskii equation (1) for κ=1,q=2,σ=2.7formulae-sequence𝜅1formulae-sequence𝑞2𝜎2.7\kappa=1,\ q=2,\ \sigma=2.7italic_κ = 1 , italic_q = 2 , italic_σ = 2.7.

in where the Hamiltonian for each wave function ϕ(i)superscriptitalic-ϕ𝑖\phi^{(i)}italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT consists of a trapping term proportional to x2superscript𝑥2x^{2}italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, as well as with a density ρ(x)=iN|ϕ(i)(x)|2𝜌𝑥superscriptsubscript𝑖𝑁superscriptsuperscriptitalic-ϕ𝑖𝑥2\rho(x)=\sum_{i}^{N}|\phi^{(i)}(x)|^{2}italic_ρ ( italic_x ) = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT compromised by all N𝑁Nitalic_N particles in the system. The control parameters in this case are α={κ,q,σ}𝛼𝜅𝑞𝜎\alpha=\{\kappa,q,\sigma\}italic_α = { italic_κ , italic_q , italic_σ }. Figure 1 shows the wave functions and their associated density for an example value of the control parameters, while Table 1 shows the ranges of the three parameters explored.

The second application problem we consider is Density Functional Theory (DFT) for a Skyrme type interaction [3]. DFT is used in nuclear physics to describe the atomic nucleus through the mean-field perspective, where the nucleons (protons and neutrons) interact with average densities and currents that are constructed from the nucleon wavefunctions (see  [45] for more details). The Hamiltonian of the system can be written as:

h^α(i)[Φ]ϕ(i)=Eiϕ(i),superscriptsubscript^𝛼𝑖delimited-[]Φsuperscriptitalic-ϕ𝑖subscript𝐸𝑖superscriptitalic-ϕ𝑖\hat{h}_{\alpha}^{(i)}[\Phi]\phi^{(i)}=E_{i}\phi^{(i)},over^ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT [ roman_Φ ] italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT = italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT , (2)

where the wavefunction of each orbital ϕisubscriptitalic-ϕ𝑖\phi_{i}italic_ϕ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT interacts with a Hamiltonian h^α(i)superscriptsubscript^𝛼𝑖\hat{h}_{\alpha}^{(i)}over^ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT that depends on all of them Φ={ϕ(i)}i=1NΦsuperscriptsubscriptsuperscriptitalic-ϕ𝑖𝑖1𝑁\Phi=\{\phi^{(i)}\}_{i=1}^{N}roman_Φ = { italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT. This single particle Hamiltonian is derived from the Skyrme effective interaction [46, 47, 48, 49], with the nuclear part is made of time-even densities [50]:

t(r)=Ctρρt2+CtρΔρρtΔρt+Ctτρtτt+CtJJt2+CtρJρt𝐉t,subscript𝑡𝑟superscriptsubscript𝐶𝑡𝜌superscriptsubscript𝜌𝑡2superscriptsubscript𝐶𝑡𝜌Δ𝜌subscript𝜌𝑡Δsubscript𝜌𝑡superscriptsubscript𝐶𝑡𝜏subscript𝜌𝑡subscript𝜏𝑡superscriptsubscript𝐶𝑡𝐽superscriptsubscript𝐽𝑡2superscriptsubscript𝐶𝑡𝜌𝐽subscript𝜌𝑡subscript𝐉𝑡{\cal H}_{t}(r)=C_{t}^{\rho}\rho_{t}^{2}+C_{t}^{\rho\Delta\rho}\rho_{t}\Delta% \rho_{t}+C_{t}^{\tau}\rho_{t}\tau_{t}+C_{t}^{J}\tensor{J}_{t}^{2}+C_{t}^{\rho% \nabla J}\rho_{t}\nabla\cdot\mathbf{J}_{t},caligraphic_H start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ( italic_r ) = italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ρ end_POSTSUPERSCRIPT italic_ρ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ρ roman_Δ italic_ρ end_POSTSUPERSCRIPT italic_ρ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT roman_Δ italic_ρ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT italic_ρ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT italic_τ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT + italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_J end_POSTSUPERSCRIPT over⃡ start_ARG italic_J end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ρ ∇ italic_J end_POSTSUPERSCRIPT italic_ρ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ∇ ⋅ bold_J start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (3)

with the subscript t=(0,1)𝑡01t=(0,1)italic_t = ( 0 , 1 ) denoting isoscalar and isovector densities, respectively. The parameters of this EDF model the interaction between the wavefunctions and the respective nucleonic densities in question (the kinetic energy density. We usually re-parametrize these constants by terms representing nuclear-matter properties [51, 52]:

α={ρc,ENM/A,KNM,JsymNM,LsymNM,Ms,CtρΔρ,CtρJ}.𝛼subscript𝜌csuperscript𝐸NM𝐴superscript𝐾NMsuperscriptsubscript𝐽symNMsuperscriptsubscript𝐿symNMsuperscriptsubscript𝑀𝑠superscriptsubscript𝐶𝑡𝜌Δ𝜌superscriptsubscript𝐶𝑡𝜌𝐽\alpha=\left\{\rho_{\text{c}},E^{\text{NM}}/A,K^{\text{NM}},J_{\text{sym}}^{% \text{NM}},L_{\text{sym}}^{\text{NM}},M_{s}^{*},C_{t}^{\rho\Delta\rho},C_{t}^{% \rho\nabla J}\right\}.italic_α = { italic_ρ start_POSTSUBSCRIPT c end_POSTSUBSCRIPT , italic_E start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT / italic_A , italic_K start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT , italic_J start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT , italic_L start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT , italic_M start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT , italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ρ roman_Δ italic_ρ end_POSTSUPERSCRIPT , italic_C start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_ρ ∇ italic_J end_POSTSUPERSCRIPT } . (4)

For this work we focus on 48Ca which, within the spherical DFT construction contains 6 independent proton orbitals and 7 independent neutron orbitals. Table 1 shows the ranges of the two varied parameters for this study, JsymNMsuperscriptsubscript𝐽symNMJ_{\text{sym}}^{\text{NM}}italic_J start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT and LsymNMsuperscriptsubscript𝐿symNML_{\text{sym}}^{\text{NM}}italic_L start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT, which represent the symmetry energy and its slope [52] respectively.

Parameter Range
κ𝜅\kappaitalic_κ [0.5,3.0]
q𝑞qitalic_q [-2.0,2.0]
σ𝜎\sigmaitalic_σ [0.5,3.0]
JsymNMsuperscriptsubscript𝐽symNMJ_{\text{sym}}^{\text{NM}}italic_J start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT [20,40]
LsymNMsuperscriptsubscript𝐿symNML_{\text{sym}}^{\text{NM}}italic_L start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT [25,65]
Table 1: Parameters varied for each of the two explored problems with their associated ranges. For the DFT case, both the symmetry energy JsymNMsuperscriptsubscript𝐽symNMJ_{\text{sym}}^{\text{NM}}italic_J start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT and its slope LsymNMsuperscriptsubscript𝐿symNML_{\text{sym}}^{\text{NM}}italic_L start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT are expressed in MeV. A total of 492 evaluations of the modified Gross-Pitaevskii densities were used for training and 212 for testing. A total of 25 evaluations of the DFT protons and neutrons densities were used for training and 16 for testing.

II.2 Reduced basis method

The first step within the RBM formalism consists of building a reduced basis of the solution to the parametrized equation we are interested in solving:

Fα[ϕ(i)]=0,subscript𝐹𝛼delimited-[]superscriptitalic-ϕ𝑖0F_{\alpha}[\phi^{(i)}]=0,italic_F start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT [ italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ] = 0 , (5)

where we rewrite the equation in terms of the operator Fαsubscript𝐹𝛼F_{\alpha}italic_F start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT which is, for example Fα[ϕ(i)]=[h^α(i)[Φ]Ei]ϕ(i)subscript𝐹𝛼delimited-[]superscriptitalic-ϕ𝑖delimited-[]superscriptsubscript^𝛼𝑖delimited-[]Φsubscript𝐸𝑖superscriptitalic-ϕ𝑖F_{\alpha}[\phi^{(i)}]=\big{[}\hat{h}_{\alpha}^{(i)}[\Phi]-E_{i}\big{]}\phi^{(% i)}italic_F start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT [ italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ] = [ over^ start_ARG italic_h end_ARG start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT [ roman_Φ ] - italic_E start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT in the DFT case 2. For each of the two examples described (1) and (2) we could approximate each wave functions as:

ϕ(i)(x;α)ϕ^(i)(x;α)=knak(α)ϕk(i)(x),superscriptitalic-ϕ𝑖𝑥𝛼superscript^italic-ϕ𝑖𝑥𝛼superscriptsubscript𝑘𝑛subscript𝑎𝑘𝛼subscriptsuperscriptitalic-ϕ𝑖𝑘𝑥\phi^{(i)}(x;\alpha)\approx\hat{\phi}^{(i)}(x;\alpha)=\sum_{k}^{n}a_{k}(\alpha% )\phi^{(i)}_{k}(x),italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ; italic_α ) ≈ over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ; italic_α ) = ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_α ) italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) , (6)

where the few n𝑛nitalic_n coefficients ak(α)subscript𝑎𝑘𝛼a_{k}(\alpha)italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_α ) represent latent reduced coordinates that depend on the control variables α𝛼\alphaitalic_α, while the reduced basis ϕk(i)superscriptsubscriptitalic-ϕ𝑘𝑖\phi_{k}^{(i)}italic_ϕ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT is constructed informed by previous HF solutions for different values of α𝛼\alphaitalic_α. In several recent successful RBM applications to nuclear physics [24, 23, 25, 28, 41, 29], the reduced basis is constructed by performing a principal component analysis [53], or singular value decomposition [54] on the set of HF solutions.

Refer to caption
Figure 2: Densities calculated for various values of the controlling parameters α𝛼\alphaitalic_α specified in Table 1. The inset plots show the exponential decay of the singular values from the SVD. Panel a) Density ρ𝜌\rhoitalic_ρ for the modified Gross-Pitaevskii Eq. (1). Panel b) Densities of neutrons ρnsubscript𝜌𝑛\rho_{n}italic_ρ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT and decay of singular values for both protons ρpsubscript𝜌𝑝\rho_{p}italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT and neutrons.

The second step is to obtain the equations that describe the response of the latent variables aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT to changes in the parameters α𝛼\alphaitalic_α. In many RBM applications (including several of the previously cited work), a way to achieve this is to project the equations into a subspace spanned by test functions ψjsubscript𝜓𝑗\psi_{j}italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT [23, 33, 34]:

ψj|Fα(ϕ^(i))=0,for 1jn,formulae-sequenceinner-productsubscript𝜓𝑗subscript𝐹𝛼superscript^italic-ϕ𝑖0for 1𝑗𝑛\langle\psi_{j}|F_{\alpha}(\hat{\phi}^{(i)})\rangle=0,\quad\text{for }1\leq j% \leq n,⟨ italic_ψ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_F start_POSTSUBSCRIPT italic_α end_POSTSUBSCRIPT ( over^ start_ARG italic_ϕ end_ARG start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ) ⟩ = 0 , for 1 ≤ italic_j ≤ italic_n , (7)

which creates n𝑛nitalic_n equations to be solved for the n𝑛nitalic_n unknown coefficients aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT (in the case of coupled equations, such as the non-linear systems in Eq. (2), there would be n𝑛nitalic_n projection equations per original wave-function equation).

In our previous work [23], we successfully constructed a reduced order model for the Gross-Pitaevskii equation (1) in the case of N=1𝑁1N=1italic_N = 1 and σ=1𝜎1\sigma=1italic_σ = 1 using the described Galerkin projection scheme (7), yet for bigger powers σ𝜎\sigmaitalic_σ for the density, or even fractional powers, this approach proves inadequate to create efficient reduced equations. The challenge lies in the inability to pre-compute the Galerkin projections in the offline stage of building the emulator, since the equations are non-affine and non-linear in σ𝜎\sigmaitalic_σ and the wavefunctions. Various terms in realistic EDFs [55, 56, 57] contains fractional powers of densities, which motivates the development of alternative approaches to obtain reduced equations, such as the one we present next.

II.3 Genetic Programming

Refer to caption
Figure 3: A plot exemplifying individuals representation in Tree Genetic Programming.

Genetic Programming (GP) 111Often it is abbreviated as GP, but not to be confused with the same abbreviation commonly used in nuclear physics emulation for Gaussian Processes. Instead, it is a bio-inspired search method making use of processes of stochastic variation and cumulative selection. is a type of Artificial Intelligence (AI) that falls under the broader category of evolutionary computation (EC), which is a subset of AI inspired by biological evolution mechanisms such as selection, mutation, and crossover to solve problems [58, 59, 60, 61]. Specifically, GP is a population-based stochastic iterative search algorithm in the search space of computer programs. Like other evolutionary meta heuristics, GP evolves a set of candidate solutions (the population) by mimicking the basic principles of Darwinian evolution. The evolutionary process involves an iterative application of a fitness-based selection of the candidate solutions and their variation throughout genetically-inspired operators, such as crossover and mutation [58]. Three of the most typical representations employed in GP are expression trees in tree GP (TGP), linear sequences of instructions in linear GP (LGP) and circuit-type graphs in Cartesian GP (CGP). Here we shall be mostly concerned with TGP, and want to only briefly touch the other two representations. In this work, we will use TGP and will refer to it as simply GP. In GP, the evolving programs are constructed by composing elements belonging to two specific, predefined, sets: a set of primitive functions F𝐹Fitalic_F, which appear as the internal nodes of the trees, a set of terminal input features T𝑇Titalic_T and constant values C𝐶Citalic_C, which represent the leaves of the trees. In the context of SML problem-solving, the trees encode symbolic expressions mapping inputs X𝑋Xitalic_X to the outputs y𝑦yitalic_y. Typically, GP is used with the so-called subtree mutation and swap crossover [58]. The latter exchanges two randomly selected subtrees between two different parent individuals. The former randomly selects a subtree in the structure of the parent individual and replaces it with a new, randomly generated tree. Algorithm 1 provides a pseudo-code for the GP, whereas Figure 3 provides a visual representation of GP-models and their mapping between tree and function.

1. Create an n𝑛nitalic_n-sized initial population of random individuals P𝑃Pitalic_P 2. Repeat until matching a termination condition, typically the number of generations (a) Calculate fitness for-all\forall individual i𝑖iitalic_i in P𝑃Pitalic_P (b) Create an empty population Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (c) Repeat until Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT contains n𝑛nitalic_n individuals i. With probability Pcsubscript𝑃𝑐P_{c}italic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT choose the crossover operator. Alternatively, mutation will be selected (1Pc1subscript𝑃𝑐1-P_{c}1 - italic_P start_POSTSUBSCRIPT italic_c end_POSTSUBSCRIPT) ii. Select 2 individuals using some selection algorithm iii. Apply the operator selected in 2.c.i to the individuals selected in 2.c.ii iv. Insert the individuals from 2.c.iii into Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (d) Replace P𝑃Pitalic_P with Psuperscript𝑃P^{\prime}italic_P start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT (and copy the elite(s) from P𝑃Pitalic_P, if elitism is used) 3. Return the best individual in P𝑃Pitalic_P
Algorithm 1 Pseudo-code for the typical GP algorithm.

In an ML context, GP provides a user with a completely different experience in terms of transparency and interpretability of the final models when compared to more “black-box” models of, say, deep artificial neural networks (ANNs). This is because GP typically outputs symbolic expressions or tree structures representing programs, which are generally interpretable and can be analyzed by humans. By manipulating discrete structures, the inner workings of a GP-based model are clearer, and the relationships between features and predictions are explicitly defined. As a result, evolved solutions can offer valuable insights into how the model arrives at its predictions, allowing humans to understand the decision-making process and relate this to their domain-knowledge.

Although GP is not the only “white-box” approach in the spectrum of ML methods, it allows for unprecedented flexibility of representations and modularity, making it suitable for numerous tasks: classification [62, 63, 64], regression [65, 66, 67], feature engineering [68, 69], manifold learning [70], active learning [71], image classification [72, 73, 74, 75], image segmentation [76], image enhancement [77, 78], automatic generation of ML pipelines [79] and even neural architecture search [80, 81]. The area of explainable AI receives consistently more attention from both practitioners and researchers [82, 83] and GP has gained popularity where human-interpretable solutions are paramount. Real-world examples include medical image segmentation [84, 85], prediction of human oral bioavailability of drugs [86], skin cancer classification from lesion images [87], and even conception of models of visual perception [88], etc. Besides being able to provide interpretable models, there is evidence that GP can also help to unlock the behaviour of black-box models [89, 76].

We proceed to apply GP to learn equations for the reduced coordinates the particle density:

ρ(x;α)=iN|ϕ(i)(x)|2ρ^(x;α)=knak(α)ρk(x).𝜌𝑥𝛼superscriptsubscript𝑖𝑁superscriptsuperscriptitalic-ϕ𝑖𝑥2^𝜌𝑥𝛼superscriptsubscript𝑘𝑛subscript𝑎𝑘𝛼subscript𝜌𝑘𝑥\rho(x;\alpha)=\sum_{i}^{N}|\phi^{(i)}(x)|^{2}\approx\hat{\rho}(x;\alpha)=\sum% _{k}^{n}a_{k}(\alpha)\rho_{k}(x).italic_ρ ( italic_x ; italic_α ) = ∑ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT | italic_ϕ start_POSTSUPERSCRIPT ( italic_i ) end_POSTSUPERSCRIPT ( italic_x ) | start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ≈ over^ start_ARG italic_ρ end_ARG ( italic_x ; italic_α ) = ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_α ) italic_ρ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) . (8)

That is, for both problems in Eq. (1) and Eq. (2), we learn equations that can describe how each coefficient aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT changes as we change their respective parameters α𝛼\alphaitalic_α within the defined ranges in Table 1.

In both cases we select a total of three basis n=3𝑛3n=3italic_n = 3. For the DFT case we analyze the proton density ρpsubscript𝜌𝑝\rho_{p}italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT as well as the neutron density ρnsubscript𝜌𝑛\rho_{n}italic_ρ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. The reduced basis ρk(x)subscript𝜌𝑘𝑥\rho_{k}(x)italic_ρ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ) for each of the respective densities is obtained following a singular value decomposition on a set of HF solutions for the training parameters (see Table 1). Note that, since the original equations Eq. (1) and Eq. (2) can’t directly be solved by only considering the density (they involve all wave functions), this GP approach is effectively learning reduced equations that would not be possible to obtain through a direct Galerkin projection.

Table 2 lists the hyper-parameters (HPs) used for GP, along with cross-validation settings. The HPs were selected following common practices found across the literature to avoid a computationally demanding tuning phase. R2superscript𝑅2R^{2}italic_R start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT was used as a fitness function [90, 91, 92] as it was found to converge faster, generalize better, even when only a few data points are available. A common practice in GP is to protect operators to avoid undefined mathematical behaviour of the function set by defining some ad-hoc behaviour at those points like, for instance, returning the value 1 in the case of a division by zero to make it possible for genetic programming to synthesize constants by using x/x𝑥𝑥x/xitalic_x / italic_x [58]. However, it was shown that these techniques present several shortcomings in the vicinity of mathematical singularities [90]. In this study, programs that produced invalid values were automatically assigned a bad fitness value, making them unlikely to be selected. By diminishing the selection attractiveness of solutions that produced invalid values, we expect to obtain models whose fitness landscape is less sharp [93].

For GP we used the commercially available DataModeler system [94]. The parameter set used is shown in Table 2.

Parameters Values
№runs 16
Population size 300
Functions (F𝐹Fitalic_F) {+, -, x, /  -x, CF, x2superscript𝑥2x^{2}italic_x start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, x1superscript𝑥1x^{-1}italic_x start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, RP, x𝑥\sqrt{x}square-root start_ARG italic_x end_ARG, exsuperscript𝑒𝑥e^{x}italic_e start_POSTSUPERSCRIPT italic_x end_POSTSUPERSCRIPT, log(x)𝑙𝑜𝑔𝑥log(x)italic_l italic_o italic_g ( italic_x ), xnsuperscript𝑥𝑛x^{n}italic_x start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, x1/3superscript𝑥13x^{1/3}italic_x start_POSTSUPERSCRIPT 1 / 3 end_POSTSUPERSCRIPT, x3superscript𝑥3x^{3}italic_x start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT}
Selection Pareto tournament selection size 30
Genetic operators {subtree crossover, subtree mutation, depth preserving subtree mutation}
P(C)𝑃𝐶P(C)italic_P ( italic_C ) 0.9
P(M)𝑃𝑀P(M)italic_P ( italic_M ) 0.1
Maximum complexity 1000 (Visitation Length [95])
Stopping criteria 5 minutes
Table 2: Summary of the GP hyper-parameters. Note that P(C)𝑃𝐶P(C)italic_P ( italic_C ) and P(M)𝑃𝑀P(M)italic_P ( italic_M ) indicate the crossover and the mutation probabilities, respectively. CF represents the continued fraction operator d/(b+c/a)𝑑𝑏𝑐𝑎d/(b+c/a)italic_d / ( italic_b + italic_c / italic_a ) and RP represent the rational polynomial function (b+d+f)/(a+c+e)𝑏𝑑𝑓𝑎𝑐𝑒(b+d+f)/(a+c+e)( italic_b + italic_d + italic_f ) / ( italic_a + italic_c + italic_e ).

To contrast the GP ability to learn the equations from data, we also implemented a linear regression model. In the context of SML, linear models are a class of algorithms that assume a linear relationship between the input variables (features) and the output variable (target), in this case the controlling parameters α𝛼\alphaitalic_α and the reduced coefficients aksubscript𝑎𝑘a_{k}italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT, respectively. To make a prediction, the input features are multiplied by the learnable weights, summed together and added a learnable constant called the intercept or bias [96]. Formally: y=β0+β1x1+β2x2++βnxn𝑦subscript𝛽0subscript𝛽1subscript𝑥1subscript𝛽2subscript𝑥2subscript𝛽𝑛subscript𝑥𝑛y=\beta_{0}+\beta_{1}x_{1}+\beta_{2}x_{2}+\ldots+\beta_{n}x_{n}italic_y = italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_β start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT + … + italic_β start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT, where y𝑦yitalic_y is the target, β0subscript𝛽0\beta_{0}italic_β start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is the intercept, and βksubscript𝛽𝑘\beta_{k}italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT is a learnable weight associated with input feature xksubscript𝑥𝑘x_{k}italic_x start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT. The learnable weights and the intercept are usually estimated from the data using Ordinary Least Squares (OLS) estimation procedure to minimize the sum of squared residuals (i.e., difference between the observed target values in the training dataset, and the values predicted by the linear model). Formally: minβi=1n(yiy^i)2subscript𝛽superscriptsubscript𝑖1𝑛superscriptsubscript𝑦𝑖subscript^𝑦𝑖2\min_{\beta}\sum_{i=1}^{n}(y_{i}-\hat{y}_{i})^{2}roman_min start_POSTSUBSCRIPT italic_β end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_i = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT ( italic_y start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT, where y^isubscript^𝑦𝑖\hat{y}_{i}over^ start_ARG italic_y end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is the model’s prediction for data instance i𝑖iitalic_i.

Refer to caption
Figure 4: Performance analysis of the SML approaches for the modified Gross-Pitaevskii Eq. (1). Panel a): Computational Accuracy vs Time plot [28] that shows for each of the 212 testing evaluations the norm of the residual in the vertical axis and the calculation time in the horizontal axis for cubic linear regression with L1 regularization (dark red), with L2 regularization (dark green), and with GP (blue). The corresponding dashed vertical lines are the median time for each method, while the dotted horizontal lines the median error. The horizontal dashed black line shows the median error obtained when optimizing the coefficients in Eq. (8) directly against the test solution. The vertical grey solid line shows the median calculation time of the HF solver Python implementation, which is more than 4 orders of magnitude slower than any of the three methods. Panels b) and c) show the evolution of the first coefficient of the reduced basis a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as the parameters κ𝜅\kappaitalic_κ and σ𝜎\sigmaitalic_σ change, respectively, while the other parameters are held constant. The black dashed line shows the optimal value of the coefficient as the parameters change. The dashed vertical grey line shows the boundary on which the SML methods were trained, after which the calculations are effectively extrapolating.

In order to constraint the complexity of the model and improve generalization ability, regularization terms can be added to the sum of squared residuals. L1 regularization promotes models’ sparsity and is useful to tackle high dimensional tasks (i.e. with many input features), especially if many are expected to be irrelevant. L2 regularization is frequently used when the input features are expected to be informative but their impact has to be reduced to avoid overfitting, and/or are highly correlated (multicollinear). Formally, the L1 term is given by λ1k=0n|βk|subscript𝜆1superscriptsubscript𝑘0𝑛subscript𝛽𝑘\lambda_{1}\sum_{k=0}^{n}|\beta_{k}|italic_λ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT | italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT |, whereas L2 term is given by λ2k=0nβk2subscript𝜆2superscriptsubscript𝑘0𝑛superscriptsubscript𝛽𝑘2\lambda_{2}\sum_{k=0}^{n}\beta_{k}^{2}italic_λ start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_k = 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT italic_β start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT. For creating high-order interactions between the original input features X𝑋Xitalic_X, one can generate a new feature set Xsuperscript𝑋X^{\prime}italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT consisting of all polynomial combinations of the features up to a given degree. Provided Xsuperscript𝑋X^{\prime}italic_X start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, a linear model can learn non-linear relationships. In this study, we use regularized linear regression with polynomial terms and we rely on scikit-learn implementation [97, 98].

Note that regularized linear regression has been long used in scientific applications. Sparse-identification of non-linear dynamics (SINDy) is a popular method for discovering governing equations in dynamic systems [31]. Essentially it utilizes linear regression with the L1 regularizing term to promote sparsity, yielding smaller models, which can offer better interpretability, which is considered an appealing property. SINDy has been successfully applied to find governing equations in many different applications [99, 100, 101, 102, 103]. Nevertheless, SINDy-like approaches present a conspicuous disadvantage: all of the non-linear relationships to be included in the model search must be manually pre-defined, which requires some degree of expertise. This can be particularly limiting when applied to a problem where valuable non-linear relationships are unexpected, and is one of the advantages we expect from GP.

III Results

Refer to caption
Figure 5: Same as Panel a) of Figure 4 but for the case of the neutron density ρn(x)subscript𝜌𝑛𝑥\rho_{n}(x)italic_ρ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) in DFT for 16 values of α𝛼\alphaitalic_α in the testing set. The vertical grey solid line represent the typical spherical DFT calculation time in fortran. This time is faster than for the simpler case in Figure 4 since research codes, like the one used here for DFT, have been optimized for performance. Yet, all SML models which show similar times to those obtained in the case of the modified Gross-Pitaevskii equation are faster by almost 4 orders of magnitude. The horizontal dashed black line shows the median of the optimal coefficients. We observed very similar results for the proton density ρp(x)subscript𝜌𝑝𝑥\rho_{p}(x)italic_ρ start_POSTSUBSCRIPT italic_p end_POSTSUBSCRIPT ( italic_x ).

Once the models were trained, we tested their generalization ability on the previously unseen combinations of parameters belonging to the respective test sets. Given a set of parameters αisubscript𝛼𝑖\alpha_{i}italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT from the test set, we quantify the accuracy performance as the root mean squared error (or L2superscript𝐿2L^{2}italic_L start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT norm) between the density obtained from a high-fidelity solver (HF) and the one obtained from the reduced coordinates equation ϕ(x;αi)knak^(αi)ϕk(x)italic-ϕ𝑥subscript𝛼𝑖superscriptsubscript𝑘𝑛^subscript𝑎𝑘subscript𝛼𝑖subscriptitalic-ϕ𝑘𝑥\phi(x;\alpha_{i})\approx\sum_{k}^{n}\hat{a_{k}}(\alpha_{i})\phi_{k}(x)italic_ϕ ( italic_x ; italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) ≈ ∑ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT over^ start_ARG italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG ( italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) italic_ϕ start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT ( italic_x ), where ak^^subscript𝑎𝑘\hat{a_{k}}over^ start_ARG italic_a start_POSTSUBSCRIPT italic_k end_POSTSUBSCRIPT end_ARG were predicted by the respective SML models as a function of αisubscript𝛼𝑖\alpha_{i}italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. We quantify the speed performance as the time it takes for each procedure to estimate the reduced coordinates given the target αisubscript𝛼𝑖\alpha_{i}italic_α start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT. Figure 4 shows the performance of each SML model (linear regression and GP) for the modified Gross-Pitaevskii equation. Panel a) shows the accuracy and speed of the three methods. The horizontal dashed black line shows the median error obtained when the coefficients in Eq. (8) are optimized to directly reproduce the test solutions, which serves as a lower bound for the error, only possible to be improved by increasing the number of kept principal components beyond n=3𝑛3n=3italic_n = 3. Both GP and LR3-L2 attain residuals comparable with this bound. In terms of speed, all three methods are more than 4 orders of magnitude faster than the HF solver.

Panels b) and c) of Figure 4 show how the first coefficient of the reduced basis a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT evolves as the parameters κ𝜅\kappaitalic_κ and σ𝜎\sigmaitalic_σ change, respectively, while the other parameters are held constant. This is known as ceteris paribus or isolated effect and is frequently used when interpreting the coefficients of a linear regression model, as it assesses the effect of a particular input feature (predictor) on the response (dependent variable) while keeping all other predictors fixed. Analogous to panel a), the black dashed line represents the evolution of the optimal value of a1subscript𝑎1a_{1}italic_a start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT as the parameters α𝛼\alphaitalic_α change. The GP has appreciably better performance in reproducing this optimal coefficient that the other two methods, and it excells particularly in extrapolating beyond the dashed vertical line which denotes the maximum value of the parameters to which all the models were trained. This suggests that the GP has learned the reduced dynamics with less overfitting.

Figure 5 shows analogous results as panel a) of Figure 4 for the DFT neutron density ρn(x)subscript𝜌𝑛𝑥\rho_{n}(x)italic_ρ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_x ) of 48Ca (comparable results were obtained for the proton density). Similar to the performance shown in the case of the modified Gross-Pitaevskii equation, the GP method (as well as LR3-L2) is able to learn the dynamics almost perfectly, showing accuracy that is almost on top of the optimized coefficients indicated by the black dashed line. The computation times of all SML methods are comparable to the easier problem in Figure 4. Indeed, it is remarkable that GP is able to find effective equations that explain the change in the nuclear densities as the two parameters JsymNMsuperscriptsubscript𝐽symNMJ_{\text{sym}}^{\text{NM}}italic_J start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT and LsymNMsuperscriptsubscript𝐿symNML_{\text{sym}}^{\text{NM}}italic_L start_POSTSUBSCRIPT sym end_POSTSUBSCRIPT start_POSTSUPERSCRIPT NM end_POSTSUPERSCRIPT are widely varied. The reduced order model we have constructed not only is agnostic to the high dimensional grid-space in the coordinate x𝑥xitalic_x, but it is also learning the solution manifold of the densities without having to directly track the underlying wave functions, and in the case of DFT the other densities involved. These results poise GP as a promising tool for drastically reducing the computation cost of existing complex nuclear calculations, as well as for model discovery directly from experimental data.

Acknowledgements

This technical report is presenting initial results of an ongoing collaboration between the Facility for Rare Isotope Beams and the Department of Computer Science and Engineering at Michigan State University. We are grateful to Edgard Bonilla for useful discussions.

References

  • Phillips et al. [2021] D. R. Phillips, R. J. Furnstahl, U. Heinz, T. Maiti, W. Nazarewicz, F. M. Nunes, M. Plumlee, M. T. Pratola, S. Pratt, F. G. Viens,  and S. M. Wild, J. Phys. G Nucl. Part. Phys. 48, 072001 (2021).
  • Aidala et al. [2023] C. Aidala, A. Aprahamian, P. Bedaque, L. Bernstein, J. Carlson, M. Carpenter, K. Chipps, V. Cirigliano, I. Cloet, A. de Gouvea, et al.A New Era Of Discovery: The 2023 Long-Range Plan For Nuclear Science, Tech. Rep. (Lawrence Livermore National Laboratory (LLNL), Livermore, CA (United States), 2023).
  • McDonnell et al. [2015] J. McDonnell, N. Schunck, D. Higdon, J. Sarich, S. Wild,  and W. Nazarewicz, Physical review letters 114, 122501 (2015).
  • Wesolowski et al. [2019] S. Wesolowski, R. J. Furnstahl, J. A. Melendez,  and D. R. Phillips, J. Phys. G Nucl. Part. Phys. 46, 045102 (2019).
  • Melendez et al. [2019] J. A. Melendez, R. J. Furnstahl, D. R. Phillips, M. T. Pratola,  and S. Wesolowski, Phys. Rev. C 100, 044001 (2019).
  • Sürer et al. [2022] O. Sürer, F. M. Nunes, M. Plumlee,  and S. M. Wild, Phys. Rev. C 106, 12 (2022).
  • Chan et al. [2023] M. Y.-H. Chan, M. Plumlee,  and S. M. Wild, Technometrics  (2023), 10.1080/00401706.2023.2210170.
  • Svensson et al. [2023] I. Svensson, A. Ekström,  and C. Forssén, arXiv preprint arXiv:2304.02004  (2023), arXiv:2304.02004 [nucl-th] .
  • Frame et al. [2018] D. Frame, R. He, I. Ipsen, D. Lee, D. Lee,  and E. Rrapaj, Phys. Rev. Lett. 121, 032501 (2018).
  • Furnstahl et al. [2020] R. J. Furnstahl, A. J. Garcia, P. J. Millican,  and X. Zhang, Phys. Lett. B 809, 135719 (2020).
  • Melendez et al. [2021] J. A. Melendez, C. Drischler, A. J. Garcia, R. J. Furnstahl,  and X. Zhang, Phys. Lett. B 821, 136608 (2021).
  • Drischler et al. [2021] C. Drischler, M. Quinonez, P. Giuliani, A. Lovell,  and F. Nunes, Phys. Lett. B 823, 136777 (2021).
  • König et al. [2020] S. König, A. Ekström, K. Hebeler, D. Lee,  and A. Schwenk, Phys. Lett. B 810, 135814 (2020).
  • Zhang and Furnstahl [2022] X. Zhang and R. J. Furnstahl, Phys. Rev. C 105, 064004 (2022)arXiv:2110.04269 [nucl-th] .
  • Melendez et al. [2022] J. A. Melendez, C. Drischler, R. J. Furnstahl, A. J. Garcia,  and X. Zhang, J. Phys. G 49, 102001 (2022)arXiv:2203.05528 [nucl-th] .
  • Garcia et al. [2023] A. Garcia, C. Drischler, R. Furnstahl, J. Melendez,  and X. Zhang, Physical Review C 107, 054001 (2023).
  • Sarkar and Lee [2021] A. Sarkar and D. Lee, Phys. Rev. Lett. 126, 032501 (2021).
  • Sarkar and Lee [2022] A. Sarkar and D. Lee, Physical Review Research 4, 023214 (2022).
  • Demol et al. [2020] P. Demol, T. Duguet, A. Ekström, M. Frosini, K. Hebeler, S. König, D. Lee, A. Schwenk, V. Somà,  and A. Tichai, Physical Review C 101, 041302 (2020).
  • Djärv et al. [2022] T. Djärv, A. Ekström, C. Forssén,  and H. T. Johansson, Phys. Rev. C 105, 014005 (2022).
  • Yapa and König [2022] N. Yapa and S. König, Physical Review C 106, 014309 (2022).
  • Franzke et al. [2022] M. C. Franzke, A. Tichai, K. Hebeler,  and A. Schwenk, Physics Letters B 830, 137101 (2022).
  • Bonilla et al. [2022] E. Bonilla, P. Giuliani, K. Godbey,  and D. Lee, Physical Review C 106, 054322 (2022).
  • Anderson et al. [2022] A. L. Anderson, G. L. O’Donnell,  and J. Piekarewicz, Phys. Rev. C 106 (2022).
  • Giuliani et al. [2023] P. Giuliani, K. Godbey, E. Bonilla, F. Viens,  and J. Piekarewicz, Frontiers in Physics 10, 1054524 (2023).
  • Bai and Ren [2021] D. Bai and Z. Ren, Phys. Rev. C 103, 014612 (2021).
  • Ekström and Hagen [2019] A. Ekström and G. Hagen, Physical Review Letters 123, 252501 (2019).
  • Odell et al. [2024] D. Odell, P. Giuliani, K. Beyer, M. Catacora-Rios, M. Y.-H. Chan, E. Bonilla, R. J. Furnstahl, K. Godbey,  and F. M. Nunes, Phys. Rev. C 109, 044612 (2024).
  • Anderson and Piekarewicz [2024] A. Anderson and J. Piekarewicz, arXiv preprint arXiv:2406.01747  (2024).
  • Quarteroni et al. [2014] A. Quarteroni, G. Rozza, et al.Reduced order methods for modeling and computational reduction, Vol. 9 (Springer, 2014).
  • Brunton et al. [2016] S. L. Brunton, J. L. Proctor,  and J. N. Kutz, Proceedings of the national academy of sciences 113, 3932 (2016).
  • [32] K. Godbey, P. Giuliani, E. Bonilla, E. Flynn, D. Odell, K. Beyer, D. Lay, D. Figueroa, R. Garg,  and M. Campbell, “Dimensionality reduction in nuclear physics,” https://dr.ascsn.net/.
  • Quarteroni et al. [2015] A. Quarteroni, A. Manzoni,  and F. Negri, Reduced Basis Methods for Partial Differential Equations: An Introduction, Vol. 92 (Springer, 2015).
  • Hesthaven et al. [2016] J. S. Hesthaven, G. Rozza, B. Stamm, et al.Certified Reduced Basis Methods for Parametrized Partial Differential Equations, Vol. 590 (Springer, 2016).
  • [35] Bayesian Uncertainty Quantification: Errors in Your EFT.
  • Barrault et al. [2004] M. Barrault, Y. Maday, N. C. Nguyen,  and A. T. Patera, C. R. Math. 339, 667 (2004).
  • Grepl et al. [2007] M. A. Grepl, Y. Maday, N. C. Nguyen,  and A. T. Patera, ESAIM: Math. Model. Numer. Anal. 41, 575 (2007).
  • Cook et al. [2024] P. Cook, D. Jammooa, M. Hjorth-Jensen, D. D. Lee,  and D. Lee, arXiv preprint arXiv:2401.11694  (2024).
  • Figueroa et al. [2024] D. Figueroa, E. Bonilla, R. Garg, P. Giuliani,  and K. Godbey, “Learning reduced equations from data,” manuscript in preparation (2024), in preparation.
  • Somasundaram et al. [2024] R. Somasundaram, C. L. Armstrong, P. Giuliani, K. Godbey, S. Gandolfi,  and I. Tews, arXiv preprint arXiv:2404.11566  (2024).
  • Reed et al. [2024] B. Reed, R. Somasundaram, S. De, C. L. Armstrong, P. Giuliani, C. Capano, D. Brown,  and I. Tews, arXiv preprint arXiv:2405.20558  (2024).
  • Burohman et al. [2023] A. M. Burohman, B. Besselink, J. M. Scherpen,  and M. K. Camlibel, IEEE Transactions on Automatic Control 68, 6160 (2023).
  • Gross [1961] E. P. Gross, Il Nuovo Cimento 20, 454 (1961).
  • Pitaevskii [1961] L. P. Pitaevskii, JETP 13, 451 (1961).
  • Schunck [2019] N. Schunck, ed., Energy Density Functional Methods for Atomic Nuclei, 2053-2563 (IOP Publishing, 2019).
  • Skyrme [1956] T. H. R. Skyrme, Philos. Mag. 1, 1043 (1956).
  • Skyrme [1958] T. Skyrme, Nucl. Phys. 9, 615 (1958).
  • Vautherin and Brink [1972] D. Vautherin and D. M. Brink, Phys. Rev. C 5, 626 (1972).
  • Engel et al. [1975] Y. Engel, D. Brink, K. Goeke, S. Krieger,  and D. Vautherin, Nucl. Phys. A 249, 215 (1975).
  • Dobaczewski and Dudek [1995] J. Dobaczewski and J. Dudek, Phys. Rev. C 52, 1827 (1995).
  • Chabanat et al. [1997] E. Chabanat, P. Bonche, P. Haensel, J. Meyer,  and R. Schaeffer, Nucl. Phys. A 627, 710 (1997).
  • Kortelainen et al. [2010a] M. Kortelainen, T. Lesinski, J. Moré, W. Nazarewicz, J. Sarich, N. Schunck, M. V. Stoitsov,  and S. Wild, Phys. Rev. C 82, 024313 (2010a).
  • Jolliffe [2002] I. T. Jolliffe, Principal Component Analysis (Springer, 2002).
  • Blum et al. [2020] A. Blum, J. Hopcroft,  and R. Kannan, Foundations of Data Science (Cambridge University Press, 2020).
  • Kortelainen et al. [2010b] M. Kortelainen, T. Lesinski, J. Moré, W. Nazarewicz, J. Sarich, N. Schunck, M. V. Stoitsov,  and S. Wild, Phys. Rev. C 82, 024313 (2010b).
  • Kortelainen et al. [2012] M. Kortelainen, J. McDonnell, W. Nazarewicz, P.-G. Reinhard, J. Sarich, N. Schunck, M. V. Stoitsov,  and S. M. Wild, Phys. Rev. C 85, 024304 (2012).
  • Kortelainen et al. [2014] M. Kortelainen, J. McDonnell, W. Nazarewicz, E. Olsen, P.-G. Reinhard, J. Sarich, N. Schunck, S. M. Wild, D. Davesne, J. Erler,  and A. Pastore, Phys. Rev. C 89, 054314 (2014).
  • Koza [1992] J. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection (MIT Press, 1992).
  • Banzhaf et al. [1998] W. Banzhaf, F. D. Francone, R. E. Keller,  and P. Nordin, Genetic Programming: An Introduction: On the Automatic Evolution of Computer Programs and Its Applications (Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1998).
  • Langdon and Poli [2002] W. B. Langdon and R. Poli, Springer Berlin Heidelberg (2002).
  • Brameier and Banzhaf [2007] M. Brameier and W. Banzhaf, Linear Genetic Programming, Genetic and Evolutionary Computation (Springer US, 2007).
  • Ingalalli et al. [2014] V. Ingalalli, S. Silva, M. Castelli,  and L. Vanneschi, in Genetic Programming, edited by M. Nicolau, K. Krawiec, M. I. Heywood, M. Castelli, P. García-Sánchez, J. J. Merelo, V. M. Rivas Santos,  and K. Sim (Springer Berlin Heidelberg, Berlin, Heidelberg, 2014) pp. 48–60.
  • Muñoz et al. [2015] L. Muñoz, S. Silva,  and L. Trujillo, in Genetic Programming, edited by P. Machado, M. I. Heywood, J. McDermott, M. Castelli, P. García-Sánchez, P. Burelli, S. Risi,  and K. Sim (Springer International Publishing, Cham, 2015) pp. 78–91.
  • La Cava et al. [2019] W. La Cava, S. Silva, K. Danai, L. Spector, L. Vanneschi,  and J. H. Moore, Swarm and Evolutionary Computation 44, 260 (2019).
  • Moraglio et al. [2012] A. Moraglio, K. Krawiec,  and C. G. Johnson, in Parallel Problem Solving from Nature - PPSN XII, edited by C. A. C. Coello, V. Cutello, K. Deb, S. Forrest, G. Nicosia,  and M. Pavone (Springer Berlin Heidelberg, Berlin, Heidelberg, 2012) pp. 21–31.
  • Castelli and Manzoni [2019] M. Castelli and L. Manzoni, SoftwareX 10, 100313 (2019).
  • La Cava et al. [2016] W. La Cava, L. Spector,  and K. Danai, in Proceedings of the Genetic and Evolutionary Computation Conference 2016, GECCO ’16 (Association for Computing Machinery, New York, NY, USA, 2016) p. 741–748.
  • Neshatian et al. [2012] K. Neshatian, M. Zhang,  and P. Andreae, IEEE Transactions on Evolutionary Computation 16, 645 (2012).
  • Tran et al. [2019] B. Tran, B. Xue,  and M. Zhang, Pattern Recognition 93, 404 (2019).
  • Lensen et al. [2022] A. Lensen, B. Xue,  and M. Zhang, IEEE Transactions on Evolutionary Computation 26, 661 (2022).
  • Haut et al. [2023a] N. Haut, B. Punch,  and W. Banzhaf, in Proceedings of the Companion Conference on Genetic and Evolutionary Computation, GECCO ’23 Companion (Association for Computing Machinery, New York, NY, USA, 2023) p. 587–590.
  • Tackett [1994] W. A. Tackett, Recombination, Selection, and the Genetic Construction of Computer Programs, Ph.D. thesis, USA (1994).
  • Zhang and Smart [2004] M. Zhang and W. Smart, in Applications of Evolutionary Computing, edited by G. R. Raidl, S. Cagnoni, J. Branke, D. W. Corne, R. Drechsler, Y. Jin, C. G. Johnson, P. Machado, E. Marchiori, F. Rothlauf, G. D. Smith,  and G. Squillero (Springer Berlin Heidelberg, Berlin, Heidelberg, 2004) pp. 369–378.
  • Ul Ain et al. [2017] Q. Ul Ain, B. Xue, H. Al-Sahaf,  and M. Zhang, in 2017 IEEE Congress on Evolutionary Computation (CEC) (IEEE Press, 2017) p. 2420–2427.
  • Iqbal et al. [2017] M. Iqbal, B. Xue, H. Al-Sahaf,  and M. Zhang, IEEE Transactions on Evolutionary Computation 21, 569 (2017).
  • Bakurov et al. [2023a] I. Bakurov, M. Buzzelli, R. Schettini, M. Castelli,  and L. Vanneschi, Genetic Programming and Evolvable Machines 24, Article number: 15 (2023a), special Issue on Highlights of Genetic Programming 2022 Events.
  • Correia et al. [2022] J. a. Correia, D. Lopes, L. Vieira, N. Rodriguez-Fernandez, A. Carballal, J. Romero,  and P. Machado, Genetic Programming and Evolvable Machines 23, 557–579 (2022).
  • Correia et al. [2021] J. Correia, L. Vieira, N. Rodriguez-Fernandez, J. Romero,  and P. Machado, in Artificial Intelligence in Music, Sound, Art and Design, edited by J. Romero, T. Martins,  and N. Rodríguez-Fernández (Springer International Publishing, Cham, 2021) pp. 82–97.
  • Olson and Moore [2019] R. S. Olson and J. H. Moore, “Tpot: A tree-based pipeline optimization tool for automating machine learning,” in Automated Machine Learning: Methods, Systems, Challenges, edited by F. Hutter, L. Kotthoff,  and J. Vanschoren (Springer International Publishing, Cham, 2019) pp. 151–160.
  • Goncalves et al. [2015] I. Goncalves, S. Silva,  and C. M. Fonseca, in Progress in Artificial Intelligence - 17th Portuguese Conference on Artificial Intelligence, EPIA 2015, Lecture Notes in Computer Science, Vol. 9273, edited by F. C. Pereira, P. Machado, E. Costa,  and A. Cardoso (Springer, Coimbra, Portugal, 2015) pp. 280–285.
  • Assunção et al. [2019] F. Assunção, N. Lourenço, P. Machado,  and B. Ribeiro, Genetic Programming and Evolvable Machines 20, 5 (2019).
  • Markus et al. [2021] A. F. Markus, J. A. Kors,  and P. R. Rijnbeek, Journal of Biomedical Informatics 113, 103655 (2021).
  • Vainio-Pekka et al. [2023] H. Vainio-Pekka, M. O.-O. Agbese, M. Jantunen, V. Vakkuri, T. Mikkonen, R. Rousi,  and P. Abrahamsson, ACM Trans. Interact. Intell. Syst. 13 (2023), 10.1145/3599974.
  • Cortacero et al. [2023] K. Cortacero, B. McKenzie, S. Müller, R. Khazen, F. Lafouresse, G. Corsaut, N. Van Acker, F.-X. Frenois, L. Lamant, N. Meyer, B. Vergier, D. G. Wilson, H. Luga, O. Staufer, M. L. Dustin, S. Valitutti,  and S. Cussat-Blanc, Nature Communications 14, 7112 (2023).
  • Haut et al. [2024] N. Haut, W. Banzhaf, B. Punch,  and D. Colbry, “Accelerating image analysis research with active learning techniques in genetic programming,” in Genetic Programming Theory and Practice XX, edited by S. Winkler, L. Trujillo, C. Ofria,  and T. Hu (Springer Nature Singapore, Singapore, 2024) pp. 45–64.
  • Silva and Vanneschi [2012] S. Silva and L. Vanneschi, Int. J. Data Min. Bioinformatics 6, 585–601 (2012).
  • Ain et al. [2022] Q. U. Ain, H. Al-Sahaf, B. Xue,  and M. Zhang, Expert Systems with Applications 197, 116680 (2022).
  • Bakurov et al. [2023b] I. Bakurov, M. Buzzelli, R. Schettini, M. Castelli,  and L. Vanneschi, IEEE Transactions on Image Processing 32, 1458 (2023b).
  • Evans et al. [2019] B. P. Evans, B. Xue,  and M. Zhang (Association for Computing Machinery, New York, NY, USA, 2019) p. 1012–1020.
  • Keijzer [2003] M. Keijzer, in Genetic Programming, edited by C. Ryan, T. Soule, M. Keijzer, E. Tsang, R. Poli,  and E. Costa (Springer Berlin Heidelberg, Berlin, Heidelberg, 2003) pp. 70–82.
  • Livadiotis and McComas [2013] G. Livadiotis and D. J. McComas, Journal of Geophysical Research: Space Physics 118, 2863 (2013)https://agupubs.onlinelibrary.wiley.com/doi/pdf/10.1002/jgra.50304 .
  • Haut et al. [2023b] N. Haut, W. Banzhaf,  and B. Punch, “Correlation versus rmse loss functions in symbolic regression tasks,” in Genetic Programming Theory and Practice XIX, edited by L. Trujillo, S. M. Winkler, S. Silva,  and W. Banzhaf (Springer Nature Singapore, Singapore, 2023) pp. 31–55.
  • Bakurov et al. [2024] I. Bakurov, N. Haut,  and W. Banzhaf, arXiv preprint arXiv:2405.10267  (2024).
  • [94] Evolved_Analytics, “DataModeler,” .
  • Keijzer and Foster [2007] M. Keijzer and J. Foster, in Genetic Programming, edited by M. Ebner, M. O’Neill, A. Ekárt, L. Vanneschi,  and A. I. Esparcia-Alcázar (Springer Berlin Heidelberg, Berlin, Heidelberg, 2007) pp. 33–44.
  • Geron [2019] A. Geron, Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems, 2nd ed. (O’Reilly Media, Inc., 2019).
  • Pedregosa et al. [2011] F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot,  and E. Duchesnay, Journal of Machine Learning Research 12, 2825 (2011).
  • Buitinck et al. [2013] L. Buitinck, G. Louppe, M. Blondel, F. Pedregosa, A. Mueller, O. Grisel, V. Niculae, P. Prettenhofer, A. Gramfort, J. Grobler, R. Layton, J. VanderPlas, A. Joly, B. Holt,  and G. Varoquaux, in ECML PKDD Workshop: Languages for Data Mining and Machine Learning (2013) pp. 108–122.
  • Hoffmann et al. [2019] M. Hoffmann, C. Fröhner,  and F. Noé, The Journal of chemical physics 150 (2019).
  • Tran and Ward [2017] G. Tran and R. Ward, Multiscale Modeling & Simulation 15, 1108 (2017).
  • Fasel et al. [2022] U. Fasel, J. N. Kutz, B. W. Brunton,  and S. L. Brunton, Proceedings of the Royal Society A 478, 20210904 (2022).
  • Jiang et al. [2021] Y.-X. Jiang, X. Xiong, S. Zhang, J.-X. Wang, J.-C. Li,  and L. Du, Nonlinear Dynamics 105, 2775 (2021).
  • Shea et al. [2021] D. E. Shea, S. L. Brunton,  and J. N. Kutz, Physical Review Research 3, 023255 (2021).