Non-markovian neural quantum propagator and its application to the simulation of ultrafast nonlinear spectra

Jiaji Zhang \orcidlink0000-0003-2978-274X jiaji.zhang@zhejianglab.com Zhejiang Laboratory, Hangzhou 311100, China Lipeng Chen \orcidlink0009-0002-1541-8912 chenlp@zhejianglab.com Zhejiang Laboratory, Hangzhou 311100, China

Abstract

The accurate solution of dissipative quantum dynamics plays an important role on the simulation of open quantum systems. Here we propose a machine-learning-based universal solver for the hierarchical equations of motion, one of the most widely used approaches which takes into account non-markovian effects and nonperturbative system-environment interactions in a numerically exact manner. We develop a neural quantum propagator model by utilizing the neural network architecture, which avoids time-consuming iterations and can be used to evolve any initial quantum state for arbitrarily long times. To demonstrate the efficacy of our model, we apply it to the simulation of population dynamics and linear and two-dimensional spectra of the Fenna-Matthews-Olson complex.

I Introduction

Ultrafast nonlinear spectroscopy provides a versatile tool to reveal the electronic structure and chemical reaction mechanism. [1, 2, 3, 4, 5] Two-dimensional electronic spectroscopy (2DES), in particular, has been widely employed to monitor electronic excitation dynamics in polyatomic molecules. [6, 7, 8, 9] By utilizing multiple UV-Vis pulses, one can measure the correlation between different electronic states via off-diagonal peaks of 2DES. In addition, extracting dynamical information from the evolution of 2DES enables the direct visualization of chemical reaction processes. [10, 11, 12, 13, 14]

Theoretical simulation of nonlinear spectra is based on the response function theory, which requires the accurate description of system dynamics upon interaction with external laser pulses.[15, 16] As molecular systems inevitably interact with their surrounding environment, a commonly used strategy is to treat the environmental degrees of freedom as a heat bath, and derive the equations of motion for the reduced system after tracing out bath degrees of freedom. [17, 18] The hierarchical equations of motion (HEOM) is one of the best-known quantum dynamics approaches, which takes into account non-markovian effects and non-perturbative system-environment interactions in a numerically exact manner. [19, 20, 21] As a typical partial differential equation (PDE), one recursively solves HEOM by conventional iterative solvers such as fourth-order Runge-Kutta (RK4) and split-operator methods. [22, 23, 24] Despite their straightforward implementation, the main drawbacks of iterative methods are the large computational cost and long-time numerical instabilities. While improvements have been proposed to alleviate the numerical issues of iterative methods, an efficient universal solver is still to be proposed. [25, 26, 27]

Over recent years, the fast development of machine learning technique offers new possibilities to circumvent aforementioned difficulties. [28, 29] A variety of machine learning based surrogate models have been developed to provide universal solvers for PDE. [30, 31, 32, 33] In contrast to iterative methods, those surrogate models solve the PDE by defining a functional that describes the mapping between an arbitrary initial condition and its corresponding solution at some subsequent time. This functional is then parameterized as a deep neural network and optimized with a prepared dataset. The state-of-the-art surrogate models, such as Fourier Neural Operator (FNO) and DeepONet, have shown their effectiveness over conventional methods on a set of PDEs of the classical dynamical problems.[34, 35, 36]

In this work, we extend the surrogate models to the non-markovian quantum dynamics by developing a so-called neural quantum propagator (NQP) model for HEOM. As the quantum analogue of the universal PDE solver, the NQP model directly generates dynamics of system without invoking the tedious, expensive iterations. Following our previous work, we adopt the FNO architecture as the core neural network structure. [37] We test its performance by comparing with the conventional RK4 method in various computational scenarios. In addition to the simulation of population dynamics, we also employ the NQP model to compute linear and nonlinear spectra.

Similar to other neural network architectures, training NQP model requires a large amount of high precision data. Those data can only be generated by conventional iterative solvers with a small enough time step, which in turn leads to a large computational cost in the data preparation stage. To address this issue, we introduce a super-resolution algorithm, which only relies on the low-resolution data to construct high-resolution NQP. The intrinsic error in the dataset is systematically improved by utilizing the physics-informed loss function (PILF), which is defined directly from the HEOM. The optimization of PILF does not rely on any prepared dataset, which significantly improves the overall computational performance of the NQP model.

The rest of the paper is organized as follows. In Section II, we introduce the HEOM approach and linear and nonlinear response functions. In Section III, we present the NQP model, including the FNO architecture, the training setup, and the super-resolution algorithm. Numerical demonstrations on the Fenna-Matthews-Olson (FMO) system are presented in Section IV. Finally, conclusions are drawn in Section V.

II Methodology

II.1 HEOM approach

We consider an electronic system interacting with a set of heat baths. The total Hamiltonian can be written as

\hat{H}_{tot}=\hat{H}_{s}+\hat{H}_{b}+\hat{H}_{s-b}.

(1)

Here, the first term $\hat{H}_{s}$ is the Hamiltonian of the electronic system,

\hat{H}_{s}=\sum_{j=1}^{N}\varepsilon_{j}|j\rangle\langle j|+\sum_{j\neq j^{% \prime}}\Delta_{j,j^{\prime}}|j\rangle\langle j^{\prime}|,

(2)

where $\varepsilon_{j}$ is the energy of the $j$ -th electronic state $|j\rangle$ , and $\Delta_{j,j^{\prime}}$ is the interstate coupling. The second term is the Hamiltonian of harmonic heat baths,

\hat{H}_{b}=\sum_{j=1}^{N}\sum_{\nu}\left(\frac{\hat{p}_{j,\nu}^{2}}{2}+\frac{% \omega_{j,\nu}^{2}\hat{x}_{j,\nu}^{2}}{2}\right),

(3)

where $\hat{p}_{j,\nu}$ , $\hat{x}_{j,\nu}$ , and $\omega_{j,\nu}$ are the dimensionless momentum, coordinate, and frequency of $\nu$ -th oscillator of $j$ -th bath. The last term is the system-bath interaction Hamiltonian,

\hat{H}_{s-b}=-\sum_{j=1}^{N}\hat{V}_{j}\sum_{\nu}g_{j,\nu}\hat{x}_{j,\nu},

(4)

where $\hat{V}_{j}=|j\rangle\langle{j}|$ , and $g_{j,\nu}$ is the coupling constant between the $j$ -th state and the $\nu$ -th oscillator which can be specified by a spectral density,

J_{j}(\omega)=\sum_{\nu}g_{j,\nu}^{2}\delta(\omega-\omega_{j,\nu}).

(5)

The influence of the $j$ -th heat bath on the electronic system is characterized by the bath correlation function, [17, 18]

\displaystyle\begin{aligned} &C_{j}(t)\\ =&\frac{1}{\pi}\int_{0}^{\infty}{\rm{d}}\omega J_{j}(\omega)\left[\coth\left(% \frac{\beta\hbar\omega}{2}\right)\cos(\omega t)-i\,\sin(\omega t)\right],\end{aligned}

(6)

where $J_{j}(\omega)$ is the spectral density of the $j$ -th bath, $\beta=1/k_{B}T$ is the inverse temperature with $k_{B}$ being the Boltzmann constant. We model the bath by the Drude spectral density,

J_{j}(\omega)=\frac{2\lambda_{j}\gamma_{j}\omega}{\gamma_{j}^{2}+\omega^{2}},

(7)

where $\lambda_{j}$ is the reorganization energy, and $\gamma_{j}$ is the inverse of the bath correlation time. In this paper, we consider the high-temperature approximation ( $\beta\hbar\gamma_{j}<1$ ), and express Eq. (6) as $C_{j}(t)=c_{j}e^{-\gamma_{j}|t|}$ , where

c_{j}=\frac{2\lambda_{j}}{\beta\hbar^{2}}-i\,\frac{\lambda_{j}\gamma_{j}}{% \hbar}.

(8)

To go beyond this approximation, one can include so-called low-temperature correction terms. [38, 39] The time evolution of the reduced density matrix can be described by the HEOM approach, which is written as [40, 19]

	$\displaystyle\partial_{t}\hat{\rho}_{\vec{n}}(t)$	$\displaystyle=-\left[\frac{i}{\hbar}\hat{H}_{s}^{\times}+\sum_{j=1}^{N}n_{j}% \gamma_{j}\right]\hat{\rho}_{\vec{n}}(t)-i\sum_{j=1}^{N}\hat{V}_{j}^{\times}% \hat{\rho}_{\vec{n}+\vec{e}_{j}}(t)$		(9)
		$\displaystyle-i\sum_{j=1}^{N}\left[c_{j}\hat{V}_{j}\hat{\rho}_{\vec{n}-\vec{e}% _{j}}(t)-c_{j}^{\ast}\hat{\rho}_{\vec{n}-\vec{e}_{j}}(t)\hat{V}_{j}\right],$		(9)

where $\vec{n}=\{n_{1},n_{2},...,n_{N}\}$ denotes the index vector with $n_{j}$ being the non-negative integer, and we have introduced abbreviated notations, $\hat{A}^{\times}\hat{B}=\hat{A}\hat{B}-\hat{B}\hat{A}$ . The density operator with all indexes equal to zero, $\hat{\rho}_{\vec{0}}(t)$ with $\vec{0}=\{0,0,...,0\}$ , corresponds to the density operator of the reduced electronic system, while all other density operators are introduced to describe non-markovian and non-perturbative effects.

II.2 Linear and nonlinear response functions

The linear and nonlinear spectra are evaluated within the framework of response function theory. [16, 1, 41] The linear response function is defined as

R^{(1)}(t)=\frac{i}{\hbar}{\rm{Tr}}\left\{\hat{\mu}\mathcal{G}_{tot}(t)\hat{% \mu}^{\times}\hat{\rho}_{tot}(0)\right\},

(10)

where $\hat{\mu}$ is the transition dipole operator, and $\hat{\rho}_{tot}$ and $\mathcal{G}_{tot}(t)=\exp(-i\hat{H}_{tot}^{\times}/\hbar t)$ are the density operator and the quantum propagator of the total system, respectively. The linear absorption spectrum is obtained by the Fourier transformation

R^{(1)}(\omega)={\rm{Im}}\int_{0}^{\infty}{\rm{d}}te^{i\omega t}R^{(1)}(t),

(11)

where Im denotes the imaginary part. The third-order response function is defined as

	$\displaystyle R^{(3)}$	$\displaystyle(t_{3},t_{2},t_{1})={\left(\frac{i}{\hbar}\right)}^{3}$		(12)
		$\displaystyle{\rm{Tr}}\left\{\hat{\mu}\mathcal{G}_{tot}(t_{3})\hat{\mu}^{% \times}\mathcal{G}_{tot}(t_{2})\hat{\mu}^{\times}\mathcal{G}_{tot}(t_{1})\hat{% \mu}^{\times}\hat{\rho}_{tot}(0)\right\}.$		(12)

The rephasing and non-rephasing parts of 2D spectrum are defined by

	$\displaystyle R^{(3,R)}$	$\displaystyle(\omega_{3},\omega_{1};t_{2})={\rm{Im}}$		(13)
		$\displaystyle\int_{0}^{\infty}{\rm{d}}t_{3}\int_{0}^{\infty}{\rm{d}}t_{1}e^{i% \omega_{3}t_{3}-i\omega_{1}t_{1}}R^{(3)}(t_{3},t_{2},t_{1}),$		(13)

	$\displaystyle R^{(3,NR)}$	$\displaystyle(\omega_{3},\omega_{1};t_{2})={\rm{Im}}$		(14)
		$\displaystyle\int_{0}^{\infty}{\rm{d}}t_{3}\int_{0}^{\infty}{\rm{d}}t_{1}e^{i% \omega_{3}t_{3}+i\omega_{1}t_{1}}R^{(3)}(t_{3},t_{2},t_{1}),$		(14)

Within the HEOM formalism, Eqs. (10) and (12) can be evaluated by replacing $\hat{\rho}_{tot}$ and $\mathcal{G}_{tot}(t)$ with $\hat{\rho}_{\vec{n}}(0)$ and Eq. (9), respectively. The final trace is only taken for the zeroth order element of $\hat{\rho}_{\vec{n}}(t)$ , i.e., $\hat{\rho}_{\vec{0}}(t)$ .

III Neural quantum propagator

We introduce the abbreviated index, $x=(j,j^{\prime},n_{1},n_{2},...,n_{N})$ , and align the matrix entries $\rho(x,t)=\langle j|\hat{\rho}_{\vec{n}}(t)|j^{\prime}\rangle$ as the column vector

\vec{\rho}_{t}=\{\rho(x_{0},t),\rho(x_{1},t),...\}.

(15)

The HEOM (Eq. (9)) can be recast to a matrix-vector form as

\partial_{t}\vec{\rho}_{t}={\bm{L}}\,\vec{\rho}_{t},

(16)

where the matrix entries of ${\bm{L}}$ can be inferred from the right-hand side of Eq. (9). The propagator of HEOM is then defined through the integration form as ${\bm{G}}_{t}=\exp({t{\bm{L}}})$ , which satisfies the composition property,

\vec{\rho}_{t}={\bm{G}}_{t-t_{0}}\vec{\rho}_{t_{0}}=e^{(t-t_{0}){\bm{L}}}.

(17)

To facilitate the description of later sections, we also introduce uniform time grid as $t_{m}=m\delta_{t}$ for $m=1\sim N_{t}$ , where $\delta_{t}=t_{max}/N_{t}$ is the time step with $N_{t}$ and $t_{max}$ being the total number of time steps and the fixed upper time limit, respectively.

III.1 Model’s architecture

To construct the NQP model, we follow our previous work [37] and parameterize the HEOM propagator as a deep neural network, ${\bm{G}}_{t_{m}}[\theta]$ , where $\theta$ represents all the trainable parameters. The architecture of the NQP model is shown in Fig. 1. In Fig. 1(a), $P_{in}$ and $P_{out}$ are the linear projections between physical and latent Fourier spaces. They are parameterized as the point-wise convolution network with one hidden layer and a Gaussian Error Linear Unit (GeLU) activation function. The rest parts are called the Fourier layers with their structure presented in Fig. 1(b).

To process the input of the $l$ -th layer $\vec{v}_{l}$ , two different routes are adopted. On the upper route, ${\mathcal{F}}$ and ${\mathcal{F}}^{-1}$ denote the Fourier and its inverse transform. The point-wise convolution $W_{l}$ serves as the learnable weight in Fourier space. Only the lowest $k_{max}$ modes are explicitly included in the weight tensor, while others with higher frequencies are truncated to control the size of the model and avoid the numerical instabilities. The lower route is similar to the residual network. The results of two different routes are summed and activated by GeLU before passing to the next layer.

Refer to caption — Figure 1: The architecture of (a) the NQP model, and (b) the $l$ -th Fourier layer. Here, $\mathcal{F}$ and $\mathcal{F}^{-1}$ denote the Fourier transform and its inverse. $+$ and $\sigma$ represent the element-wise sum and the GeLU activation function. The learnable parameters are those in the $P_{in}$ , $P_{out}$ , and $W_{l}$ .

The NQP model takes all the entries of initial condition $\vec{\rho}_{0}$ and a chosen time $t$ as the input, and outputs $\vec{\rho}_{t}={\bm{G}}_{t}[\theta]\vec{\rho}_{0}$ satisfying Eq. (16). It should be noted that no restrictions are a priori made on the explicit forms of $\vec{\rho}_{0}$ . The NQP model can be directly applied to the simulation of response function by taking the field interaction form $\hat{\mu}^{\times}\hat{\rho}_{\vec{n}}$ as the input. Since the composition property is also retained during the parameterization, the time evolution up to arbitrarily long times can be obtained by recursively applying Eq. (17).

III.2 Training objective

The NQP model is trained by minimizing an objective function $\mathcal{L}$ , defined as

\mathcal{L}=\alpha\mathcal{L}_{data}+(1-\alpha)\mathcal{L}_{phys},

(18)

where $\mathcal{L}_{data}$ and $\mathcal{L}_{phys}$ are referred to as the data and physics-informed loss functions, respectively. The hyper-parameter $\alpha\in(0,1)$ serves as a weight factor, which will be dynamically adjusted in the training stage.

For the data part $\mathcal{L}_{data}$ , we prepare a dataset by randomly sampling a set of initial condition $\{\vec{\rho}_{0}\}$ , and then evaluating their time evolution $\{\vec{\rho}_{t}\}$ up to $t\in[0,t_{max}]$ using conventional RK4 method. The data loss function is defined as follows,

\mathcal{L}_{data}=\sum_{p=1}^{N_{data}}\sum_{m=1}^{N_{t}}\frac{{\left|\left|{% \bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p)}-\vec{\rho}_{t_{m}}^{(p)}\right|% \right|}_{F}}{{\left|\left|\vec{\rho}_{t_{m}}^{(p)}\right|\right|}_{F}},

(19)

where $||\cdot||_{F}$ denotes the Frobenius-norm, $N_{data}$ is the number of individual samples in the dataset, and $\vec{\rho}_{0}^{(p)}$ and $\vec{\rho}_{t_{m}}^{(p)}$ are the initial condition and the corresponding evolution for the $p$ -th sample, respectively.

To ensure the universality of ${\bm{G}}_{t}[\theta]$ that is applicable to any $\vec{\rho}_{0}$ , one needs a large number of samples $N_{data}$ , which leads to even more computational cost in the data preparation stage. We introduce a physics-informed loss function to reduce the effective number of samples $N_{data}$ while keeping the universality of ${\bm{G}}_{t}[\theta]$ .[42] The physics-informed loss function is defined by minimizing the difference between left- and right-hand sides of Eq. (16) as

\mathcal{L}_{phys}=\sum_{p^{\prime}=1}^{N_{phys}}\sum_{m=1}^{N_{t}}{\left|% \left|{\partial_{t}}{\bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p^{\prime})}-{\bm% {L}}{\bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p^{\prime})}\right|\right|}_{F},

(20)

where $N_{phys}$ is the number of samples in the physics dataset. The time derivative $\partial_{t}\vec{\rho}_{t}$ is evaluated by the finite difference method. It should be mentioned that the calculation of $\mathcal{L}_{phys}$ involves far less samples as compared to that of $\mathcal{L}_{data}$ . In addition, we adopt the on-the-fly sampling approach by re-generating the physics dataset at each training epoch to further improve the performance of the trained model.

III.3 Super resolution algorithm

To further reduce the computational cost in the data preparation stage, we introduce a super resolution algorithm, which allows the construction of the high resolution NQP model from a lower resolution dataset. As illustrated from the previous subsection, the lower resolution dataset is prepared by integrating the HEOM with a larger time step $K\delta_{t}$ ( $K>1$ ) for a set of $\{\vec{\rho}_{0}\}$ using the RK4 method. The obtained data is then embedded into the finer grid $\{t_{m}=m\delta_{t}\}$ by interpolating the missing value using the linear interpolation scheme. $\mathcal{L}_{data}$ is evaluated on this interpolated dataset in the training stage.

On the other hand, the physics-informed loss function $\mathcal{L}_{phys}$ is evaluated directly on the finer time grid and serves as the correction over the deviation from the dataset. The super resolution algorithm is then completed by dynamically adjusting the weight factor $\alpha$ in Eq. (18) during the training process. At the begining, we set $\alpha=1$ and randomly initilize all the model’s parameters. During the training process, $\alpha$ is gradually decreased to a small enough value such as $\sim 0.01$ , and $\mathcal{L}_{phys}$ gradually becomes the dominant contribution term. The minimization of $\mathcal{L}_{phys}$ allows the improvement of the resolution over the intrinsic deviation of the dataset.

At the end of this subsection, we briefly discuss the possibility of data-free training, which is achieved by fixing $\alpha=0$ and using only $\mathcal{L}_{phys}$ during the training process. From a theoretical point of view, training with or without $\mathcal{L}_{data}$ results in the same model as long as $\mathcal{L}_{phys}$ becomes the dominant contribution of Eq. (18). In practice, however, training with only $\mathcal{L}_{phys}$ requires longer epochs for convergence when all the learnable parameters are randomly initialized. In this case, a prepared dataset, even with low resolution, serves as a well-performed guidance for the training.

IV Numerical experiments

In the following, we use the Fenna-Matthews-Olson complex as our model system. [43, 44] The electronic state $|j\rangle$ ( $j=1,\cdots,7$ ) corresponds to the state where only $j$ -th pigment is excited, and $|8\rangle=|g\rangle$ is the ground state. We set $\hat{V}_{j}=|j\rangle\langle j|$ ( $j=1\cdots 7$ ) and $\hat{V}_{g}\equiv 0$ . The heat bath parameters are chosen as $\lambda_{j}=35\,{\rm{cm}}^{-1}$ , $\gamma_{j}=200\,{\rm{cm}}^{-1}$ , and $T=300\,{\rm{K}}$ , respectively. The HEOM is truncated at the hierarchy level of $\sum n_{j}\leq 2$ after adopting the filtering algorithm,[45] which is accurate enough for our testing. We set the upper time limit as $t_{max}=30\,{\rm{fs}}$ with a time step of $\delta_{t}=0.6\,{\rm{fs}}$ , which results in $N_{t}=50$ time points.

IV.1 Training and validation test

We first introduce the model’s hyper-parameters and training setup. In order to train the NQP model, we prepare the low-resolution training dataset by randomly sampling $N_{data}=3000$ initial conditions $\vec{\rho}_{0}$ . The low resolution dataset is prepared by integrating HEOM with a larger time step of $3\delta_{t}$ . The missing values are linearly interpolated when embedded into the finer grid with the time step of $\delta_{t}$ . To test the accuracy of the model, we also prepare a high-resolution validation set with 500 samples, following the same setup but using a smaller time step of $\delta_{t}$ . It should be noted that the high-resolution validation set is never referred in the training stage. In the training process, the physics dataset is prepared using the on-the-fly sampling algorithm by randomly generating $N_{phys}=2000$ initial conditions at each epoch.

The other hyper-parameters of the NQP model are chosen as follows. We set the hidden channel of projections $P_{in}$ and $P_{out}$ as $512$ . We use $4$ Fourier layers, each of which has a hidden channel of size $64$ , and the total number of trainable parameters is around 10 million. The model is trained for $10^{5}$ epochs using the Adam optimizer. The learning rate is initially set to $10^{-4}$ , and then halved every 500 epochs until reaching $\leavevmode\nobreak\ 10^{-6}$ . The weight factor $\alpha$ in Eq. (18) is initialized as $\alpha=1$ , and halved every 100 epochs until reaching $\sim 10^{-2}$ . All the tasks are performed on the Nvidia A40 GPU with 48 GB memory.

To test our model, we present the validation test by showing the relative error of $\mathcal{L}_{data}$ for each sample in the validation set in Fig. 2. For all samples, the relative error is around $0.5\%$ . This error can be further reduced by using more samples in the data and physics sets, extending the training to longer epochs, and increasing the size of the NQP model. It should be pointed out that this error corresponds to the overall deviation of all the entries of $\hat{\rho}_{\vec{n}}$ , including those deep hierarchy elements that have much smaller magnitude as compared to $\hat{\rho}_{\vec{0}}$ .

IV.2 Population dynamics

By using the composition property, $\vec{\rho}_{t_{1}+t_{2}}={\bm{G}}_{t_{2}}[\theta]\vec{\rho}_{t_{1}}$ , our NQP model can infer truly long-time dynamics well beyond the training time limit $t_{max}$ . To test the accuracy of the long-time dynamics predicted by the NQP model, we compute population dynamics up to 40 $t_{max}$ ( $\sim 1.2\,{\rm{ps}}$ ). The reference results are obtained from the RK4 method with an integration time step of $\delta_{t}$ . Here, we consider two initial conditions: (a) $\hat{\rho}_{\vec{0}}(0)=|1\rangle\langle 1|$ , and $(b)$ $\hat{\rho}_{\vec{0}}(0)=|6\rangle\langle 6|$ , which correspond to the excitation localized at the first and sixth pigment, respectively. All other hierarchy elements $\hat{\rho}_{\vec{n}}(0)$ ( $\vec{n}\neq\vec{0}$ ) are set to zero for the factorized bath initial condition.

In Fig. 3, we show the time evolution of populations $p_{n}(t)=\langle n|\hat{\rho}_{\vec{0}}(t)|n\rangle$ for sites (a) $n=1$ , $2$ , and $3$ , and (b) $n=4$ , $5$ , and $6$ , respectively, following the experimentally demonstrated energy transfer pathways. In both cases, our NQP model yields results in perfect agreement with those from the reference RK4 method up to $10t_{max}$ . While model-predicted long time dynamics deviates slightly from the exact results due to the accumulation of errors in the training stage, our NQP model still infers the accurate dynamics even far beyond the training time ( $40t_{max}$ ).

IV.3 Linear spectra

Next, we apply our NQP model to simulate the linear and third-order response functions as defined in Eqs. (10) and (12). We choose the transition dipole operator as

\hat{\mu}=\sum_{j=1}^{7}\mu_{j}\left(|j\rangle\langle g|+|g\rangle\langle j|% \right),

(21)

where $\mu_{j}$ is the transition dipole moment of $j$ -th pigment. The system is initially in the electronic ground state before the photoexcitation, i.e., $\hat{\rho}_{\vec{0}}(0)=|g\rangle\langle g|$ .

In Fig. 4, we show the linear spectrum evaluated from the NQP model and the RK4 method ( $t\in[0,40t_{\mathrm{max}}]$ ). For each case, the peak intensities are normalized with respect to their maximum value. Overall, the NQP model yields spectrum in good agreement with that from the reference RK4. The small deviations of some peak intensities may be attributed to the model’s architecture. The adaption of Fourier transform in the model’s architecture generates some artificial aliasing modes, the magnitudes of which are increased after recurrent evaluation of long time dynamics. This systematic error could be resolved by carefully finetuning the truncation level of Fourier modes in NQP model, or by replacing the Fourier transform with methods such as wavelet transform or spatial convolutions.

IV.4 Two-dimensional spectra

We further apply the NQP model to compute 2D spectra at different time $t_{2}$ . In Fig. 5, we show the rephasing (a, c) and non-rephasing (b, d) parts of 2D spectra at $t_{2}=0$ evaluated from the NQP model (a, b) and the RK4 reference (c, d), respectively. The 2D spectra at $t_{2}=50\,{\rm{fs}}$ and $100\,{\rm{fs}}$ are presented in Appendix A. The normalization of the peak intensities is performed with respect to their maximum values. The 2D spectra predicted by the NQP model are again in good agreement with those from the RK4 reference.

To quantify the difference between model-predicted and reference spectra at different $t_{2}$ , we introduce a quantity called average point-wise deviation as a function of $t_{2}$ , which is defined as

\Delta(t_{2})=\int_{\Omega_{3}}{\rm{d}}\omega_{3}\int_{\Omega_{1}}{\rm{d}}% \omega_{1}\,\left|1-\frac{R_{NQP}^{(3)}(\omega_{3},\omega_{1};t_{2})}{R_{RK4}^% {(3)}(\omega_{3},\omega_{1};t_{2})}\right|,

(22)

where $\Omega_{1}$ and $\Omega_{3}$ represent all the frequency domain of $\omega_{1}$ and $\omega_{3}$ , and $R_{NQP}^{(3)}$ and $R_{RK4}^{(3)}$ are the spectra evaluated from the NQP model and the RK4 method. In Fig. 6, we show $\Delta(t_{2})$ up to $t_{2}=100\,{\rm{fs}}$ . It is found that $\Delta(t_{2})$ is well below the level of $1.5\%$ , demonstrating the accuracy of our NQP model.

V Conclusion

In this work, we develop a NQP model for the HEOM approach to treat the non-markovian dynamics. We use the FNO as the model’s architecture, and design a super-resolution algorithm to reduce the computational cost in the data preparation stage. In the training stage, we employ both data loss function and an extra PILF to improve the numerical performance. The accuracy of the NQP model is tested by computing the population dynamics and linear and two-dimensional spectra. The NQP model yields results in good agreement with the conventional RK4 method, demonstrating its potential applicability in various computational scenarios.

In the current NQP model, the number of learnable parameters scales exponentially with both the size of the reduced system and the number of hierarchy elements. To alleviate the computational cost, future work may extend the NQP model to deal with the decomposed form of $\hat{\rho}_{\vec{n}}(t)$ . For example, $\hat{\rho}_{\vec{n}}(t)$ can be expressed as a matrix-product-state form by using the twin-space formulation and tensor-train decomposition, which requires far less entries than the original density matrix.[46, 47] In addition, our focus here is on the time-independent Hamiltonian. Future development may extend to the driven dynamics, where the system Hamiltonian contains time-dependent external fields. A typical application is the spectroscopic equations of motion approach which was employed to the simulation of strong-field nonlinear spectra. [48, 49] Work in these directions is in progress.

Acknowledgments

J.Z. and L.P.C. acknowledge support from the starting grant of research center of new materials computing of Zhejiang Lab (No. 3700-32601).

Author declarations

Author Contributions

Jiaji Zhang: Data curation (lead); Formal analysis (lead); Investigation (equal); Supervision (equal).

Lipeng Chen: Conceptualization (lead); Funding acquisition (lead); Investigation (equal); Supervision (equal).

Conflict of Interest

The authors have no conflicts to disclose.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

The code that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A The 2D spectra at $t_{2}=50\,{\rm{fs}}$ and $t_{2}=100\,{\rm{fs}}$

In this section, we present the 2D spectra at $t_{2}=50\,{\rm{fs}}$ , $100\,{\rm{fs}}$ in Figs. 7 and 8. For both cases, the spectra predicted by the NQP model are in good agreement with the RK4 reference.

References

Gelin, Chen, and Domcke [2022] M. F. Gelin, L. Chen, and W. Domcke, “Equation-of-motion methods for the calculation of femtosecond time-resolved 4-wave-mixing and n-wave-mixing signals,” Chemical Reviews 122, 17339–17396 (2022).
Nisoli et al. [2017] M. Nisoli, P. Decleva, F. Calegari, A. Palacios, and F. Martín, “Attosecond electron dynamics in molecules,” Chemical Reviews 117, 10760–10825 (2017).
Mukamel [2000] S. Mukamel, “Multidimensional femtosecond correlation spectroscopies of electronic and vibrational excitations,” Annual Review of Physical Chemistry 51, 691–729 (2000).
Maiuri, Garavelli, and Cerullo [2019] M. Maiuri, M. Garavelli, and G. Cerullo, “Ultrafast spectroscopy: State of the art and open challenges,” Journal of the American Chemical Society 142, 3–15 (2019).
Dorfman, Schlawin, and Mukamel [2016] K. E. Dorfman, F. Schlawin, and S. Mukamel, “Nonlinear optical signals and spectroscopy with quantum light,” Reviews of Modern Physics 88, 045008 (2016).
Fresch et al. [2023] E. Fresch, F. V. A. Camargo, Q. Shen, C. C. Bellora, T. Pullerits, G. S. Engel, G. Cerullo, and E. Collini, “Two-dimensional electronic spectroscopy,” Nature Reviews Methods Primers 3 (2023), 10.1038/s43586-023-00267-2.
Oliver [2018] T. A. A. Oliver, “Recent advances in multidimensional ultrafast spectroscopy,” Royal Society Open Science 5, 171425 (2018).
Schlau-Cohen, Ishizaki, and Fleming [2011] G. S. Schlau-Cohen, A. Ishizaki, and G. R. Fleming, “Two-dimensional electronic spectroscopy and photosynthesis: Fundamentals and applications to photosynthetic light-harvesting,” Chemical Physics 386, 1–22 (2011).
Ginsberg, Cheng, and Fleming [2009] N. S. Ginsberg, Y.-C. Cheng, and G. R. Fleming, “Two-dimensional electronic spectroscopy of molecular aggregates,” Accounts of Chemical Research 42, 1352–1363 (2009).
Scholes et al. [2011] G. D. Scholes, G. R. Fleming, A. Olaya-Castro, and R. van Grondelle, “Lessons from nature about solar light harvesting,” Nature Chemistry 3, 763–774 (2011).
Kullmann et al. [2011] M. Kullmann, S. Ruetzel, J. Buback, P. Nuernberger, and T. Brixner, “Reaction dynamics of a molecular switch unveiled by coherent two-dimensional electronic spectroscopy,” Journal of the American Chemical Society 133, 13074–13080 (2011).
Arsenault et al. [2021] E. A. Arsenault, P. Bhattacharyya, Y. Yoneda, and G. R. Fleming, “Two-dimensional electronic–vibrational spectroscopy: Exploring the interplay of electrons and nuclei in excited state molecular dynamics,” The Journal of Chemical Physics 155 (2021), 10.1063/5.0053042.
Kim et al. [2020] J. Kim, J. Jeon, T. H. Yoon, and M. Cho, “Two-dimensional electronic spectroscopy of bacteriochlorophyll a with synchronized dual mode-locked lasers,” Nature Communications 11 (2020), 10.1038/s41467-020-19912-5.
Ruetzel et al. [2013] S. Ruetzel, M. Kullmann, J. Buback, P. Nuernberger, and T. Brixner, “Tracing the steps of photoinduced chemical reactions in organic molecules by coherent two-dimensional electronic spectroscopy using triggered exchange,” Physical Review Letters 110 (2013), 10.1103/physrevlett.110.148305.
cho [2019] Coherent Multidimensional Spectroscopy (Springer Singapore, 2019).
Mukamel [1995] S. Mukamel, Principles of Nonlinear Optical Spectroscopy, Oxford series in optical and imaging sciences (Oxford University Press, 1995).
Breuer and Petruccione [2007] H.-P. Breuer and F. Petruccione, The Theory of Open Quantum Systems (Oxford University Press, 2007).
Weiss [2012] U. Weiss, Quantum Dissipative Systems, 4th ed. (World Scientific, 2012).
Tanimura [2020] Y. Tanimura, “Numerically “exact” approach to open quantum dynamics: The hierarchical equations of motion (HEOM),” The Journal of Chemical Physics 153, 020901 (2020).
Ye et al. [2016] L. Ye, X. Wang, D. Hou, R. Xu, X. Zheng, and Y. Yan, “Heom‐quick: a program for accurate, efficient, and universal characterization of strongly correlated quantum impurity systems,” WIREs Computational Molecular Science 6, 608–638 (2016).
Zhang and Tanimura [2022] J. Zhang and Y. Tanimura, “Imaginary-time hierarchical equations of motion for thermodynamic variables,” The Journal of Chemical Physics 156 (2022), 10.1063/5.0091468.
Kloeden and Platen [1992] P. E. Kloeden and E. Platen, Numerical Solution of Stochastic Differential Equations (Springer Berlin Heidelberg, 1992).
Yan et al. [2021] Y. Yan, M. Xu, T. Li, and Q. Shi, “Efficient propagation of the hierarchical equations of motion using the Tucker and hierarchical Tucker tensors,” The Journal of Chemical Physics 154, 194104 (2021).
Ke [2023] Y. Ke, “Tree tensor network state approach for solving hierarchical equations of motion,” The Journal of Chemical Physics 158 (2023), 10.1063/5.0153870.
Kimura and Fujihashi [2014] A. Kimura and Y. Fujihashi, “Quantitative correction of the rate constant in the improved variational master equation for excitation energy transfer,” The Journal of Chemical Physics 141, 194110 (2014).
Schlimgen et al. [2021] A. W. Schlimgen, K. Head-Marsden, L. M. Sager, P. Narang, and D. A. Mazziotti, “Quantum simulation of open quantum systems using a unitary decomposition of operators,” Physical Review Letters 127, 270503 (2021).
Liu et al. [2023] W. Liu, Z.-H. Chen, Y. Su, Y. Wang, and W. Dou, “Predicting rate kernels via dynamic mode decomposition,” The Journal of Chemical Physics 159, 144110 (2023).
LeCun, Bengio, and Hinton [2015] Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature 521, 436–444 (2015).
Hermann et al. [2023] J. Hermann, J. Spencer, K. Choo, A. Mezzacapo, W. M. C. Foulkes, D. Pfau, G. Carleo, and F. Noé, “Ab initio quantum chemistry with neural-network wavefunctions,” Nature Reviews Chemistry 7, 692–709 (2023).
Lu et al. [2021a] L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via deeponet based on the universal approximation theorem of operators,” Nat. Mach. Intell. 3, 218–229 (2021a).
[31] Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,” Preprint at https://arxiv.org/abs/2010.08895 (2021).
Kovachki et al. [2023] N. Kovachki, Z. Li, B. Liu, K. Azizzadenesheli, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Neural operator: Learning maps between function spaces with applications to pdes,” J. Mach. Learn. Res. 24, 1–97 (2023).
[33] J. Guibas, M. Mardani, Z. Li, A. Tao, A. Anandkumar, and B. Catanzaro, “Adaptive fourier neural operators: Efficient token mixers for transformers,” 10.48550/ARXIV.2111.13587, preprint at https://arxiv.org/abs/2111.13587 (2021).
Lu et al. [2021b] L. Lu, X. Meng, Z. Mao, and G. E. Karniadakis, “Deepxde: A deep learning library for solving differential equations,” SIAM Review 63, 208–228 (2021b).
[35] J. Pathak, S. Subramanian, P. Harrington, S. Raja, A. Chattopadhyay, M. Mardani, T. Kurth, D. Hall, Z. Li, K. Azizzadenesheli, P. Hassanzadeh, K. Kashinath, and A. Anandkumar, “FourCastNet: A Global Data-driven High-resolution Weather Model using Adaptive Fourier Neural Operators,” 10.48550/ARXIV.2202.11214, preprint at https://arxiv.org/abs/2202.11214 (2022).
[36] P. Jiang, N. Meinert, H. Jordão, C. Weisser, S. Holgate, A. Lavin, B. Lütjens, D. Newman, H. Wainwright, C. Walker, and P. Barnard, “Digital twin earth – coasts: Developing a fast and physics-informed surrogate model for coastal floods via neural operators,” 10.48550/ARXIV.2110.07100, preprint at https://arxiv.org/abs/2110.07100 (2021).
Zhang, Benavides-Riveros, and Chen [2024] J. Zhang, C. L. Benavides-Riveros, and L. Chen, “Artificial-intelligence-based surrogate solution of dissipative quantum dynamics: Physics-informed reconstruction of the universal propagator,” The Journal of Physical Chemistry Letters 15, 3603–3610 (2024).
Ishizaki and Tanimura [2005] A. Ishizaki and Y. Tanimura, “Quantum dynamics of system strongly coupled to low-temperature colored noise bath: Reduced hierarchy equations approach,” Journal of the Physical Society of Japan 74, 3131–3134 (2005).
Hu, Xu, and Yan [2010] J. Hu, R.-X. Xu, and Y. Yan, “Communication: Padé spectrum decomposition of Fermi function and Bose function,” The Journal of Chemical Physics 133, 101106 (2010).
Tanimura [2006] Y. Tanimura, “Stochastic liouville, langevin, fokker–planck, and master equation approaches to quantum dissipative systems,” Journal of the Physical Society of Japan 75, 082001 (2006).
Zhang and Tanimura [2023] J. Zhang and Y. Tanimura, “Coherent two-dimensional THz magnetic resonance spectroscopies for molecular magnets: Analysis of Dzyaloshinskii–Moriya interaction,” The Journal of Chemical Physics 159, 014102 (2023).
Rosofsky, Majed, and Huerta [2023] S. G. Rosofsky, H. A. Majed, and E. A. Huerta, “Applications of physics informed neural operators,” Machine Learning: Science and Technology 4, 025022 (2023).
Adolphs and Renger [2006] J. Adolphs and T. Renger, “How proteins trigger excitation energy transfer in the fmo complex of green sulfur bacteria,” Biophysical Journal 91, 2778–2797 (2006).
Ishizaki and Fleming [2009] A. Ishizaki and G. R. Fleming, “Theoretical examination of quantum coherence in a photosynthetic system at physiological temperature,” Proceedings of the National Academy of Sciences 106, 17255–17260 (2009).
Shi et al. [2009] Q. Shi, L. Chen, G. Nan, R.-X. Xu, and Y. Yan, “Efficient hierarchical Liouville space propagator to quantum dissipative dynamics,” The Journal of Chemical Physics 130, 084105 (2009).
Borrelli [2019] R. Borrelli, “Density matrix dynamics in twin-formulation: An efficient methodology based on tensor-train representation of reduced equations of motion,” The Journal of Chemical Physics 150, 234102 (2019).
Borrelli and Gelin [2021] R. Borrelli and M. F. Gelin, “Finite temperature quantum dynamics of complex systems: Integrating thermo-field theories and tensor-train methods,” WIREs Computational Molecular Science 11, e1539 (2021).
Wang and Thoss [2004] H. Wang and M. Thoss, “Nonperturbative simulation of pump–probe spectra for electron transfer reactions in the condensed phase,” Chemical Physics Letters 389, 43–50 (2004).
Wang and Thoss [2008] H. Wang and M. Thoss, “Nonperturbative quantum simulation of time-resolved nonlinear spectra: Methodology and application to electron transfer reactions in the condensed phase,” Chemical Physics 347, 139–151 (2008).