Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Non-markovian neural quantum propagator and its application to the simulation of ultrafast nonlinear spectra

Jiaji Zhang \orcidlink0000-0003-2978-274X jiaji.zhang@zhejianglab.com Zhejiang Laboratory, Hangzhou 311100, China    Lipeng Chen \orcidlink0009-0002-1541-8912 chenlp@zhejianglab.com Zhejiang Laboratory, Hangzhou 311100, China
Abstract

The accurate solution of dissipative quantum dynamics plays an important role on the simulation of open quantum systems. Here we propose a machine-learning-based universal solver for the hierarchical equations of motion, one of the most widely used approaches which takes into account non-markovian effects and nonperturbative system-environment interactions in a numerically exact manner. We develop a neural quantum propagator model by utilizing the neural network architecture, which avoids time-consuming iterations and can be used to evolve any initial quantum state for arbitrarily long times. To demonstrate the efficacy of our model, we apply it to the simulation of population dynamics and linear and two-dimensional spectra of the Fenna-Matthews-Olson complex.

I Introduction

Ultrafast nonlinear spectroscopy provides a versatile tool to reveal the electronic structure and chemical reaction mechanism. [1, 2, 3, 4, 5] Two-dimensional electronic spectroscopy (2DES), in particular, has been widely employed to monitor electronic excitation dynamics in polyatomic molecules. [6, 7, 8, 9] By utilizing multiple UV-Vis pulses, one can measure the correlation between different electronic states via off-diagonal peaks of 2DES. In addition, extracting dynamical information from the evolution of 2DES enables the direct visualization of chemical reaction processes. [10, 11, 12, 13, 14]

Theoretical simulation of nonlinear spectra is based on the response function theory, which requires the accurate description of system dynamics upon interaction with external laser pulses.[15, 16] As molecular systems inevitably interact with their surrounding environment, a commonly used strategy is to treat the environmental degrees of freedom as a heat bath, and derive the equations of motion for the reduced system after tracing out bath degrees of freedom. [17, 18] The hierarchical equations of motion (HEOM) is one of the best-known quantum dynamics approaches, which takes into account non-markovian effects and non-perturbative system-environment interactions in a numerically exact manner. [19, 20, 21] As a typical partial differential equation (PDE), one recursively solves HEOM by conventional iterative solvers such as fourth-order Runge-Kutta (RK4) and split-operator methods. [22, 23, 24] Despite their straightforward implementation, the main drawbacks of iterative methods are the large computational cost and long-time numerical instabilities. While improvements have been proposed to alleviate the numerical issues of iterative methods, an efficient universal solver is still to be proposed. [25, 26, 27]

Over recent years, the fast development of machine learning technique offers new possibilities to circumvent aforementioned difficulties. [28, 29] A variety of machine learning based surrogate models have been developed to provide universal solvers for PDE. [30, 31, 32, 33] In contrast to iterative methods, those surrogate models solve the PDE by defining a functional that describes the mapping between an arbitrary initial condition and its corresponding solution at some subsequent time. This functional is then parameterized as a deep neural network and optimized with a prepared dataset. The state-of-the-art surrogate models, such as Fourier Neural Operator (FNO) and DeepONet, have shown their effectiveness over conventional methods on a set of PDEs of the classical dynamical problems.[34, 35, 36]

In this work, we extend the surrogate models to the non-markovian quantum dynamics by developing a so-called neural quantum propagator (NQP) model for HEOM. As the quantum analogue of the universal PDE solver, the NQP model directly generates dynamics of system without invoking the tedious, expensive iterations. Following our previous work, we adopt the FNO architecture as the core neural network structure. [37] We test its performance by comparing with the conventional RK4 method in various computational scenarios. In addition to the simulation of population dynamics, we also employ the NQP model to compute linear and nonlinear spectra.

Similar to other neural network architectures, training NQP model requires a large amount of high precision data. Those data can only be generated by conventional iterative solvers with a small enough time step, which in turn leads to a large computational cost in the data preparation stage. To address this issue, we introduce a super-resolution algorithm, which only relies on the low-resolution data to construct high-resolution NQP. The intrinsic error in the dataset is systematically improved by utilizing the physics-informed loss function (PILF), which is defined directly from the HEOM. The optimization of PILF does not rely on any prepared dataset, which significantly improves the overall computational performance of the NQP model.

The rest of the paper is organized as follows. In Section II, we introduce the HEOM approach and linear and nonlinear response functions. In Section III, we present the NQP model, including the FNO architecture, the training setup, and the super-resolution algorithm. Numerical demonstrations on the Fenna-Matthews-Olson (FMO) system are presented in Section IV. Finally, conclusions are drawn in Section V.

II Methodology

II.1 HEOM approach

We consider an electronic system interacting with a set of heat baths. The total Hamiltonian can be written as

H^tot=H^s+H^b+H^sb.subscript^𝐻𝑡𝑜𝑡subscript^𝐻𝑠subscript^𝐻𝑏subscript^𝐻𝑠𝑏\hat{H}_{tot}=\hat{H}_{s}+\hat{H}_{b}+\hat{H}_{s-b}.over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT = over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT + over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT + over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s - italic_b end_POSTSUBSCRIPT . (1)

Here, the first term H^ssubscript^𝐻𝑠\hat{H}_{s}over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT is the Hamiltonian of the electronic system,

H^s=j=1Nεj|jj|+jjΔj,j|jj|,subscript^𝐻𝑠superscriptsubscript𝑗1𝑁subscript𝜀𝑗ket𝑗quantum-operator-product𝑗subscript𝑗superscript𝑗subscriptΔ𝑗superscript𝑗𝑗brasuperscript𝑗\hat{H}_{s}=\sum_{j=1}^{N}\varepsilon_{j}|j\rangle\langle j|+\sum_{j\neq j^{% \prime}}\Delta_{j,j^{\prime}}|j\rangle\langle j^{\prime}|,over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_j ⟩ ⟨ italic_j | + ∑ start_POSTSUBSCRIPT italic_j ≠ italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT roman_Δ start_POSTSUBSCRIPT italic_j , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT | italic_j ⟩ ⟨ italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT | , (2)

where εjsubscript𝜀𝑗\varepsilon_{j}italic_ε start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the energy of the j𝑗jitalic_j-th electronic state |jket𝑗|j\rangle| italic_j ⟩, and Δj,jsubscriptΔ𝑗superscript𝑗\Delta_{j,j^{\prime}}roman_Δ start_POSTSUBSCRIPT italic_j , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT is the interstate coupling. The second term is the Hamiltonian of harmonic heat baths,

H^b=j=1Nν(p^j,ν22+ωj,ν2x^j,ν22),subscript^𝐻𝑏superscriptsubscript𝑗1𝑁subscript𝜈superscriptsubscript^𝑝𝑗𝜈22superscriptsubscript𝜔𝑗𝜈2superscriptsubscript^𝑥𝑗𝜈22\hat{H}_{b}=\sum_{j=1}^{N}\sum_{\nu}\left(\frac{\hat{p}_{j,\nu}^{2}}{2}+\frac{% \omega_{j,\nu}^{2}\hat{x}_{j,\nu}^{2}}{2}\right),over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_b end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT ( divide start_ARG over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG + divide start_ARG italic_ω start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG start_ARG 2 end_ARG ) , (3)

where p^j,νsubscript^𝑝𝑗𝜈\hat{p}_{j,\nu}over^ start_ARG italic_p end_ARG start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT, x^j,νsubscript^𝑥𝑗𝜈\hat{x}_{j,\nu}over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT, and ωj,νsubscript𝜔𝑗𝜈\omega_{j,\nu}italic_ω start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT are the dimensionless momentum, coordinate, and frequency of ν𝜈\nuitalic_ν-th oscillator of j𝑗jitalic_j-th bath. The last term is the system-bath interaction Hamiltonian,

H^sb=j=1NV^jνgj,νx^j,ν,subscript^𝐻𝑠𝑏superscriptsubscript𝑗1𝑁subscript^𝑉𝑗subscript𝜈subscript𝑔𝑗𝜈subscript^𝑥𝑗𝜈\hat{H}_{s-b}=-\sum_{j=1}^{N}\hat{V}_{j}\sum_{\nu}g_{j,\nu}\hat{x}_{j,\nu},over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s - italic_b end_POSTSUBSCRIPT = - ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT over^ start_ARG italic_x end_ARG start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT , (4)

where V^j=|jj|subscript^𝑉𝑗ket𝑗bra𝑗\hat{V}_{j}=|j\rangle\langle{j}|over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = | italic_j ⟩ ⟨ italic_j |, and gj,νsubscript𝑔𝑗𝜈g_{j,\nu}italic_g start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT is the coupling constant between the j𝑗jitalic_j-th state and the ν𝜈\nuitalic_ν-th oscillator which can be specified by a spectral density,

Jj(ω)=νgj,ν2δ(ωωj,ν).subscript𝐽𝑗𝜔subscript𝜈superscriptsubscript𝑔𝑗𝜈2𝛿𝜔subscript𝜔𝑗𝜈J_{j}(\omega)=\sum_{\nu}g_{j,\nu}^{2}\delta(\omega-\omega_{j,\nu}).italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_ω ) = ∑ start_POSTSUBSCRIPT italic_ν end_POSTSUBSCRIPT italic_g start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT italic_δ ( italic_ω - italic_ω start_POSTSUBSCRIPT italic_j , italic_ν end_POSTSUBSCRIPT ) . (5)

The influence of the j𝑗jitalic_j-th heat bath on the electronic system is characterized by the bath correlation function, [17, 18]

Cj(t)=1π0dωJj(ω)[coth(βω2)cos(ωt)isin(ωt)],missing-subexpressionsubscript𝐶𝑗𝑡1𝜋superscriptsubscript0differential-d𝜔subscript𝐽𝑗𝜔delimited-[]hyperbolic-cotangent𝛽Planck-constant-over-2-pi𝜔2𝜔𝑡𝑖𝜔𝑡\displaystyle\begin{aligned} &C_{j}(t)\\ =&\frac{1}{\pi}\int_{0}^{\infty}{\rm{d}}\omega J_{j}(\omega)\left[\coth\left(% \frac{\beta\hbar\omega}{2}\right)\cos(\omega t)-i\,\sin(\omega t)\right],\end{aligned}start_ROW start_CELL end_CELL start_CELL italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) end_CELL end_ROW start_ROW start_CELL = end_CELL start_CELL divide start_ARG 1 end_ARG start_ARG italic_π end_ARG ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_ω italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_ω ) [ roman_coth ( divide start_ARG italic_β roman_ℏ italic_ω end_ARG start_ARG 2 end_ARG ) roman_cos ( italic_ω italic_t ) - italic_i roman_sin ( italic_ω italic_t ) ] , end_CELL end_ROW (6)

where Jj(ω)subscript𝐽𝑗𝜔J_{j}(\omega)italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_ω ) is the spectral density of the j𝑗jitalic_j-th bath, β=1/kBT𝛽1subscript𝑘𝐵𝑇\beta=1/k_{B}Titalic_β = 1 / italic_k start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT italic_T is the inverse temperature with kBsubscript𝑘𝐵k_{B}italic_k start_POSTSUBSCRIPT italic_B end_POSTSUBSCRIPT being the Boltzmann constant. We model the bath by the Drude spectral density,

Jj(ω)=2λjγjωγj2+ω2,subscript𝐽𝑗𝜔2subscript𝜆𝑗subscript𝛾𝑗𝜔superscriptsubscript𝛾𝑗2superscript𝜔2J_{j}(\omega)=\frac{2\lambda_{j}\gamma_{j}\omega}{\gamma_{j}^{2}+\omega^{2}},italic_J start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_ω ) = divide start_ARG 2 italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_ω end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT + italic_ω start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG , (7)

where λjsubscript𝜆𝑗\lambda_{j}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the reorganization energy, and γjsubscript𝛾𝑗\gamma_{j}italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the inverse of the bath correlation time. In this paper, we consider the high-temperature approximation (βγj<1𝛽Planck-constant-over-2-pisubscript𝛾𝑗1\beta\hbar\gamma_{j}<1italic_β roman_ℏ italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT < 1), and express Eq. (6) as Cj(t)=cjeγj|t|subscript𝐶𝑗𝑡subscript𝑐𝑗superscript𝑒subscript𝛾𝑗𝑡C_{j}(t)=c_{j}e^{-\gamma_{j}|t|}italic_C start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( italic_t ) = italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT - italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT | italic_t | end_POSTSUPERSCRIPT, where

cj=2λjβ2iλjγj.subscript𝑐𝑗2subscript𝜆𝑗𝛽superscriptPlanck-constant-over-2-pi2𝑖subscript𝜆𝑗subscript𝛾𝑗Planck-constant-over-2-pic_{j}=\frac{2\lambda_{j}}{\beta\hbar^{2}}-i\,\frac{\lambda_{j}\gamma_{j}}{% \hbar}.italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = divide start_ARG 2 italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG italic_β roman_ℏ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG - italic_i divide start_ARG italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_ARG start_ARG roman_ℏ end_ARG . (8)

To go beyond this approximation, one can include so-called low-temperature correction terms. [38, 39] The time evolution of the reduced density matrix can be described by the HEOM approach, which is written as [40, 19]

tρ^n(t)subscript𝑡subscript^𝜌𝑛𝑡\displaystyle\partial_{t}\hat{\rho}_{\vec{n}}(t)∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ) =[iH^s×+j=1Nnjγj]ρ^n(t)ij=1NV^j×ρ^n+ej(t)absentdelimited-[]𝑖Planck-constant-over-2-pisuperscriptsubscript^𝐻𝑠superscriptsubscript𝑗1𝑁subscript𝑛𝑗subscript𝛾𝑗subscript^𝜌𝑛𝑡𝑖superscriptsubscript𝑗1𝑁superscriptsubscript^𝑉𝑗subscript^𝜌𝑛subscript𝑒𝑗𝑡\displaystyle=-\left[\frac{i}{\hbar}\hat{H}_{s}^{\times}+\sum_{j=1}^{N}n_{j}% \gamma_{j}\right]\hat{\rho}_{\vec{n}}(t)-i\sum_{j=1}^{N}\hat{V}_{j}^{\times}% \hat{\rho}_{\vec{n}+\vec{e}_{j}}(t)= - [ divide start_ARG italic_i end_ARG start_ARG roman_ℏ end_ARG over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_s end_POSTSUBSCRIPT start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT + ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ) - italic_i ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG + over→ start_ARG italic_e end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_t ) (9)
ij=1N[cjV^jρ^nej(t)cjρ^nej(t)V^j],𝑖superscriptsubscript𝑗1𝑁delimited-[]subscript𝑐𝑗subscript^𝑉𝑗subscript^𝜌𝑛subscript𝑒𝑗𝑡superscriptsubscript𝑐𝑗subscript^𝜌𝑛subscript𝑒𝑗𝑡subscript^𝑉𝑗\displaystyle-i\sum_{j=1}^{N}\left[c_{j}\hat{V}_{j}\hat{\rho}_{\vec{n}-\vec{e}% _{j}}(t)-c_{j}^{\ast}\hat{\rho}_{\vec{n}-\vec{e}_{j}}(t)\hat{V}_{j}\right],- italic_i ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N end_POSTSUPERSCRIPT [ italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG - over→ start_ARG italic_e end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_t ) - italic_c start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG - over→ start_ARG italic_e end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_t ) over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ] ,

where n={n1,n2,,nN}𝑛subscript𝑛1subscript𝑛2subscript𝑛𝑁\vec{n}=\{n_{1},n_{2},...,n_{N}\}over→ start_ARG italic_n end_ARG = { italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_n start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT } denotes the index vector with njsubscript𝑛𝑗n_{j}italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT being the non-negative integer, and we have introduced abbreviated notations, A^×B^=A^B^B^A^superscript^𝐴^𝐵^𝐴^𝐵^𝐵^𝐴\hat{A}^{\times}\hat{B}=\hat{A}\hat{B}-\hat{B}\hat{A}over^ start_ARG italic_A end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT over^ start_ARG italic_B end_ARG = over^ start_ARG italic_A end_ARG over^ start_ARG italic_B end_ARG - over^ start_ARG italic_B end_ARG over^ start_ARG italic_A end_ARG. The density operator with all indexes equal to zero, ρ^0(t)subscript^𝜌0𝑡\hat{\rho}_{\vec{0}}(t)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( italic_t ) with 0={0,0,,0}0000\vec{0}=\{0,0,...,0\}over→ start_ARG 0 end_ARG = { 0 , 0 , … , 0 }, corresponds to the density operator of the reduced electronic system, while all other density operators are introduced to describe non-markovian and non-perturbative effects.

II.2 Linear and nonlinear response functions

The linear and nonlinear spectra are evaluated within the framework of response function theory. [16, 1, 41] The linear response function is defined as

R(1)(t)=iTr{μ^𝒢tot(t)μ^×ρ^tot(0)},superscript𝑅1𝑡𝑖Planck-constant-over-2-piTr^𝜇subscript𝒢𝑡𝑜𝑡𝑡superscript^𝜇subscript^𝜌𝑡𝑜𝑡0R^{(1)}(t)=\frac{i}{\hbar}{\rm{Tr}}\left\{\hat{\mu}\mathcal{G}_{tot}(t)\hat{% \mu}^{\times}\hat{\rho}_{tot}(0)\right\},italic_R start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( italic_t ) = divide start_ARG italic_i end_ARG start_ARG roman_ℏ end_ARG roman_Tr { over^ start_ARG italic_μ end_ARG caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t ) over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( 0 ) } , (10)

where μ^^𝜇\hat{\mu}over^ start_ARG italic_μ end_ARG is the transition dipole operator, and ρ^totsubscript^𝜌𝑡𝑜𝑡\hat{\rho}_{tot}over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT and 𝒢tot(t)=exp(iH^tot×/t)subscript𝒢𝑡𝑜𝑡𝑡𝑖superscriptsubscript^𝐻𝑡𝑜𝑡Planck-constant-over-2-pi𝑡\mathcal{G}_{tot}(t)=\exp(-i\hat{H}_{tot}^{\times}/\hbar t)caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t ) = roman_exp ( - italic_i over^ start_ARG italic_H end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT / roman_ℏ italic_t ) are the density operator and the quantum propagator of the total system, respectively. The linear absorption spectrum is obtained by the Fourier transformation

R(1)(ω)=Im0dteiωtR(1)(t),superscript𝑅1𝜔Imsuperscriptsubscript0differential-d𝑡superscript𝑒𝑖𝜔𝑡superscript𝑅1𝑡R^{(1)}(\omega)={\rm{Im}}\int_{0}^{\infty}{\rm{d}}te^{i\omega t}R^{(1)}(t),italic_R start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( italic_ω ) = roman_Im ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_t italic_e start_POSTSUPERSCRIPT italic_i italic_ω italic_t end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( italic_t ) , (11)

where Im denotes the imaginary part. The third-order response function is defined as

R(3)superscript𝑅3\displaystyle R^{(3)}italic_R start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT (t3,t2,t1)=(i)3subscript𝑡3subscript𝑡2subscript𝑡1superscript𝑖Planck-constant-over-2-pi3\displaystyle(t_{3},t_{2},t_{1})={\left(\frac{i}{\hbar}\right)}^{3}( italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) = ( divide start_ARG italic_i end_ARG start_ARG roman_ℏ end_ARG ) start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT (12)
Tr{μ^𝒢tot(t3)μ^×𝒢tot(t2)μ^×𝒢tot(t1)μ^×ρ^tot(0)}.Tr^𝜇subscript𝒢𝑡𝑜𝑡subscript𝑡3superscript^𝜇subscript𝒢𝑡𝑜𝑡subscript𝑡2superscript^𝜇subscript𝒢𝑡𝑜𝑡subscript𝑡1superscript^𝜇subscript^𝜌𝑡𝑜𝑡0\displaystyle{\rm{Tr}}\left\{\hat{\mu}\mathcal{G}_{tot}(t_{3})\hat{\mu}^{% \times}\mathcal{G}_{tot}(t_{2})\hat{\mu}^{\times}\mathcal{G}_{tot}(t_{1})\hat{% \mu}^{\times}\hat{\rho}_{tot}(0)\right\}.roman_Tr { over^ start_ARG italic_μ end_ARG caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ) over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( 0 ) } .

The rephasing and non-rephasing parts of 2D spectrum are defined by

R(3,R)superscript𝑅3𝑅\displaystyle R^{(3,R)}italic_R start_POSTSUPERSCRIPT ( 3 , italic_R ) end_POSTSUPERSCRIPT (ω3,ω1;t2)=Imsubscript𝜔3subscript𝜔1subscript𝑡2Im\displaystyle(\omega_{3},\omega_{1};t_{2})={\rm{Im}}( italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = roman_Im (13)
0dt30dt1eiω3t3iω1t1R(3)(t3,t2,t1),superscriptsubscript0differential-dsubscript𝑡3superscriptsubscript0differential-dsubscript𝑡1superscript𝑒𝑖subscript𝜔3subscript𝑡3𝑖subscript𝜔1subscript𝑡1superscript𝑅3subscript𝑡3subscript𝑡2subscript𝑡1\displaystyle\int_{0}^{\infty}{\rm{d}}t_{3}\int_{0}^{\infty}{\rm{d}}t_{1}e^{i% \omega_{3}t_{3}-i\omega_{1}t_{1}}R^{(3)}(t_{3},t_{2},t_{1}),∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT - italic_i italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ,
R(3,NR)superscript𝑅3𝑁𝑅\displaystyle R^{(3,NR)}italic_R start_POSTSUPERSCRIPT ( 3 , italic_N italic_R ) end_POSTSUPERSCRIPT (ω3,ω1;t2)=Imsubscript𝜔3subscript𝜔1subscript𝑡2Im\displaystyle(\omega_{3},\omega_{1};t_{2})={\rm{Im}}( italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = roman_Im (14)
0dt30dt1eiω3t3+iω1t1R(3)(t3,t2,t1),superscriptsubscript0differential-dsubscript𝑡3superscriptsubscript0differential-dsubscript𝑡1superscript𝑒𝑖subscript𝜔3subscript𝑡3𝑖subscript𝜔1subscript𝑡1superscript𝑅3subscript𝑡3subscript𝑡2subscript𝑡1\displaystyle\int_{0}^{\infty}{\rm{d}}t_{3}\int_{0}^{\infty}{\rm{d}}t_{1}e^{i% \omega_{3}t_{3}+i\omega_{1}t_{1}}R^{(3)}(t_{3},t_{2},t_{1}),∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT roman_d italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_e start_POSTSUPERSCRIPT italic_i italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT + italic_i italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_R start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ( italic_t start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ) ,

Within the HEOM formalism, Eqs. (10) and (12) can be evaluated by replacing ρ^totsubscript^𝜌𝑡𝑜𝑡\hat{\rho}_{tot}over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT and 𝒢tot(t)subscript𝒢𝑡𝑜𝑡𝑡\mathcal{G}_{tot}(t)caligraphic_G start_POSTSUBSCRIPT italic_t italic_o italic_t end_POSTSUBSCRIPT ( italic_t ) with ρ^n(0)subscript^𝜌𝑛0\hat{\rho}_{\vec{n}}(0)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( 0 ) and Eq. (9), respectively. The final trace is only taken for the zeroth order element of ρ^n(t)subscript^𝜌𝑛𝑡\hat{\rho}_{\vec{n}}(t)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ), i.e., ρ^0(t)subscript^𝜌0𝑡\hat{\rho}_{\vec{0}}(t)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( italic_t ).

III Neural quantum propagator

We introduce the abbreviated index, x=(j,j,n1,n2,,nN)𝑥𝑗superscript𝑗subscript𝑛1subscript𝑛2subscript𝑛𝑁x=(j,j^{\prime},n_{1},n_{2},...,n_{N})italic_x = ( italic_j , italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_n start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_n start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , … , italic_n start_POSTSUBSCRIPT italic_N end_POSTSUBSCRIPT ), and align the matrix entries ρ(x,t)=j|ρ^n(t)|j𝜌𝑥𝑡quantum-operator-product𝑗subscript^𝜌𝑛𝑡superscript𝑗\rho(x,t)=\langle j|\hat{\rho}_{\vec{n}}(t)|j^{\prime}\rangleitalic_ρ ( italic_x , italic_t ) = ⟨ italic_j | over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ) | italic_j start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⟩ as the column vector

ρt={ρ(x0,t),ρ(x1,t),}.subscript𝜌𝑡𝜌subscript𝑥0𝑡𝜌subscript𝑥1𝑡\vec{\rho}_{t}=\{\rho(x_{0},t),\rho(x_{1},t),...\}.over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = { italic_ρ ( italic_x start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , italic_t ) , italic_ρ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_t ) , … } . (15)

The HEOM (Eq. (9)) can be recast to a matrix-vector form as

tρt=𝑳ρt,subscript𝑡subscript𝜌𝑡𝑳subscript𝜌𝑡\partial_{t}\vec{\rho}_{t}={\bm{L}}\,\vec{\rho}_{t},∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_L over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , (16)

where the matrix entries of 𝑳𝑳{\bm{L}}bold_italic_L can be inferred from the right-hand side of Eq. (9). The propagator of HEOM is then defined through the integration form as 𝑮t=exp(t𝑳)subscript𝑮𝑡𝑡𝑳{\bm{G}}_{t}=\exp({t{\bm{L}}})bold_italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = roman_exp ( italic_t bold_italic_L ), which satisfies the composition property,

ρt=𝑮tt0ρt0=e(tt0)𝑳.subscript𝜌𝑡subscript𝑮𝑡subscript𝑡0subscript𝜌subscript𝑡0superscript𝑒𝑡subscript𝑡0𝑳\vec{\rho}_{t}={\bm{G}}_{t-t_{0}}\vec{\rho}_{t_{0}}=e^{(t-t_{0}){\bm{L}}}.over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_G start_POSTSUBSCRIPT italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_e start_POSTSUPERSCRIPT ( italic_t - italic_t start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) bold_italic_L end_POSTSUPERSCRIPT . (17)

To facilitate the description of later sections, we also introduce uniform time grid as tm=mδtsubscript𝑡𝑚𝑚subscript𝛿𝑡t_{m}=m\delta_{t}italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_m italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT for m=1Nt𝑚1similar-tosubscript𝑁𝑡m=1\sim N_{t}italic_m = 1 ∼ italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT, where δt=tmax/Ntsubscript𝛿𝑡subscript𝑡𝑚𝑎𝑥subscript𝑁𝑡\delta_{t}=t_{max}/N_{t}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT / italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is the time step with Ntsubscript𝑁𝑡N_{t}italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT and tmaxsubscript𝑡𝑚𝑎𝑥t_{max}italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT being the total number of time steps and the fixed upper time limit, respectively.

III.1 Model’s architecture

To construct the NQP model, we follow our previous work [37] and parameterize the HEOM propagator as a deep neural network, 𝑮tm[θ]subscript𝑮subscript𝑡𝑚delimited-[]𝜃{\bm{G}}_{t_{m}}[\theta]bold_italic_G start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_θ ], where θ𝜃\thetaitalic_θ represents all the trainable parameters. The architecture of the NQP model is shown in Fig. 1. In Fig. 1(a), Pinsubscript𝑃𝑖𝑛P_{in}italic_P start_POSTSUBSCRIPT italic_i italic_n end_POSTSUBSCRIPT and Poutsubscript𝑃𝑜𝑢𝑡P_{out}italic_P start_POSTSUBSCRIPT italic_o italic_u italic_t end_POSTSUBSCRIPT are the linear projections between physical and latent Fourier spaces. They are parameterized as the point-wise convolution network with one hidden layer and a Gaussian Error Linear Unit (GeLU) activation function. The rest parts are called the Fourier layers with their structure presented in Fig. 1(b).

To process the input of the l𝑙litalic_l-th layer vlsubscript𝑣𝑙\vec{v}_{l}over→ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT, two different routes are adopted. On the upper route, {\mathcal{F}}caligraphic_F and 1superscript1{\mathcal{F}}^{-1}caligraphic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT denote the Fourier and its inverse transform. The point-wise convolution Wlsubscript𝑊𝑙W_{l}italic_W start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT serves as the learnable weight in Fourier space. Only the lowest kmaxsubscript𝑘𝑚𝑎𝑥k_{max}italic_k start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT modes are explicitly included in the weight tensor, while others with higher frequencies are truncated to control the size of the model and avoid the numerical instabilities. The lower route is similar to the residual network. The results of two different routes are summed and activated by GeLU before passing to the next layer.

Refer to caption
Figure 1: The architecture of (a) the NQP model, and (b) the l𝑙litalic_l-th Fourier layer. Here, \mathcal{F}caligraphic_F and 1superscript1\mathcal{F}^{-1}caligraphic_F start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT denote the Fourier transform and its inverse. +++ and σ𝜎\sigmaitalic_σ represent the element-wise sum and the GeLU activation function. The learnable parameters are those in the Pinsubscript𝑃𝑖𝑛P_{in}italic_P start_POSTSUBSCRIPT italic_i italic_n end_POSTSUBSCRIPT, Poutsubscript𝑃𝑜𝑢𝑡P_{out}italic_P start_POSTSUBSCRIPT italic_o italic_u italic_t end_POSTSUBSCRIPT, and Wlsubscript𝑊𝑙W_{l}italic_W start_POSTSUBSCRIPT italic_l end_POSTSUBSCRIPT.

The NQP model takes all the entries of initial condition ρ0subscript𝜌0\vec{\rho}_{0}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT and a chosen time t𝑡titalic_t as the input, and outputs ρt=𝑮t[θ]ρ0subscript𝜌𝑡subscript𝑮𝑡delimited-[]𝜃subscript𝜌0\vec{\rho}_{t}={\bm{G}}_{t}[\theta]\vec{\rho}_{0}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = bold_italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT satisfying Eq. (16). It should be noted that no restrictions are a priori made on the explicit forms of ρ0subscript𝜌0\vec{\rho}_{0}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The NQP model can be directly applied to the simulation of response function by taking the field interaction form μ^×ρ^nsuperscript^𝜇subscript^𝜌𝑛\hat{\mu}^{\times}\hat{\rho}_{\vec{n}}over^ start_ARG italic_μ end_ARG start_POSTSUPERSCRIPT × end_POSTSUPERSCRIPT over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT as the input. Since the composition property is also retained during the parameterization, the time evolution up to arbitrarily long times can be obtained by recursively applying Eq. (17).

III.2 Training objective

The NQP model is trained by minimizing an objective function \mathcal{L}caligraphic_L, defined as

=αdata+(1α)phys,𝛼subscript𝑑𝑎𝑡𝑎1𝛼subscript𝑝𝑦𝑠\mathcal{L}=\alpha\mathcal{L}_{data}+(1-\alpha)\mathcal{L}_{phys},caligraphic_L = italic_α caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT + ( 1 - italic_α ) caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT , (18)

where datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT and physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT are referred to as the data and physics-informed loss functions, respectively. The hyper-parameter α(0,1)𝛼01\alpha\in(0,1)italic_α ∈ ( 0 , 1 ) serves as a weight factor, which will be dynamically adjusted in the training stage.

For the data part datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT, we prepare a dataset by randomly sampling a set of initial condition {ρ0}subscript𝜌0\{\vec{\rho}_{0}\}{ over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }, and then evaluating their time evolution {ρt}subscript𝜌𝑡\{\vec{\rho}_{t}\}{ over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } up to t[0,tmax]𝑡0subscript𝑡𝑚𝑎𝑥t\in[0,t_{max}]italic_t ∈ [ 0 , italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ] using conventional RK4 method. The data loss function is defined as follows,

data=p=1Ndatam=1Nt𝑮tm[θ]ρ0(p)ρtm(p)Fρtm(p)F,subscript𝑑𝑎𝑡𝑎superscriptsubscript𝑝1subscript𝑁𝑑𝑎𝑡𝑎superscriptsubscript𝑚1subscript𝑁𝑡subscriptnormsubscript𝑮subscript𝑡𝑚delimited-[]𝜃superscriptsubscript𝜌0𝑝superscriptsubscript𝜌subscript𝑡𝑚𝑝𝐹subscriptnormsuperscriptsubscript𝜌subscript𝑡𝑚𝑝𝐹\mathcal{L}_{data}=\sum_{p=1}^{N_{data}}\sum_{m=1}^{N_{t}}\frac{{\left|\left|{% \bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p)}-\vec{\rho}_{t_{m}}^{(p)}\right|% \right|}_{F}}{{\left|\left|\vec{\rho}_{t_{m}}^{(p)}\right|\right|}_{F}},caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_p = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT divide start_ARG | | bold_italic_G start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_θ ] over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT - over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_ARG start_ARG | | over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT end_ARG , (19)

where ||||F||\cdot||_{F}| | ⋅ | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT denotes the Frobenius-norm, Ndatasubscript𝑁𝑑𝑎𝑡𝑎N_{data}italic_N start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT is the number of individual samples in the dataset, and ρ0(p)superscriptsubscript𝜌0𝑝\vec{\rho}_{0}^{(p)}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT and ρtm(p)superscriptsubscript𝜌subscript𝑡𝑚𝑝\vec{\rho}_{t_{m}}^{(p)}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p ) end_POSTSUPERSCRIPT are the initial condition and the corresponding evolution for the p𝑝pitalic_p-th sample, respectively.

To ensure the universality of 𝑮t[θ]subscript𝑮𝑡delimited-[]𝜃{\bm{G}}_{t}[\theta]bold_italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ] that is applicable to any ρ0subscript𝜌0\vec{\rho}_{0}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT, one needs a large number of samples Ndatasubscript𝑁𝑑𝑎𝑡𝑎N_{data}italic_N start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT, which leads to even more computational cost in the data preparation stage. We introduce a physics-informed loss function to reduce the effective number of samples Ndatasubscript𝑁𝑑𝑎𝑡𝑎N_{data}italic_N start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT while keeping the universality of 𝑮t[θ]subscript𝑮𝑡delimited-[]𝜃{\bm{G}}_{t}[\theta]bold_italic_G start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT [ italic_θ ].[42] The physics-informed loss function is defined by minimizing the difference between left- and right-hand sides of Eq. (16) as

phys=p=1Nphysm=1Ntt𝑮tm[θ]ρ0(p)𝑳𝑮tm[θ]ρ0(p)F,subscript𝑝𝑦𝑠superscriptsubscriptsuperscript𝑝1subscript𝑁𝑝𝑦𝑠superscriptsubscript𝑚1subscript𝑁𝑡subscriptnormsubscript𝑡subscript𝑮subscript𝑡𝑚delimited-[]𝜃superscriptsubscript𝜌0superscript𝑝𝑳subscript𝑮subscript𝑡𝑚delimited-[]𝜃superscriptsubscript𝜌0superscript𝑝𝐹\mathcal{L}_{phys}=\sum_{p^{\prime}=1}^{N_{phys}}\sum_{m=1}^{N_{t}}{\left|% \left|{\partial_{t}}{\bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p^{\prime})}-{\bm% {L}}{\bm{G}}_{t_{m}}[\theta]\vec{\rho}_{0}^{(p^{\prime})}\right|\right|}_{F},caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT = ∑ start_POSTSUBSCRIPT italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ∑ start_POSTSUBSCRIPT italic_m = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT end_POSTSUPERSCRIPT | | ∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT bold_italic_G start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_θ ] over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT - bold_italic_L bold_italic_G start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_θ ] over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT | | start_POSTSUBSCRIPT italic_F end_POSTSUBSCRIPT , (20)

where Nphyssubscript𝑁𝑝𝑦𝑠N_{phys}italic_N start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT is the number of samples in the physics dataset. The time derivative tρtsubscript𝑡subscript𝜌𝑡\partial_{t}\vec{\rho}_{t}∂ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT is evaluated by the finite difference method. It should be mentioned that the calculation of physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT involves far less samples as compared to that of datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT. In addition, we adopt the on-the-fly sampling approach by re-generating the physics dataset at each training epoch to further improve the performance of the trained model.

III.3 Super resolution algorithm

To further reduce the computational cost in the data preparation stage, we introduce a super resolution algorithm, which allows the construction of the high resolution NQP model from a lower resolution dataset. As illustrated from the previous subsection, the lower resolution dataset is prepared by integrating the HEOM with a larger time step Kδt𝐾subscript𝛿𝑡K\delta_{t}italic_K italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT (K>1𝐾1K>1italic_K > 1) for a set of {ρ0}subscript𝜌0\{\vec{\rho}_{0}\}{ over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } using the RK4 method. The obtained data is then embedded into the finer grid {tm=mδt}subscript𝑡𝑚𝑚subscript𝛿𝑡\{t_{m}=m\delta_{t}\}{ italic_t start_POSTSUBSCRIPT italic_m end_POSTSUBSCRIPT = italic_m italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT } by interpolating the missing value using the linear interpolation scheme. datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT is evaluated on this interpolated dataset in the training stage.

On the other hand, the physics-informed loss function physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT is evaluated directly on the finer time grid and serves as the correction over the deviation from the dataset. The super resolution algorithm is then completed by dynamically adjusting the weight factor α𝛼\alphaitalic_α in Eq. (18) during the training process. At the begining, we set α=1𝛼1\alpha=1italic_α = 1 and randomly initilize all the model’s parameters. During the training process, α𝛼\alphaitalic_α is gradually decreased to a small enough value such as 0.01similar-toabsent0.01\sim 0.01∼ 0.01, and physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT gradually becomes the dominant contribution term. The minimization of physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT allows the improvement of the resolution over the intrinsic deviation of the dataset.

At the end of this subsection, we briefly discuss the possibility of data-free training, which is achieved by fixing α=0𝛼0\alpha=0italic_α = 0 and using only physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT during the training process. From a theoretical point of view, training with or without datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT results in the same model as long as physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT becomes the dominant contribution of Eq. (18). In practice, however, training with only physsubscript𝑝𝑦𝑠\mathcal{L}_{phys}caligraphic_L start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT requires longer epochs for convergence when all the learnable parameters are randomly initialized. In this case, a prepared dataset, even with low resolution, serves as a well-performed guidance for the training.

IV Numerical experiments

In the following, we use the Fenna-Matthews-Olson complex as our model system. [43, 44] The electronic state |jket𝑗|j\rangle| italic_j ⟩ (j=1,,7𝑗17j=1,\cdots,7italic_j = 1 , ⋯ , 7) corresponds to the state where only j𝑗jitalic_j-th pigment is excited, and |8=|gket8ket𝑔|8\rangle=|g\rangle| 8 ⟩ = | italic_g ⟩ is the ground state. We set V^j=|jj|subscript^𝑉𝑗ket𝑗bra𝑗\hat{V}_{j}=|j\rangle\langle j|over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = | italic_j ⟩ ⟨ italic_j | (j=17𝑗17j=1\cdots 7italic_j = 1 ⋯ 7) and V^g0subscript^𝑉𝑔0\hat{V}_{g}\equiv 0over^ start_ARG italic_V end_ARG start_POSTSUBSCRIPT italic_g end_POSTSUBSCRIPT ≡ 0. The heat bath parameters are chosen as λj=35cm1subscript𝜆𝑗35superscriptcm1\lambda_{j}=35\,{\rm{cm}}^{-1}italic_λ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 35 roman_cm start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, γj=200cm1subscript𝛾𝑗200superscriptcm1\gamma_{j}=200\,{\rm{cm}}^{-1}italic_γ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = 200 roman_cm start_POSTSUPERSCRIPT - 1 end_POSTSUPERSCRIPT, and T=300K𝑇300KT=300\,{\rm{K}}italic_T = 300 roman_K, respectively. The HEOM is truncated at the hierarchy level of nj2subscript𝑛𝑗2\sum n_{j}\leq 2∑ italic_n start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ≤ 2 after adopting the filtering algorithm,[45] which is accurate enough for our testing. We set the upper time limit as tmax=30fssubscript𝑡𝑚𝑎𝑥30fst_{max}=30\,{\rm{fs}}italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT = 30 roman_fs with a time step of δt=0.6fssubscript𝛿𝑡0.6fs\delta_{t}=0.6\,{\rm{fs}}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 0.6 roman_fs, which results in Nt=50subscript𝑁𝑡50N_{t}=50italic_N start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT = 50 time points.

IV.1 Training and validation test

We first introduce the model’s hyper-parameters and training setup. In order to train the NQP model, we prepare the low-resolution training dataset by randomly sampling Ndata=3000subscript𝑁𝑑𝑎𝑡𝑎3000N_{data}=3000italic_N start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT = 3000 initial conditions ρ0subscript𝜌0\vec{\rho}_{0}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. The low resolution dataset is prepared by integrating HEOM with a larger time step of 3δt3subscript𝛿𝑡3\delta_{t}3 italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. The missing values are linearly interpolated when embedded into the finer grid with the time step of δtsubscript𝛿𝑡\delta_{t}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. To test the accuracy of the model, we also prepare a high-resolution validation set with 500 samples, following the same setup but using a smaller time step of δtsubscript𝛿𝑡\delta_{t}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. It should be noted that the high-resolution validation set is never referred in the training stage. In the training process, the physics dataset is prepared using the on-the-fly sampling algorithm by randomly generating Nphys=2000subscript𝑁𝑝𝑦𝑠2000N_{phys}=2000italic_N start_POSTSUBSCRIPT italic_p italic_h italic_y italic_s end_POSTSUBSCRIPT = 2000 initial conditions at each epoch.

The other hyper-parameters of the NQP model are chosen as follows. We set the hidden channel of projections Pinsubscript𝑃𝑖𝑛P_{in}italic_P start_POSTSUBSCRIPT italic_i italic_n end_POSTSUBSCRIPT and Poutsubscript𝑃𝑜𝑢𝑡P_{out}italic_P start_POSTSUBSCRIPT italic_o italic_u italic_t end_POSTSUBSCRIPT as 512512512512. We use 4444 Fourier layers, each of which has a hidden channel of size 64646464, and the total number of trainable parameters is around 10 million. The model is trained for 105superscript10510^{5}10 start_POSTSUPERSCRIPT 5 end_POSTSUPERSCRIPT epochs using the Adam optimizer. The learning rate is initially set to 104superscript10410^{-4}10 start_POSTSUPERSCRIPT - 4 end_POSTSUPERSCRIPT, and then halved every 500 epochs until reaching 106superscript106\leavevmode\nobreak\ 10^{-6}10 start_POSTSUPERSCRIPT - 6 end_POSTSUPERSCRIPT. The weight factor α𝛼\alphaitalic_α in Eq. (18) is initialized as α=1𝛼1\alpha=1italic_α = 1, and halved every 100 epochs until reaching 102similar-toabsentsuperscript102\sim 10^{-2}∼ 10 start_POSTSUPERSCRIPT - 2 end_POSTSUPERSCRIPT. All the tasks are performed on the Nvidia A40 GPU with 48 GB memory.

Refer to caption
Figure 2: The relative error of datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT for the validation set. The horizontal axis represents the index of the samples, while the vertical axis is the relative error.

To test our model, we present the validation test by showing the relative error of datasubscript𝑑𝑎𝑡𝑎\mathcal{L}_{data}caligraphic_L start_POSTSUBSCRIPT italic_d italic_a italic_t italic_a end_POSTSUBSCRIPT for each sample in the validation set in Fig. 2. For all samples, the relative error is around 0.5%percent0.50.5\%0.5 %. This error can be further reduced by using more samples in the data and physics sets, extending the training to longer epochs, and increasing the size of the NQP model. It should be pointed out that this error corresponds to the overall deviation of all the entries of ρ^nsubscript^𝜌𝑛\hat{\rho}_{\vec{n}}over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT, including those deep hierarchy elements that have much smaller magnitude as compared to ρ^0subscript^𝜌0\hat{\rho}_{\vec{0}}over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT.

IV.2 Population dynamics

By using the composition property, ρt1+t2=𝑮t2[θ]ρt1subscript𝜌subscript𝑡1subscript𝑡2subscript𝑮subscript𝑡2delimited-[]𝜃subscript𝜌subscript𝑡1\vec{\rho}_{t_{1}+t_{2}}={\bm{G}}_{t_{2}}[\theta]\vec{\rho}_{t_{1}}over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = bold_italic_G start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ italic_θ ] over→ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT italic_t start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT, our NQP model can infer truly long-time dynamics well beyond the training time limit tmaxsubscript𝑡𝑚𝑎𝑥t_{max}italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT. To test the accuracy of the long-time dynamics predicted by the NQP model, we compute population dynamics up to 40tmaxsubscript𝑡𝑚𝑎𝑥t_{max}italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT (1.2pssimilar-toabsent1.2ps\sim 1.2\,{\rm{ps}}∼ 1.2 roman_ps). The reference results are obtained from the RK4 method with an integration time step of δtsubscript𝛿𝑡\delta_{t}italic_δ start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT. Here, we consider two initial conditions: (a) ρ^0(0)=|11|subscript^𝜌00ket1bra1\hat{\rho}_{\vec{0}}(0)=|1\rangle\langle 1|over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( 0 ) = | 1 ⟩ ⟨ 1 |, and (b)𝑏(b)( italic_b ) ρ^0(0)=|66|subscript^𝜌00ket6bra6\hat{\rho}_{\vec{0}}(0)=|6\rangle\langle 6|over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( 0 ) = | 6 ⟩ ⟨ 6 |, which correspond to the excitation localized at the first and sixth pigment, respectively. All other hierarchy elements ρ^n(0)subscript^𝜌𝑛0\hat{\rho}_{\vec{n}}(0)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( 0 ) (n0𝑛0\vec{n}\neq\vec{0}over→ start_ARG italic_n end_ARG ≠ over→ start_ARG 0 end_ARG) are set to zero for the factorized bath initial condition.

Refer to caption
Figure 3: Population dynamics computed using NQP model (solid lines) and reference RK4 method (dashed lines) for two typical initial conditions up to t=1.2ps𝑡1.2pst=1.2\,{\rm{ps}}italic_t = 1.2 roman_ps (40tmax)40subscript𝑡𝑚𝑎𝑥(40\,t_{max})( 40 italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT ).

In Fig. 3, we show the time evolution of populations pn(t)=n|ρ^0(t)|nsubscript𝑝𝑛𝑡quantum-operator-product𝑛subscript^𝜌0𝑡𝑛p_{n}(t)=\langle n|\hat{\rho}_{\vec{0}}(t)|n\rangleitalic_p start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ( italic_t ) = ⟨ italic_n | over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( italic_t ) | italic_n ⟩ for sites (a) n=1𝑛1n=1italic_n = 1, 2222, and 3333, and (b) n=4𝑛4n=4italic_n = 4, 5555, and 6666, respectively, following the experimentally demonstrated energy transfer pathways. In both cases, our NQP model yields results in perfect agreement with those from the reference RK4 method up to 10tmax10subscript𝑡𝑚𝑎𝑥10t_{max}10 italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT. While model-predicted long time dynamics deviates slightly from the exact results due to the accumulation of errors in the training stage, our NQP model still infers the accurate dynamics even far beyond the training time (40tmax40subscript𝑡𝑚𝑎𝑥40t_{max}40 italic_t start_POSTSUBSCRIPT italic_m italic_a italic_x end_POSTSUBSCRIPT).

IV.3 Linear spectra

Next, we apply our NQP model to simulate the linear and third-order response functions as defined in Eqs. (10) and (12). We choose the transition dipole operator as

μ^=j=17μj(|jg|+|gj|),^𝜇superscriptsubscript𝑗17subscript𝜇𝑗ket𝑗bra𝑔ket𝑔bra𝑗\hat{\mu}=\sum_{j=1}^{7}\mu_{j}\left(|j\rangle\langle g|+|g\rangle\langle j|% \right),over^ start_ARG italic_μ end_ARG = ∑ start_POSTSUBSCRIPT italic_j = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 7 end_POSTSUPERSCRIPT italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ( | italic_j ⟩ ⟨ italic_g | + | italic_g ⟩ ⟨ italic_j | ) , (21)

where μjsubscript𝜇𝑗\mu_{j}italic_μ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT is the transition dipole moment of j𝑗jitalic_j-th pigment. The system is initially in the electronic ground state before the photoexcitation, i.e., ρ^0(0)=|gg|subscript^𝜌00ket𝑔bra𝑔\hat{\rho}_{\vec{0}}(0)=|g\rangle\langle g|over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG 0 end_ARG end_POSTSUBSCRIPT ( 0 ) = | italic_g ⟩ ⟨ italic_g |.

Refer to caption
Figure 4: The linear spectrum R(1)(ω)superscript𝑅1𝜔R^{(1)}(\omega)italic_R start_POSTSUPERSCRIPT ( 1 ) end_POSTSUPERSCRIPT ( italic_ω ) evaluated from the NQP model (blue line) and the RK4 method (red line), respectively.

In Fig. 4, we show the linear spectrum evaluated from the NQP model and the RK4 method (t[0,40tmax]𝑡040subscript𝑡maxt\in[0,40t_{\mathrm{max}}]italic_t ∈ [ 0 , 40 italic_t start_POSTSUBSCRIPT roman_max end_POSTSUBSCRIPT ]). For each case, the peak intensities are normalized with respect to their maximum value. Overall, the NQP model yields spectrum in good agreement with that from the reference RK4. The small deviations of some peak intensities may be attributed to the model’s architecture. The adaption of Fourier transform in the model’s architecture generates some artificial aliasing modes, the magnitudes of which are increased after recurrent evaluation of long time dynamics. This systematic error could be resolved by carefully finetuning the truncation level of Fourier modes in NQP model, or by replacing the Fourier transform with methods such as wavelet transform or spatial convolutions.

IV.4 Two-dimensional spectra

Refer to caption
Figure 5: The rephasing (a, c) and non-rephasing (b, d) parts of 2D spectra at t2=0subscript𝑡20t_{2}=0italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 evaluated from the NQP model (a, b) and the RK4 reference (c, d), respectively.

We further apply the NQP model to compute 2D spectra at different time t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT. In Fig. 5, we show the rephasing (a, c) and non-rephasing (b, d) parts of 2D spectra at t2=0subscript𝑡20t_{2}=0italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 0 evaluated from the NQP model (a, b) and the RK4 reference (c, d), respectively. The 2D spectra at t2=50fssubscript𝑡250fst_{2}=50\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 50 roman_fs and 100fs100fs100\,{\rm{fs}}100 roman_fs are presented in Appendix A. The normalization of the peak intensities is performed with respect to their maximum values. The 2D spectra predicted by the NQP model are again in good agreement with those from the RK4 reference.

To quantify the difference between model-predicted and reference spectra at different t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, we introduce a quantity called average point-wise deviation as a function of t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT, which is defined as

Δ(t2)=Ω3dω3Ω1dω1|1RNQP(3)(ω3,ω1;t2)RRK4(3)(ω3,ω1;t2)|,Δsubscript𝑡2subscriptsubscriptΩ3differential-dsubscript𝜔3subscriptsubscriptΩ1differential-dsubscript𝜔11superscriptsubscript𝑅𝑁𝑄𝑃3subscript𝜔3subscript𝜔1subscript𝑡2superscriptsubscript𝑅𝑅𝐾43subscript𝜔3subscript𝜔1subscript𝑡2\Delta(t_{2})=\int_{\Omega_{3}}{\rm{d}}\omega_{3}\int_{\Omega_{1}}{\rm{d}}% \omega_{1}\,\left|1-\frac{R_{NQP}^{(3)}(\omega_{3},\omega_{1};t_{2})}{R_{RK4}^% {(3)}(\omega_{3},\omega_{1};t_{2})}\right|,roman_Δ ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) = ∫ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_d italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT ∫ start_POSTSUBSCRIPT roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_POSTSUBSCRIPT roman_d italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT | 1 - divide start_ARG italic_R start_POSTSUBSCRIPT italic_N italic_Q italic_P end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ( italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG start_ARG italic_R start_POSTSUBSCRIPT italic_R italic_K 4 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT ( italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT , italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ; italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) end_ARG | , (22)

where Ω1subscriptΩ1\Omega_{1}roman_Ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and Ω3subscriptΩ3\Omega_{3}roman_Ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT represent all the frequency domain of ω1subscript𝜔1\omega_{1}italic_ω start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and ω3subscript𝜔3\omega_{3}italic_ω start_POSTSUBSCRIPT 3 end_POSTSUBSCRIPT, and RNQP(3)superscriptsubscript𝑅𝑁𝑄𝑃3R_{NQP}^{(3)}italic_R start_POSTSUBSCRIPT italic_N italic_Q italic_P end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT and RRK4(3)superscriptsubscript𝑅𝑅𝐾43R_{RK4}^{(3)}italic_R start_POSTSUBSCRIPT italic_R italic_K 4 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( 3 ) end_POSTSUPERSCRIPT are the spectra evaluated from the NQP model and the RK4 method. In Fig. 6, we show Δ(t2)Δsubscript𝑡2\Delta(t_{2})roman_Δ ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) up to t2=100fssubscript𝑡2100fst_{2}=100\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 100 roman_fs. It is found that Δ(t2)Δsubscript𝑡2\Delta(t_{2})roman_Δ ( italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) is well below the level of 1.5%percent1.51.5\%1.5 %, demonstrating the accuracy of our NQP model.

Refer to caption
Figure 6: The average point-wise deviation between the NQP model and the RK4 reference as a function of t2subscript𝑡2t_{2}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT.

V Conclusion

In this work, we develop a NQP model for the HEOM approach to treat the non-markovian dynamics. We use the FNO as the model’s architecture, and design a super-resolution algorithm to reduce the computational cost in the data preparation stage. In the training stage, we employ both data loss function and an extra PILF to improve the numerical performance. The accuracy of the NQP model is tested by computing the population dynamics and linear and two-dimensional spectra. The NQP model yields results in good agreement with the conventional RK4 method, demonstrating its potential applicability in various computational scenarios.

In the current NQP model, the number of learnable parameters scales exponentially with both the size of the reduced system and the number of hierarchy elements. To alleviate the computational cost, future work may extend the NQP model to deal with the decomposed form of ρ^n(t)subscript^𝜌𝑛𝑡\hat{\rho}_{\vec{n}}(t)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ). For example, ρ^n(t)subscript^𝜌𝑛𝑡\hat{\rho}_{\vec{n}}(t)over^ start_ARG italic_ρ end_ARG start_POSTSUBSCRIPT over→ start_ARG italic_n end_ARG end_POSTSUBSCRIPT ( italic_t ) can be expressed as a matrix-product-state form by using the twin-space formulation and tensor-train decomposition, which requires far less entries than the original density matrix.[46, 47] In addition, our focus here is on the time-independent Hamiltonian. Future development may extend to the driven dynamics, where the system Hamiltonian contains time-dependent external fields. A typical application is the spectroscopic equations of motion approach which was employed to the simulation of strong-field nonlinear spectra. [48, 49] Work in these directions is in progress.

Acknowledgments

J.Z. and L.P.C. acknowledge support from the starting grant of research center of new materials computing of Zhejiang Lab (No. 3700-32601).

Author declarations

Author Contributions

Jiaji Zhang: Data curation (lead); Formal analysis (lead); Investigation (equal); Supervision (equal).

Lipeng Chen: Conceptualization (lead); Funding acquisition (lead); Investigation (equal); Supervision (equal).

Conflict of Interest

The authors have no conflicts to disclose.

Data availability

The data that support the findings of this study are available from the corresponding author upon reasonable request.

Code availability

The code that support the findings of this study are available from the corresponding author upon reasonable request.

Appendix A The 2D spectra at t2=50fssubscript𝑡250fst_{2}=50\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 50 roman_fs and t2=100fssubscript𝑡2100fst_{2}=100\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 100 roman_fs

In this section, we present the 2D spectra at t2=50fssubscript𝑡250fst_{2}=50\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 50 roman_fs, 100fs100fs100\,{\rm{fs}}100 roman_fs in Figs. 7 and 8. For both cases, the spectra predicted by the NQP model are in good agreement with the RK4 reference.

Refer to caption
Figure 7: The rephasing (a, c) and non-rephasing (b, d) parts of 2D spectra at t2=50fssubscript𝑡250fst_{2}=50\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 50 roman_fs evaluated from (a, b) the NQP model and (c, d) the RK4 reference, respectively.
Refer to caption
Figure 8: The rephasing (a, c) and non-rephasing (b, d) parts of 2D spectra at t2=100fssubscript𝑡2100fst_{2}=100\,{\rm{fs}}italic_t start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = 100 roman_fs evaluated from (a, b) the NQP model and (c, d) the RK4 reference, respectively.

References