Rare events in a stochastic vegetation-water dynamical system based on machine learning

Yang Li¹¹footnotemark: 1 liyangbx5433@163.com School of Automation, Nanjing University of Science and Technology, 200 Xiaolingwei Street, Nanjing 210094, Jiangsu Province, China Shenglan Yuan²²footnotemark: 2³³footnotemark: 3 shenglanyuan@gbu.edu.cn School of Sciences, Great Bay University, Songshan Lake International Innovation Entrepreneurship Community A5, Dongguan 523000, Guangdong Province, China Great Bay Institute for Advanced Study, Songshan Lake International Innovation Entrepreneurship Community A5, Dongguan 523000, Guangdong Province, China Shengyuan Xu⁴⁴footnotemark: 4 syxu@njust.edu.cn

Abstract

Stochastic vegetation-water dynamical systems play a pivotal role in ecological stability, biodiversity, water resource management, and adaptation to climate change. This research proposes a machine learning-based method for analyzing rare events in stochastic vegetation-water dynamical systems with multiplicative Gaussian noise. Utilizing the Freidlin-Wentzell large deviation theory, we derive the asymptotic expressions for the quasipotential and the mean first exit time. Based on the decomposition of vector field, we design a neural network architecture to compute the most probable transition paths and the mean first exit time for both non-characteristic and characteristic boundary scenarios. The results indicate that this method can effectively predict early warnings of vegetation degradation, providing new theoretical foundations and mathematical tools for ecological management and conservation. Moreover, the method offers new possibilities for exploring more complex and higher-dimensional stochastic dynamical systems.

keywords:

Stochastic vegetation-water system, Rare events, Machine learning, Most probable path, Mean first exit time

^†^†journal: Applied Mathematics and Computation

1 Introduction

The stochastic noise is indeed very common in ecosystems, which refers to random fluctuations in ecological systems that are not predictable or explainable by deterministic factors QY ; YLZ ; TTYBD . These fluctuations can arise from a variety of sources, including environmental conditions, dispersal patterns and interactions among species, etc. Under random fluctuations, rare events SSCA , i.e., transition from one stable state to another, frequently occur, even for weak noise. Thus they are well worth investigating, such as vegetation degradation and species extinction YW .

The Freidlin-Wentzell large deviation theory FW is actually a principle that is mainly applied to study rare events of dynamical systems with small random perturbations. As the noise intensity decreases, the convergence rate at which the sample trajectories converge to the reference orbit is exponential in terms of noise. Large deviation techniques are widely used, which have become an extremely active branch of applied probability. They can estimate the escape probability of stochastic systems, reckon the probability of deviation from the reference orbit, and quantify the asymptotic probability of errors in hypothesis testing.

In large deviation theory, the action functional is an important concept that builds the relationship between the stationary probability distribution and the path distribution of stochastic dynamical systems FW . It is defined as the integral of a certain Lagrangian, similar to classical mechanics. The minimization of the action functional leads to the most probable path connecting given initial and final states, along which the stochastic system moves with highest probability than other paths YD ; LDLZ . Although variables in statistical mechanics do not move along a definite trajectory, the most probable path offers us a very intuitive way to comprehend the stochastic behaviors of the dynamical systems and predict their evolution. Furthermore, the most probable path can also be used in optimization and control problems to find optimal solutions that maximize or minimize the objective functions in stochastic dynamical systems, which provides an appropriate and effective method for exploring the properties of stochastic models in practical applications LKBMM . The calculation of the most probable path typically involves statistical methods and numerical simulation techniques. For instance, the Monte Carlo method is a random sampling technique, which can be utilized to simulate the behavior of a system under diverse conditions and parameters. Through extensive simulation and statistics, we can more accurately find the most probable path and its associated probability DMSSS .

In addition, quasipotential is commonly used in physics and engineering to generalize the concept of potential function to nongradient dynamical systems FW ; ZAAH ; Ao . It is defined as the global minimal value of action functional about both possible paths and time length, which can be used to describe rare transition events in various physical systems, such as fluid mechanics, electrodynamics and ecosystems. The quasipotential function exponentially dominates the magnitude of mean first exit time and stationary probability density.

Mean first exit time is a statistical quantity that represents the expected value of the time required for a stochastic system to escape from a confined state or region for the first time. In stochastic processes and diffusion theory, it provides a measure of how long a stochastic system can transition from one state to another, which can be valuable in grasping and optimizing the performance of complex dynamical system related to the properties of the medium, such as its diffusivity, geometry, and boundary conditions. In practical applications, it is a critical component used to assess the engineering reliability of first passage failure CZ and to describe the activation process in neuron systems FTPVB . Apart from the quasipotential, more accurate perturbation expression of mean first exit time also depends on the exponential prefactor function of WKB approximation NKMS ; MST ; MS ; LZXDL .

The traditional numerical methods for calculating these previously mentioned quantities include the geometric minimum action method (GMAM) HVE and ordered upwind method (OUM) C . The former method is grounded in geometric action, aiming to iterate the most probable paths with fixed endpoints in the path space and obtain their corresponding action functionals. It converts the time parameterization of the action functional into arc length, simplifying the integration over infinite time to a finite-length integration, thereby finding the path corresponding to the minimum action. OUM is a numerical method for discretizing phase space by considering the influence of each node in an ordered manner and using an upwind difference scheme to compute the quasipotential at each node. This method facilitates efficient computation within the discretized phase space. GMAM and OUM are two different numerical methods based on the concepts of geometric action and discretization, respectively, for solving different types of problems. However, both GMAM and OUM have nonnegligible shortcomings, such as multiple minima for GMAM and great computational cost for OUM, especially for the high-dimensional case. Due to their limitations, there is a need to explore new methods.

Machine learning is an important direction in the field of artificial intelligence A . It analyzes large amounts of data and improves existing algorithms to make them more intelligent and enhance their generalization capabilities SWS . The powerful features of machine learning are mainly manifested in the following aspects. Firstly, machine learning is data-driven. By learning from a large amount of data, it can discover patterns and rules in the data, enabling more accurate predictions and decisions B . Secondly, machine learning algorithms can automatically adjust parameters and models, reducing human intervention and errors; thus, making the results more objective and precise HLWFKB . Thirdly, machine learning algorithms can quickly adapt to different datasets and tasks, exhibiting strong adaptability, facilitating learning and application across diverse environments AR . Fourthly, as algorithms continuously improve and optimize, the interpretability of machine learning models is also steadily improving, enabling a better understanding of the internal mechanisms of data and models CCC . Fifthly, machine learning has widespread applications in various fields, such as image recognition, speech recognition, natural language processing, recommendation systems, financial risk control, etc., bringing about tremendous changes and adding significant business value across different sectors BB . In summary, machine learning is a powerful technology that facilitates more accurate and effective data processing through automatic learning and optimization algorithms JM . With the continuous development of technology, the application prospects of machine learning will become even broader.

At present, machine learning has been widely applied in the study of stochastic dynamics. For instance, researchers designed some data-driven methods to discover stochastic dynamical systems with Gaussian or non-Gaussian noise LD ; CYDK . Xu et al. developed a novel deep learning method to compute the probability density by solving the Fokker-Planck equation XZLZLK . There are also some scholars devoted to exploring data-driven machine learning methods for computing the most probable paths of stochastic dynamical systems LDL ; WGCD . Based on large deviation theory and machine learning, Li et al. LYX calculated most probable transition path and designed a control strategy to control the mean first exit time of stochastic dynamical systems to achieve a desired value. Machine learning can also be used to compute the large deviation prefactors in the highly complex nonlinear stochastic systemsLYLL . These methods to study rare events are mainly used for stochastic systems with additive noise.

The investigation of stochastic vegetation-water ecosystems is of great significance due to their notable impact on ecological stability, biodiversity, water resource management, adaptation to climate change, and soil health. Considering their complex structures with multiplicative noise ZXLQ , this paper aims to develop a machine learning method to handle rare transition events of stochastic dynamical systems with multiplicative Gaussian noise, rather than the case with additive noise LZXDL ; LYLL ; LLR . Furthermore, this method will be utilized to calculate the most probable path, quasipotential, and mean first exit time of the stochastic vegetation-water system ZXLQ . The analysis of the dynamic behavior of this system provides a theoretical basis and mathematical methods for understanding and controlling the phenomenon of vegetation degradation.

The structure of this work is as follows. In Section 2, we present the framework of the stochastic vegetation-water system and explain the meaning of the involved parameters. We reveal the dynamical structures of the corresponding deterministic system and simulate the transition phenomenon between the metastable states of stochastic system. In Section 3, we describe the concepts of large deviation theory, quasipotential and mean first exit time, and derive their expressions asymptotically. Then a machine learning method is proposed to compute these quantities for the vegetation-water system in both non-characteristic and characteristic boundary cases. In Section 4, we use the machine learning algorithm to compute the rare transition events of the stochastic vegetation-water system and provide a mathematical basis for early warning of vegetation degradation. In Section 5, we summarize the results of the paper and discuss some important future prospects.

2 Stochastic vegetation-water system

The vegetation-water dynamical system is a complex ecosystem where the vegetation and water interact to maintain the balance. However, due to various influencing factors such as climate change, terrain, soil type, and the hydrological cycle, the dynamics of the vegetation-water system often exhibit uncertain and unpredictable stochasticity that manifests in several ways. Firstly, meteorological processes, such as rainfall intensity and evaporation, are stochastic, leading to stochasticity in hydrological processes (e.g., river discharge, groundwater levels). These hydrological processes directly influence the state of the vegetation. Secondly, the growth and distribution of vegetation are also affected by many stochastic factors. For example, seed dispersal, plant growth rates, and the occurrence of pests and diseases are stochastic processes that lead to stochasticity in vegetation spatial distribution and density. In addition, the interaction between vegetation and water in the system is also stochastic. For instance, vegetation consumes water through transpiration, and water returns to the system through soil infiltration, surface runoff, etc. To understand and predict the stochasticity of the vegetation-water dynamical system, mathematical tools such as probability theory and stochastic processes are required to establish quantitative models of the system’s state and behavior. Furthermore, modern technological means, like machine leaning techniques, are employed to obtain extensive observational data of the system, supporting model validation and refinement. The stochasticity of vegetation-water systems is a crucial characteristic that plays a key role in predicting and responding to practical issues such as water resource management and ecological restoration. Studying and comprehending this stochasticity is essential for effective management and conservation efforts.

More specifically, we consider a stochastic vegetation-water system

\begin{array}[]{l}\dot{x}_{1}=\rho x_{1}\big{(}x_{2}-\frac{x_{1}}{K}\big{)}-% \beta\frac{x_{1}}{x_{1}+x_{0}}+\sigma_{1}x_{1}^{2}\xi_{1}(t),\\ \dot{x}_{2}=R-\alpha x_{2}-\lambda x_{1}x_{2}+\sigma_{2}\xi_{2}(t),\end{array}

(2.1)

where the variable $x_{1}$ represents the biomass of vegetation, while $x_{2}$ signifies the moisture level of the soil. The interaction terms $\rho x_{1}x_{2}$ and $-\lambda x_{1}x_{2}$ depict the relationship between vegetation biomass and water. The term $-\rho x_{1}^{2}/K$ limits the growth of biomass due to competition for shared resources, such as water or soil nutrients. The term $-\beta x_{1}/(x_{1}+x_{0})$ illustrates the impact of herbivores and other influencing factors. The parameter $R$ denotes the average rainfall, and the term $-\alpha x_{2}$ stands for the loss of water from the soil, which could be caused by percolation or evaporation. Taking into account random environmental disturbances affecting vegetation competition for shared resources and rainfall, the term $-\rho x_{1}^{2}/K$ transforms into $-\rho x_{1}^{2}(1+\tilde{\sigma}_{1}\xi_{1}(t))/K$ and $R$ becomes $R(1+\tilde{\sigma}_{2}\xi_{2}(t))$ . Consequently, these disturbances are captured by multiplicative noise $\sigma_{1}x_{1}^{2}\xi_{1}(t)$ and additive noise $\sigma_{2}\xi_{2}(t)$ , where $\sigma_{1}=-\rho\tilde{\sigma}_{1}/K$ and $\sigma_{2}=R\tilde{\sigma}_{2}$ . The driving terms $\xi_{1}(t)$ and $\xi_{2}(t)$ are independent Gaussian white noises with

\mathbb{E}[\xi_{i}(t)]=0,\quad\mathbb{E}[\xi_{i}(t)\xi_{i}(t+\tau)]=% \varepsilon\delta(\tau),\quad i=1,2.

Here, $\varepsilon$ is a small parameter, implying that the noise intensity is weak. The drift coefficient and diffusion matrix can be written as

b(x)=\left(\begin{array}[]{c}\rho x_{1}\big{(}x_{2}-\frac{x_{1}}{K}\big{)}-% \beta\frac{x_{1}}{x_{1}+x_{0}}\\ R-\alpha x_{2}-\lambda x_{1}x_{2}\\ \end{array}\right),\quad a(x)=\sigma(x)\sigma^{\top}(x)=\left(\begin{array}[]{% cc}\sigma_{1}^{2}x_{1}^{4}&0\\ 0&\sigma_{2}^{2}\\ \end{array}\right),

where

\sigma(x)=\left(\begin{array}[]{cc}\sigma_{1}x_{1}^{2}&0\\ 0&\sigma_{2}\\ \end{array}\right).

We first consider the corresponding deterministic system with $\sigma_{1}=\sigma_{2}=0$ :

\begin{array}[]{l}\dot{x}_{1}=\rho x_{1}\big{(}x_{2}-\frac{x_{1}}{K}\big{)}-% \beta\frac{x_{1}}{x_{1}+x_{0}},\\ \dot{x}_{2}=R-\alpha x_{2}-\lambda x_{1}x_{2}.\end{array}

(2.2)

In this paper, we fix these system parameters as $\rho=1$ , $K=10$ , $\beta=3$ , $x_{0}=1$ , $\alpha=1$ , $\lambda=0.12$ .

Let $b(x)=\textbf{0}$ and we can derive the fixed points of the system (2.2). It is observed that $\text{SN1}=\big{(}0,\frac{R}{\alpha}\big{)}$ is a trivial fixed point for arbitrary $R$ . When $x_{1}\neq 0$ , $b(x)=\textbf{0}$ implies that

\begin{array}[]{l}f(x_{1})=\rho\lambda x_{1}^{3}+\rho(\lambda x_{0}+\alpha)x_{% 1}^{2}+(\rho\alpha x_{0}+K\beta\lambda-\rho RK)x_{1}+K\beta\alpha-\rho RKx_{0}% =0,\\ x_{2}=\frac{R}{\lambda x_{1}+\alpha}.\end{array}

As shown in Fig. 1, a saddle-node bifurcation occurs at $R=R_{c}$ . Therefore, a node SN2 and a saddle US emerge. Let $f(x_{1})=0$ and $f^{\prime}(x_{1})=0$ . We have $R_{c}=1.4278$ . SN1 is stable if $R<2.998$ , otherwise it is unstable due to the collision of US. Above all, there are three states for deterministic vegetation-water system (2.2) depending on the value range of $R$ : bare for $R<R_{c}$ , bistable for $R_{c}<R<2.998$ , and vegetated for $R>2.998$ .

Refer to caption — Figure 1: Bifurcation digram of the vegetation system about the parameter $R$ . It exhibits two branches, the blue curve representing the stable equilibria, while the red dashed branch representing the unstable equilibria. Upon varying the control parameter $R$ , the two branches approach each other until they meet at the critical value of $R_{c}=1.4278$ . At this point, they annihilate each other in a saddle-node bifurcation.

It can be seen from the saddle-node bifurcation diagram in Fig. 1 that the biomass of vegetation depends greatly on the average rainfall $R$ . This finding aligns with the actual ecological phenomena. When $R<R_{c}$ , the rainfall is insufficient for vegetation to survive, which corresponds to the bare state; when $R>2.998$ , the rainfall is ample, and thus the vegetation can grow fully without the phenomenon of vegetation disappearance, corresponding to the vegetated state. Meanwhile, for rainfall values within the range $R_{c}<R<2.998$ , the bistable phenomenon emerges as the rainfall is neither too little nor too much. Investigating the vegetation state within this range of rainfall has practical significance for applications.

In this paper, we choose $R=1.55$ , i.e., bistable state, for investigation. As demonstrated in Fig. 2, the system (2.2) has two stable fixed points $\text{SN1}=(0,1.55)$ and $\text{SN2}=(4.6366,0.9959)$ . The basins of attraction for these fixed points are separated by the stable manifold of the saddle $\text{US}=(1.6667,1.2917)$ , denoted by a purple curve. The unstable manifold of US is indicated by a green curve.

Based on the ecological significance of stochastic vegetation-water model (2.1), we investigate the transition phenomena of the system initially located at SN2, i.e., with lush vegetation, under random perturbations approaching SN1. Since the noise intensity $\varepsilon$ is small, these phenomena are referred to as rare events. Indeed, when the system crosses the boundary, i.e., the stable manifold of US, it will flow along the unstable manifold of the US to the point SN1. Therefore, we mainly focus on the process of the system escaping from the attractor domain of SN2 driven by noise perturbations, which can provide important information for early warning and intervention of vegetation loss. A typical noise induced transition trajectory simulated by Monte Carlo is illustrated in Fig. 3.

3 Theory and method

Under the condition of weak Gaussian noise, rare exit events can be analyzed by utilizing Freidlin-Wentzell large deviation theory. In this section, we first review the concepts of the most probable exit path, quasipotential, and mean first exit time, and derive their asymptotic expressions. Then we design a machine learning method to compute these quantities numerically.

3.1 Large deviation theory

Now we reformulate the stochastic vegetation-water system (2.1) into the following form

dx(t)=b(x)dt+\sigma(x)dB^{\varepsilon}(t),

where $B^{\varepsilon}(t)=(B_{1}^{\varepsilon}(t),B_{2}^{\varepsilon}(t))^{\top}$ is a two-dimensional Brownian motion. Given the scenario of weak noise intensity, the stationary distribution $p_{s}(x)$ of the stochastic system can be assumed to have the following WKB form

p_{s}(x)\sim C(x)\exp\{-\varepsilon^{-1}V(x)\},

where $C(x)$ is referred to as exponential prefactor, and $V(x)$ is called quasipotential, which characterizes the possibility of the stochastic state fluctuating into the vicinity of the specific point $x$ . In addition to the stationary distribution, the quasipotential also exponentially dominates the magnitude of mean first exit time and exit location distribution. According to Freidlin-Wentzell large deviation theory, the quasipotential is defined as the minimum of the action functional along the absolutely continuous path connecting the fixed point $\bar{x}$ and the specific point $x$ , in the sense that

V(x):=\inf_{T>0}\inf_{\varphi\in C[0,T]}\{S(\varphi):\varphi(0)=\bar{x},\,% \varphi(T)=x\},

where the action functional $S(\varphi)$ has the following form

S(\varphi)=\frac{1}{2}\int_{0}^{T}(\dot{\varphi}-b(\varphi))^{\top}a^{-1}(\dot% {\varphi}-b(\varphi))dt.

By substituting the WKB approximation into the stationary Fokker-Planck equation and combining the lowest order terms of $\varepsilon$ , we obtain Hamilton-Jacobi equation

\langle\nabla V(x),b(x)\rangle+\frac{1}{2}\langle\nabla V(x),a(x)\nabla V(x)% \rangle=0.

Note that the above equation has a geometric interpretation. By combining the two terms on the left, we gain

\langle\nabla V(x),b(x)+\frac{1}{2}a(x)\nabla V(x)\rangle=0.

Therefore, we get the following orthogonal relationship

\nabla V(x)\perp b(x)+\frac{1}{2}a(x)\nabla V(x).

Define $l(x):=b(x)+\frac{1}{2}a(x)\nabla V(x)$ . The vector field $b(x)$ has the following decomposition

b(x)=-\frac{1}{2}a(x)\nabla V(x)+l(x).

The next-to-leading-order approximation of the WKB approximation to the stationary Fokker-Planck equation leads to the following transport equation about the exponential prefactor $C(x)$ :

\langle\nabla C,b+a\nabla V\rangle+C\big{(}\text{div}b+\frac{1}{2}a:H(V)+% \langle A,\nabla V\rangle\big{)}=0,

where $H$ denotes Hessian matrix, $a:H=\sum_{i,j}a_{ij}H_{ij}$ , and the vector $A(x)$ is defined by $A_{i}(x)=\sum\limits_{j=1}^{2}\frac{\partial a_{ij}}{\partial x_{j}}$ , i.e.,

A(x)=\left(\begin{array}[]{c}4\sigma_{1}^{2}x_{1}^{3}\\ 0\\ \end{array}\right).

Putting the decomposition $b(x)=-\frac{1}{2}a(x)\nabla V(x)+l(x)$ into the transport equation yields the following result

\langle\nabla C,\frac{1}{2}a\nabla V+l\rangle+C(\text{div}l+\frac{1}{2}\langle A% ,\nabla V\rangle)=0.

The above equation can be rewritten as

\langle\nabla\ln C,\frac{1}{2}a\nabla V+l\rangle=-F,

(3.3)

where

F(x)=\text{div}l(x)+\frac{1}{2}\langle A(x),\nabla V(x)\rangle.

According to Freidlin-Wentzell large deviation theory, the most probable path of fluctuation dynamics satisfies the following equation

\dot{\varphi}^{x}(t)=b(\varphi^{x}(t))+a(\varphi^{x}(t))\nabla V(\varphi^{x}(t% ))=\frac{1}{2}a(\varphi^{x}(t))\nabla V(\varphi^{x}(t))+l(\varphi^{x}(t)).

This can also be confirmed by the results of the method of characteristics applied to the Hamilton-Jacobi equation. Therefore, along the most probable path, the left-hand side of equation (3.3) is transformed into the complete differential of $\ln C(x)$ , i.e.,

\frac{d}{dt}\ln C(\varphi^{x}(t))=-F(\varphi^{x}(t)).

So that we can integrate the prefactor function as

C(x)\sim\exp\{-\int_{-\infty}^{0}F(\varphi^{x}(t))dt\}.

(3.4)

If $x$ is a saddle point, then the upper limit of the above integral is infinite.

Assume that $D$ is the attraction domain of SN2. Define the first exit time of the stochastic state from $D$ as

\tau^{\varepsilon}_{D}=\inf\{t\geq 0:x(t)\notin D,x(0)=\text{SN2}\}.

It is a random time, and its average quantity is called the mean first exit time. This can provide important quantitative information regarding the disappearance of vegetation. According to large deviation theory, the mean first exit time is exponentially dominated by the minimal value of the quasipotential along the boundary $\partial D$ :

\lim_{\varepsilon\rightarrow 0}\varepsilon\ln\mathbb{E}\tau^{\varepsilon}_{D}=% \inf_{x\in\partial D}V(x).

Usually, this minimization is achieved at the saddle US. Then we have

\mathbb{E}\tau^{\varepsilon}_{D}=L^{\varepsilon}_{D}\exp\{V(\text{US})\}.

In general, We can calculate the prefactor $L_{D}^{\varepsilon}$ in two distinct scenarios.

Case A. non-characteristic boundary (see BR22 ; BR for reference).

We assume that the domain $D$ is an open, smooth and connected subset of $\mathbb{R}^{n}$ that satisfies the following conditions, where $n(y)$ represents the exterior normal vector at $y\in\partial D$ .

(A1)

The deterministic system $\dot{x}=b(x)$ has a unique fixed point $\bar{x}$ within $D$ that attracts all the trajectories originating from $D$ . Additionally, the inner product between $b(y)$ and $n(y)$ is negative for all $y$ on the boundary $\partial D$ , i.e., $\langle b(y),n(y)\rangle<0$ for all $y\in\partial D$ .

(A2)

The function $V$ is continuously differentiable ( $C^{1}$ ) in $D$ ; for any $x\in\bar{D}$ , the most probable path $\varphi_{t}^{x}$ approaches $\bar{x}$ as $t\rightarrow-\infty$ ; and $\langle\frac{1}{2}a(y)\nabla V(y)+l(y),n(y)\rangle>0$ for all $y\in\partial D$ .

(A3)

The minimum of $V$ over $\partial D$ is attained at a single point $x^{*}$ . At this point,

\mu^{*}=\langle\frac{1}{2}a(x^{*})\nabla V(x^{*})+l(x^{*}),n(x^{*})\rangle>0,

and the quadratic form $h^{*}:\xi\mapsto\langle\xi,\nabla^{2}V(x^{*})\xi\rangle$ has positive eigenvalues on the hyperplane $n(x^{*})^{\perp}=\{\xi\in\mathbb{R}^{n}:\langle\xi,n(x^{*})\rangle=0\}$ .

If $\langle b(y),n(y)\rangle<0$ for all $y\in\partial D$ , the boundary is designated as non-characteristic. This condition guarantees that the dynamical trajectories, originating from the closure $\bar{D}$ , will remain confined within $D$ , and that the vector field is perpendicular to the boundary. Leveraging Assumption (A1), we derive the integral formula

\lambda_{D}^{\varepsilon}=\int\limits_{x\in\partial D}\langle\frac{1}{2}a(x)% \nabla V(x)+l(x),n(x)\rangle C(x)\exp\{-\varepsilon^{-1}V(x)\}dx

for the exit rate $\lambda_{D}^{\varepsilon}=[\mathbb{E}\tau_{D}^{\varepsilon}]^{-1}$ . By employing the second-order expansion of the potential $V$ in the vicinity of the point $x^{*}$ , we obtain an equivalent relation for the prefactor

	$\displaystyle L_{D}^{\varepsilon}$	$\displaystyle\sim\frac{1}{C(x^{\ast})\mu^{\ast}}\sqrt{\frac{\det h^{\ast}}{(2% \pi\varepsilon)^{n-1}}}$		(3.5)
		$\displaystyle\sim\frac{1}{\mu^{\ast}}\sqrt{\frac{\det h^{\ast}}{(2\pi% \varepsilon)^{n-1}}}\exp\{\int_{\infty}^{0}F(\varphi^{x^{\ast}}(t))dt\},$		(3.5)

where we utilize the approximation in (3.4).

Case B. characteristic boundary BR22 ; BR .

The basin domain $D$ is characteristic in the sense that the inner product between the vector field $b(y)$ and the exterior normal vector $n(y)$ vanishes for all points $y$ within the domain $D$ , i.e., $\langle b(y),n(y)\rangle=0$ for all $y\in D$ . We consider the metastable scenario that the deterministic system $\dot{x}=b(x)$ possesses two stable fixed points $\bar{x}_{1}$ and $\bar{x}_{2}$ , whose respective basins of attraction are divided by a smooth hypersurface $S$ . We focus on exit events from the basin of attraction $D$ associated with $\bar{x}_{1}$ and introduce the following set of assumptions.

(B1)

All trajectories of the deterministic system $\dot{x}=b(x)$ initiated on the hypersurface $S$ remain confined to $S$ and ultimately converge to a single fixed point $x^{\ast}\in S$ ; furthermore, the Jacobi matrix $\nabla b(x^{\ast})$ possesses $n-1$ eigenvalues with negative real part and a single positive eigenvalue denoted by $\lambda^{*}$ .

(B2)

With respect to the quasipotential $V$ associated with $\bar{x}_{1}$ , there exists a unique (up to time shift) trajectory $\rho=(\rho_{t})_{t\in\mathbb{R}}\subset D$ such that

\lim_{t\rightarrow-\infty}\rho_{t}=\bar{x}_{1},\quad\lim_{t\rightarrow+\infty}% \rho_{t}=x^{*},\quad\text{and}\quad V(x^{*})=\mathcal{S}_{-\infty,+\infty}[% \rho].

(B3)

The quasipotential $V$ is smooth in a neighborhood of $\rho=(\rho_{t})_{t\in\mathbb{R}}$ . Additionally, the vector field $l$ defined by $l(x)=b(x)+\frac{1}{2}a(x)\nabla V(x)$ satisfies the orthogonality relation $\langle\nabla V(x),l(x)\rangle=0$ .

In this context, the quasipotential $V$ attains its minimum on the hypersurface $S$ precisely at the point $x^{*}$ . Moreover, the trajectory $\rho$ is designated as the most probable exit path, satisfying the differential equation

\dot{\rho}_{t}=\frac{1}{2}a(\rho_{t})\nabla V(\rho_{t})+l(\rho_{t}),\quad% \forall t\in\mathbb{R}.

For any given $t\in\mathbb{R}$ , this path coincides with the trajectory $(\varphi_{s}^{x})_{s\leq 0}$ that connects $\bar{x}$ to $x=\rho_{t}$ , according to the relation

\varphi_{s}^{x}=\rho_{s+t},\quad\forall s\leq 0.

To describe the prefactor $L_{D}^{\epsilon}$ in this scenario, we formulate the following supplementary assumption.

(B4): The matrix $H^{\ast}=\lim_{t\rightarrow+\infty}\nabla^{2}V(\rho_{t})$ exists and possesses $n-1$ positive eigenvalues and a single negative eigenvalue.

Relying on these four assumptions, an asymptotic formula for estimating the expected time taken by the process to exit the domain $D$ is given by

L_{D}^{\varepsilon}\sim\frac{\pi}{\lambda^{\ast}}\sqrt{\frac{|\det H^{\ast}|}{% \det\nabla^{2}V(\bar{x})}}\exp\Big{\{}\int_{-\infty}^{\infty}F(\varphi(t))dt% \Big{\}}.

(3.6)

This expression provides an approximation for the expected exit time from the domain $D$ under the given conditions.

3.2 Machine learning algorithm

It is seen in subsection 3.1 that the computations of the most probable path and the mean first exit time depend on the results of quasipotential and rotational component, i.e., the decomposition of the vector field. In this subsection, we aim to propose a machine learning method to compute these quantities of the stochastic vegetation-water system (2.1) based on this decomposition.

We design a neural network architecture to achieve this goal. The input of the network is set as the coordinate $x=(x_{1},x_{2})^{\top}$ . The output of the network is $(\hat{V}_{\theta},l_{\theta})\in\mathbb{R}^{n+1}$ . The quasipotential function is defined as $V_{\theta}(x)=\hat{V}_{\theta}(x)+|x-\bar{x}|^{2}$ to guarantee the unboundedness of $V(x)$ and $|\nabla V(x)|$ . Here, $\bar{x}$ is the stable fixed point SN2, and $\theta$ denotes the training parameters of the neural network.

In order to train the neural network, we choose $N$ points randomly in the attraction domain of SN2 and construct a loss function as follows:

L=L_{\text{dyn}}+\lambda_{1}L_{\text{orth}}+\lambda_{2}L_{0}.

Since the vector field has the decomposition $b(x)=-\frac{1}{2}a(x)\nabla V(x)+l(x)$ , the first part $L_{\text{dyn}}$ of the loss function can be set as

L_{\text{dyn}}=\frac{1}{N}\sum_{i=1}^{N}[b(x_{i})+\frac{1}{2}a(x_{i})\nabla V_% {\theta}(x_{i})-l_{\theta}(x_{i})]^{2},

where the gradient of the quasipotential is realized by automatic differentiation technique. Due to the orthogonal relation $\nabla V(x)\bot l(x)$ , the second part $L_{\text{orth}}$ of the loss function is assigned as

L_{\text{orth}}=\frac{1}{N}\sum_{i=1}^{N}\frac{[\nabla V_{\theta}(x_{i})\cdot l% _{\theta}(x_{i})]^{2}}{|\nabla V_{\theta}(x_{i})|^{2}|l_{\theta}(x_{i})|^{2}+% \delta}.

Here, the small parameter $\delta\ll 1$ is chosen to avoid a zero denominator, ensuring numerical stability. Besides, the third part $L_{0}$ of the loss function is set as $L_{0}=V_{\theta}(\bar{x})^{2}$ to guarantee the fact that the quasipotential of the stable fixed point $\bar{x}=\text{SN2}$ is zero. In this paper, we choose the weight parameters $\lambda_{1}=1$ and $\lambda_{2}=0.1$ , which are used to balance the three parts of loss function.

After training the neural network, we obtain the quasipotential function $V_{\theta}(x)$ and the rotational component $l(x)$ . Thus the most probable path can be integrated by the equation

\dot{x}=b(x)+a(x)\nabla V_{\theta}(x)

in reverse time, starting from the end point. Additionally, the mean first exit time can be computed using the asymptotic expression provided in subsection 3.1.

4 Results

In this section, we present the results of applying the proposed machine learning algorithm to the stochastic vegetation-water system (2.1). The hyperparameters we set are as follows: The neural network is configured with 6 hidden layers, each having 20 nodes. The activation function in hidden layers is chosen as $\tanh$ , while the output layer uses the identify function. We utilize the Adam optimizer with a learning rate of 0.001. The small parameter in the loss function is set to $\delta=0.001$ . The neural network is trained for 1000000 epochs.

In Section 2, upon observing the expressions of the noise intensities $\sigma_{1}=-\rho\tilde{\sigma}_{1}/K$ and $\sigma_{2}=R\tilde{\sigma}_{2}$ with $\rho=1$ , $K=10$ , and $R=1.55$ , it is worth noting that the order of magnitude of $\sigma_{1}$ should be significantly smaller than that of $\sigma_{2}$ in practical system. Consequently, for this context, we assume the values of $\sigma_{1}=0.1$ and $\sigma_{2}=1$ .

Now we apply our proposed method to the exit problem of the system (2.1). We randomly and uniformly select 10000 points in the domain $[1,7]\times[0,2]$ , within which $N=8018$ collocation points located on the right-hand side of the stable manifold of US are used to train the neural network. As depicted in Fig. 4, the loss function is reduced to the magnitude of $10^{-5}$ , indicating good convergence of the algorithm. The quasipotential function and rotational components of the training results are exhibited in Fig. 5.

We first consider the case of a non-characteristic boundary. We take $x_{1}=3$ as the boundary, as it has practical ecological significance. On the one hand, this boundary is located on the right side of the natural boundary, i.e., the stable manifold of the saddle point, and thus it can be used for early warning of vegetation degradation, leaving sufficient time for manual intervention. On the other hand, this boundary is relatively simple, only requiring measurement of vegetation biomass.

By employing the gradient descent method, we can locate the point $x^{\ast}=(3,1.0632)$ with the minimal quasipotential on the boundary $x_{1}=3$ . Starting from $x^{\ast}$ , we can use the inverse time integration to determine the most probable path connecting $x^{\ast}$ and SN2, as demonstrated by the red curve in Fig. 6. The blue dashed line represents the path obtained by the shooting method BMLSM . The consistency of these results with those obtained through machine learning is evident.

Denote $\bar{H}=\nabla^{2}V_{\theta}(\text{SN2})$ . We expand the Hamilton-Jacobi equation near the fixed point $\bar{x}=\text{SN2}$ to obtain the algebraic Riccati equation

\bar{H}^{-1}\bar{Q}^{\top}+\bar{Q}\bar{H}^{-1}=\bar{a},

where

\bar{Q}=-\nabla b(\bar{x}),\quad\bar{a}=a(\bar{x}).

Therefore, we can get

\bar{H}=\left(\begin{array}[]{cc}0.0543&0.0608\\ 0.0608&2.9133\\ \end{array}\right).

Then $\det(\bar{H})=0.1546$ . Besides,

\begin{array}[]{l}\mu^{\ast}=-\big{(}\frac{1}{2}\sigma_{1}^{2}(x_{1}^{\ast})^{% 4}\frac{\partial V_{\theta}}{\partial x_{1}}(x^{\ast})+l_{1\theta}(x^{\ast})% \big{)}=0.022,\\ \det(h^{\ast})=\frac{\partial^{2}V_{\theta}}{\partial x_{2}^{2}}(x^{\ast})=2.2% 602,\quad V_{\theta}(x^{\ast})=0.0691.\end{array}

Due to the asymptotic expression of mean first exit time, we can compute the functional relation between mean first exit time and the noise intensity $\varepsilon$ , as illustrated in Fig. 7. The Monte Carlo simulations confirm the machine learning results, wherein the small error mainly stems from the asymptotic expression itself.

Next, we consider the characteristic boundary, namely, the stable manifold of US. This serves as the natural boundary of the system (2.1). When noise perturbation causes the system to reach this boundary, it tends to move towards SN1 along the unstable manifold of US. Therefore, once the system escapes from this boundary, it is not far from vegetation degradation, and effective measures need to be taken to change the situation.

According to large deviation theory, the saddle US represents the point with minimal quasipotential on the boundary i.e., the exit point. To obtain the most probable path, we integrate the equation $\dot{x}=b(x)+a(x)\nabla V_{\theta}(x)$ , commencing from the neighborhood of US with time reversal. As seen in Fig. 8, the most probable path computed by machine learning aligns well with the result obtained via the shooting method. It should be noted that on the most probable path from SN2 to US, the soil moisture level $x_{2}$ undergoes a process of first decreasing and then increasing. Although the water content at US is much higher than that of the steady state SN2, vegetation inevitably degrades. Hence, in scenarios where vegetation is inadequate, enhancing soil moisture alone is insufficient to halt desertification. Under such circumstances, the effect of planting plants artificially is significantly more beneficial, which is far greater than merely increasing soil moisture.

Denote $H^{\ast}=\nabla^{2}V_{\theta}(\text{US})$ . We solve the equation

(H^{\ast})^{-1}(Q^{\ast})^{\top}+Q^{\ast}(H^{\ast})^{-1}=a^{\ast},

where

Q^{\ast}=-\nabla b(\text{US}),\quad a^{*}=a(\text{US}).

We can obtain

H^{\ast}=\left(\begin{array}[]{cc}-0.0446&0.4228\\ 0.4228&1.3305\\ \end{array}\right).

Then $\det(H^{\ast})=-0.238$ . Specifically, $\lambda^{\ast}=0.3721$ , $V_{\theta}(\text{US})=0.1643$ . Thus the mean first exit time crossing this boundary can be computed via its asymptotic approximation, as plotted in Fig. 9. The machine learning results are also validated by Monte Carlo method. It is observed that the mean first exit time in the characteristic boundary case is much greater than the one in the non-characteristic boundary case, given the same noise intensity. Therefore, these two boundaries can be used for two-level early warning.

Finally, we want to investigate how different noise combinations affect the escape behavior of the system. We consider three cases:

(i): $\sigma_{1}=0.1$ , $\sigma_{2}=1$ ;
(ii): $\sigma_{1}=0.08$ , $\sigma_{2}=0.1$ ;
(iii): $\sigma_{1}=0.1$ , $\sigma_{2}=0.8$ .

We take the characteristic boundary as an illustrative example. Fig. 10 illustrates the most probable exit paths in the above three cases. Taking case (i) as a benchmark, we find that the path moves downward as $\sigma_{1}$ decreases, while the path moves upward when $\sigma_{2}$ decreases. This phenomenon is understandable. When $\sigma_{1}$ decreases, the effect of noise in the $x_{1}$ direction becomes smaller, so the path tends to be vertical, allowing the noise in the $x_{2}$ direction to exert a more significant influence and causing the path to bend downward. Conversely, as $\sigma_{2}$ decreases, the path becomes more horizontal, causing it to move upward.

In addition, Fig. 11 depicts the mean first exit time of the three cases (i)-(iii), along with a comparison of Monte Carlo simulations. Obviously, irrespective of the direction in which the noise is reduced, the mean first exit time will increase. Reducing $\sigma_{1}$ from 0.1 to 0.08 and reducing $\sigma_{2}$ from 1 to 0.8 result in the same proportional reduction. However, the growth of mean first exit time after reducing $\sigma_{2}$ is significantly greater than the result of reducing $\sigma_{1}$ . Therefore, the noise in the $x_{2}$ direction plays a dominant role in the escape process of the system.

5 Conclusion and future perspective

In this paper, we proposed a machine learning method to investigate rare events of stochastic dynamical systems with multiplicative Gaussian noise. We computed the most probable paths, quasipotential, and mean first exit time of a stochastic vegetation-water system in both non-characteristic and characteristic boundaries via the machine learning algorithm. We analyzed the dynamics of the system and explored the feasibility of using the exit phenomenon to establish early warnings for vegetation degradation.

We can innovatively apply machine learning methods to more complex and high-dimensional stochastic dynamical systems driven by Gaussian multiplicative noise. Furthermore, rare events, most probable paths, quasipotential, transition rates, and mean first exit time of the dynamical system with non-Gaussian Lévy noise are also worthy of further study.

Acknowledgement

The authors acknowledge support from the National Natural Science Foundation of China (Grant Nos. 12302035, 62073166, 62221004), the Natural Science Foundation of Jiangsu Province (Grant No. BK20220917), the Key Laboratory of Jiangsu Province, the Shandong Provincial Natural Science Foundation under Grant ZR2021ZD13, and the Project on the Technological Leading Talent Teams Led by Frontiers Science Center for Complex Equipment System Dynamics (FSCCESD220401).

Data availability

Numerical algorithms source code associated with this article can be found, in the online version, at https://github.com/liyangnuaa/rare-events-in-stochastic-vegetation-system.

References

(1) M. Qiao, S. Yuan, Analysis of a stochastic predator-prey model with prey subject to disease and Lévy noise, Stochastics and Dynamics 19(05) (2019) 1950038.
(2) S. Yuan, Y. Li, Z. Zeng, Stochastic bifurcations and tipping phenomena of insect outbreak systems driven by $\alpha$ -stable Lévy processes, Mathematical Modelling of Natural Phenomena 17 (2022) 34.
(3) A. Tesfay, D. Tesfay, S. Yuan, J. Brannan, J. Duan, Stochastic bifurcation in single-species model induced by $\alpha$ -stable Lévy noise, Journal of Statistical Mechanics: Theory and Experiment 2021(10) (2021) 103403.
(4) S.E. Selvan, M.S.P. Subathra, A.H. Christinal, U. Amato, On the benefits of Laplace samples in solving a rare event problem using cross-entropy method, Applied Mathematics and Computation 225 (2013) 843-859.
(5) S. Yuan, Z. Wang, Bifurcation and chaotic behavior in stochastic Rosenzweig-MacArthur prey-predator model with non-Gaussian stable Lévy noise, International Journal of Non-Linear Mechanics 150 (2023) 104339.
(6) M.I. Freidlin, A.D. Wentzell, Random perturbations of dynamical systems, Springer, Berlin, Germany, 2012.
(7) S. Yuan, J. Duan, Action functionals for stochastic differential equations with Lévy noise, Communications on Stochastic Analysis 13(3) (2019) 10.
(8) Y. Li, J. Duan, X. Liu, Y. Zhang, Most probable dynamics of stochastic dynamical systems with exponentially light jump fluctuations, Chaos: An Interdisciplinary Journal of Nonlinear Science 30(6) (2020) 063142.
(9) D.G. Luchinsky, I.A. Khovanov, S. Berri, R. Mannella, P.V.E. McClintock, Optimal fluctuations and the control of chaos, International Journal of Bifurcation and Chaos 12(3) (2002) 583-604.
(10) M.I. Dykman, P.V.E. McClintock, V.N. Smelyanski, N.D. Stein, N.G. Stocks, Optimal paths and the prehistory problem for large fluctuations in noise-driven systems, Physical Review Letters 68(18) (1992) 2718-2721.
(11) J.X. Zhou, M.D.S. Aliyu, E. Aurell, S. Huang, Quasi-potential landscape in complex multi-stable systems, Journal of the Royal Society Interface 9(77) (2012) 3539-3553.
(12) P. Ao, Potential in stochastic differential equations: novel construction, Journal of Physics A: Mathematical and General 37 (2004) L25-L30.
(13) L. Chen, W. Zhu, First passage failure of quasi non-integrable generalized Hamiltonian systems, Archive of Applied Mechanics 80 (2010) 883-893.
(14) I. Franovic, K. Todorovic, M. Perc, N. Vasovic, N. Buric, Activation process in excitable systems with multiple noise sources: One and two interacting units, Physical Review E 92 (2015) 062911.
(15) T. Naeh, M.M. Kłosek, B.J. Matkowsky, Z. Schuss, A Direct Approach to the Exit Problem, SIAM Journal on Applied Mathematics 50(2) (1990) 595-627.
(16) B. Matkowsky, Z. Schuss, C. Tier, Diffusion across characteristic boundaries with critical points, SIAM Journal on Applied Mathematics 43(4) (1983) 673–695.
(17) R.S. Maier, D.L. Stein, A scaling theory of bifurcations in the symmetric weak-noise escape problem, Journal of Statistical Physics 83(3) (1996) 291–357.
(18) Y. Li, F. Zhao, S. Xu, J. Duan, X. Liu, A deep learning method for computing mean exit time excited by weak Gaussian noise, Nonlinear Dynamics https://doi.org/10.1007/s11071-024-09280-w.
(19) M. Heymann, E. Vanden-Eijnden, The geometric minimum action method: A least action principle on the space of curves, Communications on Pure and Applied Mathematics 61(8) (2008) 1052-1117.
(20) M.K. Cameron, Finding the quasipotential for nongradient SDEs, Physica D: Nonlinear Phenomena 241(18) (2012) 1532-1550.
(21) E. Alpaydin, Machine learning, Mit Press, London, England, 2021.
(22) S.N. Steinmann, Q. Wang, Z.W. Seh, How machine learning can accelerate electrocatalysis discovery and optimization, Materials Horizons 10(2) (2023) 393-406.
(23) G. Bonaccorso, Machine learning algorithms. Packt Publishing Ltd, 2017.
(24) K. Hippalgaonkar, Q. Li, X. Wang, J.W. Fisher III, J. Kirkpatrick, T. Buonassisi, Knowledge-integrated machine learning for materials: lessons from gameplaying and robotics, Nature Reviews Materials 8(4) (2023) 241-260.
(25) M. Amini, A. Rahmani, Agricultural databases evaluation with machine learning procedure, Australian Journal of Engineering and Applied Science 8(2023) (2023) 39-50.
(26) G. Carleo, I. Cirac, K. Cranmer, L. Daudet, M. Schuld, N. Tishby, L. Vogt-Maranto, L. Zdeborová, Machine learning and the physical sciences, Reviews of Modern Physics 91(4) (2019) 045002.
(27) J.P. Bharadiya, Machine learning and AI in business intelligence: Trends and opportunities, International Journal of Computer 48(1) (2023) 123-134.
(28) M.I. Jordan, T.M. Mitchell, Machine learning: Trends, perspectives, and prospects, Science 349(6245) (2015) 255-260.
(29) Y. Li, J. Duan, A data-driven approach for discovering stochastic dynamical systems with non-Gaussian Lévy noise, Physica D: Nonlinear Phenomena 417 (2021) 132830
(30) X. Chen, L. Yang, J. Duan, G.E. Karniadakis, Solving inverse stochastic problems from discrete particle observations using the Fokker–Planck equation and physics-informed neural networks, SIAM Journal on Scientific Computing 43(3) (2021) B811-B830
(31) Y. Xu, H. Zhang, Y. Li, K. Zhou, Q. Liu, J. Kurths, Solving Fokker-Planck equation using deep learning, Chaos: An Interdisciplinary Journal of Nonlinear Science 30(1) (2020) 013133.
(32) Y. Li, J. Duan, X. Liu, Machine learning framework for computing the most probable paths of stochastic dynamical systems, Physical Review E 103(1) (2021) 012124.
(33) W. Wei, T. Gao, X. Chen, J. Duan, An optimal control method to compute the most likely transition path for stochastic dynamical systems with jumps, Chaos: An Interdisciplinary Journal of Nonlinear Science 32 (2022) 051102.
(34) Y. Li, S. Yuan, S. Xu, Controlling mean exit time of stochastic dynamical systems based on quasipotential and machine learning, Communications in Nonlinear Science and Numerical Simulation 126 (2023) 107425.
(35) Y. Li, S. Yuan, L. Lu, X. Liu, Computing large deviation prefactors of stochastic dynamical systems based on machine learning, Chinese Physics B in press https://doi.org/10.1088/1674-1056/ad12a8 (2023).
(36) H. Zhang, W. Xu, Y. Lei, Y. Qiao, Early warning and basin stability in a stochastic vegetation-water dynamical system, Communications in Nonlinear Science and Numerical Simulation 77 (2019) 258-270.
(37) B. Lin, Q. Li, W. Ren, A data driven method for computing quasipotentials, In Mathematical and Scientific Machine Learning PMLR 145 (2022) 652–670
(38) F. Bouchet, J. Reygner, Path integral derivation and numerical computation of large deviation prefactors for non-equilibrium dynamics through matrix riccati equations, Journal of Statistical Physics 189 (2) (2022) 21.
(39) F. Bouchet, J. Reygner, Generalisation of the Eyring-Kramers transition rate formula to irreversible diffusion processes, Annales Henri Poincaré 17(12) (2016) 3499-3532.
(40) S. Beri, R. Mannella, D.G. Luchinsky, A. Silchenko, P.V. McClintock, Solution of the boundary value problem for optimal escape in continuous stochastic systems and maps, Physical Review E 72(3) (2005) 036131.