Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation

Xiao, Cong; Liu, Ting; Zhang, Lufeng; Li, Zhun

doi:10.3390/pr12102302

Open AccessArticle

Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation

¹

Key Laboratory of Marine Oil and Gas Reservoirs Production, Research Institute of Petroleum Exploration and Production, SINOPEC, Beijing 100083, China

²

College of Petroleum Engineering, China University of Petroleum, Beijing 102249, China

³

China Everbright Bank Co., Ltd., Technology Research and Development Center, Beijing 100033, China

⁴

Research Institute of Petroleum Exploration and Production, SINOPEC, Beijing 100083, China

⁵

SKLOOGE: State Key Laboratory of Offshore Oil and Gas Exploitation, Beijing 100028, China

⁶

CNOOC Research Institute Ltd., Beijing 100028, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Processes 2024, 12(10), 2302; https://doi.org/10.3390/pr12102302

Submission received: 6 September 2024 / Revised: 6 October 2024 / Accepted: 14 October 2024 / Published: 21 October 2024

(This article belongs to the Section Energy Systems)

Download

Browse Figures

Versions Notes

Abstract

:

The estimation of space-varying geological parameters is often not computationally affordable for high-dimensional subsurface reservoir modeling systems. The adjoint method is generally regarded as an efficient approach for obtaining analytical gradient and, thus, proceeding with the gradient-based iteration algorithm; however, the infeasible memory requirement and computational demands strictly prohibit its generic implementation, especially for high-dimensional problems. The autoregressive neural network (aNN) model, as a nonlinear surrogate approximation, has gradually received increasing popularity due to significant reduction of computational cost, but one prominent limitation is that the generic application of aNN to large-scale reservoir models inevitably poses challenges in the training procedure, which remains unresolved. To address this issue, model-order reduction could be a promising strategy, which enables us to train the neural network in a very efficient manner. A very popular projection-based linear reduction method, i.e., propel orthogonal decomposition (POD), is adopted to achieve dimensionality reduction. This paper presents an architecture of a projection-based autoregressive neural network that efficiently derives an easy-to-use adjoint model by the use of an auto-differentiation module inside the popular deep learning frameworks. This hybrid neural network proxy, referred to as POD-aNN, is capable of speeding up derivation of reduced-order adjoint models. The performance of POD-aNN is validated through a synthetic 2D subsurface transport model. The use of POD-aNN significantly reduces the computation cost while the accuracy remains. In addition, our proposed POD-aNN can easily obtain multiple posterior realizations for uncertainty evaluation. The developed POD-aNN emulator is a data-driven approach for reduced-order modeling of nonlinear dynamic systems and, thus, should be a very efficient modeling tool to address many engineering applications related to intensive simulation-based optimization.

Keywords:

reduced-order modeling; adjoint model; autoregressive neural network; parameter estimation

1. Introduction

In the field of reservoir engineering, the geological parameters, e.g., rock permeability and porosity, cannot be directly measured and, thus, are often uncertain. The accurate predictions of well production dynamics, e.g., oil production and oil distribution underground, largely depend on whether we can obtain these geological parameters. Thus, it is necessary to estimate these geological parameters to reduce the uncertainty of the oil well production dynamic predictions. Due to the necessity of many high-fidelity model simulations, solutions of inverse modeling or parameter estimation for computational fluid modeling system with many spatially variable parameters are almost intractable tasks. In general, gradient-based optimization algorithms are very common choices to address parameter estimation problem with the aid of adjoint models [1]. Although the conventional adjoint modeling methodology has been extensively demonstrated to be very computationally cheap for achieving analytical gradient, the realization of adjoint models is generally very difficult, especially when the simulation models are not code-intrusive.

In general, emulating high-fidelity models with computationally efficient proxy models are gaining popularity to ease the development of the adjoint models. Proxy models are able to deliver a rapid simulation of high-fidelity models and, therefore, are capable of speeding up adjoint-based optimization algorithms. To our current best knowledge, the popular proxy modeling approaches roughly consist of reduced-order modeling (ROM) and data-driven proxy modeling [2,3]. The main idea behind ROM is to obtain a set of basis functions/vectors that can be used to compress the high-dimensional model into a low-dimensional subspace. We often achieve the set of orthogonal basis functions/vectors by the use of projection-based proper orthogonal decomposition (POD) [4,5]. Such projection-based ROM has been successfully applied to various research areas [6,7,8], and shows very promising efficiency increments in the process of model simulations.

Data-driven proxy modeling methods use the simulated or experimental data to accurately emulate the high-fidelity models. The commonly used data-driven proxy modeling methods include Gaussian process regression [9,10], support vector regression [11], and artificial neural network [12]. These aforementioned data-driven proxy modeling methods have extensively proved their applicability; however, they are usually restricted to handle small-scale models. The deep learning (DL) method has recently increased popularity due to its successful applications in various research domains, such as computer vision [13,14] and image processing [15,16]. Recent applications of DL to emulate subsurface transport dynamic have been widely presented in the literature. For instance, Jin and Durlofsky [17] developed a data-driven DL proxy model to emulate the temporal evolution in reservoir simulation problem. Tang and Durlofsky [18] established a deep recursive neural network architecture to speed up the time-serial predictions. Overall, recent advances in DL architectures and easy-to-use deep learning packages [19,20], have substantially prompted applications of DL proxy modeling to high-dimensional subsurface transport models [21,22,23,24,25], multiphysics conjugate problems [26], fast prediction of film cooling performance [27], and super-resolution reconstruction of turbulent flows [28].

The use of proxy models that substantially accelerate physic-model simulation to address parameter estimation problems has been widely investigated in various research fields. Cardoso MA [29] integrated POD and trajectory-piecewise-linearization (TPWL), i.e., referred to as POD-TPWL, and then applied it to oil reservoir production optimization. Subsequently, Heijn T [8] further prompted the use of POD-TPWL to geological parameter calibration problems. Vermeulen and Heemink [4], Altaf et al. [5], Kaleta et al. [30] initially utilized the POD-TPWL to derive a model-reduced adjoint-based parameter estimation in the community of reservoir engineering. Recently, Xiao et al. [31,32] further combined the POD-TPWL and domain decomposition technique to address large-scale reservoir simulation and history matching problems. The above POD-TPWL is a reduced-order linear surrogate model. To efficiently address model nonlinearity, the combination of POD and the discrete empirical interpolation method (DEIM) was initiated [33,34,35]. Ref. [36] proposed a POD-DEIM approach for rapidly solving reservoir models through projecting the large-scale partial differential equations into low-order subspace and then solved them efficiently. DL-based solutions of inversion problems or parameter estimations have also recently been explored in a wide range of application domains, e.g., hydraulic engineering [22], computer vision [37], geophysical inversion [38], and underground transport [39,40,41].

We can easily envision that the use of DL to emulate large-scale simulation models inevitably suffers from many challenges, for example, training DL for a large-scale model needs infeasible computational demands and computer memory storage. To address this issue, the combinations of DL surrogate and POD-based model-order reduction have recently gained popularity in many research fields. That is, we can project the original high-dimensional model into a low-order subspace and then train POD-based DL surrogates correspondingly [42,43,44,45]. For example, Wu et al. [43] developed data-driven reduced-order modeling by integrating POD and a temporal convolutional neural network. Mohan and Gaitonde [42] accelerated the turbulent flow simulation by combining the DL model and POD model-order reduction. The long short-term memory, as a variant of standard recurrent neural network, was adopted to emulate time-dependent dynamics of turbulence flow. Cheung et al. [44] combined data-driven deep neural network and a POD-based reduced-order modeling strategy for rapid prediction of flow dynamics.

Inspired by the successful applications of DL-based surrogate models to accelerate subsurface flow simulations, the focus of this study is on exploring the potential of combining POD-based model-order reduction and DL surrogate to emulate the original high-fidelity models and their adjoint models simultaneously. The colorredadvances of research that have arisen in the community of DL particularly facilitate the development of adjoint models. When the DL proxy is fully trained, the adjoint models of the trained DL model are easily obtained since the popular automatic-differentiation operator currently becomes accessible and is readily maintained in neural-network-related packages. The simulations of DL proxy are extremely faster than the high-fidelity models and, thus, support their superiority to derive adjoint models.

In our previous work, we have proposed many conceptual DL frameworks to derive deep-learning-based adjoint models [39,46]. Some promising results presented in our preliminary work have shown great potential in combining DL and POD to prompt the construction of adjoint models inside a reduced subspace and then efficiently support gradient-based parameter estimation [47]. In this paper, on the basis of our previous work, we systematically derive an efficient model-reduced adjoint model by the use of a DL surrogate. Following the aforementioned reduced-order modeling method, the POD-based model reduction and a nonlinear autoregressive neural network are used to compress the high-dimension model and evolve the time-series latent dynamic features, respectively. Specifically, a physically interpreted projection method, i.e., POD, is employed to obtain low-order subspace. To learn the time-series evolution of dynamic models, we use an autoregressive structure which contains a stack of residual neural network units. Once the projection-based autoregressive neural network model is constructed, the corresponding adjoint model can be easily derived by the use of automatic-differentiation mode in many popular DL packages.

In summary, we generally employed a gradient-based iterative algorithm to estimate the space-varying geological parameters. This algorithm needs us to compute the gradients. Compared to the finite-different method that needs to perturb the parameters one-by-one and, thus, requires intensive computation cost, the adjoint method can obtain the gradients analytically and, thus, is very computationally efficient. In this study, we mainly develop a deep learning surrogate model to speed up the construction of the adjoint model. The content of this study is structured as follows: Section 2 describes the mathematical formula of the general adjoint-based variational parameter estimation method. Section 3 provides the mathematical derivation of the model-reduced adjoint model. Section 4 describes an autoregressive neural network proxy model for the purpose of emulating temporal dynamics. In Section 5, many accuracy assessments of our proposed DL proxy model in terms of time-series fluid saturation prediction and parameter estimation results are presented. This section also comprehensively verifies the applicability of model-reduced adjoint-based data assimilation on a subsurface flow modeling with the aim of geological parameters’ estimation. Finally, Section 6 concludes our work and discusses possible research extensions.

2. Definition of Adjoint-Based Parameter Estimation for Reservoir Model

Reservoir geological parameter estimation is an important step in the process of reservoir development. Through historical fitting, various static parameters of the reservoir and fluid can be further clarified, thereby providing a more accurate understanding of reservoir characteristics. Reservoir history fitting is based on reservoir numerical simulation and multisource observation data fitting to modify model parameters. Without loss of generality, we simplify an explicit formula for a single simulation step of a discretized subsurface porous media flow system as follows:

x^{n} = f^{n} (x^{n - 1}, m), n = 1, \dots, N_{d}

(1)

where

f^{n}

denotes dynamic evolution model at two consecutive timesteps, n and

n - 1

.

x^{n}

represents the grid-based reservoir pressure and saturation.

m

denotes a vector of spatially varying parameters, which are the spatial permeability fields in our study. We can find descriptions from reference [48].

Uncertain parameters can be estimated by minimizing an objective function based on adjoint-based optimization algorithm. In general, an adjoint-based parameter estimation constrained by the simulation model (e.g., Equation (1)) can be formulated as follows:

\begin{matrix} m = arg min_{m} J (x^{1}, \dots, x^{n} \dots, x^{N_{d}}, m) \\ J (x^{1}, \dots, x^{n} \dots, x^{N_{d}}, m) = \frac{1}{2} \sum_{n = 1}^{N_{d}} {∥ d^{n} - x^{n} ∥}_{2}^{2} \\ x^{n} - f^{n} (x^{n - 1}, m) = 0, n = 1, \dots, N_{d} \end{matrix}

(2)

where

d^{n}

represents the vector of observations at the timestep

n

.

To estimate the geological parameters, the objective function

J (x^{1}, \dots, x^{N_{d}}, m)

needs to be minimized by gradient-based local optimization or stochastic global optimization methods. Among them, gradient-based optimization supported by the adjoint-state method is one of the most generic options. In the adjoint-based optimization approach [49,50], the objective function gradient can be derived and supported by an adjoint model, which can be presented by

\frac{d J}{d m} = - \sum_{n = 1}^{N_{d}} {[λ_{n}]}^{T} \frac{\partial x^{n}}{\partial m} - \sum_{n = 1}^{N_{d}} {[\frac{\partial x^{n}}{\partial m}]}^{T} (d^{n} - x^{n})

(3)

and the corresponding equation of the adjoint model is as follows:

{[\frac{\partial x^{n + 1}}{\partial x^{n}}]}^{T} λ^{n + 1} - λ^{n} + (d^{n} - x^{n}) = 0, n = N_{d}, \dots, 1

(4)

with an additional condition

λ_{N_{d} + 1}

= 0. More details related to the description of adjoint model are given in our previous work [31,51]. When the objective function gradient

\frac{d J}{d m}

is obtained, the uncertain parameters can be progressively adjusted with the use of quasi-Newton or Gaussian–Newton optimization algorithms [52].

The derivation of athe djoint model needs two derivatives of the dynamic equation, i.e.,

\frac{\partial x^{n}}{\partial m}

and

\frac{\partial x^{n + 1}}{\partial x^{n}}

, which are generally not easy to obtain. In addition, the adjoint codes in the commercial simulators are generally not available, which further worsens the construction of the adjoint model. In addition, the programming of adjoint model needs requires intensive computation cost and computer memory storage, especially for high-dimension models. To mitigate these issues, we propose a model-reduced neural network surrogate to derive these two derivatives for the construction of adjoint models.

3. Formula of Model-Reduced Adjoint Modeling

3.1. Model-Order Reduction Approach

The dynamic equations, e.g.,

f^{n}

, are generally high-dimensional and this hinders the application of the adjoint-based optimization approach to practical models. This limitation greatly motivates the popularity of the model-order reduction approach [8]. The use of model-order reduction is to represent the high-dimensional model state

x

using a set of low-dimensional latent-space features

ψ \in R^{N_{ψ}}

x = Φ_{x} ψ

(5)

where

Φ_{x}

denotes a projection basis matrix, and

N_{ψ}

denotes the number of low-dimensional patterns. This paper employs the POD-based model-order reduction due to the physical interpretation and high scalability.

In this study, the space-varying parameters, e.g., geological permeability, need to be estimated. In general, the spatial parameters preserve some underlying spatial correlations [53]. The geological permeability

m

is assumed to statistically obey a Gaussian distribution, and, thus, can be reduced into a low-dimensional subspace. On the basis of this assumption, the high-dimensional parameters

m

can be represented by a set of low-dimensional variables

ξ \in R^{N_{ξ}}

as follows:

m = Φ_{m} ξ

(6)

where

Φ_{m}

denotes a basis matrix.

N_{ξ}

represents the dimension of low-dimensional patterns. The basis matrix is generally obtained by using principle component analysis (PCA).

Through substituting Equations (5) and (6) into Equation (2), we obtain a new reduced-order model

f_{P O D}^{n} \in R^{N_{ψ} + N_{ξ}} \to R^{N_{ψ}}

as follows

ψ^{n} = f_{P O D}^{n} (ψ^{n - 1}, ξ), n = 1, \dots, N_{d}

(7)

3.2. Model-Reduced Adjoint Modeling

Equations (2)–(4) represent the adjoint equations for a fluid dynamic model upon a full-order space. For any high-dimensional dynamic system, the derivation and solution of adjoint equations are very expensive. To address this issue, a reduced-order adjoint model will be introduced in detail. In general, the parameter estimation problem upon the reduced-order subspace can be reformulated as follows:

\begin{matrix} ξ = arg min_{ξ} J_{r o m} (ψ^{1}, \dots, ψ^{N_{d}}, ξ) \\ J_{r o m} (ψ^{1}, \dots, ψ^{N_{d}}, ξ) = \frac{1}{2} \sum_{n = 1}^{N_{d}} {∥ d^{n} - Φ_{x} ψ^{n} ∥}_{2}^{2} \\ ψ^{n} - f_{P O D}^{n} (ψ^{n - 1}, ξ) = 0, n = 1, \cdot \cdot \cdot, N_{d} \end{matrix}

(8)

The Lagrange multiplier is generally used to formulate the model-reduced adjoint-state method, where the model-constrained optimization problem is replaced by an unconstrained optimization. We modify the original objective function

J_{r o m}

through adjoining the system equation as the following formula:

\begin{matrix} {\hat{J}}_{r o m} = J_{r o m} + \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} [ψ^{n} - f_{P O D}^{n} (ψ^{n - 1}, ξ)] \end{matrix}

(9)

where

λ_{r o m}^{n}

represents the model-reduced adjoint vector of variables at time instance n. Then, we can further obtain the total variation of

{\hat{J}}_{r o m}

with respect to

d ψ^{n}

, for n = 1,

\dots

,

N_{d}

, and

d ξ

as follows:

\begin{matrix} d {\hat{J}}_{r o m} = d J_{r o m} + \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} d ψ^{n} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ψ^{n - 1}} d ψ^{n - 1} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ξ} d ξ \end{matrix}

(10)

We can rearrange the above equation by changing the terms

ψ^{n - 1}

\begin{matrix} d {\hat{J}}_{r o m} = d J_{r o m} + \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} d ψ^{n} - \sum_{n = 0}^{N_{d} - 1} {[λ_{r o m}^{n + 1}]}^{T} \frac{\partial f_{P O D}^{n + 1}}{\partial ψ^{n}} d ψ^{n} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ξ} d ξ \end{matrix}

(11)

and further reformulating the limits of the sums involving

ψ^{n}

obtains

\begin{matrix} d {\hat{J}}_{r o m} & = d J_{r o m} + \sum_{n = 1}^{N_{d}} {{[λ_{r o m}^{n}]}^{T} - {[λ_{r o m}^{n + 1}]}^{T} \frac{\partial f_{P O D}^{n + 1}}{\partial ψ^{n}}} d ψ^{n} - {[λ_{r o m}^{1}]}^{T} \frac{\partial f_{P O D}^{1}}{\partial ψ^{0}} d ψ^{0} + \\ {[λ_{r o m}^{N_{d} + 1}]}^{T} \frac{\partial f_{P O D}^{N_{d} + 1}}{\partial ψ^{N_{d}}} d ψ^{N_{d}} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ξ} d ξ \end{matrix}

(12)

The sum of the variation of objective function can be represented as the total variations regarding to

ψ^{n}

and

ξ

\begin{matrix} d J_{r o m} = \sum_{n = 1}^{N_{d}} \frac{\partial J_{r o m}}{\partial ψ^{n}} d ψ^{n} + \frac{\partial J_{r o m}}{\partial ξ} d ξ \end{matrix}

(13)

Substituting the above equation into Equation (12) yields

\begin{matrix} d {\hat{J}}_{r o m} = \sum_{n = 1}^{N_{d}} {\frac{\partial J_{r o m}}{\partial ψ^{n}} + {[λ_{r o m}^{n}]}^{T} - {[λ_{r o m}^{n + 1}]}^{T} \frac{\partial f_{P O D}^{n + 1}}{\partial ψ^{n}}} d ψ^{n} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ξ} d ξ + \frac{\partial J_{r o m}}{\partial ξ} d ξ \end{matrix}

(14)

We set all coefficients of

d ψ^{N_{d}}

to zero; finally, the adjoint model is given as follows:

\begin{matrix} {[\frac{\partial J_{r o m}}{\partial ψ^{n}}]}^{T} + λ_{r o m}^{n} - {[\frac{\partial f_{P O D}^{n + 1}}{\partial ψ^{n}}]}^{T} λ_{r o m}^{n + 1} = 0, n = 1, \dots, N_{d} \end{matrix}

(15)

Given the ending condition

λ_{r o m}^{N_{d} + 1} = 0

, the reduced-order adjoint model can be easily solved through a back-propagation manner. The total variations of

{\hat{J}}_{r o m}

regarding the parameters

ξ

are given by

\begin{matrix} \frac{d {\hat{J}}_{r o m}}{d ξ} = \frac{\partial J_{r o m}}{\partial ξ} - \sum_{n = 1}^{N_{d}} {[λ_{r o m}^{n}]}^{T} \frac{\partial f_{P O D}^{n}}{\partial ξ} \end{matrix}

(16)

The model-reduced adjoint system equation (e.g., Equation (16)) needs two sensitivity terms of the reduced-order model, e.g.,

\frac{\partial f_{P O D}^{n + 1}}{\partial ψ^{n}}

and

\frac{\partial f_{P O D}^{n}}{\partial ξ}

. In the following part, we propose a projection-based autoregressive neural network to emulate the reduced-order model

f_{P O D}^{n}

. This surrogate model can easily derive the sensitivity terms by the use of automatic-differentiation module built-in deep-learning frameworks.

4. Projection-Based Autogressive Neural Network (POD-aNN)

We will describe a novel methodology that derives the model-reduced adjoint model using a deep neural network in this section. In detail, it includes neural network architecture, training samples preparation, and the neural network training procedure. The diagram of our proposed neural network structure is displayed in Figure 1.

4.1. Description of Neural Network Architecture

Our proposed POD-aNN surrogate model emulates the model-reduced flow dynamic at two consecutive timesteps, e.g., Equation (7). Generally, the autoregressive neural network is capable of capturing the temporal dynamic process as follows:

{\hat{ψ}}^{n} = {\hat{f}}_{P O D}^{n} ({\hat{ψ}}^{n - 1}, ξ, θ), n = 1, \dots, N_{d}

(17)

where

{\hat{ψ}}^{n}

and

{\hat{ψ}}^{n - 1}

are the POD pattern coefficients at the n-th and

n - 1

-th timestep, and

ξ

is the PCA coefficients.

θ

represents the trainable neural network parameters. The structure of the autoregressive neural network reveals that the current POD patterns

{\hat{ψ}}^{n}

rely only on the previous POD patterns

{\hat{ψ}}^{n - 1}

and the time-independent PCA patterns

ξ

, indicating that the autoregressive neural network is capable of emulating the temporal dynamic for subsurface flow models [23].

The purpose of POD-aNN is to perform a vector-to-vector regression task. The selection of a DL structure cannot be generally determined beforehand since the design of the DL structure adapted by specific tasks cannot follow strict criteria. In this study, the configuration of this POD-aNN structure is guided by our previous work [47]. A diagram of the POD-aNN architecture is depicted in Figure 1. Figure 1 illustrates the main architecture of the proposed POD-aNN neural network model. The basic unit of this neural network model is composed of many convolutional blocks. Each convolutional block includes three continuous operators, including a one-dimensional convolutional layer (

C o n v 1

), a batch normalization layer (

B N

), and a rectified linear activation unit (

R e L U

). To further improve the prediction accuracy of the POD-aNN surrogate model, we introduce two different kinds of residual blocks [54]. The details of the residual blocks are depicted in Figure 1a,b. These two kinds of residual blocks, i.e.,

R e s B l o c k 1

and

R e s B l o c k 2

, are connected and followed by a full-connected (

F C

) neural network. The deep insights/motivations about the POD-aNN architecture can be further explained here. On the one hand, the adjoint model requires us to obtain dynamic evolution at two consecutive timesteps, and the autoregressive structure, indeed, satisfies this purpose compared to the conventional recurrent neural network. As is known, the NN with deeper neural network layers can represent the complex relationship. To ease the training process or mitigate the gradient vanish/explosion issue, we often use residual block or dense blocks. The use of dense blocks might be an alternative. The choice of a neural network architecture is flexible, but it is still relatively subjective.

In this study, we propose a model-reduced nonlinear surrogate model through combining the POD and autoregressive neural network. The aNN neural network is mainly used to represent the nonlinear evolution of the model dynamics. As noted earlier, to efficiently address model nonlinearity, the combination of POD and discrete empirical interpolation method (DEIM) has been proposed in the literature [33,34,35]. Both the DEIM and aNN can be used to address model nonlinearity; however, their implementation theory is significantly different. In terms of POD-DEIM, the reduced nonlinear terms are obtained by applying the DEIM to a POD basis of the nonlinearities. The use of the DEIM technique can project the original full-order model to the low order and, thus, reduce complexity in solving equations by using Newton-like iteration methods. Overall, the essence of POD-DEIM is to project the high-dimensional governing equations into low-order subspace and prompt the computation efficiency of numerical solvers; by contrast, our proposed POD-aNN constructs the surrogate model.

4.2. Samples Preparation and Training Procedure

A set of samples needs to be generated for training the POD-aNN model. In subsurface flow models, the training samples constitute many pairs of geological permeability parameters to time-series fluid saturation/pressure realizations. After that, the full-order states

x

and parameters

m

are projected into reduced state variables

ψ

and

ξ

using POD and PCA, respectively. To emulate the temporal relationship for two consecutive timesteps, the time-series saturation realizations, i.e., (

ψ_{i}^{1}

,

\dots

,

ψ_{i}^{N_{d}}

), for the i-th high-fidelity model

ξ_{i}

(i = 1,

\dots

,

N_{s}

), can be reorganized as follows:

{(ξ_{i}; ψ_{i}^{1}, \dots, ψ_{i}^{N_{d}})}_{i = 1}^{N_{s}} \Rightarrow {(ξ_{i}, ψ_{i}^{n - 1}; ψ_{i}^{n})}_{i = 1, n = 1}^{N_{s}, N_{d}}

(18)

in which

N_{s}

denotes the training sample size. After reorganizing the dataset, the temporal relationship at two consecutive timesteps can be represented by the autoregressive neural network structure.

To improve the performance of the trained POD-aNN surrogate model, we introduce a regularized loss function. The first several POD patterns mainly dominate the dynamic features, and it is advisable to improve the fitting accuracy of these POD-based variables corresponding to large eigenvalues. We define the loss function by projecting the reduced states

ψ

back to full-order states

x

. This loss function can be considered to be an extension of the above one weighted by basis matrix

Φ_{x}

\begin{matrix} L (θ) = \frac{1}{N_{s} N_{t}} \sum_{i = 1}^{N_{s}} \sum_{n = 1}^{N_{t}} {∥ Φ_{x} {\hat{ψ}}_{i}^{n} - Φ_{x} ψ_{i}^{n} ∥}_{2}^{2} \end{matrix}

(19)

We employ the stochastic-gradient optimizer, i.e.,

Adam

, to optimize the neural network model. Once the POD-aNN is trained, the predictions for new test samples are easily realized. Given the initial model-reduced state variables

ψ^{0}

and the reduced parameter

ξ

, Equation (17) can be repeatedly performed to obtain time-series reduced state variables

{\hat{ψ}}^{n}

at all

N_{d}

time instances. Specifically speaking, each output vector

{\hat{ψ}}^{n}

is consecutively predicted through regarding the reduced state

{\hat{ψ}}^{n - 1}

and reduced parameter

ξ

as inputs. This procedure is explicitly displayed in Figure 2. After that, a set of predicted variables [

{\hat{ψ}}^{1}

,

{\hat{ψ}}^{2}

, …,

{\hat{ψ}}^{N_{d}}

] can be transformed to their corresponding full-order states [

x^{1}

,

x^{2}

, …,

x^{N_{d}}

] using Equation (5).

4.3. Model-Reduced Adjoint-Based Data Assimilation Using POD-aNN

The objective function

J_{P O D - a N N}

using our trained POD-aNN surrogate model can be reformulated as follows:

\begin{matrix} ξ = arg min_{ξ} J_{P O D - a N N} ({\hat{ψ}}^{1}, \dots, {\hat{ψ}}^{N_{d}}, ξ) \\ J_{P O D - a N N} (ψ^{1}, \dots, ψ^{N_{d}}, ξ) = \frac{1}{2} \sum_{n = 1}^{N_{d}} {∥ d^{n} - Φ_{x} {\hat{ψ}}^{n} ∥}_{2}^{2} \\ {\hat{ψ}}^{n} - {\hat{f}}_{P O D}^{n} ({\hat{ψ}}^{n - 1}, ξ, θ) = 0, n = 1, \dots, N_{d} \end{matrix}

(20)

and the gradient of the approximated objective function

J_{P O D - a N N}

\begin{matrix} \frac{d J_{P O D - a N N}}{d ξ} = \frac{\partial J_{P O D - a N N}}{\partial ξ} - \sum_{n = 1}^{N_{t}} {[{\hat{λ}}_{r o m}^{n}]}^{T} \frac{\partial {\hat{f}}_{P O D}^{n}}{\partial ξ} \end{matrix}

(21)

and the model-reduced adjoint equation with the model-reduced Lagrange multipliers

{\hat{λ}}_{r o m}^{n}

\begin{matrix} {[\frac{\partial J_{P O D - a N N}}{\partial {\hat{ψ}}^{n}}]}^{T} + {\hat{λ}}_{r o m}^{n} - {[\frac{\partial {\hat{f}}_{P O D}^{n + 1}}{\partial {\hat{ψ}}^{n}}]}^{T} {\hat{λ}}_{r o m}^{n + 1} = 0 \end{matrix}

(22)

Since the training of the POD-aNN surrogate model and adjoint-based parameter estimation are implemented in the PyTorch framework, the auto-differentiation module can be employed to derive a set of derivative terms, e.g.,

\frac{\partial {\hat{f}}_{P O D}^{n}}{\partial ξ}

and

\frac{\partial {\hat{f}}_{P O D}^{n + 1}}{\partial {\hat{ψ}}^{n}}

at

n = 1, \dots, N_{d}

, which are required by the reduced-order adjoint model. In addition, another two derivatives, i.e.,

\frac{\partial J_{P O D - a N N}}{\partial ξ}

and

\frac{\partial J_{P O D - a N N}}{\partial {\hat{ψ}}^{n}}

, of the objective function also can be computed in an efficient manner. We can efficiently compute the least-square function, e.g., Equation (20), and the derivatives using the auto-differentiation module in

P y T o r c h

.

When we obtain the gradient

{[\frac{d J_{P O D - a N N}}{d ξ}]}^{k}

at the k-th iteration, the optimized parameters at this iteration can be obtained as follows:

\begin{matrix} ξ^{k + 1} = ξ^{k} - ε^{k} {[\frac{d J_{P O D - a N N}}{d ξ}]}^{k} \end{matrix}

(23)

where

ε^{k}

denotes a step-length at the k-th iteration. The gradient descent algorithm is considered to have sufficiently converged when the predefined metrics satisfy the following:

The changes of objective functions at two consecutive iterations are very small, i.e.,

$\frac{| J_{P O D - a N N} (ξ^{k + 1}) - J_{P O D - a N N} (ξ^{k}) |}{m a x {| J_{P O D - a N N} (ξ^{k + 1})) |, 1}} < η_{J_{P O D - a N N}} .$

(24)
The changes of updated parameters at two consecutive iterations are very small, i.e.,

$\frac{| ξ^{k + 1} - ξ^{k} |}{m a x {| (ξ^{k + 1} |, 1}} < η_{ξ}$

(25)

where

η_{J_{P O D - a N N}}

and

η_{ξ}

denote the predefined convergent metrics. In our experiments,

η_{J_{P O D - a N N}} = 10^{- 4}

and

η_{ξ} = 10^{- 3}

. Algorithm 1 describes the detailed implementation procedure of the entire workflow.

Algorithm 1: Model-reduced adjoint-based parameter estimation using POD-aNN surrogate model.

5. Illustrative Example and Results

This section analyzes some numerical results based on a synthetic two-dimensional subsurface transport model with a dimension size of 20 × 20. The 2D subsurface transport model contains one pumping well and one injection well located at the two corners, respectively. In this numerical experiment, we use an open-source simulation tool, i.e., Matlab Reservoir Simulation Toolbox (MRST) [55], to run the high-fidelity model (HFM) simulations.

5.1. Description of Subsurface Transport Model

Our research mainly focuses on estimating geological parameters of a subsurface transport model with the oil/water two-phase reservoir model. In the 2D transport model, the rock permeability field is assumed to be spatially heterogeneous by following a log-Gaussian random distribution. We generate Gaussian-distributed realizations of logarithmic permeability using the Stanford Geostatistical Modeling Toolbox (SGeMS) [56]. As we mentioned previously, the log-permeability field parameters are reparameterized by PCA and, therefore, can be represented by low-dimensional PCA patterns. Specifically, we preserve the

N_{ξ}

= 72 PCA pattern coefficients to accurately characterize the spatial variability and correlation of permeability fields. As illustrated in Figure 3, one realization is regarded as the true log-permeability field. The triangle and circle denote the injector and producer, respectively. The 25 white diamonds represent the data measured locations. Table 1 lists some other model settings, including the model size along two dimensions, fluid properties, underground formation pressure, initial fluid saturation, and well pumping schedules.

5.2. Construction of Projection-Based Autogressive Neural Network

Implementation of Model-Order Reduction

As noted earlier,

N_{ξ}

= 72 PCA pattern coefficients are retained to accurately characterize the geological permeability fields. A set of PCA pattern coefficients are randomly generated to reconstruct the original permeability fields. Through running the MRST simulator, these permeability fields are regarded as the inputs of HFM simulations, and the simulated time-series fluid saturation snapshots are preserved as the outputs. These pairs of original permeability fields and fluid saturation constitute the training samples. In this experiment, many high-fidelity model simulations are used to collect fluid saturation snapshots. In this case study, 50 simulation models are run, which results in 1000 saturation snapshots. The basis matrix of POD patterns based on these 1000 snapshots is obtained. As illustrated in Figure 4, increasing the number of POD patterns will preserve the dynamic features with increasingly high accuracy. For example,

N_{ψ}

= 10, 50, and 100, POD patterns retain 87.3%, 97.2%, and 98.8% accuracy, respectively. In this case study, 50 POD modes are almost sufficient to represent the original full-order flow dynamic.

The details about the POD-aNN neural network model corresponding to this 2D synthetic model are shown in Table 2. As illustrated in Table 1, the dimension size of this 2D model are

N_{x}

= 20 and

N_{y}

= 20. In this case study, we set the numbers of channels as follows:

C_{1}

=

N_{ψ}

+

N_{ξ}

,

C_{2}

= 140,

C_{3}

= 130,

C_{4}

= 120, and

C_{5}

=

N_{ψ}

, respectively.

5.3. Accuracy Evaluation of POD-aNN Neural Network Model

In terms of the POD-aNN neural network surrogate model, there are two main sources of errors (SOEs), including (SOE1) approximation errors of the neural network surrogate, and (SOE2) the loss of hidden physics dynamics due to an insufficient number of preserved POD patterns, e.g.,

N_{ψ}

= 10. To evaluate the quality of the POD-aNN neural network model with respect to the number of training samples

N_{s}

,

N_{t e s t}

high-fidelity models are run to generate the test samples. The mean relative error over

N_{d}

timesteps among

N_{t e s t}

test samples, denoted as

γ_{S O E 1}

and

γ_{S O E 2}

, respectively, are given by

γ_{S O E 1} = \frac{1}{N_{t e s t} N_{d}} \sum_{i = 1}^{N_{t e s t}} \frac{∥ {\hat{ψ}}_{i}^{n} - ψ_{i}^{n} ∥}{ψ_{i}^{n}},, n = 1, \dots, N_{d}

(26)

and

γ_{S O E 2} = \frac{1}{N_{t e s t} N_{d}} \sum_{i = 1}^{N_{t e s t}} \frac{∥ Φ_{x} {\hat{ψ}}_{i}^{n} - x_{i}^{n} ∥}{x_{i}^{n}},, n = 1, \dots, N_{d}

(27)

where

γ_{S O E 1}

and

γ_{S O E 2}

correspond to reduced state variables and full-order variables, respectively. The

γ_{S O E 1}

only reveals the errors from the aNN neural network, while

γ_{S O E 2}

includes the total errors from both model-order reduction and the aNN neural network. The performance of surrogate model depends on the selection of hyperparameters, e.g., learning rate (lr) and batch size (bt). The evolutions of loss function corresponding to different learning rate and batch size are shown in Figure 5. The optimal hyperparameters involving the training of the POD-aNN neural network model can be found in Table 1.

We progressively increase the number of training samples, namely,

N_{s}

= 400, 600, 800, and 1000, to assess the evolution of neural network accuracy. The simulation of this synthetic subsurface flow model is at 20 timesteps. In this study, we investigate three different data collection strategies through measuring the training samples at (1)

S 1

: n = [4, 8, 12, 16, 20]-th timestep with

N_{t}

= 5; (2)

S 2

: n = [2, 4, 6, 8, 10, 12, 14, 16, 18, 20]-th timestep with

N_{t}

= 10; (3)

S 3

: n = [1, 2, 3, 4, …, 18, 19, 20]-th timestep with

N_{t}

= 20 timsesteps. For instance, the data collection strategy

S 1

yields 2000, 3000, 4000, and 5000 training samples, correspondingly.

Figure 6 plots the evolution of field-average relative error

γ_{S O E 1}

and

γ_{S O E 2}

scores over all 200 test models with respect to the time instances. Due to the negative error accumulation effects induced by the autoregessive network structure, both

γ_{S O E 1}

and

γ_{S O E 2}

scores progressively increase as the time goes on. This phenomenon reveals that the prediction accuracy of the POD-aNN network model decreases over time. Although the negative error accumulation problem occurs, the POD-aNN neural network model trained by the

N_{s}

= 1000 dataset is still capable of achieving relatively small

γ_{S O E 1}

and

γ_{S O E 2}

values. In general, increasing the number of training samples has a high possibility to mitigate the negative error-accumulation effects to a certain degree. In our methodology, this can be realized by using more model simulations (i.e., 1000 high-fidelity model runs) or collecting data more frequently (i.e.,

N_{t}

= 20 at

S 3

scheme in this case study). For example, using data measured at all 20 timesteps allows us to obtain a POD-aNN network model trained by sufficient samples, which helps to mitigate the overfitting problem and, hence, generalizes the trained POD-aNN in the test samples with low relative errors.

Figure 7 displays the evolution of

γ_{S O E 1}

and

γ_{S O E 2}

values corresponding to

N_{t e s t}

= 200 test samples. In this numerical experiment, the results for

N_{ψ}

= 100 and 50 POD modes are analyzed. It clearly reveals that the POD-aNN neural network model could produce relatively small errors, e.g., 0.02–0.10, corresponding to solely 400 training samples. When we increase the number of training samples

N_{s}

to 1000, both the

γ_{S O E 1}

and

γ_{S O E 2}

scores can reduce to 0.06. In this POD-aNN neural network model, the current output relies on state information at a previous timestep. This dependency might cause error accumulation, that is, the approximation error gradually increases as the time evolves. By retaining more POD patterns, the difference between

γ_{S O E 1}

and

γ_{S O E 2}

becomes smaller. The main source of approximation error stems from the POD-aNN surrogate model.

Figure 8 shows the time-series saturation distributions predicted from POD-aNN and HFM at five, i.e., n = [4, 8,12, 16, 20]-th, timesteps. We should emphasize that the continuous three rows represent the saturation predictions from the high-fidelity model (x), POD-aNN (

\hat{x}

), and their differences (x-

\hat{x}

). In this numerical experiment, we generate the POD-aNN models by

N_{s}

= 1000 training samples. The POD-aNN neural network model is able to accurately predict the time-series saturation indicated by small difference values (x-

\hat{x}

). For example, the mean relative errors

γ_{S O E 2}

are 6.22% and 5.34%, corresponding to

N_{ψ}

= 50 and 100 POD modes, respectively.

Figure 9 shows the time-series prediction of fluid saturation at three spatial points. As shown in Figure 3, these three positions are the injector

I 1

, producer

P 1

, and a random point D. We can clearly observe that the predictions of fluid saturation from the POD-aNN are very close to the predictions from the high fidelity model. Increasing the number of POD modes generally results in better results.

5.4. Results of Geological Parameter Estimation

The accuracy of the POD-aNN surrogate model was demonstrated in the previous section. The POD-aNN is further applied in this section to estimate geological parameter estimation for the 2D synthetic subsurface transport model. Two scenarios are designed to demonstrate the performance of the POD-aNN surrogate model. Scenario 1 is mainly used to verify the proposed methodology with spatially sparse measurements, e.g., the 25 white diamonds shown in Figure 4. The purpose of Scenario 2 is to demonstrate the effectiveness and robustness of the POD-aNN surrogate model in high-dimensional inverse modeling by assimilating spatially grid-based measurements.

The root-mean-square error of the estimated geological permeability field is used to assess the results of parameter estimation, which can be generally defined by

\begin{matrix} e_{m} = \sqrt{\frac{1}{N_{m}} \sum_{i = 1}^{N_{m}} {(m_{i} - m_{t r u e, i})}^{2}} \end{matrix}

(28)

where

N_{m}

denotes the number of geological permeability parameters,

m_{i}

is the feature at i-th gridblock, and

m_{t r u e}

is the true geological permeability field.

5.4.1. Scenario 1—Sparse Measurements

In Scenario 1, we generate the synthetic observations by adding some noise to the simulation results of the high-fidelity model parameterized by the true geological permeability field. Both the true permeability field and the measured locations can be found in Figure 4. Based on the simulated results at the n = [4, 8, 12, 16, 20]-th timestep, we collect the fluid saturation at the measured locations, which results in 125 measurements.

The sensitivity analysis of the POD-aNN surrogate to the number of training samples is performed first. The comparison between the finite-difference method (FD) and the POD-aNN surrogate model is also conducted. The dynamic evolution of the logarithmic objective function values with respect to the number of training samples is shown in Figure 10. The objective function value generated by the adjoint-based optimization method based on POD-aNN is equivalent to or even smaller than that of the FD method. Figure 11 displays the time-series fluid saturation field predicted by the POD-aNN surrogate model. Although the estimated geological permeability fields are significantly different, the prediction results of the saturation distribution are very identical to the saturation predictions from the true model. This result further reveals that multiple local optimal solutions have been obtained. In terms of computational cost, the FD method requires more than 3000 forward model simulations, while our proposed POD-aNN assisted framework can produce comparable parameter estimation results with much smaller amount of computation cost.

The sensitivity analysis of POD-aNN surrogate to the number of POD patterns is further investigated. The number of

N_{ψ}

= 10, 50, and 100 POD modes separately corresponding to 87.3%, 97.2%, and 98.8% model-order reduction accuracy was designed to analyze the performance of parameter estimations. Figure 12a displays the dynamic changes of logarithmic objective-function values with respect to the number of POD modes. Although the use of fewer POD modes yields a faster convergence performance, we can obtain smaller objective function values as the number of POD modes increases. The main reason behind these results can be explained by the fact that the parameter estimation results from our proposed POD-aNN approach are substantially influenced by the selection and representation of the POD patterns. In this case study, the POD-aNN surrogate model with

N_{ψ}

= 50 POD modes obtained almost comparable objective function values with the FD method. Figure 12b–d show the updated geological permeability fields and the parameter errors

e_{m}

from our proposed POD-aNN surrogate model. The main parameter features, e.g., high permeable zones, were almost reconstructed. The use of a small number of POD modes also has the potential of capturing the geological features of the parameter field. Additional features of the geological permeability field would be gradually achieved with an increasing number of POD modes. In addition, increasing the number of POD modes also produces small parameter errors.

5.4.2. Scenario 2—Assimilation of Spatially Dense Measurements

In this scenario, the synthetic observations are generated by measuring the fluid saturation at all gridblocks. This type of data can nearly mimic a complex situation, e.g., spatially dense observations interpreted from a seismic survey. We assume that the spatially fluid saturation at all gridblocks are measured at the n = [4, 8, 12, 16, 20]-th timestep, which results in 2000 measurements.

Both the logarithmic objective functions and updated log-permeability fields are shown in Figure 13. As shown in Figure 13a, increasing the number of training samples gradually results in small objective functions. In this scenario, although the FD method achieves relatively smaller objective functions than that of our proposed POD-aNN surrogate model, the POD-aNN surrogate model needs much fewer high-fidelity model simulations than that of the FD method. In addition, compared with Scenario 1, the computational cost almost does not increase with the number of measurements. This result further demonstrates the scalability of the POD-aNN approach. Figure 13b–f separately show the true, prior, and posterior log-permeability fields. Comparing to FD approach, our proposed POD-aNN obtains satisfactory posterior parameter fields, which are very identical to the true model. It is also, surprisingly, found that the conventional FD method fails to accurately estimate the true model in this case study. This result further demonstrates that our proposed method has the ability to find the global optimal solution at some degree. Figure 14 displays the temporal–spatial distribution of fluid saturation predicted from POD-aNN, HFM, and their differences. It clearly can be observed that spatially dense measurements produce highly accurate predictions of the saturation.

Figure 15a displays the dynamic changes of the logarithmic objective-function with respect to optimization steps corresponding to

N_{ψ}

= 100, 50, and 10 POD modes, respectively. Preserving more POD modes reasonably generates smaller objective function values. In this case study, preserving

N_{ψ}

= 100 and 50 POD modes are able to obtain comparable results. For example, retaining 50 POD modes is almost able to recover the true logarithmic permeability field, as indicated in Figure 15b–d. That is to say, 50 POD modes can contain enough information extracted from the 2000 measurements. In terms of parameter estimation in this case study, ignoring some unimportant POD patterns corresponding to small eigenvalues might introduce some positive effects, i.e., efficiently searching the optimal solutions in a reduced subspace.

Figure 16 displays the time-dependent fluid saturation predicted from POD-aNN, HFM, and their differences. The differences between HFM and POD-aNN are relatively large when we only retain 10 POD modes. However, the flow patterns, e.g., the position of fluid fronts, can be approximately captured. In the inverse modeling, these flow patterns are able to provide sufficient information for updating the parameters. These results partially explain that retaining only 10 POD modes still generates satisfactory parameter estimation.

To further prove the effectiveness of our proposed POD-aNN proxy model, by using 200 different prior models, 200 posterior realizations are obtained, correspondingly. Figure 17a,b show the box plots of the parameter errors

e_{m}

in these two cases, respectively. Compared with the prior models, the parameter errors

e_{m}

are greatly reduced. Since a large number of measurements is assimilated in Scenario 2, the parameter error

e_{m}

reduction in Scenario 2 is greater than that in Scenario 1. The average and standard deviation (STD) of ensemble log-permeability fields are shown in Figure 17c,d. The comparison of the ensemble average parameter field clearly reveals that the POD-aNN surrogate model is able to effectively identify the geological properties of the true model. In comparison to the prior models, the posterior models are very close to the true model. In addition, the geological permeability uncertainty is significantly decreased, which can be indicated by small posterior model STD. The gradient-based parameter estimation algorithm inevitably falls into local minimums. Our proposed surrogate-based parameter estimation framework does not require many other HFM simulations. We can easily obtain multiple posterior realizations. All these results confirm that the POD-aNN method can generate many posterior parameters for the uncertainty evaluation while maintaining high scalability and efficiency.

6. Discussion

6.1. Comments on Computational Cost

In terms of computational effort, the proposed POD-aNN surrogate model needs about 0.01 s running on an NVIDIA GeForce GTX 745 GPU, which is nearly 600 times faster than that of the original high-fidelity model, e.g., about 6.0 s for this 2D synthetic case study. However, the offline training phase of the neural network model has to consider additional computational costs involving training dataset generation and network parameters optimization. It is meaningless to train the POD-aNN surrogate model unless it is definitely employed in the specific applications where many simulations are repeatedly performed. Because numerous high-fidelity model simulations are usually needed in the application of simulation-based parameter estimation, POD-aNN is undoubtedly beneficial in this situation.

In this study, since the running time of the POD-aNN surrogate model can be almost ignorable, the number of HFM simulations should be a suitable metric to quantify the computational expense. For example, training the POD-aNN surrogate model by 1000 samples requires about 18 min, which equals 200 HFM simulations. In summary, our proposed model-reduced adjoint-based inverse modeling requires about 1200 model runs; the conventional FD method, by contrast, takes 13,250 HFM simulations. It can be concluded that the use of POD-aNN significantly reduces the running times in the process of solving parameter estimation problem in this case study. Both effectiveness and efficiency of our proposed method were successfully validated through a synthetic 2D subsurface transport model. We can envision that the computational efficiency of our proposed POD-aNN becomes very substantial in the large-scale simulation-based engineering applications.

6.2. Comments on the Scalability of POD-aNN

In this study, a conceptual 2D subsurface transport model is used to demonstrate the performance, e.g., accuracy and efficiency, of our proposed POD-aNN surrogate model. It is worthy of investigating whether the proposed method can scale with higher grid resolutions and larger time horizons. Both the retained POD pattern coefficients and PCA pattern coefficients are the input of the POD-aNN surrogate model. The complexity of the neural network model, e.g., the number of trainable parameters, depends on the size of input variables. We generally should provide more training samples to train complex neural network models with satisfactory accuracy. The application models with higher grid resolutions and larger time horizons usually need to preserve more POD pattern coefficients and PCA pattern coefficients, which increases the complexity of neural network models. This limitation seemingly hinders further applications to large-scale models.

Here, we provide a possible option to scale our proposed method to practical models with higher grid resolutions than those in the provided experiments. To train the deep neural network, numerous training samples obtained from high-fidelity model simulations are needed, especially for the large-scale models. To reduce the computation burden, we can further reduce the requirement of high-fidelity model simulations by training subdomain-surrogates instead of one surrogate for the global domain. As conducted in our previous work [39,40,41], we propose a subdomain-based reduced-order modeling algorithm for the estimation of space-varying parameter patterns in numerical reservoir models. Domain decomposition is applied to construct separate approximations to the numerical model in every subdomain. Motivated by this idea, we also can combine the subdomain-based model order reduction with a deep neural network surrogate and its parallel implementation. Such an approach will have the advantage that the subdomain surrogate models can be naturally trained in parallel. For the global domain deep learning surrogate, a single GPU is used to train the deep learning surrogate model. The combination of domain decomposition and deep learning enables us to train multiple subdomain deep learning surrogate models using a large number of CPUs and multiple GPUs in parallel. Motivated by this idea, we can train the POD-aNN surrogate model at each subdomain individually. Increasing the number of subdomains enables us to train subdomain deep learning surrogate models with low complexity.

7. Conclusions

To address complex parameter estimation problems with a short timeframe, we combine reduced-order modeling and deep-learning surrogate to emulate model-reduced adjoint effectively and efficiently. In detail, a structure of projection-based autoregressive neural network, referred to as POD-aNN, is proposed. The main innovation around this type of POD-aNN is to approximate objective function gradients supported by our proposed model-reduced adjoint approach. This new approach does not require us to code an adjoint model and, thus, results in very efficient implementations of gradient-based inverse modeling compared to the conventional finite-difference method. In terms of the model sensitivity approximation, our proposed POD-aNN is capable of making full use of the auto-differentiation module inside the deep learning frameworks, e.g.,

P y T o r c h

. As a result, both the forward simulation and backward integration with the model-reduced adjoint result in ignorable computational cost. Some numerical experiments were conducted on a synthetic two-dimensional subsurface flow model. The results show that the POD-aNN approach could achieve accurate parameter estimations while the computation cost remains low. The combination of model-order reduction procedure and deep-learning surrogate prompts the simulation-based sensitivity analysis and optimization in various engineering applications, especially for large-scale and computation-intensive physical models. The proposed deep learning hybrid workflow offers an alternative avenue to rapidly achieve comparable simulations and predictions from original high-fidelity physics-based models. The promising application community includes hydrology engineering, underground energy resources (geothermal recovery), and other geoscience domains.

Author Contributions

Conceptualization, C.X. and Z.L.; methodology, C.X.; software, T.L.; validation, C.X., L.Z. and Z.L.; formal analysis, C.X.; investigation, Z.L.; resources, C.X. and Z.L.; data curation, T.L. and L.Z.; writing—original draft preparation, C.X. and Z.L.; writing—review and editing, C.X. and Z.L.; visualization, C.X. and Z.L.; supervision, C.X. and Z.L.; project administration, C.X. and Z.L.; funding acquisition, C.X. and Z.L. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by Science Foundation of Key Laboratory of Marine Oil and Gas Reservoirs Production, SINOPEC (33550000-22-ZC0613-0308), Petroleum Exploration and Production Research Institute, Beijing 100083, China, Science Foundation of China University of Petroleum, Beijing (No. 2462021BJRC005) and National Natural Science Foundation of China (No. 52304055).

Data Availability Statement

The data used in this paper are not publicly available due to the limitations of consent for the original study but could be obtained upon reasonable request.

Acknowledgments

Computing resources are provided by Department of Petroleum Engineering at China University of Petroleum, Beijing. The use of open source codes MRST (https://www.sintef.no/projectweb/mrst) (accessed on 25 April 2023) is gratefully acknowledged.

Conflicts of Interest

Authors Cong Xiao, Lufeng Zhang were employed by the company SINOPEC. Author Ting Liu was employed by the company China Everbright Bank Co., Ltd. Author Zhun Li was employed by the company CNOOC Research Institute Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest. The companies had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.

References

Plessix, R.E. A review of the adjoint-state method for computing the gradient of a functional with geophysical applications. Geophys. J. Int. 2006, 167, 495–503. [Google Scholar] [CrossRef]
Razavi, S.; Tolson, B.A.; Burn, D.H. Review of surrogate modeling in water resources. Water Resour. Res. 2012, 48. [Google Scholar] [CrossRef]
Asher, M.J.; Croke, B.F.; Jakeman, A.J.; Peeters, L.J. A review of surrogate models and their application to groundwater modeling. Water Resour. Res. 2015, 51, 5957–5973. [Google Scholar] [CrossRef]
Vermeulen, P.; Heemink, A. Model-reduced variational data assimilation. Mon. Weather. Rev. 2006, 134, 2888–2899. [Google Scholar] [CrossRef]
Altaf, M.U.; Heemink, A.W.; Verlaan, M. Inverse shallow-water flow modeling using model reduction. Int. J. Multiscale Comput. Eng. 2009, 7, 577–594. [Google Scholar] [CrossRef]
Xiao, D.; Fang, F.; Buchan, A.G.; Pain, C.C.; Navon, I.M.; Du, J.; Hu, G. Non-linear model reduction for the Navier–Stokes equations using residual DEIM method. J. Comput. Phys. 2014, 263, 1–18. [Google Scholar] [CrossRef]
Xiao, D.; Yang, P.; Fang, F.; Xiang, J.; Pain, C.C.; Navon, I.M. Non-intrusive reduced order modelling of fluid–structure interactions. Comput. Methods Appl. Mech. Eng. 2016, 303, 35–54. [Google Scholar] [CrossRef]
Heijn, T.; Markovinovic, R.; Jansen, J.D. Generation of low-order reservoir models using system-theoretical concepts. In Proceedings of the SPE Reservoir Simulation Symposium, Houston, TX, USA, 3–5 February 2003. [Google Scholar]
Dai, C.; Xue, L.; Zhang, D.; Guadagnini, A. Data-worth analysis through probabilistic collocation-based Ensemble Kalman Filter. J. Hydrol. 2016, 540, 488–503. [Google Scholar] [CrossRef]
Roy, P.T.; El Moçayd, N.; Ricci, S.; Jouhaud, J.C.; Goutal, N.; De Lozzo, M.; Rochoux, M.C. Comparison of polynomial chaos and Gaussian process surrogates for uncertainty quantification and correlation estimation of spatially distributed open-channel steady flows. Stoch. Environ. Res. Risk Assess. 2018, 32, 1723–1741. [Google Scholar] [CrossRef]
Fang, F.; Zhang, T.; Pavlidis, D.; Pain, C.C.; Buchan, A.G.; Navon, I.M. Reduced order modelling of an unstructured mesh air pollution model and application in 2D/3D urban street canyons. Atmos. Environ. 2014, 96, 96–106. [Google Scholar] [CrossRef]
Ahmadi, M.A. Developing a robust surrogate model of chemical flooding based on the artificial neural network for enhanced oil recovery implications. Math. Probl. Eng. 2015, 2015, 1–9. [Google Scholar] [CrossRef]
Savchenko, A. Probabilistic Neural Network with Complex Exponential Activation Functions in Image Recognition using Deep Learning Framework. IEEE Trans. Neural Netw. Learn. Syst. 2017, 31, 651–660. [Google Scholar] [CrossRef] [PubMed]
Heo, Y.J.; Kim, S.J.; Kim, D.; Lee, K.; Chung, W.K. Super-High-Purity Seed Sorter Using Low-Latency Image-Recognition Based on Deep Learning. IEEE Robot. Autom. Lett. 2018, 3, 3035–3042. [Google Scholar] [CrossRef]
Young, T.; Hazarika, D.; Poria, S.; Cambria, E. Recent Trends in Deep Learning Based Natural Language Processing. IEEE Comput. Intell. Mag. 2018, 13, 55–75. [Google Scholar] [CrossRef]
He, X.; Gao, J.; Deng, L. Deep Learning for Natural Language Processing: Theory and Practice (Tutorial). In Proceedings of the ACM International Conference on Information and Knowledge Management, Shanghai, China, 3–7 November 2024. [Google Scholar]
Jin, Z.L.; Liu, Y.; Durlofsky, L.J. Deep-learning-based reduced-order modeling for subsurface flow simulation. arXiv 2019, arXiv:1906.03729. [Google Scholar]
Tang, M.; Liu, Y.; Durlofsky, L.J. A deep-learning-based surrogate model for data assimilation in dynamic subsurface flow problems. arXiv 2019, arXiv:1908.05823. [Google Scholar] [CrossRef]
Abadi, M.; Agarwal, A.; Barham, P.; Brevdo, E.; Chen, Z.; Citro, C.; Corrado, G.S.; Davis, A.; Dean, J.; Devin, M.; et al. Tensorflow: Large-scale machine learning on heterogeneous distributed systems. arXiv 2016, arXiv:1603.04467. [Google Scholar]
Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 2019, 32, 8024–8035. [Google Scholar]
Kani, J.N.; Elsheikh, A.H. Reduced order modeling of subsurface multiphase flow models using deep residual recurrent neural networks. Transp. Porous Media 2018, 126, 713–741. [Google Scholar] [CrossRef]
Mo, S.; Zabaras, N.; Shi, X.; Wu, J. Deep autoregressive neural networks for high-dimensional inverse problems in groundwater contaminant source identification. Water Resour. Res. 2018, 55, 3856–3881. [Google Scholar] [CrossRef]
Mo, S.; Zhu, Y.; Zabaras, N.; Shi, X.; Wu, J. Deep convolutional encoder-decoder networks for uncertainty quantification of dynamic multiphase flow in heterogeneous media. Water Resour. Res. 2019, 55, 703–728. [Google Scholar] [CrossRef]
Xuan, Y.; Lyu, H.; An, W.; Liu, J.; Liu, X. A data-driven deep learning approach for predicting separation-induced transition of submarines. Phys. Fluids 2022, 34, 024101. [Google Scholar] [CrossRef]
Bukka, S.R.; Gupta, R.; Magee, A.R.; Jaiman, R.K. Assessment of unsteady flow predictions using hybrid deep learning based reduced-order models. Phys. Fluids 2021, 33, 013601. [Google Scholar] [CrossRef]
El Haber, G.; Viquerat, J.; Larcher, A.; Ryckelynck, D.; Alves, J.; Patil, A.; Hachem, E. Deep learning model to assist multiphysics conjugate problems. Phys. Fluids 2022, 34, 015131. [Google Scholar] [CrossRef]
Li, Z.; Su, L.; Wen, F.; Zeng, J.; Wang, S.; Zhang, J. Deep learning method for fast prediction of film cooling performance. Phys. Fluids 2022, 34, 047111. [Google Scholar] [CrossRef]
Liu, B.; Tang, J.; Huang, H.; Lu, X.Y. Deep learning methods for super-resolution reconstruction of turbulent flows. Phys. Fluids 2020, 32, 025105. [Google Scholar] [CrossRef]
Cardoso, M.A.; Durlofsky, L.J.; Sarma, P. Development and application of reduced-order modeling procedures for subsurface flow simulation. Int. J. Numer. Methods Eng. 2009, 77, 1322–1350. [Google Scholar] [CrossRef]
Kaleta, M.P.; Hanea, R.G.; Heemink, A.W.; Jansen, J.D. Model-reduced gradient-based history matching. Comput. Geosci. 2011, 15, 135–153. [Google Scholar] [CrossRef]
Xiao, C.; Leeuwenburgh, O.; Lin, H.X.; Heemink, A. Non-intrusive subdomain POD-TPWL for reservoir history matching. Comput. Geosci. 2019, 23, 537–565. [Google Scholar] [CrossRef]
Xiao, C.; Leeuwenburgh, O.; Lin, H.X.; Heemink, A. Efficient estimation of space varying parameters in numerical models using non-intrusive subdomain reduced order modeling. J. Comput. Phys. 2021, 424, 109867. [Google Scholar] [CrossRef]
Chaturantabut, S.; Sorensen, D.C. Nonlinear Model Reduction via Discrete Empirical Interpolation. SIAM J. Sci. Comput. 2010, 32, 2737–2764. [Google Scholar] [CrossRef]
Yang, Y.; Ghasemi, M.; Gildin, E.; Efendiev, Y.; Calo, V. Fast Multiscale Reservoir Simulations with POD-DEIM Model Reduction. SPE J. 2016, 21, 2141–2154. [Google Scholar] [CrossRef]
Ghasemi, M.; Yang, Y.; Gildin, E.; Efendiev, Y.; Calo, V. Fast Multiscale Reservoir Simulations using POD-DEIM Model Reduction. In Proceedings of the SPE Reservoir Simulation Symposium, Houston, TX, USA, 23–25 February 2015. [Google Scholar]
Gildin, E.; Ghasemi, M.; Romanovskay, A.; Efendiev, Y. Nonlinear Complexity Reduction for Fast Simulation of Flow in Heterogeneous Porous Media. In Proceedings of the SPE Reservoir Simulation Symposium, The Woodlands, TX, USA, 18–20 February 2013. [Google Scholar] [CrossRef]
Haltmeier, M.; Antholzer, S.; Schwab, J.; Nuster, R. Photoacoustic image reconstruction via deep learning. In Proceedings of the Photons Plus Ultrasound: Imaging and Sensing 2018, San Francisco, CA, USA, 27 January–1 February 2018. [Google Scholar] [CrossRef]
Wu, Y.; Lin, Y. InversionNet: An Efficient and Accurate Data-Driven Full Waveform Inversion. IEEE Trans. Comput. Imaging 2020, 6, 419–433. [Google Scholar] [CrossRef]
Xiao, C.; Deng, Y.; Wang, G. Deep-Learning-Based Adjoint State Method: Methodology and Preliminary Application to Inverse Modeling. Water Resour. Res. 2021, 57, e2020WR027400. [Google Scholar] [CrossRef]
Xiao, C.; Leeuwenburgh, O.; Lin, H.X.; Heemink, A. Conditioning of deep-learning surrogate models to image data with application to reservoir characterization. Knowl.-Based Syst. 2021, 220, 106956. [Google Scholar] [CrossRef]
Xiao, C.; Lin, H.X.; Leeuwenburgh, O.; Heemink, A. Surrogate-assisted inversion for large-scale history matching: Comparative study between projection-based reduced-order modeling and deep neural network. J. Pet. Sci. Eng. 2022, 208, 109287. [Google Scholar] [CrossRef]
Mohan, A.T.; Gaitonde, D.V. A deep learning based approach to reduced order modeling for turbulent flow control using LSTM neural networks. arXiv 2018, arXiv:1804.09269. [Google Scholar]
Wu, P.; Sun, J.; Chang, X.; Zhang, W.; Arcucci, R.; Guo, Y.; Pain, C.C. Data-driven reduced order model with temporal convolutional neural network. Comput. Methods Appl. Mech. Eng. 2020, 360, 112766. [Google Scholar] [CrossRef]
Cheung, S.W.; Chung, E.T.; Efendiev, Y.; Gildin, E.; Wang, Y.; Zhang, J. Deep global model reduction learning in porous media flow simulation. Comput. Geosci. 2020, 24, 261–274. [Google Scholar] [CrossRef]
Wang, J.; He, C.; Li, R.; Chen, H.; Zhai, C.; Zhang, M. Flow field prediction of supercritical airfoils via variational autoencoder based deep learning framework. Phys. Fluids 2021, 33, 086108. [Google Scholar] [CrossRef]
Xiao, C.; Zhang, S.; Ma, X.; Jin, J.; Zhou, T. Model-reduced adjoint-based inversion using deep-learning: Example of geological carbon sequestration modelling. Water Resour. Res. 2022, 58, e2021WR031041. [Google Scholar] [CrossRef]
Xiao, C.; Zhang, S.; Ma, X. Projection-based autoregressive neural network for model-reduced adjoint-based variational data assimilation. In Proceedings of the 82nd EAGE Annual Conference & Exhibition, European Association of Geoscientists & Engineers, Amsterdam, The Netherlands, 18–21 October 2021; Volume 2021, pp. 1–5. [Google Scholar]
Peaceman, D.W. Fundamentals of Numerical Reservoir Simulation; Elsevier Scientific Publishing Company: Amsterdam, The Netherlands, 1977. [Google Scholar]
Dwight, R.P.; Brezillon, J. Effect of approximations of the discrete adjoint on gradient-based optimization. AIAA J. 2006, 44, 3022–3031. [Google Scholar] [CrossRef]
Protas, B. Adjoint-based optimization of PDE systems with alternative gradients. J. Comput. Phys. 2008, 227, 6490–6510. [Google Scholar] [CrossRef]
Xiao, C.; Leeuwenburgh, O.; Lin, H.X.; Heemink, A. Subdomain POD-TPWL with Local Parameterization for Large-Scale Reservoir History Matching Problems. arXiv 2019, arXiv:1901.08059. [Google Scholar]
Nocedal, J.W.S. Numerical Optimization; Springer: Berlin/Heidelberg, Germany, 1999; pp. 29–76. [Google Scholar]
Fukunaga, K.; Koontz, W.L. Application of the Karhunen-Loeve expansion to feature selection and ordering. IEEE Trans. Comput. 1970, 100, 311–318. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Identity Mappings in Deep Residual Networks. In Computer Vision—ECCV 2016; Springer: Cham, Switzerland, 2016. [Google Scholar]
Lie, K.A.; Krogstad, S.; Ligaarden, I.S.; Natvig, J.R.; Nilsen, H.M.; Skaflestad, B. Open-source MATLAB implementation of consistent discretisations on complex grids. Comput. Geosci. 2012, 16, 297–322. [Google Scholar] [CrossRef]
Deutsch, C.V.; Journel, A.G. GSLIB: Geostatistical Software Library and User’s Guide. New York 1992, 119, 578. [Google Scholar]

Figure 1. Schematic illustration of POD-aNN architecture. This figure is modified based on our previous work [47].

Figure 2. Diagram of consecutive prediction in the model states at all time instances for any random parameter

m

using the trained POD-aNN surrogate model.

Figure 2. Diagram of consecutive prediction in the model states at all time instances for any random parameter

m

using the trained POD-aNN surrogate model.

Figure 3. Illustration of the spatially distributed log-permeability field. The 25 diamonds denote the measurement locations. The injection well and production well are located at the corner of the 2D model, which are denoted by a circle and triangle, respectively.

Figure 4. (a,b) The decay of the eigenspectrum of SVDs for saturation snapshots. (c–f) The global basis vectors corresponding to the 1st, 10th, 50th, and 100th eigenvalues.

Figure 5. Evolution of loss functions with respect to the number of epochs. (a) Learning rate (lr) selection; (b) batch size (bs) selection.

Figure 6. Comparison of field-average relative error

γ_{S O E 1}

and

γ_{S O E 2}

of trained network with respect to three data collection strategies.

Figure 6. Comparison of field-average relative error

γ_{S O E 1}

and

γ_{S O E 2}

of trained network with respect to three data collection strategies.

Figure 7. Comparison of field-average relative error

γ_{S O E 1}

and

γ_{S O E 2}

of trained POD-aNN surrogate model related to the number of training samples

N_{s}

= 400, 600, 800, and 1000, respectively.

Figure 7. Comparison of field-average relative error

γ_{S O E 1}

and

γ_{S O E 2}

of trained POD-aNN surrogate model related to the number of training samples

N_{s}

= 400, 600, 800, and 1000, respectively.

Figure 8. Illustration of time-series fluid saturation of a random test sample predicted from POD-aNN and HFM. The POD-aNN surrogate model is trained using 1000 training samples, corresponding to two different number of POD modes, respectively. (a)

N_{ψ}

= 50; (b)

N_{ψ}

= 100 for the phase saturation.

Figure 8. Illustration of time-series fluid saturation of a random test sample predicted from POD-aNN and HFM. The POD-aNN surrogate model is trained using 1000 training samples, corresponding to two different number of POD modes, respectively. (a)

N_{ψ}

= 50; (b)

N_{ψ}

= 100 for the phase saturation.

Figure 9. Illustration of time-series fluid saturation predicted from POD-aNN and HFM for three locations. The POD-aNN surrogate models are trained by 1000 samples with respect to

N_{ψ}

= 50 and 100 POD modes, respectively. The dash and bold lines represent the predictions of POD-aNN and HFM, respectively. The red, blue, and green lines correspond to the locations I1, D, and P1, as shown in Figure 4.

Figure 9. Illustration of time-series fluid saturation predicted from POD-aNN and HFM for three locations. The POD-aNN surrogate models are trained by 1000 samples with respect to

N_{ψ}

= 50 and 100 POD modes, respectively. The dash and bold lines represent the predictions of POD-aNN and HFM, respectively. The red, blue, and green lines correspond to the locations I1, D, and P1, as shown in Figure 4.

Figure 10. Comparison of the dynamic evolution of the logarithmic objective function and updated geological permeability fields from our proposed POD-aNN surrogate model. (a) The iterative changes of the logarithmic objective functions. The estimated geological permeability fields from (b) FD method, and POD-aNN surrogate models trained by (c) 400 samples, (d) 600 samples, (e) 800 samples, and (f) 1000 samples, correspondingly.

Figure 11. Illustration of time-series fluid saturation distributions predicted from the POD-aNN surrogate model and the FD method. (a) FD; (b)

N_{s}

= 600; (c)

N_{s}

= 800. The number of POD modes is

N_{ψ}

= 100.

Figure 11. Illustration of time-series fluid saturation distributions predicted from the POD-aNN surrogate model and the FD method. (a) FD; (b)

N_{s}

= 600; (c)

N_{s}

= 800. The number of POD modes is

N_{ψ}

= 100.

Figure 12. Comparison of the dynamic evolution of the logarithmic objective function and updated geological permeability fields from our proposed POD-aNN surrogate model. (a) The iterative changes of the logarithmic objective functions. The estimated geological permeability fields from (b)

N_{ψ}

= 10 POD modes, (c)

N_{ψ}

= 50 POD modes, and (d)

N_{ψ}

= 100 POD modes, correspondingly.

Figure 12. Comparison of the dynamic evolution of the logarithmic objective function and updated geological permeability fields from our proposed POD-aNN surrogate model. (a) The iterative changes of the logarithmic objective functions. The estimated geological permeability fields from (b)

N_{ψ}

= 10 POD modes, (c)

N_{ψ}

= 50 POD modes, and (d)

N_{ψ}

= 100 POD modes, correspondingly.

Figure 13. Comparison of the dynamic evolution of the logarithmic objective function and updated geological permeability fields from our proposed POD-aNN surrogate model for Scenario 2. (a) The iterative changes of the logarithmic objective functions for 400 and 1000 training samples, respectively. The geological permeability fields from (b,c) denote the true and initial model. The estimated geological permeability fields from (d) the FD method and (e,f) the POD-aNN surrogate model, correspondingly.

Figure 14. Time-series fluid saturation distributions predicted from the POD-aNN surrogate model for Scenario 2. The POD-aNN surrogate models are trained by 400 and 1000 training samples. (a) FD; (b)

N_{s}

= 400; (c)

N_{s}

= 1000 samples.

Figure 14. Time-series fluid saturation distributions predicted from the POD-aNN surrogate model for Scenario 2. The POD-aNN surrogate models are trained by 400 and 1000 training samples. (a) FD; (b)

N_{s}

= 400; (c)

N_{s}

= 1000 samples.

Figure 15. Comparison of the dynamic evolution of the logarithmic objective function and geological permeability fields from our proposed POD-aNN surrogate model for Scenario 2. (a) The iterative changes of the logarithmic objective functions with regarding to the number of POD modes. The estimated geological permeability fields from (b)

N_{ψ}

= 10 POD modes, (c)

N_{ψ}

= 50 POD modes, and (d)

N_{ψ}

= 100 POD modes, correspondingly.

Figure 15. Comparison of the dynamic evolution of the logarithmic objective function and geological permeability fields from our proposed POD-aNN surrogate model for Scenario 2. (a) The iterative changes of the logarithmic objective functions with regarding to the number of POD modes. The estimated geological permeability fields from (b)

N_{ψ}

= 10 POD modes, (c)

N_{ψ}

= 50 POD modes, and (d)

N_{ψ}

= 100 POD modes, correspondingly.

Figure 16. Time-series fluid saturation distributions predicted from POD-aNN surrogate model for Scenario 2. The POD-aNN surrogate models are trained by preserving 10, 50, and 100 POD modes with

N_{s}

= 1000 training sample. (a)

N_{ψ}

= 10; (b)

N_{ψ}

= 50; (c)

N_{ψ}

= 100.

Figure 16. Time-series fluid saturation distributions predicted from POD-aNN surrogate model for Scenario 2. The POD-aNN surrogate models are trained by preserving 10, 50, and 100 POD modes with

N_{s}

= 1000 training sample. (a)

N_{ψ}

= 10; (b)

N_{ψ}

= 50; (c)

N_{ψ}

= 100.

Figure 17. Illustration of the average and standard deviation (STD) of ensemble log-permeability fields. (a,b) Parameter errors

e_{m}

of the updated geological permeability for Scenario 1 and Scenario 2, respectively. The remaining subfigures show the average and standard deviation (STD) of ensemble log-permeability fields before and after the inverse modeling.

Figure 17. Illustration of the average and standard deviation (STD) of ensemble log-permeability fields. (a,b) Parameter errors

e_{m}

of the updated geological permeability for Scenario 1 and Scenario 2, respectively. The remaining subfigures show the average and standard deviation (STD) of ensemble log-permeability fields before and after the inverse modeling.

Table 1. Experimental model properties, including the model size, fluid properties, underground formation pressure, initial fluid saturation and well pumping schedules, and hyperparameters for training POD-aNN.

Description of Parameters	Values
Grid size, $N_{x}$ × $N_{y}$	20 × 20
Fluid density	1014 kg/m³, 859 kg/m³
Fluid viscosity	0.4 mP·s, 2 mP·s
Initial oil, water saturation	$S_{o}$ = 0.80, $S_{w}$ = 0.20
Initial formation pressure	30 MPa
Bottom-hole pressure for pumping wells	25 MPa
Pumping rate for injection well	150 m³/day
Pumping time	1800 days
Hyperparameters
Training samples, $N_{s}$	400, 600, 800, 1000
Testing samples, $N_{t e s t}$	200
Learning rate	0.0003
Optimizer	$Adam$
Batch size	200
Epochs number	300

Table 2. Illustration of POD-aNN architecture in this case study.

Layer	Input Size	Output Size
Input	( $N_{ξ}$ + $N_{ψ}$ , 1)	(140, 2)
$R e s B l o c k 1$	(140, 2)	(140, 2)
$R e s B l o c k 2$	(140, 2)	(130, 2)
$R e s B l o c k 1$	(130, 2)	(130, 2)
$R e s B l o c k 2$	(130, 2)	(120, 2)
$R e s B l o c k 1$	(120, 2)	(120, 2)
$R e s B l o c k 2$	(120, 2)	(240, 1)
$F C$	(240, 1)	( $N_{ψ}$ , 1)

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xiao, C.; Liu, T.; Zhang, L.; Li, Z. Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation. Processes 2024, 12, 2302. https://doi.org/10.3390/pr12102302

AMA Style

Xiao C, Liu T, Zhang L, Li Z. Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation. Processes. 2024; 12(10):2302. https://doi.org/10.3390/pr12102302

Chicago/Turabian Style

Xiao, Cong, Ting Liu, Lufeng Zhang, and Zhun Li. 2024. "Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation" Processes 12, no. 10: 2302. https://doi.org/10.3390/pr12102302

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Use of Deep-Learning-Accelerated Gradient Approximation for Reservoir Geological Parameter Estimation

Abstract

1. Introduction

2. Definition of Adjoint-Based Parameter Estimation for Reservoir Model

3. Formula of Model-Reduced Adjoint Modeling

3.1. Model-Order Reduction Approach

3.2. Model-Reduced Adjoint Modeling

4. Projection-Based Autogressive Neural Network (POD-aNN)

4.1. Description of Neural Network Architecture

4.2. Samples Preparation and Training Procedure

4.3. Model-Reduced Adjoint-Based Data Assimilation Using POD-aNN

5. Illustrative Example and Results

5.1. Description of Subsurface Transport Model

5.2. Construction of Projection-Based Autogressive Neural Network

Implementation of Model-Order Reduction

5.3. Accuracy Evaluation of POD-aNN Neural Network Model

5.4. Results of Geological Parameter Estimation

5.4.1. Scenario 1—Sparse Measurements

5.4.2. Scenario 2—Assimilation of Spatially Dense Measurements

6. Discussion

6.1. Comments on Computational Cost

6.2. Comments on the Scalability of POD-aNN

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI