The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction

Xu, Weishuai; Zhang, Lei; Ma, Xiaodong; Li, Ming; Yao, Zhongshan

doi:10.3390/jmse12111918

Open AccessArticle

The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction

by

Weishuai Xu

^1,†

,

Lei Zhang

^1,*,

Xiaodong Ma

^1,†,

Ming Li

²

and

Zhongshan Yao

¹

Department of Military Oceanography and Hydrography and Cartography, Dalian Naval Academy, Dalian 116018, China

²

College of Advanced Interdisciplinary Studies, National University of Defense Technology, Nanjing 211101, China

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

J. Mar. Sci. Eng. 2024, 12(11), 1918; https://doi.org/10.3390/jmse12111918

Submission received: 8 September 2024 / Revised: 19 October 2024 / Accepted: 25 October 2024 / Published: 27 October 2024

(This article belongs to the Special Issue Advances in Underwater Acoustic Communication and Ocean Sensor Networks)

Download

Browse Figures

Versions Notes

Abstract

:

In response to the demand for high-precision acoustic support under the condition of limited data, this study utilized high-resolution reanalysis data and in situ observation data to extract the Kuroshio Extension Front (KEF) section through front-line identification methods. By combining the parameterized oceanic front model and the statistical features of big data, the parameterized oceanic front was reconstructed. A proxy dataset was generated using the Latin hypercube sampling method, and the sound speed reconstruction model based on the PIX2PIX model was trained and validated using single sound speed profiles at different positions of the oceanic front, combined with the parameterized oceanic front model. The experimental results show that the proposed sound speed reconstruction model can significantly improve the reconstruction accuracy by introducing the parameterized front model as an additional input, especially in the shallow-water area. The mean absolute error (MAE) of the full-depth sound speed reconstruction for this model is 0.63~0.95 m·s⁻¹, and the structural similarity index (SSIM) is 0.76~0.78. The MAE of the sound speed section within a 1000 m depth is reduced by 6.50~37.62%, reaching 1.95~3.31 m·s⁻¹. In addition, the acoustic support capabilities and generalization of the model were verified through ray tracing models and in situ data. This study contributes to advancing high-precision acoustic support in data-limited oceanic environments, laying a solid groundwork for future innovations in marine acoustics.

Keywords:

Kuroshio extension front; PIX2PIX; parameterized oceanic front model; sound speed reconstruction

1. Introduction

Under complex oceanographic conditions, the acoustic field is characterized by temporal and spatial randomness and parameter, environmental, and channel uncertainty due to the influence of oceanic interfaces, water media, and dynamic features such as fronts, eddies, and currents [1]. Owing to the horizontal non-uniformity in the hydroacoustic sound speed profile, the presence of oceanic fronts significantly affects the propagation paths and speeds of sound waves in the ocean [2]. This directly impacts both the sound speed structure [3,4] and propagation loss [5,6]. These effects are particularly important in fields such as ocean environmental monitoring [7], military oceanography [8], and underwater acoustic communication [9].

In the context of oceanic fronts, the intense variations in temperature and salinity can cause significant changes in the sound speed profile within very short horizontal and vertical distances. This poses challenges in accurately understanding the characteristics and accuracy of the sound speed field during underwater acoustic support. Previous scholars have mainly adopted two methods to address the issue of underwater acoustic support under the condition of limited oceanographic data. The first involves directly predicting the underwater acoustic propagation. In this regard, Mallik and Jaiman [10] proposed a convolutional recurrent autoencoder network architecture that could learn the transmission losses of acoustic signals generated by the geometric diffusion, refraction, and reflection of a two-dimensional ocean surface and seabed. McCarthy and Sarkar [11] used a decision tree model to predict the propagation loss based on Bellhop and verified the model using field data collected by AUVs, achieving good results. Mccarthy and Merrifield [12] calculated the propagation loss in different environments off the coast of Southern California using Bellhop and constructed a predictive model using decision trees, verifying the effectiveness of the method. Lee-Leon and Yuen [13] built a receiving system based on deep belief networks, which showed better performance in channels affected by Doppler effects and multipath propagation, as demonstrated through simulation modeling and sea trials. Lee and Johnson [14] proposed a supervised learning-based method to quantify the uncertainty in the propagation loss and evaluated the method’s ability to simulate long-distance underwater propagation under different computational models, environmental scenarios, and sources and degrees of uncertainty.

The second method involves using limited oceanographic data to perform the real-time inversion of the sound speed structure based on empirical models or machine learning methods, aiming to provide underwater acoustic support [15,16]. For example, Khan, Song [17] and Liu, and Chen [18] reconstructed the sound speed based on parameterized models of mesoscale eddies and oceanic fronts, respectively. Chen, Ma [19] and Liu, and Chen [20] obtained preliminary reconstruction results for the global sound speed profile by optimizing empirical orthogonal functions using only partial prior information about the underwater sound speed. Zhao and Wang [21] successfully reconstructed the three-dimensional ocean sound field using the Tucker decomposition algorithm and denoising autoencoder, achieving 38% higher reconstruction accuracy compared to traditional methods such as kriging interpolation and empirical orthogonal functions when the in situ data were sparse. Ma and Zhang [22] constructed a generative adversarial network (GAN) model combined with a Gaussian eddy model by using reanalysis and satellite data for sound speed reconstruction in mesoscale eddy environments, achieving an RMSE of 1.7 m·s⁻¹ and a structural similarity index measure (SSIM) of 0.77.

In the literature, the reconstruction of sound speed in complex marine environments has received widespread attention. These studies have mainly focused on using various algorithms and techniques to improve the accuracy and reliability of sound speed estimation in order to address the challenges posed by complex marine environments. However, most of these studies are based on traditional sound speed estimation methods and pay less attention to sound speed reconstruction for specific marine phenomena, such as oceanic fronts. An oceanic front is an important dynamic phenomenon in the ocean, characterized by sharp changes in physical quantities such as temperature and salinity [23,24]. These changes have a significant impact on sound wave propagation [6,25,26], making the reconstruction of the sound speed in oceanic front regions particularly complex.

Therefore, research on sound speed reconstruction for oceanic fronts has important theoretical and practical significance. In this case, unlike traditional sound speed reconstruction in complex marine environments, it is necessary to consider the unique physical characteristics of oceanic fronts and the mechanisms of sound wave propagation. Methods based on empirical models can describe certain features of oceanic fronts to some extent, but these methods are often constrained by limitations in experience and data, making it difficult to fully reflect the complexity of oceanic fronts. To overcome these limitations, this work proposes a sound speed reconstruction model for oceanic fronts that is guided by a parameterized oceanic front model based on the excellent generative model GAN and taking the Kuroshio Extension Front (KEF), one of the strongest oceanic fronts in the world, as the research object. The remainder of this paper is structured as follows: Section 2 introduces the data and model construction methods, Section 3 and Section 4 describe the model training and validation processes, and Section 5 summarizes the research and proposes future directions.

2. Data and Methods

To address the demand for high-precision underwater acoustic section reconstruction, this study constructed a sound speed reconstruction model in the oceanic front environment based on the PIX2PIX model, a variant of the GAN. The main process included the following steps.

(1): Data preparation: Based on the front-line extraction method, a large number of sections with typical frontal features were obtained, and the KEF sections were evenly divided into three groups to study the sound speed reconstruction effect when the single input profile occupied different positions. The input data used in this study included the sea surface sound speed and the sound profile in the section. When constructing the model, two main input situations were considered: in one, we input only the sea surface and single-profile sound speed; in the other, we superimposed the contour lines of the parameterized oceanic front reconstruction section for input, with the output being the section’s sound speed.
(2): Model construction: The PIX2PIX model included two main parts, namely the generator model (U-Net structure) and the discriminator model (PatchGAN structure). First, the real samples and generator samples were input into the discriminator, which determined whether the input image was real or fake and updated the parameters of the generator and discriminator. This process was repeated until the quality of the samples generated by the generator reached the expected level.
(3): Effect evaluation: By comparing the samples generated by the generator with the real samples, the performance of the sound speed reconstruction model based on the PIX2PIX model was evaluated using evaluation indicators and visual assessment methods.

Finally, the oceanic front model constructed using reanalysis data was verified using in situ data to evaluate the sound speed reconstruction effect. Moreover, the water depth sound speed profile generated by the generator was compared with the real section through an underwater acoustic simulation, aiming to evaluate its underwater acoustic propagation support capabilities. The research process of this study is shown in Figure 1.

2.1. Data

The high-resolution reanalysis data used in this study consist of two types: Japan Coastal Ocean Predictability Experiment 2 Modified (JCOPE2M), from the Japan Agency for Marine-Earth Science and Technology, and China Ocean Real-Time Analysis 1.0 (CORTA 1.0), provided by the National Marine Data and Information Service. The JCOPE2M reanalysis data are based on the Princeton ocean assimilation model, covering the Northwest Pacific with a temporal resolution of 1 day and a horizontal resolution of 1/12°, and they are vertically divided into 46 layers [27,28]. This dataset assimilates high-resolution satellite sea surface temperature data and various observational data, has a high resolution and accuracy, and is widely used in research on mesoscale phenomena and flow fields [29,30,31]. The time range of the data used in this study is January 1993 to December 2022, with a spatial range of 136~166° E, 30~44° N. Based on this dataset, a large number of KEF sections are extracted for the construction and training of the sound speed reconstruction model in the oceanic front environment.

To validate the confidence of the model and the possibility of substituting other data, the CORTA data from the National Marine Data and Information Service are used for verification. This dataset is reconstructed by the Key Laboratory of Marine Environment Information Assurance Technology of the State Oceanic Administration using the sea surface temperature and sea level height data from the National Satellite Ocean Application Center and international public data. Moreover, the reconstructed field is assimilated and corrected using real-time/quasi-real-time data obtained and monitored through international and central observations. The coverage of the product is 99~150° E, 10~52° N, with a horizontal grid resolution of 1/8° and 51 vertical standard depth layers. As the coverage of currently available public data for this dataset begins in 2019, they are mainly used as verification data in this study.

The in situ data used for model cross-validation are derived from the Kuroshio Extension System Study (KESS) and regular oceanographic survey data from the Japan Meteorological Agency. The KESS project, funded by the National Science Foundation of the United States, is a large-scale observation research project on the Kuroshio Extension involving the University of Rhode Island, the University of Hawaii, and the Woods Hole Oceanographic Institution. The goal of the KESS is to determine and quantify the dynamic and thermodynamic processes controlling the changes and interactions between the Kuroshio Extension and recirculating eddy systems [32]. The CTD data obtained from marine in situ observations by the Research Vessel Melville from June to July 2006, during the KESS project, are used, utilizing 4 sets of continuous sections with significant KEF features from the dataset.

The Japan Meteorological Agency is committed to monitoring the content of carbon dioxide in seawater and the atmosphere, aiming to improve the accuracy of global climate warming predictions. At the same time, the agency is also engaged in the in-depth study of the relationship between long-term changes in the ocean and climate change. To this end, the Japan Meteorological Agency has established oceanographic observation lines in the Northwest Pacific and its surrounding waters and arranged for ships to conduct regular oceanographic observation missions. The main observation lines are shown in Figure 2a, while the survey lines utilized in this study are depicted in Figure 2b.

2.2. Methodology

2.2.1. Oceanic Front Extraction Method

This study first calculates the sound speed from the temperature and salinity data using the Mackenzie empirical formula for the sound speed. It then determines and optimizes the front line based on the sound speed contour lines and horizontal sound speed gradient using an adaptive parameter adjustment method for sound speed contour line selection, targeting the disturbance of mesoscale eddies and turbulence commonly present in extracted front lines. First, the maximum horizontal sound speed gradient near the KE bend ridge line at 144° E is used as the search center, and the sound speed range is drawn with the temperature increasing and decreasing by 5 m·s⁻¹ above and below the search center, respectively. The continuous contour line with the highest C_OFFD is selected as the front line, with the following formula:

C_{O F F D} = \frac{\sum_{1}^{n} C_{G r a d}}{N}

(1)

C_{G r a d} = \sqrt{{(\partial ϕ_{U} / \partial x)}^{2} + {(\partial ϕ_{V} / \partial y)}^{2}}

(2)

where n is the number of effective grid points with the highest temperature gradient in the front zone of the study area (33~39° N, 142∼162° E), and N is the total number of traverse lines. C_Grad is the absolute gradient calculated at the grid point position [23,34,35,36].

\partial ϕ_{U}

and

\partial ϕ_{V}

represent the differences in the study variables (temperature, salinity, sound speed, etc.) in the latitudinal or longitudinal direction, ∂x represents the latitudinal distance, and ∂y represents the longitudinal distance.

This study uses C_OFFD as the basis for the evaluation of the front line, extracting a unique contour line to represent the position of the KEF. However, the sound speed distribution of the KEF is often affected by turbulence and detached mesoscale eddies, rendering the extracted contour line discontinuous. Seo and Sugimoto [37] performed manual checking to suppress the discontinuities and abnormal protrusions caused by small- and medium-scale disturbances. This study finds that the curvature, gradient, and turning angle parameters used to describe the degree of curve bending can be applied to identify the positions of discontinuous front lines, as shown in Figure 3.

Based on this, an abnormal protrusion detection and smoothing method is proposed to optimize the front line identified using the OFFD method. First, a 0.1° sliding window is established to traverse the front line longitudinally, and the left endpoint of the abnormal protrusion is determined based on the position of the maximum curvature and the phase transition of the gradient. Then, starting from the left endpoint, the window slides to the right side of the contour line to detect the first minimum point of

L / l

as the right endpoint of the abnormal protrusion (L is the length of the contour line; l is the straight-line distance between the two points). Finally, cubic spline interpolation is performed using the same step size as in the original data to smooth out the abnormal protrusion. The optimization result for the front line is shown in Figure 4b. This shows that, after the detection of abnormal protrusions, the front line effectively reduces the extent of unnecessary protrusions formed by small-scale interference, ensuring the integrity of the front line and fitting the positions of the high-level temperature gradients in the frontal zone well. This method is also applicable to oceanic fronts formed by different elements, such as temperature fronts and salinity fronts, by selecting appropriate search intervals and contour line intervals.

Next, the KEF line is identified based on the sound speed at the 300 m layer in the study area (144~154° E, 32~40° N), and sections with a horizontal sound speed gradient greater than 0.1 m·s⁻¹·km⁻¹ at the front line position are retained as research objects. After screening, this study extracts a total of 651,316 KEF sections, as shown in Figure 4d. Each section is linearly interpolated to a horizontal resolution of 1 km and Akima-interpolated to a vertical resolution of 1 m.

2.2.2. Parameterized Oceanic Front Model

Accurately reconstructing the ocean sound speed field is crucial for various marine acoustic applications, but the sparsity and uncertainty of sound speed samples in the vast ocean area make this task challenging [38]. The presence of oceanic fronts causes greater deviations in the sound speed below the sea surface, increasing the difficulty of this task [39]. Aiming to fully understand the sound speed structural characteristics in the oceanic front environment, and based on the research findings of previous scholars, this study employs a parameterized two-dimensional-feature oceanic front reconstruction model to provide a correction scheme for sound field reconstruction in environments containing oceanic fronts.

First, based on the formation mechanism of the oceanic front and incorporating historical statistical data, an ideal large-scale oceanic front sound speed model is established [40] by fitting the characteristic equation of the sound speed profile:

C (r, z) = [C_{2} (z) - C_{1} (z)] Φ (r) + C_{1} (z)

(3)

Φ (r, z) = \frac{1}{2} + \frac{1}{2} \tanh [2 π {(\frac{r}{R})}^{10^{a}} - π]

(4)

where r and z are the horizontal and vertical directions of the oceanic front section, R represents the horizontal span of the oceanic front section (100 km),

C_{1} (z)

and

C_{2} (z)

represent the sound speed profiles on both sides of the oceanic front, and

Φ (r, z)

is the normalized sound speed profile in the vertical direction. The parameter a has a range of −1.5~1.5 with a step size of 0.01, representing different positions in the frontal zone. By calculating different parameter values, the maximum horizontal sound speed gradient at a 300 m depth in the reconstructed KEF section is reconstructed, and the a value with the smallest absolute error between the reconstructed KEF intensity and the predicted intensity is taken as the optimal parameter of the two-dimensional parameterized oceanic front model. The optimal a value for the 600,000 KEF sections considered in this paper is statistically shown in Figure 5b, and a is taken to be −0.2 when KEF reconstruction is performed.

In addition, due to the single-profile input considered in this study, the sound speed characteristics on both sides of the oceanic front are statistically analyzed using more than 600,000 data. Therefore, the fitting curve described in Table 1 is used to calculate the sound speed characteristics on both sides of the oceanic front. The phenomenon of oceanic fronts causes drastic changes in the underwater acoustic environment, such as in the depth of the sound channel, the thickness of the sound layer, and the sea surface sound speed on both sides of the oceanic front [41]. Combining the typical stratification structure of the seawater sound speed and ray acoustic theory, the sound speed structural characteristics on both sides of the oceanic front section are extracted according to the sound speed characteristics of the mixed layer, thermocline, and deep-sea isothermal layer, respectively. These include (1) the sea surface sound speed (SSS) in m·s⁻¹; (2) the sonic layer depth (SLD) in m; (3) the bottom sonic layer speed (BSLS) in m·s⁻¹; (4) the transition layer of the sound speed (TLSS) in m·s⁻¹·m⁻¹; (5) the sound channel axis depth (SCAD) in m; (6) the sound channel axis speed (SCAS) in m·s⁻¹; (7) the conjugate depth (CD) in m; (8) the conjugate depth speed (CDS) in m·s⁻¹; and (9) the depth excess (DE) in m. Among them, the depth excess is the difference between the sampling point’s conjugate depth and the ETOPO 2022 water depth.

We use Akima interpolation to obtain the difference in the profile lines

Δ C (z)

on both sides of the front and then calculate

C_{1} (z)

and

C_{2} (z)

through the single-profile sound speed

C_{0} (z)

, as shown in Equation (5).

C_{1} (z) = \{\begin{cases} C_{0} (z) \\ C_{0} (z) + Δ C (z) / 2 \\ C_{0} (z) + Δ C (z) \end{cases}, C_{2} = \{\begin{cases} C_{0} (z) - Δ C (z) & , & South Side \\ C_{0} (z) - Δ C (z) / 2 & , & Center \\ C_{0} (z) & , & North Side \end{cases}

(5)

2.2.3. Principles of PIX2PIX Model

After conducting comprehensive academic research and investigation, this study selects a deep learning image conversion model, the PIX2PIX-based GAN, for sound speed reconstruction. The GAN model is inspired by the zero-sum game in game theory, where the generator network G and the discriminator network D continuously engage in a binary maximization and minimization game to optimize both models simultaneously. In the classic GAN model, the generator G increases the similarity between the generated samples and the real samples, ensuring that the distribution of the generated samples

p_{g}

is the same as the real sample distribution

p_{d}

. The discriminator D’s goal is to distinguish between the real and generated samples. The loss function

L_{G A N}

of the GAN model is expressed as

L_{G A N} (f_{G}, f_{D}) = \min_{G} \max_{D} E [\ln f_{D} (x)] + E [\ln (1 - f_{D} (f_{G} (z)))]

(6)

where x denotes the image data; z is random noise; E is the expectation value;

f_{D}

is the output of the discriminator D; and

f_{G}

is the output of the generator G.

Given the limitation whereby the traditional GAN model cannot control the generated content determined by the parameter

f_{G}

and random noise z, this study uses the PIX2PIX model, a variant of the model, for sound speed reconstruction. By utilizing limited sound speed profiles to reconstruct the sound speed of the entire oceanic front, underwater acoustic support is provided during data shortages. This model introduces conditional information y, which helps to control the content generated by the generator. In the PIX2PIX model, y is another type of image domain data corresponding to m. In the generator, random noise z and data y are input together to generate a cross-modal feature; in the discriminator, the data m and the corresponding y are input to generate a cross-modal vector while judging the authenticity of m. In this way, with the introduction of conditional information y, the generator is gradually controlled to generate specific results. The loss function

L_{P I X 2 P I X}

of the PIX2PIX model with conditional information y is expressed as

L_{p i x 2 p i x} (f_{G}, f_{D}) = \min_{G} \max_{D} E [\ln f_{D} (m | y)] + E [\ln (1 - f_{D} (f_{G} (m | y)))]

(7)

Differing from the classic GAN, which uses a multi-layer perceptron structure, PIX2PIX uses a combination of convolutional layers, batch normalization (BN) layers, and rectified linear unit (ReLU) layers, which are commonly used in convolutional neural networks (CNNs), for the model structure. The generator adopts the classic encoder–decoder structure (U-Net structure), and the discriminator adopts the PatchGAN structure. The structure of the model used in this study is detailed in Section 3.

When evaluating the feature prediction model, this study uses the mean absolute error (MAE) and the SSIM. The smaller the MAE and the larger the SSIM, the lower the degree of discreteness between the predicted value

{\hat{y}}_{i}

and the true value

y_{i}

of the model, and the stronger the prediction ability of the model.

M A E = \frac{1}{n} \sum_{i = 1}^{n} |y_{i} - {\hat{y}}_{i}|

(8)

This study introduces the SSIM when reconstructing the sound speed to evaluate the visual similarity of two sound speed images. This index comprehensively considers the brightness, contrast, and structural information of the image. The calculation formula is as follows:

S S I M = \frac{(2 μ_{f_{p}} μ_{f_{t}} + c_{1}) (2 σ_{f_{p} f_{t}} + c_{2})}{(μ_{f_{p}}^{2} + μ_{f_{t}}^{2} + c_{1}) (σ_{f_{p}}^{2} + σ_{f_{t}}^{2} + c_{2})}

(9)

where

f_{p}

represents the predicted image,

f_{t}

represents the true image,

μ_{f_{p}}

and

μ_{f_{t}}

denote the brightness of the image,

σ_{f_{p}}

and

σ_{f_{t}}

denote the variance in the image brightness,

σ_{f_{p} f_{t}}

is the covariance of the image brightness, and

c_{1}

and

c_{2}

are constants that are introduced to avoid division by zero.

3. Model Training

3.1. Construction of Prediction Model and Physical Parameter Input Method

Considering the computational resources and the need for representativeness, this study uses Latin hypercube sampling [42] to perform stratified sampling according to the oceanic front strength interval (0.1~3.5 m·s⁻¹·km⁻¹) in order to obtain a proxy dataset of 9000 data points. To improve the universality of the model, this proxy dataset is evenly divided into three parts, and the input profiles are set to the south side, center, and north side of the front, respectively. Two types of inputs are used to generate the sound speed of the entire oceanic front: single-profile input I and parameterized front reconstruction input II based on a single profile with contour lines spaced at 5 m·s⁻¹, superimposed onto the single-profile section. The sound speed range for both inputs and the output images is 1450~1600 m·s⁻¹, with a section length of 100 km and a depth of 5500 m, as shown in Figure 6.

The model construction process is depicted in Figure 7. Firstly, the two input sections and the output section are converted into images with an image height of 256, a width of 256, and 3 channels. The model mainly consists of a generator and a discriminator, where the generator uses a U-Net structure [43]. This structure comprises an encoder and a decoder, with the encoder extracting feature representations of the KEF image through downsampling operations, using convolutional layers and pooling layers for feature extraction, and reducing the spatial size of the feature map. The convolutional layer uses the Leaky ReLU (LReLU) activation function, as shown in Equation (10).

Y = Re LU (C o n v (X, W) + b)

(10)

where X is the input feature map, Y is the output feature map, W represents the convolution kernel, and b represents the bias term. The max-pooling layer reduces the size of the feature map, as shown in Equation (11):

Y = M a x P o o l (X)

(11)

The decoder maps the low-resolution feature map from the encoder back to a high-resolution image through upsampling operations. The generator adopts a fully convolutional network structure, with eight convolutional layers and eight transposed convolutional layers, each with a kernel size of 4 × 4 and stride of 2. To address the issue of feature loss during sampling, the U-Net network employs skip connections to directly transmit more image information from the shallow stages of the encoder to the deep stages of the decoder, thereby reducing the feature loss while retaining more detailed information. This design allows the U-Net network to focus on both local details and global information, achieving optimal sound speed reconstruction results.

The discriminator also adopts a fully convolutional architecture based on the PatchGAN structure, consisting of five convolutional modules. In the first four modules, the Leaky ReLU is used as an activation function to enhance the model’s nonlinear expression abilities; the last module uses a convolutional layer with a stride of 1. To ensure that the full sound speed structural information is retained, the discriminator’s training process is directly based on the complete KEF section image rather than processing it in blocks. Focusing on the images generated by the generator network or the real image data, the discriminator must determine whether they are real or fake. During the model training process, the generator and discriminator compete with each other and update their parameters until the sound speed section reconstructed by the generator exhibits the expected results.

3.2. Model Training and Effectiveness Evaluation

Based on the constructed PIX2PIX model, sound speed reconstruction is performed for two types of input (input I and input II) and three profile positions (south side, center, and north side). Figure 8 shows the change in the reconstruction effect of the indicated section with the number of iterations. It can be seen that as the number of iterations increases, the prediction effect is gradually improved. In the initial few iterations, there are many blue–green overlay areas in the prediction results, the reconstructed section’s image is blurred, and the prediction accuracy is not high. As the number of iterations increases, the sound speed characteristics of the front zone become increasingly obvious, and the prediction accuracy is significantly improved. Specifically, the prediction effects with inputs I and II show similar trends during the iteration process. Although the prediction effect with input I is slightly better than that with input II at early iteration numbers and at certain positions (such as the center and north side), it is found that the prediction effects are not notably different after a certain number of iterations. In addition, the figure shows the change in the prediction effect at different positions (south side, center, and north side). Overall, the prediction effects at the south side and center are relatively similar during the iteration process, while the prediction effect at the north side is slightly worse at early iteration numbers but improves as the number of iterations increases.

To further evaluate the sound speed reconstruction effect, the images are resampled into sound speed profiles based on the color scale of the KEF section. The MAE and SSIM are used to evaluate the sound speed reconstruction effects of the 1000 m shallow section and the full section, respectively. The average values for 300 test sections are calculated, and the statistical results are shown in Table 2 and Table 3.

When evaluating the reconstruction effect of the 1000 m shallow section, the table shows that as the number of iterations increases, the prediction effect of the sound speed reconstruction model is significantly improved. Specifically, when only the profile is used as the input, the MAE values for the south side, center, and north side decrease from 12.25 m·s⁻¹, 12.04 m·s⁻¹, and 13.39 m·s⁻¹ to 3.54 m·s⁻¹, 2.42 m·s⁻¹, and 4.04 m·s⁻¹, respectively, while the SSIM values increase from 0.16, 0.25, and 0.17 to 0.52, 0.61, and 0.53, respectively. When the parameterized front model is introduced as an additional input, the prediction effect of the model is further improved. In this case, the MAE values for the south side, center, and north side decrease from 11.88 m·s⁻¹, 13.94 m·s⁻¹, and 11.78 m·s⁻¹ to 3.31 m·s⁻¹, 1.95 m·s⁻¹, and 2.52 m·s⁻¹, respectively, while the SSIM values increase from 0.18, 0.15, and 0.26 to 0.53, 0.63, and 0.59, respectively.

When evaluating the full-section sound speed reconstruction effect, when only the profile is used as the input, the MAE values for the south side, center, and north side decrease from 7.35 m·s⁻¹, 9.09 m·s⁻¹, and 7.49 m·s⁻¹ to 0.84 m·s⁻¹, 0.66 m·s⁻¹, and 1.08 m·s⁻¹, respectively, while the SSIM values increase from 0.17, 0.18, and 0.14 to 0.77, 0.80, and 0.77, respectively. When the parameterized front model is introduced as an additional input, the prediction effect of the model is further improved. In this case, the MAE values for the south side, center, and north side decrease from 8.07 m·s⁻¹, 9.65 m·s⁻¹, and 9.01 m·s⁻¹ to 0.95 m·s⁻¹, 0.63 m·s⁻¹, and 0.77 m·s⁻¹, respectively, while the SSIM values increase from 0.12, 0.11, and 0.16 to 0.76, 0.78, and 0.78, respectively.

In summary, the sound speed reconstruction model’s prediction effect is improved with the increase in the number of iterations under different input conditions and positions. Although there are some differences in the initial iterations, the analysis indicates that increasing the number of iterations helps to improve the prediction accuracy.

In addition, the sound speed reconstruction evaluation indicators at different iteration numbers are plotted in Figure 9. Upon combining these with the evaluation results presented in Table 2 and Table 3, it is found that, from the perspective of the position, the prediction effect at the center position is generally better than that at the south and north sides. This may be because the data distribution at the center position is more concentrated, making the model’s predictions more accurate at this position. From the perspective of the input, the sound speed reconstruction effect is significantly improved when more input information is used, i.e., when the contour lines of the parameterized front model are introduced. From the perspective of the depth of the reconstructed section, due to the more uniform sound speed in the deep-sea isothermal layer, the reconstruction error at the 5500 m section depth is around 1 m·s⁻¹, and the prediction accuracy of the two inputs is similar. Meanwhile, the shallow sea water changes are more significant, with reconstruction accuracy of about 2~5 m·s⁻¹; here, the sound speed reconstruction accuracy when superimposing the parameterized front profile lines is improved by 0.23~1.52 m·s⁻¹ compared to the case with the single-profile input, and the sound speed reconstruction effect is significantly improved.

4. Model Verification

4.1. Evaluation of Underwater Acoustic Propagation Effect

Previous scholars have found that the Bellhop model fits well with the observed convergence zone distance after actual underwater acoustic experiments [44]. The present study also uses this model to perform an underwater acoustic propagation simulation in the oceanic front environment. This model calculates the sound field in a horizontally inhomogeneous environment based on the Gaussian beam tracking algorithm, associating each sound ray with a Gaussian intensity as the center sound line of the Gaussian ray. The propagation process of the simulated sound rays is consistent with the results of the full-wave model [45].

Considering the impact of oceanic fronts on underwater acoustic propagation, this study mainly considers the following underwater acoustic propagation characteristics through the Bellhop model. The first is the minimum emission angle that forms the convergence zone from countless sound lines emitted from an omnidirectional sound source, as shown in formula (12). When the sound speed in the source layer is

c_{s} < c_{0}

, a mixed surface-layer sound channel will be formed.

α_{\min} \geq \arccos c_{0} / c_{s}

(12)

(1): Direct Detection Distance

Direct detection is the main method of short-distance underwater detection. In this study, to mitigate the influence of the surface waveguide and ensure research consistency, the direct detection distance is set to the horizontal distance from the sound source to the position where the sound ray with an emission angle of

α_{\min}

first reaches the receiving depth. Thus, it is set to the typical submarine limit depth of 500 m, with the unit being km.

(2): Convergence Zone Distance

As there is a focal line near the 0° grazing angle sound ray reversal point, the convergence zone distance is often defined as the circular distance of this point [46]. To avoid trapping the sound line in the surface layer, the sound source is set 150 m below the sea surface in the front zone, and the horizontal reversal emission angle of the sound ray with

α_{\min}

forms the first convergence zone distance from the sound source, with the unit being km.

Based on the evaluation of different input reconstructions, sound speed reconstruction is performed using single profiles and sea surface sound speeds superimposed with parameterized oceanic front contour lines. First, underwater acoustic simulations are conducted on 300 sections in the test set. The first convergence zone distances of the real section, the uniform section based on the sound speed profiles, and the reconstructed sound speed sections are plotted to verify the predictive ability of the acoustic support characteristics, as shown in Figure 10. After calculation, the MAEs of the convergence zone distances for the uniform sound field are 3.76 km, 3.78 km, and 3.91 km, respectively, while the reconstructed sound field reduces the MAE to 2.46 km, 2.52 km, and 3.10 km, respectively. The MAE of the direct detection distance for the reconstructed sound field is 0.62 km, 0.87 km, and 1.01 km, respectively, showing high accuracy in predicting the acoustic propagation characteristics.

4.2. Evaluation of Sound Speed Reconstruction Effects for In Situ Observation Sections

In addition, the effectiveness of the sound speed reconstruction model is verified using in situ field observation sections. Figure 11 shows three sections from different periods and locations as examples. It is found that even models trained with reanalysis data can reconstruct the sections of in situ field data to a certain extent, and an additional quantitative evaluation is performed, as shown in Table 4. Overall, this indicates an MAE of 3~4 m·s⁻¹ and an SSIM of around 0.7. In situ data contain greater uncertainty than reanalysis data, which reduces the prediction accuracy to a certain extent.

4.3. Evaluation of Sound Speed Reconstruction Effects for Sections from Different Data Sources

Due to the timescale limitations of domestic high-resolution reanalysis data, this study mainly uses JCOPE2M for model training. However, the universality of the data sources is key to enhancing the model’s performance and providing acoustic support. Limited by the CORTA range, the verification sea area is set to 144~150° E, 32~40° N. Figure 12 shows that the frontal line identification and section extraction methods used in this study can be effectively applied to domestic data. Taking the most recent complete data for 2023 as an example, a total of 9694 KEF sections are extracted.

Similarly, based on Latin hypercube sampling, 900 proxy datasets are extracted, and the sound speed reconstruction accuracy under verification input II for three sound speed profile positions is plotted, as shown in Figure 13. It is found that different data sources cause certain increases in the model’s prediction errors. The average MAE for the full sea depth is 2.17 m·s⁻¹, 2.19 m·s⁻¹, and 2.72 m·s⁻¹, respectively, and the section reconstruction accuracy generally shows a pattern of first decreasing and then increasing with the depth. The MAE is the lowest at around 2000~3000 m, with an MAE of less than 1 m·s⁻¹, and the prediction accuracy is the lowest at around 300 m, which is related to the significant strength of the oceanic front in the subsurface layer. However, when evaluating the model with the SSIM, it is found that this metric reaches around 0.8 for the full sea depth, indicating that the trained model can fit the sound speed distribution of the new dataset, further verifying the high universality and robustness of the sound speed reconstruction model proposed in this study.

In summary, PIX2PIX shows significant advantages in underwater acoustic sound speed profile reconstruction in oceanic front environments, including data generation and enhancement, high-precision reconstruction, unsupervised learning, and the ability to handle complex data distributions. The model trained with the high-resolution reanalysis data of the KEF can effectively reconstruct the KEF sections of the test set and in situ data, showing broad application prospects in sound speed profile reconstruction and significant practical value.

5. Conclusions

In response to the need for high-precision acoustic support under the condition of limited data, this study extracted KEF sections using high-resolution reanalysis data and in situ observation sections through a frontal line identification method. Subsequently, parameterized oceanic fronts were reconstructed using a combination of parameterized oceanic front models and the statistical features of big data based on single profiles. Finally, a strategy for acoustic support in oceanic front environments under data shortages was proposed by combining the parameterized reconstruction results with the PIX2PIX model.

Using stratified sampling via the Latin hypercube sampling method, according to the oceanic front strength interval, a proxy dataset of 9000 samples covering the period of 1993 to 2022 was obtained. The model was trained and tested at a ratio of 9:1. When evaluating the full-section sound speed reconstruction effect, the MAE was 0.63~0.95 m·s⁻¹ and the SSIM was 0.76~0.78 when using single profiles from the south side, center, and north side of the oceanic front as the input.

However, due to the increased uncertainty brought by the oceanic front in the surface layer, the prediction accuracy was somewhat reduced. When evaluating the reconstruction effect of the 1000 m shallow section, introducing the parameterized front model as an additional input reduced the prediction MAE by 6.50~37.62%. The MAE values when using single profiles from the south side, center, and north side of the oceanic front as the model inputs were 1.95~3.31 m·s⁻¹, with an SSIM of 0.53~0.63, significantly improving the sound speed reconstruction compared to that when using only a single-profile input.

During the verification of the model’s acoustic support and sound speed reconstruction, first, the Bellhop model was used to perform underwater acoustic simulations on the reconstructed sections in the test set. The reconstructed sound field showed high accuracy in predicting the acoustic propagation characteristics, with the convergence zone distance MAE being approximately 2~3.10 km. Subsequently, the trained model’s sound speed reconstruction accuracy was verified using 54 sets of field measurement sections and the CORTA data, showing an MAE of 3~4 m·s⁻¹ and 1~2 m·s⁻¹, respectively, indicating that the model has good generalization and robustness.

This study sought to supplement previous research on sound speed reconstruction in complex ocean environments. Proposing a “feature reconstruction” strategy for oceanic front acoustic support that considers both global and local features, guided by physical models, has significant implications for acoustic support, communication, and detection in areas with limited data. This approach is expected to be applied in similar research on other oceanic fronts globally. However, the study also encountered the following issues: owing to space constraints, the consideration of the profile positions and model inputs was relatively limited; moreover, despite the good performance of the PIX2PIX-based sound speed reconstruction model, the structural selection of the generator and discriminator in the model was based on prior knowledge. Future research should focus on optimizing the sound speed reconstruction model and consider the construction of such models under different inputs.

Author Contributions

Conceptualization, W.X. and L.Z.; methodology, W.X. and X.M.; validation, L.Z., M.L. and Z.Y.; data curation, W.X.; writing—original draft preparation, W.X. and X.M.; writing—review and editing, L.Z., M.L. and Z.Y.; visualization, W.X. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the North Pacific Deep Sea Sound Speed Zone Research (DJYSYF2020-008), Dalian Naval Academy and National Natural Science Foundation of China (62073332).

Data Availability Statement

To utilize oceanic front section data in this study, please contact [email protected].

Acknowledgments

We would like to thank the Japan Agency for Marine-Earth Science and Technology, the National Marine Data and Information Service, the Woods Hole Oceanographic Institution, and the Japan Meteorological Agency for their data support. We are grateful for the efforts and contributions of these organizations and individuals, who provided us with valuable datasets such as JCOPE2M (https://www.jamstec.go.jp/e/, accessed on 10 October 2023), CORTA (https://oceancloud.nmdis.org.cn/, accessed on 21 July 2024), the KESS project (https://uskess.whoi.edu/, accessed on 22 September 2023), and the Japan voyage data (https://www.data.jma.go.jp/kaiyou/shindan/index_obs.html, accessed on 25 July 2024), which enabled us to complete this research. During this study, our colleagues and teachers provided us with a great deal of assistance and support. They offered us many valuable opinions and suggestions regarding the experimental design, data analysis, and literature research, for which we are also deeply grateful. Finally, we extend our thanks to the anonymous reviewers for their constructive feedback, which greatly enhanced our manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

Haining, H.; Yu, L. Underwater Acoustic Detection: Current Status and Future Trends. Bull. Chin. Acad. Sci. 2019, 34, 264–271. [Google Scholar] [CrossRef]
Liu, Y.; Meng, Z.; Chen, W.; Liang, Y.; Chen, W.; Chen, Y. Ocean Fronts and Their Acoustic Effects: A Review. J. Mar. Sci. Eng. 2022, 10, 2021. [Google Scholar] [CrossRef]
Colosi, J.A.; Rudnick, D.L. Observations of upper ocean sound-speed structures in the North Pacific and their effects on long-range acoustic propagation at low and mid-frequencies. J. Acoust. Soc. Am. 2020, 148, 2040–2060. [Google Scholar] [CrossRef] [PubMed]
Chen, C.; Yang, K.; Duan, R.; Ma, Y. Acoustic propagation analysis with a sound speed feature model in the front area of Kuroshio Extension. Appl. Ocean Res. 2017, 68, 1–10. [Google Scholar] [CrossRef]
Wang, Q.; Zhu, H.; Chai, Z.; Chen, C.; Cui, Z. Influence of shallow ocean front on propagation characteristics of low frequency sound energy flow. In Proceedings of the 2022 3rd International Conference on Geology, Mapping and Remote Sensing (ICGMRS), Zhoushan, China, 22–24 April 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 878–881. [Google Scholar] [CrossRef]
DeCourcy, B.J.; Lin, Y.-T.; Siegmann, W.L. Effects of front width on acoustic ducting by a continuous curved front over a sloping bottom. J. Acoust. Soc. Am. 2019, 146, 1923–1933. [Google Scholar] [CrossRef]
Shihe, R.; Hui, W.; Na, L. Review of ocean front in Chinese marginal seas and frontal forecasting. Adv. Earth Sci. 2015, 30, 552–563. [Google Scholar] [CrossRef]
Burnett, W.; Harper, S.; Preller, R.; Jacobs, G.; LaCroix, K. Overview of operational ocean forecasting in the US Navy: Past, present, and future. Oceanography 2014, 27, 24–31. [Google Scholar] [CrossRef]
Gao, L.; Zhang, Y.; Li, X. Effect of Bellhop-based mesoscale ocean front and topography on deep-sea communication and detection. In Proceedings of the International Conference on Signal Processing and Communication Technology (SPCT 2022), Harbin, China, 23–25 December 2022; SPIE: Pamiers, France, 2023. [Google Scholar] [CrossRef]
Mallik, W.; Jaiman, R.K.; Jelovica, J. Predicting transmission loss in underwater acoustics using convolutional recurrent autoencoder network. J. Acoust. Soc. Am. 2022, 152, 1627–1638. [Google Scholar] [CrossRef]
McCarthy, R.A.; Sarkar, J.; Merrifield, S.; Bednar, R.; Ung, D.; Nager, A.; Brooks, C.; Terrill, E. Machine learning transmission loss simulations in complex undersea environments with range-dependent bathymetry. J. Acoust. Soc. Am. 2023, 154 (Suppl. S4), A308. [Google Scholar] [CrossRef]
Mccarthy, R.A.; Merrifield, S.T.; Sarkar, J.; Terrill, E. Machine learning of acoustic propagation models for sound aware autonomous systems. J. Acoust. Soc. Am. 2023, 153 (Suppl. S3), A175. [Google Scholar] [CrossRef]
Lee-Leon, A.; Yuen, C.; Herremans, D. Underwater Acoustic Communication Receiver Using Deep Belief Network. IEEE Trans. Commun. 2021, 69, 3698–3708. [Google Scholar] [CrossRef]
Lee, B.M.; Johnson, J.R.; Dowling, D.R. Neural network predictions of acoustic transmission loss uncertainty. J. Acoust. Soc. Am. 2022, 152, A158. [Google Scholar] [CrossRef]
Zhu, R.; Li, Y.; Chen, Z.; Du, T.; Zhang, Y.; Li, Z.; Jing, Z.; Yang, H.; Jing, Z.; Wu, L. Deep learning improves reconstruction of ocean vertical velocity. Geophys. Res. Lett. 2023, 50, e2023GL104889. [Google Scholar] [CrossRef]
He, J.; Mahadevan, A. Vertical velocity diagnosed from surface data with Machine learning. Geophys. Res. Lett. 2024, 51, e2023GL104835. [Google Scholar] [CrossRef]
Khan, S.; Song, Y.; Huang, J.; Piao, S. Analysis of Underwater Acoustic Propagation under the Influence of Mesoscale Ocean Vortices. J. Mar. Sci. Eng. 2021, 9, 799. [Google Scholar] [CrossRef]
Liu, Y.; Chen, W.; Chen, W.; Chen, Y.; Ma, L.; Meng, Z. Ocean front model based on sound speed profile and its influence on sound propagation. In Proceedings of the 2021 OES China Ocean Acoustics (COA), Harbin, China, 14–17 July 2021; IEEE: Piscataway, NJ, USA, 2021. [Google Scholar] [CrossRef]
Chen, C.; Ma, Y.; Liu, Y. Reconstructing Sound speed profiles worldwide with Sea surface data. Appl. Ocean Res. 2018, 77, 26–33. [Google Scholar] [CrossRef]
Liu, Y.; Chen, Y.; Zhang, Y.; Chen, W.; Meng, Z. Research on reconstruction of the global sound speed profile combining partial underwater prior information. J. Sea Res. 2024, 200, 102516. [Google Scholar] [CrossRef]
Zhao, J.; Wang, M.; Hu, N.; Zhu, Z.; Li, H.; Wang, Y.; Liu, D. Reconstruction model of three-dimensional ocean sound speed field based on Tucker-denoising autoencoder. Appl. Acoust. 2024, 223, 110091. [Google Scholar] [CrossRef]
Ma, X.; Zhang, L.; Xu, W.; Li, M.; Zhou, X. A mesoscale eddy reconstruction method based on generative adversarial networks. Front. Mar. Sci. 2024, 11, 1411779. [Google Scholar] [CrossRef]
Wang, J.; Mao, K.; Chen, X.; Zhu, K. Evolution and Structure of the Kuroshio Extension Front in Spring 2019. J. Mar. Sci. Eng. 2020, 8, 502. [Google Scholar] [CrossRef]
Kida, S.; Mitsudera, H.; Aoki, S.; Guo, X.; Ito, S.i.; Kobashi, F.; Komori, N.; Kubokawa, A.; Miyama, T.; Morie, R.; et al. Oceanic fronts and jets around Japan: A review. J. Oceanogr. 2015, 71, 469–497. [Google Scholar] [CrossRef]
Liu, J.; Piao, S.; Zhang, M.; Zhang, S.; Guo, J.; Gong, L. Characteristics of Three-Dimensional Sound Propagation in Western North Pacific Fronts. J. Mar. Sci. Eng. 2021, 9, 1035. [Google Scholar] [CrossRef]
Ozanich, E.; Gawarkiewicz, G.G.; Lin, Y.-T. Study of acoustic propagation across an oceanic front at the edge of the New England shelf. J. Acoust. Soc. Am. 2022, 152, 3756–3767. [Google Scholar] [CrossRef] [PubMed]
Miyazawa, Y.; Varlamov, S.M.; Miyama, T.; Guo, X.; Hihara, T.; Kiyomatsu, K.; Kachi, M.; Kurihara, Y.; Murakami, H. Assimilation of high-resolution sea surface temperature data into an operational nowcast/forecast system around Japan using a multi-scale three-dimensional variational scheme. Ocean Dyn. 2017, 67, 713–728. [Google Scholar] [CrossRef]
Miyazawa, Y.; Kuwano-Yoshida, A.; Nishikawa, H.; Narazaki, T.; Fukuoka, T.; Sato, K. Temperature profiling measurements by sea turtles improve ocean state estimation in the Kuroshio-Oyashio Confluence region. Ocean Dyn. 2019, 69, 267–282. [Google Scholar] [CrossRef]
Chang, Y.L.; Miyazawa, Y.; Guo, X. Effects of the STCC eddies on the Kuroshio based on the 20-year JCOPE2 reanalysis results. Prog. Oceanogr. 2015, 135, 64–76. [Google Scholar] [CrossRef]
Chang, Y.-L.K.; Miyazawa, Y.; Béguer-Pon, M.; Han, Y.-S.; Ohashi, K.; Sheng, J. Physical and biological roles of mesoscale eddies in Japanese eel larvae dispersal in the western North Pacific Ocean. Sci. Rep. 2018, 8, 5013. [Google Scholar] [CrossRef]
Liu, Z.J.; Nakamura, H.; Zhu, X.H.; Nishina, A.; Guo, X.; Dong, M. Tempo-spatial variations of the Kuroshio current in the Tokara Strait based on long-term ferryboat ADCP data. J. Geophys. Res. Ocean. 2019, 124, 6030–6049. [Google Scholar] [CrossRef]
Donohue, K.A.; Watts, D.R.; Tracey, K.L.; Wimbush, M.H.; Park, J.-H.; Bond, N.A.; Cronin, M.F.; Chen, S.; Qiu, B.; Hacker, P.; et al. Program Studies the Kuroshio Extension. Eos Trans. Am. Geophys. Union 2008, 89, 161–162. [Google Scholar] [CrossRef]
Ocean Observation Knowledge—Main Observation Lines. 2024. Available online: https://www.data.jma.go.jp/kaiyou/db/vessel_obs/description/obsline.html (accessed on 25 July 2024).
Yu, P.; Zhang, L.; Liu, M.; Zhong, Q.; Zhang, Y.; Li, X. A comparison of the strength and position variability of the Kuroshio Extension SST front. Acta Oceanol. Sin. 2020, 39, 26–34. [Google Scholar] [CrossRef]
Dong, S.; Sprintall, J.; Gille, S.T. Location of the Antarctic Polar Front from AMSR-E Satellite Sea Surface Temperature Measurements. J. Phys. Oceanogr. 2006, 36, 2075–2089. [Google Scholar] [CrossRef]
Liu, J.; Zhang, Y.; Zhang, X. Analysis on the Characteristics of the Spatial and Temporal Variation of the Kuroshio Extension Front and the Distribution of the Sound Field. J. Ocean Technol. 2015, 34, 15–20. Available online: http://hyjsxb.cnjournals.org/ch/reader/view_abstract.aspx?file_no=20150203&flag=1 (accessed on 12 January 2022).
Seo, Y.; Sugimoto, S.; Hanawa, K. Long-Term Variations of the Kuroshio Extension Path in Winter: Meridional Movement and Path State Change. J. Clim. 2014, 27, 5929–5940. [Google Scholar] [CrossRef]
Li, S.; Cheng, L.; Zhang, T.; Zhao, H.; Li, J. Striking the right balance: Three-dimensional ocean sound speed field reconstruction using tensor neural networks. J. Acoust. Soc. Am. 2023, 154, 1106–1123. [Google Scholar] [CrossRef]
Chen, X.; Wang, C.; Li, H.; Hu, D.; Chen, C.; He, Y. Impact of ocean fronts on the reconstruction of vertical temperature profiles from sea surface measurements. Deep Sea Res. Part I Oceanogr. Res. Pap. 2022, 187, 103833. [Google Scholar] [CrossRef]
Liu, Y.; Chen, W.; Chen, W.; Chen, Y.; Ma, L.; Meng, Z. Reconstruction of ocean front model based on sound speed clustering and its effectiveness in ocean acoustic forecasting. Appl. Sci. 2021, 11, 8461. [Google Scholar] [CrossRef]
Etter, P.C. Underwater Acoustic Modeling and Simulation, 4th ed.; CRC Press: Boca Raton, FL, USA, 2013; p. 554. [Google Scholar]
Shields, M.D.; Zhang, J. The generalization of Latin hypercube sampling. Reliab. Eng. Syst. Saf. 2016, 148, 96–108. [Google Scholar] [CrossRef]
Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the MICCAI 2015, Munich, Germany, 5–9 October 2015; Springer: Cham, Switzerland, 2015. [Google Scholar] [CrossRef]
Yang, K.; Lu, Y.; Xue, R.; Sun, Q. Transmission characteristics of convergence zone in deep-sea slope. Appl. Acoust. 2018, 139, 222–228. [Google Scholar] [CrossRef]
Porter, M.B. The Bellhop Manual and User’s Guide: Preliminary Draft; Heat, Light, and Sound Research, Inc.: La Jolla, CA, USA, 2011; Available online: http://oalib.hlsresearch.com/Rays/HLS-2010-1.pdf (accessed on 12 January 2022).
Ma, S.; Guo, X.; Zhang, L.; Lan, Q.; Huang, C. Riemannian Geometric Modeling of Underwater Acoustic Ray Propagation·Application—Riemannian Geometric Model of Convergence Zone in the Deep Ocean. Acta Phys. Sin. 2023, 72, 108–120. [Google Scholar] [CrossRef]

Figure 1. A schematic of the influence of oceanic fronts on acoustic propagation and the PIX2PIX model for sound speed reconstruction.

Figure 2. Ocean observation lines schematic diagram: (a) the ocean observation lines of the Japan Meteorological Agency (source: Japan Meteorological Agency website [33]), (b) the locations of the in situ observation sections used in this study.

Figure 3. The different indicators and parameters used for front-line optimization.

Figure 4. A schematic of oceanic front identification: (a) the sound speed distribution at 300 m in the Kuroshio–Oyashio extension area on 1 January 2022; (b) the front line identification and section extraction effect in the study area; (c) the sound speed difference on both sides of the oceanic front section near 148° E; (d) the sound speed distribution of an example section.

Figure 5. Schematic diagram of parameterized oceanic front model construction and parameter selection: (a) the normalized sound speed profile, (b) optimal value statistical chart.

Figure 6. A schematic diagram of the input and output of the sound speed reconstruction model based on PIX2PIX.

Figure 7. The flowchart for oceanic front sound speed section reconstruction based on the PIX2PIX model.

Figure 8. A schematic of the change in the sound speed reconstruction model’s accuracy with the number of iterations under different input conditions.

Figure 9. An iterative diagram of the sound speed reconstruction errors under different inputs: the evaluation of the reconstruction accuracy for the 1000 m section based on the profiles at the south side (a), center (b), and north side (c) and for the 5500 m section based on the profiles at the south side (d), center (e), and north side (f).

Figure 10. A comparative simulation diagram of the acoustic propagation characteristics of the sound speed reconstruction sections under different position input profiles: (a) direct detection distance (south side); (b) direct detection distance (center); (c) direct detection distance (north side); (d) convergence zone distance (south side); (e) convergence zone distance (center); (f) convergence zone distance (north side).

Figure 11. The effect diagram of sound speed reconstruction based on different in situ input data.

Figure 12. A schematic diagram of front-line identification based on CORTA data: (a) the sound speed distribution at 300 m in the Kuroshio–Oyashio extension area on 1 January 2023; (b) the front line identification and section extraction effect in the study area;.

Figure 13. The sound speed reconstruction effects of the CORTA dataset: (a) line chart of MAE varying with depth; (b) boxplot of SSIM statistic.

Table 1. The statistics and fitting of changes in the acoustic structure on both sides of the front with changes in the KEF strength.

Feature	KEF Strength/m·s⁻¹·km⁻¹						Fitting (y = ax³ + bx² + cx + d)
Feature	0.5	1	1.5	2	2.5	3	a	b	c	d
SSS/m·s⁻¹	8.42	9.60	8.80	7.50	7.33	3.96	2.02	−11.96	20.00	−0.22
SLD/m	27.41	29.04	19.47	16.27	11.87	12.00	9.25	−48.65	68.98	−1.15
BSLS/m·s⁻¹	9.12	10.29	9.22	7.69	7.68	4.03	2.21	−13.00	21.49	−0.20
TLSS/m·s⁻¹·m⁻¹	−0.02	−0.06	−0.10	−0.15	−0.16	−0.17	0.00	−0.01	−0.06	0.01
SCAD/m	389.14	587.26	668.18	720.29	791.62	781.00	45.18	−348.85	890.64	−6.70
SCAS/m·s⁻¹	6.01	12.44	17.32	24.75	26.91	29.63	−0.67	1.17	12.29	−0.40
CD/m	1849	2719	2915	2860	2958	3352	321	−2181	4637	−122
CDS/m·s⁻¹	29.99	50.44	63.68	72.80	72.14	67.72	3.10	−26.74	79.46	−4.47
DE/m	−1789	−2656	−2856	−2779	−2967	−3224	−313	2146	−4570	145

Note: the values of each feature in the table represent the differences in the structure between the warm water side and the cold water side, with the abscissa x in the fitting curve representing the strength of the oceanic front and the ordinate y representing the change in structure.

Table 2. The evaluation of the sound speed reconstruction effects on 1000 m sections under different input conditions.

Section Depth	Input		Index	Number of Iterations
Section Depth	Input		Index	10	50	100	200	300	500	800
1000 m	Profile	South side	MAE	12.25	3.63	4.29	3.48	3.54	3.56	3.54
		South side	SSIM	0.16	0.39	0.46	0.50	0.52	0.54	0.52
		Center	MAE	12.04	3.39	2.57	1.91	2.42	1.81	2.42
		Center	SSIM	0.25	0.46	0.50	0.58	0.61	0.62	0.61
		North side	MAE	13.39	3.99	4.22	3.96	4.04	3.30	4.04
		North side	SSIM	0.17	0.36	0.42	0.48	0.53	0.59	0.53
	Profile + Parameterized Front Model	South side	MAE	11.88	4.07	3.33	3.35	3.31	2.98	3.31
		South side	SSIM	0.18	0.38	0.46	0.46	0.53	0.57	0.53
		Center	MAE	13.94	7.64	3.00	2.30	1.95	2.10	1.95
		Center	SSIM	0.15	0.31	0.48	0.56	0.63	0.61	0.63
		North side	MAE	11.78	5.02	3.97	2.57	2.52	3.48	2.52
		North side	SSIM	0.26	0.39	0.46	0.59	0.59	0.59	0.59

Table 3. The evaluation of the sound speed reconstruction effects on 5500 m sections under different input conditions.

Section Depth	Input		Index	Number of Iterations
Section Depth	Input		Index	10	50	100	200	300	500	800
5500 m	Profile	South side	MAE	7.35	1.99	1.41	0.87	0.84	0.84	0.84
		South side	SSIM	0.17	0.51	0.68	0.75	0.77	0.79	0.77
		Center	MAE	9.09	2.70	1.18	0.79	0.66	0.63	0.66
		Center	SSIM	0.18	0.50	0.65	0.73	0.80	0.80	0.80
		North side	MAE	7.97	2.09	1.81	1.19	1.08	0.94	1.08
		North side	SSIM	0.14	0.53	0.69	0.75	0.77	0.79	0.77
	Profile + Parameterized Front Model	South side	MAE	8.07	2.97	1.26	1.19	0.95	0.87	0.95
		South side	SSIM	0.12	0.40	0.62	0.69	0.76	0.78	0.76
		Center	MAE	9.65	3.76	1.17	0.85	0.63	0.62	0.63
		Center	SSIM	0.11	0.42	0.65	0.74	0.78	0.79	0.78
		North side	MAE	9.01	2.87	1.72	0.90	0.77	1.02	0.77
		North side	SSIM	0.16	0.44	0.63	0.75	0.78	0.79	0.78

Table 4. The evaluation of the sound speed reconstruction effects on in situ sections.

Data Source	Number of Sections	South Side		Center		Nouth Side
Data Source	Number of Sections	MAE	SSIM	MAE	SSIM	MAE	SSIM
Japan Seasonal Cruises	50	3.61	0.65	3.80	0.68	3.74	0.69
KESS Project	4	3.19	0.70	3.36	0.69	3.18	0.70

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Xu, W.; Zhang, L.; Ma, X.; Li, M.; Yao, Z. The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction. J. Mar. Sci. Eng. 2024, 12, 1918. https://doi.org/10.3390/jmse12111918

AMA Style

Xu W, Zhang L, Ma X, Li M, Yao Z. The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction. Journal of Marine Science and Engineering. 2024; 12(11):1918. https://doi.org/10.3390/jmse12111918

Chicago/Turabian Style

Xu, Weishuai, Lei Zhang, Xiaodong Ma, Ming Li, and Zhongshan Yao. 2024. "The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction" Journal of Marine Science and Engineering 12, no. 11: 1918. https://doi.org/10.3390/jmse12111918

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

The Parameterized Oceanic Front-Guided PIX2PIX Model: A Limited Data-Driven Approach to Oceanic Front Sound Speed Reconstruction

Abstract

1. Introduction

2. Data and Methods

2.1. Data

2.2. Methodology

2.2.1. Oceanic Front Extraction Method

2.2.2. Parameterized Oceanic Front Model

2.2.3. Principles of PIX2PIX Model

3. Model Training

3.1. Construction of Prediction Model and Physical Parameter Input Method

3.2. Model Training and Effectiveness Evaluation

4. Model Verification

4.1. Evaluation of Underwater Acoustic Propagation Effect

4.2. Evaluation of Sound Speed Reconstruction Effects for In Situ Observation Sections

4.3. Evaluation of Sound Speed Reconstruction Effects for Sections from Different Data Sources

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI