Research on LSTM-Based Maneuvering Motion Prediction for USVs

Guo, Rong; Mao, Yunsheng; Xiang, Zuquan; Hao, Le; Wu, Dingkun; Song, Lifei

doi:10.3390/jmse12091661

Open AccessArticle

Research on LSTM-Based Maneuvering Motion Prediction for USVs

by

Rong Guo

^1,2

,

Yunsheng Mao

^1,2,

Zuquan Xiang

^1,2,

Le Hao

^1,2,

Dingkun Wu

³ and

Lifei Song

^1,2,*

¹

Key Laboratory of High Performance Ship Technology, Wuhan University of Technology, Ministry of Education, Wuhan 430070, China

²

School of Naval Architecture, Ocean and Energy Power Engineering, Wuhan University of Technology, Wuhan 430070, China

³

College of Mechanical and Electrical Engineering, Beijing University of Chemical Technology, Beijing 100029, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2024, 12(9), 1661; https://doi.org/10.3390/jmse12091661

Submission received: 21 August 2024 / Revised: 9 September 2024 / Accepted: 13 September 2024 / Published: 16 September 2024

(This article belongs to the Special Issue Safe Maneuvering, Efficient Navigation and Intelligent Management for Ships)

Download

Browse Figures

Versions Notes

Abstract

:

Maneuvering motion prediction is central to the control and operation of ships, and the application of machine learning algorithms in this field is increasingly prevalent. However, challenges such as extensive training time, complex parameter tuning processes, and heavy reliance on mathematical models pose substantial obstacles to their application. To address these challenges, this paper proposes an LSTM-based modeling algorithm. First, a maneuvering motion model based on a real USV model was constructed, and typical operating conditions were simulated to obtain data. The Ornstein–Uhlenbeck process and the Hidden Markov Model were applied to the simulation data to generate noise and random data loss, respectively, thereby constructing a sample set that reflects real experiment characteristics. The sample data were then pre-processed for training, employing the MaxAbsScaler strategy for data normalization, Kalman filtering and RRF for data smoothing and noise reduction, and Lagrange interpolation for data resampling to enhance the robustness of the training data. Subsequently, based on the USV maneuvering motion model, an LSTM-based black-box motion prediction model was established. An in-depth comparative analysis and discussion of the model’s network structure and parameters were conducted, followed by the training of the ship maneuvering motion model using the optimized LSTM model. Generalization tests were then performed on a generalization set under Zigzag and turning conditions to validate the accuracy and generalization performance of the prediction model.

Keywords:

unmanned surface vehicle; maneuvering motion model; black-box model; machine learning; long short-term memory network

1. Introduction

Unmanned Surface Vehicles (USVs) have found extensive applications in hydrographic surveying, maritime weather forecasting, coastal port security, and marine environmental monitoring [1]. Reliable maneuverability is fundamental to USV intelligence, as many control algorithms depend on accurate motion prediction, which necessitates a precise understanding of the USV’s motion prediction model. However, the nonlinear and time-varying dynamics of USVs pose significant challenges for model parameter identification and the design of model-based control algorithms [2].

Currently, two primary methods are employed for ship maneuvering modeling. The first is mechanistic modeling, which utilizes either classical Newtonian mechanics or the Lagrange method, combined with fluid dynamics, to establish mathematical models that describe the ship’s dynamic behavior. The main mechanistic models used for ship maneuvering prediction include the Abkowitz model [3], the Maneuvering Modeling Group (MMG) model [4], and the Nomoto model [5]. The Abkowitz model focuses on the influence of hydrodynamic forces and moments in the ship’s maneuvering equations, while the MMG model decomposes the hydrodynamic forces into components acting on the hull, propeller, and rudder, considering their interactions. The Nomoto model, which describes the ship’s posture and motion response using a set of differential equations, is valued for its simplicity and ease of understanding, although it has limitations in precision [6]. Zou Zaojian applied Support Vector Machines (SVMs) to identify hydrodynamic derivatives and moments within the Abkowitz model, enhancing the accuracy of ship maneuvering predictions [7]. However, Xu developed an optimal truncated LS-SVM method [5,8], validated through self-propelled experiments [9], though their study did not address scale effects. Pandey described a three-degree-of-freedom (3DoF) MMG mathematical model for the Wave Adaptive Modular Vessel (WAM-V) [10], using a combination of captive model testing and system identification to calculate maneuvering derivatives, with the MMG model simulations demonstrating good performance in turning characteristics. Nonetheless, issues like ocean disturbances remain unresolved. Luo W. et al. [11] utilized a two-layer feedforward neural network to identify the maneuverability indices and linear hydrodynamic derivatives within the linear MMG model, demonstrating the stability of the estimated parameters. However, the study involved significant simplifications of the MMG model, and the accuracy of parameter identification remains to be improved. Carrillo S. et al. [12] conducted ship motion modeling based on the Nomoto maneuvering model and used the least squares method to identify the KT parameters.

The second modeling approach is black-box modeling. This method measures the input and output data of a maneuvering ship, and establishes a model that describes the input-output characteristics during ship maneuvering through appropriate processing and computation. Notably, this approach does not require detailed descriptions of the dynamic characteristics of the maneuvering process. In the identification work using white-box models, the task primarily involves identifying coefficients or parameters within a predetermined model, thus making the accuracy of identification dependent on the limitations of the mathematical model [13]. Using fixed, linear, and time-invariant mathematical models for ship maneuvering makes it challenging to generalize predictions, as these models fail to capture the complex nonlinear characteristics of ship motions. When the mathematical model is highly complex, the number of parameters to be identified becomes excessive, relying heavily on a large volume of high-quality sample data. It becomes difficult to obtain accurate model parameters in the presence of disturbances or irregular sample data [14]. Additionally, the precision of hydrodynamic parameter calculations for the hull, propeller, and rudder significantly affects the accuracy of motion predictions. Moreover, factors such as design and manufacturing errors, equipment wear during operation, the maneuvering environment, and variations in ship loading can all contribute to inaccuracies in the ship’s motion model throughout a ship’s lifecycle. These inaccuracies limit the effectiveness of using white-box models for motion prediction. However, employing white-box models can aid in improving the original mathematical models. Bonci et al. [15] improved the fully nonlinear Abkowitz model by identifying the USV model through CFD free-running simulations. This work utilized high-precision CFD data to eliminate the typical characteristic noise present in outdoor tank experiment data.

Over the past decade, numerous machine learning methods have demonstrated remarkable performance in the field of system identification [16], including Support Vector Regression (SVR) [17], Least Squares Support Vector Machines (LSSVM) [18], and Wavelet Neural Networks (WNN) [16]. Zou Zaojian [19] proposed a black-box modeling method based on BP neural networks for the 3-DOF ship maneuvering model, validating the potential of network-based models as control models or state observers in the domain of ship motion control. This method utilized a fully connected neural network to establish a black-box model of 3-DOF ship maneuvering, requiring only three maneuvering operations to identify and validate the model. However, the study did not account for environmental disturbances in the black-box model. Additionally, a black-box offline modeling method for ship maneuvering was proposed using Multi-Output

ν

Support Vector Regression (MO-

ν

-SVR), which optimized the computational efficiency and operability of traditional SVR models. Nevertheless, the research did not consider external disturbances that might influence the dynamic characteristics of the ship. Liu et al. [20], based on free-sailing experiments with the KVLCC2 ship model and a multi-degree-of-freedom Wave Energy Converter (WEC), proposed three identification frameworks: parametric grey-box modeling, non-parametric grey-box modeling, and Bayesian regression-based black-box modeling. They summarized the prerequisites for using these frameworks and demonstrated that the black-box model constructed through Bayesian regression exhibited stronger generalization ability in the loss function compared to SVM and ANN, due to the incorporation of prior knowledge. However, the wave energy training set used in the study was based on regular waves, which limits its generalizability.

In conclusion, the development of ship maneuverability prediction is characterized by diverse modeling approaches, data-driven methodologies, and an emphasis on model accuracy and generalization capabilities. Simultaneously, experimental methods and data validation techniques are continually being refined. As technology advances, research into ship maneuverability will become more in-depth and sophisticated, providing more reliable theoretical support for ship design and operation. Intelligent algorithms also hold significant promise in this field; however, challenges such as extensive training time, complex parameter tuning processes, and heavy reliance on mathematical models pose substantial obstacles to their application. To address these challenges, this paper proposes a Long Short-Term Memory Network (LSTM)-based modeling algorithm for USV maneuvering. By learning from historical USV motion data, the model developed in this study captures the maneuvering characteristics of the USV through machine learning. The subsequent sections of this paper are organized as follows: In Section 2, a maneuvering motion model based on a real USV model was constructed to obtain sample data. The sample data were then pre-processed for training. In Section 3, an LSTM-based black-box motion prediction model was established. In Section 4, an in-depth comparative analysis and discussion of the model’s network structure and parameters were conducted, followed by the training of the ship maneuvering motion model using the optimized LSTM model.

2. Sample Data Set Construction and Data Processing

2.1. The Maneuvering Motion Model of USV

The 3-DOF MMG motion model of the USV is described as follows:

\{\begin{matrix} \dot{x} = u c o s φ - v s i n φ \\ \dot{y} = u s i n φ + v c o s φ \\ \dot{φ} = r \\ \dot{u} = \frac{1}{m + m_{x}} [(m + m_{y}) r v + X_{H} + X_{P} + X_{R}] \\ \dot{v} = \frac{1}{m + m_{y}} [- (m + m_{x}) r v + Y_{H} + Y_{P} + Y_{R}] \\ \dot{r} = \frac{1}{I_{z z} + J_{z z}} (N_{H} + N_{P} + N_{R}) \end{matrix}

(1)

where x and y are the coordinates of the USV in the geodetic coordinate system,

u

is the longitudinal velocity,

v

is the lateral velocity,

r

represents the rate of angular, φ is the angle of the heading,

m_{x}

is the additional longitudinal mass,

m_{y}

is the additional transverse mass, and

I_{z z}

is the inertia moment of yaw.

J_{z z}

is the additional moment of inertia in yaw. X, Y, and N represent hydrodynamic forces and moments, while the subscripts H, P, and R represent the hull, propeller, and rudder, respectively.

The approximate model for calculating

X_{c a l m}

,

Y_{c a l m}

,

N_{c a l m}

was established using the MMG model:

\{\begin{matrix} X_{c a l m} = X_{H} + X_{P} + X_{R} \\ Y_{c a l m} = Y_{H} + Y_{R} \\ N_{c a l m} = N_{H} + N_{R} \end{matrix}

(2)

The force acting on the hull is approximate to the hydrodynamic derivative

X_{H}

,

Y_{H}

,

N_{H}

, and can be expressed as

\{\begin{matrix} X_{H} = X_{0} + X_{u} Δ u + X_{u u} {(Δ u)}^{2} + X_{u u u} {(Δ u)}^{3} + X_{v v} v^{2} + X_{r r} r^{2} + X_{v r} v r \\ Y_{H} = Y_{v} v + Y_{v v v} v^{3} + Y_{\dot{v}} \dot{v} + Y_{r} r + Y_{r r r} r^{3} + Y_{\dot{r}} \dot{r} + Y_{v r r} v r^{2} + Y_{v v r} v^{2} r \\ N_{H} = N_{v} v + N_{v v v} v^{3} + N_{\dot{v}} \dot{v} + N_{r} r + N_{r r r} r^{3} + N_{\dot{r}} \dot{r} + N_{v r r} v r^{2} + N_{v v r} v^{2} r \end{matrix}

(3)

where

Δ u = u - u_{0}

is the speed difference of the ship relative to the initial ship speed, and

{- X}_{0}

is the resistance when the speed of ship is

u_{0}

. The model of the USV used in this study is depicted in Figure 1. The parameters of the USV can be found in reference [18].

2.2. Sample Data Acquisition

According to Equation (1), simulations of zigzag and turning maneuvers were conducted for the USV under both still water and wave conditions, with a time step of 0.1 s. These simulations covered rudder angles ranging from 10° to 35°, ensuring the availability of the motion data necessary for subsequent maneuverability predictions. In the simulations, the wave conditions were characterized by a wavelength of 5 m and a wave height of 0.3 m. The conditions of simulation are listed in Table 1.

Due to the inherent noise present in ship trials, noise errors primarily arise from sensor inaccuracies related to angles, positioning, and other measurements. The UWB positioning system may also experience data loss and inaccuracies due to interference from scatterers, water surface, and ferromagnetic materials. Additionally, wireless signals may encounter indeterminate additional delays when penetrating obstacles. Consequently, noise was added to the simulation data, and a portion of the data points were randomly discarded to simulate the errors, high-frequency disturbances, and data acquisition losses typically encountered in ship trials.

By combining the Ornstein–Uhlenbeck (OU) process for noise and the Hidden Markov Model (HMM) for data loss, the ith final simulated data

x_{i}^{(s i m u)}

is expressed as

x_{i}^{(s i m u)}, y_{i}^{(s i m u)} = \{\begin{matrix} x_{i}^{(n o i s y)}, y_{i}^{(n o i s y)} & i f t h e H M M i s i n s t a t e S_{1} \\ N U L L & i f t h e H M M i s i n s t a t e S_{2} \end{matrix}

(4)

where

x_{i}^{(s i m u)}, y_{i}^{(s i m u)}

is the data added with OU noise. Take

x

as an example:

x_{i}^{(n o i s y)} = x_{i}^{(i d e a l)} + x_{i}^{(O U)}

(5)

where

x_{i}^{(i d e a l)}

is the data from the conditions form simulation shown in Table 1, and

x_{i}^{(O U)}

represents OU noise. OU noise is a time-correlated stochastic process initially used to describe Brownian motion in physics. In RL, it is employed to generate smooth and orderly noise sequences that aid in exploring the action space. The mathematical expression for the OU process is

{d x}_{i} = θ (μ - x_{i}) d t + σ d W_{i}

(6)

where

x_{i}^{(O U)}

is the ith value of the noise at time

t

.

θ

is the parameter controlling the speed of noise regression to the mean.

μ

is the long-term mean of the noise.

σ

is the intensity of noise fluctuations.

d W_{i}

is the standard Brownian motion.

At each time step, the noise is updated according to the following OU process:

x_{i + 1} = x_{i} + θ (μ - x_{i}) Δ t + σ \sqrt{Δ t} N (0,1)

(7)

where

N (0,1)

is a normal distribution with mean 0 and variance 1. When executing the strategy, the generated OU noise is added to the actions derived from the deterministic strategy.

S_{1}

and

S_{2}

in Formula (4) are two states in the HMM, which is employed to represent the states of data transmission, where the states represent whether the data is successfully received or lost. HMM is used for modeling systems that have unobservable states with observable outcomes.

S_{1}

indicates that the data is successfully transmitted, and

S_{2}

indicates that the data is lost. The transition probabilities between these states are given by the matrix

P

:

P = (\begin{matrix} P_{11} & P_{12} \\ P_{21} & P_{22} \end{matrix})

(8)

where

P_{i j}

represents the probability of transitioning from state

S_{i}

to state

S_{j}

.

2.3. Sample Data Processing

In the USV motion data samples obtained, the different elements of the input vector should hold equal importance; however, differing scales among these elements can influence how regression algorithms assess their significance. Therefore, before model training, the input vectors were normalized using the following:

\begin{matrix} X^{(f o r m a l)} = X^{(s i m u)} . / \max (| X^{(s i m u)} |) \\ Y^{(f o r m a l)} = Y^{(s i m u)} . / \max (| Y^{(s i m u)} |) \end{matrix}

(9)

where

X^{(f o r m a l)}

denotes the normalized input matrix, and

Y^{(f o r m a l)}

denotes the normalized output matrix.

X^{(f o r m a l)}

={

x_{i}^{(f o r m a l)} | i = 1,2, . . . N

}. For simplification of the equations,

x

will be used to represent

x^{(f o r m a l)}

in subsequent discussions, and the same applies to the simplification of

y^{(f o r m a l)}

.

Disregarding factors such as speed fluctuations during ship trials allows the ship model’s motion to be approximated as a discrete linear dynamic system. A Kalman filter was employed for preliminary filtering of the system:

x_{i} = A \times x_{i - 1} + B \times u_{i} + ω_{i - 1}

(10)

z_{i} = H \times x_{i} + v_{i}

(11)

where

x_{i}

represents the system state matrix,

z_{i}

is the observed state matrix,

A

is the state transition matrix,

B

is the control input matrix,

H

is the observation matrix,

ω

is the process noise, and

v

is the measurement noise. Both

ω

and

v

are assumed to follow Gaussian white noise distributions with covariances

Q

and

R

, respectively. The parameters are

Q = σ_{Q}^{2} \times [\begin{matrix} q & 0 \\ 0 & q \end{matrix}]

,

q = [\begin{matrix} \frac{1}{4} {Δ t}^{4} & \frac{1}{2} {Δ t}^{3} \\ \frac{1}{2} {Δ t}^{3} & {Δ t}^{2} \end{matrix}]

;

R = [\begin{matrix} σ_{R}^{2} & 0 \\ 0 & σ_{R}^{2} \end{matrix}]

where

σ_{Q} = 1

,

σ_{R} = 20

, with

Δ t

representing the data sampling interval.

Subsequently, Lagrange polynomial interpolation was applied to reconstruct missing data points in the case of lost ship motion data samples in discrete signals. For each known data point, a corresponding Lagrange basis polynomial was constructed:

l_{i} (t) = \prod_{\begin{matrix} j = 1 \\ j \neq i \end{matrix}}^{n} \frac{t - t_{j}}{t_{i} - t_{j}}

(12)

The Lagrange interpolation polynomial was then obtained by summing the weighted basis functions:

L (t) = \sum_{i = 1}^{n} x (t_{i}) l_{i} (t)

(13)

Finally, substituting

t = t_{m}

into the interpolation polynomial

L (t)

provides the estimated value of

x (t_{m})

:

x (t_{m}) = \sum_{i = 1}^{n} x (t_{i}) l_{i} (t_{m})

(14)

where

l_{i} (t_{m})

is the value of the i-th basis polynomial where

t = t_{m}

.

To ensure the robustness of the sample data, a radial reach filter (RRF) was applied for further processing of the filtered data. The first step involves forward filtering to obtain the initial filtered signal, which is then reversed in the time domain. The reversed signal undergoes a second filtering process, and, finally, the output is reversed again in the time domain to obtain a radial reach filtered signal. This signal has zero phase distortion, and its amplitude-frequency characteristics are corrected by the square of the filter’s amplitude-frequency response.

Let

z

represent the frequency domain, where

z = e^{j w}

. The process can be expressed by the following equations:

\begin{matrix} Y_{1} (e^{j w}) = X (e^{j w}) H (e^{j w}) \\ Y_{2} (e^{j w}) = e^{- j w (N - 1)} Y_{1} (e^{- j w}) \\ Y_{3} (e^{j w}) = Y_{2} (e^{j w}) H (e^{j w}) \\ Y (e^{j w}) = e^{- j w (N - 1)} Y_{3} (e^{- j w}) \end{matrix}\}

(15)

where

X (e^{j w})

is the input,

Y (e^{j w})

is the output,

H (e^{j w})

is the impulse response sequence, and

Y_{i} (i = 1,2, 3)

represents the intermediate response. The RRF effectively eliminates phase distortion, providing an accurate representation of data trends. However, it requires the “time-reversal” of the signal sequence, which implies that it cannot be used for real-time data processing. Therefore, this method is only suitable for offline data processing, where the complete experimental data is collected first, followed by filtering and comparison with simulation tests.

The dataset was split into a 9:1 ratio for the training set and test set, with the test set used for the final evaluation of the model’s overall generalization performance to ensure robust performance on unseen data. To prevent overfitting, a portion of the training data was extracted to form a validation set, which was used for preliminary evaluation and hyperparameter tuning of the model. During training, the validation set performance was monitored to determine whether training should be halted. This study employed the K-fold cross-validation method.

3. LSTM-Based USV Motion Black-Box Prediction Model

The velocities

u

,

v

and angular velocity

r

in Equation (1) were discretized using the Euler difference method, leading to the iterative equations that predict the motion from time step k to k + 1, as represented by Equations (16) through (18):

\begin{matrix} u (k + 1) = u (k) & + \frac{h}{m + m_{x}} [X_{u} u (k) - 2 u_{0} X_{u u} u (k) + 3 X_{u u u} u_{0}^{2} u (k) \\ + {X_{u u} u}^{2} (k) - 3 X_{u u u} u^{2} (k) u_{0} + {X_{u u u} u}^{3} (k) + X_{v v} v^{2} (k) \\ + X_{r r} r^{2} (k) + X_{v r} v (k) r (k) + (m + m_{y}) v (k) r (k) - {X_{u u u} u}_{0}^{3} \\ + {X_{u u} u}_{0}^{2} - X_{u} u_{0} + X_{0} + X_{P} + X_{R}] \end{matrix}

(16)

\begin{matrix} v (k + 1) = v (k) & + \frac{h}{m + m_{y} - Y_{\dot{v}}} [Y_{v} v (k) + Y_{v v v} v^{3} (k) + Y_{r} r (k) \\ - \frac{Y_{\dot{r}}}{m + m_{y} - Y_{\dot{v}}} r (k) + Y_{r r r} r^{3} (k) - (m + m_{x}) v (k) r (k) \\ + Y_{v r r} v {(k) r}^{2} (k) + Y_{v v r} v^{2} (k) r (k) + Y_{R}] \\ + \frac{Y_{\dot{r}}}{m + m_{y} - Y_{\dot{v}}} r (k + 1) \end{matrix}

(17)

\begin{matrix} r (k + 1) = r (k) & + \frac{h}{I_{z z} + J_{z z} - N_{\dot{r}}} (N_{v} v (k) - \frac{N_{\dot{v}}}{I_{z z} + J_{z z} - N_{\dot{r}}} v (k) + N_{v v v} v^{3} (k) \\ + N_{r} r (k) + N_{r r r} r^{3} (k) + N_{v r r} v (k) r^{2} (k) + N_{v v r} v^{2} (k) r (k) \\ + N_{R}) + \frac{N_{\dot{v}}}{I_{z z} + J_{z z} - N_{\dot{r}}} v (k + 1) \end{matrix}

(18)

From these equations, the following black-box prediction model was developed:

\{\begin{matrix} u (k + 1) = ρ_{1} (u (k), v (k), r (k), δ (k)) \\ v (k + 1) = ρ_{2} (u (k), v (k), r (k), δ (k)) \\ r (k + 1) = ρ_{3} (u (k), v (k), r (k), δ (k)) \end{matrix}

(19)

where

ρ (\cdot)

represents a nonlinear vector function describing the nonlinear relationship between the input and output of samples.

x

denotes the input matrix composed of input vectors

x_{i}

,

x = {[x_{1}, \dots x_{i}, \dots, x_{N}]}^{T}

;

x_{i} = [u_{i}, v_{i}, r_{i}, δ_{i}]

, with the superscript

T

indicating matrix transposition;

y

represents the output matrix composed of output vectors

y_{i}

,

y_{i} = [u_{i}, v_{i}, r_{i}]

. The index

i

corresponds to the

i

-th data point, with

i

ranging from 1 to

N

, where

N

is the number of sample points in the sample set

S

, and

t_{i}

is the corresponding time.

This black-box model reveals the nonlinear mapping relationship between the ship’s motion variables and the rudder angle. The system’s inputs are represented by the vector

{u (k), v (k), r (k), δ (k)}

, corresponding to the outputs

{u (k + 1), v (k + 1), r (k + 1)}

. Therefore, the input and output vectors for the black-box modeling can be expressed as

\{\begin{matrix} i n p u t : & [u (k), v (k), r (k), δ (k)] \\ o u t p u t : & [u (k + 1), v (k + 1), r (k + 1)] \end{matrix}

(20)

A schematic of the machine learning-based black-box modeling and motion prediction process is presented in Figure 2. At each step, the input vector

[u (k), v (k), r (k), δ (k)]

is fed into the LSTM, and the corresponding output in LSTM represents the USV motion output vector

y_{t}

.

LSTM is a variant of Recurrent Neural Networks (RNNs), specifically designed to address the issues of vanishing and exploding gradients encountered in traditional RNNs when dealing with long sequences. The primary characteristic of LSTM is its ability to capture and retain long-term dependencies, enabling effective learning and memory of information when processing long sequence data. The LSTM architecture comprises input gates, forget gates, output gates, and memory cells, which control the flow of information and allow the LSTM model to selectively retain or discard previous information. This design enables LSTM to capture long-term dependencies more effectively when processing long sequences, thereby improving model performance. Each LSTM unit in the continuous sequence shares the same parameters, with inputs and outputs corresponding to adjacent time steps in the time series. At time step

t

, the network’s input is

X_{t}

, the memory cell is

C_{t}

, and the output is the hidden state

h_{t}

.

The core of LSTM lies in the forget gate, which determines which parts of the previous memory should be retained and which should be discarded, enabling autonomous learning without external intervention. The forget gate takes the current input

X_{t}

and the previous hidden state

h_{t - 1}

as inputs, and outputs a decay coefficient

f_{t}

(ranging from 0 to 1) that represents the proportion of information retained. The calculation of the forget gate is given by Equation (21):

f_{t} = s i g m o i d (W_{f} \cdot [h_{t - 1}, X_{t}] + b_{f})

(21)

where

h_{t - 1}

is the hidden state of the previous LSTM unit,

X_{t}

is the current input, and

W_{f}

and

b_{f}

are used for the linear transformation before the sigmoid function is applied. The input gate determines how much new information should be added to the memory cell

C_{t}

. The input gate takes the current input

X_{t}

and the previous hidden state

h_{t - 1}

as inputs, and outputs the memory decay coefficient

i_{t}

and the learned memory content

\tilde{C_{t}}

. The memory decay coefficient

i_{t}

calculated by the sigmoid layer and the memory content

\tilde{C_{t}}

, generated by the tanh layer, are given by Equations (22) and (23), respectively:

i_{t} = s i g m o i d (W_{i} \cdot [h_{t - 1}, X_{t}] + b_{i})

(22)

\tilde{C_{t}} = \tanh (W_{c} \cdot [h_{t - 1}, X_{t}] + b_{c})

(23)

The memory cell

C_{t}

of LSTM is updated using the outputs of the forget gate and the input gate, as shown in Equation (24):

C_{t} = f_{t} \times C_{t - 1} + i_{t} \times \tilde{C_{t}}

(24)

Finally, the hidden state

h_{t}

of the LSTM unit is determined based on the memory cell

C_{t}

. First, the sigmoid layer determines which memory cells

C_{t}

will be activated for output, represented by

o_{t}

. Then, the memory cell

C_{t}

is processed by the tanh function to obtain a value between −1 and 1, which is multiplied by the output of the sigmoid gate to obtain the final output

h_{t}

. The calculations for

o_{t}

and the final output

h_{t}

are shown in Equations (25) and (26):

o_{t} = s i g m o i d (W_{0} \cdot [h_{t - 1}, X_{t}] + b_{0})

(25)

h_{t} = o_{t} \times \tanh (C_{t})

(26)

The outputs

C_{t}

and

h_{t}

at time t are used as inputs for the next time step. In a multi-layer stacked LSTM model, the output sequence from the previous layer is used as input for the next layer. The output of the last layer, specifically the hidden state

h_{t}

at the final time step, represents the predicted USV motion state.

During the model training process, the difference between the predicted values and the actual values is analyzed to facilitate self-learning based on predefined rules. The model optimizes itself by adjusting internal parameters to achieve the best performance. The model is updated with the goal of minimizing the prediction error. Since this model update involves a non-convex optimization problem, it is challenging to reliably search for the global optimal solution using gradient descent. To enhance the global search capability of gradient descent, the Adam optimization algorithm is employed. Adam combines the concepts of momentum and RMSprop, taking into account both first-order and second-order moment estimates of the gradients, thereby improving the effectiveness of gradient descent in non-convex optimization problems. The root mean square error (RMSE) and correlation coefficient (CC) are used to comprehensively evaluate the prediction accuracy of the model. RMSE is used to compare the differences between predicted and actual observed values, with smaller differences indicating better predictive performance. CC measures the strength and direction of the linear relationship between the predicted and actual observed values. When the CC between the model’s predicted data and the actual data approaches 1, it indicates a high degree of consistency between the predicted and actual data, resulting in improved predictive performance.

To validate the generalization of the LSTM-based black-box model, the trained model is used to predict the generalization set, and the predicted values are compared with the expected values. Since the trained model already possesses motion prediction capabilities, only the initial ship state

[u (0), v (0), r (0)]

and control input (

δ (i)

) during the maneuver are required during the entire prediction process. Generalization performance assessment focuses on the RMSE and CC of the longitudinal speed

u

, lateral speed

v

, and yaw rate

r

within the dataset.

4. Analysis on Network Structure and Simulation

LSTM is employed to model the 3-DOF ship motion black-box model in both still water and wave environments, followed by a comparative analysis of the maneuverability prediction results based on different machine learning methods. Compared to LSSVM and SVR, the structure of LSTM is more complex and time-consuming; its parameters include the number of network layers, the number of neurons, the degree of regularization, learning rate, batch size, and the number of training epochs. Currently, there is no general method to directly determine the optimal network structure using optimization algorithms; these parameters are typically manually set based on experience and experimentation to determine the best combination. To evaluate the performance of the trained models under different parameter settings, the generalization ability was tested using data from the 20°/20° zigzag maneuvering test.

4.1. Discussion of Network Structure

To explore the impact of different neural network structures on model performance, various parameter combinations were tested, including the number of LSTM layers (ranging from one to five), the number of neurons (ranging from 16 to 128), and regularization methods (none, dropout), as well as regularization parameters.

(1): Impact of Network Structure on Prediction Accuracy:

The training settings were as follows: the Adam method was used for learning rate adjustment, with an initial learning rate set to 0.001; the window width (timesteps) was set to 3; the maximum number of LSTM iterations (epochs) was set to 100; and the batch size was set to 64. The impact of different network structures on performance is illustrated in Figure 3. The meaning of the parameters on the X-axis in Figure 3, Figure 4, Figure 5 and Figure 6 is shown in Table 2.

(1) Impact of the Number of LSTM Layers: As shown in the figure, with an increase in the number of LSTM layers, RMSE and CC generally exhibit a deteriorating trend. This indicates that while increasing the complexity of the network can improve the prediction accuracy to some extent, excessive complexity may lead to overfitting, resulting in a decline in prediction accuracy.

(2) Number of Neurons: With an increase in the number of neurons, RMSE and CC fluctuate, but are less significant than the impact of the number of LSTM layers. For example, when there is one layer, increasing the number of neurons from 16 to 128 causes the RMSE mean to rise from 0.0246 to 0.0370 and the CC mean to drop from 0.998 to 0.996; when there are two layers, increasing the number of neurons causes the RMSE mean to fluctuate slightly around 0.04 and the CC mean to fluctuate slightly around 0.98. Therefore, increasing the number of neurons does not significantly improve prediction accuracy, indicating that, for this training problem, merely increasing the number of neurons and thus the network’s complexity and flexibility does not enhance prediction accuracy.

(3) Regularization Method: For a single-layer network with 128 neurons, the prediction performance is best when the dropout rate is 0.15—with an RMSE mean of 0.0322 and a CC value of 0.9944—outperforming the unregularized case (RMSE mean of 0.0370 and CC value of 0.9965). However, when the dropout rate is increased to 0.3, performance declines, with an RMSE mean of 0.0412 and a CC value of 0.9962. For a two-layer network with a configuration of 32 and 64 neurons, performance is worse, with a dropout rate of 0.15, an RMSE mean of 0.03859, and a CC value of 0.99394, compared to an RMSE mean of 0.0349 and a CC value of 0.98897 when the dropout rate is 0.3. Increasing the LSTM network layers increases complexity, and moderately increasing the dropout rate can prevent excessive reliance on specific input data, thereby enhancing the model’s robustness.

Using dropout as a regularization method effectively prevents model overfitting and improves generalization ability. Conversely, a lack of regularization may lead to overfitting and reduced generalization ability. The choice of regularization parameters depends on the model’s complexity; for complex networks, moderately increasing the dropout rate can enhance network robustness.

(2): Impact of Network Structure on Iterations

The iterations play a critical role in training. If the number of epochs is too small, the model may be undertrained, failing to capture the complex structures and long-term dependencies in the sequence, leading to underfitting. Conversely, if the number of epochs is too large, it may result in overfitting, where the model overemphasizes the details of the training data but performs poorly on unseen data. EarlyStopping can be used to monitor performance on the validation set and automatically determine the appropriate number of iterations. Therefore, in this study, the batch size was set to 100, the maximum number of epochs was set to 100, and EarlyStopping was used to prevent overfitting.

The number of LSTM network layers, the number of neurons, and the regularization parameters also influenced the training duration of the neural network. The training times for different neural network structures are shown in Figure 4.

(1): LSTM Network Layers:

As the number of LSTM layers increased, the number of epochs exhibited oscillations around 50. Overall, increasing the complexity and flexibility of the network helped better capture the characteristics of the 3-DOF model, thereby reducing training time. However, as the network became too large, the computational load increased sharply, leading to no significant improvement in prediction accuracy, while the training time became excessively long. Different parameter combinations led to varying learning abilities, resulting in oscillatory effects.

(2): Number of Neurons:

As the number of neurons increased, the iterations and training time showed an upward trend but still exhibited oscillations. This is because an increase in the number of neurons leads to an increase in the number of network parameters, thereby increasing computational load and memory consumption.

(3): Regularization Parameters:

In a single-layer network structure, increasing the dropout rate led to a gradual increase in the number of iterations and training time. However, in all two-layer network structures, increasing the dropout rate led to a decrease in the number of iterations and training time. Regularization parameters primarily affect the generalization ability and overfitting of the network, rather than its complexity and computational load. Increasing the dropout rate significantly impacts the simple structure of a single-layer network, reducing network robustness and increasing training time. For complex networks, increasing the regularization parameters may affect convergence speed and iteration count, indirectly influencing training time.

Comprehensive analysis revealed that, for this training set, the LSTM network was most effective when configured with 1–2 layers. The number of neurons and regularization did not significantly affect accuracy.

4.2. Discussion of Training Settings

Once the neural network structure is determined, the prediction accuracy is significantly influenced by the training settings. The following discussion covers several aspects of network training settings, including learning rate adjustment for the LSTM network (initial learning rate), training batch size, and window function width.

The following discussion covers the impact of different initial learning rates and window function widths on the two network configurations, as Figure 5 shows.

(1): Learning Rate:

In both single-layer and double-layer networks, the worst performance was observed when the initial learning rate was set to 0.01. As the learning rate gradually decreased to 0.0001, network performance improved, reaching its best state with LR = 0.0001, demonstrating excellent prediction performance and strong correlation. Setting the initial learning rate too high can lead to rapid convergence, missing the optimal solution, or even causing unstable training.

(2): Window Function Width:

When LR = 0.0001, the prediction accuracy of the single-layer network remained stable across different window function widths, maintaining excellent prediction accuracy. The best performance was achieved at T = 4, with RMSE reduced to 0.01649 and CC increased to 0.99931. Conversely, in the double-layer network, when LR = 0.001 or 0.0001, prediction accuracy was negatively correlated with the window function width, with RMSE increasing and CC decreasing. Increasing the window width in the LSTM double-layer network may lead to long-term dependency loss and gradient vanishing, reducing prediction accuracy. In contrast, the single-layer network was less affected, being simpler in structure with fewer parameters, making it easier to train and optimize.

To compare the appropriate window function width for the single-layer network, the training time for different window function widths can be observed in Figure 6.

Comparative analysis reveals that the model configuration with a total network layer count of 6, an LSTM network layer count of 1, 64 neurons, and a dropout regularization rate of 0.15—trained using a 5-fold cross-validation with a batch size of 32, an Adam optimizer with a learning rate of 0.0001, and a window function width of 2—exhibited the best performance in terms of RMSE and CC. The resulting RMSE was close to zero and the CC was close to 1, indicating that this model possesses excellent learning and generalization capabilities, making it highly suitable for subsequent ship maneuverability prediction tasks. Figure 7 shows the network parameters and structure of the optimized LSTM model.

4.3. Comparing Simulations

Based on the optimized LSTM structure, the motion prediction model was trained using sample data. The trained LSTM model was then applied to the validation set under wave conditions, comparing simulations based on the MMG model with those based on the LSTM black-box model. The validation set included both a 20°/20° zigzag test (as shown in Figure 8) and a 35° turning test (as shown in Figure 9), representing two of the most distinct maneuvering scenarios for USVs, neither of which were present in the training or testing datasets, ensuring the accuracy of generalization validation.

The comparison of simulations under wave disturbances for the two validation sets reveals that the predictions from the LSTM black-box model closely match the simulation results from the MMG model. This indicates that the black-box model constructed using the LSTM network demonstrates excellent learning and generalization capabilities. Despite the presence of complex noise distribution in the sample data and the loss of some data, the data preprocessing steps effectively repaired these deficiencies. Then, the LSTM neural network structure learned information from this data, with the forget gate playing a crucial role in filtering out disturbances in the sample data, thereby retaining essential information in the neural network that provides valuable insights for the model.

5. Conclusions

There is a logical challenge in applying simulation data to USV motion model identification, as the simulated data is often overly perfect, lacking noise and sampling loss issues. This results in machine learning algorithms easily capturing the features of the sample set but producing models with poor generalization, making them difficult to apply to other scenarios or in real-world situations. In response to this challenge, this paper addresses the identification of USV motion prediction models by developing an offline prediction model using machine learning methods. A simulation data processing method based on OU noise and the HMM is proposed, which brings the sample data closer to experimental data, which is characterized by random noise and random data loss. Subsequently, a sample data preprocessing strategy that includes MaxAbsScaler normalization, Kalman filtering and RRF smoothing, and Lagrange interpolation is proposed. This preprocessing allows the sample data to be effectively used in subsequent machine learning training. Following this, a 3-DOF black-box prediction model for USV maneuvering based on LSTM networks is established. The research then conduct a thorough comparative study of the LSTM-based black-box prediction model across three aspects: prediction accuracy, computational speed, and the complexity of parameter tuning. Based on the comparison results, an optimized selection strategy is proposed and proceeded with model training. Finally, the trained black-box model is applied to two validation datasets under different operating conditions, demonstrating the effectiveness and robustness of the proposed method. The findings of this research can be utilized in actual USV operations, where real-time navigation data can be used to train the USV maneuvering prediction model. This model can then be employed in model-based control systems, such as Model Predictive Control (MPC), to achieve stable USV navigation in complex environments. In this study, the focus was on offline modeling and prediction of USV maneuvering data. Future research should emphasize the utilization of real-time data for rapid, online, real-time training.

Author Contributions

Data curation, R.G., Y.M. and Z.X.; funding acquisition, Y.M. and Z.X.; methodology, L.S.; software, L.H.; visualization, L.H. and D.W.; writing—original draft, L.H.; writing—review & editing, L.S. and R.G. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the National Natural Science Foundation of China, grant numbers 51809203.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

USVs	Unmanned Surface Vehicles
MMG	Maneuvering Modeling Group
SVM	Support Vector Machines
3DoF	Three-Degree-of-Freedom
WAM-V	Wave Adaptive Modular Vessel
CFD	Computational Fluid Dynamics
SVR	Support Vector Regression
LSSVM	Least Squares Support Vector Machines
WNN	Wavelet Neural Networks
BP	Back Propagation
WEC	Wave Energy Converter
LSTM	Long Short-Term Memory Network
OU	Ornstein–Uhlenbeck
HMM	Hidden Markov Model
RNN	Recurrent Neural Network
RMSE	Root Mean Square Error
CC	Correlation Coefficient

References

Wang, S. Research on development status and combat applications of USVs in worldwide. Command. Syst. 2019, 44, 11–15. [Google Scholar]
Yuan, X.Y. Hierarchical model identification method for unmanned surface vehicle. J. Shanghai Univ. (Nat. Sci.) 2020, 26, 896–908. [Google Scholar]
Abkowitz, M.A. Measurement of hydrodynamic characteristics from ship maneuvering trials by system identification. Trans. Soc. Nav. Archit. Mar. Eng. 1980, 88, 283–318. [Google Scholar]
Liu, Y.; Zou, L.; Zou, Z.; Guo, H.P. Predictions of ship maneuverability based on virtual captive model tests. Eng. Appl. Comput. Fluid Mech. 2018, 12, 334–353. [Google Scholar] [CrossRef]
Xu, H.; Soares, C.G. Hydrodynamic coefficient estimation for ship manoeuvring in shallow water using an optimal truncated LS-SVM. Ocean Eng. 2019, 191, 106488. [Google Scholar] [CrossRef]
He, H.; Wang, Z.; Zou, Z.; Liu, Y. System Identification Based on Completely Connected Neural Networks for Black-Box Modeling of Ship Maneuvers. In Advances in Guidance, Navigation and Control; Lecture Notes in Electrical Engineering; Springer: Singapore, 2022; Volume 644. [Google Scholar] [CrossRef]
Zhang, X.; Zou, Z. Identification of models of ship manoeuvring motion using Support Vector Regression and Particle Swarm Optimization. J. Ship Mech. 2016, 20, 1427–1432. [Google Scholar] [CrossRef]
Xu, H.; Hinostroza, M.A.; Wang, Z.; Guedes Soares, C. Experimental investigation of shallow water effect on vessel steering model using system identification method. Ocean Eng. 2020, 199, 106940. [Google Scholar] [CrossRef]
Xu, H.; Hassani, V.; Soares, C.G. Uncertainty analysis of the hydrodynamic coefficients estimation of a nonlinear manoeu-vring model based on planar motion mechanism tests. Ocean Eng. 2019, 173, 450–459. [Google Scholar] [CrossRef]
Pandey, J.; Hasegawa, K. Study on turning manoeuvre of catamaran surface vessel with a combined experimental and simulation method. IFAC-PapersOnLine 2016, 49, 446–451. [Google Scholar] [CrossRef]
Luo, W.; Zhang, Z. Modeling of ship maneuvering motion using neural networks. J. Mar. Sci. Appl. 2016, 15, 426–432. [Google Scholar] [CrossRef]
Carrillo, S.; Contreras, J. Obtaining first and second order nomoto models of a fluvial support patrol using identification techniques. Ship Sci. Technol. 2018, 11, 19–28. [Google Scholar] [CrossRef]
Liu, C.D.; Zhang, H.; Han, Y.; Shi, C. Black-box modeling and prediction of ship maneuverability based on Least Square Support Vector Machine. J. Ship Mech. 2013, 17, 872–877. [Google Scholar] [CrossRef]
Xu, F.; Chen, Q.; Zhou, Z. Modeling of Underwater Vehicles’ Maneuvering Motion by Using Integral Sample Structure for Identification. J. Ship Mech. 2014, 211–220. [Google Scholar]
Bonci, M.; Viviani, M.; Broglia, R.; Dubbioso, G. Method for estimating parameters of practical ship manoeuvring models based on the combination of RANSE computations and System Identification. Appl. Ocean Res. 2015, 52, 274–294. [Google Scholar] [CrossRef]
Gupta, P.; Rasheed, A.; Steen, S. Ship performance monitoring using machine-learning. Ocean Eng. 2022, 254, 111094. [Google Scholar] [CrossRef]
Zhang, Y.Y.; Wang, Z.H.; Zou, Z.J. Black-box modeling of ship maneuvering motion based on multi-output nu-support vector regression with random excitation signal. Ocean Eng. 2022, 257, 111279. [Google Scholar] [CrossRef]
Song, L.; Hao, L.; Tao, H.; Xu, C.; Guo, R.; Li, Y.; Yao, J. Research on Black-Box Modeling Prediction of USV Maneuvering Based on SSA-WLS-SVM. J. Mar. Sci. Eng. 2023, 11, 324. [Google Scholar] [CrossRef]
He, H.; Zou, Z. Black-Box Modeling of Ship Maneuvering Motion Using System Identification Method Based on BP Neural Network. In Proceedings of the ASME 2020 39th International Conference on Ocean, Offshore and Arctic Engineering, Virtual, Online, 3–7 August 2020; Volume 6B: Ocean Engineering. [Google Scholar] [CrossRef]
Liu, Y.; Xue, Y.; Huang, S.; Xue, G.; Jing, Q. Dynamic Model Identification of Ships and Wave Energy Converters Based on Semi-Conjugate Linear Regression and Noisy Input Gaussian Process. J. Mar. Sci. Eng. 2021, 9, 194. [Google Scholar] [CrossRef]

Figure 1. Model of the USV.

Figure 2. USV motion prediction model based on LSTM.

Figure 3. Impact of Network Structure on Prediction (the mean represents the average value of the corresponding parameter).

Figure 4. Impact of Network Structure on Training Time.

Figure 5. Impact of Training Settings on Prediction Performance.

Figure 6. Impact of training parameters on Training Time.

Figure 7. The network parameters and structure of the optimized LSTM model.

Figure 8. Comparison of motion prediction results for the 20°/20° zigzag test under wave conditions.

Figure 9. Comparison of motion prediction results for the 35° turning test under wave conditions.

Table 1. Conditions of simulations.

Scenarios	Conditions
No Wave	10°/10° Zigzag
	15°/15° Zigzag
	20°/20° Zigzag
	25° Turning
	30° Turning
	35° Turning
Wave	10°/10° Zigzag
	15°/15° Zigzag
	20°/20° Zigzag
	25° Turning
	30° Turning
	35° Turning

Table 2. The meaning of the parameters on the X-axis in Figure 3, Figure 4, Figure 5 and Figure 6.

Parameters on the X-Axis	Meaning
L	the number of LSTM layers
N	the number of neurons per layer
D	the dropout regularization rate
LR	the initial learning rate
T	the window function width

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Guo, R.; Mao, Y.; Xiang, Z.; Hao, L.; Wu, D.; Song, L. Research on LSTM-Based Maneuvering Motion Prediction for USVs. J. Mar. Sci. Eng. 2024, 12, 1661. https://doi.org/10.3390/jmse12091661

AMA Style

Guo R, Mao Y, Xiang Z, Hao L, Wu D, Song L. Research on LSTM-Based Maneuvering Motion Prediction for USVs. Journal of Marine Science and Engineering. 2024; 12(9):1661. https://doi.org/10.3390/jmse12091661

Chicago/Turabian Style

Guo, Rong, Yunsheng Mao, Zuquan Xiang, Le Hao, Dingkun Wu, and Lifei Song. 2024. "Research on LSTM-Based Maneuvering Motion Prediction for USVs" Journal of Marine Science and Engineering 12, no. 9: 1661. https://doi.org/10.3390/jmse12091661

APA Style

Guo, R., Mao, Y., Xiang, Z., Hao, L., Wu, D., & Song, L. (2024). Research on LSTM-Based Maneuvering Motion Prediction for USVs. Journal of Marine Science and Engineering, 12(9), 1661. https://doi.org/10.3390/jmse12091661

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Research on LSTM-Based Maneuvering Motion Prediction for USVs

Abstract

1. Introduction

2. Sample Data Set Construction and Data Processing

2.1. The Maneuvering Motion Model of USV

2.2. Sample Data Acquisition

2.3. Sample Data Processing

3. LSTM-Based USV Motion Black-Box Prediction Model

4. Analysis on Network Structure and Simulation

4.1. Discussion of Network Structure

4.2. Discussion of Training Settings

4.3. Comparing Simulations

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI