research-article

Open access

VesNet: A Vessel Network for Jointly Learning Route Pattern and Future Trajectory

Authors:

Fenyu Jiang,

Huandong Wang,

Yong LiAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology, Volume 15, Issue 2

Article No.: 34, Pages 1 - 25

https://doi.org/10.1145/3639370

Published: 28 March 2024 Publication History

PDF eReader

Abstract

Vessel trajectory prediction is the key to maritime applications such as traffic surveillance, collision avoidance, anomaly detection, and so on. Making predictions more precisely requires a better understanding of the moving trend for a particular vessel since the movement is affected by multiple factors like marine environment, vessel type, and vessel behavior. In this paper, we propose a model named VesNet, based on the attentional seq2seq framework, to predict vessel future movement sequence by observing the current trajectory. Firstly, we extract the route patterns from the raw AIS data during preprocessing. Then, we design a multi-task learning structure to learn how to implement route pattern classification and vessel trajectory prediction simultaneously. By comparing with representative baseline models, we find that our VesNet has the best performance in terms of long-term prediction precision. Additionally, VesNet can recognize the route pattern by capturing the implicit moving characteristics. The experimental results prove that the proposed multi-task learning assists the vessel trajectory prediction mission.

1 Introduction

Over 70\(\%\) of the surface of the earth is covered by oceans. The sea provides abundant resources and connects the continents and countries worldwide. Maritime transportation is essential in global trade, sustaining the international economy. It is reported that more than 80\(\%\) of the goods trading globally are carried by sea, which is even higher for most developing countries [2]. Ensuring maritime transportation’s safety and efficiency benefits human life, the world economy, and the oceanic environment. The vessel trajectory prediction task is a core maritime situation awareness system component. By effectively predicting the future movement of a vessel on both short-term and long-term levels, it assists in functions such as collision risk assessment, collision avoidance, destination and travel time estimation, and path planning. Therefore, a more precise vessel trajectory prediction can contribute to a better preview of the maritime traffic situation, which results in a well-managed maritime transportation system.

Historical moving records offer the opportunity to comprehend the underlying dynamic characteristics of how a vessel moves. With the development of the Automatic Identification System (AIS), a ship has to periodically broadcast its current status to the receivers equipped on other nearby vessels, terrestrial base stations, and even satellites. Each AIS data contains static and dynamic information indicating the current vessel’s behavior [59]. The static information includes the Maritime Mobile Service Identity (MMSI), vessel type, and size. MMSI is a unique number assigned to a vessel that can be treated as the ID. At the same time, the dynamic information includes the timestamp, latitude, longitude, Speed over Ground (SoG), and Course over Ground (CoG). Collecting the AIS data in a time-sequential order forms the voyage trajectory. Since more and more AIS transceivers are deployed, the amount of AIS data is drastically enriched, leading to several AIS datasets publicly available online [51].

Despite the advanced data mining technologies and abundant AIS data, some prerequisite information is still needed to enhance the performance of vessel trajectory prediction further when compared with human trajectory prediction. Therefore, we summarise the challenges from the following three aspects:

(1)

Unclear Origin-Destination (OD) information. It raises the problem of identifying the exact starting and ending points of a vessel trajectory, especially the departure and arrival ports in most cases. The unclear OD information is mainly caused by wrong reported or missing information within AIS data. It is demonstrated in [58] that about 40% of the destination information is intentionally or unintentionally updated with mistakes. Poikonen [42] analyzed the European AIS data collected in January-March 2020 and found that about 10% of cargo vessels and 20% of passenger vessels did not report the destination. The availability of accurate destination information cannot be guaranteed. The incorrect or missing OD information makes capturing the vessel route for long-term predictions challenging.

(2)

No existing road networks. Unlike the human and vehicle mobility prediction problems in urban scenarios, it lacks information such as road networks and locations of Point of Interest (PoI) along a vessel trajectory, especially for open sea cases [40]. Without the constraints of “road”, the vessel’s movement is prone to be affected by the marine environment, which leads to more flexible vessel trajectories, causing instability in predictions.

(3)

Seldom periodic moving patterns. There are fewer periodical moving patterns for a specific vessel than for human mobility, as mentioned in [21]. The length of a vessel voyage usually varies from days to weeks, which hardly characterizes daily or weekly similar moving patterns [26]. Typically, the container ships follow the fastest route to reach the next port on the list, with the best financial effect. Such lack of intra-periodicity results in difficulties in discovering individual regularities.

Hence, this paper focuses on extracting typical routes from historical AIS data to tackle the abovementioned challenges. By checking the recorded AIS speed, we can obtain various independent maritime journeys by making segmentation at the anchoring status. By jointly taking the starting and ending locations, time duration, moving speed and direction distributions, and covered waypoints into consideration, it is feasible to extract route patterns composed of similar moving behaviors when clustering the journeys. When predicting the vessel trajectory, we merge the route classification result on the observed sequences into the proposed model to align the output with the derived route patterns. We summarise our contributions as follows:

—

Vessel route pattern extraction. We propose a general pipeline for extracting vessel route patterns from the raw AIS data. We categorize similar vessel trajectories into the same pattern cluster for a large amount of historical data. By filtering the starting and ending points of all trajectories and then clustering them based on the spatial density, it is feasible to identify a voyage’s departure and arrival ports, hence obtaining the OD information. The trajectory clustering is carried out by collecting the same derived OD information and similar trajectory shapes. Similar route patterns usually mean similar moving behaviors in vessel transportation, although they are generated by different vessels [40]. Thus, extracting vessel movement patterns alleviates all three challenges, acting as the prior knowledge serving the vessel trajectory prediction.

—

Multi-task learning (MTL) of route pattern and future trajectory. We propose a novel framework named VesNet (Vessel Network), which consists of MTL for route patterns classification and trajectory prediction in maritime transportation. The extracted route patterns are treated as labels for training classifiers with supervised learning. Based on the attentional seq2seq model, the VesNet can simultaneously classify the route pattern and predict the future move positions of a target vessel. The intuition behind this idea is to let the model learn typical movement patterns and guide the prediction task collaboratively by recognizing the route class.

—

Substantial experiments on a real-world dataset. We conduct extensive experiments on a real-world dataset to verify the effectiveness and efficiency of our proposed VesNet, mainly from the perspectives of overall prediction performance, route pattern classification performance, and ablation study.

The remainder of this paper is organized as follows. Section 2 introduces related works regarding mobility trajectory prediction, trajectory clustering, and MTL. Section 3 formulates the vessel trajectory prediction problem with mathematical expressions. In Section 4, an elaborate demonstration of our proposed VesNet is displayed. After explaining the setup of the experiments in Section 5, Section 6 showcases the results with corresponding analysis. Finally, Section 7 concludes the paper and future directions for further extension.

2 Related Works

In this section, other works relevant to this paper are discussed, falling into four aspects: the human mobility prediction problem, the vessel trajectory prediction problem, the moving trajectory clustering problem, and the MTL technique.

2.1 Human Mobility Prediction

When it comes to the topic of mobility prediction, human-related issues are not avoidable since more complexity and flexibility are involved. Human mobility study is essential because of its effect on the following aspects of our daily life: disease spreading, transportation scheduling, events arrangement, resource allocation, urban planning, and more. With the growing popularity of location-based mobile services, the existing GPS equipment embedded within the smart devices, and the logs generated by the wireless communications between mobile phones and base stations, more and more human mobility data springs up at multi-spatial and temporal scales. It facilitates researchers from artificial intelligence to utilize deep learning techniques to resolve mobility-related challenges [33]. Besides the human movements prediction [21, 54] and generation tasks [22, 55], other predictive problems such as movement purpose prediction [44], home location detection [41, 52], and population inference [18] also attract people’s attention. The human mobility prediction problem, especially regarding the next location prediction challenge, matters most to the vessel trajectory prediction task mentioned in this paper.

Before developing deep learning techniques, the next location prediction is carried out using probabilistic patterns. In [6], the authors proposed a probabilistic model combining human trajectories and geographical features. Monreale et al. [36] proposed a trajectory pattern mining algorithm that involves frequently visited regions. The deep learning methods manage to capture complex spatial and temporal dependencies within sequences. With the help of Recurrent Neural Network (RNN), Long Short Term Memory (LSTM), Gate Recurrent Unit (GRU), Convolutional Neural Network (CNN), and attention mechanism, various models have been developed for capturing moving patterns. In Variational Attention-based Next Location (VANext) [24], the authors used a CNN to encode the historical trajectories and a GRU to encode the current trajectory. The outputs are fed into an attention layer that detects the historical trajectory that most matches the current one for predicting the following location. Deep Model for Joint Mobility and Time (DeepJMT) [14] predicted an individual’s next POI with the arrival time using four components: a sequential dependency encoder, a spatial context encoder, a periodicity context extractor, and a social temporal context extractor. It concatenated the outputs of the four models to make the prediction. Based on the above statements, deep learning frameworks are suitable for trajectory prediction using multi-purpose components.

2.2 Vessel Trajectory Prediction

It is also worth introducing state-of-the-art vessel trajectory prediction research works for a better view of existing techniques and limitations. On the whole, three types of vessel trajectory prediction methods are summarised: (1) physical model based [51], (2) learning model based [51], and (3) knowledge-based [56]. Unlike the movement of vehicles and aircraft, a vessel cannot change its speed and direction drastically within a short period and is moving in a 2D plane [51]. This characteristic inspires the approach of leveraging physical laws to calculate the future movement of a vessel based on solving mathematical equations. These methodologies include Constant Velocity Model (CVM) [25, 43], curvilinear model [5], lateral model [10, 29], and ship model [46]. Due to being either too simple or constrained by ideal environments and accurate state assumptions, physical model-based methods are rarely utilized in practical scenarios. Furthermore, physical model-based methods perform poorly in predicting long-term positions, that is, on the granularity of hours. The forecasting is based on the current vessel dynamic status, and the long-term route knowledge needs to be incorporated into the model. Therefore, researchers employ deep neural networks to overcome such drawbacks and capture the spatial-temporal dependencies among the vessel movement data.

RNN has been widely utilized in dealing with sequences-related tasks. Seq2seq model effectively enables applications such as machine translation [16, 50] and speech recognition [15]. Due to the capability of storing long sequential context within a representing state vector through the encoder before interpreting via the decoder, the seq2seq model manages to tackle the trajectory prediction task. Forti et al. [23] proposed a seq2seq model based on an LSTM unit to predict vessel trajectories. Nguyen et al. [38] also leveraged seq2seq architecture, transforming the GPS information into a spatial grid instead. The configuration of grid size affects the training time complexity and performance accuracy. You et al. [62] took the relative longitude and latitude, together with the time interval, as the input of the seq2seq model to predict the vessel trajectory in the Yangtze River, China. Nonetheless, the input of relative values leads the model to learn the moving trend of the vessel instead of the moving pattern, which causes poor performance at waypoints. Capobianco et al. [7, 8] proposed a seq2seq with an attention model to forecast vessel trajectories. Furthermore, they involved uncertainty awareness and destination port labels to enhance the performance. It relies on facts like dataset quality, data preprocessing, and fine-tuning network parameters for the learning model-based approaches to attain practical prediction ability. In addition, the training procedure is vital, but few specific instructions can be followed, instead depending more on empirical intuition.

Besides the approaches mentioned above, another type of method, known as knowledge-based, stands on the fundamental of statistics, which describes the vessel movement patterns in a probability format. Researchers [1, 30, 35, 39, 40] extracted the vessel movement patterns by gathering several trajectories traversing similar waypoints sequentially into clusters. The pattern is represented by a synthetic route calculated as the mean of cluster members. Additionally, the Probability Density Function (PDF) of pattern time length, speed, and course is derived, which can be helpful in trajectory prediction and anomaly detection. Hexeberg et al. [28] adopted a simplified version of moving behavior knowledge without extracting movement patterns from the historical AIS data. It treated the whole dataset as a complex pattern and searched the neighbor of the predicting point to calculate the posterior course and speed. Then, linearly infer the next position constrained by the preset step length. On repeating multiple steps, a sequence of predicted positions would be generated. Nonetheless, the algorithm cannot handle the intersections and branches of vessel tracks, which impacts its applicability. Rong et al. [45] predicted vessel trajectories considering uncertainty based on a Gaussian process model. Other researchers [26, 57] split the focused region into spatial grids and established the transition relationships between grids based on the historical AIS data. The frequency of transitions is related to the likelihood of moving behaviors. Xiao et al. [57] leveraged Kernel Density Estimation (KDE) to elaborate the probability distribution of position transfers. Hakola [26] forecasted the shortest route between ports with the help of the A* algorithm. It calculated the route weights using the grid transition matrix. However, the method requires the destination port information in advance. Moreover, it always derives the minor cost route, which may differ from the ground truth. Shu et al. [48] considered vessel trajectory prediction as a path-planning task that involves multi-objective optimization based on optimal control.

In the introduction of the approaches mentioned above, we are inspired by the conjunction of both learning and knowledge-based methods to merge the pattern knowledge into the neural network, giving rise to the model proposed in this paper.

2.3 Moving Trajectory Clustering

Extract the movement pattern of a mobility target, and it helps understand its dynamic characteristics. Like humans, vehicles, and flight, their periodic daily or weekly tracks usually behave similarly. Though vessel tracks lack periodicity, different ships share the same traveling route across various vessel types. Jointly considering facts such as geographical landscapes, weather conditions, and oil consumption, the navigators of vessels are prone to select similar routes between specific ports. Therefore, gathering diverse trajectories into clusters of similar movement patterns is crucial for extracting the voyaging knowledge. The trajectory clustering problem is classifying the tracks based on the route shape and distance between each other. Calculating such shape and location-related distance leads to two directions. One is to derive the distance, and the other is to measure the hidden vector difference representing the trajectory via neural networks.

One intuitive idea concerning distance-based approaches is to compare the common parts between two trajectories. A more significant portion of parts overlapping each other results in a higher trajectory similarity. Following this thought, the Longest Common Subsequence (LCSS) method [11] provides a simple trajectory comparison method. Another widely utilized algorithm, Hausdorff distance, measures how close the two trajectories are. It defines the trajectory distance as the maximum most relative pairwise distance between all point pairs formed from the two comparing trajectories. It is time-consuming to calculate the Hausdorff distance since its time complexity is \(O(n^2)\), where n is the length of the trajectory. Both algorithms do not consider the time order, hence they are unable to distinguish trajectories in some cases. For instance, two similar round-trip trajectories are categorized as the same pattern but belong to two distinctive voyage lanes as their sailing directions are contrary. Another popular distance measurement called Dynamic Time Warping (DTW) [53] is a time sequences alignment algorithm developed originally for speech recognition. It considers time order and is suitable for comparing trajectories with different lengths. It aims to align two sequences by adjusting the time axis iteratively until an optimal match between them is found. Suppose one sequence is length m, and the other is length n. It requires an \(m \times n\) dimension matrix to store the pairwise Euclidean distance. Specifically, starting from the lower left corner and ending at the upper right corner of the matrix, a path with the minimum overall distance under a few constrained conditions can be derived, representing the optimal matching relationships between the two sequences. Such distance measures the difference between the two sequences, considering location information and trajectory shapes.

In [3], a method suitable for clustering vehicle trajectories was demonstrated. It combined ideas from spectral clustering and proposed a trajectory similarity evaluation mechanism based on the modified Hausdorff distance to enhance its robustness and respect that trajectories are ordered collections of points. The work compared the proposed method with LCSS and DTW on a few real-world datasets, revealing its superiority. Unlike the methods mentioned above that cluster based on the whole trajectories. Lee et al. [31] segmented the trajectory into partitions before putting them into categories. It consists of two phases, using the Minimum Description Length (MDL) principle for trajectory partitioning and a density-based line segment clustering algorithm for grouping. It defined three types of distances, the perpendicular distance, the parallel distance, and the angle distance, to describe how far away the two line segments are. The line distance was calculated as the linear summation of the three distances. Then, it leveraged such distance as the metric for the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm in the line categorization task.

The previously mentioned approaches in this subsection focus on computing the distance as the metric to discriminate trajectories. Researchers have recently represented the trajectories in vectors, which converts the trajectory clustering issue into representing vector grouping problems. In [61], the trajectories were learned by quality low-dimensional representations. It first used a sliding window to extract the space and time-independent moving behavior characteristics. They then employed a seq2seq framework to learn a fixed-length representation of the moving features. The learned representation encodes the movement characteristics of the targeting object and can be further fed into classic clustering algorithms, like the K-means method. The method was evaluated on synthetic and real data, resulting in significant performance improvements over existing approaches. The neural network-based process automatically transforms the trajectory into a representing vector instead of creating a matrix to store the distance between trajectories, as the distance-based methods do.

2.4 Multi-task Learning

MTL has been proven successful in fields such as computer vision, natural language processing, reinforcement learning, and so on [17, 47]. It is a subfield of machine learning with multiple tasks learned simultaneously via a shared model framework. MTL can perform better on the original job due to the hidden representations shared by relevant learning tasks [47]. From the framework-sharing scheme perspective, MTL can be realized through either the network model’s hard or soft parameter sharing. In hard parameter sharing [9], all tasks share the hidden representations while specifically designing the output layer for each job. Moreover, hard parameter sharing manages to reduce overfitting [4]. It is reasonable under an intuitive thought that with several tasks to learn simultaneously, the model is prone to capture the characteristics of all functions, causing less possibility to overfit the original specific job. Slightly different from hard parameter sharing, in soft parameter sharing, the hidden layers for each task are not identical and are kept similar based on some specified distance metrics [20, 60]. To step further into the sequence-related MTL problem, the seq2seq framework was adjusted following three parameter-sharing strategies: one-to-many, many-to-one, and many-to-many [34]. Similarly, three other parameter sharing schemes were developed concerning the LSTM recurrent network: uniform-layer, coupled-layer, and shared-layer [32].

Concretely, MTL has been involved in vessel-related tasks as well. Nguyen et al. [37] proposed a four-hot method for representing the AIS data inputted to a Variational Recurrent Neural Network (VRNN) for three maritime surveillance-related missions, namely vessel trajectory reconstruction, anomaly detection, and vessel type identification. The proposed work focused on various temporal granularity, ranging from hours to days, to learn multiple tasks. The four-hot is the concatenation of four dynamic information included in the AIS data: latitude, longitude, SoG, and CoG, all converted to the one-hot format. In [19], the authors proposed an MTL seq2seq model for vessel trajectory prediction. It jointly adopted the AIS, radar images, and Electronic Navigational Charts (ENC) as the input to simultaneously learn multiple tasks of matching the future GPS coordinates and the layout of nearby water and land regions. It reported that more types of information could achieve more prediction accuracy. The evidence showed that MTL benefited the vessel trajectory prediction task.

3 Problem Formulation

In this section, we formally describe the vessel trajectory prediction problem. The main structure of our model is based on the seq2seq learning approach, that is, by observing the input vessel movement sequences and learning a hidden representing vector before interpreting it for future trajectory prediction.

A trajectory can be expressed as a set of temporally ordered tuples \(\mathcal {T}=\lbrace (a_i, t_i)\rbrace _{i=1}^N\) with the length N, where \(a_i\) is the ith dynamic attributes and \(t_i\) is the corresponding timestamp. Generally, a vessel trajectory point is represented by \(a_i=(lat_i, lon_i, v_i, c_i)\), where \(lat_i\), \(lon_i\), \(v_i\) and \(c_i\) represent latitude, longitude, SoG, and CoG, respectively. Hence, a vessel trajectory can be presented as \(\mathcal {T}=\lbrace (lat_i, lon_i, v_i, c_i, t_i)\rbrace _{i=1}^N\). Furthermore, a dataset is composed of a set of trajectories \(\mathcal {D}=\lbrace \mathcal {T}_j\rbrace _{j=1}^K\).

Due to the irregular data generation rate and sometimes unstable data upload link, the data is sampled with fluctuating time intervals. For better reflecting the relationship between sequence length and sailing time, we resample the original data by linear interpolation for generating equal-time gaps between two consecutive trajectory points. Thus, a same-time duration covers the same number of points for different trajectories. It becomes more precise when making predictions along time horizons since the output sequence generated by the model is more likely to have uniform time intervals.

A sequence is a subset of a trajectory containing a series of successive points. The vessel trajectory prediction problem is that given a previous sequence \(\mathcal {S}=\lbrace (a_i, t_i)\rbrace _{i=n-\ell +1}^n\) with length \(\ell\) ending at time \(t_n\), to predict the adjacent future sequence \(\mathcal {S^{\prime }}=\lbrace (a_i^{\prime }, t_i)\rbrace _{i=n+1}^{n+h}\) with length h starting at time \(t_{n+1}\), as shown in Figure 1. \(a_i^{\prime }=(lat_i^{\prime }, lon_i^{\prime })\) is usually adopted in the predicted output, where the symbol \(^{\prime }\) means the predicted values rather than the ground truth. If the resampling time interval is fixed, then the prediction time horizon can be changed by directly adjusting the output sequence length h. At the same time, altering the input sequence length \(\ell\) determines how long the model focuses on the historical trajectory.

Fig. 1.

4 Methodology

This section explains the proposed method in two phases: the vessel route pattern extraction stage and the overall VesNet approach.

4.1 Route Pattern Extraction

The rationality of extracting vessel movement patterns is based on the fact that vessels usually traverse along a familiar route originating from position A and terminating at position B. Meanwhile, the ships behave like each other regarding navigating speed and course [39, 40]. Straightforwardly, the extraction relies on the vessel trajectory clustering results. Specifically, clusters with the number of members exceeding a predefined threshold are reserved. Otherwise, the clusters are regarded as noise since only a few vessels follow the route.

In this paper, we utilize a publicly available vessel tracks dataset published by Ville Hakola on the IEEE DataPort website [27]. It consists of ship tracking data in the AIS form in the Baltic Sea from 2017 to 2019. The dataset details are exhibited in Section 5, and we mention it here for a clear view of the vessel trajectory clustering procedure. Figure 2 shows the original vessel tracks, which include over one million AIS data. The region of interest is a rectangular area covering the Baltic Sea, with the x-axis representing longitude and the y-axis representing latitude. The tracks in dark blue mean a dense data distribution. Contrarily, light blue indicates a sparse data distribution. We follow the steps of data preprocessing, port extraction, and trajectory clustering to realize the route pattern extraction function.

Fig. 2.

4.1.1 Data Preprocessing.

Firstly, we need to isolate each vessel trajectory from the whole dataset, shown in Algorithm 1. The core actions at this stage are Steps 3–7. Among them, Step 3 segments the long track of a vessel into several short sub-tracks at the points where a sizeable temporal gap exists since they can be regarded as independent of each other [58]. Then, through Step 4, the trajectories starting and ending at mooring status are extracted from the sub-track. It is mainly based on the fact that the vessel seldom moors during a trajectory due to the requirements of accomplishing transportation tasks on time. Typically, it is in the mooring status when anchoring at a port [58]. Additionally, there exist methods like speed extraction from videos [12] for more accurate dynamic information acquisition. The trajectory points with abnormal speed are discarded in Step 5, as the data denoising is of great importance [13]. We then use a linear interpolation approach to achieve an equal time interval between two consecutive points in Step 6 and discard trajectories with small sequence lengths in Step 7.

4.1.2 Port Extraction.

We fetch the low-speed points for vessel mooring status detection within each sub-track for port extraction. After that, we leverage the DBSCAN algorithm to classify these points into clusters automatically. Two parameters designed for DBSCAN are vital for the clustering results. \(\varepsilon\) is the radius of the neighborhood circle around each data point, and \(\rho\) is the minimum number of data points required inside that circle so that the data point can be categorized as a core point. For two points X and Y, if (1) X locates in the neighborhood of Y, i.e., \({\rm dist}(X, Y)\le {\varepsilon }\) and (2) Y is a core point, we say that X is directly density reachable from Y, and such direct density reachable is not symmetric. Furthermore, a point X is density reachable from point Y, if there exists a chain of points \(p_1, p_2, \ldots , p_n\) with \(p_1=X\) and \(p_n=Y\) such that \(p_i\) is directly density reachable from \(p_{i+1}\), where \(i=1, \ldots , n-1\). Based on these concepts and definitions, the DBSCAN algorithm randomly selects a core point comprising other points that can be density reachable from that core point as a cluster. The procedure keeps running until all the points are allocated to a cluster. Points that do not belong to any cluster are treated as noise. As shown in Figure 3(a), the points extracted from the tracks displayed in Figure 2 scatter within the region of interest, most of which are close to the coastlines. On determining parameters \(\varepsilon\) and \(\rho\), the DBSCAN algorithm separates the points into various clusters as shown in Figure 3(b), each represented by a specific color. Notably, the noise points are colored in black. We indicate each cluster with a unique integer while denoting the noise as -1. The spatial distribution of the clusters reflects the actual locations of the ports. The port extraction is useful when the ports list is unavailable instead of manually matching on the Google map.

Fig. 3.

4.1.3 Trajectory Clustering.

Each vessel trajectory can be represented in the ‘departure - arrival ports’ mode based on the port extraction. Such representation incorporates geographical positions and contains information on OD flow. It is a compact indicator of the vessel traffic pattern since the generation process is not complicated but essential. By gathering the trajectories tagged by the same ‘departure - arrival ports’ label, we collect them into groups, where each group stands for a typical movement flow. Moreover, to separate the trajectories with similar shapes and dynamic attributes into clusters within each OD group, we cluster trajectories with DBSCAN again but in a higher dimension space. Specifically, we represent each trajectory by a tuple consisting of simplified trajectory length, whole time duration, mean and variance of speed, and direction of the simplified trajectory waypoints. All the elements are normalized into a comparable scale. The trajectory is simplified by methods like the Douglas-Peucker algorithm [64] to pick out waypoints that retain the trajectory shape. Figure 4 shows the vessel trajectory clustering results. A color denotes each pattern, and the curves are commonly thick as several trajectories follow the same pattern. For a clearer view in general, we exhibit 9 representative route patterns in Figure 5 separately. It should be noted that we do not show all the route patterns to save page space. By observing the examples, we can observe that the vessels behave similarly within the same route pattern. In the third and fourth examples, the shape of the route patterns is almost identical. However, their departure and arrival ports are reversed, considering them two different patterns. It reveals that our trajectory clustering method considers the temporal order. Furthermore, as demonstrated in Figure 6, velocity and course follow a spatial dependent distribution, proving that the vessel trajectories within the same route pattern behave similarly.

Fig. 4.

Fig. 5.

Fig. 6.

4.2 VesNet

4.2.1 Recurrent Neural Networks.

The invention of RNN has been applied to various problems: speech recognition, language modeling, translation, image captioning, conversation robots, article abstract generation, and so on. However, the RNN suffers from gradient vanishing or explosion issues and the drawback of being unable to handle long-term dependencies. LSTM, a particular type of RNN, is developed to overcome shortcomings. Unlike RNN, LSTM has a more complicated structure. It relies on the gates to adjust the information in the cell state. The following equations elaborate on the operations

\begin{align} f_t & = \sigma (W_f\times [h_{t-1}, x_t] + b_f), \end{align}

(1)

\begin{align} i_t & = \sigma (W_i\times [h_{t-1}, x_t] + b_i), \end{align}

(2)

\begin{align} \widetilde{C_t} & = \tanh (W_C\times [h_{t-1}, x_t] + b_C), \end{align}

(3)

\begin{align} C_t & = f_t\cdot {C_{t-1}} + i_t\cdot \widetilde{C_t}, \end{align}

(4)

\begin{align} o_t & = \sigma (W_o\times [h_{t-1}, x_t] + b_o), \end{align}

(5)

\begin{align} h_t & = o_t\cdot \tanh (C_t). \end{align}

(6)

The first component is the forget gate layer. It takes into the last step hidden state \(h_{t-1}\) and current step input \(x_t\) to decide how much contents to keep in the previous step cell state \(C_{t-1}\), as displayed in (1). Next is to decide what new information to store in the cell state, which is done by (2) and (3). Then the new cell state \(C_t\) is updated in (4), forgetting things and adding new candidate values. Finally, the output is a filtered version based on the cell state, as shown in (5) and (6).

4.2.2 Seq2seq Structure.

Unlike the basic RNN architecture, which is dedicated to many-to-one or many-to-many tasks with the same input-output length, seq2seq can handle many-to-many tasks with unequal input-output length. Being successful in applications such as translation, chatbot, and time series prediction, seq2seq utilizes an encoder-decoder framework to connect the sequential input features and the corresponding output predictions. Specifically, the encoder converts the input sequence into a fixed-length hidden vector, which implicitly contains the temporal and spatial relationships. The hidden vector represents a point within the high-dimensional space, and similar sequences are close to each other. Upon iteratively interpreting the hidden state, the targeted sequence is recovered without limiting output length. Therefore, seq2seq is suitable for time series prediction, focusing on forecasting a future period rather than a single timestep. It relies on either LSTM or GRU unit as the fundamental component.

4.2.3 Attention Mechanism.

In the conventional seq2seq model, the input sequence is mapped to a hidden vector, regarded as a high dimensional representation for decoding the output sequence without further delicate operations. The attention mechanism was first proposed in translation to align the words with solid correlation from different languages, though they are not in the same position. The key idea is that each output timestep queries the context produced by the encoder to pay various attention to each input timestep, which captures the critical dependencies even though the input sequence is relatively long. The overall seq2seq model, with the attention that takes LSTM as the basis, can be described by

\begin{align} attn_i & = \sum _{j=1}^l\alpha _{i,j}h_j, \end{align}

(7)

\begin{align} \alpha _{i,j} & = \frac{{\rm exp}(u_{i,j})}{\sum _{j=1}^l{\rm exp}(u_{i,j})}, \end{align}

(8)

\begin{align} u_{i,j} & = v^{\rm T}\cdot {\rm tanh}(W_{ih}h_j+W_{oh}h_i), \end{align}

(9)

where \(attn_i\), the attentional context vector corresponding to the \(i_{th}\) position, is a weighted sum of the output sequence generated by the encoder, denoted by \(h_j\) with length l. And l is the input sequence length of the encoder. The coefficient \(\alpha _{i,j}\) is calculated similarly to the softmax, obtaining the proportion of how much attention the decoder should pay to each encoder output. v, \(W_ih\) and \(W_oh\) are all learnable parameters and \(h_i\) is the current hidden state produced by the decoder based on the last timestep status (\(h_{i-1}, C_{i-1}\)) and the predicted feature \(x^{\prime }_{i-1}\).

4.2.4 VesNet Structure.

On resampling the raw AIS data during the preprocessing procedure, extracting the maritime route pattern by trajectory clustering, and introducing the attentional seq2seq time series prediction framework, we propose a deep neural network called VesNet, which jointly learns to classify the input vessel historical movement sequence into appropriate route patterns and forecast the future navigation sequence within a period. The overall structure of VesNet is shown in Figure 7. It consists of three modules: the encoder, the decoder, and the MTL block. VesNet is an extended variational version of the attentional seq2seq model, where the intermediate hidden vector supports both route pattern recognition and trajectory prediction. The VesNet takes the historical vessel movement sequence as the input and outputs the route pattern classification result and the future movement sequence, varying from minutes to hours.

Fig. 7.

As demonstrated in Figure 7, a vessel movement sequence (\(a_1, a_2, \ldots , a_l\)) with the same time interval is fed into the encoder. Each timestep input \(a_j\), where \(j=1,2,\ldots ,l\), is a four-dimensional vector comprised of vessel latitude, longitude, velocity, and course. We leverage the min-max normalization for feature scaling, which enforces the input features distributed in the range of [0, 1]. After processing the input with a sequential LSTM recurrent network, we collect the returned sequence prepared for the upcoming attention mechanism. Meanwhile, the last timestep hidden state \(h_l\), which contains the spatial and temporal characteristics of the input sequence, is concatenated with the latent representation of the departure port extracted from historical sequences. The merged hidden vector is reserved for later route pattern classification and vessel trajectory prediction. When it turns to the decoder side, another LSTM recurrent network with length h is deployed to predict the vessel’s future movement sequence. We set h to a sufficient value to cover both short-term and long-term predictions. Stopping the inference at different timesteps can make VesNet predict over different time lengths. For each timestep at the decoder, it operates on the last timestep status (\(h^{\prime }_{i-1}, C^{\prime }_{i-1}\)) and output \(a^{\prime }_{i-1}\), where \(i=1,2,\ldots ,h\), to generate the current timestep hidden state \(h^{\prime }_i\). Note that \(h^{\prime }_0\) and \(C^{\prime }_0\) are the merged hidden vectors generated by the encoder. Later on, with the help of the attention mechanism, which is elaborated in (7)–(9), we utilize \(h^{\prime }_i\) to query the contextual sequence obtained by the encoder, resulting in the weighted sum vector \(attn_i\). Finally, the MTL block is responsible for route pattern classification and future movement sequence forecasting. On the one hand, a softmax function activates the merged hidden vector to match the one-hot version of the route pattern cluster labeled in Section 4.1. Consequently, the classification result r is embedded to match the dimension of the attentional context. On the other hand, we concatenate \(h^{\prime }_i\), \(attn_i\) and \({\rm embedding}(r)\) to merge the hidden status of current timestamp queried historical context and route pattern information into a hybrid vector, exploited for single timestep vessel movement prediction. Specifically, the hybrid vector goes through a Traj Output (TO) module, which is sequentially connected with a dense layer, a RELU layer, another dense layer, and a sigmoid layer. By iteratively following the process, a vessel movement sequence is therefore produced. After a reversed operation of the min-max normalization, we derive the ultimate forecasts at the end. The primary purpose for jointly learning the route pattern classification and future movement sequence is to constrain the predictions within a prior statistical distribution with the auxiliary information the extracted route knowledge provides, guaranteeing the forecasting precision.

4.2.5 VesNet Training.

Lastly, we elaborate on the end-to-end training procedure of our proposed model. Recall that we concurrently predict the route pattern category and the vessel movement sequence during the training phase. In terms of the route pattern classification, we adopt the cross entropy as the loss function

\begin{equation} \mathcal {L}_1(\theta)=-\sum _{\tau \in \mathcal {D}_{train}}\sum _{n=1}^{|\tau |}\sum _{k=1}^K 1\lbrace r^n=r_k\rbrace log(R_{\theta }(r^{\prime n}=r_k|a_1^n, a_2^n, \ldots , a_l^n)), \end{equation}

(10)

where \(\tau\) is a subset of the training dataset \(\mathcal {D}_{train}\), containing the data within one batch, with size of \(|\tau |\). K is the total number of the extracted route pattern categories, \(r^n\) is the ground truth route pattern, which is extracted in Section 4.1 to perform as the classifier training labels, and \(r^{\prime n}\) is the classified pattern. \(R_{\theta }\) is the neural network for route pattern classification under the condition with a given input sequence (\(a_1^n, a_2^n, \ldots , a_l^n\)). Meanwhile, we choose the mean absolute error as the loss function for predicting future vessel movement sequence

\begin{equation} \mathcal {L}_2(\theta)=\sum _{\tau \in \mathcal {D}_{train}}\sum _{n=1}^{|\tau |}\left|a_{out}^n-P_{\theta }\left(a_1^n, a_2^n, \ldots , a_l^n\right)\right|\!, \end{equation}

(11)

where \(a_{out}^n\) is the ground truth future movement sequence, and \(P_{\theta }\) is the neural network for movement prediction. All other parameters are the same as aforementioned. To sum up, the integrated optimization function is a weighted combination of the two loss functions and hence expressed as

\begin{equation} \mathcal {L}=\mathcal {L}_1(\theta)+\lambda \mathcal {L}_2(\theta), \end{equation}

(12)

where \(\lambda\) is the coefficient that affects the balance between the two learning tasks during the training stage. Algorithm 2 illustrates the whole training process of the proposed VesNet model. During the entire training procedure, we employ the gradient descent approach to update the parameters \(\theta\), with a learning rate lr and a preconfigured maximum iteration number \(epoch_{max}\). Firstly, we prepare the input and output pairs for training in Step 1. Then, we calculate the gradient of the loss function in Step 3 and eventually update the parameters \(\theta\) with a scaled factor lr in Step 4. The update procedure is repeated until the maximum iteration constraint is reached or some other early stopping conditions are met. Finally, the trained VesNet model is used for further testing and maritime-related applications.

5 Experimental Setup

5.1 Dataset Description

We validate the effectiveness of the VesNet model based on a real-world vessel trajectory dataset. The dataset contains about 1 million AIS records, including maritime navigation-related static and dynamic information. We display a few data entries and explain the meaning of each field in Table 1. The data was collected from 2017 to 2019 in the Baltic Sea and reported by various vessels such as cargo, tanks, tugs, and the like. The raw data is randomly sampled with an unequal interval ranging from minutes to hours, which requires further processing. The region of interest covers a roughly rectangular area from (\(9^{\circ }{\rm E}\), \(53^{\circ }{\rm N}\)) to (\(31^{\circ }{\rm E}\), \(67^{\circ }{\rm N}\)), which is about 1465.5km long and 1555.8km wide.

Table 1.

Timestamp	MMSI	Lat (degree)	Lon (degree)	Speed (knot)	Course (degree)	Vessel type
2017/12/14 12:42	205451000	57.7413	10.4010	9.31	59.1	RORO
2017/12/14 12:48	205451000	57.7539	10.4419	9.16	64.0	RORO
2017/12/14 13:02	205451000	57.7840	10.5772	9.52	73.3	RORO
2017/12/14 13:13	205451000	57.8118	10.6563	9.41	52.8	RORO
2017/12/14 13:23	205451000	57.8204	10.7460	9.83	102.4	RORO

Table 1. Raw AIS Data Examples

5.2 Experimental Settings

At first, we follow Algorithm 1 to process the raw data. In our experiments, we set the parameters involved at this stage: \(int_{th}=4\) hours, \(len_{th}=100\) mins and \(intpl=5\) mins. After executing Algorithm 1, we obtained 1,851 independent vessel trajectories, each wholly sampled every 5 minutes. The goal of achieving a uniform time interval is to directly match the time horizon with the sequence length, mitigating the prediction error inherently caused by the unequal sampling rate. Secondly, we leverage the route pattern extraction approach discussed in Section 4.1 on the processed vessel trajectories, resulting in 57 clusters. For each vessel trajectory, we label it with an integer indicating its cluster ID and the trajectories that belong to the same route pattern share a common number. At the last step of data preprocessing, we adopt the sliding window method to construct the input and output datasets. Specifically, from the beginning of each vessel trajectory, let the first l timestep AIS data be the input sequence and the next h timestep as the output sequence for trajectory prediction, together with the allocated cluster ID as the output for route pattern classification. By consecutively sliding from the initial to the last timestep, we format the intermediately processed trajectory data into input and output sets. Moreover, we split the dataset into the training, validation, and test sets with a splitting ratio of 6: 2: 2 for the upcoming training stage and the performance evaluation purpose.

5.3 Evaluation Metrics

Since the primary purpose of our formulated problem is vessel movement prediction, we adopt the error between the predicted GPS location and the ground truth to evaluate the performance of the trained VesNet model on the test dataset. Specifically, MAE and RMSE are selected to assess the offset on both latitude and longitude in degree. We also choose the mean earth distance between the prediction and ground truth to evaluate the precision. The three metrics are calculated below

\begin{align} \rm {MAE} & = \frac{1}{m}\sum _{i=1}^m|loc_i - loc^{\prime }_i|, \end{align}

(13)

\begin{align} \rm {RMSE} & = \sqrt {\frac{1}{m}\sum _{i=1}^m(loc_i - loc^{\prime }_i)^2}, \end{align}

(14)

\begin{align} e_{dist} & = \frac{1}{m}\sum _{i=1}^m hav(loc_i, loc^{\prime }_i), \end{align}

(15)

where \(loc_i=(lat_i, lon_i)\) is a tuple representing the ground truth vessel location, while \(loc^{\prime }_i=(lat^{\prime }_i, lon^{\prime }_i)\) is the predicted result. And m is the total size of the test dataset. In (13), \(|loc_i-loc^{\prime }_i|=\frac{1}{2}(|lat_i-lat^{\prime }_i|+|lon_i-lon^{\prime }_i|)\) is the MAE considering both latitude and longitude for the \(i_{th}\) sample in the test dataset. Similarly, in (14) \((loc_i-loc^{\prime }_i)^2=\frac{1}{2}((lat_i-lat^{\prime }_i)^2+(lon_i-lon^{\prime }_i)^2)\) is the MSE, and in (15) \(hav(loc_i, loc^{\prime }_i)\) is the earth distance. \(hav()\) is the haversine function used to calculate the great circle distance between two points, the shortest distance over the earth’s surface

\begin{align} & hav((lat_i, lon_i), (lat^{\prime }_i, lon^{\prime }_i)) = R \cdot c, \end{align}

(16)

\begin{align} & c = 2 \cdot {\rm atan2}(\sqrt {a}, \sqrt {1-a}), \end{align}

(17)

\begin{align} & a = {\rm sin^2}\left(\frac{\Delta \varphi }{2}\right)+{\rm cos}\varphi _i \cdot {\rm cos}\varphi ^{\prime }_i \cdot {\rm sin^2}\left(\frac{\Delta \vartheta }{2}\right)\!, \end{align}

(18)

where R is the earth’s radius, \(\varphi\) is latitude and \(\vartheta\) is longitude, both converted to radians rather than degrees. \(\Delta \varphi\) and \(\Delta \vartheta\) are the latitudinal and longitudinal differences, respectively. The smaller values of MAE, RMSE, and \(e_{dist}\) mean the more precise vessel trajectory prediction.

Concerning evaluating route pattern classification accuracy, we utilize Recall and Precision as the corresponding metrics. The predicted route pattern is denoted as \(\mathcal {E}_P\), and the ground truth is \(\mathcal {E}_G\). The Recall is defined as \(recall=\frac{|\mathcal {E}_P \cap \mathcal {E}_G|}{|\mathcal {E}_G|}\), and Precision as \(precision=\frac{|\mathcal {E}_P \cap \mathcal {E}_G|}{|\mathcal {E}_P|}\). The larger values of Recall and Precision indicate the more accurate vessel route pattern classification performance.

5.4 Baselines

We compare our proposed VesNet with several representative baselines specially designed for maritime trajectory prediction, including classic and recently developed approaches. Below, we give a brief introduction to each of the baseline algorithms for a clearer understanding of the underlying mechanisms.

—

CVM [56]: the most commonly used vessel trajectory prediction tool in real-world scenarios. It utilizes the latest velocity and course to linearly infer the future movement sequence.

—

ARIMA [49]: short for Auto Regressive Integrated Moving Average, a classic method for time series prediction, for example, the house and stock price prediction. It is a model that captures a set of normal temporal relations in the time series data. For latitude and longitude, we establish two ARIMA models separately. The combined results are the predicted locations.

—

TREAD [40]: the maritime system developed by NATO, which is implemented for vessel trajectory prediction and anomaly detection. It first extracts the route pattern and then generates a synthetic representing trajectory for each pattern, which is equivalent to the median one. For the observed vessel movement sequence, TREAD classifies it into a category based on conditional probability and follows the prepared synthetic route to make predictions.

—

LSTM Seq2Seq [23]: the vessel trajectory prediction approach is implemented as a seq2seq framework, with the LSTM unit as the fundamental component. It takes the normalized latitude and longitude data as input.

—

ST-Seq2Seq [62]: though the structure of ST-Seq2Seq is similar to LSTM Seq2Seq, the main distinction is that ST-Seq2Seq takes the \(\Delta\) value of latitude and longitude as input. It calculates the difference between two successive vessel movement locations during the data preprocessing stage.

—

EncDec-ATTN [8]: besides seq2seq, it involves the attention mechanism for dealing with the long input sequence.

5.5 Implementations

We implemented the framework of baselines and VesNet with the machine learning library TensorFlow, version 2.7.0, Python tools NumPy, statsmodels, and scikit-learn. We conducted model training experiments on a GPU server with 80 GB memory and Nvidia GeForce RTX 2080Ti GPU.

6 Performance Evaluation

6.1 Overall Performance

We compare our VesNet with the baseline models in terms of MAE, RMSE, and \(e_{dist}\) under different prediction time horizons, which are 5 min, 10 min, 30 min, 60 min, and 120 min. The overall results are shown in Table 2. We have the following observations and corresponding analyses.

Table 2.

—

The model with the best performance for different prediction lengths is different. For the short term, like 5 min and 10 min, CVM and ST-Seq2Seq achieve the best prediction precision. The reason is that the vessel moves differently than a human or vehicle and cannot rapidly change its velocity and course. Therefore, the movement of a ship is almost linear within a short period. Besides CVM, which makes a prediction based on linear inference, ST-Seq2Seq learns the changing trend by observing the input sequence. If the observed sequence is close to linear, ST-Seq2Seq is likely to make a linear prediction with a high probability. The authors also claim that ST-Seq2Seq is suitable for short-term prediction [62]. However, for long-term predictions like 30 min, 60 min, and 120 min, our VesNet has the best performance, and other deep neural network models like LSTM Seq2Seq and EncDec-ATTN have the performance close to VesNet. Both route pattern classification and attention mechanisms contribute to the long-term sequence prediction for VesNet.

—

Both ARIMA and TREAD have poor performance in all experiments. ARIMA predicts latitude and longitude separately by two independent models, which ignores the internal correlation. And ARIMA is good at predicting the regularly fluctuating sequence. Nevertheless, the vessel’s latitude and longitude change is relatively stationary. For TREAD, there exists the risk of route pattern misclassification at first. During the prediction phase, it searches all the AIS records falling in a neighboring region to obtain the mean velocity and course. Then, based on the derived velocity, course, and time interval, predict the following timestep location, thus outputting the future sequence step by step. The searching range is the set of AIS records from the trajectories belonging to the same route pattern category. Nonetheless, the surrounding velocity and course may deviate severely from the current trajectory, leading to significant prediction error.

—

As the prediction length increases, the performance decreases for each method. When comparing the models, CVM and ST-Seq2Seq become worse in long-term prediction, while LSTM Seq2Seq and EncDec-ATTN turn out better for long-term forecasts thanks to their capability of handling long sequences. CVM and ST-Seq2Seq are suitable for 5- and 10-minute predictions. Meanwhile, our VesNet is the best choice for 30-minute, 60-minute, and 120-minute predictions.

6.2 Route Pattern Classification Performance

Additionally, we compare the route pattern classification capability of VesNet under various prediction time horizons, as shown in Table 3. By observing the results, we find that for each predicting length, the Precision is better than the Recall. Meanwhile, from 5 min to 120 min, with the prediction time horizon extended, the route pattern classification ability of VesNet improves. This is because the input sequence length increases accordingly to obtain better vessel trajectory prediction performance and, therefore, is more likely to describe the underlying route pattern. It also provides evidence that the route pattern classification pattern benefits the vessel trajectory prediction when it becomes long-term.

Table 3.

Method	5 min		10 min		30 min		60 min		120 min
Method	Recall	Precision	Recall	Precision	Recall	Precision	Recall	Precision	Recall	Precision
VesNet	0.4409	0.5601	0.4837	0.5938	0.6835	0.7925	0.7066	0.8121	0.7344	0.8216

Table 3. VesNet Route Pattern Classification Accuracy Performance Comparison under Different Time Horizons

6.3 Prediction Error Distribution

Besides the mean error distance \(e_{dist}\) shown in Table 2, we also illustrate the distribution of \(e_{dist}\) in Figure 8. Each column represents a prediction time length, from left to right is 5 min, 10 min, 30 min, 60 min, and 120 min. Each row is the result of the same method. For each plot, the x-axis is the error distance in km, the y-axis is the distribution density, the red line is the average \(e_{dist}\), and the green line indicates the median \(e_{dist}\). Observing the results, we can discover that CVM and ST-Seq2Seq have a smaller error distance in predicting the future 5 min and 10 min vessel trajectory, and CVM has higher precision predictions. ARIMA and TREAD have poor prediction performance for all cases, but the distribution of the \(e_{dist}\) is relatively stable. In addition, our VesNet achieves the best precision in 30-minute, 60-minute, and 120-minute future trajectory prediction. The \(e_{dist}\) increases minimally as the prediction horizon becomes longer. For almost all cases, the median \(e_{dist}\) is smaller than the mean \(e_{dist}\), indicating that more than 50% of predictions perform better than the average.

Fig. 8.

6.4 Case Study

This section demonstrates the prediction performance at the waypoint as the case study. As shown in Figure 9, we compare the results in 5 and 10 min predictions, respectively. We compare our VesNet with the best neural network baseline EncDec-ATTN and the non-neural network baseline CVM for a clearer view. CVM follows the historical velocity and course to make linear predictions for the two cases. EncDec-ATTN is aware of the direction change but still has some offset. VesNet can predict the course changes and achieve the best prediction performance at the waypoint. The reason is that VesNet learns to follow the route pattern. The overall performance of CVM is better in 5 and 10 min predictions, as shown in Table 2, because for the temporal granularity of 5 and 10 min, the vessel trajectory is almost linear, with only a few waypoint cases. VesNet performs better than the linear prediction approaches at the waypoint.

Fig. 9.

6.5 Parameter Tuning

MTL coefficient \(\boldsymbol {\lambda }\). To demonstrate the effectiveness of the MTL block, we evaluate our VesNet model under different factors \(\lambda\) in the range of 1, 10, 50, and 100. As illustrated in Figure 10(a), the red and green bars show the results of MAE and RMSE, and in Figure 10(b), the lines represent \(e_{dist}\). We obtain the best performance in \(e_{dist}\) when \(\lambda =100\) indicates that the future trajectory prediction task is the core component, and the route pattern classification task is the auxiliary part. Input historical length . To show the impact of historical trajectory, we evaluate our VesNet model under different input lengths in the range of 2, 4, 6, 8, 10, and 12. As Figure 11 demonstrates, the optimal input length for each time horizon differs. For 5, 10, 30, 60, and 120 min predictions, the best choice of input length is 6, 8, 12, 12, and 10, respectively. It indicates that the input length should be carefully adjusted when changing the prediction time horizon. Furthermore, the optimal input length increases as the prediction length increases. The short-input historical trajectory cannot reflect the long-term vessel movement regularity.

Fig. 10.

Fig. 11.

6.6 Ablation Study

We performed ablation experiments by removing the two mechanisms, attention and MTL, as well as one fundamental phase, the data preprocessing procedure, to analyze their impact on the prediction performance, especially for \(e_{dist}\), and Table 4 shows the results. All the experiments are compared with the fully functioned VesNet. No-attention & MTL denotes removing the attention mechanism and the MTL block for learning route pattern classification, known as the baseline LSTM Seq2seq. No-MTL represents removing the MTL block, known as the baseline EncDec-ATTN. No-attention denotes the removal of the attention mechanism but keeping the MTL block. No-preprocessing indicates removing the data preprocessing procedure, in which we directly derive results from the raw data.

Table 4.

It can be seen from the performance degradation that removes attention, and MTL has the second most significant impact on the performance. It proves that our design of combining the attentional seq2seq model and the route pattern classification is effective. Additionally, removing attention or MTL degrades the performance not too much, showing that the attention captures individual vessel movement regularity while the MTL learns the overall pattern behavior. However, the missing data preprocessing procedure finally drastically degrades the prediction performance. The reason is that the VesNet model is trained on the raw data, ignoring the trajectory segmentation, abnormal speed alleviation, uniform interpolation, and trajectory clustering, which jointly harm the vessel trajectory prediction performance without providing instructional route pattern as a priori knowledge.

7 Conclusions

This paper proposes the VesNet model, which takes the seq2seq with an attention mechanism as the fundamental framework for vessel trajectory prediction. We first extract the route patterns from the raw AIS data and then design the MTL structure, which jointly learns the route pattern and future trajectory. The experimental results show the superiority of our VesNet model in long-term vessel trajectory prediction by evaluating the metrics of MAE, RMSE, and error distance. Learning how to classify the route pattern helps improve the performance of the prediction task. Considering more specific attributes related to the vessel trajectory when making predictions is a promising research direction for future work.

References

[1]

Virginia Fernandez Arguedas, Fabio Mazzarella, and Michele Vespe. 2015. Spatio-temporal data mining for maritime situational awareness. In OCEANS 2015-Genova. IEEE, 1–8.

Abstract

1 Introduction

2 Related Works

2.1 Human Mobility Prediction

2.2 Vessel Trajectory Prediction

2.3 Moving Trajectory Clustering

2.4 Multi-task Learning

3 Problem Formulation

4 Methodology

4.1 Route Pattern Extraction

4.1.1 Data Preprocessing.

4.1.2 Port Extraction.

4.1.3 Trajectory Clustering.

4.2 VesNet

4.2.1 Recurrent Neural Networks.

4.2.2 Seq2seq Structure.

4.2.3 Attention Mechanism.

4.2.4 VesNet Structure.

4.2.5 VesNet Training.

5 Experimental Setup

5.1 Dataset Description

5.2 Experimental Settings

5.3 Evaluation Metrics

5.4 Baselines

5.5 Implementations

6 Performance Evaluation

6.1 Overall Performance

6.2 Route Pattern Classification Performance

6.3 Prediction Error Distribution

6.4 Case Study

6.5 Parameter Tuning

6.6 Ablation Study

7 Conclusions

References

Cited By

Index Terms

Recommendations

Vessel Trajectory Prediction using Sequence-to-Sequence Models over Spatial Grid

Online Long-Term Trajectory Prediction Based on Mined Route Patterns

Clustering and aggregating clues of trajectories for mining trajectory patterns and routes

Comments

Information

Published In

Publisher

Publication History

Check for updates

Author Tags

Qualifiers

Funding Sources

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations