Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

VesNet: A Vessel Network for Jointly Learning Route Pattern and Future Trajectory

Published: 28 March 2024 Publication History

Abstract

Vessel trajectory prediction is the key to maritime applications such as traffic surveillance, collision avoidance, anomaly detection, and so on. Making predictions more precisely requires a better understanding of the moving trend for a particular vessel since the movement is affected by multiple factors like marine environment, vessel type, and vessel behavior. In this paper, we propose a model named VesNet, based on the attentional seq2seq framework, to predict vessel future movement sequence by observing the current trajectory. Firstly, we extract the route patterns from the raw AIS data during preprocessing. Then, we design a multi-task learning structure to learn how to implement route pattern classification and vessel trajectory prediction simultaneously. By comparing with representative baseline models, we find that our VesNet has the best performance in terms of long-term prediction precision. Additionally, VesNet can recognize the route pattern by capturing the implicit moving characteristics. The experimental results prove that the proposed multi-task learning assists the vessel trajectory prediction mission.

1 Introduction

Over 70\(\%\) of the surface of the earth is covered by oceans. The sea provides abundant resources and connects the continents and countries worldwide. Maritime transportation is essential in global trade, sustaining the international economy. It is reported that more than 80\(\%\) of the goods trading globally are carried by sea, which is even higher for most developing countries [2]. Ensuring maritime transportation’s safety and efficiency benefits human life, the world economy, and the oceanic environment. The vessel trajectory prediction task is a core maritime situation awareness system component. By effectively predicting the future movement of a vessel on both short-term and long-term levels, it assists in functions such as collision risk assessment, collision avoidance, destination and travel time estimation, and path planning. Therefore, a more precise vessel trajectory prediction can contribute to a better preview of the maritime traffic situation, which results in a well-managed maritime transportation system.
Historical moving records offer the opportunity to comprehend the underlying dynamic characteristics of how a vessel moves. With the development of the Automatic Identification System (AIS), a ship has to periodically broadcast its current status to the receivers equipped on other nearby vessels, terrestrial base stations, and even satellites. Each AIS data contains static and dynamic information indicating the current vessel’s behavior [59]. The static information includes the Maritime Mobile Service Identity (MMSI), vessel type, and size. MMSI is a unique number assigned to a vessel that can be treated as the ID. At the same time, the dynamic information includes the timestamp, latitude, longitude, Speed over Ground (SoG), and Course over Ground (CoG). Collecting the AIS data in a time-sequential order forms the voyage trajectory. Since more and more AIS transceivers are deployed, the amount of AIS data is drastically enriched, leading to several AIS datasets publicly available online [51].
Despite the advanced data mining technologies and abundant AIS data, some prerequisite information is still needed to enhance the performance of vessel trajectory prediction further when compared with human trajectory prediction. Therefore, we summarise the challenges from the following three aspects:
(1)
Unclear Origin-Destination (OD) information. It raises the problem of identifying the exact starting and ending points of a vessel trajectory, especially the departure and arrival ports in most cases. The unclear OD information is mainly caused by wrong reported or missing information within AIS data. It is demonstrated in [58] that about 40% of the destination information is intentionally or unintentionally updated with mistakes. Poikonen [42] analyzed the European AIS data collected in January-March 2020 and found that about 10% of cargo vessels and 20% of passenger vessels did not report the destination. The availability of accurate destination information cannot be guaranteed. The incorrect or missing OD information makes capturing the vessel route for long-term predictions challenging.
(2)
No existing road networks. Unlike the human and vehicle mobility prediction problems in urban scenarios, it lacks information such as road networks and locations of Point of Interest (PoI) along a vessel trajectory, especially for open sea cases [40]. Without the constraints of “road”, the vessel’s movement is prone to be affected by the marine environment, which leads to more flexible vessel trajectories, causing instability in predictions.
(3)
Seldom periodic moving patterns. There are fewer periodical moving patterns for a specific vessel than for human mobility, as mentioned in [21]. The length of a vessel voyage usually varies from days to weeks, which hardly characterizes daily or weekly similar moving patterns [26]. Typically, the container ships follow the fastest route to reach the next port on the list, with the best financial effect. Such lack of intra-periodicity results in difficulties in discovering individual regularities.
Hence, this paper focuses on extracting typical routes from historical AIS data to tackle the abovementioned challenges. By checking the recorded AIS speed, we can obtain various independent maritime journeys by making segmentation at the anchoring status. By jointly taking the starting and ending locations, time duration, moving speed and direction distributions, and covered waypoints into consideration, it is feasible to extract route patterns composed of similar moving behaviors when clustering the journeys. When predicting the vessel trajectory, we merge the route classification result on the observed sequences into the proposed model to align the output with the derived route patterns. We summarise our contributions as follows:
Vessel route pattern extraction. We propose a general pipeline for extracting vessel route patterns from the raw AIS data. We categorize similar vessel trajectories into the same pattern cluster for a large amount of historical data. By filtering the starting and ending points of all trajectories and then clustering them based on the spatial density, it is feasible to identify a voyage’s departure and arrival ports, hence obtaining the OD information. The trajectory clustering is carried out by collecting the same derived OD information and similar trajectory shapes. Similar route patterns usually mean similar moving behaviors in vessel transportation, although they are generated by different vessels [40]. Thus, extracting vessel movement patterns alleviates all three challenges, acting as the prior knowledge serving the vessel trajectory prediction.
Multi-task learning (MTL) of route pattern and future trajectory. We propose a novel framework named VesNet (Vessel Network), which consists of MTL for route patterns classification and trajectory prediction in maritime transportation. The extracted route patterns are treated as labels for training classifiers with supervised learning. Based on the attentional seq2seq model, the VesNet can simultaneously classify the route pattern and predict the future move positions of a target vessel. The intuition behind this idea is to let the model learn typical movement patterns and guide the prediction task collaboratively by recognizing the route class.
Substantial experiments on a real-world dataset. We conduct extensive experiments on a real-world dataset to verify the effectiveness and efficiency of our proposed VesNet, mainly from the perspectives of overall prediction performance, route pattern classification performance, and ablation study.
The remainder of this paper is organized as follows. Section 2 introduces related works regarding mobility trajectory prediction, trajectory clustering, and MTL. Section 3 formulates the vessel trajectory prediction problem with mathematical expressions. In Section 4, an elaborate demonstration of our proposed VesNet is displayed. After explaining the setup of the experiments in Section 5, Section 6 showcases the results with corresponding analysis. Finally, Section 7 concludes the paper and future directions for further extension.

2 Related Works

In this section, other works relevant to this paper are discussed, falling into four aspects: the human mobility prediction problem, the vessel trajectory prediction problem, the moving trajectory clustering problem, and the MTL technique.

2.1 Human Mobility Prediction

When it comes to the topic of mobility prediction, human-related issues are not avoidable since more complexity and flexibility are involved. Human mobility study is essential because of its effect on the following aspects of our daily life: disease spreading, transportation scheduling, events arrangement, resource allocation, urban planning, and more. With the growing popularity of location-based mobile services, the existing GPS equipment embedded within the smart devices, and the logs generated by the wireless communications between mobile phones and base stations, more and more human mobility data springs up at multi-spatial and temporal scales. It facilitates researchers from artificial intelligence to utilize deep learning techniques to resolve mobility-related challenges [33]. Besides the human movements prediction [21, 54] and generation tasks [22, 55], other predictive problems such as movement purpose prediction [44], home location detection [41, 52], and population inference [18] also attract people’s attention. The human mobility prediction problem, especially regarding the next location prediction challenge, matters most to the vessel trajectory prediction task mentioned in this paper.
Before developing deep learning techniques, the next location prediction is carried out using probabilistic patterns. In [6], the authors proposed a probabilistic model combining human trajectories and geographical features. Monreale et al. [36] proposed a trajectory pattern mining algorithm that involves frequently visited regions. The deep learning methods manage to capture complex spatial and temporal dependencies within sequences. With the help of Recurrent Neural Network (RNN), Long Short Term Memory (LSTM), Gate Recurrent Unit (GRU), Convolutional Neural Network (CNN), and attention mechanism, various models have been developed for capturing moving patterns. In Variational Attention-based Next Location (VANext) [24], the authors used a CNN to encode the historical trajectories and a GRU to encode the current trajectory. The outputs are fed into an attention layer that detects the historical trajectory that most matches the current one for predicting the following location. Deep Model for Joint Mobility and Time (DeepJMT) [14] predicted an individual’s next POI with the arrival time using four components: a sequential dependency encoder, a spatial context encoder, a periodicity context extractor, and a social temporal context extractor. It concatenated the outputs of the four models to make the prediction. Based on the above statements, deep learning frameworks are suitable for trajectory prediction using multi-purpose components.

2.2 Vessel Trajectory Prediction

It is also worth introducing state-of-the-art vessel trajectory prediction research works for a better view of existing techniques and limitations. On the whole, three types of vessel trajectory prediction methods are summarised: (1) physical model based [51], (2) learning model based [51], and (3) knowledge-based [56]. Unlike the movement of vehicles and aircraft, a vessel cannot change its speed and direction drastically within a short period and is moving in a 2D plane [51]. This characteristic inspires the approach of leveraging physical laws to calculate the future movement of a vessel based on solving mathematical equations. These methodologies include Constant Velocity Model (CVM) [25, 43], curvilinear model [5], lateral model [10, 29], and ship model [46]. Due to being either too simple or constrained by ideal environments and accurate state assumptions, physical model-based methods are rarely utilized in practical scenarios. Furthermore, physical model-based methods perform poorly in predicting long-term positions, that is, on the granularity of hours. The forecasting is based on the current vessel dynamic status, and the long-term route knowledge needs to be incorporated into the model. Therefore, researchers employ deep neural networks to overcome such drawbacks and capture the spatial-temporal dependencies among the vessel movement data.
RNN has been widely utilized in dealing with sequences-related tasks. Seq2seq model effectively enables applications such as machine translation [16, 50] and speech recognition [15]. Due to the capability of storing long sequential context within a representing state vector through the encoder before interpreting via the decoder, the seq2seq model manages to tackle the trajectory prediction task. Forti et al. [23] proposed a seq2seq model based on an LSTM unit to predict vessel trajectories. Nguyen et al. [38] also leveraged seq2seq architecture, transforming the GPS information into a spatial grid instead. The configuration of grid size affects the training time complexity and performance accuracy. You et al. [62] took the relative longitude and latitude, together with the time interval, as the input of the seq2seq model to predict the vessel trajectory in the Yangtze River, China. Nonetheless, the input of relative values leads the model to learn the moving trend of the vessel instead of the moving pattern, which causes poor performance at waypoints. Capobianco et al. [7, 8] proposed a seq2seq with an attention model to forecast vessel trajectories. Furthermore, they involved uncertainty awareness and destination port labels to enhance the performance. It relies on facts like dataset quality, data preprocessing, and fine-tuning network parameters for the learning model-based approaches to attain practical prediction ability. In addition, the training procedure is vital, but few specific instructions can be followed, instead depending more on empirical intuition.
Besides the approaches mentioned above, another type of method, known as knowledge-based, stands on the fundamental of statistics, which describes the vessel movement patterns in a probability format. Researchers [1, 30, 35, 39, 40] extracted the vessel movement patterns by gathering several trajectories traversing similar waypoints sequentially into clusters. The pattern is represented by a synthetic route calculated as the mean of cluster members. Additionally, the Probability Density Function (PDF) of pattern time length, speed, and course is derived, which can be helpful in trajectory prediction and anomaly detection. Hexeberg et al. [28] adopted a simplified version of moving behavior knowledge without extracting movement patterns from the historical AIS data. It treated the whole dataset as a complex pattern and searched the neighbor of the predicting point to calculate the posterior course and speed. Then, linearly infer the next position constrained by the preset step length. On repeating multiple steps, a sequence of predicted positions would be generated. Nonetheless, the algorithm cannot handle the intersections and branches of vessel tracks, which impacts its applicability. Rong et al. [45] predicted vessel trajectories considering uncertainty based on a Gaussian process model. Other researchers [26, 57] split the focused region into spatial grids and established the transition relationships between grids based on the historical AIS data. The frequency of transitions is related to the likelihood of moving behaviors. Xiao et al. [57] leveraged Kernel Density Estimation (KDE) to elaborate the probability distribution of position transfers. Hakola [26] forecasted the shortest route between ports with the help of the A* algorithm. It calculated the route weights using the grid transition matrix. However, the method requires the destination port information in advance. Moreover, it always derives the minor cost route, which may differ from the ground truth. Shu et al. [48] considered vessel trajectory prediction as a path-planning task that involves multi-objective optimization based on optimal control.
In the introduction of the approaches mentioned above, we are inspired by the conjunction of both learning and knowledge-based methods to merge the pattern knowledge into the neural network, giving rise to the model proposed in this paper.

2.3 Moving Trajectory Clustering

Extract the movement pattern of a mobility target, and it helps understand its dynamic characteristics. Like humans, vehicles, and flight, their periodic daily or weekly tracks usually behave similarly. Though vessel tracks lack periodicity, different ships share the same traveling route across various vessel types. Jointly considering facts such as geographical landscapes, weather conditions, and oil consumption, the navigators of vessels are prone to select similar routes between specific ports. Therefore, gathering diverse trajectories into clusters of similar movement patterns is crucial for extracting the voyaging knowledge. The trajectory clustering problem is classifying the tracks based on the route shape and distance between each other. Calculating such shape and location-related distance leads to two directions. One is to derive the distance, and the other is to measure the hidden vector difference representing the trajectory via neural networks.
One intuitive idea concerning distance-based approaches is to compare the common parts between two trajectories. A more significant portion of parts overlapping each other results in a higher trajectory similarity. Following this thought, the Longest Common Subsequence (LCSS) method [11] provides a simple trajectory comparison method. Another widely utilized algorithm, Hausdorff distance, measures how close the two trajectories are. It defines the trajectory distance as the maximum most relative pairwise distance between all point pairs formed from the two comparing trajectories. It is time-consuming to calculate the Hausdorff distance since its time complexity is \(O(n^2)\), where n is the length of the trajectory. Both algorithms do not consider the time order, hence they are unable to distinguish trajectories in some cases. For instance, two similar round-trip trajectories are categorized as the same pattern but belong to two distinctive voyage lanes as their sailing directions are contrary. Another popular distance measurement called Dynamic Time Warping (DTW) [53] is a time sequences alignment algorithm developed originally for speech recognition. It considers time order and is suitable for comparing trajectories with different lengths. It aims to align two sequences by adjusting the time axis iteratively until an optimal match between them is found. Suppose one sequence is length m, and the other is length n. It requires an \(m \times n\) dimension matrix to store the pairwise Euclidean distance. Specifically, starting from the lower left corner and ending at the upper right corner of the matrix, a path with the minimum overall distance under a few constrained conditions can be derived, representing the optimal matching relationships between the two sequences. Such distance measures the difference between the two sequences, considering location information and trajectory shapes.
In [3], a method suitable for clustering vehicle trajectories was demonstrated. It combined ideas from spectral clustering and proposed a trajectory similarity evaluation mechanism based on the modified Hausdorff distance to enhance its robustness and respect that trajectories are ordered collections of points. The work compared the proposed method with LCSS and DTW on a few real-world datasets, revealing its superiority. Unlike the methods mentioned above that cluster based on the whole trajectories. Lee et al. [31] segmented the trajectory into partitions before putting them into categories. It consists of two phases, using the Minimum Description Length (MDL) principle for trajectory partitioning and a density-based line segment clustering algorithm for grouping. It defined three types of distances, the perpendicular distance, the parallel distance, and the angle distance, to describe how far away the two line segments are. The line distance was calculated as the linear summation of the three distances. Then, it leveraged such distance as the metric for the Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm in the line categorization task.
The previously mentioned approaches in this subsection focus on computing the distance as the metric to discriminate trajectories. Researchers have recently represented the trajectories in vectors, which converts the trajectory clustering issue into representing vector grouping problems. In [61], the trajectories were learned by quality low-dimensional representations. It first used a sliding window to extract the space and time-independent moving behavior characteristics. They then employed a seq2seq framework to learn a fixed-length representation of the moving features. The learned representation encodes the movement characteristics of the targeting object and can be further fed into classic clustering algorithms, like the K-means method. The method was evaluated on synthetic and real data, resulting in significant performance improvements over existing approaches. The neural network-based process automatically transforms the trajectory into a representing vector instead of creating a matrix to store the distance between trajectories, as the distance-based methods do.

2.4 Multi-task Learning

MTL has been proven successful in fields such as computer vision, natural language processing, reinforcement learning, and so on [17, 47]. It is a subfield of machine learning with multiple tasks learned simultaneously via a shared model framework. MTL can perform better on the original job due to the hidden representations shared by relevant learning tasks [47]. From the framework-sharing scheme perspective, MTL can be realized through either the network model’s hard or soft parameter sharing. In hard parameter sharing [9], all tasks share the hidden representations while specifically designing the output layer for each job. Moreover, hard parameter sharing manages to reduce overfitting [4]. It is reasonable under an intuitive thought that with several tasks to learn simultaneously, the model is prone to capture the characteristics of all functions, causing less possibility to overfit the original specific job. Slightly different from hard parameter sharing, in soft parameter sharing, the hidden layers for each task are not identical and are kept similar based on some specified distance metrics [20, 60]. To step further into the sequence-related MTL problem, the seq2seq framework was adjusted following three parameter-sharing strategies: one-to-many, many-to-one, and many-to-many [34]. Similarly, three other parameter sharing schemes were developed concerning the LSTM recurrent network: uniform-layer, coupled-layer, and shared-layer [32].
Concretely, MTL has been involved in vessel-related tasks as well. Nguyen et al. [37] proposed a four-hot method for representing the AIS data inputted to a Variational Recurrent Neural Network (VRNN) for three maritime surveillance-related missions, namely vessel trajectory reconstruction, anomaly detection, and vessel type identification. The proposed work focused on various temporal granularity, ranging from hours to days, to learn multiple tasks. The four-hot is the concatenation of four dynamic information included in the AIS data: latitude, longitude, SoG, and CoG, all converted to the one-hot format. In [19], the authors proposed an MTL seq2seq model for vessel trajectory prediction. It jointly adopted the AIS, radar images, and Electronic Navigational Charts (ENC) as the input to simultaneously learn multiple tasks of matching the future GPS coordinates and the layout of nearby water and land regions. It reported that more types of information could achieve more prediction accuracy. The evidence showed that MTL benefited the vessel trajectory prediction task.

3 Problem Formulation

In this section, we formally describe the vessel trajectory prediction problem. The main structure of our model is based on the seq2seq learning approach, that is, by observing the input vessel movement sequences and learning a hidden representing vector before interpreting it for future trajectory prediction.
A trajectory can be expressed as a set of temporally ordered tuples \(\mathcal {T}=\lbrace (a_i, t_i)\rbrace _{i=1}^N\) with the length N, where \(a_i\) is the ith dynamic attributes and \(t_i\) is the corresponding timestamp. Generally, a vessel trajectory point is represented by \(a_i=(lat_i, lon_i, v_i, c_i)\), where \(lat_i\), \(lon_i\), \(v_i\) and \(c_i\) represent latitude, longitude, SoG, and CoG, respectively. Hence, a vessel trajectory can be presented as \(\mathcal {T}=\lbrace (lat_i, lon_i, v_i, c_i, t_i)\rbrace _{i=1}^N\). Furthermore, a dataset is composed of a set of trajectories \(\mathcal {D}=\lbrace \mathcal {T}_j\rbrace _{j=1}^K\).
Due to the irregular data generation rate and sometimes unstable data upload link, the data is sampled with fluctuating time intervals. For better reflecting the relationship between sequence length and sailing time, we resample the original data by linear interpolation for generating equal-time gaps between two consecutive trajectory points. Thus, a same-time duration covers the same number of points for different trajectories. It becomes more precise when making predictions along time horizons since the output sequence generated by the model is more likely to have uniform time intervals.
A sequence is a subset of a trajectory containing a series of successive points. The vessel trajectory prediction problem is that given a previous sequence \(\mathcal {S}=\lbrace (a_i, t_i)\rbrace _{i=n-\ell +1}^n\) with length \(\ell\) ending at time \(t_n\), to predict the adjacent future sequence \(\mathcal {S^{\prime }}=\lbrace (a_i^{\prime }, t_i)\rbrace _{i=n+1}^{n+h}\) with length h starting at time \(t_{n+1}\), as shown in Figure 1. \(a_i^{\prime }=(lat_i^{\prime }, lon_i^{\prime })\) is usually adopted in the predicted output, where the symbol \(^{\prime }\) means the predicted values rather than the ground truth. If the resampling time interval is fixed, then the prediction time horizon can be changed by directly adjusting the output sequence length h. At the same time, altering the input sequence length \(\ell\) determines how long the model focuses on the historical trajectory.
Fig. 1.
Fig. 1. Illustration of observed and predicted sequences for a vessel trajectory prediction task with resampled time intervals.

4 Methodology

This section explains the proposed method in two phases: the vessel route pattern extraction stage and the overall VesNet approach.

4.1 Route Pattern Extraction

The rationality of extracting vessel movement patterns is based on the fact that vessels usually traverse along a familiar route originating from position A and terminating at position B. Meanwhile, the ships behave like each other regarding navigating speed and course [39, 40]. Straightforwardly, the extraction relies on the vessel trajectory clustering results. Specifically, clusters with the number of members exceeding a predefined threshold are reserved. Otherwise, the clusters are regarded as noise since only a few vessels follow the route.
In this paper, we utilize a publicly available vessel tracks dataset published by Ville Hakola on the IEEE DataPort website [27]. It consists of ship tracking data in the AIS form in the Baltic Sea from 2017 to 2019. The dataset details are exhibited in Section 5, and we mention it here for a clear view of the vessel trajectory clustering procedure. Figure 2 shows the original vessel tracks, which include over one million AIS data. The region of interest is a rectangular area covering the Baltic Sea, with the x-axis representing longitude and the y-axis representing latitude. The tracks in dark blue mean a dense data distribution. Contrarily, light blue indicates a sparse data distribution. We follow the steps of data preprocessing, port extraction, and trajectory clustering to realize the route pattern extraction function.
Fig. 2.
Fig. 2. Original vessel tracks in the Baltic Sea.

4.1.1 Data Preprocessing.

Firstly, we need to isolate each vessel trajectory from the whole dataset, shown in Algorithm 1. The core actions at this stage are Steps 3–7. Among them, Step 3 segments the long track of a vessel into several short sub-tracks at the points where a sizeable temporal gap exists since they can be regarded as independent of each other [58]. Then, through Step 4, the trajectories starting and ending at mooring status are extracted from the sub-track. It is mainly based on the fact that the vessel seldom moors during a trajectory due to the requirements of accomplishing transportation tasks on time. Typically, it is in the mooring status when anchoring at a port [58]. Additionally, there exist methods like speed extraction from videos [12] for more accurate dynamic information acquisition. The trajectory points with abnormal speed are discarded in Step 5, as the data denoising is of great importance [13]. We then use a linear interpolation approach to achieve an equal time interval between two consecutive points in Step 6 and discard trajectories with small sequence lengths in Step 7.

4.1.2 Port Extraction.

We fetch the low-speed points for vessel mooring status detection within each sub-track for port extraction. After that, we leverage the DBSCAN algorithm to classify these points into clusters automatically. Two parameters designed for DBSCAN are vital for the clustering results. \(\varepsilon\) is the radius of the neighborhood circle around each data point, and \(\rho\) is the minimum number of data points required inside that circle so that the data point can be categorized as a core point. For two points X and Y, if (1) X locates in the neighborhood of Y, i.e., \({\rm dist}(X, Y)\le {\varepsilon }\) and (2) Y is a core point, we say that X is directly density reachable from Y, and such direct density reachable is not symmetric. Furthermore, a point X is density reachable from point Y, if there exists a chain of points \(p_1, p_2, \ldots , p_n\) with \(p_1=X\) and \(p_n=Y\) such that \(p_i\) is directly density reachable from \(p_{i+1}\), where \(i=1, \ldots , n-1\). Based on these concepts and definitions, the DBSCAN algorithm randomly selects a core point comprising other points that can be density reachable from that core point as a cluster. The procedure keeps running until all the points are allocated to a cluster. Points that do not belong to any cluster are treated as noise. As shown in Figure 3(a), the points extracted from the tracks displayed in Figure 2 scatter within the region of interest, most of which are close to the coastlines. On determining parameters \(\varepsilon\) and \(\rho\), the DBSCAN algorithm separates the points into various clusters as shown in Figure 3(b), each represented by a specific color. Notably, the noise points are colored in black. We indicate each cluster with a unique integer while denoting the noise as -1. The spatial distribution of the clusters reflects the actual locations of the ports. The port extraction is useful when the ports list is unavailable instead of manually matching on the Google map.
Fig. 3.
Fig. 3. Port extraction results in the Baltic Sea.

4.1.3 Trajectory Clustering.

Each vessel trajectory can be represented in the ‘departure - arrival ports’ mode based on the port extraction. Such representation incorporates geographical positions and contains information on OD flow. It is a compact indicator of the vessel traffic pattern since the generation process is not complicated but essential. By gathering the trajectories tagged by the same ‘departure - arrival ports’ label, we collect them into groups, where each group stands for a typical movement flow. Moreover, to separate the trajectories with similar shapes and dynamic attributes into clusters within each OD group, we cluster trajectories with DBSCAN again but in a higher dimension space. Specifically, we represent each trajectory by a tuple consisting of simplified trajectory length, whole time duration, mean and variance of speed, and direction of the simplified trajectory waypoints. All the elements are normalized into a comparable scale. The trajectory is simplified by methods like the Douglas-Peucker algorithm [64] to pick out waypoints that retain the trajectory shape. Figure 4 shows the vessel trajectory clustering results. A color denotes each pattern, and the curves are commonly thick as several trajectories follow the same pattern. For a clearer view in general, we exhibit 9 representative route patterns in Figure 5 separately. It should be noted that we do not show all the route patterns to save page space. By observing the examples, we can observe that the vessels behave similarly within the same route pattern. In the third and fourth examples, the shape of the route patterns is almost identical. However, their departure and arrival ports are reversed, considering them two different patterns. It reveals that our trajectory clustering method considers the temporal order. Furthermore, as demonstrated in Figure 6, velocity and course follow a spatial dependent distribution, proving that the vessel trajectories within the same route pattern behave similarly.
Fig. 4.
Fig. 4. Vessel trajectories clustering results.
Fig. 5.
Fig. 5. Vessel route pattern examples.
Fig. 6.
Fig. 6. Example of velocity and course distributions.

4.2 VesNet

4.2.1 Recurrent Neural Networks.

The invention of RNN has been applied to various problems: speech recognition, language modeling, translation, image captioning, conversation robots, article abstract generation, and so on. However, the RNN suffers from gradient vanishing or explosion issues and the drawback of being unable to handle long-term dependencies. LSTM, a particular type of RNN, is developed to overcome shortcomings. Unlike RNN, LSTM has a more complicated structure. It relies on the gates to adjust the information in the cell state. The following equations elaborate on the operations
\begin{align} f_t & = \sigma (W_f\times [h_{t-1}, x_t] + b_f), \end{align}
(1)
\begin{align} i_t & = \sigma (W_i\times [h_{t-1}, x_t] + b_i), \end{align}
(2)
\begin{align} \widetilde{C_t} & = \tanh (W_C\times [h_{t-1}, x_t] + b_C), \end{align}
(3)
\begin{align} C_t & = f_t\cdot {C_{t-1}} + i_t\cdot \widetilde{C_t}, \end{align}
(4)
\begin{align} o_t & = \sigma (W_o\times [h_{t-1}, x_t] + b_o), \end{align}
(5)
\begin{align} h_t & = o_t\cdot \tanh (C_t). \end{align}
(6)
The first component is the forget gate layer. It takes into the last step hidden state \(h_{t-1}\) and current step input \(x_t\) to decide how much contents to keep in the previous step cell state \(C_{t-1}\), as displayed in (1). Next is to decide what new information to store in the cell state, which is done by (2) and (3). Then the new cell state \(C_t\) is updated in (4), forgetting things and adding new candidate values. Finally, the output is a filtered version based on the cell state, as shown in (5) and (6).

4.2.2 Seq2seq Structure.

Unlike the basic RNN architecture, which is dedicated to many-to-one or many-to-many tasks with the same input-output length, seq2seq can handle many-to-many tasks with unequal input-output length. Being successful in applications such as translation, chatbot, and time series prediction, seq2seq utilizes an encoder-decoder framework to connect the sequential input features and the corresponding output predictions. Specifically, the encoder converts the input sequence into a fixed-length hidden vector, which implicitly contains the temporal and spatial relationships. The hidden vector represents a point within the high-dimensional space, and similar sequences are close to each other. Upon iteratively interpreting the hidden state, the targeted sequence is recovered without limiting output length. Therefore, seq2seq is suitable for time series prediction, focusing on forecasting a future period rather than a single timestep. It relies on either LSTM or GRU unit as the fundamental component.

4.2.3 Attention Mechanism.

In the conventional seq2seq model, the input sequence is mapped to a hidden vector, regarded as a high dimensional representation for decoding the output sequence without further delicate operations. The attention mechanism was first proposed in translation to align the words with solid correlation from different languages, though they are not in the same position. The key idea is that each output timestep queries the context produced by the encoder to pay various attention to each input timestep, which captures the critical dependencies even though the input sequence is relatively long. The overall seq2seq model, with the attention that takes LSTM as the basis, can be described by
\begin{align} attn_i & = \sum _{j=1}^l\alpha _{i,j}h_j, \end{align}
(7)
\begin{align} \alpha _{i,j} & = \frac{{\rm exp}(u_{i,j})}{\sum _{j=1}^l{\rm exp}(u_{i,j})}, \end{align}
(8)
\begin{align} u_{i,j} & = v^{\rm T}\cdot {\rm tanh}(W_{ih}h_j+W_{oh}h_i), \end{align}
(9)
where \(attn_i\), the attentional context vector corresponding to the \(i_{th}\) position, is a weighted sum of the output sequence generated by the encoder, denoted by \(h_j\) with length l. And l is the input sequence length of the encoder. The coefficient \(\alpha _{i,j}\) is calculated similarly to the softmax, obtaining the proportion of how much attention the decoder should pay to each encoder output. v, \(W_ih\) and \(W_oh\) are all learnable parameters and \(h_i\) is the current hidden state produced by the decoder based on the last timestep status (\(h_{i-1}, C_{i-1}\)) and the predicted feature \(x^{\prime }_{i-1}\).

4.2.4 VesNet Structure.

On resampling the raw AIS data during the preprocessing procedure, extracting the maritime route pattern by trajectory clustering, and introducing the attentional seq2seq time series prediction framework, we propose a deep neural network called VesNet, which jointly learns to classify the input vessel historical movement sequence into appropriate route patterns and forecast the future navigation sequence within a period. The overall structure of VesNet is shown in Figure 7. It consists of three modules: the encoder, the decoder, and the MTL block. VesNet is an extended variational version of the attentional seq2seq model, where the intermediate hidden vector supports both route pattern recognition and trajectory prediction. The VesNet takes the historical vessel movement sequence as the input and outputs the route pattern classification result and the future movement sequence, varying from minutes to hours.
Fig. 7.
Fig. 7. Structure of VesNet.
As demonstrated in Figure 7, a vessel movement sequence (\(a_1, a_2, \ldots , a_l\)) with the same time interval is fed into the encoder. Each timestep input \(a_j\), where \(j=1,2,\ldots ,l\), is a four-dimensional vector comprised of vessel latitude, longitude, velocity, and course. We leverage the min-max normalization for feature scaling, which enforces the input features distributed in the range of [0, 1]. After processing the input with a sequential LSTM recurrent network, we collect the returned sequence prepared for the upcoming attention mechanism. Meanwhile, the last timestep hidden state \(h_l\), which contains the spatial and temporal characteristics of the input sequence, is concatenated with the latent representation of the departure port extracted from historical sequences. The merged hidden vector is reserved for later route pattern classification and vessel trajectory prediction. When it turns to the decoder side, another LSTM recurrent network with length h is deployed to predict the vessel’s future movement sequence. We set h to a sufficient value to cover both short-term and long-term predictions. Stopping the inference at different timesteps can make VesNet predict over different time lengths. For each timestep at the decoder, it operates on the last timestep status (\(h^{\prime }_{i-1}, C^{\prime }_{i-1}\)) and output \(a^{\prime }_{i-1}\), where \(i=1,2,\ldots ,h\), to generate the current timestep hidden state \(h^{\prime }_i\). Note that \(h^{\prime }_0\) and \(C^{\prime }_0\) are the merged hidden vectors generated by the encoder. Later on, with the help of the attention mechanism, which is elaborated in (7)–(9), we utilize \(h^{\prime }_i\) to query the contextual sequence obtained by the encoder, resulting in the weighted sum vector \(attn_i\). Finally, the MTL block is responsible for route pattern classification and future movement sequence forecasting. On the one hand, a softmax function activates the merged hidden vector to match the one-hot version of the route pattern cluster labeled in Section 4.1. Consequently, the classification result r is embedded to match the dimension of the attentional context. On the other hand, we concatenate \(h^{\prime }_i\), \(attn_i\) and \({\rm embedding}(r)\) to merge the hidden status of current timestamp queried historical context and route pattern information into a hybrid vector, exploited for single timestep vessel movement prediction. Specifically, the hybrid vector goes through a Traj Output (TO) module, which is sequentially connected with a dense layer, a RELU layer, another dense layer, and a sigmoid layer. By iteratively following the process, a vessel movement sequence is therefore produced. After a reversed operation of the min-max normalization, we derive the ultimate forecasts at the end. The primary purpose for jointly learning the route pattern classification and future movement sequence is to constrain the predictions within a prior statistical distribution with the auxiliary information the extracted route knowledge provides, guaranteeing the forecasting precision.

4.2.5 VesNet Training.

Lastly, we elaborate on the end-to-end training procedure of our proposed model. Recall that we concurrently predict the route pattern category and the vessel movement sequence during the training phase. In terms of the route pattern classification, we adopt the cross entropy as the loss function
\begin{equation} \mathcal {L}_1(\theta)=-\sum _{\tau \in \mathcal {D}_{train}}\sum _{n=1}^{|\tau |}\sum _{k=1}^K 1\lbrace r^n=r_k\rbrace log(R_{\theta }(r^{\prime n}=r_k|a_1^n, a_2^n, \ldots , a_l^n)), \end{equation}
(10)
where \(\tau\) is a subset of the training dataset \(\mathcal {D}_{train}\), containing the data within one batch, with size of \(|\tau |\). K is the total number of the extracted route pattern categories, \(r^n\) is the ground truth route pattern, which is extracted in Section 4.1 to perform as the classifier training labels, and \(r^{\prime n}\) is the classified pattern. \(R_{\theta }\) is the neural network for route pattern classification under the condition with a given input sequence (\(a_1^n, a_2^n, \ldots , a_l^n\)). Meanwhile, we choose the mean absolute error as the loss function for predicting future vessel movement sequence
\begin{equation} \mathcal {L}_2(\theta)=\sum _{\tau \in \mathcal {D}_{train}}\sum _{n=1}^{|\tau |}\left|a_{out}^n-P_{\theta }\left(a_1^n, a_2^n, \ldots , a_l^n\right)\right|\!, \end{equation}
(11)
where \(a_{out}^n\) is the ground truth future movement sequence, and \(P_{\theta }\) is the neural network for movement prediction. All other parameters are the same as aforementioned. To sum up, the integrated optimization function is a weighted combination of the two loss functions and hence expressed as
\begin{equation} \mathcal {L}=\mathcal {L}_1(\theta)+\lambda \mathcal {L}_2(\theta), \end{equation}
(12)
where \(\lambda\) is the coefficient that affects the balance between the two learning tasks during the training stage. Algorithm 2 illustrates the whole training process of the proposed VesNet model. During the entire training procedure, we employ the gradient descent approach to update the parameters \(\theta\), with a learning rate lr and a preconfigured maximum iteration number \(epoch_{max}\). Firstly, we prepare the input and output pairs for training in Step 1. Then, we calculate the gradient of the loss function in Step 3 and eventually update the parameters \(\theta\) with a scaled factor lr in Step 4. The update procedure is repeated until the maximum iteration constraint is reached or some other early stopping conditions are met. Finally, the trained VesNet model is used for further testing and maritime-related applications.

5 Experimental Setup

5.1 Dataset Description

We validate the effectiveness of the VesNet model based on a real-world vessel trajectory dataset. The dataset contains about 1 million AIS records, including maritime navigation-related static and dynamic information. We display a few data entries and explain the meaning of each field in Table 1. The data was collected from 2017 to 2019 in the Baltic Sea and reported by various vessels such as cargo, tanks, tugs, and the like. The raw data is randomly sampled with an unequal interval ranging from minutes to hours, which requires further processing. The region of interest covers a roughly rectangular area from (\(9^{\circ }{\rm E}\), \(53^{\circ }{\rm N}\)) to (\(31^{\circ }{\rm E}\), \(67^{\circ }{\rm N}\)), which is about 1465.5km long and 1555.8km wide.
Table 1.
TimestampMMSILat (degree)Lon (degree)Speed (knot)Course (degree)Vessel type
2017/12/14 12:4220545100057.741310.40109.3159.1RORO
2017/12/14 12:4820545100057.753910.44199.1664.0RORO
2017/12/14 13:0220545100057.784010.57729.5273.3RORO
2017/12/14 13:1320545100057.811810.65639.4152.8RORO
2017/12/14 13:2320545100057.820410.74609.83102.4RORO
Table 1. Raw AIS Data Examples

5.2 Experimental Settings

At first, we follow Algorithm 1 to process the raw data. In our experiments, we set the parameters involved at this stage: \(int_{th}=4\) hours, \(len_{th}=100\) mins and \(intpl=5\) mins. After executing Algorithm 1, we obtained 1,851 independent vessel trajectories, each wholly sampled every 5 minutes. The goal of achieving a uniform time interval is to directly match the time horizon with the sequence length, mitigating the prediction error inherently caused by the unequal sampling rate. Secondly, we leverage the route pattern extraction approach discussed in Section 4.1 on the processed vessel trajectories, resulting in 57 clusters. For each vessel trajectory, we label it with an integer indicating its cluster ID and the trajectories that belong to the same route pattern share a common number. At the last step of data preprocessing, we adopt the sliding window method to construct the input and output datasets. Specifically, from the beginning of each vessel trajectory, let the first l timestep AIS data be the input sequence and the next h timestep as the output sequence for trajectory prediction, together with the allocated cluster ID as the output for route pattern classification. By consecutively sliding from the initial to the last timestep, we format the intermediately processed trajectory data into input and output sets. Moreover, we split the dataset into the training, validation, and test sets with a splitting ratio of 6: 2: 2 for the upcoming training stage and the performance evaluation purpose.

5.3 Evaluation Metrics

Since the primary purpose of our formulated problem is vessel movement prediction, we adopt the error between the predicted GPS location and the ground truth to evaluate the performance of the trained VesNet model on the test dataset. Specifically, MAE and RMSE are selected to assess the offset on both latitude and longitude in degree. We also choose the mean earth distance between the prediction and ground truth to evaluate the precision. The three metrics are calculated below
\begin{align} \rm {MAE} & = \frac{1}{m}\sum _{i=1}^m|loc_i - loc^{\prime }_i|, \end{align}
(13)
\begin{align} \rm {RMSE} & = \sqrt {\frac{1}{m}\sum _{i=1}^m(loc_i - loc^{\prime }_i)^2}, \end{align}
(14)
\begin{align} e_{dist} & = \frac{1}{m}\sum _{i=1}^m hav(loc_i, loc^{\prime }_i), \end{align}
(15)
where \(loc_i=(lat_i, lon_i)\) is a tuple representing the ground truth vessel location, while \(loc^{\prime }_i=(lat^{\prime }_i, lon^{\prime }_i)\) is the predicted result. And m is the total size of the test dataset. In (13), \(|loc_i-loc^{\prime }_i|=\frac{1}{2}(|lat_i-lat^{\prime }_i|+|lon_i-lon^{\prime }_i|)\) is the MAE considering both latitude and longitude for the \(i_{th}\) sample in the test dataset. Similarly, in (14) \((loc_i-loc^{\prime }_i)^2=\frac{1}{2}((lat_i-lat^{\prime }_i)^2+(lon_i-lon^{\prime }_i)^2)\) is the MSE, and in (15) \(hav(loc_i, loc^{\prime }_i)\) is the earth distance. \(hav()\) is the haversine function used to calculate the great circle distance between two points, the shortest distance over the earth’s surface
\begin{align} & hav((lat_i, lon_i), (lat^{\prime }_i, lon^{\prime }_i)) = R \cdot c, \end{align}
(16)
\begin{align} & c = 2 \cdot {\rm atan2}(\sqrt {a}, \sqrt {1-a}), \end{align}
(17)
\begin{align} & a = {\rm sin^2}\left(\frac{\Delta \varphi }{2}\right)+{\rm cos}\varphi _i \cdot {\rm cos}\varphi ^{\prime }_i \cdot {\rm sin^2}\left(\frac{\Delta \vartheta }{2}\right)\!, \end{align}
(18)
where R is the earth’s radius, \(\varphi\) is latitude and \(\vartheta\) is longitude, both converted to radians rather than degrees. \(\Delta \varphi\) and \(\Delta \vartheta\) are the latitudinal and longitudinal differences, respectively. The smaller values of MAE, RMSE, and \(e_{dist}\) mean the more precise vessel trajectory prediction.
Concerning evaluating route pattern classification accuracy, we utilize Recall and Precision as the corresponding metrics. The predicted route pattern is denoted as \(\mathcal {E}_P\), and the ground truth is \(\mathcal {E}_G\). The Recall is defined as \(recall=\frac{|\mathcal {E}_P \cap \mathcal {E}_G|}{|\mathcal {E}_G|}\), and Precision as \(precision=\frac{|\mathcal {E}_P \cap \mathcal {E}_G|}{|\mathcal {E}_P|}\). The larger values of Recall and Precision indicate the more accurate vessel route pattern classification performance.

5.4 Baselines

We compare our proposed VesNet with several representative baselines specially designed for maritime trajectory prediction, including classic and recently developed approaches. Below, we give a brief introduction to each of the baseline algorithms for a clearer understanding of the underlying mechanisms.
CVM [56]: the most commonly used vessel trajectory prediction tool in real-world scenarios. It utilizes the latest velocity and course to linearly infer the future movement sequence.
ARIMA [49]: short for Auto Regressive Integrated Moving Average, a classic method for time series prediction, for example, the house and stock price prediction. It is a model that captures a set of normal temporal relations in the time series data. For latitude and longitude, we establish two ARIMA models separately. The combined results are the predicted locations.
TREAD [40]: the maritime system developed by NATO, which is implemented for vessel trajectory prediction and anomaly detection. It first extracts the route pattern and then generates a synthetic representing trajectory for each pattern, which is equivalent to the median one. For the observed vessel movement sequence, TREAD classifies it into a category based on conditional probability and follows the prepared synthetic route to make predictions.
LSTM Seq2Seq [23]: the vessel trajectory prediction approach is implemented as a seq2seq framework, with the LSTM unit as the fundamental component. It takes the normalized latitude and longitude data as input.
ST-Seq2Seq [62]: though the structure of ST-Seq2Seq is similar to LSTM Seq2Seq, the main distinction is that ST-Seq2Seq takes the \(\Delta\) value of latitude and longitude as input. It calculates the difference between two successive vessel movement locations during the data preprocessing stage.
EncDec-ATTN [8]: besides seq2seq, it involves the attention mechanism for dealing with the long input sequence.

5.5 Implementations

We implemented the framework of baselines and VesNet with the machine learning library TensorFlow, version 2.7.0, Python tools NumPy, statsmodels, and scikit-learn. We conducted model training experiments on a GPU server with 80 GB memory and Nvidia GeForce RTX 2080Ti GPU.

6 Performance Evaluation

6.1 Overall Performance

We compare our VesNet with the baseline models in terms of MAE, RMSE, and \(e_{dist}\) under different prediction time horizons, which are 5 min, 10 min, 30 min, 60 min, and 120 min. The overall results are shown in Table 2. We have the following observations and corresponding analyses.
Table 2.
Table 2. Overall Vessel Trajectory Prediction Comparison under Different Time Horizons
The model with the best performance for different prediction lengths is different. For the short term, like 5 min and 10 min, CVM and ST-Seq2Seq achieve the best prediction precision. The reason is that the vessel moves differently than a human or vehicle and cannot rapidly change its velocity and course. Therefore, the movement of a ship is almost linear within a short period. Besides CVM, which makes a prediction based on linear inference, ST-Seq2Seq learns the changing trend by observing the input sequence. If the observed sequence is close to linear, ST-Seq2Seq is likely to make a linear prediction with a high probability. The authors also claim that ST-Seq2Seq is suitable for short-term prediction [62]. However, for long-term predictions like 30 min, 60 min, and 120 min, our VesNet has the best performance, and other deep neural network models like LSTM Seq2Seq and EncDec-ATTN have the performance close to VesNet. Both route pattern classification and attention mechanisms contribute to the long-term sequence prediction for VesNet.
Both ARIMA and TREAD have poor performance in all experiments. ARIMA predicts latitude and longitude separately by two independent models, which ignores the internal correlation. And ARIMA is good at predicting the regularly fluctuating sequence. Nevertheless, the vessel’s latitude and longitude change is relatively stationary. For TREAD, there exists the risk of route pattern misclassification at first. During the prediction phase, it searches all the AIS records falling in a neighboring region to obtain the mean velocity and course. Then, based on the derived velocity, course, and time interval, predict the following timestep location, thus outputting the future sequence step by step. The searching range is the set of AIS records from the trajectories belonging to the same route pattern category. Nonetheless, the surrounding velocity and course may deviate severely from the current trajectory, leading to significant prediction error.
As the prediction length increases, the performance decreases for each method. When comparing the models, CVM and ST-Seq2Seq become worse in long-term prediction, while LSTM Seq2Seq and EncDec-ATTN turn out better for long-term forecasts thanks to their capability of handling long sequences. CVM and ST-Seq2Seq are suitable for 5- and 10-minute predictions. Meanwhile, our VesNet is the best choice for 30-minute, 60-minute, and 120-minute predictions.

6.2 Route Pattern Classification Performance

Additionally, we compare the route pattern classification capability of VesNet under various prediction time horizons, as shown in Table 3. By observing the results, we find that for each predicting length, the Precision is better than the Recall. Meanwhile, from 5 min to 120 min, with the prediction time horizon extended, the route pattern classification ability of VesNet improves. This is because the input sequence length increases accordingly to obtain better vessel trajectory prediction performance and, therefore, is more likely to describe the underlying route pattern. It also provides evidence that the route pattern classification pattern benefits the vessel trajectory prediction when it becomes long-term.
Table 3.
Method5 min10 min30 min60 min120 min
RecallPrecisionRecallPrecisionRecallPrecisionRecallPrecisionRecallPrecision
VesNet0.44090.56010.48370.59380.68350.79250.70660.81210.73440.8216
Table 3. VesNet Route Pattern Classification Accuracy Performance Comparison under Different Time Horizons

6.3 Prediction Error Distribution

Besides the mean error distance \(e_{dist}\) shown in Table 2, we also illustrate the distribution of \(e_{dist}\) in Figure 8. Each column represents a prediction time length, from left to right is 5 min, 10 min, 30 min, 60 min, and 120 min. Each row is the result of the same method. For each plot, the x-axis is the error distance in km, the y-axis is the distribution density, the red line is the average \(e_{dist}\), and the green line indicates the median \(e_{dist}\). Observing the results, we can discover that CVM and ST-Seq2Seq have a smaller error distance in predicting the future 5 min and 10 min vessel trajectory, and CVM has higher precision predictions. ARIMA and TREAD have poor prediction performance for all cases, but the distribution of the \(e_{dist}\) is relatively stable. In addition, our VesNet achieves the best precision in 30-minute, 60-minute, and 120-minute future trajectory prediction. The \(e_{dist}\) increases minimally as the prediction horizon becomes longer. For almost all cases, the median \(e_{dist}\) is smaller than the mean \(e_{dist}\), indicating that more than 50% of predictions perform better than the average.
Fig. 8.
Fig. 8. Prediction error distance distribution.

6.4 Case Study

This section demonstrates the prediction performance at the waypoint as the case study. As shown in Figure 9, we compare the results in 5 and 10 min predictions, respectively. We compare our VesNet with the best neural network baseline EncDec-ATTN and the non-neural network baseline CVM for a clearer view. CVM follows the historical velocity and course to make linear predictions for the two cases. EncDec-ATTN is aware of the direction change but still has some offset. VesNet can predict the course changes and achieve the best prediction performance at the waypoint. The reason is that VesNet learns to follow the route pattern. The overall performance of CVM is better in 5 and 10 min predictions, as shown in Table 2, because for the temporal granularity of 5 and 10 min, the vessel trajectory is almost linear, with only a few waypoint cases. VesNet performs better than the linear prediction approaches at the waypoint.
Fig. 9.
Fig. 9. Prediction performance comparison at the waypoint.

6.5 Parameter Tuning

MTL coefficient \(\boldsymbol {\lambda }\). To demonstrate the effectiveness of the MTL block, we evaluate our VesNet model under different factors \(\lambda\) in the range of 1, 10, 50, and 100. As illustrated in Figure 10(a), the red and green bars show the results of MAE and RMSE, and in Figure 10(b), the lines represent \(e_{dist}\). We obtain the best performance in \(e_{dist}\) when \(\lambda =100\) indicates that the future trajectory prediction task is the core component, and the route pattern classification task is the auxiliary part. Input historical length . To show the impact of historical trajectory, we evaluate our VesNet model under different input lengths in the range of 2, 4, 6, 8, 10, and 12. As Figure 11 demonstrates, the optimal input length for each time horizon differs. For 5, 10, 30, 60, and 120 min predictions, the best choice of input length is 6, 8, 12, 12, and 10, respectively. It indicates that the input length should be carefully adjusted when changing the prediction time horizon. Furthermore, the optimal input length increases as the prediction length increases. The short-input historical trajectory cannot reflect the long-term vessel movement regularity.
Fig. 10.
Fig. 10. Varying of the MTL balance coefficient \(\lambda\).
Fig. 11.
Fig. 11. Varying of the input length.

6.6 Ablation Study

We performed ablation experiments by removing the two mechanisms, attention and MTL, as well as one fundamental phase, the data preprocessing procedure, to analyze their impact on the prediction performance, especially for \(e_{dist}\), and Table 4 shows the results. All the experiments are compared with the fully functioned VesNet. No-attention & MTL denotes removing the attention mechanism and the MTL block for learning route pattern classification, known as the baseline LSTM Seq2seq. No-MTL represents removing the MTL block, known as the baseline EncDec-ATTN. No-attention denotes the removal of the attention mechanism but keeping the MTL block. No-preprocessing indicates removing the data preprocessing procedure, in which we directly derive results from the raw data.
Table 4.
Table 4. Impact of Attention, MTL, and Data Preprocessing, where \(\Delta\) Indicates the Performance Decline
It can be seen from the performance degradation that removes attention, and MTL has the second most significant impact on the performance. It proves that our design of combining the attentional seq2seq model and the route pattern classification is effective. Additionally, removing attention or MTL degrades the performance not too much, showing that the attention captures individual vessel movement regularity while the MTL learns the overall pattern behavior. However, the missing data preprocessing procedure finally drastically degrades the prediction performance. The reason is that the VesNet model is trained on the raw data, ignoring the trajectory segmentation, abnormal speed alleviation, uniform interpolation, and trajectory clustering, which jointly harm the vessel trajectory prediction performance without providing instructional route pattern as a priori knowledge.

7 Conclusions

This paper proposes the VesNet model, which takes the seq2seq with an attention mechanism as the fundamental framework for vessel trajectory prediction. We first extract the route patterns from the raw AIS data and then design the MTL structure, which jointly learns the route pattern and future trajectory. The experimental results show the superiority of our VesNet model in long-term vessel trajectory prediction by evaluating the metrics of MAE, RMSE, and error distance. Learning how to classify the route pattern helps improve the performance of the prediction task. Considering more specific attributes related to the vessel trajectory when making predictions is a promising research direction for future work.

References

[1]
Virginia Fernandez Arguedas, Fabio Mazzarella, and Michele Vespe. 2015. Spatio-temporal data mining for maritime situational awareness. In OCEANS 2015-Genova. IEEE, 1–8.
[2]
Regina Asariotis, Gonzalo Ayala, Mark Assaf, Celine Bacrot, Hassiba Benamara, Dominique Chantrel, Amélie Cournoyer, Marco Fugazza, Poul Hansen, Jan Hoffmann, Tomasz Kulaga, Anila Premti, Luisa Rodríguez, Benny Salo, Kamal Tahiri, Hidenobu Tokuda, Pamela Ugaz, and Frida Youssef. 2021. Review of Maritime Transport 2021. Technical Report. United Nations Conference on Trade and Development.
[3]
Stefan Atev, Grant Miller, and Nikolaos P. Papanikolopoulos. 2010. Clustering of vehicle trajectories. IEEE Transactions on Intelligent Transportation Systems 11, 3 (2010), 647–657.
[4]
Jonathan Baxter. 1997. A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning 28, 1 (1997), 7–39.
[5]
Robert A. Best and J. P. Norton. 1997. A new model and efficient tracker for a target with curvilinear motion. IEEE Trans. Aerospace Electron. Systems 33, 3 (1997), 1030–1037.
[6]
Francesco Calabrese, Giusy Di Lorenzo, and Carlo Ratti. 2010. Human mobility prediction based on individual and collective geographical preferences. In 13th International IEEE Conference on Intelligent Transportation Systems. IEEE, 312–317.
[7]
Samuele Capobianco, Nicola Forti, Leonardo M. Millefiori, Paolo Braca, and Peter Willett. 2021. Uncertainty-aware recurrent encoder-decoder networks for vessel trajectory prediction. In 2021 IEEE 24th International Conference on Information Fusion (FUSION). IEEE, 1–5.
[8]
Samuele Capobianco, Leonardo M. Millefiori, Nicola Forti, Paolo Braca, and Peter Willett. 2021. Deep learning methods for vessel trajectory prediction based on recurrent neural networks. arXiv preprint arXiv:2101.02486 (2021).
[9]
Rich Caruana. 1997. Multitask learning. Machine Learning 28, 1 (1997), 41–75.
[10]
Derek Caveney. 2007. Numerical integration for future vehicle path prediction. In 2007 American Control Conference. IEEE, 3906–3912.
[11]
Lei Chen, M. Tamer Özsu, and Vincent Oria. 2005. Robust and fast similarity search for moving object trajectories. In Proceedings of the 2005 ACM SIGMOD International Conference on Management of Data. 491–502.
[12]
Xinqiang Chen, Zichuang Wang, Qiaozhi Hua, Wen-Long Shang, Qiang Luo, and Keping Yu. 2022. AI-empowered speed extraction via port-like videos for vehicular trajectory analysis. IEEE Transactions on Intelligent Transportation Systems 24, 4 (2022), 4541–4552.
[13]
Xinqiang Chen, Shubo Wu, Chaojian Shi, Yanguo Huang, Yongsheng Yang, Ruimin Ke, and Jiansen Zhao. 2020. Sensing data supported traffic flow prediction via denoising schemes and ANN: A comparison. IEEE Sensors Journal 20, 23 (2020), 14317–14328.
[14]
Yile Chen, Cheng Long, Gao Cong, and Chenliang Li. 2020. Context-aware deep model for joint mobility and time prediction. In Proceedings of the 13th International Conference on Web Search and Data Mining. 106–114.
[15]
Chung-Cheng Chiu, Tara N. Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J. Weiss, Kanishka Rao, Ekaterina Gonina, Navdeep Jaitly, Bo Li, Jan Chorowski, and Michiel Bacchiani. 2018. State-of-the-art speech recognition with sequence-to-sequence models. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE, 4774–4778.
[16]
Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014).
[17]
Michael Crawshaw. 2020. Multi-task learning with deep neural networks: A survey. arXiv preprint arXiv:2009.09796 (2020).
[18]
Pierre Deville, Catherine Linard, Samuel Martin, Marius Gilbert, Forrest R. Stevens, Andrea E. Gaughan, Vincent D. Blondel, and Andrew J. Tatem. 2014. Dynamic population mapping using mobile phone data. Proceedings of the National Academy of Sciences 111, 45 (2014), 15888–15893.
[19]
Pim Dijt and Pascal Mettes. 2020. Trajectory prediction network for future anticipation of ships. In Proceedings of the 2020 International Conference on Multimedia Retrieval. 73–81.
[20]
Long Duong, Trevor Cohn, Steven Bird, and Paul Cook. 2015. Low resource dependency parsing: Cross-lingual parameter sharing in a neural network parser. In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers). 845–850.
[21]
Jie Feng, Yong Li, Chao Zhang, Funing Sun, Fanchao Meng, Ang Guo, and Depeng Jin. 2018. DeepMove: Predicting human mobility with attentional recurrent networks. In Proceedings of the 2018 World Wide Web Conference. ACM Press, 1459–1468.
[22]
Jie Feng, Zeyu Yang, Fengli Xu, Haisu Yu, Mudan Wang, and Yong Li. 2020. Learning to simulate human mobility. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 3426–3433.
[23]
Nicola Forti, Leonardo M. Millefiori, Paolo Braca, and Peter Willett. 2020. Prediction of vessel trajectories from AIS data via sequence-to-sequence recurrent neural networks. In ICASSP 2020-2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 8936–8940.
[24]
Qiang Gao, Fan Zhou, Goce Trajcevski, Kunpeng Zhang, Ting Zhong, and Fengli Zhang. 2019. Predicting human mobility via variational attention. In The World Wide Web Conference. 2750–2756.
[25]
H. Greidanus, M. Alvarez, T. K. Eriksen, P. Argentieri, Tülay Cokacar, A. Pesaresi, S. Falchetti, D. Nappo, F. Mazzarella, and A. Alessandrini. 2013. Basin-wide maritime awareness from multi-source ship reporting data. TransNav: International Journal on Marine Navigation and Safety of Sea Transportation 7, 2 (2013), 185–192.
[26]
Ville Hakola. 2020. Predicting Marine Traffic in the Ice-Covered Baltic Sea. Master’s thesis.
[27]
Ville Hakola. 2020. Vessel Tracking (AIS), Vessel Metadata and Dirway Datasets. Retrieved Feb. 24, 2022 from https://ieee-dataport.org/open-access/vessel-tracking-ais-vessel-metadata-and-dirway-datasets
[28]
Simen Hexeberg, Andreas L. Flåten, Bjørn-Olav H. Eriksen, and Edmund F. Brekke. 2017. AIS-based vessel trajectory prediction. In 2017 20th International Conference on Information Fusion (Fusion), IEEE, 1–8.
[29]
Jihua Huang and Han-Shue Tan. 2006. Vehicle future trajectory prediction with a DGPS/INS-based positioning system. In 2006 American Control Conference. IEEE, 5831–5836.
[30]
Ioannis Kontopoulos, Iraklis Varlamis, and Konstantinos Tserpes. 2021. A distributed framework for extracting maritime traffic patterns. International Journal of Geographical Information Science 35, 4 (2021), 767–792.
[31]
Jae-Gil Lee, Jiawei Han, and Kyu-Young Whang. 2007. Trajectory clustering: A partition-and-group framework. In Proceedings of the 2007 ACM SIGMOD International Conference on Management of Data. 593–604.
[32]
Pengfei Liu, Xipeng Qiu, and Xuanjing Huang. 2016. Recurrent neural network for text classification with multi-task learning. arXiv preprint arXiv:1605.05101 (2016).
[33]
Massimiliano Luca, Gianni Barlacchi, Bruno Lepri, and Luca Pappalardo. 2021. A survey on deep learning for human mobility. ACM Computing Surveys (CSUR) 55, 1 (2021), 1–44.
[34]
Minh-Thang Luong, Quoc V. Le, Ilya Sutskever, Oriol Vinyals, and Lukasz Kaiser. 2015. Multi-task sequence to sequence learning. arXiv preprint arXiv:1511.06114 (2015).
[35]
Fabio Mazzarella, Virginia Fernandez Arguedas, and Michele Vespe. 2015. Knowledge-based vessel position prediction using historical AIS data. In 2015 Sensor Data Fusion: Trends, Solutions, Applications (SDF). IEEE, 1–6.
[36]
Anna Monreale, Fabio Pinelli, Roberto Trasarti, and Fosca Giannotti. 2009. WhereNext: A location predictor on trajectory pattern mining. In Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 637–646.
[37]
Duong Nguyen, Rodolphe Vadaine, Guillaume Hajduch, René Garello, and Ronan Fablet. 2018. A multi-task deep learning architecture for maritime surveillance using AIS data streams. In 2018 IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 331–340.
[38]
Duc-Duy Nguyen, Chan Le Van, and Muhammad Intizar Ali. 2018. Vessel trajectory prediction using sequence-to-sequence models over spatial grid. In Proceedings of the 12th ACM International Conference on Distributed and Event-based Systems. ACM, 258–261.
[39]
Giuliana Pallotta, Michele Vespe, and Karna Bryan. 2013. Traffic knowledge discovery from AIS data. In Proceedings of the 16th International Conference on Information Fusion. IEEE, 1996–2003.
[40]
Giuliana Pallotta, Michele Vespe, and Karna Bryan. 2013. Vessel pattern knowledge discovery from AIS data: A framework for anomaly detection and route prediction. Entropy 15, 6 (2013), 2218–2245.
[41]
Luca Pappalardo, Leo Ferres, Manuel Sacasa, Ciro Cattuto, and Loreto Bravo. 2021. Evaluation of home detection algorithms on mobile phone data using individual-level ground truth. EPJ Data Science 10, 1 (2021), 29.
[42]
Jussi Poikonen. 2020. AI for Smart Ports, Part 1: Limitations of Existing Data Sources for Port Call Prediction. Retrieved Aug 20, 2023 from https://www.awake.ai/post/ai-for-smart-ports-port-call-prediction
[43]
Monica Posada, Harm Greidanus, Marlene Alvarez, Michele Vespe, Tulay Cokacar, and Silvia Falchetti. 2011. Maritime awareness for counter-piracy in the Gulf of Aden. In 2011 IEEE International Geoscience and Remote Sensing Symposium. IEEE, 249–252.
[44]
Salvatore Rinzivillo, Lorenzo Gabrielli, Mirco Nanni, Luca Pappalardo, Dino Pedreschi, and Fosca Giannotti. 2014. The purpose of motion: Learning activities from individual mobility networks. In 2014 International Conference on Data Science and Advanced Analytics (DSAA). IEEE, 312–318.
[45]
H. Rong, A. P. Teixeira, and C. Guedes Soares. 2019. Ship trajectory uncertainty prediction based on a Gaussian process model. Ocean Engineering 182 (2019), 499–511.
[46]
X. Rong Li and V. P. Jilkov. 2003. Survey of maneuvering target tracking. Part I. Dynamic models. IEEE Trans. Aerospace Electron. Systems 39, 4 (2003), 1333–1364.
[47]
Sebastian Ruder. 2017. An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098 (2017).
[48]
Yaqing Shu, Yujie Zhu, Feng Xu, Langxiong Gan, Paul Tae-Woo Lee, Jianchuan Yin, and Jihong Chen. 2023. Path planning for ships assisted by the icebreaker in ice-covered waters in the Northern Sea Route based on optimal control. Ocean Engineering 267 (2023), 113182.
[49]
Sima Siami-Namini, Neda Tavakoli, and Akbar Siami Namin. 2018. A comparison of ARIMA and LSTM in forecasting time series. In 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA). IEEE, 1394–1401.
[50]
Ilya Sutskever, Oriol Vinyals, and Quoc V. Le. 2014. Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems. 3104–3112.
[51]
Enmei Tu, Guanghao Zhang, Lily Rachmawati, Eshan Rajabally, and Guang-Bin Huang. 2017. Exploiting AIS data for intelligent maritime navigation: A comprehensive survey from data to methodology. IEEE Transactions on Intelligent Transportation Systems 19, 5 (2017), 1559–1582.
[52]
Maarten Vanhoof, Fernando Reis, Thomas Ploetz, and Zbigniew Smoreda. 2018. Assessing the quality of home detection from mobile phone data for official statistics. Journal of Official Statistics 34, 4 (2018), 935–960.
[53]
Michail Vlachos, Dimitrios Gunopulos, and Gautam Das. 2004. Rotation invariant distance measures for trajectories. In Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 707–712.
[54]
Huandong Wang, Qiaohong Yu, Yu Liu, Depeng Jin, and Yong Li. 2021. Spatio-temporal urban knowledge graph enabled mobility prediction. Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies 5, 4 (2021), 1–24.
[55]
Tong Xia, Yunhan Qi, Jie Feng, Fengli Xu, Funing Sun, Diansheng Guo, and Yong Li. 2021. AttnMove: History enhanced trajectory recovery via attentional network. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4494–4502.
[56]
Zhe Xiao, Xiuju Fu, Liye Zhang, and Rick Siow Mong Goh. 2019. Traffic pattern mining and forecasting technologies in maritime traffic service networks: A comprehensive survey. IEEE Transactions on Intelligent Transportation Systems 21, 5 (2019), 1796–1825.
[57]
Zhe Xiao, Loganathan Ponnambalam, Xiuju Fu, and Wanbing Zhang. 2017. Maritime traffic probabilistic forecasting based on vessels’ waterway patterns and motion behaviors. IEEE Transactions on Intelligent Transportation Systems 18, 11 (2017), 3122–3134.
[58]
Dong Yang, Lingxiao Wu, and Shuaian Wang. 2021. Can we trust the AIS destination port information for bulk ships?–Implications for shipping policy and practice. Transportation Research Part E: Logistics and Transportation Review 149 (2021), 102308.
[59]
Dong Yang, Lingxiao Wu, Shuaian Wang, Haiying Jia, and Kevin X Li. 2019. How big data enriches maritime research–a critical review of automatic identification system (AIS) data applications. Transport Reviews 39, 6 (2019), 755–773.
[60]
Yongxin Yang and Timothy M. Hospedales. 2016. Trace norm regularised deep multi-task learning. arXiv preprint arXiv:1606.04038 (2016).
[61]
Di Yao, Chao Zhang, Zhihua Zhu, Qin Hu, Zheng Wang, Jianhui Huang, and Jingping Bi. 2018. Learning deep representation for trajectory clustering. Expert Systems 35, 2 (2018), e12252.
[62]
Lan You, Siyu Xiao, Qingxi Peng, Christophe Claramunt, Xuewei Han, Zhengyi Guan, and Jiahe Zhang. 2020. ST-Seq2Seq: A spatio-temporal feature-optimized seq2seq model for short-term vessel trajectory prediction. IEEE Access 8 (2020), 218565–218574.
[63]
Liangbin Zhao and Guoyou Shi. 2018. A method for simplifying ship trajectory based on improved Douglas–Peucker algorithm. Ocean Engineering 166 (2018), 37–46.

Cited By

View all

Index Terms

  1. VesNet: A Vessel Network for Jointly Learning Route Pattern and Future Trajectory

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Intelligent Systems and Technology
        ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 2
        April 2024
        481 pages
        EISSN:2157-6912
        DOI:10.1145/3613561
        • Editor:
        • Huan Liu
        Issue’s Table of Contents
        This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivs International 4.0 License

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 28 March 2024
        Online AM: 18 January 2024
        Accepted: 03 December 2023
        Revised: 15 September 2023
        Received: 22 February 2023
        Published in TIST Volume 15, Issue 2

        Check for updates

        Author Tags

        1. Vessel trajectory prediction
        2. trajectory clustering
        3. multi-task learning

        Qualifiers

        • Research-article

        Funding Sources

        • National Key Research and Development Program of China
        • National Natural Science Foundation of China

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 960
          Total Downloads
        • Downloads (Last 12 months)960
        • Downloads (Last 6 weeks)151
        Reflects downloads up to 20 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Login options

        Full Access

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media