0% found this document useful (0 votes)

7 views

Spatial-Temporal Aware Inductive Graph Neural Network for C-ITS Data Recovery

Uploaded by

ranganikhila2003

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views

Spatial-Temporal Aware Inductive Graph Neural Network for C-ITS Data Recovery

Uploaded by

ranganikhila2003

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO.

8, AUGUST 2023 8431

Spatial-Temporal Aware Inductive Graph Neural

Network for C-ITS Data Recovery
Wei Liang , Yuhui Li, Kun Xie, Member, IEEE, Dafang Zhang, Kuan-Ching Li , Senior Member, IEEE,
Alireza Souri , Senior Member, IEEE, and Keqin Li , Fellow, IEEE

Abstract— With the prevalence of Intelligent Transportation and rich features for multiple data recovery tasks under the
Systems (ITS), massive sensors are deployed on roadside, vehicles, C-ITS scenario.
and infrastructures. One key challenge is imputing several
different types of missing entries in spatial-temporal traffic data Index Terms— Cooperative intelligent transportation system,
to meet the high-quality demand of data science applied in data recovery, graph neural network, spatial-temporal.
Cooperative-ITS (C-ITS) since accurate data recovery is critical I. I NTRODUCTION
to many downstream tasks in ITSs, such as traffic monitoring
and decision making. For such, it is proposed in this article
solutions to three kinds of data recovery tasks in a unified model
via spatial-temporal aware Graph Neural Networks (GNNs),
W ITH the advancement of communication and informa-
tion security technologies [1], smart cities are rapidly
growing the scope and coverage of sensor networks to collect
named Spatial-Temporal Aware Data Recovery Network (STAR), and analyze data for city management such as traffic systems,
enabling a real-time and inductive inference. A residual gated urban security, and weather forecast. With the widespread of
temporal convolution network is designed to permit the pro- sensors of all types, a massive volume of data is generated [2]
posed model to learn the temporal pattern from long sequences
with masks and an adaptive memory-based attention model and thereby, leading to possible advanced data science tech-
for utilizing implicit spatial correlation. To further exploit the nologies applied in smart city applications. One of the most
generalization power of GNNs, a sampling-based method is successful applications is Intelligent Transportation Systems
adopted to train the proposed model to be robust and inductive (ITS), which broadly supports mitigating traffic congestion,
for online servicing. Extensive numerical experiments on two real- improving road safety, increasing road capacity, and saving
world spatial-temporal traffic datasets are performed, and results
show that the proposed STAR model consistently outperforms fuel consumption using data analysis algorithms. As illus-
other baselines at 1.5-2.5 times on all kinds of imputation tasks. trated in Figure 1, Cooperative-ITS (C-ITS) has emerged to
Moreover, STAR can support recovery data for 2 to 5 hours, with enable multiple isolated ITS to cooperate with each other in
its performance barely unchanged, and has comparable perfor- recent years, thereby further improving safety, sustainability,
mance in transfer learning and time-series forecast. Experimental
efficiency, and comfort by exploiting advanced communication
results demonstrate that STAR provides adequate performance
and collaboration between standalone agents.
Manuscript received 2 August 2021; revised 9 November 2021 and
As the volume of C-ITS systems and wireless communica-
7 February 2022; accepted 22 February 2022. Date of publication 14 March tion networks expands, cases of sensor malfunction, transmis-
2022; date of current version 2 August 2023. This work was supported in part sion interruption, and missing data have become inevitable
by the National Key Research and Development Program of China under Grant
2021YFA1000600, in part by the National Natural Science Foundation of
issues, and therefore, severe consequences may occur. For
China under Grant 62072170 and Grant 61976087, in part by the Science and instance, such a phenomenon may lead to erroneous conclu-
Technology Project of Department of Communications of Hunan Provincial sions, as missing values may distort statistical characteristics
under Grant 202101, in part by the Key Research and Development Program
of Hunan Province under Grant 2022GK2015, and in part by the Hunan
and cause a model to produce unexpected results, misleading
Provincial Natural Science Foundation of China under Grant 2021JJ30141. wrong decisions. In addition, deploying sensors in urban areas
The Associate Editor for this article was W. Wei. (Corresponding author: is expensive and laborious, not to mention the increasing
Kuan-Ching Li.)
Wei Liang is with the School of Computer Science and Engineering, Hunan
system operation and maintenance costs. As a matter of fact,
University of Science and Technology, Xiangtan 411201, China, also with the only a limited number of sensors is available for the C-ITS to
College of Computer Science and Electronic Engineering, Hunan University, retrieve a conspectus of the region. Hence, the data recovery1
Changsha 410082, China, and also with the Hunan Key Laboratory for Service
Computing and Novel Software Technology, Xiangtan, Hunan 411201, China.
task is critical, since many applications may rely on it.
Yuhui Li, Kun Xie, and Dafang Zhang are with the College of Computer Essentially, the missing patterns can be summarized into
Science and Electronic Engineering, Hunan University, Changsha 410082, three types, namely random missing, segment missing, and
China.
Kuan-Ching Li is with the School of Computer Science and Engineering,
blockout missing, and corresponding intuitive examples of data
Hunan University of Science and Technology, Xiangtan 411201, China missing patterns are presented in Figure 2. Random missing
(e-mail: kuancli@outlook.com). may cause accidental packet loss; segment missing may indi-
Alireza Souri is with the Department of Computer Engineering, Haliç
University, 34394 Istanbul, Turkey.
cate malfunctioning, and blockout missing is due to the new
Keqin Li is with the Department of Computer Science, State University of deployment of sensors. In practice, all three kinds of data
New York, New Paltz, NY 12561 USA.
Digital Object Identifier 10.1109/TITS.2022.3156266 1 Data recovery and data imputation are used interchangeably in this article.

1558-0016 © 2022 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8432 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

spatial-temporal correlations, to impute the missing entries

effectively. Although significant progress has been made on
spatial-temporal aware time series forecast in recent years,
a few numbers of literature focuses on the neural network-
based spatial-temporal imputation problem with complex
missing patterns. In this article, inspired by the successful
application of [3], [4] that GNNs are promising tools for
inductive tasks, we address the challenges mentioned above
and propose a novel framework named Spatial-Temporal
Aware Data Recovery Network (STAR) for this task based on
Graph Neural Networks (GNNs). The technical contributions
are threefold:
Fig. 1. The demonstration of data recovery workflow in cooperative • We propose a novel inductive spatial-temporal model
intelligent transportation system. called STAR to solve the data imputation problem under
C-ITS. Compared with transductive methods, the pro-
posed model can meet the requirements of real-time traf-
fic data imputation without retraining the whole model,
• The proposed model can capture spatial-temporal depen-
dencies with semantics effectively and efficiently. The
core idea is to assemble an adaptive memory-based
attention network into graph convolution and utilize
dilated Temporal Convolution Network (TCN) to accel-
erate training and inference,
• To conduct extensive numerical experiments on
real-world sensor datasets to verify the performance of
Fig. 2. Data missing patterns of traffic data. (a) Random Missing is caused the proposed model.
by unexpected transmission errors, and interpolation methods can quickly fill
the missing values. (b) Segment Missing is caused by power outages, sensor
The remainder of this article is organized as follows.
malfunctioning, and extreme weather conditions. Factorization-based methods Section II briefly reviews related works, Section III presents
and neural network-based models can fill these missing values. (c) Blockout the methodology, Section IV discusses the experiment results
Missing is caused by new deployments or long-time failure, as filling missing
values for such situations may be challenging, given that no historical data
of the proposed model, and finally, concluding remarks and
is available, so thus, nearby sensors are used to fill the need to handle the future directions are presented in Section V.
complicated spatial-temporal dependencies.
II. R ELATED W ORK
missing patterns co-exist in real-world collected sensor data, A. C-ITS
incurring additional difficulties to data science. If missing data ITS integrates multiple highly trended advanced tech-
is accurately reconstructed, this is an undoubtedly valuable nologies, including sensors network, communication, control
support for autonomous driving, traffic flow prediction, and theory, and artificial intelligence. It focuses on digital tech-
deploying virtual sensors. nologies that provide intelligence for systems. The prevalence
Unfortunately, this is not an easy task, and there are needs of these systems and emerging network technologies (e.g.,
and challenges to design a highly precise while fast algorithm: 5G, WiFi6, Internet of Things (IoT), SD-WAN) enable C-ITS.
• Fast and Accurate Data Recovery. The algorithm Infrastructures equipped with C-ITS can cooperate to improve
should fill the missing values as soon as possible to overall system efficiency, reliability, and sustainability. For
meet the real-time requirement of several subsequent example, Ref. [5] proposed an augmented vehicle localization
tasks. The model should be inductive to get the imputed that combined global navigation satellite systems (GNSS)
data, which means no retraining when new data arrives. with vehicle-to-anything (V2X) communication systems. Ref-
Matrix/Tensor completion methods are mostly transduc- erence [6] exploited streaming C-ITS data to detect anomaly
tive, which means they cannot generalize to unseen nodes stopped cars and a growing pothole on the road using con-
(spatial aspect). In addition, completion-based methods cept drift detection methods. Reference [7] proposed a deep
are also unable to generalize to the next time-window neuro-evolution model to implement a cooperative control
(temporal aspect). scheme that integrated ramp metering, speed limits, and lane
• Irregular Missing Pattern. Due to the randomness of change control agents to improve freeway traffic. Reference [8]
failure cases and data packet loss, the missing patterns introduced a choreography-based heterogeneous service com-
are usually highly irregular, and the total sampling rate position platform to accelerate the reuse-based development
varies. It causes difficulty in representation learning for of an urban traffic coordination application.
such data dynamic scenes. Despite the outstanding achievement, some open issues that
The fundamental challenge of the data completion task hinder the application of data science for C-ITS still exist [9].
is to exploit the limited observed data, using the internal This article focuses on the data imputation problem. That is,

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
LIANG et al.: SPATIAL-TEMPORAL AWARE INDUCTIVE GRAPH NEURAL NETWORK FOR C-ITS DATA RECOVERY 8433

every single component collects traffic data and uses wireless TABLE I
communication to propagate messages. With the increasing M ATHEMATICAL S YMBOLS AND D ESCRIPTION
volume of communication systems, data transmission errors
and data missing become assignable. In addition, as a critical
component of the system, sensors still require high costs to
deploy to large-scale networks [10]. Fortunately, these two
problems can be alleviated by a well-designed spatial-temporal
aware data recovery algorithm, and so thus, a better model for
high accuracy data recovery and estimation under C-ITS is
urgently needed.

B. Traffic Flow Forecast factorization to reconstruct the traffic data tensor, implicitly
The traffic flow forecasting problem is a fundamental learning latent factors for representing spatial and temporal
yet challenging issue. Earlier works as those presented correlation. For example, [31] proposed a Bayesian Tensor
in [11]–[13] attempted to treat it as a time-series predic- Factorization model, [32] leveraged autoregressive in tensor
tion problem in isolated points. Unfortunately, these methods completion to capture strong temporal correlation in traffic
heavily depend on local seasonality features, and hence they data, and [33] optimize nuclear norm minimization through
often fail to model interstation dependencies. Recent works integrating linear unitary transformation, achieving high scal-
explore the power of GNNs in modeling spatio-temporal data. ability. However, low-rank matrix/tensor completion methods
References [14], [15] proposed RNN-based methods that cap- have two significant drawbacks. The former is, a retrain is
ture spatial and temporal dependency using graph convolution required if it is needed to impute a new sparse tensor, inducing
and recurrent neural networks, respectively. Other alterna- severe time-complexity concerns. On the other hand, low-rank
tives, as presented in [16], [17], are equipped with stacked constraints and linearity may force the model to capture a
CNN-based temporal encoder and graph convolution-based smooth pattern, limiting it to capture highly complex internal
spatial encoder to gain better representation and faster training temporal and spatial patterns.
speed. Li et al. [18] summarize and benchmark the previous
works on traffic flow forecast, then proposing novel RNNs III. M ETHODOLOGY
with dynamic graph inputs on each step. In this section, the definition of the spatio-temporal impu-
tation problem in math is formally presented. First, three
C. Spatial-Temporal Kriging for Blockout Missing building blocks: temporal, spatial, and diffusion graph convo-
Gaussian process regression (GPR) [19], [20] is an effective lution blocks are designed, and next, we outline the inductive
tool to solve the Kriging problem, as it applies a flexible architecture of the proposed model to show how sub-modules
kernel to construct spatiotemporal correlations. Nevertheless, iterate together to solve the data recovery problem.
the major drawback of GPR is the high computation overhead,
which limits its real-time application. A. Notation
In recent years, neural network-based Kriging emerged. Ref- The mathematical symbols used in this section are presented
erence [4] overcame strong Gaussian assumptions and directly in the following table.
used neighboring observations when generating predictions.
Ref. [21] proposed a novel generative adversarial network B. Problem Description
for recovering missing entries in a fixed-size matrix, and Spatial-temporal imputation problem under C-ITS scenario
finally, Reference [3] applied diffusion graph convolution and refers to interpolating missing data for target sensor according
exploited training technique to enable inductive inference. to sampled sensor data. Initially, we denote the entire sensor
Unfortunately, most of the models mentioned above are trans- network with N nodes and E edges as graph G while the
ductive. That is, they needed to retrain the entire model when sampled data X ∈ R N×T , where a mask M is created to
the network structure is changed even slightly. Some recent indicate the non-zero entries in X. Next, after n new nodes
studies [3], [22]–[24] demonstrated that GNN could generalize with e new edges related to them are added to the sensor
to an unseen new structure of graphs (i.e., new nodes or new network G, we have a new graph G . Notably, n new nodes
edges introduced). Inspired by these works, we develop an only have e edges as knowledge prior. Thus, our task is to
inductive model to solve the spatial-temporal Kriging problem interpolate X ∈ R(N+n)×T according to both G and X by
for dynamic C-ITS. estimating the missing history data for n nodes. Therefore,
we formulate the data imputation task as function f :
D. Spatial-Temporal Imputation for Non-Blockout Missing
X = f (X, M, G )
Works in literature pointed out the spatial-temporal imputa-
s.t. X ∗ M = X ∗ M (1)
tion problem as matrix/tensor completion, as they leverage the
road network structure as regularization under the matrix com- According to the above formulation, we treat the data
pletion framework [25]–[27]. To further utilize more spatial- imputation task as a conditional generation problem using
temporal patterns, other approaches as [28]–[30] tried tensor mask M.

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8434 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

Fig. 3. The framework of STAR.

C. Framework of STAR
We present the framework of STAR in Figure 3. It consists
of two parallel feature extraction modules, graph convolution
layers, and output layers. By stacking multiple graph convo-
lution and TCN layers, our model can handle spatial-temporal
dependencies at a different scale. For example, we can stack
more TCN layers if the input time series is long and contains
more graph convolution layers to capture long-range spatial
dependencies.

Fig. 4. TCN with kernel=2, stride=2, dilation=2.

D. Temporal Feature Extraction
Recurrent Neural Network (RNN)-based approaches are
applied to extract features of a sequence such as time series where W1 , W2 , b1 , b2 are learnable parameters, tanh(·) and
and natural language. However, RNN-based approaches do sigmoi d(·) are two commonly used activation functions; and
have disadvantages, and they are threefold. First, they cannot φ denotes 1D-Conv with 1 × 1 kernel.
handle long sequences, since memory may lose. Second, they The missing values positions are crucial for imputation
suffer from gradient vanish/explosion problems, and finally, tasks. We notice that, if we input a corrupted time series
the latter one, the recursive computation manner, brings low into the neural network after the min-max scaler, the missing
efficiency in parallel training and inference. With the concern values are set to zero, making it difficult to distinguish small
mentioned above, we adopt TCN [34] in our tasks instead values and missing values. The mask, indicating the missing
of RNN. As illustrated in Figure 4, TCN applies dilated values, contains positional information that guides the model
causal convolution, which can enlarge its receptive fields to extract temporal patterns from other time slices. The archi-
exponentially and thus enable the proposed model to capture tecture of the temporal feature extraction module is presented
long-range temporal patterns, as well as to save computation in Figure 5.
resources. The proposed RG-TCN will not change the input length of
Inspired by gating mechanisms in RNNs and GLU [35], the time series data but changes the channel depth during the
we use residual gated TCN (RG-TCN) to control information hidden layers. Therefore, we maintain the identical length of
flow more effectively in a deep network. First, we stack data after being processed by the RG-TCN.
corrupted data series and masks to form original input H(0) :
E. Attention-Based Spatial Feature
H(0) = stack(X, M, 1 − M). (2)
To extract spatial features for further fusion, we propose
Given the H as input, RG-TCN takes the form: an attention-based spatial module that combines TCN, graph
convolution, and attention mechanism with linear time com-
H = tanh(W1 H(l) + b1 ) sigmoi d(W2 H(l) + b2 ) plexity and space complexity. The architecture of this module
(l+1)
H = H + φ(H(l) ), (3) is shown in Figure 6.

normalization function that computes Softmax and L1-norm

in sequence.

F. EA-Diffusion Convolution and Output Layer

The temporal and attention-based spatial feature mod-
ules are two branches for spatial-temporal feature extrac-
tion. We concatenate these two representations as node-level
embedding for further propagation inside the graph.
The real-world sensor networks have underlying directed
topology. For example, sensors are deployed on the road,
which naturally forms a bi-directed graph. We adopt diffusion
Fig. 5. The architecture of temporal feature extraction module.
graph convolution networks (DGCN) [14] as the propagation
layer to handle this directed graph. DGCN treats forward edges
and backward edges separately to create two matrices–forward
transition matrix A f and backward transition matrix Ab .
We denote the diffusion steps as K , the diffusion graph
convolution layer is written as:

K
(l+1)
H = (Akf H(l) Wk1 + Akb H(l) Wk2 ), (5)
k=0
where
transition matrix A f and Ab are generate through A f =
Fig. 6. Architecture of spatial feature extraction module. A/ j Ai j , Ab = AT / j AiTj .
Graph Neural Networks highly rely on the pre-defined
adjacent matrix, given that it limits the neural network to
In this module, we first feed node-wise time series into capture semantic similarity inside a large-scale sensor network.
RG-TCN to extract its temporal patterns and then use graph In addition, the demand for semantic similarity depends on
convolution to obtain node-wise embedding. The graph convo- the dataset itself rather than the network structure. Besides,
lution aims to obtain embedding for new nodes while aggre- other works use attention mechanisms [37] and trainable
gating neighborhood information to distinguish them from one adaptive adjacent matrices [17], [38], [39] to capture semantic
another. Once node-level representation is computed, we use similarity. However, the former suffers from high computa-
the attention module to capture global similarity even two tion overhead, while the latter cannot generalize to unseen
nodes are in different connected components. nodes. To tackle the abovementioned challenges, we designed
Recent works on spatial-temporal traffic prediction, as those an external attention-enhanced diffusion convolution to learn
presented in [14], [16], [17], apply graph convolution to semantic similarity adaptively:
model spatial correlations. However, due to the over-smooth

K
problem, we cannot stack graph convolution layers many H(l+1) = α ∗ E A(H(l) ) + (Akf H(l) Wk1 + Akb H(l) Wk2 ),
times to capture long-range dependencies, as nodes can only k=0
capture signals from a local sub-graph. Besides, no path (6)
is even available to connect sensors with a similar pattern.
Therefore, we argue that graph convolution is insufficient to where α is initially set to zero as a weight to control semantic
capture spatial correlation thoroughly. It is proposed in [17] a similarity learning, and E A(·) is introduced in Equation 4.
self-adaptive adjacency matrix to solve such a problem. How- Through this design, we enhanced diffusion convolution with
ever, this solution’s major drawback is that it fails to generalize a second branch of linear time complexity semantic similarity
unseen nodes, which belongs to transductive methods. One learning.
straight-forward inductive solution is self-attention, while it To better utilize features at multiple-scale and accelerate
suffers from O(n 2 ) computation complexity and only cap- the training process, we adopt a concatenation for node
tures correlation inside given nodes. Considering the rapidly features produced by each layer. In this layout, the neural
expanding network scale and complicated spatial-temporal network can extract specific N-hop neighborhood information
dependencies, we adopt external attention [36]. The linear for data recovery. Besides, the residual connection is added to
complexity and global sample-wise memory can significantly enable the information and gradient to flow through the whole
facilitate the real-time data imputation. External attention network. The graph convolution, as well as the output layers,
module takes the form: are presented in Figure 7.

H(l+1) = E A(H(l) ) = Nor m(H(l) MkT )Mv , (4) G. Training and Loss Function
As mentioned in subsection III-B, our task is to reconstruct
where MkT and Mv are two learnable parameter matrices as the missing sensor data. Intuitively, we can define the loss
memories for key-value matching, Nor m(·) is a two-stage functions focusing only on the reconstructing errors used in

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8436 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

Algorithm 1: Training Pseudocode (PyTorch-Style)

Input: Iteration num_i ter ,
number of batches num_batch,
batch size batch_si ze,
timespan span,
number of masked nodes num_masked,
training dataset X ∈ Rn×T ,
adjacent Matrix for training A,
model to be trained model
Output: trained model model
for i=1:num_i ter do
Fig. 7. The architecture of graph convolution layers and output layers. for n=1:num_batch do
/* Prepare Training Data */
batch, label_data = list(), list()
unsupervised tasks (e.g., mask language model, autoencoder). spts = randInt([0, T-span], batch_si ze)
Compared with a loss function with only the masked data, the observed = randSample([0, n], n-num_masked)
reconstruction error enables the model to generalize to unseen for bs=1:batch_si ze do
samples better. The reconstruct error is given as below: batch.append(X[observed,spts[bs]: spts[bs] +
L = X − X̂22 + λ||||2 , (7) span])
label.append(X[:,spts[bs]: spts[bs] + span])
where denotes all trainable parameters, and λ is the regu- /* Prepare Mask for Data Missing */
larization coefficient empirically set to 0.01. randMask = genRandMask(batch.shape)
To learn generalized graph convolution and adapt to new segMask = genSegMask(batch.shape)
network structures, we use sampling-based training strategies, blockMask = genBlockMask(batch.shape)
such as [3], [22], [24]. As given in Algorithm 1, we randomly mask = randMask * segMask * blockMask
treat a part of nodes as observed and the rest as Blockout /* Model Inference and Optimization */
missing nodes for imputation for each batch in this training A f = forwardTransitionMatrix(A)
algorithm. In addition, to further improve the robustness and Ab = backwardTransitionMatrix(A)
enable support of multiple recovery tasks, we generate two optimizer.zero_grad()
other kinds of missing masks to simulate real-world data yhat = model(batch, mask, A f , Ab )
corrupted scenes. loss = criterion(yhat, label)
loss.backward()
IV. E XPERIMENTS optimizer.step()
In this section, we introduce the experiment environment, return model;
also evaluating the proposed model with an extensive number
of experimentations.
TABLE II
A. Dataset S TATISTICAL P ROPERTIES OF T RAFFIC D ATASETS
Two public spatial-temporal datasets are utilized to verify
the proposed model. METR-LA records four months of traffic
speed data on the highways of Los Angeles, California, USA
through 207 sensors, and PEMS-Bay collects traffic speed
data in California, USA, and normalized by a min-max scaler.
We randomly select 75% of sensors for training and hold out • GCN, which introduces non-linearity compared with the
the rest of 25% of sensors as testing data. Besides, we split the average model. It aggregates neighborhood information
datasets in chronological order, and the portion of the train set under the message passing framework.
and test set is 7:3. Detailed statistical properties of two datasets • IGNNK uses stacked diffusion graph convolution layers
are presented in Table II. and applies training strategy to be inductive for spatial
Kriging task.
B. Baselines • STAR. Whether the sub-modules are enabled or not, there
We compare our model with the following baselines: have three variants–STAR-T only enables the temporal
• Average, takes the average values from its neighborhoods
feature extraction module, STAR-S only enables the spa-
as the prediction. tial feature extraction module, and STAR enables both
• 2D-Krige, which is provided by a Python framework
modules for spatial-temporal feature extraction.
downloaded from https://github.com/GeoStat-Framework/ We also classify the baseline based on model category,
PyKrige for statistical simulations in geography. This spatial dependency modeling, temporal dependency modeling,
method is only available when sensor locations are given. and multistep imputation, as shown in Table III.

TABLE III between nearby sensors. The naive multi-layered GCN is also
S UMMARY OF M ODEL U SED IN E XPERIMENT a firm baseline in imputation. Specifically, for random miss-
ing and segment missing tasks, we significantly outperform
IGNNK, which shows that our scheme effectively captures
the spatial-temporal context.
2) Robustness: We evaluate the performance to impute the
highly corrupted data when three kinds of data missing coexist.
Out model achieves MAE of 2.63, 4.74 in PEMS-Bay and
C. Settings METR-LA, respectively, which is a 44% and 29% improve-
We implement our model in PyTorch 1.7.1 with Python ment compared to the best baseline model. The high impu-
3.7 and deploy it on a server equipped with Intel i9-9900KS tation performance under such highly corrupted inputs shows
process, 32GB memory, and an NVIDIA GTX 2080Ti GPU. the strong robustness of our model. We also observed that
For hyperparameters, we select 100 as the hidden dimen- when we applied the sampling training algorithm on IGNNK,
sion for linear mapping. To learn the long-range temporal it became precarious to blockout missing tasks because it
pattern, we use six layers of RG-TCN with dilation factors does not apply any solution to the missing entries. Our model
1,2,1,2,1,2 with kernel size 2 and stride 1. The activation applies positive masks and negative masks to indicate valid
and normalization layers are Leaky ReLU [40] and Layer and missing entries, treat missing value positions as useful
Normalization [41], respectively. For the gradient descent information, and thus, achieve robustness and accuracy.
algorithm, we select Adam optimizer [42]. The batch size is 3) Flexibility: The proposed model is trained with randomly
set to 8, and the learning rate is fixed to 0.008. missing data and imputes them according to the given mask,
and so it can support three types of missing data imputation
D. Metrics in one single model. Moreover, according to the Table IV,
To quantity our model performance and compare with other we learned that the proposed model achieves highly compet-
baseline methods, we choose the following three metrics: itive performance in all kinds of imputation tasks, which a
feature can support the ITS to reduce the cost of the entire
• MAE (Mean Absolute Error). It is commonly used in
model life-cycle significantly.
evaluating the performance of regression tasks.

|x i j − x̂ i j | F. Impact of Window Size
M AE = . (8)
Nsample Table V presents the imputation accuracy of the STAR
model, and other baseline approaches for 24-, 36-, 48-, 60-step
• RMSE (Root Mean Squared Error). RMSE is used to
(2 hours to 5 hours with the step of 1 hour) data recovery tasks
illustrate the degree of dispersion of the sample. For non-
on METR-LA and Seattle Highway datasets. The STAR model
linear fittings, smaller RMSE indicates better regression
obtains the best recovery accuracy under nearly all evaluation
accuracy.
metrics, except RMSE, for all horizons, thereby providing the
(x i j − x̂ i j )2 effectiveness for spatial-temporal aware data recovery tasks.
RM S E = . (9) From the experimental results, we conclude three significant
Nsample
features of the proposed model:
• MAPE (Median Absolute Percentage Error). MAPE is 1) High Recovery Accuracy: The proposed model, which
used to estimate relative absolute error. It takes the form: extracts the temporal features, performs better than other meth-

x i j − x̂ i j ods like IGNNK and Average. For example, for the 24-step
M AP E = × 100%. (10)
x recovery, STAR outperforms IGNNK by 20.7% and 11.7% on
ij
METR-LA and PEMS-Bay, respectively. The MAPE errors
of the STAR are significantly lower than those of IGNNK.
E. Imputation Performance This phenomenon is mainly due to the ignorance of internal
In this section, we compare the proposed model with other temporal patterns.
baselines in different conditions of data missing to demonstrate 2) Spatio-Temporal Recovery Capability: To prove the
the superiority of the proposed model. First, we set the random STAR model can capture spatial and temporal dependencies,
missing ratio to 20%, and then remove 200 segments of we compare the variants of the STAR model with IGNNK.
30 minutes in each sensor for both the trainset and the rest. As shown in Figure Figure 8(a), methods with temporal
Next, we hold out 25% of sensors as unsampled. Finally, the feature extraction have better recovery precision than base-
experiment results are given in Table IV. line ones, indicating that our temporal module can capture
According to the results, we have the following conclusions: temporal patterns from traffic data. Furthermore, according to
1) High Performance: We compare the proposed model Figure 8(b), we learn that by enabling spatial attention, RMSE
with two mathematical models and two GNNs. The proposed errors decrease, suggesting that our proposed module cap-
model consistently outperforms the baseline methods by a tures long-range spatial correlation beyond pre-defined graph
large margin in all kinds of imputation tasks. We identified that structure. Finally, only exploiting spatial and temporal features
directly using neighborhood sensors can achieve competitive could reach the best performance, indicating the presence of
performance because of the strong correlation and impact spatial-temporal dependencies.

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8438 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

TABLE IV
M ODEL P ERFORMANCE U NDER D IFFERENT I MPUTATION TASKS

TABLE V
P ERFORMANCE C OMPARISON W ITH D IFFERENT T IMESTEPS

Fig. 8. Spatial-temporal aware recovery capability. (a) The comparison Fig. 9. Long-term data recovery capability. (a) The change in MAE and
of (non)temporal approaches on RMSE under different lengths of horizons. RMSE of STAR model under different recovery horizons. (b) the RMSE errors
(b) The comparison of (non)spatial approaches on RMSE under different of the STAR model and other baselines under different recovery horizons.
lengths of horizons. The suffix -S and -T indicates that the corresponding
feature extraction block is enabled.
TABLE VI
3) Long Range Recovery: It shows that the proposed model A BLATION S TUDY ON D IFFERENT M ODULES
success in obtaining the best recovery performance regardless
of the changes in the prediction lengths. Furthermore, the
performance is stable with the increase in time steps, and thus,
the proposed model can be applied for both short-term and
long-term imputation.
As shown in Figure Figure 9(a) the change of MAE and
RMSE at varied recover lengths, we learn that it changes on two traffic datasets. Here, we concentrate on the three
slowly with time step increase by a large margin. In addition, kinds of factors: spatial block, temporal block, and external
as depicted in Figure Figure 9(b), the proposed model is attention. For each factor, a new model is built by removing
compared with baselines and demonstrates that it outperforms corresponding blocks, and we named the variants of STAR as
all methods, and added to the fact that it is not sensitive to follows:
the length, the imputation relies more on local features than • w/o EA: This is STAR without adaptive weighted exter-
the global one. nal attention modules to capture semantic similarity.
The graph convolution layer is replaced with diffusion
G. Ablation Study convolution.
To examine the effect of the key components that contribute • w/o T: This is STAR without temporal feature extraction
to the improved outcomes of STAR, we conduct experiments branch before graph convolution layers.

Fig. 10. Data recovery for 24 steps. Fig. 11. Data recovery for 36 steps.

• w/o S: This is STAR without spatial feature extraction

branch.
We assess the performance with the early-stopping strat-
egy to prevent overfitting and present experimental results
in Table VI. The introduction of external attention modules
significantly improves the performance by providing global
sample-wise attention with trainable fractions. We can see the
sharp decrease in the accuracy of semantic feature extraction
(w/o S v.s. w/o EA v.s. Full), indicating the strong correlation
and rich semantic similarity inside traffic sensor data series.
One explanation to why attention can improve accuracy is that
it learns the similarity between nodes like matrix factorization.
By implicitly learning the low-rank property, one can accu-
rately recover missing entries by learning from a similar node.
The ablation study on the feature extraction, i.e., spatial feature
only (w/o T) and temporal feature only (w/o S), will degrade
the performance, which shows that two branches before graph Fig. 12. Data recovery for 48 steps.
convolution layers can effectively capture the spatial-temporal
features for further data imputation. The improvement indi-
cates that, for data imputation tasks, it is better to embed Besides, by jointly analyzing four figures, we can learn that the
nodes into spatial-temporal context before diffusion through prediction performance has no apparent changes, indicating a
the adjacent graph. Compare (w/o S) with (w/o T), we find that long sequences processing capability.
spatial features share more importance than temporal features,
which demonstrates that the traffic speed may impact more by I. Time Series Prediction
nearby traffic conditions.
As a particular segment type is missing, the time series pre-
diction problem could fit into our data imputation framework
H. Imputation Visualization if we changed the mask to force the model to impute the
To better understand the behavior of the STAR model, missing values at the end of observed windows. With this in
we randomly select one sensor on PEMS-Bay and visualize the mind, we experiment to investigate whether our model can be
recovery results at different prediction lengths. The following applied to forecast tasks. We train our model from scratch to
four figures show the recovered time series and the ground predict traffic data using the same setting in [16]. The time
truth values in test set of 2 and 5 hours. The results are shown series prediction results are presented in Table VII.
in Figure 10, Figure 11, Figure 12, and Figure 13: According to Table VII, we learn that our model has highly
Through the figures presented above, we can learn that our competitive performance in time series prediction though it is
model successfully captured the periodicity of traffic data. designed for data imputation. This feature is due to the power
Moreover, our model generally provides comparable results of RG-TCN and GNNs, which effectively extract the temporal
for the data series without a clear trend or periodical pattern. patterns and propagate them to nearby sensors. We notice that

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8440 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

on PEMS-Bay and test on METR-LA, the transferred model

performs even better than the non-transfer model on random
missing and segment missing.
This exciting result indicates that the real-world traffic
data may share similar spatial-temporal patterns. Further-
more, considering the high similarity between METR-LA and
PEMS-Bay, since they collect traffic data every five minutes
and provide a distance matrix, pre-trained models and transfer
learning have full potential for data-driven traffic analysis.

K. Complexity Analysis
Before we analyze the complexity, we introduce some
notation first. For instance, let N denote the number of sensors,
T represents the length of the input time series, E denotes the
edge in graph G, and d means the hidden model units at each
layer.
Fig. 13. Data recovery for 60 steps. 1) Time Complexity: First, the proposed model receives
input at size N × T , which is identical to many other
TABLE VII imputation models. Second, we apply RG-TCN as a temporal
T IME S ERIES P REDICTION A CCURACY ON T RAFFIC D ATASET feature extraction module, and the time complexity is O(N T )
(can be viewed as sliding window move over the all input
time series). Third, the DiffConv layer propagates the node
embeddings in a message-passing manner, thereby, has O(Ed)
time complexity. The attention model has O(Nd) time com-
plexity. By adding them up together, we have our model time
complexity as O(N T ) + O(Ed) + O(Nd).
2) Memory Complexity: Assume that our operation is fully
in-place operation. First, the input occupies O(N T ) memory.
Second, the graph convolution and attention need O(Nd)
TABLE VIII space to store the immediate results. Therefore, the whole
T RANSFER P ERFORMANCE ON T WO T RAFFIC D ATASETS memory complexity is O(N T ) + O(Nd).
3) Numerical Results: The parameters of the proposed
model occupy 700Kb disk space, and the inference speed on
a server with one single NVIDIA K80 GPU is 55ms (average)
for 325 nodes with 60-time slots. This significant result shows
that the proposed model can be served in an online manner
with low latency.

V. C ONCLUSION AND F UTURE D IRECTIONS

This article introduces a novel framework for
our model has significant advantages in long-range prediction
spatial-temporal aware inductive data imputation, namely
tasks compared with other baselines. Besides, since we fore-
STAR. The GNNs with an attention-based spatial feature
cast the upcoming sensor data for all nodes in one forward
extraction block are enhanced to capture long-range spatial
computation, our model has lower RMSE and MAPE. It indi-
similarity and dilated convolution-based temporal feature
cates that direct multistep predictions have higher accuracy
extraction. Besides, the proposed model is inductive, which
because of no error accumulation.
means it can generalize to unseen nodes with retraining.
Results obtained from extensive experimentations show that
J. Transfer Learning STAR consistently outperforms baseline models on three
References [3] and [24] reported well-designed GNNs can real-world traffic sensor datasets. Furthermore, analysis of the
learn general message passing mechanisms and generalize to results demonstrates that the proposed model is insensitive to
a similar dataset. We investigate this phenomenon and report prediction length, as also its flexibility permits applying it for
the numerical results in the following table: any data recovery task and model time-varying systems, such
We observe from the results obtained that it is possible to as predicting sensor data for moving autonomous cars.
train the model on one dataset and directly apply it to another From the significant results and analysis obtained, we fore-
dataset with competitive performance. For example, when we see some directions as future work to be explored: (1) to
train STAR on METR-LA, it shows a sharp degradation when extend the proposed model further to support multivariate data
transferring to PEMS-Bay. In contrast, when we train STAR imputation, as there are implicit correlations between collected

series that can improve data recovery and decision-making, [23] H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V. Prasanna, “Graph-
(2) the high accuracy time series forecast can be explored and SAINT: Graph sampling based inductive learning method,” in Proc. Int.
Conf. Learn. Represent., 2020.
then implemented into the proposed model, (3) to develop a [24] M. Zhang and Y. Chen, “Inductive matrix completion based on graph
unified model to handle all kinds of missing data problems, neural networks,” in Proc. Int. Conf. Learn. Represent., 2020.
and lastly, and (4) to optimize the proposed model to be more [25] T. Zhou, H. Shan, A. Banerjee, and G. Sapiro, “Kernelized probabilistic
matrix factorization: Exploiting graphs and side information,” in Proc.
efficient, meeting the specific requirements of low latency real- SIAM Int. Conf. Data Mining, Apr. 2012, pp. 403–414.
time applications. [26] J. Strahl, J. Peltonen, H. Mamitsuka, and S. Kaski, “Scalable probabilis-
tic matrix factorization with graph-based priors,” in Proc. AAAI Conf.
R EFERENCES Artif. Intell., 2020, vol. 34, no. 4, pp. 5851–5858.
[1] W. Liang, D. Zhang, X. Lei, M. Tang, K.-C. Li, and A. Y. Zomaya, “Cir- [27] D. Deng, C. Shahabi, U. Demiryurek, L. Zhu, R. Yu, and Y. Liu,
cuit copyright blockchain: Blockchain-based homomorphic encryption “Latent space model for road networks to predict time-varying traffic,”
for IP circuit protection,” IEEE Trans. Emerg. Topics Comput., vol. 9, in Proc. 22nd ACM SIGKDD Int. Conf. Knowl. Discovery Data Mining,
no. 3, pp. 1410–1420, Jul. 2021. Aug. 2016, pp. 1525–1534.
[2] W. Liang, S. Xie, D. Zhang, X. Li, and K.-C. Li, “A mutual security [28] K. Takeuchi, H. Kashima, and N. Ueda, “Autoregressive tensor factor-
authentication method for RFID-PUF circuit based on deep learning,” ization for spatio-temporal predictions,” in Proc. IEEE Int. Conf. Data
ACM Trans. Internet Technol., vol. 22, no. 2, pp. 1–20, May 2022. Mining (ICDM), Nov. 2017, pp. 1105–1110.
[3] Y. Wu, D. Zhuang, A. Labbe, and L. Sun, “Inductive graph [29] M. T. Bahadori, Q. R. Yu, and Y. Liu, “Fast multivariate spatio-temporal
neural networks for spatiotemporal Kriging,” in Proc. AAAI, 2021, analysis via low rank tensor learning,” in Proc. Adv. Neural Inf. Process.
pp. 4478–4485. Syst., vol. 27, 2014, pp. 3491–3499.
[4] G. Appleby, L. Liu, and L.-P. Liu, “Kriging convolutional networks,” in [30] A. B. Said and A. Erradi, “Spatiotemporal tensor completion for
Proc. AAAI Conf. Artif. Intell., 2020, vol. 34, no. 4, pp. 3187–3194. improved urban traffic imputation,” IEEE Trans. Intell. Transp. Syst.,
[5] M. Brambilla, M. Nicoli, G. Soatti, and F. Deflorio, “Augmenting early access, Mar. 11, 2021, doi: 10.1109/TITS.2021.3062999.
vehicle localization by cooperative sensing of the driving environment: [31] X. Chen, Z. He, Y. Chen, Y. Lu, and J. Wang, “Missing traffic data
Insight on data association in urban traffic scenarios,” IEEE Trans. Intell. imputation and pattern discovery with a Bayesian augmented tensor fac-
Transp. Syst., vol. 21, no. 4, pp. 1646–1663, Apr. 2020. torization model,” Transp. Res. C, Emerg. Technol., vol. 104, pp. 66–77,
[6] B. Leblanc, H. Fouchal, and C. de Runz, “Obstacle detection based Jul. 2019.
on cooperative-intelligent transport system data,” in Proc. IEEE Symp. [32] X. Chen, M. Lei, N. Saunier, and L. Sun, “Low-rank autoregres-
Comput. Commun. (ISCC), Jul. 2020, pp. 1–6. sive tensor completion for spatiotemporal traffic data imputation,”
[7] Y. Wu, H. Tan, Z. Jiang, and B. Ran, “ES-CTC: A deep neuroevo- IEEE Trans. Intell. Transp. Syst., early access, Sep. 27, 2021, doi:
lution model for cooperative intelligent freeway traffic control,” 2019, 10.1109/TITS.2021.3113608.
arXiv:1905.04083. [33] X. Chen, Y. Chen, N. Saunier, and L. Sun, “Scalable low-rank tensor
[8] M. Autili, L. Chen, C. Englund, C. Pompilio, and M. Tivoli, “Coop- learning for spatiotemporal traffic data imputation,” Transp. Res. C,
erative intelligent transport systems: Choreography-based urban traf- Emerg. Technol., vol. 129, Aug. 2021, Art. no. 103226.
fic coordination,” IEEE Trans. Intell. Transp. Syst., vol. 22, no. 4, [34] S. Bai, J. Zico Kolter, and V. Koltun, “An empirical evaluation of generic
pp. 2088–2099, Apr. 2021. convolutional and recurrent networks for sequence modeling,” 2018,
[9] M. A. Javed, S. Zeadally, and E. B. Hamida, “Data analytics for cooper- arXiv:1803.01271.
ative intelligent transport systems,” Veh. Commun., vol. 15, pp. 63–72, [35] Y. N. Dauphin, A. Fan, M. Auli, and D. Grangier, “Language modeling
Jan. 2019. with gated convolutional networks,” in Proc. Int. Conf. Mach. Learn.,
[10] W. Liang, Y. Li, J. Xu, Z. Qin, and K.-C. Li, “QoS prediction and 2017, pp. 933–941.
adversarial attack protection for distributed services under DLaaS,” IEEE [36] M.-H. Guo, Z.-N. Liu, T.-J. Mu, and S.-M. Hu, “Beyond self-attention:
Trans. Comput., to be published. External attention using two linear layers for visual tasks,” 2021,
[11] C. Chen, J. Hu, Q. Meng, and Y. Zhang, “Short-time traffic flow arXiv:2105.02358.
prediction with ARIMA-GARCH model,” in Proc. IEEE Intell. Vehicles [37] M. Xu et al., “Spatial-temporal transformer networks for traffic flow
Symp. (IV), Jun. 2011, pp. 607–612. forecasting,” 2020, arXiv:2001.02908.
[12] Y. Tian, K. Zhang, J. Li, X. Lin, and B. Yang, “LSTM-based traffic flow [38] L. Bai, L. Yao, C. Li, X. Wang, and C. Wang, “Adaptive graph
prediction with missing data,” Neurocomputing, vol. 318, pp. 297–305, convolutional recurrent network for traffic forecasting,” in Proc. Adv.
Nov. 2018. Neural Inf. Process. Syst., vol. 33, 2020, pp. 17804–17815.
[13] H. Lu, Z. Ge, Y. Song, D. Jiang, T. Zhou, and J. Qin, “A temporal-aware [39] Z. Wu, S. Pan, G. Long, J. Jiang, X. Chang, and C. Zhang, “Con-
LSTM enhanced by loss-switch mechanism for traffic flow forecasting,” necting the dots: Multivariate time series forecasting with graph neural
Neurocomputing, vol. 427, pp. 169–178, Feb. 2021. networks,” in Proc. 26th ACM SIGKDD Int. Conf. Knowl. Discovery
[14] Y. Li, R. Yu, C. Shahabi, and Y. Liu, “Diffusion convolutional recurrent Data Mining, Aug. 2020, pp. 753–763.
neural network: Data-driven traffic forecasting,” in Proc. Int. Conf. [40] A. L. Maas et al., “Rectifier nonlinearities improve neural network
Learn. Represent., 2018. acoustic models,” in Proc. ICML, 2013, vol. 30, no. 1, p. 3.
[15] L. Zhao, Y. Song, C. Zhang, and Y. Liu, “T-GCN: A temporal graph [41] J. Lei Ba, J. Ryan Kiros, and G. E. Hinton, “Layer normalization,” 2016,
convolutional network for traffic prediction,” IEEE Trans. Intell. Transp. arXiv:1607.06450.
Syst., vol. 21, no. 9, pp. 3848–3858, Sep. 2020. [42] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,”
[16] B. Yu, H. Yin, and Z. Zhu, “Spatio-temporal graph convolutional in Proc. 3rd Int. Conf. Learn. Represent. (ICLR), San Diego, CA, USA,
networks: A deep learning framework for traffic forecasting,” in Proc. May 2015.
27th Int. Joint Conf. Artif. Intell., Jul. 2018, pp. 3634–3640.
[17] Z. Wu, S. Pan, G. Long, J. Jiang, and C. Zhang, “Graph WaveNet for Wei Liang received the Ph.D. degree in computer
deep spatial-temporal graph modeling,” in Proc. 28th Int. Joint Conf. science and technology from Hunan University in
Artif. Intell., Aug. 2019. 2013. He was a Post-Doctoral Scholar at Lehigh
[18] F. Li, J. Feng, H. Yan, G. Jin, D. Jin, and Y. Li, “Dynamic graph University from 2014 to 2016. He is currently a
convolutional recurrent network for traffic prediction: Benchmark and Professor and the Dean of the School of Com-
solution,” 2021, arXiv:2104.14917. puter Science and Engineering, Hunan University of
[19] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Science and Technology, China. He has authored
Learning. Cambridge, MA, USA: MIT Press, 2006. or coauthored more than 140 journal/conference
[20] N. Cressie and C. K. Wikle, Statistics for Spatio-Temporal Data. papers, such as IEEE T RANSACTIONS ON I NDUS -
Hoboken, NJ, USA: Wiley, 2015. TRIAL I NFORMATICS , IEEE T RANSACTIONS ON
[21] J. Yoon, J. Jordon, and M. Schaar, “GAIN: Missing data imputation E MERGING T OPICS IN C OMPUTING, IEEE/ACM
using generative adversarial nets,” in Proc. 35th Int. Conf. Mach. Learn., T RANSACTIONS ON C OMPUTATIONAL B IOLOGY AND B IOINFORMATICS,
2018, pp. 5689–5698. and IEEE I NTERNET OF T HINGS J OURNAL. His research interests include
[22] W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation blockchain security technology, networks security protection, embedded sys-
learning on large graphs,” in Proc. 31st Int. Conf. Neural Inf. Process. tem and hardware IP protection, fog computing, and security management in
Syst., 2017, pp. 1025–1035. wireless sensor networks (WSN).

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.
8442 IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS, VOL. 24, NO. 8, AUGUST 2023

Yuhui Li is currently a Graduate Student at Hunan Kuan-Ching Li (Senior Member, IEEE) received
University, China. He has published several high- the Ph.D. degree in electrical engineering from the
quality peer-reviewed papers on top journals and University of São Paulo (USP), Brazil, in 2001.
conferences, including IEEE T RANSACTIONS ON He has published more than 380 scientific papers
C OMPUTERS and IEEE Conference on Multimedia and articles. He is the coauthor or a co-editor of
Expo. His research interests include intelligent trans- more than 30 books published by Taylor & Francis,
portation systems, service computing, blockchain Springer, and McGraw-Hill. His research interests
security, and deep learning. include parallel and distributed computing, big data,
and emerging technologies. He is a fellow of IET
and a member of AAAS. Additionally, he has been
actively involved in many major conferences and
workshops as the program/general/steering conference chairperson positions
and has organized numerous conferences and workshops. He is the Editor-in-
Chief of Connection Science and also serves at leading positions for several
scientific journals.

Alireza Souri (Senior Member, IEEE) received the

Ph.D. degree in computer engineering from the
Science and Research Branch, Islamic Azad Uni-
Kun Xie (Member, IEEE) received the Ph.D. degree versity, Iran, in 2018. He is currently an Assistant
in computer application from Hunan University, Professor and a Researcher at Halic University,
China, in 2007. She is currently a Professor with Istanbul, Turkey. He was also recognized by the
Hunan University and the Peng Cheng Laboratory, Iran’s National Elites Foundation and awarded as
China. She has published over 60 articles in major the National Young Elite in 2018, 2019, and 2020.
journals and conference proceedings, including He has authored/coauthored more than 80 scientific
the IEEE/ACM T RANSACTIONS ON N ETWORK - articles and conference papers in high-ranked jour-
ING , IEEE T RANSACTIONS ON M OBILE C OM -
nals and an associate editor and a guest editor for
PUTING , IEEE T RANSACTIONS ON C OMPUTERS ,
several well-known scientific journals. His research interests include formal
IEEE T RANSACTIONS ON W IRELESS C OMMU - verification, model checking, fog and cloud computing, the Internet of Things
NICATIONS , IEEE T RANSACTIONS ON S ERVICES
(IoT), data mining, and wireless networks.
C OMPUTING, SIGMOD, INFOCOM, ICDCS, SECON, DSN, and IWQoS.
Her research interests include network measurement, network security, big
Keqin Li (Fellow, IEEE) is currently a SUNY
data, and AI.
Distinguished Professor of computer science with
the State University of New York. He is also a
National Distinguished Professor with Hunan Uni-
versity, China. He has authored or coauthored more
than 780 journal articles, book chapters, and refereed
conference papers, and has received several best
paper awards. He holds over 60 patents announced
or authorized by the Chinese National Intellectual
Property Administration. His current research inter-
Dafang Zhang received the Ph.D. degree in applied ests include cloud computing, fog computing and
mathematics from Hunan University, China, in 1997. mobile edge computing, energy-efficient computing and communication,
He was a Visiting Fellow with Regina University, embedded systems and cyber-physical systems, heterogeneous computing
Canada, from 2002 to 2003; and a Senior Visit- systems, big data computing, high-performance computing, CPU-GPU hybrid
ing Fellow with Michigan State University, USA, and cooperative computing, computer architectures and systems, computer
in 2013. He is currently a Professor at the Col- networking, machine learning, and intelligent and soft computing. He has
lege of Computer Science and Electronic Engineer- chaired many international conferences. He is an Associate Editor of the ACM
ing, Hunan University. He has authored or coau- Computing Surveys and the CCF Transactions on High Performance Com-
thored more than 230 journal/conference papers puting. He has served on the editorial boards for the IEEE T RANSACTIONS
and is the principal investigator (PI) for more than ON PARALLEL AND D ISTRIBUTED S YSTEMS , the IEEE T RANSACTIONS ON
30 large-scale scientific projects. His research inter- C OMPUTERS , the IEEE T RANSACTIONS ON C LOUD C OMPUTING, the IEEE
ests include dependable systems/networks, network security, network mea- T RANSACTIONS ON S ERVICES C OMPUTING, and the IEEE T RANSACTIONS
surement, hardware security, and IP protection. ON S USTAINABLE C OMPUTING .

Authorized licensed use limited to: R V College of Engineering. Downloaded on January 06,2025 at 08:18:10 UTC from IEEE Xplore. Restrictions apply.