Predicting Taxi-Out Time at Congested Airports With Optimization Based Support Vector Regression Methods
Predicting Taxi-Out Time at Congested Airports With Optimization Based Support Vector Regression Methods
Predicting Taxi-Out Time at Congested Airports With Optimization Based Support Vector Regression Methods
Research Article
Predicting Taxi-Out Time at Congested Airports with
Optimization-Based Support Vector Regression Methods
Guan Lian,1 Yaping Zhang ,1 Jitamitra Desai,2 Zhiwei Xing,3 and Xiao Luo4
1
School of Transportation Science and Engineering, Harbin Institute of Technology, Harbin, China
2
School of Mechanical & Aerospace Engineering, Nanyang Technological University, Singapore
3
Ground Support Equipment Research Base, Civil Aviation University of China, Tianjin, China
4
The Second Research Institute of Civil Aviation Administration of China, Chengdu, China
Received 9 November 2017; Revised 25 February 2018; Accepted 15 March 2018; Published 22 April 2018
Copyright © 2018 Guan Lian et al. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Accurate prediction of taxi-out time is significant precondition for improving the operationality of the departure process at an
airport, as well as reducing the long taxi-out time, congestion, and excessive emission of greenhouse gases. Unfortunately, several
of the traditional methods of predicting taxi-out time perform unsatisfactorily at congested airports. This paper describes and
tests three of those conventional methods which include Generalized Linear Model, Softmax Regression Model, and Artificial
Neural Network method and two improved Support Vector Regression (SVR) approaches based on swarm intelligence algorithm
optimization, which include Particle Swarm Optimization (PSO) and Firefly Algorithm. In order to improve the global searching
ability of Firefly Algorithm, adaptive step factor and Lévy flight are implemented simultaneously when updating the location
function. Six factors are analysed, of which delay is identified as one significant factor in congested airports. Through a series
of specific dynamic analyses, a case study of Beijing International Airport (PEK) is tested with historical data. The performance
measures show that the proposed two SVR approaches, especially the Improved Firefly Algorithm (IFA) optimization-based SVR
method, not only perform as the best modelling measures and accuracy rate compared with the representative forecast models, but
also can achieve a better predictive performance when dealing with abnormal taxi-out time states.
human factor. Controllable behaviours such as delays can be the queue that each aircraft experiences is measured as the
adjusted by alternating routes and taxiing speed and even by number of take-offs between its pushback time and its take-
holding at gate [2]. off time. Carr et al. proposed a simulation-based research
Better prediction of taxi-out time allows all stakeholders of queuing dynamics and traffic rules. They predicted taxi-
to arrange the future activities in airport operation. Efficient out time by considering aggregate metrics such as airport
taxi-out prediction methods are effective approaches when throughput and departure congestion [8]. Simaiakis and
the aim is to eliminate delays and improve the utilization Balakrishnan proposed a taxi-out time prediction model with
of resources. Once taxi-out time is predicted in advance, an analytical model of the aircraft departure process, which
operators gain a flexibility that allows them to adjust the included an estimate of the distributions of unimpeded taxi-
schedule, gates assignment, and pushback plan. This achieves out time, and the development of a queuing model of the
the smoother operation of an airport and reduces its surface departure runway system [9, 10].
congestion and fuel-burn costs. The aim of this research is Several statistical approaches and machine-learning
to develop the approaches that are more accurate predictors methods were applied to the prediction of aircraft taxiing
of the taxi-out time of departing aircraft. In this paper, we time. Srivastava used high-resolution position updates from
introduce two methods of predicting taxi-out time, both the ASDE-X surveillance system of JFK to develop a taxi-
of which arose from an analysis of the factors extracted out prediction model based on the existing surface traffic
from the Aviation System Performance (ASP) data of Beijing conditions and short-term traffic trends [11]. Hebert and
International Airport. The proposed models are developed on Dietz developed a multistage Markov process model of the
the soft-computing approaches to predicting taxi-out time: departure process at LaGuardia airport, based on five days of
Particle Swarm Optimization algorithm based and Improved data, to predict taxi-out time [12]. Balakrishna et al. proposed
Firefly Algorithm based Support Vector Regression. These the reinforcement learning algorithms, which could adapt
two intelligent algorithms can search the optimal parameters to the stochastic nature of departure operations, to predict
for SVR to predict the taxi-out time effectively. average airport taxi-out time trends approximately 30–60
The organization of this paper is as follows: A brief minutes in advance of the given time of day [2, 13]. Ravizza et
overview is offered of previous attempts to analyse taxi-out- al. built a combined statistical and ground movement model
time behaviours in the airport departure process, and of and used multiple linear regression to find the function that
the several prediction methods discussed in the Literature would predict taxiing times more accurately [14]. Also, they
Review. This is followed by a description of the research used the same explanatory variables for different approaches,
methodology, which includes three traditional prediction which included multiple linear regression, least median
methods and two newly proposed, improved swarm intel- squared linear regression, Support Vector Regression, M5
ligence algorithm-based approaches to predicting taxi-out model trees, Mamdani fuzzy rule-based systems, and TSK
time. The layout data of PEK airport is illustrated, along with fuzzy rule-based systems, to predict taxi-out times and
historical data, and both are validated for analysing airport then compared these approaches [15]. Lee et al. used both
dynamics and traffic situations in the taxiing process. Results fast-time simulation and machine-learning techniques to
obtained from the PEK data and findings are then discussed. predict taxi-out time and found the prediction method
The conclusion summarizes the benefits that accrue from of Support Vector Regression to be better than the linear
these findings, and their implications. regression method and the Dead Reckoning method [16].
Unfortunately, the state-of-the-art methods are tested at
2. Literature Review airports that do not give the findings much universalizability.
These airports have exceptional facilitating taxiing condi-
Several efforts have been made to address the prediction tions, and their response to clearance and delays is quick.
of taxi-out times. Those efforts have included both historic For airports that are large in every respect, these methods
data-based predictions and the queuing-based approaches are slightly inadequate, or they do not take some necessary
that regard causal factors. Shumsky deemed aircraft flow and factors into consideration.
departure demand to be casual factors and used dynamic
linear models to predict taxi-out time. He compared static 3. Taxi-Out Time Prediction Techniques
and dynamic linear models and found the dynamic linear
model better for predicting taxi-out time in a short-time There are several predictive approaches such as Artificial
window [3]. Pujet modelled the departure system as queuing Neural Networks (ANN) [17], Kalman Filtering models [18],
servers and derived a stochastic distribution for the taxi-out Softmax Regression (SR) [19], and the Support Vector Regres-
time. His model captured the details of the departure process sion (SVR) [20]. Therefore, methods with reasonable accu-
to estimate taxi-out time [4]. Idris et al. analysed a number of racy are essential for estimating taxi-out time at departure.
factors that affect taxi-out time by using the Airline Service
Quality Performance (ASQP) data. These factors included the 3.1. Generalized Linear Model. The Generalized Linear Model
runway configuration, the airline/terminal, the downstream (GLM), formulated by Nelder and Wedderburn [21], is a flexi-
restrictions, and the take-off queue size [5–7]. ble generalization of ordinary linear regression that allows for
These researchers developed the queuing model for pre- response variables with error-distribution models other than
dicting taxi-out time and drew the conclusion that take-off the normal distribution. GLM relates the linear model to the
queue size correlates best with taxi-out time, especially when response variables through a link function and by allowing
Mathematical Problems in Engineering 3
the magnitude of the variance of each measurement to be Step 1. Input the training dataset of historical taxi-out time
a function of its predicted value. The relationship between and the corresponding factors; recognize the number 𝐾 of
predicted value 𝑌 and independent variable 𝑋 is defined in classification of taxi-out times.
𝜂 = 𝑔 (𝐸 (𝑌)) = 𝑋𝑖 𝛽𝑖 𝑌 ∼ 𝐹, (1) Step 2. Build the exponential distribution family by running a
set of independent binary regressions according to the factors
where 𝜂 is the dependent variable, 𝑔 is the link function, vectors of each taxi-out time class; obtain the maximum
𝑋𝑖 is a set of independent variables, 𝛽𝑖 represents the slope likelihood function ℎ𝜃 (𝑥).
coefficients, and 𝐹 is the distribution of 𝑌. The procedures of
GLM for predicting taxi-out time are as follows. Step 3. Establish and minimize the cost function to obtain the
optimal parameter 𝜃 by using gradient descent method.
Step 1. Input the training dataset of historical taxi-out time
and the corresponding factors; check the distribution of 𝑌. Step 4. Update the likelihood function ℎ𝜃 (𝑥) with optimal 𝜃,
and predict the taxi-out time of test set by using ℎ𝜃 (𝑥).
Step 2. Choose the link function 𝑔 according to the distribu-
tion of 𝑌.
3.3. Artificial Neural Network. Artificial Neural Network
Step 3. Build the regression model between 𝑌 and 𝑋, cal- (ANN) is a machine-learning method based on a large
culate the estimated value of regression parameters 𝛽𝑖 , and collection of connected simple units called artificial neurons.
implement the significance test. The Back-Propagation Neural Network (BPNN), a multilayer
feedforward network trained by error back-propagation algo-
Step 4. Predict the taxi-out time by using the factors in test rithm, is one of the most widely used neural network models.
dataset. Its topology includes input layer, hidden layer, and output
layer. In output layer, the activation of a neuron is determined
3.2. Softmax Regression Model. The Softmax Regression (SR) by
is a generalization of logistic regression capable of handling
net𝑖 = ∑ 𝑤𝑖𝑗 𝑜𝑗 ,
multiclass problems, that is, admitting more than two possi-
ble discrete outcomes [19]. The algorithm includes a training 𝑦𝑖 = 𝑓 (net𝑖 ) ,
phase for estimating the regressors and a testing phase for (4)
abstracting the appropriate probability of each feature vector 1
𝑓 (net𝑖 ) = ,
from which the class labels are inferred. Afterwards, the SR 1 + 𝑒−net𝑖
selects the value of classified members by calculating the
probabilities of where net𝑖 is the activation of the 𝑖th neuron, 𝑗 is the neurons
set in the preceding layer, 𝑤𝑖𝑗 is the weight of the connection
Pr (𝑦 = 1 | 𝑥; 𝜃) between neuron 𝑖 and 𝑗, 𝑜𝑗 is the output of neuron 𝑗, and 𝑦𝑖
[ Pr (𝑦 = 2 | 𝑥; 𝜃) ] is the sigmoid function. The BPNN model can learn from
[ ]
[ ] the parameters set of taxi-out time and calculate the actual
ℎ𝜃 (𝑥) = [ . ]
[ . ] output when implementing the predicting process. If the
[ . ] error between the actual output and expected output did
[Pr (𝑦 = 𝐾 | 𝑥; 𝜃)] not meet the accuracy requirements, the learning rule of the
(2) BPNN would optimize variance by adjusting weights and
exp (𝜃(1)𝑇 𝑥) thresholds until satisfying the accuracy requirements. The
[ ] learning process of BPNN approach can be summarized in
[ exp (𝜃(2)𝑇 𝑥) ]
1 [ ] the following steps.
= 𝐾 [ ],
(𝑗)𝑇 [ .
.. ]
∑1 exp (𝜃 𝑥) [ ]
[ ] Step 1. Initialize the neural network; define the minimum
(𝐾)𝑇 MSE error (𝐸min ) and maximum number of iteration.
[exp (𝜃 𝑥)]
and the model parameters 𝜃 were trained to minimize the cost Step 2. Input training set; initialize the weight matrix W.
function:
Step 3. Compute the layer response output and the calculated
𝐽 (𝜃) MSE.
1 [𝑚 𝐾 exp (𝜃𝑗𝑇 𝑥(𝑖) ) (3) Step 4. Compare the calculated MSE and 𝐸min ; if calculated
=− ∑ ∑𝐼 {𝑦(𝑖) = 𝑗} log 𝐾 ], MSE > 𝐸min , continue; else go to Step 6.
𝑚 𝑖=1𝑗=1 ∑𝑛=1 exp (𝜃𝑛𝑇 𝑥(𝑖) )
[ ]
Step 5. Calculate change in weights and update weights; go to
where 𝐾 is the number of classes, 𝜃(1) , 𝜃(2) , . . . , 𝜃(𝐾) ∈ 𝑅𝑛 are Step 3.
the parameters of SR model, 𝜃 is an 𝑛-by-𝐾 matrix, and 𝐼{⋅} is
an indicator function. SR predicts the taxi-out time with the Step 6. Finish training and predict the taxi-out time by using
following procedures. ANN with test set.
4 Mathematical Problems in Engineering
3.4. Improved Swarm Intelligence Algorithm Based mapped into a high-dimensional linear feature space. Thus,
Prediction Approaches (7) can be written as
where 𝜔 is the weight vector and 𝜙(𝑥) can be replaced by 3.4.2. Particle Swarm Optimization. The Particle Swarm
kernel function 𝑘(𝑥, 𝑥 ). In 𝜀-SVR, the objective of 𝑓(𝑥) Optimization is a swarm intelligence algorithm developed in
is estimating the deviations of output variables less than recent years. It is a metaheuristic global optimization method
or equal to 𝜀 from training data. The 𝜀-value controls the based on a social-behaviour analogy, such as birds flocking
complexity of the approximating functions where small and fish schooling. The PSO method solves an optimiza-
values tend to penalize large portion of the training data, tion problem by moving the particles (namely, candidate
leading to tight approximating models, and large values tend solutions) over those particles’ velocities and positions
to free data from penalization, leading to loose approximating according to simple mathematical formulae. The position of
models. Therefore, the proper choice of 𝜀-value is critical each particle is updated towards the better-known position
for the generalization of regression models [22]. The optimal driven by its neighbours’, and the global, best performance.
regression function is determined from the estimation of 𝜔 Thus in searching for the optimal solution of the problem,
and 𝑏 by solving the following optimization problem: the update velocity and position of particle are based on the
1 𝑛 following equation of motion:
minimize ‖𝜔‖2 + 𝐶∑ (𝜉𝑖 + 𝜉𝑖∗ ) ,
2 𝑖=1 𝑉𝑖 (𝑡 + 1) = 𝜔𝑉𝑖 (𝑡) + 𝑐1 𝑟1 (𝑝𝑖𝑏𝑒𝑠𝑡 (𝑡) − 𝑝𝑖 (𝑡))
Subject to 𝑦𝑖 − 𝜔𝜙 (𝑥𝑖 ) − 𝑏 ≤ 𝜀 + 𝜉𝑖 (6) + 𝑐2 𝑟2 (𝑝𝑔𝑏𝑒𝑠𝑡 (𝑡) − 𝑝𝑖 (𝑡)) , (10)
𝜔𝜙 (𝑥𝑖 ) + 𝑏 − 𝑦𝑖 ≤ 𝜀 + 𝜉𝑖∗
𝑝𝑖 (𝑡 + 1) = 𝑝𝑖 (𝑡) + 𝑉𝑖 (𝑡) ,
𝜉𝑖 , 𝜉𝑖∗ ≥ 0,
where 𝑉𝑖 (𝑡 + 1) is the updated velocity for the 𝑖th particle, 𝜔 is
where 𝜉𝑖 , 𝜉𝑖∗ are the variables that are introduced to penalizing the inertia weight, 𝑐1 and 𝑐2 are the weighting coefficients for
complex fitting functions and the constant 𝐶 allows for the the personal best and global best positions, respectively, 𝑝𝑖 (𝑡)
penalizing of the error by determining the tradeoff between is the 𝑖th particle’s position at time 𝑡, 𝑝𝑖𝑏𝑒𝑠𝑡 is the 𝑖th particle’s
the training error and the model complexity. And the dual best known position, 𝑝𝑔𝑏𝑒𝑠𝑡 is the best position known to the
function is maximizing: swarm, and 𝑟1 and 𝑟2 are the uniformly random variables ∈
[0, 1]. Variants on this update equation consider best posi-
𝑊 (𝛼, 𝛼∗ )
tions within a particle’s local neighbourhood at time 𝑡.
1 𝑁
= − ∑ (𝛼𝑖 − 𝛼𝑖∗ ) (𝛼𝑘 − 𝛼𝑘∗ ) (𝜙 (𝑥𝑖 ) ⋅ 𝜙 (𝑥𝑗 )) 3.4.3. Improved Firefly Algorithm Optimization. The Firefly
2 𝑖,𝑘=1 (7) Algorithm (FA), as a new group bionic optimization algo-
rithm, has high efficiency in solving numerous optimization
𝑁 𝑁
problems and can outperform conventional algorithms, such
+ ∑ (𝛼𝑖 − 𝛼𝑖∗ ) 𝑦𝑖 − ∑ (𝛼𝑖 + 𝛼𝑖∗ ) 𝜀.
as GA. In this algorithm, the fireflies are attracted to each
𝑖=1 𝑖=1
other depending on the two elements: their own brightness
The nonlinear regression function is and attraction. The brightness depends on the location and
the target value, and the higher the brightness, the better the
𝑁
location. Fireflies with higher brightness at the same time
𝑓 (𝑥) = ∑ (𝛼𝑖 − 𝛼𝑖∗ ) (𝜙 (𝑥𝑖 ) ⋅ 𝜙 (𝑥𝑗 )) + 𝑏. (8)
𝑖=1
have a higher degree of attraction. Low-brightness fireflies
in the field of vision are attracted by high-brightness fireflies.
To avoid the complex dot product through the kernel Fireflies would move randomly if they had similar fluorescent
function 𝑘(𝑥𝑖 , 𝑥𝑗 ) = 𝜑(𝑥𝑖 ) ⋅ 𝜑(𝑥𝑗 ), the input variables are brightness.
Mathematical Problems in Engineering 5
Regarding the brightness as objective function, the opti- 3.4.4. PSO/IFA Based Support Vector Regression. In this study,
mization problem can be seen as a maximization problem. identifying the optimal parameters of the SVR model is an
The attractiveness of the fireflies is proportional to the optimization problem. Therefore, this study combined swarm
fluorescence intensity of the nearby fireflies and is inversely intelligence algorithm and SVR in prediction methods to
proportional to the distance. Define the relative fluorescence reduce prediction errors. Considering that the number of
2
brightness of the fireflies as 𝐼 = 𝐼0 𝑒−𝛾𝑟𝑖𝑗 and the attractiveness samples of the learning data is much larger than that of feature
2 dimensions, the input variables are mapped into Hilbert
as 𝛽 = 𝛽0 𝑒−𝛾𝑟𝑖𝑗 . Distance between fireflies 𝑖 and 𝑗 is 𝑟𝑖𝑗 =
space through the RBF kernel, which is more promising,
‖𝑥𝑖 −𝑥𝑗 ‖. Firefly 𝑖 is attracted by firefly 𝑗 to update the location;
compared with other kernels. In order to solve the problem
the location update equation is
of predicting departure taxi-out time more accurately, the
2
𝑥𝑖 = 𝑥𝑖 + 𝛽0 𝑒−𝛾𝑟𝑖𝑗 (𝑥𝑗 − 𝑥𝑖 ) + 𝛼 (rand − 0.5) , (11) establishment of SVR models requires the determining of the
penalty factor 𝐶, RBF kernel parameter 𝛾, and the 𝜀-value in
where 𝛾 is the absorption coefficient, 𝛾 ∈ [0.1, 10], 𝛽0 is advance, by using PSO and IFA optimization, respectively,
the attractiveness when 𝑟𝑖𝑗 = 0, 𝛼 is the step factor for since the inapposite 𝐶 would affect the training error and
determining random firefly movement, and rand is a random model complexity, inapposite 𝛾 would define the nonlinear
number drawn from a Gaussian distribution, rand ∈ [0, 1]. mapping from the input space to Hilbert space and induce
overfitting or fewer learning phenomena, and the 𝜀-value
Adaptive Step Factor. The value of the step factor affects the controls the complexity of the approximating functions. The
global and local optimal detection ability of the algorithm. flowchart of PSO/IFA based on the SVR prediction model is
In order to improve the convergence efficiency of the opti- shown in Figure 1.
mization algorithm, the large step factor can benefit the global In Figure 1, the optimized SVR prediction model includes
optimal solution search efficiency. With the increasing of three parts: data classification, PSO/IFA optimization, and
number of iterations, gradually reducing the step factor is SVR prediction model. Historical data would be classified
more conducive to the algorithm in the search space for fine as training set, validating set, and test set. The training
tuning. Thus a monotonically decreasing function is chosen set is used to adjust weights and biases. The validating set
as the step factors, which is written as is used to evaluate the performance of the trained SVR
𝛼 = 𝛼0 𝜏𝑡 , (12) model. And the test set is used to confirm the predicting
accuracy. The optimization process optimizes the parameters
where 𝛼0 is the initial attractive coefficient, 𝜏 is the controlling of the SVR and SVR models, trains and validates the models,
parameter, empirically selected as 0.9, and 𝑡 is the number of and then passes the feedback to the optimization process
iterations. after evaluating the fitness values to continue searching the
optimal parameters until meeting the accuracy. In short, the
Lévy Flight. The conventional FA optimization uses regular SVR implements regression parts, whereas the PSO and IFA
random movement method in stochastic optimization. This are applied to determine the optimal SVR parameters.
often leads to premature converging without the global The parameters of SVR prediction models were evaluated
optimal solution when dealing with a large number of local with PSO and IFA, respectively, in order to get the optimum
optimal solutions. In order to reduce the probability that fitness. All prediction processes were performed in MATLAB
the optimal process falls into the local optimal solution, 2012a. In the parameters’ optimization with both PSO and
this paper adopts Lévy flight when updating the distance of IFA methods, we initialized the maximum population size as
fireflies. Lévy flight is a random walk that the step length 20 and the maximum number of iterations as 100, and each
obeys Lévy distribution, which is a distribution of a sum of 𝑁 particle 𝑘𝑖 is a vector that comprises the SVR parameters;
identically and independently distributed random variables. namely, 𝑘𝑖 = (𝐶𝑖 , 𝛾𝑖 , 𝜀𝑖 ). The search space of the SVR param-
The Fourier transform is 𝐹𝑁(𝑘) = exp(−𝑁|𝑘|𝜁 ). The step eters is [10−1 , 102 ] × [0, 102 ] × [10−10 , 1]. The termination
lengths follow Lévy distribution 𝐿(𝑠) ∼ |𝑠|−𝜁 , where 1 < criteria are fulfilled if there is no improvement in fitness
𝜁 ≤ 3 is an index and 𝑠 follows a power-law distribution. The function and the maximum number of iterations is obtained.
distribution has an infinite variance following
𝑡2 1<𝜁<2 3.5. Performance Measures. This research aims to compare
{
{
{
{ 2 the swarm intelligence algorithm based SVR methods and
{ 𝑡
{ 𝜁=2 other prediction methods, to evaluate performance by using
𝜎2 (𝑡) ∼ { ln 𝑡 (13)
{
{ 3−𝜁
the prediction accuracy measures in statistics as presented in
{
{ 𝑡 2 < 𝜁 < 3 (15) to (18):
{
{𝑡 𝜁 ≥ 3. (1) Root mean square error (RMSE):
Thus by replacing the original step factor and random
walk with adaptive step factor and Lévy flight, respectively, 1 𝑁 2
RMSE = √ ̂ ,
∑ (𝑦 − 𝑦) (15)
the new update equation of IFA is written as 𝑁 𝑖=1
2
𝑥𝑖 = 𝑥𝑖 + 𝛽𝑒−𝛾𝑟𝑖𝑗 (𝑥𝑗 − 𝑥𝑖 ) + 𝛼0 𝜏𝑡 (rand − 0.5) ⊗ Lévy, (14)
where 𝑦 is the actual value, 𝑦̂ is predictive value, and 𝑁 is the
where symbol ⊗ is entry-wise multiplication. number of data samples.
6 Mathematical Problems in Engineering
Stop criteria?
Validating set Evaluate fitness values
Yes
SVR prediction
(2) Mean absolute percentage error (MAPE): The days from Oct. 17 to Oct. 30 were used for training,
and the days between Nov. 13 and Nov. 15 were used for testing
1 𝑁 𝑦 − 𝑦̂ the prediction. ASP data record the following information:
MAPE = ∑ . (16)
𝑁 𝑖=1 𝑦 schedule take-off time and schedule landing time, applied
pushback time, actual take-off time, and actual landing time
(3) Squared correlation coefficient (𝑟2 ): of arrival flights. Using the historical data is important to
ensure that the results are realistic, and can be compared with
𝑟2 the status quo at a specific airport simultaneously, in order to
estimate the potential situation at other similar airports.
2
(𝑁 ∑𝑁 ̂ − ∑𝑁
𝑖=1 𝑦𝑦
𝑁
̂
𝑖=1 𝑦 ∑𝑖=1 𝑦)
(17)
= 2 2
. 4.2. Data Analysis. In recent years, researchers have found
(𝑁 ∑𝑁 2 𝑁 𝑁
̂2 − (∑𝑁
𝑖=1 𝑦 − (∑𝑖=1 𝑦) ) (𝑁 ∑𝑖=1 𝑦 ̂ )
𝑖=1 𝑦)
that departure taxi-out time is related to numerous factors,
including the number of departing aircraft in the runway
(4) Prediction accuracy (PA): the last set of performance queue, the number of arriving aircraft taxiing, the time of day
measures is the percentage of prediction accuracy within a [2, 13], airlines, and taxiing route distance [14, 15]. Departure
specific-error absolute value. This percentage indicates the delay is also a significant factor in some specific airports
percentage of the aircraft in the dataset predicted within 2, such as PEK. These elements complicate the development
3, and 5 minutes, as presented in (14): of a methodology for predicting departure taxi-out time. In
this research, the various prediction models were used for
# of 𝑦 − 𝑦̂ ≤ 𝑙 predicting the taxi-out time of each flight. In order to train the
PA = × 100%. (18)
𝑁 state of flights, several factors were taken into consideration.
The state variable set 𝑋 = {𝑥1 , 𝑥2 , 𝑥3 , 𝑥4 , 𝑥5 , 𝑥6 } for the
4. Data Analysis and Observation prediction was determined by analysing the performance
data. The configuration of three parallel runways at PEK
4.1. Data Source. The datasets in this study are from the Avia- airport reduces influence among the different runways. For
tion System Performance of PEK, the second busiest airport a specific flight waiting for departure, the current departure
in the world, with a huge traffic volume, as well as severe queue length on taxiway (𝑥1 ), the potential number of
delay time. PEK airport comprises three parallel runways, landing aircraft during taxi-out course (𝑥2 ), and the distance
with Runway 36L/18R being used for combined arrival and of taxi-out route from each gate to runway (𝑥3 ) are the
departure operations, Runway 36R/18L mainly dedicated to significant factors affecting taxi-out time. The recorded data
departures, and Runway 01/19 used only for arrivals, with all include considerable delay information due to a great deal
three runways serving both departures and arrivals at traffic of traffic flow. Especially at PEK airport, the second busiest
rush hour (Civil Aviation Administration of China, 2013) airport of the world, numerous traffic flows induce enormous
[23]. delays. Delay violates the fluency of the departure and arrival
Mathematical Problems in Engineering 7
Departure/arrival
from the character of data. The taxi-out time of raw PEK 15
demand
performance data is recorded as minutes rather than seconds, 10
while the models we used, except the SR model, have a search
precision of 10−4 .
5
0
6 8 10 12 14 16 18 20 22
4.3. Observations Time of day (hours)
40
absurd data points, such as a very extended delay time,
have been eliminated from datasets, and all data samples are
normalized within a range of 0 to 1 for modelling. 30
Figure 2 shows the observed dynamics on a training day at
PEK airport. It includes actual average taxi-out time per quar-
20
ter (15 min) (Figure 2(a)), departure demand per quarter (Fig-
ure 2(b)), and arrival demand per quarter (Figure 2(b)). The
average taxi-out time and departure demand have two peak- 10
hour durations: at 7:00 AM–9:00 AM and 4:00 PM–6:00
PM, respectively. The peak-hour duration of the arrival
process happens from about 4:00 PM to 7:00 PM. These two 0
0 10 20 30 40 50 60
overlapping durations contribute the longer taxi-out time. Delay (min)
Figure 3 describes the scatterplot of a training dataset,
showing the linear fit between taxi-out time and delay. In Figure 3: Scatterplot and linear fit between taxi-out time and delay.
general, the delay has a positive impact on taxi-out time, and
𝑟2 is 0.5374. This is the reason for the delay being one of the
factors in busy airport. 10−5 < 0.05 and the null hypothesis is rejected. The test set
and results of statistical test can be seen in Appendix 1. Thus
4.3.2. Dynamics of Testing Days. The testing days are from we can safely conclude that it is statistically different between
Nov. 13 to Nov. 15 in 2013. A set of details performance is normal and abnormal days.
shown in Table 2.
Table 2 displays the details of testing days that include two 5. Numerical Results
normal days (13th and 14th) and a day with excessive delay
(15th). In order to validate the different gap between normal Through prediction of test data, performances of each pre-
and abnormal days, a nonparametrical statistical test, named dictive method could be compared. For PSO-SVR, global
Wilcoxon-Mann-Whitney test, is implemented. Two null optimal parameters (𝐶, 𝛾, 𝜀) in this research are (16.885, 1.401,
hypotheses of “the taxi-out time distribution is same between 0.028) and (36.221, 0.917, 0.020) for IFA-SVR. A visualized
two days” on the 13th-14th and 13th-15th, respectively, are comparison is made between the mean actual taxi-out time
tested. 𝑝 value of the 13th-14th is 0.081 > 0.05 and the null per quarter and the mean predicted taxi-out time per quarter
hypothesis can be accepted, while 𝑝 value of the 13th-15th is (i.e., on the predicted days). The illumination below just
8 Mathematical Problems in Engineering
30 30
25 25
in a quarter (min)
in a quarter (min)
20 20
15 15
10 10
5 5
0 0
6 8 10 12 14 16 18 20 22 6 8 10 12 14 16 18 20 22
Schedule time (hour) Schedule time (hour)
Figure 4: Plot of actual taxi-out time versus predicted taxi-out time. (a) GLM, SR, and ANN predicted taxi-out time to actual taxi-out time;
(b) PSO-SVR and IFA-SVR predicted taxi-out time to actual taxi-out time.
Table 2: Actual statistic performance of testing data. predicted taxi-out times are less than actual values. The
output results of SR are integers, since SR is based on the
Date (Nov. 2013) integral classification of training taxi-out time, which can be
Actual performance
13th 14th 15th seen from the form of median taxi-out time. However, the
Mean taxi-out time (min) 15.40 14.01 19.01 standard deviance of predicted taxi-out times of GLR reveals
Median taxi-out time (min) 14 13 17 the worst distinct sensitivity with different parameters, and
Std. dev. taxi-out time (min) 8.32 6.37 9.36 this also can be observed from the underfitting phenomenon
Mean delay (min) 20.58 20.08 27.41 in Figure 4. Compared with the results in [2] at Tampa
International Airport, these swarm intelligence algorithm
based prediction methods show better fault-tolerance ability
shows the actual and predicted taxi-out time curves on the for handling mean taxi-out time predictions, especially in
14th, which was shown in Figure 4. excessive traffic or abnormal patterns.
We can intuitively see in Figure 4 that PSO and IFA The comparison results of modelling performance for
based SVR models have higher compatibility than other each predictive method can be found in Table 4 and the
approaches, especially GLM and SR, which are obviously best performance is also highlighted with bold numbers.
underfitting and sometimes wrong-fitting, whereas the ANN Table 4 shows that the highlighted performance measures
method also has a very good fit effect. of IFA-SVR are slightly better than the results of PSO-SVR
Table 3 shows the first three performance measures for and significantly outperform other approaches. Both the
predicted datasets, and bold numbers highlight the best newly introduced PSO-SVR and IFA-SVR have the squared
performance measures (closest to actual values) for each correlation coefficient 𝑟2 exceeding 90% on both the 13th
predictive method across three testing days. The introduced and 14th, while they drop on the 15th for the large numbers
IFA-SVR outperforms other approaches in terms of mean of underestimated taxi-out times on the 15th, which will be
taxi-out time and standard deviance, while IFA-SVR is shown in Figure 5. Figure 5 indicates a comparison of taxi-out
superior on median taxi-out time. These results are closer time prediction accuracy for each predictor on the 14th and
to the actual performance of testing data. As data on the 15th, respectively, of which the 𝑥-axis represents the aircraft,
15th presents very long taxi-out time on the whole, all mean sorted from underestimated to overestimated taxi-out times,
Mathematical Problems in Engineering 9
Table 3: A comparison of performance measures for each predictive method at PEK airport.
Table 4: A comparison of modelling performance for each predictive method at PEK airport.
and the 𝑦-axis is the error between predicted and actual- the 15th (except MAPE) is in that the actual mean taxi-out
predicted taxi-out time, namely, predicted taxi-out time – time on the 15th is greater than on other days.
actual taxi-out time. Table 5 shows the performance measures of prediction
The vertical dash line divides the sorted aircraft into (i) accuracy within 2, 3, and 5 min by measuring absolute error.
underestimated taxi-out time region and (ii) overestimated IFA-SVR still comes out on top among the testing methods.
taxi-out time region. The distance between the dots on In terms of accuracy within 2 and 5 minutes, the performance
each line and the 0-baseline represents the absolute error of IFA-SVR is inferior to capability in [15] (79.39% to 86.81%
of predicted taxi-out time for each aircraft. The number of and 95.52% to 99.08%) for Stockholm Arlanda Airport. That
underestimated taxi-out times in Figure 5(a) is almost in is caused by the different traffic condition samples between
balance with the number of overestimated taxi-out time, different airports. Notice that the accuracy measures in [15] of
while being larger than it is in Figure 5(b). We can also find linear regression are 85.3% and 99.16%, respectively, while the
the notable predictive ability of newly introduced predictors best performance of TSK model improves the rates by 1.78%
for excessive traffic or abnormal patterns from Figure 5(b). In and −0.08%, respectively. In this research, the performance of
addition, the reason for all performance measures on the 13th IFA-SVR improves the rates of GLR by 97.49% and 24.42%,
and 14th of PSO-SVR and IFA-SVR being better than that on respectively.
10 Mathematical Problems in Engineering
20 20
Predicted taxi time minus true taxi time (min)
−5 −5
−10 −10
−15 −15
−20 −20
0 100 200 300 400 500 0 100 200 300 400 500
Sorted aircraft on the 14th Sorted aircraft on the 15th
Figure 5: Taxi-out time prediction accuracy at PEK airport. (a) Taxi-out time prediction accuracy on the 14th; (b) taxi-out time prediction
accuracy on the 15th.
[3] R. A. Shumsky, Dynamic statistical models for the prediction of [22] A. J. Smola and B. Schölkopf, “A tutorial on support vector
aircraft take-off times [Ph.D. thesis], MIT, 1995. regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–222,
[4] N. Pujet, Modelling and control of the departure process of con- 2004.
gested airports [Ph.D. thesis], MIT, 1999. [23] Civil Aviation Administration of China, “China’s civil aviation
[5] H. R. Idris, I. Anagnostakis, B. Delcaire et al., “Observations of domestic AIP,” ZBAA AD 2-1. 2013.
Departure Processes at Logan Airport to Support the Develop-
ment of Departure Planning Tools,” Air Traffic Control Quar-
terly, vol. 7, no. 4, pp. 229–257, 1999.
[6] H. Idris, Observation and analysis of departure operations at
Boston Logan International Airport [Ph.D. thesis], MIT, 2001.
[7] H. Idris, J. Clarke, R. Bhuva, and L. Kang, “Queuing Model for
Taxi-Out Time Estimation,” Air Traffic Control Quarterly, vol.
10, no. 1, pp. 1–22, 2002.
[8] F. Carr, A. Evans, J.-P. Clarke, and E. Feron, “Modeling and
control of airport queueing dynamics under severe flow restric-
tions,” in Proceedings of the American Control Conference, vol. 2,
pp. 1314–1319, Anchorage, Alaska, USA, May 2002.
[9] I. Simaiakis and H. Balakrishnan, “Queuing models of airport
departure processes for emissions reduction,” in Proceedings of
the AIAA Guidance, Navigation, and Control Conference and
Exhibit, Chicago, Ill, USA, August 2009.
[10] I. Simaiakis and H. Balakrishnan, “A queuing model of the
airport departure process,” Transportation Science, vol. 50, no.
1, pp. 94–109, 2016.
[11] A. Srivastava, “Improving departure taxi time predictions
using ASDE-X surveillance data,” in Proceedings of the 2011
IEEE/AIAA 30th Digital Avionics Systems Conference (DASC),
pp. 1–18, Seattle, Wash, USA, October 2011.
[12] J. E. Hebert and D. C. Dietz, “Modeling and analysis of an air-
port departure process,” Journal of Aircraft, vol. 34, no. 1, pp. 43–
47, 1997.
[13] P. Balakrishna, R. Ganesan, and L. Sherry, “Application of rein-
forcement learning algorithms for predicting taxi-out times,” in
Proceedings of the 8th USA/Europe ATM R&D Seminar, Napa,
Calif, USA, June 2009.
[14] S. Ravizza, J. A. D. Atkin, M. H. Maathuis, and E. K. Burke, “A
combined statistical approach and ground movement model for
improving taxi time estimations at airports,” Journal of the Oper-
ational Research Society, vol. 64, no. 9, pp. 1347–1360, 2013.
[15] S. Ravizza, J. Chen, J. A. D. Atkin, P. Stewart, and E. K.
Burke, “Aircraft taxi time prediction: comparisons and insights,”
Applied Soft Computing, vol. 14, no. C, pp. 397–406, 2014.
[16] H. Lee, W. Malik, B. Zhang, B. Nagarajan, and Y. C. Jung, “Taxi
time prediction at Charlotte airport using fast-time simula-
tion and machine learning techniques,” in Proceedings of the 15th
AIAA Aviation Technology, Integration, and Operations Con-
ference, 2015, Dallas, Tex, USA, June 2015.
[17] C. M. Bishop, Neural Networks for Pattern Recognition, Oxford
University Press, New York, NY, USA, 1995.
[18] R. E. Kalman, “A new approach to linear filtering and prediction
problems,” Journal of Fluids Engineering, vol. 82, no. 1, pp. 35–45,
1960.
[19] H.-F. Yu, F.-L. Huang, and C.-J. Lin, “Dual coordinate descent
methods for logistic regression and maximum entropy models,”
Machine Learning, vol. 85, no. 1-2, pp. 41–75, 2011.
[20] C. Cortes and V. Vapnik, “Support-vector networks,” Machine
Learning, vol. 20, no. 3, pp. 273–297, 1995.
[21] J. Nelder and R. W. M. Wedderburn, “Generalized linear mod-
els,” Journal of the Royal Statistical Society: Series A (Statistics in
Society), vol. 135, no. 3, pp. 370–384, 1972.
Advances in Advances in Journal of The Scientific Journal of
Operations Research
Hindawi
Decision Sciences
Hindawi
Applied Mathematics
Hindawi
World Journal
Hindawi Publishing Corporation
Probability and Statistics
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www.hindawi.com
www.hindawi.com Volume 2018
2013 www.hindawi.com Volume 2018
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi
Optimization
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
International Journal of
Engineering International Journal of
Mathematics
Hindawi
Analysis
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018