A Study of Network Intrusion Detection S
A Study of Network Intrusion Detection S
A Study of Network Intrusion Detection S
sciences
Review
A Study of Network Intrusion Detection Systems Using
Artificial Intelligence/Machine Learning
Patrick Vanin 1 , Thomas Newe 1,2, * , Lubna Luxmi Dhirani 1,2 , Eoin O’Connell 1,2 , Donna O’Shea 2,3 ,
Brian Lee 2,4 and Muzaffar Rao 1,2
1 Department of Electronic and Computer Engineering, University of Limerick, V94 T9PX Limerick, Ireland
2 Confirm—SFI Centre for Smart Manufacturing, Park Point, Dublin Rd, Castletroy, V94 C928 Limerick, Ireland
3 Department of Computer Sciences, Munster Technological University (MTU), T12 P928 Cork, Ireland
4 Software Research Institute, Technological University of the Shannon, Midlands Midwest,
N37 HD68 Athlone, Ireland
* Correspondence: thomas.newe@ul.ie
Abstract: The rapid growth of the Internet and communications has resulted in a huge increase in
transmitted data. These data are coveted by attackers and they continuously create novel attacks to
steal or corrupt these data. The growth of these attacks is an issue for the security of our systems
and represents one of the biggest challenges for intrusion detection. An intrusion detection system
(IDS) is a tool that helps to detect intrusions by inspecting the network traffic. Although many
researchers have studied and created new IDS solutions, IDS still needs improving in order to have
good detection accuracy while reducing false alarm rates. In addition, many IDS struggle to detect
zero-day attacks. Recently, machine learning algorithms have become popular with researchers to
detect network intrusion in an efficient manner and with high accuracy. This paper presents the
concept of IDS and provides a taxonomy of machine learning methods. The main metrics used to
assess an IDS are presented and a review of recent IDS using machine learning is provided where the
Citation: Vanin, P.; Newe, T.; Dhirani, strengths and weaknesses of each solution is outlined. Then, details of the different datasets used in
L.L.; O’Connell, E.; O’Shea, D.; Lee, the studies are provided and the accuracy of the results from the reviewed work is discussed. Finally,
B.; Rao, M. A Study of Network observations, research challenges and future trends are discussed.
Intrusion Detection Systems Using
Artificial Intelligence/Machine
Keywords: Intrusion Detection Systems (IDS); machine learning; network security; Intrusion
Learning. Appl. Sci. 2022, 12, 11752.
Prevention Systems (IPS); deep learning algorithms
https://doi.org/10.3390/
app122211752
Fortunately, thanks to, on the one hand, the improvement made to the network and
the processing power of the network’s components, and on the other hand, the advent
of machine learning, IDS is now a key component of the security of many systems. Most
machine learning techniques have been evaluated in the development of IDS. However,
the use of deep learning algorithms has not been sufficiently explored. With deep learning,
many more possibilities can be achieved to tackle the issue of false alarm rates and lack of
high accuracy.
The first goal of this paper is to present the key concepts of IDS and machine learning.
The second goal of the paper is to provide the recent trends and observations on recent
machine learning intrusion detection systems. We review related published work, their
advantages, disadvantages, datasets, techniques, and evaluation metrics. We discuss what
has been done and what could be done to improve the IDS using new machine learning
techniques. Finally, future trends and research challenges are outlined.
The paper is organized as follows: Section 2 describes the concept of IDS. Section 3
explains the different metrics used to assess IDSs. Section 4 provides the basic understand-
ing of machine learning, Section 5 provides datasets description and Section 6 reviews the
relevant papers used in the study. Evaluation metrics are discussed in Section 7. Observa-
tions, research challenges and future trends are provided in Section 8. Finally, Section 9
concludes this paper.
raise an alarm. The advantage of anomaly detection is its flexibility to find unknown
intrusion attacks. However, in most cases it is difficult to precisely define what the
baseline of a network is, thus, the false detection rate of these techniques can be high.
• Hybrid detection combines both of the aforementioned detections. Generally, they have
a lower false detection rate than anomaly techniques and can discover new attacks.
Nowadays, IDS by themselves are not enough for companies that want to protect
themselves from attacks. IDS are more and more being replaced by Intrusion Prevention
Systems (IPS). An IPS is similar to an IDS but with active components to stop attacks before
they are successful. Usually, an IPS consists of a firewall with IDS rules. Contrary to IDS,
IPSs are placed inline, this means that an IPS will continuously scan the traffic as the traffic
passes through it. Thus, an IPS needs to be fast and have high computing capacities to avoid
causing latency issues in a network which can affect network performance for its users.
One of the main disadvantages of many IPS is false positive attack detection. With an
IDS, a false positive can be an inconvenience but for an IPS it can cause DoS, as legitimate
traffic will be blocked. In addition, since IPSs, and especially Network Intrusion Prevention
Systems (NIPSs), form single point of failure in the network, they need to be highly stable
and robust against attacks.
Predicted Class
Normal Attack
Normal TN FP
Actual class
Attack FN TP
3.1. Precision
Corresponds to the ratio of correctly predicted attack samples to all the predicted
attack samples.
TP
Precision =
TP + FP
3.2. Recall
Corresponds to the ratio of correctly predicted attack samples to all the samples that
correspond to an attack. This metric is also known as the Detection Rate.
TP
Recall =
TP + FN
Appl. Sci. 2022, 12, 11752 4 of 27
3.5. Accuracy
Corresponds to the ratio of correctly identified classes to all the samples. This metric
is often used to measure the efficiency of an IDS when the dataset is balanced.
TP + TN
Accuracy =
TP + FN + TN + FP
3.6. F-Measure
Corresponds to the harmonic mean of the Precision and Recall. It is used to provide a
better evaluation of the system by showing the gap between the two metrics to see if the
solution is balanced. This metric is also known as the F-Score or the F1-Score.
(Precision ∗ Recall)
F − Measure = 2 ×
(Precision + Recall)
Since the attack detection rate and false alarm rate are often opposed to each other,
evaluation of IDSs is also performed using Receiver Operating Characteristics (ROC)
analysis. A ROC curve, as shown in Figure 1, represents the trade-off between attack
detection rate and false alarm rate. The closer the ROC curve is to the top left corner
the more effective the IDS is [4]. As shown in Figure 1, ROC Curves can also be used to
compare different IDS using the same dataset.
IDSs can also be evaluated according to their time performance. The time performance
corresponds to the total time that the IDS needs to detect an intrusion. This time is
composed of the processing time and the propagation time. The processing time is the
Appl. Sci. 2022, 12, 11752 5 of 27
time needed by the IDS to process the information to detect an attack. The processing
speed of the IDS needs to be as fast as possible, if not, real-time processing of intrusion is
not feasible. The propagation time is the time needed to propagate the information to the
security analyst or the Security Operation Centre (SOC). Both times, processing time and
propagation time need to be as short as possible to allow security analysts to have enough
time to react to an attack in real-time [4].
4. Machine Learning
Machine learning is closely linked to Artificial Intelligence (AI) technology. It trains
an algorithm to find regular patterns in a dataset. This training results in a model that can
be used to predict or automate things. For IDSs, machine learning can be used to detect
either known attacks or unknown attacks if the model has been sufficiently trained.
As shown in Figure 2, there are three main types of machine learning methods:
supervised, unsupervised and semi-supervised machine learning [5]. These methods
are discussed further in this section.
Frequently Used
ML Methods Advantages Disadvantages
Algorithms
Supervised ML
Based on Known + unknown = Uses labelled/trained data sets and
SVM, Random Forest,
Known can work with larger data sets. Takes a longer time to
Linear and Logistic
(Labelled data used to generate a Works well in predictive compute.
Regression [6,7]
function that maps an input to an environment/use-cases.
output).
Unsupervised ML
K-means, Apriori, Uses unlabeled data sets, works
Based on Known + unknown =
Principal Component well in analytical Takes less time to compute
Unknown
Analysis [9–11] environment/use-cases
(Uses unlabeled data)
Is a combination of supervised and Time factor and computing
Semi-Supervised ML Q-Learning, Deep unsupervised learning and enables complexity may depend on
(Mixed labelled and unlabelled) Learning (DQN) the predictive and analytical the combination of
aspects of analysing data. algorithms used
Appl. Sci. 2022, 12, 11752 8 of 27
5. Datasets
To train and test their models, the researchers used datasets. Following is a discussion
of the most known and used datasets for Intrusion Detection System training and testing.
5.1. KDDcup99
The KDDcup99 dataset has been one of the most widely used datasets to assess IDS.
It is based on the DARPA’98 dataset. The KDDcup99 contains approximately 4,900,000
samples. Each sample has 41 features and is labelled as Normal or Attack. The attack
samples are classified into four categories: Denial of Service (DoS), User to Root (U2R),
Remote to Local (R2L), and Probe. There are three different datasets for KDDcup99, the
first one is the whole dataset, the second corresponds to 10% of the whole dataset the third
one is a test dataset which contains 311,029 samples. One of the main disadvantages of this
dataset is that it is imbalanced, i.e., many samples are like each other for major classes such
as DoS and Probe whereas for R2L and U2R there are few. Depending on which part of the
dataset is used some classes might be completely absent [13].
5.3. NSL-KDD
This dataset was created to fix the main issue of the KDDcup99 dataset. It was
proposed in 2009 by Tavallaee et al. [13]. It keeps the four attack categories of the KDDcup99.
The NSL-KDD proposes two files, a training set, and a testing set. The training set is made
of 21 different attacks and has 126,620 instances. The testing set is made of 37 different
attacks and has 22,850 instances [14].
5.4. UNSW-NB15
This dataset was created by the Australian Centre for Cyber Security. It was created
to generate traffic which is a hybrid of normal activities and attack behaviours. This
dataset has nine types of attacks: Fuzzers, Analysis, Backdoors, DoS, Exploits, Generic,
Reconnaissance, Shellcode and Worms. The UNSW proposes two files, a training set, and a
testing set. These files contain records from different types of traffic, attacks and normal,
from the original dataset. The original dataset has a number of records of 2,540,044 while,
the training set has 175,341 records, and the testing set has 82,332 records [15].
5.5. CICIDS2017
This dataset was created by the Canadian Institute for Cybersecurity (CIC) in 2017.
This dataset was built using real-world traffic containing both normal and recent attack
samples. The results were analyzed based on the time stamp, source, and destination
IP, protocols, and attacks using CICFlowMeter. In addition, they implemented common
attacks such as Brute Force FTP, Brute Force SSH, Denial of Service (DoS), HeartBleed, Web
Attack, Infiltration, Botnet and Distributed Denial of Service (DDoS) [16].
A summary of the different datasets is given below in Table 3.
Appl. Sci. 2022, 12, 11752 9 of 27
6. Literature Review
The solution proposed by Lirim et al. [17] used a Convolutional Neural Network
(CNN) with a multi-layer perceptron for its model. Multi-layer perceptron can be con-
sidered as a fully connected network where a neuron corresponds to one layer and is
connected to all neurons in the next layers. For neural networks, a CNN is composed of an
input layer, hidden layers, and an output layer. Contrary to a traditional neural network, in
one of its hidden layers, CNN uses a mathematical operation called a convolution instead
of using a multiplication matrix. In CNN, the input is a tensor made of different parameters
such as the number of inputs, the input height, width, and channels. The convolutional
layer convolves the input and forwards the result to the next layer. However, the tensor size
can grow tremendously after multiple convolutions. To tackle this issue, Lirim et al. [17]
use padding to reduce the tensor dimension. They trained their model by optimizing the
hyperparameters until a decrease in performance is met. Their final model uses ten classes
(nine for attacks and one for normal traffic) and is made of multiple dual convolutional
layers followed by a pooling and a dropout layer to avoid oversize. However, their model
presents a class imbalance between the upper and bottom class which needs the use of
bootstrapping to solve the issue. They tested their model on the pre-partitioned UNSW-
NB15 dataset and on a user-defined dataset which corresponds to 30% of the whole dataset.
They obtained, respectively an accuracy of 94.4% and 95.6% for both datasets.
Lin et al. [18] proposed another IDS based on CNN. Their solution is composed of
two parts. The first one is offline training using CNN, where in their model, they start
with an input layer of 9 × 9 and reduce it through successive convolutional layers and a
maximum pooling layer to reach an output layer of 1 × 1. The second part of their system is
the online detection phase, where they use Suricata, an open-source IDS, to catch the traffic.
Then, the packets are pre-processed, and the trained model is used on the network traffic
to produce the outcome of the detection. To test their model, they used the CICIDS2017
dataset. They tested it on the feature dataset and the raw traffic dataset. They obtained,
respectively an accuracy of 96.55% and 99.56%, showing that their model is better with raw
traffic than with an extracted feature set.
Rohit et al. [19] proposed an ensemble approach to detect intrusion. They perform
three tests to show how their approach proposed better results. They first performed
normalization on the KDD Cup99 dataset, then, they use a correlation method to perform
feature selection. The feature selection used information gain as a decision factor, and
finally, they use an ensemble approach combining three algorithms: Naïve Bayes, PART,
and Adaptive Boost. The result of each algorithm is then compared, and the average of
the results or most voting results is used to decide the outcome. In addition, they use the
bagging method to reduce the variance error. They obtained an accuracy of 99.9732% on
the KDD Cup99 dataset using their solution.
Al-Yaseen et al. [20], proposed a new model for intrusion detection systems, using a
hybrid multi-level model combining SVM (Support Vector Machine) and ELM (Extreme
Learning Machine). In their model there are five levels, the first level distinguishes the
traffic into DoS or Other. The second level distinguishes the previous unknown traffic
Appl. Sci. 2022, 12, 11752 10 of 27
into Probe or Other. The third distinguishes the previous unknown traffic into User to
Root attack (U2R) or Other, and, the fourth level distinguishes the previous unknown
traffic into Remote to Local attacks (R2L) or Other. Finally, the previous unknown traffic
is distinguished between normal or unknown traffic in the fifth level. R2L and U2R are
placed at the bottom level because they are similar to normal connections. At each level,
a classifier is used. Their model is composed of 4 SVM classifiers at levels 1, 3, 4, and 5
and of 1 ELM classifier at level 2. They choose to use an ELM classifier to detect Probe
because ELM has shown better results than SVM. After pre-processing the training set from
the KDD dataset, they performed a modified K-means for feature extraction to have the 5
different categories that their solution can detect. Using their solution, they obtained an
accuracy of 95.75%, which is slightly better than if they only used multi-level SVM (95.57%).
In addition their hybrid model has a lower false alarm rate, 1.87%, compared to multi-level
SVM at 2.17%.
Kanimozhi et al. [21], proposed a solution using oppositional tunicate fuzzy C-mean
for detecting cloud intrusions. In their model, they first pre-processed the data and per-
formed a normalization to have two datasets, one for training and one for testing. They
performed a feature selection using logistic regression to keep the more relevant features,
and they used the OPTSA and FCM clustering model. The dataset is split into C clusters
using the fuzzy C-means algorithm. Once the data is clustered, they performed a cluster
expansion and integration to reduce redundant clusters. They tested their solution on
different datasets such as CICIDS2017 and obtained an accuracy of 80%.
Yiping et al. [22] created an intrusion detection system for wireless networks based
on the random forest algorithm. They first created a signal detection model to catch the
important features of signals, then, they created the model to detect malicious nonlinear
scrambling intrusion signals. An improved random forest algorithm was used to extract
the spectral features of the malicious signal, and then, optimal detection of malicious traffic
in a wireless network was performed using a reinforcement learning method and static
feature fusion. They obtained a mean accuracy of 96.93%.
Jabez et al. [23] created a system using an outlier detection approach to detect unknown
attacks. The outlier detection approach is based on identifying data points that are isolated
from clustered points. This approach uses the neighbourhood outlier factor to detect points
that are not close to each other. They trialled their solution on the KDDcup99 datasets. In
addition, the main advantage of their solution is its execution time which is significantly
better compared to other solutions such as back propagation neural network which requires
a lot of computing resources.
Kurniawan et al. [24] proposed an improved solution of Naïve Bayes for intrusion
detection systems. The Naïve Bayes algorithm is based on the Bayes equation:
P (U | H ) ∗ P ( H )
P ( H |U ) =
P (U )
where:
• U is the data with an unknown class
• H is the hypothesis class of U
• P() is the Probability
The Naïve Bayes algorithm has an issue when one of the probabilities is 0 [25]. This
results in its low accuracy when used. In their solution, Kurniawan et al. proposed two
modifications to the Naïve Bayes algorithm. The first one is removing each variable that
has a probability of 0. The second modification is to change the multiplication operation
by an addition operation when the probability is 0. In their solution, they first realized a
feature selection using the correlation-based features selection (CFS). Thus, the number of
features was reduced from 41 to 10 features. They tested their two modifications on the
NSL-KDD dataset, the second modification showed promising results with an accuracy
of 89.33%.
Appl. Sci. 2022, 12, 11752 11 of 27
Gu et al. [26] proposed a new solution to improve IDS using SVM. Their solution used
Naïve Bayes algorithm to perform feature selection. Then, they trained the model with the
transformed data from the feature selection. They tested their solution on the UNSW-NB15
and the CICIDS2017 datasets. Compared to using only the SVM classifier, the use of Naïve
Bayes for feature extraction before using the SVM classifier shows better results. Indeed,
they obtained an accuracy of 93.75% on the UNSW-NB15 dataset and an accuracy of 98.92%
on the CICIDS2017 dataset. However, their solution only shows if there is an intrusion, it
cannot be used to detect what kind of attack is in operation.
Pan et al. [27] conceived a solution to detect intrusion in wireless networks. Their
solution was based in the cloud to have the maximum efficiency in terms of computational
power. They used sink nodes based in the fog to lessen the burden on the cloud computing
section. In order to have a solution as light as possible, they used a combination of
Polymorphic Mutation (PM) and Compact SCA (CSCA), as CSCA helps to reduce the
computing load by reducing the density of the data by using probability. They added
Polymorphic Mutation to reduce the loss of precision when using CSCA. They used PM-
CSCA to optimize the parameters of KNN algorithms to have the best configuration.
They tested their solution on the NSL-KDD and UNSW-NB15 datasets. They, respectively
obtained an accuracy of 99.327% and 98.27%.
Xiao et al. [28], proposed a solution based on CNN. They first performed feature
extraction using both Principal Component Analysis (PCA) and Auto-Encoder (AE). Auto-
Encoder is a dimension reduction method using several hidden layers of neural networks
to remove insignificant data. Then, they transformed the dimension of the data from one
into a two-dimensional matrix and forwarded it to the CNN model to train it. The model
is trained and improved using back propagation algorithms. They tested their model on
the KDDcup99 and obtained an overall accuracy of 94%. They compared their model with
DNN and RNN models and got slightly better results. However, their model has a low
detection rate of U2R and R2L which are not represented enough in the dataset.
Zhang et al. [29], proposed a multi-layer model to detect attacks. Their solution
combined two machine learning techniques: CNN and GcForest. The GcForest is a random
forest technique which generates a cascade structure of decision trees. Their model is
composed of two main parts. In the first part, they run a CNN algorithm to detect different
kinds of attacks and normal traffic from the input data. Their CNN algorithm is an
improved model of GoogLeNet called GoogLeNetNP. The second part consists of using
a deep forest model to create more subclasses of the attacks. This second layer improves
the precision of their solution by classifying the abnormal classes into N-1 subclasses. The
second layer uses the cascade principle of gcForest but instead of the random forest, it uses
XGBoost. XGBoost is like a random forest, however, the construction of the trees is done
one after another until the objective function is optimized. They tested their solution on
a combination of the UNSW-NB15 and CICIDS2017 datasets. They obtained an overall
accuracy of 99.24% which is better compared to the algorithms used singularly.
Yu et al. [30], proposed an IDS model based on Few-Shot Learning (FSL). FSL is a deep
learning method that can learn from a small amount of data. In their solution they used
two embedding models, CNN and DNN, to perform feature extraction. Those models help
to reduce the dimension of the input data without losing important information. They
tested their model on the UNSW-NB15 and NSL-KDD datasets. Their solution obtained,
respectively an accuracy of 92.34% and 92%.
Gao et al. [31], proposed an ensemble machine learning IDS. They used the Principal
Component Analysis method for feature extraction. After different tests on the NSL-KDD
datasets, their ensemble algorithm is combining Decision Tree, Random Forest, KNN, DNN
and MultiTree. The results of the ensemble algorithm are made by a majority vote using
weights for each algorithm to have better accuracy. They obtained an accuracy of 85.2%
which is better than the accuracy if they were only using one algorithm. However, their
model lacks efficiency when analyzing attacks that are not in large quantity.
Appl. Sci. 2022, 12, 11752 12 of 27
Marir et al. [32], proposed a solution using a Deep Belief Network (DBN) and an
ensemble method composed of multiple SVMs. A DBN is a succession of unsupervised
networks such as Restricted Boltzmann Machines (RBM). An RBM is composed of an input
and a hidden layer where the nodes are connected to the previous and next layers but are
not connected within their layer. DBN uses an unsupervised pre-training based on the
greedy layer-wise structure. Then, they use a supervised fine-tuning approach to learn
the important features. In their solution, they use DBN for feature extraction. Then, the
extracted features are forwarded to the multi-layer ensemble SVM. The output is generated
by a voting algorithm. They tested their solution on KDDcup99, NSL-KDD, UNSW-NB15,
and CICIDS2017 datasets. They, respectively obtained a precision of 94.76%, 97.27%, 90.47%
and 90.40%. However, it was shown that when more layers are used their solution is more
time consuming.
Wei et al. [33], improved the performance of DBN for IDS by using an optimizing algo-
rithm. To optimize their model, they used a combination of Particle Swarm Optimization
(PSO), Artificial Fish Swam Algorithm (AFSA), and Genetic Algorithm (GA). The PSO is
first optimized using AFSA. Then, GA is used to find the global optimal solution of the
initial particle search. The optimal solution is then used in the DBN model to improve its
accuracy. They tested their solution on the NSL-KDD dataset and obtained an accuracy
of 82.36%.
Vinayakumar et al. [34], proposed an IDS based on Deep Neural Network (DNN).
Their DNN architecture is made of an input layer, five hidden layers and an output layer.
Their solution is scalable, and it is possible to use between one and five hidden layers in
the DNN models. They used the Apache Spark computing platform. Their solution can
work in both cases, HIDS and NIDS. For NIDS, they tested their solution on KDDcup99,
NSL-KDD, Kyoto, UNSW-NB15 and CICIDS2017 datasets. They, respectively obtained an
overall accuracy of 93%, 79.42%, 87.78%, 76.48% and 94.5% when combining the accuracy
for each number of DNN layers.
Shone et al. [35] proposed a solution combining Non-symmetric Deep Auto-Encoder
(NDAE) and Random Forest. Usually, an auto-encoder uses the symmetric scheme from
encoder-decoder, however, in their solution, they only used the encoding phase. It reduces
the computational time without impacting too much on the accuracy of the IDS. To handle
complex datasets, they choose to stack their NDAE. However, they discovered that using
only NDAE was not enough to have an accurate classification. Therefore, they added
Random Forest as their classifier after performing feature extraction using two NDAE
with three hidden layers each. They tested their solution on the KDDcup99 and NSL-KDD
datasets and compared it to a DBN solution. They obtained, respectively a total accuracy of
97.85% and 85.42%. However, their solution struggles to detect small classes such as R2L
and U2R.
Yan et al. [36], showed the impact of feature extraction using a Stacked Sparse Auto-
Encoder (SSAE) to improve IDS. A sparse auto-encoder is an autoencoder which uses
a sparsity penalty, usually, the penalty is activated when hidden nodes are used. Thus,
using a sparse auto-encoder reduces the number of hidden nodes used. Stacked sparse
auto-encoder is the addition of further sparse auto-encoders. It allows for reducing the
dimension of the input data without losing significant information. To optimize their SSAE
they used the error back propagation method, and, to test their SSAE model they used the
NSL-KDD dataset. They used different classifiers with and without their SSAE model to
show how much the use of SSAE for feature extraction improves the accuracy. The best
accuracy was obtained when the SSAE and SVM classifiers were combined. They reached
an overall accuracy of 99.35%. One of the main advantages of using their solution is the
large time reduction for training and testing, approximately a tenth of the time of other
solutions is needed. However, the detection rate for R2L and U2R is lower compared to the
other classes.
Khan et al. [37], proposed a two-stage deep learning model (TSDL) to improve IDS.
In the first stage, they classify the traffic as normal or abnormal with a probability value.
Appl. Sci. 2022, 12, 11752 13 of 27
In the second stage they used this value as an additional feature to train the classifier,
they used a DNN approach for both stages, where they used a Deep stacked auto-encoder
(DSAE) for feature extraction and Soft-max as a classifier. Soft-max is often used in a
neural network for multi-class classification problems. They tested their solution on the
KDDcup99 and UNSW-NB15 datasets. They, respectively obtained an overall accuracy of
99.996% and 89.134%.
Andresini et al. [38], proposed a solution combining an unsupervised approach with
two auto-encoders and a supervised stage to build the datasets. They trained the two auto-
encoders separately using normal and attack traffic. Then, the auto-encoders reconstruct
those samples and add them to the dataset that is used to train the model. The dataset
goes through a one-dimension CNN. This is done to see the impact of one channel on the
other to have a better distinction between the two classes: normal and attack. Finally, they
used a Soft-max classifier to identify if the data was an attack or normal. They tested their
model on KDDcup99, UNSW-NB15 and CICIDS2017 datasets. They, respectively obtained
an overall accuracy of 92.49%, 93.40% and 97.90%. One of the drawbacks of their solution
is that it does not provide details about the different types of attacks.
Ali et al. [39], proposed a model using Fast Learning Network (FLN) based on particle
swarm optimization (PSO). They used PSO to improve the accuracy of FLN which can be
inefficient due to the weights used in the neural network. They tested their solution on the
KDDcup99 dataset against other FLN solutions. They obtained a better accuracy to detect
the different classes than the other solutions. They achieved an overall accuracy of 89.23%.
However, their overall accuracy is decreased by their low accuracy when identifying one of
the small classes of attack (R2L).
Dong et al. [40], proposed a hybrid solution combining clustering with SVM. In their
solution, they first used K-means clustering to process the data and divided it into different
subsets. Then, they used SVM on each of those subsets. They tested their solution on the
NSL-KDD datasets and they obtained an overall accuracy of 99.45%. In addition, compared
to other methods their solution improved the detection rate. Their solution also requires
less time processing compared to SVM algorithms using different parameters. However,
the authors provided no information concerning the accuracy of each attack classification.
Wisanwanichthan et al. [41], proposed a Double-Layered Hybrid Approach (DLHA).
In their solution, they first create two groups in the NSL-KDD dataset. The first one contains
all classes and the second one contains only the U2R, R2L and normal classes. They created
these two groups to have better accuracy of the U2R and R2L classes which are often the
weakness of most of the IDS solutions that we have seen. Then, they performed feature
extraction in both groups. They first used Intersectional Correlated Feature Selection (ICFS).
In ICFS, the Pearson Correlation Coefficient (PCC) is used to select important features
between two random variables. PCC can determine how much two variables vary from
each other. Once ICFS is done, they performed Principal Component Analysis (PCA) to
reduce the dimension of the data. Finally, to have a ratio of 1:1 between attacks and normal
data in the second group they randomly choose the same amount of data as R2L and U2R
combined. Then, they used those two groups to train their model which is composed of
a first layer using Naïve Bayes classifier and a second layer using SVM. The first layer is
used only to detect DoS and Probe. If the outcome is not one of those two classes, then the
data goes through the second layer to detect if it is a R2L, U2R or Normal data. They tested
their solution on the NSL-KDD dataset. They obtained an overall accuracy of 93.11% and
detection of 96.67% for R2L and 100% for U2R classes. Their solution outperformed other
solutions when identifying the small classes, however, contrary to other efficient solutions,
their accuracy for the large classes was not as good.
Elhefnawy et al. [42] proposed a Hybrid Nested Genetic-Fuzzy Algorithm (HNGFA)
to detect attacks. They first performed feature selection using Naïve Bayes. Major features
and Minor features are split into two groups. Their model is composed of two genetic-fuzzy
algorithms. The first one is the Outer Genetic-Fuzzy Algorithm (OGFA) and the second one
is the Inner Genetic-Fuzzy Algorithm (IGFA). Each of these algorithms used two nested
Appl. Sci. 2022, 12, 11752 14 of 27
genetic algorithms. The outer one is used for the fuzzy sets and the inner one is used for the
fuzzy rules. The OGFA is used for classifying data with major features, whereas the IGFA
is used for classifying data with minor features. The two genetic-fuzzy algorithms interact
with each other to develop new solutions to have better accuracy. The goal is to make the
interaction between the best results of the OGFA with weak results from the IGFA to have
the best model possible. They tested their solution on the KDDcup99 and UNSW-NB15
datasets and they obtained an overall accuracy of 98.19% and 80.54%, respectively. In
addition, their solution got a good accuracy for detecting small classes such as R2L and
U2R. However, due to the complexity of their model, the training time is high.
A summary of this study is provided in Table 4. In addition, the strengths, and
weaknesses of each solution are presented in Table 5.
Feature Extraction
Authors Year Classifier Used Attack Detected
Method Used
Jabez et al. [23] 2015 NA Outlier detection Network attacks
Multi-level Hybrid DoS, User to Root (U2R) and
Al-Yaseen et al. [20] 2017 Modified K-means
SVM and ELM Remote to Local (R2L) attacks
Ensemble method
Rohit et al. [19] 2018 Correlation method (Naïve Bayes, PART, DoS, Probe, U2R, R2L
and Adaptive Boost)
DoS, U2R, R2, Probe, Fuzzers,
Analysis, Backdoors, Exploits,
Generic, Reconnaissance,
Multi-layer ensemble
Marir et al. [32] 2018 Deep Belief Network Shellcode, Worms, Brute Force
method SVM
FTP, Brute Force SSH,
Heartbleed, Web Attack,
Infiltration, Botnet and DDoS
DoS (back, land, Neptune),
Probe (ipsweep, nmap,
portsweep, satan), R2L
Non-symmetric Deep (ftp_write, guess_password,
Shone et al. [35] 2018 Random Forest
Auto-Encoder (NDAE) imap, multihop, phf, spy,
warezclient, warezmaster), U2R
(loadmodule, buffer_overflow,
rootkit, perl)
Stacked Sparse
Yan et al. [36] 2018 SVM DoS, Probe, R2L, U2R
Auto-Encoder (SSAE)
Fast Learning Network
Ali et al. [39] 2018 NA DoS, U2R, R2L and Probing
improved using PSO
DoS, U2R (illegal Access from
remote machines), R2L (illegal
Xiao et al. [28] 2019 PCA and Auto-Encoder CNN access to local super user
privileges), probe (supervisory
and detection).
DoS, Exploits, Generic,
CNN with improved
Zhang et al. [29] 2019 NA Reconnaissance, Virus and Web
gcForest
attacks
DoS (SYN flood), Probe (port
Ensemble method (DT,
scanning), R2L (guessing
Gao et al. [31] 2019 PCA RF, KNN, DNN and
password), U2R (buffer overflow
MultiTree)
attacks)
Appl. Sci. 2022, 12, 11752 15 of 27
Table 4. Cont.
Feature Extraction
Authors Year Classifier Used Attack Detected
Method Used
Analysed 39 types of attacks that
fall under the following
DBN improved using
categorises: Probe (scan and
Wei et al. [33] 2019 NA optimizing algorithm
probe), DoS, U2R (illegal access
(PSO-AFSA-GA)
to local superuser) and R2L
(unauthorized remote access)
DNN with scalable
Vinayakumar et al. [34] 2019 NA Normal, DoS, Probe, R2L, U2R
hidden layers
Normal, DoS, Probe, R2L, U2R
(22 different categories of
Deep Stack attacked tested, i.e., analysis,
Khan et al. [37] 2019 Soft-max
Auto-Encoder (DSAE) backdoor, exploits, fuzzers,
generic, reconnaissance,
shellcode, worm)
K-mean clustering with
Dong et al. [40] 2019 NA -
SVM
FTP Brute Force, SSH Brute
Force, DoS (slowloris, slowtptest,
Hulk), Web attacks (web brute
Lin et al. [18] 2020 NA CNN
force, XSS, SQL injection),
penetration attacks (infiltration
Dropbox download)
DoS (Teardrop, Smurf), Probe
(Satan, Portsweep, saint), U2R
(Rootkit, Buffer_overflow,
Loadmodule) and R2L (Xsnoop,
Embedded function
Yu et al. [30] 2020 Few-Shot Learning Httptunnel).
using CNN and DNN
Other attack types tested were:
normal, generic, fuzzers,
reconnaissance, shellcode,
worms, backdoor and exploits)
Andresini et al. [38] 2020 Dual Auto-Encoder Soft-max -
Probe, DoS, U2R, and R2L. Other
attack types tested were: normal,
Hybrid Nested Genetic
Elhefnawy et al. [42] 2020 Naïve Bayes generic, fuzzers, reconnaissance,
Fuzzy Algorithm
shellcode, worms, backdoor and
exploits)
DoS, DDoS, PortScan, Web
CNN with multi-layer Attack, Heartbleed, Benign,
Lirim et al. [17] 2021 NA
perceptron Infiltration, Brute Force, SSH,
FTP
Oppositional tunicate
Kanimozhi et al. [21] 2021 Logistic Regression -
fuzzy C-mean
Correlation-based
Kurniawan et al. [24] 2021 Modified Naïve Bayes Normal, DoS, Probe, R2L, U2R
features selection
Gu et al. [26] 2021 Naïve Bayes SVM -
KNN using PM-CSCA DoS, Sniffing (Probe), U2R and
Pan et al. [27] 2021 NA
for optimization R2L
Appl. Sci. 2022, 12, 11752 16 of 27
Table 4. Cont.
Feature Extraction
Authors Year Classifier Used Attack Detected
Method Used
Employed Double Layered
Wisanwanichthan et al. Hybrid Approach (DLHA) for
2021 ICFS and PCA Naïve Bayes and SVM
[41] detecting DoS, Probe, R2L and
U2R
Improved random
Yiping et al. [22] 2022 NA Wireless network attacks
forest algorithm
Table 5. Cont.
Table 5. Cont.
7. Evaluation Metrics
The datasets used in the different papers analyzed in Section 6, the main outcome
for their solution is given in Table 6. The evaluation metrics used in the literature review
(Section 6) to assess their solution is given in Table 7.
Table 6. Outcome/Accuracy.
Table 6. Cont.
Table 7. Cont.
8. Discussion
8.1. Observations
This study shows that the impact of machine learning is significant to improve the
efficiency of IDSs, and that the quality of the dataset is an important factor that determines
how efficient an IDS will be. Using well-constructed datasets is necessary and in many of
the research papers reviewed here, they used labelled data to improve the training of the
model. However, nowadays the size of datasets is growing and previous research shows
that machine learning models are often not suitable when the dataset size grows too large.
Thus, deep learning models, such as CNN are being adopted more and more by researchers
to develop new solutions. These new methods learn and extract useful features from raw
datasets to make NIDS efficient against zero-day attacks. In addition, NIDS need to be
trained frequently with up-to-date data from real networks. However, these new solutions
have a cost. They require more powerful computing resources and more time processing to
train an efficient model.
Table 5 shows the main advantages and weaknesses from the different reviewed
solutions. The first observation that can be done is that traditional machine learning
techniques such as clustering are less popular compared to deep learning techniques.
Indeed, it was shown that in most cases using deep learning techniques such as neural
networks significantly improved the accuracy of the IDS. In addition, to tackle the need
for more computing power, the researchers use GPUs and cloud-based platforms as they
help to implement more powerful deep learning methods. One of the main drawbacks that
was observed in this study, is that most of the existing solutions used old datasets such
as KDDcup99 and NSL-KDD to test their model. Those datasets can be out of date and
not accurate with regard to real-world network traffic. In addition, it was observed that
for some solutions, when recent data sets were used the accuracy of the system decreased
compared to older datasets where the accuracy was excellent. Another point of concern
related to most of the solutions studied is their inefficiency to detect specific attack classes
that have fewer samples in their dataset. This is mainly due to a class imbalance in the
dataset, which results in a lower detection rate for those classes. This is a serious issue as
those minor classes could be zero-day attacks. We also observed that there is a trade-off
between the complexity of the model and the number of layers used in deep learning
models. The more layers used in the algorithm the more complex the model will be, and
therefore, the model will require more computing resources and time. Efficient filtering
Appl. Sci. 2022, 12, 11752 21 of 27
and selection of the important features for the model training helps to decrease the amount
of time and resources needed.
Figure 4 shows the number of times each classifier was used in the literature review
given in Section 6. It can clearly be seen that the researchers are using more and more
deep learning techniques such as CNN or DNN. We can also see that many researchers
are using SVM as their classifier because it has great performance when detecting minor
classes. Thus, they often combine it with other machine learning methods in a multi-
layer manner. Additionally, ensemble methods are frequently used as it allows the use of
different methods together to improve the efficiency of the IDS. The advent of these new
powerful techniques is possible, thanks to the improvement in GPU utilization and cloud
computing platforms. Finally, old traditional machine learning methods such as K nearest
neighbourhood, K-mean clustering, and genetic algorithms are less used as compared to
CNN and DNN.
In Section 6, it is shown that more and more researchers are using feature extraction
in their solutions. Indeed, as shown in Figure 5, 60% of the reviewed articles in Section 6
used feature extraction. It was clearly shown that feature extraction greatly helps to
improve the accuracy of the system. In addition, it helps to improve the structure of the
dataset so that the amount of computing resources needed is decreased. In most cases, the
feature extraction methods were based on neural network techniques such as Auto-Encoder
techniques. However, many solutions also used traditional machine learning methods for
feature extraction such as Naïve Bayes or Correlation method. Thus, it was shown that old
machine learning techniques and new deep learning methods can be combined, with one
of them used for feature selection and the other one used for classifying traffic.
The analysis of the metrics used to assess the various solutions studied is shown in
Figure 6. The most used metrics are the Accuracy and the Detection rate (Recall). Those
metrics are indeed the most important to assess the quality of a solution. Thus, they should
always be used when assessing the efficiency of an IDS. Nevertheless, we consider that
the F-measure should also be used more when assessing an IDS because it shows if the
solution is efficient to detect samples that are indeed attacks even from minor classes.
Appl. Sci. 2022, 12, 11752 22 of 27
40%
60%
40%
Feature Extraction used
60%
Feature Extraction not used
Datasets are a key component of IDSs. The analysis of which public datasets is used
for testing is shown in Figure 7. It is shown that the KDDcup99 and NSL-DD were used
56% of the time for testing. Those datasets are old but still very popular with researchers
since many studies have been made using them. However, the network structure and
architecture have changed compared to 20 years ago. Indeed, nowadays IoT devices and
wireless device usage has significantly increased. Thus, the amount of data exchanged
in the world is incomparable with what existed 20 years ago, in addition, many novel
and powerful attacks have since been created. If a new IDS solution is trained on an old
dataset only, it is likely that the solution will not be efficient when utilised in a real-world
environment. Thus, this study shows the need of using recent datasets to have better
trained models that will be efficient in real-world modern networks.
Appl. Sci. 2022, 12, 11752 23 of 27
were tested using real-world data. Thus, it is not certain if any of these solutions will
perform well in a real-world environment. Therefore, one of the main challenges going
forward is to ensure that future solutions will be tested in a real-world environment to
check their efficacy.
traffic. As discussed in Section 6, all the solutions were tested using public datasets.
However, those datasets are not properly representing our networks of today where a
significant amount of the data is encrypted. New solutions to extract important features
from encrypted traffic are required to detect abnormal instances.
9. Conclusions
This paper provides a thorough study of Intrusion Detection Systems and how they
could be improved using machine learning.
Firstly, the concept of Intrusion Detection Systems was presented. There are three
main types of IDS: Network Intrusion Detection System, Host Intrusion Detection System,
and a Hybrid Intrusion Detection System. In addition, each type of IDS can either detect
attacks by using a recorded signature or by comparing the behavior of the network with a
baseline of the normal traffic or both. Then, the different metrics used to assess Intrusion
Detection System by various researchers are presented. The most important metrics are
the Accuracy, the Detection Rate (Recall) and the F-Measure. A general overview of what
is machine learning, and a global taxonomy is also discussed. There are three types of
machine learning techniques: Supervised, Semi-supervised and Unsupervised. Most of the
machine learning techniques studied fall into one of these categories. A comprehensive
review of recently published papers using machine learning for IDS was also discussed.
Based on this study, the recent trends show that deep learning methods are more and more
used to detect attacks. However, this increases the complexity of the models which requires
more computing resources. In addition, it was shown that more and more solutions are
using feature extraction with Auto-Encoder as one of the techniques used.
This study also shows that 56% of the proposed IDSs were tested using KDDcup99
and NSL-KDD which are both old datasets. Therefore, these datasets by themselves are
not enough to verify the effectiveness of their solution, and, an up-to-date dataset must be
created. This new dataset should provide enough instances of minor classes, which have
a very low detection rate, and recent attack instances in a real environment. Finally, this
work highlights the main issue for IDS is their complexity and their low accuracy for minor
classes. In addition, future research trends are presented with the idea presented of a NIDS
framework that would continuously be improved using cloud computing.
Future work proposed includes the development of a solution, that can address
encrypted network traffic by extracting traffic features to facilitate abnormal instance
detection while decreasing required computing resources.
Author Contributions: Conceptualization, P.V., T.N., L.L.D. and M.R.; methodology, P.V. and T.N.;
validation, P.V., T.N., L.L.D. and M.R.; investigation, P.V.; resources, T.N., E.O., D.O. and B.L.; data
curation, P.V., L.L.D. and M.R.; writing—original draft preparation, P.V. and M.R.; writing—review
and editing, T.N., L.L.D. and M.R.; supervision, M.R. and T.N.; funding acquisition, T.N., E.O., D.O.
and B.L. All authors have read and agreed to the published version of the manuscript.
Funding: This work was supported, in part, by Science Foundation Ireland grant number 16/RC/3918
to the CONFIRM Science Foundation Ireland Research Centre for Smart Manufacturing and co-funded
under the European Regional Development Fund. This work additionally received support from the
Higher Education Authority (HEA) under the Human Capital Initiative-Pillar 3 project, Cyberskills.
Institutional Review Board Statement: Not applicable.
Appl. Sci. 2022, 12, 11752 26 of 27
References
1. Anderson, P. Computer Security Threat Monitoring and Surveillance. 1980. Available online: https://csrc.nist.gov/csrc/media/
publications/conference-paper/1998/10/08/proceedings-of-the-21st-nissc-1998/documents/early-cs-papers/ande80.pdf
(accessed on 19 May 2022).
2. ThreatStack. The History of Intrusion Detection Systems (IDS)—Part 1. Available online: https://www.threatstack.com/blog/
the-history-of-intrusion-detection-systems-ids-part-1 (accessed on 19 May 2022).
3. Checkpoint. What Is an Intrusion Detection System? Available online: https://www.checkpoint.com/cyber-hub/network-
security/what-is-an-intrusion-detection-system-ids/ (accessed on 19 May 2022).
4. Sabahi, F.; Movaghar, A. Intrusion Detection: A Survey. In Proceedings of the 2008 Third International Conference on Systems
and Networks Communications, Sliema, Malta, 26–31 October 2008; pp. 23–26. [CrossRef]
5. IBM Cloud Education. Machine Learning. Available online: https://www.ibm.com/cloud/learn/machine-learning (accessed on
19 May 2022).
6. IBM Cloud Education. Supervised Learning. Available online: https://www.ibm.com/cloud/learn/supervised-learning
(accessed on 19 May 2022).
7. Seldon. Machine Learning Regression Explained. Available online: https://www.seldon.io/machine-learning-regression-
explained (accessed on 19 May 2022).
8. Terence, S. All Machine Learning Models Explained in 6 Minutes. Available online: https://www.ibm.com/cloud/learn/
unsupervised-learning (accessed on 19 May 2022).
9. IBM Cloud Education. Unsupervised Learning. Available online: https://www.ibm.com/cloud/learn/unsupervised-learning
(accessed on 19 May 2022).
10. Ben n’cir, C.-E.; Cleuziou, G.; Nadia, E. Overview of Overlapping Partitional Clustering Methods. In Partitional Clustering
Algorithms; Springer: Cham, Switzerland, 2015; pp. 245–275.
11. Joos, K. The Apriori Algorithm. Available online: https://towardsdatascience.com/the-apriori-algorithm-5da3db9aea95 (ac-
cessed on 22 June 2022).
12. Matt, B. A One-Stop Shop for Principal Component Analysis. Available online: https://towardsdatascience.com/a-one-stop-
shop-for-principal-component-analysis-5582fb7e0a9c (accessed on 28 June 2022).
13. Ashiku, L.; Dagli, C. Network Intrusion Detection System using Deep Learning. Procedia Comput. Sci. 2021, 185, 239–247.
[CrossRef]
14. Tavallaee, M.; Bagheri, E.; Lu, W.; Ghorbani, A.A. A detailed analysis of the KDD CUP 99 data set. In Proceedings of the 2009
IEEE Symposium on Computational Intelligence for Security and Defense Applications, Ottawa, ON, Canada, 8–10 July 2009; pp.
1–6. [CrossRef]
15. Protic, D. Review of KDD Cup ‘99, NSL-KDD and Kyoto 2006+ Datasets. Vojnoteh. Glas. 2018, 66, 580–596. [CrossRef]
16. Moustafa, N. The UNSW-NB15 Dataset. Available online: https://research.unsw.edu.au/projects/unsw-nb15-dataset (accessed
on 28 February 2022).
17. Sharafaldin, I.; Lashkari, A.H.; Ghorbani, A. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization;
Canadian Institute for Cybersecurity (CIC): Fredericton, NB, Canada, 2018; pp. 108–116.
18. Chen, L.; Kuang, X.; Xu, A.; Suo, S.; Yang, Y. A Novel Network Intrusion Detection System Based on CNN. In Proceedings of the
2020 Eighth International Conference on Advanced Cloud and Big Data (CBD), Taiyuan, China, 5–6 December 2020; pp. 243–247.
[CrossRef]
19. Gautam, R.K.S.; Doegar, E.A. An Ensemble Approach for Intrusion Detection System Using Machine Learning Algorithms.
In Proceedings of the 2018 8th International Conference on Cloud Computing, Data Science & Engineering (Confluence), Noida,
India, 11–12 January 2018; pp. 14–15. [CrossRef]
20. Al-Yaseen, W.L.; Othman, Z.A.; Nazri, M.Z.A. Multi-level hybrid support vector machine and extreme learning machine based
on modified K-means for intrusion detection system. Expert Syst. Appl. 2017, 67, 296–303. [CrossRef]
21. Kanimozhi, P.; Victoire, T.A.A. Oppositional tunicate fuzzy C-means algorithm and logistic regression for intrusion detection on
cloud. Concurr. Comput. Pract. Exp. 2022, 34, e6624. [CrossRef]
22. Chen, Y.; Yuan, F. Dynamic detection of malicious intrusion in wireless network based on improved random forest algorithm.
In Proceedings of the 2022 IEEE Asia-Pacific Conference on Image Processing, Electronics and Computers (IPEC), Dalian, China,
14–16 April 2022; pp. 27–32. [CrossRef]
23. Jabez, J.; Muthukumar, B. Intrusion Detection System (IDS): Anomaly Detection Using Outlier Detection Approach. Procedia
Comput. Sci. 2015, 48, 338–346. [CrossRef]
Appl. Sci. 2022, 12, 11752 27 of 27
24. Kurniawan, Y.; Razi, F.; Nofiyati, N.; Wijayanto, B.; Hidayat, M. Naive Bayes modification for intrusion detection system
classification with zero probability. Bull. Electr. Eng. Inform. 2021, 10, 2751–2758. [CrossRef]
25. Chauhan, N. Naïve Bayes Algorithm: Everything You Need to Know. Available online: https://www.kdnuggets.com/2020/06/
naive-bayes-algorithm-everything.html#:~{}:text=One%20of%20the%20disadvantages%20of,all%20the%20probabilities%20
are%20multiplied (accessed on 9 September 2022).
26. Gu, J.; Lu, S. An effective intrusion detection approach using SVM with naïve Bayes feature embedding. Comput. Secur. 2021,
103, 102158. [CrossRef]
27. Pan, J.-S.; Fan, F.; Chu, S.C.; Zhao, H.; Liu, G. A Lightweight Intelligent Intrusion Detection Model for Wireless Sensor Networks.
Secur. Commun. Networks 2021, 2021, 1–15. [CrossRef]
28. Xiao, Y.; Xing, C.; Zhang, T.; Zhao, Z. An Intrusion Detection Model Based on Feature Reduction and Convolutional Neural
Networks. IEEE Access 2019, 7, 42210–42219. [CrossRef]
29. Zhang, X.; Chen, J.; Zhou, Y.; Han, L.; Lin, J. A Multiple-Layer Representation Learning Model for Network-Based Attack
Detection. IEEE Access 2019, 7, 91992–92008. [CrossRef]
30. Yu, Y.; Bian, N. An Intrusion Detection Method Using Few-Shot Learning. IEEE Access 2020, 8, 49730–49740. [CrossRef]
31. Gao, X.; Shan, C.; Hu, C.; Niu, Z.; Liu, Z. An Adaptive Ensemble Machine Learning Model for Intrusion Detection. IEEE Access
2019, 7, 82512–82521. [CrossRef]
32. Marir, N.; Wang, H.; Feng, G.; Li, B.; Jia, M. Distributed Abnormal Behavior Detection Approach Based on Deep Belief Network
and Ensemble SVM Using Spark. IEEE Access 2018, 6, 59657–59671. [CrossRef]
33. Wei, P.; Li, Y.; Zhang, Z.; Hu, T.; Li, Z.; Liu, D. An Optimization Method for Intrusion Detection Classification Model Based on
Deep Belief Network. IEEE Access 2019, 7, 87593–87605. [CrossRef]
34. Vinayakumar, R.; Alazab, M.; Soman, K.P.; Poornachandran, P.; Al-Nemrat, A.; Venkatraman, S. Deep Learning Approach for
Intelligent Intrusion Detection System. IEEE Access 2019, 7, 41525–41550. [CrossRef]
35. Shone, N.; Ngoc, T.N.; Phai, V.D.; Shi, Q. A Deep Learning Approach to Network Intrusion Detection. IEEE Trans. Emerg. Top.
Comput. Intell. 2018, 2, 41–50. [CrossRef]
36. Yan, B.; Han, G. Effective Feature Extraction via Stacked Sparse Autoencoder to Improve Intrusion Detection System. IEEE Access
2018, 6, 41238–41248. [CrossRef]
37. Khan, F.A.; Gumaei, A.; Derhab, A.; Hussain, A. A Novel Two-Stage Deep Learning Model for Efficient Network Intrusion
Detection. IEEE Access 2019, 7, 30373–30385. [CrossRef]
38. Andresini, G.; Appice, A.; Mauro, N.D.; Loglisci, C.; Malerba, D. Multi-Channel Deep Feature Learning for Intrusion Detection.
IEEE Access 2020, 8, 53346–53359. [CrossRef]
39. Ali, M.H.; Mohammed, B.A.D.A.; Ismail, A.; Zolkipli, M.F. A New Intrusion Detection System Based on Fast Learning Network
and Particle Swarm Optimization. IEEE Access 2018, 6, 20255–20261. [CrossRef]
40. Liang, D.; Liu, Q.; Zhao, B.; Zhu, Z.; Liu, D. A Clustering-SVM Ensemble Method for Intrusion Detection System. In Proceedings
of the 2019 8th International Symposium on Next Generation Electronics (ISNE), Zhengzhou, China, 9–10 October 2019; pp. 1–3.
[CrossRef]
41. Wisanwanichthan, T.; Thammawichai, M. A Double-Layered Hybrid Approach for Network Intrusion Detection System Using
Combined Naive Bayes and SVM. IEEE Access 2021, 9, 138432–138450. [CrossRef]
42. Elhefnawy, R.; Abounaser, H.; Badr, A. A Hybrid Nested Genetic-Fuzzy Algorithm Framework for Intrusion Detection and
Attacks. IEEE Access 2020, 8, 98218–98233. [CrossRef]