Paper 5

Journal of Network and Computer Applications 187 (2021) 103111
Contents lists available at ScienceDirect
Journal of Network and Computer Applications

journal homepage: www.elsevier.com/locate/jnca
Towards secure intrusion detection systems using deep learning techniques:

Comprehensive analysis and review
Sang-Woong Lee a, Haval Mohammed sidqi b, Mokhtar Mohammadi c, Shima Rashidi d,
Amir Masoud Rahmani e, f, Mohammad Masdari g, Mehdi Hosseinzadeh a, *
a
Pattern Recognition and Machine Learning Lab, Gachon University, 1342 Seongnamdaero, Sujeonggu, Seongnam, 13120, Republic of Korea
b
Department of Database, College of Informatics, Sulaimani Polytechnic University, Sulaymaniyah, Iraq
c
Department of Information Technology, Lebanese French University, Erbil, Kurdistan Region, Iraq
d
Department of Computer Science, University of Human Development, Sulaymaniyah, Iraq
e
Future Technology Research Center, National Yunlin University of Science and Technology, Yunlin, Taiwan
f
Department of Computer Science, Khazar University, Baku, Azerbaijan
g
Department of Computer Science, Urmia Branch, Islamic Azad University, Urmia, Iran
A R T I C L E I N F O A B S T R A C T
Keywords: Providing a high-performance Intrusion Detection System (IDS) can be very effective in controlling malicious
Intrusion detection behaviors and cyber-attacks. Regarding the ever-growing negative impacts of the security attacks on computer
Deep learning systems and networks, various Artificial Intelligence (AI)-based techniques have been used to introduce versatile
Auto-encoder
IDS approaches. Deep learning is a branch of AI techniques, mainly based on multi-layer artificial neural net
CNN
DNN
works. Recently, deep learning techniques have gained momentum in the intrusion detection domain and several
GAN IDS approaches are provided in the literature using various deep neural networks to deal with privacy concerns
LSTM and security threats. For this purpose, this article focuses on the deep IDS approaches and investigates how deep
learning networks are employed by different approaches in different steps of the intrusion detection process to
achieve better results. It classifies the studied IDS schemes regarding the deep learning networks utilized in them
and describes their main contributions and capabilities. Besides, in each category, their main features such as
evaluated metrics, datasets, simulators, and environments are compared. Also, a comparison of the deep IDS
approaches main properties are provided to illuminate the main techniques applied in them as well as the area
less focused in the literature. Finally, the concluding remarks in the deep IDS context are provided and possible
directions at the subsequent studies are listed.
1. Introduction cybersecurity attacks, providing efficient and effective techniques to

harness them seems to be essential. Fig. 1 indicates the malware infec
Cyber-space threats are one of the significant issues that information tion growth rate between 2009 and 2018. In these statistics, all mali
technology-based organizations should deal with them (Von Solms and cious software pieces such as spyware, Trojans, worms, computer
Van Niekerk, 2013). Generally, the security attacks often attempt aimed viruses, ransomware are considered Malware.
to gain unauthorized access to the critical data in the information sys IDS are one of the key components of the security infrastructure that
tems and then modify, expose, or use them (AbdAllah et al., 2018). Also, can deter cyber threats coming from various types of attackers. Various
some of the security attacks denoted as Distributed Denial of Service kinds of IDS schemes have been provided in the security literature. In
(DDoS) attacks (Masdari and Jalali, 2016), may attempt to disrupt the this context, regarding the environment which should be IDS scheme
normal functioning of the computer systems and make them inaccessible protect, they can be classified as host IDS (Bridges et al., 2019) and
to the other users and systems (Khalaf et al., 2019; Mahjabin et al., network IDS (Sultana et al., 2019) approaches, in which the former in
2017). Therefore, regarding the ever-increasing ferocity and diversity of tends to secure a computer system by monitoring all events and
* Corresponding author.
E-mail addresses: slee@gachon.ac.kr (S.-W. Lee), Haval.sidqi@spu.edu.iq (H. Mohammed sidqi), Mukhtar@lfu.edu.krd (M. Mohammadi), Shima.rashid@uhd.edu.
iq (S. Rashidi), rahmania@yuntech.edu.tw (A.M. Rahmani), m.masdari@iaurmia.ac.ir (M. Masdari), mehdi@gachon.ac.kr (M. Hosseinzadeh).
https://doi.org/10.1016/j.jnca.2021.103111
Received 11 December 2020; Received in revised form 1 May 2021; Accepted 9 May 2021
Available online 19 May 2021
1084-8045/© 2021 Elsevier Ltd. All rights reserved.
S.-W. Lee et al. Journal of Network and Computer Applications 187 (2021) 103111
extract new features from the training data samples.

This article is going to provide a comprehensive survey and classi
fication of the IDS approaches incorporating one or more deep learning
techniques. To be more effective, it first provides a brief overview of the
deep learning networks to illuminate their structure and capabilities. It
provides a taxonomy of the deep IDS approaches considering the type of
deep network incorporated in them which have been used for feature
selection/extraction step or in the classification step. Besides, in each
section, the intrusion detection steps conducted by the investigated
schemes are described as well as their possible shortcoming and limi
tations. Also, to provide more insight about the studied intrusion
detection approach their main properties such as evaluation metrics,
Fig. 1. Malware infection growth rate (Vulnerabilities, 2021). applied IDS datasets, and classifiers are compared. Furthermore, a
comprehensive comparison of the techniques and components, applied
incoming/outgoing traffics and the latter is going to monitor and secure by all studied IDS approaches, is presented at the end of this article to
the whole computer network. Also, network-based IDS schemes are illuminate the primary methods which are beneficial to build deep IDS
categorized into flow-based solutions (Sperotto et al., 2010; Umer et al., schemes.
2017), deep packet inspection schemes (Ren et al., 2020) in which the
flow-based approaches only check the packets header data, but deep • Illustrating different variants of deep learning networks, describing
inspection schemes. Besides, based on their detection capability, IDS their architecture and capabilities.
approaches are classified as signature detection and anomaly detection • Providing a classification of the studied intrusion detection and
approaches (Aldweesh et al., 2020). anomaly detection approaches, utilizing one or more deep learning
Typically, the signature-based IDS benefit from a predefined data networks. Describing some of the remarkable IDS schemes that have
base of security attacks’ signatures and try to match the events and effectively incorporated the deep learning networks to extract fea
traffic to the specific attack patterns (Masdari and Khezri, 2020a). tures or conduct classification on the applied or collected dataset.
However, the signature-based IDS schemes cannot detect new attacks in • Comparing various features and issues regarding the deep IDS
which their pattern and signature are unknown. On the other hand, schemes, providing valuable insight into the datasets, classifiers,
anomaly-based IDS approaches attempt to learn the normal behaviors features selection/extraction method, and evaluation metrics.
and recognize everything else as anomaly or intrusion (Masdari and • Discussing possible future trends in the deep intrusion context and
Khezri, 2021). Nonetheless, they suffer from the false positive problem illuminating the topics that can be further investigated and focused
that restricts their application. on to enhance the performance of deep intrusion detection.
One factor that has a direct impact on the effectiveness of the IDS
approaches is the availability of the label datasets. Although having such The remaining of this deep learning survey is organized as follows:
datasets are costly, it increases the accuracy of the IDS schemes. Thus, Section 2 presents the research steps performed for conducting this
when a fully labeled dataset is available, the supervised learning survey, and Section 3 briefly clarifies the deep learning networks. Sec
methods can be used which provides more accurate results. On the other tion 4 provides a classification of the investigated IDS approaches
hand, when a subset of the dataset is labeled, the semi-supervised regarding the deep learning method used in them to boost the intrusion
learning methods seem to be applicable (Ashfaq et al., 2017). At last, detection process. Section 5 presents the discussion section and com
when no labeled data is available the unsupervised learning methods parison results, while section 6 gives the concluding issues and ongoing
should be used, at the cost of less accuracy and high false positives. challenges in the deep learning-based intrusion detection context.
Many IDS schemes are presented in the literature using various
machine learning methods to automatically recognize normal and 2. Research method
abnormal events happening in the systems and networks. However,
despite many efforts conducted in the intrusion detection context, the For conducting this study a systematic approach has been applied,
existing IDS approaches still suffer from a high false-positive rate and which this section is aimed to elaborate this process.
low detecting rate. For mitigating these issues, deep learning, a subset of Fig. 2 indicates the research method employed in this scheme. As
the machine learning methods, has been extensively incorporated for shown in this figure, several research questions are specified regarding
dealing with various kinds of intrusions and security problems in the deep learning-based intrusion detection schemes; these questions
different environments and increasing cyberspace security. Deep indicate the topics that will be covered in this survey. Table 1 indicates
learning can model complex architectures data for performing various the main research questions and demonstrates the need for each ques
non-linear data transformations and recognizing some patterns in data tion. Then, according to some of the specified questions, several key
with several neural network architectures. More specifically, deep words are determined for searching the deep learning-based IDS
learning is based on artificial neural networks(ANNs) that use multiple schemes and the number of search terms is determined based on these
hidden layers for transforming data and have stronger learning capa
bility (Pouyanfar et al., 2018). Several deep learning networks such as
Deep Belief Networks (DBNs), Deep Neural Networks (DNNs), Con
volutional Neural Networks (CNNs), Generative adversarial networks
(GANs), and Recurrent Neural Networks (RNNs) are provided which
have been widely employed in different application domains as well as
security and intrusion detection contexts (Wang et al., 2019a; Alom
et al., 2019; Bu and Cho, 2020). Deep learning networks can benefit
from supervised, semi-supervised, or unsupervised learning methods.
However, as an advantage, when there is a large labeled dataset for
training the deep learning models, they provide results with high ac
curacy. Besides, deep learning supports incremental learning and can
Fig. 2. Research steps.
2
Table 1 detection articles, we have applied the following search terms:

Research questions.
Index Question Reason • Intrusion Detection Survey
• Intrusion Detection Overview
1 Which kind of deep learning The response to this question
networks are applied by each IDS determines which deep learning • Intrusion Detection Comparative Study
scheme? method is appropriate for which • Intrusion Detection literature Review
step of the intrusion detection • Intrusion Detection Comparison
process. • Anomaly Detection Survey
2 How deep learning techniques are Selecting the minimum set of
applied to effectively conduct features has a direct impact on the
• Anomaly Detection Overview
feature selection or feature performance, training delay, and • Anomaly Detection Comparative Study
extraction? testing delay of intrusion detection. • Anomaly Detection literature Review
This question identifies the deep • Anomaly Detection Comparison
learning networks applied in
• Anomaly Detection systematic study
combination with the proposed
schemes and their contributions
towards the feature selection/ Several articles such as (Fernandes et al., 2019) are extracted from
extraction step. searching these terms, which appropriate references are made in this
3 What kinds of classifiers are used in Determines the deep learning article’s introduction section. Some of these review papers have focused
combination with the deep learning networks which have been used for
techniques in the intrusion classification purpose. Also, some
on the intrusion detection approaches that use classifiers such as deep
detection schemes? schemes have integrated deep and learning (Kwon et al., 2019a; Chalapathy and Chawla, 2019), random
shallow learning methods. More forest (Resende and Drummond, 2018), SVM (Hosseinzadeh et al.,
specifically, they have applied deep 2020), ensemble (Folino and Sabatino, 2016), etc. In contrast, others
learning in feature extraction/
have addressed all data mining and machine learning techniques. Some
selection or xx, while using shallow
methods for classification. of them studied the intrusion detection solutions designed for particular
4 What kinds of datasets are used by This question illuminates the environments such as mobile ad hoc networks (Soni et al., 2015; Khan
the intrusion detection scheme? intrusion detection datasets used by et al., 2020; Nadeem and Howarth, 2013), sensor networks (Butun et al.,
each scheme to conduct the 2013), cloud computing (Mishra et al., 2017; Keegan et al., 2016; Patel
required evaluations.
et al., 2013), Software Defined Networking(SDN) (Jafarian et al., 2020;
5 Which evaluation metrics have This determined the metrics that are
been employed in the studied primarily incorporated to verify the Hande and Muddana, 2020), Internet of Things(IoT) (Zarpelão et al.,
intrusion detection scheme? deep learning-based IDS schemes. 2017; da Costa et al., 2019; Thamilarasu and Chawla, 2019; Chaabouni
6 What are the advantages and The response to this question et al., 2019), etc. Some surveys have focused on network intrusion de
shortcomings of each deep learning- determines the main benefits
tections, and some surveys have addressed host-based intrusion de
based IDS scheme? achieved by the IDS schemes and
highlights the shortcomings that tections. Also, for searching the published survey articles in the deep
they suffer. learning context, the following search terms are used:
• Intrusion Detection Deep Learning Survey

keywords to find the related articles. Mainly, these search terms are • Intrusion Detection Deep Learning Overview
aimed to find different types of deep learning networks that benefited to • Intrusion Detection Deep Learning Comparative Study
conduct intrusion detection, anomaly intrusion detection, or both. After • Intrusion Detection Deep Learning literature Review
search terms are determined, some of the scientific libraries are selected • Intrusion Detection Deep Learning Comparison
to search for the necessary articles. Fig. 1 exhibits the scientific libraries • Anomaly Intrusion Detection Deep Learning Survey
that benefited to conduct this study. Since this article is aimed to • Anomaly Intrusion Detection Deep Learning Overview
conduct a comprehensive survey on the deep learning-based ap • Anomaly Intrusion Detection Deep Learning Comparative Study
proaches, many searches have been conducted to find the required • Anomaly Intrusion Detection Deep Learning literature Review
surveys and intrusion detection approaches and in this process, the • Anomaly Intrusion Detection Deep Learning Comparison
scientific libraries indicated in Fig. 3 are applied. At first, for finding the • Anomaly Intrusion Detection Deep Learning systematic study
survey articles about the intrusion detection and anomaly intrusion
By using these items several surveys (Hodo et al., 2017) based on
deep learning are found for anomaly detection (Kwon et al., 2019b),
where some of them are discussing the intrusion detection approaches
designed for special environments such as Internet-of-Things (Idrissi
et al., 2020; Asharf et al., 2020), VANET, DBN (Sohn, 2020), deep
generative networks (Yadav and Kalpana, 2021; Dutta et al., 2020),
some of them only study the network intrusion detection, and the rest try
to cover all proposed schemes.
• Intrusion Detection Deep Learning

• Intrusion Detection LSTM
• Intrusion Detection Long Short-Term Memory
• Intrusion Detection CNN
• Intrusion Detection Convolutional neural network
• Intrusion Detection Restricted Boltzmann Machine
• Intrusion Detection Boltzmann Machine
• Intrusion Detection RBM
• Intrusion Detection Recurrent neural network
• Intrusion Detection RNN
Fig. 3. Applied scientific libraries. • Intrusion detection auto-encoder
3
• Intrusion detection auto-encoder networks are constructed from several connected layers, in which the
• Intrusion detection AE first one is called the input layer, the last one is denoted as output layer,
• Intrusion Detection GAN and all layers between are called hidden layers. Each of the hidden
• Intrusion Detection Generative Adversarial Network layers consists of several neurons, in which the signal strength of a
• Intrusion Detection Deep belief network neuron depends on the factors such as bias, weight, and the activation
• Intrusion Detection DBN function. Table 2 presents the comparison of the deep learning and
• Anomaly Detection Deep Learning machine learning techniques (Sarker et al., 2020), highlighting their
• Anomaly Detection LSTM main differences(Xin et al., 2018; Ferrag et al., 2020; Sarker, 2021;
• Anomaly Detection Long Short-Term Memory Geetha and Thilagam, 2020). However, the important issue in both
• Anomaly Detection CNN methods is the quality of the data and it determines the quality of the
• Anomaly Detection Convolutional neural network final result.
• Anomaly Detection Restricted Boltzmann Machine Different types of neural networks with different architectures are
• Anomaly Detection Boltzmann Machine designed which each of them can be used in a special domain. Fig. 5
• Anomaly Detection RBM exhibits the taxonomy of the deep learning networks and this section
• Anomaly Detection Recurrent neural network briefly demonstrates these deep learning networks. As shown in this
• Anomaly Detection RNN figure, deep learning techniques are categorized into generative,
• Anomaly Detection auto-encoder descriptive, and hybrid methods (Al-Garadi et al., 2020), in which deep
• Anomaly Detection auto-encoder Boltzmann machines (DBMs), deep auto-encoders, deep belief networks
• Anomaly Detection AE (DBNs), and Generative adversarial networks (GANs) are generative
• Anomaly Detection GAN models while convolutional networks are considered a descriptive net
• Anomaly Detection Generative Adversarial Network works. Also, RNNs are categorized as both generative and descriptive
• Anomaly Detection Deep belief network models.
• Anomaly Detection DBN
3.1.1. DBMs
By using these search strings many articles have been found, which A DBM or deep Boltzmann machine is a kind of Markov random field.
only a subset of them are useful. Therefore, by performing a quality It is an undirected probabilistic graphical model that contains one
assessment process, documents such as thesis files and low-quality visible layer and several hidden layers. A DBM learns the input’s com
journal articles and conference papers are not further processed. For plex internal representations using few labeled data for fine-tuning the
performing the quality assessment these issues are considered: representation created with a set of unlabeled input. The DBMs support
top-down or bottom-up training and inference procedures, allowing
• Is the article allocated for intrusion detection in computer systems them to find the input’s representations. Nonetheless, the DBMs’ slow
and networks? speed may limit their functionality and performance (Salakhutdinov and
• Does the article have any prominent contribution in the intrusion Hinton, 2009; Zhang et al., 2018a).
detection context?
• Is a comprehensive set of experiments applied to verify the proposed 3.1.2. Auto-encoder
intrusion detection solution? Auto-encoders are feed-forward ANNs in which the number of neu
rons in the input layer and output layers are the same. An Auto-encoder
Fig. 4 depicts the publication year of the deep learning-based IDS can also have several hidden layers and it builds its inputs aiming to
approaches investigated in this survey. As shown in this figure, the main minimize the difference between the output and the input. Furthermore,
focus of this article is on the novel IDS approaches published in the
literature since 2017.
Table 2
Comparison of the deep learning and machine learning techniques.
3. Background knowledge
Item Deep Learning Machine Learning
This section discusses the various deep learning networks and IDS 1 Deep learning networks need a large Machine learning methods can be
datasets employed by the studied IDS schemes. dataset for training. trained with small datasets.
2 Deep learning networks can learn External intervention is necessary to
features from the raw data. provide the right input.
3.1. Deep learning 3 Deep learning networks do not Machine learning algorithms need
require human intervention. to be retrained through human
intervention.
Generally, machine learning is a subset of artificial intelligence and 4 Regarding the overheads of the deep Because machine learning needs
deep learning is a subset of machine learning techniques. Deep learning learning methods, they are more labeled data, it is not appropriate for
appropriate for dealing with large- handling large-scale problems that
scale problems. need a large set of labeled data.
5 Solving large-scale problems using Machine learning algorithms can be
high-end computer systems or handled using CPU on low-end
special hardware such as Graphics computer systems.
Processing Unit (GPU) can increase
the performance of the deep learning
algorithms.
6 Deep learning networks require Machine learning techniques need
more execution time. less time for execution.
7 The problem solving using deep The problem solving using machine
learning methods, based on multi- learning such as decision tree and
layered ANNs, are more intertwined linear regression, benefit from a
and complex. simple structure and is much easier
to interpret.
8 The output of the deep learning The output of the machine learning
methods can be in various forms. techniques is in numerical form.
Fig. 4. Publication year of the investigated schemes.
4
layers such as a convolutional layer, fully-connected layer, non-linearity

layer, and pooling layer, in which the first two ones are parametric and
the other two are not (Albawi et al., 2017).
3.2. IDS datasets
Several important IDS datasets are employed by the investigated

deep learning-based IDS approaches which this section illustrates.
Fig. 5. Deep learning networks taxonomy. 3.2.1. KDDCup’99

KDDCup’99 dataset is a version of the DARPA dataset, which has
each auto-encoder has two steps: an encoder for mapping the input data seven weeks of network traffic traces containing four gigabytes of binary
into the code, and a decoder for constructing input data from the code. Tcpdump data in the form of 5 million records. KDDCup’99 training
More specifically, auto-encoders support unsupervised learning of dataset consists of 4,900,000 single connection vectors containing 41
dataset encoding for dimensionality reduction, by training the network features, labeled as an attack or normal. This dataset contains the
to ignore the signal noise (Tschannen et al., 2018). following security attacks: DoS attacks, U2R or User to Root, R2L or
Remote to Local Attack, and Probing Attack. In U2R attacks the attacker
3.1.3. DBNs gains access to a user account and gains root access to that host. In the
DBNs are probabilistic generative models, consisting of several R2L attack, an attacker sends data packets to a remote host to gain access
stacked RBM modules. In a DBN, the output of each RBM is used as input to it. Also, in Probing Attack, the attacker attempts to collect some in
of the subsequent RBM. Besides, neurons of the DBN layers have con formation about a computer network for subsequent security attacks
nections to the next layer, but not to the neurons in the same layer. DBNs (Tavallaee et al., 2009).
can solve the ANNs training problems and prevents problems such as
falling in a local minimum, slow training and needing for a large training 3.2.2. NSL-KDD
dataset. DBNs are applied in different domains such as speech recogni NSL-KDD dataset solves the problems of the KDDCup’99 dataset and
tion, image identification, natural language processing, and intrusion contains fewer records in the training and testing sets. This eliminates
detection. These deep learning networks have good capabilities in the need for selecting a small portion of the dataset and makes it possible
classification and feature learning (Keyvanrad and Homayounpour, to evaluate the newly designed IDS approaches on all portions of the
2014). dataset. As an advantage, the NSL-KDD dataset does not have duplicate
and redundant records in its train set and this prevents any bias for
3.1.4. RNNs redundant data records in the classification step, leading to an
RNNs can be considered as an enhanced version of the feed-forward improvement in detection rates. However, this dataset suffers from some
ANNs that can remember data handled at each time step for calculating problems (McHugh, 2000) and may not perfectly represent the real
the subsequent results. For this purpose, in an RNN, the output of neu network traffic (Meena and Choudhary, 2017).
rons in each layer is connected to the input of the other layer’s neurons
and also to itself. Thus, RNNs can apply their internal memory to deal 3.2.3. ISCXIDS2012
with variable length inputs sequences such as times series and learns a ISCXIDS2012 is a dataset created using the concept of the profile and
data sequence to produce its new members. Furthermore, in RNNs, the contains intrusion descriptions and abstract distribution models for
input layer is unidirectionally connected to the hidden layers, while the lower-level network entities, protocols, applications. For simulating the
neurons of the hidden layers are connected to themselves and all other user behavior, the profiles are applied. These profiles are utilized for
neurons of the next layer for full information exchange. Regarding the generating a dataset in the required test-bed. For producing the anom
temporal correlations of the security attacks and malicious behaviors, alous section of this dataset, different scenarios for multi-stage attacks
the RNNs can be effectively used to model them. For this purpose, the are used. Then, agents are used to running these profiles imitating the
RNNs can be trained using current and historic inputs, in which the user activity. This dataset contains seven days of malicious and normal
probability of an attack is based on the current and prior states of the network activities.
features (Yu et al., 2019).
3.2.4. CSE–CIC–IDS2018
3.1.5. GAN This dataset considers profiles for human operators or agents in
A GAN or Generative adversarial network consists of two ANNs generating network traffic events for various network protocols. It ap
contesting, trained with each other in a zero-sum game, where one’s plies two types of profiles, denoted as B-profiles and M-Profiles. The B-
gain is another’s loss. After being trained, a GAN is able to learn the profiles encapsulates the users’ behavior such as packet size distribu
distribution of data and generate synthetic data instances that can be tions, packets in each flow, payload patterns, payload size, and request
used as real data. GANs are extensively applied in various domains such time distribution of protocols such as FTP, HTTP, HTTPS, IMAP, POP3,
as voice, video, and image generations as well as intrusion detection SMTP, and SSH. On the other hand, an M-Profile describes a scenario for
context (Gui et al., 2020). security attacks. The CSE–CIC–IDS2018 contains attack scenarios for
security attacks such as Brute-force, Web attacks, Botnet, DDoS, and
3.1.6. CNNs Heartbleed. It also has each machine’s log and captured network traffic
The CNNs are deep learning networks mainly designed for handling with 80 features. In the scenarios considered for this dataset, the attack
image processing and analysis, but they can be applied in intrusion infrastructure has 50 machines while the victim benefits from 30 servers
detection and other domains. In the intrusion detection domain, the and 420 machines (Leevy and Khoshgoftaar, 2020).
CNNs often are incorporated to extract features from the raw data.
Typically, a CNN is a multi-layered ANN that consists of input and 3.2.5. CICIDS2017
output layers along with several hidden layers. A CNN receives its input This dataset contains a number of records for the security attack that
as a 2D image and it then assigns some importance to different parts of resemble real traffic data. It uses the B-Profile system for profiling
the image to recognize the output. A CNN benefits from several hidden human behaviors and produces natural benign traffic. In this dataset, the
behaviors of 25 users are built for protocols such as email protocols,
5
HTTP, HTTPS, SSH, and FTP. Besides, it contains some records for se based IDS schemes proposed in the literature. Fig. 6 presents a taxon
curity attacks such as Brute Force SSH, Brute Force FTP, Web Attack, omy of the investigated IDS approaches, regarding the deep learning
Heartbleed, and DDoS (Sharafaldin et al., 2018). networks applied in them. As shown in this figure, some of the IDS
schemes have only have used one deep learning method in the intrusion
3.2.6. UNSW-NB15 detection process and some others have benefited from two deep
This dataset is used to evaluate NIDSs and a software tool denoted as learning networks. The rest of this section is organized according to this
IXIA PerfectStorm is incorporated to create its abnormal and normal taxonomy and a subsection is allocated to investigate each category of
network traffic traces. The IXIA tool can mimic nine types of security the IDS approaches. Besides, in each subsection, the main contributions
attacks and it can use new attacks data that can be updated from a site of the deep IDS schemes are demonstrated and their properties are
that includes the security vulnerabilities information. Also, the compared.
Tcpdump is utilized for 16 h to capture 100 GBs of network traffic
(Kumar et al., 2020).
4.1. Auto-encoder-based schemes
3.2.7. CIDDS
This part of the paper discusses the Auto-encoder-based IDS schemes
CIDDS or Coburg Intrusion Detection DataSets are provided for
(Li et al., 2020a; Louati and Ktata, 2020; Mighan and Kahani, 2018;
evaluating network anomaly intrusion detections and contain labeled
Ieracitano et al., 2020) introduced for different environments. For
flows using OpenStack for a virtual environment. The CIDDS-001
instance, in (Sadaf and Sultana, 2020), Sadaf and Sultana introduced an
dataset considers a small environment, which includes several servers
IDS approach denoted as Auto-IF, for real-time intrusion detection in fog
and clients. Also, in this dataset, the benign users’ behavior such as web
computing environments using isolation forest and auto-encoder. This
browsing is created using the Python scripts. For ensuring realistic user
approach performs binary classification of the incoming packets as fog
behavior, each user performs his/her work regarding a specific schedule
devices are more concerned about differentiating attacks from normal
and his/her characteristics are specified in a configuration file. This
packets. The authors validated their method using NSL-KDD and indi
dataset includes malicious traffic such as Brute Force attacks, DDoS at
cated that it achieves a high accuracy rate.
tacks, and Port Scans (Verma and Ranga, 2018).
Tang et al. (2020a), introduced LightGBM-AE, a NIDS based on
auto-encoder and LightGBM in which the latter performs data pre
4. The proposed deep learning-based IDS approaches
processing, feature selection, and classification steps. The LightGBM-AE
model adopts the LightGBM algorithm for feature selection and uses an
This section provides a comprehensive study on the deep learning-
auto-encoder for training and detection. When a dataset containing
Fig. 6. Taxonomy of the deep learning-based IDS schemes.
6
intrusions is fed into an auto-encoder, there will be a large reconstruc actual labels can be recognized. They constructed three training datasets
tion error with the input data and to recognize the intrusion an appro of real network conditions in which most of the data are normal. They
priate threshold is used. The experiments are conducted on the NSL-KDD trained their models to extract core features from input data. Then, using
using Pytorch and the results are compared with denoising the core features, they constructed the original data for classifying data
auto-encoder, variational auto-encoder, auto-encoder, and as well as into normal and abnormal categories with the reconstruction error
Decision Tree, Random Forest, KNN, GBDT, and XGBoost. In the eval threshold. However, this scheme is only evaluated using the NSL-KDD
uation process metrics such as accuracy, precision, recall, F1-Score are dataset, and it is not analyzed with other datasets for justifying the
used. achieved results.
4.1.1. Stacked auto-encoder 4.1.3. Nonsymmetric auto-encoder

Various types of auto-encoders are applied in the investigated The NIDS scheme proposed in (Shone et al., 2018), combines deep
scheme(Mighan and Kahani, 2020), in the intrusion detection process, and shallow learning methods for analyzing various network traffics.
which this subsection focuses on the stacked auto-encoders approaches. They combined the random forest, a shallow learning method, with the
Khan et al. (2019), proposed TSDL, a fast and two-staged IDS scheme stacked non-symmetric deep auto-encoder, an auto-encoder containing
using a deep-stacked auto-encoder in which each stage contains two multiple non-symmetrical hidden layers. In this scheme, the
hidden layers with a softmax classifier. They conducted a auto-encoder is applied for unsupervised feature learning and
semi-supervised learning method for training their proposed deep model non-symmetric data dimension reduction. They evaluated their model
and pre-training the hidden layers on unlabeled network traffic features. using GPU-enabled TensorFlow and on the KDDCup’99 and NSL-KDD
This scheme consists of two steps, in which the first one classifies the datasets. In these experiments metrics like precision, accuracy, recall,
network into abnormal and normal states. The second step can recognize and training time are evaluated. The authors compared the NDAE model
the types of attacks. The authors carried out ten-fold cross-validation on with the DBN and explained that it provides a 98.81% reduction in
UNSW-NB15 and KDDCup’99 datasets and achieved 89.134% accuracy training time and a 5% improvement in accuracy. They demonstrated
for multi-class classification in UNSW-NB15 and 99.996% for multi-class that their IDS model can improve detection accuracy and reduce training
classification using the KDDCup’99. time. However, the authors failed to evaluate their scheme using
In (Telikani and Gandomi, 2019), Telikani and Gandomi proposed a real-world backbone network traffic.
cost-sensitive stacked auto-encoder denoted as CSSAE, to deal with the
class imbalance problem in IDS. This scheme supports binary and mul 4.1.4. Sparse auto-encoder
ticlass classification and in its first stage, this scheme assigns a unique This subsection discusses the IDS approaches such as (Al-Qatf et al.,
cost to each class regarding the different classes’ distribution and this 2018), which have utilized sparse auto-encoder to deal with intrusions
cost can be used in the deep learning feature learning, in which in the and security attacks. For instance, in (Preethi and Khare, 2020), Preethi
cost function layer, the neural network parameters are updated by using and Khare introduced SAE-SVR, an IDS scheme by using SVR or support
the relevant cost. Then, in its next phase, CSSAE applies a two-layer vector regression classifier and sparse auto-encoder in which the latter is
auto-encoder for learning features for better distinguishing the major used for unsupervised reconstruction of new feature representation and
ity and minority classes. This scheme is evaluated using NSL-KDD and dimension reduction. The sparse auto-encoder needs less training time
KDDCup’99 datasets and presents a better performance in handling and enhances the prediction accuracy of SVR. The experiments are
low-frequent attacks. conducted on the NSL-KDD using the Tensorflow tool and python lan
guage. Results validate that the SAE-SVR model accelerated the training
4.1.2. Denoising auto-encoder time of SVR and improves the rate of prediction by bringing down the
Several IDS schemes such as (Abusitta et al., 2019), are proposed in error rates. The authors are evaluated using metrics like training time,
the literature which incorporates denoising auto-encoder for extracting mean squared error, r2 score, root mean squared error, and mean ab
features. solute error, accuracy.
In (Zhang et al., 2018b), Zhang et al. proposed an effective deep Yan et al. (Yan and Han, 2018), presented an IDS approach by
learning-based network IDS scheme using DAE-based feature selection applying the SSAE or stacked sparse auto-encoder to extract high-level
and an MLP-based classifier. This scheme adds weights to the loss sparse features from malicious behaviors. Initially, for learning the
functions of different samples, and this causes the selector to choose a deep sparse features the classification features should be fed to the SSAE.
few features, representing the security attacks. The performance of this Afterward, the achieved sparse features are applied to provide different
scheme is evaluated by experiments performed on the UNSW-NB15 classifiers. By performing the required experiments, the authors indi
dataset, where 12 out of 202 features are selected after feature selec cated that high-dimensional sparse features learned by SSAE can be very
tion. After classification using an MLP with 2 hidden layers, they ach effective for binary and multiclass classifications of intrusion data.
ieved a high detection accuracy and F1-Score.
Choi et al. in (Choi et al., 2019), introduced a NIDS based on unsu 4.1.5. Variational auto-encoder
pervised learning algorithm auto-encoders which can learn models This subsection studies the schemes such as (Chuang and Wu, 2019),
without labeled data. They suggested a heuristic method for setting a which employed variational auto-encoders in the intrusion detection
reconstruction loss threshold based on the percentage of abnormal data process. For example, in (Zavrak and İskefiyeli, 2020), Zavrak and
in training data. As a result, by having prior knowledge in the percentage İskefiyeli proposed unsupervised deep learning methods with
of abnormality using this method, the abnormality of data without semi-supervised learning for detecting intrusions and anomalous
7
network traffic from flow-based data. 4.1.6. Convolutional auto-encoder

Auto-encoder and variational auto-encoder methods are employed to In (Binbusayyis and Vaiyapuri, 2021), Binbusayyis and Vaiyapuri
identify unknown attacks using flow features. This scheme extracts flow- introduced an unsupervised IDS approach that extracts features and
based features from network traffic data. It also performs the recon trains a classifier in two separate stages, a single-stage IDS approach that
struction of features by recovering missing features for incomplete integrates a one-dimensional convolutional auto-encoder and a
training datasets. The ROC and the area under the ROC curve, are one-class SVM. Using only the normal traffic samples, the approach
calculated and compared with one-class SVM. By performing the optimizes the 1D CAE for compact feature representation and the
required experiments, the authors demonstrated that variational auto- one-class SVM for classification by defining a unified objective function
encoder performs, for the most part, better than auto-encoder and combining reconstruction error with classification error. Thus, the
one-class SVM. generated compact feature representation has not only reconstruction
Nguyen et al. (2019), introduced GEE, an anomaly detection ability but also the discriminative ability for classification. The authors
approach for detecting network traffic anomalies. It applies a conducted their experiments on NSL-KDD and UNSW-NB15 datasets.
gradient-based fingerprinting method and variational auto-encoder Ji et al. (2020), proposed an anomaly IDS approach by using ACAE,
which is an unsupervised deep-learning method for dealing with an asymmetric convolutional auto-encoder applied for the learning of
detecting anomalies. The authors evaluated the GEE using the UGR features. This IDS scheme employs a by integrating the random forest
dataset and exhibited that this scheme can effectively detect various classifier and asymmetric convolutional auto-encoder to benefit from
anomalies. the merits of both shallow and deep learning methods. This IDS scheme
Table 3
Properties of the auto-encoder based IDS schemes.
Scheme Evaluation Metrics Simulators/ Classifiers Feature Extractions Datasets Accuracy (%)
Environments
Sadaf and Sultana Accuracy, Precision, Keras, Tensorflow Isolation Forest stacked auto-encoder NSL-KDD 95.4
(2020) Recall, F1-Score
Tang et al. (2020a) Accuracy, Precision, Python variational auto-encoder, LightGBM NSL-KDD 89.82
Recall, F1-Score Pytorch denoising auto-encoder
Li et al. (2020a) AUC, Detection Time Python Random forest CSE–CIC–IDS
2018
Louati and Ktata Accuracy, Python auto-encoder KDDCup’99 99.73
(2020) Precision, TNR, FPR, FNR,
Detection Rate
Mighan and Kahani FPR, Precision, Accuracy, Weka SVM stacked Auto-encoder ISCX 90.2
(2018) Kappa
Ieracitano et al. Accuracy, Precision, NSL-KDD 80.87
(2020) Recall, F1-Score
Mighan and Kahani Accuracy, Precision, Stack Auto-encoder CICIDS 2017, 90.2
(2020) Recall, FPR, ROC, Network ISCX
F-measure, Kappa
Khan et al. (2019) Accuracy, FPR, MATLAB Random forests KDDCup’99, 89
F-Measure, ROC UNSW-NB15
Telikani and Gandomi Accuracy, Recall, NSL-KDD, KDDCup99 =
(2019) Precision, FPR, KDDCup’99 99.4
F-Measure NSL-KDD =
99.45
Abusitta et al. (2019) Accuracy, Tensorflow stacked denoising KDDCup’99 95
Test Classification Error autoencoders
Zhang et al. (2018b) Accuracy, F1-Score, Tensorflow MLP denoising auto- UNSW-NB15 98.80
Precision, Recall, FPR encoder
Choi et al. (2019) Accuracy, Precision, TensorFlow Denoising auto- NSL-KDD 91.70
Recall, Specificity, encoder
F1-Score
Shone et al. (2018) Precision, Accuracy, TensorFlow Random forests Nonsymmetric Deep KDDCup’99, KDDcUP99 =
Recall, Training Time Auto-encoder NSL-KDD 99.49
NSL-DD = 95.64
Al-Qatf et al. (2018) SVM
Preethi and Khare Accuracy, Keras, SVR PCA NSL-KDD 97
(2020) Mean absolute error Tensorflow
Training Time
Yan and Han (2018) Accuracy, detection rate, TensorFlow SVM, KNN, random forests stacked sparse auto- NSL-KDD 99.36
FPR encoder
Chuang and Wu Keras, NSL-KDD
(2019) TensorFlow
Zavrak and İskefiyeli ROC, AUC deeplearning4j SVM Auto-encoder Kyoto 2006+,
(2020) LibSVM CTU-13,
UNSW-NB15,
CIDDS-001,
CICIDS2017
Nguyen et al. (2019) ROC TensorFlow UGR16
Binbusayyis and Accuracy, Keras, SVM Convolutional Auto- NSL-KDD, NSL-KDD =
Vaiyapuri (2021) Detection rate, FPR TensorFlow encoder UNSW-NB15 98.45
UNSW-NB15 =
91.58
Ji et al. (2020) Random-forest Convolutional Auto- NSL-KDD,
encoder KDDCup’99
8
is tested using NSL-KDD and KDDCup’99 datasets and improves the clustering algorithm for clustering on only three remaining features and
accuracy of detecting abnormal traffic. benefits from the unsupervised extreme learning machine for intrusion
Table 3 exhibits various properties of the auto-encoder-based IDS detection. This scheme is examined using the KDDCup’99 dataset and
schemes investigated in previous subsections. achieved 91.86% detection accuracy for the AE and 92.12% detection
accuracy for RBM and K-means. However, further evaluations using
4.2. RBM-based IDS schemes other deep learning networks are needed to verify the achieved results.
Table 4 compares the properties of the RBM-based IDS schemes studied
RBMs are used by many IDS schemes such as (Mayuranathan et al., in this subsection.
2019) for handling intrusion in various environments such as cloud
computing(Masdari and Khezri, 2020b; Masdari and Zangakani, 2019; 4.3. DBN-based IDS schemes
Masdari and Khoshnevis, 2019). Also, in (Elsaeidy et al., 2019), Elsaeidy
et al. proposed an IDS model for the smart cities using RBMs for unsu The IDS schemes of this subsection such as (Yang et al., 2019a)
pervised learning of high-level features from raw traffic data. On top of incorporate DBN for intrusion detection and handling. Zhang et al.
these extracted features, different classifiers are trained. Besides, in the (2019a) presented an IDS model using an improved GA and DBN for IoT
classification step, classifiers such as two types of feed-forward ANNs, networks. In this scheme, the GA is applied for finding the optimum
SVM, and random forest are used. The authors evaluated their scheme number of the hidden layers required for DBN to achieve a high level of
performance using a water plant dataset and exhibited that their detection rate. Besides, in each layer, the number of neurons is opti
approach can have high accuracy in detecting various security attacks. mized. The authors applied the NSL-KDD for evaluating the proposed
(de Rosa et al., 2021), de Rosa et al. introduced an anomaly detection IDS model and showed that their DBN-based IDS model can increase the
approach that handles raw features using an RBM as an auto detection rate while reducing the neural network’s structure complexity.
encoder-decoder for mapping raw features into new features spaces. However, this scheme suffers from the high training time and the au
Each RBM learns its corresponding class data distribution and re thors have not dealt with this issue.
constructs samples in a new space. When the feature projection is per In (Wang et al., 2021), Wang et al. proposed DBN-EGWO-KELM, a
formed classifiers such as SVM with RBF kernels, random forests, and NIDS model using an improved DBN, and replace the BP algorithm in
decision trees are incorporated. For conducting the required experi DBN with the Kernel-based Extreme Learning Machine (KELM) that is
ments of this anomaly detection approach, the NSL-KDD dataset is capable of supervised learning. This scheme also presents EGWO, an
employed. enhanced grey wolf optimizer is designed to optimize the KELM pa
Dawoud et al. (2018), presented a secure anomaly detection for an rameters to solve the problem of poor classification performance caused
SDN-based IoT environment. In their considered architecture, IoT de by randomly initializing kernel parameters. Besides, for improving the
vices are located at the bottom layer, while the SDN layers such as optimization ability of the GWO, an optimization method using inner
control and forward layers are located at the top of the IoT devices. In and outer hunting. The authors conducted their experiments using
this scheme, to allow the anomaly detection system to directly interact datasets such as CICIDS2017, UNSW-NB15, NSL-KDD, and KDDCup’99
with the network, the proposed anomaly detection system is placed at regarding metrics like precision, accuracy, FPR, and TPR. Besides, they
the controller layer and applies RBM. This scheme applies a two-layer compared their approach against RBF, BP, KELM, SVM, CNN, LIBSVM,
RBM network consisting of hidden and visible layers, in which the and DBN-KELM.
latter has 41 nodes, identical to the features of the KDDCup’99 dataset. In (Peng et al., 2019), Peng et al. proposed a network intrusion
Besides, the neurons of the hidden layer will be activated regarding the detection method based on deep confidence neural network. The influ
input neurons’ activation functions. The evaluations of this anomaly ence of parameters of the DBN network model on the intrusion detection
detection approach indicated that it achieves a 94% precision rate, effect is analyzed experimentally. The deep feature learning model
which is higher than that of the PCA and SVM. based on deep confidence neural network and other traditional common
Jing and Bin (2016), introduced an RBM-based NIDS approach that methods are analyzed. They indicated that the detection rate of this
applies the relevance depth learning method. The experiments and method is improved compared with the traditional machine learning
evaluation results conducted in MATLAB software demonstrated that for method.
unknown intrusions, this RBM-based NIDS method can achieve more Wang et al. (2019b), try to handle unlabeled network data using
detection accuracy and effectively reduces the average false detection deep learning in feature dimension reduction and proposed IDBN-SC, an
rate. intrusion detection method of Softmax classification based on the
In (Alom and Taha, 2017), the authors applied the unsupervised improved DBN. They showed that the IDBN-SC improves the detection
deep learning techniques to provide an effective NIDS. More specifically, accuracy and reduces the processing time for the intrusion detections in
it benefits from auto-encoder and RBM for feature extraction and comparison with IDBN-based original Softmax regression and
dimension reduction. Afterward, this scheme applies the k-means IDBN-based SVM.
Table 4
Properties of the RBM-based IDS schemes.
Scheme Evaluation Metrics Simulators/ Classifiers Feature Extractions Datasets Accuracy
Environments (%)
Mayuranathan et al. FPR, FNR, Recall, Specificity, RBM Random Harmony KDDCup’99 99.77
(2019) Accuracy, Search-based
F1-Score, kappa value
Elsaeidy et al. (2019) F1-Score Matlab Feed-Forward ANNs, SVM, water plant
Random Forest dataset
de Rosa et al. (2021) Accuracy, precision, recall, SVM RBM NSL-KDD 79.05
F1-Score
Dawoud et al. (2018) Specificity, Precision, Tensorflow RBM KDDCup’99 97
Recall, FPR, Accuracy,
F1-Score
Jing and Bin (2016) Detection rate, FPR MATLAB
Alom and Taha (2017) Detection accuracy SNORT Extreme Learning Machine RBM NSL-KDD 92.12
9
Table 5
Properties of the DBN-based IDS schemes.
Scheme Evaluation Metrics Simulators/ Classifiers Feature Datasets Accuracy (%)
Environments Extractions
Yang et al. Accuracy, Scikit-learn library SVM DBN NSL-KDD 98.43

(2019a) Precision, Recall,
F1-Score,
ROC
Zhang et al. Accuracy, MATLAB DBN KDDCup’99, KDDCup’99 = 99.45,
(2019a) Detection Rate, NSL-KDD NSL-KDD = 99.45
FPR, Precision,
Recall
Wang et al. Accuracy, Precision, Kernel-based Extreme DBN CICIDS2017, UNSW- CICIDS2017 = 97.15, UNSW-
(2021) Recall Learning Machine NB15, NB15 = 93.42,
NSL-KDD, NSL-KDD = 98.6, KDDCup’99
KDDCup’99 = 98.6
Peng et al. Accuracy, KDDCup’99 95.45%
(2019) FPR
Wang et al. Accuracy, FPR Eclipse + PyDev DBN NSL-KDD 96.46%
(2019b) plugin
Zhang et al. Accuracy, Probabilistic Neural DBN NSL-KDD 96.48%
(2018c) Detection Rate, Network
FPR
Zhang et al. (2018c), provided an IDS model based on a deep combined with the backpropagation method for iterative updating of the
learning method that handles unbalanced datasets and improves the network, to improve its detection speed and accuracy. For evaluating
detection rate for minority classes. The minority samples are increased this scheme, the authors collected 300,000 records of vehicle traffic data
by SMOTE technology and under-sampling the majority of samples by using the open-source BusMaster software. Also, each message contains
NCL, to get a balanced dataset and solve the problem of low detection a timestamp, message ID, description of relevant data, and description of
rate of minority categories. At the same time, the DBN model is relevant vehicle status information. The authors indicated that by using
improved. Combining the advantages of strong classification ability, their DNN model a lower false-positive rate and a higher accuracy can be
accuracy, and simple training, the DBN-PNN model is proposed. achieved.
Table 5indicates the properties of the DBN-based IDS approaches Parvat et al. (2017), proposed a NIDS using an ensemble of multiple
discussed in this subsection. binary classifiers for multiclass classification. Each binary classifier is a
deep learning model. This ensemble uses the One-vs-All decomposition
method which divides multiclass classification into N classification
4.4. DNN-based IDS schemes problems, in which each classifier detects one class. They define five
classifiers for DoS, Probe, U2R, R2L, and normal detection. While
This section investigates the IDS schemes such as (Thamilarasu and training each classifier, the subset of training data belonging to one class
Chawla, 2019), that have applied DNN for handling security attacks and is marked as positive, and the rest all are marked as negative. Each
intrusions. For instance, Su et al. (2020), presented BAT-MC, an IDS classifier in phase 1 is a DNN with three hidden layers. In the second
approach used to recognize the traffic anomalies using attention phase of this scheme, the output of classifiers is integrated to compute
mechanism and LSTM. Besides, it uses the attention mechanism for the ultimate result. The authors tested their system on the NSL-KDD
screening network flow vectors which consists of BLSTM generated dataset and achieved 99.99% accuracy for binary classification on
packet vectors, for obtaining network traffic key features. It uses several training data and 99.89% accuracy for all five classes. Besides, 81.27%
convolutional layers for capturing the traffic data local features and accuracy is observed on testing data for binary classification and 74.90%
employs the softmax for classification of network traffic. This scheme for five classes using the decision tree classifier. However, this scheme is
learns the main features automatically. The authors conducted the not evaluated using other datasets and in real systems to further verify
required evaluations using the Keras. They have tested their IDS model its claimed results.
with the NSL-KDD and compared their scheme against RNN and CNN. Table 6 gives the comparison of the DNN-based IDS schemes inves
They exhibited that their scheme outperforms others in terms of metrics tigated in this part of the paper.
such as accuracy, precision, FPR.
Zhang et al. (2019b), introduced an in-vehicle anomaly detection
system using a DNN model which attempts to automatically extract 4.5. CNN-based IDS schemes
features required for the intrusion detection process from the vehicle’s
data packets. In this scheme, the GDM/AG and GDM algorithm is This section studies the IDS approaches such as (Song et al., 2020; Bu
Table 6
Properties of the DNN-based IDS schemes.
Scheme Evaluation Metrics Simulators/Environments Classifiers Feature Extractions Datasets Accuracy (%)
Thamilarasu and Chawla (2019) Precision, Keras library, Cooja simulator DNN
Recall,
F1-Scores
Su et al. (2020) Accuracy, LSTM Multiple convolutional layers NSL-KDD 85
Precision,
FPR
Zhang et al. 2019b) Precision, FPR MATLAB Self-
collected
Parvat et al. (2017) Accuracy Keras Ensemble Classifier NSL-KDD 99.02
10
and Cho, 2017; Li et al., 2020b; Saraeian and Golchi, 2020; Riyaz and traffic load on the roadside units. This scheme applies a CNN model with
Ganapathy, 2020; Wu et al., 2018; Xiao et al., 2019) which have applied 7 layers for extracting the link loads features and detecting the possible
the CNN for handling intrusions and security attacks. For instance, in intrusion at RSUs. It also utilizes the link loads spatial feature for design
(Yang and Wang, 2019), Yang and Wang proposed a NIDS approach that the loss function and on the output layer. Furthermore, it incorporates a
preprocesses the wireless network traffic data and models the redundant error term, based on error feedback from the output map, for
intrusion-related traffic using an improved version of the CNN denoted enhancing the training error convergence. This scheme constructs a
as ICNN. This scheme presents the low-level intrusion traffic data as CNN-based deep architecture as a bayesian hierarchical model and
features to the improved CNN, which extracts the features autono proves its convergence in training error. The authors evaluated the
mously, and applies the stochastic gradient descent method to optimize sensitivity analysis and precision analysis on their scheme and evaluated
the network parameters. By performing the experiments, the authors it using metrics such as precision, accuracy, recall, F1-Score, false-
exhibited that their method can improve the accuracy, TPR, and FPR. positive rate.
The authors conducted the required training and testing on the In (Andresini et al., 2021), Andresini et al. presented a CNN-based ID
KDDCup’99 dataset and exhibited that it can improve the detection approach for analyzing network traffic for malicious activities. This
accuracy and recall in comparison to the DBN and LeNet-5 while scheme presents the network flows as 2D images and applies them for
reducing the false positive rate than specified deep learning models. training a 2D CNN model. Furthermore, this scheme applies clustering
In (Nguyen and Kim, 2020), Nguyen and Kim introduced a NIDS and nearest neighbor search methods for creating network flows imag
model which performs feature selection using GA and FCM or fuzzy ery representation. The authors evaluated their approach on three
C-means clustering. This scheme applies the bagging classifier and a datasets and exhibited that their scheme improves predictive accuracy
CNN model. It also benefits from the GA for selecting the structure of the in comparison to other IDS approaches.
CNN model. In this scheme, the CNN model performs the feature In (Wang et al., 2020), Wang et al. introduced a NIDS using DMCNN
extraction step and the achieved features are fed into the BG classifier. or deep multi-scale CNN which using various convolution kernels to
The authors validated the results of their NIDS using a 5-fold extract features from high-dimensional unlabeled data. In this scheme,
cross-validation process. the batch normalization method is used for optimizing the network
Zhang et al. (2020a), presented SGM, for dealing with the imbal structure learning rate and obtaining features from raw data. They used
anced classes in large datasets. Furthermore, this scheme applies the NSL-KDD dataset and indicated that their model achieves a high rate
Gaussian mixture model-based under-sampling for clustering and syn of accuracy and precision while suffering less from FPR.
thetic minority over-sampling technique. Afterward, they designed Hu et al. (2020), introduced an IDS model using an improved CNN
SGM-CNN, a flow-based IDS approach, and applies their proposed and an algorithm denoted as ADASYN or adaptive synthetic sampling.
method for handling the imbalanced class with CNN, which investigated They used the ADASYN for balancing the sample distribution while
the convolution kernels impact and different rates of different learning. preventing ignoring small samples and sensitivity to large samples. In
They further analyzed their scheme’s effectiveness with datasets such as this scheme, SPC-CNN or split convolution module is used in CNN for
CICIDS2017 and UNSW-NB15 for multiclass and binary classifications. increasing the features diversity and eliminating the inter-channel in
They demonstrated that their method can achieve a high detection rate formation redundancy impact on the training process. At last, for testing
for imbalanced intrusion detection methods and other IDS schemes. the AS-CNN, the NSL-KDD dataset is applied and the authors indicated
Nie et al. (2020), designed a data-driven IDS for the Internet of ve that their scheme can achieve better results than RNN and CNN models
hicles environments which analyzes the irregular fluctuations of the in terms of accuracy, detection rate, and FPR. Table 7 indicates some of
Table 7
Properties of the CNN-based IDS schemes.
Environments
Song et al. (2020) FNR, CNN Self-Collected

Error Rate,
Precision, Recall,
F1-Score
Bu and Cho (2017) Accuracy TensorFlow CNN TPC-E
Li et al. (2020b) Accuracy, Precision, Recall, F1- TensorFlow, CNN Real dataset 99.2
Score Keras
Saraeian and Golchi Wireshark, Weka MATLAB CNN ISCX, ISCX = 97,
(2020) NSL-KDD NSL-KDD = 99.9
Riyaz and Ganapathy Accuracy TensorFlow CNN Linear correlation NSL-KDD 99.88
(2020) coefficient
Wu et al. (2018) Accuracy, Detection Rate, FPR TensorFlow Ensemble CNN NSL-KDD 79
Xiao et al. (2019) KDDCup’99
Yang and Wang (2019) Accuracy, TensorFlow KDDCup’99 95.36
Precision, FPR
Nguyen and Kim (2020) Accuracy, MATLAB Ensemble NSL-KDD 98.24
Precision,
FPR
Zhang et al. (2020a) Accuracy, TensorFlow, CNN CICIDS2017, UNSW- CICIDS2017 =
Detection Rate, Keras NB15 99.85,
FPR, UNSW-NB15 =
F1-Score 98.82
Nie et al. (2020) Precision, Accuracy, Recall, F1- CNN Self-Collected
Score, FPR
Wang et al. (2020) Accuracy, TensorFlow CNN NSL-KDD 94.65
Precision, FPR
Hu et al. (2020) Accuracy, CNN NSL-KDD 83.83
Detection Rate,
FPR
11
the properties of the CNN-based IDS solutions studied in this subsection. 4.7. LSTM-based IDS schemes
4.6. GAN-based IDS schemes In (Boukhalfa et al., 2088-8708), Boukhalfa et al. proposed a NIDS
established on the deep learning method LSTM, which recognizes at
This section investigates the GAN-based IDS schemes (Liu et al., tacks and keeps a long-term memory of them, to block the other new
2019), provided in the literature. For example, Lu et al. (2019), pre attacks. They employed the NSL-KDD for training and testing, and
sented MalDeepNet, a deep neural network for the classification of applied evaluation metrics such as accuracy, sensitivity, false-positive
malware behaviors. They used deep learning in the family clustering rate, precision, and recall to compare their scheme against other clas
algorithm and designed Mal-GAN, a malware prediction model that sifiers. However, this scheme is not analyzed using other datasets and in
recognizes the malware. real environments.
Shu et al. (2020), presented an IDS for VANETs by placing a In (Haggag et al., 2020), Haggag et al. proposed DLS-IDS, a deep
distributed SDN controller on each base station to recognize normal learning-based IDS scheme that has four main building blocks, the four
network flows and attack network flows. They applied GAN and jointly system blocks are to choose and explore, preprocess, class imbalance
train multiple SDN controllers for the whole VANET using the entire solution, and the last block is training over Apache Spark. They indi
network flow information. This IDS scheme enables the distributed SDN cated that the Spark cluster can be used for performing training of the
controllers to independently detect their sub-network flows, and it can model using a various number of hidden layers and different elements
reduce the computation and communication overheads. The author type. For handling a dataset containing a class imbalance, as a pre
evaluated their scheme with metrics such as precision, accuracy, processing phase, it uses the synthetic minority over-sampling technique
F1-Score, recall, and AUC. to increase the accuracy and mitigate the overfitting. By performing
In (Huang and Lei, 2020), Huang and Lei tackled the imbalanced necessary experiments on the NSL-KDD dataset indicated that their IDS
classes by proposing IGAN, or imbalanced GAN. For producing new data approach reaches 83.57 detection accuracy. However, this scheme
samples for a minority class in the dataset, they introduced IGAN, by should be improved to be able to handle more types of attacks.
adding convolutional layers and an imbalanced filter to the GAN. Af In (Liu et al., 2020), Liu et al. proposed a NIDS approach that uses the
terward, using the IGAN and its generated data samples, an IDS difficult set sampling technique for handling imbalanced attack data.
approach denoted as IGAN-IDS is provided, which consists of feature For dividing the imbalanced training data of network traffic into easy
extraction, IGAN, and DNN. They utilized a feed-forward ANN for and difficult sets, they incorporated the edited nearest neighbor algo
computing feature vectors from raw network properties. The authors rithm. Afterward, for compressing the majority samples in the difficult
evaluated the IGAN-IDS using datasets such as CICIDS2017, set, they used the KMeans. At last, to provide a new training dataset,
UNSW-NB15, and NSL-KDD. They compared the achieved results they combined the minority in a difficult set, the compressed majority in
against some deep and shallow learning methods using metrics such as a difficult set, and the easy set. In this scheme, classifiers such as SVM,
F1-Score, AUC, and accuracy. random forest, XGBoost, LSTM, Mini-VGGNet, and AlexNet are used.
Shahriar et al. (2020), provided G-IDS, an IDS approach that employs Besides, the necessary experiments are performed using datasets such as
GAN for producing synthetic data samples for handling problems such as CSE–CIC–IDS2018 and NSL-KDD and indicated the effectiveness of their
missing data and imbalanced classes in the training of the IDS. They approach. Table 9 exhibits some of the properties of the LSTM-based IDS
evaluated their approach on the NSL-KDD dataset and evaluated it using schemes.
the following metrics: precision, recall, F1-Score. They compared their
approach outperforms a standalone IDS, in terms of the specified met 4.8. RNN-based IDS schemes
rics. However, it is not evaluated on other datasets and in real
environments. Several RNN-based intrusion detection approaches such as (Xu et al.,
In (Ring et al., 2019), the authors attempted to generate data samples 2018; Almiani et al., 2020; Tang et al., 2019) are provided in the liter
regarding various network traffic flows using GANs. However, the GANs ature, which this section discusses. For example, in (Kaur and Singh,
can handle continuous features, while flow-based data may contain 2019) Kaur and Singh proposed D-Sign, a hybrid deep learning-based
categorical features like port numbers or IP addresses. For transforming IDS scheme for handling both anomalies and intrusions. This scheme
flow-based data to continuous values, the authors proposed three pre can detect and generate signatures of web-based security attacks. More
processing methods. They also presented a method to evaluate the specifically, the D-Sign system uses an RNN containing multiple layers of
produced network traffic flows which for defining the quality tests in LSTM for recognizing security attacks in network traffic. The reason for
corporates the domain knowledge. They used the CIDDS-001 dataset for this combination is that RNN models often suffer from problems such as
creating network flows. However, this scheme does not create sequences vanishing gradient and long-term dependency, which the LSTM can
of flows and only presents a single flow. overcome. Besides, a two-layer LSTM network is used to recognize new
Table 8 provides the comparison of the GAN-based IDS approaches security attacks, which their results are fed into a softmax function. The
described in this subsection. authors trained the D-Sign on datasets such as NSL-KDD and CICIDS
Table 8
Properties of the GAN-based IDS schemes.
Scheme Evaluation Metrics Simulators/Environments Classifiers Feature Extractions Datasets Accuracy (%)
Liu et al. (2019) Accuracy SVM 96.6%

Lu et al. (2019) DataCon
Shu et al. (2020) Precision, Recall, Tensorflow KDDCup’99,
F1-Score, NSL-KDD
AUC
Huang and Lei (2020) Accuracy, Feed-forward CICIDS2017, UNSW-NB15, CICIDS2017 = 99.79,
F1-Score, Neural Network NSL-KDD UNSW-NB15 = 82.53,
AUC NSL-KDD = 84.45
Shahriar et al. (2020) Precision, PCA NSL-KDD
Recall,
F1-Score
Ring et al. (2019) CIDDS-001
12
Table 9
Properties of the LSTM-based IDS schemes.
Scheme Evaluation Metrics Simulators/ Classifiers Feature Datasets Accuracy (%)
Environments Extractions
(Boukhalfa et al., Accuracy, LSTM NSL KDD 99.93

2088-8708) False Positive Rate,
Precision,
Recall
Haggag et al. (2020) Accuracy, Spark Cluster LSTM NSL-KDD 87.54
Precision, RNN
Recall, MLP
F1-Score,
FPR
Liu et al. (2020) Accuracy, Sklearn + SVM, random forest, CSE–CIC–IDS2018, NSL- CSE–CIC–IDS2018 =
Precision, Tensorflow XGBoost, LSTM, KDD 96.99,
Recall, Mini-VGGNet, AlexNet NSL-KDD = 82.84
F1_Score
2017 for binary and multi-class classification. They applied metrics such LSTM, and GRU on the selected features. At last, the authors evaluated
as sensitivity, accuracy, specificity, false negatives, and false positives their anomaly detection scheme on the ISCX and NSL-KDD datasets.
and exhibited that their approach outperforms other classifiers. They indicated their scheme outperforms other IDS approaches in terms
In (Tang et al., 2018), the authors proposed an anomaly intrusion of accuracy, detection rate and mitigates the required computation time.
detection scheme for SDN environments that applies GRU-RNN, an RNN Table 10 compares the RNN-based IDS schemes regarding their evalu
model that uses GRU. The GRU-RNN represents the relationship be ation metrics, simulators or environments, classifiers, feature extraction,
tween previous and current events and can increase the detection rate of and datasets.
the anomalies. The authors tested their anomaly detection approach
using the NSL-KDD dataset and indicated that with six raw features, it
can have 89% accuracy and it does not affect the performance of the 4.9. Hybrid IDS schemes
network.
Yin et al. (2017), proposed RNN-IDS, an RNN-based IDS model, and This section discusses the IDS schemes which have incorporated two
studied its performance in multiclass and binary classification problems. or more deep learning techniques in various steps of the intrusion
They also investigated the impact of different learning rates and the detection process.
number of neurons on the performance of their model. This scheme is
evaluated on the NSL-KDD and KDDCup’99 datasets and the results are 4.9.1. AE + CNN
compared with the shallow classifiers such as J48, ANN, random forest, The approaches discussed in this subsection (Xu et al., 2020), applies
SVM, etc. The results showed that the proposed RNN-IDS provides good AE and CNN models for handling intrusion detection problems. For
results with high accuracy in binary and multiclass intrusion detection instance, Wang et al. (2019c) introduced a hybrid IDS model for the
problems. Android environment based on the auto-encoder and CNN. They
In (Le et al., 2019) Le et al. proposed an IDS approach to deal with reconstructed the features of the Android application and employed
high false positives in anomaly detection systems and dealing with at several CNN models for detecting malware. For preventing problems
tacks with imbalanced training sets. This scheme presents a feature se such as over-fitting and increasing the sparseness, they used an activa
lection model, denoted as SFSDT which generates the best possible tion function in the CNN–S or serial CNN. Besides in this scheme, the
subset of features. The SFSDT model consists of a decision tree and pooling layer, convolutional layer, and full connection layer are com
Sequence Forward Selection. Then, it trains classifiers such as RNN, bined to improve feature extraction. Furthermore, the authors used an
auto-encoder for pre-training to reduce the training time. At last, the
Table 10
Properties of the RNN-based IDS schemes.
Environments
Xu et al. (2018) Accuracy, TensorFlow NSL-KDD, NSL-KDD = 99.24,

Precision, Detection Rate, FPR, KDDCup’99 KDDCup’99 = 99.84
F1-Score
Almiani et al. Accuracy, Precision, Detection Rate, MATLAB RNN NSL-KDD 90.32
(2020) F1-Score, FPR, FNR, Kappa
Coefficients, Mathew Correlation
Tang et al. Accuracy, TensorFlow, GRU-RNN NSL-KDD 89
(2019) ROC, Throughput, Keras
Latency
Kaur and Singh Recall, RNN CICIDS 2017, CICIDS 2017 = 99.10,
(2019) Specificity, Accuracy, NSL-KDD NSL-KDD = 99.40
F-Measure, AUC
Tang et al. Precision, Recall, Keras GRU-RNN NSL-KDD 89
(2018) F1-Score, Scikit-learn
Accuracy
Yin et al. Accuracy, Python DNN NSL-KDD, NSL-KDD = 83.28,
(2017) Precision, KDDCup’99 KDDCup’99 = 81.29
Recall, F1-Score
Le et al. (2019) Accuracy, Python RNN, LSTM, Sequence Forward ISCX, ISCX = 99.5,
Detection Rate, Computation Time GRU Selection, Decision Tree NSL-KDD NSL-KDD = 94
13
authors conducted experiments on several Android applications and detection. In (Zhang et al., 2020b), Zhang et al. introduced
indicated that a 5% improvement in accuracy is achieved in comparison CWGAN-CSSAE, a NIDS scheme that uses stacked auto-encoders and
with SVM, and 83% improvement in training time is gained. enhanced conditional Wasserstein generative adversarial network
In (Lopez-Martin et al., 2017), Lopez-Martin et al. proposed an IDS which uses gradient penalty and L2 regularization for generating mi
scheme for finding anomalies in HTTP messages in the IoT networks nority attacks to reduce the class imbalance of the training dataset.
using a convolutional auto-encoder that applies binary image trans Besides, a stacked auto-encoder is applied to extract network data deep
formation. This auto-encoder has a decoder and an encoder with CNN features. Also, regarding the minority attacks, this scheme assigns a
architecture and for normal messages, it minimizes the binary large misclassification cost to them with a cost-based loss function. The
cross-entropy among output and input images. When the training pro authors carried out the results of the required experiment based on
cess is completed, this approach detects anomalous messages when its KDDCup’99, and UNSW-NB15 datasets to show that this IDS model can
binary cross-entropy is bigger than a threshold. By performing the handle minority attacks and unknown attacks and evaluated this capa
required experiments, the authors confirmed that this scheme can pro bility regarding the accuracy and F1-Score metrics.
vide better results than isolation forest and SVM classifiers. They also To deal with the lack of sufficient IoT traffic data for anomaly
indicated that a deeper convolutional auto-encoder better results can be detection systems, in (Zixu et al., 2020), the authors presented an un
achieved. However, the character embedding applied in the image supervised anomaly detection approach using GAN and auto-encoder.
transformation incurs computational complexity to achieve a little This scheme trains a centralized auto-encoder and passes it to the net
performance improvement. works for anomaly detection when it is adapted to the raw data from the
IoT networks. The authors evaluated this scheme with the UNSW
4.9.2. AE + DBN Bot-IoT dataset.
In (Yang et al., 2020a), Yang et al. proposed an IDS scheme for
wireless networks which employs conditional DBN to learn the temporal 4.9.5. AE + LSTM
behavior features. They provided a window-based under-sampling al Zhang et al. (2020c), Introduced AN-LSTM, a hybrid NIDS model that
gorithm for balancing the normal samples and attack samples in the benefits from the auto-encoder and LSTM aiming for reducing the
AWID training dataset. They also used a stacked contractive computational complexity of the intrusion detection process and elimi
auto-encoder for eliminating the data redundancy. By conducting the nating redundant information. This scheme applies the KDDCup’99
required experiments, the authors showed that their detection method dataset perform preprocessing on it. Then, it applies the auto-encoder
can achieve a better detection performance compared to other deep for dimension reduction of features and LSTM for classification. At
learning and shallow learning methods. These experiments show that last, this scheme is evaluated using metrics such as false alarm rate and
this proposed mechanism is fast and has a low average detection time. accuracy. It is indicated that this scheme achieves better results than
LSTM and some other machine learning approaches. But, the encoding
4.9.3. AE + DNN and decoding process of the Auto-encoder is time-consuming and no
This subsection illustrates the IDS schemes such as (Yang et al., solution is provided by the authors to solve the problem.
2019b), benefited deep learning techniques like AE and DNN. For
instance, Yang et al. (2020b), proposed SAVAER-DNN, a NIDS that can 4.9.6. CNN + LSTM
detect known and unknown attacks and improves the detection rate of A few numbers of the schemes such as (Zhang et al., 2020d), have
low-frequent attacks. SAVAER is a supervised variational auto-encoder applied CNN and LSTM for providing high-performance IDS approaches.
with regularization, which uses WGAN-GP instead of the vanilla GAN For example, in (Wang et al., 2017), Wang et al. proposed HAST-IDS,
to learn the latent distribution of the original data. SAVAER’s decoder is which uses CNNs to learn the spatial-temporal features of the network
used to synthesize samples of low-frequent and unknown attacks, packets and applies an LSTM to learn the temporal features among
increasing the diversity of training samples and balancing the training multiple network packets. As a result, their method obtains more ac
data set. SAVAER’s encoder is used to initialize the weights of the hidden curate spatial-temporal traffic features. The authors evaluated their
layers of the DNN and explore high-level feature representations of the scheme using ISCX and DARPA1998 datasets. The experimental results
original samples. The authors are conducted evaluations using show that the HAST-IDS improves the accuracy and DR and reduces the
UNSW-NB15 and NSL-KDD datasets in terms of metrics such as detection FPR because it automatically learns the spatial-temporal features.
rate, accuracy, false-positive rate, and F1-Score. However, the HAST-IDS has low performance for the attack classes
In (Azmin and Islam, 2020), the authors presented a NIDS approach having fewer samples and as a result, its performance on imbalanced
using Variational Laplace Auto-encoder and DNN. They used applied the datasets should be enhanced. Also, in (Kim et al., 2020), Kim et al.
class labels as an input to the auto-encoder and denoted it as CVLAE, or proposed AI-IDS, an anomaly detection scheme that conducts
Conditional Variational Laplace Auto-encoder. They employed the payload-level deep learning using CNN, LSTM, and spatial feature
CVLAE for probabilistic data synthesizing and learning the network data learning. As an advantage, they used web traffic for their evaluations.
features variable representations. They used a DNN classifier and However, this scheme should be continuously revalidated for reducing
applied the synthesized and original data for its training. At last, the FPR.
authors evaluated their NIDS using the NSL-KDD and demonstrated that
it can even detect higher minority attacks with high. Also, in (Tang et al., 4.9.7. DNN + RNN
2020b), Tang et al. proposed SAAE-DNN, an IDS method that combines In (Tang et al., 2020c), Tang et al. introduced DeepIDS, a flow-based
SAE and DNN. In this scheme, the data features are mined using the anomaly detection approach for the SDNs which applies a
SAAE and after training is employed for initializing the DNN hidden fully-connected DNN and GRU-RNN or Gated Recurrent Unit RNN. The
layers’ weights. The performance of this IDS approach is analyzed with DeepIDS can deal with different size networks and can be optimized to
the NSL-KDD and the results are compared to various deep learning handle new threat models while not affecting the network performance.
networks such as CNN, RNN, AE, etc. However, this scheme does not They conducted the required experiments using the NSL-KDD dataset on
consider the imbalanced dataset problem and cannot handle attacks the Keras library to evaluate their flow-based anomaly detection
with fewer data samples. approach regarding resource utilization, latency, and throughput.
However, it is not evaluated their scheme using real network traffic and
4.9.4. AE + GAN further performance evaluations are needed regarding latency and
This subsection describes approaches such as (Hara and Shiomoto, throughput.
2020), that have incorporated AE and GAN for anomaly and intrusion Table 11 compares some of the properties of the hybrid deep
14
Table 11
Properties of the hybrid deep learning-based IDS schemes.
Environments
Wang et al. (2019c) FPR, Precision, Accuracy, TensorFlow, CNN Auto-encoder Self-collected 99.82
Recall, Keras
Positive predict value,
F1-Score,
Training Time.
Lopez-Martin et al. FPR, Precision, PyTorch CNN Auto-encoder Modified National Institute of
(2017) F1-Score, Matthews Standards and Technology dataset
correlation coefficient
Yang et al. (2020a) Accuracy, Python Conditional Stacked Contractive AWID
Precision, Recall, F1- DBN Auto-Encoder
Score,
Matthews correlation
coefficient
Yang et al. (2019b) Accuracy, TensorFlow DNN NSL-KDD, NSL-KDD =
Recall, Precision, F1- UNSW-NB15 85.97,
Score, FPR UNSW-NB15
= 89.08
Yang et al. (2020b) Accuracy, TensorFlow DNN Variational NSL-KDD, NSL-KDD =
Precision, Recall, Auto-Encoder UNSW-NB15 80.3,
detection rate, UNSW-NB15
FPR, F1-Score, = 93.01
G-mean, ROC, AUC
Tang et al. (2020b) Accuracy, TensorFlow, DNN Stacked Auto-encoder NSL-KDD 87.74
Recall, Keras
Precision,
F1-Score
Hara and Shiomoto Accuracy, Python DNN Variational Auto- NSL-KDD 83.11
(2020) Precision, Encoder
FPR
Zhang et al. Accuracy, F1-Score TensorFlow GAN Stacked NSL-KDD, KDD-KDD =
(2020b) Auto-Encoder UNSW-NB15 85.97,
UNSW-NB15
= 89.08
Zixu et al. (2020) Accuracy, GAN auto-encoder UNSW Bot-IoT 95.12
Precision,
Recall, F1-Score
Zhang et al. (2020c) Accuracy, FPR LSTM auto-encoder KDDCup’99 95
Zhang et al. Accuracy, Precision, TensorFlow, Ensemble CSE–CIC–IDS2018 98.7
(2020d) Recall, TensorLayer Classifier
F1-Score
Wang et al. (2017) Accuracy, TensorFlow, CNN + LSTM DARPA, DARPA =
Detection Rate, Keras ISCX 99.99,
FPR ISCX = 99.96
Kim et al. (2020) Accuracy, TensorFlow, CNN + LSTM CICIDS2017 99.99
Precision, Keras
F1-Score,
Recall,
Specificity
Tang et al. (2020c) Precision, TensorFlow, DNN NSL-KDD 95.3%
Recall, Keras
F1-measure,
Training Time, Testing
Time
learning-based IDS schemes.
5. Discussion
This section provides a discussion on the various issues about the

investigated deep learning-based IDS schemes in the previous section.
For this purpose, it provides some statics about the following issue:
• The number of the IDS schemes provided using each type of deep
learning network.
• The number of the IDS approaches presented using auto-encoders.
• The number of IDS solutions that benefit from two different types of
deep learning networks.
• The number of the IDS solutions incorporated different intrusion
detection datasets to evaluate their performance.
Fig. 7. Number of the IDS schemes proposed using each deep learning method.
15
Fig. 8. Auto-encoder based IDS approaches. Fig. 11. Percentage of the approaches which have used two or more datasets.
CNNs are used in 14 IDS solutions, and 17 IDS schemes have incorpo
rated two deep learning techniques, categorized as hybrid schemes.
Also, as shown in this figure, many IDS schemes using two deep learning
approaches are designed to deal with various types of intrusions. Fig. 8
depicts the number of IDS schemes designed using a different type of
auto-encoders.
Fig. 9 depicts the number of the IDS approaches which have
benefited two deep learning networks to deal with various intrusions
and anomalies. As shown in this figure, the combination of AE and DNN
deep learning methods are used by the most number of IDS schemes,
while the CNN + LSTM, AE + GAN, and AE + CNN combinations are
also applied by several IDS solutions.
Fig. 10 indicates the datasets applied in the studied deep learning-
based IDS solutions. However, as shown in this figure, 34 and 15 IDS
schemes are still applying NSL-KDD and KDDCup’99 legacy datasets,
Fig. 9. Number of the IDS approaches applied two deep learning networks.
respectively. Also, for conducting a more complete evaluation of the
proposed schemes, some schemes apply two or more datasets. This
method.
Fig. 11 exhibits the percentage of the deep learning-based IDS ap
proaches that have applied a different number of datasets. As shown in
this figure, 72% of the IDS approaches are verified with only one dataset
and 25% of them have been evaluated using only two datasets. However,
other categories are applied by only 1% of the studied IDS schemes.
Fig. 12 indicates the percentage of the IDS schemes which have
applied each evaluation metric. As shown in this figure, 27% of the IDS
schemes are evaluated using accuracy metric, 24% of them are analyzed
using precision metric, and 14% of them are evaluated using recall.
However, fewer IDS approaches have considered the time required for
training and testing.
The capability of handling the imbalanced IDS datasets is an inter
esting feature that some of the studied schemes support and apply the
Fig. 10. Datasets applied in the deep learning-based IDS schemes. deep learning networks to generate synthetic data for the classes with
fewer data samples. Fig. 13 exhibits the percentage of the schemes that
• The percentage of the IDS schemes evaluated their scheme using one,
two, three, four, or five different intrusion detection datasets.
• The percentage of the IDS schemes that have applied each evaluation
metric in their experiments.
• The percentage of the schemes that try to handle the imbalanced
dataset problem.
• Accuracy of the investigated IDS approaches regarding their
employed datasets.
It is worth mentioning that, these statistics are provided based on the

85 deep learning-based IDS approaches studied in the previous section.
At first, we analyze, which deep learning network is favored by the most
IDS schemes. For this purpose, Fig. 7 exhibits the number of the IDS
approaches which have applied different type of deep learning net
works. As shown in this figure, auto-encoders are applied by 21 schemes, Fig. 12. Percentage of the IDS schemes which have applied each evalua
tion metric.
16
computer systems and networks. To increase the performance of the IDS

schemes and increase their effectiveness against the new security chal
lenges, various AI techniques are incorporated in the IDS solutions. Deep
learning is one of the contexts which have been focused on by the re
searchers to improve the feature selection/extraction and classification
steps of the IDS approaches. To this end, in recent years, many deep
learning-based IDS solutions have been designed and proposed in the
literature, which this paper is aimed to provide an extensive survey and
classification on them. For this purpose, it first provides the background
knowledge, in which various types of deep learning networks applied in
the studied IDS approaches are illustrated; also, the main datasets
benefited to evaluate and analyze the IDS schemes are described in this
section. Then a taxonomy of the proposed deep learning-based IDS ap
proaches regarding their applied deep learning network is provided. To
Fig. 13. Percentage of the schemes that can handle imbalanced datasets. be more specific, in each category, the application of deep learning
techniques in different steps of the IDS approaches are discussed to
indicate that how each IDS solution benefited from the deep learning to
further enhance the performance of the intrusion detection process.
Finally, various features of the studied deep learning-based based are
discussed to illuminate the techniques and methods that are highly
incorporated in the investigated schemes.
Although a great deal of research has been performed in the deep
learning-based IDS context, in the subsequent researches, the following
issues can be considered:
• Several distributed IDS schemes have applied blockchain technology

to improve the security of data, transmitted in the intrusion detection
process. However, this issue has been neglected in investigated deep
learning-based IDS approaches. Consequently, in the next studies, it
Fig. 14. Accuracy of the studied schemes on the UNSW-NB15.
can be focused to further enhance the proposed approaches using
blockchain or other security technologies.
• Most of the studied IDS schemes are approaches designed for the
general environments and only 5% of them are allocated for IoT, 5%
for the SDNs, and 3% for the vehicular networks. Also, from these
few schemes, only a handful of them are allocated for anomaly
intrusion detection. Thus, deep learning-based intrusion detection in
special computing environments should be further investigated and
much focus should be made on anomaly detection in special
environments.
• In the investigated schemes, most of the applied datasets are for the
general networks. Therefore, regarding many emerging security at
tacks in the new computing environments and the dependency of the
deep learning-based techniques on quality data, producing special
datasets for them must be considered in the subsequent studies. This
can improve the evaluation of new IDS schemes and reveal their real
Fig. 15. Accuracy of the studied schemes on the ISCX dataset. effectiveness in handling different intrusions and malware. Also, the
new datasets should cover most of the known attacks or support the
can handle imbalanced datasets. As depicted in this figure, only 12% of required profiles applied in the anomaly detections.
the investigated IDS approaches can deal with imbalanced datasets. • Although hardware technologies such as the Field Programmable
These schemes often use GAN models to generate synthetic data, for the Gate Array (FPGA) (Liu et al., 2018) and Application-Specific Inte
classes which have fewer data samples. GANs can alleviate the problem grated Circuit (ASIC) (Tefai et al., 2020) are applied in the literature
with imbalanced datasets, but their data may not fully represent the real for the training of the neural networks (Boutros et al., 2018; Ven
network traffic traces and some considerations are needed to further kataramanaiah et al., 2020), the investigated deep learning-based
check the validity of the synthetically produced training data. IDS schemes have not utilized these technologies and only have
Fig. 14 indicates the accuracy of the studied deep learning-based IDS employed GPU and CPU in their training. Consequently, in the next
schemes on the UNSW-NB15 dataset. Fig. 15 depicts the accuracy of the studies in the deep learning-based intrusion detection domain, FGPA,
investigated IDS schemes which have employed the ISCX dataset for ASIC and even other hardware technologies such as neural process
evaluating their scheme. Fig. 16 exhibits the accuracy of the deep ing unit (NPU) or neural processor can be benefited.
learning-based solutions on the NSL-KDD dataset. As shown in this • Since it is unlikely to collect the same amount of record for all attack
figure many schemes have applied this legacy dataset which may not classes, having unbalanced datasets is inevitable, which leads to
represent the traffic of the current computer networks. problems such as over-fitting. GANs are used in the studied IDS
schemes for learning the distribution of input data and generating
6. Conclusions and future researches directions some synthetic data for imbalanced datasets. However, the produced
data may not be realistic. Hence, using some techniques to inspect
Intrusion detection systems (IDS) have an essential role in securing the synthetic data seems to be necessary.
17
Fig. 16. Accuracy of the studied schemes on the NSL-KDD dataset.
• Very few deep learning-based intrusion detection approaches sup • For handling real-time intrusion detection with streaming data,
port both misuse detection and anomaly detection. Regarding the techniques such as online learning and incremental learning can be
capability of such hybrid schemes in handling known and unknown used.
attacks, they can be further focused on in the upcoming studies. • Regarding a huge number of data samples that deep learning models
• Regarding the inefficiency of the misuse detection schemes in must be trained with, parallel (Jin and Kim, 2019) and distributed
handling encrypted traffic and their need for having the attack sig (Sergeev and Del Balso, 2018; Akiba et al., 2017) methods are pro
natures and keep updating the signatures database, anomaly detec vided in the literature for faster training of the deep learning net
tion schemes will gain momentum in the future IDS schemes. works (Ben-Nun and Hoefler, 2019). However, only a handful of IDS
However, fewer deep learning-based anomaly intrusion detection schemes (Al Jallad et al., 2019) have used the distributed deep
approaches are currently provided, and in subsequent studies, this learning methods and in the next studies, these techniques should be
issue should be addressed. further analyzed and evaluated in the intrusion detection domain.
18
Declaration of competing interest Dawoud, A., Shahristani, S., Raun, C., 2018. Deep learning and software-defined
networks: towards secure IoT architecture. Internet of Things 3, 82–89.
Dutta, I.K., Ghosh, B., Carlson, A., Totaro, M., Bayoumi, M., 2020. Generative adversarial
The authors declare that they have no known competing financial networks in security: a survey. In: 2020 11th IEEE Annual Ubiquitous Computing.
interests or personal relationships that could have appeared to influence Electronics & Mobile Communication Conference (UEMCON), 0399-0405.
the work reported in this paper. Elsaeidy, A., Munasinghe, K.S., Sharma, D., Jamalipour, A., 2019. Intrusion detection in
smart cities using Restricted Boltzmann Machines. J. Netw. Comput. Appl. 135,
76–83.
References Fernandes, G., Rodrigues, J.J.P.C., Carvalho, L.F., Al-Muhtadi, J.F., Proença, M.L., March
01 2019. A comprehensive survey on network anomaly detection. Telecommun.
Syst. 70, 447–489.
AbdAllah, E.G., Zulkernine, M., Hassanein, H.S., 2018. Preventing unauthorized access in
Ferrag, M.A., Maglaras, L., Moschoyiannis, S., Janicke, H., 2020. Deep learning for cyber
information centric networking. Security and Privacy 1, e33.
security intrusion detection: approaches, datasets, and comparative study. Journal of
Abusitta, A., Bellaiche, M., Dagenais, M., Halabi, T., 2019. A deep learning approach for
Information Security and Applications 50, 102419.
proactive multi-cloud cooperative intrusion detection system. Future Generat.
Folino, G., Sabatino, P., 2016. Ensemble based collaborative and distributed intrusion
Comput. Syst. 98, 308–318.
detection systems: a survey. J. Netw. Comput. Appl. 66, 1–16.
Akiba, T., Fukuda, K., Suzuki, S., 2017. ChainerMN: scalable distributed deep learning
Geetha, R., Thilagam, T., 2020. A review on the effectiveness of machine learning and
framework arXiv preprint arXiv:1710.11351.
deep learning algorithms for cyber security. Arch. Comput. Methods Eng. 1–19.
Al Jallad, K., Aljnidi, M., Desouki, M.S., 2019. Big data analysis and distributed deep
Gui, J., Sun, Z., Wen, Y., Tao, D., Ye, J., 2020. A Review on Generative Adversarial
learning for next-generation intrusion detection system optimization. Journal of Big
Networks: Algorithms, Theory, and Applications arXiv preprint arXiv:2001.06937.
Data 6, 1–18.
Haggag, M., Tantawy, M.M., El-Soudani, M.M., 2020. Implementing a deep learning
Al-Garadi, M.A., Mohamed, A., Al-Ali, A.K., Du, X., Ali, I., Guizani, M., 2020. A survey of
model for intrusion detection on Apache Spark platform. IEEE Access 8,
machine and deep learning methods for internet of things (IoT) security. IEEE
163660–163672.
Communications Surveys & Tutorials 22, 1646–1685.
Hande, Y., Muddana, A., 2020. A survey on intrusion detection system for software
Al-Qatf, M., Lasheng, Y., Al-Habib, M., Al-Sabahi, K., 2018. Deep learning approach
defined networks (SDN). Int. J. Bus. Data Commun. Netw. 16, 28–47.
combining sparse autoencoder with SVM for network intrusion detection. IEEE
Hara, K., Shiomoto, K., 2020. Intrusion detection system using semi-supervised learning
Access 6, 52843–52856.
with adversarial auto-encoder. In: NOMS 2020-2020 IEEE/IFIP Network Operations
Albawi, S., Mohammed, T.A., Al-Zawi, S., 2017. Understanding of a convolutional neural
and Management Symposium, pp. 1–8.
network. In: 2017 International Conference on Engineering and Technology. ICET),
Hodo, E., Bellekens, X., Hamilton, A., Tachtatzis, C., Atkinson, R., 2017. Shallow and
pp. 1–6.
Deep Networks Intrusion Detection System: A Taxonomy and Survey arXiv preprint
Aldweesh, A., Derhab, A., Emam, A.Z., 2020. Deep learning approaches for anomaly-
arXiv:1701.02145.
based intrusion detection systems: a survey, taxonomy, and open issues. Knowl. Base
Hosseinzadeh, M., Rahmani, A.M., Vo, B., Bidaki, M., Masdari, M., Zangakani, M., 2020.
Syst. 189, 105124.
Improving security using SVM-based anomaly detection: issues and challenges. Soft
Almiani, M., AbuGhazleh, A., Al-Rahayfeh, A., Atiewi, S., Razaque, A., 2020. Deep
Computing 1–29.
recurrent neural network for IoT intrusion detection system. Simulat. Model. Pract.
Hu, Z., Wang, L., Qi, L., Li, Y., Yang, W., 2020. A novel wireless network intrusion
Theor. 101, 102031.
detection method based on adaptive synthetic sampling and an improved
Alom, M.Z., Taha, T.M., 2017. Network intrusion detection for cyber security using
convolutional neural network. IEEE Access 8, 195741–195751.
unsupervised deep learning approaches. In: 2017 IEEE National Aerospace and
Huang, S., Lei, K., 2020. IGAN-IDS: an imbalanced generative adversarial network
Electronics Conference (NAECON), pp. 63–69.
towards intrusion detection system in ad-hoc networks. Ad Hoc Netw. 105, 102177.
Alom, M.Z., Taha, T.M., Yakopcic, C., Westberg, S., Sidike, P., Nasrin, M.S., et al., 2019.
Idrissi, I., Azizi, M., Moussaoui, O., 2020. IoT security with deep learning-based intrusion
A state-of-the-art survey on deep learning theory and architectures. Electronics 8,
detection systems: a systematic literature review. In: 2020 Fourth International
292.
Conference on Intelligent Computing in Data Sciences. ICDS), pp. 1–10.
Andresini, G., Appice, A., Malerba, D., 2021. Nearest cluster-based intrusion detection
Ieracitano, C., Adeel, A., Morabito, F.C., Hussain, A., 2020. A novel statistical analysis
through convolutional neural networks. Knowl. Base Syst. 216, 106798.
and autoencoder driven intelligent intrusion detection approach. Neurocomputing
Asharf, J., Moustafa, N., Khurshid, H., Debie, E., Haider, W., Wahab, A., 2020. A review
387, 51–62.
of intrusion detection systems using machine and deep learning in internet of things:
Jafarian, T., Masdari, M., Ghaffari, A., Majidzadeh, K., 2020. A survey and classification
challenges, solutions and future directions. Electronics 9, 1177.
of the security anomaly detection mechanisms in software defined networks. Cluster
Ashfaq, R.A.R., Wang, X.-Z., Huang, J.Z., Abbas, H., He, Y.-L., 2017. Fuzziness based
Comput. 1–19.
semi-supervised learning approach for intrusion detection system. Inf. Sci. 378,
Ji, S., Ye, K., Xu, C.-Z., 2020. A network intrusion detection approach based on
484–497.
asymmetric convolutional autoencoder. In: International Conference on Cloud
Azmin, S., Islam, A.M.A.A., 2020. Network intrusion detection system based on
Computing, pp. 126–140.
conditional variational Laplace AutoEncoder. In: 7th International Conference on
Jin, X., Kim, H.-N., 2019. Parallel deep learning detection network in the MIMO channel.
Networking, Systems and Security, pp. 82–88.
IEEE Commun. Lett. 24, 126–130.
Ben-Nun, T., Hoefler, T., 2019. Demystifying parallel and distributed deep learning: an
Jing, L., Bin, W., 2016. Network intrusion detection method based on relevance deep
in-depth concurrency analysis. ACM Comput. Surv. 52, 1–43.
learning. In: 2016 International Conference on Intelligent Transportation, Big Data &
Binbusayyis, A., Vaiyapuri, T., 2021. Unsupervised Deep Learning Approach for Network
Smart City. ICITBS), pp. 237–240.
Intrusion Detection Combining Convolutional Autoencoder and One-Class SVM.
Kaur, S., Singh, M., 2019. Hybrid intrusion detection and signature generation using
Applied Intelligence, pp. 1–15.
deep recurrent neural networks. Neural Comput. Appl. 1–19.
A. Boukhalfa, A. Abdellaoui, N. Hmina, and H. Chaoui, "LSTM deep learning method for
Keegan, N., Ji, S.-Y., Chaudhary, A., Concolato, C., Yu, B., Jeong, D.H., 2016. A survey of
network intrusion detection system," Int. J. Electr. Comput. Eng. (2088-8708), vol.
cloud-based network intrusion detection analysis. Human-centric Computing and
10, 2020.
Information Sciences 6, 19.
Boutros, A., Yazdanshenas, S., Betz, V., 2018. You cannot improve what you do not
Keyvanrad, M.A., Homayounpour, M.M., 2014. A Brief Survey on Deep Belief Networks
measure: FPGA vs. ASIC efficiency gaps for convolutional neural network inference.
and Introducing a New Object Oriented Toolbox (DeeBNet) arXiv preprint arXiv:
ACM Trans. Reconfigurable Technol. Syst. (TRETS) 11, 1–23.
1408.3264.
Bridges, R.A., Glass-Vanderlan, T.R., Iannacone, M.D., Vincent, M.S., Chen, Q., 2019. A
Khalaf, B.A., Mostafa, S.A., Mustapha, A., Mohammed, M.A., Abduallah, W.M., 2019.
survey of intrusion detection systems leveraging host data. ACM Comput. Surv. 52,
Comprehensive review of artificial intelligence and statistical approaches in
1–35.
distributed denial of service attack and defense methods. IEEE Access 7,
Bu, S.-J., Cho, S.-B., 2017. A hybrid system of deep learning and learning classifier
51691–51713.
system for database intrusion detection. In: International Conference on Hybrid
Khan, F.A., Gumaei, A., Derhab, A., Hussain, A., 2019. A novel two-stage deep learning
Artificial Intelligence Systems, pp. 615–625.
model for efficient network intrusion detection. IEEE Access 7, 30373–30385.
Bu, S.-J., Cho, S.-B., 2020. A convolutional neural-based learning classifier system for
Khan, K., Mehmood, A., Khan, S., Khan, M.A., Iqbal, Z., Mashwani, W.K., 2020. A survey
detecting database intrusion via insider attack. Inf. Sci. 512, 123–136.
on intrusion detection and prevention in wireless ad-hoc networks. J. Syst. Architect.
Butun, I., Morgera, S.D., Sankar, R., 2013. A survey of intrusion detection systems in
105, 101701.
wireless sensor networks. IEEE communications surveys & tutorials 16, 266–282.
Kim, A., Park, M., Lee, D.H., 2020. AI-IDS: application of deep learning to real-time Web
Chaabouni, N., Mosbah, M., Zemmari, A., Sauvignac, C., Faruki, P., 2019. Network
intrusion detection. IEEE Access 8, 70245–70261.
intrusion detection for IoT security based on learning techniques. IEEE
Kumar, V., Das, A.K., Sinha, D., 2020. Statistical analysis of the UNSW-NB15 dataset for
Communications Surveys & Tutorials 21, 2671–2701.
intrusion detection. In: Computational Intelligence in Pattern Recognition. Springer,
Chalapathy, R., Chawla, S., 2019. Deep Learning for Anomaly Detection: A Survey arXiv
pp. 279–294.
preprint arXiv:1901.03407.
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J., 2019a. A survey of deep learning-
Choi, H., Kim, M., Lee, G., Kim, W., 2019. Unsupervised learning approach for network
based network anomaly detection. Cluster Comput. 1–13.
intrusion detection system using autoencoders. J. Supercomput. 75, 5597–5621.
Kwon, D., Kim, H., Kim, J., Suh, S.C., Kim, I., Kim, K.J., 2019b. A survey of deep learning-
Chuang, P.-J., Wu, D.-Y., 2019. Applying deep learning to balancing network intrusion
based network anomaly detection. Cluster Comput. 22, 949–961.
detection datasets. In: 2019 IEEE 11th International Conference on Advanced
Le, T.-T.-H., Kim, Y., Kim, H., 2019. Network intrusion detection based on novel feature
Infocomm Technology (ICAIT), pp. 213–217.
selection model and various recurrent neural networks. Appl. Sci. 9, 1392.
da Costa, K.A., Papa, J.P., Lisboa, C.O., Munoz, R., de Albuquerque, V.H.C., 2019.
Leevy, J.L., Khoshgoftaar, T.M., 2020. A survey and analysis of intrusion detection
Internet of Things: a survey on machine learning-based intrusion detection
models based on CSE-CIC-IDS2018 Big Data. Journal of Big Data 7, 1–19.
approaches. Comput. Network. 151, 147–157.
19
Li, X., Chen, W., Zhang, Q., Wu, L., 2020a. Building auto-encoder intrusion detection Riyaz, B., Ganapathy, S., 2020. A deep learning approach for effective intrusion detection
system based on random forest feature selection. Comput. Secur. 95, 101851. in wireless networks using CNN. Soft Computing 24, 17265–17278.
Li, B., Wu, Y., Song, J., Lu, R., Li, T., Zhao, L., 2020b. DeepFed: federated deep learning de Rosa, G.H., Roder, M., Santos, D.F., Costa, K.A., 2021. Enhancing anomaly detection
for intrusion detection in industrial cyber-physical systems. IEEE Transactions on through restricted Boltzmann machine features projection. Int. J. Inf. Technol. 13,
Industrial Informatics. 49–57.
Liu, Q., Liu, J., Sang, R., Li, J., Zhang, T., Zhang, Q., 2018. Fast neural network training Sadaf, K., Sultana, J., 2020. Intrusion detection based on autoencoder and isolation
on FPGA using quasi-Newton optimization method. IEEE Trans. Very Large Scale Forest in fog computing. IEEE Access 8, 167059–167068.
Integr. Syst. 26, 1575–1579. Salakhutdinov, R., Hinton, G., 2009. Deep Boltzmann machines. In: Artificial Intelligence
Liu, Y., Liao, Q., Zhao, J., Han, Z., 2019. Deep learning based encryption policy intrusion and Statistics, pp. 448–455.
detection using commodity WiFi. In: 2019 IEEE 5th International Conference on Saraeian, S., Golchi, M.M., 2020. Application of deep learning technique in an intrusion
Computer and Communications (ICCC), pp. 2129–2135. detection system. Int. J. Comput. Intell. Appl. 19, 2050016.
Liu, L., Wang, P., Lin, J., Liu, L., 2020. Intrusion Detection of Imbalanced Network Traffic Sarker, I.H., 2021. Deep cybersecurity: a comprehensive overview from neural network
Based on Machine Learning and Deep Learning. IEEE Access. and deep learning perspective. SN Computer Science 2, 1–16.
Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A., Lloret, J., 2017. Conditional Sarker, I.H., Kayes, A., Badsha, S., Alqahtani, H., Watters, P., Ng, A., 2020. Cybersecurity
variational autoencoder for prediction and feature recovery applied to intrusion data science: an overview from machine learning perspective. Journal of Big Data 7,
detection in iot. Sensors 17, 1967. 1–29.
Louati, F., Ktata, F.B., 2020. A deep learning-based multi-agent system for intrusion Sergeev, A., Del Balso, M., 2018. Horovod: Fast and Easy Distributed Deep Learning in
detection. SN Applied Sciences 2, 1–13. TensorFlow arXiv preprint arXiv:1802.05799.
Lu, S., Ying, L., Lin, W., Wang, Y., Nie, M., Shen, K., et al., 2019. New Era of Shahriar, M.H., Haque, N.I., Rahman, M.A., Alonso, M., G-ids, 2020. Generative
Deeplearning-Based Malware Intrusion Detection: the Malware Detection and adversarial networks assisted intrusion detection system. In: 2020 IEEE 44th Annual
Prediction Based on Deep Learning arXiv preprint arXiv:1907.08356. Computers, Software, and Applications Conference. COMPSAC), pp. 376–385.
Mahjabin, T., Xiao, Y., Sun, G., Jiang, W., 2017. A survey of distributed denial-of-service Sharafaldin, I., Lashkari, A.H., Ghorbani, A.A., 2018. A detailed analysis of the
attack, prevention, and mitigation techniques. Int. J. Distributed Sens. Netw. 13, cicids2017 data set. In: International Conference on Information Systems Security
1550147717741463. and Privacy, pp. 172–188.
Masdari, M., Jalali, M., 2016. A survey and taxonomy of DoS attacks in cloud computing. Shone, N., Ngoc, T.N., Phai, V.D., Shi, Q., 2018. A deep learning approach to network
Secur. Commun. Network. 9, 3724–3751. intrusion detection. IEEE transactions on emerging topics in computational
Masdari, M., Khezri, H., 2020a. A Survey and Taxonomy of the Fuzzy Signature-Based intelligence 2, 41–50.
Intrusion Detection Systems. Applied Soft Computing, p. 106301. Shu, J., Zhou, L., Zhang, W., Du, X., Guizani, M., 2020. Collaborative intrusion detection
Masdari, M., Khezri, H., 2020b. Efficient VM migrations using forecasting techniques in for VANETs: a deep learning-based distributed SDN approach. IEEE Trans. Intell.
cloud computing: a comprehensive review. Cluster Comput. 1–30. Transport. Syst.
Masdari, M., Khezri, H., 2021. Towards fuzzy anomaly detection-based security: a Sohn, I., 2020. Deep Belief Network Based Intrusion Detection Techniques: A Survey.
comprehensive review. Fuzzy Optim. Decis. Making 20, 1–49. Expert Systems with Applications, p. 114170.
Masdari, M., Khoshnevis, A., 2019. A survey and classification of the workload Von Solms, R., Van Niekerk, J., 2013. From information security to cyber security.
forecasting methods in cloud computing. Cluster Comput. 1–26. Comput. Secur. 38, 97–102.
Masdari, M., Zangakani, M., 2019. Green cloud computing using proactive virtual Song, H.M., Woo, J., Kim, H.K., 2020. In-vehicle network intrusion detection using deep
machine placement: challenges and issues. J. Grid Comput. 1–33. convolutional neural network. Vehicular Communications 21, 100198.
Mayuranathan, M., Murugan, M., Dhanakoti, V., 2019. Best features based intrusion Soni, M., Ahirwa, M., Agrawal, S., 2015. A survey on intrusion detection techniques in
detection system by RBM model for detecting DDoS in cloud environment. Journal of MANET. In: 2015 International Conference on Computational Intelligence and
Ambient Intelligence and Humanized Computing 1–11. Communication Networks. CICN), pp. 1027–1032.
McHugh, J., 2000. Testing intrusion detection systems: a critique of the 1998 and 1999 Sperotto, A., Schaffrath, G., Sadre, R., Morariu, C., Pras, A., Stiller, B., 2010. An overview
darpa intrusion detection system evaluations as performed by lincoln laboratory. of IP flow-based intrusion detection. IEEE communications surveys & tutorials 12,
ACM Trans. Inf. Syst. Secur. 3, 262–294. 343–356.
Meena, G., Choudhary, R.R., 2017. A review paper on IDS classification using KDD 99 Su, T., Sun, H., Zhu, J., Wang, S., Li, Y., 2020. BAT: deep learning methods on network
and NSL KDD dataset in WEKA. In: 2017 International Conference on Computer, intrusion detection using NSL-KDD dataset. IEEE Access 8, 29575–29585.
Communications and Electronics. Comptelix), pp. 553–558. Sultana, N., Chilamkurti, N., Peng, W., Alhadad, R., 2019. Survey on SDN based network
Mighan, S.N., Kahani, M., 2018. Deep learning based latent feature extraction for intrusion detection system using machine learning approaches. Peer-to-Peer
intrusion detection. In: Electrical Engineering (ICEE). Iranian Conference on, Networking and Applications 12, 493–501.
pp. 1511–1516. Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M., 2018. Deep recurrent
Mighan, S.N., Kahani, M., 2020. A novel scalable intrusion detection system based on neural network for intrusion detection in sdn-based networks. In: 2018 4th IEEE
deep learning. Int. J. Inf. Secur. 1–17. Conference on Network Softwarization and Workshops (NetSoft), pp. 202–206.
Mishra, P., Pilli, E.S., Varadharajan, V., Tupakula, U., 2017. Intrusion detection Tang, T.A., McLernon, D., Mhamdi, L., Zaidi, S.A.R., Ghogho, M., 2019. Intrusion
techniques in cloud environment: a survey. J. Netw. Comput. Appl. 77, 18–47. detection in sdn-based networks: deep recurrent neural network approach. In: Deep
Nadeem, A., Howarth, M.P., 2013. A survey of MANET intrusion detection & prevention Learning Applications for Cyber Security. Springer, pp. 175–195.
approaches for network layer attacks. IEEE communications surveys & tutorials 15, Tang, C., Luktarhan, N., Zhao, Y., 2020a. An efficient intrusion detection method based
2027–2045. on LightGBM and autoencoder. Symmetry 12, 1458.
Nguyen, M.T., Kim, K., 2020. Genetic convolutional neural network for intrusion Tang, C., Luktarhan, N., Zhao, Y., 2020b. SAAE-DNN: deep learning method on intrusion
detection systems. Future Generat. Comput. Syst. 113, 418–427. detection. Symmetry 12, 1695.
Nguyen, Q.P., Lim, K.W., Divakaran, D.M., Low, K.H., Chan, M.C., 2019. Gee: a gradient- Tang, T.A., Mhamdi, L., McLernon, D., Zaidi, S.A.R., Ghogho, M., El Moussa, F., 2020c.
based explainable variational autoencoder for network anomaly detection. In: 2019 DeepIDS: deep learning approach for intrusion detection in software defined
IEEE Conference on Communications and Network Security (CNS), pp. 91–99. networking. Electronics 9, 1533.
Nie, L., Ning, Z., Wang, X., Hu, X., Cheng, J., Li, Y., 2020. Data-driven intrusion detection Tavallaee, M., Bagheri, E., Lu, W., Ghorbani, A.A., 2009. A detailed analysis of the KDD
for intelligent Internet of vehicles: a deep convolutional neural network-based CUP 99 data set. In: 2009 IEEE Symposium on Computational Intelligence for
method. IEEE Transactions on Network Science and Engineering 7, 2219–2230. Security and Defense Applications, pp. 1–6.
Parvat, A., Dev, S., Kadam, S., Chavan, J., 2017. Network intrusion detection system Tefai, H.T., Saleh, H., Tekeste, T., Alqutayri, M., Mohammad, B., 2020. ASIC
using ensemble of binary deep learning classifiers. In: International Conference on implementation of a pre-trained neural network for ECG feature extraction. In: 2020
Smart Trends for Information Technology and Computer Communications, pp. 3–10. IEEE International Symposium on Circuits and Systems. ISCAS), pp. 1–5.
Patel, A., Taghavi, M., Bakhtiyari, K., JúNior, J.C., 2013. An intrusion detection and Telikani, A., Gandomi, A.H., 2019. Cost-sensitive Stacked Auto-Encoders for Intrusion
prevention system in cloud computing: a systematic review. J. Netw. Comput. Appl. Detection in the Internet of Things. Internet of Things, p. 100122.
36, 25–41. Thamilarasu, G., Chawla, S., 2019. Towards deep-learning-driven intrusion detection for
Peng, W., Kong, X., Peng, G., Li, X., Wang, Z., 2019. Network intrusion detection based the internet of things. Sensors 19, 1977.
on deep learning. In: 2019 International Conference on Communications. Tschannen, M., Bachem, O., Lucic, M., 2018. Recent Advances in Autoencoder-Based
Information System and Computer Engineering (CISCE), pp. 431–435. Representation Learning arXiv preprint arXiv:1812.05069.
Pouyanfar, S., Sadiq, S., Yan, Y., Tian, H., Tao, Y., Reyes, M.P., et al., 2018. A survey on Umer, M.F., Sher, M., Bi, Y., 2017. Flow-based intrusion detection: techniques and
deep learning: algorithms, techniques, and applications. ACM Comput. Surv. 51, challenges. Comput. Secur. 70, 238–254.
1–36. Venkataramanaiah, S.K., Yin, S., Cao, Y., Seo, J.-S., 2020. Deep neural network training
Preethi, D., Khare, N., 2020. Sparse Auto Encoder Driven Support Vector Regression accelerator designs in ASIC and FPGA. In: 2020 International SoC Design Conference
Based Deep Learning Model for Predicting Network Intrusions. Peer-to-Peer (ISOCC), pp. 21–22.
Networking and Applications, pp. 1–11. Verma, A., Ranga, V., 2018. Statistical analysis of CIDDS-001 dataset for network
Ren, H., Li, H., Liu, D., Xu, G., Cheng, N., Shen, X.S., 2020. Privacy-preserving Efficient intrusion detection systems using distance-based machine learning. Procedia
Verifiable Deep Packet Inspection for Cloud-Assisted Middlebox. IEEE Transactions Computer Science 125, 709–716.
on Cloud Computing. Security Vulnerabilities," 2021.
Resende, P.A.A., Drummond, A.C., 2018. A survey of random forest based methods for Wang, W., Sheng, Y., Wang, J., Zeng, X., Ye, X., Huang, Y., et al., 2017. HAST-IDS:
intrusion detection systems. ACM Comput. Surv. 51, 1–36. learning hierarchical spatial-temporal features using deep neural networks to
Ring, M., Schlör, D., Landes, D., Hotho, A., 2019. Flow-based network traffic generation improve intrusion detection. Ieee Access 6, 1792–1806.
using generative adversarial networks. Comput. Secur. 82, 156–172. Wang, J., Chen, Y., Hao, S., Peng, X., Hu, L., 2019a. Deep learning for sensor-based
activity recognition: a survey. Pattern Recogn. Lett. 119, 3–11.
20
Wang, P., Song, X., Deng, Z., Xie, H., Wang, C., 2019b. An improved deep learning based Sang-Woong Lee received the B.S. degree in electronics and
intrusion detection method. In: 2019 IEEE 5th International Conference on computer engineering and the M.S. and Ph.D. degrees in com
Computer and Communications (ICCC), pp. 2092–2096. puter science and engineering from Korea University, Seoul,
Wang, W., Zhao, M., Wang, J., 2019c. Effective android malware detection with a hybrid South Korea, in 1996, 2001, and 2006, respectively. From June
model based on deep autoencoder and convolutional neural network. Journal of 2006 to May 2007, he was a Visiting Scholar with the Robotics
Ambient Intelligence and Humanized Computing 10, 3035–3043. Institute, Carnegie Mellon University. From September 2007 to
Wang, X., Yin, S., Li, H., Wang, J., Teng, L., 2020. A network intrusion detection method February 2017, he was a Professor with the Department of
based on deep multi-scale convolutional neural network. Int. J. Wireless Inf. Computer Engineering, Chosun University, Gwangju, South
Network 27, 503–517. Korea. He is currently a Professor with the School of
Wang, Z., Zeng, Y., Liu, Y., Li, D., 2021. Deep belief network integrating improved Computing, Gachon University, Seongnam, South Korea. His
kernel-based extreme learning machine for network intrusion detection. IEEE Access current research interests include face recognition, computa
9, 16062–16091. tional aesthetics, machine learning, medical imaging analysis,
Wu, K., Chen, Z., Li, W., 2018. A novel intrusion detection model for a massive network and AI-based applications.
using convolutional neural networks. Ieee Access 6, 50850–50859.
Xiao, Y., Xing, C., Zhang, T., Zhao, Z., 2019. An intrusion detection model based on
feature reduction and convolutional neural networks. IEEE Access 7, 42210–42219.
Xin, Y., Kong, L., Liu, Z., Chen, Y., Li, Y., Zhu, H., et al., 2018. Machine learning and deep Haval Mohammed sidqi received the Bsc degree in math
learning methods for cybersecurity. Ieee access 6, 35365–35381. college of science from university of mosul iraq in 1993, the
Xu, C., Shen, J., Du, X., Zhang, F., 2018. An intrusion detection system using a deep High diploma degree in computer science from sulaimani uni
neural network with gated recurrent units. IEEE Access 6, 48697–48707. versity in 2006, Msc degree in computer science from sulaimani
Xu, X., Li, J., Yang, Y., Shen, F., 2020. Towards Effective Intrusion Detection Using Log- university in 2018, and Phd degree in computer science from
Cosh Conditional Variational AutoEncoder. IEEE Internet of Things Journal. sulaimani polytechnic university in 2021
Yadav, S., Kalpana, R., 2021. A Survey on Network Intrusion Detection Using Deep
Generative Networks for Cyber-Physical Systems. Artificial Intelligence Paradigms
for Smart Cyber-Physical Systems, pp. 137–159.
Yan, B., Han, G., 2018. Effective feature extraction via stacked sparse autoencoder to
improve intrusion detection system. IEEE Access 6, 41238–41248.
Yang, H., Wang, F., 2019. Wireless network intrusion detection based on improved
convolutional neural network. Ieee Access 7, 64366–64374.
Yang, H., Qin, G., Ye, L., 2019a. Combined wireless network intrusion detection model
based on deep learning. IEEE Access 7, 82624–82632.
Yang, Y., Zheng, K., Wu, C., Yang, Y., 2019b. Improving the classification effectiveness of
intrusion detection by using improved conditional variational autoencoder and deep Mokhtar Mohammadi received the B.S. degree in computer
neural network. Sensors 19, 2528. engineering from Shahed University, Tehran, Iran, in 2003, the
Yang, L., Li, J., Yin, L., Sun, Z., Zhao, Y., Li, Z., 2020a. Real-time intrusion detection in M.S. degree in computer engineering from Shahid Beheshti
wireless network: a deep learning-based intelligent mechanism. IEEE Access 8, University, Tehran, Iran, in 2012, and the Ph.D. degree in
170128–170139. computer engineering from Shahrood University of Technol
Yang, Y., Zheng, K., Wu, B., Yang, Y., Wang, X., 2020b. Network intrusion detection ogy, Shahrood, Iran, in 2018. His current research interests
based on supervised adversarial variational auto-encoder with regularization. IEEE include signal processing, time-frequency analysis, and ma
Access 8, 42169–42184. chine learning. He is currently with the Department of Infor
Yin, C., Zhu, Y., Fei, J., He, X., 2017. A deep learning approach for intrusion detection mation Technology, Lebanese French University-Erbil, Iraq.
using recurrent neural networks. Ieee Access 5, 21954–21961.
Yu, Y., Si, X., Hu, C., Zhang, J., 2019. A review of recurrent neural networks: LSTM cells
and network architectures. Neural Comput. 31, 1235–1270.
Zarpelão, B.B., Miani, R.S., Kawakani, C.T., de Alvarenga, S.C., 2017. A survey of
intrusion detection in Internet of Things. J. Netw. Comput. Appl. 84, 25–37.
Zavrak, S., İskefiyeli, M., 2020. Anomaly-based intrusion detection from network flow
features using variational autoencoder. IEEE Access 8, 108346–108358.
Zhang, N., Ding, S., Zhang, J., Xue, Y., 2018a. An overview on restricted Boltzmann Shima Rashidi was born in Iran in 1989. She received the B.E.
machines. Neurocomputing 275, 1186–1199. and M.E. degrees in computer science from the University of
Zhang, H., Wu, C.Q., Gao, S., Wang, Z., Xu, Y., Liu, Y., 2018b. An effective deep learning Tabriz, Tabriz, Iran, in 2011 and 2013, respectively. Now, She
based scheme for network intrusion detection. In: 2018 24th International is a Ph.D. student at the University of Science and Technology,
Conference on Pattern Recognition. ICPR), pp. 682–687. Tehran, Iran. Currently she is an assistant lecturer in a univer
Zhang, Y., Zhang, H., Zhang, X., Qi, D., 2018c. Deep learning intrusion detection model sity of Human development, Kurdistan region, sulaymaniyah,
based on optimized imbalanced network data. In: 2018 IEEE 18th International Iraq.Her main areas of research interest are text mining, semi
Conference on Communication Technology (ICCT), pp. 1128–1132. supervised learning, social network analysis.
Zhang, Y., Li, P., Wang, X., 2019a. Intrusion detection for IoT based on improved genetic
algorithm and deep belief network. IEEE Access 7, 31711–31722.
Zhang, J., Li, F., Zhang, H., Li, R., Li, Y., 2019b. Intrusion detection system using deep
learning for in-vehicle security. Ad Hoc Netw. 95, 101974.
Zhang, H., Huang, L., Wu, C.Q., Li, Z., 2020a. An effective convolutional neural network
based on SMOTE and Gaussian mixture model for intrusion detection in imbalanced
dataset. Comput. Network. 177, 107315.
Zhang, G., Wang, X., Li, R., Song, Y., He, J., Lai, J., 2020b. Network intrusion detection
based on conditional Wasserstein generative adversarial network and cost-sensitive
Amir Masoud Rahmani received his BS in Computer Engi
stacked autoencoder. IEEE Access 8, 190431–190447.
neering from Amir Kabir University, Tehran, in 1996, the MS in
Zhang, Y., Zhang, Y., Zhang, N., Xiao, M., 2020c. A network intrusion detection method
Computer Engineering from Sharif University of Technology,
based on deep learning with higher accuracy. Procedia Computer Science 174,
Tehran, in 1998 and the PhD degree in Computer Engineering
50–54.
from IAU University, Tehran, in 2005. Currently, he is a Pro
Zhang, C., Costa-Pérez, X., Patras, P., 2020d. Tiki-taka: attacking and defending deep
fessor in the Department of Computer Engineering at the IAU
learning-based intrusion detection systems. In: Proceedings of the 2020 ACM SIGSAC
University. He is the author/co-author of more than 200 pub
Conference on Cloud Computing Security Workshop, pp. 27–39.
lications in technical journals and conferences. His research
Zixu, T., Liyanage, K.S.K., Gurusamy, M., 2020. Generative adversarial network and auto
interests are in the areas of distributed systems, Internet of
encoder based anomaly detection in distributed IoT networks. In: GLOBECOM 2020-
things and evolutionary computing.
2020 IEEE Global Communications Conference, pp. 1–7.
21
Mohammad Masdari received his B.Tech. degree in Computer Mehdi HosseinZadeh received his B.S. degree in computer
Software Engineering from Islamic Azad University, Qazvin hardware engineering, from Islamic Azad University, Dezfol
Branch, Iran, in 2001, and M.Tech degree in Computer Software branch, Iran in 2003. He also received his M.Sc. and the Ph.D.
Engineering from Islamic Azad University, South Tehran degree in computer system architecture from the Science and
Branch, Tehran, Iran, in 2003. He received his Ph.D. degree in Research Branch, Islamic Azad University, Tehran, Iran in 2005
Computer Software Engineering from Islamic Azad University, and 2008, respectively. He is currently an Associate professor in
Science and research branch, Tehran, Iran, in 2014. Since 2003, Iran University of Medical Sciences (IUMS), Tehran, Iran. He is
he worked a faculty member of Islamic Azad University, Urmia the author/co-author of more than 120 publications in tech
branch, Iran. Presently he is an Assistant Professor in the nical journals and conferences, and his research interests
Department of Computer Engineering of Islamic Azad Univer include SDN, Information Technology, Data Mining, Big data
sity, Urmia branch, Iran. His research interests include analytics, E-Commerce, E-Marketing, and Social Networks.
Distributed Systems and Network Security.
22

Paper 5

Uploaded by

Copyright:

Available Formats

Paper 5

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Paper 5

Uploaded by

Copyright:

Available Formats

Journal of Network and Computer Applications 187 (2021) 103111

Contents lists available at ScienceDirect

Journal of Network and Computer Applications

Towards secure intrusion detection systems using deep learning techniques:

1. Introduction cybersecurity attacks, providing efficient and effective techniques to

extract new features from the training data samples.

Table 1 detection articles, we have applied the following search terms:

• Intrusion Detection Deep Learning Survey

• Intrusion Detection Deep Learning

layers such as a convolutional layer, fully-connected layer, non-linearity

3.2. IDS datasets

Several important IDS datasets are employed by the investigated

Fig. 5. Deep learning networks taxonomy. 3.2.1. KDDCup’99

Fig. 6. Taxonomy of the deep learning-based IDS schemes.

4.1.1. Stacked auto-encoder 4.1.3. Nonsymmetric auto-encoder

network traffic from flow-based data. 4.1.6. Convolutional auto-encoder

Yang et al. Accuracy, Scikit-learn library SVM DBN NSL-KDD 98.43

Song et al. (2020) FNR, CNN Self-Collected

Liu et al. (2019) Accuracy SVM 96.6%

(Boukhalfa et al., Accuracy, LSTM NSL KDD 99.93

Xu et al. (2018) Accuracy, TensorFlow NSL-KDD, NSL-KDD = 99.24,

learning-based IDS schemes.

This section provides a discussion on the various issues about the

It is worth mentioning that, these statistics are provided based on the

computer systems and networks. To increase the performance of the IDS

• Several distributed IDS schemes have applied blockchain technology

Fig. 16. Accuracy of the studied schemes on the NSL-KDD dataset.

You might also like