1. Introduction
The Internet-of-Things (IoT) paradigm enables physical objects to interconnect and communicate with each other via the Internet [
1]. The popularity of IoT is fast-growing, and its adoption cuts across different areas of application such as energy, water, transport, defense, health, agriculture, etc. According to Cisco’s Annual Internet Report,
billion IoT devices will be connected to the Internet by 2023 [
2].
On the other hand, botnet is a network of compromised computers, known as bots, that are remotely controlled by a botmaster using a Command and Control (C&C) server [
3]. Nowadays, hackers leverage botnets, such as
Mirai [
4,
5], to exploit vulnerabilities in IoT devices and networks. They use this technique to launch different types of cyber-attacks against Internet-enabled infrastructures [
6,
7,
8,
9,
10,
11]. For example, in September 2016, the
Mirai botnet compromised several IoT devices to launch a Distributed Denial of Service (DDoS) attack, increasing traffic up to 620 Gbps in a web server [
5]. Additionally, in October 2016, a web-host and cloud service provider,
Dyn, was hit with a DDoS attack, increasing traffic to about
Tbps. The
Mozi botnet was first discovered in October 2019. It accounted for close to
of the total IoT network traffic monitored by IBM Security X-Force from that time until June 2020; this incidence increased IoT attack volume by
compared to the total IoT attack cases in the last two years [
12].
Cybercriminals can manipulate the energy market for a significant payoff of up to
million if they have access to 50,000 high-wattage IoT devices for only three hours a day, 100 days a year [
13,
14]. Additionally, with the current COVID-19 pandemic, corporate and IoT networks have become more vulnerable to botnet attacks because these networks are now being accessed remotely more often than before [
15]. Recent and complex botnet attacks that target IoT networks include Denial of Service (DoS), DDoS, Operating System (OS) Fingerprinting (OSF), Service Scanning (SS), Data Ex-filtration (DE) and Keylogging (KL) [
16]. A unique DoS or DDoS attack can be launched using either a Hypertext Transfer Protocol (HTTP), a Transmission Control Protocol (TCP) or a User Datagram Protocol (UDP). Such botnet attacks are referred to as DoS-HTTP, DoS-TCP, DDoS-UDP, DDoS-HTTP, DDoS-TCP or DDoS-UDP attacks.
Botnet attack detection in IoT networks can be formulated as a classification problem [
16]. For binary classification, each sample in a network traffic packet is classified as either benign or malicious based on certain predefined features. On the other hand, the specific category of botnet attack is identified in multi-class classification. Thus far, Artificial Intelligence (AI) techniques have achieved good performance in handling classification tasks in different application areas including voltage stability assessment of power systems among many others [
17]. Specifically, various Machine Learning (ML) and Deep Learning (DL) models have been developed to classify network traffic data in IoT networks. These models learn the discriminating features of benign traffic and malicious traffic using different architectures such as Random Forest (RF) [
18], Support Vector Machine (SVM) [
19], Deep Neural Network (DNN) [
20], Recurrent Neural Network (RNN) [
21], Long Short-Term Memory (LSTM) [
22] and Gated Recurrent Unit (GRU) [
23]. For an in-depth understanding, comprehensive reviews and surveys on the application of ML and DL in intrusion detection are presented in [
24,
25,
26,
27,
28,
29].
Classification of highly imbalanced network traffic data is a difficult task. The data are said to be highly imbalanced when the ratio of the number of samples in the majority class to that of the minority class is more than 1:10 [
30]. High class imbalance degrades the classification performance of ML and DL models in minority classes [
31,
32]. In addition to the class imbalance problem, small disjuncts, noise and overlap in network traffic samples can also lead to poor classification performance [
33,
34]. In popular botnet attack scenarios such as DDoS, the amount of malicious traffic generated is usually far more than the volume of benign traffic produced by legitimate devices in an IoT network. This means that the number of samples in the
normal class is very low compared to the number of samples in the
attack class. In this case, state-of-the-art DL models tend to be biased in favour of the majority (
attack) class, and this increases the false positive (FP) rate [
16]. The implication of deploying state-of-the-art ML and DL models in real-life IoT networks is that a significant percentage of network traffic data in the minority classes is misclassified, and this may, consequently, lead to a breach of privacy, loss of sensitive information, loss of revenue when applications and services are unavailable, and even loss of lives in critical IoT systems.
In this paper, we propose an efficient DL-based botnet attack detection algorithm to increase the detection rate and to reduce FP in minority classes without increasing the false negative (FN) rate in the majority classes. The main contributions of this paper are as follow:
- 1.
An efficient DL-based botnet attack detection algorithm is proposed for highly imbalanced network traffic data. Synthetic Minority Oversampling Technique (SMOTE) generates additional minority samples to achieve class balance, while Deep RNN (DRNN) learns hierarchical feature representations from the balanced network traffic data to perform discriminative classification.
- 2.
DRNN and SMOTE-DRNN models are trained, validated and tested with the Bot-IoT dataset to classify network traffic samples in the normal class and ten botnet attack classes.
- 3.
We investigate the effect of class imbalance on the accuracy, precision, recall, F1 score, false positive rate (FPR), negative predictive value (NPV), area under the receiver operating characteristic curve (AUC), geometric mean (GM) and Matthews correlation coefficient (MCC) of the DRNN and SMOTE-DRNN models.
- 4.
The training time and the testing time of the DRNN and SMOTE-DRNN models are analysed to evaluate their training speed and detection speed.
The rest of the paper is organised as follows: In
Section 2, we review the state-of-the-art ML and DL methods proposed for botnet attack detection in IoT networks; in
Section 3, we present a detailed description of the SMOTE-DRNN algorithm and model development process; in
Section 4, we discuss the results; and in
Section 5, we summarise the main findings of the paper.
2. Review of Related Works
In this section, we review the state-of-the-art ML and DL models that were developed with the Bot-IoT dataset [
16] to detect botnet attacks in IoT networks. Currently, the Bot-IoT dataset is the most relevant and up-to-date publicly available dataset for botnet attack detection in IoT networks because it (a) has IoT network traffic samples, (b) captures complete network information, (c) has a diversity of complex IoT botnet attack scenarios, (d) contains accurate ground truth labels and (e) provides a massive volume of labeled data required for effective supervised DL.
In recent literature, researchers have recommended diverse ML and DL model architectures for botnet detection in IoT networks. Odusami et al. [
35] proposed LSTM for DDoS attack detection in web servers, but they did not perform any experiment to validate the performance of the proposed method. Biswas and Roy [
36] proposed the GRU model because it outperformed the Artificial Neural Network (ANN) and LSTM models. However, the performance evaluation was based on accuracy only. Popoola et al. [
21] proposed Stacked RNN (SRNN), which involves cascading multiple layers of RNN. Tyagi and Kumar [
37] recommended the RF model because it performed better than the k-Nearest Neighbour (kNN), Logistic Regression (LR), SVM, Multi-Layer Perceptron (MLP) and Decision Tree (DT) models. Lo et al. [
38] proposed the Edge-based Graph Sample and Aggregate (E-GraphSAGE) model and it outperformed the Extreme Gradient Boosting (XGBoost) and DT models. Chauhan and Atulkar [
39] suggested the Light Gradient Boosting Machine (LGBM) model because it outperformed RF, Extra Tree (ET), Gradient Boost (GB) and XGBoost models. In [
40], the Convolutional Neural Network (CNN) model outperformed the RNN, LSTM and GRU models. Huong et al. [
41,
42] proposed a low-complexity edge-cloud DNN model, which achieved better performance than the kNN, DT, RF and SVM models. Lee et al. [
43] employed RF for botnet attack classification in an IoT smart factory. Shafiq et al. [
44] developed the Bayes Network (BN), C4.5 DT, Naive Bayes (NB), RF and Random Tree (RT) models. The bijective soft set algorithm [
45] was used to determine the most effective ML model based on accuracy, precision, recall and training time. Zakariyya et al. [
46] recommended LGBM as a resource-efficient ML method. Susilo et al. [
47] proposed the CNN model for botnet detection in Software-Defined Networks (SDN), and it outperformed the RF model. In [
48], the DNN model achieved a higher classification accuracy than the LR, kNN, DT, Classification And Regression Tree (CART) and SVM models. Das et al. [
49] developed the RF, RT, NB, C4.5 DT, Reduced Error Pruning Tree (REPT), BN and partial decision tree (PART) models with the 10 best features in [
16]. In [
50], the DT model outperformed the NB, kNN and SVM models. Sriram et al. [
51] concluded that DL models perform better than classical ML models.
In another group of studies, two or more ML/DL architectures were combined to form a hybrid model. Popoola et al. [
22] proposed a hybrid DL method based on the combination of LSTM Autoencoder (LAE) and deep Bidirectional LSTM (BLSTM) model architectures. LAE reduces the dimensionality of network traffic features to save memory space and to increase computation speed. On the other hand, deep BLSTM learns the long-term temporal relationships among the low-dimensional features to correctly distinguish between benign traffic and different classes of botnet attack traffic. Priya et al. [
52] combined SVM, NB, DT, RF and ANN to develop an ensemble classifier for attack detection in Industrial IoT (IIoT). Kunang et al. [
53] proposed a hybrid DL method by cascading Autoencoder (AE) and DNN model structures. AE performs feature extraction, while DNN performs the classification task. Zixu et al. [
54] merged Generative Adversarial Network (GAN) with AE to detect botnet attacks in distributed IoT networks. Ge et al. [
55] combined the concept of transfer learning with DNN. Bhuvaneswari and Selvakumar [
56] developed the Vector Convolutional Deep Learning (VCDL) model, which employed a vector convolutional network for feature extraction and a fully connected network for classification. Asadi et al. [
57] developed a hybrid ML model, which comprised the DNN, SVM and C4.5 decision tree algorithms. Khraisat et al. [
58] developed a hybrid ML model, which comprised One-Class SVM (OCSVM) and C4.5 DT. Aldhaheri et al. [
59] developed a hybrid between the Self Normalising Neural Network (SNN) and Dendritic Cell Algorithm (DCA) models with five optimal network traffic features, which were selected from the 10 best features in [
16] using the information gain method [
60].
Furthermore, different optimisation techniques were proposed to improve the classification performance of ML and DL models. Popoola et al. [
23] proposed a method that helps determine the most appropriate set of hyperparameters for training the Bidirectional GRU (BGRU) model in an efficient manner. Samdekar et al. [
61] recommended the Firefly Algorithm (FA) for feature dimensionality reduction because it outperformed the Chi-Square, ET and Principal Component Analysis (PCA) methods when SVM was used for classification. Kumar et al. [
62] proposed the combination of the correlation coefficient, RF mean decrease accuracy and gain ratio for selection of the most relevant features. Kunang et al. [
53] and Injadat et al. [
63] proposed the Bayesian Optimisation Gaussian Process (BO-GP) method to optimise the hyperparameters of the AE-DNN and DT models, respectively. In [
64], Binary Grey Wolf Optimisation (BGWO) was used for feature selection while NB was used for classification. Orevski and Androcec [
65] used Genetic Algorithm (GA) to optimise the hyperparameters of ANN. In [
66], Particle Swarm Optimisation (PSO) algorithm was used to determine the best hyperparameters that maximise AUC.
In the literature, botnet attack detection in IoT networks was treated as a classification task. Different ML and DL models have been developed for botnet attack detection in binary, 5-class or 11-class classification scenarios. For binary classification, ML/DL models were developed such that each sample of the network traffic data in the Bot-IoT dataset was classified as either normal or attack. For 5-class classification, four categories of botnet attacks, namely DDoS, DoS, reconnaissance and theft, were considered. For the 11-class classification scenario, ten categories of botnet attacks, namely DDoS-HTTP (DDH), DDoS-TCP (DDT), DDoS-UDP (DDU), DoS-HTTP (DH), DoS-TCP (DT), DoS-UDP (DU), OS fingerprinting (OSF), service scanning (SS), data exfiltration (DE) and keylogging (KL), were considered. In order to cover all of the available botnet attack types, our study focuses on botnet attack detection in an 11-class classification scenario.
Ferrag et al. [
67] proposed an RNN model that employs a single hidden layer and 60 hidden neurons. The model was trained with 1,878,561 network traffic samples for five epochs using a batch size of 100. The classification performance of the RNN model was evaluated with 1,797,803 network traffic samples in the test set. The RNN model outperformed the SVM, RF and NB models. The RNN model had the highest recall value for each of the 10 botnet attack classes. Furthermore, the overall recall and the FPR of the RNN model were the highest and lowest, respectively.
Ferrag et al. [
68] investigated the effectiveness of three deep discriminative models (DNN, RNN and CNN) and four deep generative models (Restricted Boltzmann Machine (RBM), Deep Belief Network (DBN), Deep Boltzmann Machine (DBM) and Deep AE (DAE)) for botnet attack detection in IoT networks. These models employed a single hidden layer and 15–100 hidden neurons. The activation functions at the hidden layer and the output layer were sigmoid and softmax, respectively. The DL models were trained with 5,877,647 network traffic samples for 100 epochs, given a learning rate of 0.01–0.5 and a batch size of 1000. The classification performance of the DL models was evaluated with 1,469,413 network traffic samples in the test set. The authors presented the recall of the DL models for the 10 botnet attack classes. Additionally, the overall recall, TNR, FPR and accuracy were reported. The CNN model outperformed the DNN and RNN models, while the DAE model outperformed the RBM, DBN and DBM models.
Ferrag et al. [
69] combined the REP Tree, JRip and Forest PA algorithms to form a hybrid ML model named RDTIDS. The REP Tree and JRip models were trained with 5,877,647 network traffic samples to perform binary classification. The outputs of the two models were combined with the network traffic features in the training set to develop the Forest PA model for multi-class classification. The classification performance of the RDTIDS model was evaluated with 1,469,413 network traffic samples in the test set. The authors presented the recall of the RDTIDS model for the 10 botnet attack classes. Additionally, the overall recall, FPR and accuracy were reported. The RDTIDS model outperformed the RF, REP Tree, MLP, NB, JRip, SVM and J48 models.
Alkadi et al. [
70] proposed a BLSTM model that employed a single hidden layer and 60 hidden neurons. BLSTM model was trained, validated and tested with
,
and
of the network traffic samples for 200 epochs, given a batch size of 100 and an Adam optimiser. The activation functions at the hidden layer and the output layer were hyperbolic tangent (
tanh) and softmax, respectively. The authors presented the recall of the BLSTM model for the 10 botnet attack classes. Additionally, the overall recall, FPR and accuracy were reported. The BLSTM model outperformed the SVM, RF and NB models.
SMOTE is a method that can effectively handle the class imbalance problem in training data, but it must be combined with the right classifier. Pokhrel et al. [
71] proposed SMOTE-kNN to address the class imbalance problem in botnet detection. However, the study focused on binary classification, and the IR in the training data was 1:208. Bagui and Li [
72] studied the effects of random undersampling (RU), random oversampling (RO), RU-RO, RU-SMOTE and RU with Adaptive Synthetic (ADSYN) methods on the performance of the ANN model. Qaddoura et al. [
73] combined SVM-SMOTE with DNN to handle a class imbalance in binary classification. Derhab et al. [
74] employed a combination of SMOTE and Temporal CNN to address the class imbalance in 5-class classification. However, the SMOTE method has not been previously combined with the DRNN model. Additionally, previous applications of SMOTE focused on binary and 5-class classification, but none of them applied it to solve the 11-class classification problem.
Table 1 shows that the distribution of network traffic samples in the training set is highly imbalanced across the 11 classes. The number of samples in the minority classes (DDH, DH, Norm, DE and KL) is relatively fewer than those in the majority classes (DDT, DDU, DT, DU, OF and SS). For majority classes, high class imbalance in the training set degrades the classification performance of state-of-the-art ML and DL models. Therefore, state-of-the-art ML and DL models may not detect DDH, DH, Norm, DE and KL correctly in IoT networks. In this paper, the number of samples in the Norm, DE and KL classes is relatively few compared to those in the previous studies. Therefore, the class imbalance problem in the present study is more challenging. Additionally, in previous related work [
67,
68,
69,
70], the recall values for the Norm class were not reported and the authors did not present the accuracy, precision, F1 score, FPR, NPV, AUC, GM and MCC of the ML/DL models for each of the 11 classes.