Research Paper

1
Machine Learning and Deep Learning Methods for

Better Anomaly Detection in IoT-23 Dataset
Cybersecurity
Yue Liang1 and Nikhil Vankayalapati2
1
Department of Computer Science, Lakehead University, ON, Canada
2
Department of Computer Science, Lakehead University, ON, Canada
Abstract—As smart devices and the Internet develop, the devices of the users.[4] This not only reduces the advancement
Internet of Things (IoT) technologies have become an important of IoT technology but also slows down the development of
factor in our life. IoT helps manufactory companies to monitor IoT infrastructure. Therefore, providing security and privacy of
the status of every machine in real time, the quality of products
and the environment variables within the factory. This not only these constantly and heavily connected devices has became a
allows managers to reduce the risk of damages and losses, also major challenge. Another key issue for providing security and
help to make decision from a higher overall standpoint. In privacy to these devices is the managing the huge amount of
addition, IoT has changed people’s life and behavior. People data generated by them, which is quite difficult using general
are now relied on IoT devices and services more than ever. data collection, storage and processing techniques[18].
However, anomalies can caused security and safety issues for an
IoT network. It is important to detect anomalies and alarm user With the development of Machine Learning (ML) and
to prevent damages or losses. In this paper, we proposed using Deep Learning (DL), learning algorithms can learn from the
the Machine Learning and Deep Learning methods to detect results of trained data and adapt in order to increase the
anomalies in a network. The experiments were performed on the performance to make informed and intelligent decisions. A
IoT-23 dataset. The performance and time cost for these models learning algorithm that has been trained by the data is able
are compared to give us the best algorithm with high performance
in less time. to establish the difference between regular benign traffic in
the model with the malicious traffic. In other words, it can
Index Terms—Internet of Things, security, malicious node, detect when there is an abnormal behaviour in the network
anomaly detection, Machine Learning, Deep Learning.
thereby preventing unauthorized access. Learning algorithms
are basically classified into two categories which are Super-
I. I NTRODUCTION vised Learning and Unsupervised Learning. We try to use the
Internet of Things (IoT) is a revolution to the global infor- light weighted machine learning methods and neural networks
mation industry after the Internet. The IoT is a smart network for accuracy improvement on detecting malicious node. The
that allows devices to exchange information and communicate Central unit in the model captures IoT traffic data and sends
with each other through internet. With IoT, human can achieve the data to a selected trained Machine Learning or Deep
the purpose of tracking, monitoring, locating, identifying and Learning model. Multiple trained Machine Learning and Deep
managing things [1]. Since the revolution of the Internet and Learning models are tested. The reason for choosing multiple
mobile devices, IoT has become an evolving and hot research models is to fit the individual needs for different users or
topic within the computer science industry. The number of IoT groups. In other words, it is important to find the efficient
devices on the Internet is increasing every year and in every model for different type of user.
sector such as: Smart Healthcare, Smart Transportation, Smart This large data in the IoT network and the heterogeneity of
Governance, Smart Agriculture, Smart Grid, Smart Home, the data makes it to difficult to improve the security and to
Smart Supply chain etc. [2]. meet all the requirements such as cost effectiveness, reliability,
Because of the convenience brought by IoT, the behavior performance etc. In some cases, if one of the feature is
of humans has also changed. People of younger generations improved then it may effect performance of other features[16].
are more used to use services from IoT devices such as smart For example, an increase in the number of security checks and
bulbs, smart oven, smart refrigerator,AC,temperature sensor, protocols in all data transfer then it may result in the increase
smoke detector etc. [3] However, as IoT develops, the concerns in cost and latency of that particular application making it
of the privacy and security issues has increased among the unsuitable for certain users. Also the increase in number of
users. As all the devices are connected to the internet and each devices connected increases the chance for attacker to gain
other, this leads to more number of ways for the attacker to access the network by accessing the node or device that has
access the information possible. The connected devices collect a weak link for example a device like smart bulb. Most of
data with personal information and stores it. Most of the users the devices that are available in the market as of now do not
do not have knowledge about IoT technology, and the hackers have the security features like firewalls, anti-virus etc. As the
can steal information from the users or even control the smart IoT devices are resource constrained it is important for these
2
devices to detect an intrusion with less complexity and time. into place to determine when to activate the IDS to detect
So, the use of Machine learning(ML) and Deep Learning(DL) an anomaly and to add a new rule to signature pattern and
techniques helps to reduce this complexity as these models build the model. Machine learning or Deep Learning methods
learn from the trained data. It is important for the central unit have been discussed in [16]. The various types of attacks at
to classify the message’s integrity. The privacy and security different levels of IoT infrastructure are clearly explained and
issues of IoT motivates researches for developing framework the possible solutions to these attacks using Machine learning
of automatic IoT sensors attack and anomaly detection[14]. are also clearly explained, that which are caused due to the
In this paper, We proposed to use ML/DL algorithms such as lack of proper security data available, the low quality data
Support Vector Machines, Decision Trees, Naive Bayes and available and performance of the learning algorithms could be
Convolutional Neural Networks for anomaly detection and the key in providing and improving the security and privacy
based on their accuracy and time cost the better algorithm of IoT devices. In this paper we would like to calculate the
to use can be concluded. And we used the IoT-23 dataset accuracy and time cost for the models, thereby comparing
for the implementation of ML/DL methods. The paper goes them to get the model that gives highest accuracy with less
as follows, in Section II literature review is discussed, in amount of time to detect and prevent the malware attacks in
Section III methodology is explained, in Section IV results are resource constrained IoT devices.
discussed with evaluation metrics and comparison. In sections
V we concluded the paper with a few suggestion of future III. M ETHODOLOGY
work. At last, references for this study is included.
A. Proposed Model
II. L ITERATURE R EVIEW
This study proposes an anomaly detection system model for
In this section, all the different anomaly detection algo- IoT security. Fig.1 is the diagram of the proposed anomaly
rithms and methodologies are briefly discussed. There are a detection system model. In our proposed model, a traffic
number of different mechanisms to improve the safety and capture unit captures traffic flow from sensors to the central
privacy of IoT devices. For example, in [12], chaos based unit. The captured traffic flow will be send to a compute unit,
encryption technique is used to generate symmetric keys to which can be a cloud or local computer. Then the compute unit
provide secured data transmission between server and the IoT will run multiple Machine Learning (ML) and Deep Learning
device which guarantees the data integrity and authenticity. (DL) models in order to get the performance and cost of each
According to [13], a mechanism with low computational individual model. Also, the compute unit will store the traffic
complexity has been proposed by using, random hopping flow to its database for future studies or model re-calibration.
sequence and random permutations to hide valuable informa- After getting the performance and cost of the ML/DL models,
tion. Moreover, in [14], Doshi presented a method to detect the user or system will select the model that is going to be
DDoS attacks in the network layer with low-cost machine used for anomaly detection. When detecting anomalies, the
learning approach, including KNN, LSVM, NN, Decision compute unit will send message or commands back to the
Tree, and Random Forest. This method can detect which node central unit such as dropping packets, malware scan, physical
is attacking the central unit with IP address. This method was inspection, marking IP address and alarming user. With our
reported to achieve high testing accuracy for all five machine proposed model, users can choose the ML/DL model based
learning algorithms. In [21] detection of anomaly is done on the performance and cost, such as accuracy and time cost.
using the fog computing, which clusters the different types of Since every user has different situation and usage of a anomaly
anomalies present in the sensor layer or edge nodes without detection system for IoT security, it is important to offer the
performing computation on both the cloud and sensor layer but best fit for different users. Moreover, since our proposed model
in the fog layer of the network. By using the fog computing captures traffic flow and store them into the database, the new
method it has become more easy to detect an anomaly. In dataset can be generated and be used for future re-calibration
[17], the author tries to implement malware detection system in the existing ML/DL models to further improve performance.
by using different classifiers of k-NN and random forest to Machine Learning algorithms such as Support Vector Ma-
build the model. The device filters TCP packets and selects chine(SVM), Random forest, Naive Bayes, Nearest Neigh-
important features such as frame numbers, length, labels etc. bours etc. and Deep learning methods such as Convolutional
The k-NN algorithm assigns traffic to the class while the Neural Networks(CNN) are trained with the data and then
random forest classifier builds decision trees to detect the computation is done to detect the anomaly in the system which
malware. The authors have proposed a new methodology in can be done on a local machine or on cloud. The dataset is
[22] which uses game theory and nash equilibrium to help divided into training and testing data and then based on the
the resource constrained IoT devices to detect an anomaly algorithm trained, conclusions can be drawn from the obtained
using Intrusion Detection System(IDS), activating it only when results. If an anomaly is detected then certain possible actions
needed. When an attack occurs the attack pattern (signature) can be taken based on the result such as: Dropping packets,
is stored and then model is trained and whenever pattern Blacklist sender’s IP address, Alarm user, Physical inspection
repeats it is identified by the signature detection technique and and more. The system can then scanned to detect any malware
anomaly is detected. Using IDS all the time can be resource present and also physical inspection can be done on the marked
consuming, so the game theory and nash equilibrium come devices.
3
TABLE I
VARIABLES AND DEFINITION FOR ZEEK FILES
ts This is the time of the first packet

uid A unique identifier of the connection
id The connection’s 4-tuple of endpoint addresses/ports
proto The transport layer protocol of the connection
service An identification of an application protocol
duration How long the connection lasted
orig bytes The number of payload bytes the originator sent
resp bytes The number of payload bytes the responder sent
conn state Possible connection state values
local orig If the connection is originated locally, this will be T
local resp If the connection is responded locally, this will be T
missed bytes Indicates the number of bytes missed in content gaps
history Records the state history of connections as a string
orig pkts Number of packets that the originator sent
orig ip bytes Number of IP level bytes that the originator sent
resp pkts Number of packets that the responder sent
resp ip bytes Number of IP level bytes that the responder sent
tunnel parents uid values for any encapsulating parent connections
orig l2 addr Link-layer address of the originator
B. Dataset
The dataset in this study was obtained from [20], the IoT-
23 dataset, which is a very recent one that was published in
January 2020 consisting of network traffic from 3 different
smart home IoT devices. The devices used were Amazon Echo,
Philips HUE and Somfy Door Lock. It is a large dataset of
real and labeled IoT malware infections and benign traffic
especially made to develop Machine learning algorithms. It
consists of 23 captures(also called scenarios), in the 23 cap-
tures, there are 20 malicious captures and 3 benign captures.
Captures from infected devices will have the possible name of
the malware sample executed on each scenario.
The malware labels for IoT-23 dataset are: Attack,
C&C, C&C-FileDownload, C&C-HeartBeat, C&C-HeartBeat-
Attack, C&C-HeartBeat-FileDownload, C&C-Mirai, C&C-
Torii, DDoS, FileDownload, Okiru, Okiru-Attack, PartOfA-
HorizontalPortScan.
In addition, Zeek is a software that perform network
analysing. The IoT-23 dataset we used is in the format of
conn.log.labeled, which is the Zeek conn.log file that was
generated from the Zeek network analyser using the original
pcap file. The variable types and definition for IoT-23 dataset
are as shown in Table I.
Since the dataset is huge, we have decided to capture part
of records from each individual dataset, then combine them
Fig. 1. The proposed anomaly detection system model to a new dataset. By doing this, our computer can handle the
workload for the new dataset, and the new dataset remains
most of the attack types of IoT-23 dataset.
The results obtained are compared with each other in order
to define the efficient method that can be used for the real C. Data Preprocessing
time data. The factors taken into consideration are ”accuracy” First, we used the Python library Pandas to load all 23
and ”time cost” taken for the algorithm. For example, even datasets separately of the IoT-23 Dataset into data frames with
if a model gives 100 percent accuracy and takes a lot of a condition of skipping the first 10 rows and reading the one
time it isn’t suitable for IoT network because the devices hundred thousand rows after. Then we combined all 23 data
are resource constrained. Therefore, our proposed model is frames into a new data frame. Next, we dropped the variables
to offer an optimal solution for different type of users, such that have no impact to the results. These variables are: ts, uid,
as a big company with lots of resources that aiming for the id.orig h, id.orig p, id.resp h, id.resp p, service, local orig,
highest accuracy or a small company that worries about cost local resp, history. Furthermore, we gave dummy values to
efficiency. the proto and conn state variables and replaced all the missing
4
TABLE II TABLE III

C OUNTS OF ATTACK T YPES FOR FILE IOT 23 COMBINED . CSV NAIVE BAYES RESULTS
Label count metrics precision recall f1score support

PartOfAHorizontalPortScan 825939 accuracy – – 0.30 288935
Okiru 262690 macro avg 0.45 0.50 0.28 288935
Benign 197809 weighted avg 0.85 0.30 0.21 288935
DDoS 138777 time cost 6 seconds
Attack 3915
C&C-HeartBeat 349
C&C-FileDownload 43 TABLE IV
C&C-Torii 30 SVM RESULTS
FileDownload 13
C&C-HeartBeat-FileDownload 8 metrics precision recall f1score support
C&C-Mirai 1 accuracy – – 0.69 80000
macro avg 0.33 0.26 0.25 80000
weighted avg 0.60 0.69 0.57 80000
time cost 5849 seconds
values with 0. Last, the combined dataset is generated and
saved as the iot23 combined.csv file.
The iot23 combined.csv file contains a total of 1,444,674
6) F1 score: Taking into account both false positives and
records. Moreover, as shown in Table II, the combined
false negatives, f1 score is a metric that calculates the harmonic
file has 10 types of attack, including PartOfAHorizon-
mean of precision and recall and is considered to be a better
talPortScan, Okiru, DDoS, Attack, C&C-HeartBeat, C&C-
measure. It is given by
FileDownload, C&C-Torii, FileDownload, C&C-HeartBeat-
FileDownload, and C&C-Mirai. precision ∗ recall
For validation, we splited the combined dataset into a F1 = 2 ∗
precision + recall
training dataset with a size of 0.8 and a testing dataset with a
size of 0.2. 7) Support score: The support score is a measuring metrics
of the python library scikit-learn, which indicates the number
IV. P ERFORMANCE E VALUATION AND A NALYSIS of occurrences of each label where it is true.
The results of the algorithms are discussed in this section.
It includes the confusion matrix of each algorithm along with C. Test Results for ML and DL Methods
the time it has taken to calculate the anomaly.
1) Naive Bayes: The supervised learning algorithm is based
on Bayes theorem and is generally used for classification
A. Hardware and Environment Settings
problems which predicts based on the probability. It is known
The experiments were run on a personal computer with an to be simple and effective algorithm for building ML models.
Intel Core 7700k CPU @ 4.50 GHz, 24 GB of RAM @ 3200 As shown in Table III, the overall accuracy for the Naive
MHz, and MSI GeForce RTX 2080. In addition, the exper- Bayes algorithm is only 30 percent and time taken to execute
iments were performed on Windows 10, Anaconda Jupyter is 6 seconds. The Naive Bayes obtained the lowest accuracy
Notebook, Python 3.8 and Tensorflow 2.4 environments. in our results.
2) Support Vector Machine: The support vector machine
B. Evaluation of Metrics (SVM) algorithm tries to find the hyperplane which is depen-
To evaluate the results of the model certain metrics are used dant on the number of features that classifies the data points.
which are described below. Hyperplane is a decision boundary between the data points, to
1) Time: The amount of time taken for an algorithm to classify them based on either side of the hyperplane. Using the
run a particular ML/DL model is taken into consideration. As extreme data points as support vectors the margin of classifier
mentioned earlier, an algorithm which takes heavy amount of can be maximised leading to better classification.
time may not be suitable for IoT environment. As shown in Table IV, it shows that the overall accuracy
2) True Positives: The outcome where the model correctly for SVM is only 69 percent while explaining the precision
predicts the positive class. for each attack. The time taken to execute is almost around
3) False Positives: The outcome where the model incor- 2 hours. The SVM obtained a similar accuracy compared to
rectly predicts the positive class. Decision Trees and CNN, but it has the highest time cost out
4) Precision: Precision is described as a measure of calcu- of all the results.
lating the correctly identified positives in a model and is given
by:
T rueP ositives TABLE V
P recision = D ECISION T REES RESULTS
T rueP ositives + F alseP ositives
5) Recall: It is a measure of actual number of positives that metrics precision recall f1score support
accuracy – – 0.73 722337
are correctly identified and is given by: macro avg 0.63 0.50 0.50 722337
T rueP ositives weighted avg 0.77 0.73 0.65 722337
Recall = time cost 3 seconds
T rueP ositives + F alseN egatives
5
TABLE VI TABLE VIII

CNN M ODEL S UMMARY E XPERIMENT RESULTS
Layer (type) Output Shape Number of Parameters Method Testing Accuracy Time Cost
Input (Dense) (None, 2000) 50000 Naive Bayes 0.30 6 seconds
dense 1 (Dense) (None, 1500) 3001500 SVM 0.69 5849 seconds
dropout 1 (Dropout) (None, 1500) 0 Decision Tree 0.73 3 seconds
dense 2 (Dense) (None, 800) 1200800 CNN 0.6935 242 seconds
dropout 2 (Dropout) (None, 800) 0
dense 3 (Dense) (None, 400) 320400 TABLE IX
dropout 3 (Dropout) (None, 400) 0 R ESULTS COMPARISON WITH PAPER [19]
dense 4 (Dense) (None, 150) 60150
dropout 4 (Dropout) (None, 150) 0 Method Testing Accuracy
Output (Dense) (None, 12) 1812 Naive Bayes (ours) 0.30
Total parameters: 4,634,662 Naive Bayes (paper[19]) 0.23
Trainable parameters: 4,634,662 SVM (ours) 0.69
Non-trainable parameters: 0 SVM (paper[19]) 0.67
TABLE VII
CNN RESULTS
As shown in Table VII that the testing accuracy for CNN
model is 69 percent and the execution time is around 4
training accuracy training loss testing accuracy testing loss minutes. Although CNN has lower accuracy and higher time
0.6937 0.8583 0.6935 0.8602
time cost 242 seconds
cost than Decision Trees, CNN can have a better performance
when dealing with a more complex dataset.
3) Decision Trees: Supervised Machine Learning classifier D. Results Comparison

that is generally used for classification problems consisting The experiment results are shown in Table VIII. The ob-
of nodes and leave connected by branches. Where the nodes tained results for each of the algorithm are compared with
represents features of the dataset, leaf node represents the each other, then the comparison will be done on the basis of
outcome and the branches are the decision rules of the accuracy and the cost of time for each algorithm to execute.
classification. For Naive Bayes, while it results in an accuracy of 0.30, other
As shown in Table V, it shows that the overall accuracy ML/DL methods result in accuracy around 0.70. For SVM,
was able to achieve 73 percent while the time cost is only it results in an accuracy of 0.69, which is about the same
around 3 seconds. The Decision Trees achieved the highest compared to the CNN model and about 6% lower accuracy
accuracy and the lowest time cost in our result, which makes compared to Decision Trees. However, the time cost for SVM
the Decision Trees as the best solution method in our study. is about 2 hours, which is 1,950 times slower than Decision
4) Convolutional Neural Networks: Convolutional neural Trees and 24 times slower than the CNN model. For CNN, it
networks (CNN) is a deep learning model with minimal pre- results in an accuracy of 0.694, which is lower than Decision
processing required with an architecture mimicking pattern Trees and higher than SVM. The time cost for CNN is about
of neurons of human brain. It has many different layers 4 minutes, which is 80 times slower than Decision Trees. For
convolutional layers, pooling layers, fully connected layers Decision Trees, it results in an accuracy of 0.73 and cost about
and normalisation layers. The convolutional layer has several 3 seconds, which are the best accuracy and the lowest time
attributes named hyper parameters such as the number of input cost among all the tested ML/DL algorithms in this study.
and output channels, padding size, kernels with particular In addition, paper [19] also tested multiple Machine Learn-
width and height etc. Pooling layers reduce the dimension ing algorithms on the IoT-23 dataset. Unlike our study, paper
of data by combining output of previous neuron cluster to [19] implemented Random Forest (RF), Naive Bayes, Support
a single neuron in the next layer. While in convolutional layer Vector Machine, Artificial Neural Network (ANN) and Ad-
the input of neuron is specific to certain neuron, while in the aBoost. As shown in Table IX, the results of paper [19] shows
fully connected layer every neuron receives input from all the that the Naive Bayes algorithm has 23 percent accuracy and
neurons in previous layer. the SVM algorithm has 67 percent accuracy. Compared to our
In our proposed CNN model, as shown in Table VI, it has 1 study, although our results has higher accuracy with the Naive
input layer, 4 dense layers, 4 dropout layers with 0.2 dropout Bayes and SVM algorithms, both results show that Naive
rate, and 1 output layer. The activation function for dense Bayes algorithm has the lowest accuracy among all algorithms.
layers is Relu, which is a linear function that will output However, since the combined dataset in [19] is a much larger
the input directly if the result is positive or output zero if dataset compared to our combined dataset, this might have
the result is not positive. The activation function for output impacts to the results. In short, the result comparison with
layer is Softmax, which is a logistic function to normalize the paper [19] shows that our result is accurate.
output into a probability distribution. The optimizer for the
CNN model is Adam, which is a gradient descent searching V. C ONCLUSION AND F UTURE W ORK
algorithm. There are a total of 4,634,662 parameters for our In this paper, we have presented an anomaly detection
proposed CNN model, and all the parameters are trainable. system for IoT security with the performance comparison
6
of different learning algorithms and methods. Based on our [17] L. Xiao, X. Wan, X. Lu, Y. Zhang and D. Wu, ”IoT Security Techniques
results, Naive Bayes has the worst performance of all learn- Based on Machine Learning: How Do IoT Devices Use AI to Enhance
Security?,” in IEEE Signal Processing Magazine, vol. 35, no. 5, pp. 41-49,
ing algorithms and methods, and Decision Trees has shown Sept. 2018, doi: 10.1109/MSP.2018.2825478.
the highest accuracy with least cost of time among all the [18] F. Hussain, R. Hussain, S. A. Hassan and E. Hossain, ”Machine Learning
ML/DL methods. In the future, more datasets from different in IoT Security: Current Solutions and Future Challenges,” in IEEE
Communications Surveys and Tutorials, vol. 22, no. 3, pp. 1686-1721,
environment should be tested in the ML/DL methods used in thirdquarter 2020, doi: 10.1109/COMST.2020.2986444.
this study. This can help to further clarify the performance, [19] N. A. Stoian, ”Machine Learning for anomaly detection in
time cost and comparison between the methods. IoT networks : Malware analysis on the IoT-23 data set,”
http://purl.utwente.nl/essays/81979
[20] IoT-23 Dataset ”https://www.stratosphereips.org/datasets-iot23”
[21] L. Lyu, J. Jin, S. Rajasegarar, X. He and M. Palaniswami, ”Fog-
R EFERENCES Empowered Anomaly Detection in IoT Using Hyperellipsoidal Cluster-
ing,” in IEEE Internet of Things Journal, vol. 4, no. 5, pp. 1174-1184,
[1] S. Chen, H. Xu, D. Liu, B. Hu and H. Wang, ”A Vision of IoT: Oct. 2017, doi: 10.1109/JIOT.2017.2709942.
Applications, Challenges, and Opportunities With China Perspective,” in [22] H. Sedjelmaci, S. M. Senouci and M. Al-Bahri, ”A lightweight
IEEE Internet of Things Journal, vol. 1, no. 4, pp. 349-359, Aug. 2014, anomaly detection technique for low-resource IoT devices: A game-
doi: 10.1109/JIOT.2014.2337336. theoretic methodology,” 2016 IEEE International Conference on Com-
[2] G. Shen and B. Liu, ”The visions, technologies, applications and security munications (ICC), Kuala Lumpur, Malaysia, 2016, pp. 1-6, doi:
issues of Internet of Things,” 2011 International Conference on E- 10.1109/ICC.2016.7510811.
Business and E-Government (ICEE), Shanghai, China, 2011, pp. 1-4,
doi: 10.1109/ICEBEG.2011.5881892.
[3] Huang, Y., Benford, S., Price, D., Patel, R., Li, B., Ivanov, A., and
Blake, H. (2020). Using Internet of Things to Reduce Office Workers’
Sedentary Behavior: Intervention Development Applying the Behavior
Change Wheel and Human-Centered Design Approach. JMIR mHealth
and uHealth, 8(7), e17914–. https://doi.org/10.2196/17914
[4] Almusaylim, Z., and Zaman, N. (2018). A review on smart home present
state and challenges: linked to context-awareness internet of things (IoT).
Wireless Networks, 25(6), 3193–3204. https://doi.org/10.1007/s11276-
018-1712-5
[5] Singh, K., and Singh, N. (2020). An ensemble hyper-tuned model for IoT
sensors attacks and anomaly detection. Journal of Information and Opti-
mization Sciences, 1–25. https://doi.org/10.1080/02522667.2020.1799515
[6] Kumar, S., Vealey, T., and Srivastava, H. (2016). Security in Internet
of Things: Challenges, Solutions and Future Directions. 5772–5781.
https://doi.org/10.1109/HICSS.2016.714
[7] Tahsien, S., Karimipour, H., and Spachos, P. (2020). Machine learn-
ing based solutions for security of Internet of Things (IoT): A sur-
vey. Journal of Network and Computer Applications, 161, 102630–.
https://doi.org/10.1016/j.jnca.2020.102630
[8] Tahsien, S., Karimipour, H., and Spachos, P. (2020). Machine learn-
ing based solutions for security of Internet of Things (IoT): A sur-
vey. Journal of Network and Computer Applications, 161, 102630–.
https://doi.org/10.1016/j.jnca.2020.102630
[9] A. Mosenia and N. K. Jha, ”A Comprehensive Study of Secu-
rity of Internet-of-Things,” in IEEE Transactions on Emerging Topics
in Computing, vol. 5, no. 4, pp. 586-602, 1 Oct.-Dec. 2017, doi:
10.1109/TETC.2016.2606384.
[10] J. Deogirikar and A. Vidhate, ”Security attacks in IoT: A survey,” 2017
International Conference on I-SMAC (IoT in Social, Mobile, Analyt-
ics and Cloud) (I-SMAC), Palladam, 2017, pp. 32-37, doi: 10.1109/I-
SMAC.2017.8058363.
[11] M. Nawir, A. Amir, N. Yaakob and O. B. Lynn, ”Internet of Things
(IoT): Taxonomy of security attacks,” 2016 3rd International Confer-
ence on Electronic Design (ICED), Phuket, 2016, pp. 321-326, doi:
10.1109/ICED.2016.7804660.
[12] T. Song, R. Li, B. Mei, J. Yu, X. Xing and X. Cheng, ”A Privacy Pre-
serving Communication Protocol for IoT Applications in Smart Homes,”
in IEEE Internet of Things Journal, vol. 4, no. 6, pp. 1844-1852, Dec.
2017, doi: 10.1109/JIOT.2017.2707489.
[13] M. N. Aman, B. Sikdar, K. C. Chua and A. Ali, ”Low Power Data
Integrity in IoT Systems,” in IEEE Internet of Things Journal, vol. 5, no.
4, pp. 3102-3113, Aug. 2018, doi: 10.1109/JIOT.2018.2833206.
[14] Doshi, R., Apthorpe, N., and Feamster, N. (2018, April 11). Machine
Learning DDoS Detection for Consumer Internet of Things Devices.
https://doi.org/10.1109/SPW.2018.00013
[15] V. Hassija, V. Chamola, V. Saxena, D. Jain, P. Goyal and B. Sikdar,
”A Survey on IoT Security: Application Areas, Security Threats, and
Solution Architectures,” in IEEE Access, vol. 7, pp. 82721-82743, 2019,
doi: 10.1109/ACCESS.2019.2924045.
[16] M. A. Al-Garadi, A. Mohamed, A. K. Al-Ali, X. Du, I. Ali and
M. Guizani, ”A Survey of Machine and Deep Learning Methods for
Internet of Things (IoT) Security,” in IEEE Communications Surveys
and Tutorials, vol. 22, no. 3, pp. 1646-1685, thirdquarter 2020, doi:
10.1109/COMST.2020.2988293.

Research Paper

Uploaded by

Copyright:

Available Formats

Research Paper

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Research Paper

Uploaded by

Copyright:

Available Formats

1

Machine Learning and Deep Learning Methods for

ts This is the time of the first packet

TABLE II TABLE III

Label count metrics precision recall f1score support

TABLE VI TABLE VIII

3) Decision Trees: Supervised Machine Learning classifier D. Results Comparison

You might also like