Comparative Analysis of Machine Learning Algorithms On The Bot-IOT Dataset

This paper compares 5 machine learning algorithms (Decision Tree, KNN, Naive Bayes, SVM, Neural Network) for detecting botnet traffic in IoT devices using a dataset of network features. The Decision Tree achieved the highest accuracy (>90%) according to experimental results, demonstrating its effectiveness at intrusion detection in IoT environments. However, the paper notes other factors like complexity and interpretability should also be considered for real-world use. Overall, the study shows machine learning can help identify threats in IoT networks.

Uploaded by

belej40682

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

44 views

Comparative Analysis of Machine Learning Algorithms On The Bot-IOT Dataset

Uploaded by

belej40682

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Comparative Analysis of Machine Learning Algorithms on the Bot-IOT

Dataset
Jay Soni (20CS43)
Computer Science & Engineering Department
Engineering College, Ajmer

ABSTRACT

This paper presents a comparative analysis of five machine learning algorithms—Decision Tree,
K-Nearest Neighbors (KNN), Naive Bayes, Support Vector Machine (SVM), and Neural
Network—applied to the Bot-IoT dataset. The dataset, collected from IoT devices, contains
various features related to network traffic and communication patterns. The aim of this study is
to assess the accuracy of these algorithms in distinguishing between normal and botnet traffic.
Experimental results demonstrate the varying performance of the algorithms and provide
insights into their suitability for intrusion detection in IoT environments.

1. INTRODUCTION
In today's interconnected world, the Internet of Things (IoT) has become an integral part
of our lives, enabling communication and data exchange among various devices and
systems. However, this increasing connectivity also brings about new security
challenges, especially in the context of cyber threats and attacks. One of the significant
concerns is the emergence of botnets, which are networks of compromised devices
controlled by malicious actors for various illicit activities, such as distributed
denial-of-service (DDoS) attacks and data exfiltration. The ability to detect and mitigate
such threats is crucial for ensuring the security and reliability of IoT networks.
Intrusion detection plays a pivotal role in identifying and thwarting cyber threats within
IoT environments. Machine learning algorithms have gained prominence as effective
tools for intrusion detection due to their capability to analyze large volumes of data and
uncover patterns indicative of malicious activities. In this context, the "bot-iot dataset"
serves as a valuable resource for evaluating the performance of machine learning
algorithms in distinguishing normal network traffic from botnet-related activities.
This study focuses on the comparative analysis of five prominent machine learning
algorithms: Decision Tree, K-Nearest Neighbors (KNN), Naive Bayes, Support Vector
Machine (SVM), and Neural Network. Each of these algorithms approaches the task of

1
intrusion detection from a unique perspective, leveraging their respective strengths in
feature representation, pattern recognition, and decision-making. By evaluating the
accuracy of these algorithms on the "bot-iot dataset," we aim to gain insights into their
suitability for identifying botnet-related activities within IoT environments.

Algorithms Under Consideration:

● Decision Tree: Decision trees employ a tree-like structure to make sequential
decisions based on input features. These models are interpretable and capable
of handling both categorical and numerical data, making them well-suited for
identifying distinct decision boundaries.
● K-Nearest Neighbors (KNN): KNN operates on the principle that similar
instances tend to have similar labels. By comparing an instance to its neighboring
instances in the feature space, KNN assigns labels based on the majority class
among its k-nearest neighbors.
● Naive Bayes: Naive Bayes relies on Bayes' theorem to compute the probability
of a certain label given the observed features. Despite its simplifying assumption
of feature independence, Naive Bayes has shown effectiveness in various
classification tasks.
● Support Vector Machine (SVM): SVM seeks to find a hyperplane that
maximizes the margin between different classes while correctly classifying
instances. It is particularly powerful in cases where classes are not linearly
separable.
● Neural Network (NN): Neural networks, inspired by the human brain, consist of
interconnected nodes that mimic neurons. Deep learning architectures, a subset
of neural networks, have demonstrated exceptional performance in complex
pattern recognition tasks.
The subsequent sections of this paper will delve into the methodologies used,
experimental results obtained, and an in-depth discussion of the performance of these
algorithms on the "bot-iot dataset." The insights gained from this analysis will aid in
making informed decisions regarding the selection of an appropriate machine learning
algorithm for intrusion detection within IoT environments.

2. Related Work
In recent years, the increasing prevalence of IoT devices and the corresponding surge in
cyber threats have spurred extensive research in the domain of intrusion detection for
IoT environments. Detecting malicious activities in these interconnected networks is
challenging due to the diverse range of devices, communication protocols, and data
characteristics. Various studies have explored the application of machine learning
algorithms for identifying anomalies and attacks within IoT ecosystems.

Khan et al. [1] presented a comprehensive survey of intrusion detection techniques for
IoT networks. The study highlighted the significance of machine learning approaches
and discussed the challenges posed by the dynamic nature of IoT data. Similarly,
Antoniades et al. [2] conducted an empirical evaluation of machine learning algorithms

2
for intrusion detection in IoT environments. Their findings underscored the importance of
algorithm selection based on the characteristics of the dataset.

The subsequent sections will detail the methodology employed for the comparative
analysis of the algorithms on the "bot-iot dataset.”

3. Methodology
The methodology section outlines the steps and techniques employed in conducting the
comparative analysis of machine learning algorithms on the "bot-iot dataset."

Dataset Preprocessing: Before commencing the analysis, the "bot-iot dataset"

underwent preprocessing to ensure its suitability for machine learning. This involved
handling missing values, encoding categorical features, and scaling numerical features.
Any redundant or irrelevant attributes were removed to enhance the quality of the data.

Algorithm Selection: Five machine learning algorithms were chosen for this
comparative analysis: Decision Tree, K-Nearest Neighbors (KNN), Naive Bayes, Support
Vector Machine (SVM), and Neural Network. Each algorithm possesses unique
characteristics and capabilities that make it well-suited for the task of intrusion detection
within IoT environments.

Model Training and Evaluation: The dataset was divided into training and testing
subsets using a standard 80-20 split. The training subset was used to train each of the
selected algorithms, while the testing subset was reserved for evaluating their
performance. To ensure fair comparison, the same training and testing data were used
for all algorithms.
For each algorithm, the training data underwent feature scaling to ensure uniformity in
feature ranges. The scaled features were then used to train the respective algorithm
models. Post-training, the models were evaluated on the testing data to calculate
accuracy scores.

Evaluation Metrics: The performance of the algorithms was assessed using:-

● Accuracy
● Precision
● Recall
● F1-score
These metrics offer a comprehensive view of the algorithms' capabilities in correctly
classifying instances of normal and botnet traffic.

4. Experimental Results

3
Boxplot for Accuracy, Precision, Recall and F-1 Score:

4
5. Discussion and Analysis
The discussion and analysis section delves into the performance of each machine
learning algorithm on the "bot-iot dataset," offering insights into their strengths and
limitations.

Among the tested algorithms, the Decision Tree demonstrated the highest accuracy,
precision, recall, and F1-score, all surpassing 0.9. Decision Trees excel at partitioning
feature space to distinguish classes, making them effective for intrusion detection.

Practical deployment requires considering factors beyond accuracy, including algorithm

complexity, interpretability, and generalization ability. Data imbalance and feature
relevance impact algorithm performance.

Future research can explore ensemble methods and hyperparameter tuning to enhance
algorithm performance.

In conclusion, the Decision Tree's superior performance highlights its potential for robust
intrusion detection, but algorithm selection should align with deployment needs.

6. Conclusion
In this study, we conducted a comparative analysis of five prominent machine learning
algorithms—Decision Tree, K-Nearest Neighbors (KNN), Naive Bayes, Support Vector
Machine (SVM), and Neural Network—for intrusion detection within IoT environments
using the "bot-iot dataset." Each algorithm's performance was evaluated based on
accuracy, precision, recall, and F1-score.

Among the algorithms, the Decision Tree emerged as the top performer, achieving the
highest scores across all metrics. Its ability to create decision rules based on features
allowed it to effectively distinguish between normal and botnet traffic. Naive Bayes and
Support Vector Machine achieved competitive accuracy scores, indicating their
effectiveness in IoT intrusion detection. However, consideration of algorithm complexity,
interpretability, and generalization ability is crucial for practical deployment.

The study's findings underscore the applicability of machine learning algorithms in

identifying anomalous activities within IoT networks. The Decision Tree's outstanding
performance highlights its potential as a robust intrusion detection tool. Nonetheless, the
choice of algorithm should align with specific deployment requirements and
considerations.

This analysis provides valuable insights for selecting suitable machine learning
algorithms for IoT intrusion detection, contributing to enhanced network security and
resilience in the face of evolving cyber threats.

5
As IoT ecosystems continue to grow, future research can explore ensemble techniques,
hyperparameter optimization, and real-time adaptation to further advance intrusion
detection capabilities.

By understanding the strengths and limitations of each algorithm, practitioners can make
informed decisions to protect IoT networks and ensure the integrity of interconnected
devices.

With this conclusion, the study encapsulates its outcomes and implications, paving the
way for informed decision-making in IoT intrusion detection strategies.

7. References:
● Khan, R., & Khan, S. U. (2017). A comprehensive survey of recent trends in IoT
security. IEEE Communications Surveys & Tutorials, 20(3), 2887-2925.
● Antoniades, D., & Loukas, G. (2019). Adversarial machine learning in the Internet
of Things: A systematic survey. IEEE Access, 7, 107652-107674.
● Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Benjamin Turnbull.
"Towards the development of realistic botnet dataset in the internet of things for
network forensic analytics: Bot-iot dataset." Future Generation Computer
Systems 100 (2019): 779-796. Public Access Here.
● Koroniotis, Nickolaos, Nour Moustafa, Elena Sitnikova, and Jill Slay. "Towards
developing network forensic mechanism for botnet activities in the iot based on
machine learning techniques." In International Conference on Mobile Networks
and Management, pp. 30-44. Springer, Cham, 2017.
● Koroniotis, Nickolaos, Nour Moustafa, and Elena Sitnikova. "A new network
forensic framework based on deep learning for Internet of Things networks: A
particle deep framework." Future Generation Computer Systems 110 (2020):
91-106.
● Koroniotis, Nickolaos, and Nour Moustafa. "Enhancing network forensics with
particle swarm and deep learning: The particle deep framework." arXiv preprint
arXiv:2005.00722 (2020).
● Koroniotis, Nickolaos, Nour Moustafa, Francesco Schiliro, Praveen Gauravaram,
and Helge Janicke. "A Holistic Review of Cybersecurity and Reliability
Perspectives in Smart Airports." IEEE Access (2020).
● Koroniotis, Nickolaos. "Designing an effective network forensic framework for the
investigation of botnets in the Internet of Things." PhD diss., The University of
New South Wales Australia, 2020.

Basic Concept of Purchase Account Management
100% (1)
Basic Concept of Purchase Account Management
3 pages
MYP Spanish 3 IB Unit Planner - Cuban Revolution
100% (1)
MYP Spanish 3 IB Unit Planner - Cuban Revolution
4 pages
Hybrid Feature Selection Models For Machine
No ratings yet
Hybrid Feature Selection Models For Machine
5 pages
s40537-024-00887-9
No ratings yet
s40537-024-00887-9
25 pages
Assignment 5 Ai
No ratings yet
Assignment 5 Ai
3 pages
29927
No ratings yet
29927
5 pages
Performance Analysis and Comparison of Machine and Deep Learning Algorithms For Iot Data Classification
No ratings yet
Performance Analysis and Comparison of Machine and Deep Learning Algorithms For Iot Data Classification
13 pages
Aics Unit 3
No ratings yet
Aics Unit 3
3 pages
Questions
No ratings yet
Questions
3 pages
MonTrees: Automated Detection and Classification of Networking Anomalies in Cellular Networks
No ratings yet
MonTrees: Automated Detection and Classification of Networking Anomalies in Cellular Networks
15 pages
A Deep Learning-Based DDoS Detection Framework for Internet of Things
No ratings yet
A Deep Learning-Based DDoS Detection Framework for Internet of Things
6 pages
Summary of articles
No ratings yet
Summary of articles
9 pages
2.2(4)
No ratings yet
2.2(4)
20 pages
IEEE Conference Template 1
No ratings yet
IEEE Conference Template 1
4 pages
Machine Learning Algorithms For Spotting 6G Network Penetration For Different Attacks
No ratings yet
Machine Learning Algorithms For Spotting 6G Network Penetration For Different Attacks
5 pages
Auddetection
No ratings yet
Auddetection
5 pages
Kotlar Et Al. - 2021 - Novel Meta-Features For Automated Machine Learning Model Selection in Anomaly Detection
No ratings yet
Kotlar Et Al. - 2021 - Novel Meta-Features For Automated Machine Learning Model Selection in Anomaly Detection
13 pages
Research
No ratings yet
Research
10 pages
Network Anomaly Detection Using A Hybrid Approach of Machine H Öztekin
No ratings yet
Network Anomaly Detection Using A Hybrid Approach of Machine H Öztekin
12 pages
An Overview of Machine Learning Applications For Intrusion Detection
No ratings yet
An Overview of Machine Learning Applications For Intrusion Detection
31 pages
An Application of Machine Learning To Network Intrusion Detectio
No ratings yet
An Application of Machine Learning To Network Intrusion Detectio
7 pages
Base Paper Interview
No ratings yet
Base Paper Interview
5 pages
Anomaly_Detection_Review (1)(2)
No ratings yet
Anomaly_Detection_Review (1)(2)
3 pages
Distributed and Cooperative Hierarchical Intrusion Detection On Manets
No ratings yet
Distributed and Cooperative Hierarchical Intrusion Detection On Manets
9 pages
Sample Bejdi
No ratings yet
Sample Bejdi
11 pages
1 s2.0 S0020025523004565 Main
No ratings yet
1 s2.0 S0020025523004565 Main
20 pages
An Ontology of Machine Learning Algorithms For Human Activity Data Processing
No ratings yet
An Ontology of Machine Learning Algorithms For Human Activity Data Processing
5 pages
PHD Title: Efficient Multimodal Vision Transformers For Embedded System
No ratings yet
PHD Title: Efficient Multimodal Vision Transformers For Embedded System
4 pages
A Novel Evaluation Approach To
No ratings yet
A Novel Evaluation Approach To
13 pages
Explaining Network Intrusion Detection System Using Explainable AI Framework
No ratings yet
Explaining Network Intrusion Detection System Using Explainable AI Framework
10 pages
IRJET Price Prediction and Analysis of F
No ratings yet
IRJET Price Prediction and Analysis of F
7 pages
Seminar Synopsisreport
No ratings yet
Seminar Synopsisreport
6 pages
Feature extraction for machine learning-based intrusion detection in
No ratings yet
Feature extraction for machine learning-based intrusion detection in
12 pages
amnamoly-detection-in-network
No ratings yet
amnamoly-detection-in-network
2 pages
NIDS Conference
No ratings yet
NIDS Conference
4 pages
Ijctt V48P126
No ratings yet
Ijctt V48P126
11 pages
1 s2.0 S0925231220319032 Main
No ratings yet
1 s2.0 S0925231220319032 Main
11 pages
An Investigation On Intrusion Detection System Using Machine Learning
No ratings yet
An Investigation On Intrusion Detection System Using Machine Learning
9 pages
Irjet V9i11154
No ratings yet
Irjet V9i11154
4 pages
Comparative Study Classification Algorit PDF
No ratings yet
Comparative Study Classification Algorit PDF
8 pages
LSP Wireless network attacks using supervised machine learning techniques
No ratings yet
LSP Wireless network attacks using supervised machine learning techniques
28 pages
Research Paper Fo GPU Virtualization
No ratings yet
Research Paper Fo GPU Virtualization
9 pages
s41870-024-02219-9
No ratings yet
s41870-024-02219-9
12 pages
v48 65
No ratings yet
v48 65
9 pages
Botnet Detection - IEEE - Doc2
No ratings yet
Botnet Detection - IEEE - Doc2
6 pages
Experimental Analysis of Decision Tree C
No ratings yet
Experimental Analysis of Decision Tree C
12 pages
Evaluation of Cybersecurity Data Set Characteristics For Their Applicability To Neural Networks Algorithms Detecting Cybersecurity Anomalies
No ratings yet
Evaluation of Cybersecurity Data Set Characteristics For Their Applicability To Neural Networks Algorithms Detecting Cybersecurity Anomalies
10 pages
Statistical Performance Assessment of Supervised Machine Learning Algorithms For Intrusion Detection System
No ratings yet
Statistical Performance Assessment of Supervised Machine Learning Algorithms For Intrusion Detection System
12 pages
Enhanced Network Anomaly Detection Based On Deep Neural Networks
No ratings yet
Enhanced Network Anomaly Detection Based On Deep Neural Networks
16 pages
ML Classification1
No ratings yet
ML Classification1
12 pages
AkinsolaJET IJCTT V48P126
No ratings yet
AkinsolaJET IJCTT V48P126
12 pages
Understanding_house_numbers_for_delivery_robots-2024
No ratings yet
Understanding_house_numbers_for_delivery_robots-2024
8 pages
Fight For Code Book 1
No ratings yet
Fight For Code Book 1
12 pages
Thesis On Speaker Recognition System
100% (2)
Thesis On Speaker Recognition System
4 pages
A System Based On Naive Bayesian For Denial-Of-Service Attack Detection
No ratings yet
A System Based On Naive Bayesian For Denial-Of-Service Attack Detection
4 pages
ML QA
No ratings yet
ML QA
10 pages
A Survey On Building An Effective Intrusion Detection System (IDS) Using Machine Learning Techniques, Challenges and Datasets
No ratings yet
A Survey On Building An Effective Intrusion Detection System (IDS) Using Machine Learning Techniques, Challenges and Datasets
8 pages
AI-Driven Anomaly Detection in Network Monitoring
No ratings yet
AI-Driven Anomaly Detection in Network Monitoring
6 pages
Neural Networks in Data Mining
No ratings yet
Neural Networks in Data Mining
6 pages
Research
No ratings yet
Research
15 pages
Information Sciences: Byoung-Jun Park, Sung-Kwun Oh, Witold Pedrycz
No ratings yet
Information Sciences: Byoung-Jun Park, Sung-Kwun Oh, Witold Pedrycz
18 pages
Data Science – Neural Networks, Deep Learning, LLMs and Power BI
From Everand
Data Science – Neural Networks, Deep Learning, LLMs and Power BI
Jagdish Krishanlal Arora
No ratings yet
CGMT Unit4
No ratings yet
CGMT Unit4
47 pages
ITC (6th) May2018
No ratings yet
ITC (6th) May2018
2 pages
Research Report On AI and Predictive Learning
No ratings yet
Research Report On AI and Predictive Learning
5 pages
TC Notes
No ratings yet
TC Notes
46 pages
Top 100 IT Companies Rank Wise List
No ratings yet
Top 100 IT Companies Rank Wise List
3 pages
Introduction To ICT
No ratings yet
Introduction To ICT
37 pages
G7 Unit 11 Lesson 1 (NO KEYS)
No ratings yet
G7 Unit 11 Lesson 1 (NO KEYS)
2 pages
Tep 521 Rica Case Study Project Mark Llacuna
No ratings yet
Tep 521 Rica Case Study Project Mark Llacuna
5 pages
26S016 - Gun Heaven 2
100% (1)
26S016 - Gun Heaven 2
59 pages
Summer Internship Project Report 1
No ratings yet
Summer Internship Project Report 1
113 pages
Generational Marketing and Its Role in Marketing
100% (1)
Generational Marketing and Its Role in Marketing
2 pages
Pediatric Pulmonology 2021, Dell'orto
No ratings yet
Pediatric Pulmonology 2021, Dell'orto
8 pages
Consumer Behavior in The Service Industry An Integrative Literature Review and Research Agenda
No ratings yet
Consumer Behavior in The Service Industry An Integrative Literature Review and Research Agenda
30 pages
Kinetic and Reactor Design Analysis of Vapor-Phase Ethanol Dehydration A Comparative Evaluation of PFR, CSTR, and Hybrid Reactor Sequencing
No ratings yet
Kinetic and Reactor Design Analysis of Vapor-Phase Ethanol Dehydration A Comparative Evaluation of PFR, CSTR, and Hybrid Reactor Sequencing
18 pages
Alexander Klug JAppl Phys
No ratings yet
Alexander Klug JAppl Phys
7 pages
NASA 175756main 05 04 2007 TL
No ratings yet
NASA 175756main 05 04 2007 TL
2 pages
Introduction To Masonry Structures
No ratings yet
Introduction To Masonry Structures
33 pages
Agrument Essay
100% (2)
Agrument Essay
47 pages
Đề chính thức
No ratings yet
Đề chính thức
11 pages
Algorithms Handout
No ratings yet
Algorithms Handout
12 pages
General Api Developer Guide PDF
No ratings yet
General Api Developer Guide PDF
21 pages
The Prophet by Kahlil Gibran: The Coming of The Ship
No ratings yet
The Prophet by Kahlil Gibran: The Coming of The Ship
45 pages
Analog Electronic Circuits (ELE-209) RCS (Makeup) PDF
No ratings yet
Analog Electronic Circuits (ELE-209) RCS (Makeup) PDF
2 pages
NST2601 Assignmnts
No ratings yet
NST2601 Assignmnts
20 pages
Isuzu Gold Star: Certified Pre-Owned Truck Warranty
No ratings yet
Isuzu Gold Star: Certified Pre-Owned Truck Warranty
2 pages
Requirement Management Plan
No ratings yet
Requirement Management Plan
7 pages
The alchemy-taylor swift (my version)
No ratings yet
The alchemy-taylor swift (my version)
4 pages
Weekly Home Learning Plan For Grade 9 Week 1, Quarter 1, October - , 2021 (MATH 8)
No ratings yet
Weekly Home Learning Plan For Grade 9 Week 1, Quarter 1, October - , 2021 (MATH 8)
3 pages
GPU Bootcamp Samhar
100% (1)
GPU Bootcamp Samhar
96 pages
CV Trutniev
No ratings yet
CV Trutniev
2 pages
Chem 59-250 Identifying Point Groups
No ratings yet
Chem 59-250 Identifying Point Groups
16 pages
Justification of Water Tank
No ratings yet
Justification of Water Tank
1 page