Mnet 011 2000396
Mnet 011 2000396
Mnet 011 2000396
computers connected to the Internet will not be cations by addressing IoT from two distinct points
feasible for IoT because of the explosive number of view: the device and network point of view.
of IoT devices, their resource-constrained nature, This process helps engineers and practitioners
and their heterogeneity. Thus, Restuccia et al. [2] starting in the area to better understand the chal-
has advocated for a secure-by-design approach, lenges and principle design of ML-based cyberse-
in which IoT systems shall be building as free curity solutions when they are intended to protect
of vulnerabilities as possible. However, securi- IoT devices individually, as well as IoT network
ty-by-design is hard to achieve in IoT, as devices infrastructure. More specifically, the contributions
are composed of several hardware parts manufac- of this work include:
tured by different vendors and software developed • A thorough discussion of the motivation for
from different companies. the design of novel solutions to secure IoT
Therefore, the design of new solutions to pro- systems, ML-based cybersecurity, and require-
tect IoT from cyberattacks has received increased ments and current daunting challenges.
attention in the scientific and industrial commu- • A proposed classification to categorize
nities. In particular, machine learning (ML)-based recent works that designed ML-based
solutions have emerged for IoT cybersecurity. Tra- approaches for IoT cybersecurity in devices
ditional cybersecurity solutions are prohibitive for and network-based solutions. The proposed
IoT, as they rely on heavy computation [3] and will classification, by considering two distinct
overload the network with traffic for autonomous points of view of IoT systems, helps to bet-
changing of default passwords on millions of IoT ter identify and understand the challenges,
devices, two-factor device authentication, applica- requirements and up-to-date common prin-
tion of security patches and updates to IoT devices ciples for the design of security solutions for
(https://tinyurl.com/yblo7yq6). Moreover, they IoT devices and networks considering physi-
were not designed considering the severe devic- cal layer features.
es’ constraints in terms of computation, memory, • A thorough discussion of open issues and
radio bandwidth, and battery resources, and do future research directions toward the design
not encompass the entire security spectrum on of efficient cybersecurity solutions for IoT.
devices, edge computing, and wireless networking.
In contrast, ML-based cybersecurity solutions for
IoT have gained increased momentum, and several
Fundamentals
works have been proposed in the literature (see Cybersecurity for IoT
[1–4] and references therein). ML models can be Figure 1 illustrates classic attacks that IoT infra-
used, for instance, to create traffic profiles, detect structures can experience. Cybersecurity solu-
threats through traffic exchange that does not fall tions must be designed to protect IoT data and
within the established normal behavior, detect IoT avoid IoT devices to be compromised. Cisco
hardware vulnerabilities through observed physical estimates that data produced by IoT applications
layer characteristics, and authenticate legitimate will reach nearly 850 ZB by 2021 (https://tinyurl.
devices based on their characteristics and behav- com/ybez862s). Despite this impressive number,
ior. Xiao et al. [3] analyzed learning-based solutions it is worth highlighting that IoT data will mostly
designed for IoT device authentication, access con- be sensitive and may reveal private aspects of
trol, malware detection, and secure offloading. Li users and their interactions with the application.
et al. [5] evaluated the feasibility and suitability of For instance, a smart health-care application will
statistical learning models for detecting anomalous produce data regarding users’ health conditions
behavior of IoT devices by considering system sta- and historical health records. A smart home appli-
tistics (e.g., CPU usage cycles and disk usage). cation will produce data regarding rooms and
In this work, we tackle the challenge of design- environmental states and conditions (e.g., tem-
ing ML-based solutions for physical layer IoT secu- perature, lightness, humidity, and noise), as well
rity. Related works either addressed one particular as users’ interactions with them.
security problem (e.g., intrusion detection) that In both examples mentioned above, IoT data
might appear, or focused primarily on the discus- leakage can reveal critical users’ sensitive infor-
sion of the ML models while presenting proposed mation and behavior in most private spaces. An
solutions to secure IoT. In contrast, we discuss attacker in possession of such data can infer when
ML-based solutions for cybersecurity in IoT appli- a user is at home (or if the house is vacant), as
well as the user’s routines and preferences while Thereafter, a naive solution for securing IoT
at home. Therefore, cybersecurity solutions for data would be to implement protective measure-
IoT must deal with eavesdropping attacks effi- ments on any single device in an IoT application.
ciently, preventing information leakage, ensuring However, such a naive approach will be unfeasi-
data will not be globally accessed, and limiting ble, given the heterogeneous and resource-con-
data lifetime to the minimum extent required. strained nature of the IoT devices, and the heavy
Moreover, IoT devices have been targeted by computation and high communication load nature
cyber-attackers aimed at taking control of them. of traditional cybersecurity techniques. More-
In contrast to traditional computing systems, each over, cybersecurity solutions must guarantee that
IoT device performs a well-defined task. However, access to IoT resources is controlled. IoT devices
such a task might be critical; for instance, an IoT might perform vital tasks, such as in health-care
medical device can be used for insulin delivery in applications. Hence, cybersecurity solutions must
a health-care system. In this regard, compromised make sure that the access to update a device con-
IoT devices can lead to fatal consequences, as figuration or working mode is granted only to a
they can pump lethal doses of the administered legitimate entity. Such access control is needed
drug in the health-care applications (https:// to prevent, for instance, a malicious user from
tinyurl.com/y8tsb7fu). changing the dosage a device must deliver to a
Besides, compromised IoT systems can be patient in a smart health-care application.
used to create botnets, which will be explored to In this regard, ML-based solutions can observe
attack and damage other computing infrastruc- different variables in an IoT system and make
tures. Although each device individually lacks decisions to secure it. In an IoT application, each
computing capabilities, the numbers compensate device will have a well-defined task to perform.
for this. An infected IoT device can be instructed Moreover, the interaction between users and a
to download malware and wait for commands set of IoT devices, or a machine-to-machine inter-
to begin an attack. Despite having constrained action in a given IoT application, tend to follow
resources, it is undeniable that orchestrated DDoS a pattern, that is, it is not a random interaction.
attacks from IoT are destructive because of the In this regard, machine learning algorithms can
excessive number of involved devices. IoT botnets be trained to learn such an interaction pattern,
(e.g., a Mirai botnet) have served as infrastructure as well as the characteristics of networking traffic
for powerful DDoS attacks, such as those in Octo- generated from such interactions. Therefore, an
ber 2016, which took down hundreds of web- ML-based solution will be able to authenticate
sites (e.g., Twitter, Netflix, Reddit, and GitHub) users, control data access, and identify DDoS
for several hours [6]. The critical fact is that tradi- attacks, compromised IoT devices, or unautho-
tional cybersecurity approaches might not prove rized attempts to access IoT data or resources.
suitable for IoT applications, given the unique
characteristics of IoT devices and networks, as Requirements and Fundamental Challenges
summarized in Table 1. Cybersecurity techniques for IoT must be light-
weight, resilient, fault-tolerant, and robust. More-
ML-Based Cybersecurity over, they should tackle the heterogeneous
Machine learning has gained increased attention capabilities of IoT devices and wireless network-
in the design of cybersecurity solutions for IoT. ing technologies. In addition, cybersecurity tech-
One of the reasons for such increased attention niques should protect IoT data by considering
is the potential for using ML models to protect different data sensitivity levels. Moreover, they
IoT data and control access to IoT resources. should guarantee that data is accessed only by
In traditional personal computer-based systems users and system components that have the right
(e.g., client/server computing applications), data permission to access it. Furthermore, cybersecuri-
is located in a well-defined place and is request- ty solutions for IoT should detect unusual IoT traf-
ed by the users from a data unique identifier or fic, block attack attempts, and mitigate damage
address of the host storing it. In contrast, IoT when a device or component is compromised.
data might be spread out among devices and Nonetheless, solutions to secure IoT must not
processing units; that is, IoT data will not reside incur significant overhead for the system and net-
in a single place, and its location will not be work, which would diminish the performance of
well-defined. an IoT application.
of the transmitter based on the output (normal- an external or internal attacker will interfere with
ized geometric means of feature values) and PUF legitimate transmissions, which will cause high
properties. burst error rates on multiple consecutive flits of
The main disadvantage of the above work is the a packet. Hence, ML classifiers were employed
high demand at the gateway node, which might to distinguish random burst errors occasioned by
fail in simultaneously authenticating IoT devices power source fluctuations, ground bounce, or
in massive IoT systems. In this regard, Ferdosi and crosstalk from burst errors due to jamming attacks.
Saad [11] proposed an LSTM-based watermark- The authors created a simulation-based dataset
ing algorithm for assisting dynamic massive IoT with different bit error rates (BER) to model nor-
device authentication. In the proposed solution, mal operation and burst errors from jamming
the LSTM model is used to extract fingerprints attacks. The number of transmitted and received
from device signals’ characteristics (spectral flat- flits, as well as the number of errors, are used
ness, mean, variance, skewness, and kurtosis). The together with the operating mode (i.e., normal or
output is a bitstream used to watermark the origi- attacked) are used for training the classifiers.
nal signal using a key. At the gateway, a proposed Despite the advancements, many challeng-
dynamic watermarking LSTM (DW-LSTM) model is es should be addressed during the design of
used to extract the bit, and features of a received AI-based cybersecurity solutions to protect devic-
watermarked signal. Those outputs are compared, es. First, the solutions must be lightweight as
and in the event of dissimilarities between two devices have limited resources in terms of com-
sequences, an attack alarm is triggered. puting, storage, and energy. Second, the solutions
In contrast to the works mentioned above, will need to deal with the lack of reliable data sets
Vashist et al. [4] addressed jamming attacks aimed to be used for training and validation. Simulated
at DoS on wireless Network-on-Chip (WiNoC). data were considered to evaluate the proposed
The authors used a burst error correction code to solutions in [9, 10], for instance. Third, it might be
monitor the rate of burst errors received over the required to re-train and update the parameters of
wireless medium, and ML classifiers (ANN, SVM, an ML-based cybersecurity solution. Hence, the
kNN, and decision tree) to detect the persistent data exchange for such tasks should be done in a
jamming attack. In the considered attack model, way that will not congest the network.