Mnet 011 2000396

This article has been accepted for inclusion in a future issue of this magazine.
Content is final as presented, with the exception of pagination.

ACCEPTED FROM OPEN CALL
Design Guidelines for Machine Learning-based Cybersecurity in Internet of Things

Azzedine Boukerche and Rodolfo W. L. Coutinho
Abstract cation, health-care, transportation, manufactur-

ing, and surveillance. Internet of things (IoT) has
Cybersecurity is one of the building blocks in unlocked sensing and actuation-based applica-
need of increasing attention in Internet of things tions in several domains. A traditional IoT appli-
(IoT) applications. IoT has become a popular tar- cation relies on various heterogeneous devices
get for attackers seeking sensitive and personal to sense the environment and act based on the
user data, computing infrastructure for massive observed conditions or received commands. The
attacks, or aimed at compromising critical appli- IoT devices gather a large amount of multimedia
cations. Worryingly, the industrial race toward the data through heterogeneous sensors, share col-
forefront of IoT software and device development lected data whenever needed through machine-
has led to increased market penetration of vulner- to-machine (M2M) communication, and offload it
able IoT devices and applications. Nevertheless, to edge or cloud infrastructures.
traditional cybersecurity solutions designed for Current advancements in key technologies
personal computers often rely on heavy compu- are supporting the ever-growing expansion and
tation and high communication overhead, and popularization of IoT. The evolving Long Term
therefore are prohibitive for IoT, given the explo- Evolution (LTE) and 5G networks are expected
sive number of IoT devices, their resource-con- to provide IoT applications with massive con-
strained nature, and their heterogeneity. Hence, nectivity, high bandwidth, and ultra-reliable and
innovative solutions must be designed for secur- low-latency communication. Edge computing will
ing IoT applications, while considering the pecu- expand data processing capabilities closer to IoT,
liar characteristics of IoT devices and networks. which helps improve energy efficiency and reduce
In this article, we discuss the motivations and network congestion, since devices will no longer
challenges of using machine learning (ML) mod- need to offload all collected data cloud servers.
els for the design of cybersecurity solutions for Information-centric networking architectures will
IoT. More specifically, we tackle the challenge of improve communication interoperability and data
designing ML-based solutions and provide guide- delivery in IoT, by employing data-centric request
lines for ML-based physical layer solutions aimed and response, and in-networking content cach-
at securing IoT. We propose a device-oriented ing, respectively. Nevertheless, cybersecurity is a
and network-oriented classification and investigate fundamental building block in need of increased
recent works that designed ML-based solutions, attention in IoT. IoT systems are being targeted
considering IoT physical layer features, to secure with an unprecedented number of cyberattacks.
IoT applications. The proposed classification helps The F-Secure reports that attack traffic on IoT
engineers and practitioners starting in this area devices more than tripled in the first half of 2019,
to better identify and understand the challeng- when compared with the previous period, and
es, requirements, and up-to-date common design reached a total of over 2.9 billion events (please
principles for securing IoT devices and networks refer to Attack Landscape H1 2019 (report avail-
considering physical layer features. Finally, we able on https://tinyurl.com/sxaeq4c)). Moreover,
shed light on some future research directions that malicious users rely on spoofing attacks, intru-
need further investigation. sions, jamming, eavesdropping, and malware
to: leak sensitive IoT data; turn into botnets IoT
Introduction systems for massive distributed denial-of-service
In recent years, significant advances have been (DDoS) attacks, spam, phishing, click-fraud; and
made on embedded devices, sensing and actua- make critical IoT applications unavailable (e.g.,
tion hardware, wireless networking technologies, health-care systems, surveillance, smart transporta-
edge computing, and data-centric networking, tion, smart grids, and industrial applications).
which have contributed to the development and The industrial race toward the forefront of the
market penetration of Internet of things (IoT). IoT development of IoT devices has led to increased
has emerged as a network of seamlessly inter- market penetration of vulnerable devices. For
connected devices (e.g., sensors and actuators), instance, the security researcher Billy Rios showed
which cooperate to attain common objectives that the LifecarePCA drug infusion system, as well
[1]. Moreover, IoT has gained increased attention as five other Hospira drug delivery automated
thanks to its potential to change the way people machines, is vulnerable to attacks that can change
live and work by creating efficient, comfortable, the drug dosage to be delivered (https://tinyurl.
green and enjoyable environments through smart com/t8xyr4h). Nevertheless, traditional cyberse-
applications over different domains, such as edu- curity solutions designed for protecting personal
Digital Object Identifier:
10.1109/MNET.011.2000396 Azzedine Boukerche is with the University of Ottawa; R. W. L. Coutinho is with Concordia University.
1 0890-8044/20/$25.00 © 2020 IEEE IEEE Network • Accepted for Publication

Authorized licensed use limited to: Cornell University Library. Downloaded on September 06,2020 at 09:44:35 UTC from IEEE Xplore. Restrictions apply.
This article has been accepted for inclusion in a future issue of this magazine. Content is final as presented, with the exception of pagination.
FIGURE 1. Common security threatens for IoT applications.
computers connected to the Internet will not be cations by addressing IoT from two distinct points
feasible for IoT because of the explosive number of view: the device and network point of view.
of IoT devices, their resource-constrained nature, This process helps engineers and practitioners
and their heterogeneity. Thus, Restuccia et al. [2] starting in the area to better understand the chal-
has advocated for a secure-by-design approach, lenges and principle design of ML-based cyberse-
in which IoT systems shall be building as free curity solutions when they are intended to protect
of vulnerabilities as possible. However, securi- IoT devices individually, as well as IoT network
ty-by-design is hard to achieve in IoT, as devices infrastructure. More specifically, the contributions
are composed of several hardware parts manufac- of this work include:
tured by different vendors and software developed • A thorough discussion of the motivation for
from different companies. the design of novel solutions to secure IoT
Therefore, the design of new solutions to pro- systems, ML-based cybersecurity, and require-
tect IoT from cyberattacks has received increased ments and current daunting challenges.
attention in the scientific and industrial commu- • A proposed classification to categorize
nities. In particular, machine learning (ML)-based recent works that designed ML-based
solutions have emerged for IoT cybersecurity. Tra- approaches for IoT cybersecurity in devices
ditional cybersecurity solutions are prohibitive for and network-based solutions. The proposed
IoT, as they rely on heavy computation [3] and will classification, by considering two distinct
overload the network with traffic for autonomous points of view of IoT systems, helps to bet-
changing of default passwords on millions of IoT ter identify and understand the challenges,
devices, two-factor device authentication, applica- requirements and up-to-date common prin-
tion of security patches and updates to IoT devices ciples for the design of security solutions for
(https://tinyurl.com/yblo7yq6). Moreover, they IoT devices and networks considering physi-
were not designed considering the severe devic- cal layer features.
es’ constraints in terms of computation, memory, • A thorough discussion of open issues and
radio bandwidth, and battery resources, and do future research directions toward the design
not encompass the entire security spectrum on of efficient cybersecurity solutions for IoT.
devices, edge computing, and wireless networking.
In contrast, ML-based cybersecurity solutions for
IoT have gained increased momentum, and several
Fundamentals
works have been proposed in the literature (see Cybersecurity for IoT
[1–4] and references therein). ML models can be Figure 1 illustrates classic attacks that IoT infra-
used, for instance, to create traffic profiles, detect structures can experience. Cybersecurity solu-
threats through traffic exchange that does not fall tions must be designed to protect IoT data and
within the established normal behavior, detect IoT avoid IoT devices to be compromised. Cisco
hardware vulnerabilities through observed physical estimates that data produced by IoT applications
layer characteristics, and authenticate legitimate will reach nearly 850 ZB by 2021 (https://tinyurl.
devices based on their characteristics and behav- com/ybez862s). Despite this impressive number,
ior. Xiao et al. [3] analyzed learning-based solutions it is worth highlighting that IoT data will mostly
designed for IoT device authentication, access con- be sensitive and may reveal private aspects of
trol, malware detection, and secure offloading. Li users and their interactions with the application.
et al. [5] evaluated the feasibility and suitability of For instance, a smart health-care application will
statistical learning models for detecting anomalous produce data regarding users’ health conditions
behavior of IoT devices by considering system sta- and historical health records. A smart home appli-
tistics (e.g., CPU usage cycles and disk usage). cation will produce data regarding rooms and
In this work, we tackle the challenge of design- environmental states and conditions (e.g., tem-
ing ML-based solutions for physical layer IoT secu- perature, lightness, humidity, and noise), as well
rity. Related works either addressed one particular as users’ interactions with them.
security problem (e.g., intrusion detection) that In both examples mentioned above, IoT data
might appear, or focused primarily on the discus- leakage can reveal critical users’ sensitive infor-
sion of the ML models while presenting proposed mation and behavior in most private spaces. An
solutions to secure IoT. In contrast, we discuss attacker in possession of such data can infer when
ML-based solutions for cybersecurity in IoT appli- a user is at home (or if the house is vacant), as
IEEE Network • Accepted for Publication 2

IoT Characteristics Challenges for cybersecurity in IoT applications

Massive deployment • Data is distributed among multiple devices.
• Individual protection of devices.
• Network overhead.
Heterogeneity • Devices with heterogeneous capabilities.
• Need for different solutions to secure different devices.
Dynamic network topologies • IoT network topology changes frequently due to controllable and uncontrollable factors.
• Topology changes will affect communication pattern of IoT devices.
• Fingerprinting-based cybersecurity solutions should consider communication traffic pattern changes.
Low-power and low-cost • IoT devices have severe energy constraints.
communication • Networking protocols do not implement robust mechanism for reliable communication.
• Distributed cybersecurity solutions should consider low-reliable communication in IoT applications.
Low latency communication • IoT applications might have time constraints.
• Complex cybersecurity solutions will incur additional delays.
TABLE 1. IoT characteristics and challenges for cybersecurity .
well as the user’s routines and preferences while Thereafter, a naive solution for securing IoT
at home. Therefore, cybersecurity solutions for data would be to implement protective measure-
IoT must deal with eavesdropping attacks effi- ments on any single device in an IoT application.
ciently, preventing information leakage, ensuring However, such a naive approach will be unfeasi-
data will not be globally accessed, and limiting ble, given the heterogeneous and resource-con-
data lifetime to the minimum extent required. strained nature of the IoT devices, and the heavy
Moreover, IoT devices have been targeted by computation and high communication load nature
cyber-attackers aimed at taking control of them. of traditional cybersecurity techniques. More-
In contrast to traditional computing systems, each over, cybersecurity solutions must guarantee that
IoT device performs a well-defined task. However, access to IoT resources is controlled. IoT devices
such a task might be critical; for instance, an IoT might perform vital tasks, such as in health-care
medical device can be used for insulin delivery in applications. Hence, cybersecurity solutions must
a health-care system. In this regard, compromised make sure that the access to update a device con-
IoT devices can lead to fatal consequences, as figuration or working mode is granted only to a
they can pump lethal doses of the administered legitimate entity. Such access control is needed
drug in the health-care applications (https:// to prevent, for instance, a malicious user from
tinyurl.com/y8tsb7fu). changing the dosage a device must deliver to a
Besides, compromised IoT systems can be patient in a smart health-care application.
used to create botnets, which will be explored to In this regard, ML-based solutions can observe
attack and damage other computing infrastruc- different variables in an IoT system and make
tures. Although each device individually lacks decisions to secure it. In an IoT application, each
computing capabilities, the numbers compensate device will have a well-defined task to perform.
for this. An infected IoT device can be instructed Moreover, the interaction between users and a
to download malware and wait for commands set of IoT devices, or a machine-to-machine inter-
to begin an attack. Despite having constrained action in a given IoT application, tend to follow
resources, it is undeniable that orchestrated DDoS a pattern, that is, it is not a random interaction.
attacks from IoT are destructive because of the In this regard, machine learning algorithms can
excessive number of involved devices. IoT botnets be trained to learn such an interaction pattern,
(e.g., a Mirai botnet) have served as infrastructure as well as the characteristics of networking traffic
for powerful DDoS attacks, such as those in Octo- generated from such interactions. Therefore, an
ber 2016, which took down hundreds of web- ML-based solution will be able to authenticate
sites (e.g., Twitter, Netflix, Reddit, and GitHub) users, control data access, and identify DDoS
for several hours [6]. The critical fact is that tradi- attacks, compromised IoT devices, or unautho-
tional cybersecurity approaches might not prove rized attempts to access IoT data or resources.
suitable for IoT applications, given the unique
characteristics of IoT devices and networks, as Requirements and Fundamental Challenges
summarized in Table 1. Cybersecurity techniques for IoT must be light-
weight, resilient, fault-tolerant, and robust. More-
ML-Based Cybersecurity over, they should tackle the heterogeneous
Machine learning has gained increased attention capabilities of IoT devices and wireless network-
in the design of cybersecurity solutions for IoT. ing technologies. In addition, cybersecurity tech-
One of the reasons for such increased attention niques should protect IoT data by considering
is the potential for using ML models to protect different data sensitivity levels. Moreover, they
IoT data and control access to IoT resources. should guarantee that data is accessed only by
In traditional personal computer-based systems users and system components that have the right
(e.g., client/server computing applications), data permission to access it. Furthermore, cybersecuri-
is located in a well-defined place and is request- ty solutions for IoT should detect unusual IoT traf-
ed by the users from a data unique identifier or fic, block attack attempts, and mitigate damage
address of the host storing it. In contrast, IoT when a device or component is compromised.
data might be spread out among devices and Nonetheless, solutions to secure IoT must not
processing units; that is, IoT data will not reside incur significant overhead for the system and net-
in a single place, and its location will not be work, which would diminish the performance of
well-defined. an IoT application.
3 IEEE Network • Accepted for Publication

In this regard, supervised machine learning tech- Approach Description

niques (e.g., SVM, naive Bayes, K-nearest neighbor,
deep neural networks, and random forests) have Device security Solutions aimed at tackling vulnerabilities and
attacks intended to IoT devices (e.g., hardware
been used for detecting network intrusion and
trojan, cloning, and battery draining), and secure
malware, DDoS, and spoofing attacks [3]. Super- them to avoid privacy leakage, DDoS and jam-
vised ML techniques require labeled data, with a ming.
set of inputs and their corresponding outputs, used
to train the model initially. The working principle Network security Solutions aimed at securing IoT communi-
cation infrastructure (e.g., edge nodes, access
of such an approach overall includes the central- points, routers, and cache systems) against ad-
ized training of the model and its later execution versaries.
in selected IoT devices. This might require a vast
amount of raw data for training the models. TABLE 2. Classification of IoT cybersecurity approaches.
In addition, needed data from training might
be sensitive and private, which will not be easy
to acquire. Furthermore, supervised models must IoT Device Security
be resilient to maliciously introduced data; that is, One of the daunting challenges in IoT applica-
they must reject compromised training data sets tions is how to secure the devices. IoT devices
that might negatively impact the result. Biased might present vulnerabilities, such as open telnet
data from user interactions must also be proper- ports, outdated firmware, and unencrypted trans-
ly treated when training supervised models for mission of sensitive data. Hence, they are sus-
securing IoT. The challenges mentioned above ceptible to many kinds of attacks, which include
will also emerge whenever a used supervised ML hardware trojan, non-network side-channel
model must be re-trained and updated. attacks, DDoS, and tampering attacks [15]. More-
In contrast, unsupervised machine learning over, IoT devices overall have severe limitations
techniques (e.g., k-means, hierarchical clustering, in terms of power supply, which lead them to
and k-NN) have gained increased attention for IoT work in a duty-cycled manner to conserve ener-
networks [7]. Unsupervised learning can be used gy. However, they are also susceptible to sleep
for detecting data modification attacks, statistical deprivation and battery draining attacks. In this
data tuples classification into benign or malicious, regard, ML-based cybersecurity approaches can
abnormal flow identification, and malicious relay be explored to ensure IoT devices are working
detection. The main advantage of unsupervised correctly, that is, detecting when they are com-
ML is that it does not require labeled data for promised or receiving unusual requests for sen-
training, which contributes to reducing complex- sitive data or due to DDoS attempts. Moreover,
ity and required resources. However, efficient ML-based cybersecurity can improve authentica-
unsupervised ML-based solutions will require the tion mechanisms and access control to data and
proper selection of features to be considered, and networks for new devices added to the system.
removal of features possessing no discriminating Machine learning has been used in proposed
power, aimed at coping with the curse of the solutions to authenticate IoT devices through
dimensionality problem. fingerprinting. Figure 2 depicts the general work
Finally, it is worth mentioning that some of the principle of such approaches. IoT devices will
machine learning models, such as deep learning, have unique radio signal signatures. The unique
are well known for the difficulties of deep under- signatures of transmitted signals will happen due
standing behind decisions taken. Thus, ML-based to the transmitter’s hardware imperfections or
cybersecurity solutions might fail concerning effects of signal propagation (e.g., fading, Doppler
forensic capabilities, as taken decisions might effect, noise, and distortion). Furthermore, recent
not be traced. It will be challenging to develop studies [9–11] designed ML-based solutions to
ML-based solutions to secure IoT that are capa- extract unique features from received signals and
ble of providing transparency and accountabili- determine if a device that is trying to authenticate
ty of the taken actions. It might not be possible in the network is legitimate or adversarial.
to prove that taken actions were correct, which Das et al. [9] proposed a Long Short Term
would challenge the system of being defensible in Memory (LSTM)-based classifier to learn unique
court law whenever needed. hardware imperfections of legitimate IoT devices.
Hence, such unique imperfections are used to dis-
ML-Based Cybersecurity for IoT tinguish legitimate devices from adversaries that
The first step toward the design of efficient try to emulate them. To do so, wireless signals
ML-based solutions for IoT applications is to through samples of transmitted preambles, com-
understand IoT characteristics, security require- posed of multiple symbols, are considered. For
ments, and design challenges. To facilitate this a given input, the LSTM classifier’s output will be
process, we propose a novel classification to the imperfection characteristics of the transmitter
categorize current ML-based designs to secure hardware, in terms of frequency offset, phase off-
IoT applications. Based on the primary goal, we set, filters, timing offset, and multipath.
categorize the solutions in IoT devices and IoT Chatterjee et al. [10] proposed the RF-PUF
network security, as summarized in Table 2. The for IoT device authentication through physical
proposed classification contributes to the study of unclonable functions (PUF). In the RF-PUF, device
the challenges and requirements of IoT systems identification is performed at the receiver node,
from a device and network point of view. Hence, from frequency, in-phase (I) and quadrature (Q)
for each category, we highlight the design prin- components and channel features) extracted from
ciples and main challenges to be overcome, and received wireless signals. The proposed solution
shed light on some recent works in the literature. implements a three-layer Artificial Neural Network
The discussed works are summarized in Table 3. (ANN) that will determine the unique identifier

Proposal Category ML technique Goal Description

Xiao et al. [8] Network security DQN Secure mobile edge Determine the edge node the IoT device should
caching devices use, the task offloading rate/time, and the transmis-
sion power to be used in the communication. Those
parameters are selected from the observed users’
density, devices’ battery level, jamming strength, and
radio channel bandwidth.
Liu et al. [7] Network security k-means Detect malicious devices Use probe packets to discover multi-hop paths
within IoT multihop paths from source nodes to the sink. The sink node de-
termines the fraction of unmodified packets of each
path, from received probes. Hence, k-means is used
to cluster nodes into benign and malicious, based on
the path reputation they are a member of and their
contribution to each path.
Das et al. [9] Device security LSTM Device authentication Use the unique hardware imperfections of IoT de-
vices to authenticate them.
Chatterjee et al. [10] Device security ANN Device authentication Authenticate IoT devices from physical unclon-
able functions.
Ferdosi and Saad [11] Device security LSTM Device authentication Gateway nodes authenticate devices of massive
IoT scenarios through received watermarked signals.
Chen et al. [12] Network security DBN Detect jamming attacks Deep belief network is used to learn features of
in the mobile edge com- eavesdropping and jamming attacks to mobile edge
puting infrastructure computing systems.
Miettinen et al. [13] Network security Random Forest Detect devices with un- Use devices’ fingerprint to identify if they have
patched vulnerabilities any unpatched vulnerability. Hence, protective mea-
surements are taken to limit the operation of a vul-
nerable device in the IoT network.
Alli et al. [14] Network security PSO and Neuro-Fuzzy Prevent malicious IoT Surrogate entities at fog nodes collect and store in-
devices of offloading information regarding IoT devices within the network.
valid data aimed at net- PSO is used at the fog nodes to select the optimal
work congestion and ex- node, aimed at reducing delay, for handling offloaded
haustion of fog and cloud tasks. Neuro-Fuzzy is used at gateways to evaluate
computing resources. data coming from IoT devices and identify malicious
task offloading.
Vashist et al. [4] Device security ANN, SVM, kNN Detect burst errors on Implements a set of machine learning classifiers to
and decision tree multiple consecutive flits detect jamming attacks aimed at denial-of-service on
classifiers of a packet in a WiNoC. wireless Network-on-Chip. The classifiers are used
to distinguish burst errors occasioned during normal
operation from errors that happen when an internal or
external attacker is interfering in the communication.
TABLE 3. Summary of discussed works.
of the transmitter based on the output (normal- an external or internal attacker will interfere with
ized geometric means of feature values) and PUF legitimate transmissions, which will cause high
properties. burst error rates on multiple consecutive flits of
The main disadvantage of the above work is the a packet. Hence, ML classifiers were employed
high demand at the gateway node, which might to distinguish random burst errors occasioned by
fail in simultaneously authenticating IoT devices power source fluctuations, ground bounce, or
in massive IoT systems. In this regard, Ferdosi and crosstalk from burst errors due to jamming attacks.
Saad [11] proposed an LSTM-based watermark- The authors created a simulation-based dataset
ing algorithm for assisting dynamic massive IoT with different bit error rates (BER) to model nor-
device authentication. In the proposed solution, mal operation and burst errors from jamming
the LSTM model is used to extract fingerprints attacks. The number of transmitted and received
from device signals’ characteristics (spectral flat- flits, as well as the number of errors, are used
ness, mean, variance, skewness, and kurtosis). The together with the operating mode (i.e., normal or
output is a bitstream used to watermark the origi- attacked) are used for training the classifiers.
nal signal using a key. At the gateway, a proposed Despite the advancements, many challeng-
dynamic watermarking LSTM (DW-LSTM) model is es should be addressed during the design of
used to extract the bit, and features of a received AI-based cybersecurity solutions to protect devic-
watermarked signal. Those outputs are compared, es. First, the solutions must be lightweight as
and in the event of dissimilarities between two devices have limited resources in terms of com-
sequences, an attack alarm is triggered. puting, storage, and energy. Second, the solutions
In contrast to the works mentioned above, will need to deal with the lack of reliable data sets
Vashist et al. [4] addressed jamming attacks aimed to be used for training and validation. Simulated
at DoS on wireless Network-on-Chip (WiNoC). data were considered to evaluate the proposed
The authors used a burst error correction code to solutions in [9, 10], for instance. Third, it might be
monitor the rate of burst errors received over the required to re-train and update the parameters of
wireless medium, and ML classifiers (ANN, SVM, an ML-based cybersecurity solution. Hence, the
kNN, and decision tree) to detect the persistent data exchange for such tasks should be done in a
jamming attack. In the considered attack model, way that will not congest the network.

FIGURE 2. ML-based IoT device fingerprinting.
IoT Network Security lected during devices’ initialization are used as

Network-based IoT security aims to create barriers input for each classifier that will provide a bina-
to protect the IoT network, rather than addressing ry decision as to whether the input fingerprint
security in a per-device manner. This includes, for matches the device-type.
instance, identification of the malicious device, In addition, IoT networks can suffer from DoS
traffic filtering as it traverses the network, iden- of edge computing resources. Malicious nodes can
tifying unusual requests without congesting the attack edge computing infrastructure by maliciously
network and increasing latency, link protection offloading tasks aimed at occupying processing, stor-
between IoT and edge/cloud servers, and device age, and communication edge computing resourc-
identification and registration when new devices. Hence, tasks offloaded by legitimate devices will
es connect to the network. ML-based solutions not find available resources on the edge and will
to secure IoT networks can also be deployed at need to be handled locally, which will exhaust IoT
edge and cloud infrastructure to monitor incom- resource-constrained devices and impair the perfor-
ing and outgoing traffic of devices within the mance of applications. Alli et al. [14] proposed the
network, profile them, and determine when the SecOFF-FCIoT, an ML-based approach for secure
network is under attack from normal and unusual task offloading to fog and cloud servers. The pro-
behavior of the entities. posed solution uses Particle Swarm Optimization
Jamming is one of the attacks in IoT networks (PSO) at IoT device level to optimally select a fog
aimed to disrupt communication between devic- node to handle offloaded tasks. Hence, a neu-
es and edge servers. In order to tackle jamming ro-fuzzy model is used at gateway nodes to evaluate
attacks, physical-layer security methods have data coming from IoT devices and isolate the mali-
been proposed for IoT, as alternative solutions cious devices that are sending invalid data with the
to encryption/decryption-based methods, which purpose of congesting the network.
are costly in terms of computing resources. Chen In contrast, Xiao et al. [8] investigated the use
et al. [12] proposed a deep learning framework of a reinforcement learning-based procedure for
for jamming attack detection in a mobile edge securing mobile edge caching (MEC) devices. In
computing infrastructure supporting IoT-based IoT applications, a MEC infrastructure will be a
cyber-physical transportation. The proposed target of attackers that seek either leakage of data
framework uses a deep belief network to analyze cached at the MEC devices, or denial of service
attack behaviors from required permissions, sen- through an impaired performance of MEC sys-
sitive application programming interfaces (APIs), tems. Hence, the authors investigated the use of
and dynamic behaviors. a deep Q-network (DQN) to secure MEC. The
Liu et al. [7] used k-means clustering to identify DQN model observes user density, battery levels,
malicious nodes involved in data routing in IoT jamming strength, and radio channel bandwidth,
multi-hop applications. Accordingly, probe pack- and selects the edge device to offload the task,
ets are transmitted from source nodes toward the the offloading rate/time, and the transmission
destination (sink). The destination calculates the power of the IoT device for the task offloading for
fraction of unmodified packets by checking the the MEC device.
integrity of each received probe packet, along Herein, collaborative solutions need to be
the multi-paths from the source node. Hence, explored to improve the security of large-scale
k-means is used to cluster nodes in two groups and massive IoT networks. IoT devices can
(benign or malicious nodes) based on the repu- select edge servers based on the level of secu-
tation attributes of the paths they are part of, and rity of the communication and server. Howev-
their contributions to the paths. er, the need for periodic communication among
Another approach to secure IoT networks is IoT devices, for exchange of the edge devices
to detect the presence of devices with unpatched security level they have used, will congest the net-
vulnerabilities and apply necessary protection work and incur additional costs, such as energy.
measurements to secure the other devices in the Hence, it requires the development of collabora-
same network. The IoT SENTINEL [13] imple- tive machine learning approaches where models’
ments software-defined networking (SDN)-based parameters are shared among the devices, rather
Security Gateway to monitor and classify the than the data used for training.
devices, as well as to send device fingerprints to
the proposed IoT Security. The Random Forest Future Research Directions
algorithm is used to create classifiers for devices While important progress has been achieved,
with known fingerprints. Hence, upon the con- there are several directions that require further
nection of new IoT devices in the network, 23 exploration in the design of solutions to secure
features extracted from each packet of a set col- IoT applications.

First, there is a lack of machine learning-based Acknowledgment

solutions that consider different information for This work is partially supported by the NSERC
device profiling. Current ML-based cybersecurity DISCOVERY, NSERC CREATE TRANSIT and Can-
solutions for IoT authentication and access control ada Research Chairs Programs.
consider device profile in terms of their hardware
imperfections. However, additional information References
for improved device profiling can be considered [1] J. Jagannath et al., “Machine Learning for Wireless Com-
to increase the performance of IoT cybersecurity munications in the Internet of Things: A comprehensive
solutions. For instance, a combination of IoT infra- Survey,’’ Ad Hoc Networks, vol. 93, 2019, p. 101913–59.
[2] F. Restuccia et al., “Securing the Internet of Things in the
structure usage information, such as CPU, memory Age of Machine Learning and Software-Defined Network-
and networking traffic intensity and pattern, rather ing,’’ IEEE Internet of Things J., vol. 5, no. 6, Dec. 2018, pp.
than considering a single aspect (as is done in the 4829–42.
current literature), as well as the use of high-level [3] L. Xiao et al., “IoT Security Techniques Based on Machine
Learning: How do IoT Devices Use AI to Enhance Security?’’
information, such as social interactions with other IEEE Signal Processing Mag., vol. 35, no. 5, Sep. 2018, pp.
devices, can improve the performance of ML-based 41–49.
cybersecurity solutions in IoT applications. [4] A. Vashist et al., “Securing a Wireless Network-on-Chip
Battery draining and sleep deprivation attacks Against Jamming Based Denial-of-Service Attacks,’’ Proc.
IEEE Computer Society Annual Symposium on VLSI (ISVLSI),
are popular and catastrophic in IoT devices. As July 2019, pp. 320–25.
mentioned in [15], some works in the literature [5] F. Li et al., “System Statistics Learning-Based IoT Security:
have already investigated the energy usage pat- Feasibility and Suitability,’’ IEEE Internet of Things J., vol. 6,
tern of IoT devices, aimed at detecting energy no. 4, Aug. 2019, pp. 6396–6403,
[6] C. Kolias et al., “DDoS in the IoT: Mirai and Other Botnets,’’
depletion and DDoS attacks. However, more Computer, vol. 50, no. 7, July 2017, pp. 80–84.
research efforts in this area are needed. For [7] X. Liu et al., “Identifying Malicious Nodes in Multihop IoT
instance, IoT devices will work in a duty-cycled Networks Using Diversity and Unsupervised Learning,’’ Proc.
manner, where devices will be sleeping (i.e., IEEE Int’l Conference on Communications (ICC), May 2018,
pp. 1–6.
transceiver will be turned off) most of the time [8] L. Xiao et al., “Security in Mobile Edge Caching with Rein-
for reducing energy consumption. Such features forcement Learning,’’ IEEE Wireless Commun., vol. 25, no. 3,
should be explored, where supervised learning June 2018, pp. 116–122.
can be used to correlate devices with similar func- [9] R. Das et al., “A Deep Learning Approach to IoT Authentica-
tion,’’ Proc. IEEE Int’l Conference on Communications (ICC),
tionalities and detect when a device is working May 2018, pp. 1–6.
with an abnormal active and sleep cycle. [10] B. Chatterjee et al., “RF-PUF: Enhancing IoT Security
Furthermore, there is a lack of investigation Through Authentication of Wireless Nodes Using in-situ
of collaborative and distributed machine learn- Machine Learning,’’ IEEE Internet of Things J., vol. 6, no. 1,
Feb. 2019, pp. 388–398.
ing-based solutions. IoT will demand ML-based [11] A. Ferdowsi and W. Saad, “Deep Learning for Signal
solutions on distributed and heterogeneous devic- Authentication and Security in Massive Iinternet-of-Things
es. Such solutions must be collaborative and Systems,’’ IEEE Trans. Commun., vol. 67, no. 2, Feb. 2019,
do not rely on centralized data training. In this pp. 1371–87.
[12] Y. Chen et al., “Deep Learning for Secure Mobile Edge
regard, federated learning could be used as a Computing in Cyber-Physical Transportation Systems,’’ IEEE
starting point for such approaches. Network, vol. 33, no. 4, July 2019, pp. 36–41.
In addition, classic challenges of machine [13] M. Miettinen et al., “IoT SENTINEL: Automated Device-type
learning, such as a data set for training and val- Identification for Security Enforcement in IoT,’’ Proc. IEEE
37th Int’l Conf. on Distributed Computing Systems (ICDCS),
idation, must be tackled. There is a lack of IoT June 2017, pp. 2177–84.
data sets in terms of incoming/outgoing network [14] A. Alli and M. Alam, “SecOFF-FCIoT: Machine Learning
traffic, device operations, and user interactions. Based Secure Offloading in Fog-Cloud of Things for Smart
Moreover, there is a lack of data sets related to City Applications,’’ Internet of Things, vol. 7, 2019, pp.
70–89.
attacks and threats of IoT applications. [15] A. Mosenia and N. Jha, “A Comprehensive Study of Secu-
Conclusion rity of Internet-of-Things,’’ IEEE Trans. on Emerging Topics in

Computing, vol. 5, no. 4, Oct. 2017, pp. 586-602.
This article presented a detailed discussion of
the advantages and challenges of machine learn-
Biographies
A zzedine B oukerche [FIEEE, FEiC, FCAE, FAAAS] is a Dis-
ing (ML)-based solutions to secure the Internet tinguished University Professor and Canada Research Chair
of things (IoT). We described the fundamental Tier-1 at the University of Ottawa. He has received the C. Got-
design requirements and challenges of cyber- lieb Computer Medal Award, Ontario Distinguished Research-
er Award, Premier of Ontario Research Excellence Award, G.
security solutions for IoT. Hence, we discussed S. Glinski Award for Excellence in Research, IEEE Computer
how ML-based solutions could be advantageous Society Golden Core Award, IEEE CS-Meritorious Award, IEEE
to tackle the vulnerabilities of IoT. We classified TCPP Leaderships Award, IEEE ComSoc ASHN Leaderships and
ML-based cybersecurity solutions as device-based Contribution Award, and the University of Ottawa Award for
Excellence in Research. His research interests include wireless
and network-based, according to the main secu- ad hoc and sensor networks, wireless networking and mobile
rity goal they are intended to cope with in IoT computing.
applications. This proposed classification helps the
understanding of the requirement and challenges R odolfo W. L. C outinho (rodolfo.coutinho@concordia.ca)
is an assistant professor at Concordia University, Canada. He
faced when designing new ML-based cybersecu- received the ACM MSWiM’19 Rising Star Award and the
rity solutions for IoT applications. For each cate- 2018 Pierre Laberge Prize at the University of Ottawa. He also
gory of the proposed classification, we shed light received the Best Thesis Awards from the CAPES, Brazilian Com-
on the main goal and fundamental challenges to puter Society and the Brazilian Computer Networks and Distrib-
uted Systems Interest Group. He has served as TPC Co-Chair
be tackled, and discussed representative works in for ACM and IEEE conferences. His research interests include
the literature. Finally, we presented some future Internet of Things, underwater networks, information-centric
research directions that need further investigation. networking, and mobile computing.


Mnet 011 2000396

Uploaded by

Copyright:

Available Formats

Mnet 011 2000396

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Mnet 011 2000396

Uploaded by

Copyright:

Available Formats

This article has been accepted for inclusion in a future issue of this magazine.

Content is final as presented, with the exception of pagination.

Design Guidelines for Machine Learning-based Cybersecurity in Internet of Things

Abstract cation, health-care, transportation, manufactur-

1 0890-8044/20/$25.00 © 2020 IEEE IEEE Network • Accepted for Publication

FIGURE 1. Common security threatens for IoT applications.

IEEE Network • Accepted for Publication 2

IoT Characteristics Challenges for cybersecurity in IoT applications

TABLE 1. IoT characteristics and challenges for cybersecurity .

3 IEEE Network • Accepted for Publication

In this regard, supervised machine learning tech- Approach Description

IEEE Network • Accepted for Publication 4

Proposal Category ML technique Goal Description

TABLE 3. Summary of discussed works.

5 IEEE Network • Accepted for Publication

FIGURE 2. ML-based IoT device fingerprinting.

IoT Network Security lected during devices’ initialization are used as

IEEE Network • Accepted for Publication 6

First, there is a lack of machine learning-based Acknowledgment

Conclusion rity of Internet-of-Things,’’ IEEE Trans. on Emerging Topics in

7 IEEE Network • Accepted for Publication

You might also like