Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3605758.3623492acmconferencesArticle/Chapter ViewAbstractPublication PagesccsConference Proceedingsconference-collections
short-paper
Open access

Granular IoT Device Identification Using TF-IDF and Cosine Similarity

Published: 26 November 2023 Publication History

Abstract

Internet of things (IoT) devices are becoming more prevalent in home environments and are shown to be generally insecure. There have been many previous studies looking to identify unknown IoT devices on networks. To truly secure a network however, there is a need to identify unknown devices down to the granularity of firmware version; a problem previous studies have failed to solve. As devices change versions, it is expected that there would be subtle differences in the on-wire signatures that would be hard for a human analyst to notice, but easy for an NLP technique to identify. In this paper we extract keywords from both encrypted and unencrypted network traffic and first use UMAP with K-Means clustering to visualise the data and show that natural clusters form across our test dataset of 18 devices covering 61 versions. This analysis suggests that there are underlying patterns in the extracted keywords that could be detected by machine learning techniques. We then show that these patterns can be detected by proposing a novel technique using TF-IDF and cosine similarity that follows the clustering results to identify IoT devices down to the level of firmware version. We show that our chosen features are strong enough to work accurately across a range of device types, manufacturers, models and versions, and note the main observations found when trying to identify devices down to a firmware version. This approach to get granularity down to device version level achieves an accuracy of 67% without being to the detriment of identifying device models, where we achieve an accuracy of 90%.

References

[1]
Nesrine Ammar, Ludovic Noirie, and Sébastien Tixeuil. 2019. Network-Protocol-Based IoT Device Identification. In 2019 Fourth International Conference on Fog and Mobile Edge Computing (FMEC). IEEE, Rome, Italy, 204--209. https://doi.org/10.1109/FMEC.2019.8795318
[2]
Ni An, Alexander Duff, Gaurav Naik, Michalis Faloutsos, Steven Weber, and Spiros Mancoridis. 2017. Behavioral anomaly detection of malware on home routers. In 2017 12th International Conference on Malicious and Unwanted Software (MALWARE). IEEE, Fajardo, PR, USA, 47--54. https://doi.org/10.1109/MALWARE.2017.8323956
[3]
Ashley Andrews, George Oikonomou, Simon Armour, Paul Thomas, and Thomas Cattermole. 2022. Keyword Extraction for Fine-Grained IoT Device Identification. In FMEC '22. IEEE, Paris, 79--85. https://doi.org/10.1109/FMEC57183.2022.10062747
[4]
BBC. 2023. The tech flaw that lets hackers control surveillance cameras. https://www.bbc.co.uk/news/technology-65975446 [Accessed: 01.07.2023].
[5]
Benjamin Bengfort, Rebecca Bilbro, Nathan Danielsen, Larry Gray, Kristen McIntyre, Prema Roman, Zijie Poh, and Others. 2018. Yellowbrick. Yellowbrick. https://doi.org/10.5281/zenodo.1206264
[6]
Gennaro Cirillo, Roberto Passerone, Antonio Posenato, and Luca Rizzon. 2020. Statistical Flow Classification for the IoT. Lecture Notes in Electrical Engineering, Vol. 627 (2020), 73--79. https://doi.org/10.1007/978--3-030--37277--4_9
[7]
Ruizhong Du, Jingze Wang, and Shuang Li. 2022. A Lightweight Flow Feature-Based IoT Device Identification Scheme. Security and Communication Networks, Vol. 2022 (2022), 10. https://doi.org/10.1155/2022/8486080
[8]
Xuan Feng, Qiang Li, Haining Wang, and Limin Sun. 2018. Acquisitional Rule-based Engine for Discovering Internet-of-Things Devices. In 27th USENIX Security Symposium (USENIX Security 18). USENIX Association, Baltimore, MD, 327--341.
[9]
Sebastian Fischer and Bernhard Weber. 2019. IoTAG : An Open Standard for IoT Device IdentificAtion and RecoGnition. In The Thirteenth International Conference on Emerging Security Information, Systems and Technologies. IARIA, Nice, France, 107--113.
[10]
Jinghuan Guo, Yong Mu, Mudi Xiong, Yaqing Liu, Jingxuan Gu, and Jose Garcia-Rodriguez. 2019. Activity Feature Solving Based on TF-IDF for Activity Recognition in Smart Homes. Complexity, Vol. 2019 (jan 2019), bibinfonumpages10 pages. https://doi.org/10.1155/2019/5245373
[11]
Jiawei Han, Micheline Kamber, and Jian Pei. 2012. Getting to Know Your Data. Morgan Kaufmann, USA, 39--82. https://doi.org/10.1016/B978-0--12--381479--1.00002--2
[12]
ITU (International Telecommunications Union). 2015. Measuring the Information Society Report 2015. http://www.itu.int/en/ITU-D/Statistics/Documents/publications/misr2015/MISR2015-w5.pdf. [Accessed: 15.05.2023].
[13]
Deepak Kumar, Kelly Shen, Benton Case, Deepali Garg, Galina Alperovich, Dmitry Kuznetsov, Rajarshi Gupta, and Zakir Durumeric. 2019. All Things Considered: An Analysis of IoT Devices on Home Networks. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 1169--1185.
[14]
Christopher D. McDermott, William Haynes, and Andrei V. Petrovksi. 2018. Threat Detection and Analysis in the Internet of Things using Deep Packet Inspection. International Journal on Cyber Situational Awareness, Vol. 4, 1 (2018), 61--83. https://doi.org/10.22619/ijcsa.2018.100120
[15]
Yair Meidan, Vinay Sachidananda, Hongyi Peng, Racheli Sagron, Yuval Elovici, and Asaf Shabtai. 2020. A Novel Approach for Detecting Vulnerable IoT Devices Connected Behind a Home NAT. Computers & Security, Vol. 97 (07 2020), 101968. https://doi.org/10.1016/j.cose.2020.101968
[16]
Markus Miettinen, Samuel Marchal, Ibbad Hafeez, N. Asokan, Ahmad-Reza Sadeghi, and Sasu Tarkoma. 2017. IoT SENTINEL: Automated Device-Type Identification for Security Enforcement in IoT. In 2017 IEEE 37th International Conference on Distributed Computing Systems (ICDCS). IEEE, Atlanta, GA, USA, 2177--2184. https://doi.org/10.1109/ICDCS.2017.283
[17]
R. Perdisci, T. Papastergiou, O. Alrawi, and M. Antonakakis. 2020. IoTFinder: Efficient Large-Scale Identification of IoT Devices via Passive DNS Traffic Analysis. In 2020 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, Los Alamitos, CA, USA, 474--489. https://doi.org/10.1109/EuroSP48549.2020.00037
[18]
Leonid Portnoy. 2000. Intrusion detection with unlabeled data using clustering. Undergraduate Thesis. Columbia University.
[19]
Pedro R. J. Pêgo and Luís Nunes. 2017. Automatic discovery and classifications of IoT devices. In 2017 12th Iberian Conference on Information Systems and Technologies (CISTI). IEEE, Lisbon, 1--10. https://doi.org/10.23919/CISTI.2017.7975691
[20]
Julia Silge and David Robinson. 2017. Text mining with R. O'Reilly Media, Inc, CA.
[21]
Arunan Sivanathan, Hassan Habibi Gharakheili, Franco Loi, Adam Radford, Chamith Wijenayake, Arun Vishwanath, and Vijay Sivaraman. 2019b. Classifying IoT Devices in Smart Environments Using Network Traffic Characteristics. IEEE Transactions on Mobile Computing, Vol. 18, 8 (2019), 1745--1759. https://doi.org/10.1109/TMC.2018.2866249
[22]
Arunan Sivanathan, Hassan Habibi Gharakheili, and Vijay Sivaraman. 2019a. Inferring IoT Device Types from Network Behavior Using Unsupervised Clustering. Proceedings - Conference on Local Computer Networks, LCN, Vol. 2019-Octob (2019), 230--233. https://doi.org/10.1109/LCN44214.2019.8990797
[23]
Arunan Sivanathan, Hassan Habibi Gharakheili, and Vijay Sivaraman. 2020. Detecting Behavioral Change of IoT Devices Using Clustering-Based Network Traffic Modeling. IEEE Internet of Things Journal, Vol. 7, 8 (2020), 7295--7309. https://doi.org/10.1109/JIOT.2020.2984030
[24]
Mathias Dahl Thomsen, Alberto Giaretta, and Nicola Dragoni. 2021. Smart Lamp or Security Camera? Automatic Identification of IoT Devices. In Selected Papers from the 12th International Networking Conference, Bogdan Ghita and Stavros Shiaeles (Eds.). Springer International Publishing, Cham, 85--99.
[25]
E. Valdez, D. Pendarakis, and H. Jamjoom. 2019. How to Discover IoT Devices When Network Traffic Is Encrypted. In 2019 IEEE International Congress on Internet of Things (ICIOT). IEEE Computer Society, Los Alamitos, CA, USA, 17--24. https://doi.org/10.1109/ICIOT.2019.00016
[26]
X. Wang, Y. Wang, X. Feng, H. Zhu, L. Sun, and Y. Zou. 2019. IoTTracker: An Enhanced Engine for Discovering Internet-of-Thing Devices. In 2019 IEEE 20th International Symposium on "A World of Wireless, Mobile and Multimedia Networks" (WoWMoM). IEEE Computer Society, Los Alamitos, CA, USA, 1--9. https://doi.org/10.1109/WoWMoM.2019.8793012
[27]
Emil Wåreus and Martin Hell. 2020. Automated CPE Labeling of CVE Summaries with Machine Learning. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 12223 LNCS (2020), 3--22. https://doi.org/10.1007/978--3-030--52683--2_1
[28]
Wei Xie, Yikun Jiang, Yong Tang, Ning Ding, and Yuanming Gao. 2018. Vulnerability detection in IoT firmware: A Survey. Proceedings of the International Conference on Parallel and Distributed Systems - ICPADS, Vol. 2017-Decem (2018), 769--772. https://doi.org/10.1109/ICPADS.2017.00104
[29]
Zi-Xiao Xu, Xiu-Bo Chen, Gang Xu, Kai-Guo Yuan, Jun Cui, and Yi-Xian Yang. 2021. Research on Feature Words for IoT Device Recognition Based on Word2vec. In Advances in Artificial Intelligence and Security, Xingming Sun, Xiaorui Zhang, Zhihua Xia, and Elisa Bertino (Eds.). Springer International Publishing, Cham, 287--298. io

Index Terms

  1. Granular IoT Device Identification Using TF-IDF and Cosine Similarity

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      CPSIoTSec '23: Proceedings of the 5th Workshop on CPS&IoT Security and Privacy
      November 2023
      115 pages
      ISBN:9798400702549
      DOI:10.1145/3605758
      This work is licensed under a Creative Commons Attribution International 4.0 License.

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 November 2023

      Check for updates

      Author Tags

      1. device identification
      2. firmware
      3. internet of things (iot)
      4. k-means
      5. machine learning (ml)
      6. tf-idf
      7. umap
      8. versions

      Qualifiers

      • Short-paper

      Conference

      CCS '23
      Sponsor:

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 276
        Total Downloads
      • Downloads (Last 12 months)276
      • Downloads (Last 6 weeks)24
      Reflects downloads up to 15 Oct 2024

      Other Metrics

      Citations

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media