Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Reduced Network Traffic Method for IoT Data Clustering

Published: 07 December 2020 Publication History
  • Get Citation Alerts
  • Abstract

    Internet of Things (IoT) systems usually involve interconnected, low processing capacity, and low memory sensor nodes (devices) that collect data in several sorts of applications that interconnect people and things. In this scenario, mining tasks, such as clustering, have been commonly deployed to detect behavioral patterns from the collected data. The centralized clustering of IoT data demands high network traffic to transmit the data from the devices to a central node, where a clustering algorithm must be applied. This approach does not scale as the number of devices increases, and the amount of data grows. However, distributing the clustering process through the devices may not be a feasible approach as well, since the devices are usually simple and may not have the ability to execute complex procedures. This work proposes a centralized IoT data clustering method that demands reduced network traffic and low processing power in the devices. The proposed method uses a data grid to summarize the information at the devices before transmitting it to the central node, reducing network traffic. After the data transfer, the proposed method applies a clustering algorithm that was developed to process data in the summarized representation. Tests with seven datasets provided experimental evidence that the proposed method reduces network traffic and produces results comparable to the ones generated by DBSCAN and HDBSCAN, two robust centralized clustering algorithms.

    References

    [1]
    M. Abdelshkour. 2015. IoT, from Cloud to Fog Computing. Retrieved April 12, 2019 from https://tinyurl.com/ydalpr5s.
    [2]
    S. Agrawal and J. Agrawal. 2015. Survey on anomaly detection using data mining techniques. Procedia Computer Science 60, 1 (2015), 708--713.
    [3]
    I. F. Akyildiz, W. Su, Y. Sankarasubramaniam, and E. Cayirci. 2002. Wireless sensor networks: A survey. Computer Networks 38, 4 (2002), 393--422.
    [4]
    Luai Al Shalabi, Zyad Shaaban, and Basel Kasasbeh. 2006. Data mining: A preprocessing engine. Journal of Computer Science 2, 9 (2006), 735--739.
    [5]
    D. Arndt and N. Langbein. 2002. Data quality in the context of customer segmentation. In Proceedings of the 2002 International Conference on Information Quality. MIT, Massachusetts, 47--60.
    [6]
    M. Bendechache and M. Kechadi. 2015. Distributed clustering algorithm for spatial data mining. In Proceedings of the 2015 IEEE International Conference on Spatial Data Mining and Geographical Knowledge Services. IEEE Computer Society, Los Alamitos, CA, 60--65.
    [7]
    S. Bin, L. Yuan, and W. Xiaoyiu. 2010. Research on data mining models for the internet of things. In Proceedings of the 2010 International Conference on Image Analysis and Signal Processing. IEEE Computer Society, Los Alamitos, CA, 127--132.
    [8]
    R. Brandao, R. Goldschmidt, and R. Choren. 2019. A data traffic reduction approach towards centralized mining in the IoT context. In Proceedings of the 21st International Conference on Enterprise Information Systems. INSTICC, SciTePress, 563--570.
    [9]
    P. Braun, A. Cuzzocrea, C. K. Leung, A. M. Pazdor, J. Souza, and S. K. Tanbeer. 2019. Pattern mining from big IoT data with fog computing: Models, issues, and research perspectives. In Proceedings of the 2019 19th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGRID’19). IEEE Computer Society, Los Alamitos, CA, 584--591.
    [10]
    Ricardo J. G. B. Campello, Davoud Moulavi, and Joerg Sander. 2013. Density-based clustering based on hierarchical density estimates. In Proceedings of the Advances in Knowledge Discovery and Data Mining, Jian Pei, Vincent S. Tseng, Longbing Cao, Hiroshi Motoda, and Guandong Xu (Eds.). Springer, Berlin. 160--172.
    [11]
    Cisco. 2017. Cisco Visual Networking Index: Global Mobile Data Traffic Forecast Update, 2016--2021 White Paper. Retrieved January 31, 2019 from https://tinyurl.com/y8kuucvk.
    [12]
    J. Diaz-Rozo, C. Bielza, and P. Larrañaga. 2018. Clustering of data streams with dynamic gaussian mixture models: An IoT application in industrial processes. IEEE Internet of Things Journal 5, 5 (2018), 3533--3547.
    [13]
    M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Proceedings of the International Conference on Knowledge Discovery and Data Mining. ACM, New York, NY, 226--231.
    [14]
    E. B. Fowlkes and C. L. Mallows. 1983. A method for comparing two hierarchical clusterings. Journal of the American Statistical Association 78, 383 (1983), 553--569.
    [15]
    Wensheng Gan, Jerry Chun-Wei Lin, Philippe Fournier-Viger, Han-Chieh Chao, and Philip S. Yu. 2019. A survey of parallel sequential pattern mining. ACM Transactions on Knowledge Discovery from Data 13, 3 (June 2019), 34.
    [16]
    Y. Gao and L. Ran. 2019. Collaborative filtering recommendation algorithm for heterogeneous data mining in the Internet of Things. IEEE Access 7, 1 (2019), 123583--123591.
    [17]
    A. Gionis, H. Mannila, and P. Tsaparas. 2007. Clustering aggregation. ACM Transactions on Knowledge Discovery Data 1, 1 (2007), 1556--4681.
    [18]
    Jayavardhana Gubbi, Rajkumar Buyya, Slaven Marusic, and Marimuthu Palaniswami. 2013. Internet of Things (IoT): A vision, architectural elements, and future directions. Future Generation Computer Systems 29, 7 (2013), 1645--1660.
    [19]
    Yuan Guo, Nan Wang, Ze-Yin Xu, and Kai Wu. 2020. The internet of things-based decision support system for information processing in intelligent manufacturing using data mining technology. Mechanical Systems and Signal Processing 142, 1 (2020), 106630.
    [20]
    Michele Ianni, Elio Masciari, Giuseppe M. Mazzeo, Mario Mezzanzanica, and Carlo Zaniolo. 2020. Fast and effective Big Data exploration by clustering. Future Generation Computer Systems 102, 1 (2020), 84--94.
    [21]
    E. Januzaj, H.-P. Kriegel, and M. Pfeifle. 2004. DBDC: Density based distributed clustering. In Proceedings of the Advances in Database Technology (EDBT’04). Springer, Berlin. 88--105.
    [22]
    Divya Joshi, Chanchal Kumari, and Abhishek Srivastava. 2016. Challenges and data mining model for IoT. International Journal of Engineering Applied Sciences and Technology 1, 3 (2016), 2455--2143.
    [23]
    G. Karypis. 2015. CLUTO - Software for Clustering High-Dimensional Datasets. Retrieved April 12, 2019 from https://tinyurl.com/pxkr8yl.
    [24]
    I. Kholod, M. Kuprianov, and I. Petukhov. 2016. Distributed data mining based on actors for Internet of Things. In Proceedings of the 2016 5th Mediterranean Conference on Embedded Computing (MECO’16). IEEE Computer Society, Los Alamitos, CA, 480--484.
    [25]
    M. Klusch, S. Lodi, and G. Moro. 2003. Issues of agent-based distributed data mining. In Proceedings of the 2003 International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS’03). ACM, New York, NY, 1034--1035.
    [26]
    H. Mashayekhi, J. Habibi, T. Khalafbeigi, S. Voulgaris, and M. van Steen. 2015. GDCluster: A general decentralized clustering algorithm. IEEE Transactions on Knowledge and Data Engineering 27, 7 (2015), 1892--1905.
    [27]
    Isaac Newton. 1687. Philosophiae Naturalis Principia Mathematica. Vol. 1. S. Pepys Reg.Soc. PrÆsis, London.
    [28]
    A. C. Onal, O. Berat Sezer, M. Ozbayoglu, and E. Dogdu. 2017. Weather data analysis and sensor fault detection using an extended IoT framework with semantics, big data, and machine learning. In Proceedings of the 2017 IEEE International Conference on Big Data (Big Data’17). IEEE Computer Society, Los Alamitos, CA, 2037--2046.
    [29]
    S. Pattar, R. Buyya, K. R. Venugopal, S. S. Iyengar, and L. M. Patnaik. 2018. Searching for the IoT resources: Fundamentals, requirements, comprehensive review, and future directions. IEEE Communications Surveys Tutorials 20, 3 (2018), 2101--2132.
    [30]
    Haibo Peng, Qiaoshun Wu, Jie Li, and Rong Zhou. 2020. Design of multi-layer industrial internet of data mine network model based on edge computation. In Proceedings of the International Conference on Cyber Security Intelligence and Analytics. Zheng Xu, Kim-Kwang Raymond Choo, Ali Dehghantanha, Reza Parizi, and Mohammad Hammoudeh (Eds.). Springer International Publishing, Cham, 1034--1040.
    [31]
    D. Puschmann, P. Barnaghi, and R. Tafazolli. 2017. Adaptive clustering for dynamic IoT data streams. IEEE Internet of Things Journal 4, 1 (2017), 64--74.
    [32]
    Quartz. 2015. Connected Cars will Send 25 Gigabytes of Data to the Cloud Every Hour. Retrieved April 12, 2019 from https://qz.com/344466/.
    [33]
    H. Rahman, N. Ahmed, and M. I. Hussain. 2016. A hybrid data aggregation scheme for provisioning Quality of Service (QoS) in Internet of Things (IoT). In Proceedings of the 2016 Cloudification of the Internet of Things (CIoT’16). IEEE Computer Society, Los Alamitos, CA, 1--5.
    [34]
    M. M. Rashid, J. Kamruzzaman, M. M. Hassan, S. Shahriar Shafin, and M. Z. A. Bhuiyan. 2020. A survey on behavioral pattern mining from sensor data in Internet of Things. IEEE Access 8, 1 (2020), 33318--33341.
    [35]
    M. Roriz, M. Endler, M. A. Casanova, H. Lopes, F. S. Silva, and T. Hara. 2016. A heuristic approach for on-line discovery of unidentified spatial clusters from grid-based streaming algorithms. In Proceedings of the International Conference on Big Data Analytics and Knowledge Discovery (DaWaK’16). Springer, Berlin. 128--142.
    [36]
    S. Salvador and P. Chan. 2004. Determining the number of clusters/segments in hierarchical clustering/segmentation algorithms. In Proceedings of the IEEE International Conference on Tools with Artificial Intelligence. IEEE Computer Society, Los Alamitos, CA, 576--584.
    [37]
    C. Savaglio, P. Gerace, G. Di Fatta, and G. Fortino. 2019. Data mining at the IoT edge. In Proceedings of the 2019 28th International Conference on Computer Communication and Networks (ICCCN’19). IEEE Computer Society, Los Alamitos, CA, 1--6.
    [38]
    W. Shi, J. Cao, Q. Zhang, Y. Li, and L. Xu. 2016. Edge computing: Vision and challenges. IEEE Internet of Things Journal 3, 5 (2016), 637--646.
    [39]
    A. Singh and S. Sharma. 2017. Analysis on data mining models for Internet Of Things. In Proceedings of the 2017 International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud) (I-SMAC’17). IEEE Computer Society, Los Alamitos, CA, 94--100.
    [40]
    C. Tsai, C. Lai, M. Chiang, and L. T. Yang. 2014. Data mining for Internet of Things: A survey. IEEE Communications Surveys Tutorials 16, 1 (2014), 77--97.
    [41]
    I. Witten, E. Frank, M. Hall, and C. Pal. 2017. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco.
    [42]
    H. Yu, H. Chen, S. Zhao, and Q. Shi. 2020. Distributed soft clustering algorithm for IoT based on finite time average consensus. IEEE Internet of Things Journal (2020), 1--1.
    [43]
    Q. Zhang, C. Zhu, L. T. Yang, Z. Chen, L. Zhao, and P. Li. 2017. An incremental CFS algorithm for clustering large data in industrial Internet of Things. IEEE Transactions on Industrial Informatics 13, 3 (2017), 1193--1201.
    [44]
    Y. Zhang, M. Chen, S. Mao, L. Hu, and V. C. M. Leung. 2014. CAP: Community activity prediction based on big data analysis. IEEE Network 28, 4 (2014), 52--57.

    Cited By

    View all
    • (2023)TAP: Traffic Accident Profiling via Multi-Task Spatio-Temporal Graph Representation LearningACM Transactions on Knowledge Discovery from Data10.1145/356459417:4(1-25)Online publication date: 24-Feb-2023
    • (2022)Clustering for smart cities in the internet of things: a reviewCluster Computing10.1007/s10586-022-03646-825:6(4097-4127)Online publication date: 27-Jun-2022
    • (2022) PSO ‐based traffic scheduling mechanism in information‐centric networking Internet Technology Letters10.1002/itl2.3625:5Online publication date: 8-Mar-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Knowledge Discovery from Data
    ACM Transactions on Knowledge Discovery from Data  Volume 15, Issue 1
    February 2021
    361 pages
    ISSN:1556-4681
    EISSN:1556-472X
    DOI:10.1145/3441647
    Issue’s Table of Contents
    © 2020 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of a national government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 07 December 2020
    Accepted: 01 September 2020
    Revised: 01 July 2020
    Received: 01 January 2020
    Published in TKDD Volume 15, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Data traffic reduction
    2. Internet of Things
    3. data summarization
    4. distributed data mining

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • Coordenação de Aperfeiçoamento de Pessoal de Nível Superior - Brasil (CAPES)

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)14
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)TAP: Traffic Accident Profiling via Multi-Task Spatio-Temporal Graph Representation LearningACM Transactions on Knowledge Discovery from Data10.1145/356459417:4(1-25)Online publication date: 24-Feb-2023
    • (2022)Clustering for smart cities in the internet of things: a reviewCluster Computing10.1007/s10586-022-03646-825:6(4097-4127)Online publication date: 27-Jun-2022
    • (2022) PSO ‐based traffic scheduling mechanism in information‐centric networking Internet Technology Letters10.1002/itl2.3625:5Online publication date: 8-Mar-2022

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media