Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Machine learning based file type classifier designing in IoT cloud

  • Published:
Cluster Computing Aims and scope Submit manuscript

Abstract

With the increase interest and number of the users in Social media, the file handling has also increased. To manage the load, cloud servers are being used by the service providers. To identify and cluster file is a difficult task that is important in the domain of computer science. Various traditional approaches for identification exists that uses design features. The problem with these methods is get that they can be easily spoofed. To resolve the issue, in this paper, a hybrid algorithm combining the features of Random Forest with AdaBoost is proposed. The algorithm Internet of Thing (IoT) data file formatting (IDFF) classifies data as (text, image, audio and video) and gives better accuracy. Our proposed research obtains better Accuracy (93%), Precision (95%), Recall (95%), F-Measure (95%), and G-Mean (96%).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Data availability

Enquiries about data availability should be directed to the authors.

References

  1. Gartner Forecast: public cloud services, worldwide, 2014–2020, 4Q16. https://www.gartner.com/doc/3562817/forecast-public-cloudservicesworldwide Update. Accessed 12 Mar (2020).

  2. IDC Worldwide Semiannual Public Cloud Services Tracker. https://www.idc.com/tracker/showproductinfo.jsp?prod_id=881. Accessed 12 Mar (2020)

  3. Benabderrahmane, S.: Combining boosting machine learning and swarm intelligence for real time object detection and tracking: towards new meta-heuristics boosting classifiers. Int. J. Intell. Robot. Appl. 1(4), 410–428 (2017)

    Article  Google Scholar 

  4. Using a PostgreSQL Database as a Source for AWS DMS. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source. PostgreSQL.html. Accessed 9 May (2020)

  5. Thaer, H.E.-B., Amal, I., Samir, A., Magdi, R.: Fast sound verification using support vector machine and particle swarm optimization algorithms. Int. J. Adv. Res. Comput. Sci. Technol. 4, 78–83 (2016)

    Google Scholar 

  6. Houby, E., Yassin, E., Omran, S.N.: A hybrid approach from ant colony optimization and K-nearest neighbor for classifying datasets using selected features. Information 41, 495–506 (2017)

    MathSciNet  Google Scholar 

  7. Zakaria, A., Rizal, R., Dwi, O.: Particle swarm optimization and support vector machine for vehicle type classification in video stream. Int. J. Comput. Appl. 182(18), 9–13 (2018)

    Google Scholar 

  8. Srinivasan, R.: Multiclass text classification a decision tree based SVM approach. CS294 Practical Mach. Learn. Project (2006)

  9. Karthick G, Harikumar R.: Comparative performance analysis of naive bayes and SVM classifier for oral X-ray images. In: Proceedings of 4th International Conference Electron Communication System (ICECS), Coimbatore, India, pp. 88–92 (2017).

  10. Thakur, A., Goraya, M.S.: A taxonomic survey on load balancing in cloud. J. Netw. Comput. Appl. 98(C), 43–57 (2017).

  11. Mishra, S.K., Bibhudatta, S., Parida, P.P.: Load balancing in cloud computing: a big picture. J. King Saud Univ. Comput. Inf. Sci. 32(2), 149–158 (2020)

    Google Scholar 

  12. Shaikh, R.B., Sasikumar, M.: Data classification for achieving security in cloud computing. Procedia Comput. Sci. 45(12), 493–498 (2015)

    Article  Google Scholar 

  13. Tuba, E., Mrkela, L., Tuba, M.: Support vector machine parameter tuning using firefly algorithm. In: Proceedings of 26th International Conference Radioelektronika, pp. 413–418 (2016).

  14. Shahaboddin, S., Petkovic, D., Nenad, T., et al.: Support vector machine firefly algorithm based optimization of lens system. Appl. Opt. 54(1), 37–45 (2015).

  15. Anushuya, G., Gopikaa, K., Prasath, S.G., Keerthika, P.: Resource management in cloud computing using SVM with GA and PSO. Int. J. Eng. Res. Technol. Etedm. 6(4), 126 (2018).

  16. Marco, W., Michiel, R., Embrechts, M., Marijn, S., Meijster, A., Lambertus, A.: The neural support vector machine. In: Proceedings of 25th Benelux Conference Artificial Intelligence (BNAIC), Delft, The Netherlands, pp. 247–254 (2013).

  17. Liu, H., Xiao, X., Li, Y., Mi, Q., Yang, Z.: Effective data classification via combining neural networks and SVM. In: Proceedings of Chinese Control and Decision Conference (CCDC), Nanchang, China, pp. 4006–4009 (2019).

  18. Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithm based on support vector machine and naive Bayes for anomaly detection. In: Proceedings of International Conference Computing, Communication & Automation (ICCCA), Greater Noida, India, pp. 323–327 (2017).

  19. Raczko, E., Zagajewski, B.: Comparison of support vector machine, random forest and neural network classifiers for tree species classification on airborne hyperspectral APEX images. Eur. J. Remote Sens. 50(1), 144–154 (2017)

    Article  Google Scholar 

  20. Al Amrani, Y., Lazaar, M., El Kadiri, K.E.: Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput. Sci. 127, 511–520 (2018).

  21. Mojtaba, H., Kamran, K., Donald, B., Meimandi, J., Laura, K.B.: An improvement of data classification using random multimodel deep learning (RMDL). Int. J. Mach. Learn. Cybern. 8, 298–310 (2018)

    Google Scholar 

  22. Xu, B., Guo, X., Ye, Y., Cheng, J.: An improved random forest classifier for text categorization. J. Comput. 7(12), 2913–2920 (2012)

    Article  Google Scholar 

  23. Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006)

    Article  Google Scholar 

  24. Mohapatra, P., Chakravarty, S., Dash, P.K.: Microarray medical data classification using kernel ridge regression and modified cat swarm optimization-based gene selection system. Swarm Evol. Comput. 28(Suppl. 8), 144–160 (2016)

    Article  Google Scholar 

  25. Pappula, L., Ghosh, D.: Cat swarm optimization with normal mutation for fast convergence of multimodal functions. Appl. Soft Comput. 66(4), 473–491 (2018)

    Article  Google Scholar 

  26. Sharma, A., Zaidi, A., Singh, R., Jain, S.: Optimization of SVM classifier using firefly algorithm. In: Proceedings of IEEE Second International Conference on Image Information Processing, India, pp. 198–202 (2013).

  27. Jinglin, D., Liu, Y., Yu, Y., Yan, W.: A prediction of precipitation data based on support vector machine and particle swarm optimization algorithms. Algorithms 10(2), 1–57 (2017)

    MathSciNet  Google Scholar 

  28. Basha, S.R., Rani, J.K., Yadav, J.P., Kumar, G.R.: Impact of feature selection techniques in text classification: an experimental study. J. Mech. Continua Math. Sci. (Special Issue) 3(3), 39–51 (2019)

    Google Scholar 

  29. Okfalisa, I., Mustakim, G., Reza, N. G. I.: Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification. In: Proceedings of 2nd International Conference on Information Technology, Information Systems and Electrical Engineering, Yogyakarta, Indonesia, pp. 294–298 (2017).

  30. Selvakumari, N.A.S.: A voice activity detector using svm and naïve bayes classification algorithm. In: Proceedings of International Conference on Signal Processing and Communication, Coimbatore, India, pp. 1–6 (2017).

  31. Pratama, B.Y., Sarno, R.: Personality classification based on twitter text using NB, KNN and SVM. In: Proceedings of International Conference on Data and Software Engineering, Yogyakarta, India, pp. 170–174 (2015).

  32. Tomala, A.S., Raczko, E., Zagajewski, B.: Comparison of support vector machine and random forest algorithms for invasive and expansive species classification using airborne hyperspectral data. Remote Sense 12(3), 516 (2020)

    Article  ADS  Google Scholar 

  33. Al-Amrani, Y., Mohamed, L., Eddine, K.: Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput. Sci. 127, 511–520 (2018)

    Article  Google Scholar 

  34. Heidarysafa, M., Kamran, K., Donald, B., Meimandi, J., Laura, K.B.: An improvement of data classification using random multimodel deep learning. Int. J. Mach. Learn. Cybern. 8, 298–310 (2018)

    Google Scholar 

  35. Thomas, T., Vijayaraghavan, A. P., Emmanuel, S.: Applications of decision trees. In: Machine Learning Approaches in Cyber Security Analytics. Springer, Singapore, pp. 157–184 (2020).

  36. Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithm based on support vector machine and naive bayes for anomaly detection. In: Proceedings of International Conference on Computing, Communication and Automation, Noida, India, pp. 323–327 (2017).

  37. Maurya, A.K, Tripathi, A.K.: Deadline-constrained algorithms for scheduling of bag-of-tasks and workflows in cloud computing environments. In: Proceedings of 2nd International Conference on High Performance Compilation, Computing Communication (HP3C), Hong Kong, pp. 6–10 (2018).

  38. Catak, F.O., Balaban, M.E.: CloudSVM: training an SVM classifier in cloud computing systems. In: Proceedings of Joint International Conference on Pervasive Computing and the Networked World. Springer, Berlin, pp. 57–68 (2012).

  39. Meyer, M., Beutel, J., Thiele, L.: Unsupervised feature learning for audio analysis. In: Proceedings of the 5th International Conference on Learning Representations (ICLR), Workshop Track, Toulon, France, pp. 1–4 (2017).

  40. Danilo, D., Fenu, G., Marras, M., Recupero, D.R.: Bridging learning analytics and cognitive computing for big data classification in microlearning video collections. Comput. Hum. Behav. 92, 468–477 (2019)

    Article  Google Scholar 

  41. Bae, C., Wahid, N., Chung, Y.Y., Yeh, W.C.: Effective audio classification algorithm swarm-based optimization. Int. J. Innov. Comput. Inf. Control 10(1), 151–167 (2014).

  42. Sharma, M., Garg, R.: HIGA: harmony-inspired genetic algorithm for rack-aware energy-efficient task scheduling in cloud data centers. Eng.Sci. Technol. Int. J. 23(1), 211–224 (2020).

  43. Shahaboddin, S., Petkovic, D., Nenad, T., et al.: Support vector machine_re_y algorithm based optimizationof lens system. Appl. Opt. 54(1), 37–45 (2015)

    Article  ADS  Google Scholar 

  44. Marco, W., Michiel, R., Embrechts, M., Marijn, S., Meijster, A., Lambertus, A.: The neural support vector machine. In: Proceedings of 25th Benelux Conference on Artificial Intelligence (BNAIC), Delft, The Netherlands, pp. 247–254 (2013)

  45. Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithmbased on support vector machine and naive Bayes for anomaly detection. In: Proceedings of International Conference on Computing, Communication & Automation (ICCCA), Greater Noida, India, pp. 323–327 (2017).

  46. Moe, Z.H., San, T., Khin, M.M., Tin, H.M.: Comparison of naiveBayes and support vector machine classifiers on document classification. In: Proceedings of IEEE 7th Global Conference on Consumer Electronics (GCCE), Nara, India, pp. 466–467 (2018).

  47. Harish, B.S., Nagadarshan, N., Manju, N.: Multilayer feedforward neural network for internet traffic classification. Int. J. Interact. Multimed. Artif. Intell. 6(Special Issue on Soft Computing), 117–122 (2020)

    Google Scholar 

  48. Gupta, A., Ghanshala, K., Joshi, R.C.: Machine learning classifier approach with gaussian process, ensemble boosted trees, SVM, and linear regression for 5G signal coverage mapping. Int. J. Interact. Multimed. Artif. Intell. 6(Regular Issue), 156–163 (2021)

    Google Scholar 

  49. Gupta, S., Chug, A.: An extensive analysis of machine learning based boosting algorithms for software maintainability prediction. Int. J. Interact. Multimed. Artif. Intell. 7(Regular Issue), 89–109 (2021)

    Google Scholar 

  50. Phan, T., Martin, K.: Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors 18(1), 1–20 (2017)

    Google Scholar 

  51. Ghobaei-Arani, M., Rahmanian, A.A., Souri, A., Rahmani, A.M.: A moth-flame optimization algorithm for Web service composition in cloud computing: simulation and verification. In: Proceedings of Software Practice and Experience, pp. 1865–1892 (2018).

  52. Mohan, B.C., Baskaran, R.: Survey on recent research and implementation of ant colony optimization in various engineering applications. Int. J. Comput. Intell. Syst. 4(4), 566–582 (2011).

  53. Jiaxu, N., Changsheng, Z., Peng, S., Yunfei, F.: Comparative study of ant colony algorithms for multi-objective optimization. Information 10(1), 1–19 (2018)

    Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material pre-pration,data collection and analysis were performed by [PS]. The first draft of the manuscript as written by [PS]. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Puneet Sharma.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sharma, P., Kumar, M. & Sharma, A. Machine learning based file type classifier designing in IoT cloud. Cluster Comput 27, 109–117 (2024). https://doi.org/10.1007/s10586-022-03816-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10586-022-03816-8

Keywords