Abstract
With the increase interest and number of the users in Social media, the file handling has also increased. To manage the load, cloud servers are being used by the service providers. To identify and cluster file is a difficult task that is important in the domain of computer science. Various traditional approaches for identification exists that uses design features. The problem with these methods is get that they can be easily spoofed. To resolve the issue, in this paper, a hybrid algorithm combining the features of Random Forest with AdaBoost is proposed. The algorithm Internet of Thing (IoT) data file formatting (IDFF) classifies data as (text, image, audio and video) and gives better accuracy. Our proposed research obtains better Accuracy (93%), Precision (95%), Recall (95%), F-Measure (95%), and G-Mean (96%).
Similar content being viewed by others
Data availability
Enquiries about data availability should be directed to the authors.
References
Gartner Forecast: public cloud services, worldwide, 2014–2020, 4Q16. https://www.gartner.com/doc/3562817/forecast-public-cloudservicesworldwide Update. Accessed 12 Mar (2020).
IDC Worldwide Semiannual Public Cloud Services Tracker. https://www.idc.com/tracker/showproductinfo.jsp?prod_id=881. Accessed 12 Mar (2020)
Benabderrahmane, S.: Combining boosting machine learning and swarm intelligence for real time object detection and tracking: towards new meta-heuristics boosting classifiers. Int. J. Intell. Robot. Appl. 1(4), 410–428 (2017)
Using a PostgreSQL Database as a Source for AWS DMS. https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source. PostgreSQL.html. Accessed 9 May (2020)
Thaer, H.E.-B., Amal, I., Samir, A., Magdi, R.: Fast sound verification using support vector machine and particle swarm optimization algorithms. Int. J. Adv. Res. Comput. Sci. Technol. 4, 78–83 (2016)
Houby, E., Yassin, E., Omran, S.N.: A hybrid approach from ant colony optimization and K-nearest neighbor for classifying datasets using selected features. Information 41, 495–506 (2017)
Zakaria, A., Rizal, R., Dwi, O.: Particle swarm optimization and support vector machine for vehicle type classification in video stream. Int. J. Comput. Appl. 182(18), 9–13 (2018)
Srinivasan, R.: Multiclass text classification a decision tree based SVM approach. CS294 Practical Mach. Learn. Project (2006)
Karthick G, Harikumar R.: Comparative performance analysis of naive bayes and SVM classifier for oral X-ray images. In: Proceedings of 4th International Conference Electron Communication System (ICECS), Coimbatore, India, pp. 88–92 (2017).
Thakur, A., Goraya, M.S.: A taxonomic survey on load balancing in cloud. J. Netw. Comput. Appl. 98(C), 43–57 (2017).
Mishra, S.K., Bibhudatta, S., Parida, P.P.: Load balancing in cloud computing: a big picture. J. King Saud Univ. Comput. Inf. Sci. 32(2), 149–158 (2020)
Shaikh, R.B., Sasikumar, M.: Data classification for achieving security in cloud computing. Procedia Comput. Sci. 45(12), 493–498 (2015)
Tuba, E., Mrkela, L., Tuba, M.: Support vector machine parameter tuning using firefly algorithm. In: Proceedings of 26th International Conference Radioelektronika, pp. 413–418 (2016).
Shahaboddin, S., Petkovic, D., Nenad, T., et al.: Support vector machine firefly algorithm based optimization of lens system. Appl. Opt. 54(1), 37–45 (2015).
Anushuya, G., Gopikaa, K., Prasath, S.G., Keerthika, P.: Resource management in cloud computing using SVM with GA and PSO. Int. J. Eng. Res. Technol. Etedm. 6(4), 126 (2018).
Marco, W., Michiel, R., Embrechts, M., Marijn, S., Meijster, A., Lambertus, A.: The neural support vector machine. In: Proceedings of 25th Benelux Conference Artificial Intelligence (BNAIC), Delft, The Netherlands, pp. 247–254 (2013).
Liu, H., Xiao, X., Li, Y., Mi, Q., Yang, Z.: Effective data classification via combining neural networks and SVM. In: Proceedings of Chinese Control and Decision Conference (CCDC), Nanchang, China, pp. 4006–4009 (2019).
Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithm based on support vector machine and naive Bayes for anomaly detection. In: Proceedings of International Conference Computing, Communication & Automation (ICCCA), Greater Noida, India, pp. 323–327 (2017).
Raczko, E., Zagajewski, B.: Comparison of support vector machine, random forest and neural network classifiers for tree species classification on airborne hyperspectral APEX images. Eur. J. Remote Sens. 50(1), 144–154 (2017)
Al Amrani, Y., Lazaar, M., El Kadiri, K.E.: Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput. Sci. 127, 511–520 (2018).
Mojtaba, H., Kamran, K., Donald, B., Meimandi, J., Laura, K.B.: An improvement of data classification using random multimodel deep learning (RMDL). Int. J. Mach. Learn. Cybern. 8, 298–310 (2018)
Xu, B., Guo, X., Ye, Y., Cheng, J.: An improved random forest classifier for text categorization. J. Comput. 7(12), 2913–2920 (2012)
Dorigo, M., Birattari, M., Stutzle, T.: Ant colony optimization. IEEE Comput. Intell. Mag. 1(4), 28–39 (2006)
Mohapatra, P., Chakravarty, S., Dash, P.K.: Microarray medical data classification using kernel ridge regression and modified cat swarm optimization-based gene selection system. Swarm Evol. Comput. 28(Suppl. 8), 144–160 (2016)
Pappula, L., Ghosh, D.: Cat swarm optimization with normal mutation for fast convergence of multimodal functions. Appl. Soft Comput. 66(4), 473–491 (2018)
Sharma, A., Zaidi, A., Singh, R., Jain, S.: Optimization of SVM classifier using firefly algorithm. In: Proceedings of IEEE Second International Conference on Image Information Processing, India, pp. 198–202 (2013).
Jinglin, D., Liu, Y., Yu, Y., Yan, W.: A prediction of precipitation data based on support vector machine and particle swarm optimization algorithms. Algorithms 10(2), 1–57 (2017)
Basha, S.R., Rani, J.K., Yadav, J.P., Kumar, G.R.: Impact of feature selection techniques in text classification: an experimental study. J. Mech. Continua Math. Sci. (Special Issue) 3(3), 39–51 (2019)
Okfalisa, I., Mustakim, G., Reza, N. G. I.: Comparative analysis of k-nearest neighbor and modified k-nearest neighbor algorithm for data classification. In: Proceedings of 2nd International Conference on Information Technology, Information Systems and Electrical Engineering, Yogyakarta, Indonesia, pp. 294–298 (2017).
Selvakumari, N.A.S.: A voice activity detector using svm and naïve bayes classification algorithm. In: Proceedings of International Conference on Signal Processing and Communication, Coimbatore, India, pp. 1–6 (2017).
Pratama, B.Y., Sarno, R.: Personality classification based on twitter text using NB, KNN and SVM. In: Proceedings of International Conference on Data and Software Engineering, Yogyakarta, India, pp. 170–174 (2015).
Tomala, A.S., Raczko, E., Zagajewski, B.: Comparison of support vector machine and random forest algorithms for invasive and expansive species classification using airborne hyperspectral data. Remote Sense 12(3), 516 (2020)
Al-Amrani, Y., Mohamed, L., Eddine, K.: Random forest and support vector machine based hybrid approach to sentiment analysis. Procedia Comput. Sci. 127, 511–520 (2018)
Heidarysafa, M., Kamran, K., Donald, B., Meimandi, J., Laura, K.B.: An improvement of data classification using random multimodel deep learning. Int. J. Mach. Learn. Cybern. 8, 298–310 (2018)
Thomas, T., Vijayaraghavan, A. P., Emmanuel, S.: Applications of decision trees. In: Machine Learning Approaches in Cyber Security Analytics. Springer, Singapore, pp. 157–184 (2020).
Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithm based on support vector machine and naive bayes for anomaly detection. In: Proceedings of International Conference on Computing, Communication and Automation, Noida, India, pp. 323–327 (2017).
Maurya, A.K, Tripathi, A.K.: Deadline-constrained algorithms for scheduling of bag-of-tasks and workflows in cloud computing environments. In: Proceedings of 2nd International Conference on High Performance Compilation, Computing Communication (HP3C), Hong Kong, pp. 6–10 (2018).
Catak, F.O., Balaban, M.E.: CloudSVM: training an SVM classifier in cloud computing systems. In: Proceedings of Joint International Conference on Pervasive Computing and the Networked World. Springer, Berlin, pp. 57–68 (2012).
Meyer, M., Beutel, J., Thiele, L.: Unsupervised feature learning for audio analysis. In: Proceedings of the 5th International Conference on Learning Representations (ICLR), Workshop Track, Toulon, France, pp. 1–4 (2017).
Danilo, D., Fenu, G., Marras, M., Recupero, D.R.: Bridging learning analytics and cognitive computing for big data classification in microlearning video collections. Comput. Hum. Behav. 92, 468–477 (2019)
Bae, C., Wahid, N., Chung, Y.Y., Yeh, W.C.: Effective audio classification algorithm swarm-based optimization. Int. J. Innov. Comput. Inf. Control 10(1), 151–167 (2014).
Sharma, M., Garg, R.: HIGA: harmony-inspired genetic algorithm for rack-aware energy-efficient task scheduling in cloud data centers. Eng.Sci. Technol. Int. J. 23(1), 211–224 (2020).
Shahaboddin, S., Petkovic, D., Nenad, T., et al.: Support vector machine_re_y algorithm based optimizationof lens system. Appl. Opt. 54(1), 37–45 (2015)
Marco, W., Michiel, R., Embrechts, M., Marijn, S., Meijster, A., Lambertus, A.: The neural support vector machine. In: Proceedings of 25th Benelux Conference on Artificial Intelligence (BNAIC), Delft, The Netherlands, pp. 247–254 (2013)
Shakya, S., Sigdel, S.: An approach to develop a hybrid algorithmbased on support vector machine and naive Bayes for anomaly detection. In: Proceedings of International Conference on Computing, Communication & Automation (ICCCA), Greater Noida, India, pp. 323–327 (2017).
Moe, Z.H., San, T., Khin, M.M., Tin, H.M.: Comparison of naiveBayes and support vector machine classifiers on document classification. In: Proceedings of IEEE 7th Global Conference on Consumer Electronics (GCCE), Nara, India, pp. 466–467 (2018).
Harish, B.S., Nagadarshan, N., Manju, N.: Multilayer feedforward neural network for internet traffic classification. Int. J. Interact. Multimed. Artif. Intell. 6(Special Issue on Soft Computing), 117–122 (2020)
Gupta, A., Ghanshala, K., Joshi, R.C.: Machine learning classifier approach with gaussian process, ensemble boosted trees, SVM, and linear regression for 5G signal coverage mapping. Int. J. Interact. Multimed. Artif. Intell. 6(Regular Issue), 156–163 (2021)
Gupta, S., Chug, A.: An extensive analysis of machine learning based boosting algorithms for software maintainability prediction. Int. J. Interact. Multimed. Artif. Intell. 7(Regular Issue), 89–109 (2021)
Phan, T., Martin, K.: Comparison of random forest, k-nearest neighbor, and support vector machine classifiers for land cover classification using sentinel-2 imagery. Sensors 18(1), 1–20 (2017)
Ghobaei-Arani, M., Rahmanian, A.A., Souri, A., Rahmani, A.M.: A moth-flame optimization algorithm for Web service composition in cloud computing: simulation and verification. In: Proceedings of Software Practice and Experience, pp. 1865–1892 (2018).
Mohan, B.C., Baskaran, R.: Survey on recent research and implementation of ant colony optimization in various engineering applications. Int. J. Comput. Intell. Syst. 4(4), 566–582 (2011).
Jiaxu, N., Changsheng, Z., Peng, S., Yunfei, F.: Comparative study of ant colony algorithms for multi-objective optimization. Information 10(1), 1–19 (2018)
Funding
The authors have not disclosed any funding.
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material pre-pration,data collection and analysis were performed by [PS]. The first draft of the manuscript as written by [PS]. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare no competing interests.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Sharma, P., Kumar, M. & Sharma, A. Machine learning based file type classifier designing in IoT cloud. Cluster Comput 27, 109–117 (2024). https://doi.org/10.1007/s10586-022-03816-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10586-022-03816-8