Abstract
Web applications remain a significant attack vector for cybercriminals seeking to exploit application vulnerabilities and gain unauthorized access to privileged data. In this research, we evaluate the efficacy of eight supervised machine learning algorithms - Naive Bayes, Decision Tree, AdaBoost, Random Forest, Logistic Regression, K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Artificial Neural Network (ANN) - in detecting and countering web application attacks. Our results indicate that KNN and Random Forest classifiers achieve an accuracy rate of 89% and an area under the curve of 94% on the CSIC HTTP dataset, a commonly used benchmark in the field. Meanwhile, the Naive Bayes classifier proves the most efficient, taking the least computational time when differentiating between malicious and benign HTTP requests. These findings may help direct future efforts towards more efficient, machine learning-driven defenses against web application attacks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Mikheeva, O.I., Gatchin Yuri, A., Savkov, S.V., Khammatova, R.M., et al.: Search methods for abnormal activities of web applications. J. Sci. Tech. Inf. Technol. Mech. Optics 126(2), 233–242 (2020)
Holz, T., Marechal, S., Raynal, F.: New threats and attacks on the world wide web. IEEE Secur. Priv. 4(2), 72–75 (2006)
Moshchuk, A., Bragin, T., Deville, D., Gribble, S.D., Levy, H.M.: SpyProxy: Execution-based detection of malicious web content. In: USENIX Security Symposium, pp. 1–16 (2007)
Tekerek, A.: A novel architecture for web-based attack detection using convolutional neural network. Comput. Secur. 100, 102096 (2021)
Huang, Y., Li, T., Zhang, L., Li, B., Liu, X.: JSContana: malicious javascript detection using adaptable context analysis and key feature extraction. Comput. Secur. 104, 102218 (2021)
Phung, N.M., Mimura, M.: Detection of malicious javascript on an imbalanced dataset. Internet of Things 13, 100357 (2021)
Nithya, V., Pandian, S.L., Malarvizhi, C.: A survey on detection and prevention of cross-site scripting attack. Int. J. Secur. Its Appl. 9(3), 139–152 (2015)
Tariq, I., Sindhu, M.A., Abbasi, R.A., Khattak, A.S., Maqbool, O., Siddiqui, G.F.: Resolving cross-site scripting attacks through genetic algorithm and reinforcement learning. Expert Syst. Appl. 168, 114386 (2021)
Jeitner, P., Shulman, H.: Injection attacks reloaded: tunnelling malicious payloads over DNS. In: 30th \(\{\)USENIX\(\}\) Security Symposium (\(\{\)USENIX\(\}\) Security 21), pp. 3165–3182 (2021)
Kc, G.S., Keromytis, A.D., Prevelakis, V.: Countering code-injection attacks with instruction-set randomization. In: Proceedings of the 10th ACM conference on Computer and communications security, pp. 272–280 (2003)
Hazel, P.: Perl compatible regular expressions, The University of Cambridge, p. 114 (2012)
Erlacher, F., Dressler, F.: On high-speed flow-based intrusion detection using snort-compatible signatures. IEEE Trans. Dependable Secur. Comput
Fredj, O.B., Cheikhrouhou, O., Krichen, M., Hamam, H., Derhab, A.: An OWASP top ten driven survey on web application protection methods. In: Garcia-Alfaro, J., Leneutre, J., Cuppens, N., Yaich, R. (eds.) CRiSIS 2020. LNCS, vol. 12528, pp. 235–252. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68887-5_14
Perl-compatible regular expressions (PCRE), http://www.pcre.org (2021)
Kozik, R., Choraś, M., Renk, R., Hołubowicz, W.: A proposal of algorithm for web applications cyber attack detection. In: Saeed, K., Snášel, V. (eds.) CISIM 2014. LNCS, vol. 8838, pp. 680–687. Springer, Heidelberg (2014). https://doi.org/10.1007/978-3-662-45237-0_61
Sharma, S., Zavarsky, P., Butakov, S.: Machine learning based intrusion detection system for web-based attacks. In: 2020 IEEE 6th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS), IEEE, pp. 227–230 (2020)
Oumaima, C., Abdeslam, R., Yassine, S., Abderrazek, F.: Experimental study on the effectiveness of machine learning methods in web intrusion detection. In: Maleh, Y., Alazab, M., Gherabi, N., Tawalbeh, L., Abd El-Latif, A.A. (eds.) ICI2C 2021. LNNS, vol. 357, pp. 486–494. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-91738-8_44
J. Offutt, Y. Wu, X. Du, H. Huang, Bypass testing of web applications. In: 15th International Symposium on Software Reliability Engineering, IEEE, pp. 187–197 (2004)
Sun, F., Zhang, P., White, J., Schmidt, D., Staples, J., Krause, L.: A feasibility study of autonomically detecting in-process cyber-attacks. In: 2017 3rd IEEE International Conference on Cybernetics (CYBCONF), IEEE, pp. 1–8 (2017)
Cova, M., Kruegel, C., Vigna, G.: Detection and analysis of drive-by-download attacks and malicious JavaScript code. In: Proceedings of the 19th international conference on World wide web, pp. 281–290 (2010)
Pazos, J.C., Légaré, J.-S., Beschastnikh, I.: XSnare: application-specific client-side cross-site scripting protection. In: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, pp. 154–165 (2021)
Johns, M., Engelmann, B., Posegga, J., Xssds: Server-side detection of cross-site scripting attacks. In: Annual Computer Security Applications Conference (ACSAC). IEEE, vol. 2008, pp. 335–344 (2008)
Fang, Y., Li, Y., Liu, L., Huang, C.: DeepXSS: cross site scripting detection based on deep learning. In: Proceedings of the 2018 International Conference on Computing and Artificial Intelligence, pp. 47–51 (2018)
Rodríguez, G.E., Torres, J.G., Flores, P., Benavides, D.E.: Cross-site scripting (XSS) attacks and mitigation: a survey. Comput. Netw. 166, 106960 (2020)
Kaur, G., Malik, Y., Samuel, H., Jaafar, F.: Detecting blind cross-site scripting attacks using machine learning. In: Proceedings of the 2018 International Conference on Signal Processing and Machine Learning, pp. 22–25 (2018)
Kemalis, K., Tzouramanis, T.: SQL-IDS: a specification-based approach for SQL-injection detection. In: Proceedings of the 2008 ACM symposium on Applied computing, pp. 2153–2158 (2008)
Zhang, L., Zhang, D., Wang, C., Zhao, J., Zhang, Z.: ART4SQLI: the art of SQL injection vulnerability discovery. IEEE Trans. Reliab. 68(4), 1470–1489 (2019)
Medeiros, I., Beatriz, M., Neves, N., Correia, M.: SEPTIC: detecting injection attacks and vulnerabilities inside the DBMS. IEEE Trans. Reliab. 68(3), 1168–1188 (2019)
Fredj, O.B.: SPHERES: an efficient server-side web application protection system. Int. J. Inf. Comput. Secur. 11(1), 33–60 (2019)
Zhuo, Z., Cai, T., Zhang, X., Lv, F.: Long short-term memory on abstract syntax tree for SQL injection detection. IET Softw. 15(2), 188–197 (2021)
Li, Q., Li, W., Wang, J., Cheng, M.: A SQL injection detection method based on adaptive deep forest. IEEE Access 7, 145385–145394 (2019)
Gu, H., et al.: DIAVA: a traffic-based framework for detection of SQL injection attacks and vulnerability analysis of leaked data. IEEE Trans. Reliab. 69(1), 188–202 (2019)
Batista, L.O.: Fuzzy neural networks to create an expert system for detecting attacks by SQL injection, arXiv preprint arXiv:1901.02868
Fang, Y., Peng, J., Liu, L., Huang, C.: WOVSQLI: detection of SQL injection behaviors using word vector and LSTM. In: Proceedings of the 2nd International Conference on Cryptography, Security and Privacy, pp. 170–174 (2018)
Liu, M., Li, K., Chen, T.: DeepSQLi: deep semantic learning for testing SQL injection. In: Proceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, pp. 286–297 (2020)
D. Chen, Q. Yan, C. Wu, J. Zhao, Sql injection attack detection and prevention techniques using deep learning. J. Phys. Conf. Series 1757, 012055 IOP Publishing (2021)
Nguyen, H.T., Torrano-Gimenez, C., Alvarez, G., Petrović, S., Franke, K.: Application of the generic feature selection measure in detection of web attacks. In: Herrero, Á., Corchado, E. (eds.) CISIS 2011. LNCS, vol. 6694, pp. 25–32. Springer, Heidelberg (2011). https://doi.org/10.1007/978-3-642-21323-6_4
Yavanoglu, O., Aydos, M.: A review on cyber security datasets for machine learning algorithms. In: IEEE International Conference on Big Data (big data). IEEE, vol. 2017, pp. 2186–2193 (2017)
Kascheev, S., Olenchikova, T.: The detecting cross-site scripting (XSS) using machine learning methods. In: Global Smart Industry Conference (GloSIC). IEEE, vol. 2020, pp. 265–270 (2020)
Mereani, F.A., Howe, J.M.: Detecting cross-site scripting attacks using machine learning. In: Hassanien, A.E., Tolba, M.F., Elhoseny, M., Mostafa, M. (eds.) AMLTA 2018. AISC, vol. 723, pp. 200–210. Springer, Cham (2018). https://doi.org/10.1007/978-3-319-74690-6_20
Halfond, W.G., Viegas, J., Orso, A., et al.: A classification of SQL-injection attacks and countermeasures. In: Proceedings of the IEEE International Symposium on Secure Software Engineering, IEEE, vol. 1, pp. 13–15 (2006)
Saritas, M.M., Yasar, A.: Performance analysis of ANN and naive Bayes classification algorithm for data classification. Int. J. Intell. Syst. Appl. Eng. 7(2), 88–91 (2019)
Garg, A., Roth, D.: Understanding probabilistic classifiers. In: De Raedt, L., Flach, P. (eds.) ECML 2001. LNCS (LNAI), vol. 2167, pp. 179–191. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44795-4_16
Kulkarni, C.C., Kulkarni, S.: Human agent knowledge transfer applied to web security. In: 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT), IEEE, pp. 1–4 (2013)
Zhang, H.: The optimality of naive Bayes. Aa 1(2), 3 (2004)
Myles, A.J., Feudale, R.N., Liu, Y., Woody, N.A., Brown, S.D.: An introduction to decision tree modeling. A J. Chemom. Soc. 18(6), 275–285 (2004)
Liaw, A., Wiener, M., et al.: Classification and regression by randomforest. R News 2(3), 18–22 (2002)
Howe, J., Mereani, F.: Detecting cross-site scripting attacks using machine learning. In: Advances in Intelligent Systems and Computing 723
Zhang, Z.: Introduction to machine learning: k-nearest neighbors. Anna. Transl. Med. 4(11)
Bhor, R., Khanuja, H.: Analysis of web application security mechanism and attack detection using vulnerability injection technique. In: 2016 International Conference on Computing Communication Control and automation (ICCUBEA), IEEE, pp. 1–6 (2016)
Jakkula, V.: Tutorial on support vector machine (SVM), School of EECS, Washington State University 37
Rawat, R., Shrivastav, S.K.: SQL injection attack detection using SVM. Int. J. Comput. Appl. 42(13), 1–4 (2012)
Braspenning, P.J., Thuijsman, F., Weijters, A.J.M.M. (eds.): Neural Network School 1999. LNCS, vol. 931. Springer, Heidelberg (1995). https://doi.org/10.1007/BFb0027019
Manzoor, I., Kumar, N., et al.: A feature reduced intrusion detection system using ANN classifier. Expert Syst. Appl. 88, 249–257 (2017)
CSIC 2010 Dataset, https://petescully.co.uk/research/csic-2010-http-dataset-in-csv-format-for-weka-analysis/ (2021)
Bhatnagar, M., Rozinaj, G., Yadav, P.K.: Web intrusion classification system using machine learning approaches. In: International Symposium ELMAR. IEEE, vol. 2022, pp. 57–60 (2022)
Ramos Júnior, L.S., Macêdo, D., Oliveira, A.L.I., Zanchettin, C.: Detecting Malicious HTTP Requests Without Log Parser Using RequestBERT-BiLSTM. In: Xavier-Junior, J.C., Rios, R.A. (eds) Intelligent Systems. BRACIS 2022. LNCS(), vol 13654 . Springer, Cham (2022). https://doi.org/10.1007/978-3-031-21689-3_24
Ghazal, S.F., Mjlae, S.A.: Cybersecurity in deep learning techniques: Detecting network attacks. Int. J. Adv. Comput. Sci. Appl. 13(11)
Li, W., Zhang, X.Y.: GBLNet: Detecting Intrusion Traffic with Multi-granularity BiLSTM. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. LNCS, vol 13353. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08760-8_32
Tan, S., Sun, R., Liang, Z.: Detection of malicious web requests using neural networks with multi granularity features. In: Proceedings of the 5th International Conference on Big Data Technologies, pp. 83–89 (2022)
Shaheed, A., Kurdy, M.: Web application firewall using machine learning and features engineering, Secur. Commun. Netw. (2022)
Toprak, S., Yavuz, A.G.: Web application firewall based on anomaly detection using deep learning. Acta Infologica 6(2), 219–244 (2022)
J. J. Davis, A. J. Clark, Data preprocessing for anomaly based network intrusion detection: a review. Comput. Secur. 30(6–7), 353–375 (2011)
Kotsiantis, S.B., Kanellopoulos, D., Pintelas, P.E.: Data preprocessing for supervised leaning. Int. J. Comput. Sci. 1(2), 111–117 (2006)
Performance metrics, https://towardsdatascience.com/20-popular-machine-learning-metrics-part-1-classification-regression-evaluation-metrics1ca3e282a2ce (2021)
Acknowledgment
This work was supported by grant number 12R170.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 ICST Institute for Computer Sciences, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Ismail, M., Alrabaee, S., Harous, S., Choo, KK.R. (2024). Empirical Evaluations of Machine Learning Effectiveness in Detecting Web Application Attacks. In: Perakovic, D., Knapcikova, L. (eds) Future Access Enablers for Ubiquitous and Intelligent Infrastructures. FABULOUS 2023. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 542. Springer, Cham. https://doi.org/10.1007/978-3-031-50051-0_8
Download citation
DOI: https://doi.org/10.1007/978-3-031-50051-0_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-50050-3
Online ISBN: 978-3-031-50051-0
eBook Packages: Computer ScienceComputer Science (R0)