A Packet Content-Oriented Remote Code Execution Attack Payload Detection Model
Abstract
:1. Introduction
- We propose a novel RCE attack payload detection model named PCO-RCEAPD. This model is packet content oriented, so it can quickly discover potential security threats (i.e., XXE, ELi, and IDSER) in the process of network communication.
- For the XXE attack, we propose a novel algorithm to construct the UD chain of XML entities. This algorithm tracks the reference process of XML entities in packets and analyzes their sensitive behaviors.
- For the ELi and IDSER attack, we slice the code based on the data dependency of expression language and Java code, extract 34 features that describe string operations and the use of sensitive classes/methods, and train a machine learning model to perform detection.
2. Preliminaries
2.1. Remote Code Execution
2.2. XML External Entity
2.3. Expression Language Injection
2.4. Insecure Deserialization
3. Related Work
3.1. Static Approach
3.2. Dynamic Approach
4. Methodology
4.1. System Overview
4.2. Packet Preprocess
Algorithm 1: Packet preprocess. |
4.3. XXE Detector
Algorithm 2: XXE detection. |
4.4. ELi and IDSER Detector
4.4.1. Feature Group
4.4.2. EL Feature Extraction
4.4.3. Java Code Feature Extraction
Algorithm 3: Code slice. |
5. Experimental Results and Analysis
5.1. Experimental Setup
- Building the vulnerability environment based on Github’s open source project Vulhub;
- Collecting XXE, ELi, and IDSER payloads (benign and malicious) from the Internet;
- Testing collected payloads in the built environment and collecting network packets.
5.2. Evaluation of the ELi and IDSER Detector
5.2.1. Evaluation of ELi Detection
5.2.2. Evaluation of IDSER Detection
5.2.3. Comprehensive Evaluation Results of the ELi and IDSER Detector
5.3. Evaluation of XXE Detection
5.4. Results and Limitations Analysis
- The construction of the UD chain for XML entities may introduce false positives, although these false positives have little impact on the final judgment of security practitioners.
- The XXE detector has false negatives when analyzing the behavior of UD chain nodes. Therefore, we suggest providing it with a list of sensitive behaviors to improve its detection accuracy.
- Because the dataset is collected in the experimental environment, it may lack representation of the real production environment. This could result in incomplete feature analysis and extraction, thus affecting the accuracy of the machine learning model.
- It is difficult to determine the success of an attack solely based on the request and response packets. We can only make a preliminary judgment by checking if a complete UD chain can be constructed and if a string can be parsed by the interpreter. This may bring false positives to practitioners.
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zheng, Y.; Zhang, X. Path sensitive static analysis of web applications for remote code execution vulnerability detection. In Proceedings of the 2013 35th International Conference on Software Engineering (ICSE), San Francisco, CA, USA, 18–26 May 2013; pp. 652–661. [Google Scholar]
- Clincy, V.; Shahriar, H. Web application firewall: Network security models and configuration. In Proceedings of the 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), Tokyo, Japan, 23–27 July 2018; Volume 1, pp. 835–836. [Google Scholar]
- Moradi Vartouni, A.; Teshnehlab, M.; Sedighian Kashi, S. Leveraging deep neural networks for anomaly-based web application firewall. IET Inf. Secur. 2019, 13, 352–361. [Google Scholar] [CrossRef]
- Appelt, D.; Nguyen, C.D.; Briand, L. Behind an application firewall, are we safe from SQL injection attacks? In Proceedings of the 2015 IEEE 8th International Conference on Software Testing, Verification and Validation (ICST), Graz, Austria, 13–17 April 2015; pp. 1–10. [Google Scholar]
- Ye, Y.; Li, T.; Adjeroh, D.; Iyengar, S.S. A survey on malware detection using data mining techniques. ACM Comput. Surv. (CSUR) 2017, 50, 1–40. [Google Scholar] [CrossRef]
- Cui, Z.; Du, L.; Wang, P.; Cai, X.; Zhang, W. Malicious code detection based on CNNs and multi-objective algorithm. J. Parallel Distrib. Comput. 2019, 129, 50–58. [Google Scholar] [CrossRef]
- He, X.; Xu, L.; Cha, C. Malicious javascript code detection based on hybrid analysis. In Proceedings of the 2018 25th Asia-Pacific Software Engineering Conference (APSEC), Nara, Japan, 4–7 December 2018; pp. 365–374. [Google Scholar]
- Kim, J.Y.; Cho, S.B. Obfuscated malware detection using deep generative model based on global/local features. Comput. Secur. 2022, 112, 102501. [Google Scholar] [CrossRef]
- Chen, J.; Guo, S.; Ma, X.; Li, H.; Guo, J.; Chen, M.; Pan, Z. Slam: A malware detection method based on sliding local attention mechanism. Secur. Commun. Netw. 2020, 2020, 6724513. [Google Scholar] [CrossRef]
- Fass, A.; Krawczyk, R.P.; Backes, M.; Stock, B. Jast: Fully syntactic detection of malicious (obfuscated) javascript. In Proceedings of the 15th International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, Saclay, France, 28–29 June 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 303–325. [Google Scholar]
- Fan, Y.; Hou, S.; Zhang, Y.; Ye, Y.; Abdulhayoglu, M. Gotcha-sly malware! scorpion a metagraph2vec based malware detection system. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, London, UK, 19–23 August 2018; pp. 253–262. [Google Scholar]
- Rusak, G.; Al-Dujaili, A.; O’Reilly, U.M. Ast-based deep learning for detecting malicious powershell. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security, Toronto, ON, Canada, 15–19 October 2018; pp. 2276–2278. [Google Scholar]
- Hendler, D.; Kels, S.; Rubin, A. Detecting malicious powershell commands using deep neural networks. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, Incheon, Republic of Korea, 29 May 2018; pp. 187–197. [Google Scholar]
- Wang, Y.; Cai, W.; Lyu, P.; Shao, W. A combined static and dynamic analysis approach to detect malicious browser extensions. Secur. Commun. Netw. 2018, 2018, 7087239. [Google Scholar] [CrossRef]
- Liang, H.; Yang, Y.; Sun, L.; Jiang, L. Jsac: A novel framework to detect malicious javascript via cnns over ast and cfg. In Proceedings of the 2019 International Joint Conference on Neural Networks (IJCNN), Budapest, Hungary, 14–19 July 2019; pp. 1–8. [Google Scholar]
- Li, Y.; Huang, J.; Ikusan, A.; Mitchell, M.; Zhang, J.; Dai, R. ShellBreaker: Automatically detecting PHP-based malicious web shells. Comput. Secur. 2019, 87, 101595. [Google Scholar] [CrossRef]
- Huang, W.; Jia, C.; Yu, M.; Li, G.; Liu, C.; Jiang, J. UTANSA: Static Approach for Multi-Language Malicious Web Scripts Detection. In Proceedings of the 2021 IEEE Symposium on Computers and Communications (ISCC), Athens, Greece, 5–8 September 2021; pp. 1–7. [Google Scholar]
- Alahmadi, A.; Alkhraan, N.; BinSaeedan, W. MPSAutodetect: A Malicious PowerShell Script Detection Model Based on a Stacked Denoising Auto-Encoder. Comput. Secur. 2022, 116, 102658. [Google Scholar] [CrossRef]
- Wang, J.; Xue, Y.; Liu, Y.; Tan, T.H. Jsdc: A hybrid approach for javascript malware detection and classification. In Proceedings of the 10th ACM Symposium on Information, Computer and Communications Security, New York, NY, USA, 14 April–17 March 2015; pp. 109–120. [Google Scholar]
- Kim, K.; Kim, I.L.; Kim, C.H.; Kwon, Y.; Zheng, Y.; Zhang, X.; Xu, D. J-force: Forced execution on javascript. In Proceedings of the 26th International Conference on World Wide Web, Perth, Australia, 3–7 April 2017; pp. 897–906. [Google Scholar]
- Wang, R.; Xu, G.; Zeng, X.; Li, X.; Feng, Z. TT-XSS: A novel taint tracking based dynamic detection framework for DOM Cross-Site Scripting. J. Parallel Distrib. Comput. 2018, 118, 100–106. [Google Scholar] [CrossRef]
- Tang, Z.; Zhai, J.; Pan, M.; Aafer, Y.; Ma, S.; Zhang, X.; Zhao, J. Dual-force: Understanding webview malware via cross-language forced execution. In Proceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineering, Montpellier, France, 3–7 September 2018; pp. 714–725. [Google Scholar]
- Li, B.; Vadrevu, P.; Lee, K.H.; Perdisci, R.; Liu, J.; Rahbarinia, B.; Li, K.; Antonakakis, M. JSgraph: Enabling Reconstruction of Web Attacks via Efficient Tracking of Live In-Browser JavaScript Executions. In Proceedings of the NDSS, San Diego, CA, USA, 18–21 February 2018. [Google Scholar]
- Xiao, F.; Lin, Z.; Sun, Y.; Ma, Y. Malware detection based on deep learning of behavior graphs. Math. Probl. Eng. 2019, 2019, 8195395. [Google Scholar] [CrossRef]
- Ye, Y.; Hou, S.; Chen, L.; Lei, J.; Wan, W.; Wang, J.; Xiong, Q.; Shao, F. Out-of-sample node representation learning for heterogeneous graph in real-time android malware detection. In Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, 10–16 August 2019. [Google Scholar]
- Shabtai, A.; Moskovitch, R.; Elovici, Y.; Glezer, C. Detection of malicious code by applying machine learning classifiers on static features: A state-of-the-art survey. Inf. Secur. Tech. Rep. 2009, 14, 16–29. [Google Scholar] [CrossRef]
- Singh, J.; Singh, J. Detection of malicious software by analyzing the behavioral artifacts using machine learning algorithms. Inf. Softw. Technol. 2020, 121, 106273. [Google Scholar] [CrossRef]
- Cova, M.; Kruegel, C.; Vigna, G. Detection and analysis of drive-by-download attacks and malicious JavaScript code. In Proceedings of the 19th International Conference on World Wide Web, Raleigh, NC, USA, 26–30 April 2010; pp. 281–290. [Google Scholar]
- Huang, Y.; Li, T.; Zhang, L.; Li, B.; Liu, X. JSContana: Malicious JavaScript detection using adaptable context analysis and key feature extraction. Comput. Secur. 2021, 104, 102218. [Google Scholar] [CrossRef]
No. | Feature Name | Description |
---|---|---|
1 | no_charAt | Number of times the charAt method is used |
2 | no_getChars | Number of times the getChars method is used |
3 | no_toString | Number of times the toString method is used |
4 | no_valueOf | Number of times the valueOf method is used |
5 | no_subString | Number of times the subString method is used |
6 | no_split | Number of times the split method is used |
7 | no_concat | Number of times the concat method is used |
8 | no_replace | Number of times the replace method is used |
9 | has_Runtime | Is the Runtime class used |
10 | no_getRuntime | Number of times the getRuntime method is used |
11 | has_ProcessBuilder | Is the ProcessBuilder class used |
12 | no_forName | Number of times the forName method is used |
13 | no_getClass | Number of times the getClass method is used |
14 | has_getDeclaredField | Is the getDeclaredField method used |
15 | has_newInstance | Is the newInstance method used |
16 | no_getMethod | Number of times the getMethod method is used |
17 | no_class | Number of times the class field is used |
18 | has_getConstructor | Is the getConstructor method used |
19 | has_getSystemClassLoader | Is the getSystemClassLoader method used |
20 | has_getEngineByName | Is the getEngineByName method used |
21 | has_eval | Is the eval method used |
22 | no_add | Number of times the “+” operator is used |
23 | has_getDeclaredConstructors | Is the getDeclaredConstructors method used |
24 | has_getClassLoader | Is the getClassLoader method used |
25 | has_start | Is the start method used |
26 | no_loadClass | Number of times the loadClass method is used |
27 | has_ScriptEngineManager | Is the ScriptEngineManager class used |
28 | has_getResource | Is the getResource method used |
29 | has_URLClassLoader | Is the URLClassLoader class used |
30 | no_decode | Number of times the decode method is used |
31 | has_getMethods | Is the getMethods method used |
32 | has_exec | Is the exec method used |
33 | has_invoke | Is the invoke method used |
34 | has_getConstructors | Is the getConstructors method used |
Metric | Description |
---|---|
TP (True Positive) | The number of payloads correctly classified as malicious. |
TN (True Negative) | The number of payloads correctly classified as benign. |
FP (False Positive) | The number of payloads mistakenly classified as malicious. |
FN (False Negative) | The number of payloads mistakenly classified as benign. |
Precision | |
Recall | |
F1 Score |
Model | TP (↗) | TN (↗) | FP (↘) | FN (↘) | Precision (↗) | Recall (↗) | F1 (↗) |
---|---|---|---|---|---|---|---|
Naive Bayes | 109 | 220 | 2 | 9 | 0.98 | 0.92 | 0.95 |
Decision Tree | 110 | 221 | 1 | 8 | 0.99 | 0.93 | 0.96 |
Logistic Regression | 106 | 221 | 1 | 12 | 0.99 | 0.90 | 0.94 |
Random Forest | 110 | 221 | 1 | 8 | 0.99 | 0.93 | 0.96 |
SVM | 111 | 219 | 3 | 7 | 0.97 | 0.94 | 0.96 |
XGBoost | 106 | 221 | 1 | 12 | 0.99 | 0.90 | 0.94 |
Model | TP (↗) | TN (↗) | FP (↘) | FN (↘) | Precision (↗) | Recall (↗) | F1 (↗) |
---|---|---|---|---|---|---|---|
Naive Bayes | 22 | 78 | 0 | 22 | 1 | 0.50 | 0.67 |
Decision Tree | 37 | 77 | 1 | 7 | 0.97 | 0.84 | 0.90 |
Logistic Regression | 34 | 75 | 2 | 10 | 0.94 | 0.77 | 0.85 |
Random Forest | 37 | 77 | 1 | 7 | 0.97 | 0.84 | 0.90 |
SVM | 34 | 76 | 1 | 10 | 0.97 | 0.77 | 0.86 |
XGBoost | 37 | 75 | 3 | 7 | 0.93 | 0.84 | 0.88 |
Model | TP (↗) | TN (↗) | FP (↘) | FN (↘) | Precision (↗) | Recall (↗) | F1 (↗) |
---|---|---|---|---|---|---|---|
Naive Bayes | 92 | 296 | 3 | 71 | 0.97 | 0.56 | 0.71 |
Decision Tree | 144 | 298 | 1 | 19 | 0.99 | 0.88 | 0.94 |
Logistic Regression | 136 | 298 | 1 | 27 | 0.99 | 0.83 | 0.91 |
Random Forest | 144 | 298 | 1 | 19 | 0.99 | 0.88 | 0.94 |
SVM | 143 | 298 | 1 | 20 | 0.99 | 0.88 | 0.93 |
XGBoost | 139 | 298 | 1 | 24 | 0.99 | 0.85 | 0.92 |
TP (↗) | TN (↗) | FP (↘) | FN (↘) | Precision (↗) | Recall (↗) | F1 (↗) |
---|---|---|---|---|---|---|
17 | 97 | 3 | 1 | 0.85 | 0.94 | 0.89 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Sun, E.; Han, J.; Li, Y.; Huang, C. A Packet Content-Oriented Remote Code Execution Attack Payload Detection Model. Future Internet 2024, 16, 235. https://doi.org/10.3390/fi16070235
Sun E, Han J, Li Y, Huang C. A Packet Content-Oriented Remote Code Execution Attack Payload Detection Model. Future Internet. 2024; 16(7):235. https://doi.org/10.3390/fi16070235
Chicago/Turabian StyleSun, Enbo, Jiaxuan Han, Yiquan Li, and Cheng Huang. 2024. "A Packet Content-Oriented Remote Code Execution Attack Payload Detection Model" Future Internet 16, no. 7: 235. https://doi.org/10.3390/fi16070235