Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3686215.3690147acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
short-paper
Open access

Understanding LLMs Ability to Aid Malware Analysts in Bypassing Evasion Techniques

Published: 04 November 2024 Publication History

Abstract

Over the past few years, the threat of malware has become increasingly evident, posing a significant risk to cybersecurity worldwide and driving extensive research efforts to prevent and mitigate these attacks. Despite numerous efforts to automate malware analysis, these systems are constantly thwarted by evasive techniques developed by malware authors. As a result, the analysis of sophisticated evasive malware falls into the hands of human malware analysts, who must undertake the time-consuming process of overcoming each evasive technique to uncover malware’s malicious behaviors. This highlights the need for approaches that aid malware analysts in this process. Although active measures, such as forced execution and symbolic analysis, can automatically circumvent some evasive checks, they suffer from limitations like path explosion and fail to provide useful insights that analysts can use in their workflow. To fill this gap, we investigate how large language models (LLMs) can address shortcomings of symbolic analysis through the first comparative analysis between the two in bypassing evasion techniques. Our study leads to three key findings: (i) we find that LLMs outperform symbolic analysis in bypassing evasive code, especially in the presence of common code patterns, such as loops, which have historically posed a challenge for symbolic analysis, (ii) we show that LLMs correctly identify methods of bypassing evasive techniques in real-world malware, and (iii) we highlight how even in LLMs failure modes, human malware analysts can benefit from the step-by-step reasoning provided by the model.

References

[1]
Sébastien Bardin, Robin David, and Jean-Yves Marion. 2017. Backward-bounded DSE: targeting infeasibility questions on obfuscated codes. In 2017 IEEE Symposium on Security and Privacy. IEEE, 633–651.
[2]
Artem Dinaburg, Paul Royal, Monirul Sharif, and Wenke Lee. 2008. Ether: malware analysis via hardware virtualization extensions. In Proceedings of the 15th ACM conference on Computer and communications security. 51–62.
[3]
Nicola Galloro, Mario Polino, Michele Carminati, Andrea Continella, and Stefano Zanero. 2022. A Systematical and longitudinal study of evasive behaviors in windows malware. Computers & security 113 (2022), 102550.
[4]
Maanak Gupta, CharanKumar Akiri, Kshitiz Aryal, Eli Parker, and Lopamudra Praharaj. 2023. From chatgpt to threatgpt: Impact of generative ai in cybersecurity and privacy. IEEE Access (2023).
[5]
Clemens Kolbitsch, Engin Kirda, and Christopher Kruegel. 2011. The power of procrastination: detection and mitigation of execution-stalling malicious code. In Proceedings of the 18th ACM conference on Computer and communications security. 285–296.
[6]
Kevin Leach, Chad Spensky, Westley Weimer, and Fengwei Zhang. 2016. Towards transparent introspection. In 2016 IEEE 23rd International Conference on Software Analysis, Evolution, and Reengineering (SANER), Vol. 1. IEEE, 248–259.
[7]
Tamas K Lengyel, Steve Maresca, Bryan D Payne, George D Webster, Sebastian Vogl, and Aggelos Kiayias. 2014. Scalability, fidelity and stealth in the DRAKVUF dynamic malware analysis system. In Proceedings of the 30th annual computer security applications conference. 386–395.
[8]
Lorenzo Maffia, Dario Nisi, Platon Kotzias, Giovanni Lagorio, Simone Aonzo, and Davide Balzarotti. 2021. Longitudinal study of the prevalence of malware evasive techniques. arXiv preprint arXiv:2112.11289 (2021).
[9]
Jiang Ming, Dongpeng Xu, Li Wang, and Dinghao Wu. 2015. Loop: Logic-oriented opaque predicate detection in obfuscated binary code. In Proceedings of the 22nd ACM SIGSAC Conference on Computer and Communications Security. 757–768.
[10]
Anh M Nguyen, Nabil Schear, HeeDong Jung, Apeksha Godiyal, Samuel T King, and Hai D Nguyen. 2009. Mavmm: Lightweight and purpose built vmm for malware analysis. In 2009 Annual Computer Security Applications Conference. IEEE, 441–450.
[11]
Fei Peng, Zhui Deng, Xiangyu Zhang, Dongyan Xu, Zhiqiang Lin, and Zhendong Su. [n. d.]. X-force: Force-executing binary programs for security applications. In USENIX Security symposium 2014.
[12]
Paul Royal. 2012. Entrapment: Tricking malware with transparent, scalable malware analysis. talk at Black Hat (2012).
[13]
Venkata Ramana Saddi, Santhosh Kumar Gopal, Abdul Sajid Mohammed, S Dhanasekaran, and Mahaveer Singh Naruka. 2024. Examine the role of generative AI in enhancing threat intelligence and cyber security measures. In 2024 2nd International Conference on Disruptive Technologies (ICDT). IEEE, 537–542.
[14]
Yan Shoshitaishvili, Ruoyu Wang, Christopher Salls, Nick Stephens, Mario Polino, Audrey Dutcher, John Grosen, Siji Feng, Christophe Hauser, Christopher Kruegel, and Giovanni Vigna. 2016. SoK: (State of) The Art of War: Offensive Techniques in Binary Analysis. In IEEE Symposium on Security and Privacy.
[15]
Chad Spensky, Hongyi Hu, and Kevin Leach. 2016. LO-PHI: Low-Observable Physical Host Instrumentation for Malware Analysis. In NDSS.
[16]
Fabian Teichmann. 2023. Ransomware attacks in the context of generative artificial intelligence—an experimental study. International Cybersecurity Law Review 4, 4 (2023), 399–414.
[17]
Yi-Min Wang, Doug Beck, Xuxian Jiang, Roussi Roussev, Chad Verbowski, Shuo Chen, and Sam King. 2006. Automated web patrol with strider honeymonkeys. In Proceedings of the 2006 Network and Distributed System Security Symposium. 35–49.
[18]
Carsten Willems, Thorsten Holz, and Felix Freiling. 2007. Toward automated dynamic malware analysis using cwsandbox. IEEE Security & Privacy 5, 2 (2007), 32–39.
[19]
Dongpeng Xu, Jiang Ming, Yu Fu, and Dinghao Wu. 2018. VMHunt: A verifiable approach to partially-virtualized binary code simplification. In Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 442–458.
[20]
Hui Xu, Zirui Zhao, Yangfan Zhou, and Michael R Lyu. 2018. Benchmarking the capability of symbolic execution tools with logic bombs. IEEE Transactions on Dependable and Secure Computing (2018).
[21]
Babak Yadegari, Brian Johannesmeyer, Ben Whitely, and Saumya Debray. 2015. A generic approach to automatic deobfuscation of executable code. In 2015 IEEE Symposium on Security and Privacy. IEEE, 674–691.
[22]
Lok-Kwong Yan, Manjukumar Jayachandra, Mu Zhang, and Heng Yin. 2012. V2e: combining hardware virtualization and softwareemulation for transparent and extensible malware analysis. In Proceedings of the 8th ACM SIGPLAN/SIGOPS conference on Virtual Execution Environments. 227–238.
[23]
Yagmur Yigit, William J Buchanan, Madjid G Tehrani, and Leandros Maglaras. 2024. Review of generative ai methods in cybersecurity. arXiv preprint arXiv:2403.08701 (2024).
[24]
Miuyin Yong Wong, Matthew Landen, Manos Antonakakis, Douglas M Blough, Elissa M Redmiles, and Mustaque Ahamad. 2021. An inside look into the practice of malware analysis. In Proceedings of the 2021 ACM SIGSAC Conference on Computer and Communications Security. 3053–3069.
[25]
Wei You, Zhuo Zhang, Yonghwi Kwon, Yousra Aafer, Fei Peng, Yu Shi, Carson Harmon, and Xiangyu Zhang. 2020. Pmp: Cost-effective forced execution with probabilistic memory pre-planning. In 2020 IEEE Symposium on Security and Privacy (SP). IEEE, 1121–1138.
[26]
Fengwei Zhang, Kevin Leach, Kun Sun, and Angelos Stavrou. 2013. Spectre: A dependable introspection framework via system management mode. In 2013 43rd Annual IEEE/IFIP international conference on dependable systems and networks (DSN). IEEE, 1–12.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI Companion '24: Companion Proceedings of the 26th International Conference on Multimodal Interaction
November 2024
252 pages
ISBN:9798400704635
DOI:10.1145/3686215
This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 04 November 2024

Check for updates

Author Tags

  1. Large Language Model
  2. Malware Analysis
  3. Symbolic Analysis

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

ICMI '24
Sponsor:
ICMI '24: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION
November 4 - 8, 2024
San Jose, Costa Rica

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 217
    Total Downloads
  • Downloads (Last 12 months)217
  • Downloads (Last 6 weeks)124
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media