Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3508398.3519365acmconferencesArticle/Chapter ViewAbstractPublication PagescodaspyConference Proceedingsconference-collections
poster

Towards Robust Detection of PDF-based Malware

Published: 15 April 2022 Publication History

Abstract

With the indisputable prevalence of PDFs, several studies into PDF malware and their evasive variants have been conducted to test the robustness of ML-based PDF classifier frameworks, Hidost and Mimicus. As heavily documented, the fundamental difference between them is that Hidost investigates the logical structure of PDFs, while Mimicus detects malicious indicators through their structural features. However, there exists techniques to mutate such features such that malicious PDFs are able to bypass these classifiers. In this work, we investigated three known attacks: Mimicry, Mimicry+, and Reverse Mimicry to compare how effective they are in evading classifiers in Hidost and Mimicus. The results shows that Mimicry and Mimicry+ are effective in bypassing models in Mimicus but not in Hidost, while Reverse Mimicy is effective against both models in Mimicus and Hidost.

Supplementary Material

MP4 File (codaspy_pdfmalware_vid.mp4)
In this video, we introduce our paper titled Towards Robust Detection of PDF-based Malware. In the first part of the video, we highlight the prevalence of PDFs in enterprise systems, and how adversaries have picked up on the trend and devised methods to propagate malware through PDFs. Subsequently, we described the methodology, where we use Machine Learning-based PDF classifier frameworks, Hidost and Mimicus, to classify both original and malware manipulated by the three adversarial attacks, Mimicry, Mimicry+, and Reverse Mimicry. We then show the results as to how classification accuracy by Hidost and Mimicus was affected by the adversarial PDFs, and discuss our analysis of the results, followed by highlighting possible improvements and our concluding statement.

References

[1]
, Tianqi Chen and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (San Francisco, California, USA) (KDD '16). ACM, New York, NY, USA, 785--794. https://doi.org/10.1145/2939672.2939785
[2]
, Melissa Chua and Vivek Balachandran. 2018. Effectiveness of android obfuscation on evading anti-malware. In Proceedings of the Eighth ACM Conference on Data and Application Security and Privacy . 143--145.
[3]
, Gerhard Eschelbeck. 2015. Sophos Security Threat Report 2014. https://news.sophos.com/en-us/2013/12/10/sophos-security-threat-report-2014/
[4]
Nicolas Fleury, Theo Dubrunquez, and Ihsen Alouani. 2021. PDF-Malware: An Overview on Threats, Detection and Evasion Attacks. arxiv: 2107.12873 [cs.CR]
[5]
Recorded Future. 2015. Gone in a flash: Top 10 vulnerabilities used by exploit kits. https://www.recordedfuture.com/top-vulnerabilities-2015/
[6]
Guolin Ke, Qi Meng, Thomas Finley, Taifeng Wang, Wei Chen, Weidong Ma, Qiwei Ye, and Tie-Yan Liu. 2017. Lightgbm: A highly efficient gradient boosting decision tree. Advances in neural information processing systems, Vol. 30 (2017), 3146--3154.
[7]
Davide Maiorca, Igino Corona, and Giorgio Giacinto. 2013. Looking at the Bag is Not Enough to Find the Bomb: An Evasion of Structural Methods for Malicious PDF Files Detection. In Proceedings of the 8th ACM SIGSAC Symposium on Information, Computer and Communications Security (Hangzhou, China) (ASIA CCS '13). Association for Computing Machinery, New York, NY, USA, 119--130. https://doi.org/10.1145/2484313.2484327
[8]
Russ Smoak. 2014. Cisco 2014 annual Security report. https://blogs.cisco.com/security/cisco-2014-annual-security-report-threat-intelligence-offers-view-into-network-compromises
[9]
Charles Smutz and Angelos Stavrou. 2012. Malicious PDF Detection Using Metadata and Structural Features. In Proceedings of the 28th Annual Computer Security Applications Conference (Orlando, Florida, USA) (ACSAC '12). Association for Computing Machinery, New York, NY, USA, 239--248. https://doi.org/10.1145/2420950.2420987
[10]
Corporation Symantec. 2014. 2014 Internet Security Threat Report. https://docs.broadcom.com/doc/istr-14-april-volume-19-en
[11]
Liang Tong, Bo Li, Chen Hajaj, Chaowei Xiao, Ning Zhang, and Yevgeniy Vorobeychik. 2019. Improving Robustness of ML Classifiers against Realizable Evasion Attacks Using Conserved Features. In 28th USENIX Security Symposium (USENIX Security 19). USENIX Association, Santa Clara, CA, 285--302. https://www.usenix.org/conference/usenixsecurity19/presentation/tong
[12]
Weilin Xu, Yanjun Qi, and David Evans. 2016. Automatically evading classifiers: A case study on PDF malware classifiers. Proceedings 2016 Network and Distributed System Security Symposium (2016). https://doi.org/10.14722/ndss.2016.23115
[13]
Pavel Laskov. 2016. Hidost: A static machine-learning-based detector of malicious files. EURASIP Journal on Information Security, Vol. 2016, 1 (2016). https://doi.org/10.1186/s13635-016-0045-0
[14]
Nedimrndic and Pavel Laskov. 2014. Practical Evasion of a Learning-Based Classifier: A Case Study. In 2014 IEEE Symposium on Security and Privacy. 197--211. https://doi.org/10.1109/SP.2014.20

Cited By

View all
  • (2022)PDF Malware Detection Based on Optimizable Decision TreesElectronics10.3390/electronics1119314211:19(3142)Online publication date: 30-Sep-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CODASPY '22: Proceedings of the Twelfth ACM Conference on Data and Application Security and Privacy
April 2022
392 pages
ISBN:9781450392204
DOI:10.1145/3508398
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 April 2022

Check for updates

Author Tags

  1. adversarial attacks
  2. machine learning
  3. pdf malware

Qualifiers

  • Poster

Conference

CODASPY '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 149 of 789 submissions, 19%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)26
  • Downloads (Last 6 weeks)2
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2022)PDF Malware Detection Based on Optimizable Decision TreesElectronics10.3390/electronics1119314211:19(3142)Online publication date: 30-Sep-2022

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media