Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3634737.3656287acmconferencesArticle/Chapter ViewAbstractPublication Pagesasia-ccsConference Proceedingsconference-collections
research-article

Towards Robust Domain Generation Algorithm Classification

Published: 01 July 2024 Publication History

Abstract

In this work, we conduct a comprehensive study on the robustness of domain generation algorithm (DGA) classifiers. We implement 32 white-box attacks, 19 of which are very effective and induce a false-negative rate (FNR) of ≈ 100% on unhardened classifiers. To defend the classifiers, we evaluate different hardening approaches and propose a novel training scheme that leverages adversarial latent space vectors and discretized adversarial domains to significantly improve robustness. In our study, we highlight a pitfall to avoid when hardening classifiers and uncover training biases that can be easily exploited by attackers to bypass detection, but which can be mitigated by adversarial training (AT). In our study, we do not observe any trade-off between robustness and performance, on the contrary, hardening improves a classifier's detection performance for known and unknown DGAs. We implement all attacks and defenses discussed in this paper as a standalone library, which we make publicly available1 to facilitate hardening of DGA classifiers.

References

[1]
Naveed Akhtar and Ajmal Mian. 2018. Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey. IEEE Access 6 (2018).
[2]
Naveed Akhtar, Ajmal Mian, Navid Kardan, and Mubarak Shah. 2021. Advances in Adversarial Attacks and Defenses in Computer Vision: A Survey. IEEE Access 9 (2021).
[3]
Hyrum S. Anderson, Jonathan Woodbridge, and Bobby Filar. 2016. DeepDGA: Adversarially-Tuned Domain Generation and Detection. In Workshop on Artificial Intelligence and Security. ACM.
[4]
Manos Antonakakis, Roberto Perdisci, David Dagon, Wenke Lee, and Nick Feamster. 2010. Building a Dynamic Reputation System for DNS. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/legacy/events/sec10/tech/full_papers/Antonakakis.pdf.
[5]
Manos Antonakakis, Roberto Perdisci, Wenke Lee, Nikolaos Vasiloglou II, and David Dagon. 2011. Detecting Malware Domains at the Upper DNS Hierarchy. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenix-security-11/detecting-malware-domains-upper-dns-hierarchy
[6]
Manos Antonakakis, Roberto Perdisci, Yacin Nadji, Nikolaos Vasiloglou, Saeed Abu-Nimeh, Wenke Lee, and David Dagon. 2012. From Throw-Away Traffic to Bots: Detecting the Rise of DGA-Based Malware. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixsecurity12/technical-sessions/presentation/antonakakis.
[7]
Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In International Conference on Machine Learning, Jennifer Dy and Andreas Krause (Eds.). PMLR. https://proceedings.mlr.press/v80/athalye18a.html
[8]
Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, and Qian Wang. 2021. Recent Advances in Adversarial Training for Adversarial Robustness. In International Joint Conference on Artificial Intelligence. International Joint Conferences on Artificial Intelligence Organization.
[9]
Leyla Bilge, Sevil Sen, Davide Balzarotti, Engin Kirda, and Christopher Kruegel. 2014. Exposure: A Passive DNS Analysis Service to Detect and Report Malicious Domains. Transactions on Information and System Security 16, 4, Article 14 (2014).
[10]
Nicholas Carlini. 2019. A Complete List of All (arXiv) Adversarial Example Papers. https://nicholas.carlini.com/writing/2019/all-adversarial-example-papers.html online, accessed 2023-11-28.
[11]
Nicholas Carlini, Anish Athalye, Nicolas Papernot, Wieland Brendel, Jonas Rauber, Dimitris Tsipras, Ian Goodfellow, Aleksander Madry, and Alexey Kurakin. 2019. On Evaluating Adversarial Robustness. arXiv:1902.06705.
[12]
Nicholas Carlini and David Wagner. 2017. Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods. In Workshop on Artificial Intelligence and Security. ACM.
[13]
Nicholas Carlini and David Wagner. 2017. Towards Evaluating the Robustness of Neural Networks. In Symposium on Security and Privacy. IEEE.
[14]
Isaac Corley, Jonathan Lwowski, and Justin Hoffman. 2019. DomainGAN: Generating Adversarial Examples to Attack Domain Generation Algorithm Classifiers. arXiv:1911.06285.
[15]
Francesco Croce and Matthias Hein. 2020. Provable robustness against all adversarial lp-perturbations for p ≥ 1. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=rklk_ySYPB
[16]
Francesco Croce and Matthias Hein. 2020. Reliable Evaluation of Adversarial Robustness with an Ensemble of Diverse Parameter-free Attacks. In International Conference on Machine Learning. PMLR. https://proceedings.mlr.press/v119/croce20b.html
[17]
Arthur Drichel, Nils Faerber, and Ulrike Meyer. 2021. First Step Towards EXPLAINable DGA Multiclass Classification. In International Conference on Availability, Reliability and Security. ACM.
[18]
Arthur Drichel and Ulrike Meyer. 2023. False Sense of Security: Leveraging XAI to Analyze the Reasoning and True Performance of Context-less DGA Classifiers. In International Symposium on Research in Attacks, Intrusions and Defenses. ACM.
[19]
Arthur Drichel, Ulrike Meyer, Samuel Schüppen, and Dominik Teubert. 2020. Analyzing the Real-World Applicability of DGA Classifiers. In International Conference on Availability, Reliability and Security. ACM.
[20]
Arthur Drichel, Ulrike Meyer, Samuel Schüppen, and Dominik Teubert. 2020. Making Use of NXt to Nothing: Effect of Class Imbalances on DGA Detection Classifiers. In International Conference on Availability, Reliability and Security. ACM.
[21]
Javid Ebrahimi, Anyi Rao, Daniel Lowd, and Dejing Dou. 2018. HotFlip: White-Box Adversarial Examples for Text Classification. In Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
[22]
Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and Harnessing Adversarial Examples. In International Conference on Learning Representations.
[23]
Nathaniel Gould, Taishi Nishiyama, and Kazunori Kamiya. 2020. Domain Generation Algorithm Detection Utilizing Model Hardening Through GAN-Generated Adversarial Examples. In Deployable Machine Learning for Security Defense. Springer.
[24]
Martin Grill, Ivan Nikolaev, Veronica Valeros, and Martin Rehak. 2015. Detecting DGAMalware Using NetFlow. In IFIP/IEEE Integrated Network Management. IEEE.
[25]
Chuan Guo, Mayank Rana, Moustapha Cissé, and Laurens van der Maaten. 2018. Countering Adversarial Images using Input Transformations. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=SyJ7ClWCb
[26]
Xiaoyan Hu, Hao Chen, Miao Li, Guang Cheng, Ruidong Li, Hua Wu, and Yali Yuan. 2023. ReplaceDGA: BiLSTM-Based Adversarial DGA With High Anti-Detection Ability. Transactions on Information Forensics and Security 18 (2023).
[27]
Di Jin, Zhijing Jin, Joey Tianyi Zhou, and Peter Szolovits. 2020. Is BERT Really Robust? A Strong Baseline for Natural Language Attack on Text Classification and Entailment. In AAAI Conference on Artificial Intelligence. AAAI Press.
[28]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations. OpenReview.net. arXiv:1412.6980
[29]
Dr. John C. Klensin. 2010. Internationalized Domain Names for Applications (IDNA): Definitions and Document Framework. RFC 5890.
[30]
Felix Kreuk, Assi Barak, Shir Aviv-Reuven, Moran Baruch, Benny Pinkas, and Joseph Keshet. 2018. Deceiving End-to-End Deep Learning Malware Detectors using Adversarial Examples. arXiv:1802.04528.
[31]
Taku Kudo and John Richardson. 2018. SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing. In Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
[32]
Alexey Kurakin, Ian J Goodfellow, and Samy Bengio. 2017. Adversarial examples in the physical world. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=HJGU3Rodl
[33]
Alexey Kurakin, Ian J. Goodfellow, and Samy Bengio. 2017. Adversarial Machine Learning at Scale. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=BJm4T4Kgx
[34]
Victor Le Pochat, TomVan Goethem, Samaneh Tajalizadehkhoob, Maciej Korczynski, and Wouter Joosen. 2019. Tranco: A Research-Oriented Top Sites Ranking Hardened Against Manipulation. In Network and Distributed System Security Symposium. Internet Society. https://www.ndss-symposium.org/ndss-paper/tranco-a-research-oriented-top-sites-ranking-hardened-against-manipulation/.
[35]
Jinfeng Li, Shouling Ji, Tianyu Du, Bo Li, and Ting Wang. 2019. TextBugger: Generating Adversarial Text Against Real-world Applications. In Network and Distributed System Security Symposium. The Internet Society. https://www.ndss-symposium.org/ndss-paper/textbugger-generating-adversarial-text-against-real-world-applications/
[36]
Linyi Li, Tao Xie, and Bo Li. 2023. SoK: Certified Robustness for Deep Neural Networks. In Symposium on Security and Privacy. IEEE.
[37]
Qihe Liu, Gao Yu, Yuanyuan Wang, and Zeng Yi. 2022. A Novel DGA Domain Adversarial Sample Generation Method By Geometric Perturbation. In International Conference on Advanced Information Science and System. ACM.
[38]
Wanping Liu, Zhoulan Zhang, Cheng Huang, and Yong Fang. 2021. CLETer: A Character-level Evasion Technique Against Deep Learning DGA Classifiers. Endorsed Transactions on Security and Safety 7, 24 (2021).
[39]
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017. Delving into Transferable Adversarial Examples and Black-box Attacks. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=Sys6GJqxl
[40]
Keane Lucas, Samruddhi Pai, Weiran Lin, Lujo Bauer, Michael K. Reiter, and Mahmood Sharif. 2023. Adversarial Training for Raw-Binary Malware Classifiers. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixsecurity23/presentation/lucas
[41]
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. 2018. Towards Deep Learning Models Resistant to Adversarial Attacks. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=rJzIBfZAb
[42]
Paul Mockapetris. 1987. Domain names - implementation and specification. RFC 1035.
[43]
Lihai Nie, Xiaoyang Shan, Laiping Zhao, and Keqiu Li. 2022. PKDGA: A Partial Knowledge-based Domain Generation Algorithm for Botnets. arXiv:2212.04234.
[44]
Nicolas Papernot, Patrick McDaniel, and Ian Goodfellow. 2016. Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples. arXiv:1605.07277.
[45]
Lucas Pardue and Julien Desgats. 2023. HTTP/2 Rapid Reset: deconstructing the record-breaking attack. https://blog.cloudflare.com/technical-breakdown-http2-rapid-reset-ddos-attack/ online, accessed 2023-11-21.
[46]
Jonathan Peck, Claire Nie, Raaghavi Sivaguru, Charles Grumer, Femi Olumofin, Bin Yu, Anderson Nascimento, and Martine De Cock. 2019. CharBot: A Simple and Effective Method for Evading DGA Classifiers. IEEE Access 7 (2019).
[47]
Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, and Lorenzo Cavallaro. 2019. TESSERACT: Eliminating Experimental Bias in Malware Classification across Space and Time. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixsecurity19/presentation/pendlebury.
[48]
Daniel Plohmann, Khaled Yakdan, Michael Klatt, Johannes Bader, and Elmar Gerhards-Padilla. 2016. A Comprehensive Measurement Study of Domain Generating Malware. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/plohmann
[49]
Danish Pruthi, Bhuwan Dhingra, and Zachary C. Lipton. 2019. Combating Adversarial Misspellings with Robust Word Recognition. In Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
[50]
Shuhuai Ren, Yihe Deng, Kun He, and Wanxiang Che. 2019. Generating Natural Language Adversarial Examples through Probability Weighted Word Saliency. In Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
[51]
Christian Rossow, Dennis Andriesse, Tillmann Werner, Brett Stone-Gross, Daniel Plohmann, Christian J. Dietrich, and Herbert Bos. 2013. SoK: P2PWNED - Modeling and Evaluating the Resilience of Peer-to-Peer Botnets. In Symposium on Security and Privacy. IEEE.
[52]
Joshua Saxe and Konstantin Berlin. 2017. eXpose: A Character-Level Convolutional Neural Network with Embeddings For Detecting Malicious URLs, File Paths and Registry Keys. arXiv:1702.08568.
[53]
Stefano Schiavoni, Federico Maggi, Lorenzo Cavallaro, and Stefano Zanero. 2014. Phoenix: DGA-Based Botnet Tracking and Intelligence. In Detection of Intrusions and Malware, and Vulnerability Assessment. Springer.
[54]
Samuel Schüppen, Dominik Teubert, Patrick Herrmann, and Ulrike Meyer. 2018. FANCI : Feature-based Automated NXDomain Classification and Intelligence. In USENIX Security Symposium. USENIX Association. https://www.usenix.org/conference/usenixsecurity18/presentation/schuppen.
[55]
Yong Shi, Gong Chen, and Juntao Li. 2018. Malicious Domain Name Detection Based on Extreme Machine Learning. Neural Processing Letters 48, 3 (2018).
[56]
Xiang Shu, Chunjie Cao, Longjuan Wang, and Fangjian Tao. 2021. GWDGA: An Effective Adversarial DGA. In Frontiers in Cyber Security. Springer.
[57]
Lior Sidi, Asaf Nadler, and Asaf Shabtai. 2020. MaskDGA: An Evasion Attack Against DGA Classifiers and Adversarial Defenses. IEEE Access 8 (2020).
[58]
Raaghavi Sivaguru, Chhaya Choudhary, Bin Yu, Vadym Tymchenko, Anderson Nascimento, and Martine De Cock. 2018. An Evaluation of DGA Classifiers. In International Conference on Big Data. IEEE.
[59]
Jan Spooren, Davy Preuveneers, Lieven Desmet, Peter Janssen, and Wouter Joosen. 2019. Detection of Algorithmically Generated Domain Names Used by Botnets: A Dual Arms Race. In Symposium on Applied Computing. ACM.
[60]
Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian J. Goodfellow, and Rob Fergus. 2014. Intriguing properties of neural networks. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=kklr_MTHMRQjG
[61]
The Domain Name Industry Brief. 2023. The Domain Name Industry Brief Q3 2023. https://dnib.com/articles/the-domain-name-industry-brief-q3-2023 online, accessed 2023-12-02.
[62]
Mingkai Tong, Guo Li, Runzi Zhang, Jianxin Xue, Wenmao Liu, and Jiahai Yang. 2020. Far from Classification Algorithm: Dive into the Preprocessing Stage in DGA Detection. In International Conference on Trust, Security and Privacy in Computing and Communications. IEEE.
[63]
Florian Tramer, Nicholas Carlini, Wieland Brendel, and Aleksander Madry. 2020. On Adaptive Attacks to Adversarial Example Defenses. In Advances in Neural Information Processing Systems. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2020/hash/11f38f8ecd71867b42433548d1078e38-Abstract.html
[64]
Duc Tran, Hieu Mac, Van Tong, Hai Anh Tran, and Linh Giang Nguyen. 2018. A LSTM based framework for handling multiclass imbalance in DGA botnet detection. Neurocomputing 275 (2018).
[65]
Tianlu Wang, Xuezhi Wang, Yao Qin, Ben Packer, Kang Li, Jilin Chen, Alex Beutel, and Ed Chi. 2020. CAT-Gen: Improving Robustness in NLP Models via Controlled Adversarial Text Generation. In Empirical Methods in Natural Language Processing. Association for Computational Linguistics.
[66]
Jonathan Woodbridge, Hyrum S. Anderson, Anjum Ahuja, and Daniel Grant. 2016. Predicting Domain Generation Algorithms with Long Short-Term Memory Networks. arXiv:1611.00791.
[67]
Zuxuan Wu, Ser-Nam Lim, Larry S. Davis, and Tom Goldstein. 2020. Making an Invisibility Cloak: Real World Adversarial Attacks on Object Detectors. In European Conference on Computer Vision. Springer.
[68]
Sandeep Yadav and A. L. Narasimha Reddy. 2012. Winning with DNS Failures: Strategies for Faster Botnet Detection. In Security and Privacy in Communication Networks. Springer.
[69]
Yijun Yang, Ruiyuan Gao, Yu Li, Qiuxia Lai, and Qiang Xu. 2022. What You See is Not What the Network Infers: Detecting Adversarial Examples Based on Semantic Contradiction. In Network and Distributed System Security Symposium. Internet Society.
[70]
Omer Yoachimik. 2022. Mantis - the most powerful botnet to date. https://blog.cloudflare.com/mantis-botnet/ online, accessed 2023-11-21.
[71]
Jin Yong Yoo, John Morris, Eli Lifland, and Yanjun Qi. 2020. Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples. In BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP. Association for Computational Linguistics.
[72]
Jin Yong Yoo and Yanjun Qi. 2021. Towards Improving Adversarial Training of NLP Models. In Findings of the Association for Computational Linguistics: EMNLP 2021. Association for Computational Linguistics.
[73]
Bin Yu, Jie Pan, Jiaming Hu, Anderson Nascimento, and Martine De Cock. 2018. Character Level based Detection of DGA Domain Names. In International Joint Conference on Neural Networks. IEEE.
[74]
Xiaochun Yun, Ji Huang, Yipeng Wang, Tianning Zang, Yuan Zhou, and Yongzheng Zhang. 2020. Khaos: An Adversarial Neural Network DGA With High Anti-Detection Ability. Transactions on Information Forensics and Security 15 (2020).
[75]
Yuan Zang, Fanchao Qi, Chenghao Yang, Zhiyuan Liu, Meng Zhang, Qun Liu, and Maosong Sun. 2020. Word-level Textual Adversarial Attacking as Combinatorial Optimization. In Annual Meeting of the Association for Computational Linguistics. Association for Computational Linguistics.
[76]
You Zhai, JianYang, Zixiang Wang, Longtao He, Liqun Yang, and Zhoujun Li. 2022. Cdga: A GAN-based Controllable Domain Generation Algorithm. In International Conference on Trust, Security and Privacy in Computing and Communications. IEEE.
[77]
Yu Zheng, Chao Yang, Yanzhou Yang, Qixian Ren, Yue Li, and Jianfeng Ma. 2021. ShadowDGA: Toward Evading DGA Detectors with GANs. In International Conference on Computer Communications and Networks. IEEE.
[78]
Chen Zhu, Yu Cheng, Zhe Gan, Siqi Sun, Tom Goldstein, and Jingjing Liu. 2020. FreeLB: Enhanced Adversarial Training for Natural Language Understanding. In International Conference on Learning Representations. OpenReview.net. https://openreview.net/forum?id=BygzbyHFvB
[79]
Roland S. Zimmermann, Wieland Brendel, Florian Tramer, and Nicholas Carlini. 2022. Increasing Confidence in Adversarial Robustness Evaluations. In Advances in Neural Information Processing Systems. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2022/hash/5545d9bcefb7d03d5ad39a905d14fbe3-Abstract-Conference.html

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASIA CCS '24: Proceedings of the 19th ACM Asia Conference on Computer and Communications Security
July 2024
1987 pages
ISBN:9798400704826
DOI:10.1145/3634737
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2024

Check for updates

Author Tags

  1. domain generation algorithm (DGA)
  2. bot detection
  3. deep learning
  4. adversarial machine learning
  5. adversarial attacks
  6. robustness

Qualifiers

  • Research-article

Conference

ASIA CCS '24
Sponsor:

Acceptance Rates

Overall Acceptance Rate 418 of 2,322 submissions, 18%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 112
    Total Downloads
  • Downloads (Last 12 months)112
  • Downloads (Last 6 weeks)25
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media