survey

Deep Learning for Zero-day Malware Detection and Classification: A Survey

Authors:

Fatemeh Deldar,

Mahdi AbadiAuthors Info & Claims

ACM Computing Surveys, Volume 56, Issue 2

Article No.: 36, Pages 1 - 37

https://doi.org/10.1145/3605775

Published: 15 September 2023 Publication History

Abstract

Zero-day malware is malware that has never been seen before or is so new that no anti-malware software can catch it. This novelty and the lack of existing mitigation strategies make zero-day malware challenging to detect and defend against. In recent years, deep learning has become the dominant and leading branch of machine learning in various research fields, including malware detection. Considering the significant threat of zero-day malware to cybersecurity and business continuity, it is necessary to identify deep learning techniques that can somehow be effective in detecting or classifying such malware. But so far, such a comprehensive review has not been conducted. In this article, we study deep learning techniques in terms of their ability to detect or classify zero-day malware. Based on our findings, we propose a taxonomy and divide different zero-day resistant, deep malware detection and classification techniques into four main categories: unsupervised, semi-supervised, few-shot, and adversarial resistant. We compare the techniques in each category in terms of various factors, including deep learning architecture, feature encoding, platform, detection or classification functionality, and whether the authors have performed a zero-day evaluation. We also provide a summary view of the reviewed papers and discuss their main characteristics and challenges.

Supplementary Material

3605775.supp (3605775.supp.pdf)

Supplementary material

Download
105.24 KB

References

[1]

Faranak Abri, Sima Siami-Namini, Mahdi Adl Khanghah, Fahimeh Mirza Soltani, and Akbar Siami Namin. 2019. Can machine/deep learning classifiers detect zero-day malware with high accuracy? In Proceedings of the IEEE International Conference on Big Data. IEEE, 3252–3259.

[2]

Ahmed Abusnaina, Mohammed Abuhamad, Hisham Alasmary, Afsah Anwar, Rhongho Jang, Saeed Salem, Daehun Nyang, and David Mohaisen. 2022. DL-FHMC: Deep learning-based fine-grained hierarchical learning approach for robust malware classification. IEEE Trans. Depend. Sec. Comput. 19, 5 (Sept.2022), 3432–3447.

[3]

Ahmed Abusnaina, Hisham Alasmary, Mohammed Abuhamad, Saeed Salem, DaeHun Nyang, and Aziz Mohaisen. 2019. Subgraph-based adversarial examples against graph-based IoT malware detection systems. In Computational Data and Social Networks, Andrea Tagarelli and Hanghang Tong (Eds.). Springer, 268–281.

[4]

Ahmed Abusnaina, Aminollah Khormali, Hisham Alasmary, Jeman Park, Afsah Anwar, and Aziz Mohaisen. 2019. Adversarial learning attacks on graph-based IoT malware detection systems. In Proceedings of the IEEE 39th International Conference on Distributed Computing Systems. IEEE, 1296–1305.

[5]

Abdullah Al-Dujaili, Alex Huang, Erik Hemberg, and Una-May O’Reilly. 2018. Adversarial deep learning for robust detection of binary encoded malware. In Proceedings of the IEEE Security and Privacy Workshops. IEEE, 76–82.

[6]

Kevin Allix, Tegawendé F. Bissyandé, Jacques Klein, and Yves Le Traon. 2016. AndroZoo: Collecting millions of Android apps for the research community. In Proceedings of the 13th International Conference on Mining Software Repositories. ACM, 468–471.

Digital Library

[7]

Hyrum S. Anderson, Anant Kharkar, Bobby Filar, and Phil Roth. 2017. Evading Machine Learning Malware Detection. Retrieved from https://blackhat.com.

[8]

Hyrum S. Anderson and Phil Roth. 2018. EMBER: An open dataset for training static PE malware machine learning models. arXiv:1804.04637. Retrieved from https://arxiv.org/abs/1804.04637.

[9]

Daniel Arp, Michael Spreitzenbarth, Malte Hübner, Hugo Gascon, and Konrad Rieck. 2014. Drebin: Effective and explainable detection of Android malware in your pocket. In Proceedings of the 21st Network and Distributed System Security Symposium. Internet Society.

[10]

Anish Athalye, Nicholas Carlini, and David Wagner. 2018. Obfuscated gradients give a false sense of security: Circumventing defenses to adversarial examples. In Proceedings of the 35th International Conference on Machine Learning. PMLR, 274–283.

[11]

AV-TEST. 2022. Malware Statistics & Trends Report. Retrieved from https://www.av-test.org/en/statistics/malware.

[12]

Vitalii Avdiienko, Konstantin Kuznetsov, Alessandra Gorla, Andreas Zeller, Steven Arzt, Siegfried Rasthofer, and Eric Bodden. 2015. Mining apps for abnormal usage of sensitive data. In Proceedings of the 37th IEEE International Conference on Software Engineering. IEEE, 426–436.

[13]

Dor Bank, Noam Koenigstein, and Raja Giryes. 2021. Autoencoders. arXiv:2003.05991. Retrieved from https://arxiv.org/abs/2003.05991.

[14]

Boaz Barak. 2016. Hopes, fears, and software obfuscation. Commun. ACM 59, 3 (March2016), 88–96.

[15]

Pedro H. Barros, Eduarda T. C. Chagas, Leonardo B. Oliveira, Fabiane Queiroz, and Heitor S. Ramos. 2022. Malware–SMELL: A zero-shot learning strategy for detecting zero-day vulnerabilities. Comput. Secur. 120 (Sept.2022), 102785.

Digital Library

[16]

Leyla Bilge and Tudor Dumitraş. 2012. Before we knew it: An empirical study of zero-day attacks in the real world. In Proceedings of the ACM Conference on Computer and Communications Security. ACM, 833–844.

Digital Library

[17]

Hamid Bostani and Veelasha Moonsamy. 2023. EvadeDroid: A practical evasion attack on machine learning for black-box Android malware detection. arXiv:2110.03301. Retrieved from https://arxiv.org/abs/2110.03301.

[18]

Julian Busch, Anton Kocheturov, Volker Tresp, and Thomas Seidl. 2021. NF-GNN: Network flow graph neural networks for malware detection and classification. In Proceedings of the 33rd International Conference on Scientific and Statistical Database Management. ACM, 121–132.

Digital Library

[19]

Cagatay Catal, Görkem Giray, and Bedir Tekinerdogan. 2022. Applications of deep learning for mobile malware detection: A systematic literature review. Neural Comput. Appl. 34, 2 (Jan.2022), 1007–1032.

Digital Library

[20]

Junyi Chai, Hao Zeng, Anming Li, and Eric W. T. Ngai. 2021. Deep learning in computer vision: A critical review of emerging techniques and application scenarios. Mach. Learn. Appl. 6 (Dec.2021), 100134.

[21]

Yuhan Chai, Lei Du, Jing Qiu, Lihua Yin, and Zhihong Tian. 2023. Dynamic prototype network based on sample adaptation for few-shot malware detection. IEEE Trans. Knowl. Data Eng. 35, 5 (May2023), 4754–4766.

Digital Library

[22]

Bingcai Chen, Zhongru Ren, Chao Yu, Iftikhar Hussain, and Jintao Liu. 2019. Adversarial examples for CNN-based malware detectors. IEEE Access 7 (May2019), 54360–54371.

[23]

Jianbo Chen, Michael I. Jordan, and Martin J. Wainwright. 2020. HopSkipJumpAttack: A query-efficient decision-based attack. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 1277–1294.

[24]

Wei-Yu Chen, Yen-Cheng Liu, Zsolt Kira, Yu-Chiang Frank Wang, and Jia-Bin Huang. 2019. A closer look at few-shot classification. In Proceedings of the 7th International Conference on Learning Representations. OpenReview.net.

[25]

Davide Chicco. 2021. Siamese neural networks: An overview. In Artificial Neural Networks, Hugh Cartwright (Ed.). Humana Press, 73–94.

[26]

Prakash Mandayam Comar, Lei Liu, Sabyasachi Saha, Pang-Ning Tan, and Antonio Nucci. 2013. Combining supervised and unsupervised learning for zero-day malware detection. In Proceedings of IEEE International Conference on Computer Communications (INFOCOM). IEEE, 2022–2030.

[27]

Mauro Conti, Shubham Khandhar, and P. Vinod. 2022. A few-shot malware classification approach for unknown family recognition using malware feature visualization. Comput. Secur. 122 (Nov.2022), 102887.

Digital Library

[28]

Ilay Cordonsky, Ishai Rosenberg, Guillaume Sicard, and Eli Omid David. 2018. DeepOrigin: End-to-end deep learning for detection of new malware families. In Proceedings of the International Joint Conference on Neural Networks. IEEE, 1–7.

[29]

Khanh Huu The Dam, Thomas Given-Wilson, and Axel Legay. 2021. Unsupervised behavioural mining and clustering for malware family identification. In Proceedings of the 36th Annual ACM Symposium on Applied Computing. ACM, 374–383.

Digital Library

[30]

Anusha Damodaran, Fabio Di Troia, Corrado Aaron Visaggio, Thomas H. Austin, and Mark Stamp. 2017. A comparison of static, dynamic, and hybrid analysis for malware detection. J. Comput. Virol. Hack. Tech. 13, 1 (Feb.2017), 1–12.

[31]

Angelo Schranko de Oliveira and Renato José Sassi. 2021. Behavioral malware detection using deep graph convolutional neural networks. Int. J. Comput. Appl. 174, 29 (April2021), 1–8.

[32]

Fatemeh Deldar, Mahdi Abadi, and Mohammad Ebrahimifard. 2022. Android malware detection using one-class graph neural networks. ISC Int. J. Inf. Secur. 14, 3 (Nov.2022), 51–59.

[33]

Luca Demetrio, Battista Biggio, Giovanni Lagorio, Fabio Roli, and Alessandro Armando. 2019. Explaining vulnerabilities of deep learning to adversarial malware binaries. In Proceedings of the 3rd Italian Conference on Cyber Security. CEUR-WS.org.

[34]

Li Deng and Dong Yu. 2014. Deep learning: Methods and applications. Found. Trends Sign. Process. 7, 3–4 (June2014), 197–387.

Digital Library

[35]

Ming Fan, Jun Liu, Xiapu Luo, Kai Chen, Zhenzhou Tian, Qinghua Zheng, and Ting Liu. 2018. Android malware familial classification and representative sample selection via frequent subgraph analysis. IEEE Trans. Inf. Forens. Secur. 13, 8 (Aug.2018), 1890–1905.

[36]

Li Fei-Fei, Rob Fergus, and Pietro Perona. 2006. One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28, 4 (April2006), 594–611.

Digital Library

[37]

Xianwei Gao, Changzhen Hu, Chun Shan, Baoxu Liu, Zequn Niu, and Hui Xie. 2020. Malware classification for the cloud via semi-supervised transfer learning. J. Inf. Secur. Appl. 55 (Dec.2020), 102661.

[38]

Daniel Gibert, Carles Mateu, and Jordi Planes. 2020. The rise of machine learning for detection and classification of malware: Research developments, trends and challenges. J. Netw. Comput. Appl. 153 (March2020), 102526.

Digital Library

[39]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (Nov.2020), 139–144.

Digital Library

[40]

Ian Goodfellow, Jonathon Shlens, and Christian Szegedy. 2015. Explaining and harnessing adversarial examples. In Proceedings of the 3rd International Conference on Learning Representations.

[41]

Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2016. Adversarial perturbations against deep neural networks for malware classification. arXiv:1606.04435. Retrieved from https://arxiv.org/abs/1606.04435.

[42]

Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2017. Adversarial examples for malware detection. In Computer Security, Simon N. Foley, Dieter Gollmann, and Einar Snekkenes (Eds.). Springer, 62–79.

[43]

Jiuxiang Gu, Zhenhua Wang, Jason Kuen, Lianyang Ma, Amir Shahroudy, Bing Shuai, Ting Liu, Xingxing Wang, Gang Wang, Jianfei Cai, and Tsuhan Chen. 2018. Recent advances in convolutional neural networks. Pattern Recogn. 77 (May2018), 354–377.

Digital Library

[44]

William Hardy, Lingwei Chen, Shifu Hou, Yanfang Ye, and Xin Li. 2016. DL4MD: A deep learning framework for intelligent malware detection. In Proceedings of the 12th International Conference on Data Mining. CSREA Press, 61–67.

[45]

Geoffrey E. Hinton. 2009. Deep belief networks. Scholarpedia 4, 5 (May2009), 5947.

[46]

R. Devon Hjelm, Athul Paul Jacob, Tong Che, Adam Trischler, Kyunghyun Cho, and Yoshua Bengio. 2018. Boundary-seeking generative adversarial networks. In Proceedings of the 6th International Conference on Learning Representations. OpenReview.net.

[47]

Shifu Hou, Aaron Saas, Lifei Chen, and Yanfang Ye. 2016. Deep4MalDroid: A deep learning framework for Android malware detection based on Linux kernel system call graphs. In Proceedings of the IEEE/WIC/ACM International Conference on Web Intelligence Workshops. IEEE, 104–111.

[48]

Weiwei Hu and Ying Tan. 2017. Generating adversarial malware examples for black-box attacks based on GAN. arXiv:1702.05983. Retrieved from https://arxiv.org/abs/1702.05983.

[49]

Weiwei Hu and Ying Tan. 2018. Black-box attacks against RNN-based malware detection algorithms. In Proceedings of the AAAI Workshop on Artificial Intelligence for Cyber Security. AAAI Press, 245–251.

[50]

Christian Janiesch, Patrick Zschech, and Kai Heinrich. 2021. Machine learning and deep learning. Electr. Mark. 31, 3 (Sept.2021), 685–695.

[51]

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement learning: A survey. J. Artif. Intell. Res. 4 (May1996), 237–285.

[52]

Omid Kargarnovin, Amir Mahdi Sadeghzadeh, and Rasool Jalili. 2022. Mal2GCN: A robust malware detection approach using deep graph convolutional networks with non-negative weights. arXiv:2108.12473. Retrieved from https://arxiv.org/abs/2108.12473.

[53]

Aminollah Khormali, Ahmed Abusnaina, Songqing Chen, DaeHun Nyang, and Aziz Mohaisen. 2019. COPYCAT: Practical adversarial attacks on visualization-based malware detection. arXiv:1909.09735. Retrieved from https://arxiv.org/abs/1909.09735.

[54]

Youngjoon Ki, Eunjin Kim, and Huy Kang Kim. 2015. A novel approach to detect malware based on API call sequence analysis. Int. J. Distrib. Sens. Netw. 11, 6 (June2015), 659101.

Digital Library

[55]

Chiho Kim, Sang-Yoon Chang, Jonghyun Kim, Dongeun Lee, and Jinoh Kim. 2021. Zero-day malware detection using threshold-free autoencoding architecture. In Proceedings of the IEEE International Conference on Big Data. IEEE, 1279–1284.

[56]

Chiho Kim, Sang-Yoon Chang, Jonghyun Kim, Dongeun Lee, and Jinoh Kim. 2023. Automated, reliable zero-day malware detection based on autoencoding architecture. IEEE Trans. Netw. Serv. Manag. (March 2023), 16 pages. Early Access.

[57]

Jin-Young Kim, Seok-Jun Bu, and Sung-Bae Cho. 2018. Zero-day malware detection using transferred generative adversarial networks based on deep autoencoders. Inf. Sci. 460–461 (Sept.2018), 83–102.

Digital Library

[58]

Jin-Young Kim and Sung-Bae Cho. 2022. Obfuscated malware detection using deep generative model based on global/local features. Comput. Secur. 112 (Jan.2022), 102501.

Digital Library

[59]

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations. OpenReview.net.

[60]

Teuvo Kohonen. 2013. Essentials of the self-organizing map. Neural Netw. 37 (Jan.2013), 52–65.

Digital Library

[61]

Felix Kreuk, Assi Barak, Shir Aviv-Reuven, Moran Baruch, Benny Pinkas, and Joseph Keshet. 2019. Deceiving end-to-end deep learning malware detectors using adversarial examples. arXiv:1802.04528. Retrieved from https://arxiv.org/abs/1802.04528.

[62]

Tom Landman and Nir Nissim. 2021. Deep-Hook: A trusted deep learning-based framework for unknown malware detection and classification in Linux cloud environments. Neural Netw. 144 (Dec.2021), 648–685.

Digital Library

[63]

Martin Längkvist, Lars Karlsson, and Amy Loutfi. 2014. A review of unsupervised feature learning and deep learning for time-series modeling. Pattern Recogn. Lett. 42 (June2014), 11–24.

[64]

Ralph Langner. 2011. Stuxnet: Dissecting a cyberwarfare weapon. IEEE Secur. Priv. 9, 3 (May2011), 49–51.

Digital Library

[65]

Arash Habibi Lashkari, Andi Fitriah A. Kadir, Laya Taheri, and Ali A. Ghorbani. 2018. Toward developing a systematic approach to generate benchmark Android malware datasets and classification. In Proceedings of the 2018 International Carnahan Conference on Security Technology. IEEE, 1–7.

[66]

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. 2015. Deep learning. Nature 521, 7553 (May2015), 436–444.

[67]

Dong-Hyun Lee. 2013. Pseudo-Label: The simple and efficient semi-supervised learning method for deep neural networks. In Proceedings of the ICML Workshop on Challenges in Representation Learning. JMLR.org.

[68]

Scott Levy and Jedidiah R. Crandall. 2020. The program with a personality: Analysis of Elk Cloner, the first personal computer virus. arXiv:2007.15759. Retrieved from https://arxiv.org/abs/2007.15759.

[69]

Deqiang Li, Qianmu Li, Yanfang Ye, and Shouhuai Xu. 2021. A framework for enhancing deep neural networks against adversarial malware. IEEE Trans. Netw. Sci. Eng. 8, 1 (Jan.2021), 736–750.

[70]

Heng Li, ShiYao Zhou, Wei Yuan, Jiahuan Li, and Henry Leung. 2020. Adversarial-example attacks toward Android malware detection system. IEEE Syst. J. 14, 1 (March2020), 653–656.

[71]

Qian Li, Qingyuan Hu, Yong Qi, Saiyu Qi, Xinxing Liu, and Pengfei Gao. 2021. Semi-supervised two-phase familial analysis of Android malware with normalized graph embedding. Knowl. Based Syst. 218 (April2021), 106802.

Digital Library

[72]

Yuxi Li. 2018. Deep reinforcement learning: An overview. arXiv:1701.07274. Retrieved from https://arxiv.org/abs/1701.07274.

[73]

Chen Liu, Bo Li, Jun Zhao, Ming Su, and Xu-Dong Liu. 2021. MG-DVD: A real-time framework for malware variant detection based on dynamic heterogeneous graph learning. In Proceedings of the 30th International Joint Conference on Artificial Intelligence. 1512–1519.

[74]

Chen Liu, Bo Li, Jun Zhao, Ziyang Zhen, Xudong Liu, and Qunshi Zhang. 2022. FewM-HGCL: Few-shot malware variants detection via heterogeneous graph contrastive learning. IEEE Trans. Depend. Sec. Comput. (Oct. 2022), 18 pages. Early Access.

[75]

Yue Liu, Chakkrit Tantithamthavorn, Li Li, and Yepang Liu. 2023. Deep learning for Android malware defenses: A systematic literature review. ACM Comput. Surv. 55, 8, Article 153 (Aug.2023), 36 pages.

Digital Library

[76]

Zhen Liu, Ruoyu Wang, Nathalie Japkowicz, Deyu Tang, Wenbin Zhang, and Jie Zhao. 2021. Research on unsupervised feature learning for Android malware detection based on restricted Boltzmann machines. Fut. Gener. Comput. Syst. 120 (July2021), 91–108.

[77]

Samaneh Mahdavifar, Andi Fitriah Abdul Kadir, Rasool Fatemi, Dima Alhadidi, and Ali A. Ghorbani. 2020. Dynamic Android malware category classification using semi-supervised deep learning. In Proceedings of the 18th IEEE International Conference on Dependable, Autonomic and Secure Computing. IEEE, 515–522.

[78]

Samaneh Mahdavifar, Dima Alhadidi, and Ali. A. Ghorbani. 2022. Effective and efficient hybrid Android malware classification using pseudo-label stacked auto-encoder. J. Netw. Syst. Manag. 30, 1, Article 22 (Jan.2022), 34 pages.

Digital Library

[79]

MaleVis Dataset. 2019. MaleVis: A Dataset for Vision Based Malware Recognition. Retrieved from https://web.cs.hacettepe.edu.tr/selman/malevis.

[80]

Mandiant. 2022. M-Trends 2022: Mandiant Special Report. Retrieved from https://www.mandiant.com/media/15671.

[81]

Alejandro Martín, Raúl Lara-Cabrera, and David Camacho. 2019. Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset. Inf. Fusion 52 (Dec.2019), 128–142.

Digital Library

[82]

Yair Meidan, Michael Bohadana, Yael Mathov, Yisroel Mirsky, Asaf Shabtai, Dominik Breitenbacher, and Yuval Elovici. 2018. N-BaIoT–network-based detection of IoT botnet attacks using deep autoencoders. IEEE Perv. Comput. 17, 3 (July2018), 12–22.

Digital Library

[83]

Stuart Millar, Niall McLaughlin, Jesus Martinez del Rincon, and Paul Miller. 2021. Multi-view deep learning for zero-day Android malware detection. J. Inf. Secur. Appl. 58 (May2021), 102718.

[84]

Michael Glen Miller. 2018. Are We Protected Yet? Developing a Machine Learning Detection System to Combat Zero-day Malware Attacks. Master’s Thesis. Utica University, Utica, NY.

[85]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing Atari with deep reinforcement learning. In Proceedings of the NIPS Workshop on Deep Learning. 1–9.

[86]

Zahra Moti, Sattar Hashemi, Hadis Karimipour, Ali Dehghantanha, Amir Namavar Jahromi, Lida Abdi, and Fatemeh Alavi. 2021. Generative adversarial network to detect unseen Internet of Things malware. Ad Hoc Netw. 122 (Nov.2021), 102591.

Digital Library

[87]

Fionn Murtagh. 1991. Multilayer perceptrons for classification and regression. Neurocomputing 2, 5 (July1991), 183–197.

[88]

Lakshmanan Nataraj, Sreejith Karthikeyan, Gregoire Jacob, and Bangalore S. Manjunath. 2011. Malware images: Visualization and automatic classification. In Proceedings of the 8th International Symposium on Visualization for Cyber Security. ACM, Article 4, 7 pages.

Digital Library

[89]

Huu Noi Nguyen, Van Cuong Nguyen, Nguyen Ngoc Tran, and Van Loi Cao. 2021. Feature representation of AutoEncoders for unsupervised IoT malware detection. In Future Data and Security Engineering, Tran Khanh Dang, Josef Küng, Tai M. Chung, and Makoto Takizawa (Eds.). Springer, 272–290.

Digital Library

[90]

Mathilde Ollivier, Sébastien Bardin, Richard Bonichon, and Jean-Yves Marion. 2019. Obfuscation: Where are we in anti-DSE protections? (A first attempt). In Proceedings of the 9th Workshop on Software Security, Protection, and Reverse Engineering. ACM, Article 5, 8 pages.

Digital Library

[91]

Lucky Onwuzurike, Enrico Mariconti, Panagiotis Andriotis, Emiliano De Cristofaro, Gordon Ross, and Gianluca Stringhini. 2019. MaMaDroid: Detecting Android malware by building Markov chains of behavioral models (extended version). ACM Trans. Priv. Secur. 22, 2, Article 14 (May2019), 34 pages.

Digital Library

[92]

Daniel W. Otter, Julian R. Medina, and Jugal K. Kalita. 2021. A survey of the usages of deep learning for natural language processing. IEEE Trans. Neural Netw. Learn. Syst. 32, 2 (Feb.2021), 604–624.

[93]

Guansong Pang, Chunhua Shen, Longbing Cao, and Anton Van Den Hengel. 2022. Deep learning for anomaly detection: A review. ACM Comput. Surv. 54, 2, Article 38 (March2022), 38 pages.

Digital Library

[94]

Nicolas Papernot, Patrick McDaniel, Somesh Jha, Matt Fredrikson, Z. Berkay Celik, and Ananthram Swami. 2016. The limitations of deep learning in adversarial settings. In Proceedings of the IEEE European Symposium on Security and Privacy. IEEE, 372–387.

[95]

Nicolas Papernot, Patrick McDaniel, Xi Wu, Somesh Jha, and Ananthram Swami. 2016. Distillation as a defense to adversarial perturbations against deep neural networks. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 582–597.

[96]

Razvan Pascanu, Tomas Mikolov, and Yoshua Bengio. 2013. On the difficulty of training recurrent neural networks. In Proceedings of the 30th International Conference on Machine Learning, Vol. 28. PMLR, 1310–1318.

[97]

Xiaowei Peng, Hequn Xian, Qian Lu, and Xiuqing Lu. 2021. Semantics aware adversarial malware examples generation for black-box attacks. Appl. Soft Comput. 109 (Sept.2021), 107506.

Digital Library

[98]

Rachel Petrik, Berat Arik, and Jared M. Smith. 2018. Towards architecture and OS-independent malware detection via memory forensics. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. ACM, 2267–2269.

Digital Library

[99]

Fabio Pierazzi, Feargus Pendlebury, Jacopo Cortellazzi, and Lorenzo Cavallaro. 2020. Intriguing properties of adversarial ML attacks in the problem space. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 1332–1349.

[100]

Maria F. Prevezianou. 2021. WannaCry as a creeping crisis. In Understanding the Creeping Crisis, Arjen Boin, Magnus Ekengren, and Mark Rhinard (Eds.). Springer, 37–50.

[101]

Junyang Qiu, Jun Zhang, Wei Luo, Lei Pan, Surya Nepal, and Yang Xiang. 2021. A survey of Android malware detection with deep neural models. ACM Comput. Surv. 53, 6, Article 126 (Nov.2021), 36 pages.

Digital Library

[102]

Edward Raff, Jon Barker, Jared Sylvester, Robert Brandon, Bryan Catanzaro, and Charles K. Nicholas. 2018. Malware detection by eating a whole EXE. In Proceedings of the AAAI Workshop on Artificial Intelligence for Cyber Security. AAAI Press, 268–276.

[103]

Abir Rahali, Arash Habibi Lashkari, Gurdip Kaur, Laya Taheri, Francois Gagnon, and Frédéric Massicotte. 2020. DIDroid: Android malware classification and characterization using deep image learning. In Proceedings of the 10th International Conference on Communication and Network Security. ACM, 70–82.

Digital Library

[104]

Royi Ronen, Marian Radu, Corina Feuerstein, Elad Yom-Tov, and Mansour Ahmadi. 2018. Microsoft malware classification challenge. arXiv:1802.10135. Retrieved from https://arxiv.org/abs/1802.10135.

[105]

Ishai Rosenberg, Asaf Shabtai, Yuval Elovici, and Lior Rokach. 2022. Adversarial machine learning attacks and defense methods in the cyber security domain. ACM Comput. Surv. 54, 5, Article 108 (June2022), 36 pages.

Digital Library

[106]

JD Rudie, Zach Katz, Sam Kuhbander, and Suman Bhunia. 2021. Technical analysis of the NSO group’s Pegasus spyware. In Proceedings of the International Conference on Computational Science and Computational Intelligence. IEEE, 747–752.

[107]

Lukas Ruff, Robert A. Vandermeulen, Nico Görnitz, Alexander Binder, Emmanuel Müller, Klaus-Robert Müller, and Marius Kloft. 2020. Deep semi-supervised anomaly detection. In Proceedings of the 8th International Conference on Learning Representations. OpenReview.net.

[108]

Hojjat Salehinejad, Sharan Sankar, Joseph Barfett, Errol Colak, and Shahrokh Valaee. 2018. Recent advances in recurrent neural networks. arXiv:1801.01078. Retrieved from https://arxiv.org/abs/1801.01078.

[109]

Adam Santoro, Sergey Bartunov, Matthew Botvinick, Daan Wierstra, and Timothy Lillicrap. 2016. Meta-learning with memory-augmented neural networks. In Proceedings of the 33rd International Conference on Machine Learning, Vol. 48. PMLR, 1842–1850.

[110]

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. 2009. The graph neural network model. IEEE Trans. Neural Netw. 20, 1 (Jan.2009), 61–80.

Digital Library

[111]

Giorgio Severi, Tim Leek, and Brendan Dolan-Gavitt. 2018. Malrec: Compact full-trace malware recording for retrospective deep analysis. In Detection of Intrusions and Malware, and Vulnerability Assessment, Cristiano Giuffrida, Sébastien Bardin, and Gregory Blanc (Eds.). Springer, 3–23.

[112]

Jake Snell, Kevin Swersky, and Richard S. Zemel. 2017. Prototypical networks for few-shot learning. In Proceedings of the 31st International Conference on Neural Information Processing Systems. Curran Associates Inc., 4080–4090.

Digital Library

[113]

Wei Song, Xuezixiang Li, Sadia Afroz, Deepali Garg, Dmitry Kuznetsov, and Heng Yin. 2022. MAB-Malware: A reinforcement learning framework for blackbox generation of adversarial malware. In Proceedings of the ACM on Asia Conference on Computer and Communications Security. ACM, 990–1003.

Digital Library

[114]

Yisheng Song, Ting Wang, Puyu Cai, Subrota K Mondal, and Jyoti Prakash Sahoo. 2023. A comprehensive survey of few-shot learning: Evolution, applications, challenges, and opportunities. ACM Comput. Surv. 55, 13s, Article 271 (Dec. 2023), 40 pages.

Digital Library

[115]

Octavian Suciu, Scott E. Coull, and Jeffrey Johns. 2019. Exploring adversarial examples in malware detection. In Proceedings of the IEEE Security and Privacy Workshops. IEEE, 8–14.

[116]

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. 2017. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning, Vol. 70. PMLR, 3319–3328.

[117]

Laya Taheri, Andi Fitriah Abdul Kadir, and Arash Habibi Lashkari. 2019. Extensible Android malware detection and family classification using network-flows and API-calls. In Proceedings of the International Carnahan Conference on Security Technology. IEEE, 1–8.

[118]

Rahim Taheri, Reza Javidan, Mohammad Shojafar, Zahra Pooranian, Ali Miri, and Mauro Conti. 2020. On defending against label flipping attacks on malware detection systems. Neural Comput. Appl. 32, 18 (Sept.2020), 14781–14800.

Digital Library

[119]

Asghar Tajoddin and Mahdi Abadi. 2019. RAMD: Registry-based anomaly malware detection using one-class ensemble classifiers. Appl. Intell. 49, 7 (July2019), 2641–2658.

Digital Library

[120]

Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Ian Goodfellow, Dan Boneh, and Patrick McDaniel. 2018. Ensemble adversarial training: Attacks and defenses. In Proceedings of the 6th International Conference on Learning Representations. OpenReview.net.

[121]

Trung Kien Tran, Hiroshi Sato, and Masao Kubo. 2018. One-shot learning approach for unknown malware classification. In Proceedings of the 5th Asian Conference on Defense Technology. IEEE, 8–13.

[122]

Trung Kien Tran, Hiroshi Sato, and Masao Kubo. 2019. Image-based unknown malware classification with few-shot learning models. In Proceedings of the 7th International Symposium on Computing and Networking Workshops. IEEE, 401–407.

[123]

Daniele Ucci, Leonardo Aniello, and Roberto Baldoni. 2019. Survey of machine learning techniques for malware analysis. Comput. Secur. 81 (March2019), 123–147.

[124]

Håvard Vegge, Finn Michael Halvorsen, Rune Walsø Nergård, Martin Gilje Jaatun, and Jostein Jensen. 2009. Where only fools dare to tread: An empirical study on the prevalence of zero-day malware. In Proceedings of the 4th International Conference on Internet Monitoring and Protection. IEEE, 66–71.

Digital Library

[125]

Joannès Vermorel and Mehryar Mohri. 2005. Multi-armed bandit algorithms and empirical evaluation. In Machine Learning, João Gama, Rui Camacho, Pavel B. Brazdil, Alípio Mário Jorge, and Luís Torgo (Eds.). Springer, 437–448.

[126]

Oriol Vinyals, Charles Blundell, Timothy Lillicrap, Koray Kavukcuoglu, and Daan Wierstra. 2016. Matching networks for one shot learning. In Proceedings of the 30th International Conference on Neural Information Processing Systems. Curran Associates Inc., 3637–3645.

Digital Library

[127]

Peng Wang, Zhijie Tang, and Junfeng Wang. 2021. A novel few-shot malware classification approach for unknown family recognition with multi-prototype modeling. Comput. Secur. 106 (July2021), 102273.

Digital Library

[128]

Qinglong Wang, Wenbo Guo, Kaixuan Zhang, Alexander G. Ororbia, Xinyu Xing, Xue Liu, and C. Lee Giles. 2017. Adversary resistant deep neural networks with an application to malware detection. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. ACM, 1145–1153.

Digital Library

[129]

Qiaokun Wen and K. P. Chow. 2021. CNN-based zero-day malware detection using small binary segments. Forens. Sci. Int. Digit. Investig. 38 (Oct.2021), 301128.

[130]

Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A comprehensive survey on graph neural networks. IEEE Trans. Neural Netw. Learn. Syst. 32, 1 (Jan.2021), 4–24.

[131]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In Proceedings of the 7th International Conference on Learning Representations. OpenReview.net.

[132]

Ke Xu, Yingjiu Li, Robert H. Deng, and Kai Chen. 2018. DeepRefiner: Multi-layer Android malware detection system applying deep neural networks. In Proceedings of the IEEE European Symposium on Security and Privacy. IEEE, 473–487.

[133]

Yanfang Ye, Lingwei Chen, Shifu Hou, William Hardy, and Xin Li. 2018. DeepAM: A heterogeneous deep learning framework for intelligent malware detection. Knowl. Inf. Syst. 54, 2 (Feb.2018), 265–285.

Digital Library

[134]

Yanfang Ye, Tao Li, Donald Adjeroh, and S. Sitharama Iyengar. 2018. A survey on malware detection using data mining techniques. ACM Comput. Surv. 50, 3, Article 41 (May2018), 40 pages.

Digital Library

[135]

ytisf. 2023. theZoo—A Live Malware Repository. Retrieved June 15, 2023 from https://github.com/ytisf/theZoo.

[136]

Junkun Yuan, Shaofang Zhou, Lanfen Lin, Feng Wang, and Jia Cui. 2020. Black-box adversarial attacks against deep learning based malware binaries detection with GAN. In Proceedings of the 24th European Conference on Artificial Intelligence. IOS Press, 2536–2542.

[137]

Zhenlong Yuan, Yongqiang Lu, Zhaoguo Wang, and Yibo Xue. 2014. Droid-Sec: Deep learning in Android malware detection. In Proceedings of the ACM Conference of the Special Interest Group on Data Communication (SIGCOMM). ACM, 371–372.

Digital Library

[138]

Rahul Yumlembam, Biju Issac, Seibu Mary Jacob, and Longzhi Yang. 2023. IoT-based Android malware detection using graph neural network with adversarial defense. IEEE Internet Things J. 10, 10 (May2023), 8432–8444.

[139]

Umme Zahoora, Muttukrishnan Rajarajan, Zahoqing Pan, and Asifullah Khan. 2022. Zero-day ransomware attack detection using deep contractive autoencoder and voting based ensemble classifier. Appl. Intell. 52, 12 (Sept.2022), 13941–13960.

Digital Library

[140]

Lan Zhang, Peng Liu, Yoon-Ho Choi, and Ping Chen. 2023. Semantics-preserving reinforcement learning attack against graph neural networks for malware detection. IEEE Trans. Depend. Sec. Comput. 20, 2 (March2023), 1390–1402.

Digital Library

[141]

Muhan Zhang, Zhicheng Cui, Marion Neumann, and Yixin Chen. 2018. An end-to-end deep learning architecture for graph classification. In Proceedings of the 32th AAAI Conference on Artificial Intelligence. AAAI Press, 4438–4445.

[142]

Nan Zhang, Shifei Ding, Jian Zhang, and Yu Xue. 2018. An overview on restricted Boltzmann machines. Neurocomputing 275 (Jan.2018), 1186–1199.

Digital Library

[143]

Kaifa Zhao, Hao Zhou, Yulin Zhu, Xian Zhan, Kai Zhou, Jianfeng Li, Le Yu, Wei Yuan, and Xiapu Luo. 2021. Structural attack against graph based Android malware detection. In Proceedings of the ACM SIGSAC Conference on Computer and Communications Security. ACM, 3218–3235.

Digital Library

[144]

Yajin Zhou and Xuxian Jiang. 2012. Dissecting Android malware: Characterization and evolution. In Proceedings of the IEEE Symposium on Security and Privacy. IEEE, 95–109.

Digital Library

[145]

Hui-Juan Zhu, Liang-Min Wang, Sheng Zhong, Yang Li, and Victor S. Sheng. 2022. A hybrid deep network framework for Android malware detection. IEEE Trans. Knowl. Data Eng. 34, 12 (Dec.2022), 5558–5570.

[146]

Tommaso Zoppi, Andrea Ceccarelli, Tommaso Puccetti, and Andrea Bondavalli. 2023. Which algorithm can detect unknown attacks? Comparison of supervised, unsupervised and meta-learning algorithms for intrusion detection. Comput. Secur. 127 (April2023), 103107.

Digital Library

Cited By

Mauri LDamiani E(2025)Hardening behavioral classifiers against polymorphic malware: An ensemble approach based on minority reportInformation Sciences10.1016/j.ins.2024.121499689(121499)Online publication date: Jan-2025
https://doi.org/10.1016/j.ins.2024.121499
S SA SVignesh VP S(2024)Assessment of Zero-Day Vulnerability using Machine Learning ApproachEAI Endorsed Transactions on Internet of Things10.4108/eetiot.497810Online publication date: 30-Jan-2024
https://doi.org/10.4108/eetiot.4978
Palma CFerreira AFigueiredo M(2024)Explainable Machine Learning for Malware Detection on Android ApplicationsInformation10.3390/info1501002515:1(25)Online publication date: 1-Jan-2024
https://doi.org/10.3390/info15010025
Show More Cited By

Index Terms

Deep Learning for Zero-day Malware Detection and Classification: A Survey
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
    1. Malware and its mitigation
  2. Software and application security
    1. Software security engineering

Recommendations

A comprehensive survey on deep learning based malware detection techniques
Abstract
Recent theoretical and practical studies have revealed that malware is one of the most harmful threats to the digital world. Malware mitigation techniques have evolved over the years to ensure security. Earlier, several classical ...
A Survey on Malware Detection with Deep Learning
SIN 2020: 13th International Conference on Security of Information and Networks

Rapid development of Internet and technology has emerged a bunch of evolving malware and attack strategies. Therefore researchers focused on machine learning and deep learning methods to detect malware (viruses, bots, ransomware, trojans). In order to ...
Zero-Day Malware Classification and Detection Using Machine Learning
Abstract
A zero-day vulnerability is a weakness of the computer software and hardware that has yet to be discovered by people who might be interested in fixing it. Hackers may use these vulnerabilities to harm computer programs, data, other systems, or a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 56, Issue 2

February 2024

974 pages

EISSN:1557-7341

DOI:10.1145/3613559

Editor:
Albert Zomaya
University of Sydney, Australia

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 September 2023

Online AM: 24 June 2023

Accepted: 31 May 2023

Revised: 18 April 2023

Received: 18 September 2022

Published in CSUR Volume 56, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey

Funding Sources

Iran National Science Foundation (INSF) and Iran’s National Elites Foundation (INEF)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
2,879
Total Downloads

Downloads (Last 12 months)2,180
Downloads (Last 6 weeks)378

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Mauri LDamiani E(2025)Hardening behavioral classifiers against polymorphic malware: An ensemble approach based on minority reportInformation Sciences10.1016/j.ins.2024.121499689(121499)Online publication date: Jan-2025
https://doi.org/10.1016/j.ins.2024.121499
S SA SVignesh VP S(2024)Assessment of Zero-Day Vulnerability using Machine Learning ApproachEAI Endorsed Transactions on Internet of Things10.4108/eetiot.497810Online publication date: 30-Jan-2024
https://doi.org/10.4108/eetiot.4978
Palma CFerreira AFigueiredo M(2024)Explainable Machine Learning for Malware Detection on Android ApplicationsInformation10.3390/info1501002515:1(25)Online publication date: 1-Jan-2024
https://doi.org/10.3390/info15010025
Redhu AChoudhary PSrinivasan KDas T(2024)Deep learning-powered malware detection in cyberspace: a contemporary reviewFrontiers in Physics10.3389/fphy.2024.134946312Online publication date: 28-Mar-2024
https://doi.org/10.3389/fphy.2024.1349463
Brosolo MPuthuvath VKA ARehiman RConti M(2024)SoK: Visualization-based Malware Detection TechniquesProceedings of the 19th International Conference on Availability, Reliability and Security10.1145/3664476.3664514(1-13)Online publication date: 30-Jul-2024
https://dl.acm.org/doi/10.1145/3664476.3664514
Zhao YUllah FChen CAmoon MKumari S(2024)Efficient malware detection using hybrid approach of transfer learning and generative adversarial examples with image representationExpert Systems10.1111/exsy.13693Online publication date: 13-Aug-2024
https://doi.org/10.1111/exsy.13693
Huertas Celdrán AMiguel Sánchez Sánchez Pvon der Assen JSchenk TBovet GMartínez Pérez GStiller B(2024)RL and Fingerprinting to Select Moving Target Defense Mechanisms for Zero-Day Attacks in IoTIEEE Transactions on Information Forensics and Security10.1109/TIFS.2024.340205519(5520-5529)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1109/TIFS.2024.3402055
Kunwar P(2024)PhD Forum: MalFormer001- Multimodal Transformer Fused Attention based Malware Detector2024 IEEE International Conference on Smart Computing (SMARTCOMP)10.1109/SMARTCOMP61445.2024.00059(252-253)Online publication date: 29-Jun-2024
https://doi.org/10.1109/SMARTCOMP61445.2024.00059
Lai Y(2024)Online Handwritten Chinese Character Recognition and Real-Time Error Correction Based on Deep Learning and Intelligent System2024 Second International Conference on Networks, Multimedia and Information Technology (NMITCON)10.1109/NMITCON62075.2024.10699302(1-6)Online publication date: 9-Aug-2024
https://doi.org/10.1109/NMITCON62075.2024.10699302
Yuan ZYu YWu YHuang SCai B(2024)Prefix Tuning for Few-shot Malware Classification with Supervised Contrastive Cross-Entropy Learning2024 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN60899.2024.10651042(1-8)Online publication date: 30-Jun-2024
https://doi.org/10.1109/IJCNN60899.2024.10651042
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents