Abstract
For detecting and resolving the various types of malware, novel techniques are proposed, among which deep learning algorithms play a crucial role. Although there has been a lot of research on the development of DL-based mobile malware detection approaches, they were not reviewed in detail yet. This paper aims to identify, assess, and synthesize the reported articles related to the application of DL techniques for mobile malware detection. A Systematic Literature Review is performed in which we selected 40 journal articles for in-depth analysis. This SLR presents and categorizes these articles based on machine learning categories, data sources, DL algorithms, evaluation parameters & approaches, feature selection techniques, datasets, and DL implementation platforms. The study also highlights the challenges, proposed solutions, and future research directions on the use of DL in mobile malware detection. This study showed that Convolutional Neural Networks and Deep Neural Networks algorithms are the most used DL algorithms. API calls, Permissions, and System Calls are the most dominant features utilized. Keras and Tensorflow are the most popular platforms. Drebin and VirusShare are the most widely used datasets. Supervised learning and static features are the most preferred machine learning and data source categories.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
This section is divided into two parts: (1) Regular references cited throughout the paper and (2) Citations to the primary studies reviewed in the SLR
Ab Razak MF, Anuar NB, Salleh R, Firdaus A (2016) The rise of “malware”: bibliometric analysis of malware study. J Netw Comput Appl 75:58–76
Ali NB, Petersen K (2014) Evaluating strategies for study selection in systematic literature studies. In: Proceedings of the 8th ACM/IEEE international symposium on empirical software engineering and measurement pp. 1–4
Antonakakis M, April T, Bailey M, Bernhard M, Bursztein E, Cochran J, Durumeric Z, Halderman JA, Invernizzi L, Kallitsis M, Kumar D (2017) Understanding the mirai botnet. In: 26th {USENIX} security symposium ({USENIX} Security 17) pp. 1093–1110
AppBrain, “Number of Android apps on Google Play.” [Online]. Available: https://www.appbrain.com/stats/number-of-android-apps. [Accessed: 17-July-2020].
Aslan ÖA, Samet R (2020) A comprehensive review on malware detection approaches. IEEE Access 8:6249–6271
Baltrušaitis T, Ahuja C, Morency LP (2018) Multimodal machine learning: a survey and taxonomy. IEEE Trans Pattern Anal Mach Intell 41(2):423–443
Bazrafshan Z, Hashemi H, Fard SMH, Hamzeh A (2013) A survey on heuristic malware detection techniques. In: The 5th conference on information and knowledge technology IEEE, pp. 113–120
Berman DS, Buczak AL, Chavis JS, Corbett CL (2019) A survey of deep learning methods for cyber security. Information 10(4):122
Brownlee J (2016) Deep learning with Python: develop deep learning models on Theano and TensorFlow using Keras. Machine Learning Mastery, Vermont
Brownlee J (2017) Long Short-term memory networks with Python: develop sequence prediction models with deep learning. Machine Learning Mastery, Vermont
Brownlee J (2019) Deep learning for computer vision: image classification, object detection, and face recognition in Python. Machine Learning Mastery, Vermont
Budgen D, Brereton P, Drummond S, Williams N (2018) Reporting systematic reviews: some lessons from a tertiary study. Inf Softw Technol 95:62–74
Carlin D, Burgess J, O’Kane P, Sezer S (2019) You could be mine (d): the rise of cryptojacking. IEEE Secur Priv 18(2):16–22
Catal C (2012) On the application of genetic algorithms for test case prioritization: a systematic literature review. In: Proceedings of the 2nd international workshop on Evidential assessment of software technologies pp. 9–14.
Catal C, Mishra D (2013) Test case prioritization: a systematic mapping study. Software Qual J 21(3):445–478
Catal C, Sevim U, Diri B (2010) Metrics-driven software quality prediction without prior fault data. In: Ao SI, Gelman L (eds) Electronic Engineering and Computing Technology. Springer, Dordrecht, pp 189–199
Choudhary GR, Kumar S, Kumar K, Mishra A, Catal C (2018) Empirical analysis of change metrics for software fault prediction. Comput Electr Eng 67:15–24
Cui Z, Xue F, Cai X, Cao Y, Wang GG, Chen J (2018) Detection of malicious code variants based on deep learning. IEEE Trans Industr Inf 14(7):3187–3196
Darabian H, Dehghantanha A, Hashemi S, Homayoun S, Choo KKR (2020) An opcode-based technique for polymorphic Internet of Things malware detection. Concurr Comput Practice Exp 32(6):e5173
Deng L (2014) A tutorial survey of architectures, algorithms, and applications for deep learning. APSIPA Trans Signal Inform Process. https://doi.org/10.1017/atsip.2013.9
Du Z, Miao Q, Zong C (2020) Trajectory planning for automated parking systems using deep reinforcement learning. Int J Automot Technol 21(4):881–887
Elkahky AM, Song Y, He X (2015) A multi-view deep learning approach for cross domain user modeling in recommendation systems. In Proceedings of the 24th international conference on world wide web pp. 278–288
Farfade SS, Saberian MJ, Li LJ (2015) Multi-view face detection using deep convolutional neural networks. In: Proceedings of the 5th ACM on international conference on multimedia retrieval pp. 643–650
Feizollah A, Anuar NB, Salleh R, Wahab AWA (2015) A review on feature selection in mobile malware detection. Digit Investig 13:22–37
Gay G, Menzies T, Cukic B, Turhan B (2009) How to build repeatable experiments. In: Proceedings of the 5th international conference on predictor models in software engineering pp. 1–9
Gibert D, Mateu C, Planes J (2020) The rise of machine learning for detection and classification of malware: research developments, trends and challenges. J Network Comput Appl 153:102526
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Bengio Y (2014) Generative adversarial nets. Adv Neural Info Process Syst 27
Griffin K, Schneider S, Hu X, Chiueh TC (2009) Automatic generation of string signatures for malware detection. In: International workshop on recent advances in intrusion detection. Springer, Berlin, Heidelberg. pp. 101–120
Hassler E, Carver JC, Kraft NA, Hale D (2014) Outcomes of a community workshop to identify and rank barriers to the systematic literature review process. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering. pp. 1–10
He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp. 770–778
Hsiao SC, Kao DY, Liu ZY, Tso R (2019) Malware image classification using one-shot learning with Siamese networks. Proced Comput Sci 159:1863–1871
Jerome Q, Allix K, State R, Engel T (2014) Using opcode-sequences to detect malicious Android applications. In: 2014 IEEE international conference on communications (ICC) IEEE. pp. 914–919
Kamilaris A, Prenafeta-Boldú FX (2018) Deep learning in agriculture: a survey. Comput Electron Agric 147:70–90
Kitchenham BA, Dyba T, Jorgensen M (2004) Evidence-based software engineering. In: Proceedings. 26th international conference on software engineering IEEE. pp. 273–281
Kitchenham B, Pretorius R, Budgen D, Brereton OP, Turner M, Niazi M, Linkman S (2010) Systematic literature reviews in software engineering–a tertiary study. Inf Softw Technol 52(8):792–805
Kitchenham B, Pearl Brereton O, Budgen D, Turner M, Bailey J, Linkman S (2009) Systematic literature reviews in software engineering—a systematic literature review. Inf Softw Technol 51(1):7–15. https://doi.org/10.1016/j.infsof.2008.09.009
Kok SH, Abdullah A, Jhanjhi NZ, Supramaniam M (2019) Ransomware, threat and detection techniques: a review. Int J Comput Sci Network Secur 19(2):136
Kolias C, Kambourakis G, Stavrou A, Voas J (2017) DDoS in the IoT: mirai and other botnets. Computer 50(7):80–84
Kouliaridis V, Barmpatsalou K, Kambourakis G, Chen S (2020) A survey on mobile malware detection techniques. IEICE Trans Inf Syst 103(2):204–211
Kuznietsov Y, Stuckler J, Leibe B (2017) Semi-supervised deep learning for monocular depth map prediction. In: Proceedings of the IEEE conference on computer vision and pattern recognition pp. 6647–6655
Lessmann S, Baesens B, Mues C, Pietsch S (2008) Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans Softw Eng 34(4):485–496
Li R, Wang S, Long Z, Gu D (2018) Undeepvo: monocular visual odometry through unsupervised deep learning. In: 2018 IEEE international conference on robotics and automation (ICRA) IEEE, pp. 7286–7291
Li Y, Yang M, Zhang Z (2018) A survey of multi-view representation learning. IEEE Trans Knowl Data Eng 31(10):1863–1883
Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, Jeroen Van Der, Laak JA, Van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Med Image Anal 42:60–88
Liu K, Xu S, Xu G, Zhang M, Sun D, Liu H (2020) A review of android malware detection approaches based on machine learning. IEEE Access. https://doi.org/10.1109/ACCESS.2020.3006143
Liu X, Liu J (2014) A two-layered permission-based android malware detection scheme. In: 2014 2nd IEEE international conference on mobile cloud computing, services, and engineering IEEE, pp. 142–148
Maggiori E, Tarabalka Y, Charpiat G, Alliez P (2016) Convolutional neural networks for large-scale remote-sensing image classification. IEEE Trans Geosci Remote Sens 55(2):645–657
Mahdavifar S, Ghorbani AA (2019) Application of deep learning to cybersecurity: a survey. Neurocomputing 347:149–176
McLaughlin N, Martinez del Rincon J, Kang B, Yerima S, Miller P, Sezer S, Safaei Y, Trickel E, Zhao Z, Doupé A, Joon Ahn G (2017) Deep android malware detection. In: Proceedings of the seventh ACM on conference on data and application security and privacy. pp. 301–308
Miles MB, Huberman AM, Saldana J (2014) Qualitative data analysis: a methods sourcebook, 3rd edn. SAGE Publications Inc., Thousand Oaks, CA
Mnih V, Kavukcuoglu K, Silver D, Rusu AA, Veness J, Bellemare MG, Graves A, Riedmiller M, Fidjeland AK, Ostrovski G, Petersen S (2015) Human-level control through deep reinforcement learning. Nature 518(7540):529–533
Nataraj L, Karthikeyan S, Jacob G, Manjunath BS (2011) Malware images: visualization and automatic classification. In: Proceedings of the 8th international symposium on visualization for cyber security. pp. 1–7
Ngiam J, Khosla A, Kim M, Nam J, Lee H, Ng AY (2011) Multimodal deep learning. In: Proceedings of the 28th international conference on machine learning (ICML-11). pp 689–696
Oussidi A, Elhassouny A (2018) Deep generative models: survey. In: 2018 international conference on intelligent systems and computer vision (ISCV). IEEE, pp. 1–8
Pan Y, Ge X, Fang C, Fan Y (2020) A systematic literature review of android malware detection using static analysis. IEEE Access 8:116363–116379
Petersen K, Feldt R, Mujtaba S, Mattsson M (2008) Systematic mapping studies in software engineering. In: 12th international conference on evaluation and assessment in software engineering (EASE) 12. pp. 1–10
Petersen K, Vakkalanka S, Kuzniarz L (2015) Guidelines for conducting systematic mapping studies in software engineering: an update. Inf Softw Technol 64:1–18
Pouyanfar S, Sadiq S, Yan Y, Tian H, Tao Y, Reyes MP, Shyu ML, Chen SC, Iyengar SS (2018) A survey on deep learning: algorithms, techniques, and applications. ACM Comput Surv (CSUR) 51(5):1–36
Qamar A, Karim A, Chang V (2019) Mobile malware attacks: review, taxonomy & future directions. Futur Gener Comput Syst 97:887–909
Salakhutdinov R, Hinton G (2009) Deep boltzmann machines. In: Artificial intelligence and statistics. pp 448–455. PMLR
Shabtai A, Kanonov U, Elovici Y, Glezer C, Weiss Y (2012) “Andromaly”: a behavioral malware detection framework for android devices. J Intell Inform Syst 38(1):161–190
Shabtai A, Moskovitch R, Elovici Y, Glezer C (2009) Detection of malicious code by applying machine learning classifiers on static features: a state-of-the-art survey. Inform Secur Tech Rep 14(1):16–29
Sohn K, Shang W, Lee H (2014) Improved multimodal deep learning with variation of information. Adv Neural Inform Process Syst 27:2141–2149
Souri A, Hosseini R (2018) A state-of-the-art survey of malware detection approaches using data mining techniques. HCIS 8(1):3
Suresh S, Di Troia F, Potika K, Stamp M (2019) An analysis of Android adware. J Comput Virol Hacking Tech 15(3):147–160
Tarhan A, Giray G (2017) On the use of ontologies in software process assessment: a systematic literature review. In: Proceedings of the 21st international conference on evaluation and assessment in software engineering. pp. 2–11
Tarvainen A, Valpola H (2017) Mean teachers are better role models: weight-averaged consistency targets improve semi-supervised deep learning results. Adv Neural Inf Process Syst 30
Tummers J, Kassahun A, Tekinerdogan B (2019) Obstacles and features of farm management information systems: a systematic literature review. Comput Electron Agric 157:189–204. https://doi.org/10.1016/j.compag.2018.12.044
Ucci D, Aniello L, Baldoni R (2019) Survey of machine learning techniques for malware analysis. Comput Secur 81:123–147
Wohlin C (2014) Guidelines for snowballing in systematic literature studies and a replication in software engineering. In: Proceedings of the 18th international conference on evaluation and assessment in software engineering—EASE ’14, 1–10. Doi: https://doi.org/10.1145/2601248.2601268
Ye Y, Chen L, Hou S, Hardy W, Li X (2018) DeepAM: a heterogeneous deep learning framework for intelligent malware detection. Knowl Inf Syst 54(2):265–285
Ye Y, Li T, Adjeroh D, Iyengar SS (2017) A survey on malware detection using data mining techniques. ACM Comput Surv (CSUR) 50(3):1–40
Yuxin D, Siyi Z (2019) Malware detection based on deep learning algorithm. Neural Comput Appl 31(2):461–472
Zeng J, Hu J, Zhang Y (2018) Adaptive traffic signal control with deep recurrent Q-learning. In: 2018 IEEE intelligent vehicles symposium (IV), IEEE, pp. 1215–1220
Zhang C, Patras P, Haddadi H (2019) Deep learning in mobile and wireless networking: a survey. IEEE Commun Surv Tutor 21(3):2224–2287
Zhang X, Zhao J, LeCun Y (2015) Character-level convolutional networks for text classification. Adv Neural Inf Process Syst 28:649–657
Primary studies (sources reviewed in the slr)
Alotaibi A (2019) Identifying malicious software using deep residual long-short term memory. IEEE Access 7:163128–163137
Alzaylaee MK, Yerima SY, Sezer S (2020) DL-Droid: deep learning based android malware detection using real devices. Comput Secur 89:101663
Amin M, Shah B, Sharif A, Ali T, Kim KL, Anwar S (2019) Android malware detection through generative adversarial networks. Trans Emerg Telecommun Technol. https://doi.org/10.1002/ett.3675
Amin M, Tanveer TA, Tehseen M, Khan M, Khan FA, Anwar S (2020) Static malware detection and attribution in android bytecode through an end-to-end deep system. Futur Gener Comput Syst 102:112–126
Ananya A, Aswathy A, Amal TR, Swathy PG, Vinod P, Mohammad S (2020) SysDroid: a dynamic ML-based android malware analyzer using system call traces. Cluster Comput 23:2789–2808
Bakhshinejad N, Hamzeh A (2019) Parallel-CNN network for malware detection. IET Inf Secur 14(2):210–219
Chen T, Mao Q, Lv M, Cheng H, Li Y (2019) DroidVecDeep: android malware detection based on Word2Vec and deep belief network. TIIS 13(4):2180–2197
D’Angelo G, Ficco M, Palmieri F (2020) Malware detection in mobile environments based on autoencoders and API-images. J Parallel Distrib Comput 137:26–33
De Lorenzo A, Martinelli F, Medvet E, Mercaldo F, Santone A (2020) Visualizing the outcome of dynamic analysis of Android malware with VizMal. J Inform Secur App 50:102423
Dharmalingam VP, Palanisamy V (2020) A novel permission ranking system for android malware detection—the permission grader. J Ambient Intell Human Comput 12:5071–5081
Jan S, Ali T, Alzahrani A, Musa S (2018) Deep convolutional generative adversarial networks for intent-based dynamic behavior capture. Int J Eng Technol 7(4.29):101–103
Karbab EB, Debbabi M, Derhab A, Mouheb D (2018) MalDozer: automatic framework for android malware detection using deep learning. Digit Investig 24:S48–S59
Kim T, Kang B, Rho M, Sezer S, Im EG (2018) A multimodal deep learning method for android malware detection using various features. IEEE Trans Inf Forensics Secur 14(3):773–788
Li D, Zhao L, Cheng Q, Lu N, Shi W (2019) Opcode sequence analysis of Android malware by a convolutional neural network. Concurr Comput: Practice Exp 32:e5308
Mahdavifar S, Ghorbani AA (2020) DeNNeS: deep embedded neural network expert system for detecting cyber attacks. Neural Comput Appl 32:14753–14780
Martín A, Rodríguez-Fernández V, Camacho D (2018) CANDYMAN: classifying android malware families by modelling dynamic traces with Markov chains. Eng Appl Artif Intell 74:121–133
Martinelli F, Marulli F, Mercaldo F (2017) Evaluating convolutional neural network for effective mobile malware detection. Proced Comput Sci 112:2372–2381
Mercaldo F, Santone A (2020) Deep learning for image-based mobile malware detection. J Comput Virol Hacking Tech 16:157–171
Nauman M, Tanveer TA, Khan S, Syed TA (2018) Deep neural architectures for large scale android malware analysis. Clust Comput 21(1):569–588
Nguyen-Vu L, Ahn J, Jung S (2019) Android fragmentation in malware detection. Comput Secur 87:101573
Pei X, Yu L, Tian S (2020) AMalNet: a deep learning framework based on graph convolutional networks for malware detection. Comput Secur 93:101792
Pei X, Yu L, Tian S, Wang H, Peng Y (2020) Combining multi-features with a neural joint model for Android malware detection. J Intell Fuzzy Syst (Preprint) 38:2151–2163
Pektaş A, Acarman T (2020) Learning to detect Android malware via opcode sequences. Neurocomputing 396:599–608
Pektaş A, Acarman T (2020) Deep learning for effective Android malware detection using API call graph embeddings. Soft Comput 24(2):1027–1043
Saif D, El-Gokhy SM, Sallam E (2018) Deep belief networks-based framework for malware detection in android systems. Alex Eng J 57(4):4049–4057
Sharmeen S, Huda S, Abawajy J, Hassan MM (2020) An adaptive framework against android privilege escalation threats using deep learning and semi-supervised approaches. Appl Soft Comput 89:106089
Shi-qi L, Bo N, Ping J, Sheng-wei T, Long Y, Rui-jin W (2019) Deep learning in Drebin: android malware image texture median filter analysis and detection. KSII Trans Internet Inform Syst (TIIS) 13(7):3654–3670
Su X, Shi W, Qu X, Zheng Y, Liu X (2020) DroidDeep: using Deep Belief Network to characterize and detect android malware. Soft Comput 24:6017–6030
Tang M, Qian Q (2018) Dynamic API call sequence visualisation for malware classification. IET Inf Secur 13(4):367–377
Vinayakumar R, Soman KP, Poornachandran P, Sachin Kumar S (2018) Detecting Android malware using long short-term memory (LSTM). J Intell Fuzzy Syst 34(3):1277–1288
Wang S, Chen Z, Yan Q, Ji K, Peng L, Yang B, Conti M (2020) Deep and broad URL feature mining for android malware detection. Inf Sci 513:600–613
Wang W, Zhao M, Wang J (2019) Effective android malware detection with a hybrid model based on deep autoencoder and convolutional neural network. J Ambient Intell Humaniz Comput 10(8):3035–3043
Xiao X, Wang Z, Li Q, Xia S, Jiang Y (2016) Back-propagation neural network on Markov chains from system call sequences: a new approach for detecting Android malware with system call sequences. IET Inf Secur 11(1):8–15
Xiao X, Zhang S, Mercaldo F, Hu G, Sangaiah AK (2019) Android malware detection based on system call sequences and LSTM. Multimed Tools Appl 78(4):3979–3999
Yen YS, Sun HM (2019) An android mutation malware detection based on deep learning using visualization of importance from codes. Microelectron Reliab 93:109–114
Yuan B, Wang J, Liu D, Guo W, Wu P, Bao X (2020) Byte-level malware classification based on markov images and deep learning. Comput Secur 92:101740
Yuan W, Jiang Y, Li H, Cai M (2019) A lightweight on-device detection method for android malware. IEEE transactions on systems, man, and cybernetics: systems
Yuan Z, Lu Y, Xue Y (2016) Droiddetector: android malware characterization and detection using deep learning. Tsinghua Sci Technol 21(1):114–123
Zhong W, Gu F (2019) A multi-level deep learning system for malware detection. Expert Syst Appl 133:151–162
Zhou Q, Feng F, Shen Z, Zhou R, Hsieh MY, Li KC (2019) A novel approach for mobile malware classification and detection in Android systems. Multimed Tools Appl 78(3):3529–3552
Funding
No funding was received for conducting this study.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors have no relevant financial or non-financial interests to disclose.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Catal, C., Giray, G. & Tekinerdogan, B. Applications of deep learning for mobile malware detection: A systematic literature review. Neural Comput & Applic 34, 1007–1032 (2022). https://doi.org/10.1007/s00521-021-06597-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-021-06597-0