Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3597503.3639122acmconferencesArticle/Chapter ViewAbstractPublication PagesicseConference Proceedingsconference-collections
research-article

MalCertain: Enhancing Deep Neural Network Based Android Malware Detection by Tackling Prediction Uncertainty

Published: 12 April 2024 Publication History
  • Get Citation Alerts
  • Abstract

    The long-lasting Android malware threat has attracted significant research efforts in malware detection. In particular, by modeling malware detection as a classification problem, machine learning based approaches, especially deep neural network (DNN) based approaches, are increasingly being used for Android malware detection and have achieved significant improvements over other detection approaches such as signature-based approaches. However, as Android malware evolve rapidly and the presence of adversarial samples, DNN models trained on early constructed samples often yield poor decisions when used to detect newly emerging samples. Fundamentally, this phenomenon can be summarized as the uncertainly in the data (noise or randomness) and the weakness in the training process (insufficient training data). Overlooking these uncertainties poses risks in the model predictions. In this paper, we take the first step to estimate the prediction uncertainty of DNN models in malware detection and leverage these estimates to enhance Android malware detection techniques. Specifically, besides training a DNN model to predict malware, we employ several uncertainty estimation methods to train a Correction Model that determines whether a sample is correctly or incorrectly predicted by the DNN model. We then leverage the estimated uncertainty output by the Correction Model to correct the prediction results, improving the accuracy of the DNN model. Experimental results show that our proposed MalCertain effectively improves the accuracy of the underlying DNN models for Android malware detection by around 21% and significantly improves the detection effectiveness of adversarial Android malware samples by up to 94.38%. Our research sheds light on the promising direction that leverages prediction uncertainty to improve prediction-based software engineering tasks.

    References

    [1]
    Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2016. AndroZoo: Collecting Millions of Android Apps for the Research Community. In Proceedings of the 13th international conference on mining software repositories. 468--471.
    [2]
    AppBrain. 2022. "Number of Android Applications,". https://www.appbrain.com/stats.
    [3]
    Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. Drebin: Effective and explainable detection of Android malware in your pocket. In Ndss, Vol. 14. 23--26.
    [4]
    Michael Backes and Mohammad Nauman. 2017. LUNA: quantifying and leveraging uncertainty in Android malware analysis through Bayesian machine learning. In 2017 IEEE European symposium on security and privacy (euros&p). IEEE, 204--217.
    [5]
    David Barber and Christopher M Bishop. 1998. Ensemble learning in Bayesian neural networks. Nato ASI Series F Computer and Systems Sciences 168 (1998), 215--238.
    [6]
    Federico Barbero, Feargus Pendlebury, Fabio Pierazzi, and Lorenzo Cavallaro. 2022. Transcending Transcend: Revisiting Malware Classification in the Presence of Concept Drift. In 2022 IEEE Symposium on Security and Privacy (SP). IEEE, 805--823.
    [7]
    Jarrett Booz, Josh McGiff, William G Hatcher, Wei Yu, James Nguyen, and Chao Lu. 2018. Tuning Deep Learning Performance for Android Malware Detection. In 2018 19th IEEE/ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing (SNPD). IEEE, 140--145.
    [8]
    Sen Chen, Minhui Xue, Lingling Fan, Shuang Hao, Lihua Xu, Haojin Zhu, and Bo Li. 2018. Automated poisoning attacks and defenses in malware detection systems: An adversarial machine learning approach. computers & security 73 (2018), 326--344.
    [9]
    Clarivate. [n. d.]. "Web of Science". https://www.webofscience.com/.
    [10]
    Muneer Ahmad Dar and Javaid Parvez. 2013. Evaluating Smartphone Application Security: A Case Study on Android. Global Journal of Computer Science and Technology 13, E12 (2013), 9--15.
    [11]
    Anthony Desnos. 2012. Android: Static analysis using similarity distance. In 2012 45th Hawaii international conference on system sciences. IEEE, 5394--5403.
    [12]
    Yu Feng, Saswat Anand, Isil Dillig, and Alex Aiken. 2014. Apposcopy: Semantics-based detection of android malware through static analysis. In Proceedings of the 22nd ACM SIGSOFT international symposium on foundations of software engineering. 576--587.
    [13]
    Yarin Gal and Zoubin Ghahramani. 2016. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In international conference on machine learning. PMLR, 1050--1059.
    [14]
    Jakob Gawlikowski, Cedrique Rovile Njieutcheu Tassi, Mohsin Ali, Jongseok Lee, Matthias Humt, Jianxiang Feng, Anna Kruspe, Rudolph Triebel, Peter Jung, Ribana Roscher, et al. 2021. A Survey of Uncertainty in Deep Neural Networks. arXiv preprint arXiv:2107.03342 (2021).
    [15]
    Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. 2014. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572 (2014).
    [16]
    Michael Grace, Yajin Zhou, Qiang Zhang, Shihong Zou, and Xuxian Jiang. 2012. Riskranker: scalable and accurate zero-day android malware detection. In Proceedings of the 10th international conference on Mobile systems, applications, and services. 281--294.
    [17]
    Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2016. Adversarial perturbations against deep neural networks for malware classification. arXiv preprint arXiv:1606.04435 (2016).
    [18]
    Kathrin Grosse, Nicolas Papernot, Praveen Manoharan, Michael Backes, and Patrick McDaniel. 2017. Adversarial examples for malware detection. In European symposium on research in computer security. Springer, 62--79.
    [19]
    Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision. 1026--1034.
    [20]
    Tong He, Zhi Zhang, Hang Zhang, Zhongyue Zhang, Junyuan Xie, and Mu Li. 2019. Bag of tricks for image classification with convolutional neural networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 558--567.
    [21]
    G. E. Hinton and D. V. Camp. 1993. Keeping the neural networks simple by minimizing the description length of the weights. (1993).
    [22]
    TonTon Hsien-De Huang and Hung-Yu Kao. 2018. R2-d2: Color-inspired Convolutional Neural Network (CNN)-based Android Malware Detections. In 2018 IEEE international conference on big data (big data). IEEE, 2633--2642.
    [23]
    Donghui Hu, Zhongjin Ma, Xiaotian Zhang, Peipei Li, Dengpan Ye, Baohong Ling, et al. 2017. The Concept Drift Problem in Android Malware Detection and Its Solution. Security and Communication Networks 2017 (2017).
    [24]
    Weiwei Hu and Ying Tan. 2017. Generating adversarial malware examples for black-box attacks based on GAN. arXiv preprint arXiv:1702.05983 (2017).
    [25]
    Wenzhen Huang, Junge Zhang, and Kaiqi Huang. 2019. Bootstrap estimated uncertainty of the environment model for model-based reinforcement learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 3870--3877.
    [26]
    Sumedh Ingale and Sunil Gupta. 2014. SECURITY IN ANDROID BASED SMART-PHONE. International Journal of Application or Innovation in Engineering Management 3 (2014).
    [27]
    Steve TK Jan, Qingying Hao, Tianrui Hu, Jiameng Pu, Sonal Oswal, Gang Wang, and Bimal Viswanath. 2020. Throwing darts in the dark? detecting bots with limited data using neural data augmentation. In 2020 IEEE symposium on security and privacy (SP). IEEE, 1190--1206.
    [28]
    Roberto Jordaney, Kumar Sharad, Santanu K Dash, Zhi Wang, Davide Papini, Ilia Nouretdinov, and Lorenzo Cavallaro. 2017. Transcend: Detecting concept drift in malware classification models. In 26th USENIX Security Symposium (USENIX Security 17). 625--642.
    [29]
    Gregory Kahn, Adam Villaflor, Vitchyr Pong, Pieter Abbeel, and Sergey Levine. 2017. Uncertainty-aware reinforcement learning for collision avoidance. arXiv preprint arXiv:1702.01182 (2017).
    [30]
    Alex Kantchelian, Sadia Afroz, Ling Huang, Aylin Caliskan Islam, Brad Miller, Michael Carl Tschantz, Rachel Greenstadt, Anthony D Joseph, and JD Tygar. 2013. Approaches to adversarial drift. In Proceedings of the 2013 ACM workshop on Artificial intelligence and security. 99--110.
    [31]
    Alex Kendall and Yarin Gal. 2017. What uncertainties do we need in bayesian deep learning for computer vision? Advances in neural information processing systems 30 (2017).
    [32]
    TaeGuen Kim, BooJoong Kang, Mina Rho, Sakir Sezer, and Eul Gyu Im. 2018. A multimodal deep learning method for Android malware detection using various features. IEEE Transactions on Information Forensics and Security 14, 3 (2018), 773--788.
    [33]
    Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. 2017. Simple and scalable predictive uncertainty estimation using deep ensembles. Advances in neural information processing systems 30 (2017).
    [34]
    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
    [35]
    Kimin Lee, Honglak Lee, Kibok Lee, and Jinwoo Shin. 2017. Training confidence-calibrated classifiers for detecting out-of-distribution samples. arXiv preprint arXiv:1711.09325 (2017).
    [36]
    Deqiang Li and Qianmu Li. 2020. Adversarial deep ensemble: Evasion attacks and defenses for malware detection. IEEE Transactions on Information Forensics and Security 15 (2020), 3886--3900.
    [37]
    Deqiang Li, Tian Qiu, Shuo Chen, Qianmu Li, and Shouhuai Xu. 2021. Can We Leverage Predictive Uncertainty to Detect Dataset Shift and Adversarial Examples in Android Malware Detection?. In Annual Computer Security Applications Conference. 596--608.
    [38]
    Dongfang Li, Zhaoguo Wang, and Yibo Xue. 2018. Fine-grained Android Malware Detection based on Deep Learning. In 2018 IEEE Conference on Communications and Network Security (CNS). IEEE, 1--2.
    [39]
    Martina Lindorfer, Matthias Neugschwandtner, and Christian Platzer. 2015. MARVIN: Efficient and Comprehensive Mobile App Classification through Static and Dynamic Analysis. In 2015 IEEE 39th annual computer software and applications conference, Vol. 2. IEEE, 422--433.
    [40]
    Björn Lütjens, Michael Everett, and Jonathan P How. 2019. Safe reinforcement learning with model uncertainty estimates. In 2019 International Conference on Robotics and Automation (ICRA). IEEE, 8662--8668.
    [41]
    Zhuo Ma, Haoran Ge, Zhuzhu Wang, Yang Liu, and Ximeng Liu. 2020. Droidetec: Android malware detection and malicious code localization through deep learning. arXiv preprint arXiv:2002.03594 (2020).
    [42]
    MalCertain. 2024. "MALCERTAIN:Enhancing Deep Neural Network Based Android Malware Detection by Tackling Prediction Uncertainty". https://github.com/Dirtyboy1029/MALCERTAIN/.
    [43]
    Gilberto Manunza, Matteo Pagliardini, Martin Jaggi, and Tatjana Chavdarova. 2021. Improved Adversarial Robustness via Uncertainty Targeted Attacks. In ICML Workshop on Uncertainty and Robustness in Deep Learning.
    [44]
    Niall McLaughlin, Jesus Martinez del Rincon, BooJoong Kang, Suleiman Yerima, Paul Miller, Sakir Sezer, Yeganeh Safaei, Erik Trickel, Ziming Zhao, Adam Doupé, et al. 2017. Deep Android malware detection. In Proceedings of the seventh ACM on conference on data and application security and privacy. 301--308.
    [45]
    John Mitros and Brian Mac Namee. 2019. On the validity of Bayesian neural networks for uncertainty estimation. arXiv preprint arXiv:1912.01530 (2019).
    [46]
    Tiwari Mohini, Srivastava Ashish Kumar, and Gupta Nitesh. 2013. Review on Android and smartphone security. Research Journal of Computer and Information Technology Sciences 2320 (2013), 6527.
    [47]
    Andre T Nguyen, Fred Lu, Gary Lopez Munoz, Edward Raff, Charles Nicholas, and James Holt. 2022. Out of Distribution Data Detection Using Dropout Bayesian Neural Networks. arXiv preprint arXiv:2202.08985 (2022).
    [48]
    Andre T Nguyen, Edward Raff, Charles Nicholas, and James Holt. 2021. Leveraging Uncertainty for Improved Static Malware Detection Under Extreme False Positive Constraints. arXiv preprint arXiv:2108.04081 (2021).
    [49]
    Luis Oala, Cosmas Heiß, Jan Macdonald, Maximilian März, Wojciech Samek, and Gitta Kutyniok. 2020. Interval neural networks: Uncertainty scores. arXiv preprint arXiv:2003.11566 (2020).
    [50]
    Yaniv Ovadia, Emily Fertig, Jie Ren, Zachary Nado, David Sculley, Sebastian Nowozin, Joshua Dillon, Balaji Lakshminarayanan, and Jasper Snoek. 2019. Can you trust your model's uncertainty? evaluating predictive uncertainty under dataset shift. Advances in neural information processing systems 32 (2019).
    [51]
    Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, and Lorenzo Cavallaro. 2019. {TESSERACT}: Eliminating experimental bias in malware classification across space and time. In 28th USENIX Security Symposium (USENIX Security 19). 729--746.
    [52]
    Junyang Qiu, Jun Zhang, Wei Luo, Lei Pan, Surya Nepal, Yu Wang, and Yang Xiang. 2019. A3CM: Automatic Capability Annotation for Android Malware. IEEE Access 7 (2019), 147156--147168.
    [53]
    A-D Schmidt, Rainer Bye, H-G Schmidt, Jan Clausen, Osman Kiraz, Kamer A Yuksel, Seyit Ahmet Camtepe, and Sahin Albayrak. 2009. Static Analysis of Executables for Collaborative Malware Detection on Android. In 2009 IEEE International Conference on Communications. IEEE, 1--5.
    [54]
    Alexandru Constantin Serban, Erik Poll, and Joost Visser. 2018. Adversarial Examples - A Complete Characterisation of the Phenomenon. arXiv preprint arXiv:1810.01185 (2018).
    [55]
    Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
    [56]
    Lewis Smith and Yarin Gal. 2018. Understanding Measures of Uncertainty for Adversarial Example Detection. arXiv preprint arXiv:1803.08533 (2018).
    [57]
    Shengyang Sun, Guodong Zhang, Jiaxin Shi, and Roger Grosse. 2019. Functional variational Bayesian neural networks. arXiv preprint arXiv:1903.05779 (2019).
    [58]
    Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow, and Rob Fergus. 2013. Intriguing properties of neural networks. arXiv preprint arXiv:1312.6199 (2013).
    [59]
    Suman R Tiwari and Ravi U Shukla. 2018. An Android Malware Detection Technique Based on Optimized Permissions and API. In 2018 International Conference on Inventive Research in Computing Applications (ICIRCA). IEEE, 258--263.
    [60]
    Halil Murat Ünver and Khaled Bakour. 2020. Android malware detection based on image-based features and machine learning techniques. SN Applied Sciences 2 (2020), 1--15.
    [61]
    Sara Vicente, Joao Carreira, Lourdes Agapito, and Jorge Batista. 2014. Reconstructing pascal voc. In Proceedings of the IEEE conference on computer vision and pattern recognition. 41--48.
    [62]
    R Vinayakumar, Mamoun Alazab, KP Soman, Prabaharan Poornachandran, and Sitalakshmi Venkatraman. 2019. Robust Intelligent Malware Detection Using Deep Learning. IEEE access 7 (2019), 46717--46738.
    [63]
    Liu Wang, Haoyu Wang, Ren He, Ran Tao, Guozhu Meng, Xiapu Luo, and Xuanzhe Liu. 2022. MalRadar: Demystifying Android Malware in the New Era. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 2 (2022), 1--27.
    [64]
    Fengguo Wei, Yuping Li, Sankardas Roy, Xinming Ou, and Wu Zhou. 2017. Deep Ground Truth Analysis of Current Android Malware. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 252--276.
    [65]
    Andrew G Wilson and Pavel Izmailov. 2020. Bayesian deep learning and a probabilistic perspective of generalization. Advances in neural information processing systems 33 (2020), 4697--4708.
    [66]
    Dong-Jie Wu, Ching-Hao Mao, Te-En Wei, Hahn-Ming Lee, and Kuo-Ping Wu. 2012. Droidmat: Android malware detection through manifest and api calls tracing. In 2012 Seventh Asia joint conference on information security. IEEE, 62--69.
    [67]
    Shengqu Xi, Shao Yang, Xusheng Xiao, Yuan Yao, Yayuan Xiong, Fengyuan Xu, Haoyu Wang, Peng Gao, Zhuotao Liu, Feng Xu, and Jian Lu. 2019. DeepIntent: Deep Icon-Behavior Learning for Detecting Intention-Behavior Discrepancy in Mobile Apps. In Proceedings of the ACM Conference on Computer and Communications Security (CCS).
    [68]
    Xusheng Xiao, Xiaoyin Wang, Zhihao Cao, Hanlin Wang, and Peng Gao. 2019. IconIntent: Automatic Identification of Sensitive UI Widgets based on Icon Classification for Android Apps. In Proceedings of the International Conference on Software Engineering (ICSE).
    [69]
    Ke Xu, Yingjiu Li, Robert H Deng, and Kai Chen. 2018. DeepRefiner: Multi-layer Android Malware Detection System Applying Deep Neural Networks. In 2018 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 473--487.
    [70]
    Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. 2021. Generalized out-of-distribution detection: A survey. arXiv preprint arXiv:2110.11334 (2021).
    [71]
    Limin Yang, Wenbo Guo, Qingying Hao, Arridhana Ciptadi, Ali Ahmadzadeh, Xinyu Xing, and Gang Wang. 2021. {CADE}: Detecting and explaining concept drift samples for security applications. In 30th USENIX Security Symposium (USENIX Security 21). 2327--2344.
    [72]
    Peter Zegzhda, Dmitry Zegzhda, Evgeny Pavlenko, and Gleb Ignatev. 2018. Applying deep learning techniques for Android malware detection. In Proceedings of the 11th International Conference on Security of Information and Networks. 1--8.
    [73]
    Mu Zhang, Yue Duan, Heng Yin, and Zhiruo Zhao. 2014. Semantics-Aware Android Malware Classification Using Weighted Contextual API Dependency Graphs. In Proceedings of the 2014 ACM SIGSAC conference on computer and communications security. 1105--1116.
    [74]
    Xiao Zhang and David Evans. 2021. Understanding Intrinsic Robustness Using Label Uncertainty. In International Conference on Learning Representations.
    [75]
    Xiyue Zhang, Xiaofei Xie, Lei Ma, Xiaoning Du, Qiang Hu, Yang Liu, Jianjun Zhao, and Meng Sun. 2020. Towards characterizing adversarial defects of deep learning software from the lens of uncertainty. In 2020 IEEE/ACM 42nd International Conference on Software Engineering (ICSE). IEEE, 739--751.
    [76]
    Kai Zhao, Dafang Zhang, Xin Su, and Wenjia Li. 2015. Fest: A feature extraction and selection tool for Android malware detection. In 2015 IEEE symposium on computers and communication (ISCC). IEEE, 714--720.
    [77]
    Xujiang Zhao, Yuzhe Ou, Lance Kaplan, Feng Chen, and Jin-Hee Cho. 2019. Quantifying classification uncertainty using regularized evidential neural networks. arXiv preprint arXiv:1910.06864 (2019).
    [78]
    Yajin Zhou and Xuxian Jiang. 2012. Dissecting Android Malware: Characterization and Evolution. In 2012 IEEE symposium on security and privacy. IEEE, 95--109.

    Index Terms

    1. MalCertain: Enhancing Deep Neural Network Based Android Malware Detection by Tackling Prediction Uncertainty

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        ICSE '24: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering
        May 2024
        2942 pages
        ISBN:9798400702174
        DOI:10.1145/3597503
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

        Sponsors

        In-Cooperation

        • Faculty of Engineering of University of Porto

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 April 2024

        Check for updates

        Author Tags

        1. Android malware detection
        2. uncertainty
        3. DNN

        Qualifiers

        • Research-article

        Conference

        ICSE '24
        Sponsor:

        Acceptance Rates

        Overall Acceptance Rate 276 of 1,856 submissions, 15%

        Upcoming Conference

        ICSE 2025

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • 0
          Total Citations
        • 135
          Total Downloads
        • Downloads (Last 12 months)135
        • Downloads (Last 6 weeks)50
        Reflects downloads up to

        Other Metrics

        Citations

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media