research-article

MalWhiteout: Reducing Label Errors in Android Malware Detection

Authors:

Yulei SuiAuthors Info & Claims

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

Article No.: 69, Pages 1 - 13

https://doi.org/10.1145/3551349.3560418

Published: 05 January 2023 Publication History

Abstract

Machine learning based Android malware detection has attracted a great deal of research work in recent years. A reliable malware dataset is critical to evaluate the effectiveness of malware detection approaches. Unfortunately, existing malware datasets used in our community are mainly labelled by leveraging existing anti-virus services (i.e., VirusTotal), which are prone to mislabelling. This, however, would lead to the inaccurate evaluation of the malware detection techniques. Removing label noises from Android malware datasets can be quite challenging, especially at a large data scale. To address this problem, we propose an effective approach called MalWhiteout to reduce label errors in Android malware datasets. Specifically, we creatively introduce Confident Learning (CL), an advanced noise estimation approach, to the domain of Android malware detection. To combat false positives introduced by CL, we incorporate the idea of ensemble learning and inter-app relation to achieve a more robust capability in noise detection. We evaluate MalWhiteout on a curated large-scale and reliable benchmark dataset. Experimental results show that MalWhiteout is capable of detecting label noises with over 94% accuracy even at a high noise ratio (i.e., 30%) of the dataset. MalWhiteout outperforms the state-of-the-art approach in terms of both effectiveness (8% to 218% improvement) and efficiency (70 to 249 times faster) across different settings. By reducing label noises, we show that the performance of existing malware detection approaches can be improved.

References

[1]

2019. Stacking in Machine Learning. https://www.geeksforgeeks.org/stacking-in-machine-learning/.

[2]

2022. Koodous. https://koodous.com.

[3]

2022. Publication Trends. https://app.dimensions.ai/discover/publication.

[4]

2022. VirusTotal. https://www.virustotal.com/.

[5]

Yousra Aafer, Wenliang Du, and Heng Yin. 2013. Droidapiminer: Mining api-level features for robust malware detection in Android. In International conference on security and privacy in communication systems. Springer, 86–103.

[6]

Kevin Allix, Tegawendé F Bissyandé, Quentin Jérome, Jacques Klein, State Radu, and Yves Le Traon. 2016. Empirical assessment of machine learning-based malware detectors for Android. Empirical Software Engineering 21, 1 (2016), 183–211.

Digital Library

[7]

Kevin Allix, Tegawendé F Bissyandé, Jacques Klein, and Yves Le Traon. 2016. Androzoo: Collecting millions of Android apps for the research community. In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR). IEEE, 468–471.

Digital Library

[8]

Daniel Arp, Michael Spreitzenbarth, Malte Hubner, Hugo Gascon, Konrad Rieck, and CERT Siemens. 2014. Drebin: Effective and explainable detection of Android malware in your pocket. In NDSS, Vol. 14. 23–26.

[9]

Battista Biggio, Blaine Nelson, and Pavel Laskov. 2011. Support vector machines under adversarial label noise. In Asian conference on machine learning. PMLR, 97–112.

[10]

Xinlei Chen and Abhinav Gupta. 2015. Webly supervised learning of convolutional networks. In Proceedings of the IEEE international conference on computer vision. 1431–1439.

Digital Library

[11]

Sarah Jane Delany, Nicola Segata, and Brian Mac Namee. 2012. Profiling instances in noise reduction. Knowledge-Based Systems 31 (2012), 28–40.

Digital Library

[12]

Dragan Gamberger, Nada Lavrac, and Saso Dzeroski. 2000. Noise detection and elimination in data preprocessing: experiments in medical domains. Applied artificial intelligence 14, 2 (2000), 205–223.

[13]

Joshua Garcia, Mahmoud Hammad, Bahman Pedrood, Ali Bagheri-Khaligh, and Sam Malek. 2015. Obfuscation-resilient, efficient, and accurate detection and family identification of Android malware. Department of Computer Science, George Mason University, Tech. Rep 202 (2015).

[14]

Aritra Ghosh, Himanshu Kumar, and PS Sastry. 2017. Robust loss functions under label noise for deep neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 31.

[15]

Aritra Ghosh, Naresh Manwani, and PS Sastry. 2017. On the robustness of decision tree learning under label noise. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 685–697.

[16]

Jacob Goldberger and Ehud Ben-Reuven. 2016. Training deep neural-networks using a noise adaptation layer. (2016).

[17]

Alessandra Gorla, Ilaria Tavecchia, Florian Gross, and Andreas Zeller. 2014. Checking app behavior against app descriptions. In Proceedings of the 36th international conference on software engineering. 1025–1035.

Digital Library

[18]

Bo Han, Quanming Yao, Xingrui Yu, Gang Niu, Miao Xu, Weihua Hu, Ivor Tsang, and Masashi Sugiyama. 2018. Co-teaching: Robust training of deep neural networks with extremely noisy labels. Advances in neural information processing systems 31 (2018).

[19]

Dan Hendrycks, Mantas Mazeika, Duncan Wilson, and Kevin Gimpel. 2018. Using trusted data to train deep networks on labels corrupted by severe noise. Advances in neural information processing systems 31 (2018).

[20]

Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In International Conference on Machine Learning. PMLR, 2304–2313.

[21]

Alex Kantchelian, Michael Carl Tschantz, Sadia Afroz, Brad Miller, Vaishaal Shankar, Rekha Bachwani, Anthony D Joseph, and J Doug Tygar. 2015. Better malware ground truth: Techniques for weighting anti-virus vendor labels. In Proceedings of the 8th ACM Workshop on Artificial Intelligence and Security. 45–56.

Digital Library

[22]

ElMouatez Billah Karbab, Mourad Debbabi, Abdelouahid Derhab, and Djedjiga Mouheb. 2018. MalDozer: Automatic framework for Android malware detection using deep learning. Digital Investigation 24(2018), S48–S59.

[23]

Li Li, Daoyuan Li, Tegawendé F Bissyandé, Jacques Klein, Yves Le Traon, David Lo, and Lorenzo Cavallaro. 2017. Understanding Android app piggybacking: A systematic study of malicious code grafting. IEEE Transactions on Information Forensics and Security 12, 6(2017), 1269–1284.

Digital Library

[24]

Eran Malach and Shai Shalev-Shwartz. 2017. Decoupling ”when to update” from ”how to update”. Advances in Neural Information Processing Systems 30 (2017).

[25]

Naresh Manwani and PS Sastry. 2013. Noise tolerance under risk minimization. IEEE transactions on cybernetics 43, 3 (2013), 1146–1151.

[26]

Enrico Mariconti, Lucky Onwuzurike, Panagiotis Andriotis, Emiliano De Cristofaro, Gordon Ross, and Gianluca Stringhini. 2017. MAMADROID: Detecting Android malware by building markov chains of behavioral models. In Proceedings of the Annual Symposium on Network and Distributed System Security (NDSS).

[27]

Aditya Krishna Menon, Ankit Singh Rawat, Sashank J Reddi, and Sanjiv Kumar. 2019. Can gradient clipping mitigate label noise?. In International Conference on Learning Representations.

[28]

Curtis Northcutt, Lu Jiang, and Isaac Chuang. 2021. Confident learning: Estimating uncertainty in dataset labels. Journal of Artificial Intelligence Research 70 (2021), 1373–1411.

Digital Library

[29]

Giorgio Patrini, Alessandro Rozza, Aditya Krishna Menon, Richard Nock, and Lizhen Qu. 2017. Making deep neural networks robust to label noise: A loss correction approach. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1944–1952.

[30]

Feargus Pendlebury, Fabio Pierazzi, Roberto Jordaney, Johannes Kinder, and Lorenzo Cavallaro. 2019. TESSERACT: Eliminating experimental bias in malware classification across space and time. In 28th USENIX Security Symposium (USENIX Security 19). 729–746.

[31]

Silvia Sebastian and Juan Caballero. 2020. Towards attribution in mobile markets: Identifying developer account polymorphism. In Proceedings of the 2020 ACM SIGSAC Conference on Computer and Communications Security. 771–785.

Digital Library

[32]

Jingya Shen, Zhenxiang Chen, Shanshan Wang, Yuhui Zhu, and Muhammad Umair Hassan. 2018. DroidDetector: a traffic-based platform to detect Android malware using machine learning. In Third International Workshop on Pattern Recognition, Vol. 10828. International Society for Optics and Photonics, 108280N.

[33]

Yanyao Shen and Sujay Sanghavi. 2019. Learning with bad training data via iterative trimmed loss minimization. In International Conference on Machine Learning. PMLR, 5739–5748.

[34]

Sainbayar Sukhbaatar, Joan Bruna, Manohar Paluri, Lubomir Bourdev, and Rob Fergus. 2014. Training convolutional networks with noisy labels. arXiv preprint arXiv:1406.2080(2014).

[35]

Ryutaro Tanno, Ardavan Saeedi, Swami Sankaranarayanan, Daniel C Alexander, and Nathan Silberman. 2019. Learning from noisy labels by regularized estimation of annotator confusion. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 11244–11253.

[36]

Kurt Thomas, Juan A Elices Crespo, Ryan Rasti, Jean-Michel Picod, Cait Phillips, Marc-André Decoste, Chris Sharp, Fabio Tirelo, Ali Tofigh, Marc-Antoine Courteau, 2016. Investigating Commercial Pay-Per-Install and the Distribution of Unwanted Software. In 25th USENIX Security Symposium (USENIX Security 16). 721–739.

[37]

Jaree Thongkam, Guandong Xu, Yanchun Zhang, and Fuchun Huang. 2008. Support vector machine for outlier detection in breast cancer survivability prediction. In Asia-Pacific Web Conference. Springer, 99–109.

[38]

Haoyu Wang, Zhe Liu, Jingyue Liang, Narseo Vallina-Rodriguez, Yao Guo, Li Li, Juan Tapiador, Jingcun Cao, and Guoai Xu. 2018. Beyond Google play: A large-scale comparative study of Chinese Android app markets. In Proceedings of IMC 2018. 293–307.

Digital Library

[39]

Haoyu Wang, Junjun Si, Hao Li, and Yao Guo. 2019. RmvDroid: Towards a reliable Android malware dataset with app metadata. In 2019 IEEE/ACM 16th International Conference on Mining Software Repositories (MSR). IEEE, 404–408.

Digital Library

[40]

Liu Wang, Ren He, Haoyu Wang, Pengcheng Xia, Yuanchun Li, Lei Wu, Yajin Zhou, Xiapu Luo, Yulei Sui, Yao Guo, 2021. Beyond the virus: a first look at coronavirus-themed Android malware. Empirical Software Engineering 26, 4 (2021), 1–38.

Digital Library

[41]

Liu Wang, Haoyu Wang, Ren He, Ran Tao, Guozhu Meng, Xiapu Luo, and Xuanzhe Liu. 2022. MalRadar: Demystifying Android Malware in the New Era. Proceedings of the ACM on Measurement and Analysis of Computing Systems 6, 2(2022), 1–27.

Digital Library

[42]

Fengguo Wei, Yuping Li, Sankardas Roy, Xinming Ou, and Wu Zhou. 2017. Deep ground truth analysis of current Android malware. In International conference on detection of intrusions and malware, and vulnerability assessment. Springer, 252–276.

[43]

Fengguo Wei, Yuping Li, Sankardas Roy, Xinming Ou, and Wu Zhou. 2017. Deep ground truth analysis of current Android malware. In International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment. Springer, 252–276.

[44]

Dong-Jie Wu, Ching-Hao Mao, Te-En Wei, Hahn-Ming Lee, and Kuo-Ping Wu. 2012. Droidmat: Android malware detection through manifest and api calls tracing. In 2012 Seventh Asia Joint Conference on Information Security. IEEE, 62–69.

Digital Library

[45]

Yueming Wu, Xiaodi Li, Deqing Zou, Wei Yang, Xin Zhang, and Hai Jin. 2019. Malscan: Fast market-wide mobile malware scanning by social-network centrality analysis. In 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). IEEE, 139–150.

Digital Library

[46]

Tong Xiao, Tian Xia, Yi Yang, Chang Huang, and Xiaogang Wang. 2015. Learning from massive noisy labeled data for image classification. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2691–2699.

[47]

Jiayun Xu, Yingjiu Li, and Robert H Deng. 2021. Differential training: A generic framework to reduce label noises for Android malware detection. (2021).

[48]

Suleiman Y Yerima, Sakir Sezer, and Igor Muttik. 2014. Android malware detection using parallel machine learning classifiers. In 2014 Eighth international conference on next generation mobile apps, services and technologies. IEEE, 37–42.

Digital Library

[49]

Xingrui Yu, Bo Han, Jiangchao Yao, Gang Niu, Ivor Tsang, and Masashi Sugiyama. 2019. How does disagreement help generalization against label corruption?. In International Conference on Machine Learning. PMLR, 7164–7173.

[50]

Yajin Zhou and Xuxian Jiang. 2012. Dissecting Android malware: Characterization and evolution. In 2012 IEEE symposium on security and privacy. IEEE, 95–109.

Digital Library

[51]

Shuofei Zhu, Jianjun Shi, Limin Yang, Boqin Qin, Ziyi Zhang, Linhai Song, and Gang Wang. 2020. Measuring and modeling the label dynamics of online anti-malware engines. In 29th USENIX Security Symposium (USENIX Security 20). 2361–2378.

Cited By

Cuiying GWu YLi HYuan WJiang HHe QLiu YChristakis MPradel M(2024)Uncovering and Mitigating the Impact of Code Obfuscation on Dataset Annotation with Antivirus EnginesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680302(553-565)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680302
Zhu HXia MWang LXu ZSheng V(2024)A Novel Knowledge Search Structure for Android Malware DetectionIEEE Transactions on Services Computing10.1109/TSC.2024.3496333(1-14)Online publication date: 2024
https://doi.org/10.1109/TSC.2024.3496333
Zhu HChen XWang LXu ZSheng V(2024)A Dynamic Analysis-Powered Explanation Framework for Malware DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.343689136:12(7483-7496)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3436891
Show More Cited By

Index Terms

MalWhiteout: Reducing Label Errors in Android Malware Detection
1. Software and its engineering
  1. Software notations and tools
    1. Development frameworks and environments

Recommendations

Adaptive Android Malware Signature Detection
ICCET '18: Proceedings of the 2018 International Conference on Communication Engineering and Technology

This paper proposes signature-based malware detection using permission and broadcast-receiver data, which is extracted from the manifest file. The malicious signatures are constructed from 800 applications thru the filtering and statistical processes. ...
Semantic modelling of Android malware for effective malware comprehension, detection, and classification
ISSTA 2016: Proceedings of the 25th International Symposium on Software Testing and Analysis

Malware has posed a major threat to the Android ecosystem. Existing malware detection tools mainly rely on signature- or feature- based approaches, failing to provide detailed information beyond the mere detection. In this work, we propose a precise ...
A multi-view context-aware approach to Android malware detection and malicious code localization
Abstract
Many existing Machine Learning (ML) based Android malware detection approaches use a variety of features such as security-sensitive APIs, system calls, control-flow structures and information flows in conjunction with ML classifiers to achieve ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ASE '22: Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering

October 2022

2006 pages

ISBN:9781450394758

DOI:10.1145/3551349

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 January 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ASE '22

ASE '22: 37th IEEE/ACM International Conference on Automated Software Engineering

October 10 - 14, 2022

MI, Rochester, USA

Acceptance Rates

Overall Acceptance Rate 82 of 337 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
372
Total Downloads

Downloads (Last 12 months)135
Downloads (Last 6 weeks)26

Reflects downloads up to 03 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Cuiying GWu YLi HYuan WJiang HHe QLiu YChristakis MPradel M(2024)Uncovering and Mitigating the Impact of Code Obfuscation on Dataset Annotation with Antivirus EnginesProceedings of the 33rd ACM SIGSOFT International Symposium on Software Testing and Analysis10.1145/3650212.3680302(553-565)Online publication date: 11-Sep-2024
https://dl.acm.org/doi/10.1145/3650212.3680302
Zhu HXia MWang LXu ZSheng V(2024)A Novel Knowledge Search Structure for Android Malware DetectionIEEE Transactions on Services Computing10.1109/TSC.2024.3496333(1-14)Online publication date: 2024
https://doi.org/10.1109/TSC.2024.3496333
Zhu HChen XWang LXu ZSheng V(2024)A Dynamic Analysis-Powered Explanation Framework for Malware DetectionIEEE Transactions on Knowledge and Data Engineering10.1109/TKDE.2024.343689136:12(7483-7496)Online publication date: Dec-2024
https://doi.org/10.1109/TKDE.2024.3436891
Fukushi NShibahara TNakano HKoide TChiba D(2024)Noisy Label Detection for Multi-labeled Malware2024 IEEE 21st Consumer Communications & Networking Conference (CCNC)10.1109/CCNC51664.2024.10454810(165-171)Online publication date: 6-Jan-2024
https://doi.org/10.1109/CCNC51664.2024.10454810
Wan LYan CMeng MWang KWang H(2024)Analyzing Excessive Permission Requests in Google Workspace Add-OnsEngineering of Complex Computer Systems10.1007/978-3-031-66456-4_18(323-345)Online publication date: 29-Sep-2024
https://doi.org/10.1007/978-3-031-66456-4_18
Wang JWang LDong FWang HMontpetit MLeivadeas AUhlig SJaved M(2023)Re-measuring the Label Dynamics of Online Anti-Malware Engines from Millions of SamplesProceedings of the 2023 ACM on Internet Measurement Conference10.1145/3618257.3624800(253-267)Online publication date: 24-Oct-2023
https://dl.acm.org/doi/10.1145/3618257.3624800
Nath DBiswas SAkhter JRahaman A(2023)A Hybrid Approach for Android Malicious Software ClassificationComputing Open10.1142/S297237012330002901Online publication date: 19-Dec-2023
https://doi.org/10.1142/S2972370123300029
Zhu HLi YWang LSheng V(2023)A multi-model ensemble learning framework for imbalanced android malware detectionExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120952234:COnline publication date: 30-Dec-2023
https://dl.acm.org/doi/10.1016/j.eswa.2023.120952

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten