Abstract
Machine-learning-based Android malware classifiers perform badly on the detection of new malware, in particular, when they take API calls and permissions as input features, which are the best performing features known so far. This is mainly because signature-based features are very sensitive to the training data and cannot capture general behaviours of identified malware. To improve the robustness of classifiers, we study the problem of learning and verifying unwanted behaviours abstracted as automata. They are common patterns shared by malware instances but rarely seen in benign applications, e.g., intercepting and forwarding incoming SMS messages. We show that by taking the verification results against unwanted behaviours as input features, the classification performance of detecting new malware is improved dramatically. In particular, the precision and recall are respectively 8 and 51 points better than those using API calls and permissions, measured against industrial datasets collected across several years. Our approach integrates several methods: formal methods, machine learning and text mining techniques. It is the first to automatically generate unwanted behaviours for Android malware detection. We also demonstrate unwanted behaviours constructed for well-known malware families. They compare well to those described in human-authored descriptions of these families.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Malware Genome Project (2012). http://www.malgenomeproject.org/
Forensic Blog (2014). http://forensics.spreitzenbarth.de/android-malware/
Juniper Networks (2015). https://www.juniper.net/security/auto/includes/mobile_signature_descriptions.html
Symantec security response (2015). http://www.symantec.com/security_response/
Aafer, Y., Du, W., Yin, H.: DroidAPIMiner: mining API-level features for robust malware detection in Android. In: Zia, T., Zomaya, A., Varadharajan, V., Mao, M. (eds.) SecureComm 2013. LNICST, vol. 127, pp. 86–103. Springer, Heidelberg (2013)
Altman, N.S.: An introduction to kernel and nearest-neighbor nonparametric regression. Am. Stat. 46(3), 175–185 (1992)
Angluin, D.: Learning regular sets from queries and counterexamples. Inf. Comput. 75(2), 87–106 (1987)
Arp, D., et al.: Drebin: efficient and explainable detection of Android malware in your pocket. In: NDSS, pp. 23–26 (2014)
Au, K.W.Y., et al.: PScout: analyzing the Android permission specification. In: CCS, pp. 217–228 (2012)
Barrera, D., Kayacik, H.G., van Oorschot, P.C., Somayaji, A.: A methodology for empirical analysis of permission-based security models and its application to Android. In: CCS, pp. 73–84 (2010)
Beaucamps, P., Gnaedig, I., Marion, J.-Y.: Behavior abstraction in malware analysis. In: Barringer, H., et al. (eds.) RV 2010. LNCS, vol. 6418, pp. 168–182. Springer, Heidelberg (2010)
Biermann, A.W., Feldman, J.A.: On the synthesis of finite-state machines from samples of their behavior. IEEE Trans. Comput. 21(6), 592–597 (1972)
Breiman, L.: Random forests. Mach. Learn. 45, 5–32 (2001)
Chakradeo, S., Reaves, B., Traynor, P., Enck, W.: MAST: triage for market-scale mobile malware analysis. In: WiSec, pp. 13–24 (2013)
Chapelle, O., Schlkopf, B., Zien, A.: Semi-Supervised Learning. The MIT Press, Cambridge (2010)
Chen, K.Z., et al.: Contextual policy enforcement in Android applications with permission event graphs. In: NDSS (2013)
Enck, W., Octeau, D., McDaniel, P., Chaudhuri, S.: A study of Android application security. In: USENIX Security Symposium (2011)
Esparza, J., Hansel, D., Rossmanith, P., Schwoon, S.: Efficient algorithms for model checking pushdown systems. In: Emerson, E.A., Sistla, A.P. (eds.) CAV 2000. LNCS, vol. 1855, pp. 232–247. Springer, Heidelberg (2000)
Fredrikson, M., et al.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: Proceedings of the IEEE Symposium on Security and Privacy, SP 2010, pp. 45–60 (2010)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Gascon, H., Yamaguchi, F., Arp, D., Rieck, K.: Structural detection of Android malware using embedded call graphs. In: AISec, pp. 45–54 (2013)
Gorla, A., et al.: Checking app behavior against app descriptions. In: ICSE, pp. 1025–1035 (2014)
Küster, J.-C., Bauer, A.: Monitoring real Android malware. In: Bartocci, E., et al. (eds.) RV 2015. LNCS, vol. 9333, pp. 136–152. Springer, Heidelberg (2015). doi:10.1007/978-3-319-23820-3_9
McAfee Threat Center (2015). http://www.mcafee.com/uk/threat-center.aspx
Norvig, P.: Paradigms of Artificial Intelligence Programming: Case Studies in Common Lisp, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (1992)
Quinlan, J.R.: C4.5 Programs for Machine Learning. Morgan Kaufmann Publishers Inc., San Francisco (1993)
Reina, A., Fattori, A., Cavallaro, L.: A system call-centric analysis and stimulation technique to automatically reconstruct Android malware behaviors. In: European Workshop on System Security (EUROSEC) (2013)
Schneider, F.B.: Enforceable security policies. ACM Trans. Inf. Syst. Secur. 3(1), 30–50 (2000)
Song, F., Touili, T.: LTL model-checking for malware detection. In: Piterman, N., Smolka, S.A. (eds.) TACAS 2013 (ETAPS 2013). LNCS, vol. 7795, pp. 416–431. Springer, Heidelberg (2013)
Spreitzenbarth, M., et al.: Mobile-sandbox: combining static and dynamic analysis with machine-learning techniques. Int. J. Inf. Secur. 14(2), 141–153 (2015)
Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc. Ser. B 58, 267–288 (1994)
Vardi, M.Y., Wolper, P.: Automata-theoretic techniques for modal logics of programs. J. Comput. Syst. Sci. 32(2), 183–221 (1986)
Whaley, J., Martin, M.C., Lam, M.S.: Automatic extraction of object-oriented component interfaces. SIGSOFT Softw. Eng. Notes 27(4), 218–228 (2002)
Yang, C., et al.: Droidminer: automated mining and characterization of fine-grained malicious behaviors in Android applications. In: ESORICS, pp. 163–182 (2014)
Yerima, S.Y., Sezer, S., McWilliams, G., Muttik, I.: A new Android malware detection approach using bayesian classification. In: AINA, pp. 121–128 (2013)
Zhou, Y., Jiang, X.: Dissecting Android malware: characterization and evolution. In: IEEE Symposium on Security and Privacy, pp. 95–109 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Chen, W., Aspinall, D., Gordon, A.D., Sutton, C., Muttik, I. (2016). On Robust Malware Classifiers by Verifying Unwanted Behaviours. In: Ábrahám, E., Huisman, M. (eds) Integrated Formal Methods. IFM 2016. Lecture Notes in Computer Science(), vol 9681. Springer, Cham. https://doi.org/10.1007/978-3-319-33693-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-319-33693-0_21
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-33692-3
Online ISBN: 978-3-319-33693-0
eBook Packages: Computer ScienceComputer Science (R0)