Abstract
We present a novel malware detection approach based on metrics over quantitative data flow graphs. Quantitative data flow graphs (QDFGs) model process behavior by interpreting issued system calls as aggregations of quantifiable data flows. Due to the high abstraction level we consider QDFG metric based detection more robust against typical behavior obfuscation like bogus call injection or call reordering than other common behavioral models that base on raw system calls. We support this claim with experiments on obfuscated malware logs and demonstrate the superior obfuscation robustness in comparison to detection using n-grams. Our evaluations on a large and diverse data set consisting of about 7000 malware and 500 goodware samples show an average detection rate of 98.01 % and a false positive rate of 0.48 %. Moreover, we show that our approach is able to detect new malware (i.e. samples from malware families not included in the training set) and that the consideration of quantities in itself significantly improves detection precision.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
By examining the reachability graph associated with a node, it is possible for analysts to investigate the root cause of an infection.
- 2.
References
Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Banescu, S., Wüchner, T., Guggenmos, M., Ochoa, M., Pretschner, A.: FEEBO: an empirical evaluation framework for malware behavior obfuscation. CoRR, arXiv:1502.03245 (2015)
Bhatkar, S., Chaturvedi, A., Sekar, R.: Dataflow anomaly detection. In: S&P, pp. 15. IEEE (2006)
Borello, J.-M., Me, L.: Code obfuscation techniques for metamorphic viruses. J. Comput. Virol. 4, 211–220 (2008)
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Canali, D., Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: A quantitative study of accuracy in system call-based malware detection. In: ISSTA. ACM (2012)
Cavallaro, L., Sekar, R.: Taint-enhanced anomaly detection. In: Jajodia, S., Mazumdar, C. (eds.) ICISS 2011. LNCS, vol. 7093, pp. 160–174. Springer, Heidelberg (2011)
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: India Software Engineering Conference, pp. 5–14 (2008)
Christodorescu, M., Jha, S., Seshia, S., Song, D., Bryant, R.: Semantics-aware malware detection. In: S&P 2005, pp. 32–46 (2005)
Forrest, S., Hofmeyr, S., Somayaji, A., Longstaff, T.: A sense of self for Unix processes. In: S&P, pp. 120–128 (1996)
Fredrikson, M., Christodorescu, M., Jha, S.: Dynamic behavior matching: a complexity analysis and new approximation algorithms. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS, vol. 6803, pp. 252–267. Springer, Heidelberg (2011)
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: S&P, pp. 45–60. IEEE (2010)
Ghosh, A.K., Schwartzbard, A., Schatz, M.: Learning program behavior profiles for intrusion detection. In: Workshop on Intrusion Detection and Network Monitoring, p. 6 (1999)
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Jang, J., Woo, J., Yun, J., Kim, H.K.: Mal-netminer: malware classification based on social network analysis of call graph. In: WWW (2014)
Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.A.: Behavior-based spyware detection. In: USENIX (2006)
Kolbitsch, C., Comparetti, P.M., Kruegel, C., Kirda, E., Zhou, X., Wang, X.: Effective and efficient malware detection at the end host. In: USENIX, pp. 351–366 (2009)
Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: Accessminer: using system-centric models for malware protection. In: CCS, pp. 399–412 (2010)
Lee, J., Jeong, K., Lee, H.: Detecting metamorphic malwares using code graphs. In: SAC (2010)
Lee, W., Stolfo, S.J., Chan, P.K.: Learning patterns from unix process execution traces for intrusion detection. In: Workshop on AI Approaches to Fraud Detection and Risk Management, pp. 50–56 (1997)
Mao, W., Cai, Z., Guan, X., Towsley, D.: Centrality metrics of importance in access behaviors and malware detections. In: ACSAC. ACM (2014)
Milea, N.A., Khoo, S.C.: Nort: runtime anomaly-based monitoring of malicious behavior for windows, pp. 115–130 (2012)
Nappa, A., Rafique, M.Z., Caballero, J.: Driving in the cloud: an analysis of drive-by download operations and abuse reporting. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 1–20. Springer, Heidelberg (2013)
Okamoto, K., Chen, W., Li, X.-Y.: Ranking of closeness centrality for large-scale social networks. In: Preparata, F.P., Wu, X., Yin, J. (eds.) FAW 2008. LNCS, vol. 5059, pp. 186–195. Springer, Heidelberg (2008)
Park, Y., Reeves, D.S., Stamp, M.: Deriving common malware behavior through graph clustering. Comput. Secur. 39, 419–430 (2013)
Preda, M., Christodorescu, M., Jha, S., Debray, S.: A semantics-based approach to malware detection. ACM SIGPLAN Notices, pp. 1–12 (2007)
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19, 639–668 (2011)
Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Impeding malware analysis using conditional code obfuscation. In: NDSS (2008)
Wressnegger, C., Schwenk, G., Arp, D., Rieck, K.: A close look on n-grams in intrusion detection: anomaly detection vs. classification. In: Workshop on Artificial Intelligence and Security, pp. 67–76 (2013)
Wüchner, T., Ochoa, M., Pretschner, A.: Malware detection with quantitative data flow graphs. In: ASIACCS (2014)
Wüchner, T., Pretschner, A.: Data loss prevention based on data-driven usage control. In: ISSRE, pp. 151–160, November 2012
Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: CCS, pp. 116–127 (2007)
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: BWCCA, pp. 297–300 (2010)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Wüchner, T., Ochoa, M., Pretschner, A. (2015). Robust and Effective Malware Detection Through Quantitative Data Flow Graph Metrics. In: Almgren, M., Gulisano, V., Maggi, F. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2015. Lecture Notes in Computer Science(), vol 9148. Springer, Cham. https://doi.org/10.1007/978-3-319-20550-2_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-20550-2_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20549-6
Online ISBN: 978-3-319-20550-2
eBook Packages: Computer ScienceComputer Science (R0)