Robust and Effective Malware Detection Through Quantitative Data Flow Graph Metrics

Wüchner, Tobias; Ochoa, Martín; Pretschner, Alexander

doi:10.1007/978-3-319-20550-2_6

Tobias Wüchner¹⁶,
Martín Ochoa¹⁶ &
Alexander Pretschner¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNSC,volume 9148))

Included in the following conference series:

International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment

3340 Accesses

Abstract

We present a novel malware detection approach based on metrics over quantitative data flow graphs. Quantitative data flow graphs (QDFGs) model process behavior by interpreting issued system calls as aggregations of quantifiable data flows. Due to the high abstraction level we consider QDFG metric based detection more robust against typical behavior obfuscation like bogus call injection or call reordering than other common behavioral models that base on raw system calls. We support this claim with experiments on obfuscated malware logs and demonstrate the superior obfuscation robustness in comparison to detection using n-grams. Our evaluations on a large and diverse data set consisting of about 7000 malware and 500 goodware samples show an average detection rate of 98.01 % and a false positive rate of 0.48 %. Moreover, we show that our approach is able to detect new malware (i.e. samples from malware families not included in the training set) and that the consideration of quantities in itself significantly improves detection precision.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Detection and classification of malicious software utilizing Max-Flows between system-call groups

Article 14 June 2022

Correlating High- and Low-Level Features:

Accurate and efficient exploit capture and classification

Article 13 September 2016

Notes

1.
By examining the reachability graph associated with a node, it is possible for analysts to investigate the root cause of an infection.
2.
http://www.cuckoosandbox.org/.

References

Bailey, M., Oberheide, J., Andersen, J., Mao, Z.M., Jahanian, F., Nazario, J.: Automated classification and analysis of internet malware. In: Kruegel, C., Lippmann, R., Clark, A. (eds.) RAID 2007. LNCS, vol. 4637, pp. 178–197. Springer, Heidelberg (2007)
Chapter Google Scholar
Banescu, S., Wüchner, T., Guggenmos, M., Ochoa, M., Pretschner, A.: FEEBO: an empirical evaluation framework for malware behavior obfuscation. CoRR, arXiv:1502.03245 (2015)
Bhatkar, S., Chaturvedi, A., Sekar, R.: Dataflow anomaly detection. In: S&P, pp. 15. IEEE (2006)
Google Scholar
Borello, J.-M., Me, L.: Code obfuscation techniques for metamorphic viruses. J. Comput. Virol. 4, 211–220 (2008)
Article Google Scholar
Brandes, U.: A faster algorithm for betweenness centrality. J. Math. Sociol. 25(2), 163–177 (2001)
Article MATH Google Scholar
Canali, D., Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: A quantitative study of accuracy in system call-based malware detection. In: ISSTA. ACM (2012)
Google Scholar
Cavallaro, L., Sekar, R.: Taint-enhanced anomaly detection. In: Jajodia, S., Mazumdar, C. (eds.) ICISS 2011. LNCS, vol. 7093, pp. 160–174. Springer, Heidelberg (2011)
Chapter Google Scholar
Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: Smote: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16(1), 321–357 (2002)
MATH Google Scholar
Christodorescu, M., Jha, S., Kruegel, C.: Mining specifications of malicious behavior. In: India Software Engineering Conference, pp. 5–14 (2008)
Google Scholar
Christodorescu, M., Jha, S., Seshia, S., Song, D., Bryant, R.: Semantics-aware malware detection. In: S&P 2005, pp. 32–46 (2005)
Google Scholar
Forrest, S., Hofmeyr, S., Somayaji, A., Longstaff, T.: A sense of self for Unix processes. In: S&P, pp. 120–128 (1996)
Google Scholar
Fredrikson, M., Christodorescu, M., Jha, S.: Dynamic behavior matching: a complexity analysis and new approximation algorithms. In: Bjørner, N., Sofronie-Stokkermans, V. (eds.) CADE 2011. LNCS, vol. 6803, pp. 252–267. Springer, Heidelberg (2011)
Chapter Google Scholar
Fredrikson, M., Jha, S., Christodorescu, M., Sailer, R., Yan, X.: Synthesizing near-optimal malware specifications from suspicious behaviors. In: S&P, pp. 45–60. IEEE (2010)
Google Scholar
Ghosh, A.K., Schwartzbard, A., Schatz, M.: Learning program behavior profiles for intrusion detection. In: Workshop on Intrusion Detection and Network Monitoring, p. 6 (1999)
Google Scholar
Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The weka data mining software: an update. ACM SIGKDD Explor. Newsl. 11(1), 10–18 (2009)
Article Google Scholar
Jang, J., Woo, J., Yun, J., Kim, H.K.: Mal-netminer: malware classification based on social network analysis of call graph. In: WWW (2014)
Google Scholar
Kirda, E., Kruegel, C., Banks, G., Vigna, G., Kemmerer, R.A.: Behavior-based spyware detection. In: USENIX (2006)
Google Scholar
Kolbitsch, C., Comparetti, P.M., Kruegel, C., Kirda, E., Zhou, X., Wang, X.: Effective and efficient malware detection at the end host. In: USENIX, pp. 351–366 (2009)
Google Scholar
Lanzi, A., Balzarotti, D., Kruegel, C., Christodorescu, M., Kirda, E.: Accessminer: using system-centric models for malware protection. In: CCS, pp. 399–412 (2010)
Google Scholar
Lee, J., Jeong, K., Lee, H.: Detecting metamorphic malwares using code graphs. In: SAC (2010)
Google Scholar
Lee, W., Stolfo, S.J., Chan, P.K.: Learning patterns from unix process execution traces for intrusion detection. In: Workshop on AI Approaches to Fraud Detection and Risk Management, pp. 50–56 (1997)
Google Scholar
Mao, W., Cai, Z., Guan, X., Towsley, D.: Centrality metrics of importance in access behaviors and malware detections. In: ACSAC. ACM (2014)
Google Scholar
Milea, N.A., Khoo, S.C.: Nort: runtime anomaly-based monitoring of malicious behavior for windows, pp. 115–130 (2012)
Google Scholar
Nappa, A., Rafique, M.Z., Caballero, J.: Driving in the cloud: an analysis of drive-by download operations and abuse reporting. In: Rieck, K., Stewin, P., Seifert, J.-P. (eds.) DIMVA 2013. LNCS, vol. 7967, pp. 1–20. Springer, Heidelberg (2013)
Chapter Google Scholar
Okamoto, K., Chen, W., Li, X.-Y.: Ranking of closeness centrality for large-scale social networks. In: Preparata, F.P., Wu, X., Yin, J. (eds.) FAW 2008. LNCS, vol. 5059, pp. 186–195. Springer, Heidelberg (2008)
Chapter Google Scholar
Park, Y., Reeves, D.S., Stamp, M.: Deriving common malware behavior through graph clustering. Comput. Secur. 39, 419–430 (2013)
Article Google Scholar
Preda, M., Christodorescu, M., Jha, S., Debray, S.: A semantics-based approach to malware detection. ACM SIGPLAN Notices, pp. 1–12 (2007)
Google Scholar
Rieck, K., Holz, T., Willems, C., Düssel, P., Laskov, P.: Learning and classification of malware behavior. In: Zamboni, D. (ed.) DIMVA 2008. LNCS, vol. 5137, pp. 108–125. Springer, Heidelberg (2008)
Chapter Google Scholar
Rieck, K., Trinius, P., Willems, C., Holz, T.: Automatic analysis of malware behavior using machine learning. J. Comput. Secur. 19, 639–668 (2011)
Google Scholar
Sharif, M.I., Lanzi, A., Giffin, J.T., Lee, W.: Impeding malware analysis using conditional code obfuscation. In: NDSS (2008)
Google Scholar
Wressnegger, C., Schwenk, G., Arp, D., Rieck, K.: A close look on n-grams in intrusion detection: anomaly detection vs. classification. In: Workshop on Artificial Intelligence and Security, pp. 67–76 (2013)
Google Scholar
Wüchner, T., Ochoa, M., Pretschner, A.: Malware detection with quantitative data flow graphs. In: ASIACCS (2014)
Google Scholar
Wüchner, T., Pretschner, A.: Data loss prevention based on data-driven usage control. In: ISSRE, pp. 151–160, November 2012
Google Scholar
Yin, H., Song, D., Egele, M., Kruegel, C., Kirda, E.: Panorama: capturing system-wide information flow for malware detection and analysis. In: CCS, pp. 116–127 (2007)
Google Scholar
You, I., Yim, K.: Malware obfuscation techniques: a brief survey. In: BWCCA, pp. 297–300 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Technische Universität München, Munich, Germany
Tobias Wüchner, Martín Ochoa & Alexander Pretschner

Authors

Tobias Wüchner
View author publications
You can also search for this author in PubMed Google Scholar
Martín Ochoa
View author publications
You can also search for this author in PubMed Google Scholar
Alexander Pretschner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tobias Wüchner .

Editor information

Editors and Affiliations

Chalmers University of Technology, Gothenburg, Sweden
Magnus Almgren
Chalmers University of Technology, Gothenburg, Sweden
Vincenzo Gulisano
Politecnico di Milano, Milan, Italy
Federico Maggi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wüchner, T., Ochoa, M., Pretschner, A. (2015). Robust and Effective Malware Detection Through Quantitative Data Flow Graph Metrics. In: Almgren, M., Gulisano, V., Maggi, F. (eds) Detection of Intrusions and Malware, and Vulnerability Assessment. DIMVA 2015. Lecture Notes in Computer Science(), vol 9148. Springer, Cham. https://doi.org/10.1007/978-3-319-20550-2_6

Download citation

DOI: https://doi.org/10.1007/978-3-319-20550-2_6
Published: 23 June 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-20549-6
Online ISBN: 978-3-319-20550-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Robust and Effective Malware Detection Through Quantitative Data Flow Graph Metrics

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Detection and classification of malicious software utilizing Max-Flows between system-call groups

Correlating High- and Low-Level Features:

Accurate and efficient exploit capture and classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Robust and Effective Malware Detection Through Quantitative Data Flow Graph Metrics

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Detection and classification of malicious software utilizing Max-Flows between system-call groups

Correlating High- and Low-Level Features:

Accurate and efficient exploit capture and classification

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation