Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2808797.2808894acmconferencesArticle/Chapter ViewAbstractPublication PageskddConference Proceedingsconference-collections
research-article

Malware Task Identification: A Data Driven Approach

Published: 25 August 2015 Publication History

Abstract

Identifying the tasks a given piece of malware was designed to perform (e.g. logging keystrokes, recording video, establishing remote access, etc.) is a difficult and time-consuming operation that is largely human-driven in practice. In this paper, we present an automated method to identify malware tasks. Using two different malware collections, we explore various circumstances for each - including cases where the training data differs significantly from test; where the malware being evaluated employs packing to thwart analytical techniques; and conditions with sparse training data. We find that this approach consistently out-performs the current state-of-the art software for malware task identification as well as standard machine learning approaches - often achieving an unbiased F1 score of over 0.9. In the near future, we look to deploy our approach for use by analysts in an operational cyber-security environment.

References

[1]
J. R. Anderson, D. Bothell, M. D. Byrne, S. Douglass, C. Lebiere, and Y. Qin. An integrated theory of mind. PSYCHOLOGICAL REVIEW, 111:1036--1060, 2004.
[2]
U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel, and E. Kirda. Scalable, behavior-based malware clustering, 2009.
[3]
D. Bothell. Act-r 6.0 reference manual. http://act-r.psy.cmu.edu/actr6/reference-manual.pdf, 2004.
[4]
L. Breiman. Random forests. Machine Learning, 45(1):5--32, 2001.
[5]
C.-C. Chang and C.-J. Lin. Libsvm: A library for support vector machines. ACM Trans. Intell. Syst. Technol., 2(3):27:1--27:27, May 2011.
[6]
J. B. M. S. Claudio Guarnieri, Alessandro Tanasi. Cuckoo sandbox. http://www.cuckoosandbox.org/, 2012.
[7]
C. Cortes and V. Vapnik. Support-vector networks. pages 273--297, 1995.
[8]
I. Firdausi, C. lim, A. Erwin, and A. S. Nugroho. Analysis of machine learning techniques used in behavior-based malware detection. In Proceedings of the 2010 Second International Conference on ACT, ACT '10, pages 201--203, Washington, DC, USA, 2010. IEEE Computer Society.
[9]
D. Giametta and A. Potter. Shmoomcon 2014:there and back again:a critical analysis of spatial analysis, 2014.
[10]
C. Gonzalez, J. F. Lerch, and C. Lebiere. Instance-based learning in dynamic decision making. Cognitive Science, 27(4):591 -- 635, 2003.
[11]
GVDG. Generator malware gvdg. 2011.
[12]
Invencia. Crowdsource: Crowd trained machine learning model for malware capability detection. http://www.invincea.com/tag/cynomix/, 2013.
[13]
ISEC-Lab. Anubis: Analyzing unknown binaries. http://anubis.iseclab.org/, 2007.
[14]
Kaspersky. Gauss: Abnormal distribution, 2012.
[15]
J. Kinable and O. Kostakis. Malware classification based on call graph clustering. J. Comput. Virol., 7(4):233--245, Nov. 2011.
[16]
D. Kong and G. Yan. Discriminant malware distance learning on structural information for automated malware classification. In Proceedings of the 19th ACM SIGKDD, KDD '13, pages 1357--1365, New York, NY, USA, 2013. ACM.
[17]
C. Lebiere, S. Bennati, R. Thomson, P. Shakarian, and E. Nunes. Functional cognitive models of malware identification. In Proceedings of ICCM, ICCM 2015, Groningen, The Netherlands, April 9-11, 2015, 2015.
[18]
C. Lebiere, P. Pirolli, R. Thomson, J. Paik, M. Rutledge-Taylor, J. Staszewski, and J. R. Anderson. A functional model of sensemaking in a neurocognitive architecture. Intell. Neuroscience, 2013:5:5--5:5, Jan. 2013.
[19]
P. Li, L. Liu, and M. K. Reiter. On challenges in evaluating malware clustering, 2007.
[20]
M. Lindorfer, C. Kolbitsch, and P. Milani Comparetti. Detecting environment-sensitive malware. In Proceedings of the 14th International Conference on RAID, RAID'11, pages 338--357, Berlin, Heidelberg, 2011. Springer-Verlag.
[21]
Mandiant. Apt1:exposing one of china's cyber espionage units. http://intelreport.mandiant.com/, 2013.
[22]
Mandiant. Mandiant APT1 samples categorized by malware families. Contagio Malware Dump, 2013.
[23]
R. Perdisci and ManChon. Vamo: towards a fully automated malware clustering validity analysis. In ACSAC, pages 329--338. ACM, 2012.
[24]
M. Sikorski and A. Honig. Practical Malware Analysis: The Hands-On Guide to Dissecting Malicious Software. No Starch Press, 1 edition, 2012.
[25]
R. S. Sutton and A. G. Barto. Introduction to Reinforcement Learning. MIT Press, Cambridge, MA, USA, 1st edition, 1998.
[26]
A. Tamersoy, K. Roundy, and D. H. Chau. Guilt by association: Large scale malware detection by mining file-relation graphs. In Proceedings of the 20th ACM SIGKDD, KDD '14, pages 1524--1533. ACM, 2014.
[27]
R. Thomson, C. Lebiere, S. Bennati, P. Shakarian, and E. Nunes. Malware identification using cognitively-inspired inference. In Proceedings of BRIMS, BRIMS 2015, Washington DC, March 31-April 3, 2015, 2015.
[28]
T. J. Wong, E. T. Cokely, and L. J. Schooler. An online database of act-r parameters: Towards a transparent community-based approach to model development. 2010.

Cited By

View all
  • (2024)Comparison of cognitively-inspired salience and feature importance techniques in intrusion detection datasetsAssurance and Security for AI-enabled Systems10.1117/12.3013842(21)Online publication date: 7-Jun-2024
  • (2020)Searching for Malware Dataset: a Systematic Literature Review2020 International Conference on Information Technology Systems and Innovation (ICITSI)10.1109/ICITSI50517.2020.9264929(375-380)Online publication date: 19-Oct-2020
  • (2020)Cognitively-Inspired Inference for Malware Task IdentificationOpen Source Intelligence and Cyber Crime10.1007/978-3-030-41251-7_7(165-194)Online publication date: 1-Aug-2020

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASONAM '15: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining 2015
August 2015
835 pages
ISBN:9781450338547
DOI:10.1145/2808797
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 August 2015

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ASONAM '15
Sponsor:

Acceptance Rates

Overall Acceptance Rate 116 of 549 submissions, 21%

Upcoming Conference

KDD '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Comparison of cognitively-inspired salience and feature importance techniques in intrusion detection datasetsAssurance and Security for AI-enabled Systems10.1117/12.3013842(21)Online publication date: 7-Jun-2024
  • (2020)Searching for Malware Dataset: a Systematic Literature Review2020 International Conference on Information Technology Systems and Innovation (ICITSI)10.1109/ICITSI50517.2020.9264929(375-380)Online publication date: 19-Oct-2020
  • (2020)Cognitively-Inspired Inference for Malware Task IdentificationOpen Source Intelligence and Cyber Crime10.1007/978-3-030-41251-7_7(165-194)Online publication date: 1-Aug-2020

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media