Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3264437.3264478acmotherconferencesArticle/Chapter ViewAbstractPublication PagessinConference Proceedingsconference-collections
research-article

Development of method for malware classification based on statistical methods and an extended set of system calls data

Published: 10 September 2018 Publication History

Abstract

In this paper, we propose a method for malware classification, by applying a statistical methods to an extended data set of system function calls, it becomes possible to improve the classification quality of malware samples. Applying the method of classification with unsupervised learning, it is possible to achieve a quality of classification comparable with classification methods based on supervised learning, including neural networks. Thus, the proposed method allows to perform detection of previously unknown families and more efficiently to detect unknown samples of small families.

References

[1]
2018. Cuckoo Sandbox - Automated Malware Analysis. Retrieved July 26, 2018 from https://cuckoosandbox.org
[2]
2018. One-hot - Wikipedia. Retrieved July 26, 2018 from https://en.wikipedia.org/wiki/One-hot
[3]
2018. PEB structure | Microsoft Docs. Retrieved July 26, 2018 from https://msdn.microsoft.com/ru-ru/library/windows/desktop/aa813706(v=vs.85).aspx
[4]
2018. scikit-learn: machine learning in Python - scikit-learn 0.19.2 documentation. Retrieved July 26, 2018 from http://scikit-learn.org/stable/
[5]
S Aranganayagi and K Thangavel. 2007. Clustering categorical data using silhouette coefficient as a relocating measure. In Conference on Computational Intelligence and Multimedia Applications, 2007. International Conference on, Vol. 2. IEEE, 13--17.
[6]
Ludmila Babenko and Alexey Kirillov. 2017. Malware detection by metainformation of used system functions. In Proceedings of the 10th International Conference on Security of Information and Networks. ACM, 240--244.
[7]
Akashdeep Bhardwaj, Vinay Avasthi, Hanumat Sastry, and GVB Subrahmanyam. 2016. Ransomware digital extortion: a rising new age threat. Indian Journal of Science and Technology 9, 14 (2016), 1--5.
[8]
Dorin Comaniciu and Peter Meer. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on pattern analysis and machine intelligence 24, 5 (2002), 603--619.
[9]
Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu, et al. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise. In Kdd, Vol. 96. 226--231.
[10]
Sanchit Gupta, Harshit Sharma, and Sarvjeet Kaur. 2016. Malware Characterization Using Windows API Call Sequences. In International Conference on Security, Privacy, and Applied Cryptography Engineering. Springer, 271--280.
[11]
John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics) 28, 1 (1979), 100--108.
[12]
Bojan Kolosnjaji, Apostolis Zarras, George Webster, and Claudia Eckert. 2016. Deep learning for classification of malware system call sequences. In Australasian Joint Conference on Artificial Intelligence. Springer, 137--149.
[13]
Jesse Kornblum. 2006. Identifying almost identical files using context triggered piecewise hashing. Digital investigation 3 (2006), 91--97.
[14]
MSDN. 2018. CryptBinaryToStringA function | Microsoft Docs. Retrieved July 26, 2018 from https://msdn.microsoft.com/en-us/library/windows/desktop/aa379887(v=vs.85).aspx
[15]
Dennis W Ruck, Steven K Rogers, Matthew Kabrisky, Mark E Oxley, and Bruce W Suter. 1990. The multilayer perceptron as an approximation to a Bayes optimal discriminant function. IEEE Transactions on Neural Networks 1, 4 (1990), 296--298.
[16]
PV Shijo and A Salim. 2015. Integrated static and dynamic analysis for malware detection. Procedia Computer Science 46 (2015), 804--811.
[17]
Jolliffe I. T. 1986. Principal component analysis and factor analysis. In Principal component analysis.
[18]
Shun Tobiyama, Yukiko Yamaguchi, Hajime Shimada, Tomonori Ikuse, and Takeshi Yagi. 2016. Malware detection with deep neural network using process behavior. In Computer Software and Applications Conference (COMPSAC), 2016 IEEE 40th Annual, Vol. 2. IEEE, 577--582.
[19]
Tobias Wüchner, Martín Ochoa, and Alexander Pretschner. 2014. Malware detection with quantitative data flow graphs. In Proceedings of the 9th ACM symposium on Information, computer and communications security. ACM, 271--282.

Cited By

View all
  • (2022)Online Malware Classification with System-Wide System Calls in Cloud IaaS2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI54793.2022.00042(146-151)Online publication date: Aug-2022
  • (2019)Getting to the root of the problem: A detailed comparison of kernel and user level data for dynamic malware analysisJournal of Information Security and Applications10.1016/j.jisa.2019.10236548(102365)Online publication date: Oct-2019

Index Terms

  1. Development of method for malware classification based on statistical methods and an extended set of system calls data

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SIN '18: Proceedings of the 11th International Conference on Security of Information and Networks
    September 2018
    148 pages
    ISBN:9781450366083
    DOI:10.1145/3264437
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    In-Cooperation

    • Cardiff University: Cardiff University

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 10 September 2018

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. DBSCAN
    2. MLP
    3. Malware
    4. Mean Shift
    5. PCA
    6. WINAPI
    7. behavioral analysis
    8. data mining
    9. python

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    SIN '18

    Acceptance Rates

    SIN '18 Paper Acceptance Rate 24 of 42 submissions, 57%;
    Overall Acceptance Rate 102 of 289 submissions, 35%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)1
    Reflects downloads up to 04 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Online Malware Classification with System-Wide System Calls in Cloud IaaS2022 IEEE 23rd International Conference on Information Reuse and Integration for Data Science (IRI)10.1109/IRI54793.2022.00042(146-151)Online publication date: Aug-2022
    • (2019)Getting to the root of the problem: A detailed comparison of kernel and user level data for dynamic malware analysisJournal of Information Security and Applications10.1016/j.jisa.2019.10236548(102365)Online publication date: Oct-2019

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media