research-article

Collective classification for packed executable identification

Authors:

Xabier Ugarte-Pedrero,

Carlos Laorden,

Pablo G. BringasAuthors Info & Claims

CEAS '11: Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference

Pages 23 - 30

https://doi.org/10.1145/2030376.2030379

Published: 01 September 2011 Publication History

Abstract

Malware is any software designed to harm computers. Commercial anti-virus are based on signature scanning, which is a technique effective only when the malicious executables have been previously analysed and identified. Malware writers employ several techniques in order to hide their actual behaviour. Executable packing consists in encrypting or hiding the real payload of the executable. Generic unpacking techniques do not depend on the packer used, as they execute the binary within an isolated environment (namely `sandbox') to gather the real code of the packed executable. However, this approach is slow and, therefore, a filter step is required to determine when an executable has been packed. To this end, supervised machine learning approaches trained with static features from the executables have been proposed. Notwithstanding, supervised learning methods need the identification and labelling of a high number of packed and not packed executables. In this paper, we propose a new method for packed executable detection that adopts a collective learning approach to reduce the labelling requirements of completely supervised approaches. We performed an empirical validation demonstrating that the system maintains a high accuracy rate while the labelling efforts are lower than when using supervised learning.

References

[1]

K. Babar and F. Khalid. Generic unpacking techniques. In Proceedings of the 2 ^nd International Conference on Computer, Control and Communication (IC4), pages 1--6. IEEE, 2009.

[2]

S. Cesare. Linux anti-debugging techniques, fooling the debugger, 1999. Available online: http://vx.netlux.org/lib/vsc04.html.

[3]

O. Chapelle, B. Schölkopf, and A. Zien. Semi-supervised learning. MIT Press, 2006.

Digital Library

[4]

A. Danielescu. Anti-debugging and anti-emulation techniques. CodeBreakers Journal, 5(1), 2008. Available online: http://www.codebreakers-journal.com/.

[5]

Data Rescue. Universal PE Unpacker plug-in. Available online: http://www.datarescue.com/idabase/unpack_pe.

[6]

M. Farooq. PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime. In Proceedings of the 12 ^th International Symposium on Recent Advances in Intrusion Detection (RAID), pages 121--141. Springer-Verlag, 2009.

Digital Library

[7]

Faster Universal Unpacker, 1999. Available online: http://code.google.com/p/fuu/.

[8]

S. Garner. Weka: The Waikato environment for knowledge analysis. In Proceedings of the New Zealand Computer Science Research Students Conference, pages 57--64, 1995.

[9]

G. Holmes, A. Donkin, and I. H. Witten. Weka: a machine learning workbench. pages 357--361, August 1994.

[10]

L. Julus. Anti-debugging in WIN32, 1999. Available online: http://vx.netlux.org/lib/vlj05.html.

[11]

M. Kang, P. Poosankam, and H. Yin. Renovo: A hidden code extractor for packed executables. In Proceedings of the 2007 ACM workshop on Recurring malcode, pages 46--53. ACM, 2007.

Digital Library

[12]

J. Kent. Information gain and a general measure of correlation. Biometrika, 70(1):163--173, 1983.

[13]

R. Lyda and J. Hamrock. Using entropy analysis to find encrypted and packed malware. IEEE Security & Privacy, 5(2):40--45, 2007.

Digital Library

[14]

L. Martignoni, M. Christodorescu, and S. Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In Proceedings of the 2007 Annual Computer Security Applications Conference (ACSAC), pages 431--441, 2007.

[15]

McAfee Labs. Mcafee whitepaper: The good, the bad, and the unknown, 2011. Available online: http://www.mcafee.com/us/resources/white-papers/wp-good-bad-unknown.pdf.

[16]

M. Morgenstern and H. Pilz. Useful and useless statistics about viruses and anti-virus programs. In Proceedings of the CARO Workshop, 2010. Available online: www.f-secure.com/weblog/archives/Maik_Morgenstern_Statistics.pdf.

[17]

G. Namata, P. Sen, M. Bilgic, and L. Getoor. Collective classification for text classification. Text Mining, pages 51--69, 2009.

[18]

J. Neville and D. Jensen. Collective classification with relational dependency networks. In Proceedings of the Workshop on Multi-Relational Data Mining (MRDM), 2003.

[19]

PEiD. PEiD webpage, 2010. Available online: http://www.peid.info/.

[20]

R. Perdisci, A. Lanzi, and W. Lee. Classification of packed executables for accurate computer virus detection. Pattern Recognition Letters, 29(14):1941--1946, 2008.

Digital Library

[21]

R. Perdisci, A. Lanzi, and W. Lee. McBoost: Boosting scalability in malware collection and analysis using statistical classification of executables. In Proceedings of the 2008 Annual Computer Security Applications Conference (ACSAC), pages 301--310, 2008.

Digital Library

[22]

R. Rolles. Unpacking virtualization obfuscators. In Proceedings of 3 ^rd USENIX Workshop on Offensive Technologies. (WOOT), 2009.

Digital Library

[23]

P. Royal, M. Halpin, D. Dagon, R. Edmonds, and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 2006 Annual Computer Security Applications Conference (ACSAC), pages 289--300, 2006.

Digital Library

[24]

I. Santos, F. Brezo, J. Nieves, Y. Penya, B. Sanz, C. Laorden, and P. Bringas. Idea: Opcode-sequence-based malware detection. In Engineering Secure Software and Systems, volume 5965 of LNCS, pages 35--43. 2010. 10.1007/978-3-642-11747-3-3.

Digital Library

[25]

M. Shafiq, S. Tabish, and M. Farooq. PE-Probe: Leveraging Packer Detection and Structural Information to Detect Malicious Portable Executables. In Proceedings of the Virus Bulletin Conference (VB), 2009.

[26]

M. Sharif, A. Lanzi, J. Giffin, and W. Lee. Automatic reverse engineering of malware emulators. In Proceedings of the 30 ^th IEEE Symposium on Security and Privacy, pages 94--109, 2009.

Digital Library

[27]

M. Sharif, A. Lanzi, J. Giffin, and W. Lee. Rotalumè: A Tool for Automatic Reverse Engineering of Malware Emulators. 2009.

[28]

Y. Singh, A. Kaur, and R. Malhotra. Comparative analysis of regression and machine learning methods for predicting fault proneness models. International Journal of Computer Applications in Technology, 35(2):183--193, 2009.

Digital Library

[29]

J. Stewart. Ollybone: Semi-automatic unpacking on ia-32. In Proceedings of the 14 ^th DEF CON Hacking Conference, 2006.

[30]

P. Ször. The art of computer virus research and defense. Addison-Wesley Professional, 2005.

Digital Library

[31]

X. Ugarte-Pedrero, I. Santos, and P. G. Bringas. Structural feature based anomaly detection for packed executable identification. In Proceedings of the 4 ^th International Conference on Computational Intelligence in Security for Information Systems (CISIS), pages 50--57, 2011.

Digital Library

[32]

VX Heavens. Available online: http://vx.netlux.org/.

[33]

V. Yegneswaran, H. Saidi, P. Porras, M. Sharif, and W. Mark. Eureka: A framework for enabling static analysis on malware. Technical report, Technical Report SRI-CSL-08-01, 2008.

Cited By

Alkhateeb EGhorbani AHabibi Lashkari A(2024)Identifying Malware Packers through Multilayer Feature Engineering in Static AnalysisInformation10.3390/info1502010215:2(102)Online publication date: 9-Feb-2024
https://doi.org/10.3390/info15020102
Veroneze RBertrand Van Ouytsel CDam KLegay A(2024)Feature selection for packer classification based on association rule miningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109083137(109083)Online publication date: Nov-2024
https://doi.org/10.1016/j.engappai.2024.109083
Alkhateeb EGhorbani AHabibi Lashkari A(2023)A survey on run-time packers and mitigation techniquesInternational Journal of Information Security10.1007/s10207-023-00759-y23:2(887-913)Online publication date: 1-Nov-2023
https://doi.org/10.1007/s10207-023-00759-y
Show More Cited By

Index Terms

Collective classification for packed executable identification
1. Security and privacy
  1. Intrusion/anomaly detection and malware mitigation
2. Social and professional topics
  1. Computing / technology policy
    1. Computer crime

Recommendations

PEAL--Packed executable analysis
ADCONS'11: Proceedings of the 2011 international conference on Advanced Computing, Networking and Security

The proliferation of packed malware has posed a serious threat to computers connected to Internet across the globe. Packers are popular tools used by malware authors to hide malicious payloads that bypass traditional signature antiviruses (AV). Packing ...
Revealing Packed Malware

In concert with the ever-growing network applications, a significant increase in the spread of malware over the Internet has been observed. In cases where malware are the zero-day threats, generating their signatures for detection via anti-virus (AV) ...
Structural feature based anomaly detection for packed executable identification
CISIS'11: Proceedings of the 4th international conference on Computational intelligence in security for information systems

Malware is any software with malicious intentions. Commercial anti-malware software relies on signature databases. This approach has proven to be effective when the threats are already known. However, malware writers employ software encryption tools and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CEAS '11: Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference

September 2011

230 pages

ISBN:9781450307888

DOI:10.1145/2030376

General Chair:
Vidyasagar Potdar
Curtin University, Australia

Copyright © 2011 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2011

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

CEAS '11

CEAS '11: The 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference

September 1 - 2, 2011

Perth, Australia

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
390
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Alkhateeb EGhorbani AHabibi Lashkari A(2024)Identifying Malware Packers through Multilayer Feature Engineering in Static AnalysisInformation10.3390/info1502010215:2(102)Online publication date: 9-Feb-2024
https://doi.org/10.3390/info15020102
Veroneze RBertrand Van Ouytsel CDam KLegay A(2024)Feature selection for packer classification based on association rule miningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109083137(109083)Online publication date: Nov-2024
https://doi.org/10.1016/j.engappai.2024.109083
Alkhateeb EGhorbani AHabibi Lashkari A(2023)A survey on run-time packers and mitigation techniquesInternational Journal of Information Security10.1007/s10207-023-00759-y23:2(887-913)Online publication date: 1-Nov-2023
https://doi.org/10.1007/s10207-023-00759-y
Muralidharan TCohen AGerson NNissim N(2022)File Packing from the Malware Perspective: Techniques, Analysis Approaches, and Directions for EnhancementsACM Computing Surveys10.1145/353081055:5(1-45)Online publication date: 3-Dec-2022
https://dl.acm.org/doi/10.1145/3530810
Shalaginov ABanin SDehghantanha AFranke K(2018)Machine Learning Aided Static Malware Analysis: A Survey and TutorialCyber Threat Intelligence10.1007/978-3-319-73951-9_2(7-45)Online publication date: 24-Apr-2018
https://doi.org/10.1007/978-3-319-73951-9_2
Kajdanowicz TKazienko P(2018)Collective ClassificationEncyclopedia of Social Network Analysis and Mining10.1007/978-1-4939-7131-2_45(253-265)Online publication date: 12-Jun-2018
https://doi.org/10.1007/978-1-4939-7131-2_45
Jung BBae SChoi CIm E(2018)Packer identification method based on byte sequencesConcurrency and Computation: Practice and Experience10.1002/cpe.508232:8Online publication date: 18-Nov-2018
https://doi.org/10.1002/cpe.5082
Bat-Erdene MKim TPark HLee H(2017)Packer Detection for Multi-Layer Executables Using Entropy AnalysisEntropy10.3390/e1903012519:3(125)Online publication date: 16-Mar-2017
https://doi.org/10.3390/e19030125
Escalada JOrtin FScully T(2017)An Efficient Platform for the Automatic Extraction of Patterns in Native CodeScientific Programming10.1155/2017/32738912017(3)Online publication date: 1-Feb-2017
https://dl.acm.org/doi/10.1155/2017/3273891
Hai NOgawa MTho QMcDonald JPreda MStakhanova N(2017)Packer identification based on metadata signatureProceedings of the 7th Software Security, Protection, and Reverse Engineering / Software Security and Protection Workshop10.1145/3151137.3160687(1-11)Online publication date: 5-Dec-2017
https://dl.acm.org/doi/10.1145/3151137.3160687
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents