Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2030376.2030379acmotherconferencesArticle/Chapter ViewAbstractPublication PagesceasConference Proceedingsconference-collections
research-article

Collective classification for packed executable identification

Published: 01 September 2011 Publication History

Abstract

Malware is any software designed to harm computers. Commercial anti-virus are based on signature scanning, which is a technique effective only when the malicious executables have been previously analysed and identified. Malware writers employ several techniques in order to hide their actual behaviour. Executable packing consists in encrypting or hiding the real payload of the executable. Generic unpacking techniques do not depend on the packer used, as they execute the binary within an isolated environment (namely `sandbox') to gather the real code of the packed executable. However, this approach is slow and, therefore, a filter step is required to determine when an executable has been packed. To this end, supervised machine learning approaches trained with static features from the executables have been proposed. Notwithstanding, supervised learning methods need the identification and labelling of a high number of packed and not packed executables. In this paper, we propose a new method for packed executable detection that adopts a collective learning approach to reduce the labelling requirements of completely supervised approaches. We performed an empirical validation demonstrating that the system maintains a high accuracy rate while the labelling efforts are lower than when using supervised learning.

References

[1]
K. Babar and F. Khalid. Generic unpacking techniques. In Proceedings of the 2 nd International Conference on Computer, Control and Communication (IC4), pages 1--6. IEEE, 2009.
[2]
S. Cesare. Linux anti-debugging techniques, fooling the debugger, 1999. Available online: http://vx.netlux.org/lib/vsc04.html.
[3]
O. Chapelle, B. Schölkopf, and A. Zien. Semi-supervised learning. MIT Press, 2006.
[4]
A. Danielescu. Anti-debugging and anti-emulation techniques. CodeBreakers Journal, 5(1), 2008. Available online: http://www.codebreakers-journal.com/.
[5]
Data Rescue. Universal PE Unpacker plug-in. Available online: http://www.datarescue.com/idabase/unpack_pe.
[6]
M. Farooq. PE-Miner: Mining Structural Information to Detect Malicious Executables in Realtime. In Proceedings of the 12 th International Symposium on Recent Advances in Intrusion Detection (RAID), pages 121--141. Springer-Verlag, 2009.
[7]
Faster Universal Unpacker, 1999. Available online: http://code.google.com/p/fuu/.
[8]
S. Garner. Weka: The Waikato environment for knowledge analysis. In Proceedings of the New Zealand Computer Science Research Students Conference, pages 57--64, 1995.
[9]
G. Holmes, A. Donkin, and I. H. Witten. Weka: a machine learning workbench. pages 357--361, August 1994.
[10]
L. Julus. Anti-debugging in WIN32, 1999. Available online: http://vx.netlux.org/lib/vlj05.html.
[11]
M. Kang, P. Poosankam, and H. Yin. Renovo: A hidden code extractor for packed executables. In Proceedings of the 2007 ACM workshop on Recurring malcode, pages 46--53. ACM, 2007.
[12]
J. Kent. Information gain and a general measure of correlation. Biometrika, 70(1):163--173, 1983.
[13]
R. Lyda and J. Hamrock. Using entropy analysis to find encrypted and packed malware. IEEE Security & Privacy, 5(2):40--45, 2007.
[14]
L. Martignoni, M. Christodorescu, and S. Jha. Omniunpack: Fast, generic, and safe unpacking of malware. In Proceedings of the 2007 Annual Computer Security Applications Conference (ACSAC), pages 431--441, 2007.
[15]
McAfee Labs. Mcafee whitepaper: The good, the bad, and the unknown, 2011. Available online: http://www.mcafee.com/us/resources/white-papers/wp-good-bad-unknown.pdf.
[16]
M. Morgenstern and H. Pilz. Useful and useless statistics about viruses and anti-virus programs. In Proceedings of the CARO Workshop, 2010. Available online: www.f-secure.com/weblog/archives/Maik_Morgenstern_Statistics.pdf.
[17]
G. Namata, P. Sen, M. Bilgic, and L. Getoor. Collective classification for text classification. Text Mining, pages 51--69, 2009.
[18]
J. Neville and D. Jensen. Collective classification with relational dependency networks. In Proceedings of the Workshop on Multi-Relational Data Mining (MRDM), 2003.
[19]
PEiD. PEiD webpage, 2010. Available online: http://www.peid.info/.
[20]
R. Perdisci, A. Lanzi, and W. Lee. Classification of packed executables for accurate computer virus detection. Pattern Recognition Letters, 29(14):1941--1946, 2008.
[21]
R. Perdisci, A. Lanzi, and W. Lee. McBoost: Boosting scalability in malware collection and analysis using statistical classification of executables. In Proceedings of the 2008 Annual Computer Security Applications Conference (ACSAC), pages 301--310, 2008.
[22]
R. Rolles. Unpacking virtualization obfuscators. In Proceedings of 3 rd USENIX Workshop on Offensive Technologies. (WOOT), 2009.
[23]
P. Royal, M. Halpin, D. Dagon, R. Edmonds, and W. Lee. Polyunpack: Automating the hidden-code extraction of unpack-executing malware. In Proceedings of the 2006 Annual Computer Security Applications Conference (ACSAC), pages 289--300, 2006.
[24]
I. Santos, F. Brezo, J. Nieves, Y. Penya, B. Sanz, C. Laorden, and P. Bringas. Idea: Opcode-sequence-based malware detection. In Engineering Secure Software and Systems, volume 5965 of LNCS, pages 35--43. 2010. 10.1007/978-3-642-11747-3-3.
[25]
M. Shafiq, S. Tabish, and M. Farooq. PE-Probe: Leveraging Packer Detection and Structural Information to Detect Malicious Portable Executables. In Proceedings of the Virus Bulletin Conference (VB), 2009.
[26]
M. Sharif, A. Lanzi, J. Giffin, and W. Lee. Automatic reverse engineering of malware emulators. In Proceedings of the 30 th IEEE Symposium on Security and Privacy, pages 94--109, 2009.
[27]
M. Sharif, A. Lanzi, J. Giffin, and W. Lee. Rotalumè: A Tool for Automatic Reverse Engineering of Malware Emulators. 2009.
[28]
Y. Singh, A. Kaur, and R. Malhotra. Comparative analysis of regression and machine learning methods for predicting fault proneness models. International Journal of Computer Applications in Technology, 35(2):183--193, 2009.
[29]
J. Stewart. Ollybone: Semi-automatic unpacking on ia-32. In Proceedings of the 14 th DEF CON Hacking Conference, 2006.
[30]
P. Ször. The art of computer virus research and defense. Addison-Wesley Professional, 2005.
[31]
X. Ugarte-Pedrero, I. Santos, and P. G. Bringas. Structural feature based anomaly detection for packed executable identification. In Proceedings of the 4 th International Conference on Computational Intelligence in Security for Information Systems (CISIS), pages 50--57, 2011.
[32]
VX Heavens. Available online: http://vx.netlux.org/.
[33]
V. Yegneswaran, H. Saidi, P. Porras, M. Sharif, and W. Mark. Eureka: A framework for enabling static analysis on malware. Technical report, Technical Report SRI-CSL-08-01, 2008.

Cited By

View all
  • (2024)Identifying Malware Packers through Multilayer Feature Engineering in Static AnalysisInformation10.3390/info1502010215:2(102)Online publication date: 9-Feb-2024
  • (2024)Feature selection for packer classification based on association rule miningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109083137(109083)Online publication date: Nov-2024
  • (2023)A survey on run-time packers and mitigation techniquesInternational Journal of Information Security10.1007/s10207-023-00759-y23:2(887-913)Online publication date: 1-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
CEAS '11: Proceedings of the 8th Annual Collaboration, Electronic messaging, Anti-Abuse and Spam Conference
September 2011
230 pages
ISBN:9781450307888
DOI:10.1145/2030376
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 September 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. executable packing
  2. machine learning
  3. malware

Qualifiers

  • Research-article

Conference

CEAS '11

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Identifying Malware Packers through Multilayer Feature Engineering in Static AnalysisInformation10.3390/info1502010215:2(102)Online publication date: 9-Feb-2024
  • (2024)Feature selection for packer classification based on association rule miningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109083137(109083)Online publication date: Nov-2024
  • (2023)A survey on run-time packers and mitigation techniquesInternational Journal of Information Security10.1007/s10207-023-00759-y23:2(887-913)Online publication date: 1-Nov-2023
  • (2022)File Packing from the Malware Perspective: Techniques, Analysis Approaches, and Directions for EnhancementsACM Computing Surveys10.1145/353081055:5(1-45)Online publication date: 3-Dec-2022
  • (2018)Machine Learning Aided Static Malware Analysis: A Survey and TutorialCyber Threat Intelligence10.1007/978-3-319-73951-9_2(7-45)Online publication date: 24-Apr-2018
  • (2018)Collective ClassificationEncyclopedia of Social Network Analysis and Mining10.1007/978-1-4939-7131-2_45(253-265)Online publication date: 12-Jun-2018
  • (2018)Packer identification method based on byte sequencesConcurrency and Computation: Practice and Experience10.1002/cpe.508232:8Online publication date: 18-Nov-2018
  • (2017)Packer Detection for Multi-Layer Executables Using Entropy AnalysisEntropy10.3390/e1903012519:3(125)Online publication date: 16-Mar-2017
  • (2017)An Efficient Platform for the Automatic Extraction of Patterns in Native CodeScientific Programming10.1155/2017/32738912017(3)Online publication date: 1-Feb-2017
  • (2017)Packer identification based on metadata signatureProceedings of the 7th Software Security, Protection, and Reverse Engineering / Software Security and Protection Workshop10.1145/3151137.3160687(1-11)Online publication date: 5-Dec-2017
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media