research-article

Detecting worm variants using machine learning

Authors:

Joseph SventekAuthors Info & Claims

CoNEXT '07: Proceedings of the 2007 ACM CoNEXT conference

Article No.: 2, Pages 1 - 12

https://doi.org/10.1145/1364654.1364657

Published: 10 December 2007 Publication History

Abstract

Network intrusion detection systems typically detect worms by examining packet or flow logs for known signatures. Not only does this approach mean worms cannot be detected until the signatures are created, but that variants of known worms will remain undetected since they will have different signatures. The intuitive solution is to write more generic signatures. This solution, however, would increase the false alarm rate and is therefore practically not feasible. This paper reports on the feasibility of using a machine learning technique to detect variants of known worms in real-time.

Support vector machines (SVMs) are a machine learning technique known to perform well at various pattern recognition tasks, such as text categorization and handwritten digit recognition. Given the efficacy of SVMs in standard pattern recognition problems this work applies SVMs to the worm detection problem. Specifically, we investigate the optimal configuration of SVMs and associated kernel functions to classify various types of synthetically generated worms. We demonstrate that the optimal configuration for real time detection of variants of known worms is to use a linear kernel, and unnormalized bi-gram frequency counts as input.

References

[1]

I. Arce and E. Levy. An analysis of the Slapper worm. Security & Privacy Magazine, IEEE, 1(1):82--87, 2003.

Digital Library

[2]

A. Bradley. Use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, 1997.

Digital Library

[3]

C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2):121--167, 1998.

Digital Library

[4]

C. Chang and C. Lin. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm, 2001.

[5]

O. Chapelle, P. Haffner, and V. Vapnik. Support vector machines for histogram-based image classification. Neural Networks, IEEE Transactions on, 10(5):1055--1064, 1999.

Digital Library

[6]

C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273--297, 1995.

Digital Library

[7]

M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, and P. Barham. Vigilante: end-to-end containment of internet worms. Proc. ACM symposium on Operating systems principles, pages 133--147, 2005.

Digital Library

[8]

W. Hu, Y. Liao, and V. Vemuri. Robust Support Vector Machines for Anomaly Detection in Computer Security. Proc. International Conference on Machine Learning and Applications, pages 23--24, 2003.

[9]

T. Joachims. Text categorization with support vector machines: learning with many relevant features. Proc. European Conference on Machine Learning, (1398):137--142, 1998.

Digital Library

[10]

A. Kasarda. The Lion Worm: King of the Jungle. SANS reading room, http://www.sans.org/rr.

[11]

D. Kienzle and M. Elder. Recent worms: a survey and trends. Proc. ACM Workshop on Rapid Malcode, pages 1--10, 2003.

Digital Library

[12]

H. Kim and B. Karp. Autograph: Toward Automated, Distributed Worm Signature Detection. In Proceedings of the 13th USENIX Security Symposium, 2004.

Digital Library

[13]

J. Kolter and M. Maloof. Learning to detect malicious executables in the wild. Proc. ACM SIGKDD international conference on Knowledge discovery and data mining, pages 470--478, 2004.

Digital Library

[14]

C. Kreibich and J. Crowcroft. Honeycomb: creating intrusion detection signatures using honeypots. ACM SIGCOMM Computer Communication Review, 34(1):51--56, 2004.

Digital Library

[15]

C. Kruengkrai, V. Sornlertlamvanich, and H. Isahara. Language, Script, and Encoding Identification with String Kernel Classifiers. Proc. Conference on Knowledge, Information and Creativity Support Systems, 2006.

[16]

W. Li, K. Wang, S. Stolfo, and B. Herzog. Fileprints: identifying file types by n-gram analysis. Proc. IEEE Systems, Man and Cybernetics Information Assurance Workshop, pages 64--71, 2005.

[17]

H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text classification using string kernels. The Journal of Machine Learning Research, 2:419--444, 2002.

Digital Library

[18]

D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. Inside the Slammer Worm. IEEE Security and Privacy, 2003.

Digital Library

[19]

D. Moore, C. Shannon, and k claffy. Code-red: a case study on the spread and victims of an internet worm. Proc. ACM SIGCOMM Workshop on Internet measurment, pages 273--284, 2002.

Digital Library

[20]

B. Mukherjee, L. Heberlein, and K. Levitt. Network intrusion detection. Network, IEEE, 8(3):26--41, 1994.

Digital Library

[21]

G. Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1), 2001.

Digital Library

[22]

J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. IEEE Symposium on Security and Privacy, 2005.

Digital Library

[23]

V. Paxson. Bro: a system for detecting network intruders in real-time. Computer Networks, 31(23--24):2435--2463, 1999.

Digital Library

[24]

R. Perdisci, G. Gu, and W. Lee. Using an ensemble of one-class svm classifiers to harden payload-based anomaly detection systems. Proc. International Conference on Data Mining, pages 488--498, 2006.

Digital Library

[25]

M. Rabin. Fingerprinting by random polynomials. Technical report, Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.

[26]

K. Rieck and P. Laskov. Detecting unknown network attacks using language models. Proc. DIMVA, pages 74--90, 2006.

Digital Library

[27]

M. Roberts. Local-order-estimating Markovian analysis for noiseless source coding and authorship identification. Technical report, UCRL-53310, Lawrence Livermore National Lab., CA (USA), 1982.

[28]

M. Roesch. Snort - lightweight intrusion detection for networks. Proc. USENIX System administration, pages 229--238, 1999.

Digital Library

[29]

C. Shannon and D. Moore. The spread of the Witty worm. Security & Privacy Magazine, IEEE, 2(4):46--50, 2004.

Digital Library

[30]

C. Sinclair, L. Pierce, and S. Matzner. An Application of Machine Learning to Network Intrusion Detection. Proc. Computer Security Applications Conference, page 371, 1999.

Digital Library

[31]

S. Singh, C. Estan, G. Varghese, and S. Savage. Automated Worm Fingerprinting. Proc. Symposium on Operating Systems Design and Implementation, 2004.

Digital Library

[32]

E. Spafford. The internet worm program: an analysis. ACM SIGCOMM Computer Communication Review, 19(1):17--57, 1989.

Digital Library

[33]

L. Spitzner. Honeypots: Tracking Hackers. Addison-Wesley Professional, 2002.

Digital Library

[34]

N. Weaver. Warhol Worms: The Potential for Very Fast Internet Plagues. UC Berkeley, February, 2002.

[35]

C. Williams and D. Barber. Bayesian classification with Gaussian processes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(12):1342--1351, 1998.

Digital Library

[36]

M. Zweig and G. Campbell. Receiver-operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry, 39(4):561--577, 1993.

Cited By

Yeboah-Ofori ASwart COpoku-Boateng FIslam S(2022)Cyber resilience in supply chain system security using machine learning for threat predictionsContinuity & Resilience Review10.1108/CRR-10-2021-00344:1(1-36)Online publication date: 9-Feb-2022
https://doi.org/10.1108/CRR-10-2021-0034
Abdullahi Yari IDehling TKluge FGeck JSunyaev AEskofier B(2021)Security Engineering of Patient-Centered Health Care Information Systems in Peer-to-Peer Environments: Systematic ReviewJournal of Medical Internet Research10.2196/2446023:11(e24460)Online publication date: 15-Nov-2021
https://doi.org/10.2196/24460
Yeboah-Ofori ABoachie C(2019)Malware Attack Predictive Analytics in a Cyber Supply Chain Context Using Machine Learning2019 International Conference on Cyber Security and Internet of Things (ICSIoT)10.1109/ICSIoT47925.2019.00019(66-73)Online publication date: May-2019
https://doi.org/10.1109/ICSIoT47925.2019.00019
Show More Cited By

Detecting worm variants using machine learning
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Supervised learning
    2. Machine learning approaches
2. Social and professional topics
  1. Computing / technology policy

Recommendations

Detecting and Defending against Worm Attacks Using Bot-honeynet
ISECS '09: Proceedings of the 2009 Second International Symposium on Electronic Commerce and Security - Volume 01

We proposed a worm detection and defense system named bot-honeynet in this paper, which combines the best features of honeynet, anomaly detection and botnet. The combination of honeynet and anomaly detection system offers a tradeoff between false ...
WORM vs. WORM: preliminary study of an active counter-attack mechanism
WORM '04: Proceedings of the 2004 ACM workshop on Rapid malcode

Self-propagating computer worms have been terrorizing the Internet for the last several years. With the increasing density, inter-connectivity and bandwidth of the Internet combined with security measures that inadequately scale, worms will continue to ...
A self-learning worm using importance scanning
WORM '05: Proceedings of the 2005 ACM workshop on Rapid malcode

The use of side information by an attacker can help a worm speed up the propagation. This philosophy has been the basis for advanced worm scanning mechanisms such as hitlist scanning, routable scanning, and importance scanning. Some of these scanning ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CoNEXT '07: Proceedings of the 2007 ACM CoNEXT conference

December 2007

448 pages

ISBN:9781595937704

DOI:10.1145/1364654

General Chairs:
Jim Kurose
University of Massachusetts
,
Henning Schulzrinne
Columbia University

Copyright © 2007 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Alcatel-Lucent
SIGCOMM: ACM Special Interest Group on Data Communication
Thomson
CISCO
IMDEA
IBM: IBM

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 December 2007

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Acceptance Rates

Overall Acceptance Rate 198 of 789 submissions, 25%

Upcoming Conference

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
447
Total Downloads

Downloads (Last 12 months)2
Downloads (Last 6 weeks)0

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Yeboah-Ofori ASwart COpoku-Boateng FIslam S(2022)Cyber resilience in supply chain system security using machine learning for threat predictionsContinuity & Resilience Review10.1108/CRR-10-2021-00344:1(1-36)Online publication date: 9-Feb-2022
https://doi.org/10.1108/CRR-10-2021-0034
Abdullahi Yari IDehling TKluge FGeck JSunyaev AEskofier B(2021)Security Engineering of Patient-Centered Health Care Information Systems in Peer-to-Peer Environments: Systematic ReviewJournal of Medical Internet Research10.2196/2446023:11(e24460)Online publication date: 15-Nov-2021
https://doi.org/10.2196/24460
Yeboah-Ofori ABoachie C(2019)Malware Attack Predictive Analytics in a Cyber Supply Chain Context Using Machine Learning2019 International Conference on Cyber Security and Internet of Things (ICSIoT)10.1109/ICSIoT47925.2019.00019(66-73)Online publication date: May-2019
https://doi.org/10.1109/ICSIoT47925.2019.00019
Naval SLaxmi VGupta NGaur MRajarajan MPoet RMuttukrishnan R(2014)Exploring Worm Behaviors using DTWProceedings of the 7th International Conference on Security of Information and Networks10.1145/2659651.2659737(379-384)Online publication date: 9-Sep-2014
https://dl.acm.org/doi/10.1145/2659651.2659737
Beaver JSymons CGillen R(2013)A learning system for discriminating variants of malicious network trafficProceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop10.1145/2459976.2460003(1-4)Online publication date: 8-Jan-2013
https://dl.acm.org/doi/10.1145/2459976.2460003
Beaver JBorges-Hink RBuckner M(2013)An Evaluation of Machine Learning Methods to Detect Malicious SCADA CommunicationsProceedings of the 2013 12th International Conference on Machine Learning and Applications - Volume 0210.1109/ICMLA.2013.105(54-59)Online publication date: 4-Dec-2013
https://dl.acm.org/doi/10.1109/ICMLA.2013.105

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents