Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1364654.1364657acmconferencesArticle/Chapter ViewAbstractPublication PagesconextConference Proceedingsconference-collections
research-article

Detecting worm variants using machine learning

Published: 10 December 2007 Publication History

Abstract

Network intrusion detection systems typically detect worms by examining packet or flow logs for known signatures. Not only does this approach mean worms cannot be detected until the signatures are created, but that variants of known worms will remain undetected since they will have different signatures. The intuitive solution is to write more generic signatures. This solution, however, would increase the false alarm rate and is therefore practically not feasible. This paper reports on the feasibility of using a machine learning technique to detect variants of known worms in real-time.
Support vector machines (SVMs) are a machine learning technique known to perform well at various pattern recognition tasks, such as text categorization and handwritten digit recognition. Given the efficacy of SVMs in standard pattern recognition problems this work applies SVMs to the worm detection problem. Specifically, we investigate the optimal configuration of SVMs and associated kernel functions to classify various types of synthetically generated worms. We demonstrate that the optimal configuration for real time detection of variants of known worms is to use a linear kernel, and unnormalized bi-gram frequency counts as input.

References

[1]
I. Arce and E. Levy. An analysis of the Slapper worm. Security & Privacy Magazine, IEEE, 1(1):82--87, 2003.
[2]
A. Bradley. Use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recognition, 30(7):1145--1159, 1997.
[3]
C. Burges. A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery, 2(2):121--167, 1998.
[4]
C. Chang and C. Lin. LIBSVM: a library for support vector machines. Software available at http://www.csie.ntu.edu.tw/cjlin/libsvm, 2001.
[5]
O. Chapelle, P. Haffner, and V. Vapnik. Support vector machines for histogram-based image classification. Neural Networks, IEEE Transactions on, 10(5):1055--1064, 1999.
[6]
C. Cortes and V. Vapnik. Support-vector networks. Machine Learning, 20(3):273--297, 1995.
[7]
M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, and P. Barham. Vigilante: end-to-end containment of internet worms. Proc. ACM symposium on Operating systems principles, pages 133--147, 2005.
[8]
W. Hu, Y. Liao, and V. Vemuri. Robust Support Vector Machines for Anomaly Detection in Computer Security. Proc. International Conference on Machine Learning and Applications, pages 23--24, 2003.
[9]
T. Joachims. Text categorization with support vector machines: learning with many relevant features. Proc. European Conference on Machine Learning, (1398):137--142, 1998.
[10]
A. Kasarda. The Lion Worm: King of the Jungle. SANS reading room, http://www.sans.org/rr.
[11]
D. Kienzle and M. Elder. Recent worms: a survey and trends. Proc. ACM Workshop on Rapid Malcode, pages 1--10, 2003.
[12]
H. Kim and B. Karp. Autograph: Toward Automated, Distributed Worm Signature Detection. In Proceedings of the 13th USENIX Security Symposium, 2004.
[13]
J. Kolter and M. Maloof. Learning to detect malicious executables in the wild. Proc. ACM SIGKDD international conference on Knowledge discovery and data mining, pages 470--478, 2004.
[14]
C. Kreibich and J. Crowcroft. Honeycomb: creating intrusion detection signatures using honeypots. ACM SIGCOMM Computer Communication Review, 34(1):51--56, 2004.
[15]
C. Kruengkrai, V. Sornlertlamvanich, and H. Isahara. Language, Script, and Encoding Identification with String Kernel Classifiers. Proc. Conference on Knowledge, Information and Creativity Support Systems, 2006.
[16]
W. Li, K. Wang, S. Stolfo, and B. Herzog. Fileprints: identifying file types by n-gram analysis. Proc. IEEE Systems, Man and Cybernetics Information Assurance Workshop, pages 64--71, 2005.
[17]
H. Lodhi, C. Saunders, J. Shawe-Taylor, N. Cristianini, and C. Watkins. Text classification using string kernels. The Journal of Machine Learning Research, 2:419--444, 2002.
[18]
D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. Inside the Slammer Worm. IEEE Security and Privacy, 2003.
[19]
D. Moore, C. Shannon, and k claffy. Code-red: a case study on the spread and victims of an internet worm. Proc. ACM SIGCOMM Workshop on Internet measurment, pages 273--284, 2002.
[20]
B. Mukherjee, L. Heberlein, and K. Levitt. Network intrusion detection. Network, IEEE, 8(3):26--41, 1994.
[21]
G. Navarro. A guided tour to approximate string matching. ACM Computing Surveys, 33(1), 2001.
[22]
J. Newsome, B. Karp, and D. Song. Polygraph: Automatically generating signatures for polymorphic worms. IEEE Symposium on Security and Privacy, 2005.
[23]
V. Paxson. Bro: a system for detecting network intruders in real-time. Computer Networks, 31(23--24):2435--2463, 1999.
[24]
R. Perdisci, G. Gu, and W. Lee. Using an ensemble of one-class svm classifiers to harden payload-based anomaly detection systems. Proc. International Conference on Data Mining, pages 488--498, 2006.
[25]
M. Rabin. Fingerprinting by random polynomials. Technical report, Technical Report TR-15-81, Center for Research in Computing Technology, Harvard University, 1981.
[26]
K. Rieck and P. Laskov. Detecting unknown network attacks using language models. Proc. DIMVA, pages 74--90, 2006.
[27]
M. Roberts. Local-order-estimating Markovian analysis for noiseless source coding and authorship identification. Technical report, UCRL-53310, Lawrence Livermore National Lab., CA (USA), 1982.
[28]
M. Roesch. Snort - lightweight intrusion detection for networks. Proc. USENIX System administration, pages 229--238, 1999.
[29]
C. Shannon and D. Moore. The spread of the Witty worm. Security & Privacy Magazine, IEEE, 2(4):46--50, 2004.
[30]
C. Sinclair, L. Pierce, and S. Matzner. An Application of Machine Learning to Network Intrusion Detection. Proc. Computer Security Applications Conference, page 371, 1999.
[31]
S. Singh, C. Estan, G. Varghese, and S. Savage. Automated Worm Fingerprinting. Proc. Symposium on Operating Systems Design and Implementation, 2004.
[32]
E. Spafford. The internet worm program: an analysis. ACM SIGCOMM Computer Communication Review, 19(1):17--57, 1989.
[33]
L. Spitzner. Honeypots: Tracking Hackers. Addison-Wesley Professional, 2002.
[34]
N. Weaver. Warhol Worms: The Potential for Very Fast Internet Plagues. UC Berkeley, February, 2002.
[35]
C. Williams and D. Barber. Bayesian classification with Gaussian processes. Pattern Analysis and Machine Intelligence, IEEE Transactions on, 20(12):1342--1351, 1998.
[36]
M. Zweig and G. Campbell. Receiver-operating characteristic plots: a fundamental evaluation tool in clinical medicine. Clinical Chemistry, 39(4):561--577, 1993.

Cited By

View all
  • (2022)Cyber resilience in supply chain system security using machine learning for threat predictionsContinuity & Resilience Review10.1108/CRR-10-2021-00344:1(1-36)Online publication date: 9-Feb-2022
  • (2021)Security Engineering of Patient-Centered Health Care Information Systems in Peer-to-Peer Environments: Systematic ReviewJournal of Medical Internet Research10.2196/2446023:11(e24460)Online publication date: 15-Nov-2021
  • (2019)Malware Attack Predictive Analytics in a Cyber Supply Chain Context Using Machine Learning2019 International Conference on Cyber Security and Internet of Things (ICSIoT)10.1109/ICSIoT47925.2019.00019(66-73)Online publication date: May-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CoNEXT '07: Proceedings of the 2007 ACM CoNEXT conference
December 2007
448 pages
ISBN:9781595937704
DOI:10.1145/1364654
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 December 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Acceptance Rates

Overall Acceptance Rate 198 of 789 submissions, 25%

Upcoming Conference

CoNEXT '24

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)2
  • Downloads (Last 6 weeks)0
Reflects downloads up to 01 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Cyber resilience in supply chain system security using machine learning for threat predictionsContinuity & Resilience Review10.1108/CRR-10-2021-00344:1(1-36)Online publication date: 9-Feb-2022
  • (2021)Security Engineering of Patient-Centered Health Care Information Systems in Peer-to-Peer Environments: Systematic ReviewJournal of Medical Internet Research10.2196/2446023:11(e24460)Online publication date: 15-Nov-2021
  • (2019)Malware Attack Predictive Analytics in a Cyber Supply Chain Context Using Machine Learning2019 International Conference on Cyber Security and Internet of Things (ICSIoT)10.1109/ICSIoT47925.2019.00019(66-73)Online publication date: May-2019
  • (2014)Exploring Worm Behaviors using DTWProceedings of the 7th International Conference on Security of Information and Networks10.1145/2659651.2659737(379-384)Online publication date: 9-Sep-2014
  • (2013)A learning system for discriminating variants of malicious network trafficProceedings of the Eighth Annual Cyber Security and Information Intelligence Research Workshop10.1145/2459976.2460003(1-4)Online publication date: 8-Jan-2013
  • (2013)An Evaluation of Machine Learning Methods to Detect Malicious SCADA CommunicationsProceedings of the 2013 12th International Conference on Machine Learning and Applications - Volume 0210.1109/ICMLA.2013.105(54-59)Online publication date: 4-Dec-2013

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media