Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

A new intrusion detection system using support vector machines and hierarchical clustering

Published: 01 October 2007 Publication History

Abstract

Whenever an intrusion occurs, the security and value of a computer system is compromised. Network-based attacks make it difficult for legitimate users to access various network services by purposely occupying or sabotaging network resources and services. This can be done by sending large amounts of network traffic, exploiting well-known faults in networking services, and by overloading network hosts. Intrusion Detection attempts to detect computer attacks by examining various data records observed in processes on the network and it is split into two groups, anomaly detection systems and misuse detection systems. Anomaly detection is an attempt to search for malicious behavior that deviates from established normal patterns. Misuse detection is used to identify intrusions that match known attack scenarios. Our interest here is in anomaly detection and our proposed method is a scalable solution for detecting network-based anomalies. We use Support Vector Machines (SVM) for classification. The SVM is one of the most successful classification algorithms in the data mining area, but its long training time limits its use. This paper presents a study for enhancing the training time of SVM, specifically when dealing with large data sets, using hierarchical clustering analysis. We use the Dynamically Growing Self-Organizing Tree (DGSOT) algorithm for clustering because it has proved to overcome the drawbacks of traditional hierarchical clustering algorithms (e.g., hierarchical agglomerative clustering). Clustering analysis helps find the boundary points, which are the most qualified data points to train SVM, between two classes. We present a new approach of combination of SVM and DGSOT, which starts with an initial training set and expands it gradually using the clustering structure produced by the DGSOT algorithm. We compare our approach with the Rocchio Bundling technique and random selection in terms of accuracy loss and training time gain using a single benchmark real data set. We show that our proposed variations contribute significantly in improving the training process of SVM with high generalization accuracy and outperform the Rocchio Bundling technique.

References

[1]
1. Agarwal, D.K.: Shrinkage estimator generalizations of proximal support vector machines, In: Proceedings of the 8th International Conference Knowledge Discovery and Data Mining, pp. 173-182. Edmonton, Canada (2002).
[2]
2. Anderson, D., Frivold, T., Valdes, A.: Next-generation intrusion detection expert system (NIDES): a summary. Technical Report SRI-CSL-95-07. Computer Science Laboratory, SRI International, Menlo Park, CA (May 1995).
[3]
3. Axelsson, S.: Research in intrusion detection systems: a survey. Technical Report TR 98-17 (revised in 1999). Chalmers University of Technology, Goteborg, Sweden (1999).
[4]
4. Balcazar, J.L., Dai, Y., Watanabe, O.: A random sampling technique for training support vector machines for primal-form maximal-margin classifiers, algorithmic learning theory. In: Proceedings of the 12th International Conference, ALT 2001, p. 119. Washington, DC (2001).
[5]
5. Bivens, A., Palagiri, C., Smith, R., Szymanski, B., Embrechts, M.: Intelligent engineering systems through artificial neural networks. In: Proceedings of the ANNIE-2002, vol. 12, pp. 579-584. ASME Press, New York (2002).
[6]
6. Branch, J., Bivens, A., Chan, C.-Y., Lee, T.-K., Szymanski, B.: Denial of service intrusion detection using time dependent deterministic finite automata. In: Proceedings of the Research Conference. RPI, Troy, NY (2002).
[7]
7. Cannady, J.: Artificial neural networks for misuse detection. In: Proceedings of the National Information Systems Security Conference (NISSC98), pp. 443-456. Arlington, VA (1998).
[8]
8. Cauwenberghs, G., Poggio, T.: Incremental and decremental support vector machine learning. In: Proceedings of the Advances in Neural Information Processing Systems, pp. 409-415. Vancouver, Canada (2000).
[9]
9. Debar, H., Dacier, M., Wespi, A.: A revised taxonomy for intrusion detection systems. Ann. Télécommun. 55(7/8), 361-378 (2000).
[10]
10. Denning, D.E.: An intrusion detection model. IEEE Trans. Software Eng. 13(2), 222-232 (1987).
[11]
11. Dopazo, J., Carazo, J.M.: Phylogenetic reconstruction using an unsupervised growing neural network that adopts the topology of a phylogenetic tree. J. Mol. Evol. 44, 226-233 (1997).
[12]
12. Forras, P.A., Neumann, F.G.: EMERALD: event monitoring enabling response to anomalous live disturbances. In: Proceedings of the 20th National Information Systems Security Conference, pp. 353-365 (1997).
[13]
13. Freeman, S., Bivens, A., Branch, J., Szymanski, B.: Host-based intrusion detection using user signatures. In: Proceedings of the Research Conference. RPI, Troy, NY (2002).
[14]
14. Feng, G., Mangasarian, O.L.: Semi-supervised support vector machines for unlabeled data classification. Optimization Methods Software 15, 29-44 (2001).
[15]
15. Ghosh, A., Schwartzbard, A., Shatz, M.: Learning program behavior profiles for intrusion detection. In: Proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring, pp. 51-62. Santa Clara, CA (1999).
[16]
16. Girardin, L., Brodbeck, D.: A visual approach or monitoring logs. In: Proceedings of the 12th System Administration Conference (LISA 98), pp. 299-308. Boston, MA (1998) (ISBN: 1-880446- 40-5).
[17]
17. Hu, W., Liao, Y., Vemuri, V.R.: Robust support vector machines for anomaly detection in computer security. In: Proceedings of the 2003 International Conference on Machine Learning and Applications (ICMLA'03). Los Angeles, CA (2003).
[18]
18. Ilgun, K., Kemmerer, R.A., Porras, P.A.: State transition analysis: A rule-based intrusion detection approach. IEEE Trans. Software Eng. 21(3), 181-199 (1995).
[19]
19. Joshi, M., Agrawal, R.: PNrule: a new framework for learning classifier models in data mining (a case-study in network intrusion detection) (2001). In: Proceedings of the First SIAM International Conference on Data Mining. Chicago (2001).
[20]
20. Khan, L., Luo, F.: Hierarchical clustering for complex data, in press. Int. J. Artif. Intell. Tools. World Scientific.
[21]
21. Kohonen, T.: Self-Organizing Maps, Springer Series. Springer Berlin Heidelberg New York (1995).
[22]
22. Kumar, S., Spafford, E.H.: A software architecture to support misuse intrusion detection. In: Proceedings of the 18th National Information Security Conference, pp. 194-204. (1995).
[23]
23. Lane, T., Brodley, C.E.: Temporal sequence earning and data reduction for anomaly detection. ACM Trans. Inform. Syst. Security 2(3), 295-331 (1999).
[24]
24. Lee, W., Stolfo, S.J.: A framework for constructing features and models for intrusion detection systems. ACM Trans. Inform. Syst. Security 3(4), 227-261 (2000).
[25]
25. Luo, F., Khan, L., Bastani, F.B., Yen, I.L., Zhou, J.: A dynamically growing self-organizing tree (DGSOT) for hierarchical clustering gene expression profiles. Bioinformatics 20(16), 2605-2617 (2004).
[26]
26. Marchette, D.: A statistical method for profiling network traffic. In: Proceedings of the First USENIX Workshop on Intrusion Detection and Network Monitoring, pp. 119-128. Santa Clara, CA (1999).
[27]
27. McCanne, S., Leres, C., Jacobson, V.: Libpcap, available via anonymous ftp at ftp://ftp.ee.lbl.gov/ (1989).
[28]
28. Mukkamala, S., Janoski, G., Sung, A.: Intrusion detection: support vector machines and neural networks. In: Proceedings of the IEEE International Joint Conference on Neural Networks (ANNIE), pp. 1702-1707. St. Louis, MO (2002).
[29]
29. Lippmann, R., Graf, I., Wyschogrod, D., Webster, S.E., Weber, D.J., Gorton, S.: The 1998 DARPA/AFRL off-line intrusion detection evaluation. In: Proceedings of the First International Workshop on Recent Advances in Intrusion Detection (RAID). Louvain-la-Neuve, Belgium (1998).
[30]
30. Ray, S., Turi, R.H.: Determination of number of clusters in k- means clustering and application in color image segmentation. In: Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques (ICAPRDT'99), pp. 137-143. Calcutta, India (1999).
[31]
31. Ryan, J., Lin, M., Mikkulainen, R.: Intrusion detection with neural networks. In: Advances in Neural Information Processing Systems, vol. 10, pp. 943-949. MIT Press, Cambridge, MA (1998).
[32]
32. Sequeira, K., Zaki, M.J.: ADMIT: anomaly-base data mining for intrusions. In: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 386- 395 (2002).
[33]
33. Stolfo, S.J., Lee, W., Chan, P.K., Fan, W., Eskin, E.: Data mining-based intrusion detectors: an overview of the Columbia IDS project. ACM SIGMOD Record 30(4), 5-14 (2001).
[34]
34. Vapnik, V.N.: The Nature of Statistical Learning Theory. Springer Berlin Heidelberg New York (1995).
[35]
35. Voorhees, E.M.: Implementing agglomerative hierarchic clustering algorithms for use in document retrieval. Inform. Process. Manage. 22(6), 465-476 (1986).
[36]
36. Warrender, C., Forrest, S., Pearlmutter, B.: Detecting intrusions using system calls: Alternative data models. In: Proceedings of the 1999 IEEE Symposium on Security and Privacy, pp. 133-145. (1999).
[37]
37. Shih, L., Rennie, Y.D.M., Chang, Y., Karger, D.R.: Text bundling: statistics-based data reduction. In: Proceedings of the 20th International Conference on Machine Learning (ICML), pp. 696-703. Washington DC (2003).
[38]
38. Tufis, D., Popescu, C., Rosu, R.: Automatic classification of documents by random sampling. Proc. Romanian Acad. Ser. 1(2), 117- 127 (2000).
[39]
39. Upadhyaya, S., Chinchani, R., Kwiat, K.: An analytical framework for reasoning about intrusions. In: Proceedings of the IEEE Symposium on Reliable Distributed Systems, pp. 99-108. New Orleans, LA (2001).
[40]
40. Wang, K., Stolfo, S.J.: One class training for masquerade detection. In: Proceedings of the 3rd IEEE Conference, Data Mining Workshop on Data Mining for Computer Security. Florida (2003).
[41]
41. Yu, H., Yang, J., Han, J.: Classifying large data sets using SVM with hierarchical clusters. In: Proceedings of the SIGKDD 2003, pp. 306-315. Washington, DC (2003).
[42]
42. Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Proceedings of the SIGMOD Conference, pp. 103-114 (1996).

Cited By

View all
  • (2023)IoT-based intrusion detection system for healthcare using RNNBiLSTM deep learning strategy with custom featuresSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08536-827:16(11915-11930)Online publication date: 31-May-2023
  • (2022)Distributed Denial-of-Service (DDoS) Attacks and Defense Mechanisms in Various Web-Enabled Computing PlatformsInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29714318:1(1-43)Online publication date: 15-Apr-2022
  • (2022)Comprehensive Composition to Spot Intrusions by Optimized Gaussian Kernel SVMInternational Journal of Knowledge-Based Organizations10.4018/IJKBO.29168912:1(1-27)Online publication date: 25-Feb-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image The VLDB Journal — The International Journal on Very Large Data Bases
The VLDB Journal — The International Journal on Very Large Data Bases  Volume 16, Issue 4
October 2007
124 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 October 2007

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)9
  • Downloads (Last 6 weeks)8
Reflects downloads up to 01 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2023)IoT-based intrusion detection system for healthcare using RNNBiLSTM deep learning strategy with custom featuresSoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-023-08536-827:16(11915-11930)Online publication date: 31-May-2023
  • (2022)Distributed Denial-of-Service (DDoS) Attacks and Defense Mechanisms in Various Web-Enabled Computing PlatformsInternational Journal on Semantic Web & Information Systems10.4018/IJSWIS.29714318:1(1-43)Online publication date: 15-Apr-2022
  • (2022)Comprehensive Composition to Spot Intrusions by Optimized Gaussian Kernel SVMInternational Journal of Knowledge-Based Organizations10.4018/IJKBO.29168912:1(1-27)Online publication date: 25-Feb-2022
  • (2022)E-minBatch GraphSAGESecurity and Communication Networks10.1155/2022/53637642022Online publication date: 1-Jan-2022
  • (2022)A Mask-based Output Layer for Multi-level Hierarchical ClassificationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557534(3833-3837)Online publication date: 17-Oct-2022
  • (2022)Diverse Analysis of Data Mining and Machine Learning Algorithms to Secure Computer NetworkWireless Personal Communications: An International Journal10.1007/s11277-021-09393-0124:2(1033-1059)Online publication date: 1-May-2022
  • (2022)Evaluating the Performance of Various SVM Kernel Functions Based on Basic Features Extracted from KDDCUP'99 Dataset by Random Forest Method for Detecting DDoS AttacksWireless Personal Communications: An International Journal10.1007/s11277-021-09280-8123:4(3127-3145)Online publication date: 1-Apr-2022
  • (2021)An Effective Algorithm for Intrusion Detection Using Random Shapelet ForestWireless Communications & Mobile Computing10.1155/2021/42147842021Online publication date: 1-Jan-2021
  • (2020)Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases ProcessIEEE Transactions on Network and Service Management10.1109/TNSM.2020.301624617:4(2451-2479)Online publication date: 1-Dec-2020
  • (2020)MLEsIDSs: machine learning-based ensembles for intrusion detection systems—a reviewThe Journal of Supercomputing10.1007/s11227-020-03196-z76:11(8938-8971)Online publication date: 1-Nov-2020
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media