Article

Free access

Network Anomaly Detection Using Co-clustering

Authors:

Evangelos E. Papalexakis,

Peter SteenkisteAuthors Info & Claims

ASONAM '12: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

Pages 403 - 410

https://doi.org/10.1109/ASONAM.2012.72

Published: 26 August 2012 Publication History

Abstract

Early Internet architecture design goals did not put security as a high priority. However, today Internet security is a quickly growing concern. The prevalence of Internet attacks has increased significantly, but still the challenge of detecting such attacks generally falls on the end hosts and service providers, requiring system administrators to detect and block attacks on their own. In particular, as social networks have become central hubs of information and communication, they are increasingly the target of attention and attacks. This creates a challenge of carefully distinguishing malicious connections from normal ones. Previous work has shown that for a variety of Internet attacks, there is a small subset of connection measurements that are good indicators of whether a connection is part of an attack or not. In this paper we look at the effectiveness of using two different co-clustering algorithms to both cluster connections as well as mark which connection measurements are strong indicators of what makes any given cluster anomalous relative to the total data set. We run experiments with these co-clustering algorithms on the KDD 1999 Cup data set. In our experiments we find that soft co-clustering, running on samples of data, finds consistent parameters that are strong indicators of anomalous detections and creates clusters, that are highly pure. When running hard co-clustering on the full data set (over 100 runs), we on average have one cluster with 92.44% attack connections and the other with 75.84% normal connections. These results are on par with the KDD 1999 Cup winning entry, showing that co-clustering is a strong, unsupervised method for separating normal connections from anomalous ones. Finally, we believe that the ideas presented in this work may inspire research for anomaly detection in social networks, such as identifying spammers and fraudsters.

References

[1]

Bregman co-clustering code on-line. http://www.lans.ece.utexas.edu/facility.html.

[2]

Google hack attack was ultra sophisticated, new details show. http://www.wired.com/threatlevel/2010/01/operation-aurora/.

[3]

K-means algorithm, wikipedia. http://en.wikipedia.org/wiki/K-means_clustering.

[4]

Kdd 99 cup dataset. http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html.

[5]

Smr co-clustering code on-line. http://www.models.life.ku.dk/cocluster.

[6]

Twitter restores service after attack. http://bits.blogs.nytimes.com/2009/ 08/06/twitter-overwhelmed-by-web-attack/.

[7]

L. Akoglu, M. McGlohon, and C. Faloutsos. Oddball: Spotting anomalies in weighted graphs. Advances in Knowledge Discovery and Data Mining, pages 410-421, 2010.

Digital Library

[8]

A. Banerjee, I. Dhillon, J. Ghosh, S. Merugu, and D.S. Modha. A generalized maximum entropy approach to bregman co-clustering and matrix approximation. In Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 509-514. ACM, 2004.

Digital Library

[9]

H. Cho, I.S. Dhillon, Y. Guan, and S. Sra. Minimum sum-squared residue co-clustering of gene expression data. In Proceedings of the fourth SIAM international conference on data mining, volume 114, 2004.

[10]

I.S. Dhillon. Co-clustering documents and words using bipartite spectral graph partitioning. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, pages 269-274. ACM, 2001.

Digital Library

[11]

I.S. Dhillon, S. Mallela, and D.S. Modha. Information-theoretic coclustering. In Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining, pages 89-98. ACM, 2003.

Digital Library

[12]

Charles Elkan. Results of the kdd'99 classifier learning contest. http://cseweb.ucsd.edu/~elkan/clresults.html.

[13]

G. Gu, R. Perdisci, J. Zhang, and W. Lee. Botminer: Clustering analysis of network traffic for protocol-and structure-independent botnet detection. In Proceedings of the 17th conference on Security symposium, pages 139-154. USENIX Association, 2008.

Digital Library

[14]

Y. Guan, A.A. Ghorbani, N. Belacel, et al. Y-means: A clustering method for intrusion detection. 2003.

[15]

K. Henderson, T. Eliassi-Rad, C. Faloutsos, L. Akoglu, L. Li, K. Maruhashi, B.A. Prakash, and H. Tong. Metric forensics: a multi-level approach for mining volatile graphs. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 163-172. ACM, 2010.

Digital Library

[16]

P. Kabiri and G.R. Zargar. Category-based selection of effective parameters for intrusion detection. International Journal of Computer Science and Network Security (IJCSNS), 9(9):181-188, 2009.

[17]

K. Leung and C. Leckie. Unsupervised anomaly detection in network intrusion detection using clusters. In Proceedings of the Twenty-eighth Australasian conference on Computer Science-Volume 38, pages 333-342. Australian Computer Society, Inc., 2005.

Digital Library

[18]

K. Maruhashi, F. Guo, and C. Faloutsos. Multiaspectforensics: Pattern mining on large-scale heterogeneous networks with tensor analysis. In Proceedings of the Third International Conference on Advances in Social Network Analysis and Mining, 2011.

Digital Library

[19]

David Moore, Geoffrey Voelker, and Stefan Savage. Inferring internet denial-of-service activity. In In Proceedings of the 10th Usenix Security Symposium, pages 9-22, 2001.

Digital Library

[20]

B. Mukherjee, L.T. Heberlein, and K.N. Levitt. Network intrusion detection. Network, IEEE, 8(3):26-41, 1994.

Digital Library

[21]

E.E. Papalexakis and N.D. Sidiropoulos. Co-clustering as multilinear decomposition with sparse latent factors. In Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on, pages 2064-2067. IEEE, 2011.

[22]

E.E. Papalexakis, N.D. Sidiropoulos, and M.N. Garofalakis. Reviewer profiling using sparse matrix regression. In Data Mining Workshops (ICDMW), 2010 IEEE International Conference on, pages 1214-1219. IEEE, 2010.

Digital Library

[23]

A. Patcha and J.M. Park. An overview of anomaly detection techniques: Existing solutions and latest technological trends. Computer Networks, 51(12):3448-3470, 2007.

Digital Library

[24]

Bernhard Pfahringer. Winning the kdd99 classication cup: Bagged boosting. http://www.sigkdd.org/explorations/issues/1-2-2000-01/pfahringer.pdf.

[25]

L. Portnoy, E. Eskin, and S. Stolfo. Intrusion detection with unlabeled data using clustering. In In Proceedings of ACM CSS Workshop on Data Mining Applied to Security (DMSA-2001. Citeseer, 2001.

[26]

H. Shah, J. Undercoffer, and A. Joshi. Fuzzy clustering for intrusion detection. In Fuzzy Systems, 2003. FUZZ'03. The 12th IEEE International Conference on, volume 2, pages 1274-1278. IEEE, 2003.

[27]

R. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267-288, 1996.

Cited By

Qureshi MYan JCheng YYeganeh SSeung YCardwell NDe Bruijn WJacobson VKaur JWetherall DVahdat ASchulzrinne HKohler EMaltz DMisra V(2023)Fathom: Understanding Datacenter Application Network PerformanceProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604815(394-405)Online publication date: 10-Sep-2023
https://dl.acm.org/doi/10.1145/3603269.3604815
Ahmed M(2019)Data summarizationKnowledge and Information Systems10.1007/s10115-018-1183-058:2(249-273)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s10115-018-1183-0
Kim JSim ATierney BSuh SKim I(2019)Multivariate network traffic analysis using clustered patternsComputing10.1007/s00607-018-0619-4101:4(339-361)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s00607-018-0619-4
Show More Cited By

Network Anomaly Detection Using Co-clustering
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Unsupervised learning

Recommendations

Adversarial Anomaly Detection Using Centroid-Based Clustering
2018 IEEE International Conference on Information Reuse and Integration (IRI)
As cyber attacks are growing with an unprecedented rate in the recent years, organizations are seeking an efficient and scalable solution towards a holistic protection system. As the adversaries are becoming more skilled and organized, traditional rule ...
Unsupervised Anomaly Detection Using HDG-Clustering Algorithm
Neural Information Processing
Abstract
As intrusion posing a serious security threat in network environments, many network intrusion detection schemes have been proposed in recent years. Most such methods employ signature-based or data-mining based techniques that rely on labeled ...
Anomaly-Based Intrusion Detection using Fuzzy Rough Clustering
ICHIT '06: Proceedings of the 2006 International Conference on Hybrid Information Technology - Volume 01

It is an important issue for the security of network to detect new intrusion attack and also to increase the detection rates and reduce false positive rates in Intrusion Detection System (IDS). Anomaly intrusion detection focuses on modeling normal ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

ASONAM '12: Proceedings of the 2012 International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2012)

August 2012

1390 pages

ISBN:9780769547992

In-Cooperation

Publisher

IEEE Computer Society

United States

Publication History

Published: 26 August 2012

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 116 of 549 submissions, 21%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

7
Total Citations
View Citations
117
Total Downloads

Downloads (Last 12 months)7
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Qureshi MYan JCheng YYeganeh SSeung YCardwell NDe Bruijn WJacobson VKaur JWetherall DVahdat ASchulzrinne HKohler EMaltz DMisra V(2023)Fathom: Understanding Datacenter Application Network PerformanceProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604815(394-405)Online publication date: 10-Sep-2023
https://dl.acm.org/doi/10.1145/3603269.3604815
Ahmed M(2019)Data summarizationKnowledge and Information Systems10.1007/s10115-018-1183-058:2(249-273)Online publication date: 1-Jun-2019
https://dl.acm.org/doi/10.1007/s10115-018-1183-0
Kim JSim ATierney BSuh SKim I(2019)Multivariate network traffic analysis using clustered patternsComputing10.1007/s00607-018-0619-4101:4(339-361)Online publication date: 1-Apr-2019
https://dl.acm.org/doi/10.1007/s00607-018-0619-4
Fang CLiu JAnsari N(2018)Revealing connectivity structural patterns among web objects based on co-clustering of bipartite request dependency graphWireless Networks10.1007/s11276-016-1345-524:2(439-451)Online publication date: 1-Feb-2018
https://dl.acm.org/doi/10.1007/s11276-016-1345-5
Sapegin AJaeger DCheng FMeinel C(2017)Towards a system for complex analysis of security events in large-scale networksComputers and Security10.1016/j.cose.2017.02.00167:C(16-34)Online publication date: 1-Jun-2017
https://dl.acm.org/doi/10.1016/j.cose.2017.02.001
Ahmed MNaser Mahmood AHu J(2016)A survey of network anomaly detection techniquesJournal of Network and Computer Applications10.1016/j.jnca.2015.11.01660:C(19-31)Online publication date: 1-Jan-2016
https://dl.acm.org/doi/10.1016/j.jnca.2015.11.016
Beutel AXu WGuruswami VPalow CFaloutsos CSchwabe DAlmeida VGlaser HBaeza-Yates RMoon S(2013)CopyCatchProceedings of the 22nd international conference on World Wide Web10.1145/2488388.2488400(119-130)Online publication date: 13-May-2013
https://dl.acm.org/doi/10.1145/2488388.2488400

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents