research-article

A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic

Authors:

Tingshan Huang,

Nagarajan KandasamyAuthors Info & Claims

IEEE Transactions on Network and Service Management, Volume 13, Issue 3

Pages 651 - 665

https://doi.org/10.1109/TNSM.2016.2597125

Published: 01 September 2016 Publication History

Abstract

The monitoring and management of high-volume feature-rich traffic in large networks offers significant challenges in storage, transmission, and computational costs. The predominant approach to reducing these costs is based on performing a linear mapping of the data to a low-dimensional subspace such that a certain large percentage of the variance in the data is preserved in the low-dimensional representation. This variance-based subspace approach to dimensionality reduction forces a fixed choice of the number of dimensions, is not responsive to real-time shifts in observed traffic patterns, and is vulnerable to normal traffic spoofing. Based on theoretical insights proved in this paper, we propose a new distance-based approach to dimensionality reduction motivated by the fact that the real-time structural differences between the covariance matrices of the observed and the normal traffic is more relevant to anomaly detection than the structure of the training data alone. Our approach, called the distance-based subspace method, allows a different number of reduced dimensions in different time windows and arrives at only the number of dimensions necessary for effective anomaly detection. We present centralized and distributed versions of our algorithm and, using simulation on real traffic traces, demonstrate the qualitative and quantitative advantages of the distance-based subspace approach.

References

[1]

M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network anomaly detection: Methods, systems and tools,” IEEE Commun. Surveys Tuts., vol. Volume 16, no. Issue 1, pp. 303–336. 2014.

[2]

T.-F. Yen et al., “Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks,” in Proc. ACM Annu. Comput. Security Appl. Conf., New Orleans, LA, USA, 2013, pp. 199–208.

Digital Library

[3]

I. T. Jolliffe, Principal Component Analysis, 2nd ed. Berlin, Germany: Springer, 2002.

[4]

A. Lakhina, M. Crovella, and C. Diot, “Diagnosing network-wide traffic anomalies,” ACM SIGCOMM Comput. Commun. Rev., vol. Volume 34, no. Issue 4, pp. 219–230, 2004.

Digital Library

[5]

H. Ringberg, A. Soule, J. Rexford, and C. Diot, “Sensitivity of PCA for traffic anomaly detection,” ACM SIGMETRICS Perform. Eval. Rev., vol. Volume 35, no. Issue 1, pp. 109–120, 2007.

Digital Library

[6]

D. S. Yeung, S. Jin, and X. Wang, “Covariance-matrix modeling and detecting various flooding attacks,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. Volume 37, no. Issue 2, pp. 157–169, 2007.

Digital Library

[7]

N. Vlassis, Y. Sfakianakis, and W. Kowalczyk, “<chapter-title>Gossip-based greedy Gaussian mixture learning</chapter-title>,” in Advances in Informatics . Berlin, Germany: Springer, 2005, pp. 349–359.

Digital Library

[8]

M. Mandjes, I. Saniee, and A. L. Stolyar, “Load characterization and anomaly detection for voice over IP traffic,” IEEE Trans. Neural Netw., vol. Volume 16, no. Issue 5, pp. 1019–1026, 2005.

Digital Library

[9]

E. P. Freire, A. Ziviani, and R. M. Salles, “Detecting VoIP calls hidden in Web traffic,” IEEE Trans. Netw. Service Manag., vol. Volume 5, no. Issue 4, pp. 204–214, 2008.

Digital Library

[10]

V. L. L. Thing, M. Sloman, and N. Dulay, “Locating network domain entry and exit point/path for DDoS attack traffic,” IEEE Trans. Netw. Service Manag., vol. Volume 6, no. Issue 3, pp. 163–174, 2009.

Digital Library

[11]

Y. Xie and S.-Z. Yu, “A large-scale hidden semi-Markov model for anomaly detection on user browsing behaviors,” IEEE/ACM Trans. Netw., vol. Volume 17, no. Issue 1, pp. 54–65, 2009.

Digital Library

[12]

I. C. Paschalidis and G. Smaragdakis, “Spatio-temporal network anomaly detection by assessing deviations of empirical measures,” IEEE/ACM Trans. Netw., vol. Volume 17, no. Issue 3, pp. 685–697, 2009.

Digital Library

[13]

M. Thottan and C. Ji, “Anomaly detection in IP networks,” IEEE Trans. Signal Process., vol. Volume 51, no. Issue 8, pp. 2191–2204, 2003.

Digital Library

[14]

A. Lakhina, M. Crovella, and C. Diot, “Mining anomalies using traffic feature distributions,” ACM SIGCOMM Comput. Commun. Rev., vol. Volume 35, no. Issue 4, pp. 217–228, 2005.

Digital Library

[15]

M. Tavallaee, W. Lu, S. A. Iqbal, and A. Ghorbani, “A novel covariance matrix based approach for detecting network anomalies,” in Proc. IEEE Commun. Netw. Services Res. Conf., 2008, pp. 75–81.

Digital Library

[16]

A. Kind, M. P. Stoecklin, and X. Dimitropoulos, “Histogram-based traffic anomaly detection,” IEEE Trans. Netw. Service Manag., vol. Volume 6, no. Issue 2, pp. 110–121, 2009.

Digital Library

[17]

A. D'Alconzo, A. Coluccia, F. Ricciato, and P. Romirer-Maierhofer, “A distribution-based approach to anomaly detection and application to 3G mobile traffic,” in Proc. IEEE Glob. Telecommun. Conf., Honolulu, HI, USA, 2009, pp. 1–8.

Digital Library

[18]

C. Callegari, L. Gazzarrini, S. Giordano, M. Pagano, and T. Pepe, “A novel PCA-based network anomaly detection,” in Proc. IEEE Int. Conf. Commun., Kyoto, Japan, 2011, pp. 1–5.

[19]

K. Nyalkalkar, S. Sinhay, M. Bailey, and F. Jahanian, “A comparative study of two network-based anomaly detection methods,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Shanghai, China, 2011, pp. 176–180.

[20]

C. Pascoal et al., “Robust feature selection and robust PCA for Internet traffic anomaly detection,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Orlando, FL, USA, 2012, pp. 1755–1763.

[21]

G. Mateos and G. B. Giannakis, “Robust PCA as bilinear decomposition with outlier-sparsity regularization,” IEEE Trans. Signal Process., vol. Volume 60, no. Issue 10, pp. 5176–5190, 2012.

Digital Library

[22]

T. Kudo, T. Morita, T. Matsuda, and T. Takine, “PCA-based robust anomaly detection using periodic traffic behavior,” in Proc. IEEE Int. Conf. Commun. Workshops (ICC), Budapest, Hungary, 2013, pp. 1330–1334.

[23]

C. J. C. Burges, Dimension Reduction: A Guided Tour . Boston, MA, USA: Now, 2010.

[24]

M. Jelasity, G. Canright, and K. Engø-Monsen, “<chapter-title>Asynchronous distributed power iteration with gossip-based normalization</chapter-title>,” in Euro-Par 2007 Parallel Processing . Berlin, Germany: Springer, 2007, pp. 514–525.

Digital Library

[25]

A. Bertrand and M. Moonen, “Power iteration-based distributed total least squares estimation in ad hoc sensor networks,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Kyoto, Japan, 2012, pp. 2669–2672.

[26]

Z. Meng, A. Wiesel, and A. O. Hero, “Distributed principal component analysis on networks via directed graphical models,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Kyoto, Japan, 2012, pp. 2877–2880.

[27]

L. Huang et al., “In-network PCA and anomaly detection,” in Proc. Adv. Neural Inf. Process. Syst., Vancouver, BC, Canada, 2006, pp. 617–624.

Digital Library

[28]

A. Wiesel and A. O. Hero, “Decomposable principal component analysis,” IEEE Trans. Signal Process., vol. Volume 57, no. Issue 11, pp. 4369–4377, 2009.

Digital Library

[29]

S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Trans. Inf. Theory, vol. Volume 52, no. Issue 6, pp. 2508–2530, 2006.

Digital Library

[30]

A. G. Dimakis, S. Kar, J. M. F. Moura, M. G. Rabbat, and A. Scaglione, “Gossip algorithms for distributed signal processing,” Proc. IEEE, vol. Volume 98, no. Issue 11, pp. 1847–1864, 2010.

[31]

L. Li, A. Scaglione, and J. H. Manton, “Distributed principal subspace estimation in wireless sensor networks,” IEEE J. Sel. Topics Signal Process., vol. Volume 5, no. Issue 4, pp. 725–738, 2011.

[32]

A. Björck and G. H. Golub, “Numerical methods for computing angles between linear subspaces,” Math. Comput., vol. Volume 27, no. Issue 123, pp. 579–594, 1973.

[33]

T. Huang, H. Sethu, and N. Kandasamy, “A fast algorithm for detecting anomalous changes in network traffic,” in Proc. Int. Conf. Netw. Service Manag. (CNSM), Barcelona, Spain, 2015, pp. 251–255.

Digital Library

[34]

G. H. Golub and C. F. Van Loan, Matrix Computations, 4th ed. Baltimore, MD, USA: Johns Hopkins Univ., 2012.

[35]

T. Huang, H. Sethu, and N. Kandasamy. (2016). A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic . {Online}. Available: http://arxiv.org/abs/1606.04552

[36]

. The CAIDA Anonymized Internet Traces 2013 . Accessed on <day>4</day>, 2015. {Online}. Available: http://www.caida.org/data/passive/passive_2013_dataset.xml

[37]

. Traffic Data From Kyoto University's Honeypots . Accessed on <day>4</day>, 2016. {Online}. Available: http://www.takakura.com/Kyoto_data/

[38]

T. Huang, “Adaptive sampling and statistical inference for anomaly detection,” Ph.D. dissertation, <institution content-type=department>Dept. Elec. Compu. Eng</institution>., <institution content-type=institution>Drexel Univ</institution>., Philadelphia, PA, USA, 2015.

Cited By

Vafaei Sadr ABassett BKunz M(2021)A flexible framework for anomaly Detection via dimensionality reductionNeural Computing and Applications10.1007/s00521-021-05839-535:2(1157-1167)Online publication date: 11-Mar-2021
https://dl.acm.org/doi/10.1007/s00521-021-05839-5
Molina-Coronado BMori UMendiburu AMiguel-Alonso J(2020)Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases ProcessIEEE Transactions on Network and Service Management10.1109/TNSM.2020.301624617:4(2451-2479)Online publication date: 1-Dec-2020
https://dl.acm.org/doi/10.1109/TNSM.2020.3016246
Kumar AVidyapu SSaradhi VTamarapalli V(2020)A Multi-View Subspace Learning Approach to Internet Traffic Matrix EstimationIEEE Transactions on Network and Service Management10.1109/TNSM.2020.298332917:2(1282-1293)Online publication date: 10-Jun-2020
https://dl.acm.org/doi/10.1109/TNSM.2020.2983329
Show More Cited By

Recommendations

Linear Dimensionality Reduction via a Heteroscedastic Extension of LDA: The Chernoff Criterion

Abstract--We propose an eigenvector-based heteroscedastic linear dimension reduction (LDR) technique for multiclass data. The technique is based on a heteroscedastic two-class technique which utilizes the so-called Chernoff criterion, and successfully ...
Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis

Reducing the dimensionality of data without losing intrinsic information is an important preprocessing step in high-dimensional data analysis. Fisher discriminant analysis (FDA) is a traditional technique for supervised dimensionality reduction, but it ...
Dimensionality reduction-based spoken emotion recognition

To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised ...

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Network and Service Management

IEEE Transactions on Network and Service Management Volume 13, Issue 3

September 2016

349 pages

ISSN:1932-4537

Issue’s Table of Contents

Copyright © 2016.

Publisher

IEEE Press

Publication History

Published: 01 September 2016

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

6
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Vafaei Sadr ABassett BKunz M(2021)A flexible framework for anomaly Detection via dimensionality reductionNeural Computing and Applications10.1007/s00521-021-05839-535:2(1157-1167)Online publication date: 11-Mar-2021
https://dl.acm.org/doi/10.1007/s00521-021-05839-5
Molina-Coronado BMori UMendiburu AMiguel-Alonso J(2020)Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases ProcessIEEE Transactions on Network and Service Management10.1109/TNSM.2020.301624617:4(2451-2479)Online publication date: 1-Dec-2020
https://dl.acm.org/doi/10.1109/TNSM.2020.3016246
Kumar AVidyapu SSaradhi VTamarapalli V(2020)A Multi-View Subspace Learning Approach to Internet Traffic Matrix EstimationIEEE Transactions on Network and Service Management10.1109/TNSM.2020.298332917:2(1282-1293)Online publication date: 10-Jun-2020
https://dl.acm.org/doi/10.1109/TNSM.2020.2983329
Fernandes GRodrigues JCarvalho LAl-Muhtadi JProença M(2019)A comprehensive survey on network anomaly detectionTelecommunications Systems10.1007/s11235-018-0475-870:3(447-489)Online publication date: 1-Mar-2019
https://dl.acm.org/doi/10.1007/s11235-018-0475-8
Malboubi M(2018)Optimal-Coherent Network Inference (OCNI): Principles and ApplicationsIEEE Transactions on Network and Service Management10.1109/TNSM.2018.282015915:2(811-824)Online publication date: 1-Jun-2018
https://dl.acm.org/doi/10.1109/TNSM.2018.2820159
Murthy KGhosh A(2017)Noisy-free Length Discriminant Analysis with cosine hyperbolic framework for dimensionality reductionExpert Systems with Applications: An International Journal10.1016/j.eswa.2017.03.03481:C(88-107)Online publication date: 15-Sep-2017
https://dl.acm.org/doi/10.1016/j.eswa.2017.03.034

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents