Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic

Published: 01 September 2016 Publication History

Abstract

The monitoring and management of high-volume feature-rich traffic in large networks offers significant challenges in storage, transmission, and computational costs. The predominant approach to reducing these costs is based on performing a linear mapping of the data to a low-dimensional subspace such that a certain large percentage of the variance in the data is preserved in the low-dimensional representation. This variance-based subspace approach to dimensionality reduction forces a fixed choice of the number of dimensions, is not responsive to real-time shifts in observed traffic patterns, and is vulnerable to normal traffic spoofing. Based on theoretical insights proved in this paper, we propose a new distance-based approach to dimensionality reduction motivated by the fact that the real-time structural differences between the covariance matrices of the observed and the normal traffic is more relevant to anomaly detection than the structure of the training data alone. Our approach, called the distance-based subspace method, allows a different number of reduced dimensions in different time windows and arrives at only the number of dimensions necessary for effective anomaly detection. We present centralized and distributed versions of our algorithm and, using simulation on real traffic traces, demonstrate the qualitative and quantitative advantages of the distance-based subspace approach.

References

[1]
M. H. Bhuyan, D. K. Bhattacharyya, and J. K. Kalita, “Network anomaly detection: Methods, systems and tools,” IEEE Commun. Surveys Tuts., vol. Volume 16, no. Issue 1, pp. 303–336. 2014.
[2]
T.-F. Yen et al., “Beehive: Large-scale log analysis for detecting suspicious activity in enterprise networks,” in Proc. ACM Annu. Comput. Security Appl. Conf., New Orleans, LA, USA, 2013, pp. 199–208.
[3]
I. T. Jolliffe, Principal Component Analysis, 2nd ed. Berlin, Germany: Springer, 2002.
[4]
A. Lakhina, M. Crovella, and C. Diot, “Diagnosing network-wide traffic anomalies,” ACM SIGCOMM Comput. Commun. Rev., vol. Volume 34, no. Issue 4, pp. 219–230, 2004.
[5]
H. Ringberg, A. Soule, J. Rexford, and C. Diot, “Sensitivity of PCA for traffic anomaly detection,” ACM SIGMETRICS Perform. Eval. Rev., vol. Volume 35, no. Issue 1, pp. 109–120, 2007.
[6]
D. S. Yeung, S. Jin, and X. Wang, “Covariance-matrix modeling and detecting various flooding attacks,” IEEE Trans. Syst., Man, Cybern. A, Syst., Humans, vol. Volume 37, no. Issue 2, pp. 157–169, 2007.
[7]
N. Vlassis, Y. Sfakianakis, and W. Kowalczyk, “<chapter-title>Gossip-based greedy Gaussian mixture learning</chapter-title>,” in Advances in Informatics . Berlin, Germany: Springer, 2005, pp. 349–359.
[8]
M. Mandjes, I. Saniee, and A. L. Stolyar, “Load characterization and anomaly detection for voice over IP traffic,” IEEE Trans. Neural Netw., vol. Volume 16, no. Issue 5, pp. 1019–1026, 2005.
[9]
E. P. Freire, A. Ziviani, and R. M. Salles, “Detecting VoIP calls hidden in Web traffic,” IEEE Trans. Netw. Service Manag., vol. Volume 5, no. Issue 4, pp. 204–214, 2008.
[10]
V. L. L. Thing, M. Sloman, and N. Dulay, “Locating network domain entry and exit point/path for DDoS attack traffic,” IEEE Trans. Netw. Service Manag., vol. Volume 6, no. Issue 3, pp. 163–174, 2009.
[11]
Y. Xie and S.-Z. Yu, “A large-scale hidden semi-Markov model for anomaly detection on user browsing behaviors,” IEEE/ACM Trans. Netw., vol. Volume 17, no. Issue 1, pp. 54–65, 2009.
[12]
I. C. Paschalidis and G. Smaragdakis, “Spatio-temporal network anomaly detection by assessing deviations of empirical measures,” IEEE/ACM Trans. Netw., vol. Volume 17, no. Issue 3, pp. 685–697, 2009.
[13]
M. Thottan and C. Ji, “Anomaly detection in IP networks,” IEEE Trans. Signal Process., vol. Volume 51, no. Issue 8, pp. 2191–2204, 2003.
[14]
A. Lakhina, M. Crovella, and C. Diot, “Mining anomalies using traffic feature distributions,” ACM SIGCOMM Comput. Commun. Rev., vol. Volume 35, no. Issue 4, pp. 217–228, 2005.
[15]
M. Tavallaee, W. Lu, S. A. Iqbal, and A. Ghorbani, “A novel covariance matrix based approach for detecting network anomalies,” in Proc. IEEE Commun. Netw. Services Res. Conf., 2008, pp. 75–81.
[16]
A. Kind, M. P. Stoecklin, and X. Dimitropoulos, “Histogram-based traffic anomaly detection,” IEEE Trans. Netw. Service Manag., vol. Volume 6, no. Issue 2, pp. 110–121, 2009.
[17]
A. D'Alconzo, A. Coluccia, F. Ricciato, and P. Romirer-Maierhofer, “A distribution-based approach to anomaly detection and application to 3G mobile traffic,” in Proc. IEEE Glob. Telecommun. Conf., Honolulu, HI, USA, 2009, pp. 1–8.
[18]
C. Callegari, L. Gazzarrini, S. Giordano, M. Pagano, and T. Pepe, “A novel PCA-based network anomaly detection,” in Proc. IEEE Int. Conf. Commun., Kyoto, Japan, 2011, pp. 1–5.
[19]
K. Nyalkalkar, S. Sinhay, M. Bailey, and F. Jahanian, “A comparative study of two network-based anomaly detection methods,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Shanghai, China, 2011, pp. 176–180.
[20]
C. Pascoal et al., “Robust feature selection and robust PCA for Internet traffic anomaly detection,” in Proc. IEEE Int. Conf. Comput. Commun. (INFOCOM), Orlando, FL, USA, 2012, pp. 1755–1763.
[21]
G. Mateos and G. B. Giannakis, “Robust PCA as bilinear decomposition with outlier-sparsity regularization,” IEEE Trans. Signal Process., vol. Volume 60, no. Issue 10, pp. 5176–5190, 2012.
[22]
T. Kudo, T. Morita, T. Matsuda, and T. Takine, “PCA-based robust anomaly detection using periodic traffic behavior,” in Proc. IEEE Int. Conf. Commun. Workshops (ICC), Budapest, Hungary, 2013, pp. 1330–1334.
[23]
C. J. C. Burges, Dimension Reduction: A Guided Tour . Boston, MA, USA: Now, 2010.
[24]
M. Jelasity, G. Canright, and K. Engø-Monsen, “<chapter-title>Asynchronous distributed power iteration with gossip-based normalization</chapter-title>,” in Euro-Par 2007 Parallel Processing . Berlin, Germany: Springer, 2007, pp. 514–525.
[25]
A. Bertrand and M. Moonen, “Power iteration-based distributed total least squares estimation in ad hoc sensor networks,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Kyoto, Japan, 2012, pp. 2669–2672.
[26]
Z. Meng, A. Wiesel, and A. O. Hero, “Distributed principal component analysis on networks via directed graphical models,” in Proc. IEEE Int. Conf. Acoust. Speech Signal Process. (ICASSP), Kyoto, Japan, 2012, pp. 2877–2880.
[27]
L. Huang et al., “In-network PCA and anomaly detection,” in Proc. Adv. Neural Inf. Process. Syst., Vancouver, BC, Canada, 2006, pp. 617–624.
[28]
A. Wiesel and A. O. Hero, “Decomposable principal component analysis,” IEEE Trans. Signal Process., vol. Volume 57, no. Issue 11, pp. 4369–4377, 2009.
[29]
S. Boyd, A. Ghosh, B. Prabhakar, and D. Shah, “Randomized gossip algorithms,” IEEE Trans. Inf. Theory, vol. Volume 52, no. Issue 6, pp. 2508–2530, 2006.
[30]
A. G. Dimakis, S. Kar, J. M. F. Moura, M. G. Rabbat, and A. Scaglione, “Gossip algorithms for distributed signal processing,” Proc. IEEE, vol. Volume 98, no. Issue 11, pp. 1847–1864, 2010.
[31]
L. Li, A. Scaglione, and J. H. Manton, “Distributed principal subspace estimation in wireless sensor networks,” IEEE J. Sel. Topics Signal Process., vol. Volume 5, no. Issue 4, pp. 725–738, 2011.
[32]
A. Björck and G. H. Golub, “Numerical methods for computing angles between linear subspaces,” Math. Comput., vol. Volume 27, no. Issue 123, pp. 579–594, 1973.
[33]
T. Huang, H. Sethu, and N. Kandasamy, “A fast algorithm for detecting anomalous changes in network traffic,” in Proc. Int. Conf. Netw. Service Manag. (CNSM), Barcelona, Spain, 2015, pp. 251–255.
[34]
G. H. Golub and C. F. Van Loan, Matrix Computations, 4th ed. Baltimore, MD, USA: Johns Hopkins Univ., 2012.
[35]
T. Huang, H. Sethu, and N. Kandasamy. (2016). A New Approach to Dimensionality Reduction for Anomaly Detection in Data Traffic . {Online}. Available: http://arxiv.org/abs/1606.04552
[36]
. The CAIDA Anonymized Internet Traces 2013 . Accessed on <day>4</day>, 2015. {Online}. Available: http://www.caida.org/data/passive/passive_2013_dataset.xml
[37]
. Traffic Data From Kyoto University's Honeypots . Accessed on <day>4</day>, 2016. {Online}. Available: http://www.takakura.com/Kyoto_data/
[38]
T. Huang, “Adaptive sampling and statistical inference for anomaly detection,” Ph.D. dissertation, <institution content-type=department>Dept. Elec. Compu. Eng</institution>., <institution content-type=institution>Drexel Univ</institution>., Philadelphia, PA, USA, 2015.

Cited By

View all
  • (2021)A flexible framework for anomaly Detection via dimensionality reductionNeural Computing and Applications10.1007/s00521-021-05839-535:2(1157-1167)Online publication date: 11-Mar-2021
  • (2020)Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases ProcessIEEE Transactions on Network and Service Management10.1109/TNSM.2020.301624617:4(2451-2479)Online publication date: 1-Dec-2020
  • (2020)A Multi-View Subspace Learning Approach to Internet Traffic Matrix EstimationIEEE Transactions on Network and Service Management10.1109/TNSM.2020.298332917:2(1282-1293)Online publication date: 10-Jun-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Network and Service Management
IEEE Transactions on Network and Service Management  Volume 13, Issue 3
September 2016
349 pages

Publisher

IEEE Press

Publication History

Published: 01 September 2016

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)A flexible framework for anomaly Detection via dimensionality reductionNeural Computing and Applications10.1007/s00521-021-05839-535:2(1157-1167)Online publication date: 11-Mar-2021
  • (2020)Survey of Network Intrusion Detection Methods From the Perspective of the Knowledge Discovery in Databases ProcessIEEE Transactions on Network and Service Management10.1109/TNSM.2020.301624617:4(2451-2479)Online publication date: 1-Dec-2020
  • (2020)A Multi-View Subspace Learning Approach to Internet Traffic Matrix EstimationIEEE Transactions on Network and Service Management10.1109/TNSM.2020.298332917:2(1282-1293)Online publication date: 10-Jun-2020
  • (2019)A comprehensive survey on network anomaly detectionTelecommunications Systems10.1007/s11235-018-0475-870:3(447-489)Online publication date: 1-Mar-2019
  • (2018)Optimal-Coherent Network Inference (OCNI): Principles and ApplicationsIEEE Transactions on Network and Service Management10.1109/TNSM.2018.282015915:2(811-824)Online publication date: 1-Jun-2018
  • (2017)Noisy-free Length Discriminant Analysis with cosine hyperbolic framework for dimensionality reductionExpert Systems with Applications: An International Journal10.1016/j.eswa.2017.03.03481:C(88-107)Online publication date: 15-Sep-2017

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media