Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1028788.1028802acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications

Published: 25 October 2004 Publication History

Abstract

In traffic monitoring, accounting, and network anomaly detection, it is often important to be able to detect high-volume traffic clusters in near real-time. Such heavy-hitter traffic clusters are often hierarchical (<i>ie</i>, they may occur at different aggregation levels like ranges of IP addresses) and possibly multidimensional (<i>ie</i>, they may involve the combination of different IP header fields like IP addresses, port numbers, and protocol). Without prior knowledge about the precise structures of such traffic clusters, a naive approach would require the monitoring system to examine all possible ombinations of aggregates in order to detect the heavy hitters, which can be proohibitive in terms of computation resources.
In this paper, we focus on online identification of 1-dimensional and 2-dimensional hierarchical heavy hitters (HHHs), arguably the two most important scenarios in traffic analysis. We show that the problem of HHH detection can be transformed to one of dynamic packet classification by taking a top-down approach and adaptively creating new rules to match HHHs. We then adapt several existing static packet classification algorithms to support dynamic packet classification. The resulting HHH detection algorithms have much lower worst-case update costs than existing algorithms and can provide tunable deterministic accuracy guarantees. As an application of these algorithms, we also propose robust techniques to detect changes among heavy-hitter traffic clusters. Our techniques can accommodate variability due to sampling that is increasingly used in network measurement. Evaluation based on real Internet traces collected at a Tier-1 ISP suggests that these techniques are remarkably accurate and efficient.

References

[1]
H. Arsham. Time series analysis and forecasting techniques. http://obelia.jde.aca.mmu.ac.uk/resdesgn/arsham/opre330Forecast.htm.]]
[2]
F. Baboescu, S. Singh, and G. Varghese. Packet classification for core routers: Is there an alternative to CAMs. In INFOCOM, 2003. http://citeseer.ist.psu.edu/baboescu03packet.html.]]
[3]
F. Baboescu and G. Varghese. Scalable packet classification. In Proc. ACM SIGCOMM, 2001. http://citeseer.ist.psu.edu/baboescu01packet.html.]]
[4]
P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. In Proceedings of the ACM SIGCOMM Internet Measurement Workshop, Marseille, France, November 2002.]]
[5]
P. Barford and D. Plonka. Characteristics of network traffic flow anomalies. In Proceedings of the ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, November 2001.]]
[6]
G. E. P. Box and G. M. Jenkins. Time Series Analysis, Forecasting and Control. Holden-Day, 1976.]]
[7]
G. E. P. Box, G. M. Jenkins, and G. C. Reinsel. Time Series Analysis, Forecasting and Control. Prentice-Hall, Englewood Cliffs, 1994.]]
[8]
J. Brutlag. Aberrant behavior detection in time series for network monitoring. In Proc. of the 14th USENIX System Administration Conference (LISA XIV), New Orleans, LA, December 2000.]]
[9]
C. Chen and L.-M. Liu. Forecasting time series with outliers. Journal of Forecasting, 12:13--35, 1993.]]
[10]
Cisco. Random Sampled NetFlow. http://www.cisco.com/univercd/cc/td/doc/product/software/ios123/123newft/123t/123t_2/nfstatsa.pdf.]]
[11]
G. Cormode, F. Korn, S. Muthukrishnan, and D. Srivastava. Finding hierarchical heavy hitters in data streams. In International Conference on Very Large Data Bases, 2003.]]
[12]
G. Cormode, F. Korn, S. Muthukrishnan, and D. Srivastava. Diamond in the rough: Finding hierarchical heavy hitters in multi-dimensional data. In Proc. ACM SIGMOD, June 2004.]]
[13]
G. Cormode and S. Muthukrishnan. What's hot and what's not: Tracking most frequent items dynamically. In Proc. ACM PODC '2003, July 2003.]]
[14]
G. Cormode and S. Muthukrishnan. Improved data stream summaries: The count-min sketch and its applications. In Journal of Algorithms, 2004. In press. http://dimacs.rutgers.edu/ graham/pubs/cm-full.pdf.]]
[15]
N. Duffield and C. Lund. Predicting resource usage and estimation accuracy in an IP flow measurement collection infrastructure. In ACM SIGCOMM Internet Measurement Workshop, Miami Beach, FL, October 2003.]]
[16]
N. Duffield, C. Lund, and M. Thorup. Charging from sampled network usage. In ACM SIGCOMM Internet Measurement Workshop, San Francisco, CA, November 2001.]]
[17]
C. Estan and G. Varghese. New directions in traffic measurement and accounting. In Proc. ACM SIGCOMM, Pittsburgh, PA, August 2002.]]
[18]
C. Estan and G. Varghese. Automatically inferring patterns of resource consumption in network traffic. In Proc. ACM SIGCOMM, Karlsruhe, Germany, August 2003.]]
[19]
F. Feather, D. Siewiorek, and R. Maxion. Fault detection in an ethernet network using anomaly signature matching. In Proc. ACM SIGCOMM, 1993.]]
[20]
A. Feldmann and S. Muthukrishnan. Tradeoffs for packet classification. In INFOCOM (3), pages 1193--1202, 2000. http://citeseer.ist.psu.edu/feldmann00tradeoffs.html.]]
[21]
P. Gupta and N. McKeown. Packet classification on multiple fields. In Proc. ACM SIGCOMM, pages 147--160, 1999. http://citeseer.ist.psu.edu/gupta99packet.html.]]
[22]
C. Hood and C. Ji. Proactive network fault detection. In Proc. IEEE INFOCOM '97, Kobe, Japan, April 1997.]]
[23]
K. J. Houle, G. M. Weaver, N. Long, and R. Thomas. Trends in Denial of Service Attack Technology. http://www.cert.org/archive/pdf/DoS_trends.pdf.]]
[24]
J. Jung, B. Krishnamurthy, and M. Rabinovich. Flash crowds and denial of service attacks: Characterization and implications for CDN s and web sites. In Proceedings of the World Wide Web Conference, Honolulu, Hawaii, May 2002. http://www.research.att.com/ bala/papers/www02-fc.html.]]
[25]
I. Katzela and M. Schwartz. Schemes for fault identification in communication networks. IEEE/ACM Transactions on Networking, 3(6):753--764, December 1995.]]
[26]
B. Krishnamurthy, S. Sen, Y. Zhang, and Y. Chen. Sketch-based change detection: Methods, evaluation, and applications. In Proc. ACM/USENIX Internet Measurement Conference, 2003. http://www.research.att.com/ yzhang/papers/nad-imc03.pdf.]]
[27]
G. Manku and R. Motwani. Approximate frequency counts over data streams. In International Conference on Very Large Data Bases, 2002.]]
[28]
D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver. The spread of the Sapphire Slammer worm. Technical report, CAIDA, February 2003. http://www.cs.berkeley.edu/ nweaver/sapphire/.]]
[29]
S. Muthukrishnan. Data streams: Algorithms and applications, 2003. Manuscript based on invited talk from 14th SODA.]]
[30]
S. Singh, F. Baboescu, G. Varghese, and J. Wang. Packet classification using multidimensional cutting. In Proc. ACM SIGCOMM, 2003. http://citeseer.ist.psu.edu/singh03packet.html.]]
[31]
V. Srinivasan, S. Suri, and G. Varghese. Packet classification using tuple space search. In Proc. ACM SIGCOMM, pages 135--146, 1999. http://citeseer.ist.psu.edu/srinivasan99packet.html.]]
[32]
V. Srinivasan and G. Varghese. Faster IP lookups using controlled prefix expansion. In ACM Transactions on Computer Systems, 1999.]]
[33]
V. Srinivasan, G. Varghese, S. Suri, and M. Waldvogel. Fast and scalable layer four switching. In Proc. ACM SIGCOMM, 1998. http://citeseer.ist.psu.edu/srinivasan98fast.html.]]
[34]
R. S. Tsay. Outliers, level shifts, and variance changes in time series. Journal of Forecasting, 7:1--20, 1988.]]
[35]
A. Ward, P. Glynn, and K. Richardson. Internet service performance failure detection. Performance Evaluation Review, August 1998.]]

Cited By

View all
  • (2024)From CountMin to Super kJoin Sketches for Flow Spread EstimationIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.327966511:3(2353-2370)Online publication date: May-2024
  • (2024)FARM: Comprehensive Data Center Network Monitoring and Management2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00055(520-530)Online publication date: 23-Jul-2024
  • (2024)An effective and accurate flow size measurement using funnel-shaped sketchComputer Networks10.1016/j.comnet.2024.110467247(110467)Online publication date: Jun-2024
  • Show More Cited By

Index Terms

  1. Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IMC '04: Proceedings of the 4th ACM SIGCOMM conference on Internet measurement
      October 2004
      386 pages
      ISBN:1581138210
      DOI:10.1145/1028788
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 25 October 2004

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. change detection
      2. data stream computation
      3. hierarchical heavy hitters
      4. network anomaly detection
      5. packet classification

      Qualifiers

      • Article

      Conference

      IMC04
      Sponsor:
      IMC04: Internet Measurement Conference
      October 25 - 27, 2004
      Taormina, Sicily, Italy

      Acceptance Rates

      Overall Acceptance Rate 277 of 1,083 submissions, 26%

      Upcoming Conference

      IMC '24
      ACM Internet Measurement Conference
      November 4 - 6, 2024
      Madrid , AA , Spain

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)25
      • Downloads (Last 6 weeks)4
      Reflects downloads up to 26 Sep 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)From CountMin to Super kJoin Sketches for Flow Spread EstimationIEEE Transactions on Network Science and Engineering10.1109/TNSE.2023.327966511:3(2353-2370)Online publication date: May-2024
      • (2024)FARM: Comprehensive Data Center Network Monitoring and Management2024 IEEE 44th International Conference on Distributed Computing Systems (ICDCS)10.1109/ICDCS60910.2024.00055(520-530)Online publication date: 23-Jul-2024
      • (2024)An effective and accurate flow size measurement using funnel-shaped sketchComputer Networks10.1016/j.comnet.2024.110467247(110467)Online publication date: Jun-2024
      • (2023)Hyper-USS: Answering Subset Query Over Multi-Attribute Data StreamProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599383(1698-1709)Online publication date: 6-Aug-2023
      • (2023)MVPipe: Enabling Lightweight Updates and Fast Convergence in Hierarchical Heavy Hitter DetectionIEEE/ACM Transactions on Networking10.1109/TNET.2023.327330731:6(3207-3221)Online publication date: Dec-2023
      • (2023)CocoSketch: High-Performance Sketch-Based Measurement Over Arbitrary Partial Key QueryIEEE/ACM Transactions on Networking10.1109/TNET.2023.325722631:6(2653-2668)Online publication date: Dec-2023
      • (2023)Self-Adaptive Sampling Based Per-Flow Traffic MeasurementIEEE/ACM Transactions on Networking10.1109/TNET.2022.321206631:3(1010-1025)Online publication date: Jun-2023
      • (2022)PosterProceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security10.1145/3548606.3563544(3371-3373)Online publication date: 7-Nov-2022
      • (2022)A survey on security applications of P4 programmable switches and a STRIDE-based vulnerability assessmentComputer Networks: The International Journal of Computer and Telecommunications Networking10.1016/j.comnet.2022.108800207:COnline publication date: 16-May-2022
      • (2021)SKTProceedings of the VLDB Endowment10.14778/3476249.347628714:11(2369-2382)Online publication date: 27-Oct-2021
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media