Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2930611.2930632guideproceedingsArticle/Chapter ViewAbstractPublication PagesnsdiConference Proceedingsconference-collections
Article

FlowRadar: a better NetFlow for data centers

Published: 16 March 2016 Publication History

Abstract

NetFlow has been a widely used monitoring tool with a variety of applications. NetFlow maintains an active working set of flows in a hash table that supports flow insertion, collision resolution, and flow removing. This is hard to implement in merchant silicon at data center switches, which has limited per-packet processing time. Therefore, many NetFlow implementations and other monitoring solutions have to sample or select a subset of packets to monitor. In this paper, we observe the need to monitor all the flows without sampling in short time scales. Thus, we design FlowRadar, a new way to maintain flows and their counters that scales to a large number of flows with small memory and bandwidth overhead. The key idea of FlowRadar is to encode perflow counters with a small memory and constant insertion time at switches, and then to leverage the computing power at the remote collector to perform network-wide decoding and analysis of the flow counters. Our evaluation shows that the memory usage of FlowRadar is close to traditional NetFlow with perfect hashing. With FlowRadar, operators can get better views into their networks as demonstrated by two new monitoring applications we build on top of FlowRadar.

References

[1]
http://www.cisco.com/c/en/us/products/collateral/ios-nx-os-software/ios-netflow/prod_white_paper0900aecd80406232.html.
[2]
deterlab.net.
[3]
Flowradar implementation in p4. https://github.com/USC-NSL/FlowRadar-P4.
[4]
NetFlow. https://www.ietf.org/rfc/rfc3954.txt.
[5]
ns-3 simulator. https://www.nsnam.org/.
[6]
Open vSwitch. http://openvswitch.org/.
[7]
Operand forwarding. https://en.wikipedia.org/wiki/Operand_forwarding.
[8]
P4 language consortium. p4.org.
[9]
P4 simulator. https://github.com/p4lang.
[10]
Packet loss impact on tcp throughput in esnet. http://fasterdata.es.net/network-tuning/tcp-issues-explained/packet-loss/.
[11]
Solving the mystery of link imbalance a metastable failure state at scale. https://code.facebook.com/posts/1499322996995183/.
[12]
Router overhead when enabling netflow. http://blog.tmcnet.com/advanced-netflow-traffic-analysis/2013/05/router-overhead-when-enabling-netflow.html, 2013.
[13]
M. Alizadeh, A. Greenberg, D. A. Maltz, J. Padhye, P. Patel, B. Prabhakar, S. Sengupta, and M. Sridharan. Data center TCP (DCTCP). In SIGCOMM, 2010.
[14]
Y. Azar, A. Z. Broder, A. R. Karlin, and E. Upfal. Balanced allocations. SIAM J. Comput., 29(1), 1999.
[15]
F. Bonomi, M. Mitzenmacher, R. Panigrahy, S. Singh, and G. Varghese. Beyond bloom filters: From approximate membership checks to approximate state machines. In SIGCOMM, 2006.
[16]
F. Bonomi, M. Mitzenmacher, R. Panigraphy, S. Singh, and G. Varghese. Bloom filters via d-left hashing and dynamic bit reassignment extended abstract. In Forty-Fourth Annual Allerton Conf., Illinois, USA, pages 877-883, 2006.
[17]
P. Cheng, F. Ren, R. Shu, and C. Lin. Catch the whole lot in an action: Rapid precise packet loss notification in data centers. In NSDI, 2014.
[18]
Cisco. Netflow performance analysis. White paper, 2005.
[19]
G. DeCandia, D. Hastorun, M. Jampani, G. Kakulapati, A. Lakshman, A. Pilchin, S. Sivasubramanian, P. Vosshall, and W. Vogels. Dynamo: Amazon's highly available key-value store. In ACM SIGOPS, 2007.
[20]
N. Duffield, C. Lund, and M. Thorup. Estimating flow distributions from sampled flow statistics. In ACM SIGCOMM, 2003.
[21]
D. Eppstein, M. Goodrich, F. Uyeda, and G. Varghese. What's the difference? efficient set difference without prior context. In SIGCOMM, 2011.
[22]
C. Estan, K. Keys, D. Moore, and G. Varghese. Building a better netflow. ACM SIGCOMM, 2004.
[23]
C. Estan and G. Varghese. Data streaming in computer networking. In Workshop on Management and Processing of Data Streams, 2003.
[24]
M. T. Goodrich and M. Mitzenmacher. Invertible bloom lookup tables. In arXiv:1101.2245v2, 2011.
[25]
C. Guo, L. Yuan, D. Xiang, Y. Dang, R. Huang, D. Maltz, Z. Liu, V. Wang, B. Pang, H. Chen, Z.-W. Lin, and V. Kurien. Pingmesh: A large-scale system for data center network latency measurement and analysis. In SIGCOMM, 2015.
[26]
N. Handigol, B. Heller, V. Jeyakumar, D. Mazières, and N. McKeown. I know what your packet did last hop: Using packet histories to troubleshoot networks. In NSDI, 2014.
[27]
S. Kandula, D. Katabi, S. Sinha, and A. Berger. Dynamic load balancing without packet reordering. SIGCOMM Comput. Commun. Rev., 37(2), 2007.
[28]
R. Kompella, K. Levchenko, A. Snoeren, and G. Varghese. Every microsecond counts: Tracking fine-grain latencies with a loss difference aggregator. In SIGCOMM, 2009.
[29]
A. Kuzmanovic and E. W. Knightly. Low-rate tcp-targeted denial of service attacks (the shrew vs. the mice and elephants). In SIGCOMM, 2003.
[30]
J. Mai, C.-N. Chuah, A. Sridharan, T. Ye, and H. Zang. Is sampled data sufficient for anomaly detection? In Proceedings of the 6th ACM SIGCOMM conference on Internet measurement, IMC '06, pages 165-176, New York, NY, USA, 2006. ACM.
[31]
M. Mathis, J. Semke, J. Mahdavi, and T. Ott. The macroscopic behavior of the tcp congestion avoidance algorithm. In SIGCOMM Comput. Commun. Rev., 1997.
[32]
N. McKeown, T. Anderson, H. Balakrishnan, G. Parulkar, L. Peterson, J. Rexford, S. Shenker, and J. Turner. OpenFlow: Enabling Innovation in Campus Networks. SIGCOMM Computer Communication Review, 38(2), 2008.
[33]
R. Pagh and F. F. Rodler. Cuckoo hashing. In Algorithms -- ESA 2001. Lecture Notes in Computer Science 2161.
[34]
J. Rasley, B. Stephens, C. Dixon, E. Rozner, W. Felter, K. Agarwal, J. Carter, and R. Fonseca. Planck: Millisecond-scale monitoring and control for commodity networks. In SIGCOMM, 2014.
[35]
A. Roy, H. Zeng, J. Bagga, G. Porter, and A. C. Snoeren. Inside the social network's (datacenter) network. In SIGCOMM, 2015.
[36]
H. Song, S. Dharmapurikar, J. Turner, and J. Lockwood. Fast hash table lookup using extended Bloom filter: An aid to network processing. In SIGCOMM, 2005.
[37]
F. Uyeda, L. Foschini, F. Baker, S. Suri, and G. Varghese. Efficiently Measuring Bandwidth at All Time Scales. In NSDI, 2011.
[38]
B. Vöcking. How asymmetry helps load balancing. J. ACM, 50(4), 2003.
[39]
W. Vogels. Performance and scalability. http://www.allthingsdistributed.com/2006/04/performance_and_scalability.html, 2009.
[40]
M. Wang, B. Li, and Z. Li. sflow: Towards resource-efficient and agile service federation in service overlay networks. Distributed Computing Systems, International Conference on, 0:628-635, 2004.
[41]
M. Yu, A. Greenberg, D. Maltz, J. Rexford, L. Yuan, S. Kandula, and C. Kim. Profiling Network Performance for Multi-tier Data Center Applications. In NSDI, 2011.
[42]
M. Yu, L. Jose, and R. Miao. Software Defined Traffic Measurement with OpenSketch. In NSDI, 2013.
[43]
D. Zhou, B. Fan, H. Lim, D. G. Andersen, and M. Kaminsky. Scaling up clustered network appliances with ScaleBricks. In SIGCOMM, 2015.
[44]
Y. Zhu, N. Kang, J. Cao, A. Greenberg, G. Lu, R. Mahajan, D. Maltz, L. Yuan, M. Zhang, B. Y. Zhao, and H. Zheng. Packet-level telemetry in large datacenter networks. In SIGCOMM, 2015.

Cited By

View all
  • (2024)Diagnosing application-network anomalies for millions of IPs in production cloudsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692046(885-899)Online publication date: 10-Jul-2024
  • (2024)NetGSR: Towards Efficient and Reliable Network Monitoring with Generative Super ResolutionProceedings of the ACM on Networking10.1145/36964002:CoNEXT4(1-27)Online publication date: 25-Nov-2024
  • (2024)SPArch: A Hardware-oriented Sketch-based Architecture for High-speed Network Flow MeasurementsACM Transactions on Privacy and Security10.1145/368747727:4(1-34)Online publication date: 8-Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
NSDI'16: Proceedings of the 13th Usenix Conference on Networked Systems Design and Implementation
March 2016
699 pages
ISBN:9781931971294

Sponsors

  • VMware
  • Google Inc.
  • Microsoft Research: Microsoft Research
  • Facebook: Facebook

Publisher

USENIX Association

United States

Publication History

Published: 16 March 2016

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Diagnosing application-network anomalies for millions of IPs in production cloudsProceedings of the 2024 USENIX Conference on Usenix Annual Technical Conference10.5555/3691992.3692046(885-899)Online publication date: 10-Jul-2024
  • (2024)NetGSR: Towards Efficient and Reliable Network Monitoring with Generative Super ResolutionProceedings of the ACM on Networking10.1145/36964002:CoNEXT4(1-27)Online publication date: 25-Nov-2024
  • (2024)SPArch: A Hardware-oriented Sketch-based Architecture for High-speed Network Flow MeasurementsACM Transactions on Privacy and Security10.1145/368747727:4(1-34)Online publication date: 8-Aug-2024
  • (2024)Eagle: Toward Scalable and Near-Optimal Network-Wide Sketch Deployment in Network MeasurementProceedings of the ACM SIGCOMM 2024 Conference10.1145/3651890.3672244(291-310)Online publication date: 4-Aug-2024
  • (2023)Securing Public Clouds using Dynamic Communication GraphsProceedings of the 22nd ACM Workshop on Hot Topics in Networks10.1145/3626111.3628198(272-279)Online publication date: 28-Nov-2023
  • (2023)A Holistic View of AI-driven Network Incident ManagementProceedings of the 22nd ACM Workshop on Hot Topics in Networks10.1145/3626111.3628176(180-188)Online publication date: 28-Nov-2023
  • (2023)LadderFilter: Filtering Infrequent Items with Small Memory and Time OverheadProceedings of the ACM on Management of Data10.1145/35886901:1(1-21)Online publication date: 30-May-2023
  • (2023)Network Monitoring on Multi-Pipe SwitchesProceedings of the ACM on Measurement and Analysis of Computing Systems10.1145/35793217:1(1-31)Online publication date: 2-Mar-2023
  • (2022)Stingy sketchProceedings of the VLDB Endowment10.14778/3523210.352322015:7(1426-1438)Online publication date: 1-Mar-2022
  • (2022)A fine-grained telemetry stream for security services in 5G open radio access networksProceedings of the 1st International Workshop on Emerging Topics in Wireless10.1145/3565474.3569070(18-23)Online publication date: 9-Dec-2022
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media