Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1254882.1254890acmconferencesArticle/Chapter ViewAbstractPublication PagesmetricsConference Proceedingsconference-collections
Article

Diagnosing network disruptions with network-wide analysis

Published: 12 June 2007 Publication History

Abstract

To maintain high availability in the face of changing network conditions, network operators must quickly detect, identify, and react to events that cause network disruptions. One way to accomplish this goal is to monitor routing dynamics, by analyzing routing update streams collected from routers. Existing monitoring approaches typically treat streams of routing updates from different routers as independent signals, and report only the "loud" events (i.e., events that involve large volume of routing messages). In this paper, we examine BGP routing data from all routers in the Abilene backbone for six months and correlate them with a catalog of all known disruptions to its nodes and links. We find that many important events are not loud enough to be detected from a single stream. Instead, they become detectable only when multiple BGP update streams are simultaneously examined. This is because routing updates exhibit network-wide dependencies.
This paper proposes using network-wide analysis of routing information to diagnose (i.e., detect and identify) network disruptions. To detect network disruptions, we apply a multivariate analysis technique on dynamic routing information, (i.e., update traffic from all the Abilene routers) and find that this technique can detect every reported disruption to nodes and links within the network with a low rate of false alarms. To identify the type of disruption, we jointly analyze both the network-wide static configuration and details in the dynamic routing updates; we find that our method can correctly explain the scenario that caused the disruption. Although much work remains to make network-wide analysis of routing data operationally practical, our results illustrate the importance and potential of such an approach.

References

[1]
Abilene operational mailing list. https://listserv.indiana.edu/archives/abilene-ops-l.html.
[2]
P. Barford, J. Kline, D. Plonka, and A. Ron. A signal analysis of network traffic anomalies. In Proc. ACM SIGCOMM Internet Measurement Workshop, Marseille, France, Nov. 2002.
[3]
M. Caesar, L. Subramanian, and R. Katz. Towards localizing root causes of BGP dynamics. Technical Report UCB/CSD-04-1302, U.C. Berkeley, Nov. 2003.
[4]
R. Dunia and S. J. Qin. Multi-dimensional Fault Diagnosis Using a Subspace Approach. In American Control Conference, 1997.
[5]
N. Feamster, D. Andersen, H. Balakrishnan, and M. F. Kaashoek. Measuring the effects of Internet path faults on reactive routing. In Proc. ACM SIGMETRICS, San Diego, CA, June 2003.
[6]
N. Feamster and H. Balakrishnan. Detecting BGP Configuration Faults with Static Analysis. In Proc. 2nd Symposium on Networked Systems Design and Implementation, Boston, MA, May 2005.
[7]
N. Feamster, Z. M. Mao, and J. Rexford. BorderGuard: Detecting cold potatoes from peers. In Proc. Internet Measurement Conference, Taormina, Italy, Oct. 2004.
[8]
A. Feldmann, O. Maennel, Z. M. Mao, A. Berger, and B. Maggs. Locating Internet routing instabilities. In Proc. ACM SIGCOMM, pages 205--218, Portland, OR, Aug. 2004.
[9]
R. Govindan and A. Reddy. An analysis of inter-domain topology and route stability. In Proc. IEEE INFOCOM, Kobe, Japan, Apr. 1997.
[10]
J. E. Jackson and G. Mudholkar. Control Procedures for Residuals Associated with Principal Component Analysis. Technometrics, pages 341--349, 1979.
[11]
S. Kandula, D. Katabi, and J.-P. Vasseur. Shrink: a tool for failure diagnosis in ip networks. In Proc. ACM SIGCOMM Workshop on Mining Network Data (MineNet), pages 173--178, Philadelphia, PA, Aug. 2005.
[12]
C. Labovitz, A. Ahuja, A. Bose, and F. Jahanian. Delayed Internet Routing Convergence. IEEE/ACM Transactions on Networking, 9(3):293--306, June 2001.
[13]
C. Labovitz, A. Ahuja, and F. Jahanian. Experimental study of Internet stability and wide-area network failures. In Proc. FTCS, Madison, WI, June 1999.
[14]
C. Labovitz, G. R. Malan, and F. Jahanian. Origins of Internet routing instability. In Proc. IEEE INFOCOM, pages 218--226, New York, NY, Mar. 1999.
[15]
M. Lad, D. Massey, and L. Zhang. Visualizing Internet Routing Changes. Transactions on Information Visualization, 12(6):1450--1460, Nov. 2006.
[16]
A. Lakhina, M. Crovella, and C. Diot. Diagnosing network-wide traffic anomalies. In Proc. ACM SIGCOMM, Philadelphia, PA, Aug. 2005.
[17]
A. Lakhina, M. Crovella, and C. Diot. Mining anomalies using traffic feature distributions. In Proc. ACM SIGCOMM, pages 217--228, Philadelphia, PA, Aug. 2005.
[18]
A. Lakhina, K. Papagiannaki, M. Crovella, C. Diot, E. D. Kolaczyk, and N. Taft. Structural analysis of network traffic flows. In Proc. ACM SIGMETRICS, pages 61--72, New York, NY, June 2004.
[19]
Z. M. Mao, T. Griffin, and R. Bush. BGP Beacons. In Proc. ACM SIGCOMM Internet Measurement Conference, pages 1--14, Miami, FL, Oct. 2003.
[20]
A. Markopoulou, G. Iannaccone, S. Bhattacharyya, C. -N. C. and C. Diot. Characterization of Failures in an IP Backbone. In Proc. IEEE INFOCOM, Hong Kong, Mar. 2004.
[21]
J. Moy. OSPF Version 2, Mar. 1994. RFC 1583.
[22]
U. of Oregon. RouteViews. http://www.routeviews.org/.
[23]
D. Oran. OSI IS-IS intra-domain routing protocol. Internet Engineering Task Force, Feb. 1990. RFC 1142.
[24]
Packet Design Route Explorer. http://www.packetdesign.com/products/rex.htm.
[25]
SSFNet. http://www.ssfnet.org/, 2003.
[26]
R. Teixeira and J. Rexford. A measurement framework for pin-pointing routing changes. In ACM SIGCOMM Workshop on Network Troubleshooting, pages 313--318, Sept. 2004.
[27]
R. Teixeira, A. Shaikh, T. Griffin, and J. Rexford. Dynamics of Hot-Potato Routing in IP Networks. In Proc. ACM SIGMETRICS, pages 307--319, New York, NY, June 2004.
[28]
F. Wang, Z. M. Mao, J. Wang, L. Gao, and R. Bush. A measurement study on the impact of routing events on end-to-end internet path performance. In Proc. ACM SIGCOMM, pages 375--386, Pisa, Italy,Aug. 2006.
[29]
J. Wu, Z. Mao, J. Rexford, and J. Wang. Finding a Needle in a Haystack: Pinpointing Significant BGP Routing Changes in an IP Network. In Proc. 2nd Usenix NSDI, Boston, MA, May 2005.
[30]
K. Xu, J. Chandrashekar, and Z. -L. Zhang. A First Step to Understand Inter Domain Routing Dynamics. In Proc. ACM SIGCOMM Workshop on Mining Network Data (MineNet), Philadelphia, PA, Aug. 2005.
[31]
J. Zhang, J. Feigenbaum, and J. Rexford. Learning-Based Anomaly Detection of BGP Updates. Technical Report YALEU/DCS/TR--1318, Yale University, Apr. 2005.

Cited By

View all
  • (2024)Practical Anomaly Detection in Internet Services: An ISP centric approachNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575071(1-4)Online publication date: 6-May-2024
  • (2022)Unveiling the potential of graph neural networks for BGP anomaly detectionProceedings of the 1st International Workshop on Graph Neural Networking10.1145/3565473.3569188(7-12)Online publication date: 9-Dec-2022
  • (2021)Background of Network Behavior AnalysisNetwork Behavior Analysis10.1007/978-981-16-8325-1_2(7-19)Online publication date: 16-Dec-2021
  • Show More Cited By

Index Terms

  1. Diagnosing network disruptions with network-wide analysis

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Conferences
        SIGMETRICS '07: Proceedings of the 2007 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
        June 2007
        398 pages
        ISBN:9781595936394
        DOI:10.1145/1254882
        • cover image ACM SIGMETRICS Performance Evaluation Review
          ACM SIGMETRICS Performance Evaluation Review  Volume 35, Issue 1
          SIGMETRICS '07 Conference Proceedings
          June 2007
          382 pages
          ISSN:0163-5999
          DOI:10.1145/1269899
          Issue’s Table of Contents
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Sponsors

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 12 June 2007

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. anomaly detection
        2. network management
        3. statistical inference

        Qualifiers

        • Article

        Conference

        SIGMETRICS07

        Acceptance Rates

        Overall Acceptance Rate 459 of 2,691 submissions, 17%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)15
        • Downloads (Last 6 weeks)3
        Reflects downloads up to 15 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Practical Anomaly Detection in Internet Services: An ISP centric approachNOMS 2024-2024 IEEE Network Operations and Management Symposium10.1109/NOMS59830.2024.10575071(1-4)Online publication date: 6-May-2024
        • (2022)Unveiling the potential of graph neural networks for BGP anomaly detectionProceedings of the 1st International Workshop on Graph Neural Networking10.1145/3565473.3569188(7-12)Online publication date: 9-Dec-2022
        • (2021)Background of Network Behavior AnalysisNetwork Behavior Analysis10.1007/978-981-16-8325-1_2(7-19)Online publication date: 16-Dec-2021
        • (2019)Sparse Control and Data plane Telemetry features for BGP anomaly detectionIEEE INFOCOM 2019 - IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)10.1109/INFCOMW.2019.8845303(240-245)Online publication date: Apr-2019
        • (2017)The Impact of Router Outages on the AS-level InternetProceedings of the Conference of the ACM Special Interest Group on Data Communication10.1145/3098822.3098858(488-501)Online publication date: 7-Aug-2017
        • (2016)Detecting and Diagnosing Performance Impact of Smartphone Software UpgradesProceedings of the 12th Conference on International Conference on Network and Service Management10.5555/3375069.3375092(188-194)Online publication date: 31-Oct-2016
        • (2016)A Model for Incident Tickets Correlation in Network ManagementJournal of Network and Systems Management10.1007/s10922-014-9340-624:1(57-91)Online publication date: 1-Jan-2016
        • (2014)A transform domain-based anomaly detection approach to network-wide trafficJournal of Network and Computer Applications10.5555/2773807.277406140:C(292-306)Online publication date: 1-Apr-2014
        • (2014)Robust network compressive sensingProceedings of the 20th annual international conference on Mobile computing and networking10.1145/2639108.2639129(545-556)Online publication date: 7-Sep-2014
        • (2014)A transform domain-based anomaly detection approach to network-wide trafficJournal of Network and Computer Applications10.1016/j.jnca.2013.09.01440(292-306)Online publication date: Apr-2014
        • Show More Cited By

        View Options

        Get Access

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media