Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1298306.1298333acmconferencesArticle/Chapter ViewAbstractPublication PagesimcConference Proceedingsconference-collections
Article

Networkmd: topology inference and failure diagnosis in the last mile

Published: 24 October 2007 Publication History
  • Get Citation Alerts
  • Abstract

    Health monitoring, automated failure localization and diagnosis have all become critical to service providers of large distribution networks (e.g., digital cable and fiber-to-the-home), due to the increases in scale and complexity of their offered services. Existing automated failure diagnosis solutions typically assume complete knowledge of network topology, which in practice is rarely available. The solution presented in this paper - Network Management and Diagnosis (NetworkMD) - is an automated failure diagnosis system that can infer failure groups based on historical failure data, and optionally geographical information. The inferred failure groups mirror missing topologies, and can be used to localize failures, diagnose root causes of problems, and detect misconfiguration in known topologies. NetworkMD uses an unsupervised learning algorithm based on non-negative matrix factorization (NMF) to infer failure groups. Using cable network as the primary example, we demonstrate the effectiveness of NetworkMD in both simulated settings and real environment using data collected from a commercial network serving hundreds of thousands of customers via thousands of intermediate network devices.

    References

    [1]
    C. M. Bishop. Neural Networks for Pattern Recognition. Oxford University Press, 1995.
    [2]
    J. Case, M. Fedor, M. Schoffstall, and J. Davin. RFC1157: Simple Network Management Protocol (SNMP). IETF, April 1990.
    [3]
    H. Chang, R. Govindan, S. Jamin, S. J. Shenker, and W. Willinger. Towards capturing representative AS-level internet topologies. Computer Networks, 44(6):737--755, 2004.
    [4]
    M. Coates, R. Castro, R. Nowak, M. Gadhiok, R. King, and Y. Tsang. Maximum likelihood network topology identification from edge-based unicast measurements. In Proceedings of the ACM International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), pages 11--20, 2002.
    [5]
    M. Coates, M. Rabbat, and R. Nowak. Merging logical topologies using end-to-end measurements. In Internet Measurment Conference (IMC), 2003.
    [6]
    B. Donnet, P. Raoult, T. Friedman, and M. Crovella. Efficient algorithms for large-scale topology discovery. SIGMETRICS Performance Evaluation Review, 33(1):327--338, 2005.
    [7]
    EMC. SMARTS. http://www.emc.com/products/software/smarts/smarts_family/.
    [8]
    L. Gang, M. Coates, G. Liang, R. Nowak, and B. Yu. Internet Tomography: Recent Developments. Statistical Science, Mar 2004.
    [9]
    Y. He, G. Siganos, M. Faloutsos, and S. Krishnamurthy. A systematic framework for unearthing the missing links: Measurements and impact. In Proceedings of the 4th USENIX Symposium on Networked System Design and Implementation (NSDI), 2007.
    [10]
    Hewlett-Packard. Management Software: HPOpenView. http://h20229.www2.hp.com/.
    [11]
    iGlass. iGlass. http://www.iglass.net.
    [12]
    H. Jamjoom, N. Anerousis, R. Jennings, and D. Saha. Service Assurance Process Re-Engineering Using Location-aware Infrastructure Intelligence. the Tenth IFIP/IEEE International Symposium on Integrated Network Management, May 2007.
    [13]
    D. Jones and R. Woundy. RFC3256: The DOCSIS (Data-Over-Cable Service Interface Specifications) Device Class DHCP (Dynamic Host Configuration Protocol) Relay Agent Information Sub-option. IETF, April 2002.
    [14]
    S. Kandula, D. Katabi, and J. P. Vasseur. Shrink: A tool for failure diagnosis in IP networks. In Proc. of ACM SIGCOMM MineNet Workshop, 2005.
    [15]
    R. R. Kompella, J. Yates, A. Greenberg, and A. C. Snoeren. IP fault localization via risk modeling. In Proceedings of NSDI, 2005.
    [16]
    R. R. Kompella, J. Yates, A. Greenberg, and A. C. Snoeren. Detection and Localization of Network Black Holes. In Proceedings of IEEE Infocom, May 2007.
    [17]
    D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In Proceedings of Neural Information Processing Systems (NIPS), pages 556--562, 2000.
    [18]
    T. Li. A general model for clustering binary data. In KDD '05: Proceeding of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, pages 188--197, New York, NY, USA, 2005. ACM Press.
    [19]
    Z. M. Mao, D. Johnson, J. Rexford, J. Wang, and R. Katz. Scalable and accurate identification of AS-level forwarding paths. In Proceedings of IEEE Infocom, Mar 2004.
    [20]
    Z. M. Mao, L. Qiu, J. Wang, and Y. Zhang. On AS-level path inference. In the International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), pages 339--349, 2005.
    [21]
    P. Sebos, J. Yates, D. Rubenstein, and A. Greenberg. Effectiveness of shared risk link group auto-discovery in optical networks, 2002.
    [22]
    M. Steinder and A. Sethi. Increasing robustness of fault localization through analysis of lost, spurious, and positive symptoms. In Proc. of IEEE INFOCOM, New York, NY, 2002., 2002.
    [23]
    P. Wu, R. Bhatnagar, L. Epshtein, M. Bhandaru, and Z. Shi. Alarm correlation engine (ACE). In Proceedings of Network Operations and Management Symposium'98, Feb. 1998.

    Cited By

    View all
    • (2023)Topology reconstruction using time series data in telecommunication networksNetworks10.1002/net.2219683:2(408-427)Online publication date: 28-Nov-2023
    • (2016)A Multivariate Approach to Predicting Quantity of Failures in Broadband Networks Based on a Recurrent Neural NetworkJournal of Network and Systems Management10.1007/s10922-015-9348-624:1(189-221)Online publication date: 1-Jan-2016
    • (2013)A comparison of syslog and IS-IS for network failure analysisProceedings of the 2013 conference on Internet measurement conference10.1145/2504730.2504766(433-440)Online publication date: 23-Oct-2013
    • Show More Cited By

    Index Terms

    1. Networkmd: topology inference and failure diagnosis in the last mile

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      IMC '07: Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
      October 2007
      390 pages
      ISBN:9781595939081
      DOI:10.1145/1298306
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 24 October 2007

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. failure diagnosis
      2. network topology inference

      Qualifiers

      • Article

      Conference

      IMC07
      Sponsor:
      IMC07: Internet Measurement Conference
      October 24 - 26, 2007
      California, San Diego, USA

      Acceptance Rates

      Overall Acceptance Rate 277 of 1,083 submissions, 26%

      Upcoming Conference

      IMC '24
      ACM Internet Measurement Conference
      November 4 - 6, 2024
      Madrid , AA , Spain

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)3
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 10 Aug 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)Topology reconstruction using time series data in telecommunication networksNetworks10.1002/net.2219683:2(408-427)Online publication date: 28-Nov-2023
      • (2016)A Multivariate Approach to Predicting Quantity of Failures in Broadband Networks Based on a Recurrent Neural NetworkJournal of Network and Systems Management10.1007/s10922-015-9348-624:1(189-221)Online publication date: 1-Jan-2016
      • (2013)A comparison of syslog and IS-IS for network failure analysisProceedings of the 2013 conference on Internet measurement conference10.1145/2504730.2504766(433-440)Online publication date: 23-Oct-2013
      • (2013)A Binary Independent Component Analysis Approach to Tree Topology InferenceIEEE Transactions on Signal Processing10.1109/TSP.2013.225447661:12(3071-3080)Online publication date: 1-Jun-2013
      • (2010)NEVERMIND, the problem is already fixedProceedings of the 6th International COnference10.1145/1921168.1921178(1-12)Online publication date: 30-Nov-2010
      • (2010)California fault linesACM SIGCOMM Computer Communication Review10.1145/1851275.185122040:4(315-326)Online publication date: 30-Aug-2010
      • (2010)California fault linesProceedings of the ACM SIGCOMM 2010 conference10.1145/1851182.1851220(315-326)Online publication date: 30-Aug-2010
      • (2010)Estimating the access link quality by active measurements2010 22nd International Teletraffic Congress (lTC 22)10.1109/ITC.2010.5608738(1-8)Online publication date: Sep-2010
      • (2009)QoEScope: Adaptive IP service management for heterogeneous enterprise networks2009 17th International Workshop on Quality of Service10.1109/IWQoS.2009.5201414(1-5)Online publication date: Jul-2009
      • (2009)Current Developments in DETER Cybersecurity Testbed TechnologyProceedings of the 2009 Cybersecurity Applications & Technology Conference for Homeland Security10.1109/CATCH.2009.30(57-70)Online publication date: 3-Mar-2009

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media