Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Towards automated performance diagnosis in a large IPTV network

Published: 16 August 2009 Publication History

Abstract

IPTV is increasingly being deployed and offered as a commercial service to residential broadband customers. Compared with traditional ISP networks, an IPTV distribution network (i) typically adopts a hierarchical instead of mesh-like structure, (ii) imposes more stringent requirements on both reliability and performance, (iii) has different distribution protocols (which make heavy use of IP multicast) and traffic patterns, and (iv) faces more serious scalability challenges in managing millions of network elements. These unique characteristics impose tremendous challenges in the effective management of IPTV network and service.
In this paper, we focus on characterizing and troubleshooting performance issues in one of the largest IPTV networks in North America. We collect a large amount of measurement data from a wide range of sources, including device usage and error logs, user activity logs, video quality alarms, and customer trouble tickets. We develop a novel diagnosis tool called Giza that is specifically tailored to the enormous scale and hierarchical structure of the IPTV network. Giza applies multi-resolution data analysis to quickly detect and localize regions in the IPTV distribution hierarchy that are experiencing serious performance problems. Giza then uses several statistical data mining techniques to troubleshoot the identified problems and diagnose their root causes. Validation against operational experiences demonstrates the effectiveness of Giza in detecting important performance issues and identifying interesting dependencies. The methodology and algorithms in Giza promise to be of great use in IPTV network operations.

References

[1]
D. Agarwal, D. Barman, D. Gunopulos, N. E. Young, F. Korn, and D. Srivastava. Efficient and effective explanation of change in hierarchical summaries. In ACM KDD, 2007.
[2]
B. Aggarwal, R. Bhagwan, V. N. Padmanabhan, and G. Voelker. NetPrints: Diagnosing home network misconfigurations using shared knowledge. In NSDI, 2009.
[3]
A. Arnold, Y. Liu, and N. Abe. Temporal causal modeling with graphical granger methods. In ACM KDD, pages 66--75, 2007.
[4]
P. Bahl, R. Chandra, A. Greenberg, S. Kandula, D. A. Maltz, and M. Zhang. Towards highly reliable enterprise network services via inference of multi-level dependencies. In Sigcomm, 2007.
[5]
W. Buntine. Theory refinement on Bayesian networks. In Proc. Uncertainty in artificial intelligence, 1991.
[6]
M. Cha, P. Rodriguez, J. Crowcroft, S. Moon, and X. Amatriain. Watching Television over an IP Network. In ACM IMC, 2008.
[7]
X. Chen, M. Zhang, Z. M. Mao, and P. Bahl. Automating network application dependency discovery: Experiences, limitations, and new solutions. In OSDI, 2008.
[8]
B. Cheng, L. Stein, H. Jin, and Z. Zhang. Towards cinematic internet video-on-demand. In ACM EuroSys, 2008.
[9]
P. R. Cohen, L. A. Ballesteros, D. E. Gregory, and R. S. Amant. Regression can build predictive causal models. Technical Report UM-CS-1994-015, 1994.
[10]
P. R. Cohen, D. E. Gregory, L. Ballesteros, and R. S. Amant. Two algorithms for inducing structural equation models from data. Technical Report UM-CS-1994-080, 1994.
[11]
G. F. Cooper and E. Herskovits. A bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4):309--347, 1992.
[12]
G. Cormode, F. Korn, S. Muthukrishnan, and D. Srivastava. Finding hierarchical heavy hitters in data streams. In VLDB, 2003.
[13]
G. Cormode, F. Korn, S. Muthukrishnan, and D. Srivastava. Diamond in the rough: finding hierarchical heavy hitters in multi-dimensional data. In ACM Sigmod, 2004.
[14]
A. Dhamdhere, R. Teixeira, C. Dovrolis, and C. Diot. Netdiagnoser: troubleshooting network unreachabilities using end-to-end probes and routing data. In ACM CoNEXT, 2007.
[15]
D.L.Donoho. For most large underdetermined systems of equations, the minimal l1-norm near solution approximates the sparsest near--solution. In http://www-stat.stanford.edu/ donoho/Reports/, 2004.
[16]
C. Estan, S. Savage, and G. Varghese. Automatically inferring patterns of resource consumption in network traffic. In ACM Sigcomm, 2003.
[17]
C. W. J. Granger. Investigating causal relations by econometric models and cross-spectral methods. In Econometrica, 1969.
[18]
X. Hei, C. Liang, J. Liang, Y. Liu, and K. W. Ross. A measurement study of a large-scale P2P IPTV system. IEEE Transaction on Multimedia, 2007.
[19]
Y. Huang, T. Z. Fu, D.-M. Chiu, J. C. Lui, and C. Huang. Challenges, design and analysis of a large-scale P2P-VoD system. In ACM Sigcomm, 2008.
[20]
S. Kandula, R. Chandra, and D. Katabi. What's Going On? Learning Communication Rules in Edge Networks. In Sigcomm, 2008.
[21]
S. Kandula, D. Katabi, and J.-P. Vasseur. Shrink: A Tool for Failure Diagnosis in IP Networks. In MineNet, 2005.
[22]
R. R. Kompella, J. Yates, A. Greenberg, and A. C. Snoeren. Detection and localization of network blackholes. In Infocom, 2007.
[23]
A. Mahimkar, J. Yates, Y. Zhang, A. Shaikh, J. Wang, Z. Ge, and C. T. Ee. Troubleshooting chronic conditions in large IP networks. In ACM CoNEXT, 2008.
[24]
T. Qiu, Z. Ge, S. Lee, J. Wang, J. Xu, and Q. Zhao. Modeling channel popularity dynamics in a large IPTV system. In ACM Sigmetrics, 2009.
[25]
T. Silverston and O. Fourmaux. P2P IPTV measurement: a case study of TVants. In ACM CoNEXT, 2006.
[26]
P. Spirtes, C. N. Glymour, and R. Scheines. Causation, prediction and search. Lecture Notes in Statistics, 1993.
[27]
K. Sridhar, G. Damm, and H. C. Cankaya. End-to-end diagnostics in IPTV architectures. Bell Lab. Tech. J., 2008.
[28]
M. Tariq, A. Zeitoun, V. Valancius, N. Feamster, and M. Ammar. Answering what-if deployment and configuration questions with WISE. In ACM SIGCOMM, 2008.
[29]
Wikipedia. Chebyshev inequality. http://en.wikipedia.org/wiki/Chebyshev's_inequality.
[30]
S. A. Yemini, S. Kliger, E. Mozes, Y. Yemini, and D. Ohsie. High speed and robust event correlation. In IEEE Comm., 1996.
[31]
H. Yu, D. Zheng, B. Y. Zhao, and W. Zheng. Understanding user behavior in large-scale video-on-demand systems. ACM Sigops Operating Systems Review, 2006.
[32]
Y. Zhang, S. Singh, S. Sen, N. Duffield, and C. Lund. Online identification of hierarchical heavy hitters: algorithms, evaluation, and applications. In ACM IMC, 2004.

Cited By

View all
  • (2024)A survey on intelligent management of alerts and incidents in IT servicesJournal of Network and Computer Applications10.1016/j.jnca.2024.103842224(103842)Online publication date: Apr-2024
  • (2023)Veritas: Answering Causal Queries from Video Streaming TracesProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604828(738-753)Online publication date: 10-Sep-2023
  • (2022)Collaborative Management of Correlated Multicast TransferData Center Networking10.1007/978-981-16-9368-7_10(233-256)Online publication date: 24-Feb-2022
  • Show More Cited By

Index Terms

  1. Towards automated performance diagnosis in a large IPTV network

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGCOMM Computer Communication Review
    ACM SIGCOMM Computer Communication Review  Volume 39, Issue 4
    SIGCOMM '09
    October 2009
    325 pages
    ISSN:0146-4833
    DOI:10.1145/1594977
    Issue’s Table of Contents
    • cover image ACM Conferences
      SIGCOMM '09: Proceedings of the ACM SIGCOMM 2009 conference on Data communication
      August 2009
      340 pages
      ISBN:9781605585949
      DOI:10.1145/1592568
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 16 August 2009
    Published in SIGCOMM-CCR Volume 39, Issue 4

    Check for updates

    Author Tags

    1. IPTV
    2. network diagnosis

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)39
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 03 Oct 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A survey on intelligent management of alerts and incidents in IT servicesJournal of Network and Computer Applications10.1016/j.jnca.2024.103842224(103842)Online publication date: Apr-2024
    • (2023)Veritas: Answering Causal Queries from Video Streaming TracesProceedings of the ACM SIGCOMM 2023 Conference10.1145/3603269.3604828(738-753)Online publication date: 10-Sep-2023
    • (2022)Collaborative Management of Correlated Multicast TransferData Center Networking10.1007/978-981-16-9368-7_10(233-256)Online publication date: 24-Feb-2022
    • (2020)Pitfalls of data-driven networkingProceedings of the Workshop on Network Meets AI & ML10.1145/3405671.3405815(42-47)Online publication date: 10-Aug-2020
    • (2019)Heterogeneous Data Ensemble Learning in End-to-End Diagnosis for IPTV2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS)10.23919/APNOMS.2019.8892990(1-6)Online publication date: Sep-2019
    • (2019)Zooming in on wide-area latencies to a global cloud providerProceedings of the ACM Special Interest Group on Data Communication10.1145/3341302.3342073(104-116)Online publication date: 19-Aug-2019
    • (2019)Rigorous, Effortless and Timely Assessment of Cellular Network Changes2019 49th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)10.1109/DSN.2019.00037(256-263)Online publication date: Jun-2019
    • (2019)Understanding the causal impact of the video delivery throughput on user engagementMultimedia Tools and Applications10.1007/s11042-018-7013-278:11(15589-15604)Online publication date: 1-Jun-2019
    • (2018)Delay Bounded Multi-Source Multicast in Software-Defined NetworkingElectronics10.3390/electronics70100107:1(10)Online publication date: 21-Jan-2018
    • (2018)Opportunities and Challenges Towards Cognitive IT Service Management in Real World2018 IEEE Symposium on Service-Oriented System Engineering (SOSE)10.1109/SOSE.2018.00028(164-173)Online publication date: Mar-2018
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media