Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3469968.3469988acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicbdcConference Proceedingsconference-collections
research-article

Sequencis - Distributed Application Fault Characteristics Discovery using Service Logs

Published: 06 October 2021 Publication History

Abstract

Modern cloud-based applications commonly contain multiple services and microservices in a distributed architecture, which makes the debugging and isolation of application faults a challenge. Current available APM tools are intrusive, using instrumentation while impacting application performance and do not combine analysis of KPIs with logs. Logs of distributed cloud applications contain massive amount of data about application behavior and can potentially provide signals and insights for application faults. In this paper we propose Sequencis, a machine learning based automated and non-intrusive solution for discovering distributed application fault characteristics. Sequencis efficiently discerns sequential patterns of large amounts of logged events in distributed services and correlates those patterns with application errors and KPI indications such as high CPU utilization or other faults. Over extensive experiments on several distributed applications, Sequencis demonstrates ability to capture correlations between errors and log sequence patterns including text-based, URLs and SQL query patterns.

References

[1]
A Ranjan, Chitta, Samaneh Ebrahimi, and Kamran Paynabar. 2016. "Sequence graph transform (SGT): A feature extraction function for sequence data mining." arXiv preprint arXiv:1608.03533. https://arxiv.org/abs/1608.03533
[2]
He, Shilin, Pinjia He, Zhuangbin Chen, Tianyi Yang, Yuxin Su, and Michael R. Lyu. 2020. "A Survey on Automated Log Analysis for Reliability Engineering." arXiv preprint arXiv:2009.07237. https://arxiv.org/abs/2009.07237
[3]
Meng, Weibin, Ying Liu, Yichen Zhu, Shenglin Zhang, Dan Pei, Yuqing Liu, Yihao Chen 2019. "LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs." In IJCAI, vol. 7, pp. 4739-4745. LINs
[4]
He, Pinjia, Jieming Zhu, Zibin Zheng, and Michael R. Lyu. 2017. "Drain: An online log parsing approach with fixed depth tree." In 2017 IEEE International Conference on Web Services (ICWS), pp. 33-40. IEEE. https://ieeexplore.ieee.org/abstract/document/8029742/
[5]
Greff, Klaus, Rupesh K. Srivastava, Jan Koutník, Bas R. Steunebrink, and Jürgen Schmidhuber. 2016. "LSTM: A search space odyssey." IEEE transactions on neural networks and learning systems 28, no. 10: 2222-2232. https://arxiv.org/pdf/1503.04069.pdf
[6]
https://github.com/kiritbasu/Fake-Apache-Log-Generator
[7]
Michael Wintergerst, Ralf Schmelter, Johannes Scheerer, Thomas Klink, Steffen Schreiber, Dietrich Mostowoj, and Matthias Braun. Debugging applications in the cloud, December 1 2015. US Patent 9,201,759
[8]
Yamamoto, Masao, Miyuki Ono, Kohta Nakashima, and Akira Hirai. 2016. "Unified performance profiling of an entire virtualized environment." International Journal of Networking and Computing 6, no. 1:124-147.https://www.jstage.jst.go.jp/article/ijnc/6/1/6_124/_pdf
[9]
Yu, Xiao. 2018. "Understanding and Debugging Complex Software Systems: A Data-Driven Perspective." https://repository.lib.ncsu.edu/bitstream/handle/1840.20/35293/etd.pdf
[10]
Zhao, Xu, Kirk Rodrigues, Yu Luo, Ding Yuan, and Michael Stumm. 2016. "Non-intrusive performance profiling for entire software stacks based on the flow reconstruction principle." In 12th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 16), pp. 603-618. https://www.usenix.org/system/files/conference/osdi16/osdi16-zhao.pdf
[11]
Sigelman, Benjamin H., Luiz Andre Barroso, Mike Burrows, Pat Stephenson, Manoj Plakal, Donald Beaver, Saul Jaspan, and Chandan Shanbhag. 2010. "Dapper, a large-scale distributed systems tracing infrastructure." https://storage.googleapis.com/pub-tools-public-publication-data/pdf/36356.pdf
[12]
Chen, Mike Y., Emre Kiciman, Eugene Fratkin, Armando Fox, and Eric Brewer. 2002. "Pinpoint: Problem determination in large, dynamic internet services." In Proceedings International Conference on Dependable Systems and Networks, pp. 595-604. IEEE. http://radlab.cs.berkeley.edu/people/fox/static/pubs/pdf/c21.pdf
[13]
Jonathan Mace, Ryan Roelke, and Rodrigo Fonseca. 2018. Pivot tracing: Dynamic causal monitoring for distributed systems. ACM Transactions on Computer Systems (TOCS), 35(4):11. https://cs.brown.edu/research/pubs/theses/masters/2015/roelke.ryan.pdf
[14]
Andre an Hoorn, André, Matthias Rohr, Wilhelm Hasselbring, Jan Waller, Jens Ehlers, Sören Frey, and Dennis Kieselhorst. 2009. "Continuous monitoring of software services: Design and application of the Kieker framework." http://oceanrep.geomar.de/14459/1/vanhoorn_tr0921.pdf
[15]
Ahmed, Tarek M., Cor-Paul Bezemer, Tse-Hsun Chen, Ahmed E. Hassan, and Weiyi Shang. 2016. "Studying the effectiveness of application performance management (apm) tools for detecting performance regressions for web applications: An experience report." In 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), pp. 1-12. IEEE. https://ieeexplore.ieee.org/abstract/document/7832882
[16]
Heger, Christoph, André van Hoorn, Mario Mann, and Dušan Okanović. 2017. "Application performance management: State of the art and challenges for the future." In Proceedings of the 8th ACM/SPEC on International Conference on Performance Engineering, pp. 429-432. http://eprints.uni-kiel.de/37427/7/20170422-ICPE-APM_Tutorial-full-final.pdf
[17]
Open Source APM Tools Overview. https://www.retit.de/ open-source-application-performance-monitoring-apm-tools-aclassification-and-overview-of-tools-and-standards-2/. Accessed: 2019-01-16
[18]
A Aguilera, Marcos K., Jeffrey C. Mogul, Janet L. Wiener, Patrick Reynolds, and Athicha Muthitacharoen. 2003. "Performance debugging for distributed systems of black boxes." ACM SIGOPS Operating Systems Review 37, no. 5: 74-89. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.117.5695&rep=rep1&type=pdf
[19]
Barham, Paul, Rebecca Isaacs, Richard Mortier, and Dushyanth Narayanan. 2003. "Magpie: Online Modelling and Performance-aware Systems." In HotOS, pp. 85-90. https://www.usenix.org/legacy/events/hotos03/tech/full_papers/barham/barham_html/
[20]
Pellegrino, Giancarlo, and Davide Balzarotti. 2014. "Toward Black-Box Detection of Logic Flaws in Web Applications." In NDSS. https://www.eurecom.fr/en/publication/4207/download/rs-publi-4207.pdf
[21]
Schoonjans, Arnaud, Dimitri Van Landuyt, Bert Lagaisse, and Wouter Joosen. 2015. "On the suitability of black-box performance monitoring for sla-driven cloud provisioning scenarios." In Proceedings of the 14th International Workshop on Adaptive and Reflective Middleware, pp. 1-6. https://dl.acm.org/doi/abs/10.1145/2834965.2834971
[22]
Vieira, Thiago Pereira de Brito. 2013. "An approach for profiling distributed applications through network traffic analysis." Master's thesis, Universidade Federal de Pernambuco. https://attena.ufpe.br/handle/123456789/12454
[23]
Akshaya, H. L., J. Vidya, and K. Veena. 2015. "A basic introduction to devops tools." International Journal of Computer Science & Information Technologies 6, no. 3: 05-06. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.735.2925&rep=rep1&type=pdf
[24]
He, Shilin, Qingwei Lin, Jian-Guang Lou, Hongyu Zhang, Michael R. Lyu, and Dongmei Zhang. 2018. "Identifying impactful service system problems via log analysis." In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, pp. 60-70. https://dl.acm.org/doi/abs/10.1145/3236024.3236083
[25]
Han, Jiawei, Jian Pei, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umeshwar Dayal, and Meichun Hsu. 2001. "Prefixspan: Mining sequential patterns efficiently by prefix-projected pattern growth." In proceedings of the 17th international conference on data engineering, pp. 215-224. IEEE Washington, DC, USA. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.22.6665&rep=rep1&type=pdf
[26]
Srikant, Ramakrishnan, and Rakesh Agrawal. 1996. "Mining sequential patterns: Generalizations and performance improvements." In International conference on extending database technology, pp. 1-17. Springer, Berlin, Heidelberg. http://www.rakesh.agrawal-family.com/papers/edbt96seq_rj.pdf
[27]
S rikant, Ramakrishnan. "Fast algorithms for mining association rules and sequential patterns. 1996. " PhD diss., University of Wisconsin, Madison. http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.361.2972&rep=rep1&type=pdf
[28]
Nguyen, Dang, Wei Luo, Tu Dinh Nguyen, Svetha Venkatesh, and Dinh Phung. 2018. "Sqn2vec: Learning sequence representation via sequential patterns with a gap constraint." In Joint European Conference on Machine Learning and Knowledge Discovery in Databases, pp. 569-584. Springer, Cham. http://www.ecmlpkdd2018.org/wp-content/uploads/2018/09/362.pdf
[29]
Baldi, Pierre. 2012. "Autoencoders, unsupervised learning, and deep architectures." In Proceedings of ICML workshop on unsupervised and transfer learning, pp. 37-49. JMLR Workshop and Conference Proceedings. http://proceedings.mlr.press/v27/baldi12a/baldi12a.pdf

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICBDC '21: Proceedings of the 6th International Conference on Big Data and Computing
May 2021
218 pages
ISBN:9781450389808
DOI:10.1145/3469968
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Clustering
  2. Log Analysis
  3. Problem Identification
  4. Sequence Embedding
  5. Service Systems
  6. Web Apps

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICBDC 2021

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 66
    Total Downloads
  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)1
Reflects downloads up to 15 Feb 2025

Other Metrics

Citations

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media