Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

An effective parallel convolutional anomaly multi-classification model for fault diagnosis in microservice system

Published: 21 May 2024 Publication History

Abstract

Microservice architecture is a new technology for deploying large-scale applications and services in the cloud. But multivariate time series data with anomalies are increasingly generated in the cloud. Effectively diagnosing the runtime system anomalies is necessary to ensure the quality of service of microservice systems. Typical anomaly detection methods are effective in data quality and computing reliability of cloud computing. However, they all focus on one-class anomaly detection, which may not perform on practical microservice frameworks with diverse types of anomalies. Furthermore, locating the root cause of anomalies to eliminate after detection is essential. To address these issues, we propose an effective parallel convolutional anomaly multi-classification model (PCAC) based on an attention mechanism for fault diagnosis in microservice system. We first construct a parallel convolutional structure that allows subnetworks to extract features independently. Then, channel and spatial attention mechanisms are applied in the parallel convolutional layers to mitigate the loss of feature representation. Finally, causal inference based on the anomalous graph is used to locate the fault in the microservice system. The experimental results clearly show that the proposed model achieves the highest F1 scores on six public microservice datasets, improved by 37.9% in average macro-F1 and 4.4% in average micro-F1 scores respectively, outperforming eight state-of-the-art methods.

References

[1]
Assaf, R., Giurgiu, I., Bagehorn, F., & Schumann, A. (2019). MTEX-CNN: Multivariate time series explanations for predictions with convolutional neural networks. In: 2019 IEEE International Conference on Data Mining (ICDM), pp. 952–957. IEEE
[2]
Chen P, Liu H, Xin R, Carval T, Zhao J, Xia Y, and Zhao Z Effectively detecting operational anomalies in large-scale IoT data infrastructures by using a GAN-based predictive model The Computer Journal 2022 65 11 2909-2925
[3]
Chen P, Qi Y, and Hou D CauseInfer: Automated end-to-end performance diagnosis with hierarchical causality graph in cloud environment IEEE Transactions on Services Computing 2016 12 2 214-230
[4]
Chen P, Xia Y, Pang S, and Li J A probabilistic model for performance analysis of cloud infrastructures Concurrency and Computation: Practice and Experience 2015 27 17 4784-4796
[5]
Chickering, D. M., & Boutilier, C. (2003). Optimal structure identification with greedy search. Journal of Machine Learning Research, 507–554.
[6]
Deng, A., & Hooi, B. (2020). AutoMAP: Diagnose your microservice-based web applications automatically. In: Proceedings of The Web Conference 2020, pp. 246–258.
[7]
Deng, A., & Hooi, B. (2021). Graph neural network-based anomaly detection in multivariate time series. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 35, pp. 4027–4035.
[8]
Di Francesco, P., Malavolta, I., & Lago, P. (2017). Research on architecting microservices: Trends, focus, and potential for industrial adoption. In: 2017 IEEE International Conference on Software Architecture (ICSA), pp. 21–30. IEEE
[9]
Dongjie, W., Zhengzhang, C., Jingchao, N., Liang, T., Zheng, W., Yanjie, F., & Haifeng, C. (2023). Hierarchical graph neural networks for causal discovery and root cause localization. In: Proceedings of 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining.
[10]
Engle, R. F. (1982). Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econometrica: Journal of the Econometric Society, 987–1007.
[11]
Fan C, Xiao F, Zhao Y, and Wang J Analytical investigation of autoencoder-based methods for unsupervised anomaly detection in building energy data Applied Energy 2018 211 1123-1135
[12]
Fauvel K, Lin T, Masson V, Fromont É, and Termier A XCM: An explainable convolutional neural network for multivariate time series classification Mathematics 2021 9 23 3137
[13]
Graves, A., & Graves, A. (2012). Long short-term memory. Supervised Sequence Labelling with Recurrent Neural Networks, 37–45.
[14]
Hyvärinen A, Zhang K, Shimizu S, and Hoyer PO Estimation of a structural vector autoregression model using non-Gaussianity Journal of Machine Learning Research 2010 11 5 1709-1731
[15]
Kiss, I., Genge, B., Haller, P., & Sebestyén, G. (2014). Data clustering-based anomaly detection in industrial control systems. In: 2014 IEEE 10th International Conference on Intelligent Computer Communication and Processing (ICCP), pp. 275–281. IEEE.
[16]
Kriegel, H.-P., Kroger, P., Schubert, E., & Zimek, A. (2011). Interpreting and unifying outlier scores. In: Proceedings of the 2011 SIAM International Conference on Data Mining (ICDM), pp. 13–24. SIAM.
[17]
LeCun Y, Bottou L, Bengio Y, and Haffner P Gradient-based learning applied to document recognition Proceedings of the IEEE 1998 86 11 2278-2324
[18]
Lewis, R. J. (2000). An introduction to classification and regression tree (CART) analysis. In: Annual Meeting of the Society for Academic Emergency Medicine in San Francisco, California (Acad Emerg Med), vol. 14. Citeseer
[19]
Long, J., Shelhamer, E., & Darrell, T. (2017). Fully convolutional networks for semantic segmentation.
[20]
Mariani, L., Monni, C., Pezzé, M., Riganelli, O., & Xin, R. (2018). Localizing faults in cloud systems. In: 2018 IEEE 11th International Conference on Software Testing, Verification and Validation (ICST), pp. 262–273. IEEE
[21]
Page, L., Brin, S., Motwani, R., & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford Digital Libraries Working Paper.
[22]
Shimizu S, Hoyer PO, and Hyvärinen A A linear non-Gaussian acyclic model for causal discovery Journal of Machine Learning Research 2006 7 2003-2030
[23]
Shimizu S, Inazumi T, Sogawa Y, Hyvarinen A, Kawahara Y, Washio T, Hoyer PO, and Bollen K DirectLiNGAM: A direct method for learning a linear non-gaussian structural equation model Journal of Machine Learning Research 2011 12 2 1225-1248
[24]
Shyu, M.-L., Chen, S.-C., Sarinnapakorn, K., & Chang, L. (2003). A novel anomaly detection scheme based on principal component classifier. Technical Report, Miami Univ Coral Gables Fl Dept of Electrical and Computer Engineering.
[25]
Song Y, Xin R, Chen P, Zhang R, Chen J, and Zhao Z Identifying performance anomalies in fluctuating cloud environments: A robust correlative-GNN-based explainable approach Future Generation Computer Systems 2023 145 77-86
[26]
Spirtes, P., Glymour, C. N., & Scheines, R. (2000). Causation, prediction, and search [electronic resource].
[27]
Tuli, S., Casale, G., & Jennings, N. R. (2022). TranAD: Deep transformer networks for anomaly detection in multivariate time series data. arXiv preprint arXiv:2201.07284.
[28]
Wen P, Yang Z, Wu L, Qi S, Chen J, and Chen P A novel convolutional adversarial framework for multivariate time series anomaly detection and explanation in cloud environment Applied Sciences 2022 12 20 10390
[29]
Woo, S., Park, J., Lee, J.-Y., & Kweon, I. S. (2018). CBAM: Convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19.
[30]
Wu, L., Tordsson, J., Bogatinovski, J., Elmroth, E., & Kao, O. (2021). MicroDiag: Fine-grained performance diagnosis for microservice systems. In: Proceedings of 2021 IEEE/ACM International Workshop on Cloud Intelligence, pp. 31–36.
[31]
Xin R, Liu H, Chen P, and Zhao Z Robust and accurate performance anomaly detection and prediction for cloud applications: A novel ensemble learning-based framework Journal of Cloud Computing 2023 12 1 1-16
[32]
Xu, X., Chen, P., Xia, Y., Long, M., Peng, Q., & Long, T. (2022). MRoCO: A novel approach to structured application scheduling with a hybrid vehicular cloud-edge environment. In: 2022 IEEE International Conference on Services Computing (SCC), pp. 84–92. IEEE
[33]
Yang Z, Ying S, Wang B, Li Y, Dong B, Geng J, and Zhang T A system fault diagnosis method with a reclustering algorithm Scientific Programming 2021
[34]
Zhang, X., Gao, Y., Lin, J., & Lu, C.-T. (2020). TapNet: Multivariate time series classification with attentional prototypical network. In: Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), vol. 34, pp. 6845–6852.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Software Quality Journal
Software Quality Journal  Volume 32, Issue 3
Sep 2024
519 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 21 May 2024
Accepted: 01 April 2024

Author Tags

  1. Microservice
  2. Attention mechanism
  3. Anomaly multi-classification
  4. Fault diagnosis

Qualifiers

  • Research-article

Funding Sources

  • Chunhui Project of Ministry of Education of China
  • National Natural Science Foundation
  • Science and Technology Program of Sichuan Province

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media