Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Anomaly Detection at Scale: The Case for Deep Distributional Time Series Models

  • Conference paper
  • First Online:
Service-Oriented Computing – ICSOC 2020 Workshops (ICSOC 2020)

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 12632))

Included in the following conference series:

  • 2161 Accesses

Abstract

This paper introduces a new methodology for detecting anomalies in time series data, with a primary application to monitoring the health of (micro-) services and cloud resources. The main novelty in our approach is that instead of modeling time series consisting of real values or vectors of real values, we model time series of probability distributions. This extension allows the technique to be applied to the common scenario where the data is generated by requests coming in to a service, which is then aggregated at a fixed temporal frequency. We show the superior accuracy of our method on synthetic and public real-world data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    A collective anomaly consists of a subset of points that deviates from the rest of the dataset even though individually each point may appear normal.

  2. 2.

    The code is available at https://github.com/awslabs/gluon-ts/tree/distribution_ anomaly_detection/distribution_anomaly_detection.

References

  1. Alexandrov, A., et al.: Gluonts: probabilistic time series models in python. arXiv preprint arXiv:1906.05264 (2019)

  2. Bendre, S.: Outliers in statistical data (1994)

    Google Scholar 

  3. Caron, F., Davy, M., Doucet, A., Duflos, E., Vanheeghe, P.: Bayesian inference for linear dynamic models with dirichlet process mixtures. IEEE Trans. Signal Process. 56(1), 71–84 (2007)

    Article  MathSciNet  Google Scholar 

  4. Chang, Y., Kaufmann, R.K., Kim, C.S., Miller, J.I., Park, J.Y., Park, S.: Evaluating trends in time series of distributions: a spatial fingerprint of human effects on climate. J. Econom. 214(1), 274–294 (2020)

    Article  MathSciNet  Google Scholar 

  5. Chen, T., et al.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)

  6. Du, M., Li, F., Zheng, G., Srikumar, V.: Deeplog: anomaly detection and diagnosis from system logs through deep learning. In: Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, pp. 1285–1298 (2017)

    Google Scholar 

  7. Faloutsos, C., Gasthaus, J., Januschowski, T., Wang, Y.: Forecasting big time series: old and new. Proc. VLDB Endow. 11(12), 2102–2105 (2018)

    Article  Google Scholar 

  8. Gasthaus, J., et al.: Probabilistic forecasting with spline quantile function RNNs. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 1901–1910 (2019)

    Google Scholar 

  9. González, J.P., San Roque, A.M., Perez, E.A.: Forecasting functional time series with a new hilbertian armax model: application to electricity price forecasting. IEEE Trans. Power Syst. 33(1), 545–556 (2017)

    Article  Google Scholar 

  10. Guha, S., Mishra, N., Roy, G., Schrijvers, O.: Robust random cut forest based anomaly detection on streams. In: International Conference on Machine Learning, pp. 2712–2721 (2016)

    Google Scholar 

  11. Hawkins, D.M.: Identification of Outliers, vol. 11. Springer, Heidelberg (1980)

    Book  Google Scholar 

  12. Hochenbaum, J., Vallis, O.S., Kejariwal, A.: Automatic anomaly detection in the cloud via statistical learning. arXiv preprint arXiv:1704.07706 (2017)

  13. Hyndman, R.J.: Computing and graphing highest density regions. Am. Stat. 50(2), 120–126 (1996)

    Google Scholar 

  14. Hyndman, R.J., Ullah, M.S.: Robust forecasting of mortality and fertility rates: a functional data approach. Comput. Stat. Data Anal. 51(10), 4942–4956 (2007)

    Article  MathSciNet  Google Scholar 

  15. Kieu, T., Yang, B., Guo, C., Jensen, C.S.: Outlier detection for time series with recurrent autoencoder ensembles (2019)

    Google Scholar 

  16. Malhotra, P., Vig, L., Shroff, G., Agarwal, P.: Long short term memory networks for anomaly detection in time series. In: Proceedings, vol. 89, pp. 89–94. Presses universitaires de Louvain (2015)

    Google Scholar 

  17. Meng, W., et al.: Loganomaly: unsupervised detection of sequential and quantitative anomalies in unstructured logs. In: IJCAI, pp. 4739–4745 (2019)

    Google Scholar 

  18. Moayedi, H.Z., Masnadi-Shirazi, M.: Arima model for network traffic prediction and anomaly detection. In: 2008 International Symposium on Information Technology, vol. 4, pp. 1–6. IEEE (2008)

    Google Scholar 

  19. Munir, M., Siddiqui, S.A., Chattha, M.A., Dengel, A., Ahmed, S.: Fusead: unsupervised anomaly detection in streaming sensors data by fusing statistical and deep learning models. Sensors 19(11), 2451 (2019)

    Article  Google Scholar 

  20. Park, J.Y., Qian, J.: Functional regression of continuous state distributions. J. Econom. 167(2), 397–412 (2012)

    Article  MathSciNet  Google Scholar 

  21. Ren, H., et al.: Time-series anomaly detection service at microsoft. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp. 3009–3017 (2019)

    Google Scholar 

  22. Rodriguez, A., Ter Horst, E., et al.: Bayesian dynamic density estimation. Bayesian Anal. 3(2), 339–365 (2008)

    Article  MathSciNet  Google Scholar 

  23. Salinas, D., Flunkert, V., Gasthaus, J., Januschowski, T.: Deepar: probabilistic forecasting with autoregressive recurrent networks. Int. J. Forecast. 36(3), 1181–1191 (2019)

    Article  Google Scholar 

  24. Siffer, A., Fouque, P.A., Termier, A., Largouet, C.: Anomaly detection in streams with extreme value theory. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1067–1075 (2017)

    Google Scholar 

  25. Szabó, Z., Sriperumbudur, B.K., Póczos, B., Gretton, A.: Learning theory for distribution regression. J. Mach. Learn. Res. 17(1), 5272–5311 (2016)

    MathSciNet  MATH  Google Scholar 

  26. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)

    Google Scholar 

  27. Yeh, C.C.M., et al.: Matrix profile i: all pairs similarity joins for time series: a unifying view that includes motifs, discords and shapelets. In: 2016 IEEE 16th International Conference on Data Mining (ICDM), pp. 1317–1322. IEEE (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ayed, F., Stella, L., Januschowski, T., Gasthaus, J. (2021). Anomaly Detection at Scale: The Case for Deep Distributional Time Series Models. In: Hacid, H., et al. Service-Oriented Computing – ICSOC 2020 Workshops. ICSOC 2020. Lecture Notes in Computer Science(), vol 12632. Springer, Cham. https://doi.org/10.1007/978-3-030-76352-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-76352-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-76351-0

  • Online ISBN: 978-3-030-76352-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics