Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

OnlineSTL: scaling time series decomposition by 100x

Published: 01 March 2022 Publication History

Abstract

Decomposing a complex time series into trend, seasonality, and remainder components is an important primitive that facilitates time series anomaly detection, change point detection, and forecasting. Although numerous batch algorithms are known for time series decomposition, none operate well in an online scalable setting where high throughput and real-time response are paramount. In this paper, we propose OnlineSTL, a novel online algorithm for time series decomposition which is highly scalable and is deployed for real-time metrics monitoring on high-resolution, high-ingest rate data. Experiments on different synthetic and real world time series datasets demonstrate that OnlineSTL achieves orders of magnitude speedups (100x) for large seasonalities while maintaining quality of decomposition.

References

[1]
Colin Adams, Luis Alonso, Benjamin Atkin, John Banning, Sumeer Bhola, Rick Buskens, Ming Chen, Xi Chen, Yoo Chung, Qin Jia, Nick Sakharov, George Talbot, Adam Tart, and Nick Taylor. 2020. Monarch: Google's Planet-Scale in-Memory Time Series Database. Proc. VLDB Endow. 13, 12 (Aug. 2020), 3181--3194.
[2]
Inc. Amazon.com. 2022. AWS CloudWatch. Retrieved March 14, 2022 from https://aws.amazon.com/cloudwatch/
[3]
Gonzalo R Arce. 2005. Nonlinear signal processing: a statistical approach. John Wiley & Sons. https://books.google.ne/books?id=Lq6KdR3nDbYC
[4]
Anthony Asta. 2016. Observability at Twitter: technical overview, part i, 2016. Retrieved March 14, 2022 from https://blog.twitter.com/2016/observability-at-twitter-technical-overview-part-i
[5]
et al. B. Beyer, C. Jones. 2016. Site Reliability Engineering: How Google Runs Production Systems. O'Reilly.
[6]
Kasun Bandara, Rob J Hyndman, and Christoph Bergmeir. 2021. MSTL: A Seasonal-Trend Decomposition Algorithm for Time Series with Multiple Seasonal Patterns. arXiv preprint arXiv:2107.13462 (2021).
[7]
William R. Bell and Steven C. Hillmer. 1984. Issues Involved with the Seasonal Adjustment of Economic Time Series. Journal of Business and Economic Statistics 2, 4 (1984), 291--320. http://www.jstor.org/stable/1391266
[8]
George.E.P. Box and Gwilym M. Jenkins. 1976. Time Series Analysis: Forecasting and Control. Holden-Day.
[9]
Alex Casalboni. 2020. Amazon Lookout for Metrics. Retrieved March 14, 2022 from https://aws.amazon.com/blogs/aws/preview-amazon-lookout-for-metrics-anomaly-detection-service-monitoring-health-business/
[10]
C. Chatfield. 2016. The Analysis of Time Series: An Introduction, Sixth Edition. CRC Press. https://books.google.com/books?id=qKzyAbdaDFAC
[11]
B. S. Chris Larsen. 2022. OpenTSDB - a distributed, scalable monitoring system. Retrieved March 14, 2022 from http://opentsdb.net
[12]
Robert B. Cleveland, William S. Cleveland, Jean E. McRae, and Irma Terpenning. 1990. STL: A Seasonal-Trend Decomposition Procedure Based on Loess (with Discussion). Journal of Official Statistics 6 (1990), 3--73.
[13]
Microsoft Corporation. 2022. Azure Anomaly Detector API. Retrieved March 14, 2022 from https://docs.microsoft.com/en-us/azure/cognitive-services/anomaly-detector/concepts/anomaly-detection-best-practices
[14]
Microsoft Corporation. 2022. Microsoft Azure Monitor. Retrieved March 14, 2022 from https://docs.microsoft.com/azure/monitoring-and-diagnostics
[15]
Datadog. 2022. Retrieved March 14, 2022 from https://www.datadoghq.com/
[16]
Datadog. 2022. Datadog anomaly detection. Retrieved March 14, 2022 from https://docs.datadoghq.com/monitors/monitor_types/anomaly/
[17]
Jason Dixon. 2017. Monitoring with Graphite: Tracking Dynamic Host and Application Metrics at Scale. O'Reilly Media.
[18]
Alexander Dokumentov and Rob J Hyndman. 2021. STR: Seasonal-Trend Decomposition Using Regression. INFORMS Journal on Data Science (2021).
[19]
Dheeru Dua and Casey Graff. 2017. UCI Machine Learning Repository. Retrieved March 14, 2022 from http://archive.ics.uci.edu/ml
[20]
Vassiliy A Epanechnikov. 1969. Non-parametric estimation of a multivariate probability density. Theory of Probability & Its Applications 14, 1 (1969), 153--158.
[21]
Hadi Fanaee-T and Joao Gama. 2013. Event labeling combining ensemble detectors and background knowledge. Progress in Artificial Intelligence (2013), 1--15.
[22]
David Findley. 2010. Some Recent Developments and Directions in Seasonal Adjustment. Journal of Official Statistics 21 (07 2010).
[23]
Apache Software Foundation. 2011. Apache Flink. Retrieved March 14, 2022 from https://flink.apache.org/
[24]
Jerome Friedman, Trevor Hastie, Robert Tibshirani, et al. 2001. The elements of statistical learning. Vol. 1. Springer series in statistics New York.
[25]
N. Golyandina and E. Osipov. 2007. The "Caterpillar"-SSA method for analysis of time series with missing values. Journal of Statistical Planning and Inference 137, 8 (2007), 2642--2653. https://www.sciencedirect.com/science/article/pii/S037837580700016X
[26]
Bruce Hansen. 2009. Lecture notes on nonparametrics. Technical report, University of Wisconsin.
[27]
Robert J. Hodrick and Edward C. Prescott. 1997. Postwar U.S. Business Cycles: An Empirical Investigation. Journal of Money, Credit and Banking 29, 1 (1997), 1--16.
[28]
Michael Httermann. 2012. DevOps for Developers (1st ed.). Apress, USA.
[29]
Rob J. Hyndman, George Athanasopoulos, Christoph Bergmeir, Gabriel Caceres, Leanne Chhay, Mitchell O'Hara-Wild, Fotios Petropoulos, Slava Razbash, Earo Wang, and Farah Yasmeen. 2018. forecast: Forecasting functions for time series and linear models.
[30]
Rob J. Hyndman. 2020. Data for "Forecasting: Principles and Practice" (2nd Edition). https://CRAN.R-project.org/package=fpp2
[31]
Rob J Hyndman and George Athanasopoulos. 2018. Forecasting: principles and practice. OTexts.
[32]
Rob J Hyndman and Anne B Koehler. 2006. Another look at measures of forecast accuracy. International Journal of Forecasting (2006), 679--688.
[33]
Rob J Hyndman and Anne B Koehler. 2006. Another look at measures of forecast accuracy. International journal of forecasting 22, 4 (2006), 679--688.
[34]
Rob J. Hyndman and Yangzhuoran Yang. 2018. tsdl: Time Series Data Library. v0.1.0. Retrieved March 14, 2022 from https://pkg.yangzhuoranyang./tsdl/.
[35]
Google Inc. 2022. Google Stackdriver. Retrieved March 14, 2022 from https://cloud.google.com/stackdriver/
[36]
SplunkInc. 2022. Retrieved March 14, 2022 from https://www.splunk.com
[37]
Seung-Jean Kim, Kwangmoo Koh, Stephen Boyd, and Dimitry Gorinevsky. 2009. L1 Trend Filtering. SIAM Rev. 51, 2 (2009), 339--360.
[38]
Alysha Livera, Rob Hyndman, and Ralph Snyder. 2010. Forecasting Time Series With Complex Seasonal Patterns Using Exponential Smoothing. J. Amer. Statist. Assoc. 106 (01 2010), 1513--1527.
[39]
FR Macaulay. 1932. The smoothing of time series. NBER Books (1932).
[40]
Denise R. Osborn. 1995. Moving Average Detrending and the Analysis of Business Cycles†. Oxford Bulletin of Economics and Statistics 57, 4 (1995), 547--558.
[41]
Tuomas Pelkonen, Scott Franklin, Justin Teller, Paul Cavallaro, Qi Huang, Justin Meza, and Kaushik Veeraraghavan. 2015. Gorilla: A Fast, Scalable, in-Memory Time Series Database. Proc. VLDB Endow. 8, 12 (2015), 1816--1827.
[42]
J. H. Poynting. 1884. A Comparison of the Fluctuations in the Price of Wheat and in the Cotton and Silk Imports into Great Britain. Journal of the Statistical Society of London 47, 1 (1884), 34--74.
[43]
F. Reinartz, J. Volz, and B. Rabenstein. 2022. Prometheus - monitoring system and time series database. Retrieved March 14, 2022 from http://prometheus.io
[44]
Christian H. Reinsch. 1967. Smoothing by spline functions. Numer. Math. 10 (1967), 177--183.
[45]
Kexin Rong and Peter Bailis. 2017. ASAP: Prioritizing Attention via Time Series Smoothing. Proc. VLDB Endow. 10, 11 (Aug. 2017), 1358--1369.
[46]
Thomas J. Sargent and Paolo Surico. 2011. Two Illustrations of the Quantity Theory of Money: Breakdowns and Revivals. American Economic Review 101, 1 (February 2011), 109--28.
[47]
Akshay Shah and Fran Bell. 2016. Observability at Uber Engineering: Past, Present, Future. Video. Retrieved March 14, 2022 from https://www.youtube.com/watch?v=2JAnmzVwgP8
[48]
Sean J. Taylor and Benjamin Letham. 2018. Forecasting at Scale. The American Statistician 72, 1 (January 2018), 37--45.
[49]
Jörg Thalheim, Antonio Rodrigues, Istemi Ekin Akkus, Pramod Bhatotia, Ruichuan Chen, Bimal Viswanath, Lei Jiao, and Christof Fetzer. 2017. Sieve: Actionable Insights from Monitored Metrics in Distributed Systems. In Proceedings of the 18th ACM/IFIP/USENIX Middleware Conference (Las Vegas, Nevada) (Middleware '17). 14--27.
[50]
Qingsong Wen, Jingkun Gao, Xiaomin Song, Liang Sun, Huan Xu, and Shenghuo Zhu. 2019. RobustSTL: A robust seasonal-trend decomposition algorithm for long time series. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 5409--5416.
[51]
Qingsong Wen, Zhe Zhang, Yan Li, and Liang Sun. 2020. Fast RobustSTL: Efficient and Robust Seasonal-Trend Decomposition for Time Series with Complex Patterns. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2203--2213.
[52]
Qingsong Wen, Zhe Zhang, Yan Li, and Liang Sun. 2020. Fast RobustSTL: Efficient and Robust Seasonal-Trend Decomposition for Time Series with Complex Patterns. ACM, 2203--2213.
[53]
Alex Woodie. 2015. Kafka tops 1 trillion messages per day at LinkedIn. Datanami, September 2015. Retrieved March 14, 2022 from http://www.datanami.com/2015/09/02/kafka-tops-1-trillion-messages-per-day-at-linkedin

Cited By

View all
  • (2024)Forecasting Algorithms for Intelligent Resource Scaling: An Experimental AnalysisProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698564(126-143)Online publication date: 20-Nov-2024
  • (2024)BacktrackSTL: Ultra-Fast Online Seasonal-Trend Decomposition with Backtrack TechniqueProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671510(5848-5859)Online publication date: 25-Aug-2024
  • (2024)PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web ApplicationsProceedings of the ACM Web Conference 202410.1145/3589334.3645330(2747-2758)Online publication date: 13-May-2024
  • Show More Cited By
  1. OnlineSTL: scaling time series decomposition by 100x

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 15, Issue 7
    March 2022
    208 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 March 2022
    Published in PVLDB Volume 15, Issue 7

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)20
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Forecasting Algorithms for Intelligent Resource Scaling: An Experimental AnalysisProceedings of the 2024 ACM Symposium on Cloud Computing10.1145/3698038.3698564(126-143)Online publication date: 20-Nov-2024
    • (2024)BacktrackSTL: Ultra-Fast Online Seasonal-Trend Decomposition with Backtrack TechniqueProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671510(5848-5859)Online publication date: 25-Aug-2024
    • (2024)PASS: Predictive Auto-Scaling System for Large-scale Enterprise Web ApplicationsProceedings of the ACM Web Conference 202410.1145/3589334.3645330(2747-2758)Online publication date: 13-May-2024
    • (2024)Adaptive Seasonal-Trend Decomposition for Streaming Time Series Data with Transitions and Fluctuations in SeasonalityMachine Learning and Knowledge Discovery in Databases. Research Track10.1007/978-3-031-70344-7_25(426-443)Online publication date: 8-Sep-2024
    • (2023)OneShotSTL: One-Shot Seasonal-Trend Decomposition For Online Time Series Anomaly Detection And ForecastingProceedings of the VLDB Endowment10.14778/3583140.358315516:6(1399-1412)Online publication date: 1-Feb-2023

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media