Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Forecasting big time series: old and new

Published: 01 August 2018 Publication History
  • Get Citation Alerts
  • Abstract

    Time series forecasting is a key ingredient in the automation and optimization of business processes: in retail, deciding which products to order and where to store them depends on the forecasts of future demand in different regions; in cloud computing, the estimated future usage of services and infrastructure components guides capacity planning; and workforce scheduling in warehouses, call centers, factories requires forecasts of the future workload. Recent years have witnessed a paradigm shift in forecasting techniques and applications, from computer-assisted model- and assumption-based to data-driven and fully-automated. This shift can be attributed to the availability of large, rich, and diverse time series data sources, posing unprecedented challenges to traditional time series forecasting methods. As such, how can we build statistical models to efficiently and effectively learn to forecast from large and diverse data sources? How can we leverage the statistical power of "similar" time series to improve forecasts in the case of limited observations? What are the implications for building forecasting systems that can handle large data volumes?
    The objective of this tutorial is to provide a concise and intuitive overview of the most important methods and tools available for solving large-scale forecasting problems. We review the state of the art in three related fields: (1) classical modeling of time series, (2) scalable tensor methods, and (3) deep learning for forecasting. Further, we share lessons learned from building scalable forecasting systems. While our focus is on providing an intuitive overview of the methods and practical issues, we also present technical details underlying these powerful tools.

    References

    [1]
    J.-H. Böse, V. Flunkert, J. Gasthaus, T. Januschowski, D. Lange, D. Salinas, S. Schelter, M. Seeger, and Y. Wang. Probabilistic demand forecasting at scale. PVLDB, 10(12):1694--1705, 2017.
    [2]
    G. E. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung. Time series analysis: forecasting and control. John Wiley & Sons, 2015.
    [3]
    P. J. Brockwell and R. A. Davis. Time series: Theory and method. Springer-Verlag, 1991.
    [4]
    M. R. de Araujo, P. M. P. Ribeiro, and C. Faloutsos. Tensorcast: Forecasting with context using coupled tensors. In Data Mining (ICDM), 2017 IEEE International Conference on, pages 71--80. IEEE, 2017.
    [5]
    J. Durbin and S. J. Koopman. Time series analysis by state space methods, volume 38. OUP Oxford, 2012.
    [6]
    V. Flunkert, D. Salinas, and J. Gasthaus. Deepar: Probabilistic forecasting with autoregressive recurrent networks. arXiv preprint arXiv:1704.04110, 2017.
    [7]
    E. Gately. Neural networks for financial forecasting. John Wiley & Sons, Inc., 1995.
    [8]
    A. C. Harvey. Forecasting, structural time series models and the Kalman filter. Cambridge university press, 1990.
    [9]
    T. Hill, L. Marquez, M. O'Connor, and W. Remus. Artificial neural network models for forecasting and decision making. International journal of forecasting, 10(1):5--15, 1994.
    [10]
    S. Hochreiter and J. Schmidhuber. Long short-term memory. Neural computation, 9(8):1735--1780, 1997.
    [11]
    R. Hyndman, A. B. Koehler, J. K. Ord, and R. D. Snyder. Forecasting with exponential smoothing: the state space approach. Springer Science & Business Media, 2008.
    [12]
    R. J. Hyndman and G. Athanasopoulos. Forecasting: Principles and practice. otexts; 2014. www. otexts. org/fpp., 987507109, 2017.
    [13]
    T. Januschowski, D. Arpin, D. Salinas, V. Flunkert, J. Gasthaus, L. Stella, and P. Vazquez. Now available in amazon sagemaker: Deepar algorithm for more accurate time series forecasting. https://aws.amazon.com/blogs/machine-learning/now-available-in-amazon-sagemaker-deepar-algorithm-for-more-accurate-time-series-forecasting/,2018.
    [14]
    J. Leskovec, L. Backstrom, and J. Kleinberg. Meme-tracking and the dynamics of the news cycle. ACM SIGKDD, 2009.
    [15]
    L. Li, J. McCann, N. Pollard, and C. Faloutsos. Dynammo: mining and summarization of coevolving sequences with missing values. In ACM SIGKDD, pages 507--516, 2009.
    [16]
    L. Li, B. A. Prakash, and C. Faloutsos. Parsimonious linear fingerprinting for time series. PVLDB, 3(1--2):385--396, 2010.
    [17]
    Y. Li, R. Yu, C. Shahabi, and Y. Liu. Graph convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926, 2017.
    [18]
    C.-N. Lu, H.-T. Wu, and S. Vemuri. Neural network based short term load forecasting. IEEE Transactions on Power Systems, 8(1):336--342, 1993.
    [19]
    L. Ma, D. Van Aken, A. Hefny, G. Mezerhane, A. Pavlo, and G. J. Gordon. Query-based workload forecasting for self-driving database management systems. In SIGMOD, pages 631--645. ACM, 2018.
    [20]
    Y. Matsubara, Y. Sakurai, and C. Faloutsos. The web as a jungle: Non-linear dynamical systems for co-evolving online activities. In WWW, pages 721--731. ACM, 2015.
    [21]
    Y. Matsubara, Y. Sakurai, and C. Faloutsos. Non-linear mining of competing local activities. In WWW, pages 737--747. ACM, 2016.
    [22]
    Y. Matsubara, Y. Sakurai, C. Faloutsos, T. Iwata, and M. Yoshikawa. Fast mining and forecasting of complex time-stamped events. In KDD, pages 271--279. ACM, 2012.
    [23]
    Y. Matsubara, Y. Sakurai, B. A. Prakash, L. Li, and C. Faloutsos. Rise and fall patterns of information diffusion: model and implications. In KDD, pages 6--14. ACM, 2012.
    [24]
    Y. Matsubara, Y. Sakurai, W. G. van Panhuis, and C. Faloutsos. Funnel: automatic mining of spatially coevolving epidemics. In KDD, pages 105--114. ACM, 2014.
    [25]
    S. Mukherjee, D. Shankar, A. Ghosh, N. Tathawadekar, P. Kompalli, S. Sarawagi, and K. Chaudhury. Armdn: Associative and recurrent mixture density networks for eretail demand forecasting. arXiv preprint arXiv:1803.03800, 2018.
    [26]
    S. Papadimitriou, A. Brockwell, and C. Faloutsos. Adaptive, hands-off stream mining. In VLDB, pages 560--571. Morgan Kaufmann, 2003.
    [27]
    S. Papadimitriou and P. S. Yu. Optimal multi-scale patterns in time series streams. In SIGMOD Conference, pages 647--658, 2006.
    [28]
    D. C. Park, M. El-Sharkawi, R. Marks, L. Atlas, and M. Damborg. Electric load forecasting using an artificial neural network. IEEE transactions on Power Systems, 6(2):442--449, 1991.
    [29]
    L. Roberts, L. Razoumov, L. Su, and Y. Wang. Gini regularized optimal transport with an application to spatio-temporal forecasting. NIPS Workshop on Optimal Transport, 2017.
    [30]
    S. L. Scott and H. R. Varian. Predicting the present with bayesian structural time series. International Journal of Mathematical Modelling and Numerical Optimisation, 5(1--2):4--23, 2014.
    [31]
    M. Seeger, S. Rangapuram, Y. Wang, D. Salinas, J. Gasthaus, T. Januschowski, and V. Flunkert. Approximate bayesian inference in linear state space models for intermittent demand forecasting at scale. arXiv preprint arXiv:1709.07638, 2017.
    [32]
    M. W. Seeger, D. Salinas, and V. Flunkert. Bayesian intermittent demand forecasting for large inventories. In NIPS, pages 4646--4654, 2016.
    [33]
    K. Takeuchi, H. Kashima, and N. Ueda. Autoregressive tensor factorization for spatio-temporal predictions. In ICDM, pages 1105--1110. IEEE, 2017.
    [34]
    Y. Tao, C. Faloutsos, D. Papadias, and B. Liu. Prediction and indexing of moving objects with unknown motion patterns. In SIGMOD Conference, pages 611--622. ACM, 2004.
    [35]
    D. Van Aken, A. Pavlo, G. J. Gordon, and B. Zhang. Automatic database management system tuning through large-scale machine learning. In Proceedings of the 2017 ACM International Conference on Management of Data, pages 1009--1024. ACM, 2017.
    [36]
    A. Van Den Oord, S. Dieleman, H. Zen, K. Simonyan, O. Vinyals, A. Graves, N. Kalchbrenner, A. W. Senior, and K. Kavukcuoglu. Wavenet: A generative model for raw audio. In SSW, page 125, 2016.
    [37]
    R. Wen, K. Torkkola, and B. Narayanaswamy. A multi-horizon quantile recurrent forecaster. arXiv preprint arXiv:1711.11053, 2017.
    [38]
    S. Xingjian, Z. Chen, H. Wang, D.-Y. Yeung, W.-K. Wong, and W.-c. Woo. Convolutional lstm network: A machine learning approach for precipitation nowcasting. In NIPS, pages 802--810, 2015.
    [39]
    B. Yi, N. Sidiropoulos, T. Johnson, H. Jagadish, C. Faloutsos, and A. Biliris. Online data mining for co-evolving time sequences. In Data Engineering, 2000. Proceedings. 16th International Conference on, pages 13--22. IEEE, 2000.
    [40]
    H.-F. Yu, N. Rao, and I. S. Dhillon. Temporal regularized matrix factorization for high-dimensional time series prediction. In NIPS, pages 847--855, 2016.
    [41]
    R. Yu, S. Zheng, A. Anandkumar, and Y. Yue. Long-term forecasting using tensor-train rnns. arXiv preprint arXiv:1711.00073, 2017.
    [42]
    G. Zhang, B. E. Patuwo, and M. Y. Hu. Forecasting with artificial neural networks:: The state of the art. International journal of forecasting, 14(1):35--62, 1998.
    [43]
    Y. Zhu and D. Shasha. Statstream: Statistical monitoring of thousands of data streams in real time. PVLDB, pages 358--369, 2002.

    Cited By

    View all
    • (2024)A Multi-Scale Decomposition MLP-Mixer for Time Series AnalysisProceedings of the VLDB Endowment10.14778/3654621.365463717:7(1723-1736)Online publication date: 1-Mar-2024
    • (2023)Weakly Guided Adaptation for Robust Time Series ForecastingProceedings of the VLDB Endowment10.14778/3636218.363623117:4(766-779)Online publication date: 1-Dec-2023
    • (2023)TSM-Bench: Benchmarking Time Series Database Systems for Monitoring ApplicationsProceedings of the VLDB Endowment10.14778/3611479.361153216:11(3363-3376)Online publication date: 24-Aug-2023
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the VLDB Endowment
    Proceedings of the VLDB Endowment  Volume 11, Issue 12
    August 2018
    426 pages
    ISSN:2150-8097
    Issue’s Table of Contents

    Publisher

    VLDB Endowment

    Publication History

    Published: 01 August 2018
    Published in PVLDB Volume 11, Issue 12

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)146
    • Downloads (Last 6 weeks)17

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)A Multi-Scale Decomposition MLP-Mixer for Time Series AnalysisProceedings of the VLDB Endowment10.14778/3654621.365463717:7(1723-1736)Online publication date: 1-Mar-2024
    • (2023)Weakly Guided Adaptation for Robust Time Series ForecastingProceedings of the VLDB Endowment10.14778/3636218.363623117:4(766-779)Online publication date: 1-Dec-2023
    • (2023)TSM-Bench: Benchmarking Time Series Database Systems for Monitoring ApplicationsProceedings of the VLDB Endowment10.14778/3611479.361153216:11(3363-3376)Online publication date: 24-Aug-2023
    • (2023)Learning to Predict Head Pose in Remotely-Rendered Virtual RealityProceedings of the 14th Conference on ACM Multimedia Systems10.1145/3587819.3590972(27-38)Online publication date: 7-Jun-2023
    • (2022)Deep Learning for Time Series Forecasting: Tutorial and Literature SurveyACM Computing Surveys10.1145/353338255:6(1-36)Online publication date: 19-May-2022
    • (2021)A Hybridly Optimized LSTM-Based Data Flow Prediction Model for Dependable Online TicketingWireless Communications & Mobile Computing10.1155/2021/99516072021Online publication date: 1-Jan-2021
    • (2021)Local Gaussian Process Model Inference Classification for Time Series DataProceedings of the 33rd International Conference on Scientific and Statistical Database Management10.1145/3468791.3468839(209-213)Online publication date: 6-Jul-2021
    • (2021)Effective low capacity status prediction for cloud systemsProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering10.1145/3468264.3473917(1236-1241)Online publication date: 20-Aug-2021
    • (2020)Resilient Neural Forecasting SystemsProceedings of the Fourth International Workshop on Data Management for End-to-End Machine Learning10.1145/3399579.3399869(1-5)Online publication date: 14-Jun-2020
    • (2020)Forecasting Big Time Series: Theory and PracticeCompanion Proceedings of the Web Conference 202010.1145/3366424.3383118(320-321)Online publication date: 20-Apr-2020
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media