Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3399579.3399869acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
short-paper
Open access

Resilient Neural Forecasting Systems

Published: 17 June 2020 Publication History

Abstract

Industrial machine learning systems face data challenges that are often under-explored in the academic literature. Common data challenges are data distribution shifts, missing values and anomalies. In this paper, we discuss data challenges and solutions in the context of a Neural Forecasting application on labor planning. We discuss how to make this forecasting system resilient to these data challenges. We address changes in data distribution with a periodic retraining scheme and discuss the critical importance of model stability in this setting. Furthermore, we show how our deep learning model deals with missing values natively without requiring imputation. Finally, we describe how we detect anomalies in the input data and mitigate their effect before they impact the forecasts. This results in a fully autonomous forecasting system that compares favourably to a hybrid system consisting of the algorithm and human overrides.

References

[1]
Subutai Ahmad, Alexander Lavin, Scott Purdy, and Zuha Agha. 2017. Unsupervised real-time anomaly detection for streaming data. Neurocomputing 262 (2017), 134--147.
[2]
Gustavo E. A. P. A. Batista and Maria Carolina Monard. 2003. An analysis of four missing data treatment methods for supervised learning. Applied Artificial Intelligence 17, 5-6 (2003), 519--533.
[3]
Konstantinos Benidis, Syama Sundar Rangapuram, Valentin Flunkert, Bernie Wang, Danielle Maddix, Caner Turkmen, Jan Gasthaus, Michael Bohlke-Schneider, David Salinas, Lorenzo Stella, Laurent Callot, and Tim Januschowski. 2020. Neural forecasting: Introduction and literature overview. arXiv:2004.10240
[4]
Felix Biessmann, Tammo Rukat, Phillipp Schmidt, Prathik Naidu, Sebastian Schelter, Andrey Taptunov, Dustin Lange, and David Salinas. 2019. DataWig: Missing Value Imputation for Tables. Journal of Machine Learning Research 20, 175 (2019), 1--6.
[5]
George EP Boxand Gwilym M Jenkins. 1968. Some recent advances in forecasting and control. Journal of the Royal Statistical Society. Series C (Applied Statistics) 17, 2 (1968), 91--109.
[6]
Nicolas Chapados, Marc Joliveau, and Louis-Martin Rousseau. 2011. Retail Store Workforce Scheduling by Expected Operating Income Maximization. In Integration of AI and OR Techniques in Constraint Programming for Combinatorial Optimization Problems, Tobias Achterberg and J. Christopher Beck (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 53--58.
[7]
Andreas T Ernst, Houyuan Jiang, Mohan Krishnamoorthy, Bowie Owens, and David Sier. 2004. An annotated bibliography of personnel scheduling and rostering. Annals of Operations Research 127, 1-4 (2004), 21--144.
[8]
Christos Faloutsos, Jan Gasthaus, Tim Januschowski, and Yuyang Wang. 2018. Forecasting big time series: old and new. Proceedings of the VLDB Endowment 11, 12 (2018), 2102--2105.
[9]
Christos Faloutsos, Jan Gasthaus, Tim Januschowski, and Yuyang Wang. 2019. Classical and Contemporary Approaches to Big Time Series Forecasting. In Proceedings of the 2019 International Conference on Management of Data (SIGMOD '19). ACM, New York, NY, USA.
[10]
João Gama, Indre Žliobaite, Albert Bifet, Mykola Pechenizkiy, and Abdelhamid Bouchachia. 2014. A survey on concept drift adaptation. Comput. Surveys 46, 4 (2014), 44:1--44:37.
[11]
Jan Gasthaus, Konstantinos Benidis, Yuyang Wang, Syama Sundar Rangapuram, David Salinas, Valentin Flunkert, and Tim Januschowski. 2019. Probabilistic Forecasting with Spline Quantile Function RNNs. In The 22nd International Conference on Artificial Intelligence and Statistics. Naha, Okinawa, Japan.
[12]
Sudipto Guha, Nina Mishra, Gourav Roy, and Okke Schrijvers. 2016. Robust Random Cut Forest Based Anomaly Detection on Streams. In Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York, NY, USA, 2712--2721.
[13]
Geoff Hulten, Laurie Spencer, and Pedro Domingos. 2001. Mining time-changing data streams. In Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (KDD '01). San Francisco, California, 97--106.
[14]
Mona Ebadi Jalal, Monireh Hosseini, and Stefan Karlsson. 2016. Forecasting incoming call volumes in call centers with recurrent Neural Networks. Journal of Business Research 69, 11 (2016), 4811--4814.
[15]
Tim Januschowski and Stephan Kolassa. 2019. A Classification of Business Forecasting Problems. Foresight: The International Journal of Applied Forecasting 52 (2019), 36--43.
[16]
Nikolay Laptev, Jason Yosinsk, Li Li Erran, and Slawek Smyl. 2017. Time-series Extreme Event Forecasting with Neural Networks at Uber. In ICML Time Series Workshop.
[17]
Edo Liberty, Zohar Karnin, Bing Xiang, Laurence Rouesnel, Baris Coskun, Ramesh Nallapati, Julio Delgado, Amir Sadoughi, Amir Astashonok, Piali Das, Can Balioglu, Saswata Charkravarty, Madhav Jha, Philip Gaultier, Tim Januschowski, Valentin Flunkert, Bernie Wang, Jan Gasthaus, Syama Rangapuram, David Salinas, Sebastian Schelter, David Arpin, and Alexander Smola. 2020. Elastic Machine Learning Algorithms in Amazon SageMaker. In Proceedings of the 2020 International Conference on Management of Data (SIGMOD '20). ACM, New York, NY, USA.
[18]
Roderick J A Little and Donald B Rubin. 1986. Statistical Analysis with Missing Data. John Wiley & Sons, Inc., USA.
[19]
Spyros Makridakis, Evangelos Spiliotis, and Vassilios Assimakopoulos. 2018. The M4 Competition: Results, findings, conclusion and way forward. International Journal of Forecasting 34, 4 (2018), 802--808.
[20]
Pierre-Alexandre Mattei and Jes Frellsen. 2019. MIWAE: Deep Generative Modelling and Imputation of Incomplete Data Sets. In Proceedings of the 36th International Conference on International Conference on Machine Learning. Long Beach, USA, 4413--4423.
[21]
Boris Polyak. 1990. New stochastic approximation type procedures. Avtomatica i Telemekhanika 7 (01 1990), 98--107.
[22]
Alkis Polyzotis, Martin A. Zinkevich, Steven Whang, and Sudip Roy. 2017. Data Management Challenges in Production Machine Learning. (2017), 1723--1726.
[23]
Stephan Rabanser, Stephan Günnemann, and Zachary Lipton. 2019. Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift. In Advances in Neural Information Processing Systems 32. Vancouver, Canada, 1396--1408.
[24]
Syama Sundar Rangapuram, Matthias W Seeger, Jan Gasthaus, Lorenzo Stella, Yuyang Wang, and Tim Januschowski. 2018. Deep state space models for time series forecasting. In Advances in Neural Information Processing Systems. Montreal, Canada, 7785--7794.
[25]
David Ruppert. 1988. Efficient estimators from a slowly converging robbinsmonro process. Technical Report, Cornell University Operations Research and Industrial Engineering (1988).
[26]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2019. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting (2019).
[27]
Slawek Smyl. 2020. A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting 36, 1 (2020), 75--85.
[28]
Gerhard Widmer and Miroslav Kubat. 1996. Learning in the presence of concept drift and hidden contexts. Machine Learning 23, 1 (1996), 69--101.
[29]
Matei Zaharia, Reynold S. Xin, Patrick Wendell, Tathagata Das, Michael Armbrust, Ankur Dave, Xiangrui Meng, Josh Rosen, Shivaram Venkataraman, Michael J. Franklin, Ali Ghodsi, Joseph Gonzalez, Scott Shenker, and Ion Stoica. 2016. Apache Spark: A Unified Engine for Big Data Processing. Commun. ACM 59, 11 (2016), 56--65.

Cited By

View all
  • (2024)Towards Resilient Energy Forecasting: A Robust Optimization ApproachIEEE Transactions on Smart Grid10.1109/TSG.2023.327237915:1(874-885)Online publication date: Jan-2024
  • (2024)Probabilistic wind power forecasting resilient to missing values: An adaptive quantile regression approachEnergy10.1016/j.energy.2024.131544300(131544)Online publication date: Aug-2024
  • (2022)Deep Learning for Time Series Forecasting: Tutorial and Literature SurveyACM Computing Surveys10.1145/353338255:6(1-36)Online publication date: 7-Dec-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
DEEM '20: Proceedings of the Fourth International Workshop on Data Management for End-to-End Machine Learning
June 2020
50 pages
ISBN:9781450380232
DOI:10.1145/3399579
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 June 2020

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Missing values
  2. anomaly detection
  3. data quality
  4. forecasting

Qualifiers

  • Short-paper
  • Research
  • Refereed limited

Conference

SIGMOD/PODS '20
Sponsor:

Acceptance Rates

DEEM '20 Paper Acceptance Rate 4 of 8 submissions, 50%;
Overall Acceptance Rate 44 of 67 submissions, 66%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)115
  • Downloads (Last 6 weeks)18
Reflects downloads up to 10 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Towards Resilient Energy Forecasting: A Robust Optimization ApproachIEEE Transactions on Smart Grid10.1109/TSG.2023.327237915:1(874-885)Online publication date: Jan-2024
  • (2024)Probabilistic wind power forecasting resilient to missing values: An adaptive quantile regression approachEnergy10.1016/j.energy.2024.131544300(131544)Online publication date: Aug-2024
  • (2022)Deep Learning for Time Series Forecasting: Tutorial and Literature SurveyACM Computing Surveys10.1145/353338255:6(1-36)Online publication date: 7-Dec-2022
  • (2020)Normalizing kalman filters for multivariate time series analysisProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3495976(2995-3007)Online publication date: 6-Dec-2020

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media