Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access
Just Accepted

Evaluation-Free Time-Series Forecasting Model Selection via Meta-Learning

Online AM: 24 January 2025 Publication History

Abstract

Time-series forecasting models are invariably used in a variety of domains for crucial decision-making. Traditionally these models are constructed by experts with considerable manual effort. Unfortunately, this approach has poor scalability while generating accurate forecasts for new datasets belonging to diverse applications. Without access to skilled domain-knowledge, one approach is to train all the models on the new time-series data and then select the best one. However, this approach is nonviable in practice. In this work, we develop techniques for fast automatic selection of the best forecasting model for a new unseen time-series dataset, without having to first train (or evaluate) all the models on the new time-series data to select the best one. In particular, we develop a forecasting meta-learning approach called AutoForecast that allows for the quick inference of the best time-series forecasting model for an unseen dataset. Our approach learns both forecasting models performances over time horizon of the same dataset and task similarity across different datasets. The experiments demonstrate the effectiveness of the approach over state-of-the-art (SOTA) single and ensemble methods and several SOTA meta-learners (adapted to our problem) in terms of selecting better forecasting models (i.e., 2X gain) for unseen tasks for univariate and multivariate testbeds. AutoForecast has also significant reduction in inference time compared to the naïve approach (doing inference using all possible models and then selecting the best one), with median of 42X across the two testbeds. We release our meta-learning database corpus (348 datasets), performances of the 322 forecasting models on the database corpus, meta-features, and source codes for the community to access them for forecasting model selection and to build on them with new datasets and models which can help advance automating time-series forecasting problem. In our released database corpus, we unveil new traces of Adobe computing cluster usage for production workloads.

References

[1]
Hossein Abbasimehr, Mostafa Shabani, and Mohsen Yousefi. 2020. An optimized model using LSTM network for demand forecasting. Computers & industrial engineering 143 (2020), 106435.
[2]
Mustafa Abdallah, Wo Jae Lee, Nithin Raghunathan, Charilaos Mousoulis, John W Sutherland, and Saurabh Bagchi. 2021. Anomaly Detection through Transfer Learning in Agriculture and Manufacturing IoT Systems. arXiv preprint arXiv:2102.05814 (2021).
[3]
Mustafa Abdallah, Ryan Rossi, Kanak Mahadik, Sungchul Kim, Handong Zhao, and Saurabh Bagchi. 2022. AutoForecast: Automatic Time-Series Forecasting Model Selection. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management (Atlanta, GA, USA) (CIKM ’22). Association for Computing Machinery, New York, NY, USA, 5–14. https://doi.org/10.1145/3511808.3557241
[4]
Salisu Mamman Abdulrahman, Pavel Brazdil, Jan N. van Rijn, and Joaquin Vanschoren. 2018. Speeding up algorithm selection using average ranking and active testing by introducing runtime. Mach. Learn. 107, 1 (2018), 79–108. https://doi.org/10.1007/s10994-017-5687-8
[5]
Alexander Alexandrov, Konstantinos Benidis, Michael Bohlke-Schneider, Valentin Flunkert, Jan Gasthaus, Tim Januschowski, Danielle C Maddix, Syama Sundar Rangapuram, David Salinas, Jasper Schulz, et al. 2020. GluonTS: Probabilistic and Neural Time Series Modeling in Python. J. Mach. Learn. Res. 21, 116 (2020), 1–6.
[6]
Bay Arinze, Seung-Lae Kim, and Murugan Anandarajan. 1997. Combining and selecting forecasting models using rule based induction. Computers & Operations Research 24, 5 (1997), 423–433. https://doi.org/10.1016/S0305-0548(96)00062-7
[7]
Christoph Bergmeir and José M Benítez. 2012. On the use of cross-validation for time series predictor evaluation. Information Sciences 191 (2012), 192–213.
[8]
James Bergstra, Rémi Bardenet, Yoshua Bengio, and Balázs Kégl. 2011. Algorithms for hyper-parameter optimization. In 25th annual conference on neural information processing systems (NIPS 2011), Vol. 24. Neural Information Processing Systems Foundation.
[9]
James Bergstra and Yoshua Bengio. 2012. Random search for hyper-parameter optimization. Journal of machine learning research 13, 2 (2012).
[10]
Casper Solheim Bojer and Jens Peder Meldgaard. 2021. Kaggle forecasting competitions: An overlooked learning opportunity. International Journal of Forecasting 37, 2 (2021), 587–603.
[11]
Pavel Brazdil, Christophe Giraud Carrier, Carlos Soares, and Ricardo Vilalta. 2008. Metalearning: Applications to data mining. Springer Science & Business Media.
[12]
Vitor Cerqueira, Luis Torgo, and Igor Mozetič. 2020. Evaluating time series forecasting models: An empirical study on performance estimation methods. Machine Learning 109, 11 (2020), 1997–2028.
[13]
Vitor Cerqueira, Luis Torgo, and Carlos Soares. 2021. Model Selection for Time Series Forecasting: Empirical Analysis of Different Estimators. arXiv preprint arXiv:2104.00584 (2021).
[14]
Chris Chatfield. 1978. The Holt-winters forecasting procedure. Journal of the Royal Statistical Society: Series C (Applied Statistics) 27, 3 (1978), 264–279.
[15]
Baibhab Chatterjee, Dong-Hyun Seo, Shramana Chakraborty, Shitij Avlani, Xiaofan Jiang, Heng Zhang, Mustafa Abdallah, Nithin Raghunathan, Charilaos Mousoulis, Ali Shakouri, Saurabh Bagchi, Dimitrios Peroulis, and Shreyas Sen. 2021. Context-Aware Collaborative Intelligence With Spatio-Temporal In-Sensor-Analytics for Efficient Communication in a Large-Area IoT Testbed. IEEE Internet of Things Journal 8, 8 (2021), 6800–6814. https://doi.org/10.1109/JIOT.2020.3036087
[16]
Maximilian Christ, Nils Braun, Julius Neuffer, and Andreas W Kempa-Liehr. 2018. Time series feature extraction on basis of scalable hypothesis tests (tsfresh–a python package). Neurocomputing 307 (2018), 72–77.
[17]
Christophmark. 2020. Python Implementation of FFORMA. https://github.com/christophmark/fforma
[18]
Merlise Clyde. 2003. Model averaging. Subjective and objective Bayesian statistics (2003), 636–642.
[19]
Fred Collopy and J Scott Armstrong. 1992. Rule-based forecasting: Development and validation of an expert systems approach to combining time series extrapolations. Management science 38, 10 (1992), 1394–1414.
[20]
Alysha M De Livera, Rob J Hyndman, and Ralph D Snyder. 2011. Forecasting time series with complex seasonal patterns using exponential smoothing. Journal of the American statistical association 106, 496 (2011), 1513–1527.
[21]
Hassan Ismail Fawaz, Germain Forestier, Jonathan Weber, Lhassane Idoumghar, and Pierre-Alain Muller. 2018. Transfer learning for time series classification. In 2018 IEEE international conference on big data (Big Data). IEEE, 1367–1376.
[22]
Matthias Feurer and Frank Hutter. 2019. Hyperparameter optimization. In Automated Machine Learning. Springer, Cham, 3–33.
[23]
Matthias Feurer, Aaron Klein, Katharina Eggensperger, Jost Springenberg, Manuel Blum, and Frank Hutter. 2015. Efficient and Robust Automated Machine Learning. In Advances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett (Eds.), Vol. 28. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2015/file/11d0e6287202fced83f79975ec59a3a6-Paper.pdf
[24]
Matthias Feurer, Jost Springenberg, and Frank Hutter. 2015. Initializing bayesian hyperparameter optimization via meta-learning. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 29.
[25]
Chelsea Finn, Pieter Abbeel, and Sergey Levine. 2017. Model-agnostic meta-learning for fast adaptation of deep networks. In International Conference on Machine Learning. PMLR, 1126–1135.
[26]
Jean-Yves Franceschi, Aymeric Dieuleveut, and Martin Jaggi. 2019. Unsupervised scalable representation learning for multivariate time series. arXiv preprint arXiv:1901.10738 (2019).
[27]
Ben D Fulcher, Carl H Lubba, Sarab S Sethi, and Nick S Jones. 2019. CompEngine: a self-organizing, living library of time-series data. arXiv preprint arXiv:1905.01042 (2019).
[28]
Everette S Gardner Jr. 1985. Exponential smoothing: The state of the art. Journal of forecasting 4, 1 (1985), 1–28.
[29]
Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I. Webb, Rob J. Hyndman, and Pablo Montero-Manso. 2021. Monash Time Series Forecasting Archive. In Neural Information Processing Systems Track on Datasets and Benchmarks.
[30]
Rakshitha Godahewa, Christoph Bergmeir, Geoffrey I. Webb, and Pablo Montero-Manso. 2023. An accurate and fully-automated ensemble model for weekly time series forecasting. International Journal of Forecasting 39, 2 (2023), 641–658. https://doi.org/10.1016/j.ijforecast.2022.01.008
[31]
Hansika Hewamalage, Christoph Bergmeir, and Kasun Bandara. 2021. Recurrent neural networks for time series forecasting: Current status and future directions. International Journal of Forecasting 37, 1 (2021), 388–427.
[32]
Ali Hooshmand and Ratnesh Sharma. 2019. Energy predictive models with limited data using transfer learning. In Proceedings of the Tenth ACM International Conference on Future Energy Systems. 12–16.
[33]
Rob Hyndman, Yanfei Kang, Pablo Montero-Manso, Thiyanga Talagala, Earo Wang, Yangzhuoran Yang, Mitchell O’Hara-Wild, et al. 2019. tsfeatures: Time series feature extraction. R package version 1, 0 (2019).
[34]
Rob J Hyndman. 2014. Measuring forecast accuracy. Business forecasting: Practical problems and solutions (2014), 177–183.
[35]
Rob J Hyndman and Yeasmin Khandakar. 2008. Automatic time series forecasting: the forecast package for R. Journal of statistical software 27 (2008), 1–22.
[36]
Rob J Hyndman, Anne B Koehler, Ralph D Snyder, and Simone Grose. 2002. A state space framework for automatic forecasting using exponential smoothing methods. International Journal of forecasting 18, 3 (2002), 439–454.
[37]
Serdar Kadioglu, Yuri Malitsky, Meinolf Sellmann, and Kevin Tierney. 2010. ISAC-Instance-Specific Algorithm Configuration. In ECAI, Vol. 215. Citeseer, 751–756.
[38]
Kaggle. 2021. Time Series Forecasting Datasets. https://www.kaggle.com/search?q=time+series+forecasting+in%3Adatasets. [Online; accessed 21-May-2021].
[39]
Prajakta S Kalekar et al. 2004. Time series forecasting using holt-winters exponential smoothing. Kanwal school of information Technology 4329008, 13 (2004), 1–13.
[40]
Alexandros Kalousis. 2002. Algorithm selection via meta-learning. Ph. D. Dissertation. University of Geneva.
[41]
Stephan Kolassa. 2011. Combining exponential smoothing forecasts using Akaike weights. International Journal of Forecasting 27, 2 (2011), 238–251.
[42]
Mirko Kück, Sven F Crone, and Michael Freitag. 2016. Meta-learning with neural networks and landmarking for forecasting model selection an empirical evaluation of different feature sets applied to industry data. In 2016 international joint conference on neural networks (IJCNN). IEEE, 1499–1506.
[43]
Christiane Lemke and Bogdan Gabrys. 2010. Meta-learning for time series forecasting and forecast combination. Neurocomputing 73, 10-12 (2010), 2006–2016.
[44]
Richard Lewis and Gregory C Reinsel. 1985. Prediction of multivariate time series by autoregressive model fitting. Journal of multivariate analysis 16, 3 (1985), 393–411.
[45]
Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar. 2017. Hyperband: A novel bandit-based approach to hyperparameter optimization. The Journal of Machine Learning Research 18, 1 (2017), 6765–6816.
[46]
Andy Liaw, Matthew Wiener, et al. 2002. Classification and regression by randomForest. R news 2, 3 (2002), 18–22.
[47]
Pablo Ribalta Lorenzo, Jakub Nalepa, Michal Kawulok, Luciano Sanchez Ramos, and José Ranilla Pastor. 2017. Particle swarm optimization for hyper-parameter selection in deep neural networks. In Proc. of the Genetic & Evolutionary Computation Conference. 481–488.
[48]
Carl H Lubba, Sarab S Sethi, Philip Knaute, Simon R Schultz, Ben D Fulcher, and Nick S Jones. 2019. catch22: CAnonical Time-series CHaracteristics: Selected through highly comparative time-series analysis. Data Mining and Knowledge Discovery 33, 6 (2019), 1821–1852.
[49]
Marin Matijaš, Johan AK Suykens, and Slavko Krajcar. 2013. Load forecasting using a multivariate meta-learning system. Expert systems with applications 40, 11 (2013), 4427–4437.
[50]
Gaurav Mittal, Chang Liu, Nikolaos Karianakis, Victor Fragoso, Mei Chen, and Yun Fu. 2020. HyperSTAR: Task-Aware Hyperparameters for Deep Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 8736–8745.
[51]
Pablo Montero-Manso, George Athanasopoulos, Rob J Hyndman, and Thiyanga S Talagala. 2020. FFORMA: Feature-based forecast model averaging. International Journal of Forecasting 36, 1 (2020), 86–92.
[52]
Leann Myers and Maria J Sirois. 2004. Spearman correlation coefficients, differences between. Encyclopedia of statistical sciences 12 (2004).
[53]
Jyoti Narwariya, Pankaj Malhotra, Lovekesh Vig, Gautam Shroff, and TV Vishnu. 2020. Meta-learning for few-shot time series classification. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD. ACM, 28–36.
[54]
Mladen Nikolić, Filip Marić, and Predrag Janičić. 2013. Simple algorithm portfolio for SAT. Artificial Intelligence Review 40, 4 (2013), 457–465.
[55]
Boris N Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. Meta-learning framework with applications to zero-shot time-series forecasting. arXiv:2002.02887 (2020).
[56]
Boris N. Oreshkin, Dmitri Carpov, Nicolas Chapados, and Yoshua Bengio. 2020. N-BEATS: Neural basis expansion analysis for interpretable time series forecasting. In International Conference on Learning Representations. https://openreview.net/forum?id=r1ecqn4YwB
[57]
Zheyi Pan, Wentao Zhang, Yuxuan Liang, Weinan Zhang, Yong Yu, Junbo Zhang, and Yu Zheng. 2020. Spatio-Temporal Meta Learning for Urban Traffic Prediction. IEEE Transactions on Knowledge and Data Engineering (2020), 1–1. https://doi.org/10.1109/TKDE.2020.2995855
[58]
Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al. 2011. Scikit-learn: Machine learning in Python. the Journal of machine Learning research 12 (2011), 2825–2830.
[59]
Arnak Poghosyan, Ashot Harutyunyan, Naira Grigoryan, Clement Pang, George Oganesyan, Sirak Ghazaryan, and Narek Hovhannisyan. 2021. An Enterprise Time Series Forecasting System for Cloud Applications Using Transfer Learning. Sensors 21, 5 (2021), 1590.
[60]
Ricardo BC Prudêncio and Teresa B Ludermir. 2004. Meta-learning approaches to selecting time series models. Neurocomputing 61 (2004), 121–137.
[61]
Min Qi and Guoqiang Peter Zhang. 2001. An investigation of model selection criteria for neural network time series forecasting. European Journal of Operational Research 132, 3 (2001), 666–680.
[62]
Aniruddh Raghu, Maithra Raghu, Samy Bengio, and Oriol Vinyals. 2020. Rapid Learning or Feature Reuse? Towards Understanding the Effectiveness of MAML. In International Conference on Learning Representations. https://openreview.net/forum?id=rkgMkCEtPB
[63]
Sachin Ravi and Hugo Larochelle. 2017. Optimization as a Model for Few-Shot Learning. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings. OpenReview.net. https://openreview.net/forum?id=rJY0-Kcll
[64]
Mauro Ribeiro, Katarina Grolinger, Hany F ElYamany, Wilson A Higashino, and Miriam AM Capretz. 2018. Transfer learning with seasonal and trend adjustment for cross-building energy forecasting. Energy and Buildings 165 (2018), 352–363.
[65]
Andrei A. Rusu, Dushyant Rao, Jakub Sygnowski, Oriol Vinyals, Razvan Pascanu, Simon Osindero, and Raia Hadsell. 2019. Meta-Learning with Latent Embedding Optimization. In International Conference on Learning Representations. https://openreview.net/forum?id=BJgklhAcK7
[66]
David Salinas, Valentin Flunkert, Jan Gasthaus, and Tim Januschowski. 2020. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. International Journal of Forecasting 36, 3 (2020), 1181–1191.
[67]
Michael Schomaker and Christian Heumann. 2014. Model selection and model averaging after multiple imputation. Computational Statistics & Data Analysis 71 (2014), 758–770.
[68]
Skipper Seabold and Josef Perktold. 2010. Statsmodels: Econometric and statistical modeling with python. In Proceedings of the 9th Python in Science Conference, Vol. 57. Austin, TX, 61.
[69]
Syed Yousaf Shah, Dhaval Patel, Long Vu, Xuan-Hong Dang, Bei Chen, Peter Kirchner, Horst Samulowitz, David Wood, Gregory Bramble, Wesley M Gifford, et al. 2021. AutoAI-TS: AutoAI for Time Series Forecasting. In Proceedings of the 2021 International Conference on Management of Data. 2584–2596.
[70]
Bobak Shahriari, Alexandre Bouchard-Côté, and Nando Freitas. 2016. Unbounded Bayesian optimization via regularization. In Artificial intelligence and statistics. PMLR, 1168–1176.
[71]
Qi Shi, Mohamed Abdel-Aty, and Jaeyoung Lee. 2016. A Bayesian ridge regression analysis of congestion's impact on urban expressway safety. Accident Analysis & Prevention 88 (2016), 124–137.
[72]
Slawek Smyl. 2020. A hybrid method of exponential smoothing and recurrent neural networks for time series forecasting. International Journal of Forecasting 36, 1 (2020), 75–85.
[73]
Jake Snell, Kevin Swersky, and Richard Zemel. 2017. Prototypical Networks for Few-shot Learning. In Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper/2017/file/cb8da6767461f2812ae4290eac7cbc42-Paper.pdf
[74]
Thiyanga S Talagala, Rob J Hyndman, and George Athanasopoulos. 2023. Meta-learning how to forecast time series. Journal of Forecasting 42, 6 (2023), 1476–1501.
[75]
Thiyanga S Talagala, Rob J Hyndman, George Athanasopoulos, et al. 2018. Meta-learning how to forecast time series. Monash Econometrics Working Papers 6 (2018), 18.
[76]
Thiyanga S. Talagala, Feng Li, and Yanfei Kang. 2021. FFORMPP: Feature-based forecast model performance prediction. International Journal of Forecasting (2021). https://doi.org/10.1016/j.ijforecast.2021.07.002
[77]
Sean J Taylor and Benjamin Letham. 2018. Forecasting at scale. The American Statistician 72, 1 (2018), 37–45.
[78]
Evaldas Vaiciukynas, Paulius Danenas, Vilius Kontrimas, and Rimantas Butleris. 2020. Meta-Learning for Time Series Forecasting Ensemble. arXiv preprint arXiv:2011.10545 (2020).
[79]
Joaquin Vanschoren. 2018. Meta-learning: A survey. arXiv preprint arXiv:1810.03548 (2018).
[80]
Xiaozhe Wang, Kate Smith-Miles, and Rob Hyndman. 2009. Rule induction for forecasting method selection: Meta-learning the characteristics of univariate time series. Neurocomputing 72, 10-12 (2009), 2581–2594.
[81]
Yuyang Wang, Alex Smola, Danielle Maddix, Jan Gasthaus, Dean Foster, and Tim Januschowski. 2019. Deep factors for forecasting. In International Conference on Machine Learning. PMLR, 6607–6617.
[82]
Tailai Wen and Roy Keyes. 2019. Time series anomaly detection using convolutional neural networks and transfer learning. arXiv preprint arXiv:1905.13628 (2019).
[83]
Agus Widodo and Indra Budi. 2013. Model selection using dimensionality reduction of time series characteristics. In International Symposium on Forecasting, Seoul, South Korea. 57–118.
[84]
Martin Wistuba, Nicolas Schilling, and Lars Schmidt-Thieme. 2018. Scalable gaussian process-based transfer surrogates for hyperparameter optimization. Machine Learning 107, 1 (2018), 43–78.
[85]
David H Wolpert. 1996. The lack of a priori distinctions between learning algorithms. Neural computation 8, 7 (1996), 1341–1390.
[86]
Weizhong Yan, Hai Qiu, and Ya Xue. 2009. Gaussian process for long-term time-series forecasting. In 2009 International Joint Conference on Neural Networks. IEEE, 3420–3427.
[87]
Rui Ye and Qun Dai. 2018. A novel transfer learning framework for time series forecasting. Knowledge-Based Systems 156 (2018), 74–99.
[88]
Yue Zhao, Ryan A Rossi, and Leman Akoglu. 2020. Automating outlier detection via meta-learning. arXiv preprint arXiv:2009.10606 (2020).
[89]
Fan Zhou, Chengtai Cao, Kunpeng Zhang, Goce Trajcevski, Ting Zhong, and Ji Geng. 2019. Meta-GNN: On Few-Shot Node Classification in Graph Meta-Learning. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management (Beijing, China) (CIKM ’19). Association for Computing Machinery, New York, NY, USA, 2357–2360. https://doi.org/10.1145/3357384.3358106

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Knowledge Discovery from Data
ACM Transactions on Knowledge Discovery from Data Just Accepted
EISSN:1556-472X
Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 24 January 2025
Accepted: 23 December 2024
Revised: 30 July 2024
Received: 21 November 2022

Check for updates

Author Tags

  1. Time-series forecasting
  2. Model selection
  3. AutoML
  4. Meta-learning
  5. Inference

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 77
    Total Downloads
  • Downloads (Last 12 months)77
  • Downloads (Last 6 weeks)77
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media