Short-Term Load Forecasting Using Encoder-Decoder WaveNet: Application to the French Grid
Abstract
:1. Introduction
- There is no “leakage“from the future to the past.
- Employing dilated convolutions enable an exponentially large receptive field to process long input sequences.
- As a result of skip-connections, the proposed approach avoids degradation problems (explosions and vanishing gradients) when the depth of the networks increases.
- Parallelisation on GPU is also possible thanks to the use of convolutions instead of recurrent units.
- Potential performance improvement.
2. Materials and Methods
2.1. Proposed Approach: Encoder-Decoder Approach Using WaveNets
2.2. Data Analysis
2.3. Data Preparation
- Data division 1:
- ○
- Training set: 2017-01-01 00:00:00 to 2017-12-31 23:00:00.
- ○
- Validation set: 2018-01-01 00:00:00 to 2018-01-28 07:00:00.
- ○
- Testing set: 2018-01-28 08:00:00 to 2018-02-24 15:00:00.
- Data division 2:
- ○
- Training set: from 2017-01-01 00:00:00 to 2018-07-02 11:00:00.
- ○
- Validation set: from 2018-07-02 12:00:00 to 2018-07-29 19:00:00.
- ○
- Testing set: from 2018-07-29 20:00:00 to 2018-08-26 03:00:00.
- Data division 3:
- ○
- Training set: from 2017-01-01 00:00:00 to 2019-11-06 07:00:00.
- ○
- Validation set: from 2019-11-06 08:00:00 to 2019-12-03 15:00:00.
- ○
- Testing set: from 2019-12-03 16:00:00 to 2019-12-31 23:00:00.
2.4. Deep Learning Architecture
Design of the Architecture
- Input length: 168 h
- Learning rate: 0.001
- Epochs: 1000
- Batch size: 32
- Optimiser: Adam
- Dilation Rates: [1,2,4,8,16]
- Kernel sizes: 2
- Number of filters: 32
- Fully connected neurons: 32
2.5. Training
- β1 parameter: 0.9. The exponential decay rate for the first moment estimates (momentum).
- β2 parameter: 0.99. The exponential decay rate for the first moment estimates (RMSprop).
- Loss function: Mean Squared Error (MSE).
3. Results
4. Discussion
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Bunn, D.; Farmer, E. Comparative Models for Electrical Load Forecasting; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 1985. [Google Scholar]
- Sadaei, H.; de Lima Silva, P.; Guimarães, F.G.; Lee, M.H. Short-term load forecasting by using a combined method of convolutional neural networks and fuzzy time series. Energy 2019, 175, 365–377. [Google Scholar] [CrossRef]
- Al-Hamadi, H.M.; Soliman, S.A. Short-term electric load forecasting based on Kalman filtering algorithm with moving window weather and load model. Electr. Power Syst. Res. 2004, 68, 47–59. [Google Scholar] [CrossRef]
- Vähäkyla, P.; Hakonen, E.; Léman, P. Short-term forecasting of grid load using Box-Jenkins techniques. Int. J. Electr. Power Energy Syst. 1980, 2, 29–34. [Google Scholar] [CrossRef]
- Sen, P.; Roy, M.; Pal, P. Application of ARIMA for forecasting energy consumption and GHG emission: A case study of an Indian pig iron manufacturing organization. Energy 2016, 116, 1031–1038. [Google Scholar] [CrossRef]
- Box, G.E.P.; Jenkins, G.M.; Reinsel, G.C. Time Series Analysis Forecasting and Control; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2013. [Google Scholar]
- Luceño, A.; Peña, D. Autoregressive Integrated Moving Average (ARIMA) Modeling; John Wiley & Sons Ltd.: Hoboken, NJ, USA, 2007. [Google Scholar]
- Brownlee, J. Machine Learning Mastery. 2017. Available online: https://machinelearningmastery.com/handle-long-sequences-long-short-term-memory-recurrent-neural-networks/ (accessed on 16 October 2020).
- Chitalia, G.; Pipattanasomporn, M.; Garg, V.; Rahman, S. Robust short-term electrical load forecasting framework for commercial buildings using deep recurrent neural networks. Appl. Energy 2020, 278, 115410. [Google Scholar] [CrossRef]
- Brown, T.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyvam, P.; Sastry, G.; Askell, A.; et al. Language Models are Few-Shot Learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Zeng, Y.; Zeng, Y.; Choi, B.; Wang, L. Multifactor-influenced energy consumption forecasting using enhanced back-propagation neural network. Energy 2017, 127, 381–396. [Google Scholar] [CrossRef]
- Hu, H.; Wang, L.; Lv, S.X. Forecasting energy consumption and wind power generation usingdeep echo state network. Renew. Energy 2020, 154, 598–613. [Google Scholar] [CrossRef]
- Li, H.; Liu, H.; Ji, H.; Zhang, S.; Li, P. Ultra-Short-Term Load Demand Forecast Model Framework Based on Deep Learning. Energies 2020, 13, 4900. [Google Scholar] [CrossRef]
- Rahman, A.; Srikumar, V.; Smith, A.D. Predicting electricity consumption for commercial and residential buildingsusing deep recurrent neural networks. Appl. Energy 2018, 212, 372–385. [Google Scholar] [CrossRef]
- Marino, D.L.; Amarasinghe, K.; Manic, M. Building Energy Load Forecasting using Deep Neural. IECON 2016. [Google Scholar] [CrossRef] [Green Version]
- Mittelman, R. Time-series modeling with undecimated fully convolutional neural networks. arXiv 2015, arXiv:1508.00317. [Google Scholar]
- Wang, Z.; Yan, W.; Oates, T. Time series classification from scratch with deep neural networks: A strong baseline. In Proceedings of the International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, USA, 14–19 May 2017. [Google Scholar]
- Rizvi, S.; Syed, T.; Qureshi, J. Real-Time Forecasting of Petrol Retail using Dilated Causal CNNs. J. Ambient. Intell. Smart Environ. 2019. [Google Scholar] [CrossRef]
- Yasrab, R.; Gu, N.; Zhang, X. An Encoder-Decoder Based Convolution Neural. Appl. Sci. 2017, 7, 312. [Google Scholar] [CrossRef] [Green Version]
- Yazdanbakhsh, O.; Dick, S. Multivariate Time Series Classification using Dilated Convolutional Neural Network. arXiv 2019, arXiv:1905.01697. [Google Scholar]
- Yan, J.; Mu, L.; Wang, L.; Ranjan, R.; Zomaya, A.Y. Temporal Convolutional Networks for the Advance Prediction of ENSO. Nature 2020, 10, 8055. [Google Scholar] [CrossRef] [PubMed]
- Borovykh, A.; Bohte, S.; Oosterlee, C. Conditional time series forecasting with convolutional neural networks. arXiv 2018, arXiv:1703.04691. [Google Scholar]
- Impedovo, D.; Dentamaro, V.; Pirlo, G.; Sarcinella, L. TrafficWave: Generative deep learning architecture for vehicular traffic flow prediction. Appl. Sci. 2019, 9, 5504. [Google Scholar] [CrossRef] [Green Version]
- Pramono, S.H.; Rohmatillah, M.; Maulana, E.; Hasanah, R.N.; Hario, F. Deep Learning-Based Short-Term Load Forecasting. Energies 2019, 12, 3359. [Google Scholar] [CrossRef] [Green Version]
- Oord, A.D.; Dieleman, S.; Zen, H.; Simonyan, K.; Vinyals, O.; Graves, A.; Kalchbrenner, N.; Kavukcuoglu, K. WaveNet: A Generative Model for Raw Audio. arXiv 2016, arXiv:1609.03499. [Google Scholar]
- Raghunandepu. Understanding and Implementation of Residual Networks (ResNets). 2019. Available online: https://medium.com/analytics-vidhya/understanding-and-implementation-of-residual-networks-resnets-b80f9a507b9c (accessed on 5 August 2020).
- RTE. RTE France. Available online: https://rte-france.com/ (accessed on 6 October 2020).
- Koval, S.I. Data preparation for neural network data analysis. In Proceedings of the IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), Moscow and St. Petersburg, Russia, 26–29 January 2018. [Google Scholar]
- Scikit-Learning. Available online: https://scikit-learn.org/stable/auto_examples/compose/plot_transformed_target.html#sphx-glr-auto-examples-compose-plot-transformed-target-py (accessed on 7 October 2020).
- Buber, E.; Diri, B. Performance Analysis and CPU vs GPU Comparison for Deep Learning. In Proceedings of the 2018 6th International Conference on Control Engineering & Information Technology (CEIT), Istanbul, Turkey, 25–27 October 2018. [Google Scholar]
- Brain, G. TensorFlow. Available online: https://www.tensorflow.org/ (accessed on 18 October 2020).
- Box, G.; Jenkins, G.M.; Reinsel, C. Time Series Analysis: Forecasting and Control; J. Wiley and Sons Inc: Hoboken, NJ, USA, 1970. [Google Scholar]
- Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. In Neurocomputing; Elsevier: Amsterdam, The Netherlands, 2003; pp. 159–175. [Google Scholar]
- Kotu, V.; Deshpande, B. Time Series Forecasting. In Data Science; Morgan Kaupmann: Burligton, MA, USA, 2019; pp. 395–445. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep FeedForward Networks. In Deep Learning; MIT Press: Cambridge, MA, USA, 2017; pp. 166–228. [Google Scholar]
- Sun, J. Feedforward Neural Networks. 2017. Available online: https://www.cc.gatech.edu/~san37/post/dlhc-fnn/ (accessed on 17 September 2020).
- LeCun, Y.; Boser, B.; Denker, J.S.; Henderson, D.; Howard, R.E.; Hubbard, W.; Jackel, L.D. Backpropagation Applied to Handwritten Zip Code Recognition. In Neural Computation; MIT Press: Cambridge, MA, USA, 1989; pp. 541–551. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Convolutional Neural Networks. In Deep Learning; MIT Press: Cambridge, MA, USA, 2017; pp. 330–373. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Recurrent Neural Networks. In Deep Learning; MIT Press: Cambridge, MA, USA, 2017; pp. 373–425. [Google Scholar]
- Rumelhart, D.; Hinton, G.; Williams, R. Learning representations by back-propagation errors. Nature 1986, 323, 533–536. [Google Scholar] [CrossRef]
- Lim, B.; Zohren, S. Time Series Forecasting with Deep Learning: A Survey. arXiv 2020, arXiv:2004.13408. [Google Scholar]
- Perez, P.T. Deep Learning: Recurrent Neural Networks. 2018. Available online: https://medium.com/deeplearningbrasilia/deep-learning-recurrent-neural-networks-f9482a24d010 (accessed on 18 September 2020).
- Elman, J.L. Finding structure in time. Cogn. Sci. 1990, 14, 179–211. [Google Scholar] [CrossRef]
- Hewanakage, H.; Bergneir, C.; Bandara, K. Recurrent Neural Networks for Time Series Forecasting: Current Status. arXiv 2020, arXiv:1909.00590. [Google Scholar]
- Hochreiter, S.; Schmidhuber, J. Long Short-Term Memory. Neural Comput. 1997, 9, 1735–1780. [Google Scholar] [CrossRef]
- Olah, C. Understanding LSTM Networks. 2015. Available online: http://colah.github.io/posts/2015-08-Understanding-LSTMs/ (accessed on 21 September 2020).
- Cho, K.; Bahdanau, D.; Bougares, F.; Bengio, H.S.Y. Learning Phrase Representations using RNN Encoder–Decoder. arXiv 2014, arXiv:1406.1078. [Google Scholar]
- Anderson, R. RNN, Talking about Gated Recurrent Unit. Available online: https://technopremium.com/blog/rnn-talking-about-gated-recurrent-unit/ (accessed on 21 September 2020).
- Inkawhich, M. Deploy Seq2seq Model. Available online: https://pytorch.org/tutorials/beginner/deploy_seq2seq_hybrid_frontend_tutorial.html (accessed on 21 September 2020).
Input Length | Learning Rate | Epochs | Batch Size | Optimiser | Dilation Rates | Kernel Sizes | Filters | Fully Connected Layer’s Neurons |
---|---|---|---|---|---|---|---|---|
168 | 0.01 | 100 | 8 | RMSprop | [1,2,4,8] | 2 | 12 | 16 |
256 | 0.001 | 150 | 16 | Adam | [1,2,4,8,16] | 4 | 32 | 32 |
0.0001 | 200 | 32 | [1,2,4,8,16,32,64,64,128] | 64 | 64 | |||
1000 | [1,2,4,8,16,32,64,64,128,128,256] | 128 |
Data Division | Set | MSE (MW) | RMSE (MW) | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (%) |
---|---|---|---|---|---|---|---|
1 | Training Set | 3.815 × 106 | 1953.212 | 1366.707 | 2.488 | −558.947 | −1.114 |
Validation Set | 7.432 × 106 | 2726.292 | 1979.027 | 3.096 | −907.611 | −1.511 | |
Testing Set | 7.743 × 106 | 2782.677 | 2119.104 | 2.953 | 79.439 | 0.004 | |
2 | Training Set | 2.972 × 106 | 1724.011 | 1225.774 | 2.167 | 327.089 | 0.573 |
Validation Set | 2.487 × 106 | 1577.155 | 1084.796 | 2.338 | 477.346 | 0.913 | |
Testing Set | 2.685 × 106 | 1638.709 | 1117.471 | 2.626 | 18.018 | 0.061 | |
3 | Training Set | 3.560 × 106 | 1889.346 | 1352.111 | 2.477 | 395.462 | 0.656 |
Validation Set | 6.717 × 106 | 2591.814 | 1743.703 | 2.785 | 129.694 | 0.122 | |
Testing Set | 6.666 × 106 | 2581.991 | 1889.944 | 3.064 | 185.531 | 0.219 |
Data Division | Model | MSE (MW) | RMSE (MW) | MAE (MW) | MAPE (%) | MBE (MW) | MBPE (%) |
---|---|---|---|---|---|---|---|
1 | Proposed Approach | 7.743 × 106 | 2782.677 | 2119.104 | 2.953 | 79.439 | 0.004 |
ARIMA | 2.409 × 108 | 15,524.138 | 13,701.994 | 18.170 | 13,564.961 | 17.9248 | |
MLP | 1.050 × 107 | 3240.606 | 2542.002 | 3.517 | 258.806 | 0.3205 | |
CausalConv1D | 9.700 × 106 | 3115.109 | 2336.889 | 3.230 | 413.027 | 0.5484 | |
ConvLSTM | 9.006 × 106 | 3001.072 | 2321.472 | 3.211 | 283.711 | 0.356 | |
LSTM | 4.428 × 107 | 6654.573 | 5439.153 | 7.614 | 237.031 | −0.235 | |
GRU | 4.476 × 107 | 6690.945 | 5498.788 | 7.649 | 935.558 | 0.603 | |
StackedLSTM | 4.414 × 107 | 6644.195 | 5317.568 | 7.416 | 617.804 | 0.326 | |
StackedGRU | 1.976 × 107 | 4445.943 | 3513.181 | 4.859 | 779.914 | 0.842 | |
2 | Proposed Approach | 2.685 × 106 | 1638.709 | 1117.471 | 2.626 | 18.018 | 0.061 |
ARIMA | 3.584 × 107 | 5987.222 | 4974.551 | 12.352 | −1745.694 | −5.961 | |
MLP | 3.294 × 106 | 1815.062 | 1402.886 | 3.296 | −353.854 | −1.031 | |
CausalConv1D | 2.809 × 106 | 1676.236 | 1139.331 | 2.644 | −147.862 | −0.465 | |
ConvLSTM | 2.814 × 106 | 1677.679 | 1233.298 | 2.870 | −246.956 | −0.707 | |
LSTM | 3.103 × 107 | 5570.446 | 4585.232 | 11.243 | −956.642 | −3.908 | |
GRU | 2.793 × 107 | 5284.928 | 4316.504 | 10.554 | −726.990 | −3.297 | |
StackedLSTM | 8.562 × 106 | 2926.257 | 2328.889 | 5.487 | −43.796 | −0.556 | |
StackedGRU | 2.967 × 106 | 5447.697 | 1775.300 | 4.183 | −524.327 | −1.480 | |
3 | Proposed Approach | 6.666 × 106 | 2581.991 | 1889.944 | 3.064 | 185.531 | 0.219 |
ARIMA | 9.082 × 106 | 9530.107 | 7598.771 | 11.525 | 5884.41 | 8.136 | |
MLP | 8.566 × 106 | 2926.846 | 2211.715 | 3.673 | −659.786 | −1.311 | |
CausalConv1D | 7.472 × 106 | 2733.633 | 1895.930 | 3.076 | −637.183 | −1.167 | |
ConvLSTM | 6.880 × 106 | 2623.061 | 1894.860 | 3.129 | −206.603 | −0.437 | |
LSTM | 2.458 × 107 | 4958.222 | 3883.119 | 6.366 | −442.154 | −1.161 | |
GRU | 3.706 × 107 | 6087.658 | 4904.714 | 8.113 | −811.686 | −2.036 | |
StackedLSTM | 3.415 × 107 | 5843.829 | 4719.721 | 7.760 | −176.051 | −1.074 | |
StackedGRU | 9.504 × 106 | 3082.936 | 2346.052 | 3.862 | −357.712 | −0.769 |
Test Set | Mean (%) | Median (%) | 5th Percentile (%) | 25th Percentile (%) | 75th Percentile (%) | 95th Percentile (%) |
---|---|---|---|---|---|---|
1 | 2.966 | 2.642 | 1.328 | 1.971 | 3.732 | 5.616 |
2 | 2.997 | 2.800 | 1.438 | 2.017 | 3.646 | 5.588 |
3 | 3.239 | 2.848 | 1.470 | 2.181 | 3.910 | 6.474 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Dorado Rueda, F.; Durán Suárez, J.; del Real Torres, A. Short-Term Load Forecasting Using Encoder-Decoder WaveNet: Application to the French Grid. Energies 2021, 14, 2524. https://doi.org/10.3390/en14092524
Dorado Rueda F, Durán Suárez J, del Real Torres A. Short-Term Load Forecasting Using Encoder-Decoder WaveNet: Application to the French Grid. Energies. 2021; 14(9):2524. https://doi.org/10.3390/en14092524
Chicago/Turabian StyleDorado Rueda, Fernando, Jaime Durán Suárez, and Alejandro del Real Torres. 2021. "Short-Term Load Forecasting Using Encoder-Decoder WaveNet: Application to the French Grid" Energies 14, no. 9: 2524. https://doi.org/10.3390/en14092524