Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
0% found this document useful (0 votes)
27 views

Machine-Learning Based Methods in Short-Term Load Forecasting

Uploaded by

mepixoy924
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Machine-Learning Based Methods in Short-Term Load Forecasting

Uploaded by

mepixoy924
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

The Electricity Journal 34 (2021) 106884

Contents lists available at ScienceDirect

The Electricity Journal


journal homepage: www.elsevier.com/locate/tej

Machine-Learning based methods in short-term load forecasting⋆


Weilin Guo a, Liang Che a, *, Mohammad Shahidehpour b, Xin Wan c
a
College of Electrical and Information Engineering, Hunan University, Changsha, Hunan 410082, China
b
Robert W. Galvin Center for Electricity Innovation, Illinois Institute of Technology, Chicago, IL 60616, USA
c
Dadu River Hydro Power Development Company of China Guodian Corporation, Chengdu, Sichuan 610041, China

A R T I C L E I N F O A B S T R A C T

Keywords: Short-term load forecasting is of great significance to the secure and efficient operation of power systems.
Short-term load forecasting However, loads can be affected by a variety of external impact factors and thus involve high levels of un­
Machine learning certainties. So it is a challenging task to achieve an accurate load forecast. This paper discusses three commonly-
Support vector machine
used machine-learning methods used for load forecasting, i.e., the support vector machine method, the random
Long short-term memory
forest regression method, and the long short-term memory neural network method. The features and applications
of these methods are analyzed and compared. By integrating the advantages of these methods, a fusion fore­
casting approach and a data preprocessing technique are proposed for improving the forecasting accuracy. A
comparative study based on real load data is performed to verify that the proposed approach is capable of
achieving a relatively higher forecasting accuracy.

1. Introduction competition in 2001, was the first to apply SVM method in load fore­
casting (Chen et al., 2004). Since then, researchers have proposed
Nowadays, power system incorporates an increasing amount of various SVM-based forecasting methods to achieve more accurate fore­
renewable energy (e.g., solar and wind powers). The fast growth in the cast results (Jiang et al., 2017; Hou et al., 2018; Chen et al., 2017;
penetration of renewable energy has the potential to enhance the energy Barman et al., 2018; Lu et al., 2019; Chen et al., 2019; Hoori et al.,
efficiency and economics of power systems. However, the high pene­ 2020). Jiang proposed a support vector regression (SVR)-based predic­
tration of renewable energy also introduces additional challenges in tor and a hybrid parameter optimization based forecasting method to
power system scheduling and dispatch. Short-term load forecasting provide a high-precision and high-resolution STLF (Jiang et al., 2017).
(STLF), which is the basis for an optimal and secure dispatch, plays a Barman proposed a regional hybrid short-term load forecasting model
critical role in ensuring the secure and economic operation of power by considering regional climate conditions and using SVR model and
systems. Therefore, research efforts in recent years have been focusing grasshopper optimization algorithm (Barman et al., 2018). The results
on improving the accuracy of STLF. The past research has applied a show that the proposed model has better accuracy than the traditional
variety of traditional methods, e.g., time-series analysis and regression STLF model which uses temperature as the only climate factor. The
analysis. But these methods typically have limitations, e.g., low accu­ least-squares SVM, which is an extension of the standard SVM, uses
racy, weak capability of integrating impact factors such as weather data, nonlinear mapping to transform the second optimal inequality
and low sensitivity concerning input data. constraint problem in the original space to a linear system with equality
Recently, with the fast development of computer science and artifi­ constraints in the feature space, improving the convergence speed and
cial intelligence, machine-learning technologies have been introduced accuracy. In 2001, Breiman was the first to introduce RF, a supervised
into the STLF. These include the artificial neural network (ANN), sup­ learning algorithm, based on bagging ensemble learning and random
port vector machine (SVM), the random forest (RF), and the long short- attribute subspace theory (Breiman, 2001). Later, RF algorithms have
term memory (LSTM). Chen, the winner of the EUNITE load forecasting been adapted and applied in load forecasting problems, e.g., the RF


W. Guo and L. Che are with the College of Electrical and Information Engineering at Hunan University, Changsha, Hunan 410082, China. M. Shahidehpour is
with the Galvin Center for Electricity Innovation at Illinois Institute of Technology, Chicago, IL 60616, USA. Xin Wan is with the Dadu River Hydro Power
Development Company of China Guodian Corporation, Chengdu, Sichuan 610041, China.
* Corresponding author.
E-mail address: cheliang1213@163.com (L. Che).

https://doi.org/10.1016/j.tej.2020.106884

Available online 10 December 2020


1040-6190/© 2020 Elsevier Inc. All rights reserved.
W. Guo et al. The Electricity Journal 34 (2021) 106884

integrating multiple decision trees to form a classifier which has the deviation comparison method comprehensively processes the data in a
advantages of strong generalization ability, fewer parameters and high broader time horizon. It avoids the burden of preselecting the detection
prediction accuracy (Wu et al., 2015; Lahouar and Ben, 2015; Jurado threshold and improves the overall detection performance. After the
et al., 2015; Uriarte et al., 2016; Li et al., 2019). Wu improved the RF abnormal data is detected, this method makes a further correction that
method by using gray projection to propose a splitting algorithm (Wu firstly positions the abnormal data to zero and then processes the
et al., 2015); the experimental results showed that the proposed one has missing data using suitable data filling methods.
higher prediction accuracy and robustness than the original algorithms. The commonly-used data correction and filling method are briefly
Kong proposed a forecasting method based on a deep belief network introduced as follows:
while considering the impacts of temperature and humidity. The pro­
posed method was demonstrated to have improved prediction accuracy (1) Longitudinal filling method: This method uses correlation anal­
especially under the cases of large size of training samples and complex ysis or degree of correlation to a principle of similar-day substi­
impact factors (Kong et al., 2018). LSTM, firstly proposed by Hochreiter tution. If two or more days have similar impact factors regarding
in 1997, has been widely used in NLP applications such as text, speech, electric load, they are referred to as similar days. For a day with
handwriting recognition, and machine translation (Hochreiter and missing data, if there is one similar day which has data, the data
Schmidhuber, 1997). LSTM is also applied in load forecasting. Kong in this similar day is used to fill the data for this day with missing
proposed an LSTM-based load forecasting framework which is tested on data, and if there are more than one similar days, the average
a publicly available set of real residential smart-meter data (Kong et al., value of the data in these days are used for the filling.
2019). It showed that the proposed method has higher accuracy (2) Regression prediction filling method: In this method, the missing
compared to benchmarks in the field of load forecasting. data is replaced by the data that is predicted by performing
In this paper, we review three commonly-used machine learning regression on historical data whose time is prior to that of the
methods (SVM, RF and LSTM) and compare their advantages and dis­ missing data. In this type of methods, the commonly-used ones
advantages and their applications in STLF. Then, considering the fact include parametric regression, non-parametric regression and
that a single model may not achieve a satisfying accuracy level in STLF, linear regression.
we propose a fusion forecasting approach together with a data pre­ (3) Normal interval value filling method: This method firstly obtains
processing method. The proposed fusion approach integrates the merits a normal range of the available load data and then use the
of the aforementioned three machine-learning methods for enhancing average or median value of this range to replace the missing data.
the forecasting accuracy.
The rest of this paper is arranged as follows. Section 2 introduces Moreover, if the number of missing data samples is much smaller
data preprocessing methods. Section 3 discusses the advantages and than the number of total samples and at the same time the dependence
disadvantages of SVM, RF, LSTM. Section 4 proposes a fusion prediction between each data is significant, then it would be difficult to fill and
algorithm and evaluation index for the accuracy of prediction results correct the missing data. This may affect the accuracy of short-term load
based on the analysis of Sections 2 and 3. Section 5 verifies the perfor­ forecasting.
mance of the proposed method based on simulations on real load data.
Finally, Section 6 concludes this work and discusses future research 2.2. Intelligent data preprocessing methods
directions.
In recent years, with the fast development of artificial intelligence
2. Data preprocessing technology, researchers have adopted machine-learning algorithms in
data preprocessing, e.g., the self-organizing map (SOM) neural network
Due to the complex power consumption behaviors of users on the (Diaz et al., 2019), the empirical mode decomposition (EMD) (Huang
demand side, the telemetered load data may include abnormal data, et al., 1999), the ensemble empirical mode decomposition (EEMD)
which can be generally divided into bad data and distorted data. Bad data (Hong et al., 2012), the variational mode decomposition (VMD) (Ali
is mainly caused by the failure of meters, while distorted load data refers et al., 2018), the set pair analysis, and the isolation forest method, etc.
to a sudden drop of load typically resulted from a change of large in­ This paper mainly focuses on the EEMD data preprocessing method,
dustrial load in the grid. To deal with the problems caused by abnormal which is an improvement to the EMD method.
data, it is necessary to use proper methods to detect the abnormal data EMD, originally proposed by Huang in 1998, decomposes the data
out from the telemetered raw load data, and then process the identified into a number of intrinsic mode function components in terms of their
abnormal data based on actual situations. In this section, we review and intrinsic characteristic scales (Huang et al., 1999). EMD allows the load
compare the commonly-used abnormal data processing methods and data to be well preprocessed and thus improves the accuracy of load
then discuss the data preprocessing approach that will be used in this forecasting. However, research efforts in recent years have revealed
paper. some limitations when using the EMD method, e.g., mode aliasing,
over-envelope, and under-envelope. To deal with the above limitations,
2.1. Traditional data preprocessing methods Hong proposed an improved decomposition method, namely EEMD, for
data preprocessing (Hong et al., 2012). Compared with the EMD, the
Presently, the commonly-used traditional methods for abnormal data EEMD introduces two important parameters: Gaussian white noise
detection include the data horizontal comparison method and the time- amplitude and primary pitch frequency. On the one hand, the Gaussian
window mean-standard-deviation comparison method. white noise has continuity across different scales and different fre­
In general, the load data tends to be relatively stable within a short quencies, which can be effectively utilized to repress the aforementioned
period of time and the load variation curves are similar across different mode aliasing. On the other hand, the introduced primary pitch fre­
time cycles. Based on this feature, the data horizontal comparison quency can prevent the decomposition from falling into over-envelope
method detects the load at adjacent moments. When the load data at a or under-envelope. So, EEMD is deemed as a more adaptable data pre­
certain time instant is considered to be abnormal if it exceeds a given processing method and can be used to improve the accuracy of load
threshold. This method is easy-to-be implemented. However, it has the forecasting.
problems that it is difficult to set up a threshold for abnormal data
identification and that the data being abnormal at the previous or next 3. Common machine learning algorithms
time instant may cause a misjudgment of the data at the current instant.
To overcome this problem, the time-window mean-standard- Currently, a variety of forecasting algorithms have been used in the

2
W. Guo et al. The Electricity Journal 34 (2021) 106884

Fig. 1. Principles of SVM.

research of power load forecasting, which can be roughly divided into


three categories (Xiao et al., 2013): classic forecasting method, tradi­
tional forecasting method and modern forecasting method. The first one
(classic forecasting method) mainly includes the time-series based
method, the regression-analysis method, and the exponential smoothing
method. The second category, the traditional forecasting, mainly con­
tains and trend extrapolation method, and similar day method and gray
forecast method. The last one, the modern forecasting method which is
Fig. 2. RNN structure diagram and RNN expansion diagram.
our focus in this paper, mainly includes the expert system method, the
neural network method and the SVM-based method. The remainder of
this section will introduce these commonly-used machine learning al­ trees to make predictions and uses voting or averaging to classify the
gorithms in details. input data. In the process of classifying, the RF can sort the attributes of
input data according to their importance with regard to the forecasting
results. This can help the feature selection and the construction of
3.1. Short-term load forecasting based on SVM
effective classifiers. Consequently, the RF can realize the dimensionality
reduction of the attributes of the input data by eliminating the
3.1.1. Algorithm principle
lower-ranked attributes. On the other hand, a higher-ranked attribute
SVM is a machine-learning method developed from the structural
typically means a stronger correlation between the attribute and the
risk minimization principle and statistical learning theory. On the one
forecasting result, so such an attribute should be retained in the
hand, SVM uses the structural risk minimization principle instead of the
classification.
empirical risk minimization principle to minimize the training error, so
In the process of screening influencing factors, the RF firstly gener­
as to avoid falling into local optimum like the artificial neural network
ates a huge number of trees from the data set, in which only a small part
(ANN). On the other hand, SVM is a feature model that can nonlinearly
of the attributes is trained for each tree, then performs the statistical
map the training data in a low-dimensional plane to high-dimensional
analysis. The RF is implemented based on the algorithm including the
space. Based on these characteristics, SVM can effectively overcome
following steps:
the shortcomings of traditional prediction models (e.g., weak classifi­
cation ability and over-fitting), and thus is widely used in load fore­
(1) From a training set with size M, perform K sample replacements
casting, data mining and other fields.
to generate K training sets, where each sample is randomly
The basic principle of SVM is to find a pair of separating hyperplanes
selected out from M samples by using the bootstrap method;
that separate the samples, as illustrated in Fig. 1(a). In this figure, the
(2) Train K training sets to generate K decision trees;
distance of a sample point to the nearer separating hyperplane is defined
(3) For a single decision tree, randomly select n features (n < N,
as margin. The set of separating hyperplanes is said to be optimal when
where N is the total number of features), segment the decision
they maximize the smallest margin of all the sample points; in this case,
tree according to information gain or Gini coefficient or other
the SVM classification problem is transformed into a regression problem,
indicators, and select products with strong classification capa­
as depicted in Fig. 1(b).
bilities for each feature;
In the practices of load forecasting, there is a problem that the input
(4) (Each decision tree grows to the maximum without any
data samples are often located in non-linear and separate space, which
adjustment;
render the data preprocessing difficult. To tackle this problem, the
(5) (Multiple decision trees will be generated to form an RF. For
concept of kernel function is introduced to SVM to map the input data to
classification problems, the classification result depends on the
a high-dimensional space, and thus the data can be easily classified. In
number of votes of the weak classifier.
this way, it is easier to obtain different hyperplane surfaces by modifying
the kernel function. In essence, by using EEMD for preprocessing and
using an embedded kernel function, an SVM problem is equivalent to a 3.3. Short-term load forecasting based on LSTM
quadratic linear programming problem, and therefore is provable to
have a unique solution. 3.3.1. Algorithm principle
RNN is first proposed in 1989 (Ronald and David, 1989). It is a type
3.2. Short-term load forecasting based on random forest of neural network capable of sequential data processing, part-of-speech
tagging and named entity recognition, etc. The typical structure diagram
3.2.1. Algorithm principle and expansion diagram of RNN are shown in Fig. 2. As shown in the
Another important machine-learning algorithm is the RF, which was figure, the RNN is divided into three layers, namely the input layer,
first proposed by Breiman based on the theory of classification regres­ hidden layer and output layer. The input of the hidden layer includes the
sion tree (Breiman, 2001). Essentially, RF combines multiple decision input of the previously hidden layer and the output of the input layer.

3
W. Guo et al. The Electricity Journal 34 (2021) 106884

Table 1
Comparison of the advantages and disadvantages of machine-learning
algorithms.
SVM RF LSTM

Advantages 1. Avoid the local 1. Avoid overfitting; 1. Avoid the


optimum issue; 2. Can handle high- local
2. Strong dimensional, optimum
generalization continuous and issue;
capability; discrete data; 2. Avoid
3. Can process 3. Strong anti-noise overfitting;
small samples; ability, avoid 3. Robustness;
4. Robustness. over-fitting, small
generalization
error;
4. Fast training
speed and high
accuracy.
Disadvantages 1. Classification 1. Over-fitting 1. Limited
has limitations; especially in memory
Fig. 3. Basic structure of LSTM unit. 2. Time- noisy capacity
consuming; classification or when applied
Compared with the deep neural networks and convolution neural net­ 3. Not sensitive to regression for long-
missing data problems; sequence
works, the nodes in different cells in the same hidden layer of RNN are
under lager-size 2. Black box model, samples;
interconnected. In addition, an important benefit of RNN is its ability to samples (Jahan­ weak 2. Time-
use contextual information when mapping between input and output gir, 2020). interpretability; consuming.
sequences. Unfortunately, for standard RNN architectures, the range of 3. Time-consuming
context that can be practically accessed is quite limited, i.e., part of the under large gain.

data sequence may be lost, and therefore may cause a degradation of


accuracy. To address this issue, a variant of RNN, called LSTM
(Hochreiter and Schmidhuber, 1997), is proposed, which uses the
multiplicative gates to allow its memory cells to store and access in­
formation over long periods of time, thereby mitigating the vanishing
gradient problem.
Fig. 3 depicts the basic structure of an LSTM unit. As shown in this
figure, an LSTM storage unit is formed by integrating the input, output
and forget layers. Compared with the traditional RNN, the LSTM’s forget
layer is not an ordinary neural unit, but has a distinct memory mode.
Each LSTM unit has a cell whose status at time instant t is denoted by ct.
This tuple can be viewed as a memory unit of the LSTM. A memory cell
consists of four elements, namely, an input gate, a neuron with self-
recurrent connection (connected to itself), a forget gate, and an output
gate. Specifically, the application of an LSTM unit adopts the following
procedure:

(1) At each time instant, the LSTM unit receives the current state (xt)
and the hidden state at the previous time instant (ht-1) through its
three gates;
(2) Each gate receives an internal information input and the memory
unit’s state (ct-1);
(3) These gates, upon receiving the input information, handle the
inputs from respective information sources (e.g., ht-1, ct-1) while
the gates’ logic function determines their activeness;
(4) Once the input information is processed by the nonlinear function
at the input gate, the state of the memory cell associated with the
forgetting gate is superimposed to form a new memory cell state
c t. Fig. 4. Model prediction algorithm for frame fusion.
(5) Finally, the memory cell state (ct) will form the LSTM unit’s
output (ht) based on the nonlinear function and the dynamic
control of the output gate. where xt denotes the input to the memory cell layer at time instant t, Wf ,
Wi , Wo , Wc , Uf , Ui , Uo and Uc are the weight matrices, and bf , bi , bo
The variables mentioned above interact with each other based on the and bc are bias vectors.
following model:
⎧ ( ) 3.4. Algorithm advantages and disadvantages

⎪ f t = σg Wf xt + Uf ht− 1 + bf


⎨ it = σg (Wi xt + Ui ht− 1 + bi ) The three machine-learning algorithms used for load forecasting
ot = σg (Wo xt + Uo ht− 1 + bo ) (1) discussed in this section, SVM, RF and LSTM, are clearly compared based



⎪ ct = f t ∙ct− 1 + it ∙σc (Wc xt + Uc ht− 1 + bc ) on their advantages and disadvantages in Table 1:

ht = ot ∙σc (ct )

4
W. Guo et al. The Electricity Journal 34 (2021) 106884

Fig. 5. Data preprocessing results.

4. Fusion algorithm 4.2. Statistical metrics

4.1. Framework fusion model algorithm In this paper, a set of indices is used to assess the forecasting accuracy
and to comprehensively benchmark the performance of the proposed
In general, the generalization ability of SVM can alleviate the errors approach against those of existing approaches. These indices include the
caused by neural network’s over-fitting to a certain extent and avoid the root mean square error (RMSE), mean absolute error (MAE) and mean
insufficient model learning problem caused by the inadequacy of input absolute percentage error (MAPE), which are formulated as follows:
data. However, the application of SVM has some limitations. For √̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅̅
example, SVM is only suitable for calculations with relateive small size 1∑ n
( )2
RMSE = y − yp (2)
of samples; its forecasting accuracy may be affected when model pa­ n i=1 i
rameters are not well defined or the load involves various types of
changes. The functions of RF and LSTM can be used to overcome the 1∑ n ⃒ ⃒
MAE = ⃒y − y ⃒ (3)
limitations of the SVM model. Apparently, it is impossible to have a n i=1 i p

single type of algorithm suitable for all scenarios. Hence, the idea of a
fusion method, which make full use of the advantages of different types n ⃒ ⃒
1∑ ⃒yi − yp ⃒
of algorithms, is naturally emerged. The principle of the fusion is to MAPE = ⃒ ⃒ (4)
n i=1 ⃒ yi ⃒
integrate the advantages of different algorithms to construct a more
adaptable prediction method. With such an idea in mind, this paper where n is the number of samples in the test set, yp and yi are the
proposes a load forecasting method based on the fusion model of SVM, forecasted and actual short-term loads, respectively.
RF and LSTM frameworks, together with a data preprocess method,
which is shown in Fig. 4. As shown in the figure, the input data is pre­ 5. Illustrative example
processed by the EEMD algorithm for linearizing the variation of load,
then, the fusion method built on SVM, RF and LSTM is used to perform 5.1. Experimental data and parameter settings
the load forecast. The purpose is to use the proposed fusion method to
improve the forecasting accuracy. In this section, the effectiveness of the proposed forecasting
approach is verified by numeric simulations based on real load data. The
data set includes samples with 15-min interval and a total of 5760

5
W. Guo et al. The Electricity Journal 34 (2021) 106884

The three prediction methods discussed in Section 4.1, SVM, RF and


LSTM, and the proposed fusion algorithm are respectively used to
perform the STLF. Finally, the indices representing the forecasting ac­
curacy of each algorithm are compared in Table 2. The forecasted load
under different forecasting methods are depicted by the curves in Fig. 7.
In terms of the forecasting accuracy, it can be seen from Table 1 that
the MAPEs of SVM, RF, LSTM and the fusion algorithm are 23.4%, 3.6%,
5.3% and 2.8 %, 3.6 %, 5.3 % and 2.8 %, respectively. Especially, SVM
can make use of historical load data and external environmental char­
acteristics, but fails to take into account the time-dependent character­
istics of electrical load, so it only achieves a moderate prediction
accuracy. RF makes good use of the trend of the historical sequence to
derive the loads at the future time instants. It can be seen from Fig. 7 that
the fusion algorithm’s prediction accuracy is higher than any traditional
machine learning method. Moreover, the prediction accuracy of LSTM is
slightly higher than that of RF while much better than that of SVM. It can
Fig. 6. The training result. be seen that the simulation results verify that the proposed fusion al­
gorithm significantly improves the prediction accuracy.

Table 2 6. Conclusions
The forecasting accuracy of each algorithm.
Methods SVM RF LSTM FA Load forecasting plays a critical role in power system scheduling and
dispatch. Many research efforts have been put into improving the per­
RMSE 106.4186 39.33402 20.2323 10.07455
MAPE 0.234434 0.036709 0.053074 0.028489 formance of power load forecasting. High-precision forecasting models
MAE 95.71316 36.45278 15.18708 7.599294 can produce large economic, social and environmental benefits, ensure
the operational security of power systems. Therefore, there is an urgent
need to develop an adaptable and high-accuracy prediction algorithm.
sample points. The first 5664 points in the data sequence are used for Different forecasting methods, as reviewed in this paper, have
training the model while the last 96 points are used for testing. The different advantages when performing the forecasting. However, it is
experimental environment is Intel (R) Core (TM) i5-6500 CPU @ 3.20 most likely that a single model cannot achieve a satisfying accuracy level
GHz processor, 8GB RAM, MATLAB and TensorFlow platform. of forecasting. To address this issue, a fusion forecasting model is pro­
In the decomposition of the raw input data, the parameter is set as r = posed in this paper. It includes a data preprocessing method and a multi-
100 and standard deviation a = 0.2 according to the reference (Lee et al., step forecasting strategy that integrates the advantages of SVM, RF and
2012). The basics of parameter selection have been discussed in Section LSTM, and thus comprehensively improve the forecasting accuracy.
2.2. Further research can focus on the following problems:
Regarding the LSTM network, the number of neurons in the input
layer is equal to the number of attributes of the input data. The hidden (1) This paper ignores the conflicts between using different indices to
layer includes three layers with 10, 15 and 5 neurons, respectively, and evaluate the forecasting accuracy. Future work can focus on
uses the rectified linear unit function (ReLU) as the activation function. address the issue that the forecasting process involves multiple
The output layer is a fully-connected layer with one neuron. objectives or conflicting constraints.
(2) The forecasting can incorporate more impact factors (such as
5.2. Result analysis weather information, holidays, intermittent renewable energy,
and electric vehicles, etc.).
The decomposition result is shown in Fig. 5. In this figure, IMF stands
for intrinsic mode function, IMF1 through IMF5 with large changes Declaration of Competing Interest
represent the high-frequency variation components of loads, IMF6
through IMF8 represent the low-frequency variation components, and R The authors report no declarations of interest.
is an index that denotes the trend of the load variation. The process of
convergence during the training is shown in Fig. 6.

Fig. 7. Comparison of forecast data.

6
W. Guo et al. The Electricity Journal 34 (2021) 106884

References Li, Z., Shahidehpour, M., Alabdulwahab, A., Al-Turki, Y., 2019. Valuation of distributed
energy resources in active distribution networks. Electricity 32 (4), 27–36.
Lu, L., Azimi, M., Iseley, T., 2019. Short-term load forecasting of urban gas using a hybrid
Ali, M., Khan, A., Rehman, U., 2018. Hybrid multiscale wind speed forecasting based on
model based on improved fruit fly optimization algorithm and support vector
variational mode decomposition. Int. Trans. Electr. Energy Syst. 28, 1–21.
machine. Energy 19 (5), 666–677.
Barman, M., Choudhury, D., Sutradhar, S., 2018. A regional hybrid GOA-SVM model
Ronald, W., David, Z., 1989. A learning algorithm for continually running fully recurrent
based on similar day approach for short-term load forecasting in Assam. Energy 145
neural networks. Neural Comput. 1 (2), 270–280.
(February 15), 710–720.
Uriarte, R., Tiezzi, F., Tsaftaris, A., 2016. Supporting autonomic management of clouds:
Breiman, L., 2001. Random forests. Mach. Learn. 45 (1), 5–32.
service clustering with random forest. IEEE Trans. Netw. Serv. Manag. 13 (3),
Chen, J., Chang, W., Lin, J., 2004. Load forecasting using support vector machines: a
595–607.
study on EUNITE competition 2001. IEEE Trans. Power Syst. 19 (4), 1821–1830.
Wu, X., He, J., Zhang, P., et al., 2015. Power system short-term load forecasting based on
Chen, Y., Xu, P., Chu, Y., et al., 2017. Short-term electrical load forecasting using the
improved random forest with grey relation projection. Automation Electr. Power
Support Vector Regression (SVR) model to calculate the demand response baseline
Syst. 39 (12), 50–55.
for office buildings. Energy. 195, 659–670.
Xiao, B., Zhou, C., Mu, G., 2013. Review and prospect of the spatial load forecasting
Chen, H., Zhang, J., Tao, Y., Tan, F., 2019. Asymmetric GARCH type models for
methods. Proc. Chinese Soc. Electr. Eng. 000 (025), 78–92.
asymmetric volatility characteristics analysis and wind power forecasting. Prot.
Control. Mod. Power Syst. 4 (4), 356–366.
Diaz, A., Lopez-Rubio, E., Palomo, J., 2019. The forbidden region self-organizing map Weilin Guo received the B.S. degree from the College of Electrical Engineering, Tibet
neural network. IEEE Trans. Neural Netw. Learn. Syst. 31 (1), 201–211. University, Lin Zhi, Tibet, China in 2017. He is currently pursuing the PhpH.D. degree with
Hochreiter, S., Schmidhuber, J., 1997. Long short-term memory. Neural Comput. 9 (8), the College of Electrical and Information Engineering at Hunan University, Changsha,
1735–1780. Hunan, China. His research interests include power system operation and planning, and
Hong, H., Zhu, X., Su, W., et al., 2012. Detection of time varying pitch in tonal languages: applications of machine learning methods to power systems.
an approach based on ensemble empirical mode decomposition. Journal of Zhejiang
University Science C. Computers & Electronics. 13 (2), 139–145.
Liang Che (M’15) received the B.S. degree from Shanghai Jiaotong University, China, in
Hoori, O., Kazzaz, A., Khimani, R., et al., 2020. Electric load forecasting model using a
2006, and the Ph.D. degree from the Illinois Institute of Technology, Chicago, IL, in 2015,
multi-column deep neural networks. IEEE Trans. Ind. Electron. 67 (8), 6473–6482.
all in electrical engineering. He is currently a Professor with the College of Electrical and
Hou, K., Shao, G., Wang, H., Zheng, L., Zhang, Q., Wu, S., Hu, W., 2018. Research on
Information Engineering, Hunan University, Changsha, China. He was with the Mid­
practical power system stability analysis algorithm based on modified SVM. Prot.
continent Independent System Operator (MISO), Carmel, IN, USA from 2016 to 2019, and
Control. Mod. Power Syst. 3 (2), 119–125.
with Siemens PTI, Minnetonka, MN, USA from 2015 to 2016. His research interests include
Huang, N., et al., 1999. A NEW VIEW OF NONLINEAR WATER WAVES: the hilbert
power system operation and planning, and applications of machine learning methods to
Spectrum. Annu. Rev. Fluid Mech. 31, 417–457.
power systems.
Jahangir, H., et al., 2020. Deep learning-based forecasting approach in smart grids with
micro-clustering and Bi-directional LSTM network. IEEE Trans. Ind. Electron.
(1982), 1. Mohammad Shahidehpour (F’01) received an Honorary Doctorate degree from the
Jiang, H., Zhang, Y., Muljadi, E., et al., 2017. A short-term and high-resolution Polytechnic University of Bucharest, Bucharest, Romania. He is a University Distinguished
distribution system load forecasting approach using support vector regression with Professor, Bodine Chair Professor, and Director of the Robert W. Galvin Center for Elec­
hybrid parameters optimization. IEEE Trans. Smart Grid 9 (4), 1-1. tricity Innovation at Illinois Institute of Technology. Dr. Shahidehpour was the recipient of
Jurado, S., Nebot, À., Mugica, F., et al., 2015. Hybrid methodologies for electricity load the IEEE PES Ramakumar Family Renewable Energy Excellence Award, IEEE PES Douglas
forecasting: entropy-based feature selection with machine learning and soft M. Staszesky Distribution Automation Award, IEEE PES Outstanding Power Engineering
computing techniques. Energy 86, 276–291. Educator Award, and IEEE PES T. Burke Hayes Faculty Recognition Award for his con­
Kong, X., Zheng, F., Cao, J., et al., 2018. Short-term load forecasting based on deep belief tributions to hydrokinetics, He is a member of the US National Academy of Engineering,
network. Automation Electr. Power Syst. 42 (5), 133–139. and Fellow of IEEE, the American Association for the Advancement of Science (AAAS), and
Kong, W., et al., 2019. Short-term residential load forecasting based on LSTM recurrent the National Academy of Inventors (NAI).
neural network. IEEE Trans. Smart Grid 10 (1), 841–851.
Lahouar, A., Ben, J., 2015. Day-ahead load forecast using random forest and expert input
Xin Wan is with the Guodian Dadu River Drainage Area Hydroelectricity Development
selection. Energy Convers. Manage. 103, 1040–1051.
Co., Ltd.
Lee, L., Chang, C., Hsieh, Y., et al., 2012. A brain-wave-Actuated small robot Car Using
ensemble empirical mode decomposition-based approach. IEEE Trans. Syst. Man
Cybern. A. Syst. Hum. 42 (5), 1053–1064.

You might also like