Applied Energy 88 (2011) 1298–1311
Contents lists available at ScienceDirect
Applied Energy
journal homepage: www.elsevier.com/locate/apenergy
Error analysis of short term wind power prediction models
Maria Grazia De Giorgi ⇑, Antonio Ficarella, Marco Tarantino
Dipartimento di Ingegneria dell’Innovazione, Università del Salento, Via per Monteroni, 73100 Lecce, Italy
a r t i c l e
i n f o
Article history:
Received 22 March 2010
Received in revised form 13 October 2010
Accepted 24 October 2010
Available online 24 November 2010
Keywords:
Forecasting
Wind power
ARMA models
Artificial Neural Networks
ANFIS
a b s t r a c t
The integration of wind farms in power networks has become an important problem. This is because the
electricity produced cannot be preserved because of the high cost of storage and electricity production
must follow market demand. Short-long-range wind forecasting over different lengths/periods of time
is becoming an important process for the management of wind farms. Time series modelling of wind
speeds is based upon the valid assumption that all the causative factors are implicitly accounted for in
the sequence of occurrence of the process itself. Hence time series modelling is equivalent to physical
modelling. Auto Regressive Moving Average (ARMA) models, which perform a linear mapping between
inputs and outputs, and Artificial Neural Networks (ANNs) and Adaptive Neuro-Fuzzy Inference Systems
(ANFIS), which perform a non-linear mapping, provide a robust approach to wind power prediction. In
this work, these models are developed in order to forecast power production of a wind farm with three
wind turbines, using real load data and comparing different time prediction periods. This comparative
analysis takes in the first time, various forecasting methods, time horizons and a deep performance analysis focused upon the normalised mean error and the statistical distribution hereof in order to evaluate
error distribution within a narrower curve and therefore forecasting methods whereby it is more improbable to make errors in prediction.
Ó 2010 Elsevier Ltd. All rights reserved.
1. Introduction
Among new sources of renewable energy, wind energy is
undoubtedly the one that has seen greatest growth over recent
years; thus becoming, in various countries, the true alternative to
fossil fuels. At the end of 2009, worldwide nameplate capacity of
wind-powered generators was 159.2 gigawatts (GW) with an energy production of 340 TW h, about 2% of worldwide electricity
usage (compared to the 0.1% of 1997).
The increasing interest of worldwide academic literature in
wind energy is attested in several works dealing with this important theme.
Morales et al. [1] propose a procedure to produce a set of plausible scenarios characterising the uncertainty associated with wind
speed at different geographic sites. This characterisation constitutes an essential part within the decision-making processes faced
by both power system operators and producers with a generation
portfolio including wind plants at several locations. Zhou et al. [2]
review the current state of simulation, optimisation and control
technologies for stand-alone hybrid solar–wind energy systems
with battery storage. Xydis et al. [3] perform a wind resource
assessment study in the area of the Central Peloponnese (inland)
using Geographic Information Systems (GIS) tools and an exergy
⇑ Corresponding author.
E-mail address: mariagrazia.degiorgi@unisalento.it (M.G. De Giorgi).
0306-2619/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved.
doi:10.1016/j.apenergy.2010.10.035
analysis. Hongxing et al. [4] recommend an optimal design model
for designing hybrid solar–wind systems employing battery banks
for calculating the optimum system configurations and ensuring
that the annualised cost of these systems is minimised while satisfying the custom required loss of power supply probability (LPSP).
In [5] the hourly measured wind speed data for the years 2003–
2005 at heights of 10 m, 30 m and 60 m respectively for The Kingdom of Bahrain have been statically analysed in order to determine
the potential for wind power generation. Luickx et al. [6] define
various elements that come into play when considering backup
for electricity generation from wind power. The effects of several
parameters, to be situated on the short-range operation of backup
of wind power, are defined and analysed accordingly. The most
important parameters are the load profiles, the wind power output
profiles and the total amount of installed wind power.
The most important problem for the diffusion of wind energy is
that it is characterised by high variability, both in space and time.
Short-range wind energy forecasting is very important in order to
minimise the scheduling errors that impact on both grid reliability
and market service costs. In the field of energy, which is increasingly oriented towards economic concepts of market, energy
producers ought to be able to predict the amount of energy
produced in the subsequent hours or days with high precision.
The natural consequence of the evolution of the electricity market has been the search for technical solutions, often based on time
series, that are able to predict deliverable power in short-medium
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
periods. The introduction, also in the energetic field, of Artificial
Neural Networks, has given new impetus to the use of systemic
techniques in the area of real time prediction systems. One
requirement for a good model in a real time prediction system is
the ability to maintain an acceptable degree of reliability in forecasting when prediction lengths increase; indeed, it is essentially
the availability of sufficient lead-time in forecasting that in turn affects the potential for optimal use of the energy source in question.
Owing to the variability of wind, the skill of forecasting wind
speed in subsequent time intervals must be evaluated in the various cases.
Various forecasting models have been developed in literature,
each with its own characteristics and each performing well under
different circumstances.
Numerical weather prediction (NWP) models are effective in
predicting large-scale area wind speed and can achieve better results in long-range forecasting [7,8]. These use hydrodynamic
atmospheric models which take into account physical phenomena
such as frictional, thermal and convective effects.
Forecasts are generally of two types, as Burton asserts [9]: (i)
short-range predictions of wind speed on a time horizon variable
from a few seconds to minutes, which could be useful for the operative control of wind turbines; (ii) long-range forecasting on a time
horizon from a few hours to days. The latter are useful in order to
plan the energy supply of wind parks. In [9] it is highlighted that
for short-range forecasting, statistical techniques offer good results, while for long-range forecasting it is necessary to rely on
meteorological methods.
The ARMA models are sophisticated methods that make forecasts based upon a linear combination of n previous values. These
are based upon the assumption that the value of wind speed at
time k is a linear function of the two previous values at times
(k 1) and (k 2) and that the coefficients of the linear function
change every time. The initial assumption of this method is that
the statistical properties of wind (average, auto-variance) do not
change within the period taken into consideration for such a prediction however this assumption may be limiting with regard to
the use of ARMA models.
Bossanyi [10] investigated the use of ARMA models for forecasting wind speed from a few seconds to a few minutes obtaining a
decrement of error up to 20% if compared with the statistical methods based on persistence.
Alternatively, Boland et al. [11] applied ARMA models in order
to predict wind power with a prediction length of 30 min, also proposing a method for the selection of the optimal order of the
model.
Significantly longer horizons were taken into account by Kavasseri and Seetharaman [12] who examined the use of fractionalAutoregressive Integrated Moving Average (ARIMA) models to
forecast wind speeds of 1-day-ahead (24 h) and 2-day-ahead
(48 h) periods respectively. Forecasting accuracy was assessed by
computing three indices, i.e. the daily mean error, the variance
and the square root of the forecast mean square error. The results
obtained indicate that significant improvements in forecasting
accuracy are gained with the models proposed compared to the
persistence method.
Riahy and Abedi [13] proposed a new method based on linear
prediction for wind speed forecasting. This method utilises the linear prediction method in conjunction with filtering of the wind
speed waveform; however this approach, aimed at minimising
the absolute percentage error, is applied only for a prediction range
of a few seconds ahead.
Recent techniques such as ANNs, neuro-fuzzy networks and
wavelet-based methods are used increasingly.
An application of ANNs for a given wind speed forecast with a
time horizon of one hour is found in Flores and Tapia [14] on the
1299
basis of the three previous values, thus obtaining a mean squared
error of 0.057 m/s, whereas a new approach to wind speed forecasting in the subsequent hour is found in A. Sfetsos [15], whose
method is based on a multi-step prediction of average values with
10 min-intervals. Among the medium-long-range forecasts, the
work of Jayaraj et al. [16] is significant. Jayaraj applied an ANN in
order to predict the wind speed over the subsequent 1-, 24-, and
48-h periods respectively: good performance was noted in forecasting the wind speed in the subsequent hour (with a maximum
root mean squared error of 13% on the generated output), while
the use of ANN longer time intervals shows an increment of root
mean squared error up to 23% in the case of the 24-h period.
Cadenas and Rivera [17] applied the ARIMA models and the
Artificial Neural Networks (ANN) methods to a time series based
on 7 years of recorded wind speed measurements with good results through a seasonal parameter introduced in the ARIMA model, using a prediction length of one hour and evaluating the
performances by the mean squared error, the mean absolute error
and the mean absolute percentage error. Potter et al. [18] and Johnson et. al [19] obtained an improvement in forecasting quality for
the brief prediction lengths by applying the ANFIS method, however a very short prediction length is tested: 2.5 min in [18] and
5 min in [19].
Barbounis and Theocharis [20] carried out long-range wind
speed and power forecasting in a wind farm using locally recurrent
multi-layer networks as forecasting models with a prediction
length of 15 min–3 h. Performance was evaluated by the normalised mean squared error, revealing an improvement with respect
to the persistent and ARMA models.
Bilgili et al. [21] applied resilient propagation (RP) Artificial
Neural Networks to predict the mean monthly wind speed of any
target station using the mean monthly wind speeds of neighbouring stations indicated as reference stations. The maximum and the
minimum mean absolute percentage errors were found to be,
respectively, 14.13% and 4.49%. Another attempt to forecast the
mean monthly wind speed by ANNs is found in Kalogirou et al.
[22], who used a multi-layered artificial neural network for predicting the mean monthly wind speed in regions of Cyprus. Data
for the period 1986–1996 were used to train the neural network
and data from the year 1997 were used for validation; in this case
the maximum percentage difference was only 1.8%.
In [23] the profile of wind speed in Nigeria is modelled using an
artificial neural network. The ANN model consists of a three-layered, feed-forward, back-propagation network with different configurations, designed using the Neural Toolbox for MATLAB. The
geographic parameters (latitude, longitude and altitude) and the
month of the year were used as input data and the monthly mean
wind speed was used as the output of the network.
Beccali et al. [24] present a novel approach to wind speed spatial estimation in Sicily (Italy): an incremental self-organising neural network (Generalised Mapping Regressor – GMR) is coupled
with exploratory data analysis techniques in order to obtain a
map of the spatial distribution of the average wind speed over
the entire region.
Li and Shi [25] illustrate a comprehensive comparison study on
the application of different Artificial Neural Networks in 1-h-ahead
wind speed forecasting. Three types of typical neural networks,
namely, adaptive linear element, back propagation, and radial basis
function, are investigated.
In the literature there are some interesting attempts to compare
the new non-linear forecasting techniques with the linear ARMA
models ([26–28]). According to Kariniotakis et al. [26] fuzzy-logic
leads to improvements in the predictions of wind power if compared with the simplest statistical techniques, however the forecasting range considered here is between 10 min and 2 h.
Palomares-Salas et al. [27] based their comparison between the
1300
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
ARIMA model and a back-propagation neural network on three
parameters: the Pearson correlation coefficient associated with
the original time series and the forecasting series, the Index Of
Agreement (IOA) of Willmot and the root mean square error. Their
results show that the ARIMA model is better than NNT for shortrange intervals to forecast (10 min, 1 h, 2 h and 4 h).
Sfetsos [28] compares ARMA, ANNs and ANFIS techniques by
evaluating the RMS error in the prediction length of 1 h for the
hourly wind speed.
The multi-layer perceptron network (MLP) is the principal technique used in [29] behind other forecasting methods such as the
ARMA models and various kinds of ANNs. Two main forecasting
systems are presented, based not only on historical real data but
also on numerical weather predictions, obtaining an average RMS
error of approximately 14% in the horizon 12–24 h.
Lei et al. [30] give a bibliographical survey on the general background of research and developments within the fields of wind
speed and wind power forecasting. Based on the assessment of
wind power forecasting models, further direction for additional research and application is proposed. This is aimed at furthering the
study of artificial intelligence methods and improving their training algorithm: thus aiming at more accurate results as well as combining different physical and statistical models in order to achieve
good results both in long- and short-range prediction. It aims, furthermore, at research on the practical application of the models,
not only theoretical, and the proposal of new mathematical
methods.
This work is also aimed at continuing the comparative analysis
in the literature of the most important wind power forecasting
techniques, by increasing the number of prediction horizons, and
by going into more depth in the statistical analysis of forecasting
error.
In particular, a wide comparison between ARMA models, ANFIS
and ANNs has been carried out in the attempt to create a wind
power forecasting model for a wind park in Southern Italy, a country with a complex orography and very unstable meteorological
conditions. The comparison concerns more forecasting methods,
more time horizons and a deeper analysis than is to be found in
the current literature; it focuses on the normalised absolute mean
error and a further analysis of the statistical distribution of normalised error has been made in order to explore more deeply the
quality of the generated techniques by evaluating the said error
distributions within a narrower curve and therefore the forecasting
methods whereby it is more improbable to make errors in
prediction.
eters were either unavailable or incorrect; naturally, an algorithm
was sought and created in order to compensate the gaps in the historical series data and two vectors were obtained accordingly: the
hourly average wind speed for years 1–5; the hourly average
power for years 1–5; when the hourly average power was unavailable, it was estimated by the value of the hourly average wind
speed, through the power curve equation; while, in absence of
the hourly average wind speed for n time instants, an interpolation
was made upon the n/2 previous and the n/2 subsequent values.
It is important to highlight that the real values of wind power
are reserved, so the normalised values are plotted in the following,
adopting the range [0; 1] to cover the whole range of power
measured.
Other variables like pressure or temperature were not taken
into consideration owing to low correlation between these said
variables and wind power (Figs. 1 and 2); the determination coefficient of the best interpolation curve between the time series of
wind power and wind speed is equal to 0.907, while the same coefficient is almost 0/zero between power and pressure or power and
temperature.
Figs. 3 and 4 show the normalised measured hourly wind power
in the months of September–March, of year IV.
The higher variability which characterises March is evident,
compromising the elaboration of a forecasting method.
Wind power and wind speed form a non-linear relationship,
basically cubic. Thus, small error in wind speed forecast will actually generate a large (cubic) error in wind power; this error can further increase because of the complex orography of the site, by
which for the same wind speed there can be a great difference of
wind power depending on direction. Therefore, the approach used
in this study is based on the adoption of the wind power as target
for all the forecast methods applied.
In order to evaluate the quality of the available wind power
data, the real power curve (obtained by plotting real wind power
against real wind speed) with the theoretical one (obtained by
plotting theoretical wind power against real wind speed) was compared: the two curves are in good agreement (the average difference being only 1.18%). Moreover, a spectral analysis of the time
series of wind power and wind speed was carried out by using Fast
Fourier Transforms (FFTs) and plotting the associated Power Spectral Density (the FFT of the autocorrelation function) in order to
evaluate the frequency content of the time series and identify some
periodicities. Fig. 5 depicts the Power Spectral Density for the
whole time series (years 1–5) of wind power and wind speed.
March, year V
2. Wind farm characteristics
Normalised hourly wind power
1.0
This work is aimed at wind power forecasting and is based upon
data measured and taken from three wind turbines in a wind park
at a specific location in Southern Italy, that is characterised by a
complex orography and, moreover, the wind farm is located in
highly complex terrain, where geographical effects make wind
speed predictions difficult. Besides this, meteorological conditions
in the wind farm are also very unstable.
In order to define a prediction model for wind power, the most
significant problem remains the selection of the best parameter to
use from among the several variables of the system.
In particular, an analysis of the time series was carried out. The
series is represented by the following daily data (registered every
10 min): temperature (°C), wind speed (m/s), direction (°), pressure
(mmHg), wind power of each of the three turbines (kW), measured
for 5 years.
Firstly, an accurate elaboration of the measured values was necessary in order to check, each month, the days in which the param-
0.8
0.6
0.4
0.2
0.0
0
5
10
15
Hourly temperature (°C)
Fig. 1. Normalised hourly wind power vs hourly temperature in the month of
March, year V.
1301
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
1.0
The two spectra are in good agreement, both denoting a peak in
correspondence of 1/24 h1 and 1/12 h1, thereby highlighting
the quality of the recorded data.
0.8
3. Auto Regressive Moving Average (ARMA) models
Normalised hourly wind power
March, year V
A first attempt to predict the power generated was made
through the so-called Auto Regressive Moving Average (ARMA)
models.
The general expression for the ARMA models is:
0.6
0.4
AðqÞyðtÞ ¼ CðqÞeðtÞ
0.2
ð1Þ
where y(t) is the variable of interest, i.e. the power produced at the
time t and A(q) and C(q) are two polynomial functions of the backshift operator q
0.0
900
905
910
915
920
925
930
AðqÞ ¼ 1 þ a1 q1 þ þ ana1 qna
Hourly pressure (mmHg)
Fig. 2. Normalised hourly wind power vs hourly pressure in the month of March,
year V.
CðqÞ ¼ 1 þ c1 q
1
þ þ cnc q
nc
ð2Þ
ð3Þ
e(t) is the error term at the time t and na and nc are the model orders (in this work na = nc).
Two test sets have been defined:
Normalised hourly wind power
September, year IV
1. Application of the ARMA model (from order 1 to 11) on a training period of 3 years and on a testing period of 2 years, in five
forecasting cases (1 h, 3 h, 6 h, 12 h, 24 h).
2. Application of the ARMA model (from order 1 to 9), on a training period of four years and on a testing period of one year, in
the same cases.
1.0
0.8
0.6
The input is the hourly average power at time t, while the output of the model is the sum of hourly average power at times
t + 1. . .t + h (h = prediction length).
0.4
0.2
4. Artificial Neural Networks (ANNs)
0.0
4.1. Introduction
0
2
4
6
8
10
12
14
16
18
Hourly wind speed (m/s)
Fig. 3. Normalised hourly wind power vs hourly wind speed in the month of
September, year IV.
Normalised hourly wind power
March, year IV
1.0
0.8
0.6
0.4
0.2
0.0
0
2
4
6
8
10
12
14
16
18
Hourly wind speed (m/s)
Fig. 4. Normalised hourly wind power vs hourly wind speed in the month of March,
year IV.
Neural networks are composed of simple elements operating in
parallel. These elements are inspired by biological nervous systems. As in nature, the network function is determined largely by
the connections between elements. A neural network can be
trained to perform a particular function by adjusting the values
of the connections (weights) between elements (Figs. 6 and 7).
The basic component of such a system is a neuron. When active,
electrochemical signals are received through synapses to the neuron cell. Each synapse has its own weight that determines the contribution and extent to which the respective input affects the
output of the neuron. The weighted sum of the input electrochemical signals is fed to the nucleus that sends electrical impulses in
response, being transmitted to other neurons or to other biological
units as actuation signals. Neurons are interconnected through
synapses. The synaptic weights modify continuously during learning. Groups of neurons are organised into subsystems and integrate
to form the brain.
In the ANN technique, a simulation of a small part of the central
nervous system is done which is a rather basic mathematical model of the biological nervous system. Inputs are fed into the corresponding neurons, and the electrochemical signals are altered by
weights. The weighted sum is operated upon by an activation function, and outputs are fed to other neurons in the network. All these
neurons are highly interconnected and the activation values constitute final output or may be fed to the next model. These connection weights are continuously modified during training to obtain
desired accuracy and generalisation capabilities.
1302
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
Power Spectral Density of wind power
2.0x10
Power Spectral Density of wind speed
3
5x10
7
3
4x10
1.5x10
7
3
1.0x10
7
5.0x10
6
PSD
PSD
3x10
3
2x10
3
1x10
0.0
0.04
0.05
0.06
0.07
0.08
0
0.04
0.09
0.05
0.06
0.07
Frequency (1/h)
Frequency (1/h)
(a)
(b)
0.08
0.09
Fig. 5. Power Spectral Density of wind power (a) and wind speed (b).
Fig. 8. Typical feed-forward network with two layers.
Fig. 6. Scheme of operating in a neural network.
Fig. 7. Model of a general neuron.
Five kinds of forecast have been made in this work by using
ANNs in order to evaluate the wind power of the park: three networks (MLFF, Elman, MLP) based on one input (the hourly average
power) and two networks (Elman and MLP) based on two inputs
(the hourly average power and the hourly average wind speed).
The application of the scheme with two inputs has been carried
out only for the Elman and the MLP networks, the two techniques
with the best respective performances in the case of one input.
For each of the networks (and also for the ANFIS model) the two
test sets also described for the ARMA models have been defined:
1. Application of the model on a training period of 3 years and on a
testing period of 2 years, in five forecasting horizons (1 h, 3 h,
6 h, 12 h, 24 h).
2. Application of the model on a training period of 4 years and on a
testing period of 1 year, in the same cases.
4.2. Multi-layer feed-forward network
The neural network used at this step is the Multi-Layer Feed
Forward (MLFFN). In the Feed-Forward networks (Fig. 8) the data
flow from input to output units is strictly feed-forward. The data
processing can extend over multiple (layers of) units, yet no feedback connections are present, i.e. connections extending from outputs of units to inputs of units in the same layer or previous layers.
Several algorithms for training utilise the gradient of the performance function to determine how to adjust the weights in order to
minimise performance. The gradient is determined using a technique called back-propagation that involves performing computations backwards through the network.
The goal of the algorithm is to minimise the global error E defined as
E¼
n
1X
ðtðkÞ oðkÞÞ2
2 k¼1
ð4Þ
where o(k) and t(k) are the output and target network for any k output node.
The input is made up of average hourly power (as many values
as the neurons of the first layer, the last value is the hourly average
power at time t), while the output is the sum of hourly average
power at times t + 1. . .t + h (h = prediction length).
Table 1 shows the final network parameters used in the training
for each prediction length, determined after an optimisation process (oriented to minimise the mean squared error).
4.3. Elman networks
This kind of network is characterised by feedback from the firstlayer output to the first-layer input. This recurrent connection
1303
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
Table 1
MLFF network parameters used in the training for each prediction length.
Parameters
1h
3h
6h
12 h
24 h
Training function
Adapt learning function
Performance function
Number layers
Neurons (layer 1) – inputs
Neurons (layer 2)
Neurons (layer 3) – output
Activation function hidden layer
Activation function output layer
Epochs
TRAINLM
LEARNGD
MSE
3
16
8
1
TANSIG
PURELIN
200
TRAINLM
LEARNGD
MSE
3
24
12
TRAINLM
LEARNGD
MSE
3
24
12
1
TANSIG
PURELIN
200
TRAINLM
LEARNGD
MSE
3
36
18
1
TANSIG
PURELIN
200
TRAINLM
LEARNGD
MSE
3
48
24
1
TANSIG
PURELIN
200
TANSIG
PURELIN
200
TRAINLM = Levenberg–Marquardt algorithm.
LEARNGD = Gradient descent weight and bias learning function.
MSE = Mean squared error.
TANSIG = Hyperbolic tangent sigmoid transfer function.
PURELIN = Linear transfer function.
4.4. Multi-layer perceptron network
Fig. 9. Typical architecture of an Elman back-propagation network.
allows the Elman network to detect and generate time-varying
patterns (Fig. 9).
Firstly, the Elman network was applied with the same structural
characteristics described in Table 1, assuming as input the hourly
average power and as output the sum of hourly average power at
times t + 1. . .t + h (h = prediction length).
Second, two inputs were used: the hourly average power and
the hourly average wind speed at time t, while the output is the
sum of hourly average power at times t + 1. . .t + h.
Table 2 shows the final network parameters used in the training
for each prediction length, determined after an optimisation process oriented to minimise the mean squared error.
The perceptron is a type of artificial neural network invented in
1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. It
may be seen as the simplest kind of feed-forward neural network
and thus a linear classifier.
A multi-layer perceptron (MLP) is a modification of the standard
linear perceptron in that it uses three or more layers of neurons
with non-linear activation functions, and is more powerful than
the perceptron in that it can distinguish data that is not linearly
separable, or separable by a hyperplane.
Learning occurs in the perceptron by changing connection
weights (or synaptic weights) after each datum is processed, based
on the amount of error in the output compared to the expected
result.
As for the Elman network, the MLP were applied first, assuming
as input the hourly average power and as output the sum of hourly
average power at times t + 1. . .t + h (h = prediction length), and secondly two inputs were used: the hourly average power and the
hourly average wind speed at time t, adopting as output the sum
of hourly average power at times t + 1. . .t + h. The two structural
schemes, determined after an optimisation process oriented to
minimise the mean squared error, are described in Tables 3 and 4.
5. Adaptive Neuro-Fuzzy Inference Systems (ANFIS)
The acronym ANFIS is derived from Adaptive Neuro-Fuzzy Inference System. It is a hybrid of two intelligent system models and
combines the low-level computational power of a neural network
Table 2
Elman network (2 inputs) parameters used in the training for each prediction length.
Parameters
1h
3h
6h
12 h
24 h
Training function
Adapt learning function
Performance function
Number layers
Neurons (layer 1) – inputs
Neurons (layer 2)
Neurons (layer 3) – output
Activation function hidden layer
Activation function output layer
Epochs
TRAINGDX
LEARNGD
MSE
3
2
5
1
TANSIG
PURELIN
500
TRAINGDX
LEARNGD
MSE
3
2
5
1
TANSIG
PURELIN
500
TRAINGDX
LEARNGD
MSE
3
2
5
1
TANSIG
PURELIN
500
TRAINGDX
LEARNGD
MSE
3
2
5
1
TANSIG
PURELIN
500
TRAINGDX
LEARNGD
MSE
3
2
5
1
TANSIG
PURELIN
500
TRAINGDX = Gradient descent with momentum and adaptive learning rate backpropagation.
LEARNGD = Gradient descent weight and bias learning function.
MSE = Mean squared error.
TANSIG = Hyperbolic tangent sigmoid transfer function.
PURELIN = Linear transfer function.
1304
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
Table 3
MLP network (1 input) parameters used in the training for each prediction length.
Parameters
1h
3h
6h
12 h
24 h
Adapt learning function
Performance function
Number layers
Neurons (layer 1) – inputs
Neurons (layer 2)
Neurons (layer 3) – output
Activation function hidden layer
Activation function output layer
Epochs
LEARNP
MSE
3
1
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
1
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
1
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
1
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
1
3
1
TANSIG
PURELIN
200
LEARNP = Perceptron weight and bias learning function.
MSE = Mean squared error.
TANSIG = Hyperbolic tangent sigmoid transfer function.
PURELIN = Linear transfer function.
Table 4
MLP network (2 inputs) parameters used in the training for each prediction length.
Parameters
1h
3h
6h
12 h
24 h
Adapt learning function
Performance function
Number layers
Neurons (layer 1) – inputs
Neurons (layer 2)
Neurons (layer 3) – output
Activation function hidden layer
Activation function output layer
Epochs
LEARNP
MSE
3
2
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
2
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
2
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
2
3
1
TANSIG
PURELIN
200
LEARNP
MSE
3
2
3
1
TANSIG
PURELIN
200
LEARNP = Perceptron weight and bias learning function.
MSE = Mean squared error.
TANSIG = Hyperbolic tangent sigmoid transfer function.
PURELIN = Linear transfer function.
Table 5
ANFIS parameters used in the training for each prediction length.
Parameters
1h
3h
6h
12 h
24 h
Input Membership Functions
Number of input MFs
Output Membership Functions
Optimisation method
Epochs
GBELLMF
4
LINEAR
HYBRID
500
GBELLMF
4
LINEAR
HYBRID
500
GBELLMF
4
LINEAR
HYBRID
500
GBELLMF
4
LINEAR
HYBRID
500
GBELLMF
4
LINEAR
HYBRID
500
GBELLMF = Generalised bell-shaped built-in membership function.
LINEAR = Linear function.
HYBRID = Combination of backpropagation and the least-squares method to estimate membership function parameters.
with the high-level reasoning capability of a fuzzy inference system.
The easiest way to understand how the ANFIS model operates is to
consider it in two steps. Firstly, the system is trained in a similar way
to a neural network with a large set of input data. Once trained the
system will then operate exactly as a fuzzy expert system.
Using a given input/output data set, the ANFIS constructs a fuzzy inference system (FIS) the membership function parameters of
which are tuned (adjusted) using either a backpropagation algorithm alone, or in combination with a least squares-type method.
This allows fuzzy systems to learn from the data modelled.
A network-type structure similar to that of a neural network
that maps inputs through input membership functions and associated parameters and then through output membership functions
and associated parameters to outputs. This can then be used to
interpret the input/output map. The parameters associated with
the membership functions will change through the learning process. The computation of these parameters (or their adjustment)
is facilitated by a gradient vector that provides a measure of how
well the fuzzy inference system is modelling the input/output data
for a given set of parameters. Once the gradient vector is obtained,
Input
Membership
Functions
Fuzzy
Rules
Output
Membership
Functions
Input
Output
Σ
Fig. 10. ANFIS model structure.
1305
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
50
Training - Testing
period period
4 years - 1 year
3 years - 2 years
2 years - 1 year
1 year - 1 year
1 year - 6 months
6 months - 6 months
1 month - 1 month
1 month - 1 week
1 week - 1 week
Normalised abosolute
average error (%)
40
30
20
10
0
0
5
10
15
20
25
Prediction length - Hours
Fig. 11. ARMA models: normalised absolute average error vs prediction length (various combination of training–testing periods).
Table 6
ARMA models: normalised absolute average error in the two testing sets.
Training 3 years, testing 2 years
Training 4 years, testing 1 year
Order of the model
1 h (%)
3 h (%)
6 h (%)
12 h (%)
24 h (%)
1 h (%)
3 h (%)
6 h (%)
12 h (%)
24 h (%)
1
2
3
4
5
6
7
8
9
10
11
6.875
6.903
6.906
6.881
6.897
6.918
6.899
6.921
6.910
6.928
6.913
12.603
12.568
12.577
12.511
12.552
12.607
12.554
12.620
12.579
12.632
12.583
16.979
16.791
16.819
16.754
16.795
16.833
16.782
16.880
16.818
16.890
16.802
20.913
20.707
20.717
20.713
20.718
20.722
20.682
20.840
20.722
20.793
20.713
23.680
23.842
23.795
23.792
23.812
23.673
23.726
23.924
23.832
23.759
23.830
6.939
6.958
6.962
6.977
6.964
6.962
6.967
6.975
6.964
12.494
12.439
12.459
12.508
12.469
12.456
12.479
12.511
12.465
16.673
16.530
16.577
16.658
16.594
16.577
16.607
16.663
16.582
20.787
20.633
20.672
20.729
20.682
20.669
20.657
20.734
20.667
23.961
24.010
24.092
24.057
24.075
24.065
23.934
24.054
24.037
any of several optimisation routines could be applied in order to
adjust the parameters in order to reduce error (usually defined
by the sum of the squared difference between actual and desired
outputs). ANFIS uses either backpropagation or a combination of
least squares estimation and backpropagation for membership
function parameter estimation.
In this application of the ANFIS technique, input is made up of
the average hourly power, while output is the sum of average
hourly power at times t + 1. . .t + h (h = prediction length).
Table 5 shows the final network parameters used in the training
for each prediction length, determined after an optimisation
process aimed at minimising the mean squared error, while
Fig. 10 depicts the ANFIS structure.
6. Results and discussion
As already illustrated, each forecasting model has been applied
in the following two test sets:
1. Application of the model on a training period of 3 years and on a
testing period of 2 years, in five forecasting cases (1 h, 3 h, 6 h,
12 h, 24 h).
2. Application of the model on a training period of 4 years and on a
testing period of 1 year, in the same cases.
For each prevision an evaluation based on the normalised mean
absolute percentage error has been made:
n
1 X
jP i T i j
100
NMAPE ¼
n i¼1 Maxni¼1 ðT i Þ
ð5Þ
where i is the generic time instant; n the number of observations;
Pi the predicted power at instant i; Ti is the real power at instant i.
As already illustrated, several orders of ARMA models were tested
in order to evaluate the influence of the model order on the forecast
and to individuate, as a consequence, the eventual optimal order.
Before applying these models in the two test sets aforementioned, several attempts have been carried out to evaluate the
performance of this kind of technique on the following
combinations of training–testing periods: 2 years–1 year, 1 year–
1 year, 1 year–6 months, 6 months–6 months, 1 month–1 month,
1 month–1 week, 1 week–1 week. The model of order five has been
chosen in this analysis of sensibility.
The results obtained are plotted in Fig. 11 and show that the
performance is generally worse when the training period decreases, thus in the following only the two main test sets
(3 years–2 years and 4 years–1 year) are considered.
Examining the best training–testing periods combination, the
model error increases relatively quickly in the first 6 h from 7%
to 17%. After these first 6 h error stabilises at between 20% and 24%.
In the first set (training period of 3 years and testing period of
2 years) orders from 1 to 11 were used, while in the second set
(training period of 4 years and testing period of 1 year) orders from
1 to 9 were tested, because the maximum admissible order of the
model was equal to 9 (to obtain a solution of the Eq. (1)).
Table 6 allows three important observations to be made:
s the forecast is worse when prediction length increases;
s the training period does not influence the ARMA models, the
performances of which appear substantially independent from
having a training period of 3 or 4 years;
1306
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
Best ARMA
ANFIS
MLFF
MLP 1 INPUT
ELMAN 1 INPUT
MLP 2 INPUTS
ELMAN 2 INPUTS
26
24
Normalised absolute
average error (%)
22
20
18
16
14
12
10
8
6
0
3
6
9
12
15
18
21
24
Prediction length - Hours
Fig. 12. Normalised absolute average error vs prediction length (training 3 years,
testing 2 years).
Best ARMA
ANFIS
MLFF
MLP 1 INPUT
ELMAN 1 INPUT
MLP 2 INPUTS
ELMAN 2 INPUTS
26
24
Normalised absolute
average error (%)
22
20
18
16
14
12
10
8
6
0
3
6
9
12
15
18
21
24
Prediction length - Hours
Fig. 13. Normalised absolute average error vs prediction length (training 4 years,
testing 1 years).
s there is no optimal order of model, but there are similar performances among the various orders.
In the following a detailed comparison between several forecasting techniques is illustrated; qualitatively the plots depicting
the observed, as opposed to, predicted values for given prediction
lengths may be focused upon with a quantitative analysis through
the normalised percentage error described above. Considering its
absolute average value, it is possible to identify the best technique
for each prediction length; a deep analysis of the statistical distribution of normalised error evaluated the techniques whereby it is
more improbable to make errors in prediction.
The next two graphs (Figs. 12 and 13) plot the trends of the normalised absolute average errors of all the experimented methods
for the two test sets. It is evident that the non-linear models improve very significantly on the linear model forecasts; indeed,
the performance of the ARMA models (for which the best value
is plotted among the ones obtained with the several orders used)
is worse in both the test sets except for the prediction length of
one hour; only in this case, therefore, are the ARMA models taken
into consideration in the following.
Fig. 14 shows observed and predicted values in a week in September in year V, for the prediction length of one hour (training
3 years, testing 2 years) with the following models: ARMA order
5, ANFIS, MLP with one input (selected as the best among the three
neural networks with one input) and MLP with two inputs (selected as the best between the two neural networks with two inputs). It is particularly evident that there is good agreement
between the observed and predicted values for all the forecasting
techniques used, the performances of which are quite similar
(see Table 7); therefore the use of a simple linear technique such
as the ARMA models or a more sophisticated non-liner algorithm
appears almost indifferent.
Fig. 15 shows observed and predicted values in a week in September, year V, for the prediction length of 6 h (training 3 years,
testing 2 years) with the following models: ANFIS, MLP with one
input (selected as the best among the three neural networks with
one input) and MLP with two inputs (selected as the best between
the two neural networks with two inputs). There is a visible but
slight gap between the observed and predicted values for all the
forecasting techniques used, the normalised absolute average error
of which is around 11% (see Table 7).
In Fig. 16 the observed and predicted values in a week in September, year V, are shown for the prediction length of 12 h (training 3 years, testing 2 years) with the following models: ANFIS, MLP
with one input (selected as the best among the three neural networks with one input) and MLP with two inputs (selected as the
best between the two neural networks with two inputs). Here
the gap between observed and predicted values is more evident,
and the normalised absolute average error is approximately 13%
(see Table 7).
Table 7 shows the performance of the several methods applied,
indicating the best value obtained among the several orders tested
for the ARMA models.
Observing the average values in Table 7, the performance of the
several non-linear models applied (ANNs and ANFIS) appears
clearly preferable to the linear ones (ARMA) and essentially similar,
even if it is evident that the MLP and the Elman networks are both
characterised by better performance.
It is interesting to note that the all the best results were obtained in the first test set (training period of 3 years and testing
period of 2 years), meaning that too long a training period could
lightly decrease the quality of the training itself.
Moreover, it is remarkable that MLP performance appears better in the first four time horizons (1, 3, 6, 12 h), while the Elman
network is more suitable in the final prediction length (24 h). This
may well be due to the different structures of the two network
architectures examined above; in particular, the ability of the Elman network to both detect and generate time-varying patterns
(by the recurrent connection seen in Fig. 9) appears practically
negligible in the short-medium time horizons yet becomes of great
importance as the prediction length increases, exceeding the great
advantage of the MLP network in not suffering from local minima.
This advantage owes itself solely to the fact that the only parameters adjusted in the learning process are those of the linear mapping from hidden layer to output layer. Linearity ensures that the
error surface is quadratic and therefore has a single and easily
found minimum.
It is also noticeable that the best networks individuated in Table
7 for the first four time horizons (1, 3, 6, 12 h) have the simplest
architectures; the MLP networks with one and two inputs respectively have only one and two neurons in the first layer; as a consequence they are characterised by a very short computational time
and appear quite suitable for the development of an online application. This is especially because of the possibility of periodical
1307
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
ARMA model for prediction of power
with time horizon equal to 1 hour
1.0
Prediction
Observation
0.8
Normalised hourly power
Normalised hourly power
1.0
ANFIS for prediction of power
with time horizon equal to 1 hour
0.6
0.4
0.2
Prediction
Observation
0.8
0.6
0.4
0.2
0.0
0.0
0
20
40
60
80
100
120
140
0
160
20
1 week of September, year V - Hours
MLP (1 input) network for prediction of
power with time horizon equal to 1 hour
1.0
1.0
80
100
120
140
160
Prediction
Normalised hourly power
Normalised hourly power
60
MLP (2 inputs) network for prediction of
power with time horizon equal to 1 hour
Prediction
0.8
40
1 week of September, year V - Hours
Observation
0.6
0.4
0.2
0.0
Observation
0.8
0.6
0.4
0.2
0.0
0
20
40
60
80
100
120
140
160
0
20
1 week of September, year V - Hours
40
60
80
100
120
140
160
1 week of September, year V - Hours
Fig. 14. Observed vs predicted values (training 3 years, testing 2 years, prediction length 1 h).
Table 7
Normalised absolute average percentage error in the several testing sets.
Prediction length
Model
1 h (%)
3 h (%)
6 h (%)
12 h (%)
24 h (%)
Training period 3 years, testing period 2 years
Best ARMA
ANFIS
MLFF
ELM AN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
6.87
6.82
6.89
6.84
6.73
7.33
6.79
12.51
9.37
9.89
9.54
9.15
9.39
9.16
16.75
11.45
11.96
12.03
11.94
11.31
11.19
20.68
13.65
13.97
13.55
13.19
13.84
13.34
23.67
15.15
15.93
14.77
15.32
15.11
15.15
Training period 4 years, testing period 1 year
Best ARMA
ANFIS
MLFF
ELMAN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
6.94
7.14
7.51
7.58
7.17
7.68
7.15
12.44
9.36
10.06
9.89
9.45
9.19
9.39
16.53
11.39
12.16
11.81
11.73
11.21
11.47
20.63
13.70
14.17
13.60
15.01
14.27
13.66
23.93
15.67
15.91
15.90
16.99
15.75
15.62
training on new data adapting to variations of the dynamic of
process.
Only in the last prediction length (24 h) is a more complex
network architecture necessary in order to obtain a better
performance, with 48 neurons in the first layer and 24 in the
second, also needing a longer computational time due to higher
complexity.
Nevertheless, considering only the average values is insufficient
in the evaluation of the differences in performance of the forecasting methods. Thus, a further analysis of the statistical distribution
1308
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
ANFIS for prediction of power
with time horizon equal to 6 hours
1.0
Prediction
Observation
0.8
Normalised hourly power
Normalised hourly power
1.0
ANFIS for prediction of power
with time horizon equal to 12 hours
0.6
0.4
0.2
Prediction
Observation
0.8
0.6
0.4
0.2
0.0
0.0
0
20
40
60
80
100
120
140
0
160
20
MLP (1 input) network for prediction of
power with time horizon equal to 6 hours
Prediction
Observation
0.8
0.6
0.4
0.2
80
100
120
140
160
Prediction
Observation
0.8
0.6
0.4
0.2
0.0
0.0
0
1.0
20
40
60
80
100
120
140
0
160
20
40
60
80
100
120
140
1 week of September, year V - Hours
1 week of September, year V - Hours
MLP (2 inputs) network for prediction of
power with time horizon equal to 6 hours
MLP (2 inputs) network for prediction of
power with time horizon equal to 12 hours
1.0
Prediction
Observation
0.8
Normalised hourly power
Normalised hourly power
60
MLP (1 input) network for prediction of
power with time horizon equal to 12 hours
1.0
Normalised hourly power
Normalised hourly power
1.0
40
1 week of September, year V - Hours
1 week of September, year V - Hours
0.6
0.4
0.2
160
Prediction
Observation
0.8
0.6
0.4
0.2
0.0
0.0
0
20
40
60
80
100
120
140
160
1 week of September, year V - Hours
Fig. 15. Observed vs predicted values (training 3 years, testing 2 years, prediction
length 6 h).
of normalised error was made in order to evaluate the narrower
curves of error distribution and therefore the techniques whereby
it is more improbable to make errors in prediction.
Figs. 17–19 depict error distributions for the seven methods
tested (the ARMA model is of order 5) with a prediction length of
0
20
40
60
80
100
120
140
160
1 week of September, year V - Hours
Fig. 16. Observed vs predicted values (training 3 years, testing 2 years, prediction
length 12 h).
1 h, while Tables 8 and 9 contain, respectively, the probability that
the error itself takes in values in the ranges: [10%; +10%] and
[20%; +20%]. For each comparison the best value is highlighted
in bold.
1309
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
ARMA (order 5)
ANFIS
Error distribution of the statistical model
40
40
35
35
30
30
Frequency (%)
Frequency (%)
Error distribution of the statistical model
25
20
15
25
20
15
10
10
5
5
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
Normalised error (%)
Normalised error (%)
MLFF
Error distribution of the statistical model
40
Frequency (%)
35
30
25
20
15
10
5
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
Normalised error (%)
Fig. 17. Error distribution of some forecast models (ARMA, ANFIS, MLFF, training 3 years, testing 2 years, prediction length 1 h).
ELMAN (1 input)
ELMAN (2 inputs)
Error distribution of the statistical model
40
40
35
35
30
30
Frequency (%)
Frequency (%)
Error distribution of the statistical model
25
20
15
25
20
15
10
10
5
5
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
Normalised error (%)
Normalised error (%)
Fig. 18. Error distribution of the Elman networks (training 3 years, testing 2 years, prediction length 1 h).
If the probability is focused upon that the normalised
error takes in values in the ranges [10%; +10%] and [20%;
+20%] it becomes evident that in most of the cases it is appreciably higher with Elman neural networks, ensuring that the
1310
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
MLP (1 input)
MLP (2 inputs)
Error distribution of the statistical model
Error distribution of the statistical model
50
40
Frequency (%)
Frequency (%)
40
30
20
30
20
10
10
0
0
-70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80
-80-70-60-50-40-30-20-10 0 10 20 30 40 50 60 70 80 90
Normalised error (%)
Normalised error (%)
Fig. 19. Error distribution of the MLP networks (training 3 years, testing 2 years, prediction length 1 h).
Table 8
Probability that the normalised error takes values in the range [10%; +10%].
Prediction length
Model
1 h (%)
3 h (%)
6 h (%)
12 h (%)
24 h (%)
Training period 3 years, testing period 2 years
Best ARMA
ANFIS
MLFF
ELM AN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
78.90
78.08
77.24
77.92
78.40
76.86
78.37
64.28
69.74
65.77
66.84
70.82
70.27
70.79
54.67
61.24
57.86
58.64
58.73
66.67
64.17
46.49
54.05
51.28
52.06
55.94
50.87
57.47
40.91
43.89
42.56
48.80
39.29
46.39
43.24
Training period 4 years, testing period 1 year
Best ARMA
ANFIS
MLFF
ELMAN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
77.77
77.32
76.60
75.92
77.31
76.18
77.44
62.12
70.22
65.61
66.40
70.20
70.17
69.99
51.86
64.17
58.67
60.10
60.41
63.91
62.93
44.72
55.26
50.64
54.07
36.18
48.83
54.66
39.08
44.28
43.14
44.92
48.57
42.54
42.11
Table 9
Probability that the normalised error takes values in the range [20%; +20%].
Prediction length
Model
1 h (%)
3 h (%)
6 h (%)
12 h (%)
24 h (%)
Training period 3 years, testing period 2 years
Best ARMA
ANFIS
MLFF
ELM AN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
91.27
91.38
91.41
91.55
91.53
91.04
91.46
81.30
87.18
84.98
87.06
87.41
87.91
87.31
74.29
83.24
80.17
81.92
83.78
86.25
83.57
66.77
77.65
74.73
76.29
79.45
78.27
79.09
60.13
73.83
69.09
73.12
74.83
75.40
74.58
Training period 4 years, testing period 1 year
Best ARMA
ANFIS
MLFF
ELMAN (1 input)
MLP (1 input)
ELMAN (2 inputs)
MLP (2 inputs)
90.72
90.77
90.69
90.61
90.75
90.31
80.80
79.97
86.54
85.05
86.59
86.81
86.36
86.86
71.85
82.82
79.99
82.24
83.21
82.75
83.19
63.43
78.44
75.46
75.98
78.23
77.76
78.44
57.81
74.47
71.13
72.13
68.24
73.12
73.99
curve of error distribution is narrower and that it is more
improbable to make errors in prediction. This may be
justified once more by the aforementioned recurrent connec-
tion that characterises the Elman network, enabling it to detect
time-varying patterns and thus generate more accurate
forecasts.
M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311
7. Conclusion
The greatest problem for the diffusion of wind energy is its high
variability, both in space and time. Short-range wind energy
forecasting is very important in minimising the scheduling errors
that impact grid reliability and market service costs.
In this work a wide comparison has been carried out between
the ARMA models (which perform a linear mapping between input
and output), five kinds of Artificial Neural Networks (which perform non-linear mapping between input and output; thus suitable
to describe real situations) and the ANFIS model (which combines
the low-level computational power of a neural network with the
high-level reasoning capability of a fuzzy inference system) in an
attempt to create a forecasting model for a wind park at a specific
location in Southern Italy.
This comparison looks at many forecasting methods, time horizons and a deep performance analysis focused upon normalised
mean error and the statistical distribution thereof, in order to evaluate narrower error distribution and therefore the forecasting
methods whereby it is more improbable to make errors in
prediction.
In particular, ARMA, ANNs and ANFIS have been applied to forecasting power production using two test sets: first, a training period of 3 years and a testing period of 2 years, second a training
period of 4 years and a testing period of 1 year; in both the sets five
prediction lengths have been taken into consideration: 1, 3, 6, 12
and 24 h respectively.
For all the given techniques the forecast is worse when prediction length increases.
The best results were obtained in the first test set (training period of 3 years and testing period of 2 years), meaning that too long
a training period may lead to a slight decrease in the performance
of the training itself.
Moreover, in a site with a complex orography like the one containing the examined wind farm, no significant benefits derive
from the use of two inputs (wind speed and wind power) instead
one (wind power) in the forecast methods.
Analysing the normalised absolute average percentage error in
the several testing sets, the performance of the models applied
would appear essentially similar, yet it is evident that MLP performance appears better in the first four time horizons (1, 3, 6, 12 h),
while the Elman network is more suitable in the final prediction
length (24 h).
It is also noticeable that the best networks individuated for the
first four time horizons (1, 3, 6, 12 h) are the ones with the simplest
architectures; the MLP networks with one and two inputs only
have one and two neurons in the first layer; as a consequence, they
have a very short computational time and appear quite suitable for
the development of an online application, in particular owing to
the possibility of periodical training on new data and adapting to
variations of the dynamic of process. Only in the last prediction
length (24 h) is a more complex network architecture necessary
to obtain better performance, with 48 neurons in the first layer
and 24 in the second; also in need of a longer computational time
due to higher complexity.
A further analysis of the statistical distribution of normalised
error was made, calculating the probability that the error itself
takes in values of the ranges: [10%; +10%] and [20%; +20%].
In most cases, the probability that the normalised error take
values in the ranges [10%; +10%] and [20%; +20%] is appreciably
higher with the Elman neural networks, ensuring a narrower error
1311
curve and making it thus more improbable to make errors in
prediction.
References
[1] Morales JM, Minguez R, Conejio AJ. A methodology to generate statistically
dependent wind speed scenarios. Appl Energy 2010;87:843–55.
[2] Zhou W, Chengzhi L, Zhongshi L, Lu L, Yang H. Current status of research on
optimum sizing of stand-alone hybrid solar–wind power generation systems.
Appl Energy 2010;87:380–9.
[3] Xydis G, Koroneos C, Loizidou M. Exergy analysis in a wind speed prognostic
model as a wind farm sitting selection tool: a case study in Southern Greece.
Appl Energy 2009;86:2411–20.
[4] Hongxing Y, Wei Z, Chengzhi L. Optimal design and techno-economic analysis
of a hybrid solar–wind power generation system. Appl Energy 2009;86:163–9.
[5] Jowder F. Wind power analysis and site matching of wind turbine generators in
Kingdom of Bahrain. Appl Energy 2009;86:538–45.
[6] Luickx PJ, Delarue ED, D’haeseleer WD. Considerations on the backup of wind
power: operational backup. Appl Energy 2008;85:787–99.
[7] Saylor DJ, Rosen JN, Hu T, Li X. A neural network approach to local downscaling
of GCM output for assessing wind power implications of climate change.
Renew Energy 2000;19:359–78.
[8] Sideratos G, Hatziargyriou ND. An advanced statistical method for wind power
forecasting. IEEE Trans Power Syst 2007:22.
[9] Burton NJ, Bossanyi E. Wind energy handbook. Wiley; 2001.
[10] Bossanyi E. Short-term stochastic wind prediction and possible control
application. In: Proceedings of the Delphi workshop on ‘‘wind energy
application’’ Greece; 1985. p. 137–142.
[11] Boland J, Ward K, Korolkowiecz M. Modelling the volatility in wind farm
output. School of Mathematics and Statistics-University of South Australia.
[12] Kavasseri RG, Seetharaman K. Day-ahead wind speed forecasting using fARIMA models. Renew Energy 2009;34:1388–93.
[13] Riahy GH, Abedi M. Short term wind speed forecasting for wind turbine
applications using linear prediction method. Renew Energy 2008;33:35–41.
[14] Flores AT, Tapia G. Application of a control algorithm for wind speed prediction
and active power generation. Renew Energy 2005;33:523–36.
[15] Sfetsos A. A novel approach for the forecasting of mean hourly wind speed
time series. Renew Energy 2002;27:163–74.
[16] Jayaraj KP, Padmakumari K, Sreevalsan E, Arun P. Wind speed and power
prediction using artificial neural networks. In: European wind energy
conference; 2004 [EWEC].
[17] Cadenas E, Rivera W. Wind speed forecasting in the South Coast of Oaxaca,
Mexico. Renew Energy 2006;32:2116–28.
[18] Potter C, Ringrose M, Negnevitsky M. Short-term wind forecasting techniques
for power generation. Australasian Universities power engineering
conference; 2004 [AUPEC].
[19] Johnson P, Negnevitsky M, Muttaqi KM. Short-term wind forecasting using
Adaptive Neural Fuzzy Inference. In: System Australasian Universities power
engineering conference; 2008 [AUPEC].
[20] Barbounis TG, Theocharis JB. Locally recurrent neural networks for long-term
wind speed and power prediction. Neurocomputing 2006;69:466–96.
[21] Bilgili M, Sahin B, Yasar A. Application of artificial neural networks for the
wind speed prediction of target station using reference stations data. Renew
Energy 2007;32:2350–60.
[22] Kalogirou S, Neocleous C, Pashiardis S, Schizas C. Wind speed prediction using
artificial neural networks. In: European symposium on intelligent
techniquesy; 1999.
[23] Fadare DA. The application of artificial neural networks to mapping of wind
speed profile for energy application in Nigeria. Appl Energy 2010;87:934–42.
[24] Beccali M, Cirrincione G, Marvuglia A, Serporta C. Estimation of wind velocity
over a complex terrain using the Generalized Mapping Regressor. Appl Energy
2010;87:884–93.
[25] Li G, Shi J. On comparing three artificial neural networks for wind speed
forecasting. Appl Energy 2010;87:2313–20.
[26] Kariniotakis G, Nogaret E, Stavrakakis G., Advanced short-term forecasting of
wind power production. In: Proceedings of the 1996 European union wind
energy conference EUWEC’97, Dublin, Ireland; 1997. p. 751–4.
[27] Palomares-Salas JC, De la Rosa JJG, Ramiro JG, Melgar J, Aguera A, Moreno A.
Comparison of models for wind speed forecasting. Research Unit PAIDI-TIC168. University of Cádiz.
[28] Sftetos A. A comparison of various forecasting techniques applied to mean
hourly wind speed time series. Renew Energy 2000;21:23–35.
[29] Ramirez-Rosado IJ, Ferandez-Jimenez LA, Monteiro C, Sousa J, Bessa R.
Comparison of two new short-term wind power forecasting systems. Renew
Energy 2009;34:1848–54.
[30] Lei M, Shiyan L, Chuanwen J, Hongling L, Yan Z. A review on the forecasting
of wind speed and generated power. Renew Sustain Energy Rev 2009;13:
915–20.