Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Error analysis of short term wind power prediction models

2011, Applied Energy

Applied Energy 88 (2011) 1298–1311 Contents lists available at ScienceDirect Applied Energy journal homepage: www.elsevier.com/locate/apenergy Error analysis of short term wind power prediction models Maria Grazia De Giorgi ⇑, Antonio Ficarella, Marco Tarantino Dipartimento di Ingegneria dell’Innovazione, Università del Salento, Via per Monteroni, 73100 Lecce, Italy a r t i c l e i n f o Article history: Received 22 March 2010 Received in revised form 13 October 2010 Accepted 24 October 2010 Available online 24 November 2010 Keywords: Forecasting Wind power ARMA models Artificial Neural Networks ANFIS a b s t r a c t The integration of wind farms in power networks has become an important problem. This is because the electricity produced cannot be preserved because of the high cost of storage and electricity production must follow market demand. Short-long-range wind forecasting over different lengths/periods of time is becoming an important process for the management of wind farms. Time series modelling of wind speeds is based upon the valid assumption that all the causative factors are implicitly accounted for in the sequence of occurrence of the process itself. Hence time series modelling is equivalent to physical modelling. Auto Regressive Moving Average (ARMA) models, which perform a linear mapping between inputs and outputs, and Artificial Neural Networks (ANNs) and Adaptive Neuro-Fuzzy Inference Systems (ANFIS), which perform a non-linear mapping, provide a robust approach to wind power prediction. In this work, these models are developed in order to forecast power production of a wind farm with three wind turbines, using real load data and comparing different time prediction periods. This comparative analysis takes in the first time, various forecasting methods, time horizons and a deep performance analysis focused upon the normalised mean error and the statistical distribution hereof in order to evaluate error distribution within a narrower curve and therefore forecasting methods whereby it is more improbable to make errors in prediction. Ó 2010 Elsevier Ltd. All rights reserved. 1. Introduction Among new sources of renewable energy, wind energy is undoubtedly the one that has seen greatest growth over recent years; thus becoming, in various countries, the true alternative to fossil fuels. At the end of 2009, worldwide nameplate capacity of wind-powered generators was 159.2 gigawatts (GW) with an energy production of 340 TW h, about 2% of worldwide electricity usage (compared to the 0.1% of 1997). The increasing interest of worldwide academic literature in wind energy is attested in several works dealing with this important theme. Morales et al. [1] propose a procedure to produce a set of plausible scenarios characterising the uncertainty associated with wind speed at different geographic sites. This characterisation constitutes an essential part within the decision-making processes faced by both power system operators and producers with a generation portfolio including wind plants at several locations. Zhou et al. [2] review the current state of simulation, optimisation and control technologies for stand-alone hybrid solar–wind energy systems with battery storage. Xydis et al. [3] perform a wind resource assessment study in the area of the Central Peloponnese (inland) using Geographic Information Systems (GIS) tools and an exergy ⇑ Corresponding author. E-mail address: mariagrazia.degiorgi@unisalento.it (M.G. De Giorgi). 0306-2619/$ - see front matter Ó 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.apenergy.2010.10.035 analysis. Hongxing et al. [4] recommend an optimal design model for designing hybrid solar–wind systems employing battery banks for calculating the optimum system configurations and ensuring that the annualised cost of these systems is minimised while satisfying the custom required loss of power supply probability (LPSP). In [5] the hourly measured wind speed data for the years 2003– 2005 at heights of 10 m, 30 m and 60 m respectively for The Kingdom of Bahrain have been statically analysed in order to determine the potential for wind power generation. Luickx et al. [6] define various elements that come into play when considering backup for electricity generation from wind power. The effects of several parameters, to be situated on the short-range operation of backup of wind power, are defined and analysed accordingly. The most important parameters are the load profiles, the wind power output profiles and the total amount of installed wind power. The most important problem for the diffusion of wind energy is that it is characterised by high variability, both in space and time. Short-range wind energy forecasting is very important in order to minimise the scheduling errors that impact on both grid reliability and market service costs. In the field of energy, which is increasingly oriented towards economic concepts of market, energy producers ought to be able to predict the amount of energy produced in the subsequent hours or days with high precision. The natural consequence of the evolution of the electricity market has been the search for technical solutions, often based on time series, that are able to predict deliverable power in short-medium M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 periods. The introduction, also in the energetic field, of Artificial Neural Networks, has given new impetus to the use of systemic techniques in the area of real time prediction systems. One requirement for a good model in a real time prediction system is the ability to maintain an acceptable degree of reliability in forecasting when prediction lengths increase; indeed, it is essentially the availability of sufficient lead-time in forecasting that in turn affects the potential for optimal use of the energy source in question. Owing to the variability of wind, the skill of forecasting wind speed in subsequent time intervals must be evaluated in the various cases. Various forecasting models have been developed in literature, each with its own characteristics and each performing well under different circumstances. Numerical weather prediction (NWP) models are effective in predicting large-scale area wind speed and can achieve better results in long-range forecasting [7,8]. These use hydrodynamic atmospheric models which take into account physical phenomena such as frictional, thermal and convective effects. Forecasts are generally of two types, as Burton asserts [9]: (i) short-range predictions of wind speed on a time horizon variable from a few seconds to minutes, which could be useful for the operative control of wind turbines; (ii) long-range forecasting on a time horizon from a few hours to days. The latter are useful in order to plan the energy supply of wind parks. In [9] it is highlighted that for short-range forecasting, statistical techniques offer good results, while for long-range forecasting it is necessary to rely on meteorological methods. The ARMA models are sophisticated methods that make forecasts based upon a linear combination of n previous values. These are based upon the assumption that the value of wind speed at time k is a linear function of the two previous values at times (k  1) and (k  2) and that the coefficients of the linear function change every time. The initial assumption of this method is that the statistical properties of wind (average, auto-variance) do not change within the period taken into consideration for such a prediction however this assumption may be limiting with regard to the use of ARMA models. Bossanyi [10] investigated the use of ARMA models for forecasting wind speed from a few seconds to a few minutes obtaining a decrement of error up to 20% if compared with the statistical methods based on persistence. Alternatively, Boland et al. [11] applied ARMA models in order to predict wind power with a prediction length of 30 min, also proposing a method for the selection of the optimal order of the model. Significantly longer horizons were taken into account by Kavasseri and Seetharaman [12] who examined the use of fractionalAutoregressive Integrated Moving Average (ARIMA) models to forecast wind speeds of 1-day-ahead (24 h) and 2-day-ahead (48 h) periods respectively. Forecasting accuracy was assessed by computing three indices, i.e. the daily mean error, the variance and the square root of the forecast mean square error. The results obtained indicate that significant improvements in forecasting accuracy are gained with the models proposed compared to the persistence method. Riahy and Abedi [13] proposed a new method based on linear prediction for wind speed forecasting. This method utilises the linear prediction method in conjunction with filtering of the wind speed waveform; however this approach, aimed at minimising the absolute percentage error, is applied only for a prediction range of a few seconds ahead. Recent techniques such as ANNs, neuro-fuzzy networks and wavelet-based methods are used increasingly. An application of ANNs for a given wind speed forecast with a time horizon of one hour is found in Flores and Tapia [14] on the 1299 basis of the three previous values, thus obtaining a mean squared error of 0.057 m/s, whereas a new approach to wind speed forecasting in the subsequent hour is found in A. Sfetsos [15], whose method is based on a multi-step prediction of average values with 10 min-intervals. Among the medium-long-range forecasts, the work of Jayaraj et al. [16] is significant. Jayaraj applied an ANN in order to predict the wind speed over the subsequent 1-, 24-, and 48-h periods respectively: good performance was noted in forecasting the wind speed in the subsequent hour (with a maximum root mean squared error of 13% on the generated output), while the use of ANN longer time intervals shows an increment of root mean squared error up to 23% in the case of the 24-h period. Cadenas and Rivera [17] applied the ARIMA models and the Artificial Neural Networks (ANN) methods to a time series based on 7 years of recorded wind speed measurements with good results through a seasonal parameter introduced in the ARIMA model, using a prediction length of one hour and evaluating the performances by the mean squared error, the mean absolute error and the mean absolute percentage error. Potter et al. [18] and Johnson et. al [19] obtained an improvement in forecasting quality for the brief prediction lengths by applying the ANFIS method, however a very short prediction length is tested: 2.5 min in [18] and 5 min in [19]. Barbounis and Theocharis [20] carried out long-range wind speed and power forecasting in a wind farm using locally recurrent multi-layer networks as forecasting models with a prediction length of 15 min–3 h. Performance was evaluated by the normalised mean squared error, revealing an improvement with respect to the persistent and ARMA models. Bilgili et al. [21] applied resilient propagation (RP) Artificial Neural Networks to predict the mean monthly wind speed of any target station using the mean monthly wind speeds of neighbouring stations indicated as reference stations. The maximum and the minimum mean absolute percentage errors were found to be, respectively, 14.13% and 4.49%. Another attempt to forecast the mean monthly wind speed by ANNs is found in Kalogirou et al. [22], who used a multi-layered artificial neural network for predicting the mean monthly wind speed in regions of Cyprus. Data for the period 1986–1996 were used to train the neural network and data from the year 1997 were used for validation; in this case the maximum percentage difference was only 1.8%. In [23] the profile of wind speed in Nigeria is modelled using an artificial neural network. The ANN model consists of a three-layered, feed-forward, back-propagation network with different configurations, designed using the Neural Toolbox for MATLAB. The geographic parameters (latitude, longitude and altitude) and the month of the year were used as input data and the monthly mean wind speed was used as the output of the network. Beccali et al. [24] present a novel approach to wind speed spatial estimation in Sicily (Italy): an incremental self-organising neural network (Generalised Mapping Regressor – GMR) is coupled with exploratory data analysis techniques in order to obtain a map of the spatial distribution of the average wind speed over the entire region. Li and Shi [25] illustrate a comprehensive comparison study on the application of different Artificial Neural Networks in 1-h-ahead wind speed forecasting. Three types of typical neural networks, namely, adaptive linear element, back propagation, and radial basis function, are investigated. In the literature there are some interesting attempts to compare the new non-linear forecasting techniques with the linear ARMA models ([26–28]). According to Kariniotakis et al. [26] fuzzy-logic leads to improvements in the predictions of wind power if compared with the simplest statistical techniques, however the forecasting range considered here is between 10 min and 2 h. Palomares-Salas et al. [27] based their comparison between the 1300 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 ARIMA model and a back-propagation neural network on three parameters: the Pearson correlation coefficient associated with the original time series and the forecasting series, the Index Of Agreement (IOA) of Willmot and the root mean square error. Their results show that the ARIMA model is better than NNT for shortrange intervals to forecast (10 min, 1 h, 2 h and 4 h). Sfetsos [28] compares ARMA, ANNs and ANFIS techniques by evaluating the RMS error in the prediction length of 1 h for the hourly wind speed. The multi-layer perceptron network (MLP) is the principal technique used in [29] behind other forecasting methods such as the ARMA models and various kinds of ANNs. Two main forecasting systems are presented, based not only on historical real data but also on numerical weather predictions, obtaining an average RMS error of approximately 14% in the horizon 12–24 h. Lei et al. [30] give a bibliographical survey on the general background of research and developments within the fields of wind speed and wind power forecasting. Based on the assessment of wind power forecasting models, further direction for additional research and application is proposed. This is aimed at furthering the study of artificial intelligence methods and improving their training algorithm: thus aiming at more accurate results as well as combining different physical and statistical models in order to achieve good results both in long- and short-range prediction. It aims, furthermore, at research on the practical application of the models, not only theoretical, and the proposal of new mathematical methods. This work is also aimed at continuing the comparative analysis in the literature of the most important wind power forecasting techniques, by increasing the number of prediction horizons, and by going into more depth in the statistical analysis of forecasting error. In particular, a wide comparison between ARMA models, ANFIS and ANNs has been carried out in the attempt to create a wind power forecasting model for a wind park in Southern Italy, a country with a complex orography and very unstable meteorological conditions. The comparison concerns more forecasting methods, more time horizons and a deeper analysis than is to be found in the current literature; it focuses on the normalised absolute mean error and a further analysis of the statistical distribution of normalised error has been made in order to explore more deeply the quality of the generated techniques by evaluating the said error distributions within a narrower curve and therefore the forecasting methods whereby it is more improbable to make errors in prediction. eters were either unavailable or incorrect; naturally, an algorithm was sought and created in order to compensate the gaps in the historical series data and two vectors were obtained accordingly: the hourly average wind speed for years 1–5; the hourly average power for years 1–5; when the hourly average power was unavailable, it was estimated by the value of the hourly average wind speed, through the power curve equation; while, in absence of the hourly average wind speed for n time instants, an interpolation was made upon the n/2 previous and the n/2 subsequent values. It is important to highlight that the real values of wind power are reserved, so the normalised values are plotted in the following, adopting the range [0; 1] to cover the whole range of power measured. Other variables like pressure or temperature were not taken into consideration owing to low correlation between these said variables and wind power (Figs. 1 and 2); the determination coefficient of the best interpolation curve between the time series of wind power and wind speed is equal to 0.907, while the same coefficient is almost 0/zero between power and pressure or power and temperature. Figs. 3 and 4 show the normalised measured hourly wind power in the months of September–March, of year IV. The higher variability which characterises March is evident, compromising the elaboration of a forecasting method. Wind power and wind speed form a non-linear relationship, basically cubic. Thus, small error in wind speed forecast will actually generate a large (cubic) error in wind power; this error can further increase because of the complex orography of the site, by which for the same wind speed there can be a great difference of wind power depending on direction. Therefore, the approach used in this study is based on the adoption of the wind power as target for all the forecast methods applied. In order to evaluate the quality of the available wind power data, the real power curve (obtained by plotting real wind power against real wind speed) with the theoretical one (obtained by plotting theoretical wind power against real wind speed) was compared: the two curves are in good agreement (the average difference being only 1.18%). Moreover, a spectral analysis of the time series of wind power and wind speed was carried out by using Fast Fourier Transforms (FFTs) and plotting the associated Power Spectral Density (the FFT of the autocorrelation function) in order to evaluate the frequency content of the time series and identify some periodicities. Fig. 5 depicts the Power Spectral Density for the whole time series (years 1–5) of wind power and wind speed. March, year V 2. Wind farm characteristics Normalised hourly wind power 1.0 This work is aimed at wind power forecasting and is based upon data measured and taken from three wind turbines in a wind park at a specific location in Southern Italy, that is characterised by a complex orography and, moreover, the wind farm is located in highly complex terrain, where geographical effects make wind speed predictions difficult. Besides this, meteorological conditions in the wind farm are also very unstable. In order to define a prediction model for wind power, the most significant problem remains the selection of the best parameter to use from among the several variables of the system. In particular, an analysis of the time series was carried out. The series is represented by the following daily data (registered every 10 min): temperature (°C), wind speed (m/s), direction (°), pressure (mmHg), wind power of each of the three turbines (kW), measured for 5 years. Firstly, an accurate elaboration of the measured values was necessary in order to check, each month, the days in which the param- 0.8 0.6 0.4 0.2 0.0 0 5 10 15 Hourly temperature (°C) Fig. 1. Normalised hourly wind power vs hourly temperature in the month of March, year V. 1301 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 1.0 The two spectra are in good agreement, both denoting a peak in correspondence of 1/24 h1 and 1/12 h1, thereby highlighting the quality of the recorded data. 0.8 3. Auto Regressive Moving Average (ARMA) models Normalised hourly wind power March, year V A first attempt to predict the power generated was made through the so-called Auto Regressive Moving Average (ARMA) models. The general expression for the ARMA models is: 0.6 0.4 AðqÞyðtÞ ¼ CðqÞeðtÞ 0.2 ð1Þ where y(t) is the variable of interest, i.e. the power produced at the time t and A(q) and C(q) are two polynomial functions of the backshift operator q 0.0 900 905 910 915 920 925 930 AðqÞ ¼ 1 þ a1 q1 þ    þ ana1 qna Hourly pressure (mmHg) Fig. 2. Normalised hourly wind power vs hourly pressure in the month of March, year V. CðqÞ ¼ 1 þ c1 q 1 þ    þ cnc q nc ð2Þ ð3Þ e(t) is the error term at the time t and na and nc are the model orders (in this work na = nc). Two test sets have been defined: Normalised hourly wind power September, year IV 1. Application of the ARMA model (from order 1 to 11) on a training period of 3 years and on a testing period of 2 years, in five forecasting cases (1 h, 3 h, 6 h, 12 h, 24 h). 2. Application of the ARMA model (from order 1 to 9), on a training period of four years and on a testing period of one year, in the same cases. 1.0 0.8 0.6 The input is the hourly average power at time t, while the output of the model is the sum of hourly average power at times t + 1. . .t + h (h = prediction length). 0.4 0.2 4. Artificial Neural Networks (ANNs) 0.0 4.1. Introduction 0 2 4 6 8 10 12 14 16 18 Hourly wind speed (m/s) Fig. 3. Normalised hourly wind power vs hourly wind speed in the month of September, year IV. Normalised hourly wind power March, year IV 1.0 0.8 0.6 0.4 0.2 0.0 0 2 4 6 8 10 12 14 16 18 Hourly wind speed (m/s) Fig. 4. Normalised hourly wind power vs hourly wind speed in the month of March, year IV. Neural networks are composed of simple elements operating in parallel. These elements are inspired by biological nervous systems. As in nature, the network function is determined largely by the connections between elements. A neural network can be trained to perform a particular function by adjusting the values of the connections (weights) between elements (Figs. 6 and 7). The basic component of such a system is a neuron. When active, electrochemical signals are received through synapses to the neuron cell. Each synapse has its own weight that determines the contribution and extent to which the respective input affects the output of the neuron. The weighted sum of the input electrochemical signals is fed to the nucleus that sends electrical impulses in response, being transmitted to other neurons or to other biological units as actuation signals. Neurons are interconnected through synapses. The synaptic weights modify continuously during learning. Groups of neurons are organised into subsystems and integrate to form the brain. In the ANN technique, a simulation of a small part of the central nervous system is done which is a rather basic mathematical model of the biological nervous system. Inputs are fed into the corresponding neurons, and the electrochemical signals are altered by weights. The weighted sum is operated upon by an activation function, and outputs are fed to other neurons in the network. All these neurons are highly interconnected and the activation values constitute final output or may be fed to the next model. These connection weights are continuously modified during training to obtain desired accuracy and generalisation capabilities. 1302 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 Power Spectral Density of wind power 2.0x10 Power Spectral Density of wind speed 3 5x10 7 3 4x10 1.5x10 7 3 1.0x10 7 5.0x10 6 PSD PSD 3x10 3 2x10 3 1x10 0.0 0.04 0.05 0.06 0.07 0.08 0 0.04 0.09 0.05 0.06 0.07 Frequency (1/h) Frequency (1/h) (a) (b) 0.08 0.09 Fig. 5. Power Spectral Density of wind power (a) and wind speed (b). Fig. 8. Typical feed-forward network with two layers. Fig. 6. Scheme of operating in a neural network. Fig. 7. Model of a general neuron. Five kinds of forecast have been made in this work by using ANNs in order to evaluate the wind power of the park: three networks (MLFF, Elman, MLP) based on one input (the hourly average power) and two networks (Elman and MLP) based on two inputs (the hourly average power and the hourly average wind speed). The application of the scheme with two inputs has been carried out only for the Elman and the MLP networks, the two techniques with the best respective performances in the case of one input. For each of the networks (and also for the ANFIS model) the two test sets also described for the ARMA models have been defined: 1. Application of the model on a training period of 3 years and on a testing period of 2 years, in five forecasting horizons (1 h, 3 h, 6 h, 12 h, 24 h). 2. Application of the model on a training period of 4 years and on a testing period of 1 year, in the same cases. 4.2. Multi-layer feed-forward network The neural network used at this step is the Multi-Layer Feed Forward (MLFFN). In the Feed-Forward networks (Fig. 8) the data flow from input to output units is strictly feed-forward. The data processing can extend over multiple (layers of) units, yet no feedback connections are present, i.e. connections extending from outputs of units to inputs of units in the same layer or previous layers. Several algorithms for training utilise the gradient of the performance function to determine how to adjust the weights in order to minimise performance. The gradient is determined using a technique called back-propagation that involves performing computations backwards through the network. The goal of the algorithm is to minimise the global error E defined as E¼ n 1X ðtðkÞ  oðkÞÞ2 2 k¼1 ð4Þ where o(k) and t(k) are the output and target network for any k output node. The input is made up of average hourly power (as many values as the neurons of the first layer, the last value is the hourly average power at time t), while the output is the sum of hourly average power at times t + 1. . .t + h (h = prediction length). Table 1 shows the final network parameters used in the training for each prediction length, determined after an optimisation process (oriented to minimise the mean squared error). 4.3. Elman networks This kind of network is characterised by feedback from the firstlayer output to the first-layer input. This recurrent connection 1303 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 Table 1 MLFF network parameters used in the training for each prediction length. Parameters 1h 3h 6h 12 h 24 h Training function Adapt learning function Performance function Number layers Neurons (layer 1) – inputs Neurons (layer 2) Neurons (layer 3) – output Activation function hidden layer Activation function output layer Epochs TRAINLM LEARNGD MSE 3 16 8 1 TANSIG PURELIN 200 TRAINLM LEARNGD MSE 3 24 12 TRAINLM LEARNGD MSE 3 24 12 1 TANSIG PURELIN 200 TRAINLM LEARNGD MSE 3 36 18 1 TANSIG PURELIN 200 TRAINLM LEARNGD MSE 3 48 24 1 TANSIG PURELIN 200 TANSIG PURELIN 200 TRAINLM = Levenberg–Marquardt algorithm. LEARNGD = Gradient descent weight and bias learning function. MSE = Mean squared error. TANSIG = Hyperbolic tangent sigmoid transfer function. PURELIN = Linear transfer function. 4.4. Multi-layer perceptron network Fig. 9. Typical architecture of an Elman back-propagation network. allows the Elman network to detect and generate time-varying patterns (Fig. 9). Firstly, the Elman network was applied with the same structural characteristics described in Table 1, assuming as input the hourly average power and as output the sum of hourly average power at times t + 1. . .t + h (h = prediction length). Second, two inputs were used: the hourly average power and the hourly average wind speed at time t, while the output is the sum of hourly average power at times t + 1. . .t + h. Table 2 shows the final network parameters used in the training for each prediction length, determined after an optimisation process oriented to minimise the mean squared error. The perceptron is a type of artificial neural network invented in 1957 at the Cornell Aeronautical Laboratory by Frank Rosenblatt. It may be seen as the simplest kind of feed-forward neural network and thus a linear classifier. A multi-layer perceptron (MLP) is a modification of the standard linear perceptron in that it uses three or more layers of neurons with non-linear activation functions, and is more powerful than the perceptron in that it can distinguish data that is not linearly separable, or separable by a hyperplane. Learning occurs in the perceptron by changing connection weights (or synaptic weights) after each datum is processed, based on the amount of error in the output compared to the expected result. As for the Elman network, the MLP were applied first, assuming as input the hourly average power and as output the sum of hourly average power at times t + 1. . .t + h (h = prediction length), and secondly two inputs were used: the hourly average power and the hourly average wind speed at time t, adopting as output the sum of hourly average power at times t + 1. . .t + h. The two structural schemes, determined after an optimisation process oriented to minimise the mean squared error, are described in Tables 3 and 4. 5. Adaptive Neuro-Fuzzy Inference Systems (ANFIS) The acronym ANFIS is derived from Adaptive Neuro-Fuzzy Inference System. It is a hybrid of two intelligent system models and combines the low-level computational power of a neural network Table 2 Elman network (2 inputs) parameters used in the training for each prediction length. Parameters 1h 3h 6h 12 h 24 h Training function Adapt learning function Performance function Number layers Neurons (layer 1) – inputs Neurons (layer 2) Neurons (layer 3) – output Activation function hidden layer Activation function output layer Epochs TRAINGDX LEARNGD MSE 3 2 5 1 TANSIG PURELIN 500 TRAINGDX LEARNGD MSE 3 2 5 1 TANSIG PURELIN 500 TRAINGDX LEARNGD MSE 3 2 5 1 TANSIG PURELIN 500 TRAINGDX LEARNGD MSE 3 2 5 1 TANSIG PURELIN 500 TRAINGDX LEARNGD MSE 3 2 5 1 TANSIG PURELIN 500 TRAINGDX = Gradient descent with momentum and adaptive learning rate backpropagation. LEARNGD = Gradient descent weight and bias learning function. MSE = Mean squared error. TANSIG = Hyperbolic tangent sigmoid transfer function. PURELIN = Linear transfer function. 1304 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 Table 3 MLP network (1 input) parameters used in the training for each prediction length. Parameters 1h 3h 6h 12 h 24 h Adapt learning function Performance function Number layers Neurons (layer 1) – inputs Neurons (layer 2) Neurons (layer 3) – output Activation function hidden layer Activation function output layer Epochs LEARNP MSE 3 1 3 1 TANSIG PURELIN 200 LEARNP MSE 3 1 3 1 TANSIG PURELIN 200 LEARNP MSE 3 1 3 1 TANSIG PURELIN 200 LEARNP MSE 3 1 3 1 TANSIG PURELIN 200 LEARNP MSE 3 1 3 1 TANSIG PURELIN 200 LEARNP = Perceptron weight and bias learning function. MSE = Mean squared error. TANSIG = Hyperbolic tangent sigmoid transfer function. PURELIN = Linear transfer function. Table 4 MLP network (2 inputs) parameters used in the training for each prediction length. Parameters 1h 3h 6h 12 h 24 h Adapt learning function Performance function Number layers Neurons (layer 1) – inputs Neurons (layer 2) Neurons (layer 3) – output Activation function hidden layer Activation function output layer Epochs LEARNP MSE 3 2 3 1 TANSIG PURELIN 200 LEARNP MSE 3 2 3 1 TANSIG PURELIN 200 LEARNP MSE 3 2 3 1 TANSIG PURELIN 200 LEARNP MSE 3 2 3 1 TANSIG PURELIN 200 LEARNP MSE 3 2 3 1 TANSIG PURELIN 200 LEARNP = Perceptron weight and bias learning function. MSE = Mean squared error. TANSIG = Hyperbolic tangent sigmoid transfer function. PURELIN = Linear transfer function. Table 5 ANFIS parameters used in the training for each prediction length. Parameters 1h 3h 6h 12 h 24 h Input Membership Functions Number of input MFs Output Membership Functions Optimisation method Epochs GBELLMF 4 LINEAR HYBRID 500 GBELLMF 4 LINEAR HYBRID 500 GBELLMF 4 LINEAR HYBRID 500 GBELLMF 4 LINEAR HYBRID 500 GBELLMF 4 LINEAR HYBRID 500 GBELLMF = Generalised bell-shaped built-in membership function. LINEAR = Linear function. HYBRID = Combination of backpropagation and the least-squares method to estimate membership function parameters. with the high-level reasoning capability of a fuzzy inference system. The easiest way to understand how the ANFIS model operates is to consider it in two steps. Firstly, the system is trained in a similar way to a neural network with a large set of input data. Once trained the system will then operate exactly as a fuzzy expert system. Using a given input/output data set, the ANFIS constructs a fuzzy inference system (FIS) the membership function parameters of which are tuned (adjusted) using either a backpropagation algorithm alone, or in combination with a least squares-type method. This allows fuzzy systems to learn from the data modelled. A network-type structure similar to that of a neural network that maps inputs through input membership functions and associated parameters and then through output membership functions and associated parameters to outputs. This can then be used to interpret the input/output map. The parameters associated with the membership functions will change through the learning process. The computation of these parameters (or their adjustment) is facilitated by a gradient vector that provides a measure of how well the fuzzy inference system is modelling the input/output data for a given set of parameters. Once the gradient vector is obtained, Input Membership Functions Fuzzy Rules Output Membership Functions Input Output Σ Fig. 10. ANFIS model structure. 1305 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 50 Training - Testing period period 4 years - 1 year 3 years - 2 years 2 years - 1 year 1 year - 1 year 1 year - 6 months 6 months - 6 months 1 month - 1 month 1 month - 1 week 1 week - 1 week Normalised abosolute average error (%) 40 30 20 10 0 0 5 10 15 20 25 Prediction length - Hours Fig. 11. ARMA models: normalised absolute average error vs prediction length (various combination of training–testing periods). Table 6 ARMA models: normalised absolute average error in the two testing sets. Training 3 years, testing 2 years Training 4 years, testing 1 year Order of the model 1 h (%) 3 h (%) 6 h (%) 12 h (%) 24 h (%) 1 h (%) 3 h (%) 6 h (%) 12 h (%) 24 h (%) 1 2 3 4 5 6 7 8 9 10 11 6.875 6.903 6.906 6.881 6.897 6.918 6.899 6.921 6.910 6.928 6.913 12.603 12.568 12.577 12.511 12.552 12.607 12.554 12.620 12.579 12.632 12.583 16.979 16.791 16.819 16.754 16.795 16.833 16.782 16.880 16.818 16.890 16.802 20.913 20.707 20.717 20.713 20.718 20.722 20.682 20.840 20.722 20.793 20.713 23.680 23.842 23.795 23.792 23.812 23.673 23.726 23.924 23.832 23.759 23.830 6.939 6.958 6.962 6.977 6.964 6.962 6.967 6.975 6.964 12.494 12.439 12.459 12.508 12.469 12.456 12.479 12.511 12.465 16.673 16.530 16.577 16.658 16.594 16.577 16.607 16.663 16.582 20.787 20.633 20.672 20.729 20.682 20.669 20.657 20.734 20.667 23.961 24.010 24.092 24.057 24.075 24.065 23.934 24.054 24.037 any of several optimisation routines could be applied in order to adjust the parameters in order to reduce error (usually defined by the sum of the squared difference between actual and desired outputs). ANFIS uses either backpropagation or a combination of least squares estimation and backpropagation for membership function parameter estimation. In this application of the ANFIS technique, input is made up of the average hourly power, while output is the sum of average hourly power at times t + 1. . .t + h (h = prediction length). Table 5 shows the final network parameters used in the training for each prediction length, determined after an optimisation process aimed at minimising the mean squared error, while Fig. 10 depicts the ANFIS structure. 6. Results and discussion As already illustrated, each forecasting model has been applied in the following two test sets: 1. Application of the model on a training period of 3 years and on a testing period of 2 years, in five forecasting cases (1 h, 3 h, 6 h, 12 h, 24 h). 2. Application of the model on a training period of 4 years and on a testing period of 1 year, in the same cases. For each prevision an evaluation based on the normalised mean absolute percentage error has been made: n 1 X jP i  T i j  100 NMAPE ¼  n i¼1 Maxni¼1 ðT i Þ ð5Þ where i is the generic time instant; n the number of observations; Pi the predicted power at instant i; Ti is the real power at instant i. As already illustrated, several orders of ARMA models were tested in order to evaluate the influence of the model order on the forecast and to individuate, as a consequence, the eventual optimal order. Before applying these models in the two test sets aforementioned, several attempts have been carried out to evaluate the performance of this kind of technique on the following combinations of training–testing periods: 2 years–1 year, 1 year– 1 year, 1 year–6 months, 6 months–6 months, 1 month–1 month, 1 month–1 week, 1 week–1 week. The model of order five has been chosen in this analysis of sensibility. The results obtained are plotted in Fig. 11 and show that the performance is generally worse when the training period decreases, thus in the following only the two main test sets (3 years–2 years and 4 years–1 year) are considered. Examining the best training–testing periods combination, the model error increases relatively quickly in the first 6 h from 7% to 17%. After these first 6 h error stabilises at between 20% and 24%. In the first set (training period of 3 years and testing period of 2 years) orders from 1 to 11 were used, while in the second set (training period of 4 years and testing period of 1 year) orders from 1 to 9 were tested, because the maximum admissible order of the model was equal to 9 (to obtain a solution of the Eq. (1)). Table 6 allows three important observations to be made: s the forecast is worse when prediction length increases; s the training period does not influence the ARMA models, the performances of which appear substantially independent from having a training period of 3 or 4 years; 1306 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 Best ARMA ANFIS MLFF MLP 1 INPUT ELMAN 1 INPUT MLP 2 INPUTS ELMAN 2 INPUTS 26 24 Normalised absolute average error (%) 22 20 18 16 14 12 10 8 6 0 3 6 9 12 15 18 21 24 Prediction length - Hours Fig. 12. Normalised absolute average error vs prediction length (training 3 years, testing 2 years). Best ARMA ANFIS MLFF MLP 1 INPUT ELMAN 1 INPUT MLP 2 INPUTS ELMAN 2 INPUTS 26 24 Normalised absolute average error (%) 22 20 18 16 14 12 10 8 6 0 3 6 9 12 15 18 21 24 Prediction length - Hours Fig. 13. Normalised absolute average error vs prediction length (training 4 years, testing 1 years). s there is no optimal order of model, but there are similar performances among the various orders. In the following a detailed comparison between several forecasting techniques is illustrated; qualitatively the plots depicting the observed, as opposed to, predicted values for given prediction lengths may be focused upon with a quantitative analysis through the normalised percentage error described above. Considering its absolute average value, it is possible to identify the best technique for each prediction length; a deep analysis of the statistical distribution of normalised error evaluated the techniques whereby it is more improbable to make errors in prediction. The next two graphs (Figs. 12 and 13) plot the trends of the normalised absolute average errors of all the experimented methods for the two test sets. It is evident that the non-linear models improve very significantly on the linear model forecasts; indeed, the performance of the ARMA models (for which the best value is plotted among the ones obtained with the several orders used) is worse in both the test sets except for the prediction length of one hour; only in this case, therefore, are the ARMA models taken into consideration in the following. Fig. 14 shows observed and predicted values in a week in September in year V, for the prediction length of one hour (training 3 years, testing 2 years) with the following models: ARMA order 5, ANFIS, MLP with one input (selected as the best among the three neural networks with one input) and MLP with two inputs (selected as the best between the two neural networks with two inputs). It is particularly evident that there is good agreement between the observed and predicted values for all the forecasting techniques used, the performances of which are quite similar (see Table 7); therefore the use of a simple linear technique such as the ARMA models or a more sophisticated non-liner algorithm appears almost indifferent. Fig. 15 shows observed and predicted values in a week in September, year V, for the prediction length of 6 h (training 3 years, testing 2 years) with the following models: ANFIS, MLP with one input (selected as the best among the three neural networks with one input) and MLP with two inputs (selected as the best between the two neural networks with two inputs). There is a visible but slight gap between the observed and predicted values for all the forecasting techniques used, the normalised absolute average error of which is around 11% (see Table 7). In Fig. 16 the observed and predicted values in a week in September, year V, are shown for the prediction length of 12 h (training 3 years, testing 2 years) with the following models: ANFIS, MLP with one input (selected as the best among the three neural networks with one input) and MLP with two inputs (selected as the best between the two neural networks with two inputs). Here the gap between observed and predicted values is more evident, and the normalised absolute average error is approximately 13% (see Table 7). Table 7 shows the performance of the several methods applied, indicating the best value obtained among the several orders tested for the ARMA models. Observing the average values in Table 7, the performance of the several non-linear models applied (ANNs and ANFIS) appears clearly preferable to the linear ones (ARMA) and essentially similar, even if it is evident that the MLP and the Elman networks are both characterised by better performance. It is interesting to note that the all the best results were obtained in the first test set (training period of 3 years and testing period of 2 years), meaning that too long a training period could lightly decrease the quality of the training itself. Moreover, it is remarkable that MLP performance appears better in the first four time horizons (1, 3, 6, 12 h), while the Elman network is more suitable in the final prediction length (24 h). This may well be due to the different structures of the two network architectures examined above; in particular, the ability of the Elman network to both detect and generate time-varying patterns (by the recurrent connection seen in Fig. 9) appears practically negligible in the short-medium time horizons yet becomes of great importance as the prediction length increases, exceeding the great advantage of the MLP network in not suffering from local minima. This advantage owes itself solely to the fact that the only parameters adjusted in the learning process are those of the linear mapping from hidden layer to output layer. Linearity ensures that the error surface is quadratic and therefore has a single and easily found minimum. It is also noticeable that the best networks individuated in Table 7 for the first four time horizons (1, 3, 6, 12 h) have the simplest architectures; the MLP networks with one and two inputs respectively have only one and two neurons in the first layer; as a consequence they are characterised by a very short computational time and appear quite suitable for the development of an online application. This is especially because of the possibility of periodical 1307 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 ARMA model for prediction of power with time horizon equal to 1 hour 1.0 Prediction Observation 0.8 Normalised hourly power Normalised hourly power 1.0 ANFIS for prediction of power with time horizon equal to 1 hour 0.6 0.4 0.2 Prediction Observation 0.8 0.6 0.4 0.2 0.0 0.0 0 20 40 60 80 100 120 140 0 160 20 1 week of September, year V - Hours MLP (1 input) network for prediction of power with time horizon equal to 1 hour 1.0 1.0 80 100 120 140 160 Prediction Normalised hourly power Normalised hourly power 60 MLP (2 inputs) network for prediction of power with time horizon equal to 1 hour Prediction 0.8 40 1 week of September, year V - Hours Observation 0.6 0.4 0.2 0.0 Observation 0.8 0.6 0.4 0.2 0.0 0 20 40 60 80 100 120 140 160 0 20 1 week of September, year V - Hours 40 60 80 100 120 140 160 1 week of September, year V - Hours Fig. 14. Observed vs predicted values (training 3 years, testing 2 years, prediction length 1 h). Table 7 Normalised absolute average percentage error in the several testing sets. Prediction length Model 1 h (%) 3 h (%) 6 h (%) 12 h (%) 24 h (%) Training period 3 years, testing period 2 years Best ARMA ANFIS MLFF ELM AN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 6.87 6.82 6.89 6.84 6.73 7.33 6.79 12.51 9.37 9.89 9.54 9.15 9.39 9.16 16.75 11.45 11.96 12.03 11.94 11.31 11.19 20.68 13.65 13.97 13.55 13.19 13.84 13.34 23.67 15.15 15.93 14.77 15.32 15.11 15.15 Training period 4 years, testing period 1 year Best ARMA ANFIS MLFF ELMAN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 6.94 7.14 7.51 7.58 7.17 7.68 7.15 12.44 9.36 10.06 9.89 9.45 9.19 9.39 16.53 11.39 12.16 11.81 11.73 11.21 11.47 20.63 13.70 14.17 13.60 15.01 14.27 13.66 23.93 15.67 15.91 15.90 16.99 15.75 15.62 training on new data adapting to variations of the dynamic of process. Only in the last prediction length (24 h) is a more complex network architecture necessary in order to obtain a better performance, with 48 neurons in the first layer and 24 in the second, also needing a longer computational time due to higher complexity. Nevertheless, considering only the average values is insufficient in the evaluation of the differences in performance of the forecasting methods. Thus, a further analysis of the statistical distribution 1308 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 ANFIS for prediction of power with time horizon equal to 6 hours 1.0 Prediction Observation 0.8 Normalised hourly power Normalised hourly power 1.0 ANFIS for prediction of power with time horizon equal to 12 hours 0.6 0.4 0.2 Prediction Observation 0.8 0.6 0.4 0.2 0.0 0.0 0 20 40 60 80 100 120 140 0 160 20 MLP (1 input) network for prediction of power with time horizon equal to 6 hours Prediction Observation 0.8 0.6 0.4 0.2 80 100 120 140 160 Prediction Observation 0.8 0.6 0.4 0.2 0.0 0.0 0 1.0 20 40 60 80 100 120 140 0 160 20 40 60 80 100 120 140 1 week of September, year V - Hours 1 week of September, year V - Hours MLP (2 inputs) network for prediction of power with time horizon equal to 6 hours MLP (2 inputs) network for prediction of power with time horizon equal to 12 hours 1.0 Prediction Observation 0.8 Normalised hourly power Normalised hourly power 60 MLP (1 input) network for prediction of power with time horizon equal to 12 hours 1.0 Normalised hourly power Normalised hourly power 1.0 40 1 week of September, year V - Hours 1 week of September, year V - Hours 0.6 0.4 0.2 160 Prediction Observation 0.8 0.6 0.4 0.2 0.0 0.0 0 20 40 60 80 100 120 140 160 1 week of September, year V - Hours Fig. 15. Observed vs predicted values (training 3 years, testing 2 years, prediction length 6 h). of normalised error was made in order to evaluate the narrower curves of error distribution and therefore the techniques whereby it is more improbable to make errors in prediction. Figs. 17–19 depict error distributions for the seven methods tested (the ARMA model is of order 5) with a prediction length of 0 20 40 60 80 100 120 140 160 1 week of September, year V - Hours Fig. 16. Observed vs predicted values (training 3 years, testing 2 years, prediction length 12 h). 1 h, while Tables 8 and 9 contain, respectively, the probability that the error itself takes in values in the ranges: [10%; +10%] and [20%; +20%]. For each comparison the best value is highlighted in bold. 1309 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 ARMA (order 5) ANFIS Error distribution of the statistical model 40 40 35 35 30 30 Frequency (%) Frequency (%) Error distribution of the statistical model 25 20 15 25 20 15 10 10 5 5 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 Normalised error (%) Normalised error (%) MLFF Error distribution of the statistical model 40 Frequency (%) 35 30 25 20 15 10 5 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 Normalised error (%) Fig. 17. Error distribution of some forecast models (ARMA, ANFIS, MLFF, training 3 years, testing 2 years, prediction length 1 h). ELMAN (1 input) ELMAN (2 inputs) Error distribution of the statistical model 40 40 35 35 30 30 Frequency (%) Frequency (%) Error distribution of the statistical model 25 20 15 25 20 15 10 10 5 5 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 Normalised error (%) Normalised error (%) Fig. 18. Error distribution of the Elman networks (training 3 years, testing 2 years, prediction length 1 h). If the probability is focused upon that the normalised error takes in values in the ranges [10%; +10%] and [20%; +20%] it becomes evident that in most of the cases it is appreciably higher with Elman neural networks, ensuring that the 1310 M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 MLP (1 input) MLP (2 inputs) Error distribution of the statistical model Error distribution of the statistical model 50 40 Frequency (%) Frequency (%) 40 30 20 30 20 10 10 0 0 -70 -60 -50 -40 -30 -20 -10 0 10 20 30 40 50 60 70 80 -80-70-60-50-40-30-20-10 0 10 20 30 40 50 60 70 80 90 Normalised error (%) Normalised error (%) Fig. 19. Error distribution of the MLP networks (training 3 years, testing 2 years, prediction length 1 h). Table 8 Probability that the normalised error takes values in the range [10%; +10%]. Prediction length Model 1 h (%) 3 h (%) 6 h (%) 12 h (%) 24 h (%) Training period 3 years, testing period 2 years Best ARMA ANFIS MLFF ELM AN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 78.90 78.08 77.24 77.92 78.40 76.86 78.37 64.28 69.74 65.77 66.84 70.82 70.27 70.79 54.67 61.24 57.86 58.64 58.73 66.67 64.17 46.49 54.05 51.28 52.06 55.94 50.87 57.47 40.91 43.89 42.56 48.80 39.29 46.39 43.24 Training period 4 years, testing period 1 year Best ARMA ANFIS MLFF ELMAN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 77.77 77.32 76.60 75.92 77.31 76.18 77.44 62.12 70.22 65.61 66.40 70.20 70.17 69.99 51.86 64.17 58.67 60.10 60.41 63.91 62.93 44.72 55.26 50.64 54.07 36.18 48.83 54.66 39.08 44.28 43.14 44.92 48.57 42.54 42.11 Table 9 Probability that the normalised error takes values in the range [20%; +20%]. Prediction length Model 1 h (%) 3 h (%) 6 h (%) 12 h (%) 24 h (%) Training period 3 years, testing period 2 years Best ARMA ANFIS MLFF ELM AN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 91.27 91.38 91.41 91.55 91.53 91.04 91.46 81.30 87.18 84.98 87.06 87.41 87.91 87.31 74.29 83.24 80.17 81.92 83.78 86.25 83.57 66.77 77.65 74.73 76.29 79.45 78.27 79.09 60.13 73.83 69.09 73.12 74.83 75.40 74.58 Training period 4 years, testing period 1 year Best ARMA ANFIS MLFF ELMAN (1 input) MLP (1 input) ELMAN (2 inputs) MLP (2 inputs) 90.72 90.77 90.69 90.61 90.75 90.31 80.80 79.97 86.54 85.05 86.59 86.81 86.36 86.86 71.85 82.82 79.99 82.24 83.21 82.75 83.19 63.43 78.44 75.46 75.98 78.23 77.76 78.44 57.81 74.47 71.13 72.13 68.24 73.12 73.99 curve of error distribution is narrower and that it is more improbable to make errors in prediction. This may be justified once more by the aforementioned recurrent connec- tion that characterises the Elman network, enabling it to detect time-varying patterns and thus generate more accurate forecasts. M.G. De Giorgi et al. / Applied Energy 88 (2011) 1298–1311 7. Conclusion The greatest problem for the diffusion of wind energy is its high variability, both in space and time. Short-range wind energy forecasting is very important in minimising the scheduling errors that impact grid reliability and market service costs. In this work a wide comparison has been carried out between the ARMA models (which perform a linear mapping between input and output), five kinds of Artificial Neural Networks (which perform non-linear mapping between input and output; thus suitable to describe real situations) and the ANFIS model (which combines the low-level computational power of a neural network with the high-level reasoning capability of a fuzzy inference system) in an attempt to create a forecasting model for a wind park at a specific location in Southern Italy. This comparison looks at many forecasting methods, time horizons and a deep performance analysis focused upon normalised mean error and the statistical distribution thereof, in order to evaluate narrower error distribution and therefore the forecasting methods whereby it is more improbable to make errors in prediction. In particular, ARMA, ANNs and ANFIS have been applied to forecasting power production using two test sets: first, a training period of 3 years and a testing period of 2 years, second a training period of 4 years and a testing period of 1 year; in both the sets five prediction lengths have been taken into consideration: 1, 3, 6, 12 and 24 h respectively. For all the given techniques the forecast is worse when prediction length increases. The best results were obtained in the first test set (training period of 3 years and testing period of 2 years), meaning that too long a training period may lead to a slight decrease in the performance of the training itself. Moreover, in a site with a complex orography like the one containing the examined wind farm, no significant benefits derive from the use of two inputs (wind speed and wind power) instead one (wind power) in the forecast methods. Analysing the normalised absolute average percentage error in the several testing sets, the performance of the models applied would appear essentially similar, yet it is evident that MLP performance appears better in the first four time horizons (1, 3, 6, 12 h), while the Elman network is more suitable in the final prediction length (24 h). It is also noticeable that the best networks individuated for the first four time horizons (1, 3, 6, 12 h) are the ones with the simplest architectures; the MLP networks with one and two inputs only have one and two neurons in the first layer; as a consequence, they have a very short computational time and appear quite suitable for the development of an online application, in particular owing to the possibility of periodical training on new data and adapting to variations of the dynamic of process. Only in the last prediction length (24 h) is a more complex network architecture necessary to obtain better performance, with 48 neurons in the first layer and 24 in the second; also in need of a longer computational time due to higher complexity. A further analysis of the statistical distribution of normalised error was made, calculating the probability that the error itself takes in values of the ranges: [10%; +10%] and [20%; +20%]. In most cases, the probability that the normalised error take values in the ranges [10%; +10%] and [20%; +20%] is appreciably higher with the Elman neural networks, ensuring a narrower error 1311 curve and making it thus more improbable to make errors in prediction. References [1] Morales JM, Minguez R, Conejio AJ. A methodology to generate statistically dependent wind speed scenarios. Appl Energy 2010;87:843–55. [2] Zhou W, Chengzhi L, Zhongshi L, Lu L, Yang H. Current status of research on optimum sizing of stand-alone hybrid solar–wind power generation systems. Appl Energy 2010;87:380–9. [3] Xydis G, Koroneos C, Loizidou M. Exergy analysis in a wind speed prognostic model as a wind farm sitting selection tool: a case study in Southern Greece. Appl Energy 2009;86:2411–20. [4] Hongxing Y, Wei Z, Chengzhi L. Optimal design and techno-economic analysis of a hybrid solar–wind power generation system. Appl Energy 2009;86:163–9. [5] Jowder F. Wind power analysis and site matching of wind turbine generators in Kingdom of Bahrain. Appl Energy 2009;86:538–45. [6] Luickx PJ, Delarue ED, D’haeseleer WD. Considerations on the backup of wind power: operational backup. Appl Energy 2008;85:787–99. [7] Saylor DJ, Rosen JN, Hu T, Li X. A neural network approach to local downscaling of GCM output for assessing wind power implications of climate change. Renew Energy 2000;19:359–78. [8] Sideratos G, Hatziargyriou ND. An advanced statistical method for wind power forecasting. IEEE Trans Power Syst 2007:22. [9] Burton NJ, Bossanyi E. Wind energy handbook. Wiley; 2001. [10] Bossanyi E. Short-term stochastic wind prediction and possible control application. In: Proceedings of the Delphi workshop on ‘‘wind energy application’’ Greece; 1985. p. 137–142. [11] Boland J, Ward K, Korolkowiecz M. Modelling the volatility in wind farm output. School of Mathematics and Statistics-University of South Australia. [12] Kavasseri RG, Seetharaman K. Day-ahead wind speed forecasting using fARIMA models. Renew Energy 2009;34:1388–93. [13] Riahy GH, Abedi M. Short term wind speed forecasting for wind turbine applications using linear prediction method. Renew Energy 2008;33:35–41. [14] Flores AT, Tapia G. Application of a control algorithm for wind speed prediction and active power generation. Renew Energy 2005;33:523–36. [15] Sfetsos A. A novel approach for the forecasting of mean hourly wind speed time series. Renew Energy 2002;27:163–74. [16] Jayaraj KP, Padmakumari K, Sreevalsan E, Arun P. Wind speed and power prediction using artificial neural networks. In: European wind energy conference; 2004 [EWEC]. [17] Cadenas E, Rivera W. Wind speed forecasting in the South Coast of Oaxaca, Mexico. Renew Energy 2006;32:2116–28. [18] Potter C, Ringrose M, Negnevitsky M. Short-term wind forecasting techniques for power generation. Australasian Universities power engineering conference; 2004 [AUPEC]. [19] Johnson P, Negnevitsky M, Muttaqi KM. Short-term wind forecasting using Adaptive Neural Fuzzy Inference. In: System Australasian Universities power engineering conference; 2008 [AUPEC]. [20] Barbounis TG, Theocharis JB. Locally recurrent neural networks for long-term wind speed and power prediction. Neurocomputing 2006;69:466–96. [21] Bilgili M, Sahin B, Yasar A. Application of artificial neural networks for the wind speed prediction of target station using reference stations data. Renew Energy 2007;32:2350–60. [22] Kalogirou S, Neocleous C, Pashiardis S, Schizas C. Wind speed prediction using artificial neural networks. In: European symposium on intelligent techniquesy; 1999. [23] Fadare DA. The application of artificial neural networks to mapping of wind speed profile for energy application in Nigeria. Appl Energy 2010;87:934–42. [24] Beccali M, Cirrincione G, Marvuglia A, Serporta C. Estimation of wind velocity over a complex terrain using the Generalized Mapping Regressor. Appl Energy 2010;87:884–93. [25] Li G, Shi J. On comparing three artificial neural networks for wind speed forecasting. Appl Energy 2010;87:2313–20. [26] Kariniotakis G, Nogaret E, Stavrakakis G., Advanced short-term forecasting of wind power production. In: Proceedings of the 1996 European union wind energy conference EUWEC’97, Dublin, Ireland; 1997. p. 751–4. [27] Palomares-Salas JC, De la Rosa JJG, Ramiro JG, Melgar J, Aguera A, Moreno A. Comparison of models for wind speed forecasting. Research Unit PAIDI-TIC168. University of Cádiz. [28] Sftetos A. A comparison of various forecasting techniques applied to mean hourly wind speed time series. Renew Energy 2000;21:23–35. [29] Ramirez-Rosado IJ, Ferandez-Jimenez LA, Monteiro C, Sousa J, Bessa R. Comparison of two new short-term wind power forecasting systems. Renew Energy 2009;34:1848–54. [30] Lei M, Shiyan L, Chuanwen J, Hongling L, Yan Z. A review on the forecasting of wind speed and generated power. Renew Sustain Energy Rev 2009;13: 915–20.