Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble
Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble
Research Article: Global Solar Radiation Forecasting Using Square Root Regularization-Based Ensemble
Research Article
Global Solar Radiation Forecasting Using Square Root
Regularization-Based Ensemble
1,2 1,2
Yao Dong and He Jiang
1
School of Statistics, JiangXi University of Finance and Economics, Nanchang JiangXi 330013, China
2
Applied Statistics Research Center, Nanchang Jiangxi 330013, China
Received 26 February 2019; Revised 30 April 2019; Accepted 2 May 2019; Published 14 May 2019
Copyright © 2019 Yao Dong and He Jiang. This is an open access article distributed under the Creative Commons Attribution
License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly
cited.
In recent decades, the integration of solar energy sources has gradually become the main challenge for global energy consumption.
Therefore, it is essential to predict global solar radiation in an accurate and efficient way when estimating outputs of the solar system.
Inaccurate predictions either cause load overestimation that results in increased cost or failure to gather adequate supplies. However,
accurate forecasting is a challenging task because solar resources are intermittent and uncontrollable. To tackle this difficulty, several
machine learning models have been established; however, the forecasting outcomes of these models are not sufficiently accurate.
Therefore, in this study, we investigate ensemble learning with square root regularization and intelligent optimization to forecast
hourly global solar radiation. The main structure of the proposed method is constructed based on ensemble learning with a random
subspace (RS) method that divides the original data into several covariate subspaces. A novel covariate-selection method called
square root smoothly clipped absolute deviation (SRSCAD) is proposed and is applied to each subspace with efficient extraction
of relevant covariates. To combine the forecasts obtained using RS and SRSCAD, a firefly algorithm (FA) is used to estimate the
weights assigned to individual forecasts. To handle the complexity of the proposed ensemble system, a simple and efficient algorithm
is derived based on a thresholding rule and accelerated gradient method. To illustrate the validity and effectiveness of the proposed
method, global solar radiation datasets of eight locations of Xinjiang province in China are considered. The experimental results
show that the proposed RS-SRSCAD-FA achieves the best performances with a mean absolute percentage error, root-mean-square
error, Theil inequality coefficient, and correlation coefficient of 0.066, 20.21 W/m2 , 0.016, 3.40 s, and 0.98 in site 1, respectively. For
the other seven datasets, RS-SRSCAD-FA still outperforms other approaches. Finally, a nonparametric Friedman test is applied to
perform statistical comparisons of results over eight datasets.
representation is infeasible. This advantage promotes the (GPR) methodology, which shows promising results with
application of machine learning models in classification, lower computational costs. Rohani et al. [23] presented GPR
pattern recognition, spam filtering, data mining, and fore- with a 𝐾-fold cross-validation (CV) model to forecast daily
casting [6]. At present, a popular method for forecasting and monthly solar radiation at Mashhad city in Iran based on
global solar radiation is to use an artificial neural network several meteorological parameters. Considering the effect of
(ANN). ANNs have been used in various areas with different climatic factors such as sunshine duration, relative humidity,
input covariates (multiple meteorological, geographical, and and air temperature, Guermoui et al. [24] used the GPR
astronomical parameters such as temperature, humidity, to predict the daily global solar radiation in the Saharan
wind speed, precipitation, altitude, declination, and zenith climate. Guermoui et al. [25] proposed a novel method called
angle) and multiple time scales (hourly, daily, and monthly) weighted Gaussian process regression (WGPR) to predict
[7, 8]. The main advantage of ANNs is that they do not daily global and direct horizontal solar radiation in Saharan
require as many adjustable parameters as other classical climate. The experimental results illustrate that these devel-
techniques and provide higher forecasting accuracy. Ozgoren oped models exhibit good forecasting performance.
et al. [9] proposed an ANN model based on a multinonlinear It is challenging for a single forecasting model to provide
regression (MNLE) algorithm to forecast the monthly mean an optimum solution to a complex problem. Each model
daily sum global solar radiation for Turkey. Hejase et al. has advantages and drawbacks. Thus, it is worthwhile to
used a multilayer perceptron and radial basis function (RBF) investigate appropriate combinations of their advantages for
techniques with comprehensive training architectures and implementing the forecasting task. As solar radiation simul-
different combinations of inputs to forecast global horizontal taneously exhibits nonstationary and nonlinear characteris-
irradiance for three major cities in the United Arab Emirates tics at different time scales, the analysis of these fluctuations at
[10]. Renno et al. [11] developed two ANNs to forecast all scales can be performed using multi-scale decomposition
daily global radiation and hourly direct normal irradiance methods. Wang et al. [26] decomposed solar radiation into
for the University of Salerno. Notwithstanding its growing a quasi-stationary process at different frequency scales using
popularity, ANNs have their drawbacks. For example, the wavelet transform (WT) and employed BPNN to predict solar
backpropagation neural network (BPNN) exhibits difficulty radiation based on temperature, clearness index, and phase
in training the input data owing to the iterative tuning of space reconstruction radiation. Monjoly et al. [27] applied
parameters required for hidden neurons because it does three multiscale decomposition methods, empirical mode
not always produce a unique global solution with different decomposition (EMD), ensemble empirical mode decompo-
models; moreover, BPNN exhibits a slow response owing to sition (EEMD), and WT, to analyze the nonstationary and
its gradient-based learning algorithm and requires a longer nonlinear features of the clear sky index, and then developed
training time [12, 13]. a hybrid model AR-NN combined autoregressive process
It is well known that the support vector machine (SVM) (AR) and neural network (NN) to forecast global horizontal
developed by Vapnik [14] is a powerful tool for exploring solar irradiation (GHI). The experimental results showed that
appropriate SVM regressors to establish a prediction model the WT-hybrid model has the best forecasting accuracy. To
with sufficiently high accuracy. The main goal of SVM improve the forecasting performance, Hussain and Alalili
regressors is to construct the optimum prediction model [28] investigated a new hybrid technique based on WT and
that boosts the prediction accuracy efficiently. It is a reg- four different architectures of ANN to forecast GHI in the
ularization network and exhibits an advantage over ANN Abu Dhabi region.
model. To reduce the high computation cost, least-squares Ensemble methods are straightforward and highly effec-
SVM (LSSVM) uses a least-squares method via a set of linear tive techniques that average the approximate predictions of a
equations based on structural risk minimization; therefore, it few weak learners to present accurate estimates rather than
is able to prevent overfitting of the training data and does not find a single remarkable sophisticated learner [29–31]. The
require iterative tuning of model parameters [15–17]. Wu and advent of ensemble methods has gained prominence. The
Liu investigated an SVM model to estimate monthly mean most commonly used ensemble methods include random for-
daily solar radiation using measured minimum, mean, and est, gradient boosting, and bagging. Although these methods
maximum air temperatures in 24 sites in China. Five SVMs have wide applications in other scientific fields, they have only
based on different input covariates were developed [18]. Chen rarely been adopted in forecasting global solar radiation [6].
et al. used seven SVMs and five empirical sunshine-based Gala et al. [32] used support vector regression, random forest
models to forecast daily global solar radiation at three sites in regression, and gradient boosted regression as a hybrid model
Liaoning, China. The experimental results demonstrated that to improve 3-hour accumulated radiation forecasts for seven
all SVMs exhibited higher performance than empirical mod- sites in Spain. Sun et al. [33] proposed different random forest
els [19]. Ekici proposed a LSSVM-based intelligent model to models with and without considering the air pollution index
forecast the daily solar insolation for Turkey [20]. It was noted to estimate solar radiation in three sites in China through
that the temporal behavior of global solar radiation and its comparisons with a few empirical models; their experimental
corresponding atmospheric input covariates are complex and results demonstrated that the performance of random forest
chaotic [21]. Although SVM is flexible in handling nonlinear techniques with the air pollution index is higher. Aler et al.
input covariates, it encounters fluctuations and difficulties [34] applied gradient boosting machine learning to improve
in training input datasets with wide frequencies [22]. Some the performance of diffuse and direct separation models.
researchers have focused on Gaussian process regression Ensembles of linear and nonlinear weak learners have both
Mathematical Problems in Engineering 3
been adopted. The results indicated that the gradient boosting (iii) A simple-to-implement and efficient algorithm is
method can reduce large random errors of separation models designed based on thresholding rules and an acceler-
efficiently. ated gradient method is used to further speed up the
The computational cost is mainly determined by the convergence of the algorithm.
complexity of the forecasting model. It is time-consuming if
several covariates are employed to establish the model. Thus, The rest of this paper is organized as follows. Section 2
it is essential to investigate an efficient method to simplify the presents a preliminary discussion of related methodologies.
model structure. Covariate selection is an efficient method Section 3 shows the proposed method and the design of the
to extract the important covariates. There are several penal- algorithm used in the proposed method is also revealed.
ized covariate selection methods including least absolute The empirical study and corresponding results analysis are
shrinkage and selection operator (LASSO) [35], least angle presented in Section 4. Finally, Section 5 summarizes this
regression [36], elastic net [37], and adaptive LASSO [38]. study and provides concluding remarks. The nomenclature is
LASSO is a popular covariate selection method applying an provided in Table 1.
ℓ1 -type penalty. Yang et al. [39] selected the important time-
lagged covariate using LASSO and revealed its advantages 2. Methodology and Materials
over the exponential smoothing (ES) model and the autore-
gressive integrated moving average (ARIMA) model. LARS 2.1. Forecasting and Covariate Selection in the Interactive
is closely associated with LASSO, and their connection is Model. Several nonlinear models including SVMs and ANNs
established in Efron et al. [36]. Zou proposed adaptive LASSO have been considered to forecast global solar radiation
involving the addition of weights to the penalty parameter because researchers have gradually realized that the out-
of each feature [38]. It outperforms LASSO in terms of comes given by a linear regression model are not accurate.
selection consistency. The elastic net was first proposed by Interactive models are one of nonlinear models and have
Zou, and it considers ℓ2 -type penalty in addition to the ℓ1 - wide applications in many scientific fields. They are favored
type penalty [37]. Thus, the elastic net applied ℓ1 + ℓ2 -type because they take the connections between covariates into
penalty, which provides the following advantages: the ℓ1 part account to capture the nonlinear relationship between the
focuses on feature selection and the ℓ2 part can be applied to global solar radiation and the other meteorological and
boost forecasting accuracy. However, Zhang determined that geographical covariates. In this study, an interactive model
the abovementioned feature selection methods apply convex is considered to build the forecasting model in a global solar
penalties, which makes it difficult to detect the important radiation problem and study the following nonlinear additive
covariates when the model coherence is high [40]. To solve setup:
this problem, the nonconvex penalized covariate selection
approaches are advocated. Knight and Fu proposed bridge 𝑦 = 𝛽0 + ∑𝛽𝑗 𝑥𝑗 + ∑𝜃𝑗𝑘 𝑥𝑗 ⊙ 𝑥𝑘 + 𝜀, (1)
𝑗 𝑗,𝑘
regression, which focuses on the penalty function of the
ℓ𝑟 (0 < 𝑟 < 2) form [41]. Fan and Li [42] proposed where 𝑦 denotes the global solar radiation to be studied, 𝑋 =
the smoothly clipped absolute deviation (SCAD) method [𝑥1 , . . . , 𝑥𝑝 ] ∈ R𝑛×𝑝 is the data matrix with 𝑛 samples and 𝑝
and demonstrated its advantage over LASSO. However, to meteorological and geographical covariates, 𝜀 represents the
the best of the present authors’ knowledge, there is no noise term, which follows a Gaussian distribution 𝑁(0, 𝜎2 𝐼)
research work on feature selection using both square root with 𝜎2 representing the noise level, and let 𝐼 be an identity
loss function and SCAD penalty in the literature. The main matrix. The distribution of 𝜀 can also be a binomial distri-
contribution of this study is the advocation of a novel global bution if a classification problem is studied. Use 𝑥𝑗 ⊙ 𝑥𝑘 to
solar radiation forecasting approach that combines ensemble denote the two-way covariates. Here 𝛽0 is the intercept term,
learning, square root smoothly clipped absolute deviation 𝑝
and {𝛽𝑗 }𝑗=1 and {𝜃𝑗𝑘 }1≤𝑗,𝑘≤𝑝2 are the coefficients of covariates
(SRSCAD), and the firefly algorithm (FA). Specifically, the
to be estimated. Note that several covariates are included in
contributions of this study are given as follows:
this model. Furthermore, the large computation time is one
challenge caused by the excessive number of covariates.
(i) An SRSCAD method is proposed to extract the In the global solar radiation forecasting domain, a proper
important covariates efficiently. Moreover, the opti- metric that measures the fitness of global solar radiation and
mal regularization parameter is selected by a 10-fold other meteorological and geographical features needs to be
CV; this CV completely utilizes the data through defined. This metric is called a loss function, denoted by 𝐿, in
resampling, which is particularly useful for data with the machine learning community. Because 𝜀 follows a Gaus-
a small sample size. sian distribution in the interactive model (1), a simple way to
define this metric is to consider the negative loglikelihood of
(ii) The ensemble learning is the main structure of the
the distribution of 𝑦 (after centering), that is, 𝑓(𝑦 | 𝑋, 𝛽, 𝜃)
proposed method. The diversity created by covariate
selection of the ensemble system substantially aids in 𝐿 fl log (𝑓 (𝑦 | 𝑋, 𝛽, 𝜃))
boosting the forecasting accuracy. The FA, which is
demonstrated to obtain the global optimal solution 2
1 (2)
successfully, is applied as a weight policy in the = 2 𝑦 − ∑𝛽 𝑥 − ∑𝜃 𝑥 ⊙ 𝑥 ,
𝑗 𝑗 𝑗𝑘 𝑗 𝑘
forecasting model. 2𝜎 2
𝑗 𝑗,𝑘
4 Mathematical Problems in Engineering
{ }
where ‖ ⋅ ‖2 represents the Euclidean norm. A common min {√𝐿 + ∑PSCAD (𝛽𝑗 ; 𝜆) + ∑PSCAD (𝜃𝑗𝑘 ; 𝜆)} . (6)
procedure is to restrict the number of covariates to avoid the 𝛽𝑗 ,𝜃𝑗𝑘
𝑗 𝑗,𝑘
{ }
overfitting problem. Namely, a few of the covariates may not
be useful to the model and they are considered to be nuisance In the following, we focus on (6).
Mathematical Problems in Engineering 5
2.2. Ensemble Learning. Famous statistician George E.P. Box for 𝑑 = 1, . . . , 𝐷, where ‖ ⋅ ‖2 denotes the Euclidean norm,
said “all the models are wrong but some of them are useful” 𝐷 is the number of dimensions, 𝑟𝑖𝑗 is the Cartesian distance
[45]. This indicates that a single forecasting model cannot between 𝐹𝑖 and 𝐹𝑗 , and 𝑓𝑖𝑑 and 𝑓𝑗𝑑 are the 𝑑th components
always perform very well under all circumstances. The basic of 𝐹𝑖 and 𝐹𝑗 , respectively. At the distance 𝑟 = 0, the
idea of ensemble learning is to create a finite number of parameter 𝑔0 represents the attractiveness and 𝛾 denotes the
individual learners and combine them through assigning light absorption coefficient. According to [52], 𝛾 is used as a
different weights and summing them up [46]. Ensemble typical initial value, 𝛾 = 1/Γ2 , where Γ is the length scale of
learning is a supervised learning algorithm because it can the optimization problem.
be used to make predictions and classifications. A single The movement of the firefly 𝐹𝑖 towards another brighter
model that is going to be trained can be seen as one firefly 𝐹𝑗 can be given by
hypothesis. However, it is difficult to decide which hypothesis
2
is the best. Thus, the ensemble learning system combines 𝑓𝑖𝑑 (𝑡 + 1) = 𝑓𝑖𝑑 (𝑡) + 𝑔0 𝑒−𝛾𝑟𝑖𝑗 [𝑓𝑗𝑑 (𝑡) − 𝑓𝑖𝑑 (𝑡)]
all the hypotheses and creates a better hypothesis although
(9)
the computational time is increased. Therefore, ensemble 1
learning can represent more flexible functions than a single + 𝛼 sgn (𝑟𝑎𝑛𝑑 − ) ⊕ Lévy,
2
model. Note that an overfitting problem can be caused by
the ensemble techniques. In practice, regularization methods Lévy : 𝑢 = 𝑡−𝑎 (1 < 𝑎 ≤ 3) , (10)
are applied to reduce the problem related to overfitting of the
training data. where 𝛼 ∈ [0, 1] is the step size of the random walk and
Empirically, ensemble learning tends to provide better sgn(𝑟𝑎𝑛𝑑 − 1/2) (𝑟𝑎𝑛𝑑 ∈ [0, 1]) is a random sign whereas the
results than a single model using the significant diversity random step length is obtained from a Lévy distribution in
among the models. Therefore, many ensemble methods seek (10). In this paper, the minimization problems are considered.
to promote diversity. Specifically, the diversity can be gener- Let 𝑉 be the fitness function. If 𝑉(𝐹𝑖 ) < 𝑉(𝐹𝐽 ), this means that
ated from the sample space (boosting and bagging), covariate 𝐹𝑖 is brighter than the firefly 𝐹𝑗 . The main steps of the FA are
space (RS [47] and random forest [48]), and parameter space. described as follows:
To gain better performance, the method used to combine (i) Step 1. Initialize the positions of fireflies (or 𝑁
single models is very important. The simplest way is just to solutions) 𝐹𝑖 = [𝑓𝑖1 , 𝑓𝑖2 , . . . , 𝑓𝑖𝑑 ] randomly based on
calculate the average of all the results given by the single the following:
models. However, these single models may play different roles
in the ensemble system. Thus, linear and nonlinear weighting 𝑓𝑖𝑑 = 𝑙𝑜𝑤 + 𝑟𝑎𝑛𝑑 (0, 1) (𝑢𝑝 − 𝑙𝑜𝑤)
strategies are applied to evaluate their contributions. In this (11)
paper, the FA, which is introduced in the following section, is (𝑖 = 1, 2, . . . , 𝑁, 𝑑 = 1, 2, . . . , 𝐷)
used as a weighting strategy for single models.
where 𝑙𝑜𝑤 and 𝑢𝑝 denote the lower and upper bounds
2.3. Firefly Algorithm. Xin-She Yang proposed a nature- for the dimension 𝐷, respectively. The fitness function
inspired metaheuristic in 2008 called FA, which was derived can be used to evaluate the position of these fireflies
from the flashing behavior of fireflies [49]. The main purpose in the initial population.
of the FA is to use a signal flash to attract other fireflies. There (ii) Step 2. Compare the positions of 𝐹𝑖 and 𝐹𝑗 (𝑖 ≠ 𝑗); if
are three hypotheses in this algorithm: (i) no consideration of 𝑉(𝐹𝑖 ) < 𝑉(𝐹𝑗 ), 𝐹𝑖 will be attracted by 𝐹𝑗 and updates
their sex, each firefly will move towards all other fireflies; (ii) its position based on (9). Then the updated position
attractiveness is associated with brightness, the fireflies with of each firefly can also be evaluated.
low light intensities will be attracted by the brighter ones in
the vicinity; nevertheless, the brightness will increase as their (iii) Step 3. If the iteration process satisfies the stopping
distance decreases; (iii) if no fireflies are brighter than the criteria, the algorithm stops; otherwise, continue with
given one, it will move randomly [50, 51]. step 2.
Let 𝐹𝑖 be the 𝑖th firefly in the population, where each
firefly represents a candidate solution in the search space and 3. The RS-SRSCAD-FA Method
𝑁 is the population size. Fireflies move towards the optimal
solution positions. The attractiveness is related to the light In this paper, a novel global solar radiation forecasting
intensity; thus we can define the attractiveness between two method is invented based on RS, SRSCAD, and the FA and
fireflies 𝐹𝑖 and 𝐹𝑗 as follows: is called RS-SRSCAD-FA for short. The main procedure of
RS-SRSCAD-FA is described as follows:
2
𝑔 (𝑟𝑖𝑗 ) = 𝑔0 𝑒−𝛾𝑟𝑖𝑗 , (7) (i) Step 1: Build the interactive model as described in (1).
and (ii) Step 2: The RS method is applied to the interactive
model. In particular, randomly select 𝑏 covariates
𝐷 without replacement for 𝐵 times, which results in 𝐵
2
𝑟𝑖𝑗 = 𝐹𝑖 − 𝐹𝑗 2 = √ ∑ (𝑓𝑖𝑑 − 𝑓𝑗𝑑 ) , (8) blocks denoted by {𝐺1 , . . . , 𝐺𝐵 } with 𝐿 covariates in
𝑑=1 each block.
6 Mathematical Problems in Engineering
(iii) Step 3: SRSCAD covariate selection method is used a few covariates are included; (2) the collinearity between
in each block {𝐺𝑖 }𝐵𝑖=1 . covariates is decreased. In step 4, the SRSCAD estimates
(iv) Step 4: Assign weights to 𝐵 blocks and concatenate obtained in step 3 are concatenated using the FA instead of
the estimates using the weights determined by FA. linear combination. This procedure aggregates the diversity
generated by steps 2 and 3. The flowchart of the proposed
method is shown in Figure 1.
Remark. In step 1, an interactive model is established so that
all the two-way covariates 𝑥𝑗 ⊙ 𝑥𝑘 that describe the possible To solve the penalized optimization problems efficiently,
relationships between covariates 𝑥𝑗 and 𝑥𝑘 are considered. the thresholding rule (generally referred to as the S function
Note that the dimension of the model has increased from 𝑝 [53]), rather than the penalty function, is applied as the main
to 𝑝2 + 𝑝. In step 2, similar to random forest, the interactive tool throughout this study. The threshold function is defined
model is divided into a number of blocks using the RS rigorously as follows.
method, which randomly extracts a number of covariates.
This procedure is the same as the divide procedure in the Definition (threshold rule). A threshold rule is a real-valued
“divide and conquer” strategy of the ensemble learning. function S(𝑡; 𝜆) defined for −∞ < 𝑡 < ∞ and 0 ≤ 𝜆 < ∞
The complex model is divided into some simple models such that
that are easy to handle separately. After this procedure,
all the solutions are combined. In step 3, the proposed (1) S(−𝑡; 𝜆) = −S(𝑡; 𝜆);
dimension reduction method SRSCAD is used to reduce the (2) S(𝑡; 𝜆) ≤ S(𝑡 ; 𝜆) for 𝑡 ≤ 𝑡 ;
number of covariates. This step is essential because it can
reduce the number of covariates effectively. This delivers two (3) lim𝑡→+∞ S(𝑡; 𝜆) = ∞; and
benefits: (1) the computation issue is solved because only (4) 0 ≤ S(𝑡; 𝜆) ≤ 𝑡 for 0 ≤ 𝑡 ≤ ∞.
Mathematical Problems in Engineering 7
(𝑡+1) (𝑡)
16: Step 3. Ω ← SSCAD (𝜁 ; 𝜆 𝑢 ‖𝑦 − 𝑋Ω ‖2 /𝜏22 ) (𝑡)
From the definition, it is evident that S(⋅; 𝜆) is an odd details of the proposed algorithm are given in Algorithm 1
monotone unbounded shrinkage rule for 𝑡, at any 𝜆. The after initialization with 𝜔(−1) = 0, 𝜔(0) = 1 and Ω(0) = Ω(−1) .
LASSO and SCAD rules are defined as follows: When designing the algorithm, the following points should
be noted:
SSOFT (𝑡; 𝜆) = sgn (𝑡) (|𝑡| − 𝜆)+ ; (12)
(i) Data splitting scheme. A total of 75% of the original
{𝜆 if 𝑡 ≤ 𝜆 data are going to be applied as training data and the
{
{
{ (3.7𝜆 − 𝑡) remaining part (25%) is considered to be the test
SSCAD (𝑡; 𝜆) = { if 𝜆 ≤ 𝑡 ≤ 𝑎𝜆; (13)
{
{ (2.7) data. The training data is applied to train a forecasting
{ model and the test data is used to evaluate the model
{0 if 𝑡 ≥ 𝑎𝜆
performances.
and sgn is the sign function. The vector versions of (ii) Convergence.
SSOFT (𝑡; 𝜆) and SSCAD (𝑡; 𝜆) are denoted by SSOFT (𝑡 ; 𝜆) and
SSCAD (𝑡 ; 𝜆) for any vector 𝑡 . Step size. The step size 𝜏 is used to ensure
To describe the proposed RS-SRSCAD-FA algorithm in a the convergence of the SRSCAD algorithm (cf.
simple way, we first define a matrix Ω with its 𝑗th column lines 11–19 in Algorithm 1). Usually, it can be
given by Ω𝑗 = [𝛽𝑗 , 𝜃𝑗1 , . . . , 𝜃𝑗𝑝 ], which indicates the 𝑗th determined based on a theoretical analysis with
covariate and all of its associated two-way covariates. The the surrogate function defined. Another way to
8 Mathematical Problems in Engineering
determine the step size is to apply a line search that it is appropriate for large-scale solar thermal power
method. industry installations. Therefore, it is worthwhile to inves-
Stopping criteria. The error tolerance between tigate the global solar radiation by conducting research
two successive iterates and maximum iteration in eight sites in this region; see Figure 2. This dataset is
number are determined by trial and error. collected from the National Renewable Energy Laboratory
In particular, the error tolerance is chosen (NREL) and is available at the following website http://www
from the set {1𝑒 − 4, 1𝑒 − 5, 1𝑒 − 6} and .nrel.gov/gis/solar.html. In addition to the global solar radi-
the maximum number of iterations is selected ation (W/m2 ), this data consists of seven meteorological
from {100, 500, 1000}. The optimal combination covariates, which are the solar zenith angle (degrees), pre-
of the error tolerance and maximum itera- cipitation (cm), temperature (∘ C), wind direction (degrees),
tion number is determined using the trade- wind speed (m/s), relative humidity (%), and air pressure
off between forecasting accuracy and computa- (mbar). It was collected between 11:00 and 20:00 from 1/1/2014
tional efficiency. to 12/31/2014 as the sunshine is poor during other periods. As
an illustrative example, the global solar radiation signal and
(iii) Acceleration. Accelerated gradient method (AGM) other meteorological covariates from 1/1/2014 to 1/31/2014 in
is applied to increase the convergence speed. The Site 1 are shown in Figure 3. The main purpose of this paper
design of AGM was first proposed by [54] and is to forecast the hourly global solar radiation using these
the computation complexity can be reduced from meteorological covariates accurately and efficiently, which
O(1/𝑀) to O(1/𝑀2 ), where 𝑀 represents the number could play a dominant role in the design of solar power plants.
of iterations of the algorithm.
4.2. Individual Models. ENN and LSSVM are implemented
(iv) Parameter tuning. In a regularization-based problem, using the MATLAB neural network toolbox and LSSVMlab
the choice of regularization parameter is critical. In 1.5 toolbox. Specifically, a three-layer ENN is established
this work, the regularization parameters are selected with the number of input neurons 𝑁𝑖 = 56, which is the
based on 10-fold CV. The main procedure can be number of covariates in the interactive model. The optimal
described as follows: the data is divided into 𝐾 parts number of hidden neurons is 5, which is selected from the set
with roughly equal sizes. The 𝐾 − 1 parts will be used 𝑁ℎ = {5, 10, 15, 20} using 10-fold CV. The number of output
to train the model and the left one part is applied to neurons 𝑁𝑜 = 1. Thus, ENN uses a 56 × 5 × 1 network
calculate the test error. CV repeats this procedure for architecture. Furthermore, to estimate the weights between
𝐾 times for each candidate regularization parameter layers more accurately, weight regularization with ℓ2 penalty
and the final CV error is the average of 𝐾 test function is applied during the weight estimation procedure.
errors. The optimal parameter is selected based on the The ℓ2 regularization parameter is selected from a grid of
smallest CV error. When 𝐾 = 𝑛, this procedure is values denoted by 𝜂 = {1𝑒 − 5, 1𝑒 − 4, 1𝑒 − 3, 1𝑒 − 2, 1𝑒 − 1}.
known as leave-one-out cross-validation (LOOCV), The optimal regularization parameter 𝜂𝑜 is determined as
which is often used in complex problems. 1𝑒 − 5 using the Akaike information criterion (AIC) [59].
LSSVM is established by applying an RBF with two kernel
4. Forecasting Global Solar Radiation in parameters selected from the interval [2−6 , 2−5 , . . . , 25 , 26 ]
Xinjiang Province of China using 10-fold CV. SRL is implemented based on the algorithm
COORD, the MATLAB code of which may be downloaded
In this section, to demonstrate the forecasting accuracy of from Belloni’s website. Theoretical selection is applied for
the proposed RS-SRSCAD-FA algorithm, comparisons are the selection of regularization parameter 𝜆 in COORD: 𝜆 =
made between this and other forecasting methods includ- 1.1𝐺−1 (1 − 0.05/(2(𝑝2 + 𝑝))) in the optimization problem (5)
ing two benchmark models, the Elman neural network recommended in [60], with 𝐺 representing the cumulative
(ENN) and LSSVM, the Angstrom–Prescott empirical model distribution function of a Gaussian distribution. For the
(Angstrom–Prescott) [55], popular feature selection models Angstrom–Prescott empirical model, we apply the empirical
including LASSO, SCAD, and SRL and existing hybrid model in the form of a second polynomial, which has been
models including the combination of cuckoo search and proved to be superior to other forms of model by [55].
square root progressive quantile variable selection procedure The possible maximum monthly mean of the daily sunshine
in sparse quadratic radial basis function (CS-SRPQVSP- duration is given by 𝑆𝑑 = (2/15)cos−1 (− tan ℓ tan 𝛿) with ℓ
QRBF) [56], particle swarm optimization in a backprop- representing the location latitude and 𝛿 = 23.45 sin(360(284+
agation neural network (PSO-BPNN) [57], and the firefly 𝐷𝑛 )/365), where 𝐷𝑛 is the number of days in the year. CS-
algorithm in a support vector machine (SVM-FA) [58]. The SRPQVSP-QBF, PSO-BPNN, and SVM-FA are implemented
experimental results are evaluated using forecasting accuracy using MATLAB and all the model parameters are tuned to
and computation efficiency. Furthermore, the analysis of give good performances. The proposed RS-SRSCAD-FA is
multihour-ahead cases including 24 h ahead and 48 h ahead implemented based on Algorithm 1.
is also provided in the experiments.
4.3. Statistical Measures of Forecasting Performance. In this
4.1. Data Collection and Description. The Xinjiang area is paper, four common criteria, the mean absolute percent-
rich in sunshine and far from the population centers so age error (MAPE), root-mean-square error (RMSE), Theil
Mathematical Problems in Engineering 9
Site 5 Site 7
37.55∘ N, 83.15∘ E 37.55∘ N, 84.15∘ E Annual total global horizontal radiation (kWh/G2 )
<1050
1050-1400
1400-1750
>1750
Site 2 Xinjiang
37.05∘ N, 81.65∘ E
Site 1
36.85∘ N, 80.55∘ E
Site 3
36.95∘ N, 81.75∘ E
Mean global solar radiation (W/G2 ) 560.213 556.648 557.266 556.207 555.014 558.547 559.223 666.151
2
Maximum global solar radiation (W/G ) 956 960 959 963 964 960 969 1101
2
Minimum global solar radiation (W/G ) 17 9 9 11 12 16 18 25
Figure 2: Details of the eight sites in the Xinjiang area for measuring global solar radiation.
inequality coefficient (TIC), and correlation coefficient (𝑅), (in seconds) is denoted by CC, and the experiments are
are used to evaluate the prediction accuracy. The definitions implemented using a PC with an Intel Core i7 CPU at 3.60
of the criteria are as follows: GHz.
1 𝑛 𝑦 − 𝑦̂𝑖
MAPE = ∑ 𝑖 , (14)
𝑛 𝑖=1 𝑦𝑖 4.4. Results Analysis. Table 2 summarizes the forecasting
results and computation cost of all the compared approaches
at Sites 1–8. As the median is more robust to the outliers
1 𝑛 2 than the mean, we focus on the median when making com-
RMSE = √ ∑ (𝑦𝑖 − 𝑦̂𝑖 ) , (15)
𝑛 𝑖=1 parisons. Apparently, the performance of the proposed RS-
SRSCAD-FA is remarkable because it provides the lowest
√(1/𝑛) ∑𝑛𝑖=1 (𝑦𝑖 − 𝑦̂𝑖 )2 RMSE and TIC values, whereas its MAPEs are marginally
TIC = , (16) higher than those of other forecasting methods. For instance,
√(1/𝑛) ∑𝑛𝑖=1 𝑦̂𝑖2 + √(1/𝑛) ∑𝑛𝑖=1 𝑦𝑖2 RMSEs of RS-SRSCAD-FA are 20.21, 21.93, 21.20, and
20.29 W/m2 at sites 1–4, which are approximately 20.37%,
∑𝑛𝑖=1 (𝑦𝑖 − 𝑦) (𝑦̂𝑖 − 𝑦)
̂ 13.22%, 18.99%, and 22.65% higher than these of ENN. The
R= 2
, (17) Angstrom–Prescott model performs better than both CS-
2
∑𝑛𝑖=1 (𝑦𝑖 − 𝑦) ∑𝑛𝑖=1 (𝑦̂𝑖 − 𝑦)
̂ SRPQVSP-QRBF and SVM-FA in terms of accuracy, except
for Site 8. For instance, RMSEs of the Angstrom–Prescott
where 𝑛 is the sample size of the test data, 𝑦𝑖 and 𝑦̂𝑖 represent model are given as 23.13, 24.13, 24.63, 25.14, 24.74, 22.32,
the true value and estimated value, respectively, and 𝑦 and 𝑦̂ and 22.22 W/m2 , which are lower than those of CS-
denote the mean value of 𝑦 and 𝑦 ̂ . The data is divided into the SRPQVSP-RQBF at 25.78, 25.16, 25.62, 22.34, 27.41, 22.65, and
training dataset (75%) and test dataset (25%). Specifically, the 22.89 W/m2 . CS-SRPQVSP-QRBF delivers better results than
training data is used to establish the forecasting models, and SVM-FA except for Sites 1 and 2. Both CS-SRPQVSP-QRBF
the out-of-sample forecasting performances are evaluated and SVM-FA outperform PSO-BPNN in terms of MAPE,
based on the test data. The division procedure is repeated RMSE, and TIC. The reasons for this are probably that CS and
30 times, and the median of the MAPEs, RMSEs, TICs, FA are superior over PSO in parameter estimation. However,
and 𝑅 values is reported. Furthermore, the convergence cost PSO-BPNN still produces better outcomes than LSSVM,
10 Mathematical Problems in Engineering
800 20
700
15
Temperature (C)
10
Actual global
500
400 5
300
0
200
−5
100
0 −10
0 50 100 150 200 250 300 0 50 100 150 200 250 300
855 80
70
850
845 50
40
840 30
20
835
10
830 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
100 0.5
90
0.4
Precipitation (cm)
angle (degree)
Solar zenith
80
0.3
70
0.2
60
50 0.1
0 50 100 150 200 250 300 0 50 100 150 200 250 300
400 6
5
Wind direction (degree)
300
Wind speed (m/s)
200 3
2
100
1
0 0
0 50 100 150 200 250 300 0 50 100 150 200 250 300
Figure 3: Example of a global solar radiation signal and other meteorological covariates including solar zenith angle (degrees), precipitation
(cm), temperature (∘ C), wind direction (degrees), wind speed (m/s), relative humidity (%), and air pressure (mbar).
LASSO, SCAD, and SRL, which demonstrates that the hybrid model is more computationally efficient than CS-SRPQVSP-
forecasting methods have proved the model performance to QRBF, PSO-BPNN, and SVM-FA, which take more train-
some extent. ing time. For instance, in Site 4, the computational time
LASSO, SCAD, and SRL provide comparable forecast- of the Angstrom–Prescott model is 4.06 s, and those of
ing results and training times that are much shorter than CS-SRPQVSP-QRBF, PSO-BPNN, and SVM-FA are 415.35,
other methods. Although LSSVM is computationally rapid, 502.35, and 408.56 s, respectively. The reason why CS-
its performance is not as good as ENN and provides the SRPQVSP-QRBF takes more computation time is that it
largest forecasting errors at Sites 2–7. The Angstrom–Prescott considers the two-way interaction terms and the selection
Mathematical Problems in Engineering 11
Table 2: Statistical performances of the proposed RS-SRSCAD-FA and other competitors. The statistical performances are evaluated by
mean absolute percentage error (MAPE), root-mean-square error (RMSE, W/m2 ), Theil inequality coefficient (TIC), computation cost (CC,
in seconds), and correlation coefficient (𝑅) at Sites 1–8.
of parameters in the RBF also takes some time. The the low models. However, this is an acceptable trade-off between
training speed of PSO-BPNN and SVM-FA is that the models forecasting accuracy and computational efficiency because
are retrained when the parameters changed. Furthermore, the accuracy is improved at the cost of more computation
both ENN and RS-SRSCAD-FA compute slowly as their time. The correlation coefficients of the compared methods
model structures are more complex than those of other are given in the last column of the table. RS-SRSCAD-FA
12 Mathematical Problems in Engineering
Table 3: Statistical performances of the proposed RS-SRSCAD-FA and other competitors. The statistical performances are evaluated by
mean absolute percentage error (MAPE), root-mean-square error (RMSE, W/m2 ), Theil inequality coefficient (TIC), computation cost (CC,
in seconds), and correlation coefficient (𝑅) at Sites 1–8 for 24-hour ahead forecasting.
is still the best by giving the highest 𝑅 values. For instance, 𝑅 values of 0.97 for Site 1 and around 0.80 for the other
the 𝑅 values of RS-SRSCAD-FA are 0.98, 0.96, 0.92, 0.94, sites. The performances of the compared methods on cases
0.94, 0.91, 0.95, and 0.93 for Sites 1—8, respectively. The 24-hour ahead forecasting and 48-hour ahead forecasting
Angstrom–Prescott model also provides comparable 𝑅 val- are listed in Tables 3 and 4, respectively. The results are
ues, which are also higher than 0.90. LSSVM only produces consistent with what we have discussed above. It is observed
Mathematical Problems in Engineering 13
Table 4: Statistical performances of the proposed RS-SRSCAD-FA and other competitors. The statistical performances are evaluated by
mean absolute percentage error (MAPE), root-mean-square error (RMSE, W/m2 ), Theil inequality coefficient (TIC), computation cost (CC,
in seconds), and correlation coefficient (𝑅) at Sites 1–8 for 48-hour ahead forecasting.
that RS-SRSCAD-FA provides remarkable results followed Sites 1—8, respectively. The correlation coefficients of RS-
by the Angstrom–Prescott model and CS-SRPQVSP-QRBF. SRSCAD-FA are given around 0.90, which are also higher
The RMSEs of RS-SRSCAD-FA for 24 hours ahead are 13.11, than other methods. The RMSEs of LASSO, SCAD, and SRL
16.90, 13.72, 14.49, 13.16, 14.71, 13.70, and 14.53 W/m2 for are approximately 20–21 𝑊/𝑚2 , which are comparable to but
14 Mathematical Problems in Engineering
MAPE
MAPE
MAPE
MAPE
0.1 0.1 0.1 0.1
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
MAPE Site 1 Site 2 Site 3 Site 4
0.25 0.25 0.25 0.25
MAPE
MAPE
MAPE
MAPE
0.1 0.1 0.1 0.1
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Site 5 Site 6 Site 7 Site 8
50 50 50 50
45 45 45 45
40 40 40 40
35 35 35 35
30 30 30 30
RMSE
RMSE
RMSE
RMSE
25 25 25 25
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
RMSE Site 1 Site 2 Site 3 Site 4
50 50 50 50
45 45 45 45
40 40 40 40
35 35 35 35
30 30 30 30
RMSE
RMSE
RMSE
RMSE
25 25 25 25
20 20 20 20
15 15 15 15
10 10 10 10
5 5 5 5
0 0 0 0
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10
Site 5 Site 6 Site 7 Site 8
TIC
0.02
TIC
TIC
TIC
0.02 0.02
TIC
TIC
Figure 4: Boxplots of MAPE, RMSE, and TIC at Sites 1–8. The compared models ENN, LSSVM, LASSO, SCAD, SRL, Angstrom-Prescott,
CS-SRPQVSP-QRBF, PSO-BPNN, SVM-FA, and RS-SRSCAD-FA are indicated by numbers 1–10, respectively (the 𝑦-axis is truncated to more
effectively display the remaining boxplots).
worse than those of CS-SRPQVSP-QRBF, PSO-BPNN, and The boxplots shown in Figure 4 clearly demonstrate that
SVM-FA (approximately 14–18 W/m2 ). Similar phenomena the forecasting values of RS-SRSCAD-FA are critically lower
can also be seen in Table 4 for the case of 48-hour ahead and a few outliers are observed (the red lines represent the
forecasting. median values). Figures 5 and 6 exhibit the relationship
Mathematical Problems in Engineering 15
Actual solar Actual solar Actual solar Actual solar Actual solar
radiation radiation radiation radiation radiation
(W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2)
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 1 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 2 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
(W/m2)
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 3 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
(W/m2)
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 4 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2)
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
Figure 5: Plot of forecasting values and actual values of global solar radiation at Sites 1–4.
between the measured data and the estimated values in better forecasting results than the other penalized feature
the solar radiation datasets. Apparently, the points of RS- selection methods. Figure 7 exhibits the errors between actual
SRSCAD-FA are closer to the straight line, which demon- values and forecast values provided by the proposed RS-
strates its satisfactory performances. ENN demonstrates SRSCAD-FA. It is obvious that the RS-SRSCAD-FA values
16 Mathematical Problems in Engineering
Actual solar Actual solar Actual solar Actual solar Actual solar
radiation radiation radiation radiation radiation
(W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2) (W/m ∧2)
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
Site 5 0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
500 500 500 500 500
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 6
ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 7 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Site 8 ENN LSSVM LASSO SCAD SRL
1000 1000 1000 1000 1000
global solar
radiation
Forecast
(W/m2 )
0 0 0 0 0
0 500 1000 0 500 1000 0 500 1000 0 500 1000 0 500 1000
Angstrom-Prescott CS-SRPQVSP-QRBF PSO-BPNN SVM-FA RS-SRSCAD-FA
Figure 6: Plot of forecasting values and actual values of global solar radiation at Sites 5–8.
closely match the actual data in almost all time periods. That algorithm with other existing algorithm are more reliable
is, RS-SRSCAD-FA consistently displays the best forecasting and trustworthy on multiple datasets than on a single dataset
performance over a vast majority of the time period. [61]. To make a fair comparison, a nonparametric Friedman
test is applied to evaluate the forecasting and selection
4.5. Statistical Comparisons of Results over Multiple Real performances of the algorithms in this global solar radiation
Datasets. Statistical tests for comparison of the proposed datasets. The Friedman test examines whether the measured
Mathematical Problems in Engineering 17
Site 1 Site 2
100 100
(W/G2 )
(W/G2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 3 Site 4
100 100
(W/G2 )
(W/G2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 5 Site 6
100 100
(W/G2 )
(W/G2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Site 7 Site 8
100 100
(W/G2 )
(W/G2 )
0 0
−100 −100
0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800
Error between actual data and forecast values of RS-SRSCAD-FA
Figure 7: Plot of errors between actual global solar radiation and forecast values of RS-SRSCAD-FA at Sites 1–8.
average rank is significantly different from the mean rank that does not employ all the covariates. Therefore, RS-SRSCAD-
is likely under the null hypothesis. Table 5 reveals the ranks of FA exhibits a higher prediction capacity relative to other
the MAPEs, RMSEs, and TICs of the proposed approach and forecasting methods.
of the other competitors. It is convenient to calculate the 𝜒𝐹2
and 𝐹𝐹 values as shown in the table. Using the six approaches 5. Conclusions
and eight datasets, 𝐹𝐹 is distributed according to 8 − 1 = 7
and (10 − 1) ∗ (6 − 1) = 45 degrees of freedom. The critical Forecasting solar radiation is fundamental to solar power
value for the statistic 𝐹(7, 45) for 𝛼 = 0.1 is 1.85; therefore, technology. For the utilization and conversion of solar power,
the null hypothesis is rejected, which indicates that there accurate and continuous radiation data is essential. Therefore,
are differences between these algorithms and more tests are establishing an accurate and efficient forecasting model plays
needed. a vital role in global solar radiation forecasting. To overcome
To confirm the advantage, the newly proposed RS- the drawbacks of a single model, which yields low forecasting
SRSCAD-FA method is regarded as the control algorithm, accuracy, a novel global solar radiation forecasting method
and the other competitors are tested against it. The most called RS-SRSCAD-FA has been proposed. The main struc-
straightforward method is to calculate critical difference ture of the novel method is ensemble learning for enhancing
(CD) with the Bonferroni–Dunn test [62]. According to [61], the forecasting accuracy and model stability. An efficient
the critical value for 𝑞0.1 for the 10 methods is 2.326. Thus, covariate selection method SRSCAD is used in the input
CD is 2.326 × √(10 × 11)/(6 × 8) = 3.52. It is evident that the space; simultaneously, a weight vector, which best represents
forecasting models except LSSVM, PSO-BPNN, and SVM- the importance of each individual model in the ensemble
FA yield comparable MAPEs as the differences between their system, is determined by the FA. To illustrate the validity
average ranks are less than the CD value 3.52. For example, the and superiority of the proposed method, the datasets from
difference between SRL and RS-SRSCAD-FA is less than 3.52 eight locations in the Xinjiang area of China have been
(4.31 − 3.25 = 1.06 < 3.52). In terms of RMSE, RS-SRSCAD- applied as real data examples. The results demonstrate that
FA performs better than LSSVM, LASSO, and SCAD, and the proposed RS-SRSCAD-FA method is superior to other
SRL and is comparable with ENN, Angstrom–Prescott, CS- forecasting approaches.
SRPQVSP-QRBF, and SVM-FA based on the differences
between the average ranks. For instance, the RMSE dif- Data Availability
ferences between LASSO and RS-SRSCAD-FA are larger
than 3.52 (7.81 − 1.00 = 6.81 > 3.52). The differences The data used to support the findings of this study are
between Angstrom–Prescott and RS-SRSCAD-FA are less included within the article.
than 3.52 (2.38 − 1.00 = 1.38 < 3.52). Similar phenomena
can be observed from the aspect of TIC and 𝑅 values: Conflicts of Interest
most of the results of the nonparametric Freidman test are
consistent with our observations. Unlike ENN and SVM- The authors declare that the received funding did not lead
FA, RS-SRSCAD-FA provides an interpretable model that to any conflicts of interest regarding the publication of this
18 Mathematical Problems in Engineering
Table 5: Ranks of forecasting models for Sites 1–8 of global solar radiation data.
Methods Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Site 8 Average rank 𝜒𝐹2 𝐹𝐹
MAPE
ENN 2 4 3.5 5 3 1 1.5 3 2.88 46.23 12.56
LSSVM 10 10 10 10 10 6 10 6 9.00
LASSO 1 3 1 2 5 4.5 5.5 7 3.63
SCAD 5 1.5 2 5 4 4.5 4 4 3.75
SRL 5 1.5 3.5 5 2 7 5.5 5 4.31
Angstrom–Prescott 7 5 5 7 7 8 7 8.5 6.81
CS-SRPQVSP-QRBF 5 8 7 2 6 3 3 1.5 4.44
PSO-BPNN 8 7 9 9 9 10 8.5 10 8.81
SVM-FA 9 6 8 8 8 9 8.5 8.5 8.13
RS-SRSCAD-FA 3 9 6 2 1 2 1.5 1.5 3.25
RMSE
ENN 5 4 4 4 3 4 4 5 4.13 61.63 41.60
LSSVM 2 10 10 10 10 7 10 9 8.50
LASSO 8 7 7 7 7 9.5 7 10 7.81
SCAD 9 8 8 9 8 9.5 9 7 8.44
SRL 10 9 9 8 9 8 8 8 8.63
Angstrom–Prescott 3 2 2 3 2 2 2 3 2.38
CS-SRPQVSP-QRBF 6 3 3 2 5 3 3 2 3.38
PSO-BPNN 7 6 6 6 6 6 6 6 6.13
SVM-FA 4 5 5 5 4 5 5 4 4.63
RS-SRSCAD-FA 1 1 1 1 1 1 1 1 1.00
TIC
ENN 6 3.5 4.5 4.5 2.5 4 4.5 5 4.31 59.57 33.56
LSSVM 2.5 10 10 10 10 8.5 10 9 8.75
LASSO 8 7 7.5 8 7.5 8.5 8 10 8.06
SCAD 8 8.5 7.5 8 7.5 8.5 8 7.5 7.94
SRL 8 8.5 9 8 7.5 8.5 8 7.5 8.13
Angstrom–Prescott 4 2 2 3 2.5 3 2.5 3 2.75
CS-SRPQVSP-QRBF 2.5 3.5 3 2 4.5 2 2.5 1.5 2.69
PSO-BPNN 10 5.5 6 6 7.5 5.5 6 6 6.56
SVM-FA 5 5.5 4.5 4.5 4.5 5.5 4.5 4 4.75
RS-SRSCAD-FA 1 1 1 1 1 1 1 1.5 1.06
𝑅
ENN 6 7 9 4 3.5 6 4 8 5.94 56.57 25.67
LSSVM 2 8 10 10 10 8 8 10 8.25
LASSO 9 9 8 9 8.5 9.5 9 9 8.88
SCAD 10 10 5 6.5 8.5 9.5 10 7 8.31
SRL 5 6 6.5 8 7 7 7 6 6.56
Angstrom–Prescott 3 2 2 2 2 2 2 2 2.13
CS-SRPQVSP-QRBF 7 3 3 3 3.5 3 3 3 3.56
PSO-BPNN 8 4.5 6.5 6.5 6 4.5 6 5 5.88
SVM-FA 4 4.5 4 5 5 4.5 5 4 4.50
RS-SRSCAD-FA 1 1 1 1 1 1 1 1 1.00
manuscript and there are no conflicts of interest regarding the 71861012), the Natural Science Foundation of Jiangxi,
publication of this article. China (Grant Nos. 20181BAB211020 and 20171BAA218001),
the China Postdoctoral Science Foundation (Grant Nos.
Acknowledgments 2017M620277 and 2018T110654), and Scientific Research
This research was supported by the National Natural Fund of Jiangxi Provincial Education Department (Grant No.
Science Foundation of China (Grant Nos. 71761016 and GJJ180287).
Mathematical Problems in Engineering 19
[35] R. Tibshirani, “Regression shrinkage and selection via the [56] H. Jiang, “A novel approach for forecasting global horizontal
lasso,” Journal of the Royal Statistical Society: Series B (Statistical irradiance based on sparse quadratic RBF neural network,”
Methodology), vol. 58, no. 1, pp. 267–288, 1996. Energy Conversion and Management, vol. 152, pp. 266–280, 2017.
[36] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani, “Least angle [57] M. A. Mohandes, “Modeling global solar radiation using Parti-
regression,” The Annals of Statistics, vol. 32, no. 2, pp. 407–499, cle Swarm Optimization (PSO),” Solar Energy, vol. 86, no. 11, pp.
2004. 3137–3145, 2012.
[37] H. Zou and T. Hastie, “Regularization and variable selection [58] L. Olatomiwa, S. Mekhilef, S. Shamshirband, K. Mohammadi,
via the elastic net,” Journal of the Royal Statistical Society B: D. Petković, and C. Sudheer, “A support vector machine-firefly
Statistical Methodology, vol. 67, no. 2, pp. 301–320, 2005. algorithm-based model for global solar radiation prediction,”
[38] H. Zou, “The adaptive lasso and its oracle properties,” Journal of Solar Energy, vol. 115, pp. 632–644, 2015.
the American Statistical Association, vol. 101, no. 476, pp. 1418– [59] H. Akaike, “A new look at the statistical model identification,”
1429, 2006. IEEE Transactions on Automatic Control, vol. 19, no. 6, pp. 716–
[39] D. Yang, Z. Ye, L. H. I. Lim, and Z. Dong, “Very short term 723, 1974.
irradiance forecasting using the lasso,” Solar Energy, vol. 114, pp. [60] A. Belloni, V. Chernozhukov, and L. Wang, “Square-root lasso:
314–326, 2015. pivotal recovery of sparse signals via conic programming,”
[40] T. Zhang, “Multi-stage convex relaxation for feature selection,” Biometrika, vol. 98, no. 4, pp. 791–806, 2011.
Bernoulli, vol. 19, no. 5, pp. 2277–2293, 2013. [61] A. Janez, “Statistical comparisons of classifiers over multiple
[41] K. Knight and W. Fu, “Asymptotics for lasso-type estimators,” data sets,” Journal of Machine Learning Research, vol. 70, no. 1,
The Annals of Statistics, vol. 28, no. 5, pp. 1356–1378, 2000. pp. 1–30, 2006.
[42] J. Fan and R. Li, “Variable selection via nonconcave penalized [62] O. J. Dunn, “Multiple comparisons among means,” Journal of
likelihood and its oracle properties,” Journal of the American the American Statistical Association, vol. 56, no. 293, pp. 52–64,
Statistical Association, vol. 96, no. 456, pp. 1348–1360, 2001. 1961.
[43] P. Zhao and B. Yu, “On model selection consistency of Lasso,”
Journal of Machine Learning Research, vol. 7, no. 12, pp. 2541–
2563, 2006.
[44] K. Lounici, “Sup-norm convergence rate and sign concentration
property of Lasso and Dantzig estimators,” Electronic Journal of
Statistics, vol. 2, pp. 90–102, 2008.
[45] G. E. P. Box, “Science and statistics,” Journal of the American
Statistical Association, vol. 71, no. 356, pp. 791–799, 1976.
[46] J. M. Bates and C. W. J. Granger, “The combination of forecasts,”
Operational Research Quarterly, vol. 20, no. 4, pp. 451–468, 1969.
[47] T. K. Ho, “The random subspace method for constructing
decision forests,” IEEE Transactions on Pattern Analysis and
Machine Intelligence, vol. 20, no. 8, pp. 832–844, 1998.
[48] T. K. Ho, “Random decision forests,” in Proceedings of the 3rd
International Conference on Document Analysis and Recogni-
tion, pp. 278–282, Montreal, Canada, 1995.
[49] X. She Yang, “Firefly algorithm, lévy flights and global opti-
mization. Research and development in intelligent systems,”
Research and Development in Intelligent Systems XXVI, vol. 26,
pp. 209–218, 2010.
[50] O. P. Verma, D. Aggarwal, and T. Patodi, “Opposition and
dimensional based modified firefly algorithm,” Expert Systems
with Applications, vol. 44, pp. 168–176, 2016.
[51] H. Wang, W. Wang, X. Zhou et al., “Firefly algorithm with
neighborhood attraction,” Information Sciences, vol. 382-383,
pp. 374–387, 2017.
[52] X. She Yang, Nature-Inspired Metaheuristic Algorithms, Luniver
Press, 2008.
[53] Y. She, “Thresholding-based iterative selection procedures for
model selection and shrinkage,” Electronic Journal of Statistics,
vol. 3, pp. 384–415, 2009.
[54] Yu. Nesterov, “Gradient methods for minimizing composite
objective function,” Technical Report, Université catholique de
Louvain, Center for Operations Research and Econometrics
(CORE), 2007.
[55] L. Feng, A. Lin, L. Wang, W. Qin, and W. Gong, “Evaluation
of sunshine-based models for predicting diffuse solar radiation
in China,” Renewable & Sustainable Energy Reviews, vol. 94, pp.
168–182, 2018.
Advances in Advances in Journal of The Scientific Journal of
Operations Research
Hindawi
Decision Sciences
Hindawi
Applied Mathematics
Hindawi
World Journal
Hindawi Publishing Corporation
Probability and Statistics
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 www.hindawi.com Volume 2018 http://www.hindawi.com
www.hindawi.com Volume 2018
2013 www.hindawi.com Volume 2018
International
Journal of
Mathematics and
Mathematical
Sciences
Journal of
Hindawi
Optimization
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018
International Journal of
Engineering International Journal of
Mathematics
Hindawi
Analysis
Hindawi
www.hindawi.com Volume 2018 www.hindawi.com Volume 2018